tions that involve short-lived access to shared data
stored on a disk. Alternatives are needed in other
situations.
2.4. Virtually synchronous environments
One way out of the problems enumerated in Sec.
2.2 would be to base the system on atomic multicast
protocols.1 A multicast is atomic if all of its opera-
tional destinations receive the message unless the
sender fails, in which case either all receive it or none
does so. Moreover, all recipients see the same mes-
sage delivery ordering. We need to extend this
definition of atomicity to cover the case of a multicast
whose destinations include process groups with
memberships that may be changing. Such a multi-
cast should logically be delivered to the group
membership that applied when it was invoked, but
this may not be the one that is current at the time of
delivery. We will consequently require that the
delivery of an atomic multicast is always completed
before a group that forms part of its destinations is
allowed to take on a new member. We point out that
many existing atomic multicast protocols assume
static sets of destinations that are known when the
protocol is initiated [Chang] [Cristian] [Schneider-b].
We will use the term synchronous to a describe
an environment in which multicasts are atomic and
events such as message deliveries, process and site
failures, recoveries, and other events described below,
occur in the same order everywhere. In a synchro-
nous environment, mechanisms solving all the prob-
lems cited above can be implemented without much
difficulty. Processes can easily maintain a consistent
view of one another, as each process is always in the
same point in its computation as any other. Syn-
chronization, when needed, is simple for the same
reason. Process failures can be handled consistently
because all operational processes learn of a failure
simultaneously, in the same computational step.
Unfortunately, this is prohibitively expensive. The
problem is that it requires all message deliveries to
be ordered relative to one another, regardless of
whether the application needs this to maintain con-
sistency. The protocols needed to achieve such a
strong ordering are invariably costly, both in terms of
1This is a multi'cast to a set of processes, not a broad-
cast to all the machines connected to a local network with
a hardware broadcast capability. Such hardware might,
however, be exploited to optimize the implementation of
the multicast protocol [Babaoglu].
the number of messages sent and in terms of the time
spent waiting for them to terminate.
This leads us to the concept of virtual synchrony.
The basic idea is to preserve the illusion that events
are occurring instantaneously, but to use different
communication primitives that enforce weaker
delivery orderings in situations where the application
or tool is insensitive to the delivery ordering. For
example, one could imagine a multicast primitive
that delivers messages in the order that the sending
process sent them, but is completely unordered rela-
tive to multicasts from other origins. A process with
private access to a replicated FIFO queue could use
this primitive to update it, since updates would be
processed in the same order at all copies. On the
other hand, if more than one process can perform
operations on the queue, this primitive would be
inadequate, because updates from different processes
might be processed out of order. The advantage of
using this primitive in the former case is that it is
likely to have a cheaper implementation than a full
atomic multicast, and yet gives the degree of syn-
chrony needed for that application.
A virtually synchronous execution is thus charac-
terized by the following property:
It will appear to any observer -- any process
using the system -- that all processes observed
the same events in the same order. This applies
not just to message delivery events, but also to
failures, recoveries, group membership changes,
and other events described below. As we will see
in the next section, this enables one to make a-
priori assumptions about the actions other
processes will take, and simplifies algorithmic
design.
Recall that the actual sequence of events will some-
times differ from process to process in situations
where the resulting actions are the same (or semanti-
cally equivalent) to those that would have been taken
had the event sequences actually been identical. We
will exploit this to increase concurrency.
2.5. Other virtually synchronous tools
The above discussion is so focused on atomic mul-
ticasting that one might conclude that this is all
1SIS 2 provides. In fact, we view atomic multicasts as
just one of a family of tools that all provide virtually
synchronous behavior. For example, there is a tool
that supports bulk transfers of information between
processes in a way that looks instantaneous. Another
126