The message in the mailbox that was processed by the crashed printing office can
be passed to the new printing office (when it's been determined that it was not
handled). The agents don't notice any problem and the system just continues to sell
tickets. This scenario is one example of a fault tolerance strategy that Akka
provides, which is called the Restart strategy. Other strategies that can be used are
Resume, Stop and Escalate. These strategies are explained in more detail in chapter
4. Akka provides a way to select strategies for specific exceptions that can occur in
Actors. Since Akka controls how all messages between Actors are processed and
knows all the addresses of the Actors, it can stop processing messages for an Actor
that throws an exception, check which strategy should be used for the specific
exception and take the required action. (This is kind of a Cool Hand Luke
Universe: the only failures, are failures of communication.)
Fault tolerant does not mean that every possible fault is caught and recovered
from completely. A fault tolerant system is a system that can at least contain and
isolate faults in specific parts of the system, averting a full system crash. The goal
is to keep the system running, as was achieved by restarting the printing office.
Different faults require different corrective strategies. Some faults are solved by
restarting a part of the system, other faults might not be solvable at the point of
detection and may need to be handled at a higher level, as part of a larger
subsystem. We'll see how Akka provides for such cases with its notion of
Supervision later in the book.
As you would probably expect, replacing malfunctioning objects in a shared
mutable state approach is almost impossible to do, unless you are prepared to build
a framework of your own to support it. And this is not limited to malfunctioning
objects, what if you just wanted to replace the behavior of a particular object? (As
we will see later, Akka also provides functionality to hot swap the behavior of an
Actor.) Since you do not have control over how methods are called, nor possess the
ability to suspend a method and redirect it to another new object, the flexibility that
is offered by the message passing approach is unmatched.
Without going into a lot of detail, let's look briefly at exception handling.
Exceptions in standard, non-concurrent code are thrown up the call hierarchy. An
object needs to handle the exception, or rethrow it. Whenever an exception occurs
you need to stop the regular business that you are doing and fallback to error
handling, to immediately continue where you left off after the error has been dealt
with. Since this is quite hard, most developers prefer to just throw an error all the
way up the stack, leaving it up for some type of framework to handle, aborting the