This is the kind of feature that requires a recovery method
to (1) support operation logging (i.e., logging the quantity
by which a field's value was decremented or incremented,
rather than logging the before and after values of the field
as in IMS), (2) avoid erroneous attempts to undo or redo
some actions unnecessarily by precisely tracking the state
of a page using the LSN concept, and (3) write CLRs.
Unlike in earlier recovery methods, in ARIES, CLRs have
the property that they are redo-only log records. By
appropriate chaining of the CLRs to log records written
during forward processing, a bounded amount of logging
is ensured during rollbacks, even in the face of repeated
failures during restart recovery or of nested rollbacks.
This is to be contrasted with what happens in IMS
[PeSt83], which may undo the same nonCLR multiple
times, and in AS/400 [ClCo89], DB2/MVS V1 and
NonStop SQL, which, in addition to undoing the same
nonCLR multiple times, may also undo CLRs one or more
times (see [MHLPS92] for examples). In the past, these
have caused severe problems in real-life situations.
When the undo of a log record causes a CLR to be written,
the CLR is made to point, via the UndoNxtLSN field of
the CLR, to the predecessor of the log record being
undone. The latter information is readily available since
every log record, including a CLR, contains a pointer
(PrevLSN) to the most recent preceding log record written
by the same transaction. Thus, during rollback, the
UndoNxtLSN field of the most recently written CLR keeps
track of the progress of rollback. It tells the system from
where to continue the rollback of the transaction, if a
system failure were to interrupt the completion of the
rollback or if a nested rollback were to be performed. It lets
the system bypass those log records that had already been
undone.
Since CLRs can describe what actions are actually
performed during the undo of an original action, the undo
action need not be, in terms of which page(s) is affected,
the exact inverse of the action that is being compensated
(i.e., logical undo is made possible). This allows very high
concurrency to be supported. For example, in a B
+
-tree, a
key inserted on page 10 by one transaction may be moved
to page 20 by another transaction before the key insertion
is committed, as we permit in ARIES/IM [Mohan95b,
MoLe92] (see [Mohan93a] for the description of
ARIES/LHS which also exploits this feature). Now, if the
first transaction were to roll back, then the key will be
located on page 20 by retraversing the tree and deleted
A
nested rollback
is said to have occurred if a partial rollback
were to be later followed by a total rollback or another partial
rollback whose point of termination is an
earlier
point in the
transaction than the point of termination of the first rollback.
from there. A CLR will be written to describe the key
deletion on page 20. This enables page-oriented redo,
which is very efficient, during restart and media recovery
[MHLPS92].
3.2.2
Restart Recovery
When restarting the transaction system after an abnormal
termination, recovery processing in ARIES involves
making three passes (analysis, redo and undo) over the
log. In order to make this processing efficient, periodically
during normal processing, ARIES takes checkpoints. The
checkpoint log records identify the transactions that are
active, their states, and the addresses of their most recently
written log records, and also the modified data (dirty data)
that is in the buffer pool. During restart recovery, ARIES
first scans the log from the last checkpoint to the end of
the log. During this analysis pass, information about dirty
data and transactions that were in progress at the time of
the checkpoint is brought up to date as of the end of the
log. The analysis pass, using the dirty data information,
determines the starting point (RedoLSN) for the log scan
of the immediately following redo pass. The analysis pass
also determines the list of transactions to be rolled back in
the undo pass. For each in-progress transaction, the LSN
of the most recently written log record will also be
determined.
Next, during the redo pass, ARIES repeats history with
respect to those updates logged on stable storage but whose
effects on the database pages did not get reflected on disk
before the system failure. This is done for the updates of
ALL transactions, including the updates of those
transactions that had neither committed nor reached the
in-doubt state of two-phase commit by the time of the
crash (i.e., even the missing updates of the so-called loser
transactions are redone).
The process of repeating history essentially reestablishes
the state of the database as of the time of the failure. A log
record's update is redone if the affected page's page_LSN
is less than the log record's LSN. The redo pass also
obtains the locks needed to protect the uncommitted
updates of those distributed transactions which will remain
in the in-doubt (prepared) state [MoLO86] at the end of
restart recovery. In contrast, in the recovery methods of
System R [GMBLL81] and DB2 V1 [Crus84], only the
missing updates of terminated and in-doubt transactions
(the nonloser transactions) are redone during the redo
pass. This is called the selective redo paradigm. In
[MHLPS92], we show why this paradigm leads to
problems when fine-granularity (i.e., smaller than page-
granularity) locking is to be supported with WAL.
The next pass is the undo pass during which all loser
transactions' updates are rolled back, in reverse
4