DRBD Features
10
Controllers with battery-backed write cache (BBWC) use a battery to back up their volatile
storage. On such devices, when power is restored after an outage, the controller flushes the most
recent pending writes out to disk from the battery-backed cache, ensuring all writes committed
to the volatile cache are actually transferred to stable storage. When running DRBD on top
of such devices, it may be acceptable to disable disk flushes, thereby improving DRBD's write
performance. See Section6.12, “Disabling backing device flushes” [43] for details.
2.10.Disk�error�handling�strategies
If a hard drive that is used as a backing block device for DRBD on one of the nodes fails, DRBD
may either pass on the I/O error to the upper layer (usually the file system) or it can mask I/O
errors from upper layers.
Passing on I/O errors.If DRBD is configured to “pass on” I/O errors, any such errors occuring
on the lower-level device are transparently passed to upper I/O layers. Thus, it is left to upper
layers to deal with such errors (this may result in a file system being remounted read-only, for
example). This strategy does not ensure service continuity, and is hence not recommended for
most users.
Masking I/O errors. If DRBD is configured to detach on lower-level I/O error, DRBD will do
so, automatically, upon occurrence of the first lower-level I/O error. The I/O error is masked
from upper layers while DRBD transparently fetches the affected block from the peer node,
over the network. From then onwards, DRBD is said to operate in diskless mode, and carries out
all subsequent I/O operations, read and write, on the peer node. Performance in this mode is
inevitably expected to suffer, but the service continues without interruption, and can be moved
to the peer node in a deliberate fashion at a convenient time.
See Section 6.9, “Configuring I/O error handling strategies” [39] for information on
configuring I/O error handling strategies for DRBD.
2.11.Strategies�for�dealing�with�outdated�data
DRBD distinguishes between inconsistent and outdated data. Inconsistent data is data that cannot
be expected to be accessible and useful in any manner. The prime example for this is data on
a node that is currently the target of an on-going synchronization. Data on such a node is part
obsolete, part up to date, and impossible to identify as either. Thus, for example, if the device
holds a filesystem (as is commonly the case), that filesystem would be unexpected to mount or
even pass an automatic filesystem check.
Outdated data, by contrast, is data on a secondary node that is consistent, but no longer in
sync with the primary node. This would occur in any interruption of the replication link, whether
temporary or permanent. Data on an outdated, disconnected secondary node is expected to be
clean, but it reflects a state of the peer node some time past. In order to avoid services using
outdated data, DRBD disallows promoting [3] a resource that is in the outdated state.
DRBD has interfaces that allow an external application to outdate a secondary node as soon
as a network interruption occurs. DRBD will then refuse to switch the node to the primary
role, preventing applications from using the outdated data. A complete implementation of this
functionality exists for the Heartbeat cluster management framework [66] (where it uses a
communication channel separate from the DRBD replication link). However, the interfaces are
generic and may be easily used by any other cluster management application.
Whenever an outdated resource has its replication link re-established, its outdated flag is
automatically cleared. A background synchronization [6] then follows.
See the section about the DRBD outdate-peer daemon (dopd) [77] for an example DRBD/
Heartbeat configuration enabling protection against inadvertent use of outdated data.