DRBD 8.0.x及后续版本：实现高可用性的关键技术

需积分: 1 149 浏览量更新于2024-07-20 收藏 253KB PDF 举报

"DRBD v8.0.x and beyond by Lars Ellenberg" 是一篇由Lars Ellenberg撰写的技术文章，针对分布式复制块设备（DRBD）的最新版本进行了深入探讨。DRBD是一款由Philipp Reisner和Lars Ellenberg及其团队在LINBIT公司开发的软件，旨在提供高度可用的存储解决方案，即使在单个存储节点完全故障的情况下也能保证数据的持续访问。文章的核心内容主要集中在以下几个方面： 1. 共享磁盘与无共享架构：文章首先介绍了DRBD在共享磁盘集群环境中的应用，尽管它实质上是一个基于无共享（Shared-Nothing）设计的系统。这意味着DRBD能够在不依赖于其他节点的本地存储的情况下，实现数据的同步和冗余。 2. 设计理念与算法：Lars Ellenberg详细阐述了DRBD的设计理念，包括他们开发的一些关键算法，这些算法不仅对DRBD本身有益，也可能对软件RAID、快照技术（如dm-mirror）以及集群文件系统或分布式数据库有所启发。 3. 应用场景和局限性：文章涵盖了典型和非典型的应用场景，讨论了DRBD的优势和限制。例如，如何优化硬件升级过程，使得在存储升级期间不影响业务连续性。 4. 性能评估与未来特性：文章提到了性能基准测试的结果，以及即将引入的改进，如接收端的缓存预热技术，这将增强系统的响应速度和效率。 5. 实用性提示：最后，作者提供了使用DRBD的一些实用技巧和策略，帮助用户最大化地利用DRBD来提升其系统的可用性和性能。 "DRBD v8.0.x and beyond" 是一篇技术深度和实用性并重的文章，不仅介绍了DRBD的核心技术和设计思想，还为读者提供了实际操作中的应用指导，以及对未来发展趋势的展望。对于任何关注高可用性存储和分布式计算的IT专业人士来说，这篇文章都是一份宝贵的参考资料。

6.1 network hiccup 6 RESYNC: MAGIC HEALING

Two questions to be answered:

• which blocks, and

• which direction?

6.1 network hiccup

Lets keep it simple: One Primary, one Secondary, it has been just a network hick up, no node-role changes

involved. We know the direction: Primary -> Secondary. We know which blocks to transfer, because the

Primary keeps an in-memory bitmap, which is dirtied whenever a WRITEs is completed to the upper layers

without being acknowledged by the Secondary.

This may transfer some blocks more than necessary, because of the granularity of the bitmap, and

because for some blocks only the ack was lost, but the data had been written correctly.

6.2 node crash

If a Secondary node had crashed and was revived, the procedure is just the same as above.

If the Primary was rebooted while the Secondary was down, we’d lose the information stored in the

dirty bitmap, so we do keep a copy of it in some reserved meta-data area on disk, where we can initialize

the in-memory bitmap from, once we are conﬁgured again.

If a Primary node had crashed, we have a different problem.

There could have been in-ﬂight io, and we have no idea whether that made it to disk or to the network,

or to both. Even though only very few blocks will be different, we have no idea which ones, we have to

assume that any block might be different.

For the sake of data-integrity, we would have to retransmit the entire disk, just to be sure...

6.3 Full Sync? No Way.

To avoid this, we could dirty the on-disk bitmap with each incoming write request, submit the write, and

clear it after it has been successfully completed.

This would make three requests out of one. Worse, to be correct, the dirty write would have to be

synchronous, we’d have to wait for it to complete before we could submit the application write.

We are smarter than that.

6.4 Peanuts . . .

To reliably keep track of the target blocks of in-ﬂight IO, while minimizing the required additional io-

requests for this housekeeping, we came up with the concept of the "Activity-Log".

Think of your storage as a huge heap of peanuts. Sisyphus has tagged them all with a distinct block

number. There are many people running around, taking some of the peanuts in their pockets (that is the

in-ﬂight io), and throwing them back on the heap (that is the io-completion). Painting them blue is allowed,

these are WRITEs we are missing the acknowledgment of the other node for (dirty bits). Eating peanuts is

strictly forbidden, as is re-tagging.

Blocks corresponding to the in-pocket peanuts have to be retransmitted, those corresponding to the

heap don’t need to (but it would do no harm if some of them are).

Our mission is to know at each given moment as precisely as possible which peanuts are NOT in the

pockets of those people (and not painted blue, yet), because if we know that, we can avoid retransmitting

the corresponding blocks after Primary crash.

First, we get into control of the situation. We structure the heap, and put the peanuts in order into boxes

(activity-log extents) which in turn are numbered. We draw a line in the sand.

Some do that, anyways; call them Eh-i-oh and Silent Corruption ;)

剩余16页未读，继续阅读

awakeningdemo

粉丝: 0
资源: 6

DRBD 8.0.x及后续版本：实现高可用性的关键技术

drbd-8.3.12.tar.gz

drbd_int.rar_lars

drbd-8.4.4.tar.gz

drbd-8.3.0.tar.gz

99273878drbd-8.3.6.tar.gz

drbd-8.4.1.tar.gz_Raid!

drbd-8.0.4.tar

DRBD 9.0 cn.html

drbd-8.3.16

User_Guide_DRBD_9.pdf

最新资源