Frisbee：快速分布式的磁盘镜像工具

需积分: 7 97 浏览量更新于2024-09-11 收藏 110KB PDF 举报

"Frisbee是一种用于快速、可扩展的磁盘映像技术，适用于大规模虚拟机集群的分布式安装。该技术旨在提供高速度和在局域网环境中的可伸缩性，以实现灾难恢复、操作系统安装等任务的便捷执行。Frisbee的特点包括文件系统感知的压缩方法、自定义的应用层可靠多播协议以及灵活的应用级帧封装。" Frisbee是为了解决研究人员和生产系统操作员在处理整个磁盘映像时遇到的问题而设计的。这种技术为用户提供方便快捷的工具，使灾难恢复、操作系统安装等多种任务变得更加容易，特别是在研究环境中，它极大地促进了实验和创新。 Frisbee的核心特性集中在速度和可扩展性上。为了实现这些目标，它采用了以下技术： 1. 文件系统感知的压缩：Frisbee利用了一种经过适当调整的压缩方法，该方法考虑到文件系统的结构，从而在不丢失数据完整性的情况下，有效地减少磁盘映像的大小，节省传输时间。 2. 自定义应用层可靠多播协议：在大规模部署中，单播或广播方式可能导致效率低下。Frisbee采用了一种定制的可靠多播协议，能够在局域网环境中同时向多个客户端快速、可靠地分发数据，确保每个接收端都能接收到完整无误的磁盘映像。 3. 灵活的应用级帧封装：Frisbee的设计允许在数据传输过程中进行灵活的帧封装，这有助于在传输过程中保持数据的完整性，适应不同的网络条件和负载，进一步提高效率。通过这些技术，Frisbee能在短时间内完成大量磁盘的写入操作。例如，在商业PC硬件上，Frisbee可以在34秒内将总计50GB的数据写入80个磁盘，显示出其卓越的性能和效率。在大规模虚拟机集群管理中，Frisbee的这些优势使得系统部署和维护变得更为高效。它可以显著减少停机时间，提高运维效率，同时由于其内置的安全机制，也保障了系统安装的安全性。对于需要频繁进行系统更新和测试的环境，如云计算数据中心和大规模科研项目，Frisbee是一个极具价值的工具。

3 Design Tradeoffs

There are three phases of a disk imaging system: im-

age creation, image distribution, and image installation.

Each phase has aspects which must be balanced to fulﬁll

a desired goal. We consider each phase in turn.

3.1 Image Creation

In image creation, the goal is to create a consistent snap-

shot of a disk or partition in the most efﬁcient way pos-

sible.

Source availability: While it is possible for the

source of the snapshot to be active during the image cre-

ation process, it is more common that it be quiescent to

ensure consistency. Quiescence may be achieved either

by using a separate partition or disk for the image source

or by running the image creation tool in a standalone en-

vironment which doesn’t use the source partition. What-

ever the technique, the time that the image source is “of-

ﬂine” may be a consideration. For example, an image

creation tool which compresses the data as it reads it

from the disk may take much longer than one that just

reads the raw data and compresses later. However, the

former will require much less space to store the initial

image.

Degree of compression and data segmenta-

tion: Another factor is how much (if any) and what kind

of compression is used when creating the image. While

compression would seem to be an obvious optimization,

there are trade-offs. As mentioned, the time and CPU

resources required to create an image are greater when

compressing. Compression also impacts the distribution

and decompression process. If a disk image is com-

pressed as a single unit and even a single byte is lost

during distribution, the decompression process will stall

until the byte is acquired successfully. Thus, depending

on the distribution medium, images may need to be

broken into smaller pieces, each of which is compressed

independently. This can make image distribution more

robust and image installation more efﬁcient at the

expense of sub-optimal compression.

Filesystem-aware compression: A stated advan-

tage of disk imaging over techniques that operate at the

ﬁle level is that imaging requires no knowledge of the

contents or semantics of the data being imaged. This

matches well with typical ﬁle compression tools and al-

gorithms which are likewise ignorant of the data be-

ing compressed. However, most disk images contain

ﬁlesystems and most ﬁlesystems have a large amount

of available (free) space in them, space that will du-

tifully be compressed even though the contents are ir-

relevant. Thus, the trade-off for being able to handle

any content is wasted time and space creating the image

and wasted time decompressing the image. One com-

mon workaround is to zero all the free space in ﬁlesys-

tems on the disk prior to imaging, for example, by cre-

ating and then deleting a large ﬁle full of zeros. This at

least ensures maximum compressibility of the free space.

A better solution is to perform ﬁlesystem-aware com-

pression. A ﬁlesystem-aware compression tool under-

stands the layout of a disk, identifying ﬁlesystems and

differentiating the important, allocated blocks from the

unimportant, free blocks. The allocated blocks are com-

pressed while the free blocks are skipped. Of course,

a disk imaging tool using ﬁlesystem-aware compression

requires even more intimate knowledge of a ﬁlesystem

than a ﬁle-level tool, but the imaging tool need not un-

derstand all ﬁlesystems it may encounter– it can always

fall back on naive compression.

3.2 Image Distribution

Image distribution is concerned with getting a disk im-

age from a “server” to one or more “clients.” In our con-

text it is assumed that the server and clients are different

machines and not just different disks on the same ma-

chine. Furthermore, we restrict the discussion to distri-

bution over a network.

Network bandwidth and latency: Perhaps the most

important aspect of network distribution is bandwidth

utilization. The availability of bandwidth affects how

images are created (the degree of compression) as well

as how many clients can be supported by a server (scal-

ing). Bandwidth requirements are reduced signiﬁcantly

by using compression. Increased compression not only

reduces the amount of data that needs to be transferred,

it also slows the consumption rate of the client due to the

need to decompress the data before writing it to disk. If

image distribution is serialized, only one client at a time,

then compression alone may be sufﬁcient to achieve a

target bandwidth. However, if the goal is to distribute

an image to multiple clients simultaneously, then typical

unicast protocols will need to be replaced with broad-

cast or multicast. Broadcast works well in environments

where all clients in the broadcast domain are involved

in the image distribution. If the network is shared, then

multicast is more appropriate, ensuring that unafﬁliated

machines are not affected. Just as in all data transfer pro-

tocols, the delay-bandwidth product affects how much

data needs to be en route in order to keep clients busy,

and the bandwidth and latency inﬂuence the granularity

of the error recovery protocol.

Network reliability: As alluded to earlier, the er-

ror rate of the network may affect how compression is

performed. Smaller compression units may limit the

effectiveness of the compression, but increase the abil-

ity of clients to remain busy in the face of lost packets.

More generally, in lossy networks it is desirable to sub-

divide an image into “chunks” and include with each

chunk additional information to make that chunk self-

剩余13页未读，继续阅读

a34123a

粉丝: 0
资源: 6

Frisbee：快速分布式的磁盘镜像工具

ultimate_frisbee_website

Android-frisbee.zip

nf-ultimate-frisbee:终极飞盘调度程序

frisbee:通过抓取搜索引擎结果来收集电子邮件地址

UltimateWhiteboard:一个通用的 iOS 应用程序，旨在充当 Ultimate Frisbee Sport 的虚拟白板

分组组织活动leetcode-frisbee:[已弃用]适用于GDG成员和组织者的Android应用程序

Frisbee：URLSession的另一个网络包装。 构建简单，小巧且易于在应用程序的网络层创建测试

TeamCreator:一个用于根据个人玩家的多个属性创建公平团队的项目。 当前的测试和算法是使用Ultimate Frisbee League的测试数据设计的

最新资源

Frisbee：URLSession的另一个网络包装。构建简单，小巧且易于在应用程序的网络层创建测试

TeamCreator:一个用于根据个人玩家的多个属性创建公平团队的项目。当前的测试和算法是使用Ultimate Frisbee League的测试数据设计的