提升基于副本存储集群的擦除编码数据归档效率

103 浏览量更新于2024-07-14 收藏 1.39MB PDF 举报

本文主要探讨了在基于副本的存储集群中优化擦除编码数据归档的问题，针对的是为了提高成本效益而广泛使用的(k+r,k)擦除编码技术在稀有访问副本存储中的应用。通常，这种编码方式被用于存储集群中，目的是减少归档过程中不必要的带宽消耗，从而降低存储成本。然而，除了关注归档流量优化之外，作者认识到非顺序读取和负载不平衡问题同样会影响归档性能。现有的擦除编码数据归档优化研究主要集中在如何减少存储集群内的数据传输，例如通过改进数据分布策略、提升编码效率或者利用缓存等方法来减少检索时的数据复制。然而，传统分布式归档方案（如DArch，即Distributed Archival）在面对随机分布的副本时，可能会遇到“冷热”数据不均衡的问题，导致热点数据的访问速度变慢，整体效率下降。针对这些挑战，本文的作者们提出了针对基于副本的存储集群的新策略，旨在同时优化归档流量、处理非顺序读取和负载均衡。他们可能考虑了动态调整编码策略、采用分层存储架构、实施智能调度算法，以及利用分布式系统的特性，如副本的冗余性和一致性模型，来改善归档性能。具体可能的技术手段包括但不限于： 1. **负载均衡策略**：通过监控和预测各副本的访问频率，动态调整数据分布，确保数据请求能均匀地分散到各个副本上，避免某几个副本过载。 2. **数据访问路径优化**：利用预读、预测性编码等技术，提前加载可能被访问的数据，减少实际读取时的I/O操作，提高非顺序读取效率。 3. **缓存策略**：在节点之间共享缓存，将常用或频繁访问的数据存储在内存中，加快访问速度。 4. **多级存储层次**：结合不同性能的存储介质，如SSD、HDD，进行数据分层存储，既能降低成本又能保持较高的访问速度。 5. **实时编码与解码优化**：设计更高效的擦除编码和解码算法，以减少数据的编码和解码开销。 6. **监控与自适应控制**：通过实时监控系统状态，根据实际情况调整编码参数，以适应不断变化的工作负载。文章可能还探讨了这些优化策略在实践中遇到的挑战，比如如何在保证数据冗余的同时保持性能，并通过实验验证了所提出的优化方法的有效性。这篇研究论文深入分析了基于副本的存储集群中擦除编码数据归档的关键问题，并提供了解决方案，对于提升存储系统在归档场景下的整体性能具有重要意义。

block in another stripe. That is, a parity block in a stripe can

either (i) be stored on a separate and dedicated node or (ii) be

mixed with a data block in another stripe across all nodes in

the cluster.

2.2. Replication redundancy

Replication schemes are widely used to ensure high-data reli-

ability and to facilitate I/O parallelisms in distributed storage

clusters like GFS [

1], HDFS [2] and QFS [24]. A cluster ﬁle

system maintains two or three replicas for each data block.

However, replicated data improve data reliability and I/O

performance at the cost of high-storage consumption. For

example, the storage capacity overhead of replication and

triplication is as high as 200% and 300%, respectively.

Archiving infrequently accessed data (a.k.a., unpopular data)

can improve storage utilization. Additionally, such an archiv-

ing operation has a little impact on I/O access services in clus-

ters since the data to be archived exhibit decreased access

frequency. In the Facebook’s BLOB storage clusters, the BLOB

storage workload changes along with Facebook’sgrowth.

Consequentially, newly created BLOBs (i.e. hot data) have three

replicas to support a high request rate; week-old BLOBs (i.e.

warm data) have two duplicates using Geo-replicated XOR cod-

ing; and 2

-month old data (i.e. cold data) are converted into

the format of (14,10) RS-coded data [

18, 25].

Replica placement plays a signiﬁcant role in data reliabil-

ity. A typical example is the rack-aware replica placement

policy adopted by HDFS [

2]. With the rack-aware replica

placement in place, three replicas of a data block are orga-

nized as below: one replica is stored on one node in a local

rack, another replica is placed on a node in a remote rack and

the last one is kept on a separate node in the same remote

rack. When it comes to two-way replication, each primary

block and its replica block are stored on two distinct nodes.

When an erasure-coded data archival proced ure is launched,

it is challenging to determine the set of source data blocks for

an archival stripe from a group of data replicas, because repli-

cas of any two distinct data blocks are randomly distributed.

2.3. Distributed erasure-coded archiv al

Decentralized encoding [

12], parallel data archiving [15] and

pipelined archival [

14] belong to the camp of distributed data

archival schemes (i.e. DArch). In the DArch cases, two nodes

or more—acting as encoding nodes—accomplish the parity

generation. In particular, each encoding node retrieves k dif-

ferent data blocks from existing data replicas and computes r

parity blocks. In this section, we make use of a concrete

example to illustrate the DArch scheme.

Figure

3 demonstrates how to archive two-replica data with

random distribution using RS codes with coding parameter

‘k = 4’ in a storage cluster, where

{

}

DDDD,,,

1,1 1,2 1,3 1,4

and

{

}

DDDD,,,

2,1 2,2 2,3 2,4

are the source data blocks of stripe 1

and 2, respectively. Data blocks are randomly distributed on

eight data nodes as long as two replicas of any data block are

stored on two different data nodes.

DArch embraces an idea of parallel archiving, namely,

multiple encoding nodes share the responsibility of parity-

block generation. For each archival stripe, a master node and

some provider nodes are chosen according to data locality. A

master node performs as an encoding node for a speciﬁc arch-

ival stripe. Speciﬁcally, given a stripe, a master node should

be a node that contains the most amount of source data blocks

in the stripe. For example, Fig.

3 illustrates that the two arch-

ival stripes are processed by two encoding nodes—

and

. After a master node is selected for an archival stripe,

DArch nominates associated data-block provider nodes as fol-

lows: a higher priority is assigned to a node storing a larger

amount of source data blocks; the master node fetches data

blocks from the remote node with a high priority in the ﬁrst

place, and so forth.

In this study, we investigate the optimization approaches

on the basis of DArch, which is capable of accomplishing

erasure-coded archival for replicas managed by random data

placement strategies. After a thorough analysis, we observe

that non-sequential disk reads coupled with imbalanced I/O

loads suppress overall archiving performance. To alleviate the

above bottlenecks, we propose to incorporate a prefetching

mechanism to the data-block-reading stage to enhance disk I/

O throughput (see Section

3), and adopt a load-balancing

strategy to evenly distribute archival tasks among multiple

nodes to increase the utilization of network bandwidth (see

Section

4).

3. PREFETCHING-ENABLED ARCHIVING

3.1. Performance bottlenecks in data archival

In a handful of replica-based storage clusters (e.g. GFS [

8],

HDFS [

9] and QFS [24]), ﬁles are organized in form of large

D1,3

D1,4

D2,4

SN2 SN3 SN4 SN5

SN6 SN7 SN8

(

EN for stripe 1) (

EN for stripe 2)

SN1

D1,1

D1,2

D1,3 D1,4

D2,1

D2,2

D2,3

D2,4

D1,1 D1,2

D1,3 D1,4

D2,1

D2,2D2,3

D2,4

FIGURE 3. Distributed archiving scheme (DArch) enables multiple

encoding nodes (e.g.

and

) to share the data archival respon-

sibility for all archival stripes.

4J.HUANG et al.

SECTION B: COMPUTER AND COMMUNICATIONS NETWORKS AND SYSTEMS

THE COMPUTER JOURNAL, 2018

Downloaded from https://academic.oup.com/comjnl/advance-article-abstract/doi/10.1093/comjnl/bxy079/5065104 by Huazhong University of Science and Technology user on 02 January 2019

剩余15页未读，继续阅读

weixin_38596485

粉丝: 2
资源: 892

提升基于副本存储集群的擦除编码数据归档效率

基于蚁群优化的擦除编码存储系统数据更新方案

基于Hadoop的分布式集群大数据动态存储系统设计.pdf

擦除编码存储集群的一种高效的基于I / O重定向的重构方案

促进异构擦除编码存储系统中的降级读取

重新访问擦除编码的内存存储中的更新方案

超详细mongodb副本集群搭建（Mac，windows）基于Mac环境下Windows平台的MongoDB副本集群搭建指南

开发技术-硬件-流媒体服务器集群的副本存储策略.zip

优化擦除编码内存存储的更新策略：提出GU机制

基于Elasticsearch集群的数据查询优化

ZFS后端存储与ProxmoxVE: 数据管理与集群优化

最新资源