二进制循环码在分布式存储中的再生码设计

61 浏览量更新于2024-08-26 收藏 399KB PDF 举报

"本文主要介绍了一种用于分布式存储系统的再生码设计框架，该框架基于二进制循环码进行编码，并利用这些基本操作实现节点修复。这种方法可以看作是一种级联编码方案，其中外部码是二进制循环码，内部码是利用二进制循环码作为字母集的再生码。其优势在于，编码和修复失败节点的计算复杂度较低。当数据文件的大小趋于无穷大时，这种编码方法能够渐近地达到存储与修复带宽之间的基本权衡曲线。" 在分布式存储系统中，再生码（Regenerating Codes）是一种重要的错误纠正和节点修复技术。它们旨在优化存储效率和修复成本之间的平衡，特别是在大规模的数据中心环境中。传统的纠删码如RAID或 Reed-Solomon 码可能在节点故障时需要下载大量数据来恢复丢失的信息，而再生码则允许仅通过较少的通信带宽来修复故障节点。本文提出的二进制循环码再生码方案，将二进制循环码作为基础，构建了一个级联编码结构。二进制循环码是一种特殊的线性分组码，具有良好的卷积性质和易于硬件实现的特点。它们通常由生成多项式定义，且可以通过模2加法和位移操作进行编码和解码。在级联编码中，外部码（outer code）是二进制循环码，它提供了初步的错误纠正能力；内部码（inner code）则是一种再生码，它使用二进制循环码作为其字母集，允许在修复过程中高效地更新和重构数据。这样的设计使得编码过程和节点修复过程计算效率更高，因为只需要进行二进制运算，而不需要复杂的乘法或除法。作者证明了，当数据文件的大小趋向于无穷大时，这个二进制循环码再生码方案能够达到存储容量与修复带宽之间的理论最优权衡。这一权衡曲线是分布式存储系统设计中的关键指标，因为它定义了在保持系统可靠性的前提下，最小化存储开销和网络流量的最佳策略。这项研究为分布式存储系统提供了一种新的、高效的编码策略，通过利用二进制循环码的特性，能够在保证数据安全的同时，降低节点修复的计算复杂性和网络资源消耗。这对于大规模、高可用性的云存储环境具有重要意义，有助于提高整体系统的性能和经济性。

Regenerating Codes over a Binary Cyclic Code

Kenneth W. Shum

‡

, Hanxu Hou

§†

, Minghua Chen

, Huanle Xu

, and Hui Li

†∗

‡

Institute of Network Coding, the Chinese University of Hong Kong

Department of Information Engineering, the Chinese University of Hong Kong

†

Shenzhen Eng. Lab of Converged Networks Tech., Shenzhen Key Lab of Cloud Computing Tech. and App.,

Peking University Shenzhen Graduate School

Abstract— We present a design framework of regenerating

codes for distributed storage systems which employ binary

additions and bit-wise cyclic shifts as the basic operations. The

proposed coding method can be regarded as a concatenation

coding scheme with the outer code being a binary cyclic code,

and the inner code a regenerating code utilizing the binary cyclic

code as the alphabet set. The advantage of this approach is

that encoding and repair of failed node can be done with low

computational complexity. It is proved that the proposed coding

method can achieve the fundamental tradeoff curve between the

storage and repair bandwidth asymptotically when the size of

the data ﬁle is large.

I. INTRODUCTION

Regenerating codes is a class of erasure-correcting codes

introduced by Dimakis et al. in [1], with the aim of efﬁcient

repair of storage nodes. A data ﬁle is encoded and distributed

to n storage nodes, such that the ﬁle can be decoded from any

k of them. Furthermore, upon the failure of a storage node,

we want to repair the failed node by downloading some data

from any d surviving nodes, with the amount of data sent to

the new node as little as possible. The number of data packets

sent to the new node during the repair process is an important

metric in measuring efﬁciency of node repair, and is coined

the repair bandwidth in [1].

We differentiate two modes of repair. The ﬁrst one is called

exact repair and the second one functional repair. In exact

repair, the content of the new node is required to be the same as

in the failed node. In functional repair, the content of the new

node need not be the same as in the failed one, but the property

that any k nodes are sufﬁcient in decoding the original ﬁle

should be maintained. It is shown in [1] that, the minimization

of repair bandwidth for functional repair is closely related

to the single-source multi-cast problem in network coding

theory. After formulating the problem using an information

ﬂow graph, a fundamental tradeoff between the amount of

storage per node and the repair bandwidth is established. For

exact repair, some recent result on the fundamental limit on

repair bandwidth can be found in [2]. In the remaining of this

paper, we focus on functional repair.

This work was partially supported by the National Basic Research Pro-

gram of China (No.2012CB315904), NSFC61179028, by a grant from Uni-

versity Grants Committee of Hong Kong Special Administrative Region,

China (Project No. AoE/E-02/08), and by the Shenzhen Key Laboratory

of Network Coding Key Technology and Application, Shenzhen, China

(ZSDY20120619151314964).

* Corresponding author.

In [3], existence of linear network codes achieving all points

on the fundamental tradeoff curve for functional-repair regen-

erating codes is shown. The construction relies on arithmetic

of ﬁnite ﬁeld, and as in application of linear network code

to single-source multi-cast problem in general, the underlying

ﬁnite ﬁeld must be sufﬁciently large. However, multiplication

and division in ﬁnite ﬁeld are costly to implement in software

or hardware. In the literature of coding for disk arrays, the

computational complexity is reduced by replacing arithmetic

ﬁnite ﬁeld by simple bit-wise operations. For example, in [4],

maximal-distance separable (MDS) code with a convolutional

code as alphabet set is introduced by Piret and Krol. In [5],

Blaum and Roth proposed a construction of array codes based

on the ring of polynomials with binary coefﬁcients modulo

1 + x + · · · + x

p−1

for some prime number p. Similar

approach was considered by Xiao et al. in [6]. Motivated

by these constructions of low-complexity array codes, a class

of regenerating codes utilizing the XOR operations and bit-

wise shifts are proposed recently in [7]. The objective of this

paper is to introduce another class of regenerating codes which

enables repair by XOR and bit-wise cyclic shifts.

After reviewing some preliminaries on binary cyclic codes

in Section III, we we show that we can operate arbitrarily close

to the fundamental tradeoff curve between storage and repair

bandwidth by this family of regenerating codes in Section IV.

In Section V, we compare the computational complexity with

functional-repair regenerating codes over ﬁnite ﬁeld.

II. A MOTIVATING EXAMPLE

The following example of storage code illustrates the basic

ideas. Suppose that we want to store some information bits to

four storage nodes, such that we can recover the information

bits from any two nodes. Nodes 1 and 2 store the information

bits in uncoded format, and nodes 3 and 4 store some parity-

check bits. The information bits are divided into groups of

2(m − 1) bits, for some positive and odd integer m. Each

group of 2(m − 1) bits is called a data chunk. As the data

chunks are processed in the same manner, we focus on one

data chunk. We divide the 2(m − 1) information bits into two

equal parts, each consisting of m − 1 bits. Let the bits in the

ﬁrst part be b(1, 0), b(1, 1), . . . , b(1, m−2), and the bits in the

second part be b(2, 0), b(2, 1), . . . , b(2, m − 2). For i = 1, 2,

let

b(i, m − 1) :=

m−2

j=0

b(i, j)

2014 IEEE International Symposium on Information Theory

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38667408

粉丝: 8
资源: 896

二进制循环码在分布式存储中的再生码设计

二进制格雷码 与自然二进制码的互换

二进制格雷码与自然二进制码的互换

CRC代码生成器：代码为任何二进制数据流和生成多项式生成CRC代码-matlab开发

二进制和格雷码互换verilog代码

二进制代码转换：二进制代码转换如 2241,8421,gray..-matlab开发

二进制码转格雷码互转换的FPGA设计

SQLServer 二进制生成文件

56-Vivado二进制与格雷码互转设计.7z

基于VHDL语言9位二进制转换成BCD码

xunhuanma.rar_cyclic decoder_cyclic code_循环码_循环码 生成 多项式_循环码译码

最新资源

二进制格雷码与自然二进制码的互换

xunhuanma.rar_cyclic decoder_cyclic code_循环码_循环码生成多项式_循环码译码