IEEE
Proof
IEEE COMMUNICATIONS LETTERS 1
Lowest Density MDS Array Codes of Distance 3
Zhijie Huang, Hong Jiang, Fellow, IEEE, Ke Zhou, Member, IEEE, Yuhong Zhao, and Chong Wang
Abstract—We present a new family of maximum-distance sep-
arable (MDS) array codes of distance 3, called S(ymmetry)-Code,
which has the properties of: 1) optimality in encoding, decoding
and update; 2) code length of either p or p − 1 with p being an
odd prime; 3) requiring less I/O cost than most existing lowest
density MDS array codes during single-erasure recovery; and
4) approaching the lower bound in I/O cost of single-erasure
recovery when the code length is small.
Index Terms—Array codes, maximum-distance separable
(MDS) codes, lowest density, storage systems.
I. INTRODUCTION
A
RRAY codes have been studied extensively in the past
thirty years due to their practical utility in communication
and storage systems. The main advantage of these codes is that
the encoding and decoding procedures use only simple XOR
operations, thus are more efficient than the well-known Reed–
Solomon codes in terms of computational complexity [1]. In
array codes, symbols are column vectors and hence each code-
word is a two-dimensional array. An array code of size m × n
can be regarded as a one-dimensional code defined over the
Abelian group G(2
m
).LetN be the number of its codewords,
then the dimension k can be defined as k = log
2
m
N, as with
the usual one-dimensional codes. Accordingly, the distance d
of the code is defined over G(2
m
), i.e., each column is regarded
as a symbol , and d refers to the column-wise distance. If the
Singleton bound is achieved, i.e., d = n−k+1, then the array
code is called a maximum-distance separable (MDS) array code.
When array codes are applied to storage systems, the n
columns (symbols) of a codeword are stored in n different
disks (nodes) respectively. Since every parity bit is constructed
from a certain subset of the data bits of the array, whenever
a data bit is changed, the parity bits to which it contributes
must be updated (read-modify-write) accordingly. Therefore, in
addition to the traditional encoding and decoding complexity,
another important performance metric of array codes is update
complexity—the average number of parity bits that must be
updated when a data bit is altered. Update complexity directly
dictates the communication overhead of small-write operations,
hence is particularly crucial when the codes are used in storage
Manuscript received February 8, 2015; accepted July 26, 2015. This work
was supported in part by the National Basic Research Program (973 Program)
of China under Grant 2011CB302305, by the NSFC under Grant 61232004,
and by the U.S. National Science Foundation under Grants CNS-1016609 and
CNS-1116606. The associate editor coordinating the review of this paper and
approving it for publication was H. Saeedi. (Corresponding author: Ke Zhou.)
Z. Huang, K. Zhou, Y. Zhao, and C. Wang are with the School of Computer
Science and Technology, Huazhong University of Science and Technology,
Wuhan 430074, China (e-mail: jayzy_huang@hust.edu.cn; k.zhou@hust.edu.
cn; yuhongzhao@hust.edu.cn; C_Wang@hust.edu.cn).
H. Jiang is with the Department of Computer Science and Engineering, Uni-
versity of Texas at Arlington, TX 76010 USA (e-mail: hong.jiang@uta.edu).
Digital Object Identifier 10.1109/LCOMM.2015.2464379
applications that require frequent updates of data, particularly
in a distributed storage system. In order to minimize the update
complexity, a subclass of MDS array codes, called lowest
density MDS array codes, have gained much attention in the
past few years [2]–[6]. These codes can achieve the lower
bound of update complexity, at the expense of certain limit in
the code length.
In recent years, array codes used in storage systems also
take the I/O cost of single-erasure recovery (IOCSER) as an
additional performance metric. IOCSER refers to the amount
of surviving information that needs to be accessed in order to
rebuild exactly the lost information, and it deserves attention
due to the following reasons: a) 99.75% of the actual recoveries
in the production environment are caused by single erasures [7],
b) IOCSER directly dictates the communication overhead and
the reconstruction time of single-erasure recovery. Researchers
have been trying to reduce the IOCSER in two ways: one is to
develop optimal single-erasure recovery algorithms for existing
array codes [8], [9], and the other is to design new codes aiming
to minimize the IOCSER, e.g., [10]–[12]. The latter can reduce
the IOCSER much more than the former, but the price is that
the new codes are usually unsatisfactory in terms of encoding,
decoding and update.
It is worth stressing that, none of the existing MDS array
codes is perfect in that every code requires certain trade-offs in
different performance metrics, according to its main application
scenarios. In this letter, we construct a new family of MDS array
codes of distance 3, called S(ymmetry)-Code. S-Code is a class
of lowest density MDS array codes with a regular geometrical
structure, so it enjoys the inherent optimal properties of lowest
density MDS array codes and it is possible to work out a spe-
cially optimized single-erasure decoding algorithm. In general,
it has the following attractive properties:
• Optimal encoding, decoding, and update complexity.
• Less IOCSER than existing lowest density MDS array
codes that have similar code length limit.
• Approaching the lower bound of IOCSER when the code
length is small. In particular, the S-Code of length 4
exactly achieves the theoretical lower bound of IOCSER.
II. S
YMMETRIC-CODE (S-CODE)
In S-Code, a codeword is a (p − 1) × p array containing
(p − 1) × (p − 2) data bits and 2(p − 1) parity bits, where p is
an odd prime. The parity bits are placed along two symmetric
diagonals, and each parity bit is constructed from the data bits
along the diagonal (anti-diagonal) that traverses it.
A. Encoding Procedure
Throughout this letter, we use b
i,j
to denote the ith bit in
the jth column, where 0 ≤ i ≤ p − 2 and 0 ≤ j ≤ p − 1, and
1558-2558 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.