DCS5: Diagonal Coding Scheme for Enhancing the
Endurance of SSD-based RAID-5 Systems
Yubiao Pan
1
, Yongkun Li
2
, Yinlong Xu
2
, Weitao Zhang
1
School of Computer Science & Technology
University of Science & Technology of China
Hefei, China
E-mail:
1
{pyb, avenger}@mail.ustc.edu.cn,
2
{ykli, ylxu}@ustc.edu.cn
Abstract—Solid-state drives (SSDs) have been widely deployed
in large-scale storage systems. To guarantee high reliability for
SSD-based storage systems, it still requires data redundancy
schemes, e.g., RAID schemes. Traditional RAID-5 shows its
benefits in load-balancing and I/O parallelism, and so it is
still the first choice for enhancing the reliability of SSD RAID
arrays. However, some SSDs under the RAID-5 configuration
may age much faster than others because of the non-uniformity
of workloads, which makes them be worn out very quickly and so
decreases the endurance of SSD-based RAID arrays. To address
this problem, we develop a diagonal coding scheme, DCS5, to
improve the wear-leveling among devices in an SSD-based RAID-
5 array. DCS5 can efficiently improve the array endurance if
accesses are aligned with the stripe size, i.e., when data symbols
in the same stripe receive the same number of writes, while the
number could be different for different stripes. To relax the above
assumption, we further propose an enhanced scheme which is
called as DCS5+. DCS5+ can improve the wear-leveling among
devices under general access patterns via triggering different
responses to different kinds of requests. We conduct extensive
trace-driven evaluations based on real-world workloads, and
results show that our coding scheme efficiently enhances the
endurance of SSD-based RAID-5 arrays.
Keywords-Solid-state Drives; RAID; Endurance; System-level
Wear-leveling
I. INTRODUCTION
Solid-state devices (SSDs) have been revolutionizing the tra-
ditional storage architecture and are being widely deployed in
large-scale storage systems as they can provide the same host
interface as hard disk drives (HDDs) and also offer multiple
improvements over HDDs, such as higher I/O performance,
lower power consumption, and higher shock resistance.
However, SSDs also possess multiple constraints, one major
concern is that each block in an SSD can only tolerate a
limited number of erasures. The typical value of this number is
10K for multi-level cell (MLC) SSDs, and it may even drop
to several thousand for triple-level cell (TLC) SSDs [1, 2].
Thus, single SSDs usually employ wear-leveling techniques
to balance the erasures on blocks inside SSDs [3, 4, 5, 6]. On
the other hand, bit errors are still common in SSDs, especially
for MLC SSDs and TLC SSDs, and they could be caused by
read disturb, write disturb and even data retention [2, 7, 8]. In
particular, bit error rate increases as the number of erasures
performed on the SSD increases, and the increase may become
sharp when SSDs are reaching their erasure limit. Even worse,
as the density of flash cells increases so to enlarge the capacity
of single SSDs, the endurance of SSDs continues decreasing,
while the bit error rate further increases [2].
To enhance the endurance and reliability of SSDs, both
wear-leveling techniques and error correction codes (ECCs)
are developed, while they are only for single SSDs. In terms
of SSD-based arrays, device-level redundancy schemes like
redundant array of independent disks (RAID) [9] become
necessary. In particular, RAID-5 which places parities evenly
across multiple devices can not only improve the system
reliability, but also provides high I/O throughput as it achieves
both load-balancing and I/O parallelism.
'
'
'
'
'
'
'
'
'
'
'
&
&
&
&
&
Fig. 1: Data organization in a RAID-5 system.
SSD-based RAID-5 can reconstruct lost symbols for one
device failure. A RAID-5 array is divided into many stripes,
each of which consists of one parity symbol that is encoded
from the data symbols in the same stripe. Symbols are striped
across multiple devices in a round-robin manner. For the ease
of presentation, C represents a parity symbol, D denotes a
data symbol in this paper. For example, in Fig. 1, data symbols
1, 2, 3, 4, 5, 6, 7, 8 are located in D
0,0
, D
0,1
, D
0,2
, D
0,3
, D
1,0
,
D
1,1
, D
1,2
, and D
1,4
, respectively, and the corresponding
two parity symbols are located in C
0,4
and C
1,3
. Note that
parity symbols receive more writes in RAID arrays because
updating a data symbol requires an update to the corresponding
parity symbol. For instance, if D
0,0
, D
0,1
, D
0,2
and D
0,3
are updated once independently, then the parity symbol C
0,4
will be updated by four times. Here we do not consider any
2014 9th IEEE International Conference on Networking, Architecture, and Storage
978-1-4799-4087-5/14 $31.00 © 2014 IEEE
DOI 10.1109/NAS.2014.16
63