CG-Resync: Conversion-Guided Resynchronization
for a SSD-based RAID Array
∗
Letian Yi
†
, Jiwu Shu
†§
, Jiaxin Ou
†
, Weimin Zheng
†
†
Department of Computer Science and Technology, Tsinghua University, Beijing, China
†
Tsinghua National Laboratory for Information Science and Technology, Beijing, China
lonat.front@gmail.com, shujw@tsinghua.edu.cn
Abstract—SSD-based RAID arrays have been widely adopted
in large-scale systems. One requirement on a RAID is to provide
data consistency, which can be an issue during serving write
requests. While using NVRAM or on-storage logging can ensure
the consistency, the approaches can either be very expensive or
substantially compromise performance. For SSD-based RAID,
scanning the entire storage space during rebooting after a crash
can recover the consistency. However, it takes a long resychro-
nization time. To address the issue efficiently and cost-effectively,
we propose CG-Resync, a scheme providing consistency assur-
ance for SSD-based RAIDs by leveraging logging mechanism
readily available in almost all SSDs for accommodating flash’s
out-of-place-write requirement. To identify uncompleted writes
resulting in inconsistent stripes, we use guided conversion in
managing SSD’s internal logs. In particular, only when a stripe
becomes consistent does CG-Resync allow the updated data on
the stripe to be removed from the log. We evaluate CG-Resync
and experiments show that it provides improved RAID reliability
and availability upon a crash with little performance loss during
regular I/O operations.
Index Terms—SSD array; resynchronization; consistency;
I. Introduction
A consistency problem arises when a SSD RAID array
crashes. We illustrate how the consistency can be compromised
in Figure 1. The diagram depicts the steps involved in serving
a write request on a stripe of a RAID-5 array composed of
five SSDs. In the example, among the five sectors in the stripe,
Sector P is the parity sector. While the RAID intends to carry
out requested write into the S
1
data sector, it must also update
the P parity sector in the stripe. As illustrated at Step 1, the
host issues write request for updating the data sector. Upon
receiving the request, the RAID recalculates a new parity for
the stripe at Step 2, and issues two writes to S
1
and P on SSD
1
and SSD
5
at Steps 3 4, respectively. A write request arriving
at the SSD is first queued in the SSD’s controller, and then
is scheduled to commit its data to the flash memory. The two
updates (of the data and parity) are committed on the flash
memory of SSD
5
and SSD
1
probably at different times in
Step 5 and Step 6 in the example, respectively. Between Step
§
Corresponding author: Jiwu Shu (shujw@tsinghua.edu.cn).
∗
This work is supported by the National Natural Science Foundation
of China (Grant No. 60925006,61232003), the National High Technology
Research and Development Program of China (Grant No. 2013AA013201),
and the research fund of Tsinghua-Tencent Joint Laboratory for Internet
Innovation Technology, and Tsinghua University Initiative Scientific Research
Program.
z
X
z
Y
z
Z
z
[
w
ᴺ KRVWZULWHV6
ᴼ LVVXLQJD
ZULWHWR66'
ᴻ FRPSXWLQJ
DQHZSDULW\
ᴿGDWDLV
FRPPLWWHGRQ
WKHIODVKRI
66'
ᴾSDULW\LV
FRPPLWWHGRQ
WKHIODVKRI
66'
66'
66'
66'
66'
66'
ᴽ LVVXLQJD
ZULWHWR66'
Fig. 1: Illustration of servicing a write request from the host
in the RAID-5 array.
5 and Step 6, the stripe is in a window of vulnerability. During
this time window, if the array crashes, a concurrent failure of
any of the remaining SSDs (SSD
2
, SSD
3
,orSSD
4
) would
result in data loss. Apparently when the stripe’s data and parity
stays in an inconsistent state, the data residing on a failed SSD
cannot be reconstructed. As the vulnerability is due to crash
of the array’s component SSD, we name this failure scenario
as array crash model.
If the RAID controller resides in the host machine, or the
SSD is a software RAID, incidents such as operating system
crashes and power outage can also lead to the vulnerability. In
the example, if the RAID controller crashes between Steps 3
and 4, and all SSDs remain active, the stripe will be left in an
inconsistent state after S
1
is committed on SSD
1
. We define
this failure scenario as RAID crash model.
To facilitate the presentation, we formally define a trans-
action of a RAID array in the context as a unit of work that
comprises all SSD writes for processing a single host’s write.
In the above example, a transaction includes writes to S
1
and
P. After all SSD writes of the transaction are completed at
Step 6, the transaction is considered as completed. If there are
uncompleted transactions in the event of a crash, the array lies
in vulnerable state.
II. The Design of CG-Resync
A. An Overview
Figure 3 shows diagram CG-Resync in the SSD-based array
architecture. To identify an uncompleted transaction, we need
to maintain the relationship between the updates and their cor-
responding transactions. To this end, the Tx shepherd module
978-1-4799-2987-0/13/$31.00 ©2013 IEEE
455