An Experimental Study of Redundant Array of Independent SSDs and Filesystems
Yuxuan Xing
1
, Ya Feng
1
, Songping Yu
1
, Zhengguo Chen
1
, Fang Liu
1
, Nong Xiao
1 2
1
State Key Laboratory of High Performance Computing
National University of Defense Technology, NUDT
Changsha, China
2
School of Data and Computer Science
Sun yat-sen University
Guangzhou, China
xingyuxuan_2012 @nudt.edu.cn
Abstract—Solid state disks (SSDs) become more and more
popular in personal devices and data centers. Flash chips can
be packaged in Hard disk drive (HDD) form factors and
provide the same interface as HDDs. This character makes
SSDs easily replace HDDs in existing storage systems. PCIe-
based SSD can provide a higher I/O performance, but it is still
a little expensive. This paper studies the feasibility of
Redundant Arrays of Independent SSDs (RAIS) with different
filesystems. We comprehensively analyze the performance of
RAIS constructed by SATA SSD and PCIe SSD individually.
We investigate different RAIS configurations (RAIS0, 5, 6) and
filesystems under various I/O access patterns. Finally, we
illustrate our serval key findings and recommendations for
building RAIS.
Keywords-Solid state disks; storage systems; Redundant
Arrays of Independent SSDs; I/O performance; filesystems
I. INTRODUCTION
Hard disk drives (HDDs) are the leading media in storage
systems for decades. The capacity of HDDs is continuous
growing and the price-per-byte is reducing. However, the I/O
performance improves slowly due to its mechanical
characteristic that is in the absence of parallelism.
Unfortunately, this state will not change for a long time.
In recent years, new storage media (such as Flash, PCM,
FeRAM, etc) have sprung up in our sight. Flash memory has
been deployed widely in modern storage system due to its
advantageous properties, including non-volatility, high I/O
performance, low power consumption and mature
technology. Flash memory-based solid state disks (SSDs),
especially made by NAND Flash, are becoming another
leading media in storage systems. International prestigious
companies such as EMC, NetApp, IBM, have introduced
their SSD-based storage products to both enterprise and
consumer markets for years. Some research institutes also
take advantage of SSDs to accelerate I/O performance of the
compute nodes in TianHe supercomputer [1]. Facebook,
Google, Baidu and Alibaba said they have deployed many
SSDs in their data centers to fulfill the need for a highly
reliable and available storage system, as well as high I/O
throughput [2].
Since David Patterson [3] first described the way to
construct redundant arrays of independent disks (RAID) in
1988, RAID is the dominant technology in enterprise storage
systems. Although SSDs can replace HDD easily in the array,
it also needs technical ability to build a highly reliable and
high-performance redundant arrays of independent SSDs
(RAIS) [4]. RAIS has its own design challenges due to some
typical characteristics of SSDs. Besides, filesystem also
plays an important role in the storage system since it is a key
part of I/O stack. Therefore, it is important to
comprehensively evaluate the performance impact on the
design of both RAIS and file system at the same time.
In our study, we used two different kinds of SSDs
(depicted in Table 1) to construct three types of RAIS:
RAIS0, RAIS5 and RAIS6. We choose these RAIS
configurations because they are the most widely used
schemas in the actual environments. We also choose four
filesystems to test them with RAIS: XFS, ext4, F2FS and
btrfs [5][6][7]. Here are the salient contributions of our work:
z We make a detailed analysis of single raw-device’s
performance by varying I/O access patterns and
queue depth, and check their I/O behaviors under
four filesystems.
z We deeply explore the I/O characteristics of RAIS0,
RAIS5 and RAIS6, which are based on two
different types of SSD individually.
z We conduct a comprehensive comparison of RAIS
schemas under different filesystems, and reveal
some design insights for constructing high cost-
effective RAIS.
Table 1: Storage device characteristics examined in our study
The rest of this paper is organized as follows. We first
present an overview of the SSD, RAIS and filesystem in
Section 2. We describe the methodology and testbed in
section3. And section4 analyzes the I/O performance in
detail. Section5 introduces some related work. We conclude
in section6.
II. B
ACKGROUND
A. Solid State Disk
Flash memory is a non-volatile storage medium based on
semiconductor technology. It can be classified into two types:
NAND and NOR. NAND flash memory has been widely
used for massive storage devices over NOR flash because of
its large capacity and low cost relative to NOR flash memory.
Figure 1 shows a schematic of a flash package. It is
composed of one or more dies. And each die has a separate
chip enable and ready/busy signals so that one of the dies can
accept commands and data while the other is carrying out
another operation [8]. Thanks to the above features, NAND-
2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International
Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems
978-1-5090-4297-5/16 $31.00 © 2016 IEEE
DOI 10.1109/HPCC-SmartCity-DSS.2016.67
42