Wen ZHOU et al. Prober: exploiting sequential characteristics in buffer for improving SSDs write performance 953
In this paper, we make the following contributions.
First, in order to address the problem of cache pollution by
LWdata, we establish a block-level request sequence model,
called R-SEQ, and propose the Prober approach to identify
large sequential write request at early stage and label it as
cold page in buffer. Incorporated with existing buffer man-
agement scheme, SSD controller will migrate all the cold
pages to the tail of the queue and discard them from buffer
preferentially.
Second, in order to further lessen the response time, we
propose an actively write-back scheme. The proposed ap-
proach can write back LWdata in background and free up
space for incoming requests, which reduces average response
time dramatically.
Third, we implement Prober module and actively write-
back scheme on SSDsim. Extensive experiments under vari-
ous real-world workloads with diverse ratio of write requests
are conducted to examine write buffer hit ratio, average re-
sponse time and the number of erase operations. The experi-
mental results reflect that Prober is able to identify cold data
accurately and our schemes are very effective.
The rest of this paper is organized as follows. Section 2
gives the background of SSDs and motivation of Prober. Sec-
tion 3 shows the details of Prober-based buffer management
and Section 4 shows the extensive evaluation of Prober. Sec-
tion 5 summarizes related work. Finally, in Section 6, we
conclude the paper.
2 Background and motivation
In this section, we present the background on SSDs to facil-
itate our discussion and analyze the important observations
that motivate our Prober design.
2.1 SSD overviews
Flash memory are widely used as solid-state drives (SSD) on
conventional machines. In the internal of SSD, FTL plays an
important role in managing DRAM buffer and flash memory.
In general, FTL consists of address mapping, wear leveling,
garbage collection and buffer replacement components. Some
optional functions, such as data compression/encryption and
bad block management, are also applied in FTL. Among
them, address mapping scheme is the most important one and
the foundation of other algorithms.
According to mapping granularity, address mapping al-
gorithms are classified into page-level, block-level and hy-
brid mapping. For page-level mapping [3,4], a given logical
page number (LPN) can be mapped to arbitrary physical page
number (PPN) in the flash memory. However, the table size
of page-level FTL is too large to be fully stored in the buffer,
which leads to severe performance degradation. To reduce
the space overhead of the address mapping table, block-level
FTL are proposed with a coarse-grained mapping pattern,
which records the mapping from logical block number (LBN)
to physical block number (PBN) by ensuring the offset in the
physical block is the same as that in the logical block. Block-
level FTL shows lower space overhead than that of page-level
FTL. However, the update operations generated by rewritten
requests incur plenty of page migration and consume unaf-
fordable latency. To overcome their respective defects, sci-
entists propose a hybrid mapping scheme which divides the
entire blocks into data blocks and log blocks and adopts the
page-level and block-level mapping schemes in different area.
Log blocks which store frequently updated pages use page-
level mapping, while data blocks store consecutive data and
adopt block-level mapping. Since log blocks are consumed
rapidly in random write applications, hybrid FTL consumes
longer time on transforming log blocks to data blocks, which
significantly degrades the performance. All kinds of FTL al-
gorithms are well studied to overcome their flaws and im-
prove storage performance.
In addition, another important component of FTL is the
buffer management module, which is widely studied in the
databases and operating systems. The size of each buffer en-
try can be a page or a block, which is decided by address map-
ping granularity. For page-level FTL, DRAM buffer manages
each page independently, thus it accommodates more hot
pages, achieving higher write performance with random ac-
cess pattern. While for the block-level FTL, pages with same
logical block address are gathered in buffer and written into
a physical block together when they are evicted from buffer.
It demonstrates higher write performance with sequential ac-
cess pattern. In this paper, we adopt page-level granularity
in DRAM buffer for our research. However, our scheme can
also be used in block-level FTL.
Given that asymmetry read/write latencies, the DRAM
buffer is mainly used to store dirty data generated by write
requests. Our scheme is based on pure write buffer, and it is
easy to be transplanted to hybrid buffer.
2.2 Research motivations for Prober
Operations associated with large files are ubiquitous in vari-
ous applications. On personal desktops and tablet computers,
there are frequent file copy operations such as downloading