Improving Write Performance of LSMT-based Key-Value Store
WeiTao Zhang
1,2
, Yinlong Xu
1,3
, Yongkun Li
1,3
, Dinglong Li
1
1. School of Computer Science and Technology, University of Science and Technology of China
2. AnHui Province Key Laboratory of High Performance Computing, Hefei, China
3. Collaborative Innovation Center of High Performance Computing, National University of Defense Technology
Email: {avenger, ldlong}@mail.ustc.edu.cn, {ylxu, ykli}@ustc.edu.cn
Abstract—Key-value stores are widely used to provide much
higher read and write throughput than traditional SQL
databases. LSMT (log structure merge tree) based key-value
store, as one type of key-value stores, is applied in many
practical systems since it could eliminate random writes and
provide good read performance at the same time. However, the
data residing in disk needs compaction operations from time to
time, which takes a large amount of I/O resources. Since disk
access speed is much slower than DRAM and most data resides
in disks, the compaction operation will significantly influence
the system performance. In this paper, we propose a grouped
level structure, which divides each level in LSMT into multiple
groups. Also, we propose a new compaction method for the
grouped level structure to reduce the compaction I/O overhead.
Our experiments show that the grouped level structure saves
about 55% to 78% I/O resource of compaction, so it improves
the write throughput by 69% to 284%, but only reduces
the read throughput by 5% to 9%. It improves the overall
throughput by 30% to 69% with read dominated workloads
of 25% write operations and 75% read operations.
Keywords-Key-Value, Log Structured Merge Tree, Com-
paction, Big Data
I. INTRODUCTION
With key-value store, data is represented as a collection
of key-value pairs, where the key of each value is unique. It
is a basic type of NoSQL database which doesn’t rely on the
traditional structures of relational database. Compared with
traditional relational database, key-value store is designed
to handle a huge amount of data and it has worked well
for shopping cart contents, landing page URLs, and default
account numbers, etc. Key-value stores are fast, scalable,
portable and flexible, so they have been widely used in prac-
tical systems, including Google’s BigTable [6], Facebook’s
Cassandra [10], Apache Hbase [1], Amazon’s Dynamo [8]
and so on.
In some practical applications, key-value store faces write-
intensive workloads where the workloads are dominated by
data writes rather than reads [17], e.g. to store frequently-
changing objects [4, 5, 12]. To improve write performance,
many key-value stores are based on the log-structured merge
tree (LSMT) [13], like Hbase, Cassandra, BigTable and
LevelDB [2], which could be treated as a lightweight im-
plementation of BigTable and has been widely deployed in
many applications. Those key-value stores offer a similar
API to write data to and read data from the databases, and
the basic interfaces are put for write and get for read. To
support write intensive workloads, key-value stores usually
use append-only strategy to speed up write process, where
update and delete operations are performed as append writes.
To distinguish insert, delete and update operations, each key-
value pair is along with a flag to identify those operations,
and the obsolete key-value pairs will be reclaimed by the
compaction operation.
The new coming key-value pairs in a LSMT key-value
stores are stored in a memory buffer at the beginning. When
the buffer is full, the key-value pairs are packed into an sst
file with sorted keys and persisted into an external storage.
Each sst file keeps a record of its minimum key and its
maximum key, which we call its key interval. To further
enhance read performance with spatial locality and temporal
locality of workloads, LSMT key-value stores divides all
key-value pairs into some levels. Usually there are multiple
times of sst files in level i +1 of which in level i. When
level i is full, all of its sst files are compacted into level
i +1.
With compaction, the keys in different sst files are sorted.
So apart from level 0, the key intervals of different sst files
in the same level do not overlap with each other. A given
key may falls into only one sst file in a level (apart from
level 0). With compaction, we can limit the sst files to be
searched for finding a given key. For example, in Figure 1,
key 7 falls in the key intervals of both files A and B,but
only B contains key 8. Suppose that we compact A
3,11
and
B
7,44
into files C
3,8
and D
11,44
. Then key 7 falls only in
the key interval of C, so we only need to search key 7 in
file C.
To read the value with a given key k, we will firstly
compare k with the minimum key and the maximum key
of an sst file. If k does not falls in its key interval, the
corresponding value is not in this file; otherwise, we will
use binary search to compare k with the keys in this file.
So with levelling and compaction, read latency in key-value
stores is limited.
In a long run, read latency in a key-store can be limited
by compaction. However, the compaction process itself will
consume a plenty of CPU and I/O resources, especially
for a write-intensive workload. In a LSMT key-value store,
most data resides in disk and access to disk is much slower
2016 IEEE 22nd International Conference on Parallel and Distributed Systems
1521-9097/16 $31.00 © 2016 IEEE
DOI 10.1109/ICPADS.2016.77
553