Me-CLOCK:A Memory-Efficient Framework
to Implement Replacement Policies
for Large Caches
Zhiguang Chen, Nong Xiao, Member, IEEE,
Yutong Lu, Fang Liu, and Yang Ou
Abstract—Solid State Drives (SSDs) have been extensively deployed as the
cache of hard disk-based storage systems. The SSD-based cache generally
supplies ultra-large capacity, whereas managing so large a cache introduces
excessive memory overhead, which in turn makes the SSD-based cache neither
cost-effective nor energy-efficient. This work targets to reduce the memory
overhead introduced by the replacement policy of SSD-based cache. Traditionally,
data structures involved in cache replacement policy reside in main memory. While
these in-memor y data structures are not suitable for SSD-based cache any more
since the cache is much larger than ever. We propose a memory-efficient
framework which keeps most data structures in SSD while just leaving the memory-
efficient data str ucture (i.e., a new bloom proposed in this work) in main memory.
Our framework can be used to implement any LRU-based replacement policies
under negligible memory overhead. We evaluate our proposals via theoretical
analysis and prototype implementation. Experimental results demonstrate that,
our framework is practical to implement most replacement policies for large caches,
and is able to reduce the memory overhead by about 10.
Index Terms—Cache, SSD-based cache, SSD, cache replacement policy, storage
Ç
1INTRODUCTION
FLASH memory is well-known for its low latency, high parallelism,
and energy efficiency. Solid State Drives (SSDs) based on flash
memory inherit all these advantages, thus are widely accepted by
large-scale storage systems. However, the price of SSDs is quite
high compared with hard disks. Storage systems absolutely built
by SSDs are not cost-effective. Accordingly, a large number of stor-
age systems take SSDs as the cache of hard disks. One outstanding
feature of SSD-based caches is the large capacity compared with
DRAM-based caches. Managing so large a cache requires a large
volume of main memory, which makes the SSD-based cache nei-
ther cost-effective nor energy-efficient. In this work, we target to
reduce the memory overhead introduced by the cache replacement
policy. Existing cache replacement policies such as ARC [1], LIRS
[2], 2Q [3], and SAC [4] are mostly based on Least Recently Used
(LRU) queue. The LRU queue is traditionally implemented via
doubly linked list. Each entry of the list is a tuple containing three
fields at least, i.e., hPage number; Left pointer; Right pointeri. The
Page number denotes the page associated with the entry. The left
and right pointers are used to link adjacent entries in the list. If
each field is 4 bytes the entire tuple takes up 12 bytes. Such a low
space overhead (i.e., 12 bytes for each 4 KB page) is negligible for
DRAM-based caches. However, for the SSD-based caches that sup-
ply much larger capacity than ever, keeping such a low space over-
head in main memory is still unacceptable. Take a 1TBSSD-based
cache containing 256 M (1 TB/4 KB) pages for an example, the
LRU policy used to manage the cache consumes as many as 3 GB
(256 M 12 bytes) main memory. Such a large DRAM will increase
the cost as well as power of SSD-based cache significantly.
In this work, we propose a memory-efficient framework to
implement LRU policy. As LRU policy is the fundament of most
complex cache replacement policies, our framework can be further
applied to these polices, facilitating SSD-based caches to adopt
these policies without introducing excessive memory overhead.
The new framework is an improvement over CLOCK [5]. CLOCK
maintains two data structures, a First In First Out (FIFO) queue and
some reused flags. Each entry of the FIFO queue corresponds to a
page of the cache, where the reused flag appended to the entry is used
to indicate whether the corresponding page has been frequently
accessed or not. Frequently accessed pages are protected from being
evicted out of cache. CLOCK imitates the LRU policy exactly, thus
has been extensively adopted to implement LRU-based cache
replacement policies. However, as data structures maintained by
CLOCK must reside in main memory, CLOCK is not suitable for
SSD-based caches due to the unacceptable memory overhead.
Our framework improves CLOCK by keeping the FIFO queue
in SSD, rather than in main memory. The memory overhead of
CLOCK is mostly introduced by the FIFO queue. Keeping FIFO
queue in SSD is an intuitive optimization to reduce the memory
overhead. However, CLOCK cannot keep FIFO queue in SSD since
that each entry of the FIFO queue is appended with a randomly
accessed reused flag. As a result, the whole FIFO queue is randomly
accessed, thus must be kept in main memory. We propose a new
bloom filter [6] to replace the reused flags, enabling the FIFO queue
to be kept in SSD. Specifically, our framework consists of two com-
ponents, a FIFO queue and a bloom filter. As the entries of FIFO
queue are not appended with randomly accessed reused flags any
more, operations to the FIFO queue merely occur at the head or
tail. So, we only maintain the head and tail of FIFO queue in main
memory, while keep most entries of FIFO queue in SSD. The bloom
filter takes over the responsibility of reused flags, indicating whether
a page has been frequently accessed. It is kept in main memory but
introduces negligible memory overhead.
Our memory-efficient CLOCK (Me-CLOCK) requires the bloom
filter to support element deletion as well as remain memory-
efficient. However, there is no existing bloom filter meeting these
demands. Accordingly, we propose such a new bloom filter, which
is another contribution of this work. We evaluate our proposals via
theoretical analysis and prototype implementation. Evaluation
results demonstrate that Me-CLOCK can be used to implement
complex LRU-based cache replacement policies; the memory over-
head is 10 times less than that introduced by traditional manners
such as the LRU queue and CLOCK. Note that Me-CLOCK is just
used to implement replacement policies, but cannot be used to
index pages in the cache. The cache generally adopts separate data
structures such as Red-Black tree to index its contents.
The rest of this paper is organized as follows. Section 2 presents
the background and motivations. Section 3 elaborates the principle
of Me-CLOCK. Section 4 proposes a bloom filter supporting dele-
tion. Section 5 and 6 evaluate the new bloom filter and Me-CLOCK,
respectively. Section 7 summarizes some related works. The last
section concludes this work.
2BACKGROUND AND MOTIVATIONS
This section argues that, the existing LRU and CLOCK cannot be
used to implement the replacement policies for large caches due to
their high memory overhead.
2.1 The LRU Queue
The LRU queue is the fundament of most replacement policies
such as ARC [1], 2Q [3]. Operations to the LRU queue are
explained as follows.
ADD. when a page is accessed for the first time, the LRU policy
generates a new entry for it, and then adds the entry to the Most
Recently Used (MRU) end of LRU queue.
HIT. when a page hits in cache, the related entry is moved to the
MRU end of the LRU queue.
The authors are with the State Key Laboratory of High Performance Computing,
National University of Defense Technology, Changsha, China.
E-mail: {che nzhiguang, nongxiao, ytlu, liufang, yangou}@nudt.edu.cn.
Manuscript received 4 June 2014; revised 8 Sept. 2015; accepted 11 Oct. 2015. Date of
publication 25 Oct. 2015; date of current version 15 July 2016.
Recommended for acceptance by J. D. Bruguera.
For information on obtaining reprints of this article, please send e-mail to: reprints@ieee.
org, and reference the Digital Object Identifier below.
Digital Object Identifier no. 10.1109/TC.2015.2495182
IEEE TRANSACTIONS ON COMPUTERS, VOL. 65, NO. 8, AUGUST 2016 2665
0018-9340 ß 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.