Hybrid Efficient Memory Access Monitoring
Wenzhe Zhang, Kai Lu, Xiaoping Wang
Science and Technology on Parallel and Distributed Processing Laboratory
Collaborative Innovation Center of High Performance Computing
State Key Laboratory of High-end Server & Storage Technology
College of Computer, National University of Defense Technology
Changsha, PR China
zhangwenzhe@nudt.edu.cn
lukainudt@163.com
xiaopingwang@nudt.edu.cn
Abstract—High-efficient memory access monitoring for native
programs (C or C++ programs) is very attractive. It can be
leveraged to support analyzing the behavior of programs and
thus is needed by a lot of tools such as program-analyzer,
debugger, program-controller, etc. Current mechanisms for
memory access monitoring mainly contain dynamic
instrumentation, compiler instrumentation, and page-
protection and they all fall short in terms of performance. In
this paper we propose a hybrid mechanism that combines the
compiler instrumentation and page-protection mechanism to
achieve better performance. Our key idea is to divide a
program into two parts: (1) loop part and (2) non-loop part
and then leverage different mechanisms to tackle the
corresponding parts. We will show in this paper that the
compiler instrumentation is suitable for non-loop part while
the page-protection is suitable for the loop part. By combining
them we can make the best use of the monitoring mechanisms
and thus achieve better performance.
Keywords-memory access monitoring; complilor
instrumentation; page-protection; hybrid mechanism; high
efficient
I. INTRODUCTION
Memory access monitoring is widely adopted when we
need to analyze the behavior of programs and make better
control of the programs. Current mechanisms for memory
access monitoring mainly include dynamic instrumentation
[1], compiler instrumentation [2], and page-protection [3].
They have been widely used in many tools. For example,
dynamic instrumentation is adopted in ConMem [5] to track
the memory access behavior for bug detection. Compiler
instrumentation [2] is used in CoreDet [6] to monitor and
tune the memory access behavior of different threads. Page-
protection is used in RFDet [4] to monitor and control
memory accesses for deterministic multi-threading.
Although been convenient and widely adopted, current
mechanisms all fall short on performance. Dynamic and
compiler instrumentation both need to insert a special
function call before every memory access instruction and
thus introduce huge overhead [1][20]. Page-protection is
supported by hardware (page table) but it only monitors
memory accesses at page-size. Further byte-by-byte
comparing is needed to get the exact modification and this
comes with non-ignorable overhead [4][19]. Thus, a high
efficient memory access monitoring is attractive and will
benefit all the previous tools [5, 11].
This paper introduces a hybrid mechanism that combines
compiler instrumentation and page-protection mechanism to
achieve better performance without sacrificing any
monitoring precision. Our idea is to apply different strategies
to different parts of programs and thus make the best use of
different mechanisms. We divide a program into two parts:
(1) loop part and (2) non-loop part and we will discuss and
show that the compiler instrumentation is suitable for non-
loop part and page-protection is suitable for loop part. Thus
by combining them we will get better performance for
monitoring memory accesses.
The rest of this paper is organized as follows: We give a
brief introduction of the current three memory access
monitoring mechanisms and compare them in details in
Section 2. Section 3 gives our design and implementation of
the hybrid memory access monitoring mechanism. We show
the experiment results in Section 4 and discuss related work
in Section 5. Section 6 concludes.
II. BACKGROUND
This section discusses and compares the three popular
mechanisms for memory access monitoring.
A. Dynamic Instrumentation
Dynamic instrumentation [1] is based on dynamic binary
translation [7] which dynamically translates the programs’
code to native code and execute it. The programs’ code will
firstly be translated and copied to another memory place and
then executed. During this process of translation and copying,
we have the opportunity to insert our own code. Figure 1
shows the case. For the two memory accesses (write a and
write b), we insert a special function call fun with the
addresses of a and b being the parameters. Thus every time
when executing the write a and write b, it will first calls our
function and in that function we can do our monitoring
record.