A Virtual Sample Generation Approach for Speculative Multithreading
Using Feature Sets and Abstract Syntax Trees
Bin Liu,Yinliang Zhao,Meirong Li,Yanzhao Liu, Boqin Feng
School of Electronic and Information Engineering
Xi’an Jiaotong University
Xi’an, China
liubin2010@stu.xjtu.edu.cn, zhaoy@mail.xjtu.edu.cn, wodilili@126.com, erdoslyz@yahoo.cn, bqfeng@mail.xjtu.edu.cn
Abstract—Speculative multithreading (SpMT) is a thread level
automatic parallelization technique to accelerate sequential
programs. Since approaches based on heuristic rules only get
the local optimal speculative thread solution and have reached
their speedup performance limit, machine learning approaches
have been introduced into speculative multithreading to avoid
the shortcomings of the heuristic rules relied on experience.
However, few irregular programs can meet the need for
training model of machine learning. To solve this problem, we
first build feature sets based on Olden benchmarks and then
disturb them into new sets. With the new sets, virtual samples
are generated by abstract syntax trees (ASTs). By this means,
we effectively resolve the shortage of samples for speculative
multithreading based on machine learning. On Prophet, which
is a generic SpMT processor to evaluate the performance of
multithread programs, the validity of virtual samples is
verified and reaches an average speedup of 1.47. Experiments
show that the virtual samples can simulate a variety of
procedure structures of Olden benchmarks and this sample
generation technique can provide sufficient samples for
training model.
Keywords-Speculative Multithreading; Machine Learning;
Program Features;Virtual Samples;Automatic Parallelization
I. INTRODUCTION
Multi-core processors are now the mainstream processor
architectures, offering more computing and storage resources,
increasing communication bandwidth and reducing
communication delay. To improve the irregular program
speedup performance on multi-core processors, huge amount
of sequential application programs must be reconstructed so
that they can be executed in parallel [1]. Thread-level
speculation (TLS), also known as SpMT, is an automatic
parallelization of sequential programs on multi-core
processors by parallelizing compiler technology [2].
Compared with manual parallelization, TLS can accelerate
irregular sequential application programs with lowest cost
and without user interactions. Examples of TLS include
Multiscalar, Hydra [3], Pinot [4], POSH [5] and Mitosis [6].
In recent years, machine learning approaches have been
introduced into TLS. Khan et al. [7] extracted features from
sequential programs and used machine learning algorithms to
cross-validate programs. The features were mapped to a
model library to get the program’s executive performance.
Tournavitis et al. [8] extracted the features of control and
data dependences in sequential programs to establish
analysis model. Taking the run-time information as training
samples, they could predict the potential concurrency units
of the program, get the right candidate threads and achieve
better results. Though this approach performed effectively,
only static program features were extracted and insufficient
information of programs were obtained. Wang et al. [9]
developed an automatic compiler-based approach to map a
parallelized program to multi-core processors using machine
learning. They focused on determining the best number of
threads for a parallel program and how they were scheduled.
However, as the feature values of the samples were
estimated, the model was imperfect and imprecise. Although
these research work had some effect in TLS, there are still a
lot of problems.
Sufficient training samples are an important guarantee for
generalization capability of learning model. In 1992, Vetter
et al. [10] proposed the idea of virtual sample, and they
generated virtual samples for image recognition by
geometric transformation. In [11], Li et al. proposed a non-
linear virtual sample generation technique using the group
discovery technology and parametric equations of
hypersphere. Different streaming parallel programs as
training examples for machine learning models were
generated by stream program generator in [12].
In our study, we proposed an automatic compiler-based
approach to partition irregular programs into ideal partition
structures by machine learning. This approach consists two
stages: first, using machine learning, useful knowledge are
obtained from the samples for guiding thread partitioning,
each sample contain an irregular sequential program and the
corresponding an ideal partition structure which can obtain
maximum speedup on multi-core processors. Then, we use
the knowledge to guide the thread partitioning. After that, we
get an ideal partition structure when a new irregular program
should be dealt with.
As sample sets are not enough for training model, we
proposed a sample generation approach to compensate for
the existing benchmark sets. First, features influencing
program’s speedup were studied in irregular application
programs and constructed feature sets based on Olden
benchmarks. After that, new sets were generated by
disturbing them, then the virtual samples were generated
based on abstract syntax trees. Finally, the validity of virtual
2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies
978-0-7695-4879-1/12 $26.00 © 2012 IEEE
DOI 10.1109/PDCAT.2012.33
39