Matrix-based parallel pattern matching method
Hongli Zhang, Dongliang Xu, Lei Zhang, Yanbin Sun
School of Computer Science and Technology, Harbin Institute of Technology
Harbin, China
Email: Ray198421@gmail.com
Abstract—This study presents pattern matching algorithms,
based on vector and matrix models that are suitable for parallel
pattern matching. On these two models, we further proposed
the vector-based single-pattern matching (VBSP) and the matrix-
based multi-pattern matching (MBMP) algorithms, as well as the
matrix-based multi-pattern approximate (MBMPA) algorithm
and the matrix-based multi-pattern exact (MBMPE) algorithm.
The G-MBMP algorithm refers to the implementation of the
MBMP algorithm on a graphics processing unit (GPU). The
performance of the G-MBMPA is better than that of the G-
impMASM. The performance of the G-MBMPE is better than
that of the G-WM (GPU-based WM algorithm) and that of the
G-AC algorithms (GPU-based AC algorithm). The memory of the
G-MBMPE algorithm is the least of the three algorithms and is
significantly less than that of the G-AC algorithm.
Index Terms—matrix, parallel pattern matching, GPU, G-
MAMP.
I. INTRODUCTION
Pattern matching is a basic research problem in computer
science. This problem serves as the kernel module in network
security application systems, which include intrusion detection
systems and firewall. The performance of the pattern matching
algorithm directly affects the efficiency of the whole system.
The pattern matching algorithm has been studied for 30 years
and has become an established topic in the research area of
serial pattern matching. Studies on parallel pattern matching
are far less than those on serial pattern matching.
Rapid network development over the past few years made
it difficult for the traditional pattern matching algorithm of
network security systems to meet the current network traffic
requirements. Hence, an increasing number of researchers
began studying the parallel pattern matching algorithm. Re-
searchers also began searching for means of using hardware
to meet the demands of the current rapid network. Such
methods include Content-addressable memory (CAM) [1]–[3]
and Field-Programmable Gate Array (FPGA) [4] [5]. Although
the efficiency of CAM and FPGA is better than that of
traditional methods, the cost is higher, and portability is poor.
The general computing process of a graphics processing unit
(GPU) provides a new method for researchers. Aside from
graphics rendering, the GPU is also used for general-purpose
computing on graphics processing units (GPGPU). The GPG-
PU generally adopts the CPU + GPU heterogeneous model.
The central processing unit (CPU), which is unsuitable for data
parallel computing, is responsible for the execution of complex
logic processing and transaction management. Meanwhile, the
GPU focuses on intensive large-scale data parallel computing.
Unlike a CPU that is optimized for use on sequential code, all
commodity GPUs follow a streaming, data-parallel program-
ming model that resembles single-program multiple-data. The
GPU has a parallel multi-core architecture. Each core contains
thread processors that simultaneously run hundreds of threads.
The strong processing capacity and high bandwidth of the
GPU are used to compensate for the inadequate performance
of the CPU. This calculation method excavates the potential
performance of computers, in which a significant advantage
in terms of cost and performance is noted.
II. RELATED WORK
An increasing number of researchers are focusing on ap-
proximate matching because of its wide range of applications.
Prasad et al. [6] proposed two multi-pattern approximate
string matching (MASM) algorithms, namely, MASM1 and
MASM2. These two algorithms require no verification and
can handle patterns with length of more than that of a
computer word (w). MASM uses the bit-parallel automata
(BPA) of approximate matching and concatenation to form a
single-pattern from a set of r patterns. The main drawback
of this algorithm is that it requires all the patterns to be
of equal length (m), which is disadvantageous in improving
the performance through the use of GPU when . Xu et al.
[7] proposed an improved bit-parallel algorithm, impMASM,
which can efficiently port to the GPU and can handle patterns
of unequal lengths.
Over the past several years, researchers have developed
a variety of GPU-based pattern matching algorithms to im-
prove network speedup in exact matching. Some researchers
have integrated traditional multi-pattern matching algorithms,
including the Aho-Corasick (AC) [8]–[10] and Wu-Manber
(WM) algorithms [11]–[13], into a GPU. Tran et al. [8] pro-
posed a memory-efficient GPU-based parallelization approach
for the AC algorithm. The proposed approach parallelizes the
AC algorithm by efficiently placing and caching both the
input text string and reference data organized as a 2D table
(STT) in the on-chip shared memories and texture caches.
This condition significantly reduces average memory access
latencies while impressively improving system performance.
Huang et al. [12] implemented two sequential algorithms,
namely, the WM and AC algorithms, over the GPU parallel
computation platform. The experimental results showed that
the throughput of GPU implementation is approximately five
to seven times faster than that of the CPU. Huang et al.
[13] proposed a WM-like, GPU-based multi-pattern matching
IEEE ICC 2015 - Communication and Information Systems Security Symposium
978-1-4673-6432-4/15/$31.00 ©2015 IEEE 7114