矩阵并行模式匹配算法：GPU加速与性能优化

101 浏览量更新于2024-08-27 收藏 718KB PDF 举报

本文探讨了一种基于矩阵的并行模式匹配方法，针对计算机科学与技术领域的研究人员提供了一种高效且适合在并行环境中执行的模式匹配策略。研究者Hongli Zhang、Dongliang Xu、Lei Zhang 和 Yanbin Sun 来自哈尔滨工业大学，他们共同提出了两种核心算法：向量基单模式匹配（VBSP）算法和矩阵基多模式匹配（MBMP）算法。这两种模型的优势在于它们能够充分利用并行计算能力，特别适合在现代图形处理器（GPU）等硬件上进行加速。作者进一步开发了矩阵基多模式近似匹配（MBMPA）算法和矩阵基多模式精确匹配（MBMPE）算法，以提升匹配性能的同时保持准确性。在GPU实现方面，研究者将MBMP算法转化为G-MBMP，与已有的GPU加速模式匹配算法如G-imPMA和G-WM（GPU基础的WM算法）进行了对比。实验结果显示，G-MBMPA在性能上优于G-imPMA，而G-MBMPE则在性能上超过了G-WM和G-A（GPU基础的A算法）。尤其值得注意的是，G-MBMPE算法在内存消耗方面表现出色，相比其他算法有显著的优势，是这三种方法中最节省内存的一个。该研究的关键词包括矩阵、并行模式匹配以及GPU，对于处理大规模数据集和实时应用中的模式搜索具有重要意义。这些算法不仅提升了计算效率，也为其他领域如生物信息学、图像处理和网络安全提供了强大的工具。这项工作为并行计算环境下的模式匹配问题提供了一个创新且高效的解决方案。

展开

Matrix-based parallel pattern matching method

Hongli Zhang, Dongliang Xu, Lei Zhang, Yanbin Sun

School of Computer Science and Technology, Harbin Institute of Technology

Harbin, China

Email: Ray198421@gmail.com

Abstract—This study presents pattern matching algorithms,

based on vector and matrix models that are suitable for parallel

pattern matching. On these two models, we further proposed

the vector-based single-pattern matching (VBSP) and the matrix-

based multi-pattern matching (MBMP) algorithms, as well as the

matrix-based multi-pattern approximate (MBMPA) algorithm

and the matrix-based multi-pattern exact (MBMPE) algorithm.

The G-MBMP algorithm refers to the implementation of the

MBMP algorithm on a graphics processing unit (GPU). The

performance of the G-MBMPA is better than that of the G-

impMASM. The performance of the G-MBMPE is better than

that of the G-WM (GPU-based WM algorithm) and that of the

G-AC algorithms (GPU-based AC algorithm). The memory of the

G-MBMPE algorithm is the least of the three algorithms and is

signiﬁcantly less than that of the G-AC algorithm.

Index Terms—matrix, parallel pattern matching, GPU, G-

MAMP.

I. INTRODUCTION

Pattern matching is a basic research problem in computer

science. This problem serves as the kernel module in network

security application systems, which include intrusion detection

systems and ﬁrewall. The performance of the pattern matching

algorithm directly affects the efﬁciency of the whole system.

The pattern matching algorithm has been studied for 30 years

and has become an established topic in the research area of

serial pattern matching. Studies on parallel pattern matching

are far less than those on serial pattern matching.

Rapid network development over the past few years made

it difﬁcult for the traditional pattern matching algorithm of

network security systems to meet the current network trafﬁc

requirements. Hence, an increasing number of researchers

began studying the parallel pattern matching algorithm. Re-

searchers also began searching for means of using hardware

to meet the demands of the current rapid network. Such

methods include Content-addressable memory (CAM) [1]–[3]

and Field-Programmable Gate Array (FPGA) [4] [5]. Although

the efﬁciency of CAM and FPGA is better than that of

traditional methods, the cost is higher, and portability is poor.

The general computing process of a graphics processing unit

(GPU) provides a new method for researchers. Aside from

graphics rendering, the GPU is also used for general-purpose

computing on graphics processing units (GPGPU). The GPG-

PU generally adopts the CPU + GPU heterogeneous model.

The central processing unit (CPU), which is unsuitable for data

parallel computing, is responsible for the execution of complex

logic processing and transaction management. Meanwhile, the

GPU focuses on intensive large-scale data parallel computing.

Unlike a CPU that is optimized for use on sequential code, all

commodity GPUs follow a streaming, data-parallel program-

ming model that resembles single-program multiple-data. The

GPU has a parallel multi-core architecture. Each core contains

thread processors that simultaneously run hundreds of threads.

The strong processing capacity and high bandwidth of the

GPU are used to compensate for the inadequate performance

of the CPU. This calculation method excavates the potential

performance of computers, in which a signiﬁcant advantage

in terms of cost and performance is noted.

II. RELATED WORK

An increasing number of researchers are focusing on ap-

proximate matching because of its wide range of applications.

Prasad et al. [6] proposed two multi-pattern approximate

string matching (MASM) algorithms, namely, MASM1 and

MASM2. These two algorithms require no veriﬁcation and

can handle patterns with length of more than that of a

computer word (w). MASM uses the bit-parallel automata

(BPA) of approximate matching and concatenation to form a

single-pattern from a set of r patterns. The main drawback

of this algorithm is that it requires all the patterns to be

of equal length (m), which is disadvantageous in improving

the performance through the use of GPU when . Xu et al.

[7] proposed an improved bit-parallel algorithm, impMASM,

which can efﬁciently port to the GPU and can handle patterns

of unequal lengths.

Over the past several years, researchers have developed

a variety of GPU-based pattern matching algorithms to im-

prove network speedup in exact matching. Some researchers

have integrated traditional multi-pattern matching algorithms,

including the Aho-Corasick (AC) [8]–[10] and Wu-Manber

(WM) algorithms [11]–[13], into a GPU. Tran et al. [8] pro-

posed a memory-efﬁcient GPU-based parallelization approach

for the AC algorithm. The proposed approach parallelizes the

AC algorithm by efﬁciently placing and caching both the

input text string and reference data organized as a 2D table

(STT) in the on-chip shared memories and texture caches.

This condition signiﬁcantly reduces average memory access

latencies while impressively improving system performance.

Huang et al. [12] implemented two sequential algorithms,

namely, the WM and AC algorithms, over the GPU parallel

computation platform. The experimental results showed that

the throughput of GPU implementation is approximately ﬁve

to seven times faster than that of the CPU. Huang et al.

[13] proposed a WM-like, GPU-based multi-pattern matching

IEEE ICC 2015 - Communication and Information Systems Security Symposium

下载后可阅读完整内容，剩余5页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

weixin_38526751

粉丝: 3

矩阵并行模式匹配算法：GPU加速与性能优化

并行计算技术在矩阵处理中的三个版本研究

探索高效串匹配算法：从精确到近似

并行计算在图像检索中的高效应用

光学交互式仿真工具箱：二维 FDTD-工具箱（Yee Grid，基于矩阵，电流样源，PML 边界，基于 GUI）-matlab开发

基于GPU的串匹配算法研究.pdf

基于纹理的匹配算法

基于matlab-模板匹配-角度修正的铭牌识别.zip

动态图模式匹配技术综述.pdf

电子功用-电子商务环境下基于云计算处理模式的个性化推荐方法

矩阵计算的并行化：多核处理器加速矩阵运算的秘诀

最新资源