A Fingerprint-based Fast Hash Table for High-Speed Packet Processing
*
Xuyu Xiang, Jiaohua Qin
College of Transportation and Logistics, Central South University of Forestry and Technology,
Changsha 410004, China E-mail:{xyuxiang@163.com,qinjiaohua@163.com}
Abstract
As link rates and traffic volumes of the Internet are constantly increasing, existing hash tables have a
challenge of high performance, that is how to both satisfy high-speed packet processing and limit memory
space. Recently proposed Pruned Fast Hash Table (PFHT) and Shared-node Fast Hash Table (SFHT)
suffer from both high-overhead update and large memory space, where update requires not only frequent
accesses of off-chip memory but also extra off-chip memory space to store either a backup Basic Fast Hash
Table or several replicas of items. In this paper, we propose a Fingerprint-based Fast Hash Table (F
2
HT),
when store/query an item
, we leverage the fingerprint '( )hx by hashing the item to select
the
('()mod)hx kth of k buckets indexed by
to store/search the item. F
2
HT is a trade-off scheme among the
query performance, the update overhead and memory space requirement of the Fast Hash Tables, which
has
(1)O
average memory accesses of insert, delete, and query, without extra off-chip memory space.
Experimental results show that compared with PFHT and SFHT, F
2
HT reduces the overhead of update by
one order of magnitude, as well as the size of off-chip memory space by several factors.
Keywords: Packet Processing
;
Hash Table
;
Bloom Filter
;
Fingerprint
1. Introduction
Hash table is a data structure that uses a hash function to map identifying values, known as keys,
to their associated values , that is, each lookup is independent of the number of elements stored in
the table
[1]
. Hash table designs also allow arbitrary insertions and deletions of key-value pairs, at
constant average
(1)O
[2,3]
. Hash collisions are practically unavoidable as hashing a random subset of
a large set of possible keys, which leads to increase of the operation spending and reduce of the
search performance. In order to reduce the Hash table Collision, some collision resolution strategies
are proposed, such as Linear Chaining, Linear Probing and Double Hashing
[1]
, to guarantee the hash
table of average search performance for constant
(1)O
. Hash table is widely used in packet
processing, such as IP find
[4,5]
, packet classification
[6-8]
, depth packet detection
[9-11]
and network
traffic monitoring
[12,13]
, etc.
Packet processing is concentrated on finding the best rules matched with each input data packet
in a given set of rules. For example, the IP lookup is concentrated on finding the Longest prefix
matching rule with the destination IP address of input data packet in a group of constituted rules by
the IP address prefix
[4,5]
. packet classification is concentrated on finding the rules that most closely
matches the input data packet 5-tuple fields and the priority in a group of rules by the 5-tuple (that
the source IP address, source port, destination IP address, destination port and protocol type) and
the priority
[6-8]
. Deep packet inspection is concentrated on finding all the rules that match the input
data packet payload in a set of rules constituted signature string
[9,10]
. Packet processing is a
computationally intensive operation
[14]
, usually deployed in the critical data path of the high-speed
router, the mass of data packets must be checked and the tens of thousands of rules be matched.
With the rapid growth of network bandwidth and traffic, the hash table will be facing a high-
performance challenge, namely how to meet the time and space requirements of high-speed packet
processing. In packet processing, the hash table is mainly used in the TCP connection records
[9,10]
,
the statistics of each flow state
[12,13]
, and representation and lookup of the rule set
[14,5-7,11]
.
Although the hash table has a good average performance, its worst performance is the bottleneck of
a high-speed packet processing. Collision in the hash table increases with its load, which not only
increase the worst detection sequence and cache queue length, and consume the limited computing
and storage resources of high-speed routers. To adapt to the high-speed, high-capacity packet
processing, there is an urgent need to optimize the hash table design based on hardware to improve
the worst performance of packet processing. On the one hand, in order to meet the time demand for
line-speed processing of network packets, such as packet processing throughput up to 10 ~ 40Gbps,
researchers use modern embedded memory technology, such as Application Specific Integrated
Journal of Convergence Information Technology(JCIT)
Volume 7, Number 19, Oct 2012
doi : 10.4156/jcit.vol7.issue19.30
A Fingerprint-based Fast Hash Table for High-Speed Packet Processing
Xuyu Xiang, Jiaohua Qin