无锁Hash表：Split-OrderedLists算法详解

需积分: 20 165 浏览量更新于2024-07-18 收藏 605KB PDF 举报

无锁Hash表的实现方式是一项关键的高性能并发数据结构研究，由ORISHALEV和NIRSHAVIT两位作者在他们的论文中提出。该论文主要关注的是在现代架构上设计并实现一个可扩展的无锁哈希表，旨在提供并发插入、删除和查找操作，且期望的时间复杂度达到O(1)。这种创新的设计挑战了传统的无锁算法，因为它并非基于"物品在桶间移动"的传统思路，而是通过递归分有序列表（Split-OrderedLists）来实现。在无锁Hash表的设计中，核心是分有序列表的概念，它是一种特殊的链接列表，其中元素被有序排列，使得每个元素能够通过单次比较和交换操作进行"分割"。这种方法允许哈希表在扩展时，不仅将新的元素添加到已有结构中，而且可以高效地维护数据的一致性，避免了传统锁机制带来的同步开销，从而提升了并发性能。尽管无锁算法通常期望在多程序环境中发挥最佳效果，但作者们还进行了大规模的共享内存多处理器实验，结果显示即使在非多程序环境下，他们的新算法也显示出优越的性能。值得注意的是，实现这个无锁哈希表的关键在于代码的简洁性，仅使用了加载、存储和比较与交换操作，这大大降低了开发者在实际应用中的复杂性。这对于那些追求高并发、低延迟的应用场景，如大规模分布式系统、实时数据处理等，具有很高的实用价值。然而，由于其依赖于特定的硬件特性，可能对某些老旧或特殊架构的兼容性有所限制，因此在实际部署时需要仔细评估。这篇论文提供了一种新颖的无锁哈希表设计，不仅实现了高效的并发操作，还展示了在不同环境下的性能优势，对于理解无锁数据结构和优化高并发场景下数据结构设计具有重要的理论和实践意义。

384 O. SHALEV AND N. SHAVIT

perform as well as Lea’s algorithms, even in nonmultiprogrammed cases, although

lock-free algorithms are expected to beneﬁts systems mainly in multiprogrammed

environments. Under high loads, they signiﬁcantly outperform Lea’s algorithm,

exhibiting up to four times higher throughput. They also exhibit greater robustness,

for example in experiments where the hash function is biased to create nonuniform

distributions.

The remainder of this article is organized as follows: In the next section, we

describe the background and the new algorithm in depth. In Section 3, we present

the full correctness proof. In Section 4, the empirical results are presented and

discussed.

2. The Algorithm in Detail

Our hash table data structure consists of two interconnected substructures (see

Figure 1): A linked list of nodes containing the stored items and keys, and an

expanding array of pointers into the list. The array entries are the logical “buckets”

typical of most hash tables. Any item in the hash table can be reached by traversing

down the list from its head, while the bucket pointers provide shortcuts into the list

in order to minimize the search cost per item.

The main difﬁculty in maintaining this structure is in managing the continuous

coverage of the full length of the list by bucket pointers as the number of items in

the list grows. The distribution of bucket pointers among the list items must remain

dense enough to allow constant time access to any item. Therefore, new buckets

need to be created and assigned to sparsely covered regions in the list.

The bucket array initially has size 2, and is doubled every time the number of

items in the table exceeds size · L, where L is a small integer denoting the load

factor, the maximum number of items one would expect to ﬁnd in each logical

bucket of the hash table. The initial state of all buckets is uninitialized, except

for the bucket of index 0, which points to an empty list, and is effectively the

head pointer of the main list structure. Each bucket goes through an initialization

procedure when ﬁrst accessed, after which it points to some node in the list.

When an item of key k is inserted, deleted, or searched for in the table, a hash

function modulo the table size is used, that is, the bucket chosen for item k is

k mod size. The table size is always equal to some power 2

, i ≥ 1, so that the

bucket index is exactly the integer represented by the key’s i least signiﬁcant bits

(LSBs). The hash function’s dependency on the table size makes it necessary to

take special care as this size changes: an item that was inserted to the hash table’s

list before the resize must be accessible, after the resize, from both the buckets it

already belonged to and from the new bucket it will logically belong to given the

new hash function.

2.1. R

ECURSIVE SPLIT-ORDERING. The combination of a modulo-size hash

function and a 2

table size is not new. It was the basis of the well known se-

quential extensible Linear Hashing scheme proposed by Litwin [1980], was the

basis of the two-level locking hash scheme of Ellis [1983], and was recently used

by Lea [2003] in his concurrent extensible hashing scheme. The novelty here is

that we use it as a basis for a combinatorial structure that allows us to repeatedly

“split” all the items among the buckets without actually changing their position in

the main list.

剩余26页未读，继续阅读

榕易

粉丝: 59

无锁Hash表：Split-OrderedLists算法详解

atomic_hash：高性能无锁哈希表支持并发读写

Android平台下弱键无锁并发哈希表的实现

无锁哈希表：高性能多线程解决方案

atomic_hash:可锁定多个线程的无锁哈希表可以同时读写在现代计算机平台上删除多达1000万个操作

基于无锁哈希表的IBM Alignment Model的并行加速

自己实现的hash表

GPU无锁跳步哈希表.pdf

SimpleGPUHashTable：使用无锁技术在CUDA中实现的简单GPU哈希表

高手打造最快的Hash表源码(和Blizzard的对话)

LockFreeMap:无锁int-int哈希图

最新资源