Eunomia：高竞争环境下并发搜索树的HTM扩展策略

研究论文

170 浏览量更新于2024-08-27 收藏 644KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"Eunomia是一项使用硬件事务内存(HTM)在高竞争环境下扩展并发搜索树的研究。该研究分析了HTM在构建高效并发B+树时存在的性能问题，并提出了一个新的设计模式Eunomia，以减少HTM事务的中止并提高并发性能。" 在现代多核处理器中，硬件事务内存（Hardware Transactional Memory, HTM）被广泛用于并发数据结构，如搜索树，以提供更好的性能和可伸缩性。然而，当面临高竞争环境时，现有的HTM解决方案并不能始终如一地提供预期的性能。这篇研究论文关注的问题在于，HTM在处理并发搜索树时，尤其是在冲突频繁的情况下，会出现过多的事务中止，这既包括由于虚假冲突（false conflicts）导致的中止，也包括由于真实冲突（true conflicts）导致的中止。论文首先对一个基于HTM的并发B+树进行了深入分析，揭示了导致HTM事务中止的原因。这些原因可能包括但不限于事务区域过大导致的冲突概率增加，以及在并发更新时的版本控制问题。通过对这些问题的识别，研究者们提出了Eunomia这一设计模式，它包含一系列原则来减少HTM事务的中止。 Eunomia的关键特性之一是将HTM区域拆分为多个基于版本的并发控制区域。这种策略可以降低单个事务覆盖的范围，从而减少冲突的可能性。此外，Eunomia还可能采用了优化的冲突检测和解决策略，以更好地处理虚假冲突，避免不必要的事务回滚。同时，通过更精细的版本管理，Eunomia能够更有效地跟踪和协调并发操作，从而降低真实冲突导致的中止。在Eunomia的设计中，可能还包括了预判和预测技术，以预测可能的冲突并提前调整事务执行路径，进一步提高并发性能。此外，Eunomia或许还采用了自适应的策略，能够根据当前系统的负载和冲突情况动态调整事务的大小和执行策略。 Eunomia是针对HTM在并发搜索树中应用的优化设计，旨在通过减少事务中止和提高并发处理能力，实现搜索树在高竞争环境下的性能扩展。这一研究成果对于理解和改进并发数据结构的性能，尤其是在大规模并行计算和分布式系统中的应用，具有重要的理论和实践意义。

资源详情

资源推荐

use of HTM to construct concurrent search tree structures.

Here, we use an HTM-based B+Tree from DBX [32], an

in-memory database, as an example. A B+Tree is a B-

Tree in which internal nodes only store keys, and only

leaves are associated with values [3]. The HTM-B+Tree

adopts HTM regions to protect operations of the B+Tree

such as get, put, and delete. This design was later adopted

and shown to be effective in other distributed in-memory

databases [10, 34]. Since most HTM-B+Tree operations

share the major process of accessing B+Tree, here we use

the put operation as an example to illustrate the access

algorithm of HTM-B+Tree in Algorithm 1, which comprises

the following steps. For brevity, we omit the structural

changes and rebalance operations.

(1) Traversing the internal nodes (Lines 6-8). In this

stage, the request traverses tree edges from the root to the

target leaf node; (2) Traversing the leaf nodes (Lines 10-15).

The request ﬁrst detects if there are duplicate keys in the

target leaf. If so, the put operation changes into an update;

otherwise it will insert a new record; (3) Propagating splits

upwards (Lines 17-19). For a put operation, if the target

leaf node is already full, then insertion triggers splits, and

propagates the split upwards until encountering an internal

node with empty slots.

The three stages are included in a monolithic HTM region

marked by xbegin and xend primitives; such a coarse-grained

HTM region eliminates the complexity of maintaining ﬁne-

grained locks and makes it easy to reason about correctness.

As a result, it was shown to have much better performance

compared to a state-of-the-art B+Tree (i.e., Masstree [20])

under low to modest contention [32].

2.3 Issues under High Contention

While the HTM-based concurrent tree structure has high

performance under low and modest contention, its perfor-

mance may collapse under high contention. To illustrate

this, we evaluate the throughput of HTM-based B+Tree

using the YCSB benchmark with the Zipﬁan input distribu-

tion [17, 25]. We adjust the skew coefﬁcient ✓ in the Zipﬁan

distribution to simulate different levels of contention. We

test it on a 20-core platform with Intel’s TSX [12] support.

All the performance results are collected using 16 threads

(a few cores are reserved for controlling threads). Threads

are distributed equally on two sockets (detailed experimental

setup in section 5.1).

As shown in Figure 1, with low contention rate (i.e.,

skew coefﬁcient ✓<0.6), the HTM-based B+Tree achieves

high and stable performance. However, when the contention

rate increases (e.g., ✓>0.6), performance of an HTM-

based B+Tree drops sharply. When ✓ = 0.9, the performance

decreases to lower than 3 million ops/s. To understand the

reasons behind the performance collapse, we collect the

number of HTM aborts. Since adding performance counters

to each HTM region severely hinders the overall throughput,

here we set performance counters in every 10 operations, so

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Throughput (Million ops/sec)

Skew Coefficient (θ)

HTM-B+Tree

Figure 1: Performance under different contention rates.

that the performance with HTM counters deviates little from

that without counters. As shown in Figure 2, the HTM abort

rate increases sharply with the contention rate; the HTM

abort rate for ✓ = 0.9 is around 47X higher than that for

✓ = 0.5. The collected CPU cycles also show that frequent

HTM aborts and retries waste more than 94% of the total

CPU cycles when ✓ = 0.9.

0.5 0.6 0.7 0.8 0.9 0.99

HTM Aborts per Operation

Skew Coefficient (θ)

Same Record

Meta-Data

Different Records

Figure 2: HTM aborts incurred by different reasons.

To understand the underlying reasons for collapsed per-

formance under high contention, we perform a detailed

analysis and uncover three main sources of aborts.

• High retry cost due to monolithic transactions. While

using a monolithic transaction for a critical section

provides consistency with trivial effort, it also causes

increased abort rates. Even worse, a retry from a leaf

node would waste a lot of useful work, causing high retry

cost. Our analysis ﬁnds that the distribution of conﬂicts is

non-uniform in the B+Tree: more than 90% of conﬂicts

occur in the leaf level. In this case, a conﬂict in leaf nodes

will abort the entire tree traversal from root to leaf, even

though there is no conﬂict in internal nodes.

• False conﬂicts. False Conﬂicts are conﬂicts incurred

by requests accessing different records. False conﬂicts

stem from two major reasons. The ﬁrst one is cache line

sharing of consecutive records. B+Tree arranges keys

stored in a node in a consecutive manner to provide an

ordered store. However, such data layout causes severe

conﬂicts under high contention. Since HTM detects

conﬂicts at cache line granularity, concurrently accessing

data in the same cache line would result in increased

剩余12页未读，继续阅读

weixin_38694566

粉丝: 5
资源: 878

Eunomia：高竞争环境下并发搜索树的HTM扩展策略

eunomia:针对 Javascript 的数据上下文交互库

eunomia:一个快速的 DataLog Reasoner

NanoAirline航空公司管理系统.zip

基于Tensorflow的手势识别代码+数据集+文档说明（期末大作业）

weixin029微信阅读网站小程序+ssm.rar

基于java的植物健康系统设计与实现.docx

书籍推荐系统构建实践.zip

基于java的乡政府管理系统设计与实现.docx

基于java的校园疫情防控系统设计与实现.docx

weixin017基于微信小程序的学生公寓电费信息管理系统+ssm.rar

旅行管理系统（trip-management-system）.zip

基于Java和Spring框架的请假管理系统.zip

cad软件操作中线性改色的插件

weixin076亿家旺生鲜云订单零售系统的设计与实现+ssm.rar

Serial ATA revistion3.2 protocalSATA 3.2 协议 protocal

基于ssm的动物园管理系统设计与实现.doc

数据的多次复制和智能重命名

基于java的大学城水电管理系统设计与实现.docx

weixin037基于微信小程序的4S店客户管理系统+ssm.rar

最新资源