cache-oblivious流式B树：数据结构与效率优化

33 浏览量更新于2024-08-25 收藏 225KB PDF 举报

《Cache-Oblivious Streaming B-trees》是一篇由Michael A. Bender、Martin Farach-Colton、Jeremy T. Fineman、Yonatan R. Fogel、Bradley C. Kuszmaul以及Jelani Nelson共同发表的计算机科学论文。该研究专注于设计一种能够在数据流处理环境中高效运作的数据结构——流式B树，特别是针对缓存未感知（cache-oblivious）的特性进行优化。流式B树是一种数据结构，它特别适用于处理大量数据的插入和范围查询操作，尤其在内存有限或数据流不断变化的场景下，它能保持良好的性能。论文主要介绍了两种特定的流式B树实现：shuttle tree和cache-oblivious lookahead array (COLA)。 shuttle tree设计的关键在于其对块传输大小（B）的优化，对于每个查询，它实现了搜索操作的最优时间复杂度为O(logB+1N)，这意味着在查找时，即使面对N个元素，也能在理想情况下完成搜索只需B次logB+1级别的数据传输。而对于连续查询范围内的L个元素，其范围查询性能达到最优的O(logB+1N + L/B)，这在查询效率上有着显著的优势。另一核心贡献是cache-oblivious lookahead array (COLA)，它在插入操作上表现突出，对于N个元素的插入操作，时间复杂度被控制在O((logB+1N)/B + Θ(1/(loglogB)^2) + (log2N)/B)。这种设计使得B树能够适应各种缓存大小，无需预先知道具体硬件配置，从而实现了真正的缓存未感知性。这篇论文的核心贡献在于提出了一种能在数据流处理中，尤其是对缓存管理敏感的环境下，提供高效查询和插入操作的新型B树数据结构。这对于大数据处理、分布式系统和云计算等领域具有重要的理论和实践意义，因为它可以减少因缓存不匹配带来的性能瓶颈，提高系统的整体效率。

There are several consequences of this invariant and balancing

strategy. (See e.g. [3,6] for full proofs.)

LEMMA 1. Consider an N-node weight-balanced tree with

constant balance parameter c.

(1) The degree of any node is Θ(c).

(2) For any node u and constant d ≤ h(u), the number of descen-

dants of u that have height at least h(u) −d is Θ(c

(3) Suppose that a node split has cost 1. Then the amortized cost to

insert into the tree is the search cost plus O(1).

(4) Suppose that splitting a node u costs Θ(c

h(u)

). Then the amor-

tized cost to insert into the tree is the search cost plus O(logN).

The shuttle tree supports inserts efﬁciently by using the buffers.

The buffers work in much the same way as in a BRT—an element

being inserted starts at the root, follows the appropriate root-to-

leaf path, but pauses at buffers along the way. The element only

gets “shuttled” down the tree when buffers overﬂow (and hence are

full enough to amortize the cost of crossing block boundaries). Our

shuttle tree differs from the BRT in that it has a linked list of buffers

associated with each child pointer (rather than a single buffer as in

the BRT). These buffers have doubly-exponentially increasing size.

To insert into a shuttle tree, start by inserting into the root node.

To insert into a node in the tree, simply ﬁnd the appropriate child

pointer and insert into the corresponding linked list of buffers by in-

serting into the smallest buffer. When a buffer “overﬂows” (i.e., the

height of the buffer shuttle tree exceeds the to-be-speciﬁed maxi-

mum), take each element in the buffer and insert it into the next

(larger) buffer in the list.

Once the largest buffer in the list over-

ﬂows, insert these items into the child node in the shuttle tree.

(Thus, data items in the shuttle tree live in two possible places,

either in some buffer on a root-to-leaf path or in a leaf of the tree.)

When an inserted element reaches a leaf in the shuttle tree, inser-

tions work in roughly the same way as in any SWBST with splits

trickling up the tree. Note that at the time a node u splits, the buffers

in between u and u’s parent have just been ﬂushed.

LEMMA 2. When an element is inserted into a leaf ℓ, all nodes

on the path from the root to ℓ can be fetched without increasing the

asymptotic complexity, as long as M = Ω(BlogN).

Proof. The reason we insert into a leaf is because its parent’s

buffer has just overﬂowed, thus the grandparent buffer has just

overﬂowed, and so on up the tree to the root. Thus, we just ﬂushed

buffers all the way down. If any subsequent block transfers evict

the root-to-leaf path, we can charge the cost of replacing the rele-

vant path block to the cost of evicting it in the ﬁrst place.

We base our buffer sizes on Fibonacci numbers. Let F

be the kth

Fibonacci number. Then F

= 0, F

= 1, and F

= F

k−1

k−2

. For

all positive integers h, we deﬁne the Fibonacci factor of h, denoted

by ξ(h), as follows. If h is a Fibonacci number, then ξ(h) = h.

Otherwise, let f be the largest Fibonacci number less than h. Then

the Fibonacci factor of h is ξ(h) = ξ(h − f ). The buffer sizes of

a node at height h + 1 depend upon ξ(h). In particular, consider a

node u at height h + 1 in the tree, and let k be such that F

= ξ(h).

We deﬁne the buffer-height-index function H ( j) = j −⌈2log

j⌉,

where ϕ≈1.618 is the golden ratio. Then u has buffers with heights

H ( j)

, for each integer j, j = Θ(1), ...,k −1,k.

In other words,

there are roughly k buffers increasing geometrically in their heights,

These items are inserted in arrival order, not smallest to largest.

We can start j at any sufﬁciently large constant to help the proofs,

in particular Lemma 16.

and the largest buffer has height F

H (k)

= F

k−2⌈log

k⌉

. These set-

tings mean that the parent node of a subtree containing roughly K

nodes has the largest buffer of size roughly K

1/Θ((loglogK)

)

The shuttle tree as speciﬁed thus far cannot yet be analyzed in

the cache-oblivious setting. To do so, we must enforce a particular

dynamic layout in memory. We must show that the layout permits

efﬁcient operations and can be efﬁciently maintained.

Shuttle-tree layout

We lay out the shuttle tree recursively in a type of “van Emde Boas

(vEB) layout” [20] that takes into account the lists of buffers and

several additional complications.

We ﬁrst explain how our vEB layout would proceed on a regular

tree of height h. Let F

be the largest Fibonacci number strictly

smaller than h. Then we split the tree at height F

(roughly h/ϕ

instead of at height h/2 as in the traditional vEB layout). That is, if

h = F

is the kth Fibonacci number, then we split the tree into a root

subtree of height F

k−2

and leaf subtrees of height F

k−1

, which are

recursively laid out. It is important that the split is above the half-

way point, height h/2, unlike in previous cache-oblivious search

structures [6–8, 11, 20]. Fibonacci numbers are a convenient way

to ensure this requirement because they enforce some integrality

while roughly matching F

≈ ϕ

We now give the vEB layout of the shuttle tree, which means

also laying out the buffers; see Figure 1. Consider a (sub)tree of

height F

k+1

, with leaves of this tree having buffers of heights (ge-

ometrically increasing) up to F

H (k+1)

. Think of this subtree and

these buffers as a single entity, which we call a recursive sub-

tree. When laying out this recursive subtree, we split the subtree

at height F

. In the “left” end of memory, we store the top recur-

sive subtree of height F

k−1

(which includes leaf buffers of height

up to F

H (k−1)

) recursively. To the right of this subtree, we store

the height-F

H (k)

leaf buffers, from left-to-right in the same order

as the leaves. To the right of these buffers, we lay out each of the

bottom recursive subtrees of height F

(including leaf buffers of

height up to F

H (k)

) recursively. To the right of each of the bottom

recursive subtrees, we lay out that subtree’s height-F

H (k+1)

leaf

buffers. We call the (contiguous) height-F

recursive subtree and

height-F

H (k+1)

buffers (which appear immediately after the recur-

sive subtree in the layout) a (height-F

) buffered recursive subtree.

The leaves of the shuttle tree are special in that they do not have

any buffers. We call a recursive subtree containing leaves of the

top level (i.e., entire) tree a leaf recursive subtree. The recursive

layout of a leaf recursive subtree is the same, except that the bot-

tom recursive subtrees do not have any buffers coming out of the

leaves. For convenience, we use the terms “recursive subtree” and

“buffered recursive subtree” as generalizations of “leaf recursive

subtree,” even though the leaves do not have buffers.

The buffer heights are carefully chosen using the Fibonacci fac-

tors to match this recursive layout. It is always the case that when

splitting a tree at the proper height, the leaf nodes of a height-F

top

or bottom recursive subtree have a height-F

H (k+1)

buffer that can

be stored after the recursive subtree (as indicated in Figure 1). The

exception is for buffers that would have a sufﬁciently small con-

stant height; these buffers are omitted altogether. (The elimination

of small buffers helps the analysis in Lemma 16.)

Another way to interpret buffer sizes is that a node has a buffer

for every (sufﬁciently large) recursive subtree in which it is a leaf.

Thus, nodes that are roots of height-2 or taller recursive subtrees

(i.e., those having Fibonacci factors > 1) have no buffers, because

they cannot also be leaves of recursive subtrees. This notion is cap-

tured by the following lemma, which can be proved by induction.

剩余11页未读，继续阅读

weixin_38539705

粉丝: 6
资源: 952

cache-oblivious流式B树：数据结构与效率优化

深入探究Cache-Oblivious-Algorithms缓存遗忘算法

TokuDB：提升MySQL与MariaDB数据库性能与扩展性的技术解析

缓存无关算法与数据结构：Demaine 2002年的研究

Cache-Oblivious Algorithms and Data Structures (Demaine, 2002)-计算机科学

Cache-Oblivious Peeling of Random Hypergraphs - 2nd Dec 2013 (1312.0526)-计算机科学

Cache-Oblivious-Algorithms:缓存遗忘算法

Enhancing Server Availability and Security Through Failure-Oblivious Computing - 2004 (rinard)-计算机科学

mc-oblivious:原型

基于向量引用Platform-Oblivious内存连接优化技术.pdf

Practical quantum all-or-nothing oblivious transfer protocol

最新资源