ing versions descending by their DFS number satisfies this,
with the advantage that ancestorship can be tested in O(1)
time: let the interval I(v) = [DFS(v), max
wv
DFS(w)],
then v w ⇐⇒ DFS(w) ∈ I(v). As the version tree
changes, we can use an efficient renumbering scheme to re-
tain integer DFS values, such as in the order maintenance
problem [4].
2.2 Definitions
Consider a set of elements A and versions V . An element
(k, v) is a lead element (at v) if v ∈ V . Define lead (A, v) as
the total number of lead elements at v in A and lead(A, V ) =
P
v∈V
lead (A, v). The lead-below count is the total lead at
versions descendent from v, i.e. lead
below(v) =
P
xv
lead (v).
An element (k, x) is said to be live (or accessible) at version
v in A if x v and k has not been rewritten between x and
v, i.e. there is no other element (k, y) ∈ A with x ≺ y v.
Let liv e(A, v) be the total number of elements of A that are
live at v. Note that if v w then live(v) ≤ live(w). Also
live(v) ≤ live(parent(v)) + lead(v), (1)
with the difference between right and left-hand sides being
equal to the number of keys k which appear in both versions
v and parent(v). We use N to denote the total number of
keys written; for a version v, we use N
v
to denote the number
of keys that are live at v, i.e. the number of distinct keys
written in ancestor versions of v (each key is live at least
once, so
P
v
N
v
≥ N).
We assume that keys and values (which could be pointers to
data or real data) are all of fixed size.
3. A CACHE-OBLIVIOUS VERSIONED B-
TREE
In this section we present a cach e-oblivious versioned B-tree,
which we refer to as a stratified doubling array ( SDA). It
contains a collection of arrays of key-version-value tuples,
arranged into levels, with ‘forward pointers’ to facilitate
searchin g. Arrays in level l are roughly twice as large as
arrays in level l −1, hence ‘doubling’, and have d isjoint sets
of versions associated to them, hence ‘stratified’ in version
space.
The basic idea is to store arrays of kv-ordered elements, as
in the COLA of Bender et al. [5], except that we apply
a version split process, similar to the one employed in the
versioned B-tree, albeit more complex, in order to avoid ar-
rays containing too few elements from some version ( we call
this a ‘density’ property). The result is that each level may
have several arrays, tagged with disjoint sets of versions that
indicate which should be used.
3.1 Arrays
An array (A, V ) contains a set A of entries (k, v, x) where
k is a key, v is a version, and x is either a data value or
a forward pointer containing an array index (the array into
which it indexes will become clear from the context later),
ordered by (k, v). The set V is a set of ‘valid versions’ that
will be used for lookups and merges between various arrays.
Each array also contains a point er to a unique ‘next array’,
identifying the array, if any, into which its forward pointers
point. Arrays implement the following operations:
• search(k,v,[lb],[ub]): search for a (k, v) pair, within
optional lower and upper bounds. It returns the index
of a least upper bound y for (k, v) in the k-v order,
and t he destinations of the two closest forward point-
ers either side of y.
• iterate(loc): provides an iterator over elements start-
ing from index loc.
• append(k,v,x): appends the entry to the end of the
array, returning its location.
3.2 Definitions
For a version v, the density of version v in A is δ(A, v) =
live(A, v)/|A|. We say that a version v is dense in A if
δ(A, v) ≥ 1/3, and that an array (A, V ) is dense if every
v ∈ V is dense in A. Note that if v is dense in (A, V ) then
every descendant version is also dense there.
Given a non- empty set of versions V , we say a version v is
an orphan of V if it has no strict ancestor in V . We say
the array (A, V ) is a stratum if the orphans of V are all
siblings – they have the same parent, not in V , which we
write without ambiguity as parent(V );
For a version v and set of versions V , let T
V
[v] = {w ∈ V :
v w} be the subtree of V rooted at v. For W ⊂ V a set of
versions and A an array, define the split of A with respect
to W to be the set of all entries live in any version in W :
λ(A, W ) = {(k, x) ∈ A : (k, x) is live at some v ∈ W }, i.e.
the set of all keys live in any version in W . For W a stratum
with orphans w
i
having common parent p, define
arr_size(A, W ) := live(A, p) + lead(A, W )
= live(A, p) +
P
i
lead
below(A, w
i
)
(2)
As in (1), |λ
T
(A, W )| ≤ arr_size(A, W ) with the difference
being those keys live in the parent version but over-written
in all orphans of W .
As a special case, when W = T
V
[v] for some version v ∈ V ,
define λ
T
(A, V, v) = λ(A, T
V
[v]), and as usual where A
and V are clear, we write T [v] and λ
T
(v) for the set of
versions and corresponding split respectively. Note that
lead (A, T
V
[v]) = lead
below(A, v).
A version split of an array (A, V ) gives a set of strata {(A
i
, V
i
)}
i
such that A = ∪
i
A
i
, and V = ∪
i
V
i
, and V
i
are mutually dis-
joint.
3.3 Levels
As previously mentioned, an SDA keeps (k, v, x) tuples in
arrays arranged into levels. Each level l ≥ 0 contains a set
of arrays (A
l
i
, V
l
i
) with disjoint sets of valid versions. We
keep in memory a map from version to t he array in which
it is valid – if such a thing exists. We also keep track of the
subset of t hose versions (which we call ‘real’) for which there
is at least one lead key in th e array where v is valid.