In this study, we propose a novel skip list and Octree-
based dynamic index. As far as we know, ours is the first
work to set up an auxiliary cloud index using a skip list
and Octree structure.
Framework of the Skip-Octree index
In cloud storage systems , a whole dataset is distributed
and stored on multiple data servers. Therefore, query
performance is mainly affected by two aspects. One is
the manner in which to locate the corresponding data
servers that stored user required data effectively. The
other is the manner in which to improve the efficiency
of data access on each local data server. In this study, a
new double-layer cloud indexing framework based on
Octree and skip list is proposed.
Background of Octree and skip list
Octree is a type of multidimensional data structure with
which a multidimensional data space is recursively divided
into eight equal subspaces (namely quadran ts) until a qua d-
rant contains only one data object. In addition, Octree is an
adopted tree-based storage structure. For an Octree, an ori-
ginal data space is represented as a root node. Then, eight
quadrants which act as eight children nodes of the root are
generated by space partition. However, under the condition
in which data is both sparsed and skewed, the query per-
formance of Octree is worse than sequence retrieve. Hence,
the compressed Octree was proposed in the study in [28].
In a compressed Octree, all empty paths are removed.
Compared with R tree [29], the space division method of
compressed Octree is simpler, and no space overlap occurs.
Therefore, compressed Octree is used to index local data in
this study. For simplicity, the compressed Octree is also
called Octree in our cloud index framework.
The skip list [30] is a randomized data structure that or-
ganizes elements with hierarchical ordered link lists. Thus,
it is an extension of the ordered list. Because query pro-
cessing on each layer can skip many elements, a skip list
can provide adequate query performance with a balanced
binary tree. In addition, because a randomized algorithm
is adopted to maintain balance rather than employing
strictly enforced balancing, the insertion and deletion op-
erations in a skip list are much simpler and considerably
faster than the balanced binary tree. Furthermore, skip list
is well suited to parallel computation applications. The in-
sertion can be performed in parallel using different posi-
tions of the ordered list without rebalancing the global
data structure. Skip list has been embedded in some popu-
lar key-value store databases such as Leveldb and Redis.
Strictly speaking, skip list is not a search tree, but
its expected time complexity is O(log
2
n), which is
similar to a binary search tree. In our Skip-Octree,
the idea of skip list is utilized to accelerate the data
retrieval efficiency of Octree.
Skip-Octree index specification
Octree is an efficient three-dimensional space partition
method. However, in a cloud environment, extensive
data can enlarge Octree to such an extent that it be-
comes inaccessible. In this section, our proposed index
structure called Skip-Octree is described. Skip-Octree
provides a hierarchical view of the compressed Octree to
allow for logarithmic expected-time querying.
Design of Skip-Octree
Based on the randomizing idea of a skip list, the original
dataset is randomly divided into subsets with a probabil-
ity of 1/2. In addition, an individual Octree is con-
structed for each data set.
In Fig. 1, Q
0
, Q
1
,andQ
2
are three datasets, where Q
0
is the original dataset, Q
1
contains approximately half
the data of Q
0
and which is a subset of Q
0
, and Q
2
is a
subset of Q
1
. The query request is processed from right
to left, that is, from the smallest Octree to the largest.
For each non-empty subspace, a pointer links it between
different layers of the Octree. For example, if a user
wants to search a keyword k, the hierarchical Octree
index performs this query request at Q
2
. Then, because
k is not found on Q
2
, this query request is redirected to
Q
1
. Finally, Q
0
receives this query request and obtains k.
Because this query procedure has similar properties to
those of a skip list, the hierarchical Octree is essentially
a skip list recon struction.
Definition of Skip-Octree The Skip-Octree is defined
by a sequence of subsets Li of the input points S with L
0
= S and builds a compressed Octree Qi for each Li. For
i >0, Li is sampled from L
i-1
by maintaining each point
with a probability of 1/2. For each Li, a compressed
Octree Qi is built for the points in Li. Therefore, Qi can
be seen as forming a sequence of levels in the skip list
such that L
0
and Ltop are the bottom and top le vels,
respectively.
As Fig. 2 illustrates, a skip list is a randomized data
structure in which level 0 is denoted as L
0
that records
all original data. In the same manner, L
1
records
approximately half of the data of L
0
and L
2
records ap-
proximately half those of L
1
. In Skip-Octree, L
0
, L
1
, and
L
2
correspond to the three hierarchal Octree Q
0
, Q
1
, and
Q
2
. The multidimensional data space is partitioned by
Octree to obtain multiple level subspaces. The skip list
is used to organize these hierarchical data points and ac-
celerate query performance. In a skip list, the same
nodes between the upper and lower layers are associated
with the pointer. Thus, with the pointer pointed to the
root node in the topmost layer, we can find the specific
keyword by having the pointer move down . In addition,
with the locality sensitive hashing function [31], the
He et al. Journal of Cloud Computing: Advances, Systems and Applications (2016) 5:10 Page 3 of 11