Future Generation Computer Systems 72 (2017) 264–272
Contents lists available at ScienceDirect
Future Generation Computer Systems
journal homepage: www.elsevier.com/locate/fgcs
A hybrid index for temporal big data
Mei Wang
a
, Meng Xiao
a
, Sancheng Peng
b,c,∗
, Guohua Liu
a
a
School of Computer Science and Technology, Donghua University, Shanghai, 201620, PR China
b
School of Informatics, Guangdong University of Foreign Studies, Guangzhou, Guangdong Province, 510420, PR China
c
Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou, Guangdong Province, 510420, PR China
h i g h l i g h t s
• A novel segmentation hybrid index SHB+-Tree for temporal big data is proposed.
• The proposed index integrates the advantages of temporal index and object index.
• The segmented storage strategy is proposed.
• The bottom-up index construction approach is provided.
• The experiments are conducted to verify the effectiveness of the proposed method.
a r t i c l e i n f o
Article history:
Received 15 November 2015
Received in revised form
14 May 2016
Accepted 6 August 2016
Available online 26 August 2016
Keywords:
Big data
Temporal database
Temporal index
SHB+-Tree index
Segmented storage
a b s t r a c t
Temporal index provides an important way to accelerate query performance in temporal big data.
However, the current temporal index cannot support the variety of queries very well, and it is hard to take
account of the efficiency of query execution as well as the index construction and maintenance. In this
paper, we propose a novel segmentation-based hybrid index B+-Tree, called SHB+- tree, for temporal big
data. First, the temporal data in temporal table deposited is separated to fragments according to the time
order. In each segment, the hybrid index is constructed by integrating the temporal index and the object
index, and the temporal big data is shared by them. The performance of construction and maintenance is
improved by employing the segmented storage strategy and bottom-up index construction approaches
for every part of the hybrid index. The experimental results on benchmark data set verify the effectiveness
and efficiency of the proposed method.
© 2016 Elsevier B.V. All rights reserved.
1. Introduction
In the era where data are being produced over time and shared
in an unprecedented pace, mining the information in the big data
has become increasingly crucial. Temporal information is the nat-
ural and basic description for the development and changes of
real-world objects, and almost everything has explicit or implicit
temporal features. While the traditional snapshot databases al-
ways record the information in a given specific time, it is difficult
to reflect the dynamic changes of real world sufficiently and accu-
rately. It is becoming increasingly urgent for the management and
retrieval of temporal big data in most modern database systems.
Temporal big data management has already attracted wide
concerns in both academic and industrial fields. Tang [1] proposed
∗
Corresponding author.
E-mail address: psc346@aliyun.com (S. Peng).
the concept of bi-temporal data at an earlier time. In this work,
each tuple of the temporal table carries two time intervals
[start
t
, end
t
] and [start
v
, end
v
], representing transaction time and
valid time (a.k.a system time and application time, respectively).
He also proposed to take time interval as a key, which makes
a breakthrough in traditional databases which only take digit
or character as a key. In this basis, many temporal database
prototypes have been implemented, such as TimeDB [2] and
TempDB [3]. Under the impetus of the above research and real
applications, ISO/IEC published the edition of the SQL standard
in December 2011, SQL: 2011 [4,5], which includes an important
functionality to create and manipulate temporal tables. In the
meantime, many popular commercial databases such as Oracle [6],
IBM DB2 [7], SAP HANA [8] also include temporal features. With
the developments of temporal databases, some key technologies in
the traditional databases have been re-examined. As an important
way to accelerate query performance, index has received great
attentions. Some index structures have been proposed to support
http://dx.doi.org/10.1016/j.future.2016.08.002
0167-739X/© 2016 Elsevier B.V. All rights reserved.