S. K. Sharma, X. Wang: Live Data Analytics With Collaborative Edge and Cloud Processing in Wireless IoT Networks
A. BASIC FEATURES
The term ‘‘Big data’’ usually refers to extremely large, het-
erogeneous and complex (semi-structured and unstructured)
data-sets, which cannot be handled by the conventional data
processing and storage tools/applications such as Relational
Database Management System (RDBMS) [13]. The impor-
tance of big data lies on how meaningful information can be
extracted from it for a particular application rather than the
size of the data, and this extraction process requires novel
data analysis methods and huge processing power. In wireless
IoT environments, big data may be generated from a variety
of application scenarios ranging from smart home scenario
to e-Healthcare applications. In addition to the importance
of content and control signaling data in wireless networks,
location-based data from various sensors such as GPS sen-
sors and embedded sensors in mobile devices can provide
significant inputs to the government bodies in developing
specific strategies for public facilities, transportation system,
emergency responses and crime/risk warnings. Moreover, by
analyzing the habits and interests of customers, industries
may plan their future products in order to address their cus-
tomers’ personalized as well as group needs [3].
FIGURE 1. Main attributes of big data.
As depicted in Fig. 1, the commonly discussed attributes
of big data are [13]: (i) volume, (ii) variety, (iii) veracity,
(iv) velocity, and (v) value. The first two attributes,
i.e., volume and variety, reflect to the hardware and software
requirements in handling massive heterogeneous data-sets
while the features veracity and velocity translate into the
real-time processing ability with sufficient trustworthiness.
On the other hand, acquisition of the highest useful value
from the complex big data-sets in wireless IoT networks
requires interdisciplinary cooperation among academia,
enterprises and wireless industries [13].
B. CHALLENGES
In contrast to the traditional data, big data mainly differs in
the following way [14]: (i) data rate is more rapid and data
volume is constantly updated, (ii) data is of semi-structured
or unstructured nature, (iii) data source is fully distributed,
(iv) data access is in batch mode or real time instead of more
interactive feature in the traditional data, and (v) integration
of heterogeneous data from different sources becomes com-
plicated. The heterogeneous data generated from IoT devices
may have certain statistical and strong correlative features
across several dimensions such as time and location, and also
the devices may have social relations among themselves [15].
In addition, in hierarchial IoT networks, the aggregated fea-
tures of the data traffic can be exploited in order to regulate
the peak content demand, for example, cluster planning based
on data distribution, peak load shifting and cache provision-
ing. Besides, IoT devices with similar interests may share the
contents from their nearby devices and this content sharing
may be enabled using either infrastructure-based commu-
nication or some infrastructure-free communication. Such
a peer to peer nature of resource sharing, called as crowd
computing [15], can exploit the spatial correlation as well as
the mobility of IoT devices.
Big data analytics at the cloud centre can easily integrate
the data collected using distributed sensors and aggregator
nodes while exploiting correlation among the data-sets [16].
More importantly, this has the ability to analyze the mas-
sive data with ever-increasing scale and complexity, and can
provide a global point of view across the whole network.
Moreover, this cloud-based approach will lead to lower-error,
higher-precision, and more dynamic treatment of data than
the conventional data analytic approaches [16]. However,
handling IoT data in the cloud platform in a traditional way
creates several issues due to specific features of IoT data
described in the following [17].
1) Distributed and heterogeneous data structure: IoT
data is generated from distributed heterogeneous nodes
which is largely diverse and may range from integer
to character, and can be semi-structured and unstruc-
tured such as audio, images, and video. In addition, big
data in wireless networks is usually distributed across
several domains such as frequency, space, time, codes,
and antennas. Also, the involved data sources may have
distinct characteristics in terms of data rate, mobility,
power levels and transmission schemes.
2) Real-time requirements: The dynamic environment
of wireless IoT systems creates the need to handle a
large volume of real-time and high-speed continuous
data streams.
3) Weak data semantics: The data acquired from
IoT sensors are mainly of low-level having weak
semantics and are gathered with the help of resource-
constrained sensors/devices/objects. In order to extract
meaningful information from the collected data, they
need to go through effective processing by exploiting
various aspects such as spatial-temporal correlation and
event-driven knowledge.
4) Data inaccuracy: Since the information gathered by
the employed sensing system may not be accurate
VOLUME 5, 2017 4623