An Introduction to Sensor Data Analytics 3
The large volumes of data lead to huge challenges in terms of
storage and processing of the data. It has been estimated that since
2008, the number of internet-connected devices has exceeded the
number of people on the planet. Thus, it is clear that the amount
of machine generated data today greatly exceeds the amount of
human generated data, and this gap is only likely to increase in
the forseeable future. This is widely known as the big data problem
in the context of analytical applications [10], or the information
overload problem in stream processing.
In many cases, it is critical to perform in-network processing,wherein
the data is processed within the network itself, rather than at a
centralized service. This needs effective design of distributed pro-
cessing algorithms, wherein queries and other mining algorithm
can be processed within the network in real time [12].
In this book, we will provide an overview of the key areas of research
in sensor processing, as they related to these challenges. We will also
study a number of new applications of sensor data such as social sensing,
mobile data processing, RFID processing, and the internet of things.
This chapter is organized as follows. In the next section, we will dis-
cuss the key areas of research in sensor processing, as they relate to the
afore-mentioned challenges. We will also relate the different research
areas to these challenges. Section 3 discusses the conclusions and sum-
mary.
2. Research in Sensor Processing
The research issues in the area of sensor processing arise along all
stages of the pipeline, beginning from data collection, cleaning, data
management, and knowledge discovery and mining. Furthermore, many
research issues arise in the context of in-network processing, which are
specific to the particular application domain. The specificity to the
application domain may arise in the context of other parts of the pipeline
as well. Therefore, we summarize the key research issues which arise in
the context of sensor data processing as follows:
Data Collection and Cleaning Issues: Numerous issues arise
in the context of collection of sensor data. Sensor data is inher-
ently noisy and uncertain, and may either have missed readings or
redundant readings depending upon the application domain. For
example, in the context of RFID data, almost 30% of the readings
are dropped, and multiple sensors may track the same RFID ob-
ject. In the context of battery-driven sensors, numerous errors may
Licensed to marie zheng<marieprince@hotmail.com>