xviii ◾ Preface
interests of people connected to web communities and a lot of other information. e Big
Data collected from the web is considered an unprecedented source to fuel data processing
and business intelligence. However, collecting, storing, analyzing, and processing these
Big Data as quickly as possible creates new challenges for both scientists and analytics. For
example, analyzing Big Data from social media is now widely accepted by many compa-
nies as a way of testing the acceptance of their products and services based on customers’
opinions. Opinion mining or sentiment analysis methods have been recently proposed for
extracting positive/negative words from Big Data. However, highly accurate and timely
processing and analysis of the huge amount of data to extract their meaning requires new
processing techniques. More precisely, a technology is needed to deal with the massive
amounts of unstructured and semistructured information in order to understand hidden
user behavior. Existing solutions are time consuming given the increase in data volume
and complexity. It is possible to use high-performance computing technology to accelerate
data processing through MapReduce ported to cloud computing. is will allow compa-
nies to deliver more business value to their end customers in the dynamic and changing
business environment. is chapter discusses approaches proposed in literature and their
use in the cloud for Big Data analysis and processing.
Chapter 6, “e Art of Scheduling for Big Data Science,” by Florin Pop and Valentin
Cristea, moves the attention to applications that generate Big Data, like social networking
and social inuence programs, cloud applications, public websites, scientic experiments
and simulations, data warehouses, monitoring platforms, and e-government services. Data
grow rapidly, since applications produce continuously increasing volumes of both unstruc-
tured and structured data. e impact on data processing, transfer, and storage is the need
to reevaluate the approaches and solutions to better answer user needs. In this context,
scheduling models and algorithms have an important role. A large variety of solutions for
specic applications and platforms exist, so a thorough and systematic analysis of exist-
ing solutions for scheduling models, methods, and algorithms used in Big Data processing
and storage environments has high importance. is chapter presents the best of existing
solutions and creates an overview of current and near-future trends. It highlights, from
a research perspective, the performance and limitations of existing solutions and oers
an overview of the current situation in the area of scheduling and resource management
related to Big Data processing.
Chapter 7, “Time–Space Scheduling in the MapReduce Framework,” by Zhuo Tang,
Ling Qi, Lingang Jiang, Kenli Li, and Keqin Li, focuses on the signicance of Big Data, that
is, analyzing people’s behavior, intentions, and preferences in the growing and popular
social networks and, in addition to this, processing data with nontraditional structures
and exploring their meanings. Big Data is oen used to describe a company’s large amount
of unstructured and semistructured data. Using analysis to create these data in a relational
database for downloading will require too much time and money. Big Data analysis and
cloud computing are oen linked together because real-time analysis of large data requires
a framework similar to MapReduce to assign work to hundreds or even thousands of com-
puters. Aer several years of criticism, questioning, discussion, and speculation, Big Data
nally ushered in the era belonging to it. Hadoop presents MapReduce as an analytics