ations to more advanced methods that improve efficiency, including the frequent pattern growth approach, frequent
pattern mining with vertical data format, and mining closed and max frequent itemsets. The chapter also discusses
pattern evaluation methods and introduces measures for mining correlated patterns. Chapter 7 is on advanced pat-
tern mining methods. It discusses methods for pattern mining in multilevel and multidimensional space, mining
rare and negative patterns, mining colossal patterns and high-dimensional data, constraint-based pattern mining,
and mining compressed or approximate patterns. It also introduces methods for pattern exploration and application,
including semantic annotation of frequent patterns.
Chapter 8 and Chapter 9 describe methods for data classification. Due to the importance and diversity of classific-
ation methods, the contents are partitioned into two chapters. Chapter 8 introduces basic concepts and methods for
classification, including decision tree induction, Bayes classification, and rule-based classification. It also discusses
model evaluation and selection methods and methods for improving classification accuracy, including ensemble
methods and how to handle imbalanced data. Chapter 9 discusses advanced methods for classification, including
Bayesian belief networks, the neural network technique of backpropagation, support vector machines, classification
using frequent patterns, k-nearest-neighbor classifiers, case-based reasoning, genetic algorithms, rough set theory,
and fuzzy set approaches. Additional topics include multiclass classification, semi-supervised classification, active
learning, and transfer learning.
Cluster analysis forms the topic of Chapter 10 and Chapter 11. Chapter 10 introduces the basic concepts and meth-
ods for data clustering, including an overview of basic cluster analysis methods, partitioning methods, hierarchical
methods, density-based methods, and grid-based methods. It also introduces methods for the evaluation of cluster-
ing. Chapter 11 discusses advanced methods for clustering, including probabilistic model-based clustering, clus-
tering high-dimensional data, clustering graph and network data, and clustering with constraints.
Chapter 12 is dedicated to outlier detection. It introduces the basic concepts of outliers and outlier analysis and
discusses various outlier detection methods from the view of degree of supervision (i.e., supervised, semi-super-
vised, and unsupervised methods), as well as from the view of approaches (i.e., statistical methods, proximity-based
methods, clustering-based methods, and classification-based methods). It also discusses methods for mining con-
textual and collective outliers, and for outlier detection in high-dimensional data.
Finally, in Chapter 13, we discuss trends, applications, and research frontiers in data mining. We briefly cover
mining complex data types, including mining sequence data (e.g., time series, symbolic sequences, and biological
sequences), mining graphs and networks, and mining spatial, multimedia, text, and Web data. In-depth treatment
of data mining methods for such data is left to a book on advanced topics in data mining, the writing of which
is in progress. The chapter then moves ahead to cover other data mining methodologies, including statistical data
mining, foundations of data mining, visual and audio data mining, as well as data mining applications. It discusses
data mining for financial data analysis, for industries like retail and telecommunication, for use in science and en-
gineering, and for intrusion detection and prevention. It also discusses the relationship between data mining and
recommender systems. Because data mining is present in many aspects of daily life, we discuss issues regarding