“03-Ch01-SA272” 17/9/2008 page 10
10 CHAPTER 1 Introduction
1.5 OUTLINE OF THE BOOK
Chapters 2–10 deal with supervised pattern recognition and Chapters 11–16 deal
with the unsupervised case. Semi-supervised learning is introduced in Chapter 10.
The goal of each chapter is to start with the basics,definitions, and approaches,and
move progressively to more advanced issues and recent techniques. To what extent
the various topics covered in the book will be presented in a first course on pattern
recognition depends very much on the course’s focus,on the students’background,
and, of course, on the lecturer. In the following outline of the chapters, we give
our view and the topics that we cover in a first course on pattern recognition. No
doubt, other views do exist and may be better suited to different audiences. At the
end of each chapter, a number of problems and computer exercises are provided.
Chapter 2 is focused on Bayesian classification and techniques for estimating
unknown probability density functions. In a first course on pattern recognition,the
sections related to Bayesian inference, the maximum entropy, and the expectation
maximization (EM) algorithm are omitted. Special focus is put on the Bayesian clas-
sification,the minimum distance (Euclidean and Mahalanobis),the nearest neighbor
classifiers, and the naive Bayes classifier. Bayesian networks are briefly introduced.
Chapter 3 deals with the design of linear classifiers. The sections dealing with the
probability estimation property of the mean square solution as well as the bias var i-
ance dilemma are only briefly mentioned in our first course. The basic philosophy
underlying the support vector machines can also be explained, although a deeper
treatment requires mathematical tools (summarized inAppendix C) that most of the
students are not familiar with during a first course class. On the contrary,emphasis is
put on the linear separability issue, the perceptron algorithm, and the mean square
and least squares solutions. After all, these topics have a much broader horizon
and applicability. Support vector machines are briefly introduced. The geometric
interpretation offers students a better understanding of the SVM theory.
Chapter 4 deals with the design of nonlinear classifiers. The section dealing with
exact classification is b ypassed in a first course. The proof of the backpropagation
algorithm is usually very boring for most of the students and we bypass its details.
A description of its rationale is given, and the students experiment with it using
MATLAB. The issues related to cost functions are bypassed. Pruning is discussed
with an emphasis on generalization issues. Emphasis is also given to Cover’s theorem
and radial basis function (RBF) networks. The nonlinear support vector machines,
decision trees, and combining classifiers are only briefly touched via a discussion
on the basic philosophy behind their rationale.
Chapter 5 deals with the feature selection sta ge, and we have made an effort
to present most of the well-known techniques. In a first course we put emphasis
on the t-test. This is because hypothesis testing also has a broad horizon, and at
the same time it is easy for the students to apply it in computer exercises. Then,
depending on time constraints, divergence, Bhattacharrya distance, and scattered
matrices are presented and commented on, although their more detailed treatment