
2
Introduction to Statistical Pattern Recognition
operation by observing the output voltage of a microphone over a period
of
time. This problem reduces to discrimination of waveforms from good and
bad machines.
On
the other hand, recognition
of
printed English Characters
corresponds to classification
of
geometric figures. In order to perform this type
of classification, we must first measure the observable characteristics of the
sample. The most primitive but assured way to extract all information con-
tained
in
the sample is to measure the time-sampled values for a waveform,
x(t,),
.
.
.
,
x(t,,),
and the grey levels of pixels for a figure,
x(1)
,
.
.
.
,
A-(n),
as
shown
in
Fig.
1-1.
These
n
measurements form a vector
X.
Even under the
normal machine condition, the observed waveforms are different each time the
observation is made. Therefore,
x(ri)
is a random variable and will be
expressed, using boldface, as
x(fi).
Likewise,
X
is called a random vector if its
components are random variables and
is
expressed as
X.
Similar arguments
hold for characters: the observation,
x(i),
varies from one
A
to another and
therefore
x(i)
is
a random variable, and
X
is a random vector.
Thus, each waveform or character is expressed by a vector
(or
a sample)
in an n-dimensional space, and many waveforms or characters form a distribu-
tion
of
X
in the n-dimensional
space. Figure
1-2 shows a simple two-
dimensional example of two distributions corresponding to normal and
abnormal machine conditions, where points depict the locations of samples and
solid lines are the contour lines of the probability density functions. If we
know these two distributions of
X
from past experience, we can set up a boun-
dary between these
two
distributions,
g
(I-
,,
x2)
=
0,
which divides the two-
dimensional space into two regions. Once the boundary is selected, we can
classify a sample without a class label to a normal
or
abnormal machine,
depending
on
g
(x
I,
xz)<
0
or
g
(x,
,
x2)
>O.
We call
g
(x
,
x2)
a discriminant
function,
and a network which detects the sign of
g
(x
1,
x2)
is
called a
pattern
I-ecognition network,
a
categorizer,
or
a classfier. Figure
1-3
shows a block
diagram of
a
classifier in a general n-dimensional space.
Thus,
in
order to
design a classifier, we must study the characteristics of the distribution of
X
for
each category and find a proper discriminant function.
This process is called
learning
or
training, and samples used to design a classifier are called learning
or training samples. The discussion can be easily extended to multi-category
cases.
Thus, pattern recognition,
or
decision-making
in
a broader sense, may be
considered as a problem of estimating density functions in a high-dimensional
space and dividing the space into the regions of categories
or
classes. Because