c
Face
Recognition
Using
Eigenfaces
Matthew
A.
Turk and Alex
P.
Pentland
Vision and Modeling Group, The Media Laboratory
Massachusetts Institute
of
Technology
Abstract
We present an approach to the detection and
identification of human faces and describe a work-
ing, near-real-time face recognition system which
tracks a subject’s head and then recognizes the per-
son by comparing characteristics of the face to those
of known individuals.
Our
approach treats face
recognition
as a
two-dimensional recognition prob-
lem, taking advantage of the fact that faces are are
normally upright and thus may be described by a
small set of
2-D
characteristic views. Face images
are projected onto a feature space (“face space”)
that best encodes the variation among known face
images. The face space is defined by the “eigen-
faces”, which are the eigenvectors of the set of faces;
they do not necessarily correspond to isolated fea-
tures such
as
eyes, ears, and noses. The framework
provides the ability to learn to recognize new faces
in an unsupervised manner.
1
Introduction
Developing
a
computational model of face recogni-
tion is quite difficult, because faces are complex,
multidimensional, and meaningful visual stimuli.
They are a natural class of objects, and stand in
stark contrast to sine wave gratings, the “blocks
world”, and other artificial stimuli used in human
and computer vision research[l]. Thus unlike most
early visual functions, for which we may construct
detailed models of retinal
or
striate activity, face
recognition is a very high level task for which com-
putational approaches can currently only suggest
broad constraints on the corresponding neural ac-
tivity.
We therefore focused
our
research towards devel-
oping a sort of early, preattentive pattern recogni-
tion capability that does not depend upon having
full three-dimensional models
or
detailed geometry.
Our aim was to develop a computational model of
face recognition which is fast, reasonably simple,
and accurate in constrained environments such as
an office
or
a household.
Although face recognition is a high level visual
problem, there is quite
a
bit of structure imposed on
the task. We take advantage of some of this struc-
ture by proposing a scheme for recognition which is
based on an information theory approach, seeking
to
encode the most relevant information in a group
of faces which will best distinguish them from one
CH2983-5/91/0000/0586/$01
.OO
(0
1991
IEEE
another. The approach transforms face images into
a small set of characteristic feature images, called
eigenfaces”
,
which are the principal components of
the initial training set of face images. Recognition is
performed by projecting
a
new image into the snb-
space spanned by the eigenfaces (“face space”) and
then classifying the face by comparing its position in
face space with the positions of known individuals.
Automatically learning and later recognizing new
faces is practical within this framework. Recogni-
tion under reasonably varying conditions is achieved
by training on a limited number of characteristic
views (e.g., a “straight on” view, a
45’
view, and
a profile view). The approach has advantages over
other face recognition schemes in its speed and sim-
plicity, learning capacity, and relative insensitivity
to
small
or
gradual changes in the face image.
1.1
Background and related work
Much of the work in computer recognition of faces
has focused on detecting individual features such as
the eyes, nose, mouth, and head outline, and defin-
ing a face model by the position, size, and relation-
ships among these features. Beginning with Bled-
soe’s
[2]
and Kanade’s
[3]
early systems,
a
number
of automated or semi-automated face recognition
strategies have modeled and classified faces based
on normalized distances and ratios among feature
points. Recently this general approach has been
continued and improved by the recent work of Yuille
et al.
[4].
Such approaches have proven difficult to extend
to multiple views, and have often been quite frag-
ile. Research in human strategies of face recogni-
tion, moreover, has shown that individual features
and their immediate relationships comprise an insuf-
ficient representation to account for the performance
of adult human face identification
[5].
Nonetheless,
this approach to face recognition remains the most
popular one in the computer vision literature.
Connectionist approaches to face identification
sepk to capture the configurational,
or
gestalt-like
nature of the task. Fleming and Cottrell
[6],
build-
ing on earlier work by Kohonen and Lahtio
[7],
use
nonlinear units to train a network via back propa-
gation to classify face images. Stonham’s
WISARD
system
[8]
has been applied with some success to bi-
nary face images, recognizing both identity and ex-
pression. Most connectionist systems dealing with
faces trrat thr input image as a general
2-D
pattern,
“
’
586