Automatic 3D Face Reconstruction based on Single 2D Image
Yepeng Guan
School of Communication and Information Engineering, Shanghai University
149 Yanchang Rd., Shanghai 200072, China
E-mail: ypguan@shu.edu.cn
Tel: +86 21 56331967; fax: +86 21 56336908
Abstract
Face recognition combines a number of advantages
as it offers speed and reliability of detection at
distance from the subject, which makes its candidacy
as a biometric feature a fertile area of pattern
recognition and image processing research. In
general, changes to head pose and facial expression
cause a 2D image of a face to become an unreliable
form of identification, while a 3D representation
suggests all vertices of the 3D facial model as
candidate and useful features for face classification
and recognition. Established 3D face reconstruction,
however, requires multiple 2D images of a subject at
different angles but such images are not always
available, and proposals to construct a 3D facial
model from a single 2D image of the face are
hampered by severe restrictions on that image (frontal
pose, neutral expression). This paper offers a new
proposal to overcome deficiencies of 3D facial model
reconstruction from one image, by improvements using
an affine transformation and a 3D statistical face
model. The experimental results that are presented for
these new algorithms indicate that they are efficient
and that this method is promising.
1. Introduction
Biometric identification is one of the most
important characteristics to show machine intelligence
and is also stepping into the life of common people
little by little. Of all the biometric features, the face is
so common, reachable, no physical contact, work
quickly and reliably that face recognition remains one
of the most active research issues in pattern recognition
and image processing. Although various investigations
over a number of years and presented a lot different
algorithms for face recognition, face recognition
remains a very challenging topic
[1]
.
Some inductive learning successes have been
reported but using very restricted and contrived data
sets
[2]
. Real world applications, however, may
encounter more pronounced intra-subject facial
variations owing to different head poses or facial
expressions.
Face Recognition Vendor Test in 2002 (FRVT
2002)
[3]
evaluated the state of art algorithms against
large-scale and real-world test datasets. These results
indicate that face recognition accuracy from frontal
face pose in indoor lighting conditions can reach about
90%. However, face recognition among different poses
and lighting conditions is far from satisfactory.
Most of the face recognition methods that are based
on appearance
[4-6]
require that several training samples
be available under different conditions for each
subject. Only a small number of training images,
however, are generally available for a subject in real
applications, namely all facial variations can not be
captured in most cases. Besides, the human face is a
3D elastic surface such that the 2D image projection of
a face is very sensitive to the changes in head pose and
facial expression. Utilizing 3D facial information is a
promising way to deal with these variations because
3D facial descriptions bring much information, with
little dependence on pose and makeup
[7]
. Recently, 3D
face recognition has attracted much more attentions in
the face recognition community
[8-10]
. If we represent
the face as 3D surface, all vertices represented on the
3D facial model can be used as the facial features for
classification and recognition. By comparing the 3D
testing model and the 3D source model in the database,
the 3D information can be fully explored to yield a
more robust and effective face recognition. Based on
aforementioned, performing the 3D face reconstruction
is the key issue.
Traditional image based 3D reconstruction methods
use multiple images to reconstruct the 3D geometry.
However, it is not always possible to obtain such
images. Even when multiple images become available,
parts of the scene appear in only one image due to
occlusions and/or lack of features to match between
images.
With just one view, 3D recovery is not possible
normally. It is clear that one single view of a generic
3D object (if shading is neglected) does not provide
sufficient 3D information. If the object belongs to a
class of similar objects, it seems possible to infer
appropriate transformations for the class and use them
2007 International Conference on Multimedia and Ubiquitous Engineering(MUE'07)
0-7695-2777-9/07 $20.00 © 2007