Published in IET Computer Vision
Received on 16th May 2012
Revised on 28th September 2012
Accepted on 12th October 2012
doi: 10.1049/iet-cvi.2012.0094
ISSN 1751-9632
Automatic face image annotation based on a single
template with constrained warping deformation
Yang Yang, Yuehu Liu, Jianyi Liu
The School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, People’s Republic of China
E-mail: yyang@mail.xjtu.edu.cn
Abstract: In this study, an automatic face image annotation method is proposed by aligning faces with different expressions to an
annotated neutral face. This work is useful in reducing tedious manual work for labelling image data in large databases. However,
it is challenging because of the appearance variations caused by non-rigid face deformations under various expressions. Unlike
some conventional approaches acquiring sufficient image templates to model the query appearance, only a single given template
is necessary for the proposed method. The authors address the problem through dense image alignment. Specifically, image
warping in the alignment process is constrained by prior knowledge about facial shape deformation. The proposed method is
independent of the appearance model, and is available for unseen faces. In addition, to initialise warping parameters, the
authors present a robust patch-based estimation method. Context information for feature points is carefully modelled to
propagate the searching path for local patch matching. The face annotation experiments are performed on some large
expressions, with noisy image qualities and in low image resolutions. Comparison results with conventional methods
demonstrate the proposed method’s superiority on both accuracy and robustness.
1 Introduction
Automatic face image annotation is an essential topic in
computer vision with applications in many areas, such as
face recognition [1], expression analysis and motion
tracking. A group of popular methods [2–5], which are
based on the active appearance model (AAM) [6], try to
construct a generative model from a set of templates. Their
model fitting is performed on an analysis-by-synthesis
strategy, which achieves high accuracy by generating a
virtual appearance for each query object. However, as
presented by Gross et al. [7], to build this kind of
appearance model requires a large number of templates in
order to extend model generalisation for unseen faces.
Moreover, the model fitting progress is sensitive to their
initial searching positions.
In this paper, we propose a novel face annotation
method based on a single template. The goal is to align
query expression faces with an annotated neutral face.
Fig. 1 presents the illustration of our annotation task.
We take the input neutral face as a template, which has
been annotated on the contour positions with some
landmarks. This template is deformed to match the query
expression faces by pixel mapping. Then, all the
landmarks could be transferred correspondingly. This
work can be used for automatic landmark labelling. The
benefit is to reduce labour-intensive work for preparing
training examples and remove the labelling inconsistencies
among different labellers. Furthermore, the proposed
method can be applied to track facial expressions in
video sequences.
Face alignment for unseen targets suffers from appearance
variations across various expressions. The essentiality of
shape constraint has been noted in recent alignment
methods [8, 9]. Cao et al. [8 ] utilised an explicit shape
regression method to infer the facial shape from coarse to
fine. Liang et al. [9] proposed a shape constraint Markov
network for accurate face alignment. Dissimilar as previous
methods, we address the above problem by constraining the
shape deformation parameters.
The proposed method is independent of the appearance
model, but a set of facial shape samples are needed to
acquire prior knowledge about shape deformation.
Motivated by the successful work to build facial-
relative-motion model in the expression synthesis [10] and
facial animation [11] tasks, we incorporate relative motion
prior (RMP) into our annotation algorithm. RMP is a
statistical characteristic of facial shape deformation, like
changing from a neutral expression to various expressions.
With the RMP constraint, model fitting is considered to
search in reasonable face motion ranges. At the same time,
the fitting is more robust against the negative influence
caused by appearance variations. Taking the computational
cost into account, we solve our model fitting in the
inverse compositional [12] (IC) framework. The most
time-consuming parameters are pre-computed offline.
Meanwhile, deformation parameters are updated efficiently
in the online process.
Even with our method, bad initial positions may get the
model stuck in a local minimum. Furthermore, we propose
a robust initialisation method for estimating deformation
parameters. A few matching features are detected according
www.ietdl.org
20 IET Comput. Vis., 2013, Vol. 7, Iss. 1, pp. 20–28
&
The Institution of Engineering and Technology 2013 doi: 10.1049/iet-cvi.2012.0094