GAO et al.: CAST2FACE: ASSIGNING CHARACTER NAMES ONTO FACES IN MOVIE 2301
Fig. 1. Cast2Face framework diagram with four components: 1) gallery face set collection; 2) face tracks extraction and description; 3) KMTJSRC
classification; and 4) CRF-based sequence labeling.
Therefore, besides considering the cast information and
using the robust KMTJSRC recognition algorithm, another
contribution of this paper is adopting the CRFs to model
the constraints among face tracks. In general, the higher a
character’s name ranks in the cast list, the more frequently the
character appears in the movie. Therefore, a prior probability
is assigned to each character in advance. Then, the final face
labeling is obtained with the CRF model based on both the
prior probabilities and the robust recognition in the individual
face track with KMTJSRC.
B. Outline of Our Approach
The Cast2Face method we propose is a novel framework
for labeling the faces of the characters in a movie with cast.
Our method comprises four components, as shown in Fig. 1.
1) Gallery face set collection with cast analysis and
Web image search. Most of the previous methods use
supervised or semisupervised training data, which are
manually labeled or prepared to train the learning model.
Unlike them, by using the textual source of the cast, our
approach collects the gallery face data from the Google
image search accurately and automatically. The collected
gallery set not only contains sufficient face features, but
also can be obtained efficiently.
2) Probe face tracks extraction and description using the
state-of-the-art face detection and tracking algorithms
to generate the face tracks. This step helps to obtain
sufficient probe faces efficiently. After that, a robust
face feature description method, which uses the
scale-invariant feature transform (SIFT)
descriptors, is introduced to more robustly
represent each face track.
3) Face tracks identification using a robust KMTJSRC.
We address the computation of joint SR of visual signals
across multiple kernel-based representations, using the
form of kernel matrices to represent each probe face
track with the gallery set. Then, the recognition is
finished by choosing the character name with the biggest
1
distance in the weight parameters.
4) CRF model-based tracks sequence labeling considering
constraints among face tracks. Unlike the real-time
face recognition, faces in movies are always with
various angles, resolutions, and expressions; thus, face
recognition directly performed on these faces is always
with unsatisfactory accuracy. However, there are many
constraints in these face tracks considered as a time
sequence. Therefore, we consider the CRF model, which
considers context information, since an ordinary classi-
fier predicts a label for a single sample without regard
to neighboring samples. By applying the CRF model on
face tracks sequence labeling and minimizing the energy
function, we get more robust labeling performance in
terms of the initial recognition of KMTJSRC.
Compared with previous studies on name-to-face studies,
the main contributions of this paper include the following.
1) To the best of our knowledge, Cast2Face proposed in
this paper as well as its conference version [21] is the
first work combining the character identification with the
cast analysis and Web image retrieval.
2) A robust multitask joint SR method and the KMTJSRC
are developed to classify each face track without training
on a possibly contaminated gallery set.
3) The prior probability is introduced based on the charac-
ter name order in the cast list, and a CRF model is
used to relabel the whole face track more efficiently
and effectively with the consideration of the neighboring
constraints.
4) We design a novel application of our method to
automatically generate the spotlights summarization of
a particular actor in many of his/her movies.
More visual details can be seen in Fig. 2, which shows the
working mechanism of our proposed Cast2Face method.
II. C
AST2FACE:ASSIGNING CHARACTER
NAMES ONTO FACES
A. Cast-Based Web Image Search and Gallery Generation
The gallery data set and the real name for the final labeling
are very important for the character identification. We can
employ readily available textual annotation for TV and movie