角色命名算法Cast2Face：电影中面部识别与演员角色对应

研究论文

163 浏览量更新于2024-08-26 收藏 3.34MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

本文档探讨了一项名为"Cast2Face"的研究，它旨在解决电影角色自动识别问题，特别是在存在演员与角色对应关系的情况下，为面部图像分配准确的角色名称。这项工作发表在2016年12月的《IEEE Transactions on Circuits and Systems for Video Technology》期刊，卷26，第12期，第2299页。研究背景表明，尽管自动识别电影中的角色已经引起了许多研究人员的关注，并推动了一系列有意义的应用，但考虑到角色外观的巨大变化以及可用标注的不足和模糊性，这个问题仍然是一个挑战。"Cast2Face"框架的提出，就是在这样的背景下寻求解决方案。该框架的核心特点包括： 1. **命名限制**：首先，框架限制分配的面部角色名称仅限于电影演员的真实角色名列表中，确保了命名的准确性。 2. **基于演员和电影名称的搜索**：对于每个角色，利用对应的演员名字和电影名作为关键词，从Google图片搜索中获取一组面部图像，形成候选人脸集合，即“演员库”。 3. **人脸匹配与识别**：通过使用强大的核函数（kernel method），系统能够识别电影中的人脸轨迹，并将其关联到相应的演员库中的演员，从而确定其角色身份。这种方法强调了利用外部信息（演员和电影数据）来辅助角色识别的策略。 4. **监督学习与强化**：框架依赖于电影演员角色的已知对应关系进行监督学习，这使得模型能够在训练过程中更有效地学习和理解角色特征，提高识别的准确性。 "Cast2Face"是一个结合了内容理解和上下文信息的智能系统，它利用了演员-角色对应信息和图像检索技术，为解决电影角色识别问题提供了一个创新且有针对性的解决方案。这个研究对于电影分析、人工智能娱乐应用以及计算机视觉领域都具有重要意义，展示了在缺乏明确标注情况下如何利用外部数据增强人脸识别任务的能力。

资源详情

资源推荐

GAO et al.: CAST2FACE: ASSIGNING CHARACTER NAMES ONTO FACES IN MOVIE 2301

Fig. 1. Cast2Face framework diagram with four components: 1) gallery face set collection; 2) face tracks extraction and description; 3) KMTJSRC

classiﬁcation; and 4) CRF-based sequence labeling.

Therefore, besides considering the cast information and

using the robust KMTJSRC recognition algorithm, another

contribution of this paper is adopting the CRFs to model

the constraints among face tracks. In general, the higher a

character’s name ranks in the cast list, the more frequently the

character appears in the movie. Therefore, a prior probability

is assigned to each character in advance. Then, the ﬁnal face

labeling is obtained with the CRF model based on both the

prior probabilities and the robust recognition in the individual

face track with KMTJSRC.

B. Outline of Our Approach

The Cast2Face method we propose is a novel framework

for labeling the faces of the characters in a movie with cast.

Our method comprises four components, as shown in Fig. 1.

1) Gallery face set collection with cast analysis and

Web image search. Most of the previous methods use

supervised or semisupervised training data, which are

manually labeled or prepared to train the learning model.

Unlike them, by using the textual source of the cast, our

approach collects the gallery face data from the Google

image search accurately and automatically. The collected

gallery set not only contains sufﬁcient face features, but

also can be obtained efﬁciently.

2) Probe face tracks extraction and description using the

state-of-the-art face detection and tracking algorithms

to generate the face tracks. This step helps to obtain

sufﬁcient probe faces efﬁciently. After that, a robust

face feature description method, which uses the

scale-invariant feature transform (SIFT)

descriptors, is introduced to more robustly

represent each face track.

3) Face tracks identiﬁcation using a robust KMTJSRC.

We address the computation of joint SR of visual signals

across multiple kernel-based representations, using the

form of kernel matrices to represent each probe face

track with the gallery set. Then, the recognition is

ﬁnished by choosing the character name with the biggest



distance in the weight parameters.

4) CRF model-based tracks sequence labeling considering

constraints among face tracks. Unlike the real-time

face recognition, faces in movies are always with

various angles, resolutions, and expressions; thus, face

recognition directly performed on these faces is always

with unsatisfactory accuracy. However, there are many

constraints in these face tracks considered as a time

sequence. Therefore, we consider the CRF model, which

considers context information, since an ordinary classi-

ﬁer predicts a label for a single sample without regard

to neighboring samples. By applying the CRF model on

face tracks sequence labeling and minimizing the energy

function, we get more robust labeling performance in

terms of the initial recognition of KMTJSRC.

Compared with previous studies on name-to-face studies,

the main contributions of this paper include the following.

1) To the best of our knowledge, Cast2Face proposed in

this paper as well as its conference version [21] is the

ﬁrst work combining the character identiﬁcation with the

cast analysis and Web image retrieval.

2) A robust multitask joint SR method and the KMTJSRC

are developed to classify each face track without training

on a possibly contaminated gallery set.

3) The prior probability is introduced based on the charac-

ter name order in the cast list, and a CRF model is

used to relabel the whole face track more efﬁciently

and effectively with the consideration of the neighboring

constraints.

4) We design a novel application of our method to

automatically generate the spotlights summarization of

a particular actor in many of his/her movies.

More visual details can be seen in Fig. 2, which shows the

working mechanism of our proposed Cast2Face method.

II. C

AST2FACE:ASSIGNING CHARACTER

NAMES ONTO FACES

A. Cast-Based Web Image Search and Gallery Generation

The gallery data set and the real name for the ﬁnal labeling

are very important for the character identiﬁcation. We can

employ readily available textual annotation for TV and movie

剩余13页未读，继续阅读

weixin_38562392

粉丝: 4
资源: 917

角色命名算法Cast2Face：电影中面部识别与演员角色对应

cast2gif:将 Asciinema cast 文件渲染为 GIF 的工具，*不*使用 Electron 或 Web 浏览器。 用 Rust 编写

Delphi 7 3D 引擎 CAST2SDK

std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - start);

std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count() - 60000;报错

std::dynamic_pointer_cast<std::shared_ptr<T>>(）

std::chrono::time_pointstd::chrono::steady_clock end = std::chrono::steady_clock::now(); auto elapsed = std::chrono::duration_cast<std::chrono::seconds>(end - start); auto替换成真正的类型

osg::Node获取所有面片对应顶点的集合

boost::lexical_cast<pcl::traits::asType_t<CASE_LABEL>>(values)报错C2338

详细解释：Eigen::Matrix4d lidar2camera_pose = static_cast<Eigen::Matrix<double, 4, 4, 0, 4, 4>>(world2camera_pose * lidar2world_pose);

“<function-style-cast>”: 无法从“T2”转换为“_Tp”

用c++代码获取osg::Node中所有顶点的坐标、所有面上顶点坐标的索引、所有的纹理坐标、所有面上顶点纹理坐标的索引、所有的材质库和所有的三角面在材质库对应的索引

reinterpret_cast、dynamic_cas、static_cast、const_cast之间的区别

std::chrono::duration_cast

exception:Query failed (#20230530_034037_02588_847se): Value cannot be cast to date: 20230504

serial->setBaudRate（static_cast<QSerialPort::BaudRate>(ui->comboBox_baud->currentText().toInt()));

C++中四种类型转换是：static_cast, dynamic_cast, const_cast, reinterpret_cast实现源码

boost::lexical_cast是怎么使用的

最新资源

cast2gif:将 Asciinema cast 文件渲染为 GIF 的工具，不使用 Electron 或 Web 浏览器。用 Rust 编写