Comp-GAN：面部表情合成与识别的创新技术

180 浏览量更新于2024-08-26 收藏 15.56MB PDF 举报

"Comp-GAN：合成和识别面部表情的成分生成对抗网络" 本文是一篇研究论文，主要介绍了一种名为“成分生成对抗网络”（Comp-GAN）的新模型，该模型专注于面部表情的合成与识别。随着深度学习架构的发展，面部表情识别技术取得了显著的进步，但在训练数据不足、多样性缺乏、对脸部变化的敏感性以及识别基本情绪类型有限等问题上仍存在挑战。Comp-GAN就是为了应对这些问题而提出的。 Comp-GAN是一种端到端的框架，能够生成具有指定姿势和期望面部表情的新人脸图像。这个特性解决了训练数据量不足的问题，因为可以通过生成新的、多样化的面部表情图像来扩展训练集。生成的图像质量高，能够模拟真实的面部表情变化，这对于提升模型的泛化能力至关重要。生成对抗网络（GANs）是Comp-GAN的核心组成部分，由生成器（Generator）和判别器（Discriminator）两部分组成。生成器负责创造看似真实的人脸图像，而判别器则试图区分生成的图像与真实图像。在反复的对抗过程中，生成器逐渐提高生成图像的质量，直到判别器无法区分真假，从而达到以假乱真的效果。在面部表情合成方面，Comp-GAN通过将面部特征分解为不同的组件（如表情、姿态等），并独立控制这些组件，实现了对新图像的精确合成。这使得研究人员可以控制生成的人脸图像的表情和姿势，以满足特定需求。对于面部表情识别，Comp-GAN利用生成的图像作为额外的训练样本，帮助构建一个更强大、更具泛化能力的识别模型。由于这些合成图像覆盖了多种表情和姿势组合，训练出的模型能更好地适应实际中的各种面部变化，提高了识别的准确性和鲁棒性。此外，论文可能还讨论了实验结果，包括Comp-GAN在标准数据集上的性能表现，与其他现有方法的比较，以及可能面临的挑战和未来的研究方向。通过这种创新方法， Comp-GAN有望推动面部表情识别技术的发展，使其在社交互动理解、人机交互、情感分析等领域有更广泛的应用。

Comp-GAN: Compositional Generative Adversarial Network in

Synthesizing and Recognizing Facial Expression

Wenxuan Wang

, Qiang Sun

, Yanwei Fu

, Tao Chen

Chenjie Cao

, Ziqi Zheng

, Guoqiang Xu

, Han Qiu

, Yu-Gang Jiang

, Xiangyang Xue

Fudan University

, Ping An OneConnect

ABSTRACT

Facial expression is important in understanding our social inter-

action. Thus the ability to recognize facial expression enables the

novel multimedia applications. With the advance of recent deep ar-

chitectures, research on facial expression recognition has achieved

great progress. However, these models are still suering from the

problems of lacking sucient and diverse high quality training

faces, vulnerability to the facial variations, and recognizing a lim-

ited number of basic types of emotions. To tackle these problems,

this paper proposes a novel end-to-end Compositional Generative

Adversarial Network (Comp-GAN) that is able to synthesize new

face images with specied poses and desired facial expressions;

and such synthesized images can be further utilized to help train a

robust and generalized expression recognition model. Essentially,

Comp-GAN can dynamically change the expression and pose of

faces according to the input images while keeping the identity in-

formation. Specically, the generator has two major components:

one for generating images with desired expression and the other

for changing the pose of faces. Furthermore, a face reconstruction

learning process is applied to re-generate the input image and con-

strains the generator for preserving the key information such as

facial identity. For the rst time, various one/zero-shot facial expres-

sion recognition tasks have been created. We conduct extensive

experiments to show that the images generated by Comp-GAN

are helpful to improve the performance of one/zero-shot facial

expression recognition.

# indicates corresponding authors.

Wenxuan Wang, Yu-Gang Jiang, Xiangyang Xue are with the Shanghai Key Lab of

Intelligent Information Processing, School of Computer Science, Fudan University;

Qiang Sun is with the Academy for Engineering & Technology, Fudan University; Tao

Chen is with the School of Information Science and Technology, Fudan University;

and Yanwei Fu is with the School of Data Science, and Fudan-Xinzailing Joint Research

Centre for Big Data, Fudan University. {wxwang17, 18110860051, yanweifu, eetchen,

xyxue, ygj}@fudan.edu.cn.

Chenjie Cao, Ziqi Zheng, Guoqiang Xu, and Han Qiu are with Ping An OneConnect.

{caochenjie948, zhengziqi356,xuguoqiang371,hannaqiu}@pingan.com.

This work was supported in part by NSFC (No.61572138 & No.U1611461), and STCSM

Project (19ZR1471800).

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

MM ’19, October 21–25, 2019, Nice, France

ACM ISBN 978-1-4503-6889-6/19/10... $15.00

https://doi.org/10.1145/3343031.3351032

CCS CONCEPTS

• Computing methodologies → Activity recognition and un-

derstanding; Image representations; Visual inspection.

KEYWORDS

Facial Expression, Generative Adversarial Network

ACM Reference Format:

Wenxuan Wang

, Qiang Sun

, Yanwei Fu

, Tao Chen

and Chenjie Cao

Ziqi Zheng

, Guoqiang Xu

, Han Qiu

, Yu-Gang Jiang

, Xiangyang Xue

. 2019. Comp-GAN: Compositional Generative Adversarial Network in

Synthesizing and Recognizing Facial Expression. In Proceedings of the 27th

ACM International Conference on Multimedia (MM ’19), October 21–25, 2019,

Nice, France. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/

3343031.3351032

1 INTRODUCTION

Facial expression as one important facial attribute plays a key role

in the communication [

], which can reect the emotional

state and intention of humans. Building a system capable of auto-

matically recognizing facial expression from media data has been

an important research eld over the past few years. Such a system

can enable various multimedia applications in real-world scenarios,

such as medical test, education, security, driver fatigue surveillance,

and many other human-computer interactions.

With the advance of recent deep architectures, several pilot

studies [

] have investigated the possibility of learning rep-

resentative deep facial emotion features from data. Consequently

and as expected, their results show that deep-feature-based expres-

sion recognition (ER) system indeed outperforms the traditional

hand-crafted-feature-based ER models [

]. Despite the en-

couraging advances in these ER works, there are still several key

challenges in extending facial ER system to real-world applications.

(1) Lack of sucient and diverse high-quality training data. The

annotation task for facial expression generally requires devoted

contributions from the experts, and the labeling procedure is much

more dicult and time-consuming than labeling image class [

]. It

is thus a severe problem in training deep ER models in general. To

bypass the problem of insucient training data, the typical solu-

tion would be rstly pre-training a model on large scale auxiliary

dataset of dierent recognition tasks (e.g., ImageNet [

], CASIA

WebFace [

]), and then ne-tuning the model on ER tasks of the

target dataset. The performance of such a manner is very sensitive

to the relation between recognition tasks in auxiliary dataset and

ER task of the target dataset. On the other hand, data augmentation

is widely utilized in enlarging the training dataset. In ER tasks,

several GAN-based methods are applied in synthesizing faces with

dierent expressions [

], poses [

] and identities [

], respectively.

However, they do not properly preserve the identity and expression

Session 1B: Affective Computing & Facial Analytics

MM ’19, October 21–25, 2019, Nice, France

211

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38728555

粉丝: 3
资源: 921

Comp-GAN：面部表情合成与识别的创新技术

matlab全景图拼接代码-COMP-558:COMP-558

Java采购管理信息系统源码-mobile-comp-project:mobile-comp-project

Comp790-166-Comp-Bio:计算生物学-2021年Spring

COMP-110：Comp 110分配

COMP-6530:云计算

ap-comp-sci:AP Comp科学实验室和其他东西

enph353-comp-controller:控制器算法和装饰

comp-421:数据库系统

hkust-comp-material:科大COMP课程资料的备份库

ICS3U-Intro-to-Comp-Sci:计算机科学概论

最新资源