深度学习与可靠众包：野生表情识别新方法

下载需积分: 9 | PDF格式 | 2.53MB | 更新于2024-08-28 | 101 浏览量 | 举报

"可靠的众包与深度局部保持学习用于野生表情识别" 在计算机视觉领域，面部表情识别是一项重要的研究任务，特别是在人机交互、情绪分析和社会心理学应用中。然而，过去的面部表情研究通常依赖于相对有限的数据集，这使得对当前方法在实际场景中的有效性存在不确定性。论文"Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild"由北京邮电大学的Shan Li、Weihong Deng和JunPing Du共同撰写，他们提出了一个创新的数据库RAF-DB，以及一种新的深度学习方法DLP-CNN（Deep Locality-Preserving Convolutional Neural Network）。 RAF-DB数据库包含大约30,000张来自数千个个体的面部图像，每个图像平均被独立标注了40次。利用EM（期望最大化）算法，他们有效地筛选出不可靠的标签，确保数据的准确性。通过众包的方式，研究发现现实生活中的面部表情往往表达复合情感，甚至混合情感，这是现有实验室控制环境下数据集所缺乏的特性。因此，RAF-DB是第一个包含自然环境中复合表情的数据库。为了应对RAF-DB中基本情感动作单元的多样性及可能的偏离问题，研究者提出了DLP-CNN模型。该模型的目标是通过强化深度特征的判别能力来解决这一挑战。深度学习，尤其是卷积神经网络（CNN），已经在图像识别任务中展现出强大的性能。然而，在处理复杂和多变的情感表达时，传统的深度学习方法可能会遇到困难，因为它们可能过于关注全局特征而忽视了局部信息的保留。 DLP-CNN方法的核心在于结合了深度学习的全局模式捕获能力和局部特征的保持策略。它通过在CNN架构中引入局部保持正则化，使得网络在学习全局表示的同时，也能捕捉到面部表情的细微变化，从而提高对复合和混合情感的识别精度。这种方法有望改善现有模型在处理真实世界复杂情感表达时的性能，并推动面部表情识别技术的发展。这篇论文的工作为表情识别研究提供了一个更为现实和多样化的数据集，同时提出了一种新的深度学习方法，以适应并优化处理这些数据。这不仅加深了我们对自然环境中面部表情的理解，也为未来的情绪分析系统设计提供了有价值的参考。

展开

Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression

Recognition in the Wild

Shan Li, Weihong Deng, and JunPing Du

Beijing University of Posts and Telecommunications

{ls1995, whDeng, junpingd}@bupt.edu.cn

Abstract

Past research on facial expressions have used relative-

ly limited datasets, which makes it unclear whether cur-

rent methods can be employed in real world. In this pa-

per, we present a novel database, RAF-DB, which contains

about 30000 facial images from thousands of individuals.

Each image has been individually labeled about 40 times,

then EM algorithm was used to ﬁlter out unreliable label-

s. Crowdsourcing reveals that real-world faces often ex-

press compound emotions, or even mixture ones. For all

we know, RAF-DB is the ﬁrst database that contains com-

pound expressions in the wild. Our cross-database study

shows that the action units of basic emotions in RAF-DB are

much more diverse than, or even deviate from, those of lab-

controlled ones. To address this problem, we propose a new

DLP-CNN (Deep Locality-Preserving CNN) method, which

aims to enhance the discriminative power of deep features

by preserving the locality closeness while maximizing the

inter-class scatters. The benchmark experiments on the 7-

class basic expressions and 11-class compound expression-

s, as well as the additional experiments on SFEW and CK+

databases, show that the proposed DLP-CNN outperforms

the state-of-the-art handcrafted features and deep learning

based methods for the expression recognition in the wild.

1. Introduction

Millions of images are being uploaded every day by user-

s from different events and social gatherings. There is an

increasing interest in designing systems capable of under-

standing human manifestations of emotional attributes and

affective displays. To automatic learn the affective state of

face images from the Internet, large annotated databases are

required. However, the complexity of annotations of emo-

tion categories has hindered the collection of large annotat-

ed databases. On the other side, popular AU coding [12]

requires speciﬁc expertise to take months to learn and be

perfected, hence, alternative solutions are needed. And due

to the cultural difference in the way of perceiving facial e-

motion [13], it is difﬁcult for psychologists to deﬁne deﬁnite

prototypical AUs for each facial expressions. Therefore, it

is also worth to study the emotion of social images from the

judgments of a large common population, besides from the

professional knowledge of a few experts.

In this paper, we propose to study the common ex-

pression perception by a reliable crowdsourcing approach.

Speciﬁcally, our well-trained annotators are asked to label

face images with one of the seven basic categories [11],

and each face is annotated enough times independently, i.e.

about 40 times in our experiment. Then, the noisy labels

are ﬁltered by an EM based reliability evaluation algorithm,

through which each image can be represented reliably by a

7-dimensional emotion probability vector. By analyzing 1.2

million labels of 29672 great-diverse facial images down-

loaded from the Internet, these Real-world Affective Faces

(RAF)

are naturally categorized into two types: basic ex-

pression with single-modal distribution and compound e-

motions with bimodal distribution, an observation support-

ing a recent ground-breaking ﬁnding in the lab-controlled

condition [10]. To the best of our knowledge, the real-

world expression database RAF-DB is the ﬁrst large-scale

database providing the labels of common expression per-

ception and compound emotions in unconstrained environ-

ment.

The cross-database experiment and AU analysis on

RAF-DB indicates that AUs of real-world expressions are

much more diverse than, or even deviate from, those of

lab-controlled ones guided by psychologists. To address

this ambiguity of unconstrained emotion, we further pro-

pose a novel Deep Locality-preserving CNN (DLP-CNN).

Inspired by [17], we develop a practical back-propagation

algorithm which creates a locality preserving loss (LP loss)

aiming to pull the locally neighboring faces of the same

class together. Jointly trained with the classical softmax

loss which forces different classes to stay apart, locality p-

reserving loss drives the intra-class local clusters of each

http://whdeng.cn/RAF/model1.html

2852

下载后可阅读完整内容，剩余9页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

高山我梦：）

粉丝: 66

深度学习与可靠众包：野生表情识别新方法

HUMOR数据集：探索幽默研究的NLP语料库

细粒度众包在细粒度识别中的应用

研究网站美学：机器学习数据集与用户偏好比较法

Crowdsource-Testing-Federal-Crowdsourcing-and-Citizen-Science-Toolkit_8-2015:来自82015 Federal Crowdsourcing和Citizen Science Toolkit众包兼容性测试周期的工件

Incentive Mechanism for Macrotasking Crowdsourcing: A Zero-Determinant Strategy Approach

Fine-Grained Crowdsourcing for Fine-Grained Recognition

Privacy-Preserving Blockchain-Based Federated.pdf

Zero-Determinant Strategy for Cooperation Enforcement in Crowdsourcing

CrowdSourcing-Accident-Tracker:众包响应式 Web 应用程序，各种用户可以实时报告事故。 报告的事故在谷歌地图上动态更新，并提供导航到最近的紧急导航系统

Crowdsourcing Cloud-Based Software Developmenti - 2015.pdf

最新资源

CrowdSourcing-Accident-Tracker:众包响应式 Web 应用程序，各种用户可以实时报告事故。报告的事故在谷歌地图上动态更新，并提供导航到最近的紧急导航系统