红外视频面部表情识别：时空特征与深度玻尔兹曼机分析

2 浏览量更新于2024-08-26 收藏 678KB PDF 举报

"该资源是一篇关于视频中红外面部表情识别的时空分析的研究论文，由天津大学计算机科学与技术学院的Zhilei Liu和海洋科学与技术学院的Cuicui Zhang共同撰写。" 在人机交互领域，面部表情识别（Facial Expression Recognition, FER）对于情绪推断具有重要意义。传统上，大部分研究集中在可见光图像上的面部表情识别，但光照条件的变化可能会影响其识别效果。红外热成像技术因其能够反映温度分布，对光照变化具有较强的鲁棒性，近年来逐渐受到关注。本研究提出了一种新颖的基于红外图像序列的面部表情识别方法，该方法利用时空特征分析和深度玻尔兹曼机（Deep Boltzmann Machines, DBM）。首先，通过光流算法在红外图像序列中生成密集的运动场，这有助于捕捉到面部表情变化的动态信息。光流算法能计算相邻帧间的像素位移，从而揭示出面部肌肉的运动轨迹。接着，主成分分析（Principal Component Analysis, PCA）被应用来进行特征降维，以减少数据的复杂性同时保留关键信息。PCA可以将原始高维数据转换为一组线性不相关的低维特征向量，降低后续处理的计算负担。最后，设计了一个三层的DBM结构进行最终的表情分类。DBM是一种无监督学习的神经网络模型，能捕获数据的隐含结构并进行有效的特征学习。在该研究中，DBM用于从提取的时空特征中学习表达的表示，并进行分类任务。该论文的研究成果为红外面部表情识别提供了一种新途径，特别是在光照条件不稳定或夜晚环境下，这种方法可能表现出比基于可见光图像的方法更高的识别准确性和稳定性。此外，结合时空分析和深度学习的策略也为其他生物特征识别和情感计算领域的研究提供了新的思路。

Spatio-temporal Analysis for Infrared Facial Expression

Recognition from Videos

Zhilei Liu

School of Computer Science and Technology

Tianjin University

Tianjin, China 300072

zhileiliu@tju.edu.cn

Cuicui Zhang

School of Marine Science and Technology

Tianjin University

Tianjin, China 300072

cuicui.zhang@tju.edu.cn

ABSTRACT

Facial expression recognition (FER) for emotion inference has

become one of the most important research fields in human-

computer interaction. Existing study on FER mainly focuses on

visible images, whereas varying lighting conditions may influence

their performances. Recent studies have demonstrated the

advantages of infrared thermal images reflecting the temperature

distributions, which are robust to lighting changes. In this paper, a

novel infrared image sequence based FER method is proposed

using spatiotemporal feature analysis and deep Boltzmann

machines (DBM). Firstly, a dense motion field among infrared

image sequences is generated using optical flow algorithm. Then,

PCA is applied for dimension reduction and a three-layer DBM

structure is designed for final expression classification. Finally,

the effectiveness of the proposed method is well demonstrated

based on several experiments conducted on NVIE database.

CCS Concepts

• Computing methodologies→Computer vision represent-

tations; Image representations;

Keywords

Facial expression recognition; Infrared image sequences; optical

flow; Deep Boltzmann machine

1. INTRODUCTION

Facial expression recognition (FER) has become an important

area of personalized human-computer interaction [1, 3, 4].

Existing works on FER mainly focus on visible images. Whereas

varying lighting conditions may influence the performance of

visible expression recognition. Recent studies have shown that

infrared thermal images (IRTI), which reflect the temperature

distribution of subjects, are less sensitive to the lighting conditions

[1, 2]. Infrared expression recognition has been recognized as a

crucial complementarity for the FER [1, 5, 6]. Existing feature

extraction methods for FER can be roughly divided into two types:

static image based methods and dynamic image sequence (e.g.

video) based methods. Examples of the first type include the

Active Contour Model (ACM) [7], the Active Shape Model (ASM)

[8], the Active Appearance Model (AAM) [9], the Gabor filter

[10], the Elastic Graph Matching (EGM) [11], the Fisher

Discriminant Analysis method [12] and so on. The second type of

methods includes the dense motion field based methods (e.g. [13])

and the key feature point based methods (e.g.[14]). Existing

recognition methods include the shallow structure methods such

as the Hidden Markov Model (HMM) [15], the Support Vector

Machine (SVM) [16], the Adaboost [17]; and the deep structure

methods such as the Deep Belief Network (DBN)[26], the Deep

Boltzmann Machine (DBM)[27], the Convolutional Neural

Network (CNN)[1, 25], the Auto-Encoder (AE) method [18], etc..

These above methods can be performed on either visible images

or infrared images.

There are also several specific methods for infrared images. For

example, Benjamin Hernandze and Gustavo Olague [19] used

Gray Level Co-occurrence Matrix (GLCM) to extract features

from infrared images for the recognition of surprise, happiness,

and anger. Guotai Jiang et al. [20] extracted global and local

features from the region of interest (ROI) in infrared images for

the facial expression recognition. Yasunari Yoshitomi et al.[21]

used two-dimensional Discrete Cosine Transform (2D-DCT) to

extract features in the frequency domain for facial expression

recognition. A. Merla and G.L. Romani [22] performed

recognition of happiness, fear, and disgust through conducting the

facial temperature distribution of 10 subjects. Although these

methods considering face spatial features have got some

achievements, they only work on static images other than dynamic

image sequences. Since the facial expression is a dynamic process,

the temporal information is also very important for recognition.

Therefore, we need to develop a spatio-temporal method to

involve both spatial and temporal features for recognition.

To solve these problems mentioned above, in this paper, we

propose a novel spatio-temporal feature analysis method based on

optical flow and deep learning method named as Deep Boltzmann

Machine (DBM)[27]. Other than existing works which perform

spatial feature extraction on static images, spatio-temporal

features are extracted from infrared image sequences in this paper.

Firstly, we use the optical flow estimation method [23] to generate

dense motion field between each two adjacent infrared images.

Then, we use the principle component analysis (PCA)[24] for

dimension reduction. Finally, the DBM model is utilized to realize

the FER task

2. II. INFRARED FACIAL EXPRESSION

RECOGNITION BASED ON OPTICAL

FLOW AND DBM

The framework of our method is illustrated in Fig. 1, which

contains 3 steps: optical flow algorithm for spatio-temporal dense

motion field extraction, PCA for the dimension reduction, and

DBM for facial expression recognition.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for profit or commercial advantage and that

copies bear this notice and the full citation on the first page. Copyrights

for components of this work owned by others than ACM must be

honored. Abstracting with credit is permitted. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior

specific permission and/or a fee. Request permissions from

Permissions@acm.org.

ICVIP’17, December 27–29, 2017, Singapore, Singapore

ACM ISBN 978-1-4503-5383-0/17/12…$15.00.

DOI: https://doi.org/10.1145/3177404.3177408

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38628920

粉丝: 3
资源: 962

红外视频面部表情识别：时空特征与深度玻尔兹曼机分析

热红外多人识别与跟踪测试视频集

红外小目标识别研究：主分量分析与特征提取

红外视频火焰识别：粒子群优化与双层模糊分类器的高效应用

基于近红外视频序列的鲁棒面部表情识别研究与应用

基于生物热传递的红外面部识别血液灌注构造

RAF-DB数据集(上)，人脸面部表情识别数据集

多模式自发面部表情数据库的分析

手指静脉识别技术中红外线的作用分析

红外图像识别在舰船火灾中的应用分析

面向红外视频图像的火焰识别

最新资源