NAGD-CNN+Dropout：提升面部表情识别的深度卷积神经网络

10 浏览量更新于2024-08-28 收藏 1.01MB PDF 举报

本文主要探讨了一种基于Nesterov加速梯度下降（Nesterov Accelerated Gradient Descent, NAGD）的卷积神经网络（Convolutional Neural Network, CNN）与dropout技术结合在面部表情识别中的应用。研究者 Wanjuan Su、Luefeng Chen、Min Wu、Mengtian Zhou、Zhentao Liu 和 Weihua Cao 提出了一种新颖的方法，即NAGD-CNN-Dropout，旨在解决面部表情识别任务中的深度学习模型过拟合问题和提高分类精度。 NAGD-CNN-Dropout方法首先通过融合CNN和softmax回归，构建了一个深度卷积神经网络（Deep Convolutional Neural Network, DCNN），它能够深入挖掘高级表达特征并进行有效的分类。Dropout层被巧妙地放置在下采样层之后，既能够防止模型过度拟合，又可以缩短训练时间。NAGD优化算法的应用使得网络权重更新更加平稳，避免了算法过快或过慢收敛的问题，从而增强了网络的响应能力。为了验证这一方法的有效性，作者在标准基准数据库上进行了实验。实验结果表明，NAGD-CNN-Dropout相较于现有的前沿技术表现出了显著的优势。此外，论文还进行了实际应用实验，进一步证明了该方法在实际场景中的稳健性和有效性。这项研究不仅提供了提升面部表情识别准确率的新策略，也为深度学习模型优化和防止过拟合问题提供了一个有价值的参考。它展示了如何结合Nesterov加速梯度法和dropout技术来增强深度学习模型的性能，对于推动计算机视觉领域，特别是表情识别领域的研究具有重要意义。在未来的工作中，这种方法可能会被广泛应用于其他视觉任务，如人体姿态估计、物体识别等领域。

Nesterov Accelerated Gradient Descent-based Convolution Neural

Network with Dropout for Facial Expression Recognition

Wanjuan Su, Luefeng Chen, Min Wu, Mengtian Zhou, Zhentao Liu and Weihua Cao

Abstract— Nesterov accelerated gradient descent-based con-

volution neural network (NAGDCNN) with dropout is proposed

for facial expression recognition, which fuses the convolution

neural network (CNN) with Softmax regression to construct

a deep convolution neural network (DCNN) that can excavate

high-level expression features and classify them. The dropout

layer is added after the sub-sampling layer which can effectively

reduce overﬁtting and the network’s training time, moreover,

the Nesterov accelerated gradient descent (NAGD) is used to

optimize the network weights that can predictably prevent

the algorithm from going too fast or too slow and enhance

the response capability of the network. To verify the effec-

tiveness of the proposal, experiments on benchmark database

are conducted, and the experimental results show that the

proposal outperforms the state-of-the-art methods. Futhermore,

the application experiment is also curried out and the results

indicate the feasibility of the proposal in practical applications.

Key Words—Deep learning, facial expression recognition,

Nesterov accelerated gradient descent, dropout, principal com-

ponent analysis

I. INTRODUCTION

With the development of various kinds of technologies, the

level of social intelligence is also increasing, and people’s

requirements for human-robot interaction (HRI) experience

are getting higher and higher. However, the existing machines

are unable to interact with people emotionally [1]. Facial ex-

pression is one of the main channels for human to express e-

motion [2], so achieving facial expression recognition (FER)

is conducive to the realization of the machine’s recognition

for human emotions or even understanding it. FER has a wide

range of applications [3], such as fatigue driving test, remote

nursing, HRI, etc. Therefore, the realization of more accurate

FER can promote the development of social intelligence.

FER can be divided into expression feature extraction

and expression feature recognition [4]. As for expression

feature extraction, various methods are employed in the

previous papers, e.g., active appearance models [5], scale-

invariant feature transform [6], local binary pattern [7],

Gabor wavelet transform [8] and so on. Especially, prin-

cipal component analysis (PCA) [9] is also a commonly

used feature extraction algorithm which can simplify data

structure by reducing data dimensionality. In order to learn

This work was supported by the National Natural Science Foundation

of China under Grants 61733016, 61603356 and 61210011, the Hubei

Provincial Natural Science Foundation of China under Grant 2015CFA010,

and the 111 project under Grant B17040.

W. J. Su, L. F. Chen, M. Wu, M. T. Zhou, Z. T. Liu, and W. H. Cao are

with the School of Automation, China University of Geosciences, Wuhan

430074, China, and also with the Hubei Key Laboratory of Advanced

Control and Intelligent Automation for Complex Systems, Wuhan 430074,

China. (Corresponding author: chenluefeng@cug.edu.cn)

projection subspaces equipped with the ability of robustness

and generalization, a new subspace learning algorithms based

on the standard PCA, linear discriminant analysis, clustering

based discriminant analysis (CDA) and their combinations

is proposed [10], the combination of PCA and CDA has

achieved better performance on facial expression database.

Here, PCA is also chosen to extract expression feature to

solve the problems of data redundancy and high dimension.

The facial expression feature recognition aims to design

a suitable classiﬁcation mechanism to recognize facial ex-

pression, of which common algorithms have hidden Markov

model [11], support vector machines (SVM) [12], etc.

A framework for FER by using appearance features of

salient facial patches (SFP) is proposed [13], which inves-

tigates the relevance of different facial patches, and exper-

iments on benchmark databases show the effectiveness of

SFP. Nevertheless, its process of obtaining the high-level

expression feature is very complicated, deep learning (DL)

has a strong ability of unsupervised feature learning which

has brought about changes and leaps in various ﬁelds [14].

The DL aims at discovering the input data’s high-levels

of distributed representations which has been widely used

in speech recognition, image recognition and other ﬁelds.

Hinton [15] et al. used deep belief network (DBN) and deep

automatic encoders to perform simple image recognition and

dimensionality reduction task which proved the feasibility

of the application of deep neural network (DNN) in image

recognition. Based on this, many researchers begin to apply

DL to FER. For instance, Liu et al. [16] proposed to adapt

3D convolutional neural networks (3DCNN) with deformable

action parts (DAP) constraints, namely, a deformable parts

learning component is incorporated into 3DCNN which

can detect speciﬁc facial action parts under the structured

spatial constraints, and obtain the discriminative part-based

representation simultaneously. The deep convolution neural

network (DCNN) is applied to perform feature learning and

smile detection simultaneously [17], by using the learned

features to train the SVM or AdaBoost classiﬁer, which

shows that the learned features have impressive discrimina-

tive ability. It can be seen that DL can effectively combine

feature learning and classiﬁcation into a single model. In

CNN, convolution layers and sub-sampling layers are usually

stacked iteratively to extract high-level semantic features.

However, here we develop a DCNN for FER that fuses the

CNN with Softmax regression (SR) to construct a DCNN.

Besides, the dropout layer is employed in the DCNN pro-

cedure which can effectively alleviated overﬁtting problem

[18] and reduce the network’s training time to some extent.

2017 11th Asian Control Conference (ASCC)

Gold Coast Convention Centre, Australia

December 17-20, 2017

下载后可阅读完整内容，剩余5页未读，立即下载

weixin_38720173

粉丝: 8

NAGD-CNN+Dropout：提升面部表情识别的深度卷积神经网络

各种梯度下降法（SGD、Momentum、NAG、Aagrad、RMSProp、Adam）matlab实现

优化算法（SAGA、SAG、RMSProp、Nesterov Accelerated Gradient、随机和小型批处理梯度）

nesterov accelerated gradient

Practical Recommendations for Gradient-Based Training of Deep Architectures

遗传算法matlab初始化种群代码-Neural-Network-optimized-with-genetic-algorithm:神经网络优

accelerated-proximal-gradient.rar_matlab例程_matlab_

Improving-Deep-Neural-Networks-Hyperparameter-tuning-Regularization-and-Optimization:我从不断完善的深度神经网络进行编程作业的解决方案

Gradient Descent Optimization：用于多种梯度下降优化方法的 MATLAB 包，例如 Adam 和 RMSProp。-matlab开发

gradient_descent_ebook_descent_machinelearning_

Face-expression-and-ethnic-recognition:用于面部表情识别和种族分类的两个图像模型

最新资源