基于深度卷积神经网络的头部姿态估计方法研究

需积分: 50 122 浏览量更新于2024-09-11 收藏 1.06MB PDF 举报

"基于深度卷积网络的头部姿态估计" 基于深度卷积网络的头部姿态估计是计算机视觉领域中的一个重要和具有挑战性的任务。头部姿态估计的目的是通过分析人脸图像来确定头部的方向和姿态。这种技术有广泛的应用前景，如人机交互、人脸识别、虚拟现实等。深度卷积网络（Deep Convolutional Network，DCNN）是一种基于深度学习的神经网络模型，它可以自动学习图像特征并进行分类。DCNN 在图像分类、目标检测、图像分割等领域中已经取得了很好的效果。在头部姿态估计中，DCNN 可以用于对人脸图像进行特征提取和分类，从而估计头部的姿态。该方法的优点是可以高效地处理大量的图像数据，并且可以学习到robust的特征表示，从而提高头部姿态估计的准确性。在这篇论文中，作者提出了一个基于 DCNN 的头部姿态估计方法。该方法首先对人脸图像进行 rough cropping，以去除背景噪声和非面部区域，然后使用 DCNN 对人脸图像进行特征提取和分类，以估计头部的姿态。实验结果表明，该方法可以达到很高的估计准确性。基于深度卷积网络的头部姿态估计是一种高效、准确的技术，它可以广泛应用于人机交互、人脸识别、虚拟现实等领域。相关知识点： 1. 头部姿态估计：头部姿态估计是计算机视觉领域中的一个重要和具有挑战性的任务。其目的是通过分析人脸图像来确定头部的方向和姿态。 2. 深度卷积网络（DCNN）：DCNN 是一种基于深度学习的神经网络模型，可以自动学习图像特征并进行分类。 3. 人脸图像处理：人脸图像处理是头部姿态估计的关键步骤，包括人脸检测、人脸识别、人脸 alignment 等。 4. 特征提取：特征提取是头部姿态估计的重要步骤，DCNN 可以自动学习图像特征并进行分类。 5. 多类分类：多类分类是头部姿态估计的最终目标，DCNN 可以对人脸图像进行多类分类，以估计头部的姿态。关键词：深度学习、头部姿态估计、深度卷积网络、人脸图像处理、特征提取、多类分类。

930 Cai et al. / Front Inform Technol Electron Eng 2015 16(11):930-939

Frontiers of Information Technology & Electronic Engineering

www.zju.edu.cn/jzus; engineering.cae.cn; www.springerlink.com

ISSN 2095-9184 (print); ISSN 2095-9230 (online)

E-mail: jzus@zju.edu.cn

Multiclass classiﬁcation based on a deep convolutional

network for head pose estimation

∗

Ying CAI

1,2

, Meng-long YANG

‡3

,JunLI

(

School of Computer Science, Sichuan University, Chengdu 610065, China)

(

College of Information Engine ering, Sichuan A gricultural University, Yaan 625014, China)

(

School of A eronautics and Astronautics, Sichuan University, Chengdu 610065, China)

E-mail: caiying34@qq.com; steinbeck@163.com; ljun402@163.com

Received Apr. 20, 2015; Revision accepted May 15, 2015; Crosschecked Oct. 16, 2015

Abstract: Head pose estimation has been considered an important and challenging task in computer vision. In

this paper we propose a novel method to estimate head pose based on a deep convolutional neural network (DCNN)

for 2D face images. We design an eﬀective and simple method to roughly crop the face from the input image,

maintaining the individual-relative facial features ratio. The method can be used in various poses. Then two

convolutional neural networks are set up to train the head pose classiﬁer and then compared with each other. The

simpler one has six layers. It performs well on seven yaw poses but is somewhat unsatisfactory when mixed in two

pitch poses. The other has eight layers and more pixels in input layers. It has better performance on more poses

and more training samples. Before training the network, two reasonable strategies including shift and zoom are

executed to prepare training samples. Finally, feature extraction ﬁlters are optimized together with the weight of

the classiﬁcation component through training, to minimize the classiﬁcation error. Our method has been evaluated

on the CAS-PEAL-R1, CMU PIE, and CUBIC FacePix databases. It has better performance than state-of-the-art

methods for head pose estimation.

Key words: Head pose estimation, Deep convolutional neural network, Multiclass classiﬁcation

doi:10.1631/FITEE.1500125 Document code: A CLC number: TP391

1 Intro duction

The problem of head pose estimation has en-

joyed substantial attention in the computer vision

community. Robust algorithms of head pose esti-

mation could be beneﬁcial for many applications,

such as video surveillance, human computer inter-

action, video conferencing, and face recognition.

However, it is still an intrinsically challenging task

‡

Corresponding author

Project supported by the National Key Scientiﬁc In-

strument and Equipment Development Project of China

(No. 2013YQ49087903), the National Natural Science Founda-

tion of China (No. 61402307), and the Educational Commission

of Sichuan Province, China (No. 15ZA0007)

ORCID: Ying CAI, http://orcid.org/0000-0002-5096-6175



Zhejiang University and Springer-Verlag Berlin Heidelberg 2015

because of the appearance variation between identi-

ties, complex illumination, varied background, and

other factors. Many methods use classiﬁcation or

regression to solve the problem of pose estimation.

In this paper, we treat the problem of head pose

estimation as a classiﬁcation question, because we

believe that there are invariant essential features in

the images with the same pose and these features are

suitable for pose classiﬁcation. Furthermore, we ﬁnd

that the deep convolutional neural network (DCNN)

performs well on many visual tasks, because spatial

topology and shift-invariant local features are well

captured (LeCun et al., 1998). We consider that ap-

propriate DCNN architecture and an eﬀective image

preprocess will produce good performance on head

下载后可阅读完整内容，剩余9页未读，立即下载

Aurelia.C

粉丝: 10

基于深度卷积神经网络的头部姿态估计方法研究

基于深度图像的头部姿态估计

一种基于深度学习的头部姿态估计模型.pdf

深度图像 头部姿态估计

基于深度学习的驾驶员头部姿态参数估计.pdf

基于局部二值特征和BP神经网络的头部姿态估计.pdf

头部姿态估计

深度卷积网络在头部姿势估计中的应用

基于图卷积和姿态估计算法的运动员动作捕捉技术研究

基于深度学习的人脸姿态估计技术

基于Kinect设计的机器人头部姿态状态估计

最新资源

深度图像头部姿态估计