深度关联网络恢复正脸图像

15 浏览量更新于2024-08-29 收藏 375KB PDF 举报

"Two-Stream Deep Correlation Network for Frontal Face Recovery" 这篇研究论文"Two-Stream Deep Correlation Network for Frontal Face Recovery"聚焦于人脸识别领域中的一个重要问题：如何有效地从不同姿态的人脸图像中恢复出正面人脸，以提高人脸识别的性能。在人脸识别系统中，姿态变化和纹理差异是两个主要影响识别准确率的因素。传统的做法通常是只利用纹理特征来恢复正面人脸，而这篇论文提出了一种新的方法，即双流深度相关网络，它结合了几何特征和纹理特征来进行正面人脸恢复。论文作者Ting Zhang、Qiulei Dong、Ming Tang和Zhanyi Hu提出的方法分为两个主要步骤。首先，他们设计了两个独立的流（或称为分支）来分别提取输入人脸图像的几何特征和纹理特征。这两流的设置允许模型分别处理与姿态变化相关的信息和与面部表面细节相关的纹理信息。这一步骤反映了对人脸的全面理解，不仅考虑了人脸形状的变化，也考虑了由于光照、遮挡等因素导致的纹理变化。接着，论文中引入了一个创新的多乘性补丁相关层（Multiplicative Patch Correlation Layer），用于融合这两个流提取的特征。这个层的作用是通过计算特征之间的相关性，将几何信息与纹理信息有效地结合在一起，以生成更精确的正面人脸图像。这种融合策略有助于保留两者的关键信息，同时减少不相关特征的干扰。最后，整个网络被设计为端到端的训练和预测模型，这意味着从输入图像到输出正面人脸的全过程在一个单一的神经网络结构中完成。这种方法的优点在于，它可以优化整个流程，使得特征提取和融合更加协同，从而在实际应用中表现出优于现有最佳方法的性能。在基准测试上，该方法的优越性得到了验证，为未来的人脸识别技术提供了新的思路和可能的改进方向。

1478 IEEE SIGNAL PROCESSING LETTERS, VOL. 24, NO. 10, OCTOBER 2017

Two-Stream Deep Correlation Network for

Frontal Face Recovery

Ting Zhang, Qiulei Dong, Ming Tang, and Zhanyi Hu

Abstract—Pose and textural variations are two dominant factors

to affect the performance of face recognition. It is widely believed

that generating the corresponding frontal face from a face image of

an arbitrary pose is an effective step toward improving the recog-

nition performance. In the literature, however, the frontal face

is generally recovered by only exploring textural characteristic.

In this letter, we propose a two-stream deep correlation network,

which incorporates both geometric and textural features for frontal

face recovery. Given a face image under an arbitrary pose as in-

put, geometric and textural characteristics are ﬁrst extracted from

two separate streams. The extracted characteristics are then fused

through the proposed multiplicative patch correlation layer. These

two steps are integrated into one network for end-to-end train-

ing and prediction, which is demonstrated effective compared with

state-of-the-art methods on the benchmark datasets.

Index Terms—Correlation layer, deep neural network, frontal

face recovery, geometric stream, textural stream.

I. INTRODUCTION

ACE recognition is a ﬁeld of great potential, which has been

widely used in access control, video surveillance, personal

veriﬁcation, etc. Over the past decade, there have been tremen-

dous advances in face recognition, most of which are owed to

the development of deep learning [1]–[5]. Although data-driven

features extracted by deep neural networks show great advan-

tages over the hand crafted ones in face recognition [6]–[10],

the performance of face recognition is usually inﬂuenced by the

large variations in pose, illumination, expression, etc. Among

them, pose variation has been a persistent challenge because it

Manuscript received May 14, 2017; revised July 10, 2017; accepted July

20, 2017. Date of publication August 7, 2017; date of current version August

29, 2017. This work was supported in part by the Strategic Priority Research

Program of the Chinese Academy of Sciences under Grant XDB02070002,

and in part by the National Natural Science Foundation of China under Grant

61421004, Grant 61375042, and Grant 61573359. The associate editor coor-

dinating the review of this manuscript and approving it for publication was

Dr. Sumohana S. Channappayya. (Corresponding author: Qiulei Dong.)

T. Zhang is with the National Laboratory of Pattern Recognition, Institute

of Automation, Chinese Academy of Sciences, Beijing 100190, China and also

with the University of Chinese Academy of Sciences, Beijing 100049, China

(e-mail: ting.zhang@nlpr.ia.ac.cn).

Q. Dong and Z. Hu are with the National Laboratory of Pattern Recognition,

Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China,

and with the University of Chinese Academy of Sciences, B eijing 100049,

China, and also with the Center for Excellence in Brain Science and Intelligence

Technology, Chinese Academy of Sciences, Beijing 100190, China (e-mail:

qldong@nlpr.ia.ac.cn; huzy@nlpr.ia.ac.cn).

M. Tang is with the National Laboratory of Pattern Recognition, Institute

of Automation, Chinese Academy of Sciences, Beijing 100190, China (e-mail:

tangm@nlpr.ia.ac.cn).

Color versions of one or more of the ﬁgures in this letter are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/LSP.2017.2736542

may make the intraperson variance exceed the interperson one.

In view of this, many methods have been proposed to transfer

a face image of arbitrary pose to the frontal one. These meth-

ods can be roughly classiﬁed into two groups: Two-dimensional

(2-D)-based methods [11]–[17] and 3-D-based ones [18]–[21].

The 2-D-based techniques usually encode a test image with

some exemplars, or use 2-D image matching algorithms to ad-

dress the pose variation. In [12], Markov random ﬁelds was ap-

plied to infer the frontal face images. Li et al. [15] proposed an

elastic matching method which aligned the patches and matched

the face images of different poses based on Gaussian Mixture

Model. In [1], a deep convolutional neural network was pro-

posed to recover the frontal image of neutral illumination from

those with arbitrary poses and illumination. In [11], a new deep

architecture was presented to generate face images with target-

poses from those with arbitrary poses and illumination. In [17],

recurrent neural networks were combined with autoencoders to

render sequences of rotated face images through incremental

3-D rotations.

The 3-D-based techniques attempt to match the captured 3-D

facial data to probe face images or align a probe face image to

a 3-D face model. Asthana et al. [19] constructed an aligned

3-D face model from a nonfrontal face image, and then rotated

the model to render a frontal face image. In [20], a virtual

view for the probe image was generated based on a set of 3-D

displacement ﬁelds sampled from a 3-D face database and the

synthesized faces were tested.

Despite the demonstrated success, the performance of ex-

isting methods on frontal face recovery is still limited. The

methods based on 3-D reconstruction are time consuming and

sometimes require several views captured at multiple poses.

Although being efﬁcient and only requiring a single input im-

age, the performance of 2-D reconstruction methods is limited

because they only exploit the facial textures to align face images.

These textures are not effective enough to locate correspondence

when face is under out-of-plane rotation.

In this letter, we propose a two-stream deep correlation net-

work (TSDCN) to solve the aforementioned limitations. Given

an input face image, we extract the textural and geometric fea-

tures independently via two streams. The textural stream per-

forms similarly with existing methods and the geometric stream

predicts the angles of the face poses. The angle predictions are

then correlated with the texture correspondence to predict the

recovered face image. Experimental results on the Multi-PIE

and labeled faces in the wild (LFW) datasets demonstrate the

validity of the proposed method.

The contributions of this work include the following.

1) We propose a two-stream network to tackle the frontal

face recovery problem, which could independently cap-

ture textural and geometric features of input face image.

See http://www.ieee.org/publications

standards/publications/rights/index.html for more information.

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38570519

粉丝: 2
资源: 975

深度关联网络恢复正脸图像

corr_matrix <- cor(correlation) corr_df <- corr_matrix

Zero-mean correlation

推荐20篇关于多特征服装检索的文献

ERROR: Could not build wheels for spatial-correlation-sampler, which is required to install pyproject.toml-based projects

e detrended partial cross-correlation analysis (DPXA)

多标签图像分类文献综述

LMCF目标跟踪算法的英文文献

cascade-correlation算法

Visual Tracking via Adaptive Spatially-Regularized Correlation Filters(**CVPR2019 Oral**).

最新资源

Visual Tracking via Adaptive Spatially-Regularized Correlation Filters(CVPR2019 Oral).