FSENet：人脸分割增强深度特征学习用于人脸识别

需积分: 9 103 浏览量更新于2024-07-10 收藏 2.57MB PDF 举报

"这篇研究论文探讨了一种名为FaceSegmentor-Enhanced Deep Feature Learning (FSENet) 的方法，用于改善人脸识别的性能。通过利用面部的局部特性，该方法旨在解决由于类内变化大和类间细微差异导致的传统全局特征学习方法识别能力有限的问题。论文提出了一种脸部分割器，它可以将人脸解析为局部组件，并探索这些组件之间的内部相关性，从而增强识别不同身份的区分能力。具体来说，他们引入了语义解析模块，为每个像素分配语义部分标签，生成一组解析图，每张图代表特定面部成分的像素级出现概率。接着，他们根据这些解析图对脸部区域进行分割，提取局部特征，并建立结构相关的深度特征学习网络，以提升人脸识别的准确性。" 在本文中，作者们针对人脸识别领域的挑战提出了创新的解决方案。传统的深度学习模型通常依赖于整个面部图像的整体特征，这种方法在处理具有大量内在变化（如表情、光照和姿态变化）的人脸时，识别效果可能会受到影响。FSENet则引入了一种新的策略，通过面部分割技术来提取更具区分性的特征。首先，FSENet中的语义解析模块是关键，它能够将面部图像细分为不同的部分，如眼睛、鼻子、嘴巴等。每个部分都由一个独特的标签标识，使得模型可以学习到面部各个局部区域的特性。这一步骤有助于捕捉到人脸之间的微小差异，尤其是在不同类别之间。接下来，通过使用这些解析图，研究人员能够定位并分割出特定的面部区域，从而获取局部特征。这些局部特征包含了丰富的信息，可以更好地描述人脸的细节，进一步提高了模型的识别精度。此外，FSENet还构建了一个结构相关的深度特征学习框架，这个框架能够理解并利用面部局部组件之间的相互关系。这种结构化的学习方式使得模型能够捕获到更复杂的面部模式，增强了模型的泛化能力和鲁棒性。这篇论文提出的FSENet方法展示了如何通过深度学习和面部分割技术来提升人脸识别的性能，尤其在处理复杂环境和细微差异的情况下。这一研究为未来的人脸识别算法提供了新的思路和可能的改进方向。

CHENG et al.: FACE SEGMENTOR-ENHANCED DEEP FEATURE LEARNING FOR FACE RECOGNITION 225

top of L-Softmax loss with weight normalization on a hyper-

sphere manifold. Wang et al. [22] reformulate the softmax loss

as a cosine loss by L2 normalizing both features and weight

vectors to remove radial variations while Deng et al. proposed

ArcFace loss [23] to utilize the arc-cosine function to calculate

the angle between the current feature and the target weight.

Most of these methods aim to improve discrimination built

on holistic features. We tend to further explore facial local

information to make face features more sufﬁcient.

B. Facial Parsing

Facial parsing provides the per-pixel estimation of its

semantic class, implicitly providing facial semantic segmen-

tation. Warrell and Prince [32] introduced LabelFaces which

used priors to loosely model the topological structure of

face images. Le et al. [33] proposed an active shape model

that allowed for greater independence among facial compo-

nents and improved the appearance ﬁtting step by a Viterbi

optimization process. Zhou et al. [34] presented an interlinked

convolutional neural network (iCNN), where a special inter-

linking layer was designed to integrate local information and

contextual information efﬁciently. Luo et al. [35] presented

a hierarchical structure via deep learning, where they used

component-speciﬁc segmentors on each component to estimate

pixel-wise label. Due to the low generalization of segmen-

tors on complicated label interactions, an exemplar-based face

parsing method [36] was proposed with hand-labeled seg-

mentation maps and a set of sparse keypoint descriptors.

Zhou et al. [37] presented Fully-Convolutional continuous

CRF Neural Network (FC-CNN) architecture to improve

high segmentation accuracy. On the other hand, the parsing

information could beneﬁt other facial tasks. Chen et al. [38]

made full use of the geometry prior (e.g., parsing maps) to

super-resolve low-resolution images. Lu et al. [39] advanced

the expression synthesis domain by the introduction of a

Couple-Agent Face Parsing based Generative Adversarial

Network (CAFP-GAN) that unites the knowledge of facial

semantic regions and controllable expression signals. We are

inspired by facial parsing that assigns each pixel a probability

value of its class to obtain facial semantic segmentation which

reveals facial geometry, assisting facial information.

III. P

ROPOSED APPROACH

In this section, we ﬁrst introduce the motivations and the

overview of our proposed approach, and then illustrate

the details of each part in our framework. Lastly, we present

the implementation details.

A. Motivations

Deep learning based face recognition methods have shown

the effectiveness relying on the discriminative power of

advanced networks. Nevertheless, the resulting features are

nearly built on basis of facial holistic appearance character-

istics. The representation still have insufﬁcient information

due to ignoring some detailed local information. For exam-

ple, the property of facial components (e.g., eyes and nose)

also provides judgment to discern identities. To make full use

Fig. 2. Cosine similarity among faces. I

and I

are from different iden-

tities while I

and I

are the same person with various poses. ‘AF’, ‘SF’

and ‘Fusion’ indicates appearance features, semantic local features and their

combination, respectively. With the help of semantic features, the face rep-

resentation distinguishes different identities I

and I

more obviously. On

the other hand, the images I

and I

cause the large distance of appear-

ance features due to pose variations. The semantic features extract each facial

component characteristics which show the similarity among same person

faces.

of the local component features, we observe that the facial

parsing [34] could segment the face into semantic part, cov-

ering rich localized information. The generated local features

are potentially complementary to holistic features.

Fig. 2 shows the cosine similarity scores of two persons

with appearance features, semantic features and the fused fea-

tures, respectively. I

and I

are different identities while I

and I

denote the same person with various poses. We com-

pute the similarity between different identities (I

and I

) and

same person (I

and I

) with these three-pattern features. The

appearance similarity between I

and I

is higher than I

and

where only with the holistic information, the appearance

features may lead to the wrong veriﬁcation. For the seman-

tic parsing features, the similarity between various persons

is obviously lower than appearance similarity. The semantic

information generated by facial parsing provides the details of

facial components, such as big or small eyes, which assists the

appearance information. After fusing semantic information, it

reduces the similarity among different persons. On the other

hand, the appearance features of I

and I

are very different

due to the pose variations while the semantic features reveal

each component personalized attributes to show the similarity

among same person images. Therefore, our framework targets

at incorporating holistic and local information to enhance the

discriminative ability of face descriptors.

B. Face Segmentor-Enhanced Network

Our proposed FSENet simultaneously exploits the global

and local information, which mainly consists of four parts:

backbone module, semantic parsing network, part mask and

correlation matrix module. The holistic and local features are

剩余14页未读，继续阅读

weixin_38733525

粉丝: 2
资源: 920

FSENet：人脸分割增强深度特征学习用于人脸识别

人脸识别系统用于学习的

基于深度学习的人脸识别

深度学习人脸识别

人脸部分识别判定数据集，人脸分割模型

一种改善光照对深度人脸识别影响的方法.pdf

人脸识别中发型遮挡检测方法研究.pdf

利用深度学习进行图像分割 (2).pdf

改进深度数据流形分类算法在人脸识别中的应用

Rm2DLPP算法：人脸识别新方法

OpenCV颜色识别在人脸识别中的应用：揭秘人脸识别技术背后的关键技术

最新资源