人脸长期跟踪：基于Tracking-Learning-Detection的新方法

需积分: 10 78 浏览量更新于2024-09-14 收藏 1.96MB PDF 举报

"tld_tracking-learning-detection applied to faces." 这篇论文提出了一种创新的系统，专门用于在无约束的视频中进行长期的人脸追踪，基于的是Tracking-Learning-Detection（TLD）方法。TLD是一种目标跟踪算法，旨在解决目标在复杂环境中可能出现的遮挡、运动变化等问题。在传统TLD框架的基础上，该系统引入了两个关键概念：一个通用的检测器和一个验证器。通用检测器是预先训练好的，用于定位正面人脸，而在线训练的验证器则负责判断哪些检测到的人脸与被追踪的目标匹配。这种设计增强了系统对遮挡和外观变化的鲁棒性，使其能够在实时环境下有效运行。论文中详细评估了几种在跟踪过程中构建验证器的策略，这些策略的量化比较有助于优化系统性能。系统的有效性通过两个不同的视频场景进行了验证：一个为“as it comes”（23分钟）的视频，另一个为监控（8分钟）视频。在这两种情况下，系统都能够检测并追踪人脸，并且能从单个正面人脸示例和未标记的视频自动学习多视图模型。关键词：长期人脸识别，学习，检测，验证，实时。此系统的核心在于其结合了检测、学习和验证三个步骤。首先，通过预训练的检测器来初步定位可能的人脸区域。然后，验证器会根据已有的追踪信息对这些候选区域进行筛选，确保选择出正确的目标人脸。随着时间的推移，系统不断学习并适应目标人脸的变化，包括视角、光照、表情等。这一过程使得系统在长时间追踪时仍能保持准确性和稳定性。此外，系统能够处理人脸遮挡的情况，即使目标部分被遮挡，也能通过验证器的决策机制恢复追踪。同时，它能从单个正面人脸图像中学习到多种视角的脸部特征，这在无标记数据的情况下尤其有用，因为通常我们无法获取所有可能视角的标记数据。这个TLD方法应用于人脸识别的系统展示了在真实世界无约束视频中的强大追踪能力，并提供了在复杂环境下的实时人脸识别解决方案。其对验证器的设计和学习策略的研究，对于提升目标跟踪算法的鲁棒性和准确性具有重要的理论和实践价值。

FACE-TLD: TRACKING-LEARNING-DETECTION APPLIED TO FACES

Zdenek Kalal

†

, Krystian Mikolajczyk

†

, Jiri Matas

‡

†

Centre for Vision, Speech and Signal Processing, University of Surrey, UK

‡

Center for Machine Perception, Czech Technical University, Czech Republic

ABSTRACT

A novel system for long-term tracking of a human face in

unconstrained videos is built on Tracking-Learning-Detection

(TLD) approach. The system extends TLD with the concept

of a generic detector and a validator which is designed for

real-time face tracking resistent to occlusions and appearance

changes. The off-line trained detector localizes frontal faces

and the online trained validator decides which faces corre-

spond to the tracked subject. Several strategies for build-

ing the validator during tracking are quantitatively evaluated.

The system is validated on a sitcom episode (23 min.) and a

surveillance (8 min.) video. In both cases the system detects-

tracks the face and automatically learns a multi-view model

from a single frontal example and an unlabeled video.

Index Terms— long-term face tracking, learning, detec-

tion, veriﬁcation, real-time

1. INTRODUCTION

Long-term real-time tracking of human faces in uncon-

strained environments is a challenging problem: given a

single example of a speciﬁc face, track the face in a video

that may include frame cuts, sudden appearance changes,

long-lasting occlusions etc. In such environments, the frame-

by-frame tracking meets face detection and veriﬁcation at

one point with a common goal to determine the location of

the speciﬁc face. This paper proposes a novel solution that is

suitable in such situations.

Two approaches are used for modeling an object ap-

pearance in tracking: static and adaptive. Static models [1]

assume that the object appearance change is limited and

known. Unexpected changes of the object appearance can not

be tracked. This drawback is addressed by adaptive meth-

ods [2] which update the object model during tracking. The

underlying assumption is that every update is correct. Every

incorrect update brings error to the model that accumulates

over time and causes drift. In the context of faces, the drift

problem has been addressed by introduction of so called

visual constraints [3]. Even though this approach demon-

strated increased robustness and accuracy, its performance

was tested only on videos where the face was in the ﬁeld

of view. In scenarios where a face moves in and out of the

Fig. 1. Our system tracks, learns and detects a speciﬁc face in

real-time in unconstrained videos.

frame, face re-detection is essential. Face detection have been

extensively studied [4] and a range of ready-to-use face de-

tectors are available [5] which enable tracking-by-detection.

Apart from expensive ofﬂine training, the disadvantage of

tracking-by-detection is that all faces have the same model

and therefore the identities can not be distinguished. To

elevate this problem, Li et al. [6] proposed a face tracking

algorithm that splits the face model into three parts with dif-

ferent lifespan. This makes the tracker suitable for low-frame

rate videos but the longest period the face can disappear from

the camera view is limited. Another class of approaches for

face tracking was developed as part of automatic character

annotation in video [7]. These systems can handle the sce-

nario considered in this paper, but they have been designed

for ofﬂine processing and adaptation for real-time tracking is

not straightforward.

In this work, we build on an approach called Tracking-

Learning-Detection (TLD) [8], whose learning part was ana-

lyzed in [9]. The TLD method was designed for long-term

tracking of arbitrary objects in unconstrained environments.

The object was tracked and simultaneously learned in order

to build a detector that supports the tracker once it fails. The

detector was build upon the information from the ﬁrst frame

as well as the information provided by the tracker. This paper

has three contributions w.r.t. TLD: (i) Additional source of

information (ofﬂine detector) is embedded to the TLD frame-

work which simpliﬁes the learning task in cases when the ob-

下载后可阅读完整内容，剩余3页未读，立即下载

zhengxiaoli33

粉丝: 0
资源: 4

人脸长期跟踪：基于Tracking-Learning-Detection的新方法

TLD(Tracking-Learning-Detection)

TLD-tracking-learning-detection

FACE-TLD Tracking-Learning-Detection applied to faces

TLD(Tracking-Learning-Detection)目标跟踪算法的C++实现

jsp-2_0-fr-spec.rar_jsp specification_jsp-2_0-fr-sp

jsp-2_0-fr-spec-docs.zip_doc_jsp do_jsp-2_0-fr-sp

Tracking-Learning-Detection翻译

Tracking-Learning-Detection code

Tracking-Learning-Detection讲解PPT

Tracking-Learning-Detection原理分析1

最新资源