稀疏仿射锥体：视觉跟踪的新方法

168 浏览量更新于2024-08-27 收藏 563KB PDF 举报

"SparseAffineHullforVisualTracking" 在视觉追踪领域，开发一个鲁棒的外观模型是一项极具挑战性的任务，因为目标可能会受到部分遮挡、快速运动、背景杂乱和光照变化等多种因素的影响。本文提出的"稀疏仿射 hull"方法为视觉追踪提供了一种新颖的目标表示方式。这种方法的核心是在粒子滤波框架下，将目标候选者表示为字典模板的稀疏仿射组合。仿射组合允许目标外观覆盖未知的变化，这在应对动态场景时尤其重要。字典模板是预先学习或从跟踪过程中积累的一组基础图像特征，可以捕捉到目标的各种可能状态。在跟踪过程中，这些字典模板会不断更新，以适应视频序列中不断变化的环境条件，从而保持模型的适应性和准确性。在粒子滤波器中，每个粒子代表一种可能的目标位置和状态，而每个状态的表示则由稀疏的仿射组合构成。通过优化过程，找到最佳的模板组合，以最小化与当前帧中目标观测之间的差异。这种优化过程通常涉及在线学习策略，例如在线字典学习算法，以在跟踪过程中动态调整模板集。实验结果表明，稀疏仿射 hull 方法在多个具有挑战性的视频序列上表现出色，对比其他视觉追踪算法，它能更好地处理遮挡、光照变化和快速运动等问题，从而提高了追踪的准确性和稳定性。这种方法的优势在于其灵活性和自适应性，能够在复杂环境中维持对目标的有效追踪。总结来说，"Sparse Affine Hull for Visual Tracking"是一种创新的视觉追踪技术，通过利用稀疏仿射组合和字典模板更新，构建了一个能够适应各种视觉变化的动态模型。这一方法对于解决现实世界中的追踪问题具有重要的实用价值，为未来的视觉追踪研究提供了新的思路和方法。

Sparse Afﬁne Hull for Visual Tracking

Jun Wang, Yuanyun Wang, Chengzhi Deng*, Huasheng Zhu, Shengqian Wang and Li Lv

1 Jiangxi Province Key Laboratory of Water Information Cooperative Sensing and Intelligent Processing,

Nanchang Institute of Technology, Nanchang 330099, China

2 School of Information Engineering, Nanchang Institute of Technology, Nanchang 330099, China

Wangjun012778@126.com, Wangyy

abc@163.com, dengchengzhi@126.com

zhuhuasheng@sohu.com, Sqwang113@yahoo.com

Abstract—It is a challenging task to develop a robust ap-

pearance model due to various factors such as partial occlusion,

fast motion, background clutters and illumination variations. In

this paper, we propose a novel target representation for visual

tracking. Namely, a target candidate is represented by sparse

afﬁne combinations of dictionary templates in a particle ﬁlter

framework. Afﬁne combinations based target appearances can

cover unknown appearances. In order to adapt the dynamic

scenes across a video sequence, the dictionary templates are

updated in the tracking process. Experimental results on several

challenging video sequences against some state-of-the-art tracking

algorithms demonstrate that the proposed algorithm is robust to

illumination variations, background clutters, etc.

I. INTRODUCTION

Visual tracking is an important issue in computer vision

with a variety of tasks such as vehicle navigation, human-

computer interaction, video surveillance, etc. The goal of

visual tracking is to locate a tracked target across a video

sequence. Although much progress has been made in recent

years [1], it is a challenging tasks to design an effective

appearance model due to the inﬂuence of factors such as fast

motion, motion blur, partial occlusion, illumination variation,

in-plane and out-of-plane rotations and background clutters.

Generally speaking, visual tracking can be classiﬁed as

either generative [2]-[5], [8]-[10] or discriminative [11]-[17].

Generative tracking algorithms typically learn an appearance

model to represent a target candidate and search for an image

region that has the minimal reconstruct residual as the tracked

target in the current frame. In [2], a target candidate is

divided into multiple non-overlapping image patches, which is

represented by a histogram. The similarity between an image

patch in a target candidate and the corresponding image patch

in the template is measured. The similarity is used as a voting

map to evaluate the likelihood evaluation. The algorithm in [2]

can alleviate the drift problem because the ﬁxed target template

is used, however it is not robust to dynamic scene variations.

Kwon et al.[3] use multiple target appearance models to adapt

the signiﬁcant appearance variations, and use multiple motion

models to cover motion variations. The algorithm [3] is robust

to complicated appearance variations. He et al.[4] represent

a target by a locality sensitive histogram, which is robust to

drastic illumination variations. Wang et al.[5] propose afﬁne

hull based regularized target representation for visual tracking.

Recently, sparse representation techniques [6] based gen-

erative tracking algorithms are developed [7]-[10]. The L1

tracking algorithm [7] represents each target candidate as a

sparse combination of target templates and trivial templates.

Based on the trivial templates, the L1 is robust to partial

occlusions. In [9], the local patches in a target candidate

are sparsely represented by the corresponding patches in the

dictionary templates. Based on both holistic templates and

local representations, Zhong et al.[8] propose a sparsity-based

collaborative appearance model.

Unlike generative tracking algorithms, discriminative track-

ing algorithms formulate visual tracking as a binary classi-

ﬁcation problem, in which a classiﬁer is learnt and used to

distinguish a target from its surrounding background. Avidan

[11] proposes an ensemble tracker by combining a set of weak

classiﬁers into a strong classiﬁer. In [12], the discriminative

features are updated by an online boosting algorithm. Babenko

et al. [13] propose a discriminative tracking algorithm by

introducing multiple instance learning to updating the classi-

ﬁers. Bai et al. [14] propose a randomized ensemble tracking

algorithm by combining a set of weak classiﬁers with a weight

vector that is considered as a distribution of conﬁdence. Hare et

al. [16] introduce a structured output SVM learning technique

and propose a tracking-by-detection algorithm.

For generative tracking algorithms, developing a robust

appearance model is crucial issue. Inspired by the afﬁne

hull representation based face recognition [18] and the sparse

representation based visual tracking, we propose a novel visual

tracking algorithm (referred to as SAHT). The target candidate

is represented by a sparse afﬁne combination on a set of

dictionary templates in our work. The proposed sparse afﬁne

hull based target representation has the advantages of both

the afﬁne hull (i.e., covering the unknown target appearances

that do not appear in the dictionary templates) and sparse

representation (i.e., it is robust to outliers).

The remainder of this paper is organized as follows. Section

II presents the proposed visual tracking algorithm. Section

III evaluates experimental results of the proposed algorithm

against the state-of-the-art algorithms on challenging video

sequences. Section IV concludes the paper.

II. T

HE PROPOSED TRACKING ALGORITHM

In this section, under the particle ﬁlter framework, we

propose a novel target appearance model. A target candidate is

represented by sparse afﬁne combinations of a set of dictionary

templates. In order to adapt the dynamic scene variations

and maintain the effectiveness of the dictionary templates, the

templates are dynamically updated.

2016 6th International Conference on Digital Home

DOI 10.1109/ICDH.2016.24

2016 6th International Conference on Digital Home

DOI 10.1109/ICDH.2016.24

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38748555

粉丝: 6
资源: 933

稀疏仿射锥体：视觉跟踪的新方法

Online Visual Tracking by Huchuan Lu-June 1, 2019.epub

Generic Sparse Bundle Adjustment for C++

Laplacian affine sparse coding with tilt and orientation consistency for image classification

Bayesian sparse representation model for sar image classification

Sparse Adaptive Filters for Echo Cancellation.pdf

Structured Sparse Error Coding for Face Recognition With Occlusion

Batched Sparse Matrix Multiplication for Accelerating Graph Convolutional PPT

Efficient Sparse Pose Adjustment for 2D Mapping.pdf

SBNet- Sparse Blocks Network for Fast Inference.zip

Learning Neural Sparse Voxel Fields for Free-viewpoint Rendering

最新资源