视觉跟踪技术综述：实验性研究与最新算法概述

5星 · 超过95%的资源需积分: 9 157 浏览量更新于2024-07-23 1 收藏 3.46MB PDF 举报

视觉跟踪技术综述视觉跟踪是计算机视觉领域的一个重要研究方向，旨在追踪视频序列中的目标对象。随着技术的发展，越来越多的算法被提出，但它们的性能如何？如何评价它们的效果？本文将对视觉跟踪技术进行综述，了解当前的研究进展和挑战。视觉跟踪的挑战视觉跟踪是一个非常复杂的问题，涉及到多种因素，如照明变化、遮挡、背景杂乱、摄像机运动、低对比度、镜面反射等。这些因素使得视觉跟踪变得非常困难。因此，设计一个好的跟踪器需要考虑到这些因素，并能够在多种场景中进行跟踪。当前的研究进展在过去的二十年中，已经提出了许多视觉跟踪算法，但它们的性能如何？如何评价它们的效果？当前的研究主要集中在两个方面：一是设计新的跟踪算法，二是评价现有的跟踪算法。新的跟踪算法包括基于深度学习的方法、基于模型的方法、基于粒子滤波的方法等。这些方法的提出旨在解决视觉跟踪中的挑战，如照明变化、遮挡等。同时，研究人员也在不断地评价和比较这些算法的性能，以确定它们在不同场景中的效果。评价跟踪器的性能评价跟踪器的性能是一个非常重要的步骤，因为它能够帮助研究人员确定哪些算法是最好的，并且能够在实际应用中发挥最大的作用。当前，评价跟踪器的性能主要通过以下几个方面： 1. 精度：指跟踪器能够正确地追踪目标对象的程度。 2. Robustness：指跟踪器能够在多种场景中保持跟踪的能力。 3. 速度：指跟踪器能够以多快的速度追踪目标对象。在本文中，我们将对15种视觉跟踪算法进行评价，涵盖了基于深度学习的方法、基于模型的方法、基于粒子滤波的方法等。我们将从多种角度对这些算法进行评价，如精度、Robustness、速度等，以确定它们在不同场景中的效果。结论视觉跟踪是一个非常复杂的问题，需要考虑到多种因素，如照明变化、遮挡、背景杂乱、摄像机运动、低对比度、镜面反射等。当前的研究进展主要集中在设计新的跟踪算法和评价现有的跟踪算法两个方面。评价跟踪器的性能是非常重要的，因为它能够帮助研究人员确定哪些算法是最好的，并且能够在实际应用中发挥最大的作用。

SMEULDERS ET AL.: VISUAL TRACKING: AN EXPERIMENTAL SURVEY 1447

[IVT] Incremental Visual Tracking: The tracker in [54]

recognizes that in tracking it is important to keep an

extended model of appearances capturing the full range of

appearances of the target in the past. Eigen Images of

the target are computed by incremental PCA over the tar-

get’s intensity-value template. They are stored in a leaking

memory to slowly forget old observations. Candidate win-

dows are sampled by Particle Filtering [55]fromthemotion

model, which is a Gaussian distribution around the previ-

ous position. The conﬁdence of each sample is the distance

of the intensity feature set from candidate window to the

target’s Eigen image subspace. The candidate window with

the minimum score is selected.

[TAG] Tracking on the Afﬁne Group: The paper [56]

also uses an extended model of appearances.Itextendsthe

traditional {translation, scale, rotation} motion types to

a more general 2-dimensional afﬁne matrix group. The

tracker departs from the extended model of IVT adopt-

ing its appearance model including the incremental PCA

of the target intensity values. The tracker samples all pos-

sible transformations of the target from the afﬁne group

using a Gaussian model.

[TST] Tracking by Sampling Trackers: The paper [45]

observes that the real-world varies signiﬁcantly over time,

requiring the tracker to adapt to the current situation.

Therefore, the method relies on tracking by sampling many

trackers. In this way it maintains an extended model of

trackers. It can be conceived as the extended equivalence

of IVT. Each tracker is made from four components: an

appearance model, a motion model, a state representa-

tion and an observation model. Each component is further

divided into sub-components. The state of the target stores

the center, scale and spatial information, the latter further

subdivided by vertical projection of edges, similar to the

FRT-tracker. Multiple locations and scales are considered.

Sparse incremental PCA with leaking of HSI- and edge-

features captures the state’s appearance past over the last

ﬁve frames, similar to IVT. Only the states with the highest

Eigen values are computed. The motion model is composed

of multiple Gaussian distributions. The observation model

consists of Gaussian ﬁlter responses of the intensity fea-

tures. Basic trackers are formed from combinations of the

four components. In a new frame, the basic tracker with

the best target state is selected from the space of trackers.

3.3 Tracking Using Matching with Constraints

Following major successes for sparse representations in the

object detection and classiﬁcation literature, a recent devel-

opment in tracking reduces the target representation to a

sparse representation, and performs sparse optimisation.

[TMC] Tracking by Monte Carlo sampling: The

method [43] aims to track targets for which the object shape

changes drastically over time by sparse optimization over

patch pairs. Given the target location in the ﬁrst frame,

the target is modeled by sampling a ﬁxed number of tar-

get patches that are described by edge features and color

histograms. Each patch is then associated with a corre-

sponding background patch sampled outside the object

boundaries. Patches are inserted as nodes in a star-shaped

graph where the edges represent the relative distance to the

center of the target. The best locations of the patches in the

new frame are found by warping each target patch to an

old target patch. Apart from the appearance probability, the

geometric likelihood is based on the difference in location

with the old one. The new target location is found by maxi-

mum a posteriori estimation. TMC has an elaborate update

scheme by adding patches, removing them, shifting them

to other locations, or slowly substituting their appearance

with the current appearance.

[ACT] Adaptive Coupled-layer Tracking: The recent

tracker [57] aims for rapid and signiﬁcant appearance

changes by sparse optimization in two layers. The tracker con-

straint changes in the local layers by maintaining a global

layer. In each local layer, at the start, patches will receive

uniform weight and be grouped in a regular grid within

the target bounding box. Each layer is a gray level his-

togram and location. For a new frame, the locations of the

patches are predicted by a constant-velocity Kalman-ﬁlter

and tuned to its position in the new frame by an afﬁne

transformation. Patches which drift away from the target

are removed. The global layer contains a representation of

appearance, shape and motion. Color HSV-histograms of

target and background assess the appearance likelihood per

pixel. Motion is deﬁned by computing the optical ﬂow of

a set of salient points by KLT. The difference between the

velocity of the points and the velocity of the tracker assesses

the likelihood of the motion per pixel. Finally, the degree

of being inside or outside the convex hull spanned around

the patches gives the likelihood of a pixel. The local layer

uses these three likelihoods to modify the weight of each

patch and to decide whether to remove the patch or not.

Finally, the three likelihoods are combined into an overall

probability for each pixel to belong to the target. The local

layer in ACT is updated by adding and removing patches.

The global layer is slowly updated by the properties of the

stable patches of the local layer.

[L1T] L1-minimization Tracker: The tracker [58],

employs sparse optimization by L1 from the past appearance.

It starts using the intensity values in target windows

sampled near the target as the bases for a sparse represen-

tation. Individual, non-target intensity values are used as

alternative bases. Candidate windows in the new frame

are sampled from a Gaussian distribution centered at the

previous target position by Particle Filtering. They are

expressed as a linear combination of these sparse bases

by L1-minimization such that many of the coefﬁcients are

zero. The tracker expands the number of candidates by

also considering afﬁne warps of the current candidates.

The search is applied over all candidate windows, selecting

the new target by the minimum L1-error. The method

concludes with an elaborate target window update scheme.

[L1O] L1 Tracker with Occlusion detection: Advancing

the sparse optimization by L1, the paper [59] uses L2 least

squares optimization to improve the speed. It also considers

occlusion explicitly. The candidate windows are sorted on

the basis of the reconstruction error in the least squares. The

ones above a threshold are selected for L1-minimization. To

detect occluded pixels, the tracker considers the coefﬁcients

of the alternative bases over a certain threshold to ﬁnd pix-

els under occlusion. When more than 30% of the pixels are

occluded, L1O declares occlusion, which disables the model

updating.

剩余26页未读，继续阅读

lillllllll

粉丝: 69
资源: 13

视觉跟踪技术综述：实验性研究与最新算法概述

Deep Learning for Visual Tracking A Comprehensive Survey.pdf

Real-Time Visual Tracking

Sparse Affine Hull for Visual Tracking

Convolutional Residual Learning for Visual Tracking

Adaptive NormalHedge for robust visual tracking

Robust visual tracking using template anchors

Incremental Learning for Robust Visual Tracking

Incremental Learning for Robust Visual Tracking.

CONVEX HULL FOR VISUAL TRACKING WITH EMD

A Summary of Correlation Filter in Visual Tracking

最新资源