L1-L2范数目标表示：鲁棒视觉跟踪算法

93 浏览量更新于2024-08-26 收藏 1.44MB PDF 举报

"基于L1-L2规范的视觉跟踪目标表示" 视觉跟踪是计算机视觉领域中的一个核心问题，它涉及到在连续的视频帧中追踪特定目标的运动轨迹。然而，由于光照变化、部分遮挡、运动模糊以及背景杂波等因素，保持稳定且准确的跟踪是一个极具挑战性的任务。现有的许多跟踪方法依赖于线性组合模板的外观模型，这些模型往往通过最小二乘法来建立。当目标外观发生显著变化时，这种方法可能会失效。本文提出了一个新颖的目标表示方法，以提高视觉跟踪的鲁棒性。新方法利用了L1-范数和L2-范数的概念。目标候选者不再仅仅由单一模板表示，而是通过目标模板集合与L1-范数的线性组合来表达，这有助于编码目标的残差信息，从而适应外观的变化。同时，L2-范数被用来正则化编码系数，防止过拟合并增强模型的稳定性。 L1-范数以其对稀疏解的偏好而著称，能有效捕捉目标的关键特征，即使在部分遮挡的情况下也能保持良好的表现。而L2-范数的引入则对系数进行了平滑处理，有助于抑制噪声和异常值，进一步增强了模型的鲁棒性。为了评估目标的匹配程度，作者提出了一种新的似然评估函数，该函数基于重构残差和编码系数。这一函数的设计使得算法能够更好地处理跟踪过程中出现的各种不确定性，提高了在复杂环境下的跟踪性能。实验部分，该工作在一系列具有挑战性的视频序列上对比了所提算法与其他最新算法的性能，结果显示，新提出的跟踪算法在准确性和鲁棒性方面都有显著优势。这些实验验证了L1-范数和L2-范数结合的表示方法在视觉跟踪问题上的有效性。总结起来，"基于L1-L2规范的视觉跟踪目标表示"这一研究提出了一种创新的视觉跟踪策略，通过结合L1和L2范数，构建了一个更加适应外观变化的模型，尤其在处理部分遮挡情况时表现出色。这种方法不仅增强了模型的鲁棒性，还提升了跟踪算法的精度，为视觉跟踪领域的研究提供了新的思路。

104 Y. Wang, J. Wang, C. Deng, H. Zhu, and S. Wang

Recently, sparse linear representations have been introduced to target representations.

In L1 algorithm [3], a target candidate is sparsely represented by using both target tem-

plates and trivial templates. The target templates are used to represent target appear-

ance, and trivial templates are used to describe outliers or occlusions. The L1 algorithm

is robust to partial occlusions. However, it is time-consuming in solving `

minimization

problem, which limits the tracking performance in real time. Jia et al. [23] propose a

structural local sparse appearance model, where a target candidate is sparsely represented

by using the partial information and spatial information via a alignment-pooling method.

Taking advantage of generative and discriminative models, Zhong et al. [4] propose a

sparsity-based collaborative appearance model based on both holistic templates and local

representations. Recently, Zhang et al. [24] propose structural spare tracking algorithm

by exploiting the spatial layout structure among the local patches inside each target can-

didate. In [25], a target candidate is represented by sparse combinations of particles by

exploiting underlying low-rank constraints.

Discriminative tracking algorithms consider visual tracking as a binary classiﬁcation

problem, in which a classiﬁer is learnt to distinguish a target from the around background.

Avidan [10] proposes an ensemble tracking algorithm by combining a set of weak classiﬁers

into a strong classiﬁer and computes the conﬁdence value for each pixel. The target is

located by a vote conﬁdence map. Bai et al.[11] consider the contribution of conﬁdences as

a weight vector and combine a set of weak classiﬁers into a strong classiﬁers. Babenko et al.

[15] introduce the multiple instance learning framework into visual tracking where positive

and negative bags are considered as training samples. Kalal et al. [14] formulate visual

tracking in a tracking-learning-detecting framework. In [14], a bootstrapping classiﬁer is

learnt and used to select potential samples for updating unlabeled data with positive and

negative constraints. Hare et al.[12] propose a tracking-by-detecting algorithm based on

an online structured output support vector machine (SVM). Ning et al. [26] learn linear

structured SVM and explicit feature map to track object. In [27, 28, 29], the features

based on deep convolutional neural networks are learnt.

3. The proposed visual tracking algorithm. In this section, we describe `

norms

based target representation and a likelihood evaluation based on the reconstruction resid-

ual and the coding coeﬃcient. Based on the target representation and the likelihood

evaluation, we outline the proposed tracking algorithm in a particle ﬁlter framework [30].

3.1. `

norms based target representation. During tracking, m particles (i.e.,

target candidates) are sampled at the t-th frame, the state of a particle is denoted as

, i = 1, 2, ··· , m. The corresponding observation of x

is denoted as y

at frame t. The

state of the located target at frame t is denoted as

, and the corresponding observation

is denoted as

In visual tracking, the observation y

of a target candidate is often represented by a

linear combination of target templates

≈ d

+ d

+ ··· + d

, (1)

where D = [d

, d

, ··· , d

] is a set of target templates, α = [α

, α

, ··· , α

]

∈ R

is the

corresponding template coeﬃcient vector.

Diﬀerent from sparse linear representations in [3, 4, 23], in the proposed tracking algo-

rithm, the observation y

of a target candidate is approximated in the form of non-sparse

combinations of a set of target templates by solving

ˆα = arg min

− Dαk

+ λkαk

(2)

剩余10页未读，继续阅读

weixin_38682254

粉丝: 7
资源: 938

L1-L2范数目标表示：鲁棒视觉跟踪算法

L1—L2 optimization in signal and image processing

api-ms-win-core-file-l1-2-0和l2-1-0x64.7z

团体程序设计天梯赛-练习集 （L1-025 - L1-036）完整代码&题解

l1-l2极小化的研究意义

稀疏相位恢复问题的l1-l2极小化松弛模型的展望

l1-l2极小化的展望

最新资源

团体程序设计天梯赛-练习集（L1-025 - L1-036）完整代码&题解