Pattern Recognition Letters 56 (2015) 52–59
Contents lists available at ScienceDirect
Pattern Recognition Letters
journal homepage: www.elsevier.com/locate/patrec
Robust visual tracking based on product sparse coding
✩
Huang Hong-tu
∗
, Bi Du-yan, Zha Yu-fei, Ma Shi-ping, Gao Shan, Liu Chang
Aeronautics and Astronautics Engineering College, Air Force Engineering University, Xi’an 710038, China
article info
Article history:
Received 28 September 2014
Available online 16 February 2015
Keywords:
Visual tracking
Product sparse coding
L
1
-norm minimization
Ridge regression
Support vector machine
abstract
In this paper, we propose a sparse coding tracking algorithm based on the Cartesian product of two sub-
codebooks. The original sparse coding problem is decomposed into two sub sparse coding problems. And the
dimension of sparse representation is intensively enlarged at a lower computational cost. Furthermore, in
order to reduce the number of L
1
-norm minimization, ridge regression is employed to exclude the substantive
outlying particles according to the reconstruction error. Finally the high-dimension sparse representation
is put into the classifier and the candidate with the maximal response is considered as the target. Both
qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the
proposed tracking algorithm performs favorably against several state-of-the-art algorithms.
© 2015 Elsevier B.V. All rights reserved.
1. Introduction
Visual tracking has long been playing a critical role in numerous
applications such as surveillance, military reconnaissance, motion
recognition and traffic monitoring, to name a few [1]. While much
progress has been made within the last decades, it still remains
challenging in many scenarios including pose variation, illumination
change, partial occlusion, motion blur, background clutter and so on.
In the past few years, variation and extension of L
1
-norm mini-
mization have been applied to many computer vision tasks, including
face recognition, image super-resolution, denoising, inpainting and
image classification [2]. Inspired by the success of sparse representa-
tion in face recognition [3], many researchers develop a robust visual
tracking framework by casting the tracking as a sparse approxima-
tion on the codebook [4]. A thorough review can refer to [5].The
sparse coding visual tracking algorithms can be classified into two
categories, generative model and discriminative model. Both of them
require obtaining the sparse representation firstly. And the approach
to sparse representation is a L
1
-norm minimization problem, which
can be solved by homotopy method, gradient projection method, it-
erative shrinkage-thresholding method, interior-point method and
so on [6]. As we know sparse coding is a competitive method given
sufficiently large codebooks [7]. However, sparse coding is compu-
tationally expensive and the computational cost increases sharply
with the size of the codebook. So its power is mostly limited by the
size of the codebook in practice, especially for discriminative sparse
✩
This paper has been recommended for acceptance by A. Fernandez-Caballero.
∗
Corresponding author. Tel.: +86 29 84787724; fax: +86 29 84787724.
E-mail address: huanghongtu@sina.cn (H. Hong-tu).
coding tracking algorithm. So many researchers have to make a trade-
off between the speed and the discriminative ability. Given a proper
computational cost, how to enlarge the codebook to improve the dis-
criminative power is urgent to be solved.
In this paper we propose a robust product sparse coding tracking
algorithm. And the codebook size is increased in product manner at a
lower computational cost than direct operation on the Cartesian prod-
uct of two sub-codebooks [7]. The original sparse coding problem is
decomposed into two sub sparse coding problems. Each codeword in
the codebook is divided into two equal parts. Then the sparse repre-
sentation of the candidate can be obtained on the two sub-codebooks
simultaneously. And the final sparse representation can be calculated
via the product of the two obtained sparse coding coefficients. Finally
the high-dimension sparse representation is input into the SVM clas-
sifier and the candidate with the maximal score is regarded as the
target. In order to reduce the number of L
1
-norm minimization, ridge
regression is adopted to exclude the candidates with big reconstruc-
tion error at a lower computational cost. After that, tracking is led by
the Bayesian state inference framework in which a particle filter is
used for propagating sample distributions over time. Numerous ex-
periments on various challenging sequences show that the proposed
algorithm performs favorably against state-of-the-art methods and
the tracker based on product sparse coding is superior to the original
sparse coding tracker under the same condition.
The rest of the paper is organized as follows. In Section 2,webe-
gin with summarizing the related work on sparse coding tracking. In
Section 3, we offer the details of the sparse representation based on
product sparse coding. Section 4 is the initialization and generaliza-
tion analysis of the SVM classifier used in our paper. The integration
of our proposed model in particle filter framework for tacking is de-
scribed in Section 5. Qualitative and quantitative evaluations of our
http://dx.doi.org/10.1016/j.patrec.2015.01.014
0167-8655/© 2015 Elsevier B.V. All rights reserved.