ADAPTIVE MULTIPLE CUES INTEGRATION FOR
PARTICLE FILTER TRACKING
Hao Zhou
1,2
, Yun Gao
1
, Guowu Yuan
1
and Rongbin Ji
2
1
school of information Science and Engineering, Yun Nan University.
2
Kunming Institute of Physical
zhouhao@ynu.edu.cn, gausegao@163.com, yuanguowu@sina.com, jirongbin@gmail.com
Keywords: Object tracking, Particle filtering, Multiple
cues, Adaptive fusing
Abstract
Visual object tracking in complex environments is a chal-
lenging task in smart surveillance fields. In order to en-
hance the robust tracking performance, many tracking
algorithms based on multi cues integration have been pro-
posed. However, how the multiple cues are fused during
tracking is still an open issue. This paper integrates multi-
ple cues into particle filtering framework for robust track-
ing in situations where no single cue is suitable. A novel
quality function is introduced to evaluate the reliability of
each cue. With the weights corresponding to the cue relia-
bilities, the combined likelihood is estimated as the
weighted average of each cue. Experiments show that
tracking with multiple weighted cues provides more relia-
ble performance than single cue tracking.
1 Introduction
Video based moving object tracking is one of the exigent
missions in computer vision area such as visual surveil-
lance, human computer interactions etc. Reliable visual
tracking of real-world objects is a challenging problem
due to the presence of noise, occlusion, clutter and dynam-
ic changes in the scene other than the motion of objects of
interest.
Visual tracking methods can be broadly divided into de-
terministic method and probabilistic method. Determinis-
tic methods
[1][2]
may run into trouble when similar objects
are presented in background or when complete occlusion
occurs. The probabilistic methods view the tracking algo-
rithm as a state solving problem under the Bayesian
framework. The representative methods are Kalman filter,
particle filter and their derivatives
[3][4][5][6]
. The Kalman
filter as a recursive linear estimator is a special case, ap-
plying only to Gaussian densities, of a more probability
density propagation process. However, most of the models
encountered in visual tracking are nonlinear, non-
Gaussian, multi-modal or any combination of these. Due
to particle filter’s non-Gaussian non-linear assumption
and multiple hypothesis property, they have become popu-
lar tools for visual tracking, their popularity stems from
their simplicity, flexibility, ease of implementation model-
ling success over a wide range of challenging applications.
Particle filers employ appearance information to establish
observation model during tracking. The major difficulty
during visual tracking in real scenarios is the variation of
the target appearance and its background. The appearance
of target tends to change especially during a long tracking
process because of variations in illumination or viewpoint.
Many of the proposed algorithms are based on a single
feature or modality. Therefore, they are often limited to a
controlled environment. As a result, using a single feature
for tracking is not sufficient to deal with a wide variety of
environmental conditions. Thus, it has been argued in
many works that considering multi-modal data leads to an
improvement in tracking. It increases the robustness by
letting complementary observations from different sources
work together.
The combination way of complementary information to
achieve better result is the key issue of multi-cues based
object tracking algorithm.
In early studies, Birchfield
[7]
suggested to combine colour
and gradients cues within a hypothesize-and-test proce-
dure for head tracking. In their study, the final output is
determined by taking into account the product of likeli-
hoods provided by each cue, and all cues always provide
equal reliable information about the target object. Thus, if
one of visual cues becomes unreliable, it may lead to false
outcomes. Therefore, the evaluation way for diverse cues
based on their reliabilities is essential to multi-cues inte-
gration. In some of the studies, each cue has an adaptive
reliability value associated with it, and each cue contrib-
utes to the joint result according to its reliability
[8][9][10]
.
Cheng et.al.
[11]
proposed a match way based on data fu-
sion of colour, contour and target position prediction for a
target movement and illumination change. It could assign
different weights to different visual cues. But it did not
show how the threshold is got. Liu et.al.
[12]
proposed an
adaptive multi-cue integration based Mean-Shift frame-
work. A novel quality function is introduced to evaluate
the reliability of each cue. The cues are integrated as a
weighted sum of probability distribution. Each cue’s reli-
ability is evaluated by the quality function and the weights
are adapted accordingly. The integration procedure for
colour and motion proposed in [13] was embedded into
mean-shift tracking process. A democratic integration
strategy was used for colour and motion cues. A quality
estimation function was introduced to evaluate the relia-
bility of each cue. Erkut Erdem et al.
[14]
proposed model-
free tracker combines the FragTrack's arbitrary-fragments
based object representation and the concepts of adaptive
multi-cue integration. The proposed method associated a
reliability value to each fragment and dynamically adjust-
ing these reliabilities at each frame with respect to the