128 Int J Comput Vis (2008) 77: 125–141
3 Incremental Learning for Tracking
We present details of the proposed incremental learning al-
gorithm for object tracking in this section. First we propose
an efficient method that incrementally updates an eigenbasis
as new observations arrive, which is used to learn the ap-
pearance of the target while tracking progresses. Next we
describe our approach for drawing particles in the motion
parameter space and predicting the most likely object loca-
tion with the help of the learned appearance model. Collec-
tively, we show how these two modules work in tandem to
track objects well under varying conditions.
3.1 Incremental Update of Eigenbasis and Mean
The appearance of a target object may change drastically
due to intrinsic and extrinsic factors as discussed earlier.
Therefore, to produce a robust tracker, it is important to
adapt the appearance model online, while tracking, to re-
flect these changes. The appearance model we have chosen,
a eigenbasis, is typically learned off-line from a set of train-
ing images {I
1
,...,I
n
}, by taking computing the eigenvec-
tors U of the sample covariance matrix
1
n−1
n
i=1
(I
i
−
¯
I)
(I
i
−
¯
I)
, where
¯
I =
1
n
n
i=1
I
i
is the sample mean of the
training images. Equivalently one can obtain U by comput-
ing the singular value decomposition UΣV
of the centered
data matrix [(I
1
−
¯
I)···(I
n
−
¯
I)], with columns equal to the
respective training images minus their mean.
Adapting the appearance model to account for novel
views of the target can be thought of as retraining the eigen-
basis with an additional m images {I
n+1
,...,I
n+m
},for
some value of m. Naively, this update could be performed
by computing the singular value decomposition U
Σ
V
T
of the augmented (centered) data matrix [(I
1
−
¯
I
) ···
(I
n+m
−
¯
I
)], where
¯
I
is the average of the entire n + m
training images.
Unfortunately this approach is unsatisfactory for online
applications, like visual tracking, due to its storage and com-
putational requirements. First, the naive approach uses the
entire set of training images for each update. If an update is
made at each video frame, then the number of images which
must be retained grows linearly with the length of the se-
quence. Second, the cost of computing the mean and singu-
lar value decomposition grows with the number of images,
so the algorithm will run ever slower as time progresses. In-
stead, the requirements of our application dictate that any
algorithm for updating the mean and eigenbasis must have
storage and computational requirements that are constant,
regardless of the number of images seen so far.
Numerous, more-sophisticated algorithms have been de-
veloped to efficiently update an eigenbasis as more data
arrive (Golub and Van Loan 1996;Halletal.1998;Levy
and Lindenbaum 2000; Brand 2002). However, most meth-
ods assume the sample mean is fixed when updating the
eigenbasis, or equivalently that the data is inherently zero-
mean. Neither assumption is appropriate in our application.
An exception is the method by Hall et al. (2002), which
does consider the change of the mean as each new datum
arrives. Although similar to our (independently-developed)
algorithm, it lacks the forgetting factor, which hurts its suit-
ability for tracking, and has a greater computational cost.
(Both of these disadvantages are demonstrated quantita-
tively in Sect. 4.3.) Part of the additional complexity comes,
because Hall’s algorithm is based on the notion of adding
eigenspaces, from computing the eigenvalue decomposition
of each block of new data as it arrives. In this respect our
algorithm is simpler, since it incorporates new data directly,
without the additional step.
Here we extend one of these efficient update proce-
dures—the Sequential Karhunen–Loeve (SKL) algorithm of
Levy and Lindenbaum (2000)—presenting a new incremen-
tal PCA algorithm that correctly updates the eigenbasis as
well as the mean, given one or more additional training data.
Our algorithm, a variation of which was first presented in
Lim et al. (2005), has also been applied to algorithms where
the subspace mean plays an important role. For example, it
can be applied to adaptively update the between-class and
within-class covariance matrices used in Fisher linear dis-
criminant analysis (Lin et al. 2005). We begin with a sum-
mary of the SKL algorithm, then describe our new incre-
mental PCA algorithm, and follow with a discussion of a
forgetting factor which can be used to down-weight the ef-
fect of earlier observations on the PCA model.
Putting aside for the moment the problem of the sample
mean, suppose we have a d ×n data matrix A ={I
1
,...,I
n
}
where each column I
i
is an observation (a d-dimensional
image vector in this paper), for which we have already com-
puted the singular value decomposition A =UΣV
. When
a d ×m matrix B of new observations is available, the goal
is to efficiently compute the SVD of the concatenation of A
and B: [AB]=U
Σ
V
. Letting
˜
B be the component of
B orthogonal to U , we can express the concatenation of A
and B in a partitioned form as follows:
AB
=
U
˜
B
ΣU
T
B
0
˜
B
B
V
0
0 I
. (1)
Let R =
ΣU
T
B
0
˜
B
B
, which is a square matrix of size
k +m, where k is the number of singular values in Σ.The
SVD of R, R =
˜
U
˜
Σ
˜
V
, can be computed in constant time
regardless of n, the initial number of data. Now the SVD of
[AB] can be expressed as
AB
=
U
˜
B
˜
U
˜
Σ
˜
V
V
0
0 I
.
Since an incremental PCA is only interested in comput-
ing U
and Σ
, V
, whose size scales with the number of