stage, all samples are clearly labeled for the L classes. Note that the test sample can represent a
vehicle or nonvehicle target. For each target category, there are N feature types used for image
representation.
Each feature type is extracted from the image sample and then converted into a vector. Let I
be a sample image from one target category. Let f ∈ R
m×1
be the i’th ð i ¼ 1; 2;:::;NÞ feature
type extracted from I, and m is the dimension of the feature. Let
D ∈ R
m×n
i
be a training sample
set from the j’th ðj ¼ 1 ; 2;:::;LÞ target category, and n
i
is the number of the i’th feature type.
Each entry, dðd ∈ R
m×1
Þ in D, has the same feature type as f. Theoretically, f can be well
approximated by a linear combination of the training sample set,
14
i.e.,
EQ-TARGET;temp:intralink-;e001;116;625f
m×1
¼ ψ
i1
d
i1
þ ψ
i2
d
i2
þ ··· þψ
in
i
d
in
i
; (1)
where ψ
ik
ðk ¼ 1; 2;:::;n
i
Þ denotes the weighted coefficient.
Considering all the training samples for N feature types, the matrix D ∈ R
m×J
can be written
as follows:
EQ-TARGET;temp:intralink-;e002;116;556D ¼½D
m×n
1
; D
m×n
2
;:::;D
m×n
N
; (2)
where D is referred to as the dictionary matrix for the j’th target category, J ¼
P
N
i¼1
ðn
i
Þ denote s
the total number of all feature types, and usually m < J. Note that in Eq. (2), we assume that all
feature types have the same dimension. Thus, Eq. (1) can be rewritten as follows:
EQ-TARGET;temp:intralink-;e003;116;487½f
m×1
¼½D
m×J
½s
J×1
; (3)
where
EQ-TARGET;temp:intralink-;e004;116;443s ¼½0; 0;:::;0; ψ
i1
; ψ
i2
;:::;ψ
in
i
; 0; 0;:::;0
T
; (4)
is a J × 1 vector composed by the weighted coefficients. Theoretically, s is a sparse vector whose
entries are all zero except those associated with the i ’th feature type (i.e., ψ
i1
; ψ
i2
;:::;ψ
in
i
).
Thus, the number of nonzero and zero entries in s is n
i
and J − n
i
, respectively. Specifically, for
one target category, if N feature types are extracted from each sample, the number of each feature
would be the same, i.e., n
i
¼ n
j
(i, j ¼ 1; 2;:::;N, and i ≠ j). This means that the dimension of
s is J ¼ N × n
i
. Based on Eq. (4), we know that the proportion of nonzero entries in s is 1∕N
[i.e., n
i
∕J ¼ n
i
∕ðN × n
i
Þ], which meets the requirements of the sparsity criterion.
Having observed f and knowing the matrix D, the general problem is to recover s. Ideally, the
entries in s are all zero except those associated with the same feature type as f, whi ch means that s
can be well approximated by the best n
i
-term representation.
As m < J, this set of equations is underdetermined and has infinitely many solutions. To
overcome this, the sparsest solution is usually sought, which can be done by solving the follow-
ing L0 optimization problem:
14
EQ-TARGET;temp:intralink-;e005;116;258arg min
s
0
ks
0
k
0
s:t: r ¼kf − Ds
0
k < ϵ; (5)
where r denotes the reconstruction residual, and ks
0
k
0
counts the number of nonzero entries in
s
0
, and ϵ is a small positive number. However, L0 optimization is NP-hard and computationally
difficult to solve. Thanks to the recent development in the theory of compressed sensing,
15,16
if
the vector s
0
is sparse enough, optimization based on the l
1
-norm
EQ-TARGET;temp:intralink-;e006;116;171arg min
s
0
ks
0
k
1
s:t: r ¼kf − Ds
0
k < ϵ; (6)
can exactly recover n
i
-sparse coefficients in s
0
, where ks
0
k
1
denotes the l
1
-norm of s
0
(which
sums up the absolute values of all entries in s
0
). This is a convex optimization problem that
conveniently reduces to a linear progra m known as basis pursuit,
17
and this problem can be
solved by the interior point method
18
and the gradient projection method.
19
Lv et al.: Vehicle detection in synthetic aperture radar images. . .
Journal of Applied Remote Sensing 025020-3 Apr–Jun 2018
•
Vol. 12(2)