COL 9(8), 081002(2011) CHINESE OPTICS LETTERS August 10, 2011
Novel averaging window filter for SIFT in infrared
face recognition
Junfeng Bai (
yyy
ddd
¸¸¸
)
1
, Yong Ma (
êêê
YYY
)
2∗
, Jing Li (
iii
···
)
2
,
Fan Fan (
)
2
, and Hongyuan Wang (
÷÷÷
)
1
1
Department of Electronic and Information Engineering,
Huazhong University of Science and Technology, Wuhan 430074, China
2
Department of Electronic and Information Engineering, Wuhan National Laboratory for Opto-Electronics,
Huazhong University of Science and Technology, Wuhan 430074, China
∗
Corresponding author: mayong@hust.edu.cn
Received December 21, 2010; accepted March 24, 2011; posted online June 16, 2011
The extraction of stable local features directly affects the performance of infrared face recognition al-
gorithms. Recent studies on the application of scale invariant feature transform (SIFT) to infrared face
recognition show that star-styled window filter (SWF) can filter out errors incorrectly introduced by SIFT.
The current letter proposes an improved filter pattern called Y-styled window filter (YWF) to further elim-
inate the wrong matches. Compared with SWF, YWF patterns are sparser and do not maintain rotation
invariance; thus, they are more suitable to infrared face recognition. Our experimental results demonstrate
that a YWF-based averaging window outperforms an SWF-based one in reducing wrong matches, therefore
improving the reliability of infrared face recognition systems.
OCIS codes: 100.2000, 100.3008, 100.2960.
doi: 10.3788/COL201109.081002.
Infrared human face recognition has become an area
of growing interest in literature
[1]
. Most representative
methods include elemental shape matching, eigenface,
metrics matching, template matching, symmetry wave-
forms, and face codes
[2−4]
are introduced from the visi-
ble domain. Among these methods, symmetry waveforms
and face codes utilize the anatomical structure by analyz-
ing the infrared vascular pattern, while the others extract
and match thermal contours
[1]
. Compared with recogni-
tion in visible-spectrum imagery, face recognition in the
thermal infrared domain has received relatively little at-
tention in literature.
For infrared face recognition, there are several success-
ful candidate visual approaches based on invariant fea-
ture extraction
[5]
. Mikolajczyk et al. made the first
effort in this area and achieved rotation invariance
[6]
.
Lowe extended this approach and achieved scale
invariance
[7,8]
. Many researchers have reported achieve-
ments in affine transformation invariance and rotation,
including scale invariant feature transform (SIFT), inde-
pendent component analysis, improved Harris corner de-
tector, and fractal and genetic algorithms
[9−14]
. Among
them, the methods based on scale–space feature extrac-
tion, i.e., SIFT
[10]
and improved Harris corner
[14]
, are
most applicable to infrared human face recognition. Be-
tween them, features extracted by SIFT are more dis-
persed in spatial distribution, more stable for occlusion,
and are relatively large in quantity
[15]
. Therefore, the
former is more suitable for infrared features, and SIFT is
taken as our candidate method for investigation.
By introducing SIFT into the infrared domain, sev-
eral problems in infrared human face recognition, such
as wearing glasses and facial rotation, can be solved di-
rectly. However, SIFT has an intrinsic defect, that is,
it generates mismatches for points with similar textures
around them. Experiments by Tan et al. showed that
most of these mismatches differ in mean brightness
[16]
.
By applying a star-styled averaging window, mismatches
can be removed effectively. Tan’s study examined only
one pattern out of many possible averaging window filters
applicable in this scenario. In this letter, two other can-
didate filter patterns, namely, cross-styled window filter
(CWF) and Y-styled window filter (YWF), are proposed,
which are proved to be more effective in yielding better
filtering results than star-styled window filter (SWF).
The elegant design of the four SIFT stages enables it
to extract distinctive invariant features from an image
better than other algorithms. However, the construction
of the SIFT features is completely performed in the scale
space and the features of the original image space are not
used. Patches with similar local textures will therefore
result in similar keypoint descriptors, leading to incorrect
features when used for object recognition. The proof is
given below.
Let A and B be similar patches in I(x, y). Let a and b
be points in the same relative physical location of A and
B, respectively, i.e., a ∈ A, b ∈ B and a ≈ kb, where k
is the gray level ratio. In addition, let a
0
and b
0
be local
extrema in A and B, respectively. Our derivation follows
the stages similar to those in SIFT.
In the first stage, the two patches are subjected to the
difference of Gaussian (DoG) transformation. Since the
DoG operator is linear, we have
DoG(a) ≈ DoG(k · b) = k · DoG(b). (1)
Meanwhile, the relative physical locations of a
0
and b
0
are the same because a ≈ kb.
In the second stage, the previous two criteria are
checked on a
0
and b
0
. Since a
0
≈ kb
0
, it is impossi-
ble that one extrema is along an edge and the other is
not. Assume that a
0
and b
0
are stable. Thus, both of
them will pass the test of the second stage.
1671-7694/2011/081002(4) 081002-1
c
2011 Chinese Optics Letters