FACE DETECTION USING LOCAL SMQT FEATURES AND SPLIT UP SNOW CLASSIFIER
Mikael Nilsson, J
¨
orgen Nordberg, and Ingvar Claesson
Blekinge Institute of Technology
School of Engineering
Box 520, SE-372 25 Ronneby, Sweden
E-mail: mkn@bth.se, jno@bth.se, icl@bth.se
ABSTRACT
The purpose of this paper is threefold: firstly, the local Successive
Mean Quantization Transform features are proposed for illumination
and sensor insensitive operation in object recognition. Secondly, a
split up Sparse Network of Winnows is presented to speed up the
original classifier. Finally, the features and classifier are combined
for the task of frontal face detection. Detection results are presented
for the MIT+CMU and the BioID databases. With regard to this
face detector, the Receiver Operation Characteristics curve for the
BioID database yields the best published result. The result for the
CMU+MIT database is comparable to state-of-the-art face detectors.
A Matlab version of the face detection algorithm can be downloaded
from http://www.mathworks.com/matlabcentral/fileexchange/
loadFile.do?objectId=13701&objectType=FILE.
Index Terms— Object detection, Pattern recognition, Lighting,
Image processing
1. INTRODUCTION
Illumination and sensor variation are major concerns in visual object
detection. It is desirable to transform the raw illumination and sensor
varying image so the information only contains the structures of the
object. Some techniques previously proposed to reduce this variation
are Histogram Equalization (HE), variants of Local Binary Patterns
(LBP) [1] and the Modified Census Transform (MCT) [2]. HE is
a computationally expensive operation in comparison to LBP and
MCT, however LBP and MCT are typically restricted to extract only
binary patterns in a local area. The Successive Mean Quantization
Transform (SMQT) [3] can be viewed as a tunable tradeoff between
the number of quantization levels in the result and the computational
load. In this paper the SMQT is used to extract features from the
local area of an image. Derivations of the sensor and illumination
insensitive properties of the local SMQT features are presented.
Pattern recognition in the context of appearance based face de-
tection can been approached in several ways [4, 5]. Techniques pro-
posed for this task are for example the Neural Network (NN) [6],
probabilistic modelling [7], cascade of boosted features (AdaBoost)
[8], Sparse Network of Winnows (SNoW) [9], combination of Ad-
aBoost and SNoW [2] and the Support Vector Machine (SVM) [10].
This paper proposes an extension to the SNoW classifier, the split up
SNoW, for this classification task. The split up SNoW will utilize the
result from the original SNoW classifier and create a cascade of clas-
sifiers to perform a more rapid detection. It will be shown that the
number of splits and the number of weak classifiers can be arbitrary
within the limits of the full classifier. Further, a stronger classifier
will utilize all information gained from all weaker classifiers.
Face detection is a required first step in face recognition systems.
It also has several applications in areas such as video coding, video
conference, crowd surveillance and human-computer interfaces [5].
Here, a framework for face detection is proposed using the illumi-
nation insensitive features gained from the local SMQT features and
the rapid detection achieved by the split up SNoW classifier. A de-
scription of the scanning process and the database collection is pre-
sented. The resulting face detection algorithm is also evaluated on
two known databases, the CMU+MIT database [6] and the BioiD
database [11].
2. LOCAL SMQT FEATURES
The SMQT uses an approach that performs an automatic structural
breakdown of information. Our previous work with the SMQT can
be found in [3]. These properties will be employed on local areas in
an image to extract illumination insensitive features. Local areas can
be defined in several ways. For example, a straight forward method
is to divide the image into blocks of a predefined size. Another way
could be to extract values by interpolate points on a circle with a
radius from a fixed point [1]. Nevertheless, once the local area is
defined it will be a set of pixel values. Let x be one pixel and D (x) be
asetof|D(x)| = D pixels from a local area in an image. Consider
the SMQT transformation of the local area
SMQT
L
: D(x) →M(x) (1)
which yields a new set of values. The resulting values are insensitive
to gain and bias [3]. These properties are desirable with regard to the
formation of the whole intensity image I(x) which is a product of the
reflectance R(x) and the illuminance E(x) [12]. Additionally, the
influence of the camera can be modelled as a gain factor g and a bias
term b [2]. Thus, a model of the image can be described by
I(x)=gE(x)R(x)+b. (2)
In order to design a robust classifier for object detection the re-
flectance should be extracted since it contains the object structure.
In general, the separation of the reflectance and the illuminance is
an ill posed problem. A common approach to solving this problem
involves assuming that E(x) is spatially smooth. Further, if the il-
luminance can be considered to be constant in the chosen local area
then E(x) is given by
E(x)=E,∀x ∈D. (3)
Given the validity of Eq. 3, the SMQT on the local area will yield
illumination and camera-insensitive features. This implies that all
Authorized licensed use limited to: IEEE Xplore. Downloaded on January 9, 2009 at 04:34 from IEEE Xplore. Restrictions apply.