HEAD DETECTION BASED ON CONVOLUTIONAL NEURAL NETWORK
WITH MULTI-STAGE WEIGHTED FEATURE
Ting Rui
1
,
Jian-chao Fei
1
,
Peng Cui
2
, You Zhou
3
,
Hu-sheng Fang
1
(1. College of Field Engineering, PLA Univ. of Sci. & Tech., Nanjing 210007, China; 2. Department of
Computer Science and Technology, Tsinghua University, Beijing 100084,China; 3. Jiangsu
Institute of Commerce, Nanjing 210007, China)
ABSTRACT
Human head detection is an important means of
pedestrian detection and counting. By now, head detection is
mainly based on outline, color and template which have low
recognition rate and error tolerance. Recently, deep learning
has become a research hotspot in the field of pattern
recognition. As a model of deep learning, convolutional
neural network (CNN) performs well in the areas of image
recognition and speech analysis. In this paper, a new method
based on CNN was proposed. This method uses a few new
twists, such as multi-stage weighted feature and connections
that skip layers to integrate global shape information and
local motif information. The experimental results show that
the proposed method performs a higher accuracy on head
detection compared with the traditional ones’.
Index Terms— human head detection, deep learning,
multi-stage feature, convolutional neural network
1. INTRODUCTION
Human head in the image contains abundant and stable
features compared with other parts. While head in the video
is seldom covered by other people, it offers a convenient
way of tracking analysis [1]. Based on these, we can
accurately obtain the people flow by analyzing the
movement of head. However, the complexity of background,
angle and the shooting position make head detection a
challenging task. All existing state-of-the-art methods use a
combination of hand-crafted features such as Hog, LBP and
their variations and combinations, followed by a trainable
classifier such as SVM[2-4].
CNN, as one model of deep learning, has a weight-shared
structure. It’s like a biological neural network due to this
special structure and reduces the number of parameters.
Furthermore, CNN has an obvious advantage on image
analysis. It can abstract feature automatically and enforces
invariance of translation, zoom and error due to the
convolution and subsampling layers[5-8].
In this paper, we propose a CNN model with multi-stage
weighted feature to detect head. And this model improves
the accuracy which is proved by the following experiments.
2. THE STRUCTURE AND CHARACTERISTIC OF
CNN
CNN can learn and abstract feature from the data
automatically and has successfully applied to pattern
classification, object detection and object recognition [9-10].
At the same time, its generalization ability is superior to the
traditional ways’. It can be treated as a supervised multi-
layer network with convolutional layers and down sampling
layers alternately appearing. The input image is operated by
convolution and subsampling layers and the output layer
shows the result.
In the convolution layer, every feature map contains one
kind of feature and share the same row of parameters, while
different feature maps use different rows of parameters in
order to abstract different features. When CNN works, it
adjusts the parameters in the training stage as well to lead
the training process to the optimization direction. The
convolution theory is shown in figure1 and the general
formula is shown in equation1.
1