Deformable Part Model based Multiple Pedestrian Detection for
Video Surveillance in Crowded Scenes
Lu Wang, Xiaoli Ji, Qingxu Deng and Mingxing Jia
College of Information Science and Engineering, Northeastern University, Shenyang, China
{wanglu, jixiaoli, dengqingxu, jiamingxing}@ise.neu.edu.cn
Keywords: Deformable Part-based Model, Multiple Pedestrian Detection, Crowd Detection, Video Surveillance.
Abstract: Pedestrian detection is a challenging task for video surveillance. The problem becomes more difficult when
occlusion is prevalent. In this paper, we extend a deformable part-based pedestrian detector to pedestrian de-
tection in crowded scenes by considering both body part detection responses and detections' mutual spatial
relationship. Specifically, we first decompose the full body detector into several body part detectors, whose
detection responses can be computed efficiently from the response of the full body detector. Then, given the
detection responses of the body part detectors, hypotheses are nominated by considering both detection
scores and responses’ mutual spatial relationship. Finally, a local optimization process is applied to make
the final decision, where an objective function encouraging detections with high confidence, high discrimi-
nability and low conflict with other detections is proposed to select the best candidate detections. Experi-
mental results show the effectiveness of the proposed approach.
1 INTRODUCTION
Pedestrian detection is a very important task for
video surveillance. It is difficult due to pose articula-
tions, appearance variations, low figure-ground con-
trast and etc. Recently, significant advance has been
made on detecting well separated individual pedes-
trians through training detectors using statistical
machine learning methods and running the detectors
on the detection window that slides over image posi-
tions and across scale levels (Dollar, 2012). Howev-
er, when applied to the detection of crowds, their
performance degrades significantly due to ambigu-
ous appearance caused by heavy occlusions.
The deformable part-based model (DPM) trained
using latent support vector machine (Felzenszwalb,
2010) has been proved to be one of the most power-
ful object detectors. It runs detection on individual
parts and then sum up the responses to form the final
detection score. DPM has a good potential to apply
to crowd detection because parts can be flexibly
removed from and added to the model to deal with
occlusion. There are some works that apply the
DPM models to deal with occlusion (Ouyang, 2012);
(Shu, 2012); (Yan, 2012). However, (Ouyang, 2012)
and (Shu, 2012) focus on improving the responses in
a detection window without considering detection
responses of neighboring windows; only Yan, 2012
determines the visibility of part by simultaneously
considering the appearance and mutual spatial rela-
tionship. Therefore, the aim of this work is to adapt
a DPM based full body pedestrian detector to crowd
detection in surveillance scenarios by considering
both body part detection responses and detections'
mutual spatial relationship.
In this paper, we assume the camera looks down
onto a ground plane and no camera parameter is
known. Specifically, we first propose to decompose
the original whole body detector trained on the
INRIA pedestrian dataset into several body part
detectors, whose responses are computed efficiently,
and the bias term for each part detector is estimated
from the training data so that the same threshold can
be used to select responses from different body part
detectors. Then, given the detection responses of the
body part detectors, hypotheses that may correspond
to genuine pedestrians are nominated by considering
both detection scores and responses’ mutual spatial
relationship. Finally, a local optimization process is
applied to make the final decision, where an objec-
tive function encouraging detections with high con-
fidence, high discriminability and low conflict with
other detections is proposed to select the best detec-
tions from the mutually overlapped hypotheses.
599