data:image/s3,"s3://crabby-images/05fc2/05fc23b71b2ad768dd18e4498c09789511816e8b" alt=""
STATISTICAL BACKGROUND SUBTRACTION BASED ON IMBALANCED LEARNING
Xiang Zhang* Zhi Liu
†
Hongsheng Li* Xu Zhao
‡
Ping Zhang*
*School of Electronic Engineering, University of Electronic Science and Technology, China
†
School of Communication and Information Engineering, Shanghai University, China
‡
School of Electronic Information, Shanghai Jiaotong University, China
uestchero@uestc.edu.cn, liuzhisjtu@163.com, lihongsheng@gmail.com,
zhaoxu@sjtu.edu.cn, pingzh@uestc.edu.cn
ABSTRACT
In this paper, we study the class imbalance problem in statisti-
cal background subtraction. Firstly, we discuss the imbalance
essence in background subtraction, and conclude that fore-
ground and background are inherently imbalanced. Secondly,
following the imbalanced learning strategy in machine learn-
ing, we present a spatio-temporal over-sampling method to
resolve the class imbalance in background subtraction. Our
method densely generate synthesized foreground samples in
compact 3D spatio-temporal domain. Those generated sam-
ples could reduce the imbalance level between foreground
and background from both quantity and quality, and therefore
contribute to improvement of detection performance. We also
define a new index to measure the change of imbalance level
during over-sampling. Experiments are conducted on public
datasets to demonstrate the effectiveness of our method.
Index Terms— imbalanced learning, class imbalance,
background subtraction, moving object detection
1. INTRODUCTION
Moving object detection, also called background subtraction,
is a key step in many computer vision applications. Inter-
ferences in imaging environment are generally recognized as
challenges for moving object detection. Toyama et al. [1] in
an early work conclude nine such problems, e.g., light switch,
waving trees, sleeping and waking persons, etc. Recently,
Brutzer et al. [2] list seven challenges including gradual and
sudden illumination changes, dynamic background, camou-
flage, shadows, bootstrapping and video noise.
This work was supported by National Natural Science Foundation of
China (No. 61105001, No. 61171144 and No. 61308102), the Key (Key
grant) Project of Chinese Ministry of Education (No. 212053), the Inno-
vation Program of Shanghai Municipal Education Commission (No. 12Z-
Z086), the Chinese Postdoctoral Science Foundation (No. 2013M531946),
the Sichuan Provincial Key Technology Research and Development Program
(No. 2014GZX0009), the Fundamental Research Funds for the Central U-
niversities of China (No. ZYGX2013J059), and a Marie Curie International
Incoming Fellowship within the 7th European Community Framework Pro-
gramme under Grant No. 299202.
0
0.2
0.4
0.6
0.8
1
PBAS[4] SGMM[5] DPGMM[6] PSP[7] CHEBY[8] KNN[9] SOBS[10] KDE[11]
rate
TP
rate
TN
Fig. 1.
TP
rate
and
TN
rate
of eight state-of-the-art moving object de-
tection algorithms on the CDW2012 database.
Table 1. Global imbalance level on CDW2012 database.
subset 1 2 3 4 5 6
η 0.031 0.014 0.045 0.038 0.053 0.081
We argue that the class imbalance is another key issue in
background subtraction. In machine learning [3], class im-
balance refers to the case that the majority (or negative) class
is represented by a large number of samples while the mi-
nority (or positive) class by only a few. Classifiers trained
with such skewed data tend to generate results with high true
negative rate and low true positive rate. We found that class
imbalance also exists in background subtraction. We define
global imbalance degree in a video as η =
sum
(F )/
sum
(B),
where
sum
(F ) and
sum
(B) are the sums of foreground and
background pixels, respectively. Then we compute η on the
CDW2012 database (www.changedetection.net, this database
consists of six subsets), which is shown in Table 1.
Table 1 reveals that video data indeed is imbalanced,
where foreground and background are the minority class and
majority class, respectively. Define
TP
rate
and
TN
rate
as
TP
rate
=
T P
T P + F N
and
TN
rate
=
T N
T N + FP
,
where
TP
,
FP
,
TN
and
FN
are total numbers of true positives,
false positives, true negatives and false negatives, respective-
ly. The performance of eight state-of-the-art moving object
detection algorithms [4]-[11] on CDW2012 database is shown
in Fig. 1, where all algorithms exhibit high
TN
rate
and rela-