978-1-5090-0654-0/16/$31.00 ©2016 IEEE 263 ICALIP 2016
MOVING OBJECT DETECTION BASED ON BACKGROUND DICTIONARY
Hua-sheng Zhu, Jun Wang, Chen-guang Xu, and Jun Ye
School of Information Engineering, Nanchang Institute of Technology, Nanchang 330099, China
ABSTRACT
Gaussian Mixture Model (GMM) and its variations
process images by per pixel, so they may be corrupted by
noises and the computational cost is high. In this paper,
we propose a robust moving object detection algorithm
with a background dictionary learning. To do this, we first
divide an image into multiple image patches that have the
same sizes. Each patch is the object or background. Then,
A background dictionary is learnt for each patch. The
similarity between a patch and the background dictionary
is measured, upon which a patch is distinguished between
the object and the background. Additionally, in order to
adapt the dynamic contexts across in a video sequence, a
robust background dictionary updating scheme is
proposed. Experimental results demonstrate the
effectiveness and robustness of the proposed detection
algorithm.
Key Words — Moving object detection; Gaussian
mixture model; background dictionary; similarity
1. INTRODUCTION
Moving object detection is a hot research topic in
computer vision. Generally speaking, moving object
detection can be broadly categorized into three groups,
namely optical flow method [1-3], frame subtraction
method [4-5], and background subtraction method [6-8].
The optical flow is the pattern of apparent motion of
objects, surfaces, and edges in a visual scene caused by
the relative motion between an observer and the scene.
The optical flow method is susceptible to be interfered by
noises and the computational cost is high. The frame
subtraction method captures a moving object by com-
puting the differences between two adjacent frames. This
method has a low computational cost, and it is robust to
illumination variations. However, the frame subtraction
method is not stable due to the moving speed variations,
and it is not able to capture the whole outline. The
background subtraction method compares the intensity
between the current image and the corresponding
backgrounds. Because of the robustness for the moving
object detection, the background subtraction method is
widely applied. Modeling a background model is critical
before segmenting a moving object. Stauffer et al. [9]
proposes a background modeling based on the Gaussian
Mixture Model (GMM). GMM is widely applied in
moving object detection for videos. By updating really
the background model, GMM can efficiently overcome
the small perturbations caused by the dynamic
background and the noises caused by camera shaking.
However, GMM is not robust to the influence caused by
severe illumination variations. The improving variations
[10-12] of GMM are proposed, however, these methods
process an image by per pixel, upon which the probability
density is computed. So the computational cost for GMM
is high, and it is not robust to noises. Recently, the DPM
based object detection algorithm[13] and the deformation
dictionaries based object detection algorithm[14] are
proposed.
In this paper, we propose a robust moving object
detection method based on a learnt background dictionary.
For an image, it is divided into multiple image patches
and the similarity between each patch and the
corresponding dictionary is computed. Then the patch is
distinguished as an object or background based on the
similarity. The proposed method is robust to illumination
variation, and the computational cost is low.
2. BACKGROUND DICTIONARY BASED MOVING
OBJECT DETECTION
In this section, we describe the details of the proposed
method including: the framework of the proposed method,
the dictionary initializing and dictionary updating.
2.1. The Framework Of The Proposed Method
The framework of the proposed moving object detection
is illustrated in Fig. 1. It consists of four major
components: the video input, the background dictionary
initializing, background dictionary updating, moving
object detection.
A video are composed of many frames:
12
{, ,...,},
t
Vff f= (1)
where V is a video, f
i
is the i
th
frame.
Each frame consists of the background and moving
object:
,
bg=+ (2)
where b is the background of the current frame image, g is
the moving object.
Each frame can be divided into several image patches
that have a same size as:
12
,,, ,
nn
m
fpp p R
×
=∈
(3)