Depth Information Fused Salient Object Detection
Fangfang Chen, Congyan Lang, Songhe Feng, Zehai Song
School of Computer and Information Technology, Beijing Jiaotong University, China
{12120401, cylang, shfeng, zhsong}@bjtu.edu.cn
ABSTRACT
Saliency Detection has emerged as a hot topic due to its potential
application in image and video understanding. Most existing
saliency detection algorithms focus on two-dimensional
information while the depth information is often ignored. In this
paper, we first create the salient object ground truth of a specific
image dataset which contains 600 RGB-D (color and depth
information) images taken from different surroundings with
different angle and intensity of illumination. The depth image
describes the depth information of each object in the image from
the perspective of a viewer, and the intensity value of every pixel
in the depth image denotes the depth information. With the help of
depth information, a more precise object description can be
acquired. Furthermore, several state-of-the-art saliency detection
models can be utilized to generate 2D salient maps, which can be
fused with the depth map to detect the salient object in a given
image. Experimental results demonstrate the effectiveness of the
proposed method.
Categories and Subject Descriptors
I.4.8 [Image Processing And Computer Vision]: Scene Analysis
– color, depth cues.
General Terms
Theory
Keywords
salient object detection, depth information, RGB-D image, visual
attention.
1. INTRODUCTION
The rapid popularization of digital cameras and mobile phone
cameras has led to an explosive growth of social image sharing
web sites, such as Flickr. How to organize and analysis these
large-scale images has become a hot topic recently. Visual saliency
is deemed as a fundamental issue in the field of psychology,
neuroscience, neural systems and computer vision. It can be
regarded as the ability of a visual system (human or machine) to
select a certain subset of visual information for further processing
[1]. The goal of salient object detection is to detect and extract the
most salient and attention-grabbing object in a scene. The output is
usually called “saliency map” where the intensity of each pixel
represents the probability of the pixel belonging to the salient
object [1]. Visual saliency and salient detection can be applied in
many fields, including object detection and recognition[3], image
indexing[2], image compression[4], multimedia question
answering[5], movie2comics[6], tagging technology [7], and so on.
The study on human visual systems suggests that the saliency is
related to uniqueness, rarity and surprise of a scene, characterized
by primitive features like color, texture, shape, etc. [8]. Recently
various of efforts have been made to compute the salient object of
a given image.
In this paper, we introduce the depth information of an image to
assist the salient object detection. Depth information is derived
from the depth image, which is also known as distance image. It
includes the information of distance that between viewers and
object in the scene, the intensity of each pixel in the depth image
corresponds to the depth information. The larger the gray value is,
the further the object is, as shown in Figure 1. The proposed
approach consists of three main steps. Firstly, we try to create the
salient object ground truth of an image dataset which contains 600
RGB-D images introduced in [27]. Then, the image segmentation
technology is introduced to decompose the image into multiple
segmentations. Furthermore, an adaptive fusion strategy is
employed by incorporating the depth map with some existing 2D
saliency maps computed by the state-of-the-art saliency map
calculation algorithms. The novelty lies in that, unlike existing
algorithms that compute saliency maps only derived from
two-dimensional visual features, our method combines the depth
information with the salient maps, which can improve the object
edge detection performance, and generate more precise salient
object.
(a) original image (b) depth image
Figure 1. Example of depth image, the intensity of each pixel in
the image (b) corresponds to the depth information. The
greater the gray value is, the further the object is.
The paper is organized as follows. A related work about salient
object detection is introduced in Section 2. We present the
proposed algorithm in Section 3. Experimental results are shown in
Section 4 and a conclusion is given in Section 5.
2. RELATED WORKS
We introduce the related work on image salient object detection
briefly in this section. Recently, many efforts have been made to
propose various computational models to calculate the salient
objects or regions. According to whether the prior knowledge is
required or not, existing saliency detection algorithms can be
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
ICIMCS’14, July 10–12, 2014, Xiamen, Fujian, China.
Copyright 2014 ACM 978-1-4503-2810-4/14/07 …$15.00.