A Unified Approach to Salient Object Detection via Low Rank Matrix Recovery
Xiaohui Shen and Ying Wu
Northwestern University
2145 Sheridan Road, Evanston, IL 60208
{xsh835, yingwu}@eecs.northwestern.edu
Abstract
Salient object detection is not a pure low-level, bottom-
up process. Higher-level knowledge is important even for
task-independent image saliency. We propose a unified
model to incorporate traditional low-level features with
higher-level guidance to detect salient objects. In our mod-
el, an image is represented as a low-rank matrix plus sparse
noises in a certain feature space, where the non-salient re-
gions (or background) can be explained by the low-rank
matrix, and the salient regions are indicated by the sparse
noises. To ensure the validity of this model, a linear trans-
form for the feature space is introduced and needs to be
learned. Given an image, its low-level saliency is then ex-
tracted by identifying those sparse noises when recovering
the low-rank matrix. Furthermore, higher-level knowledge
is fused to compose a prior map, and is treated as a prior
term in the objective function to improve the performance.
Extensive experiments show that our model can comfortably
achieves comparable performance to the existing methods
even without the help from high-level knowledge. The in-
tegration of top-down priors further improves the perfor-
mance and achieves the state-of-the-art. Moreover, the pro-
posed model can be considered as a prototype framework
not only for general salient object detection, but also for
potential task-dependent saliency applications.
1. Introduction
Image saliency is an important and fundamental research
problem in neuroscience and psychology to investigate the
mechanism of human visual systems in selecting regions of
interest from complex scenes. Recently it has also been an
active topic in computer vision, due to its applications to
object detection[11, 20] and image editing techniques [19,
8, 13, 4].
Visual saliency can be viewed from different perspec-
tives. Contrast-based and uniqueness-based methods are
two typical categories. Local contrast on multiple low-
level features can be used to detect low-level saliency [12],
(a) Input (b) High-level priors
(c) Saliency map (d) Detection
Figure 1. Illustration of our approach. By integrating the low-level
visual features from the image in (a) and high-level priors from
human perception in (b), we get the saliency map of the image as
in (c). The salient object is then segmented based on the saliency
map, which is shown in (d).
which has motivated various models and methods that
combine local, regional and global contrast-based features
[17, 23, 27, 18, 8, 4, 16]. In addition, uniqueness is anoth-
er point of view for saliency, because salient regions can
be regarded as those that cannot be well “explained” by its
surroundings [2], i.e., being unique. To measure the unique-
ness, different models such as self-information[3], graphic
models [9], log-spectrum [10] and sparsity models [26] are
studied. Uniqueness is in essence similar to high contrast,
as the regions different from their surroundings usually have
high responses on contrast-based features.
These methods may work well for low-level saliency (or
saliency regions), but they are neither sufficient nor neces-
sary, especially in the cases when the saliency is also re-
lated to the human perception or is task-dependent. While
the salient regions are mostly unique, the inverse might not
necessarily be true [14]. Not all unique regions are salient,
and a small region with high local contrast might be con-