视觉注意力研究：检测显著对象

需积分: 15 149 浏览量更新于2024-09-09 1 收藏 2.88MB PDF 举报

“经典图像显著性文章，探讨如何检测图像中的显著对象，提出了一种结合局部、区域和全局特征的显著对象检测方法，并构建了一个大型图像数据库用于定量评估视觉注意力算法。” 在计算机视觉领域，显著性检测是模拟人类视觉系统对图像中重要或突出元素的识别过程。这篇名为“Learning to Detect a Salient Object”的文章深入研究了通过图像分割问题来检测显著对象的方法。作者Tie Liu, Jian Sun, Nan-Ning Zheng, Xiaoou Tang 和 Heung-Yeung Shum 来自西安交通大学和微软亚洲研究院，他们提出了一系列新颖的特征，这些特征包括多尺度对比度、中心环绕直方图和颜色空间分布，旨在从局部、区域和全局三个层次描述显著对象。多尺度对比度考虑了不同尺度下的对象与背景之间的差异，有助于识别不同大小的显著物体。中心环绕直方图则聚焦于对象中心与边缘的对比，这对于理解对象的轮廓和边界信息至关重要。颜色空间分布特征则分析颜色在图像中的分布模式，以确定哪些区域具有独特的色彩组合，可能代表显著对象。文章中，作者采用条件随机场（Conditional Random Field, CRF）模型来有效融合这些特征，用于显著对象检测。条件随机场是一种概率模型，常用于图像分割任务，因为它能捕捉像素间的依赖关系，从而提高分割的准确性。通过训练CRF，可以优化特征权重，使得显著对象与背景得到准确区分。为了定量评估提出的检测方法，作者构建了一个大规模的图像数据库，包含数万个由多人精心标注的图像。这是第一个专为视觉注意力算法定量评估设计的大规模图像数据库，为后续的研究提供了宝贵的资源。论文附带的数据集公开可用，进一步推动了该领域的研究。总结来说，这篇研究文章为显著性检测提供了一种综合特征和机器学习方法，同时创建了一个标准的评估基准，对理解和改进计算机视觉中的显著性检测算法有着重要的贡献。通过这样的工作，我们可以更接近地模拟人类视觉系统，使机器能够更加智能地处理和理解图像信息。

Learning to Detect A Salient Object

Tie Liu

Jian Sun

Nan-Ning Zheng

Xiaoou Tang

Heung-Yeung Shum

Xi’an Jiaotong University

Microsoft Research Asia

Xi’an, P.R. China Beijing, P.R. China

Abstract We study visual attention by detecting a salient

object in an input image. We formula te salient object detec-

tion as an image segmentation problem, where we separate

the salient object from the image bac kgrou nd. We propose

a set o f novel features including multi-scale contra st, center-

surround histogram, and color spatial distribution to describe

a salient object locally, regionally, and globally. A Condi-

tional Random Field is learned to effectively co mbine these

features for salient object detection. We also constructed a

large image database containing tens of thousands of care-

fully labeled imag e s b y multiple users. To our kn owledge, it

is the ﬁrst large image database for quantitative evaluation of

visual attention algorithms. We validate o ur approach o n this

image database, which is public available with this paper.

1. Introduction

“Everyone knows what attention is...”

—William James, 1890

The human brain and visual system pay more attention

to some parts of an image. Visual attention has been studied

by re searchers in physiology, psychology, neural systems,

and computer vision for a long time. There are many

applications for visual attention, for example, automatic

image c ropping [23], adaptive image display on small de-

vices [4], image/video compression, a dvertising design [7],

and image collection browsing. Recent studies [18, 2 2, 26]

demonstra te d that visual attention helps object recognition,

tracking, and detec tion as well.

Most existing visual attention approa c hes are based on

the bottom-up computationa l framework [3, 6, 8, 9, 10, 11,

19, 25] because visual attention is in ge neral unconsciou sly

driven by low-level stimu lus in the scene such as intensity,

contrast, and motion. These approache s consist of the fol-

lowing three steps. The ﬁrst step is feature extraction, in

which multiple low-level visual features, such as intensity,

color, orientation, texture and motion are extracted from the

image at multiple scales. The second step is saliency com-

putation. The saliency is computed by a center-surround

operation [10], self-informa tion [3], or graph-based random

(a) (b) (c)

Figure 1. Salient map. From top to bottom: input image, salient map

computed by Itti’s algorithm (http:// ww w.saliencytoolbox.net), and

salient map computed by our approach.

walk [6] using multiple features. After normalization and

linear/non-linear combination, a master map [24] or a salient

map [11] is computed to represent the saliency of each im-

age pixel. Last, a few key locations on the saliency map are

identiﬁed by winner-take-all, or inhibition-o f-return, or other

non-lin ear operations. While these approaches have worked

well in ﬁnding a few ﬁxation locations in both synth etic and

natural images, they have not been able to accurately detect

where visual attention should be.

For instance, the middle row in Figure 1 shows three

salient maps computed using Itti’s a lgorithm [10]. Notice

that the saliency concentrates on several small local regions

with high contrast structures, e.g. , the background grid in (a),

the shadow in (b), and the foreground boundary in (c). Al-

though the leaf in (a) commands mu ch attention, the saliency

for the leaf is low. Therefore, these salient maps computed

from low-level features are not a good indicatio n for where a

user’s attention is while perusing the se images.

In this paper, we incor porate the high level concept of

salient obje c t into the p rocess of visual attention com putation.

In Figure 1, the leaf, car, and woman attract the most visual

attention in each respective image. We call them salient ob-

jects, or foreground objects that we are familiar with. As can

下载后可阅读完整内容，剩余7页未读，立即下载

mrsbusy

粉丝: 0
资源: 1

视觉注意力研究：检测显著对象

CVPR2007_Learning_to_detect_a_salient_object

Docker Desktop is unable to detect a Hypervisor

给出10个深度学习课程设计的题目

Docker Desktop is unable to detect a Hypervisor.

write me a code for detect a object on real time process to detec fault by using AI

深度学习中训练出来的detect.py，如何被其他函数调用

报错 Docker Desktop is unable to detect a Hypervisor

Towards Deep Learning Models Resistant to Adversarial Attacks

Shell: Failed to detect a valid hadoop home directory

Unable to detect database type

最新资源