基于自由能原理的视觉显著性检测算法

22 浏览量更新于2024-08-28 收藏 612KB PDF 举报

本文主要探讨了"视觉显著性检测与自由能理论"（Visual Saliency Detection With Free Energy Theory），发表在2015年10月的IEEE SIGNAL PROCESSING LETTERS第22卷第10期。该研究论文由Ke Gu、Guangtao Zhai、Weisi Lin、Xiaokang Yang和Wenjun Zhang等作者合作完成，他们都是各自领域的资深成员或院士。视觉显著性可以被理解为人类大脑活动的一种体现，它反映了我们在观察图像时自然地被某些部分吸引的程度。当前许多模型倾向于依赖局部特征、全局特征或两者结合来识别显著区域。然而，近年来，一种名为"自由能原理"的理论尝试将多种脑科学理论统一在一个框架内，它认为视觉刺激中的"意外"（即所谓的"惊讶"）与人类的视觉显著性密切相关。作者们提出了一种新颖的计算方法——基于自由能的显著性检测技术（FES），这种方法通过计算输入图像信号与其由半参数模型重构的预测信号之间的局部熵差异来衡量显著性。这里的"惊讶"实际上代表了图像中那些与预期对比强烈、信息量高的部分，这些部分往往对应于人的注意力焦点。实验结果表明，该算法能够准确预测人类的注视点，相较于传统的和现有的顶尖算法，FES展现出更好的性能。因此，这项研究不仅提供了新的视角来理解视觉显著性，还为计算机视觉领域提供了一种有效且实用的方法来自动检测图像中吸引人的部分，这对于诸如视频分析、广告优化和智能监控等领域具有重要意义。本论文的核心贡献在于将自由能原理应用于视觉显著性检测，通过量化"惊讶"来提升算法对人类视觉行为的预测能力，并通过实验证明了其在实际应用中的优越性。这不仅深化了我们对视觉感知机理的理解，也为相关领域的研究者提供了新的研究工具和技术路线。

1552 IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 10, OCTOBER 2015

Visual Saliency Detection With Free Energy Theory

Ke Gu, Student Member, IEEE, Guangtao Zhai, Member, IEEE,WeisiLin, Senior Member, IEEE,

Xiaokang Yang, Senior Member, IEEE, and Wenjun Zhang, Fellow, IEEE

Abstract—Visual saliency can be thought of as the product of

human brain activity. Most existing models were built upon local

features or global features or both. Lately, a so-called free energy

principle uniﬁes several brain theories within one framewor

and tells where easily surprise human viewers in a visual stimulus

through a psychological measure. We believe that this “surprise”

should be highly related to visual saliency, and the

reby introduce a

novel computational Free Energy inspired Saliency detection tech-

nique (FES). Our method computes the local entropy of the gap

between an input image signal and its predicted co

unterpart that

is reconstructed from the input one with a semi-parametric model.

Experimental results prove that our algorithm predicts human ﬁx-

ation points accurately and is superior to clas

sical/state-of-the-art

competitors.

Index Terms—Bi-lateral ﬁltering, free energy, linear autoregres-

sive (AR) model, saliency detection, semi-parametric model.

I. INTRODUCTION

ALIENCY detection is an active and important research

topic in both image processing a

nd computer vision com-

munities. In many applications of graphics, design and human

computer interaction, we strongly concern about where human

beings look in a scene–where s

aliency spots are located. Visual

saliency can promote the study of quality assessment [1][2], ob-

ject recognition [3][4], and computer graphics [5]. So an efﬁ-

cient and effective comp

utational model is eagerly required to

detect salient areas in the encountered scene.

More than hundreds of saliency detection models have been

proposed during the pa

st 25 years [6], and this number is ex-

pected to be increasing quickly. Existing methods are divided

into two types according to distinct attentional mechanisms:

1) top-down task-de

pendent methods; 2) bottom-up stimulus-

driven methods. Because top-down approaches require prior

knowledge about the visual content, bottom-up approaches that

Manuscript received February 03, 2015; revised March 06, 2015; accepted

March 13, 2015. Date of publication March 18, 2015; date of current version

March 24, 2015. This work was supported in part by the National Science Foun-

dation of China under Grants 61025005, 61371146, 61221001, and 61390514,

the Foundation for the Author of National Excellent Doctoral Dissertationof

China under Grant 201339, and by the Shanghai Municipal Commission of

Economy and Informatization under Grant 140310. The associate editor coordi-

nating the review of this manuscript and approving it for publication was Prof.

Zhengguo Li.

K. Gu, G. Zhai, X. Yang, and W. Zhang are with Institute of Image Com-

munication and Information Processing, Shanghai Key Laboratory of Digital

Media Processing and Transmissions, Shanghai Jiao Tong University, Shanghai

200240, China (e-mail: guke.doctor@gmail.com; zhaiguangtao@sjtu.edu.cn;

xkyang@sjtu.edu.cn; zhangwenjun@sjtu.edu.cn).

W. Lin is with the School of Computer Engineering, Nanyang Technological

University, Singapore 639798 (e-mail: wslin@ntu.edu.sg).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/LSP.2015.2413944

only use information from the visual signal itself have been

broadly and deeply researched.

We in this letter concentrate on bottom-up methods. Many

techniques in this class were modeled to seek for locations wi

maximum local saliency and employ biologically motivated

local features [7]–[10]. These features, which mainly consist of

intensity, edge, texture, color and orientation, a

re inspired by

neural responses in lateral geniculate nucleus and V1 cortex.

The benchmark Itti model [7] provides a general architecture

for detecting visual saliency. This model works

by ﬁrst subsam-

pling an input image into a Gaussian pyramid, decomposing

each pyramid level into various channels for color, intensity

and orientation, and then summing and nor

malizing maps in

each channel across scales to yield the ﬁnal saliency map.

Some other relevant algorithms depend on global features

[11]–[15]. The techniques mainly atte

mpt to ﬁnd regions from

a visual signal that implies unique frequencies in transform do-

mains. This renders these algorithms quickly and precisely de-

tect visual “pop-outs” due to global

considerations, thus to lo-

cate possible salient objects. The classical spectral residual (SR)

model [11] was established upon the ﬁnding that more high-

frequency information than lo

w-frequency one is stored in the

residual, and the remaining Fourier amplitude spectrum is used

to constitute a saliency map.

Recently, the adoption of on

ly local or global features was

found to be somewhat limited. Thus, an increasing number of

nowadays studies have been devoted to incorporating both two

types of features for sa

liency detection [16]–[20]. Most of them

were developed based on complementary strategies, thereby

gaining substantially high performance. In [18], the authors

took into account loc

al and global image patch rarities (LG) as

two complementary processes to design the saliency detection

model. In [19], content-aware saliency detection (CAS) model

combines four bas

ic principles of human visual attention, i.e.

local low-level considerations, global considerations, visual

organization rules, and high-level factors.

It is human viewe

rs deciding visual saliency, and thus the

most valid technique should highly approximate the response

of the human brain to visual stimuli. Friston has lately uniﬁed

some brain the

ories within the free-energy framework, which

indicates that the brain inference process always attempts to

infer the meaningful part from a visual stimulus by removing

the uncerta

inty [21]. It is natural that there exists a gap between

the real scene and the brain’s prediction due to the fact that the

internal generative model cannot be universal. It is the gap that

makes hum

an viewers “surprise”, andthusattractsmuchmore

human attention. Therefore, we hypothesize that this gap (i.e.

“surprise”) highly correlates with the visual saliency. Based on

this po

stulation, this letter designs a new computational Free En-

ergy inspired Saliency detection model (FES). Our work com-

putes the local entropy of the gap between an image and its pre-

dicte

d version reconstructed from the input one by a semi-para-

metric model, which fuses the parametric autoregressive (AR)

See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38663151

粉丝: 3
资源: 897

基于自由能原理的视觉显著性检测算法

VISUAL ENERGY

Spatiotemporal Saliency Detection Using

Context-Aware Saliency Detection 代码

用中文说一下Pyramid Feature Attention Network for Saliency detection针对的问题和解决方法

Context-Aware Saliency Detection的代码可以帮我写出来吗？

Visual Saliency Transformer

显著性检测(saliency detection)评价指标之sAUC（shuffled AUC）的Matlab代码实现

帮我推荐最新的sailency综述

python显著性目标检测代码（附注释）

python saliency

最新资源