PCA驱动的视觉显著性检测算法提升精确度

研究论文

28 浏览量更新于2024-08-26 收藏 1.94MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

844 IEEE TRANSACTIONS ON BROADCASTING, VOL. 62, NO. 4, DECEMBER 2016

for the ﬁrst time and proposed a feed-forward model to com-

bine these features. The Koch and Ullman [17] model was

ﬁrstly completely implemented and veriﬁed by Itti et al. [9]

in a biologically plausible way. Since then, many models

with different assumptions for attention modeling have been

proposed. It is very difﬁcult to classify all the saliency mod-

els. In this section, we provide a brief overview of the recent

saliency models. The models are classiﬁed into four classes

in an intuitive way: local model, global model, local+global

combined model, and others. Note that we focus on models for

human ﬁxation prediction instead of those models that detect

the most salient region or object in an image.

Itti et al. [9] computed saliency maps for each of the

three features (e.g., colors, intensity, orientations) in paral-

lel, where each feature was computed by a set of linear

“center-surround” operations akin to visual receptive ﬁelds.

Then these maps were linearly summed and normalized to

yield the “conspicuity maps”. Le Meur et al. [18] proposed

an approach based on the understanding of the HVS behav-

ior. Contrast sensitivity functions, perceptual decomposition,

visual masking, and center-surround interactions were some

of the features implemented in this model. Itti and Baldi [19]

deﬁned surprising stimuli as those which signiﬁcantly change

beliefs of an observer by measuring the Kullback-Leibler (KL)

distance between posterior and prior beliefs of the observer.

Harel et al. [20] ﬁrst extracted features similar to [9] to obtain

three multi-scale feature maps (e.g., colors, intensity, orien-

tations). Then, a fully connected graph was built and each

graph was treated as a Markov chain to build an activation

map. Seo and Milanfar [21] ﬁrst computed local features mea-

suring the likeness of a pixel to its surroundings. Then, the

matrix cosine similarity (a generalization of cosine similar-

ity) was employed to measure the similarity of each pixel to

its surroundings. Gao et al. [22] deﬁned saliency as classi-

ﬁcation with minimal expected error. The KL distance was

utilized to measure mutual information between features at

a scene location and class labels. Higher mutual information

between a region and class of interest indicates higher saliency

of that region. Gu et al. [23] introduced free energy theory into

saliency detection. They computed the local entropy of the gap

between an image and its predicted version reconstructed from

the input one by a semi-parametric model, which fused the

parametric autoregressive (AR) operator that can simulate a

broad range of natural scenes and the non-parametric bilateral

ﬁltering that works stably at image edges.

Bruce and Tsotsos [7] proposed the Attention based on

Information Maximization (AIM) model aiming at maximiz-

ing information sampled from a scene. The proposed operation

was based on Shannon’s self-information measure and was

implemented in a neural circuit, which was demonstrated to

have close ties with the circuitry existent in the primate visual

cortex. Zhang et al. [24] proposed a deﬁnition of saliency

by considering what the visual system was trying to opti-

mize when directing attention. The resulting model was a

Bayesian framework from which bottom-up saliency emerged

naturally as the self-information of visual features, and overall

saliency (incorporating top-down information with bottom-

up saliency) emerged as the pointwise mutual information

between the features and the target when searching for a

target. Garcia-Diaz et al. [25] proposed a visual saliency

approach that relied on a contextually adapted representation

produced through adaptive whitening of color and scale fea-

tures. The proposal was grounded on the speciﬁc adaptation

of the basis of low-level features to the statistical structure

of the image. Adaptation was achieved through decorrelation

and contrast normalization in several steps in a hierarchical

approach, in compliance with coarse features described in bio-

logical visual systems. Saliency was simply the square of the

vector norm in the resulting representation. Riche et al. [13]

proposed a saliency prediction model, which selected infor-

mation worthy of attention based on multi-scale spatial rar-

ity. First, they extracted low-level color and medium-level

orientation features. Afterwards, a multi-scale rarity mecha-

nism was applied. Finally, they fused rarity maps into ﬁnal

saliency map.

Li et al. [26] proposed a saliency model by combining

global information from frequency domain analysis and local

information from spatial domain analysis. In frequency domain

analysis, they suppressed repeating patterns by using spectrum

smoothing. In spatial domain analysis, they enhanced regions

by using a center-surround mechanism. Goferman et al. [15]

proposed a context-aware saliency detection model based on

four principles. First, local low-level considerations such as

color and contrast. Second, global considerations which sup-

press frequently occurring features while maintaining features

that deviate from the norm. Third, visual organization rules

which state that visual forms may possess one or several

centers of gravity about which the form is organized. Four,

high-level factors, such as human faces. Borji and Itti [14]

proposed a framework that measured patch rarities locally and

globally in RGB and Lab color space and fused local and

global saliency maps of all channels from both color spaces

into the ﬁnal saliency map. Liu et al. [27] measured visual

saliency as the unpredicted information of image patch through

an order-adaptive predictor under minimum description length

principle. Furthermore, a structural redundancy operator was

also involved to improve the saliency detection performance.

Hou and Zhang [10] proposed the spectral residual saliency

model based on the idea that statistical singularities in the

spectrum might be responsible for anomalous regions in the

image, where proto objects were popped up. They ﬁrst ana-

lyzed the log spectrum of each image and obtained the spectral

residual. Then they transformed spectral residual to spatial

domain to obtain the saliency map. Guo et al. [28]showed

that it was the phase spectrum, not the amplitude spectrum, of

the Fourier transform that was the key in obtaining the loca-

tion of salient areas. Later, Guo and Zhang [29] proposed a

quaternion representation of an image which was composed of

intensity, color, and motion features. Based on the principle of

the phase spectrum of Fourier transform, the spatiotemporal

saliency map was calculated by its quaternion representation.

Achanta et al. [30] proposed a frequency-tuned approach using

low-level features of color and luminance. The input RGB

image was transformed to Lab color space and blurred with

a Gaussian kernel to eliminate noise and texture details. Then

the saliency map was computed using the Euclidean distance

剩余12页未读，继续阅读

weixin_38737144

粉丝: 4
资源: 942

PCA驱动的视觉显著性检测算法提升精确度

基于主成分分析和神经网络的人脸检测新算法.pdf

基于主成分分析的人脸识别技术

基于机器视觉人脸识别技术课堂签到系统的设计与实现-opencv

人脸识别算法opencv

目标人脸检测和识别原理

opencv人脸识别原理

matlab人脸识别 本科毕设

机器视觉算法人脸识别matlab

基于opencv的人脸识别算法研究与实现

matlab在图片中找到人脸_怎样在众多的人脸中找到目标人脸？

基于matlab人脸识别系统设计及实现

写一个基于matlab的人脸识别系程序统

人脸识别算法的原理是什么

通俗介绍一下Eigenfaces、Fisherfaces、LBPH、CNN、MTCNN、Retinaface特征

人脸图像处理与特征提取方法

人脸识别程序代码 opencv

OpenCV人脸识别原理

dragon计算描述符

opencv 人脸识别核心源码

OPENCV4的PCA

最新资源

matlab人脸识别本科毕设