卷积神经网络与潜在兴趣区域：一种高效显著性检测模型

研究论文

168 浏览量更新于2024-08-26 收藏 983KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

2015 11th International Conference on Natural Computation (ICNC'15)

A Combined Convolutional Neural Network and

Potential Region-Of-Interest Model for

Saliency Detection

Yu Hu

, Zhen Liang

, Zheru Chi

, and Hong Fu

a,b

Department of Electronic and Information Engineering, the Hong Kong Polytechnic University, Hong Kong SAR, China

Department of Computer Science, Chu Hai College of Higher Education, Hong Kong SAR, China

Abstract—A saliency detection model for approaching the human

performance is a challenging research topic. In this paper, a new

saliency model is proposed to detect saliency in natural scenes by

using a trained convolutional neural network and a region-based

validation method. The convolutional neural network (CNN)

focuses on image details and local contrast of an image, while the

region-based validation method focus on global information.

Experimental results show that the two components of the model

are complementary for each other in producing high-quality

saliency maps.

Keywords-saliency map; saliency detection; machine learning;

convolutional neural networks

I. INTRODUCTION

Computational visual attention models have been

developing to reduce unnecessary information from visual

contents since decades ago, because a growing amount of

visual contents are being made on a daily base. Inspired by the

human visual attention system which selects prominent regions

with saliency-based means, saliency detection models have

been comprehensively developed to imitate the visual search

progress and automatically determine the importance of regions

that might attracts humans’ attention. With increasing

computational power, new approaches such as deep machine

learning are employed in saliency modeling. However, saliency

detection remains a challenging mission in order to generate

saliency maps automatically from visual contents without any

prior knowledge. To produce a saliency map, most bottom-up

models could be categorized into two classes: learning-based

approaches and non-learning-based approaches.

Non-learning-based saliency models are generally classified

into three types. For the first type, images are regarded as

pixels, where every pixel corresponds to a saliency value. The

saliency value of a pixel is generated from the contrast with

neighboring pixels. A typical saliency model of this type is a

biologically-plausible model to produce saliency maps from

intensity, color and orientation of images with center-surround

bias, which is introduced by Itti et al [1]. In the methods of

second type, images are firstly casted into frequency domain.

For instance, Hou and Zhang [2] proposed a frequency-based

model which use log spectrum representation to generate

saliency maps. In the last type of methods, pre-segmentation

steps are performed to treat images as separated regions. For

instance, a region-based attention model which apply popping-

out cutting to segment the conspicuous regions from

background is introduced by Fu et al [3]. Moreover, Liang et al

[4] proposed a hypergraph structure which described the spatial

relationships among several segments. In the model, potential

Region-Of-Interests (p-ROIs) are produced to estimate

hypothetic ground truth of given scenes.

Each type of methods has its disadvantage. Pixel-based type

of methods may suffer from a problem in which issue that the

global association of a given scene is neglected. Meanwhile,

frequency-based models have the weakness in reducing the

effect of noise in background regions. As for region-based

methods, the main drawback is overlooking the detail of image

contents and losing information on the relationship among

segments.

Learning-based saliency models are pixel-based in general.

A commonly used strategy is to perform feature extraction first

then apply a machine learning model to obtain final saliency

maps. Two approaches have been proposed to extract features.

One is to use hand-designed features. For example, Judd et al

[5] proposed a pixel-based model considering 33 hand-

designed features to generate saliency maps with the aid of

Support Vector Machines (SVMs). Another saliency model

introduced by Xu et al [6] acquires 20 hand-designed features

in pixel-, object- and semantic-levels. Their model utilizes

SVMs to produce final saliency maps as well. Another feature

extraction approach is to employ machine learning techniques.

A multi-layer sparse network is used as the feature extractor in

Shen’s saliency model [7]. A hand-designed feature extractor

has its superiority in precision and effectiveness on picking out

features which a designer deems it useful. However, it also has

its drawback that a hand-designed feature extractor may neglect

some detailed, important and useful information. Using

machine learning as a feature extractor has an advantage in

acquiring important information which can be easily ignored

by human beings. Nonetheless, it would also extract unwanted

and useless feature or even those with negative effects.

Meanwhile, machine learning has being rapidly developing

since a decade ago. It showed promising performance in the

field of computer vision including saliency detection. Machine

learning models are usually used as classifiers such as SVMs in

Judd’s work [5]. Other machine learning methods like neural

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38620741

粉丝: 1
资源: 909

卷积神经网络与潜在兴趣区域：一种高效显著性检测模型

【项目实战】Python基于卷积神经网络CNN模型和VGG16模型进行图片识别项目实战

基于卷积神经网络的人体细胞癌症分类模型实现

基于AlexNet架构的卷积神经网络用于有毒评论分类

虚拟图像训练的深度残差卷积神经网络用于近距离检测农作物上的无脊椎害虫

卷积神经网络和卷积神经网络模型的区别

卷积神经网络模型 visio源文件

lightgbm如何与卷积神经网络模型组合

请按时间顺序列出卷积神经网络用于图像处理的典型模型

针对空气质量指数预测问题，如何使用融合前馈神经网络，卷积神经网络，循环神经网络设计模型，包括过去历史信息，不同区域信息相关，有周期性

一维卷积神经网络与二维卷积神经网络

卷积神经网络模型有哪些

卷积神经网络和图卷积神经网络区别

卷积神经网络二分类模型

卷积神经网络自编码器模型

AlexNet卷积神经网络和卷积神经网络的从属关系

python基于卷积神经网络cnn模型和vg16模型的一些资料

常见的卷积神经网络模型

卷积神经网络主流模型

文本卷积神经网络与卷积神经网络比较 详细点

图卷积神经网络和传统卷积神经网络有什么区别？

最新资源

文本卷积神经网络与卷积神经网络比较详细点