没有合适的资源?快使用搜索试试~ 我知道了~
首页卷积神经网络与潜在兴趣区域:一种高效显著性检测模型
卷积神经网络与潜在兴趣区域:一种高效显著性检测模型
0 下载量 168 浏览量
更新于2024-08-26
收藏 983KB PDF 举报
本文主要探讨了卷积神经网络(Convolutional Neural Network, CNN)在显著性检测领域的应用,特别是在自然场景中的目标检测。显著性检测是计算机视觉中一项具有挑战性的任务,其目标是模拟人类对图像中吸引人或显著部分的感知能力。作者Yu Hua、Zhen Liang、Zheru Chia和Hong Fu合作,提出了一种结合CNN与潜在兴趣区域(Potential Region-Of-Interest, PRoI)模型的方法,以提高显著性检测的准确性和性能。 CNN被选择是因为其在处理图像特征和局部对比方面表现出色,它能够从原始像素数据中学习并提取出复杂的视觉特征。CNN通过一系列卷积层、池化层和全连接层,有效地捕捉图像的空间结构和纹理信息,这对于区分显著物体与背景至关重要。 另一方面,潜在兴趣区域模型强调的是全局信息的考虑。该方法通过验证不同区域是否符合显著性特征,来确定整个图像中的显著区域。这种区域级别的验证方法补充了CNN的局部分析,确保了检测结果不仅局限于局部特征,而且能够考虑到整体布局和上下文信息。 实验结果显示,CNN与PRoI模型的结合显著提升了显著性地图的质量,表明两者在检测过程中相互补充,协同工作。这种集成策略有助于提升模型的鲁棒性和准确性,使得在各种复杂自然场景下,都能更接近人类的显著性检测能力。 关键词包括:显著性地图、显著性检测、机器学习、卷积神经网络。该研究论文发表于2015年的第11届国际自然计算会议(ICNC'15),展示了在深度学习框架下,如何利用现代神经网络技术优化显著性检测算法,为进一步推动该领域的发展提供了新的思路和方法。
资源详情
资源推荐
978-1-4673-7678-5 ©2015 IEEE 154
2015 11th International Conference on Natural Computation (ICNC'15)
A Combined Convolutional Neural Network and
Potential Region-Of-Interest Model for
Saliency Detection
Yu Hu
a
, Zhen Liang
a
, Zheru Chi
a
, and Hong Fu
a,b
a
Department of Electronic and Information Engineering, the Hong Kong Polytechnic University, Hong Kong SAR, China
b
Department of Computer Science, Chu Hai College of Higher Education, Hong Kong SAR, China
Abstract—A saliency detection model for approaching the human
performance is a challenging research topic. In this paper, a new
saliency model is proposed to detect saliency in natural scenes by
using a trained convolutional neural network and a region-based
validation method. The convolutional neural network (CNN)
focuses on image details and local contrast of an image, while the
region-based validation method focus on global information.
Experimental results show that the two components of the model
are complementary for each other in producing high-quality
saliency maps.
Keywords-saliency map; saliency detection; machine learning;
convolutional neural networks
I. INTRODUCTION
Computational visual attention models have been
developing to reduce unnecessary information from visual
contents since decades ago, because a growing amount of
visual contents are being made on a daily base. Inspired by the
human visual attention system which selects prominent regions
with saliency-based means, saliency detection models have
been comprehensively developed to imitate the visual search
progress and automatically determine the importance of regions
that might attracts humans’ attention. With increasing
computational power, new approaches such as deep machine
learning are employed in saliency modeling. However, saliency
detection remains a challenging mission in order to generate
saliency maps automatically from visual contents without any
prior knowledge. To produce a saliency map, most bottom-up
models could be categorized into two classes: learning-based
approaches and non-learning-based approaches.
Non-learning-based saliency models are generally classified
into three types. For the first type, images are regarded as
pixels, where every pixel corresponds to a saliency value. The
saliency value of a pixel is generated from the contrast with
neighboring pixels. A typical saliency model of this type is a
biologically-plausible model to produce saliency maps from
intensity, color and orientation of images with center-surround
bias, which is introduced by Itti et al [1]. In the methods of
second type, images are firstly casted into frequency domain.
For instance, Hou and Zhang [2] proposed a frequency-based
model which use log spectrum representation to generate
saliency maps. In the last type of methods, pre-segmentation
steps are performed to treat images as separated regions. For
instance, a region-based attention model which apply popping-
out cutting to segment the conspicuous regions from
background is introduced by Fu et al [3]. Moreover, Liang et al
[4] proposed a hypergraph structure which described the spatial
relationships among several segments. In the model, potential
Region-Of-Interests (p-ROIs) are produced to estimate
hypothetic ground truth of given scenes.
Each type of methods has its disadvantage. Pixel-based type
of methods may suffer from a problem in which issue that the
global association of a given scene is neglected. Meanwhile,
frequency-based models have the weakness in reducing the
effect of noise in background regions. As for region-based
methods, the main drawback is overlooking the detail of image
contents and losing information on the relationship among
segments.
Learning-based saliency models are pixel-based in general.
A commonly used strategy is to perform feature extraction first
then apply a machine learning model to obtain final saliency
maps. Two approaches have been proposed to extract features.
One is to use hand-designed features. For example, Judd et al
[5] proposed a pixel-based model considering 33 hand-
designed features to generate saliency maps with the aid of
Support Vector Machines (SVMs). Another saliency model
introduced by Xu et al [6] acquires 20 hand-designed features
in pixel-, object- and semantic-levels. Their model utilizes
SVMs to produce final saliency maps as well. Another feature
extraction approach is to employ machine learning techniques.
A multi-layer sparse network is used as the feature extractor in
Shen’s saliency model [7]. A hand-designed feature extractor
has its superiority in precision and effectiveness on picking out
features which a designer deems it useful. However, it also has
its drawback that a hand-designed feature extractor may neglect
some detailed, important and useful information. Using
machine learning as a feature extractor has an advantage in
acquiring important information which can be easily ignored
by human beings. Nonetheless, it would also extract unwanted
and useless feature or even those with negative effects.
Meanwhile, machine learning has being rapidly developing
since a decade ago. It showed promising performance in the
field of computer vision including saliency detection. Machine
learning models are usually used as classifiers such as SVMs in
Judd’s work [5]. Other machine learning methods like neural
下载后可阅读完整内容,剩余4页未读,立即下载
weixin_38620741
- 粉丝: 1
- 资源: 909
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 十种常见电感线圈电感量计算公式详解
- 军用车辆:CAN总线的集成与优势
- CAN总线在汽车智能换档系统中的作用与实现
- CAN总线数据超载问题及解决策略
- 汽车车身系统CAN总线设计与应用
- SAP企业需求深度剖析:财务会计与供应链的关键流程与改进策略
- CAN总线在发动机电控系统中的通信设计实践
- Spring与iBATIS整合:快速开发与比较分析
- CAN总线驱动的整车管理系统硬件设计详解
- CAN总线通讯智能节点设计与实现
- DSP实现电动汽车CAN总线通讯技术
- CAN协议网关设计:自动位速率检测与互连
- Xcode免证书调试iPad程序开发指南
- 分布式数据库查询优化算法探讨
- Win7安装VC++6.0完全指南:解决兼容性与Office冲突
- MFC实现学生信息管理系统:登录与数据库操作
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功