深度学习鲁棒显著目标检测方法

17 浏览量更新于2024-08-28 收藏 1.45MB PDF 举报

"这篇研究论文‘Robust salient object detection for RGB images’主要探讨了在RGB图像中实现鲁棒显著对象检测的方法。论文指出，尽管现有的监督式显著对象检测模型在基准数据集上的表现已有显著提升，但大多数模型都假设图像至少包含一个显著对象，这在处理现实世界的复杂场景时可能会导致效果下降。为了解决这个问题，论文引入了显著对象存在预测，即在深度网络中判断图像是否包含显著对象，以学习更优的显著对象检测模型。此外，对于密集的显著对象检测任务，模型通过自顶向下的高阶语义特征混合上采样来弥补空间信息损失，并结合高层特征和显著性存在信息。此模型能够识别不含显著对象的非显著图像。" 这篇论文的贡献在于提出了一种新的显著对象检测方法，旨在增强模型在处理RGB图像中的鲁棒性和适应性。具体来说，它包括以下几个关键知识点： 1. **显著对象检测**：这是计算机视觉领域的一个重要问题，目标是识别出图像中最吸引人注意的对象。在许多应用中，如图像摘要、视频剪辑和人机交互，显著对象检测都是关键步骤。 2. **假设挑战**：现有的显著对象检测模型通常基于假设，即每张图像都至少有一个显著对象。然而，现实世界中的图像可能并不符合这一假设，导致模型在处理这类图像时性能下降。 3. **显著对象存在预测**：论文提出的新方法引入了一个预判机制，用于判断图像是否包含显著对象。这是一个额外的预测层，可以增强模型对无显著对象图像的识别能力，从而提高整体的检测准确性。 4. **深度网络集成**：利用深度学习网络进行特征提取和学习，模型能够逐步混合和上采样高层语义特征，这有助于恢复由于下采样过程丢失的空间信息。 5. **空间信息恢复**：在密集显著对象检测任务中，模型通过结合高层特征和显著性存在信息来逐步恢复空间信息，这对于准确识别和定位显著对象至关重要。 6. **鲁棒性增强**：论文的目标是构建一个能够适应各种复杂场景的模型，包括那些没有显著对象的图像，这提升了模型在实际应用中的鲁棒性。 7. **评估与基准**：论文的实验部分可能涉及对标准基准数据集的测试，以证明所提方法相对于现有技术的改进，这通常包括对比分析和定量评估。这篇研究论文提出了一个新颖的框架，通过改进显著对象检测模型的结构和训练策略，以更好地处理真实世界中可能出现的各种图像情况。这种方法有望推动显著对象检测技术的进步，特别是在处理复杂和多变的图像环境时。

Robust salient object detection for RGB images

features for salient object detection. Zhang et al. [58]pro-

pose a controlled bi-directional passing of features between

shallow and deep layers to obtain accurate predictions. Deng

et al. [9] develop a recurrent residual reﬁnement network

for saliency maps reﬁnement by incorporating shallow and

deep layers features alternately. Qin et al. [36] propose a

end-to-end predict-reﬁne architecture BASNet. It is able to

capture both large-scale and ﬁne structures, e.g., thin regions,

holes, and produce salient object detection maps with clear

boundaries. Wu et al. [52] propose a cascaded partial decoder

framework, which discards low-level features to reduce the

complexity of deep aggregation models and utilizes gen-

erated relatively precise attention map to reﬁne high-level

features to improve the performance. Li et al. [27]employ

a multi-scale cascade structure and a reﬁnement module to

ﬁlter out errors. It better consolidates contextual information

and intermediate saliency priors.

Although aforementioned approaches employ powerful

CNNs and make remarkable success in salient object detec-

tion, they produce unsatisfactory results in dealing with

non-salient images problems as shown in Fig. 1. As a result,

there is still a large room for performance improvements.

2.2 Salient object existence prediction

Wang et al. [45] exploit hand-crafted global features from

multiple saliency information to directly predict the exis-

tence and the position of the salient object in web images by

random forest. The purpose of this work is different from ours

and focuses more on location of salient object whose result

is expressed by bounding box enclosing the salient object

region. Zhang et al. [56] investigate not only existence but

also counting the number of salient objects based on holistic

cues. If the image contains no salient objects, then an all-

black saliency map is generated directly while salient object

detection is not performed. Jiang et al. [19] propose a super-

vised learning approach for jointly addressing the salient

object detection and existence prediction problems by the

structural SVM framework to predict both image-level exis-

tence labels and pixel-level saliency values. Hou et al. [17]

predict the saliency existence of the input image by intro-

ducing another branch into salient object detection network

with short connections. It does not consider jointly training

of both tasks, but only improve the accuracy of salient object

existence prediction based on salient object detection net-

work.

In this paper, we focus on recognizing saliency existence

and locating salient objects. By incorporating image-level

label, better performance of salient object detection with

pixel level can be achieved.

2.3 Multi-task models

Salient object detection is to identify the most visually dis-

tinctive objects or regions in an image and then segment them

out from the background. Semantic segmentation, image

classiﬁcation, salient object contour detection, subitizing and

salient object existence prediction are discussed to guide

salient object detection in recent years. Li et al. [26]set

up a multi-mask learning scheme for exploring the intrinsic

correlations between saliency detection and image seman-

tic segmentation. Cholakkal et al. [8] propose a framework

for top-down salient object detection that incorporates a

tightly coupled image classiﬁcation module. Wang et al. [43]

propose to use image-level tags as weak supervision to

learn to predict pixel-level saliency maps solving the prob-

lem that train DNNs require costly pixel-level annotations.

Li et al. [23] propose to use the combination of a coarse

salient object activation map from the classiﬁcation network

and saliency maps generated from unsupervised methods as

pixel-level annotation, to train fully convolutional networks

for salient object detection supervised by these noisy annota-

tions. Wang et al. [22] design a deep multi-scale reﬁnement

network for both salient region detection and salient object

contour detection. Zhuge et al. [61] propose a fully con-

volutional networks to integrate multi-level convolutional

features recurrently with the guidance of object boundary

information. He et al. [14] detect salient objects with the aid

of subitizing. Jiang et al. [19] propose t o jointly train the

salient object detection and existence prediction problems

by the structural SVM framework. Li et al. [28] graft salient

object detection decoder onto the existing contour detection

network to form a multi-task network architecture without

using any manually labeled salient object masks. Although

it implements joint training of two tasks, both tasks belong to

the pixel-level segmentation task, lack of more multi-modal

information. Hou et al. [16] aim at solving pixel-wise binary

problems, including salient object detection, skeleton extrac-

tion and edge detection, by introducing a horizontal cascade

encoder architecture. But this general structure cannot han-

dle multiple tasks at the same time, and does not consider the

complementarity between multiple tasks. Wang et al. [50]

design a neural network that has two branches for attention

box prediction (ABP) and aesthetics assessment (AA) to crop

photograph with the best aesthetic quality. ABP subnetwork

is responsible for inferring the initial cropping, and the AA

network determines the ﬁnal cropping. ABP task is followed

by AA task. These two tasks are not learned simultaneously.

They only share several convolutional blocks in the bottom

of network.

CNN model trained in end-to-end manner for both salient

object detection and salient object existence prediction tasks

is unexplored in aforementioned literature.

123

剩余12页未读，继续阅读

weixin_38606639

粉丝: 3

深度学习鲁棒显著目标检测方法

"双目视觉下的SLAM三维场景建图及物体识别研究-重庆大学硕士学位论文

软级联方法实现鲁棒目标检测

《不稳定日志数据的鲁棒性缺陷检测模型研究》

A Robust Rectangular Object Detection Method using Multiple Features", 2019 8th International Conference on Computer and Communication Engineering (ICCCE) 链接

Robust sensor faults detection for induction motor using observer (2012年)

Robust Real-time Object Detection

Robust Real-time Object Detection 论文 整理ppt 及一篇相关中文论文

Robust Lane Detection for Complicated Road Environment Based on Normal Map

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

Robust Real-time Object Detection(2).zip_超经典的实时人脸检测资料、

最新资源

Robust Real-time Object Detection 论文整理ppt 及一篇相关中文论文