深度学习驱动的图像对象检测与语义分割

129 浏览量更新于2024-08-27 收藏 550KB PDF 举报

"这篇研究论文探讨了基于卷积神经网络(CNN)的图像目标检测与语义分割技术。文中提出了一种无监督的共分割算法，该算法适用于同时含有多个前景对象且背景变化剧烈的图像。通过在RGB空间中提取颜色边缘图像进行语义提取，该方法能有效地通过递归建模像素和区域的外观分布来区分前景和背景。通过利用不同图像区域之间的相关性，增强了图像前景和背景模型的一致性。实验结果表明，深度卷积神经网络可以通过端到端的特征学习有效地实现场景图像的语义分类，并能实现对场景图像的精确语义分割。关键词包括：CNN、AdaBoost、图像分割等。" 在这篇研究论文中，作者关注的是如何利用深度学习，特别是卷积神经网络（CNN）来解决图像处理中的两个关键问题：目标检测和语义分割。目标检测是指识别并定位图像中的特定对象，而语义分割则更进一步，将图像的每一个像素都分类到不同的对象或背景类别。提出的无监督共分割算法是一种创新的方法，它能够在没有大量标记数据的情况下工作，这通常是在监督学习任务中必不可少的。该算法通过对RGB空间中的颜色边缘图像进行分析，以提取语义信息。颜色边缘图像可以突出显示图像中物体的边界，帮助区分前景和背景。通过递归地建模像素和区域的外观分布，算法能够学习和理解不同对象的特征，从而有效地分离出前景对象。为了增强图像前景和背景模型的连贯性，论文引入了区域之间的相关性分析。这种策略考虑了图像内部各部分之间的关系，有助于更准确地识别和分割前景与背景，尤其在背景变化剧烈的情况下，如动态场景或者复杂的环境。论文的实验部分展示了深度卷积神经网络（CNN）在端到端学习模式下的强大性能。端到端学习允许网络直接从原始输入图像学习到最终的像素级分类，无需手动设计中间步骤。通过这种方式，CNN能够自动学习和提取特征，进行语义分类，从而实现对场景图像的精确分割。关键词中的CNN（卷积神经网络）是深度学习领域的一种核心架构，特别适合图像处理任务，因为它能自动学习和提取图像的多层次特征。AdaBoost是一种集成学习方法，常用于分类任务，可能在这篇文章中被用作提升模型性能的一个组件。该论文提出了一个无监督的共分割算法，利用CNN和区域相关性增强技术，实现了在复杂背景下多对象图像的精准目标检测和语义分割。这一研究对于无人驾

DEEP LEARNING & NEURAL COMPUTING FOR INTELLIGENT SENSING AND

CONTROL

Image object detection and semantic segmentation based

on convolutional neural network

Laigang Zhang

•

Zhou Sheng

•

Yibin Li

•

Qun Sun

•

Ying Zhao

•

Deying Feng

Received: 24 March 2019 / Accepted: 23 August 2019

 Springer-Verlag London Ltd., part of Springer Nature 2019

Abstract

In this paper, an unsupervised co-segmentation algorithm is proposed, which can be applied to the image with multiple

foreground objects simultaneously and the background changes dramatically. The color edge image in RGB space is

extracted for semantic extraction. This method can effectively distinguish foreground and background by recursively

modeling the appearance distribution of pixels and regions. The coherence of image foreground and background model is

enhanced by using the correlation between different image regions and image interior. Experimental results show that deep

convolutional neural network can effect ively realize semantic classiﬁcation of scene images by end-to-end feature learning

and achieve accurate semantic segmentation of scene images.

Keywords CNN  AdaBoost  Image object detection  Semantic segmentation

1 Introduction

Scene understanding is a hot topic in the ﬁeld of computer

vision and artiﬁcial intelligence. Its research results have

been widely used in many ﬁelds such as robot navigation,

network search, security monitoring, medical care and so

on. Various branch tasks of scene understanding, such as

target detection, image semantic segmentation and so on,

have made a breakthrough in recent years, but there are still

many shortcomings. For example, it is difﬁcult to obtain

reliable and robust features for dynamic target classiﬁca-

tion in the scene because of the deformation of the target

itself and the interference of external factors [1].

Object detection is a wide area problem in the ﬁeld of

computer and machine vision. Complex background also

increases the range of challenges and errors as well as

problems. Many algorithms used for object detection have

difﬁculty in matching the inﬂuence of occlusion and pixel

moment. Hence, a highly robust algorithm, ORBTRIAN, is

proposed for low-resolution images, and gradient

enhancement machine learning algorithm is used to detect

ORB. This work has been compared with the technology

based on AdaBoost and Surf. Analysis shows that the

performance of the early model improved to 3.8%. The

feature points extracted from the ORB method are further

processed to further reduce the processing. Only those

points farthest from its centroid triangle are selected, and

only one feature point is selected. The result was about 28,

much faster than earlier calculations [2, 3]. Tree-based GB

& Zhou Sheng

yushin@stu.cque.edu.cn

Laigang Zhang

zhanglaigang@lcu.edu.cn

Yibin Li

liyb@sdu.edu.cn

Qun Sun

sunqun@lcu.edu.cn

Ying Zhao

zhaoying@lcu.edu.cn

Deying Feng

fengdeying@lcu.edu.cn

School of Mechanical and Automotive Engineering,

Liaocheng University, Liaocheng, Shandong, China

School of Management, Wuhan Donghu University, Wuhan,

China

Shandong University, Jinan, Shandong, China

123

Neural Computing and Applications

https://doi.org/10.1007/s00521-019-04491-4

(0123456789().,-volV)(0123456789().,-volV)

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38625448

粉丝: 8

深度学习驱动的图像对象检测与语义分割

Deep Learning and Convolutional Neural Networks for Medical Image Computing

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

Semantic Breast Tumor Segmentation by CNN: Semantic Breast Tumor Segmentation by Convolutional Neural Network in MRI 图像-matlab开发

Gated Convolutional Neural Network

STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation

Object detection via a multi-region & semantic segmentation-aware CNN model

Video Object Segmentation Through Deep Convolutional Networks

semantic_segmentation_satellite_image:该存储库包含用于Landsat 8图像的语义地质分割的所有代码和数据

keras-semantic-segmentation-example:Keras中的语义分割示例

最新资源