深度学习驱动的图像语义分割综述：现状与挑战

127 浏览量更新于2024-07-15 收藏 1.99MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

本文档深入探讨了数字图像语义分割算法的研究现状、挑战和前景，这是计算机视觉领域中的关键分支，对于自动驾驶、无人驾驶航空器系统（UAVS）以及虚拟现实或增强现实系统等应用至关重要。随着深度学习方法的兴起，图像语义分割技术逐渐成为研究热点。在深度学习驱动的时代，图像语义分割主要依赖于神经网络模型，特别是卷积神经网络(Convolutional Neural Networks, CNNs)。CNNs以其强大的特征提取能力和局部感知能力，在图像分类和像素级别的标注任务中表现出色。这些网络结构如FCN (Fully Convolutional Network)、U-Net、SegNet、DeepLab等，通过将全连接层替换为可上下采样的卷积层，实现了从图像到像素级分类的端到端学习。文章概述了主流的图像语义分割算法，包括： 1. **全卷积网络（FCN）**：是第一个将卷积神经网络应用于像素级预测的模型，它将分类任务从分类器扩展到分割器，显著减少了计算量，提高了效率。 2. **U-Net**：结合了下采样（编码器）和上采样（解码器）结构，保留了大尺度空间信息，特别适用于医疗影像分割等任务。 3. **SegNet**：采用了特征金字塔结构，通过编码器和反卷积网络的组合，实现像素级别的精确标注。 4. **DeepLab**系列：包括DeepLab v1、v2、v3和v3+，采用空洞卷积和多尺度特征融合来提高精度，尤其在处理大尺度图像和复杂场景时效果显著。 5. **Mask R-CNN**：结合了区域提议网络（RPN）和FCN，既用于物体检测也用于实例分割，增强了语义分割的鲁棒性。然而，尽管深度学习带来了显著的进步，图像语义分割仍然面临一些挑战，如小目标检测、物体遮挡、光照变化、背景相似性等问题。为解决这些问题，研究人员正在探索新的网络架构、优化策略（如迁移学习和自监督学习）、以及集成多种模态信息（如多传感器数据）的方法。此外，未来的研究方向可能包括更高效的计算策略、模型轻量化、解释性与可解释性，以及在边缘设备上的实时部署。数字图像语义分割算法的持续发展将对提高各种智能系统的表现和用户体验起到关键作用。

资源详情

资源推荐

Digital Image Semantic Segmentation Algorithms: A Survey 199

Where, E

: x

) measures the probability that the pixel i is labelled c

under feature

, E

: c

) measures the consistency of two connected pixels’ label.

In [11] a new high-order conditional random ﬁeld was proposed. The model combined

the target detection results based on global shape features and the point-to-condition

model. Target detectors and pre-background segmentation algorithms were used to obtain

target regions in the image, and new high-level energy items were deﬁned on the target

regions. The new high-order conditional random ﬁeld model was a weighted mixed model

of high-order energy items and point-pair conditional random ﬁeld models, its optimal

solution was the ﬁnal semantic segmentation result of the image. The new high-order

energy term is deﬁned as:

) = −|x

|max(0, (1 − R)max(0, (C

− C

))) (7)

R =

(8)

Where x

is a set of random ﬂag variables corresponding to all pixels that make up

a single object area, C

is the threshold. By adjusting this value, the ﬁnal recognition

accuracy rate can be controlled. Wang et al. [12] proposed an improved image segmen-

tation algorithm based on a robust high-order conditional random ﬁeld model, according

to the given tag set, the maximum stream-minimum cut algorithm was applied to obtain

the local optimal solution, then the local optimal solution was used to modify the node’s

tag, and the extended algorithm was run on the unmarked nodes. At the same time, the

ﬂow and edge of the graph were dynamically updated during each iteration, which would

make the time of each iteration decrease rapidly. The experimental results showed that

the convergence speed was faster on the same segmentation eﬀect. The image semantic

segmentation algorithms based on conditional random ﬁeld are shown in Table 2.

Table 2. Comparison of algorithms based on conditional random ﬁeld (%)

Author Algorithm features Datasets

Segmentation

results

ZHANG[8]

CRF, dense features, high-order

potential energy

MSRC-21 75.8(mA)

ZHANG[9] CRF, Joint-boosting Algorithm MSRC-21 71.6(mA)

ZUO[10] CRF, Interactive Self-built dataset 95.3(mA)

MAO[11] CRF, high order energy items MSRC-21 72.2(PA)

WANG[12]

CRF, Maximum Flow -

Minimum Cut

MSRC-21 0.7s(time)

Chen et al. [13] proposed a new image semantic segmentation model in combination

with the underlying segmentation results. First, the corresponding underlying segmen-

tation image block was obtained by the histogram threshold and the K-means. Then

the high-level semantic information of the image was acquired by the word bag model.

Finally, the high-level semantic information was used in conjunction with the support

vector machine re-labels the image block to obtain the ﬁnal image semantic segmenta-

tion result. In [14] an image semantic segmentation algorithm based on texture primitive

blocks was proposed. Firstly, texture primitive features were extracted, k-means and

k-d trees were used to get the image’s texture primitive block segmentation maps, and

then semantic mapping of texture primitive blocks was implemented by using the image

semantic learning and prediction methods based on support vector machine.

The two papers have similar ideas. Firstly, the image is subdivided and then the high-

level semantic information of the image is obtained. Then the support vector machine is

剩余15页未读，继续阅读

weixin_38595689

粉丝: 4
资源: 910

深度学习驱动的图像语义分割综述：现状与挑战

DeepLab: Semantic Image Segmentation

Deep Learning based 3D Segmentation A Survey.pdf

3d point cloud semantic segmentation: a survey

Semantic Segmentation vs. Instance Segmentation

2023-06-10 14:24:53,245 semantic_segmentation INFO: Sample: 840, validation Acc: 99.515, Pre: 86.825, Rec: 84.206, F1: 85.496，用matlab读取这样的txt文本中的数字

Capsule Networks for Computer Vision: A Survey翻译

Dynamic semantic segmentation

有没有关于Semantic Segmentation with Classification的项目或者代码

Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation

Small but Mighty: Enhancing 3D Point Clouds Semantic Segmentation with U-Next Framework

STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation

Dlib库中的深度学习算法有哪些？

liteseg: a novel lightweight convnet for semantic segmentation

帮我检索20篇有关语义协同的外文文献

基于深度学习的图像分割有哪些论文

bending reality: distortion-aware transformers for adapting to panoramic semantic segmentation

粗读Active Boundary Loss for Semantic Segmentation

Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds的checkpoint怎么用

实时计算：Apache Flink：Flink与Kafka集成实现事件驱动架构.docx

移动软件开发实验五：高校新闻

最新资源