分层弱监督方法提升遥感居民区语义分割精度

103 浏览量更新于2024-08-26 1 收藏 3.48MB PDF 举报

本文主要探讨了遥感图像中居民区语义分割的问题，针对近年来深度学习在遥感领域展现出的卓越性能，尤其是卷积神经网络（Convolutional Neural Networks, CNN）在语义分割中的广泛应用，但面临的一个关键挑战是像素级标注的高昂成本和复杂性。为解决这一问题，研究者提出了分层弱监督学习（Hierarchical Weakly Supervised Learning, HWSL）方法。 HWSL的核心思路在于利用弱监督信号进行有效的像素级语义分割。首先，通过计算CNN模型中间层的梯度映射，提出了一种弱监督层次化显著性分析技术。这种方法旨在捕捉一系列针对不同类别的层次化显著性地图，从而更精确地识别出居民区的特征区域。这种分析有助于减少对完整像素级标签的依赖，降低标注成本。接着，研究引入了超像素（Superpixels）的概念，将连续的像素集合视为一个整体，以便更好地处理和理解遥感图像的局部结构。超像素的使用有助于提高分割的精度和效率，因为它们能够代表更大区域的特征，并降低噪声的影响。此外，低秩矩阵恢复技术也被整合到该方法中，目的是发现和强调图像中的共同显著区域，这些区域可能是居民区的共性特征。这种方法有助于增强分割结果的一致性和稳定性，同时保留了不同类别的特性。最后，为了融合层次化显著性地图，研究者引入了自适应权重策略。这意味着在整合各个层次的显著性信息时，会根据地图的特性和上下文动态调整权重，以优化最终的分类结果。这样，即使在缺乏充分像素级标注的情况下，HWSL也能实现相对准确的居民区语义分割，为遥感图像分析提供了一种有前景的方法。这篇文章的研究成果对于推动遥感领域尤其是弱监督学习的应用具有重要意义，它不仅降低了高精度标注的需求，还展示了如何有效利用弱监督信号提升居民区语义分割的性能，为遥感数据分析的实际应用提供了新的解决方案。

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 17, NO. 1, JANUARY 2020 117

Hierarchical Weakly Supervised Learning for

Residential Area Semantic Segmentation in

Remote Sensing Images

Libao Zhang , Member, IEEE, Jie Ma, Xinran Lv, and Donghui Chen

Abstract— Residential-area segmentation is one of the most

fundamental tasks in the ﬁeld of remote sensing. Recently, fully

supervised convolutional neural network (CNN)-based methods

have shown superiority in the ﬁeld of semantic segmentation.

However, a serious problem for those CNN-based methods is that

pixel-level annotations are expensive and laborious. In this study,

a novel hierarchical weakly supervised learning (HWSL) method

is proposed to realize pixel-level semantic segmentation in remote

sensing images. First, a weakly supervised hierarchical saliency

analysis is proposed to capture a sequence of class-speciﬁc

hierarchical saliency maps by computing the gradient maps with

respect to the middle layers of the CNN. Then, superpixels

and low-rank matrix recovery are introduced to highlight the

common salient areas and fuse class-speciﬁc saliency maps with

adaptive weights. Finally, a subtraction operation between class-

speciﬁc saliency maps is conducted to generate hierarchical

residual saliency maps and fulﬁll residential-area segmentation.

Comprehensive evaluations with two remote sensing data sets

and comparison with seven methods validate the superiority of

the proposed HWSL model.

Index Terms— Deep learning, remote sensing, saliency analysis,

semantic segmentation, weakly supervised.

I. INTRODUCTION

HE semantic segmentation for residential areas, i.e., anno-

tating residential areas pixel-wisely in remote sensing

images (RSIs) [1], is a fundamental task in the ﬁeld of

remote sensing. During the past few years, deep learning,

which can automatically discover problem-speciﬁc features for

a given problem, has received extensive attention in image

segmentation and object detection tasks.

In particular, convolutional neural networks (CNNs) [2]

are the most widely used deep-learning method. They are

known for their large consumption of training images to

avoid overﬁtting and also to improve the generalization

ability of the framework. With the development of CNNs,

the accuracy of the segmentation task has been boosted

signiﬁcantly. Shelhamer et al. [3 ] built fu lly convolutional

networks (FCNs), which take the input of arbitrary size and

produce correspondingly sized output with efﬁcient inference

Manuscript received October 29, 2018; revised February 19, 2019; accepted

March 17, 2019. Date of publication May 22, 2019; date of current version

December 27, 2019. This work wa s supported in part by the Beijing Natural

Science Foundation under Grant L182029, in part by the National Natural

Science Foundation of China under Grant 61571050, and in part by the BNU

Interdisciplinary Research Foundation for the First-Year Doctoral Candidates

under Grant BNUXKJC1801. (Corresponding author: Libao Zhang.)

The authors are with the College of Information Science and Tech-

nology, Beijing Normal University, Beijing 100875, China (e-mail:

libaozhang@163.com).

Color versions of one or more of the ﬁgures in this letter are av ailable

online at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/LGRS.2019.2914490

and learning. Badrinarayanan et al. [4] designed a trainable

segmentation engine, which consists of an encoder network

and a corresponding decoder n etwork followed by a pixelwise

classiﬁcation layer. Chen et al. [5] brought together meth-

ods from deep CNNs (DCNNs) and probabilistic graphical

models for addressing the task of p ixel-level classiﬁcation.

Ronneberger et al. [6] designed a segmentation architecture

consisting of a contracting path to capture context and a

symmetric exp anding path that enables precise localization.

As those frameworks are all optimized based on the pixel-

level loss functions, their good performances depend on a large

amount of annotated data. Therefore, a common bottleneck

of those aforementioned methods is that they are operated

in a fully supervised manner, i.e., they typically require

plenty of pixel-level annotations in the training phase. The

process is inevitably expensive, laborious, and also prone to

error. Because of the complicated surface features and rich

background interference in RSIs, it is more labor-intensive to

label the RSIs pixel by pixel.

Weakly supervised annotations, in the form of bounding

boxes (approximate location) and image-level labels (whether

the input image contains objects), are much easier to acquire

compared to precise pixel-level annotations. Weakly super-

vised methods, which rely on weakly supervised annota-

tions, can therefore be viewed as the means to address

the limitation of fully supervised CNN-based approaches.

Simonyan et al. [ 7] utilized the grad ient maps to achieve the

object location task of natural scenes in a weakly supervised

way. However, the gradient saliency maps of RSIs are not as

desirable, since the grayscale of RSIs change violently.

In th is work, a novel hierarchical weakly supervised learn-

ing (HWSL) model is proposed to realize the semantic seg-

mentation with image-level annotations. Here, image-level tags

are used to train a classiﬁcation CNN, which is also respon-

sible for generating the class-speciﬁc gradient hierarchical

saliency maps (CS-GHSMs) with respect to middle convo-

lutional layers. As the layers go deeper, those CS-GHSMs

can progressively capture local and g lobal salient features,

which are beneﬁcial to segmentation tasks [1]. Then, it is

proposed to integrate m ultiscale features by fusing CS-GHSMs

with the help of superpixels and low-rank matrix recovery.

Finally, a subtraction operation between foreground and back-

ground fused saliency maps is implemented to suppress the

background.

The major contributions are as follows.

1) A novel weakly supervised sema ntic segmentation

method is proposed to generate accurate saliency maps

in RSIs by image-level labels rather than pixel-level

labels, which saves considerable labor costs.

See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38649356

粉丝: 5
资源: 951

分层弱监督方法提升遥感居民区语义分割精度

WTS：WTS：使用分割模型对遥感土地覆盖分类的弱监督学习框架

VALSE2018王兴刚教授弱监督语义分割

基于深度学习的遥感图像新增建筑物语义分割.pdf

基于 Pytorch 的遥感图像分割模型在语义分割任务中的性能 该模型采用了Unet++ 架构，以提高遥感图像分割的精度和效果

本实验旨在研究和评估基于_Pytorch_的遥感图像分割模型在语义分割任务

遥感图像分割-基于一种类似UNet的Transformer算法实现遥感城市场景图像的语义分割-适用于卫星图像+航空图像

卫星遥感图像建筑与道路语义分割数据集及可视化教程

基于Tensorflow的遥感图像深度学习语义分割

遥感图像语义分割.docx

遥感图像语义分割.pdf

最新资源

基于 Pytorch 的遥感图像分割模型在语义分割任务中的性能该模型采用了Unet++ 架构，以提高遥感图像分割的精度和效果