多模态深度监督：解决语义边检测中的干扰挑战

需积分: 9 186 浏览量更新于2024-09-07 收藏 8.58MB PDF 举报

"本文主要探讨了语义边缘检测（Semantic Edge Detection，SED）这一领域的前沿技术，它旨在同时提取边缘和边缘所属的类别信息，这在诸如语义分割（Semantic Segmentation）、对象提议生成以及对象识别等众多应用中具有广泛的应用价值。传统的SED方法面临着一个关键挑战：如何有效地结合精细的边缘定位和高级别的语义识别这两个分散的监督目标。现有的状态-of-the-art SED方法在利用深度监督（Deep Supervision）来提升性能方面遇到了困难。深度监督通常通过在网络的不同层次提供额外的监督信号，以帮助模型学习更深层次的特征表示。然而，对于SED而言，这种多目标的特性使得较低层网络专注于生成类别无关的边缘，而较高层网络则负责识别类别敏感的语义边缘，导致了监督信号的分散和冲突。针对这个问题，论文提出了一种新颖的全卷积神经网络架构，它采用了多任务框架下的多样化深度监督（Diverse Deep Supervision，DDS）。该架构的设计巧妙地将网络划分为两部分：较低层用于生成类别不特定的边缘信息，通过粗粒度的监督帮助模型学习通用边缘特征；而较高层则集中于处理类别特定的语义边缘检测，通过精细化的监督引导网络理解更高层次的语义关系。为了克服分散的监督挑战，论文还引入了一个创新的信息转换单元，这个单元的作用是协调不同层级之间的信息流，确保各层能够协同工作，同时最大化利用深度监督的优势。通过这种方式，该方法有望显著提高语义边缘检测的准确性和鲁棒性，从而推动这一领域的发展，为后续的研究者提供了一个有效的解决方案和新的研究方向。"

Semantic Edge Detection with Diverse Deep Supervision

Yun Liu

, Ming-Ming Cheng

, JiaWang Bian

, Le Zhang

, Peng-Tao Jiang

, Yang Cao

Nankai University

University of Adelaide

Advanced Digital Sciences Center

ABSTRACT

Semantic edge detection (SED), which aims at jointly extracting

edges as well as their category information, has far-reaching appli-

cations in domains such as semantic segmentation, object proposal

generation, and object recognition. SED naturally requires achiev-

ing two distinct supervision targets: locating ne detailed edges

and identifying high-level semantics. We shed light on how such

distracted supervision targets prevent state-of-the-art SED meth-

ods from eectively using deep supervision to improve results. In

this paper, we propose a novel fully convolutional neural network

architecture using diverse deep supervision (

DDS

) within a multi-

task framework where lower layers aim at generating category-

agnostic edges, while higher layers are responsible for the detec-

tion of category-aware semantic edges. To overcome the distracted

supervision challenge, a novel information converter unit is in-

troduced, whose eectiveness has been extensively evaluated in

several popular benchmark datasets, including SBD, Cityscapes,

and PASCAL VOC2012. Source code will be released upon paper

acceptance.

KEYWORDS

Semantic edge detection, diverse deep supervision

1 INTRODUCTION

Classical edge detection aims to detect edges and objects’ bound-

aries. It is

category-agnostic

in the sense that recognizing object

categories is not necessary. It can be viewed as a pixel-wise binary

classication problem whose objective is to classify each pixel as

belonging to one class, indicating the edge, or to the other class, indi-

cating non-edge. In this paper we consider more practical scenarios

of semantic edge detection, in which the detection of edges and the

recognition of edges’ categories within an image is jointly achieved.

Semantic edge detection (

SED

) [

] is an active research

topic in computer vision due to its wide-ranging applications in

problems such as object proposal generation [

], occlusion and

depth reasoning [

], 3D reconstruction [

], object detection

[11, 12], image-based localization [32] and so on.

Recently, deep convolutional neural networks (DCNNs) reign

undisputed as the new de-facto method for category-agnostic edge

detection [

] where near human-level performances have been

achieved. Deep learning for

category-aware

SED, which jointly

detects visually salient edges as well as recognizes their categories,

however, is not yet to witness such vast popularity. Hariharan et

al. [

] rst combined generic object detectors with bottom-up

edges to recognize semantic edges. A fully convolutional encoder-

decoder network is proposed in [

] to detect object contours but

without recognizing specic categories. Recently, CASENet [

]

introduces a skip-layer structure to enrich category-wise edge ac-

tivations with bottom layer features, improving previous state-of-

the-art methods with a signicant margin.

(a) original image (b) ground truth

Person

Motorbike

Person+Motorbike

(d) Side-1 (e) Side-2 (f) Side-3

(g) Side-4 (h) Side-5 (i) DDS

Figure 1: An example of our DDS algorithm. (a) shows the

original image from the SBD dataset. (b)-(c) show its seman-

tic edge map and corresponding color co des. (d)-(g) display

category-agnostic edges from Side-1-4. (h)-(i) show semantic

edges of Side-5 and DDS output, respectively.

Distracted supervision paradox in SED.

SED naturally requires

achieving two distinct supervision targets: i) locating ne detailed

edges by capturing discontinuity among image regions, mainly

using low-level features; and ii) identifying abstracted high-level

semantics by summarizing dierent appearance variations of the

target categories. Such distracted supervision paradox prevents the

state-of-the-art SED method, i.e. CASENet [

], from successfully

applying deep supervision, whose eectiveness has been demon-

strated in a wide number of other computer vision tasks, e.g. image

categorization [

], object detection [

], visual tracking [

], and

category-agnostic edge detection [29, 38].

In this paper, we propose a diverse deep supervision (

DDS

) method,

which employs deep supervision with dierent loss functions for

high-level and low-level feature learning as shown in Fig. 2(b).

While mainly using high-level convolution (i.e.

conv

) features for

semantic classication and low-level conv ones for non-semantic

edge details is intuitive and straightforward, directly doing this as in

CASENet [

] results in even worse performance than directly learn-

ing semantic edges without deep supervision or category-agnostic

edge guidance. In [

], Yu et al. claimed that deep supervision for

lower layers of the network is not necessary, after unsuccessfully

trying various ways of adding deep supervision. As illustrated in

Fig. 2(b), we propose an

information converter

unit for changing

the backbone DCNN features into dierent representations, for

training category-agnostic or semantic edges respectively. Without

arXiv:1804.02864v1 [cs.CV] 9 Apr 2018

下载后可阅读完整内容，剩余8页未读，立即下载

arguoixx

粉丝: 0
资源: 3

多模态深度监督：解决语义边检测中的干扰挑战

基于深度学习的图像语义分割算法综述

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

self-supervised-semantic-segmentation

Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation

Semantic Segmentation vs. Instance Segmentation

semantic instance segmentation with a discriminative loss function

有没有关于Semantic Segmentation with Classification的项目或者代码

Dynamic semantic segmentation

semantic-segmentation-editor

最新资源