深度学习中的噪声标签处理与α-池化研究

需积分: 9 180 浏览量更新于2024-09-07 收藏 7.95MB PDF 举报

"这篇论文探讨了深度学习中的噪声标签问题，并提出了一种新的池化方法——广义无序池化(Generalized Orderless Pooling)，它在训练过程中可以学习到最优的池化策略。此外，该研究还提供了一种可视化决策的新方法，能够识别出对测试图像预测影响最大的训练图像部分，有助于理解和分析模型决策的依据。" 在当前的深度学习领域，卷积神经网络(CNN)架构已经成为许多计算机视觉任务的核心。平均池化(Average Pooling)作为一种常用的特征编码步骤，通常被用于CNN的最后层。然而，在细粒度识别(fine-grained recognition)任务中，如鸟类物种识别等，更复杂的全局表示，如双线性池化(Bilinear Pooling)，已经展现出更好的性能。在这篇名为“noise label paper”的研究中，作者们对平均池化和双线性池化进行了泛化，提出了“α-池化”(α-Pooling)。α-池化允许在训练过程中学习到最佳的池化策略，这使得网络能根据数据自动调整其池化方式，从而适应不同的任务需求，提高了模型的适应性和泛化能力。除了新颖的池化方法，论文还引入了一种可视化技术，可以揭示模型决策背后的图像区域影响。这项技术能够标识出训练集中哪些部分对特定测试图像的预测结果影响最大。这对于用户来说，可以提供决策的解释性，增强模型的可信度。同时，对于研究人员来说，这是一种深入理解模型如何利用不同语义部分进行决策的有力工具。例如，通过这种方法，研究者发现更高容量的VGG16模型在识别鸟类时，更侧重于鸟的头部特征，而相对低容量的VGG-M模型可能关注的区域则不同。这项工作在深度学习模型的优化和解释性方面迈出了重要的一步，对于提升模型的性能和理解模型的决策过程具有重要意义。通过学习适应性的池化策略和可视化决策过程，未来的研究者可以更好地调整和解释他们的模型，这对于提升模型的准确性和透明度至关重要。

Generalized orderless pooling performs implicit salient matching

Marcel Simon

, Yang Gao

, Trevor Darrell

, Joachim Denzler

, Erik Rodner

Computer Vision Group, University of Jena, Germany

EECS, UC Berkeley, USA

Corporate Research and Technology, Carl Zeiss AG

{marcel.simon, joachim.denzler}@uni-jena.de {yg, trevor}@eecs.berkeley.edu

Abstract

Most recent CNN architectures use average pooling as

a ﬁnal feature encoding step. In the ﬁeld of ﬁne-grained

recognition, however, recent global representations like bi-

linear pooling offer improved performance. In this paper,

we generalize average and bilinear pooling to “α-pooling”,

allowing for learning the pooling strategy during training.

In addition, we present a novel way to visualize decisions

made by these approaches. We identify parts of training

images having the highest inﬂuence on the prediction of a

given test image. It allows for justifying decisions to users

and also for analyzing the inﬂuence of semantic parts. For

example, we can show that the higher capacity VGG16

model focuses much more on the bird’s head than, e.g.,

the lower-capacity VGG-M model when recognizing ﬁne-

grained bird categories. Both contributions allow us to an-

alyze the difference when moving between average and bi-

linear pooling. In addition, experiments show that our gen-

eralized approach can outperform both across a variety of

standard datasets.

1. Introduction

Deep architectures are characterized by interleaved con-

volution layers to compute intermediate features and pool-

ing layers to aggregate information. Inspired by recent re-

sults in ﬁne-grained recognition [19, 10] showing certain

pooling strategies offered equivalent performance as clas-

sic models involving explicit correspondence, we investi-

gate here a new pooling layer generalization for deep neu-

ral networks suitable both for ﬁne-grained and more generic

recognition tasks.

Fine-grained recognition developed from a niche re-

search ﬁeld into a popular topic with numerous applications,

ranging from automated monitoring of animal species [9]

to ﬁne-grained recognition of cloth types [8]. The deﬁn-

ing property of ﬁne-grained recognition is that all possi-

ble object categories share a similar object structure and

hence similar object parts. Since the objects do not sig-

Figure 1. We present the novel pooling strategy α-pooling, which

replaces the ﬁnal average pooling or bilinear pooling layer in

CNNs. It allows for a smooth combination of average and bilinear

pooling techniques. The optimal pooling strategy can be learned

during training to optimally adapt to the properties of the task. In

addition, we present a novel way to visualize predictions of α-

pooling-based classiﬁcation decisions. It allows in particular for

analyzing incorrect classiﬁcation decisions, which is an important

addition to all widely used orderless pooling strategies.

niﬁcantly differ in the overall shape, subtle differences in

the appearance of an object part can likely make the differ-

ence between two classes. For example, one of the most

popular ﬁne-grained tasks is bird species recognition. All

birds have the basic body structure with beak, head, throat,

belly, wings as well as tail parts, and two species might dif-

arXiv:1705.00487v3 [cs.CV] 20 Jul 2017

下载后可阅读完整内容，剩余9页未读，立即下载

mingmingdiii

粉丝: 0
资源: 1

深度学习中的噪声标签处理与α-池化研究

Noise-label-generation-and-relabeling:在给定噪声率的情况下为数据集生成噪声标签，并使用重新标记算法对这些噪声标签进行重新标记

Noise-92噪声数据库，matlab

label noise

Zhong_Graph_Convolutional_Label_Noise_Cleaner_Train_a_Plug-And-Play_Action_Classifier_CVPR_2019_paper.pdf

label_noise_correction:实施文件

label-noise-papers:与标签噪声表示学习相关的论文的最新列表在这里

Awesome-Learning-with-Label-Noise:精选的带有噪音标签的学习资源列表

Noise_Estimation.zip_NOISE_estimation Noise _noise speech_noise

Advances-in-Label-Noise-Learning:精选（最新）的带有噪声标签的学习资源列表

noise_NOISE_

最新资源