细粒度分类的导航学习方法

需积分: 44 164 浏览量更新于2024-07-18 收藏 2.42MB PDF 举报

"Learning to Navigate for Fine-grained Classification - ECCV 2018论文，探讨在没有边界框或部分注释的情况下，如何有效地定位信息区域进行细粒度图像识别" 在细粒度图像分类（Fine-grained Image Classification）领域，区分不同类别的细微差异是一项极具挑战性的任务。传统的图像分类方法往往难以捕捉到这些微妙的特征，如鸟的羽毛纹理、汽车的品牌标识等。针对这一问题，ECCV 2018论文提出了一个新颖的自我监督机制——Navigator-Teacher-Scrutinizer (NTS-Net)网络，旨在无需边界框或部分注释的情况下，有效定位具有区分性的图像区域。 NTS-Net模型由三个关键组件构成：Navigator（导航者）、Teacher（教师）和Scrutinizer（审查者）。这个架构的创新之处在于利用了区域信息性与作为真实类别概率之间的内在一致性。具体来说： 1. Navigator代理：其主要任务是检测图像中最具有区分性的区域。通过自我监督学习，Navigator能够在教师和审查者的指导下，逐步学习识别那些对细粒度分类至关重要的局部特征。 2. Teacher代理：扮演指导角色，它为Navigator提供反馈，帮助其优化选择的区域。Teacher根据整个图像的信息来评估Navigator所选择的区域，指导其向更具有鉴别性的部分聚焦。 3. Scrutinizer代理：作为最后的检查环节，它负责评估和修正Navigator和Teacher的工作结果。Scrutinizer会仔细分析选取的区域，确保它们确实包含有助于分类的特征，并对可能的错误进行修正。该论文提出的训练策略允许模型在没有人工注释的情况下自我学习，这大大降低了数据预处理的需求。这种自我监督的方法使得模型能够从大量未标注的图像中自动学习细粒度特征，从而提高分类准确性。 NTS-Net的贡献在于： - 提出了一种无须边界框或部分注释的自我监督机制，降低了对大量手动标注数据的依赖。 - 设计了多代理协作的网络结构，通过内在一致性学习，提升了对细粒度特征的定位和识别能力。 - 实现了在实际应用中的高效性能，对于细粒度图像分类任务有显著的提升。通过这种方式，NTS-Net不仅解决了细粒度图像分类中的关键问题，还为无监督或弱监督学习提供了新的思路，对于计算机视觉领域的研究有着深远的影响。

4 Yang et al.

within the network and predict the location of informative regions. Lin et al. [28]

use a bilinear model to build discriminative features of the whole image; the

model is able to capture subtle diﬀerences between diﬀerent subordinate classes.

Zhang et al. [47] propose a two-step approach to learn a bunch of part detectors

and part saliency maps. Fu et al. [12] use an alternate optimization scheme to

train attention proposal network and region-based classiﬁer; they show that two

tasks are correlated and can beneﬁt each other. Zhao et al. [48] propose Diver-

siﬁed Visual Attention Network (DVAN) to explicitly pursues the diversity of

attention and better gather discriminative information. Lam et al. [25] propose a

Heuristic-Successor Network (HSNet) to formulate the ﬁne-grained classiﬁcation

problem as a sequential search for informative regions in an image.

2.2 Object detection

Early object detection methods employ SIFT [34] or HOG [10] features. Recent

works are mainly focusing on convolutional neural networks. Approaches like

R-CNN [14], OverFeat [40] and SPPnet [16] adopt traditional image-processing

methods to generate object proposals and perform category classiﬁcation and

bounding box regression. Later works like Faster R-CNN [38] propose Region

Proposal Network (RPN) for proposal generation. YOLO [37] and SSD [31] im-

prove detection speed over Faster R-CNN [38] by employing a single-shot ar-

chitecture. On the other hand, Feature Pyramid Networks (FPN) [27] focuses

on better addressing multi-scale problem and generates anchors from multiple

feature maps. Our method requires selecting informative regions, which can also

be viewed as object detection. To the best of our knowledge, we are the ﬁrst

one to introduce FPN into ﬁne-grained classiﬁcation while eliminates the need

of human annotations.

2.3 Learning to rank

Learning to rank is drawing attention in the ﬁeld of machine learning and infor-

mation retrieval [30]. The training data consist of lists of items with assigned or-

ders, while the objective is to learn the order for item lists. The ranking loss func-

tion is designed to penalize pairs with wrong order. Let X = {X

, X

, · · · , X

}

denote the objects to rank, and Y = {Y

, Y

, · · · , Y

} the indexing of the objects,

where Y

≥ Y

means X

should be ranked before X

. Let F be the hypothesis set

of ranking function. The goal is to ﬁnd a ranking function F ∈ F that minimize a

certain loss function deﬁned on {X

, X

· · · X

}, {Y

, Y

, · · · , Y

} and F. There

are many ranking methods. Generally speaking, these methods can be divided

into three categories: the point-wise approach [9], pair-wise approach [18,4] and

list-wise approach[6,44].

Point-wise approach assign each data with a numerical score, and the learning-

to-rank problem can be formulated as a regression problem, for example with

L2 loss function:

point

(F, X, Y ) =

i=1

(F(X

) − Y

)

(1)

剩余15页未读，继续阅读

z止于至善

粉丝: 195
资源: 13

细粒度分类的导航学习方法

libsvmmatlab代码-FineGrainClassification:重新实现ICCV2013论文“Fine-GrainedCateg

LEARNING TO NAVIGATE IN COMPLEX ENVIRONMENTS

Could not build wheels for Ta-Lib, which is required to install pyproject.toml-based projects

running setup.py develop for opencv-contrib-python

how to unify text to unix-line-return style on files which has been committed to git remote server

build wheels for opencv-python, which is required to install pyproject.toml-based projects

Sec-Fetch-Dest: document Sec-Fetch-Mode:navigate Sec-Fetch-Site: none Sec-FetchUser: ?1

To search for alternate channels that may provide the conda package you're looking for, navigate to https://anaconda.org and use the search bar at the top of the page.

Running setup.py bdist_wheel for opencv-contrib-python ...

最新资源