大规模少样本学习：利用类别层次结构的知识迁移

26 浏览量更新于2024-08-28 收藏 1.54MB PDF 举报

"这篇研究论文探讨了大规模的少量学习（Large-Scale Few-Shot Learning, FSL）领域，特别是如何利用班级层次结构进行知识转移，以解决在具有1000个类别的源域中训练深度特征嵌入模型并进行最近邻搜索时遇到的挑战。" 在计算机视觉和机器学习领域，少量样本学习（Few-Shot Learning）是一个重要的研究课题，其目标是让模型能够通过少量样本快速学习新的类别。随着数据集规模的增大，这个问题变得更具挑战性。近年来，大规模的少量样本学习（Large-Scale Few-Shot Learning）受到了广泛关注。研究发现，在大规模FSL问题中，一个简单的基线方法——即使用源域中聚合的类别训练深度特征嵌入模型，然后在目标类别上执行最近邻搜索——表现出了强大的性能。尽管现有的最先进的大型FSL方法试图超越这个基线，但它们仍然面临着可扩展性的内在限制。为此，本文提出了一种新颖的大规模FSL模型，该模型利用班级层次结构来编码源域和目标域之间的语义关系。具体来说，他们训练一个深度特征嵌入模型，使其能够在层次结构的不同层对每个训练样本预测类标签。这种方法的优势在于，由于目标类别通常在层次结构的高层与源类别共享一些标签，因此模型可以利用这些共享信息进行更有效的知识转移。在层次结构中进行学习允许模型逐步学习到更通用的特征，这些特征不仅适用于源域中的类别，而且对从未见过的目标类别也有很好的泛化能力。此外，通过在层次的不同层级进行预测，模型可以逐步细化其分类能力，从而提高对新类别的识别精度。论文的贡献主要体现在以下几个方面： 1. 提出了一种利用班级层次结构进行知识转移的新方法，解决了大规模FSL中的可扩展性问题。 2. 设计了一个深度特征嵌入模型，该模型能够在层次结构的不同层次上进行预测，适应类别之间的语义关系。 3. 实验结果可能展示了这种方法相对于现有技术的优越性，并可能提供了关于如何在大规模FSL中有效地利用层次结构的新见解。这篇研究对于理解和改进机器学习模型在面对大规模、多类别问题时的泛化能力具有重要意义，对于未来在图像识别、自然语言处理和其他领域应用少量样本学习的方法具有指导价值。通过深入探索和利用类别的语义关系，该工作有望推动FSL领域的进一步发展。

Large-Scale Few-Shot Learning: Knowledge Transfer with Class Hierarchy

Aoxue Li

Tiange Luo

Zhiwu Lu

2∗

Tao Xiang

Liwei Wang

Peking University

Renmin University of China

Queen Mary University of London

zhiwu.lu@gmail.com t.xiang@qmul.ac.uk

Abstract

Recently, large-scale few-shot learning (FSL) becomes

topical. It is discovered that, for a large-scale FSL problem

with 1,000 classes in the source domain, a strong baseline

emerges, that is, simply training a deep feature embedding

model using the aggregated source classes and perform-

ing nearest neighbor (NN) search using the learned features

on the target classes. The state-of-the-art large-scale FSL

methods struggle to beat this baseline, indicating intrinsic

limitations on scalability. In this paper, we thus propose a

novel large-scale FSL model by exploiting class hierarchy

encoding the semantic relationships between the source and

target classes. Speciﬁcally, a deep feature embedding mod-

el is learned to predict class labels for each training sample

at different layers of the hierarchy. Since the target classes

share some of the labels at the top layers of the hierarchy,

more transferable features are obtained even with only the

source class samples for model training. Extensive exper-

iments show that the proposed model signiﬁcantly outper-

forms not only the NN baseline but also the state-of-the-art

alternatives. Further, we show that the proposed model can

be easily extended to the large-scale zero-shot learning (ZS-

L) problem and also achieves state-of-the-art results.

1. Introduction

In the past ﬁve years, the object recognition research

has focused on large-scale recognition problems such as the

ImageNet ILSVRC challenges [34]. Deep neural network

(DNN) based models [37, 42, 12, 41] have achieved super-

human performance on the ILSVRC 1K recognition task.

However, most existing object recognition models, partic-

ularly those DNN based ones, require hundreds of image

samples to be collected for each object class; many of the

object classes are rare and it is very hard to collect sufﬁcien-

t training samples, even with social media. Therefore, it is

highly desirable to develop object recognition models that

require only few training samples/shots per object class.

∗

Corresponding author.

1 2 3 4 5

Top-5 Accuracy (%)

K-shot

PPA

LSD

SGM

Figure 1. Comparative results for large-scale FSL on the ImNet

dataset [17]. The top-5 accuracy over target class samples is used

as the evaluation metric. Notations: NN – nearest neighbor (NN)

search performed in a learned feature space using K samples per

target class as the references; SGM – FSL with the squared gradi-

ent magnitude (SGM) loss [11]; PPA – parameter prediction from

activations (PPA) [31]; LSD – large-scale diffusion (LSD) [3].

To overcome this challenge, meta-learning based few-

shot learning (FSL) [4, 19, 35, 10, 31, 30, 44, 5, 40] has

become a hot topic. FSL is inspired by the fact that hu-

man can recognize target visual objects almost effortlessly

with a few samples thanks to the ability to learn to learn

and knowledge transfer. Similarly, in the FSL problem, we

are provided with a set of source classes and a set of target

classes under the setting that: (1) The target classes have no

overlap with the source classes in the label space; (2) Each

source class has sufﬁcient labelled samples, whereas each

target class has only a few labelled samples. FSL thus aims

to transfer knowledge from the source to target classes.

The focus of this work is on the large-scale FSL set-

ting with a large number of source classes provided. This

is very different from the most widely used meta-learning

evaluation benchmarks such as miniImageNet [45] which

contains 64 source classes with 600 samples in each class.

Yet it is more realistic – after all, we have 1,000s of class-

es in ImageNet that we can use, so why not include more

source classes when it comes to FSL? It is noted that a

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38712874

粉丝: 10
资源: 947

大规模少样本学习：利用类别层次结构的知识迁移

大规模机器学习平台的技术实现.pptx

少量学习：GPT-3的少量学习

实现机器学习有哪些方法

面向小样本学习的轻量化知识蒸馏* 陈嘉言

深度学习相比机器学习的优势

机器人学习的基本类型有哪些

存储器的层次结构及组成原理

深度学习与传统的机器学习有何区别？

监督学习，无监督学习，半监督学习，自监督学习和强化学习有什么区别和联系

监督学习、无监督学习、自监督学习与半监督学习如何分类以及如何区别？

最新资源