融合自监督与少样本：蒙特利尔新研究探索目标检测

下载需积分: 5 | PDF格式 | 6.08MB | 更新于2024-07-09 | 11 浏览量 | 举报

1 收藏

本文主要探讨了蒙特利尔最新的研究论文——《自监督与少样本目标检测：一项全面调查》，该论文针对计算机视觉领域中的两大挑战：标注数据的高成本和稀缺性，以及目标检测任务中对大量标记数据的依赖。传统的深度学习模型往往依赖大量的标注数据来训练，特别是在对象检测和实例分割这类任务中，密集的图像标记是必不可少的。然而，少样本目标检测（Few-shot Object Detection）作为一种新兴的研究领域，试图通过少量的新类别（未见过的目标）样本来提升模型的泛化能力。文章的核心关注点在于结合自监督学习（Self-supervised Learning）与少样本目标检测的方法。自监督学习是一种无监督学习策略，它利用未标记的数据来学习通用特征表示，这些表示能够有效迁移到诸如目标检测这样的下游任务中，显著减少了对标记数据的依赖。这种方法通常涉及设计创新的预训练策略，例如通过构建自我监督任务（如图像旋转、颜色变换或对比预测）来引导模型学习图像的内在结构。当前的调查总结了最近的几项关键研究，它们在如何结合自监督学习的无监督表示学习与少样本目标检测的针对性训练之间寻找平衡。研究者们提出了各种新颖的策略，如使用元学习（Meta-learning）、生成对抗网络（GANs）、以及基于对比的学习（Contrastive Learning）等技术，以在有限的数据集上提高模型的泛化能力和适应新类别的能力。作者在文中强调，尽管这些方法已经取得了显著的进步，但还有很多挑战和未来的研究方向，比如如何更有效地利用未标记数据，提高迁移学习的效率，以及如何在处理复杂的场景和多样化的对象类别时保持性能。此外，研究者们还在探索如何将这些方法扩展到更广泛的领域，如实例分割，并且寻找与多模态数据融合的可能性，以进一步提升模型的性能和鲁棒性。这篇综述为我们提供了一个全面的视角，展示了自监督与少样本目标检测结合领域的前沿进展，同时也揭示了未来研究需要解决的关键问题，为推动计算机视觉领域的进步指明了方向。

Fig. 2. A Faster R-CNN with Feature Pyramid Network. The input image is fed to the backbone network, then the feature pyramid network (light

yellow) computes multi-scale features. The region proposal network proposes candidate boxes, which are ﬁltered with non-maximum suppression

(NMS). Features for the remaining boxes are pooled with RoIAlign and fed to the box head, which predicts object category and reﬁned box

coordinates. Finally, redundant and low-quality predictions are removed with NMS. Blue labels are class names in the detectron2 implementation.

Figure courtesy of Hiroto Honda. https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd

CNN

set of image features

transformer

encoder

…

positional encoding

transformer

decoder

class,

box

class,

box

object

FFN

object queries

backbone encoder decoder prediction heads

Fig. 3. The DETR object detector. The image is fed to the backbone, then positional encodings are added to the features and fed to the transformer

encoder. The decoder takes as input object query embeddings, attends to the encoded representation, and outputs a ﬁxed number of object

detections, which are ﬁnally thresholded, without need for NMS [9]. Image courtesy of Carion et al. [9].

mean average precision (mAP) evaluation metric and the

differences between Pascal VOC and MS COCO implemen-

tations.

4 FEW-SHOT OBJECT DETECTION

Informally, few-shot object detection (FSOD) is the task of

learning to detect new categories of objects using only one

or a few training examples per class. In this section, we

describe the FSOD framework, its differences with few-

shot classiﬁcation, common datasets, evaluation metrics,

and FSOD methods. We provide a taxonomy of popular

few-shot and self-supervised object detection methods in

Figure 1.

4.1 FSOD Framework

We formally introduce the dominant FSOD framework, as

formalized by Kang et al. [54] (Figure 4). FSOD parti-

tions objects into two disjoint sets of categories: base or

known/source classes, which are object categories for which

we have access to a large number of training examples; and

novel or unseen/target classes, for which we have only a

few training examples (shots) per class. In the vast majority

of the FSOD literature, we assume that the object detector’s

backbone has already been pretrained on an image classi-

ﬁcation dataset such as ImageNet (usually a ResNet-50 or

101). Then, the FSOD task is formalized as follows:

(1) Base training.

Annotations are given only for the base

classes, with a large number of training examples per

class (bikes in the example). We train the FSOD method

on the base classes.

2. In the context of self-supervised learning, base-training may also

be referred to as ﬁnetuning or training. This should not be confused with

base training in the meta-learning framework; rather this is similar to the

meta-training phase [32].

剩余19页未读，继续阅读

syp_net

粉丝: 158
资源: 1184

融合自监督与少样本：蒙特利尔新研究探索目标检测

小样本下的卫星图像典型目标识别_测试集

目标检测样本数据

蒙特利尔本地电视频道YapAiTek分配：蒙特利尔本地电视频道数据集的监督学习模型

查找魁北克蒙特利尔巴士

montreal:Maptime蒙特利尔

MontrealGameJam.github.io:蒙特利尔GameJam 的网站，蒙特利尔版的 Global Gamejam

letjobing：蒙特利尔小职位广告

加拿大蒙特利尔大学的微波教育

logementsociauxmtl:蒙特利尔高级社会公寓

stations-website:蒙特利尔站网站

最新资源