SuperGlue：深度学习特征匹配与图神经网络

需积分: 9 4 浏览量更新于2024-08-05 收藏 8.07MB PDF 举报

"SuperGlue是2020年CVPR会议上发表的一篇顶会论文，它提出了一种新的神经网络模型，用于匹配图像中的局部特征。SuperGlue通过联合寻找对应关系和拒绝不匹配的点，解决了两组特征之间的匹配问题。" 正文： SuperGlue是计算机视觉领域的一项创新，其主要目标是改进图像特征匹配的效率和准确性。该技术是基于图神经网络（Graph Neural Networks, GNN）的，可以学习到特征之间的对应关系，并通过端到端的训练来学习几何变换的先验知识和3D世界的规律。论文的核心是通过解决一个可微分的最优传输问题（Differentiable Optimal Transport Problem）来估计匹配分配。最优传输问题在数学中常用于优化分配问题，而在这里，它的成本由图神经网络预测，网络能够根据输入的图像特征自适应地调整匹配策略。这种方法允许SuperGlue考虑底层的3D场景信息，同时考虑特征匹配的整体性。引入的灵活上下文聚合机制是基于注意力机制的，它使SuperGlue能够理解图像背后的3D场景，并对特征匹配进行推理。这种基于注意力的机制使得网络能够聚焦于关键信息，忽略不相关的细节，从而提高匹配的精确度。与传统的、基于人工设计启发式方法相比，SuperGlue的优势在于其通过深度学习自动学习了特征匹配的模式和规则。它不再依赖于预定义的几何约束，而是通过大量的图像对进行训练，学习到了更为复杂和通用的匹配策略。在实际应用中，SuperGlue在具有挑战性的室内和室外环境中进行姿态估计任务时，表现出了超越其他学习方法的性能，达到了最先进的结果。此外，该方法在现代GPU上实现了实时匹配，易于整合到现有的结构化光（Structure from Motion, SfM）或同时定位与映射（Simultaneous Localization and Mapping, SLAM）系统中，提高了这些系统的整体性能。 SuperGlue是计算机视觉领域的一个重要突破，它通过学习和理解3D世界的特性，提供了一种高效且准确的局部特征匹配方法，对于提升现实世界中的视觉定位和导航任务有着显著的影响。

SuperGlue: Learning Feature Matching with Graph Neural Networks

Paul-Edouard Sarlin

1∗

Daniel DeTone

Tomasz Malisiewicz

Andrew Rabinovich

ETH Zurich

Magic Leap, Inc.

Abstract

This paper introduces SuperGlue, a neural network that

matches two sets of local features by jointly ﬁnding corre-

spondences and rejecting non-matchable points. Assign-

ments are estimated by solving a differentiable optimal

transport problem, whose costs are predicted by a graph

neural network. We introduce a ﬂexible context aggregation

mechanism based on attention, enabling SuperGlue to rea-

son about the underlying 3D scene and feature assignments

jointly. Compared to traditional, hand-designed heuris-

tics, our technique learns priors over geometric transforma-

tions and regularities of the 3D world through end-to-end

training from image pairs. SuperGlue outperforms other

learned approaches and achieves state-of-the-art results on

the task of pose estimation in challenging real-world in-

door and outdoor environments. The proposed method per-

forms matching in real-time on a modern GPU and can

be readily integrated into modern SfM or SLAM systems.

The code and trained weights are publicly available at

github.com/magicleap/SuperGluePretrainedNetwork.

1. Introduction

Correspondences between points in images are essential

for estimating the 3D structure and camera poses in geo-

metric computer vision tasks such as Simultaneous Local-

ization and Mapping (SLAM) and Structure-from-Motion

(SfM). Such correspondences are generally estimated by

matching local features, a process known as data associa-

tion. Large viewpoint and lighting changes, occlusion, blur,

and lack of texture are factors that make 2D-to-2D data as-

sociation particularly challenging.

In this paper, we present a new way of thinking about the

feature matching problem. Instead of learning better task-

agnostic local features followed by simple matching heuris-

tics and tricks, we propose to learn the matching process

from pre-existing local features using a novel neural archi-

tecture called SuperGlue. In the context of SLAM, which

typically [

7] decomposes the problem into the visual fea-

ture extraction front-end and the bundle adjustment or pose

estimation back-end, our network lies directly in the middle

– SuperGlue is a learnable middle-end (see Figure

1).

Super

Glue

Detector & Descriptor

Deep Front-End

SuperGlue

Back-End Optimizer

Deep Middle-End Matcher

Figure 1: Feature matching with SuperGlue. Our ap-

proach establishes pointwise correspondences from off-the-

shelf local features: it acts as a middle-end between hand-

crafted or learned front-end and back-end. SuperGlue uses a

graph neural network and attention to solve an assignment

optimization problem, and handles partial point visibility

and occlusion elegantly, producing a partial assignment.

In this work, learning feature matching is viewed as

ﬁnding the partial assignment between two sets of local

features. We revisit the classical graph-based strategy of

matching by solving a linear assignment problem, which,

when relaxed to an optimal transport problem, can be solved

differentiably. The cost function of this optimization is pre-

dicted by a Graph Neural Network (GNN). Inspired by the

success of the Transformer [

55], it uses self- (intra-image)

and cross- (inter-image) attention to leverage both spatial

relationships of the keypoints and their visual appearance.

This formulation enforces the assignment structure of the

predictions while enabling the cost to learn complex pri-

ors, elegantly handling occlusion and non-repeatable key-

points. Our method is trained end-to-end from image pairs

– we learn priors for pose estimation from a large annotated

dataset, enabling SuperGlue to reason about the 3D scene

and the assignment. Our work can be applied to a variety of

multiple-view geometry problems that require high-quality

feature correspondences (see Figure

2).

∗

Work done at Magic Leap, Inc. for a Master’s degree. The author thanks

his academic supervisors: Cesar Cadena, Marcin Dymczyk, Juan Nieto.

4938

下载后可阅读完整内容，剩余9页未读，立即下载

图灵动力

粉丝: 13
资源: 5

SuperGlue：深度学习特征匹配与图神经网络

aiohttp-3.7.3-cp36-cp36m-win_amd64.whl.rar

基于Java中的swing类的图形化飞机游戏的开发练习.zip

SQLite：SQLite数据库创建与管理.docx

【完整源码+数据库】SpringBoot 集成 Spring Security短信验证码登录

去年和朋友一起做的java小游戏.游戏具体界面在readme中,游戏设计的uml图在design.pdf中.zip

ad3-2.2.1-cp34-cp34m-win_amd64.whl.rar

arctic-1.67.1-cp36-cp36m-win32.whl.rar

基于Java实现的黄金矿工小游戏.zip

课设毕设基于SpringBoot+Vue的大学生心理咨询平台源码可运行.zip

网络直播带货查询系统 SSM毕业设计 附带论文.zip

最新资源

网络直播带货查询系统 SSM毕业设计附带论文.zip