ASD-SLAM：深度学习增强的视觉SLAM系统

需积分: 0 33 浏览量更新于2024-08-05 收藏 2.58MB PDF 举报

"马太原61篇论文探讨了ASD-SLAM：一种新颖的自适应尺度描述符学习方法，用于视觉SLAM系统" 在自动驾驶领域，视觉里程计（Visual Odometry）和同时定位与建图（SLAM）是至关重要的技术。传统的基于关键点的视觉SLAM系统中，前端特征匹配的准确性对系统的定位精度有着决定性的影响，并往往是限制其性能的关键因素。特别是在视角变化大和高度重复场景下，提高特征描述符的判别性和可匹配性对于提升SLAM的定位精度至关重要。针对这一问题，该论文提出了一种新的自适应尺度三元组损失函数，并将其应用于三元组网络，以生成自适应尺度描述符（Adaptive-Scale Descriptor, ASD）。这种描述符能更好地适应不同尺度的变化，增强了特征在复杂环境下的匹配能力。基于ASD，研究者构建了一个名为ASD-SLAM的单目SLAM系统，它是在当前最先进的ORB-SLAM系统基础上，融合深度学习技术的增强版。实验结果显示，ASD在UBC基准数据集上表现优于其他方法，同时，ASD-SLAM系统在性能上也超越了现有的SLAM系统，这表明ASD-SLAM在处理视角变化和重复场景时具有更高的鲁棒性和精度。此外，该研究还可能涉及到以下知识点： 1. 三元组损失函数：这是一种深度学习中的损失函数，用于度量一个样本与正样本（相同类别的样本）的距离与负样本（不同类别的样本）的距离之间的差距，以优化特征表示，提高分类或匹配性能。 2. 深度学习增强的SLAM：结合深度学习，特别是卷积神经网络（CNN），可以自动学习更强大的特征表示，从而提升SLAM的性能。 3. ORB-SLAM系统：一种广泛使用的开源SLAM解决方案，它结合了ORB（Oriented FAST and Rotated BRIEF）特征与视觉里程计、地图重定位和回环检测等模块。 4. UBC基准数据集：可能是指University of British Columbia提供的一个用于测试SLAM算法的公开数据集，包含各种挑战性环境下的图像序列。 5. 自适应尺度：在特征描述符中，自适应尺度意味着描述符能够根据环境的变化（如物体尺度、光照等）自动调整，以保持其匹配性能。 6. 鲁棒性：衡量系统在面对噪声、变化或异常情况时的稳定性。ASD-SLAM的鲁棒性提高意味着它在复杂或不理想环境下也能保持良好的定位和映射效果。通过这些深入的研究和改进，ASD-SLAM不仅提高了特征匹配的准确性和系统定位的精度，而且为未来的SLAM研究提供了新的思路和方法。这种结合传统计算机视觉方法和深度学习技术的策略，为解决SLAM在实际应用中的挑战开辟了新的途径。

ASD-SLAM: A Novel Adaptive-Scale Descriptor Learning for Visual

SLAM

Taiyuan Ma

and Yafei Wang

and Zili Wang

and Xulei Liu

and Huimin Zhang

Abstract— Visual Odometry and Simultaneous Localization

and Mapping (SLAM) are widely used in autonomous driving.

In the traditional keypoint-based visual SLAM systems, the

feature matching accuracy of the front end plays a decisive role

and becomes the bottleneck restricting the positioning accuracy,

especially in challenging scenarios like viewpoint variation and

highly repetitive scenes. Thus, increasing the discriminability

and matchability of feature descriptor is of importance to

improve the positioning accuracy of visual SLAM. In this paper,

we proposed a novel adaptive-scale triplet loss function and

apply it to triplet network to generate adaptive-scale descriptor

(ASD). Based on ASD, we designed our monocular SLAM sys-

tem (ASD-SLAM) which is an deep-learning enhanced system

based on the state of art ORB-SLAM system. The experimental

results show that ASD achieves better performance on the UBC

benchmark dataset, at the same time, the ASD-SLAM system

also outperforms the current popular visual SLAM frameworks

on the KITTI Odometry Dataset.

I. INTRODUCTION

Feature matching is one of the key steps for Simultaneous

Localization and Mapping (SLAM), which in turn depends

on the quality of descriptors. The descriptors are feature

abstraction of the original pixels of the images. Effective

descriptors should be able to cope with image transformation,

illumination changes and so on while describing the image

features. Over the past decade, researches focused keypoint

descriptors based on hand-crafted solutions such as SIFT

[1], SURF [2] and ORB [3]. These descriptors still play

important roles in current popular visual SLAM frameworks

like ORB-SLAM2[4]. Among these hand-drift descriptors,

the sift descriptor has a higher matching precision, but

requires too much computation. The recent rise of deep

learning has created the opportunity to develop learning-

based and data-driven techniques of keypoint description.

According to [8], the descriptors coming from trained-CNN

outperform the hand-crafted descriptors in terms of their

invariance properties in patch veriﬁcation tasks. Among the

methods based on the CNNs for keypoint description [5-7] ,

[11-17], the most famous models are DeepDesc [5], L2-Net

[6], CS L2-Net [6] and HardNet [7], they produce 128 or 256

dimensions unit eigenvectors like SIFT, ORB. All studies

about keypoint description with trained CNNs inevitably

compare their performance with hand-crafted descriptors,

and come to a common conclusion that they outperform

the hand-crafted descriptors in terms of their invariance

properties [8]. Although these learning-based descriptors

T. Ma, Y. Wang, X. Liu, H. Zhang are with School of Mechanical

Engineering, University of Shanghai Jiao Tong, Shanghai 200240, China

(corresponding author: Yafei Wang, e-mail: wyfjlu@sjtu.edu.cn).

Z. Wang is with Company of Xiao Peng, Guangzhou, China

achieve good performance in patch veriﬁcation tasks, they

are not popular in practical applications. Especially, accord-

ing to a recent research [9], in some complicated tasks,

like SFM, traditional hand-crafted features (SIFT [1] and

its variants [10]) still prevail over the learned ones. The

main reason is that most researches did not consider the

speciﬁcities of the speciﬁc applications like SLAM, SFM

when designing loss functions, which made the descriptors

difﬁcult to apply to these applications. Traditionally, most

researches focus on data augmentation or build more suitable

datasets to increase the robust to deal with illumination and

viewpoint changes in practical applications and ignore the

importance of loss function. For example, most learning-

based methods adopt Siamese losses [5],[11-13] and Triplet

losses [6][7][14-17] aiming at reducing the distance between

similar image patches and increase the distance of dissimilar

ones. Generally, Triplet losses are reported to have better

performance than Siamese losses according to [17]. However,

Triplet losses suffer from scale uncertainty, according to

[18], which is fatal to the feature matching between multiple

frames of SLAM and SFM. Therefore, in this paper, In order

to enable the descriptor to adapt to the feature matching

of consecutive frames in SLAM, we proposed an Adaptive-

Scale Triplet Loss function and apply it to Triplet Network

to better solve the problem of scale uncertainty and obtain

our adaptive-scale descriptor (ASD). Moreover, by replacing

the front end of the traditional visual SLAM framework with

ASD , we design the deep-learning enhanced SLAM system

(ASD-SLAM). We separately evaluate the performance of

ASD and the positioning accuracy of ASD-SLAM on the

public datasets. The experimental results show that ASD

achieves the better performance in patch veriﬁcation tasks

and the ASD-SLAM positioning results are more accurate

than the inﬂuential monocular SLAM systems like ORB-

SLAM, LDSO. In addition, ASD is not only applied to

SLAM, but also can be extended to other similar ﬁelds like

SFM. In summary, our main contributions

are the following:

• We proposed an adaptive-scale triplet loss function and

applied it to triplet network to generate ASD which

achieved state-of-art performance on the public Brown

dataset.

• We design our deep-learning-enhanced SLAM system

(ASD-SLAM), and obtained better results comparing to

state-of-the-art visual SLAM systems like ORB-SLAM

and LDSO.

https://github.com/mataiyuan/ASD-SLAM?ﬁles=1

SALM依赖特征匹

配，特征匹配又依

赖描述子的质量，

描述子就是对图片

像素的特征提取

SIFT匹配精确，但

算力大

实际使用中在复杂

任务中还是用传统

设计的特征(SIFT)

主要原因是在设计

损失函数的时候作

者没有考虑SLAM、

SFM等实际应用的

特殊性，最终导致

描述子难以实际使

用

大多数人专注于数

据增强和设计更稳

定的数据集来增强

鲁棒性解决光照变

化和视角变化，却

忽视了loss

function的重要

性.例如，采用

Siamese losses和

triplet lossed。

后者一般效果更

好，然后triplet

有尺度不确定的缺

点，对于SLAM和

SFM的多帧特征匹

配很致命

本文设计一种自适

应尺度triplet

loss function运

用到triplet

netwwork中更好地

解决尺度不确定问

题，得到ASD

这些深度学习描述

子只适用于patch

verification

tasks，不适用于

SLAM、SFM

设计了自适应尺度

的triplet loss

function，并设计

了网络生成了ASD

设计了ASD-SLAM

下载后可阅读完整内容，剩余7页未读，立即下载

zh222333

粉丝: 38
资源: 296

ASD-SLAM：深度学习增强的视觉SLAM系统

ICA定点迭代算法：快速盲源分离新方法

铝板弯曲模拟：单晶体塑性有限元分析

DEFORM-3D二次开发：探究点应力状态对金属材料塑性变形与断裂的影响

GSM与Internet结合的远程无线监控系统设计

Font Awesome图标字体库提供可缩放矢量图标,它可以被定制大小、颜色、阴影以及任何可以用CSS的样式

EDAfloorplanning

数学建模培训资料 数学建模实战题目真题答案解析解题过程&论文报告 最低生活保障问题的探索 共20页.pdf

变更用水性质定额申请表.xls

GitHub Desktop版快速下载

嗨玩旅游网站-JAVA-基于springboot嗨玩旅游网站设计与实现（毕业论文+PPT）

最新资源

数学建模培训资料数学建模实战题目真题答案解析解题过程&论文报告最低生活保障问题的探索共20页.pdf