SLAM技术综述：视觉同步定位与建图

需积分: 45 134 浏览量更新于2024-07-19 2 收藏 453KB PDF 举报

"SLAM经典文献：Visual Simultaneous Localization and Mapping: A Survey" 这篇文献是对视觉同时定位与建图（Visual SLAM）技术的一次全面综述，旨在深入讲解SLAM的基础知识。SLAM（Simultaneous Localization And Mapping），即同时定位与建图，是机器人和自动驾驶领域的一项核心技术，它允许设备在未知环境中移动并构建环境地图的同时确定自身的精确位置。文章详细介绍了SLAM的基本概念，包括其核心问题、方法论以及在实际应用中的挑战。SLAM系统通常由传感器数据获取、特征提取、数据关联、状态估计和地图构建等模块组成。视觉SLAM主要依赖于摄像头作为传感器，通过分析图像序列来获取环境信息。在特征提取阶段，算法会识别图像中的关键点，如边缘或角点，这些特征对于匹配不同帧间的相似性至关重要。数据关联则涉及如何将这些特征点在连续的图像之间进行关联，以解决运动模糊和光照变化等问题。文献中可能涵盖了多种视觉SLAM方法，例如EKF-SLAM（扩展卡尔曼滤波SLAM）和粒子滤波SLAM，这些方法都基于概率框架来估计机器人位姿和地图。此外，还可能讨论了现代视觉SLAM系统的先进算法，如ORB-SLAM和DSO（直接稀疏光流法），它们在实时性能和鲁棒性方面取得了显著进步。作者团队包括Jorge Fuentes-Pacheco、Jose Ruiz-Ascencio和Juan Manuel Rendon-Mancha等人，他们在相关领域有丰富的研究经验，如精准农业无人机和图像处理、生物图像分析等。他们通过这篇综述，为读者提供了SLAM领域的最新进展和未来研究方向的洞察。此文献自2015年发表以来，已被引用100多次，阅读量超过6753次，表明其在学术界具有广泛的影响。SLAM技术的不断发展对自动驾驶汽车、无人机、服务机器人以及增强现实等领域有着深远的影响。通过这篇综述，读者可以了解到SLAM技术的历史、现状以及未来可能面临的挑战，为相关研究和实践提供理论基础。

Visual simultaneous localization and mapping

The idea of utilizing one camera has become popular since the emergence of single cam-

era SLAM or MonoSLAM (Davison 2003). This is probably also because it is now easier to

access a single camera than a stereo pair, through cell phones, personal digital assistants or

personal computers. This monocular approach offers a very simple, ﬂexible and economic

solution in terms of hardware and processing times.

Monocular SLAM is a particular case of bearing-only SLAM. The latter is a partially

observable problem, where sensors do not provide sufﬁcient information from a simple

observation to determine the depth of a landmark. This causes a landmark initialization prob-

lem, where solutions can be divided into two categories: delayed and undelayed (Lemaire et

al. 2007; Vidal et al. 2007). A salient feature tracking across multiple observations has to be

performed to obtain tridimensional information from a single camera.

Even though many contributions have been made to visual SLAM, there are still many

problems. The solutions proposed for the visual SLAM problem are reviewed in Sect. 6. Many

visual SLAM systems suffer from large accumulated errors while the environment is being

explored (or fail completely in visually complex environments), which leads to inconsistent

estimates of robot position and totally incongruous maps. Three primary reasons exist:

(1) First, generally it is assumed that camera movement is smooth and that there will be

consistency in the appearance of salient features (Davison 2003; Nistér et al. 2004),

but in general this is not true. The above assumptions are highly related to the selection

of the salient feature detector and of the matching technique used. This originates an

inaccuracy in camera position when capturing images with little texture or that are

blurred due to rapid movements of the sensor (e.g. due to vibration or quick direction

changes) (Pupilli and Calway 2006). These phenomena are typical when the camera

is carried by a person, humanoid robots, and quad-rotor helicopters, among others.

One way of alleviating this problem to some extent is by the use of keyframes (see

“Appendix I”) (Mouragnon et al. 2006; Klein and Murray 2008). Alternatively, Pretto

et al. (2007)andMei and Reid (2008) analyze the problem of visual tracking in real

time over blurred image sequences due to an out-of-focus camera.

(2) Second, most of researchers assume that the environments to explore are static and

that they only contain stationary and rigid elements; the majority of the environments

contain people and objects in motion. If this is not considered, the moving elements

will originate false matches and consequently will generate unpredictable errors in all

the system. The ﬁrst approaches to this problem are proposed by Wang et al. (2007);

Wangsiripitak and Murray (2009); Migliore et al. (2009), as well as Lin and Wang

(2010).

(3) Third, the world is visually repetitive. There are many similar textures, such as the

repeated architectural elements, foliage and walls of brick or stone. Also some objects

such as trafﬁc signals appear repeatedly within an urban outdoor environment. This

makes it difﬁcult to recognize a previously explored area and also to do SLAM on large

extensions of land.

4 Salient feature selection

We will make a difference between salient features and landmarks, since in some articles

they are treated indistinctly. According to Frintrop and Jensfelt (2008), a landmark is a region

in the real world described by 3D position and appearance information. On the other hand,

a salient feature is a region of the image described by its 2D position (on the image) and an

123

剩余27页未读，继续阅读

ppetrel

粉丝: 0
资源: 6

SLAM技术综述：视觉同步定位与建图

激光雷达ros源码

OpenCV学习教程合集4in1

视觉slam综述论文总结很到位

SLAM相关文献

airborne slam 文献

slam文献.zip

slam经典教程Guido Zunino

关于slam的文献综述

多传感器融合slam的文献综述

slam入门经典

最新资源