视觉SLAM算法：2010-2016年综述

下载需积分: 0 | PDF格式 | 691KB | 更新于2024-08-05 | 185 浏览量 | 举报

"Visual SLAM算法：2010年至2016年的回顾1" 本文是一篇关于视觉SLAM（Visual Simultaneous Localization And Mapping）算法的综述性论文，发表于2017年的IPSJ Transaction on Computer Vision and Applications。作者Takafumi Taketomi、Hideaki Uchiyama和Sei Ikeda回顾了2010年至2016年间的主要进展，这一时期被认为是视觉SLAM算法发展的关键阶段。 SLAM，即同时定位与建图，是一种在未知环境中估计传感器运动并重建环境结构的技术。视觉SLAM（vSLAM）利用摄像头数据来实现这一目标，因此它完全依赖于视觉信息。vSLAM技术在计算机视觉、增强现实和机器人学等领域具有广泛应用，并在学术文献中得到了广泛讨论。该论文旨在对这一时间段内不同研究社区提出的vSLAM算法进行分类和总结，从技术和历史的角度提供深入理解。作者主要关注2010至2016年间的算法，因为这段时间内出现了许多重大突破。论文的技术类别总结可能包括以下几个方面： 1. **基础框架**：vSLAM的基础通常分为两种主要类型：直接法和间接法。直接法直接处理图像像素，而间接法则通过特征检测和匹配来估计相机运动和环境结构。 2. **特征提取与匹配**：在视觉SLAM中，特征提取是识别图像中的稳定点或线段的过程，如SIFT、SURF和ORB等。特征匹配则用于比较不同帧之间的相似性，帮助确定相机的相对位姿。 3. **数据关联**：数据关联是vSLAM的关键部分，涉及将新获取的图像观测与现有地图进行匹配。这涉及到回环检测，以防止累积误差导致的漂移。 4. **优化方法**：SLAM系统通常使用非线性最小二乘法（如Levenberg-Marquardt算法）来优化相机轨迹和地图点的估计，确保全局一致性。 5. **多传感器融合**：除了单目vSLAM，还有使用双目、RGB-D或立体相机的系统，它们可以提供额外的深度信息，提高定位和建图的准确性。 6. **鲁棒性与效率**：算法设计时需考虑如何处理光照变化、遮挡、重复纹理等问题，同时保持计算效率，以适应实时应用。 7. **后端优化与重定位**：vSLAM系统可能包含后端优化步骤，如图优化，以及在失去跟踪后重新定位到地图的能力。 8. **实时性能**：由于vSLAM应用于实时场景，因此论文可能会探讨如何在保证性能的同时，优化算法以满足实时性要求。 9. **实验与评估**：作者可能对比分析了不同算法在公开数据集上的表现，以及在实际环境中的应用效果。通过这篇综述，读者可以全面了解视觉SLAM领域的最新进展，包括各种算法的设计思路、优缺点，以及未来的研究方向。这对于研究人员和开发者来说，是深入理解和开发视觉SLAM系统的重要参考资料。

Taketomi et al. IPSJ Transactions on Computer Vision and Applications

(2017) 9:16

Page 3 of 11

techniques as relocalization. Basically, relocalization is

done for recovering a camera pose and loop detection is

done for obtaining geometrically consistent map.

Pose-graph optimization has widely been used to sup-

press the accumulated error by optimizing camera poses

[12, 13]. In this method, the relationship between camera

poses is represented as a graph and the consistent graph

is built to suppress the error in the optimization. Bundle

adjustment (BA) is also used to minimize the reprojec-

tion error of the map by optimizing both the map and the

camera poses [14]. In large environments, this optimiza-

tion procedure is employed to minimize estimation errors

efficiently. In small environments, BA may be performed

without loop closing because the accumulated error is

small.

2.3 Summary

As listed above, the framework of vSLAM algorithms is

composed of five modules: initialization, tracking, map-

ping, relocalization, and global map optimization. Since

each vSLAM algorithm employs different methodolo-

gies for each module, features of a vSLAM algorithm

highly depend on the methodologies employed. There-

fore, it is important to understand each module of a

vSLAM algorithm to know its performance, advantages,

and limitations.

It should be noted that tracking and mapping (TAM) is

used instead of using localization and mapping. TAM was

first used in Parallel Tracking and Mapping (PTAM) [15]

because localization and mapping are not simultaneously

performed in a traditional way. Tracking is performed in

every frame with one thread whereas mapping is per-

formed at a certain timing with another thread. After

PTAM was proposed, most of vSLAM algorithms follows

the framework of TAM. Therefore, TAM is used in this

paper.

3 Related technologies

vSLAM, visual odometry, and online structure from

motion are designed for estimating camera motion and 3D

structure in an unknown environment. In this section, we

explain the relationship among them.

3.1 Visual odometry

Odometry is to estimate the sequential changes of sensor

positions over time using sensors such as wheel encoder to

acquire relative sensor movement. Camera-based odom-

etry called visual odometry (VO) is also one of the active

research fields in the literature [16, 17]. From the technical

point of views, vSLAM and VO are highly relevant tech-

niques because both techniques basically estimate sensor

positions. According to the survey papers in robotics

[18, 19], the relationship between vSLAM and VO can be

represented as follows.

vSLAM = VO + global map optimization

The main difference between these two techniques is

global map optimization in the mapping. In other words,

VO is equivalent to the modules in Section 2.1. In the

VO, the geometric consistency of a map is considered

only in a small portion of a map or only relative camera

motion is computed without mapping. On the other hand,

in the vSLAM, the global geometric consistency of a map

is normally considered. Therefore, to build a geometrically

consistent map, the global optimization is performed in

the recent vSLAM algorithms.

The relationship between vSLAM and VO can also be

found from the papers [20, 21] and the papers [22, 23]. In

the paper [20, 22], a technique on VO was first proposed.

Then, a technique on vSLAM was proposed by adding the

global optimization in VO [21, 23].

3.2 Structure from motion

Structure from motion (SfM) is a technique to estimate

camera motion and 3D structure of the environment in a

batch manner [24]. In the paper [25], a SfM method that

runs online was proposed. The authors named it as real-

time SfM. From the technical point of views, there is no

definitive difference between vSLAM and real-time SfM.

This may be why the word “real-time SfM” is not found in

视觉SLAM算法：2010-2016年综述

VisualSLAM项目：掌握基于视觉的导航技术

机器人SLAM算法更新：传感器融合与计算机视觉

高效SLAM算法：基于旋转度关键帧提取和历史模型差异性回环检测

实时视觉SLAM算法：高效、精确的多线程框架，适用于多种应用与相机设置,实时视觉SLAM算法：高效、精确的多线程框架支持PDF源码，适用于多种应用场景,一种可用于实时应用程序的SLAM PDF和源码

实时视觉SLAM算法：高效、精确的多线程解决方案适用于多种应用,一种可用于实时应用程序的SLAM PDF和源码 Visual SLAM的许多应用，如增强现实、现实、机器人或自动驾驶，都需要通用、健

优化CDKF SLAM算法：降低计算复杂度

室内移动机器人SLAM算法：现状与进展

无人机主动SLAM算法：边界引导的探索策略

优化SLAM算法：提升野外机器人定位与建图精度

视觉SLAM算法：AR应用、挑战与技术进展

最新资源