视觉-激光雷达融合定位与建图：低漂移、鲁棒且高效

需积分: 50 109 浏览量更新于2024-09-05 收藏 3.77MB PDF 举报

本文探讨了一种结合视觉与激光雷达（LiDAR）的低成本立体视觉惯性定位系统（Visual-LiDAR Odometry and Mapping, VLOAM），其核心在于利用多态约束卡尔曼滤波（Multi-State Constraint Kalman Filter, MSCKF）为基础的视觉惯性导航（Visual-Inertial Odometry, VIO），同时利用预先构建的LiDAR地图提供有界误差的三维导航。相较于传统VIO，该方法引入了一个关键创新：除了标准的稀疏视觉特征测量，还融合了视觉半密集云与LiDAR地图的全局注册信息。这一跨模态约束有助于纠正VIO中的累积漂移问题，提高了系统的稳定性和准确性。在MSCKF的更新过程中，这种紧耦合策略充分利用了视觉和激光雷达数据之间的联系，使得在运动场景下，如快速移动、剧烈运动或视觉特征不足时，系统的鲁棒性显著提升。特别关注的是，对于不同感知模态创建的云点之间进行全球匹配，进一步增强了导航性能，并确保了在实际应用中的高精度。论文通过蒙特卡洛仿真和实际世界实验验证了这一方法的有效性。结果显示，该VLOAM方法在KITTI odometry基准测试中排名首位，平均翻译和旋转误差极低，仅有0.75%的相对位置漂移。除了评估运动估计的精度，作者还着重考察了系统在高速运动和环境光照变化条件下的鲁棒性，表明该技术在极端条件下也能保持稳定表现。本文提出的视觉-激光雷达融合定位框架在降低漂移、增强鲁棒性以及实时性方面具有显著优势，是当前SLAM（Simultaneous Localization and Mapping）领域的前沿研究，对提高自主导航系统的实用性具有重要价值。

Visual-lidar Odometry and Mapping: Low-drift, Robust, and Fast

Ji Zhang and Sanjiv Singh

Abstract— Here, we present a general framework for com-

bining visual odometry and lidar odometry in a fundamental

and ﬁrst principle method. The method shows improvements in

performance over the state of the art, particularly in robustness

to aggressive motion and temporary lack of visual features. The

proposed on-line method starts with visual odometry to estimate

the ego-motion and to register point clouds from a scanning

lidar at a high frequency but low ﬁdelity. Then, scan matching

based lidar odometry reﬁnes the motion estimation and point

cloud registration simultaneously. We show results with datasets

collected in our own experiments as well as using the KITTI

odometry benchmark. Our proposed method is ranked #1 on

the benchmark in terms of average translation and rotation

errors, with a 0.75% of relative position drift. In addition

to comparison of the motion estimation accuracy, we evaluate

robustness of the method when the sensor suite moves at a high

speed and is subject to signiﬁcant ambient lighting changes.

I. INTRODUCTION

Recent separate results in visual odometry and lidar odom-

etry are promising in that they can provide solutions to 6-

DOF state estimation, mapping, and even obstacle detection.

However, drawbacks are present using each sensor alone.

Visual odometry methods require moderate lighting condi-

tions and fail if distinct visual features are insufﬁciently

available. On the other hand, motion estimation via moving

lidars involves motion distortion in point clouds as range

measurements are received at different times during contin-

uous lidar motion. Hence, the motion often has to be solved

with a large number of variables. Scan matching also fails in

degenerate scenes such as those dominated by planar areas.

Here, we propose a fundamental and ﬁrst principle method

for ego-motion estimation combining a monocular camera

and a 3D lidar. We would like to accurately estimate the

6-DOF motion as well as a spatial, metric representation of

the environment, in real-time and onboard a robot navigating

in an unknown environment. While cameras and lidars have

complementary strengths and weaknesses, it is not straight-

forward to combine them in a traditional ﬁlter. Our method

tightly couples the two modes such that it can handle both

aggressive motion including translation and rotation, and

lack of optical texture as in complete whiteout or blackout

imagery. In non-pathological cases, high accuracy in motion

estimation and environment reconstruction is possible.

Our proposed method, namely V-LOAM, explores advan-

tages of each sensor and compensates for drawbacks from

the other, hence shows further improvements in performance

over the state of the art. The method has two sequentially

staggered processes. The ﬁrst uses visual odometry running

J. Zhang and S. Singh are with the Robotics Institute at Carnegie Mellon

University. Emails: zhangji@cmu.edu and ssingh@cmu.edu.

Fig. 1. The method aims at motion estimation and mapping using a

monocular camera combined with a 3D lidar. A visual odometry method

estimates motion at a high frequency but low ﬁdelity to register point clouds.

Then, a lidar odometry method matches the point clouds at a low frequency

to reﬁne motion estimates and incrementally build maps. The lidar odometry

also removes distortion in the point clouds caused by drift of the visual

odometry. Combination of the two sensors allows the method to accurately

map even with rapid motion and in undesirable lighting conditions.

at a high frequency as the image frame rate (60Hz) to

estimate motion. The second uses lidar odometry at a low

frequency (1 Hz) to reﬁne motion estimates and remove

distortion in the point clouds caused by drift of the visual

odometry. The distortion-free point clouds are matched and

registered to incrementally build maps. The result is that

the visual odometry handles rapid motion, and the lidar

odometry warrants low-drift and robustness in undesirable

lighting conditions. Our ﬁnding is that the maps are often

accurate without the need for post-processing. Although

loop closure can further improve the maps, we intentionally

choose not to do so since the emphasis of this work is to

push the limit of accurate odometry estimation.

The basic algorithm of V-LOAM is general enough that it

can be adapted to use range sensors of different kinds, e.g.

a time-of-ﬂy camera. The method can also be conﬁgured to

provide localization only, if a prior map is available.

In addition to evaluation on the KITTI odometry bench-

mark [1], we further experiment with a wide-angle camera

and a ﬁsheye camera. Our conclusion is that the ﬁsheye

camera brings in more robustness but less accuracy because

of its larger ﬁeld of view and higher image distortion.

However, after the scan matching reﬁnement, the ﬁnal motion

estimation reaches the same level of accuracy. Our experi-

ment results can be seen in a publicly available video.

II. RELATED WORK

Vision and lidar based methods are common for state

estimation [2]. With stereo cameras [3], [4], the baseline

provides a reference to help determine scale of the motion.

However, if a monocular camera is used [5]–[7], scale of the

motion is generally unsolvable without aiding from other

sensors or assumptions about motion. The introduction of

RGB-D cameras provides an efﬁcient way to associate visual

images with depth. Motion estimation with RGB-D cameras

www.youtube.com/watch?v=-6cwhPMAap8

2015 IEEE International Conference on Robotics and Automation (ICRA)

Washington State Convention Center

Seattle, Washington, May 26-30, 2015

下载后可阅读完整内容，剩余9页未读，立即下载

南山二毛

粉丝: 1w+
资源: 72

视觉-激光雷达融合定位与建图：低漂移、鲁棒且高效

lidar_tool

Lidar Odometry and mapping in real time

Accurate Direct Visual-Lidar Odometry.pdf

Low-drift and real-time lidar odometry and mapping 翻译注解1

LeGO-LOAM_ Lightweight and Ground-Optimized Lidar Odometry and Mapping

direct-visual-lidar-calibration.tar.xz

A Review of Visual-LiDAR Fusion based Simultaneous Localization and Mapping

MATLAB-LiDAR：VLP-16 LiDAR +高尔夫球轨迹仿真

3D-lidar.rar_3D lidar_lidar_lidar 测距_信号强度颜色_激光质量

3D_Mapping_based_on_2D-Lidar_at_static_locations-master.zip

最新资源