DynaSLAM：基于多元几何和深度学习的动态场景检测与重建

需积分: 0 38 浏览量更新于2024-08-05 收藏 6.41MB PDF 举报

"这篇论文介绍了DynaSLAM，一个基于ORB-SLAM2的视觉SLAM（Simultaneous Localization and Mapping）系统，该系统增加了动态物体检测和背景修复的功能，适用于有动态物体的真实世界环境，如服务机器人或自动驾驶车辆的应用。DynaSLAM能够通过多视图几何、深度学习或两者结合的方式检测运动物体，并在场景中进行静态地图构建，以填补动态物体遮挡的背景。论文在公开的单目、立体和RGB-D数据集上进行了系统评估，探讨了精度与速度之间的权衡，展示了DynaSLAM在高度动态场景中的优越性能。" 正文: 在现代计算机视觉领域，SLAM（同时定位与建图）算法是机器人和自动驾驶技术的核心组成部分。传统SLAM假设场景是刚性的，即场景中的所有元素都是静止的，但这并不适用于现实世界的复杂环境，如有人活动的区域。在这种背景下，DynaSLAM应运而生，它是一个旨在克服这一限制的视觉SLAM系统。 DynaSLAM构建在ORB-SLAM2的基础上，ORB-SLAM2是目前广泛使用的SLAM框架之一，以其鲁棒性和效率而著名。DynaSLAM在其基础上添加了对动态物体的检测能力，这意味着它能够在环境中识别出移动的对象，这对于在人群中导航的服务机器人或在交通繁忙的道路上行驶的自动驾驶汽车来说至关重要。 DynaSLAM采用了多种策略来检测动态物体。首先，它可以利用多视图几何原理，通过分析不同时间帧间的物体位置变化来识别运动物体。其次，系统整合了深度学习技术，这可能涉及训练神经网络模型以识别和分类动态物体。结合这两种方法，DynaSLAM可以提高检测的准确性和鲁棒性，适应不同的场景和条件。在动态物体检测后，DynaSLAM能够创建一个静态的地图，用于填充因动态物体遮挡而丢失的背景。这种背景修复功能对于保持连续的环境感知和提供连贯的导航信息至关重要。通过这种方式，DynaSLAM能够提供一个更完整、更精确的环境模型，即使在有动态干扰的情况下也是如此。为了评估DynaSLAM的性能，研究者在多个公开的数据集上进行了实验，包括单目、立体和RGB-D数据集。这些实验不仅考察了系统的定位和建图精度，还研究了在牺牲一定速度以提高精度时，系统性能的变化。结果显示，DynaSLAM在高度动态的场景中表现出比标准视觉SLAM基线更高的准确性。 DynaSLAM是SLAM算法的一个重要进展，它通过引入动态物体检测和背景修复，显著提高了在复杂动态环境下的性能。这一成果对于推动服务机器人、自动驾驶和其他依赖于精确环境感知的应用具有重大意义，也预示着未来SLAM系统将更加适应真实世界中的挑战。

DynaSLAM: Tracking, Mapping and Inpainting in Dynamic Scenes

Berta Bescos, Jos

e M. F

acil, Javier Civera and Jos

e Neira

Abstract— The assumption of scene rigidity is typical in

SLAM algorithms. Such a strong assumption limits the use

of most visual SLAM systems in populated real-world environ-

ments, which are the target of several relevant applications like

service robotics or autonomous vehicles.

In this paper we present DynaSLAM, a visual SLAM system

that, building on ORB-SLAM2 [1], adds the capabilities of dy-

namic object detection and background inpainting. DynaSLAM

is robust in dynamic scenarios for monocular, stereo and

RGB-D conﬁgurations. We are capable of detecting the moving

objects either by multi-view geometry, deep learning or both.

Having a static map of the scene allows inpainting the frame

background that has been occluded by such dynamic objects.

We evaluate our system in public monocular, stereo and

RGB-D datasets. We study the impact of several accuracy/speed

trade-offs to assess the limits of the proposed methodology. Dy-

naSLAM outperforms the accuracy of standard visual SLAM

baselines in highly dynamic scenarios. And it also estimates

a map of the static parts of the scene, which is a must for

long-term applications in real-world environments.

I. INTRODUCTION

Simultaneous Localization and Mapping (SLAM) is a

prerequisite for many robotic applications, for example

collision-less navigation. SLAM techniques estimate jointly

a map of an unknown environment and the robot pose

within such map, only from the data streams of its on-board

sensors. The map allows the robot to continually localize

within the same environment without accumulating drift.

This is in contrast to odometry approaches that integrate the

incremental motion estimated within a local window and are

unable to correct the drift when revisiting places.

Visual SLAM, where the main sensor is a camera, has

received a high degree of attention and research efforts over

the last years. The minimalistic solution of a monocular cam-

era has practical advantages with respect to size, power and

cost, but also several challenges such as the unobservability

of the scale or state initialization. By using more complex

setups, like stereo or RGB-D cameras, these issues are solved

and the robustness of visual SLAM systems can be greatly

improved.

The research community has addressed SLAM from

many different angles. However, the vast majority of the

approaches and datasets assume a static environment. As

This work has been supported by NVIDIA Corporation through the

donation of a Titan X GPU, by the Spanish Ministry of Economy and

Competitiveness (projects DPI2015-68905-P and DPI2015-67275-P, FPI

grant BES-2016-077836), and by the Arag

on regional government (Grupo

DGA T04-FSE).

Berta Bescos, Jos

e M. F

acil, Javier Civera and Jos

e Neira

are with the Instituto de Investigaci

on en Ingenier

ıa de

Arag

on (I3A), Universidad de Zaragoza, Zaragoza 50018, Spain

{bbescos,jmfacil,jcivera,jneira}@unizar.es

(a) Input RGB-D frames with dynamic content.

(b) Output RGB-D frames. Dynamic content has been removed. Occluded

background has been reconstructed with information from previous views.

Fig. 1: Overview of DynaSLAM results for the RGB-D case.

a consequence, they can only manage small fractions of

dynamic content by classifying them as outliers to such static

model. Although the static assumption holds for some robotic

applications, it limits the applicability of visual SLAM in

many relevant cases, such as intelligent autonomous systems

operating in populated real-world environments over long

periods of time.

Visual SLAM can be classiﬁed into feature-based methods

[2], [3], that rely on salient points matching and can only esti-

mate a sparse reconstruction; and direct methods [4], [5], [6],

which are able to estimate in principle a completely dense

reconstruction by the direct minimization of the photometric

error and TV regularization. Some direct methods focus on

the high-gradient areas estimating semi-dense maps [7], [8].

None of the above methods, considered the state of the

art, address the very common problem of dynamic objects

in the scene, e.g., people walking, bicycles or cars. Detecting

and dealing with dynamic objects in visual SLAM reveals

several challenges for both mapping and tracking, including:

1) How to detect such dynamic objects in the images to:

a) Prevent the tracking algorithm from using

matches that belong to dynamic objects.

arXiv:1806.05620v2 [cs.CV] 15 Aug 2018

深度学习+几何判别来检测动态objects

堵塞

胜过

长期

无碰撞

防止跟踪算法使用属于动态对象的匹配。

如何检测动态物体

下载后可阅读完整内容，剩余7页未读，立即下载

本本纲目

粉丝: 31
资源: 293

DynaSLAM：基于多元几何和深度学习的动态场景检测与重建

基于深度学习的学生空间想象能力培养.pdf

在“数学实验”引领下构建深度学习课堂--以基本不等式的证明为例.pdf

高中数学教学中如何促进学生的深度学习.pdf

dynaslam如何将多视图几何得到的动态信息与深度学习的掩码信息结合的

如何在复杂场景下通过深度学习实现滞留物的检测与识别，并进行异常行为检测？请结合几何仿射不变矩阵和欧式距离的原理进行说明。

深度学习熔池检测研究进展

怎么利用几何流形实现伪装目标检测

多张深度图能使用几何约束来去除动态点吗

帮我写一段基于单目/立体图像的3D目标检测方法的综述

水下图像增强深度学习

最新资源