AutoRemover：自动驾驶中的自动目标移除技术

需积分: 10 188 浏览量更新于2024-08-05 收藏 8.07MB PDF 举报

"AutoRemover 是一款针对自动驾驶的自动物体移除工具，旨在生成没有移动对象的逼真街景视频。该技术结合了深度学习和视频修复算法，以解决自动驾驶模拟中的照片级真实感问题。文章由来自浙江大学、百度研究院、南加州大学和北京大学的研究人员共同发表，详细介绍了如何应对阴影检测和大范围自身运动带来的挑战。" 在自动驾驶领域，真实感的模拟环境对于训练和测试自动驾驶系统至关重要。AutoRemover 算法是为了解决这一需求而提出的，它专门设计用于从街景视频中移除移动物体，如行人、车辆等，从而创建出一个静态背景的视频流。这个过程对提升自动驾驶系统的仿真训练效果有着显著意义。面对的第一个挑战是阴影。阴影通常与移动物体紧密关联，但又未被明确标记。为了处理这个问题，研究人员构建了一个自动驾驶专用的阴影数据集，并开发了一种深度神经网络模型，该模型能够自动检测并识别视频中的阴影。通过精确识别和处理阴影，可以更有效地移除与其相关的移动物体，保持视频的视觉连贯性。第二个挑战来自于视频中的大范围自身运动（ego-motion），即自动驾驶车辆自身的移动。解决这个问题，研究团队利用多帧融合技术来跟踪和理解车辆的动态，从而能够在移除物体后平滑地修复背景。这种技术可以确保即使在大的视角变化下，视频的背景也能被自然地填充和重建。 AutoRemover 的工作流程可能包括以下几个步骤：首先，使用深度学习模型检测视频帧中的移动物体和阴影；其次，利用 ego-motion 模型估计和补偿车辆的运动；然后，将移动物体和阴影区域进行填充，采用先进的视频修复算法确保填充后的图像质量；最后，生成一个无移动物体且背景连续的街景视频。这项工作不仅展示了深度学习在自动驾驶模拟中的应用潜力，还为视频处理和计算机视觉领域提供了新的研究方向。通过 AutoRemover，研究人员和工程师能够更有效地创建和利用高保真度的模拟环境，这对于提高自动驾驶系统的安全性和可靠性具有重要意义。

AutoRemover: Automatic Object Removal for Autonomous Driving Videos

Rong Zhang,

1∗

Wei Li,

2,4†

Peng Wang,

Chenye Guan,

Jin Fang,

Yuhang Song,

Jinhui Yu,

Baoquan Chen,

Weiwei Xu,

1†

Ruigang Yang,

Zhejiang University,

Baidu Research, Baidu Inc.

University of Southern California,

Peking University

cadzhangrong@zju.edu.cn, liweimcc@gmail.com, jerryking234@gmail.com, {guanchenye, fangjin}@baidu.com,

yuhangso@usc.edu, jhyu@cad.zju.edu.cn, baoquan@pku.edu.cn, xww@cad.zju.edu.cn, ryang2@outlook.com

Abstract

Motivated by the need for photo-realistic simulation in au-

tonomous driving, in this paper we present a video inpaint-

ing algorithm AutoRemover, designed speciﬁcally for gener-

ating street-view videos without any moving objects. In our

setup we have two challenges: the ﬁrst is the shadow, shad-

ows are usually unlabeled but tightly coupled with the mov-

ing objects. The second is the large ego-motion in the videos.

To deal with shadows, we build up an autonomous driving

shadow dataset and design a deep neural network to detect

shadows automatically. To deal with large ego-motion, we

take advantage of the multi-source data, in particular the 3D

data, in autonomous driving. More speciﬁcally, the geometric

relationship between frames is incorporated into an inpaint-

ing deep neural network to produce high-quality structurally

consistent video output. Experiments show that our method

outperforms other state-of-the-art (SOTA) object removal al-

gorithms, reducing the RMSE by over 19%.

1 Introduction

With the explosive growth of AI robotic techniques, espe-

cially the autonomous driving (AD) vehicles, countless im-

ages or videos as long as other sensor data are captured daily.

To fuel the learning-based AI algorithms (such as percep-

tion, scene parsing, planning) in those intelligence systems,

a large number of annotated data are still in great demand.

Thus, building virtual simulators for saving massive efforts

on labeling and processing the captured data are essential

to make the data best used for various AD applications (Al-

haija et al. 2018; Seif and Hu 2016). One basic procedure

in those applications is removing the unwanted or hard-to-

annotate parts of the raw data, a.k.a the object removal or im-

age/video inpainting. As shown in Figure 1, with the devel-

oped simulation system in (Li et al. 2019), the background

image obtained by removing the foreground vehicles can be

used to synthesize new trafﬁc images with annotations or re-

construct 3D road models with clean textures, which is one

of the desirable ways for data augmentation.

The image inpainting problem has been widely inves-

tigated, which also forms the basis of video inpainting.

∗

The authors from Zhejiang University are afﬁliated with the

State Key Lab of CAD&CG.

†

Corresponding authors.

 2020, Association for the Advancement of Artiﬁcial

Figure 1: 1st row shows the source image and inpainted one

from a video. 2nd row shows the usage of inpainting in data

augmentation and simulator. With the inpainted background,

the vehicle can be moved or inserted to synthesize new traf-

ﬁc images. 3rd row shows inpainted videos are used to yield

3D model with clean texture.

Technically, image inpainting algorithms either utilize sim-

ilar patches in the current image to ﬁll the hole by the

optimization-based methods or directly hallucinate from

training images by the learning-based methods. Recently,

the CNNs, especially GANs, hugely advanced the image in-

painting technique (Pathak et al. 2016; Iizuka, Simo-Serra,

and Ishikawa 2017; Yu et al. 2018a), yielding visually plau-

sible and impressive results. However, directly applying im-

age inpainting techniques to videos suffers from jittering and

inconsistency. Thus, different kinds of temporal constraints

are introduced in recent video inpainting approaches (Huang

et al. 2016; Xu et al. 2019), whose core is jointly estimating

optical ﬂow and inpainting color.

Even several video inpainting systems have been pro-

posed in the very close recent, their target scenarios are

usually with only small camera ego-motion in the behind

of foreground objects movements, where the ﬂow between

frames are easy to estimate. Unfortunately, the videos cap-

tured by AD vehicles have large camera ego-motion (Fig-

arXiv:1911.12588v1 [cs.CV] 28 Nov 2019

下载后可阅读完整内容，剩余8页未读，立即下载

DeepLearning小舟

粉丝: 2379
资源: 57

AutoRemover：自动驾驶中的自动目标移除技术

Object-Removal-master.zip_object removal_removal_图像修复

shadow_detection_and_removal-harini.ppt

AttributeError: module 'open3d.cuda.pybind.geometry' has no attribute 'radius_outlier_removal'

acv_license_removal

AttributeError: module 'open3d.cpu.pybind.geometry' has no attribute 'radius_outlier_removal'

(* equivalent_register_removal = "no" *)在verilog中是什么意思

region filling and object removal by exemplar-based image inpainting

用HDevelop实现将几张3D点云图拟合成一个圆柱的全部代码过程

用HDevelop实现将几张3D点云图拟合成一个圆柱的全部代码过程，HDevelop版本为20版本

最新资源