奔驰Gated2Depth：基于摄像头的实时高精度点云生成

需积分: 0 104 浏览量更新于2024-08-03 收藏 618KB PDF 举报

2019年国际计算机视觉会议（ICCV）上的一项重要研究成果"奔驰Gated2Depth：基于门控图像的实时密集激光雷达"（Gated2Depth: Real-Time Dense LiDAR from Gated Images）提出了一种创新的深度感知框架，旨在将三张来自门控摄像头的图像转化为具有与脉冲激光雷达测量相当深度精度的高分辨率三维地图。这一突破性工作针对的是当前扫描式激光雷达系统存在的局限。传统扫描激光雷达在长距离下往往无法提供高空间分辨率，因为机械扫描速率限制了对场景的理解，通常只能局限于近距离密集采样区域。此外，现有的脉冲激光雷达在成本、功率消耗、尺寸以及在强反射环境下的性能上存在显著问题。 Gated2Depth研究团队，由Daimler AG、Algolux、乌尔姆大学和普林斯顿大学的专家组成，挑战了传统的点云扫描方法。他们展示了一种将低成本的CMOS门控相机转化为能够在至少80米范围内提供密集深度成像的技术。该方法的关键在于深度学习，通过利用多帧门控图像之间的语义上下文信息进行训练，而无需密集深度标签。研究者们采用了一种创新的训练策略，即利用合成鉴别器损失函数，这使得模型能够在没有真实密集深度数据的情况下进行学习。这种方法的优势在于能够降低硬件成本和能耗，同时提供与专业激光雷达相当的深度精度，这对于自动驾驶、机器人导航以及其他依赖高精度深度信息的应用领域具有重大意义。这项工作的成果不仅提升了视觉传感器的性能，还为未来的智能设备设计提供了新的可能性，即低成本且低功耗的深度感知解决方案，有望在大规模商业应用中实现更广泛的场景理解和感知能力。

RGB Camera Gated Camera Lidar Bird’s Eye View

Figure 2: Sensor performance in a fog chamber with very

dense fog. The ﬁrst row shows recordings without fog while

the second row shows the same scene in dense fog.

outdoor environments is an open challenge. Recent ap-

proaches tackle the lack of dense training data by propos-

ing semi-supervised methods relying on relative depth [10],

stereo images [15, 16, 31], sparse lidar points [31] or seman-

tic labels [62]. Passive methods have in common that their

precision is more than an order of magnitude below that of

scanning lidar systems which makes them no valid alterna-

tive to ubitious lidar ranging in autonomous vehicles [51].

In this work, we propose a method that allows to close this

precision gap using low-cost gated imagers.

Sparse Depth Completion. As an alternative approach

to recover accurate dense depth, a recent work proposes

depth completion from sparse lidar measurements. Simi-

lar to monocular depth estimation, learned encoder-decoder

architectures have been proposed for this task [11, 27, 37].

Jaritz et al. [27] propose to incorporate color RGB data for

upsampling sparse depth samples but also require sparse

depth samples in down-stream scene understanding tasks.

To allow for an independent design of depth estimation and

scene analysis algorithms, the completion architecture has

to be trained with varying sparsity patterns [27, 37] or ad-

ditional validity maps [11]. While these depth completion

methods offer improved depth estimates, they suffer from

the same limitation as scanned lidar: low spatial resolu-

tion at long ranges due to limited angular sampling, low-

resolution detectors, and costly mechanical scanning.

Time-of-Flight Depth Cameras. Amplitude-modulated C-

ToF cameras [19, 30, 33], such as Microsoft’s Kinect One,

have become broadly adopted for indoor sensing [23, 53].

These cameras measure depth by recording the phase shift

of periodically-modulated ﬂood light illumination, which

allows to extract the time-of-ﬂight for the reﬂected ﬂood

light from the source to scene and back to the camera. How-

ever, in addition to the modulated light, this sensing ap-

proach also records all ambient background light. While

per-pixel lock-in ampliﬁcation removes background com-

ponents efﬁciently in indoor scenarios [33], and learned ar-

chitectures can alleviate multi-path distortions [55], exist-

ing C-ToF cameras are limited to ranges of a few meters in

outdoor scenarios [22] in strong sunlight.

Gated cameras send out pulses of ﬂood-light and only

record photons from a certain distance by opening and clos-

ing the camera after a given delay. Gated imaging has ﬁrst

been proposed by Heckman et al. [21]. This acquisition

mode allows to gate out backscatter from fog, rain, and

snow [18]. Busck et al. [3, 6, 7] use gated imaging for

high-resolution depth sensing by capturing large sequences

of narrow gated slices. However, as the depth accuracy is

inversely related to the gate width, and hence the number

of required captures, sequentially capturing high-resolution

gated depth is infeasible at real-time frame-rates. Recently,

a line of research proposes analytic reconstruction mod-

els for known pulse and integration shapes [34, 35, 61].

These approaches require perfect knowledge of the integra-

tion and pulse proﬁles, which is impractical due to drift,

and they provide low precision for broad gating windows

in real-time capture settings. Adam et al. [2], and Schober

et al. [50], present Bayesian methods for pulsed time-of-

ﬂight imaging of room-sized scenes. These methods solve

probabilistic per-pixel estimation problems using priors on

depth, reﬂectivity and ambient light, which is possible when

using nanosecond exposure proﬁles [2,

50] for room-sized

scenes. In this work, we demonstrate that exploiting spatio-

temporal scene semantics allows to recover dense and lidar-

accurate depth from only three slices, with exposures two

orders of magnitude longer (> 100 ns), acquired in real-

time. Using such wide exposure gates allows us to rely

on low-cost gated CMOS imagers instead of detectors with

high temporal resolution, such as SPADs.

3. Gated Imaging

In this section, we review gated imaging and propose an

analytic per-pixel depth estimation method.

Gated Imaging Consider the setup shown in Figure 3,

where an amplitude-modulated source ﬂood-illuminates the

scene with broad rect-shaped “pulses” of light. The syn-

chronized camera opens after a delay ξ to receive only pho-

tons with round-trip path-length longer than ξ · c, where c

is the speed of light. Assuming a dominating lambertian

reﬂector at distance r, the detector gain is temporally mod-

ulated with the gating function g resulting in the exposure

measurement

I (r)=αC(r)=

∞



−∞

g (t − ξ) κ (t, r) dt, (1)

where κ is the temporal scene response, α the albedo of

the reﬂector, and C (r) the range-intensity proﬁle. With the

reﬂector at distance r, the temporal scene response can be

described as

κ (t, r)=αp



t −



β (r) . (2)

where p is here the laser pulse proﬁle and atmospheric ef-

fects, e.g. in a scattering medium, are modeled by the



Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2021 at 05:19:23 UTC from IEEE Xplore. Restrictions apply.

深度补全

剩余10页未读，继续阅读

编程newbie小黄

粉丝: 0
资源: 1

奔驰Gated2Depth：基于摄像头的实时高精度点云生成

你知道2021-ICCV_TRANSREID_TRANSFORMER-BASED-OBJECT-RE-IDENTIFICATION这篇文章嘛

基于图像的文本提取技术国内外研究现状，附上参考文献

请给我几个计算机方向顶刊顶会的链接

usss 2019iccv

有三年内比较著名的文献吗

目前有哪些top-down方法的姿态估计网络，按年份梳理

基于深度学习的空中运动目标检测与追踪的研究背景与意义相关资料

resnet的参考文献

还有哪些2021年发表的vision transformer 加速器的文献

最新资源