FlowNet3D：3D点云中的场景流学习与应用

顶会论文

需积分: 11 109 浏览量更新于2024-09-05 收藏 1.64MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"FlowNet3D: Learning Scene Flow in 3D Point Clouds是CVPR2019上发表的一篇顶会论文，主要研究如何使用深度神经网络从3D点云数据中估计场景流。文章提出了一个名为FlowNet3D的新颖网络，能以端到端的方式学习点云中的3D运动。网络设计包含两个新的学习层，用于处理点集，以捕捉点云的深层特征和点的运动表示。实验表明，FlowNet3D在FlyingThings3D的挑战性合成数据和实际的KITTI LiDAR扫描数据上表现优秀，即使只在合成数据上训练，也能成功泛化到真实场景，优于多种基线方法并接近现有最优技术。此外，论文还展示了场景流输出在扫描注册和运动分割等应用上的潜力。" 在本文中，作者探讨的关键知识点包括： 1. 场景流（Scene Flow）：场景流是指3D环境中点的动态运动，对于机器人导航、人机交互等领域有着重要的应用价值。理解场景流可以帮助系统理解环境的变化和物体的运动。 2. 3D点云处理：不同于传统依赖于立体图像或RGB-D图像的方法，FlowNet3D直接处理3D点云数据。点云作为三维空间中的离散采样，提供了直观且丰富的几何信息，但其无序性和不规则性对处理提出了挑战。 3. 端到端学习（End-to-End Learning）：FlowNet3D采用端到端的学习策略，意味着整个模型从输入点云到输出场景流是一个连续的过程，无需中间步骤的手动工程。 4. 深度神经网络设计：FlowNet3D网络包含两个创新的学习层，用于学习点云的深层特征和流动嵌入。这使得网络能够捕捉到点的复杂运动模式。 5. 数据集应用：论文在FlyingThings3D的合成数据和KITTI的LiDAR扫描数据上进行评估。FlyingThings3D提供了复杂的合成场景来测试模型的泛化能力，而KITTI数据集则用于验证模型在真实世界环境中的性能。 6. 泛化能力：FlowNet3D仅在合成数据上训练，但在真实世界的数据上仍表现出色，显示了其强大的泛化能力。 7. 应用示例：论文展示了场景流估计在扫描注册（扫描不同时间点的同一场景并进行对齐）和运动分割（识别运动物体）中的实际应用，证明了这种方法的实用性。 FlowNet3D为3D点云的场景流估计提供了一个有效且具有强大泛化能力的解决方案，对于3D视觉领域的研究和应用具有重要贡献。

资源详情

资源推荐

FlowNet3D: Learning Scene Flow in 3D Point Clouds

Xingyu Liu

∗1

Charles R. Qi

∗2

Leonidas J. Guibas

1,2

Stanford University

Facebook AI Research

Abstract

Many applications in robotics and human-computer in-

teraction can beneﬁt from understanding 3D motion of

points in a dynamic environment, widely noted as scene

ﬂow. While most previous methods focus on stereo and

RGB-D images as input, few try to estimate scene ﬂow di-

rectly from point clouds. In this work, we propose a novel

deep neural network named FlowNet3D that learns scene

ﬂow from point clouds in an end-to-end fashion. Our net-

work simultaneously learns deep hierarchical features of

point clouds and ﬂow embeddings that represent point mo-

tions, supported by two newly proposed learning layers for

point sets. We evaluate the network on both challenging

synthetic data from FlyingThings3D and real Lidar scans

from KITTI. Trained on synthetic data only, our network

successfully generalizes to real scans, outperforming vari-

ous baselines and showing competitive results to the prior

art. We also demonstrate two applications of our scene ﬂow

output (scan registration and motion segmentation) to show

its potential wide use cases.

1. Introduction

Scene ﬂow is the 3D motion ﬁeld of points in the

scene [27]. Its projection to an image plane becomes 2D

optical ﬂow. It is a low-level understanding of a dynamic

environment, without any assumed knowledge of structure

or motion of the scene. With this ﬂexibility, scene ﬂow can

serve many higher level applications. For example, it pro-

vides motion cues for object segmentation, action recogni-

tion, camera pose estimation, or even serve as a regulariza-

tion for other 3D vision problems.

However, for this 3D ﬂow estimation problem, most pre-

vious works rely on 2D representations. They extend meth-

ods for optical ﬂow estimation to stereo or RGB-D images,

and usually estimate optical ﬂow and disparity map sepa-

rately [33, 28, 16], not directly optimizing for 3D scene

ﬂow. These methods cannot be applied to cases where point

clouds are the only input.

Very recently, researchers in the robotics community

started to study scene ﬂow estimation directly in 3D point

* indicates equal contributions.

point cloud 1: 





point cloud 2: 





scene flow: 





FlowNet3D

Figure 1: End-to-end scene ﬂow estimation from point

clouds. Our model directly consumes raw point clouds

from two consecutive frames, and outputs dense scene ﬂow

(as translation vectors) for all points in the 1st frame.

clouds (e.g. from Lidar) [7, 25]. But those works did not

beneﬁt from deep learning as they built multi-stage systems

based on hand-crafted features, with simple models such as

logistic regression. There are often many assumptions in-

volved such as assumed scene rigidity or existence of point

correspondences, which make it hard to adapt those systems

to beneﬁt from deep networks. On the other hand, in the

learning domain, Qi et al. [19, 20] recently proposed novel

deep architectures that directly consume point clouds for

3D classiﬁcation and segmentation. However, their work

focused on processing static point clouds.

In this work, we connect the above two research frontiers

by proposing a deep neural network called FlowNet3D that

learns scene ﬂow in 3D point clouds end-to-end. As illus-

trated in Fig. 1, given input point clouds from two consec-

utive frames (point cloud 1 and point cloud 2), our network

estimates a translational ﬂow vector for every point in the

ﬁrst frame to indicate its motion between the two frames.

The network, based on the building blocks from [19], is

able to simultaneously learn deep hierarchical features of

point clouds and ﬂow embeddings that represent their mo-

tions. While there are no correspondences between the

two sampled point clouds, our network learns to associate

points from their spatial localities and geometric similar-

ities, through our newly proposed ﬂow embedding layer.

Each output embedding implicitly represents the 3D mo-

tion of a point. From the embeddings, the network further

up-samples and reﬁnes them in an informed way through

another novel set upconv layer. Compared to direct feature

529

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

DOI 10.1109/CVPR.2019.00062

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_46235004

粉丝: 0
资源: 2

FlowNet3D：3D点云中的场景流学习与应用

sceneflow场景流

flownet3d-master.zip

scene15数据集.zip

如何写一个使用scene flow数据集的网络

语义分割模型中引入光流FlowNet2，FlowNet2输出的flownetfusion_flow是什么东西，是干嘛用的

给出一些相关文献，能够指导我在自己的图像语义分割模型中加入光流估计模型

Resample2d在语义分割的光流FlowNet2输出的flownetfusion_flow作用是什么

给出一个能同时用于中间视图合成和视频插帧的网络模型

macbook安装FlowNet2

Segmentation目录下没有FlowNet2

flownet中correlation代码

flownet2可视化

-bash: cd: flownet2-pytorch: No such file or directory

给flow-server添加cpu，90%，内存90%。修改上述docker-compose

如何BiSeNetV2语义分割模型代码中加入FlowNet光流信号

flow-server分配cpu90%、内存90%

flownet如何计算监督

最新资源