Pointwise Convolutional Neural Networks
Binh-Son Hua Minh-Khoi Tran Sai-Kit Yeung
The University of Tokyo Singapore University of Technology and Design
Abstract
Deep learning with 3D data such as reconstructed point
clouds and CAD models has received great research inter-
ests recently. However, the capability of using point clouds
with convolutional neural network has been so far not fully
explored. In this paper, we present a convolutional neural
network for semantic segmentation and object recognition
with 3D point clouds. At the core of our network is point-
wise convolution, a new convolution operator that can be
applied at each point of a point cloud. Our fully convolu-
tional network design, while being surprisingly simple to
implement, can yield competitive accuracy in both semantic
segmentation and object recognition task.
1. Introduction
Deep learning with 3D data has received great research
interests recently, which leads to noticeable advances in
typical applications including scene understanding, shape
completion, and shape matching. Among these, scene un-
derstanding is considered as one of the most important tasks
for robots and drones as it can assist exploratory scene nav-
igations. Tasks such as semantic scene segmentation and
object recognition are often performed to predict contex-
tual information about objects for both indoor and outdoor
scenes.
Unfortunately, deep learning in 3D was deemed difficult
due to the fact that there are several ways to represent 3D data
such as volumes, point clouds, or multi-view images. Vol-
ume representation is a true 3D representation and straight-
forward to implement but often requires a large amount of
memory for data storage. By contrast, multi-view represen-
tation is not a true 3D representation but shows promising
prediction accuracy as existing pre-trained weights from 2D
networks can be utilized. Among such representations, point
clouds have been the most flexible as they are compact and
0
This work was done when Binh-Son Hua was a postdoctoral researcher
in Singapore University of Technology and Design in 2017.
Figure 1: Pointwise convolution. We define a new convo-
lution operator for point cloud input. For each point, near-
est neighbors are queried on the fly and binned into kernel
cells before convolving with kernel weights. By stacking
pointwise convolution operators together, we can build fully
convolutional neural networks for scene segmentation and
object recognition for point clouds.
could be exported from a wide range of CAD modelling
and 3D reconstruction software. However, the capability of
using point clouds with neural network has been so far not
fully explored.
In this paper, we present a convolutional neural network
for semantic segmentation and object recognition with 3D
point clouds. At the core of our network is a new convolution
operator, called pointwise convolution, which can be applied
at each point in a point cloud to learn pointwise features.
This leads to surprisingly simple and fully convolutional net-
work designs for scene segmentation and object recognition.
Our experiments show that pointwise convolution can yield
competitive accuracy to previous techniques while being
much simpler to implement. In summary, our contributions
are:
•
A pointwise convolution operator that can output fea-
tures at each point in a point cloud;
•
Two pointwise convolutional neural networks for se-
mantic scene segmentation and object recognition.
arXiv:1712.05245v2 [cs.CV] 29 Mar 2018