A GRAPH-CNN FOR 3D POINT CLOUD CLASSIFICATION
Yingxue Zhang and Michael Rabbat
McGill University
Montreal, Canada
ABSTRACT
Graph convolutional neural networks (Graph-CNNs) extend tradi-
tional CNNs to handle data that is supported on a graph. Major chal-
lenges when working with data on graphs are that the support set (the
vertices of the graph) do not typically have a natural ordering, and in
general, the topology of the graph is not regular (i.e., vertices do not
all have the same number of neighbors). Thus, Graph-CNNs have
huge potential to deal with 3D point cloud data which has been ob-
tained from sampling a manifold. In this paper we develop a Graph-
CNN for classifying 3D point cloud data, called PointGCN
1
. The
architecture combines localized graph convolutions with two types
of graph downsampling operations (also known as pooling). By
the effective exploration of the point cloud local structure using the
Graph-CNN, the proposed architecture achieves competitive perfor-
mance on the 3D object classification benchmark ModelNet, and our
architecture is more stable than competing schemes.
Index Terms— Graph convolutional neural networks, graph
signal processing, 3D point cloud data, supervised learning
1. INTRODUCTION
With the advent of very large datasets and improved computational
capabilities, methods using convolutional neural networks (CNNs)
now achieve state-of-the-art performance on a variety of tasks, in-
cluding speech recognition and image classification. Many emerging
applications give rise to data that may be viewed as being supported
on the vertices of a graph, and field of graph signal processing (GSP)
has developed filtering and other operations on graph signals [1, 2].
Data may either be naturally sampled on the vertices or edges of a
graph (e.g., flows on a transportation network), or the data may sim-
ply be unstructured and a graph is imposed to capture the manifold
structure underlying the data (e.g., the 3D point clouds considered
in this paper). Unlike the domains encountered in more traditional
signal processing (e.g., 1D time-series, 2D images), general graph
topologies do not have the same regularity or symmetries, and so
there is not a unique, well-defined notion of convolution on a graph.
This has motivated researchers to develop a variety of approaches to
convolutions on graphs, which can then be applied in graph-CNNs
and other graph-based signal processing architectures.
Bruna et al. [3, 4] first proposed the idea of using a graph con-
volution defined in the graph spectral domain together with a graph
multiresolution clustering approach to achieve pooling/downsampling.
Defferrard et al. [5] propose a fast localized convolution operation
by leveraging the recursive form of Chebyshev polynomials to both
avoid explicitly calculating the Fourier graph basis and to allow the
number of learnable filter coefficients to be independent of the graph
1
Code is available at https://github.com/maggie0106/
Graph-CNN-in-3D-Point-Cloud-Classification
size. Atwood and Towsley [6] use a similar localized filtering idea
but define the convolution process directly in the spatial domain by
searching the receptive filed at different scales using random walk.
Graph kernels have also been applied to the graph classification
task [7, 8] which aims to classify a graph based on its topology, as
opposed to classifying or otherwise processing signals on a graph.
However, Graph kernels suffer from quadratic training complexity
in the number of graphs [8].
The formulation of Graph-CNNs opens up a range of applica-
tions. Defferrard et al. [5] validate their model on an image classifi-
cation task and demonstrate the effectiveness of Graph-CNNs. Kipf
and Welling [9] study the application of the Graph-CNNs to semi-
supervised learning. In this paper, we explore the application of the
Graph-CNNs in 3D point cloud data.
GSP techniques have been applied to process 3D point cloud
data, such as that obtained by light detection and ranging (LiDAR)
sensors. Rather than binning point clouds into voxels, graph-based
approaches fit a graph with one vertex for each point and edges be-
tween nearby points, and then operate on the graph. The effective-
ness of GSP for processing 3D point cloud data has been demon-
strated in applications such as data visualization, in-painting, and
compression [10, 11, 12, 13].
In this work, we propose a Graph-CNN architecture called
PointGCN for classifying 3D point cloud data by exploring its local
structure encoded in the constructed graph. Unlike most previous
Graph-CNNs, in this setting both the signals and the graph structure
vary from input to input. The proposed architecture uses existing
graph convolution operation together with two types of specifi-
cally designed pooling layers for point cloud data. The architecture
learns a latent signature summarizing each point cloud at different
receptive fields.
We achieve an average classification accuracy comparable to the
state-of-the-art on the ModelNet benchmark, and the variance of the
proposed approach is substantially lower than existing point-based
classification methods.
2. PROBLEM STATEMENT
We consider a classification problem where we are given m labeled
training instances {(X
j
, y
j
)}, each composed of an input X
j
∈ X
and an output y
j
∈ Y. Our goal is produce a function y = f(X)
to predict the output y associated with a new, unseen input X. For
point-based 3D classification problem, we consider the case where
the output space Y is finite (the classes), and each input X
j
is a set
of n points, {x
j,1
, . . . , x
j,n
} ⊂ R
3
.
Previous work has taken different approaches to classifying 3D
point clouds [14, 15, 16, 17, 18, 19, 20, 21], including rendering and
processing a collection of 2D images (projections of the points onto
an image plane from different perspectives), or binning the points
arXiv:1812.01711v1 [cs.CV] 28 Nov 2018