Robust Place Recognition using an Imaging Lidar
Tixiao Shan, Brendan Englot, F
´
abio Duarte, Carlo Ratti, and Daniela Rus
Abstract— We propose a methodology for robust, real-time
place recognition using an imaging lidar, which yields image-
quality high-resolution 3D point clouds. Utilizing the intensity
readings of an imaging lidar, we project the point cloud
and obtain an intensity image. ORB feature descriptors are
extracted from the image and encoded into a bag-of-words
vector. The vector, used to identify the point cloud, is inserted
into a database that is maintained by DBoW for fast place
recognition queries. The returned candidate is further validated
by matching visual feature descriptors. To reject matching
outliers, we apply PnP, which minimizes the reprojection
error of visual features’ positions in Euclidean space with
their correspondences in 2D image space, using RANSAC.
Combining the advantages from both camera and lidar-based
place recognition approaches, our method is truly rotation-
invariant, and can tackle reverse revisiting and upside down
revisiting. The proposed method is evaluated on datasets
gathered from a variety of platforms over different scales
and environments. Our implementation is available at https:
//git.io/imaging-lidar-place-recognition.
I. INTRODUCTION
Place recognition plays an important role in many mo-
bile robotics applications, such as solving the kidnapped
robot problem, localizing a robot in a known map, and
maintaining the accuracy of simultaneous localization and
mapping (SLAM). During the last two decades, a variety
of place recognition methods have achieved great success
in tackling such problems using camera, lidar, and other
perceptual sensors. Camera-based place recognition methods
often extract visual features from textured scenes and find
candidates using a bag-of-words approach. However, such
methods are subject to illumination and viewpoint change.
On the other hand, lidar-based place recognition methods,
which often extract local or global descriptors from a point
cloud, are invariant to such changes. The long detection
range and wide aperture of lidar permit the capture of many
structural details of an environment. Yet such details are often
discarded during descriptor extraction, which may result
in false positive detections when surrounded by repeating
structures. Due to the prevalence of low lidar resolution,
camera-based methods cannot typically be applied to lidar
data. Conversely, lidar-based methods cannot typically be
applied to camera data due to a lack of structural information.
However, with the recent availability of high-resolution
lidars, such as the Ouster OS1-128 and Velodyne VLS-
128, we can begin to bridge the gap between camera-based
T. Shan, F. Duarte and C. Ratti are with the Department of Urban Studies
and Planning, Massachusetts Institute of Technology, USA, {shant, fduarte,
ratti}@mit.edu.
B. Englot is with the Department of Mechanical Engineering, Stevens Institute of
Technology, USA, benglot@stevens.edu.
T. Shan and D. Rus are with the Computer Science & Artificial Intelligence Lab-
oratory, Massachusetts Institute of Technology, USA, {shant, rus}@mit.edu.
Fig. 1: A demonstration of the proposed method applied to a
mapping task. Left: a loop is found when the place is revisited.
Grayscale images are intensity images projected from point clouds.
Green lines connect the matched features. Right: top-view point
cloud map of a parking lot. Red line indicates the traversed
trajectory. Blue segments along with green dots indicate detected
loop closures using our method. Note that features are extracted
from the traffic arrow on the ground for place recognition.
and lidar-based place recognition methods. We refer to such
high-resolution lidar that gives image-quality 3D scans as
imaging lidar. Driven by the prospects of this technology,
we present a method for robust place recognition using
an imaging lidar. We first project the high-resolution point
cloud with intensity information onto an intensity image.
We then extract Oriented FAST and rotated BRIEF (ORB)
feature descriptors from the intensity image. The extracted
descriptors are converted into a bag-of-words (BoW) vector,
which forms a compact representation for the original point
cloud. A DBoW database is built with these vectors and
queried for place recognition. If a candidate is found, we
match the ORB descriptors to ensure enough features can
be matched between these two places. To reject matching
outliers, we formulate the matching problem as an optimiza-
tion problem by applying Perspective-n-Point (PnP) Random
Sample Consensus (RANSAC). A representative example of
our method is shown in Figure 1. The main contributions of
our work, which combines techniques from both camera and
lidar-based place recognition methods, are as follows:
• Real-time robust place recognition that is designed for
imaging lidar, and to our knowledge, the first that uses
projected lidar intensity images for place recognition.
• The proposed method, which is invariant to sensor
attitude changes, can detect reverse revisiting, and even
upside down revisiting.
• Our method is extensively validated with data gathered
across different scales, platforms, and environments.
II. RELATED WORK
Our work draws upon concepts used in both camera-based
and lidar-based place recognition methods. Due to their low
hardware cost requirement and robustness in texture-rich
arXiv:2103.02111v1 [cs.CV] 3 Mar 2021