Obstacle Detection for Self-Driving Cars Using Only Monocular
Cameras and Wheel Odometry
Christian H
¨
ane
1
, Torsten Sattler
1
, and Marc Pollefeys
1
Abstract— Mapping the environment is crucial to enable path
planning and obstacle avoidance for self-driving vehicles and
other robots. In this paper, we concentrate on ground-based
vehicles and present an approach which extracts static obstacles
from depth maps computed out of multiple consecutive images.
In contrast to existing approaches, our system does not require
accurate visual inertial odometry estimation but solely relies on
the readily available wheel odometry. To handle the resulting
higher pose uncertainty, our system fuses obstacle detections
over time and between cameras to estimate the free and
occupied space around the vehicle. Using monocular fisheye
cameras, we are able to cover a wider field of view and detect
obstacles closer to the car, which are often not within the
standard field of view of a classical binocular stereo camera
setup. Our quantitative analysis shows that our system is
accurate enough for navigation purposes of self-driving cars
and runs in real-time.
I. INTRODUCTION
Reliably and accurately detecting obstacles is one of the
core problems that need to be solved to enable autonomous
navigation for robots and vehicles. For many use cases such
as micro aerial vehicles (MAVs) or self-driving cars, obstacle
detection approaches need to run in (near) real-time so that
evasive actions can be performed. At the same time, solutions
to the obstacle detection problem are often restricted by the
type of vehicle and the available resources. For example, a
MAV has restricted computational capabilities and can carry
only a certain payload while car manufacturers are interested
in using sensors already built into series vehicles in order to
keep self-driving cars affordable.
There essentially exist two approaches for obstacle de-
tection. Active methods use sensors such as laser scanners,
time-of-flight, structured light or ultrasound to search for
obstacles. In contrast, passive methods try to detect obstacles
based on passive measurements of the scene, e.g., in camera
images. They have the advantage that they work over a
wide range of weather and lighting conditions, offer a high
resolution, and that cameras are cheap. At the same time, a
wide field of view can be covered using for example fisheye
cameras. In this paper, we therefore present an obstacle
detection system for static objects based on camera images.
We use stereo vision techniques [1], [2], [3] to obtain a
3D model of the scene. Our main motivation is to enable
self-driving cars to detect static objects such as parked cars
and signposts, determine the amount of free space around
them, and measure the distance between obstacles, e.g., to
*This work has been supported by EU’s 7th Framework Programme
(FP7/2007-2013) under grant #269916 (V-Charge).
1
Department of Computer Science, ETH Z
¨
urich, Switzerland {chaene,
sattlert, pomarc}@inf.ethz.ch
determine the size of an empty parking spot. We therefore
detect obstacles as objects obtruding from the ground [4].
Most existing stereo vision-based techniques rely on classical
forward facing binocular stereo cameras with a relatively
narrow field of view and visual (inertial) odometry (VIO)
systems to provide accurate vehicle poses. These systems
are mainly targeted for detecting objects which are in front
of the car and are therefore used in standard on road forward
driving situations. For many maneuvers such as parking into
a parking spot or navigating in a narrow parking garage, a
full surround view is very important. We show that for such
situations accurate obstacle detections can be obtained from
a system that uses only monocular fisheye cameras and the
less accurate poses provided from the wheel odometry of the
car, if the noisy individual detections are properly fused over
time. The resulting system does not require complex VIO
systems, but simply exploits information already available
while running in real-time on our test vehicle, we thus avoid
any unnecessary delay a VIO system might introduce.
This paper makes the following contributions: We describe
the overall obstacle detection system and explain each part in
detail, highlighting the rationale behind our design decisions.
We demonstrate experimentally that highly precise vehicle
poses are not required for accurate obstacle detection and
show that a proper fusion of individual measurements can
compensate for pose errors. Self-driving cars are currently a
very active field of research and we believe that the proposed
system and our results will be of interest to a significant part
of researchers working in this field. To our knowledge, ours
is the first system that uses monocular fisheye cameras and
only relies on the wheel odometry.
The paper is structured as follows: The remainder of this
section discusses related work. Sec. II provides an overview
over both our vehicle setup and our obstacle detection
system. Sec. III explains the computation of the depth maps.
Obstacle extraction is described in Sec. IV, while Sec. V
details how to fuse detections from multiple depth maps.
Sec. VI experimentally evaluates the proposed method.
A. Related Work
In contrast to motion stereo systems, active sensors such as
lidar [5] and binocular stereo cameras [6] can provide a depth
map of the environment at any time, even when the vehicle
is not moving. Stereo cameras offer the advantage of being
cheap to produce while providing high-quality measurements
in real-time [6]. Thus, many obstacle detection systems rely
on a stereo setup [7], [8], [9]. Obstacles are usually detected
in an occupancy grid [10], [11], a digital elevation [12] or