A Census-Based Stereo Vision Algorithm Using Modified Semi-Global Matching
and Plane Fitting to Improve Matching Quality
∗
Martin Humenberger, Tobias Engelke, Wilfried Kubinger
AIT Austrian Institute of Technology
Donau-City-Strasse 1, 1220 Vienna, Austria
martin.humenberger@ait.ac.at, tobias.engelke@ait.ac.at, wilfried.kubinger@ait.ac.at
Abstract
This paper introduces a new segmentation-based ap-
proach for disparity optimization in stereo vision. The main
contribution is a significant enhancement of the matching
quality at occlusions and textureless areas by segmenting
either the left color image or the calculated texture image.
The local cost calculation is done with a Census-based cor-
relation method and is compared with standard sum of ab-
solute differences. The confidence of a match is measured
and only non-confident or non-textured pixels are estimated
by calculating a disparity plane for the corresponding seg-
ment. The quality of the local optimized matches is in-
creased by a modified Semi-Global Matching (SGM) step
with subpixel accuracy. In contrast to standard SGM, not
the whole image is used for disparity optimization but hor-
izontal stripes of the image. It is shown that this modi-
fication significantly reduces the memory consumption by
nearly constant matching quality and thus enables embed-
ded realization. Using the Middlebury ranking as evalua-
tion criterion, it is shown that the proposed algorithm per-
forms well in comparison to the pure Census correlation.
It reaches a top ten rank if subpixel accuracy is supposed.
Furthermore, the matching quality of the algorithm, espe-
cially of the texture-based plane fitting, is shown on two
real-world scenes where a significant enhancement could
be achieved.
1. Introduction
3D data perception of the surrounding environment of a
robot platform or an autonomous vehicle is essential for re-
liable operation. Common sensors are based on laser, radar,
or time-of-flight. These techniques enable high quality 3D
perception with the drawback of low resolution and high
costs. For a number of robot applications such as people or
∗
This work was has been supported by the European Union project
ROBOTS@HOME under grant FP6-2006-IST-6-045350.
scene recognition as well as robot navigation digital cam-
eras are used. Stereo vision is technology that uses two in
parallel mounted digital cameras to determine the depth of
a scene. Advantages are the low price, the high resolution
and the fact that the images can be used for any other appli-
cation as well. For home applications it is also quite useful
because it is purely passive technology and thus does not
effect the surrounding environment.
For depth calculation the so called correspondence prob-
lem (stereo matching), which is the search for correspond-
ing projections of the same scene point onto both camera
planes, has to be solved. The horizontal displacement of
corresponding pixels is denoted as disparity. Area-based
stereo matching algorithms try to calculate the complete
disparity map, which is an image of the same size as the
camera images with the disparity instead of the intensity
value for each pixel. The advantage is that with a single
capture a huge number of surrounding 3D points can be de-
termined. The matching process is based on similarity com-
parison of areas of the images (correlation), thus textureless
areas are a difficult challenge. Pixels visible in only one of
the images are called occlusions and obviously cannot be
found by correlation.
In general, area-based matching algorithms calculate the
costs for each matching candidate and optimize them af-
terwards to find the correct matches. Once the local costs
are calculated, a minimum search (winner takes all,WTA)
can be used to find the best matching pixels. Another strat-
egy is to apply global optimization to the local costs to en-
hance the probability of correct matching. Here, not only
the pixels’ neighborhoods are used to calculate the costs,
but the whole scanline or even the whole image. With
these techniques, especially on textureless areas better re-
sults can be achieved. The drawback of global optimizing
algorithms is the huge processing time and memory con-
sumption. To the authors’ knowledge, no implementation of
a global optimization is commercially available for purely
embedded real-time platforms without dedicated hardware
such as field programmable gate arrays (FPGA).
978-1-4244-7028-0/10/$26.00 ©2010 IEEE