Efficient High-Resolution Stereo Matching using Local Plane Sweeps
Sudipta N. Sinha
Microsoft Research
sudipsin@microsoft.com
Daniel Scharstein
Middlebury College
schar@middlebury.edu
Richard Szeliski
Microsoft Research
szeliski@microsoft.com
Abstract
We present a stereo algorithm designed for speed and
efficiency that uses local slanted plane sweeps to pro-
pose disparity hypotheses for a semi-global matching al-
gorithm. Our local plane hypotheses are derived from ini-
tial sparse feature correspondences followed by an itera-
tive clustering step. Local plane sweeps are then performed
around each slanted plane to produce out-of-plane paral-
lax and matching-cost estimates. A final global optimiza-
tion stage, implemented using semi-global matching, as-
signs each pixel to one of the local plane hypotheses. By
only exploring a small fraction of the whole disparity space
volume, our technique achieves significant speedups over
previous algorithms and achieves state-of-the-art accuracy
on high-resolution stereo pairs of up to 19 megapixels.
1. Introduction
As imaging and processor systems continue to increase
in resolution and power, the need for more efficient stereo
matching algorithms is becoming more acute. Increasing
the image resolution not only increases the number of pix-
els that must be processed, it also increases the number
of disparity levels that must be considered. For example,
the full-size 2005 Middlebury stereo pairs [15] average 1.4
megapixels (MP) and have a disparity range of 200 pixels;
the recent Disney/ETH datasets [12] are as large as 19 MP
with disparity ranges up to 1000 pixels.
While great advances have been made in the last decade,
most algorithms (with the exception of seed-and-grow and
“PatchMatch” approaches, which we discuss in the next
section) still evaluate the complete disparity space image
(DSI), either explicitly, or by doing a local correspondence
search over the full range of disparities.
In this paper, we remove this full search using sparse
feature correpondences to propose local planes along which
we perform small-disparity plane sweeps. This has the ad-
vantage of handling highly slanted surfaces without requir-
ing many disparity hypotheses and without any bias toward
fronto-parallel orientations. The local plane sweep not only
Figure 1. A disparity map computed in 15 seconds by our method
for the 11-megapixel Couch stereo pair selected from the multi-
view datasets released by [12] (best seen in color). More than
98.5% of pixels agree to within 2.0 disparities with their result,
which was computed from 100 densely-spaced input images.
performs subpixel registration, it also deals gracefully with
curved surfaces, which a single plane would fail to model.
The key to our approach is its ability to efficiently pro-
pose and evaluate local plane-sweep hypotheses indepen-
dently. Our algorithm is also able to propagate promising
hypotheses into adjoining image regions that have not yet
been adequately modeled. We merge the candidate surfaces
from the local plane sweeps in a final optimization step us-
ing a variant of semi-global matching [10]. The resulting
algorithm exhibits high efficiency since it only evaluates
a small fraction of potential disparity hypotheses. It also
maintains high accuracy due to its subpixel registration and
edge-aware global optimization components. We present an
experimental comparison with several state-of-the-art tech-
niques that demonstrates the superior accuracy and high ef-
ficiency of our approach on 20 high-resolution stereo pairs,
including seven new 5-6 MP datasets with ground truth.
2014 IEEE Conference on Computer Vision and Pattern Recognition
1063-6919/14 $31.00 © 2014 IEEE
DOI 10.1109/CVPR.2014.205
1576
2014 IEEE Conference on Computer Vision and Pattern Recognition
1063-6919/14 $31.00 © 2014 IEEE
DOI 10.1109/CVPR.2014.205
1582
2014 IEEE Conference on Computer Vision and Pattern Recognition
1063-6919/14 $31.00 © 2014 IEEE
DOI 10.1109/CVPR.2014.205
1582