Cross-trees, edge and superpixel priors-based cost aggregation
for stereo matching
Feiyang Cheng
a
, Hong Zhang
a
, Mingui Sun
b
, Ding Yuan
a,
n
a
Image Research Center, Beihang University, Xueyuan Rd., Haiidan Dist., 100191, Beijing, China
b
Department of Neurosurgery, University of Pittsburgh, Suite B-400, 200 Lothrop Street, PA15213, Pittsburgh, USA
article info
Article history:
Received 9 April 2014
Received in revised form
6 November 2014
Accepted 5 January 2015
Available online 14 January 2015
Keywords:
Stereo matching
Cost aggregation
Image filtering
Spanning trees
abstract
In this paper, we propose a novel cross-trees structure to perform the non-local cost aggregation strategy,
and the cross-trees structure consists of a horizontal-tree and a vertical-tree. Compared to other spanning
trees, the significant superiorities of the cross-trees are that the trees' constructions are efficient and the
trees are exactly unique since the constructions are independent on any local or global property of the
image itself. Additionally, two different priors: edge prior and superpixel prior, are proposed to tackle the
false cost aggregations which cross the depth boundaries. Hence, our method contains two different
algorithms in terms of cross-treesþprior. By traversing the two crossed trees successively, a fast non-local
cost aggregation algorithm is performed twice to compute the aggregated cost volume. Performance
evaluation on the 27 Middlebury data sets shows that both our algorithms outperform the other two tree-
based non-local methods, namely minimum spanning tree (MST) and segment-tree (ST).
& 2015 Elsevier Ltd. All rights reserved.
1. Introduction
Dense two-frame stereo matching has been extensively inves-
tigated for decades as a traditional low-level vision task, since it is
crucial for many applications such as 3D reconstruction [1,2],
image-based rendering [3,4] and anonymous driving [5]. Accord-
ing to the analysis and taxonomy scheme proposed in [6], stereo
matching algorithms can be categorized into two groups: local
algorithms and global algorithms. Stereo matching algorithms are
often implemented following a subset of the four steps or all:
1. Cost function/cost volume estimation.
2. Cost aggregation within a support region.
3. Disparity computation/optimization.
4. Disparity refinement.
Global algorithms usually make explicit smoothness assumptions,
and minimize a predefined energy function to obtain optimal results
[7–9]. Despite the reliable matching results obtained, global algo-
rithms are often time-consuming. All local algorithms compute the
matching cost (step 1) firs tl y and then perform the cost aggregation
(step 2) to get a locally optimized cost volume [6,10–14].Wemainly
focus on efficient and effectiv e local and non-local methods in this
paper, and the readers are referr ed to a recent study for a compre-
hensiv e study of the global methods [1 5].
To find a correspondence (x, x’), the problem of the local
methods can be concluded as a comparison of the similarity of
two local patches which around x and x' respectively. The similarity
of the two patches is computed by aggregating the costs of the
pixels within the patches. Hence, the cost aggregation (step 2)
procedure has important impacts on the accuracy and the efficiency
of a local algorithm. The cost aggregation of a pixel in traditional
local algorithms is usually performed by averaging the costs of the
pixel itself and all its neighboring pixels. Here, the implicit assump-
tion is that all the pixels which lie in a special local support region
have similar disparities, as shown in Fig. 1(a). Such local methods
suffer from well known ”edge fatten” effect once the local support
regions cover the depth boundaries. The problem can be explained
in the context of image filtering. For instance, the box filter always
blurs the edges of an image during the image denoising procedure.
Hence, the problem of the cost aggregation step is how to choose
optimal local support regions for each pixel. Various researches
have been conducted to estimate optimal support regions for
the cost aggregation, such as various window-based methods
[10,11] and adaptive support weights (ASW) methods (also known
as local filtering-based methods) [12–14] which have state of the
art performance in the last years. However, the selected support
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/pr
Pattern Recognition
http://dx.doi.org/10.1016/j.patcog.2015.01.002
0031-3203/& 2015 Elsevier Ltd. All rights reserved.
n
Corresponding author. Tel.:/fax: þ86 10 82338991.
E-mail addresses: chfybuaa@gmail.com (F. Cheng),
dmrzhang@buaa.edu.cn (H. Zhang), drsun@pitt.edu (M. Sun),
dyuan@buaa.edu.cn (D. Yuan).
Pattern Recognition 48 (2015) 2269–2278