ISSN 1054-6618, Pattern Recognition and Image Analysis, 2016, Vol. 26, No. 4, pp. 726–733. © Pleiades Publishing, Ltd., 2016.
Effective Energy-based Multi-view Piecewise Planar Stereo
1
Yiran Wang
a
, Wei Wang
a
, Hai Zhu
a
, and Shi Dong
b,c
*
a
School of Network Engineering, Zhoukou Normal University, Zhoukou 466000, China
b
School of Computer Science and Technology, Zhoukou Normal University, Zhoukou 466000, China
c
Department of Computer Science and Engineering, Washington University in St. Louis, Saint Louis, MO 63130
e-mail: *njbsok@gmail.com
Abstract– For piecewise planar scene modeling, many challenging issues still persist, in particular, how to
generate sufficient candidate planes and how to assign an optimal plane for each spatial patch. To address
these issues, we present a novel multi-view piecewise planar stereo method for the complete reconstruction.
In our method, reconstruction is formulated as an energy-based plane labeling problem, where photo-con-
sistency and geometric constraints are incorporated to a unified superpixel-level MRF (Markov Random
Field) framework. To enhance the efficacy of the plane inference and optimization, an effective multi-direc-
tion plane sweeping with much restricted search space is carried out to generate sufficient and reliable candi-
date planes. Experiments show that our method can effectively handle many challenging factors (e.g., slant
surfaces, textureless regions) and achieve satisfactory results.
Keywords: plane fitting, multi-view stereo, energy optimization, piecewise planar stereo, depth map
DOI: 10.1134/S1054661816040209
1. INTRODUCTION
Piecewise planar model (PPM) is a widely used
technique for the reconstruction of urban scenes,
where the higher-order planarity prior significantly
helps to overcome several challenging difficulties to
which traditional pixel-level stereos appear powerless,
e.g., textureless regions, inevitable occlusions.
Although many excellent algorithms have been pro-
posed in recent years, how to automatically recon-
struct a complex structured scene with completeness
and accuracy is still a remote one.
The most existing piecewise planar methods tend
to infer spatial planes along a quite restricted number
of directions (e.g., Manhattan-world model) or heav-
ily rely on the number of initial 3D points, which often
achieves only geometrically simplistic models or fails
to recover complex scene structures. To address these
problems, we propose a novel energy-based multi-
view piecewise planar stereo method for the complete
reconstruction of the scene by incorporating across-
view photo and geometrical consistencies under a
superpixel-level MRF framework. By virtue of such
strong constraints and the power of plane potential
expression and inference, our method can effectively
recover the complete structures of the scene.
The main contributions of our work can be summa-
rized as: (1) a natural integration of pixel-level and
superpixel-level multi-view stereos under an unified
1
The article is published in the original.
MRF framework for the complete reconstruction of
urban scenes. (2) a novel and robust energy formulation
that incorporates a variety of constraints, such as photo-
consistency, occlusion penalty, geometric relations
between planes. (3) an effective multi-direction plane
sweeping with restricted search space method to gener-
ate sufficient and reliable candidate planes for inferring
the planes corresponding to texturelss regions.
2. RELATED WORK
Most existing segmentation-based methods [1, 2]
start from a set of candidate planes generated from
reconstructed sparse or quasi-dense data (e.g., 3D
points), and then use some global optimization
method (e.g., Graph Cuts [3]) to infer the spatial
planes in poorly textured regions where the recon-
structed points are either few or of very low quality. In
these methods, the image over-segmentation is usu-
ally adopted because of the fact that pixels of similar
appearance are more likely to belong to the same
plane, then the space patch associated with an image
segment (i.e., superpixel) is modeled as a plane.
In fact, such methods often heavily depend on the
used candidate planes, as the initial missing of a real
plane is irreversible in the subsequent optimization.
Moreover, for two-view dense reconstruction, some
pixel- or region- matching measurement could still
remain ambiguous due to fewer observations.
Relatively, multi-view piecewise planar stereo
methods could be effective for the complete recon-
struction of complex scenes due to the availability of
more observations or constraints. In general, in order
Received May 20, 2015
REPRESENTATION, PROCESSING, ANALYSIS,
AND UNDERSTANDING OF IMAGES