H. Javidnia, P. Corcoran: Depth Map Post-Processing Approach
FIGURE 1. Overview of the adaptive random walk with restart.
not affected by illumination variation because of gradient and
census transform, the processing time is quite fast in compar-
ison with recently studied methods, has good performance in
both outside and inside environment and gives us the option
to have a estimation of the depth in low texture scenes.
One important advantage of this algorithm which con-
vinced us to employ it as a part of our approach, is the good
performance on high resolution images. A traditional way
to speed up stereo computation is to use image pyramids or
downsized images which also reduce the disparity range. This
down-sampling in disparity computation will cause some
small objects to be missed. The full disparity resolution for
large distance is vital for long range object detection. The
point about the chosen algorithm is that the image doesn’t
need to be down-sampled to speed up the method.
The comparison of this method with several others meth-
ods done in this paper showed that it has acceptable depth
estimation in high resolution images, 2864
∗
1924 pixels.
Acceptable depth estimation refers to the fact that the
algorithm doesn’t have the problem of estimating different
layers of depth in one object. It respects the depth layers
without conflict. This feature along with the fast process-
ing time makes this algorithm suitable for high resolution
real-time applications. Also it gives us the ability of mak-
ing a more accurate filter, which is described later in the
paper.
A. ALGORITHM DESIGN
The initial matching cost in ARWR is pixel-wise calculated
by employing census transform and gradient image matching.
Census-based matching technique or census transform was
initially introduced by Zabi in 1994 [18]. It is a form of
non-parametric local transform to map the intensity values
of the pixels within a square window to a bit string, thereby
capturing the image structure. In other words, it computes for
every pixel a binary string (census signature) by comparing
its grey value with the grey values in its neighborhood.
The census transform is robust to radiometric variations
but the noise in the local image structure is being encoded
based on the intensity of the pixels. The encoded noise brings
some matching doubts especially in the area with repetitive
or similar texture patterns.
To overcome this problem gradient image matching is
employed as part of the local matching block in ARWR.
At this stage gradient images are computed using 5 × 5
Sobel filters. The whole process of the ARWR is shown
in Fig. 1.
The green block in Fig. 1 shows the local matching block
including the transformation and matching parts.
The usual similarity criteria in stereo matching are
only strictly valid for surfaces with Lambertian (diffuse)
reflectance characteristics. Specular reflections are viewpoint
dependent and may cause large intensity difference at corre-
sponding image points. In the presence of specular reflection,
traditional stereo methods are often unable to establish any
correspondence, or the calculated disparity values tend to be
inaccurate.
In this case using the gradient image matching makes
the local matching method more robust on non-Lambertian
surfaces.
The noise variation in the local pixel-wise matching meth-
ods can be vital in term of the performance. That is why SLIC
(Simple Linear Iterative Clustering) algorithm is employed in
ARWR, the blue block in Fig. 1. SLIC is one of the common
super-pixeling methods [19].
The local measurements in the matching block are more
robust to noise variation when the super-pixels are considered
as the smallest parts of the image to be matched to the target
image. Super-pixeling is considered as an alternative to pixels
in pixel-wise matching which leads to a reduction in memory
requirements in the whole algorithm.
At the last step of the ARWR which is shown as pink
block in Fig. 1, the calculated matching cost is updated using
the RWR algorithm to determine the optimum disparity with
respect to occluded and discontinuity regions. The standard
VOLUME 4, 2016 5511