3Stereo Vision
disparity of every image point should be determined to obtain a 2D disparity
or depth map. For example, if the resolution of each image is N × M, a brute
force approach for nding all correspondences needs (N × M)
2
computations.
To reduce the computation complexity of the stereo matching problem, the
3D geometry of a stereo vision system should be considered so that stereo
matching is restricted to certain image areas. To know the stereo geometry,
calibration of a stereo vision with respect to a reference coordinate system
is needed. A direct calibration method is presented to obtain a geometric
relationship between the two cameras, which is represented by 3D Euclidean
transformations, rotation, and translation. From the stereo calibration, we
know that stereo correspondences are related by a 2D point-to-line projec-
tion, called epipolar geometry.
The epipolar geometry between the stereo images provides a very strong
constraint to nd stereo correspondences. The epipolar constraint is essen-
tial in stereo matching because an image point in one image has its conju-
gate point on an epipolar line in the other image, while the epipolar line is
derived by the original point. Therefore, the computation complexity and
time of stereo correspondences can be greatly reduced. If the epipolar lines
are parallel to the horizontal image axis, stereo matching can be done in an
effective and fast way since the correspondences between stereo images lie
only along the same horizontal line. For this reason, it is better to convert
nonparallel epipolar lines to parallel ones. In terms of stereo conguration,
this is the same as converting a general stereo conguration to the parallel
stereo conguration, which is called stereo rectication. Rectication of stereo
images transforms all epipolar lines in the stereo images parallel to the hori-
zontal image axis. Therefore, in the rectied stereo images, corresponding
image points are always in the same horizontal lines. This geometric prop-
erty also greatly reduces computation time for stereo matching.
For many years, various stereo matching techniques have been introduced.
Most stereo matching techniques are categorized into two types. In terms of
matching cost and energy aggregation, they are categorized as either local
or global stereo matching techniques. Local stereo matching techniques use
image templates dened in both stereo images to measure their correlation
[14,15]. Common techniques are SAD (sum of absolute difference), SSD (sum
of squared difference), and NCC (normalized cross correlation).
In the template-based method, a cost function is dened based on the simi-
larity between two image templates of their left and right images. Suppose
an image template is dened from the left image and many comparing tem-
plates are dened along an epipolar line in the right image. Then a matching
template from the right image is determined in the sense of minimizing the
matching cost. Local matching techniques are useful when only some parts
in an image are of interest for obtaining the depth map of the area.
By the way, most recent investigations in stereo vision address global
error minimization. In global matching methods, a cost function is dened
in terms of image data and depth continuity [14]. The data term measures