3D OBJECT RECOGNITION
41
purpose vision system must be aware that key object features may not be visible even
when an object is present and visible in thefierd of view.
In situations where it is not
possible to take intelligent actions based on what is currently visible, the general
purpose system should automatically request the acquisition of image data from new
vantage points [32].
The visible-invariant surface characteristics that we have decided to use are the
Gaussian curvature (K)
and the
mean curvature
(H), which are referred to collec-
tively as
surface curvature.
We abbreviate this term as
S-curvature.
When a surface
region is visible, its S-curvature is invariant to
changes in surface parameterization
and to
translations and rotations
of object surfaces. In addition, mean curvature is an
extrinsic
surface property whereas Gaussian curvature is
intrinsic.
These terms are
discussed later. Differential geometry emphasizes that these are quite reasonable
surface features to consider.
Since we can seldom obtain perfect sensor data from the real world, it is desirable
to compute a “rich” characterization of a surface that preserves the surface structure
information and is insensitive to noise. Noise insensitivity may be achieved by
computing redundant, or at least “overlapping,” information about a surface. In
order to have a very rich geometric representation, we propose to combine surface
critical points (local maxima, minima, and saddle points) and large metric determi-
nant points (depth-discontinuities) with the surface curvature information to char-
acterize a depth map surface in more detail. They provide useful complementary
information and can be computed for a small additional cost. Given a depth map
surface characterization, we suggest that depth map surface region characteristics
can be matched against pre-computed object model surface region characteristics
guided by depth-discontinuity and critical point information to achieve object
recognition.
The matching algorithm of a robust 3-D object recognition system must be
view-independent. One could use multiple view ideas similar to those of Koenderink
and van Doorn (visual potential) [30, 311 or Chakravarty and Freeman (characteris-
tic views) [lo], but we are pursuing a new, more compact, scheme that does not
increase its storage requirements so dramatically as object complexity increases.
After the matching algorithm has produced a list of possible objects and their
respective locations and orientations, we can use a depth-buffer algorithm to create a
synthetic depth map using the world model. Verification matching could be done
directly between the synthetic depth map and the sensor data, or we may run the
surface characterization algorithm on the synthetic data to yield a synthetic scene
description that could be matched against the surface characterization scene descrip-
tion computed from the sensor data. If major discrepancies exist, the system should
try to remedy the problems in its understanding automatically. It may also be
necessary to compute our surface characterization using different window sizes
(scales) and correlate features in this scale-space dimension to help overcome the
effects of noise. The matching algorithm, the matching object representation, the
feedback process, and scale-space ideas require further study.
5. REVIEW OF DIFFERENTIAL GEOMETRY OF SURFACES
In Section 3, we discussed how range-image object recognition might be decom-
posed into a surface recognition problem. We assume that surfaces can be recog-
nized by their characteristics. But what does this term “surface characteristic” mean?