Journal of Mechanical Science and Technology 28 (7) (2014) 2459~2467
www.springerlink.com/content/1738-494x
DOI 10.1007/s12206-014-0603-7
Voxel-encoded descriptor for 3D model retrieval by exploring model’s
spatial information
†
Jin-Yuan Jia
1
, Qian Zhang
1
, Long Zeng
2
and Shuang Liang
1,*
1
School of Software Engineering, Tongji University, Shanghai, China
2
Department of Mechanical and Aerospace Engineering, Hong Kong University of Science and Technology, Hong Kong, China
(Manuscript Received October 11, 2013; Revised March 2, 2014; Accepted April 23, 2014)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Abstract
Retrieving similar products with a given one has attracted considerable attention. However, products are usually assembled by multiple
components, frustrating the previous visual-based retrieval descriptors. We design a voxel-encoded descriptor (VED) by exploring mod-
els’ spatial information, i.e., boundary data and internal data. This descriptor is computed in three steps. First, the posture of a polygonal
model is normalized by improved voxel-based principal component analysis technique. Then, six color images are generated by project-
ing the voxels along its six local axes. The color value of each pixel encodes status of all voxels intersecting with the ray starting from the
pixel and parallel to the axis. The status of all voxels along a ray embodies the spatial distribution of the model along this ray. Finally, the
VED is computed by applying 2D Fourier transformation to the six color images. With VED, we can distinguish a hollow sphere from a
solid sphere. To improve the retrieval efficiency, the database structure is optimized by an improved geometric manifold entropy
(iGEOMEN) scheme. VED and iGEOMEN are integrated into a model retrieval system. Experimental results demonstrate that the VED
descriptor outperforms the previous visual-based shape descriptors, especially for complex assembly models.
Keywords: iGEOMEN; Voxel-encoded descriptor; Model retrieval; Visual similarity; Voxel representation; Shape descriptor
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1. Introduction
This work is motivated by an industrial project, where de-
signers usually need to find similar 3D designs with a given
model. Previous visual-based retrieval methods, e.g., silhou-
ettes [1], binary images [2], depth images [3, 4], characteristic
lines [5], etc., are popular because of their efficiency. How-
ever, the problem encountered is that most design models are
assembled from multiple components, and the accuracy of
previous visual-based retrieval methods decreases considera-
bly, especially for complex assembly models. This is because
the previous shape descriptors only analyze the data of a mod-
el’s boundary, not exploring a model’s spatial structures.
That is, for an assembly model, components may be oc-
cluded by other components when viewed from a specific
angle. The regions between these components just act as its
internal structures. Thus, an assembly model usually has a
complex internal structure from a specific view. Even though
two complex models have the same appearance for all views,
they may have significantly different internal structure.
We propose here a new shape descriptor (VED) based on
voxel representation, to encode not only a model’s visualized
characteristics but also its internal structures. For a given 3D
model, it is approximated with six color images projected
from its six local axes. For each pixel of an image, a ray start-
ing from the pixel and parallel to the local axis is constructed.
Suppose there are n voxels intersecting with this ray and each
voxel has two states, i.e., occupied or empty, corresponding to
1 or 0 binary codes. Then, the state of all voxels of this ray can
be written as a string of binary code of length n. This string is
divided into three sub-strings and each is translated into a
color value of the pixel, corresponding to red, green and blue
channel. Finally, 2D Fourier transformation is applied to the
six color images to extract the VED.
An assembly model is normalized by an improved voxel-
based pose normalization process, for two reasons. First, the
VED descriptor is invariant to translation, scale (uniform) and
rotation, due to the voxel-based principal component analysis
technique (denoted as VPCA and detailed in Sec. 3). Second,
because of the voxelization, the VED has certain tolerance
level to noise and defects, e.g., holes and cracks etc. [6], which
are common in digitized models.
In addition, to be efficient in retrieving models from a large-
scale 3D model database, the database structure is optimized
with improved geometric manifold entropy technique, denoted
*
Corresponding author. Tel.: +86 21 69585491, Fax.: +86 21 69583731
E-mail address: shuangliang@tongji.edu.cn
†
Recommended by Associate Editor Gil Ho Yoon
© KSME & Springer 2014