基于全景视图的3D形状特征描述与无监督检索

需积分: 10 7 浏览量更新于2024-07-29 收藏 2.08MB PDF 举报

本文档主要探讨了一种新颖的3D形状描述符，称为"Panorama: A 3D Shape Descriptor Based on Panoramic Views for Unsupervised 3D Object Retrieval"（《国际计算机视觉》2010年，卷89，第177-192页）。该研究发表于2009年，由Panagiotis Papadakis、Ioannis Pratikakis、Theoharis Theoharis和Stavros Perantonis四位作者共同完成。该3D形状描述符的核心思想是利用一组全景视图来刻画3D对象在三维空间中的位置和姿态。方法步骤包括：首先，将3D物体投影到三个相互垂直的圆柱体上，这三个圆柱体分别与对象的三个主轴对齐，这样可以全面捕捉物体的全局形状。每一步投影后，研究人员计算得到相应的二维离散傅立叶变换（2D DFT）和二维离散小波变换（2D DWT），这些转换提供了对象局部特征的频域和时域信息。进一步提升检索性能的关键在于采用了一种局部（无监督）的相关反馈技术。这种方法在实际应用过程中，通过分析查询结果的反馈信息，动态调整对象的描述符，使得搜索更加精确和针对性。这种自适应的方法有助于减少误匹配，并在没有预先标记数据的情况下提高3D物体检索的准确性。这项研究提供了一种创新的3D形状描述方法，结合了全局和局部特征提取，以及自适应的反馈机制，对于3D对象的自动识别和检索具有重要意义，尤其在无监督学习和计算机视觉领域内。通过这种方式，能够有效地处理大规模3D数据集，为3D物体检索任务开辟了新的可能性。

180 Int J Comput Vis (2010) 89: 177–192

proposed a hybrid descriptor formed by combining features

extracted from a depth-buffer and spherical-function based

representation, with enhanced translation and rotation in-

variance properties. The advantage of this method over sim-

ilar approaches is the top discriminative power along with

minimum space and time requirements.

2.2 Relevance Feedback in 3D Object Retrieval

In order to enable the machine to retrieve information

through adapting to individual categorization criteria, rel-

evance feedback (RF) was introduced as a means to involve

the user in the retrieval process and guide the retrieval sys-

tem towards the target. Relevance feedback was ﬁrst used

to improve text retrieval (Rochio 1971), later on success-

fully employed in image retrieval systems and lately in a

few 3D object retrieval systems. It is the information that is

acquired from the user’s interaction with the retrieval sys-

tem about the relevance of a subset of the retrieved results.

Further information on relevance feedback methods can be

found in Ruthven and Lalmas (2003), Crucianu et al. (2004),

Zhou and Huang (2001) and Papadakis et al. (2008b).

Local relevance feedback (LRF), also known as pseudo

or blind relevance feedback, is different from the conven-

tional approach in that the user does not actually provide

any feedback at all. Instead, the required training data are

obtained based only on the unsupervised retrieval result.

The procedure comprises two steps. First, the user submits

a query to the system which uses a set of low-level features

to produce a ranked list of results which is not displayed to

the user. Second, the system reconﬁgures itself by only us-

ing the top m matches of the list, based on the assumption

that most likely they are relevant to the user’s query.

LRF was ﬁrst employed in the context of text retrieval,

in order to extend the keywords comprising the query with

related words from the top ranked retrieved documents.

Apart from a few studies that incorporated RF in 3D ob-

ject retrieval (Elad et al. 2001; Bang and Chen 2002;Atmo-

sukarto et al. 2005; Lou et al. 2003; Leifman et al. 2005;

Akbar et al. 2006; Novotni et al. 2005), LRF has only lately

been examined in Papadakis et al. (2008b).

3 Computation of the PANORAMA Descriptor

In this section, we ﬁrst describe the steps for the compu-

tation of the proposed descriptor (PANORAMA), namely:

(i) pose normalization (Sect. 3.1), (ii) extraction of the

panoramic views (Sect. 3.2) and (iii) feature extraction

(Sect. 3.3). Finally, in Sect. 3.4 we describe a weighing

scheme that is applied to the features and the procedure for

comparing two PANORAMA descriptors.

3.1 Pose Normalization

Prior to the extraction of the PANORAMA descriptor, we

must ﬁrst normalize the pose of a 3D object, since the trans-

lation, rotation and scale characteristics should not inﬂuence

the measure of similarity between objects.

To normalize the translation of a 3D model we compute

its centroid using CPCA (Vranic 2004). In CPCA, the cen-

troid of a 3D mesh model is computed as the average of its

triangle centroids where every triangle is weighed propor-

tionally to its surface area. We translate the model so that its

centroid coincides with the origin and translation invariance

is achieved as the centroids of all 3D models coincide.

To normalize for rotation, we use CPCA and NPCA (Pa-

padakis et al. 2007) in order to align the principal axes of a

3D model with the coordinate axes. First, we align the 3D

model using CPCA to determine its principal axes using the

model’s spatial surface distribution and then we use NPCA

to determine its principal axes using the surface orientation

distribution. Both methods use Principal Component Analy-

sis (PCA) to compute the principal axes of the 3D model.

The difference between the two methods lies in the input

data that are used for the computation of the covariance ma-

trix. In particular, in CPCA the surface area coordinates are

used whereas in NPCA the surface orientation coordinates

are used which are obtained from the triangles’ normal vec-

tors. The detailed description regarding the formulation of

CPCA and NPCA can be found in Vranic (2004) and in our

previous work (Papadakis et al. 2007), respectively.

Thus, we obtain two alternative aligned versions of the

3D model, which are separately used to extract two sets of

features that are integrated into a single feature vector (see

Sect. 3.4).

The PANORAMA shape descriptor is rendered scale in-

variant, by normalizing the corresponding features to the

unit L

norm. As will be later described in Sects. 3.3.1 and

3.3.2, the features used by the PANORAMA descriptor are

obtained from the 2D Discrete Fourier Transform and 2D

Discrete Wavelet Transform. The corresponding coefﬁcients

are proportional to the object’s scale, therefore by normal-

izing the coefﬁcients to their unit L

norm we are in fact

normalizing all objects to the same scale.

3.2 Extraction of Panoramic Views

After the normalization of a 3D model’s pose, the next step

is to acquire a set of panoramic views.

To obtain a panoramic view, we project the model to the

lateral surface of a cylinder of radius R and height H =2R,

centered at the origin with its axis parallel to one of the co-

ordinate axes (see Fig. 1). We set the value of R to 3 ∗d

mean

where d

mean

is the mean distance of the model’s surface

from its centroid. For each model, the value of d

mean

is deter-

mined using the diagonal elements of the covariance matrix

剩余15页未读，继续阅读

aiaixin

粉丝: 5

基于全景视图的3D形状特征描述与无监督检索

ShapeDescriptor

3D-Shape3D.zip

3DVIAShape(3D建模神器)5.0.0官方安装版

cole_02_0507.pdf

工程硕士开题报告：无线传感器网络路由技术及能量优化LEACH协议研究

【东海期货-2025研报】东海贵金属周度策略：金价高位回落，阶段性回调趋势初现.pdf

图像数据处理工具+数据(帮助用户快速划分数据集并增强图像数据集。通过自动化数据处理流程，简化了深度学习项目的数据准备工作)

diminico_02_0709.pdf

agenda_3cd_01_0716.pdf

A课件Python全栈开发线下班.zip

最新资源