基于投影分析的3D形状语义分割方法

101 浏览量更新于2024-08-26 收藏 1.03MB PDF 举报

"3D形状分割的投影分析" 本文介绍了一种用于3D形状语义分割和标记的投影分析方法。该方法将输入的3D形状视为2D投影的集合，然后通过从现有的已标记图像中转移知识来标记每个投影，最后反向投影并将标记融合在3D形状上。投影分析的优点在于可以简化处理任务，规避了具有完整且建模良好的3D形状的要求，并通过利用大量可用的图像数据来应对3D形状分析的数据挑战。大而密集的标记集可确保从紧密匹配的标记图像中推断出给定投影图像的标记。投影分析方法的关键步骤包括： 1. 将输入的3D形状视为2D投影的集合 2. 从现有的已标记图像中转移知识来标记每个投影 3. 反向投影并将标记融合在3D形状上在图像空间分析中，使用了基于新颖的双类Hausdorff距离匹配3D对象的投影二进制图像。该距离是拓扑感知的，并且应用于分段线性扭曲的对象投影以补偿零件缩放和视图差异。投影分析方法可以应用于不完美的3D模型的语义标记，如不完整或自相交的模型。如果不采用投影分析方法，将很难对其进行分析。该方法的优点包括： * 可以简化处理任务 * 规避了具有完整且建模良好的3D形状的要求 * 可以应对3D形状分析的数据挑战 * 可以应用于不完美的3D模型的语义标记投影分析方法的缺点包括： * 需要大量可用的图像数据 * 需要高质量的标记图像 * 可能存在标记错误或不准确的情况投影分析方法是一种强大的工具，用于3D形状语义分割和标记。该方法可以应用于各种领域，如计算机视觉、机器人学、计算机图形学等。相关知识点： * 3D形状分割 * 投影分析 * 语义分割 * 标记 * Hausdorff距离 * 分段线性扭曲 * 图像空间分析 * 二进制图像 * 投影二进制图像 * topological awareness 总结来说，投影分析方法是一种创新性的方法，用于3D形状语义分割和标记。该方法可以应用于各种领域，并且具有广泛的应用前景。

well labeled images in over 22,000 object categories. By utilizing

ImageNet to train object detectors, Lai et al. [2012] demonstrated

that the resulting detector can be used to reliably label objects in 3D

scenes. While the amount of available 3D data continues to grow,

it is unlikely that it will ever come close to matching the volume

of image data. Moreover, compared to 2D images, 3D shapes are

inherently more difﬁcult to acquire and process, requiring more ef-

fort to label and analyze. Our work demonstrates the advantages of

image data, in terms of its sheer volume and relative ease for pro-

cessing, which can be exploited to address challenges arising from

the segmentation of 3D shapes.

Projective shape analysis. Treating a 3D shape as a collection

of 2D projections rendered from multiple directions is not new to

computer graphics. Murase and Nayar [1995] recognize an object

by matching its appearance with a large set of 2D images obtained

automatically by rendering 3D models under varying poses and illu-

minations. Lindstrom and Turk [2000] compute an image-space er-

ror metric from these projections to guide mesh simpliﬁcation. Cyr

and Kimia [2001] generate projections from selected view direc-

tions and use them to identify 3D objects and their poses. Sketch-

or image-based 3D shape retrieval [Eitz et al. 2012] compares ob-

ject projections with query images or user-drawn sketches in 2D.

Similarities among 2D shapes can be evaluated using techniques

such as LFD [Chen et al. 2003] and cross-correlation [Makadia and

Daniilidis 2010]. Liu and Zhang [2007] embed a 3D mesh into the

spectral domain, turning the 3D segmentation problem into a con-

tour analysis one. 3D reconstruction from multi-view images is one

of the most fundamental problems in computer vision. Our work

applies projective analysis to a new application: semantic segmen-

tation of 3D shapes. Speciﬁcally, we fuse labeled segmentations

learned from back-projected 2D labels to obtain a coherent seman-

tic labeling of a 3D object.

Image and shape hybrid processing. 3D shape reconstruction

often beneﬁts from utilizing available 2D data, e.g., from registered

photographs, to improve the quality of 3D scans [Li et al. 2011]. On

the other hand, leveraging a priori 3D geometry of a given object

category can alleviate the ill-posed nature of image analysis from

single photographs. Chang et al. [2009] and Pepik et al. [2012]

combine the representational power of 3D objects with 2D object

category detectors to estimate viewpoints. Xu et al. [2011] take a

data-driven approach for photo-inspired 3D shape creation, where

the best matching 3D candidate is deformed to ﬁt the silhouette of

the object captured in a single photograph. In our work, we also

take a hybrid approach where the semantics of 3D shapes is guided

by constraints learned via projective shape analysis.

Image retrieval. Measuring image similarity for retrieval is ex-

tensively studied in computer vision; see [Xiao et al. 2010] for

a systematic study of image features for scene retrieval. Well-

known distance measures between 2D shapes include Hamming

distance, Hausdorff distance [Baddeley 1992], shock graph edit

distance [Klein et al. 2001], distance between Fourier descrip-

tors [Chen et al. 2003], inner distance shape context [Ling and Ja-

cobs 2007], and context-sensitive shape similarity [Bai et al. 2010].

Different from previous attempts, we do not only retrieve a 2D

shape but also infer a semantic labeling of its interior. Unlike ex-

isting contour-based methods [Ling and Jacobs 2007], our region-

based analysis allows shape retrieval and label transfer to be con-

ducted in a coherent manner. Moreover, our image retrieval is not

cross-category, but within-category, with the goal of ﬁnding shapes

with similar topological features to guide part-aware label trans-

fer. To properly evaluate the differences between the correspond-

ing parts of two shapes, we implicitly warp one shape to match

Figure 3: Region-based matching via warp alignment. Both the

labeled images (left column) and query projection (middle column)

are cut into axis-aligned slabs. Each labeled image is then warped

to match the query projection. The dissimilarity is measured us-

ing warp-aligned shapes, allowing the matching to favor the shape

with similar topologies (top row) over the one with parts at similar

scales and positions (bottom row). Note that although the bottom

chair is visually more similar, the top chair is more useful for label-

ing the armrest area in the query projection.

the other, before computing dissimilarity using a topology-aware

Hausdorff distance measure.

Image label transfer. Semantic label transfer is another core

problem in computer vision. Existing approaches can be classiﬁed

into learning-based and non-parametric based. The former ones try

learn a model for each object category. A successful method is Tex-

tonboost [Shotton et al. 2006], which trains a conditional random

ﬁeld (CRF) model. A problem of learning-based methods is that

they do not scale well with the number of object categories. With

the emergence of large image databases, non-parametric methods

have demonstrated their advantages. Given an input image, Liu et

al. [2011a] ﬁrst retrieve its nearest neighbors from a large database

using GIST matching [Oliva and Torralba 2001]; then transfer and

integrate together annotations from each of these neighbors via

dense correspondence estimated from SIFT ﬂow [Liu et al. 2011b].

Compared to learning-based approaches, this method has few pa-

rameters and allows simply adding more images and/or new cate-

gories without requiring additional training. When the set of anno-

tated images is small, Zhang et al. [2010] and Chen et al. [2012]

further learn an object model from the retrieved nearest neighbors

to improve the performance of label transfer. Our approach incor-

porates the same nearest neighbor idea, but instead of performing

label transfer within the whole image domain, we compute seman-

tic labeling for the interior of the 2D shape only. This provides

us additional constraints for obtaining a better labeling result. In

addition, almost all existing dense correspondence estimation ap-

proaches [Liu et al. 2011b; Berg et al. 2005; Leordeanu and Hebert

2005; Duchenne et al. 2011] rely on local intensity patterns and are

unsuitable for transferring labels to textureless 2D projections.

3 Overview

Our image-driven shape analysis is based on a dataset of pre-labeled

images which captures the semantic knowledge about the relevant

class of shapes. The input is a 3D mesh model, possibly non-

manifold, incomplete, or self-intersecting. The 3D shape and the

labeled images belong to the same semantic class. We assume that

both the input and the objects captured in the labeled images are

in their upright orientations. In practice, we found the assumption

to hold for the vast majority of the data, e.g., almost all chair im-

ages found on Google. We apply our multi-view shape matching

剩余10页未读，继续阅读

weixin_38606202

粉丝: 1
资源: 951

基于投影分析的3D形状语义分割方法

3D投影控制软件

牙齿stl网格模型分割算法（投影算法（曲面的栅格化算法））（牙龈外轮廓计算）

在进行3D形状分割时，如何有效地利用投影分析技术，并详细描述Hausdorff距离在提高3D对象识别精度中的应用？

在3D形状语义分割过程中，如何应用投影分析技术，并详细阐述Hausdorff距离的作用及其如何帮助提高分割精度？

如何结合投影分析技术进行3D形状的语义分割，并解释Hausdorff距离在该过程中的作用？

通过极线成像和区域条纹投影在存在强互反射的情况下进行3D形状测量

3D人脸姿态估计与投影来确定特征点初始位置，然后使用经典的回归树集成ERT方法来更好的进行位置回归

CS裁剪，二维或三维图形任意平移旋转缩放的复合算法，正等轴测投影，中点分割裁剪的QT,Web，VC,C#,OpenGL,Java

用于3D人脸识别和情感分析的鲁棒区域边界球面描述符

最新资源