单RGB图像到3D网格模型：Pixel2Mesh方法解析

下载需积分: 0 | PDF格式 | 2.97MB | 更新于2024-07-01 | 34 浏览量 | 举报

"Pixel2Mesh: 通过单个RGB图像生成3D网格模型" Pixel2Mesh是一种创新的深度学习方法，其主要目标是从单个RGB图像中生成三维（3D）三角网格模型。这一技术由Nanyang Wang、Yinda Zhang、Zhuwen Li等来自复旦大学、普林斯顿大学、英特尔实验室以及腾讯AI Lab的研究人员共同提出。该方法的独特之处在于它克服了先前3D形状表示方法的局限性，如体积表示或点云表示，这些方法在转换为更实用的网格模型时通常面临挑战。在Pixel2Mesh中，3D网格被表示为基于图的卷积神经网络（GCNN）。这种表示方式允许网络从输入图像中提取感知特征，并以此为基础逐步变形一个初始的椭球体，从而生成准确的几何形状。这一过程采用从粗到细的策略，确保整个变形过程的稳定性。在变形过程中，定义了各种损失函数，包括几何损失、拓扑损失和颜色损失，以确保生成的3D模型既具有合理的形状，又与输入图像的颜色信息相匹配。几何损失关注于保持模型的几何精度，确保生成的3D网格具有正确的结构。拓扑损失则用于维持网格的连通性和结构一致性，防止在变形过程中出现不连续或错误的连接。颜色损失则是将输入图像的颜色信息映射到3D网格上，使生成的模型在视觉上与输入图像相符。 Pixel2Mesh的工作流程大致分为以下几个步骤： 1. 输入图像预处理：首先对输入的RGB图像进行特征提取，为后续的形状估计提供信息。 2. 初始化：从一个简单的椭球体开始，作为3D形状的基础。 3. 图形卷积：使用GCNN对椭球体进行迭代更新，每一层的卷积操作都在图结构上进行，以适应图像特征。 4. 变形过程：根据提取的特征，逐步调整和优化网格的顶点位置，形成复杂的3D形状。 5. 损失函数优化：通过反向传播，最小化几何、拓扑和颜色损失，不断调整网格，直到达到最佳状态。 6. 输出：最终生成的网格模型可以直接用于渲染、动画或其他3D应用。这种方法的优点在于，它提供了一种有效且直观的方式来从单张二维图像中恢复三维信息，这对于3D建模和计算机视觉领域具有重要的实际应用价值。然而，值得注意的是，尽管Pixel2Mesh在许多场景下表现出色，但可能在处理复杂纹理、遮挡情况或者光照变化较大的图像时面临挑战，需要进一步的研究和改进。

4 N. Wang

, Y. Zhang

, Z. Li

, Y. Fu, W, Liu, Y. Jiang

constraint on a modern GPU. Most recently, Tatarchenko et al.[30] have proposed an

octree representation, which allows to reconstructing higher resolution outputs with a

limited memory budget. However, a 3D voxel is still not a popular shape representation

in game and movie industries. To avoid drawbacks of the voxel representation, Fan et

al.[9] propose to generate point clouds from single images. The point cloud representa-

tion has no local connections between points, and thus the point positions have a very

large degree of freedom. Consequently, the generated point cloud is usually not close

to a surface and cannot be used to recover a 3D mesh directly. Besides these typical 3D

representations, there is an interesting work [28] which uses a so-called “geometry im-

age” to represent a 3D shape. Thus, their network is a 2D convolutional neural network

which conducts an image to image mapping. Our works are mostly related to the two

recent works [17] and [24]. However, the former adopts simple silhouette supervision,

and hence does not perform well for complicated objects such as car, lamp, etc; the

latter needs a large model repository to generate a combined model.

Our base network is a graph neural network [26]; this architecture has been adopted

for shape analysis [31]. In the meanwhile, there are charting-based methods which di-

rectly apply convolutions on surface manifolds [2,22,23] for shape analysis. As far as

we know, these architectures have never been adopted for 3D reconstruction from sin-

gle images, though graph and surface manifold are natural representations for meshed

objects. For a comprehensive understanding of the graph neural network, the charting-

based methods and their applications, please refer to this survey [3].

3 Method

3.1 Preliminary: Graph-based Convolution

We ﬁrst provide some background about graph based convolution; more detailed in-

troduction can be found in [3]. A 3D mesh is a collection of vertices, edges and faces

that deﬁnes the shape of a 3D object; it can be represented by a graph M = (V, E, F),

where V = {v

}

i=1

is the set of N vertices in the mesh, E = {e

}

i=1

is the set of

E edges with each connecting two vertices, and F = {f

}

i=1

are the feature vectors

attached on vertices. A graph based convolutional layer is deﬁned on irregular graph as:

l+1

= w

q∈N (p)

(1)

where f

∈ R

, f

l+1

∈ R

l+1

are the feature vectors on vertex p before and after the

convolution, and N (p) is the neighboring vertices of p; w

and w

are the learnable

parameter matrices of d

× d

l+1

that are applied to all vertices. Note that w

is shared

for all edges, and thus (1) works on nodes with different vertex degrees. In our case, the

attached feature vector f

is the concatenation of the 3D vertex coordinate, feature en-

coding 3D shape, and feature learned from the input color image (if they exist). Running

convolutions updates the features, which is equivalent as applying a deformation.

剩余15页未读，继续阅读

仙夜子

粉丝: 45

单RGB图像到3D网格模型：Pixel2Mesh方法解析

Pixel2Mesh(翻译).pdf

Mesh进行3D渲染绘制

Pixel2Mesh：Pixel2Mesh：从单个RGB图像生成3D网格模型。 在ECCV2018中

An End-to-End Network for Generating Social Relationship Graphs.pdf

Pure-phase apodizer for generating double ring-shaped focuses

LEd- a software for generating PDF file

Deep Visual-Semantic Alignments for Generating Image Descriptions

视频图matlab代码-Autonomous-2D-Map-Generating-Robot:自主制图机器人

Variational-Recurrent-Autoencoder-Tensorflow, "Generating Sentences from a Continuous Space"的tensorflow实现.zip

Improved foilless Ku-band transit-time oscillator for generating gigawatt level microwave with low guiding magnetic field

最新资源

Pixel2Mesh：Pixel2Mesh：从单个RGB图像生成3D网格模型。在ECCV2018中