均先验3D点云生成的WarpingGAN网络

100 浏览量更新于2023-10-25 收藏 19.53MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

Yingzhi Tang1*Yue Qian1*Qijian Zhang1Yiming Zeng1Junhui Hou1Xuefei Zhe2{yztang4-c, yueqian4-c, qijizhang3-c, ym.zeng}@my.cityu.edu.hk, jh.hou@cityu.edu.hkShapeGFDPMPDGNTreeGANWarpingGANSP-GANmented/virtual reality [16], animation [9, 27], and immer-sive telepresence [21]. However, obtaining 3D point clouddata is still costly and time-consuming in realistic scenarios,especially the shapes with complex geometry and topology.Besides, the acquired point clouds with 3D sensing devicesare usually incomplete and sparse due to occlusions, dis-tances, and surface materials. The great success of gener-ative adversarial network (GAN)-based 2D image genera-tion [4,6,31] makes synthesizing realistic-looking 3D pointclouds promising, i.e., generating point clouds whose sta-tistical distribution is similar to real point clouds. However,the essentially different data modality as well as the uniquecharacteristics of 3D point clouds, i.e., the irregular struc-ture and unorderness, makes it non-trivial to extend GAN-based methods for generating 2D images to 3D point cloudgeneration.Recently, several works on 3D point cloud generationhave been proposed [1, 10, 12, 15, 20, 22, 26, 28]. For ex-ample, GAN-based methods [1, 10, 20, 22, 26] usually usemulti-layer perceptrons (MLPs) as generators to directlylearn mapping functions between latent codes and 3D pointclouds, which require a large number of parameters to fit.Moreover, as the adversarial learning mechanism cannotimpose strong constraints on global shapes and local ge-ometric details, these approaches tend to generate non-uniformly distributed point clouds, as illustrated in Fig. 2.Yang et al. [28] and Luo et al. [17] considered 3D pointcloud generation as probabilistic problems, which first sam-ple points from a Gaussian space and then move them to the63970WarpingGAN：用于对抗性3D点云生成的多个均匀先验的变形01 香港城市大学 2 腾讯AI实验室0图1. 与最先进的3D点云生成方法生成的形状进行视觉比较。TreeGAN [ 20 ]，PDGN [ 10 ]和SP-GAN [ 15 ]是基于GAN的，而ShapeGF [ 5]和DPM [ 17 ]是基于概率的。0摘要0我们提出了WarpingGAN，一种有效且高效的3D点云生成网络。与现有方法直接学习潜在代码和3D形状之间的映射函数来生成点云不同，WarpingGAN学习了一个统一的局部变形函数，将多个相同的预定义先验（即均匀分布在规则3D网格上的点集）变形为由局部结构感知语义驱动的3D形状。此外，我们还巧妙地利用了鉴别器的原理，并定制了一个拼接损失来消除生成形状的不同分区之间的间隙，以提高质量。由于新颖的生成机制，WarpingGAN在一次训练后成为一个单一的轻量级网络，能够高效地生成具有不同分辨率的均匀分布的3D点云。广泛的实验结果证明了我们的WarpingGAN在定量指标、视觉质量和效率方面优于最先进的方法。源代码公开可用于0https://github.com/yztang4/WarpingGAN.git . 1. 引言0本工作得到了香港研究资助局城市大学11202320和11218121号资助。通讯作者：J. Hou。* 平等贡献(a) TreeGAN(b) PDGN(c) SP-GANExisting 3D point cloud generation methods can beroughly classified into two categories, i.e., GAN-basedmethods and Probabilistic-based.GAN-based approaches. As the first work, Achlioptaset al. [1] proposed rGAN, whose generator consists of sev-eral fully connected layers. However, both the generatorand discriminator of rGAN cannot well utilize the local in-formation and tends to generate defective parts. Valsesia etal. [22] utilized a graph convolution network to learn localdependencies between a point and its neighbors by design-ing a localized operation. Shu et al. proposed [20] Tree-GAN, where a tree structure is introduced to preserve an-cestor information instead of neighbor information to gen-erate new points. Hui et al. [10] proposed a progressivelearning strategy to generate multi-resolution point clouds,and a learning-based bilateral interpolation is utilized to ex-ploit local geometric structure.The above methods em-ploy MLPs on the global feature to directly generate pointclouds, which can be hard to optimize and inflexible interms of the number of points. To simplify the learning pro-cess, TreeGAN and PDGN adopt the inefficient progressivearchitecture in the generator.Recently, Li et al. [15] proposed SP-GAN for point cloudgeneration and manipulation. They introduce a pre-definedsphere to perform deformation. By contrast, our methodadopts multiple uniform 3D priors to warp each shape par-tition, enabling higher-quality point clouds. Furthermore,unlike SP-GAN, our architecture does not employ the time-consuming kNN operation in the generator. The experimentfindings show that our proposed WarpingGAN is more ef-fective and efficient than SP-GAN.It is worth noting that Wang et al. [24] experimen-tally found that the current GAN-based frameworks canonly adopt PointNet as the discriminator. Other more ad-vanced point cloud frameworks such as PointNet++ [19]and DGCNN [25] are unable to be optimized as discrimina-tors. However, PointNet learns point-wise features and usesthe max-pooling symmetric function to select critical pointsto determine the global shape feature. Thus, only a smallportion of critical points guide the global shape, and localgeometric information is lost. Therefore, a well-designedgenerator is crucial for the GAN-based method.Probabilistic-based approaches regard point cloudsas samples from a distribution, and then move the sam-pled points to the target positions during the generativephase.PointFlow [28] uses the continuous normalizingflow framework to transform the parameters of the distribu-tions of shapes and the distribution of points given a shape.ShapeGF [5] generates point clouds by learning the gradi-ent field of its log-density and moves points gradually in thegradient direction. DPM [17] simulates the generation pro-cess as non-equilibrium thermodynamics by converting the63980(d) WarpingGAN0图2. 不同基于GAN的方法生成的点云的视觉比较。0目标位置通过学习分布变换。然而，由于这些方法倾向于估计训练数据的平均分布，因此生成的点云模糊且没有清晰的全局形状和局部细节。除了质量有限外，现有方法还因为采用了耗时的k最近邻（k NN）搜索而效率低下，PDGN[10]和SP-GAN [15]采用了渐进生成过程，TreeGAN[20]和PDGN [10]采用了两阶段训练策略，ShapeGF[5]也禁止了端到端优化。为了解决上述问题，我们提出了WarpingGAN，它引入了一种新颖的GAN-based3D点云生成机制。通过利用多个3D均匀先验，即均匀分布在一个单位3D立方体上的点集，WarpingGAN将生成过程形式化为学习一个将多个3D先验变形到3D形状不同局部区域的函数，该函数在局部结构感知语义的指导下进行变形，这与直接学习从潜在代码生成固定数量的点的现有方法基本不同。同时，我们设计了一种拼接损失，用于缩小生成点云与真实点云之间的局部差异，以缩小不同分区之间的差距。这种新机制使得WarpingGAN具有紧凑性和高效性。此外，它使得WarpingGAN能够在一次训练后生成具有不同点数的点云。此外，3D先验的均匀性在一定程度上可以隐含地规范WarpingGAN，促进均匀分布的点云的生成，如图2所示。总之，我们的工作有以下贡献：•我们从统一局部变形的新视角研究了基于GAN的3D点云生成，导致WarpingGAN具有轻量级、高效率和灵活的输出；•通过利用鉴别器的固有设计而不引入额外的复杂操作，我们提出了适用于WarpingGAN的拼接损失，以提升生成器的性能；•我们进行了大量实验和分析，证明了WarpingGAN相对于最先进的方法的优越性。02. 相关工作Cocess only requires unified parameters Θ to be optimized,which is more compact and easier. Moreover, although weexpect to use multiple priors rather than a single one im-prove generation quality, the weak supervision ability of thediscriminator may not realize our objective, i.e., the super-vision is insufficient to drive the local regions warped fromdifferent priors to tightly fit to each other, resulting in gapsbetween them. To handle this issue, we further tailor a sim-ple yet effective stitching loss to supervise the training ofthe generator.Fig. 3 illustrates the overall architecture of the gener-ator of our WarpingGAN, which consists of two modules,i.e., code enhancement and unified local warping. Specifi-cally, taking a latent code z as input, the code enhancementmodule first enhances its representation ability to 3D shapesby changing its distribution. Conditioned on the enhancedcode, the unified local-warping module is then successivelyperformed twice to warp multiple pre-defined uniform 3Dpriors to different local regions, which are finally assem-bled into a 3D shape. Owing to the warping-style mech-anism, WarpingGAN is featured with highly compact andefficient. Besides, WarpingGAN is able to generate point63990将噪声分布转化为具有马尔可夫链的形状分布。其中一些方法 [ 17 , 28 ]可以实现灵活的生成，因为它们独立地处理每个点。然而，由于点之间的关联缺失，它们无法正确解决生成形状的非均匀和噪声问题。此外，许多基于概率的方法 [ 5 ]采用两阶段的训练过程，需要额外的自编码器训练。值得注意的是，ShapeGF 也使用了 GAN结构，不仅仅是自编码器。点云自编码器旨在通过一个狭窄的瓶颈层重构输入点云。例如，FoldingNet [ 29 ]引入了一种折叠式操作，通过学习从2D网格到3D点云的映射函数来重构3D形状。AtlasNet [ 7 ]利用一系列2D网格通过多个独立的 MLP逐块地重构表面。Bednarik 等人 [ 2 ]通过计算重构表面的一阶和二阶导数的差异性质，解决了AtlasNet的补丁坍塌和重叠问题。备注：我们认为这种折叠式解码器在基于GAN的点云生成任务中潜在适用，这在以前的框架中被忽视了。在这项工作中，我们通过研究一个更强大的点云生成器在这个方向上迈出了一步。请注意，直接采用FoldingNet和AtlasNet的解码器作为GAN基点云生成框架的生成器无法生成满意的点云。请参阅第 3.1 节的分析和第4.3 节的实验演示。03. 提出的方法03.1. 问题分析和形式化0给定一个遵循高斯分布的潜在编码 z ∈ R C ，生成器 F ( ∙) 试图产生一个与真实数据集 { � P }具有相同统计分布的点云 P ∈ R N × 3。大多数现有的基于GAN的方法直接学习 z 和 P之间的映射函数，即 F ( z ; Θ ) = P ，其中 Θ是待学习的网络参数。然而，由于鉴别器的弱监督能力，学习 F ( ∙ )非常困难，特别是对于具有复杂形状的数据集，从而限制了生成的点云的质量。更具体地说，Wang 等人 [ 24 ]分析指出，作为唯一可行的鉴别器，PointNet 架构 [ 18 ]只能感知少数关键点的分布，而不能感知整个点云的分布。受前面提到的折叠式设计的启发，我们考虑将均匀的3D先验（即均匀分布在规则3D网格中的一组3D点）扭曲为3D形状，表示为 U ∈ [0 , 1] N × 3 。因此，我们重新构造0将生成过程点智能化为 F ( u i , z ; Θ ) = p i � i ∈ [1 , N ]，其中 u i 和 p i 分别是 U 和 P 的第 i个点。我们期望预定义的3D先验的均匀性可以规范所有生成点的分布，以减轻前面提到的基于最大池化鉴别器的限制。然而，单一的先验可能无法很好地生成复杂的形状，因为先验的拓扑结构与3D对象之间存在本质差异。因此，我们计划使用多个均匀的3D先验，表示为� U j ∈ [0 , 1] n × 3 � M j=1，并将它们中的每一个扭曲以捕捉生成点云 P的局部区域（即 P j ∈ R n × 3 ），其中 N = M × n。为了实现这一点，类似于AtlasNet [ 7]，一种直观的方法是独立地为每对 U j 和 P j学习一个扭曲过程，即 p j i = F ( u j i , z ; Θ j ) ，其中 uj i 和 p j i 分别是 U j 和 P j 的第 i个点。然而，这种方式显著增加了网络参数 { Θ j } M j =1，使得网络难以训练。为了以高效和有效的方式实现生成，我们最终将其形式化为统一的局部扭曲过程。也就是说，我们使用 M 个不同的全局相关的局部编码 { z j } M j =1（详见第 3.2节），每个编码都预期嵌入一个典型局部区域的语义。因此，一个局部编码可以通过相同的 MLP驱动先验的扭曲到相应的结构。因此，该过程通常被写为0pji = F(uji, zj; Θ), �j∈[1, M] and �i∈[1, n]. (1)03.2. 基于变形的生成器{...{...{{{...{{...{{...{...{{...{...{Code enhancement. As aforementioned, WarpingGANaims to learn a warping function under the guidance of thelatent code z. However, z is randomly drawn from a Gaus-sian distribution and thus lacks the implicit semantic infor-mation to represent the shape faithfully. To fill this knowl-edge gap, we propose a code enhancement module shown inFig. 3(a), composed of five fully-connected layers to trans-form z to �z ∈ RD of a higher dimension (D > C). In Sec-tion 4.3, we illustrate that after the data-driven training pro-cess, such a module can transform the Gaussian distributionto a distribution that is comparable to that of the features ex-tracted from real datasets that encode shape semantics.64000统一局部变形0统一局部变形0合并0... 共享0均匀先验M x (n x 3) 生成的点云N x 3 第一次变形的部分M x (n x 3) 第二次变形的部分M x (n x 3)0MLPs0全局形状代码D x 10(a) 代码增强模块0M x (n x (3+d+D))0MLPs0M x (n x 3)0(b) 统一局部变形模块0复制0连接0分割0代码增强0全局形状代码D x 10图3.WarpingGAN生成器的架构示意图，包括代码增强模块和统一局部变形模块。在训练过程中，它以潜在代码和M个预定义的均匀3D先验作为输入。统一局部变形模块连续执行两次，生成一个3D形状。0通过改变Uj的大小（即n的值），可以在一次训练后生成具有不同点数的点云。0统一局部变形。在图3(b)中，统一局部变形模块由增强的潜在代码z引导，可以从多个3D先验中生成各种局部区域，然后将这些区域进一步组装成具有复杂拓扑结构的点云。具体而言，我们首先将z分成M个等长的局部代码，表示为{zj∈RD/M}Mj=1，然后将每个局部代码与全局形状信息连接起来，得到zj=[zjz]∈RD(M+1)/M。最后，我们将zj与Uj的每个坐标连接起来，然后将其输入到MLPs中，以逐点回归出第j个局部区域的点。此外，我们连续两次执行这样的变形过程，因此，这样的统一局部变形过程可以表示为pji = F F(uji, zj; Θ1), zj;Θ2，(2)0对于任意的 j ∈ [1 , M ] 和 i ∈ [1, n ] ，0其中 Θ 1 和 Θ 2是两个连续变形过程的网络参数。需要注意的是，在局部编码中连接 � z 是至关重要的，因为 � z提供了协调不同局部编码的基本全局形状信息（详见第 4.3节示例）。0池化0真实/假？MLPs0全局特征0拼接损失0真实/生成0点云0图 4. WarpingGAN的判别器架构。它以真实点云或生成的点云作为输入，并产生拼接损失所使用的置信度值和关键点（即红色点）。03.3. 训练目标0判别器。根据之前的工作 [ 10 , 20 ]，我们采用 PointNet 作为判别器 D ( ∙) 来训练生成器。如图 4 所示，它以 { � P }作为输入进行二分类。具体而言，它首先通过 MLP学习点特征，然后采用最大池化来获得全局形状特征，该特征用于确定置信度值。同时，由于最大池化操作，我们可以从输入点云中检索到少量关键点，这些关键点描述了输入形状的骨架 [ 18 ]。拼接损失。如第 3.1节所分析的那样，初始判别器的设计导致了有限的监督能力，无法使生成的局部区域紧密地拟合在一起。为了解决这个问题，我们利用关键点而不引入额外的复杂操作，提出了一种拼接损失，该损失最小化了 P 和 � P之间的局部差异。设 { q i } Q i =1 � P 和 { � q i } � Q i =1 � � P表示从输入点云中基于判别器的最大池化操作检索到的关键点。我们首先找出每个 { q i } Q i =1 的 K 个最近邻 N ( q i ) = { p k i } K k =1 � P，并计算成对距离 d k i = ∥ q i − p k i ∥ 2 。我们还进行相同的操作LsQ i=1i Ni−Q i=1�i N �i ,(3)PdPPPP(5)4. Experiments4.1. Experiment SettingsDataset.Following the settings in [20], we selectedthree categories of ShapeNet, i.e., Chair, Airplane and Carshapes, to train and evaluate WarpingGAN. Each pointcloud contains 2048 points.Implementation details. WarpingGAN samples latentcodes of dimension C = 128 following a Gaussian distribu-Table 1. Quantitative comparison of WarpingGAN with five state-of-the-art methods over two categories. The listed MMD and COVvalues were obtained by multiplying the original values with 103and 102, respectively. ↑ (resp. ↓) means the higher (resp. lower),the better.MethodMetricChairAirplaneMMD↓COV↑Uniform↓MMD↓COV↑Uniform↓TreeGAN [20]9.645.000.883.842.500.45PDGN [10]9.351.250.853.441.250.21SP-GAN [15]11.541.250.343.546.250.05ShapeGF [5]9.650.000.643.547.500.09DPM [17]9.437.501.453.433.750.35WarpingGAN8.753.750.293.348.750.02tion as input to generate point clouds each with N = 2048points. We set the number of priors M to 16 for all shapes,K = 40 for computing the stitching loss, and λs = 0.05and λgp = 10 during the training phase.We adoptedLeakyReLU with a negative slope equal to 0.2 in eachlayer. We utilized Adam with the learning rate r = 0.0001,β1 = 0 and β2 = 0.99 as the optimizer to optimize both thegenerator and discriminator, and set the batch size to 32. Weimplemented the whole network with PyTorch and trainedit on Nvidia RTX 2080ti GPU with Intel(R) Xeon(R) CPU.4.2. Comparison with State-of-the-Art MethodsWe compared the proposed WarpingGAN with five state-of-the-art point cloud generation frameworks, includingthree GAN-based methods, i.e., TreeGAN [20], PDGN [10]and SP-GAN [15], and two probablistic-based methods, i.e.,ShapeGF [5] and DPM [17].Quantitative comparison. Following the settings of SP-GAN [15], we utilized Minimal Matching Distance (MMD)and Coverage (COV) to quantitatively evaluate the qualityof generated point clouds by different methods. Besides, wealso adopted the uniformity loss* in [14] to quantitativelymeasure the uniformity of generated point clouds.As listed in Table 1, our WarpingGAN outperforms allthe other methods in terms of all metrics. Specifically, thelower MMD values imply that WarpingGAN can generateshapes with high fidelity to point clouds in the real dataset,the high COV values demonstrate that the generated shapesof WarpingGAN match well with the real shapes in terms offraction, and the lower Uniform values indicate our Warp-ingGAN can generate point clouds with more uniformlydistributed points. Moreover, we want to point out that met-rics MMD and COV do not necessarily and reliably corre-late to the quality of generated data, which has also beendiscussed in [28,30]. Thus, we refer readers to examine vi-sual quality of generated point clouds provided as followsand in Supplementary Material.Visual comparison.We visualized the generated 3D*We measured the normalized point clouds with various percentagesof points, i.e., p ∈ {0.002, 0.004, 0.006, 0.008, 0.012, 0.015}. We re-ported the average uniformity over all p.64010判别器的初始设计导致了有限的监督能力，无法使生成的局部区域紧密地拟合在一起。为了解决这个问题，我们利用关键点而不引入额外的复杂操作，提出了一种拼接损失，该损失最小化了 P 和 � P 之间的局部差异。设 { q i } Q i =1 �P 和 { � q i } � Q i =1 � � P表示从输入点云中基于判别器的最大池化操作检索到的关键点。我们首先找出每个 { q i } Q i =1 的 K 个最近邻 N ( q i) = { p k i } K k =1 � P ，并计算成对距离 d k i = ∥ q i −p k i ∥ 2 。我们还进行相同的操作0对于 { � q i } � Q i =1 ，我们根据这些信息定义拼接损失如下0K 和 ¯ d i 是 { d k i } Q i =1的均值。需要注意的是，拼接损失只在训练过程中对非常少量的关键点进行 k近邻操作，因此，所提出的方法在训练和测试过程中仍然保持高效率。详见第 4.2 节。联合优化。为了训练所提出的Warping-GAN，我们采用了改进的 WGAN 损失 [ 8]，其中包括了生成器的损失 L g ( ∙ ) 和带有 Lipschitz约束的判别器的损失 L d ( ∙ ) 。具体而言，L g ( ∙ )的定义如下0其中 P P 是生成形状的分布， L s是平衡的提出的拼接损失，其权重为 λ s > 0 。 L d ( ∙ )的定义如下0+ λ gp E � p E � p � P � P [( ∥ ▽ � p D0其中 P � P 是真实点云 � P 的分布，� p 是通过插值从 P P和 P � P 中采样的形状对来均匀采样的，以满足1-Lipschitz 约束 [ 8 ]，而 λ gp > 0是平衡梯度惩罚项的权重。Figure 5. Visual illustration of the generated Chair, Airplane and Car shapes by our WarpingGAN. These shapes ha

下载后可阅读完整内容，剩余1页未读，立即下载