分区引导的GANs：解决模式坍塌与复杂分布学习

需积分: 5 70 浏览量更新于2024-08-05 收藏 1.2MB PDF 举报

"Partition-Guided GANs是一种利用分割引导的生成对抗网络技术，旨在解决GAN（生成对抗网络）在训练过程中遇到的模式塌陷和学习离散数据集流形的困难。该方法通过将高维分布的学习任务分解为更简单的子任务来实现这一目标。论文提出了一种分区器，它可以将空间划分为具有较简单分布的小区域，并为每个分区训练一个不同的生成器。这一过程是在无监督的情况下进行的，不需要任何标签。论文还提出了空间分区器的两个设计准则：1)生成连通分区和2)提供分区与数据样本之间距离的代理，以及一个动态调整的损失函数d。" Partition-Guided GANs的详细介绍：生成对抗网络（GANs）是深度学习领域的一个重要分支，它通过两个神经网络——生成器（Generator）和判别器（Discriminator）之间的博弈来学习数据分布并生成新的样本。尽管GANs在图像生成、风格迁移等领域取得了显著成果，但其训练过程存在一些固有问题，比如模式塌陷（Mode Collapse），即生成器倾向于生成有限的样本多样性，以及学习离散数据流形的挑战。为了克服这些问题，Partition-Guided GANs提出了一种创新策略。它首先将高维数据空间分割成多个小区域，每个区域内的数据分布相对简单。这种分区是通过一个自动学习的分区器完成的，它能够在无需标签信息的无监督学习环境中工作。这意味着系统能够自适应地理解数据的内在结构，而无需人工标注。论文中提出的两个设计准则对于优化模型的训练至关重要。第一个准则是生成连通的分区，这意味着各个分区应该是相互连接的，以确保生成器能够覆盖整个数据分布，避免生成的样本集中在局部区域。第二个准则是提供分区与数据样本之间的距离代理，这有助于生成器更好地理解样本间的差异，从而生成更丰富多样且分布广泛的样本。此外，论文还引入了一个动态调整的损失函数d，这个损失函数不仅考虑了生成器和判别器之间的对抗性训练，还考虑了分区之间的关系和样本到分区的距离。通过这种方式，Partition-Guided GANs能够更有效地探索高维数据空间，提高生成样本的多样性和质量，从而缓解模式塌陷问题。 Partition-Guided GANs通过将复杂任务拆解为一系列更简单的子任务，结合无监督学习和自适应分区策略，为解决GAN训练中的难题提供了新的视角和方法。这种方法有望在图像生成、视频生成以及其他需要生成高质量多样样本的领域得到应用。

3. Related work

Apart from their application in computer vision [

], GANs have also been been employed

in natural language processing [

], medicine [

] and several other ﬁelds [

]. Many of recent

research have accordingly focused on providing ways to

avoid the problems discussed in Section 2 [45, 53].

Mode collapse

For instance, Metz et al. [

] unroll the

optimization of the discriminator to obtain a better estimate

of the optimal discriminator at each step, which remedies

mode collapse. However, due to high computational com-

plexity, it is not scalable to large datasets. VEEGAN [

]

adds a reconstruction term to bi-directional GANs [

]

objective, which does not depend on the discriminator. This

term can provide a training signal to the generator even when

the discriminator does not. PacGAN [

] changes the dis-

criminator to make decisions based on a pack of samples.

This change mitigates mode collapse by making it easier

for the discriminator to detect lack of diversity and natu-

rally penalize the generator when mode collapse happens.

Lucic et al. [

], motivated by the better performance of

supervised-GANs, propose using a small set of labels and

a semi-supervised method to infer the labels for the entire

data. They further improve the performance by utilizing an

auxiliary rotation loss similar to that of RotNet [17].

Mode connecting

Based on Theorem 1, to avoid mode

connecting one has to either use a latent variable

with a

disconnected support, or allow

to be a discontinuous

function [27, 36, 40, 46, 64].

To obtain a disconnected latent space, DeLiGAN [

]

samples

from a mixture of Gaussian, while Odena et

al. [

] add a discreet dimension to the latent variable. Other

methods dissect the latent space post-training using some

variant of rejection sampling, for example, Azadi et al. [

]

perform rejection sampling based on the discriminator’s

score, and Tanielian et al. [

] reject the samples where

the generator’s Jacobian is higher than a certain threshold.

The discontinuous generator method is mostly achieved

by learning multiple generators, with the primary motivation

being to remedy mode-collapse, which also reduces mode

connecting. Both MGAN [

] and DMWGAN [

] employ

K different generators while penalizing them from overlap-

ping with each other. However, these works do not explicitly

address the issue when some of the data modes are not being

captured. Also, as shown in Liu et al. [

], MGAN is quite

sensitive to the choice of

. By contrast, Self-Conditioned

GAN [

] clusters the space using the discriminator’s ﬁ-

nal layer and uses the labels as self-supervised conditions.

However, in practice, their clustering does not seem to be

reliable (e.g., in terms of NMI for labeled datasets), and the

features highly depend on the choice of the discriminator’s

architecture. In addition, there is no guarantee that the gener-

ators will be guided to generate from their assigned clusters.

GAN-Tree [

] uses hierarchical clustering to address con-

tinuous multi-modal data, with the number of parameters

increasing linearly with the number of clusters. Thus it is

limited to very few cluster numbers (e.g., 5) and can only

capture a few modes.

Another recently expanding direction explores the beneﬁt

of using image augmentation techniques for generative mod-

eling. Some works simply augment the data using various

perturbations (e.g., random crop, horizontal ﬂipping) [

Others [

] incorporated regularization on top of the

augmentations, for example CRGAN [

] enforces consis-

tency for different image perturbations. ADA [

] processes

each image using non-leaking augmentations and adaptively

tunes the augmentation strength while training. These works

are orthogonal to ours and can be combined with our method.

4. Method

This section ﬁrst describes how GANs are trained on a par-

titioned space using a mixture of generators/discriminators

and the uniﬁed objective function required for this goal. We

then explain our differentiable space partitioner and how we

guide the generators towards the right region. We conclude

the section by making connections to supervised GANs,

which use an auxiliary classiﬁer [56, 59].

Multi-generator/discriminator objective:

Given a par-

titioning of the space, we train a generator (

) and a discrim-

inator (

) for each region. To avoid over-parameterization

and allow information sharing across different regions, we

employ parameter sharing across different

(

)’s by tying

their parameters except the input (last) layer. The mixture of

these generators serves as our main generator

. We use the

following objective function to train our GANs:

min

max

V (D

)

(1)

where

,...,A

be a partitioning of the space,

data

(x ∈ A

) and:

V (D, G, A)=E

x∼p

data

(x|x∈A)

[log D(x)] +

z∼p

(z|G(z)∈A)

[log(1 − D(G(z)))] (2)

We motivate this objective by making connection to the

Jensen–Shannon distance (

JSD

) between the distribution of

our mixture of generators and the data distribution in the

following Theorem.

Theorem 2.

Let

P =

Q =

, and

,...,A

be a partitioning of the space, such that the

support of each distribution p

and q

is A

. Then:

JSD(P ' Q)=

JSD(p

' q

) (3)

5101

提供了⼀种实⽤的(有理论保证的)⽅法来指导每个发⽣器从其指定的区域⽣产样品，进⼀步改进了模式的折叠/连接。我们的实验表明，在FID和IS⽅⾯，我们的模型⽐相关基线有显著改善，

证实了我们模型的有效性。

除了在计算机视觉[28,29,30,31,33,77,82,85]中

的应⽤，gan还被⽤于⾃然语⾔处理

[43,44,80]，医学[67,37]和其他⼏个领域

[16,51,60]。因此，最近的许多研究都集中于提供

避免第2节[45,53]中讨论的问题的⽅法。

例如，Metz等⼈的

[53]展开了鉴别器的

优化，以获得每⼀步

最优鉴别器的更好估

计，弥补了模式崩

溃。但是，由于计算

复杂度⾼，⽆法在⼤

数据集上进⾏扩展。

VEEGAN[69]在双向

gan[15,14]⽬标上增

加了重构项，不依赖

于鉴别器。这⼀项可

以提供⼀个训练信号

给发⽣器，即使鉴别

器没有。

PacGAN[45]改变鉴

别器，根据⼀组样品

做出决定。这种改变

通过使鉴别器更容易

检测多样性的缺乏和

在模式崩溃发⽣时惩

罚⽣成器来减轻模式

崩溃。Lucic等⼈

[49]受到监督gan的

更好表现的激励，提

出使⽤⼀组⼩标签和

半监督⽅法来推断整

个数据的标签。他们

利⽤类似于

RotNet[17]的辅助旋

转损失进⼀步提⾼了

性能。

根据定理1，为了

避免模态连接，必

须使⽤隐变量z和

不连接的⽀撑，或

者允许Gθ是⼀个不

连续函数

[27,36,40,46,64]

。

为了获得不连通的潜空

间，DeLiGAN[22]从混合

⾼斯中采样z，⽽Odena

等⼈[59]在潜变量中增加

了离散维数。其他的⽅法

使⽤⼀些不同的拒绝采样

对训练后的潜在空间进⾏

解剖，例如，Azadi等⼈

[2]基于鉴别器的得分进

⾏拒绝采样，Tanielian

等⼈[70]拒绝⽣成器的雅

可⽐矩阵⾼于某⼀阈值的

样本。

间断发电机法主要是通过学习多个发电机来实现的，主要的动机是弥补模式的崩溃，这也减少了模式的连

接。MGAN[27]和DMWGAN[36]都使⽤K个不同的发⽣器，同时惩罚它们之间的重叠。然⽽，当⼀些数据模

式没有被捕获时，这些⼯作并没有明确地解决这个问题。此外，如Liu et al.[46]所示，MGAN对k的选择⾮常

敏感。相⽐之下，⾃条件GAN[46]利⽤鉴别器的最终层聚集空间，并使⽤标签作为⾃监督条件。然⽽，在实

践中，它们的聚类似乎并不可靠(例如，在标签数据集的NMI⽅⾯)，特征⾼度依赖于鉴别器的架构选择。此

外，也不能保证发电机将被引导从它们所分配的集群中⽣成。

GAN-Tree[40]使⽤层次聚类来处理连续的多模态数据，参

数的数量随聚类的数量线性增加。因此，它被限制在⾮常

少的集群数量(例如，5)，并且只能捕获⼏种模式

另⼀个最近扩展的⽅

向是探索使⽤图像增

强技术进⾏⽣成建模

的好处。有些作品简

单地使⽤各种扰动(例

如，随机裁剪，⽔平

翻转)来增加数据。其

他[9,49,84]在增⼴的

基础上加⼊了正则

化，例如CRGAN[83]

加强了不同图像扰动

的⼀致性。ADA[32]

使⽤⾮泄漏增强处理

每个图像，并在训练

时⾃适应调节增强强

度。这些作品与我们

的作品是正交的，可

以与我们的⽅法相结

合。

本节⾸先描述如何在分

区空间上使⽤⽣成器/

鉴别器的混合和这个⽬

标所需的统⼀⽬标函数

来训练gan。然后我们

解释我们的可微空间分

割器以及我们如何引导

⽣成器⾛向正确的区

域。我们通过连接到使

⽤辅助分类器的监督

gan来结束本节

[56,59]。

多⽣成器/鉴别器⽬

标:给定⼀个分区的空

间，我们为每个区域

训练⼀个⽣成器(Gi)

和⼀个鉴别器(Di)。

为了避免过度参数化

并允许跨不同区域共

享信息，我们通过将

不同Gi (Di)的参数绑

定到输⼊层(最后⼀

层)之外来跨不同Gi

(Di)的参数共享。这

些⽣成器的混合物作

为我们的主⽣成器

g。我们使⽤以下⽬

标函数来训练我们的

gan:

我们通过将我们的混

合⽣成器的分布与以

下定理中的数据分布

之间的Jensen-

Shannon距离(JSD)联

系起来，来激发这个

⽬标。

⽬标

函数

剩余10页未读，继续阅读

ysh9888

粉丝: 1579
资源: 45

分区引导的GANs：解决模式坍塌与复杂分布学习

台电酷闪量产工具DM8261 Partition-V1.0.0.4

DVD-partition-.zip_partition

mysql-partition-and-Index.rar_partition

61-partition-init.rules

AIX 6.1 administrator认证考试 000-104 v9.02-109题.pdf

partition-single-img.py

Linux-partition-command-fdisk.zip_fdisk_linux fdisk_partition

分区工具，可以制作U盘系统--MiniTool.Partition.Wizard

c语言-leetcode题解之0086-partition-list.zip

计算机组装与维护-任务一-硬盘分区(Partition-Magic)和格式化-教学设计.docx

最新资源