GANs损失函数解析：机遇与挑战

需积分: 0 100 浏览量更新于2024-07-15 收藏 11.3MB PDF 举报

"这篇论文由Zhaoqing Pan等人发表在IEEE Transactions on Emerging Topics in Computational Intelligence期刊上，详细探讨了生成对抗网络（GANs）的损失函数，包括它们的机会与挑战。作者们对不同类型的损失函数进行了调查分析，讨论了各自的优缺点，并介绍了GAN的基本理论及其训练机制。" 在生成对抗网络（GANs）的研究领域中，损失函数起着至关重要的作用，它们衡量了模型生成的样本与真实样本之间的差距，从而引导模型向目标学习。论文首先概述了GAN的基础理论，这涉及到两个主要的组成部分：生成器（Generator）和判别器（Discriminator）。生成器尝试生成逼真的数据，而判别器则试图区分真实数据与生成器产生的假数据。接着，文章深入研究了用于GAN训练的各种损失函数。传统的GAN损失函数如最小-最大博弈损失（Minimax Loss）是最基础的，它通过最小化判别器的损失和最大化生成器的损失来达到平衡。然而，这种简单的损失函数可能导致训练不稳定，甚至出现模式崩溃（Mode Collapse）问题。为了解决这些问题，研究人员提出了一系列改进的损失函数。例如， Wasserstein距离（Wasserstein Loss）或 Wasserstein GAN（WGAN）引入了连续性假设，使训练过程更加稳定，减少了梯度消失的问题。另一类是Hinge损失，它被用于Improved WGAN中，通过限制判别器的输出范围来进一步稳定训练。此外，还有一些针对特定任务的损失函数，如对抗式自编码器（Adversarial Autoencoders, AAEs）的变分推断损失，以及InfoGAN的信息增益损失，这些损失函数旨在增加生成器的学习能力，提升模型的表示能力和多样性。论文还分析了各种损失函数的挑战，比如训练动态平衡的控制、优化难度、以及如何有效地评估生成样本的质量。这些挑战提示我们，设计更有效的损失函数和训练策略是未来GAN研究的重要方向。这篇论文提供了一个全面的指南，对于理解GANs的损失函数有极大的价值，有助于研究人员和实践者选择和设计适合他们特定应用的损失函数，以提升生成模型的性能。

504 IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, VOL. 4, NO. 4, AUGUST 2020

Fig. 2. The schematic diagram of the loss function in the loss-sensitive GAN.

Δ(x, G(z)) represents the distance between x and G(z), which can be a p-

Norm. With the iteration of the generator G, the similarity between the generated

samples and the real samples will increase. can be more reasonably used to

address the vanishing gradient problem.

a inﬁnite modeling ability, so there is no restriction to the

distribution of real samples, which may lead to the vanishing

gradient problem. The loss-sensitive GAN learns a loss function

(x) parameterized with θ, which can restrict the modeling

ability of discriminator D. The loss function is learned by a

data-dependent margin because they assumed that the loss of

real data distribution should be smaller than the generated one.

In this way, the loss-sensitive GAN has proved that the Lipschitz

densities of real samples are similar to the densities of generated

samples. The objective function of loss-sensitive GAN is deﬁned

in Eqs. (13) and (14),

min

V (D)=E

x∼p

data

(x)

+ λE

x ∼ p

data

(x)

z ∼ p

(z)



Δ(x, G(z)) + L

(x) − L

(G(z))



(13)

min

V (G)=E

z∼p

(z)

(G(z)) , (14)

where Δ(x, G(z)) represents the margin measuring the differ-

ences between x and G(z) in terms of their losses, and λ is a

balancing parameter. Fig. 2 illustrates this idea in details. I t can

be seen from Fig. 2, if the generated data distribution is close to

the real one, it is no longer treated as a negative sample, and more

efforts can be concentrated to improve the samples that are far

away from the real samples. It should be noted that the margin is

not a ﬁxed constant. The margin is a similarity function deﬁned

in speciﬁc experiments such as p-Norm. When the generated

sample is very close to the real one in a metric space, the margin

will be vanished.

(b) Maximum mean discrepancy (MMD)

Assuming that χ is a non-empty metric space, and a class of

function f ∈F: χ → R, X ∼ p, Y ∼ q, the maximum mean

discrepancy (MMD) [25] between p and q is deﬁned in Eq. (15),

MMD[F,p,q]=sup

f∈F

Ef(X) − Ef(Y ). (15)

The reproducing kernel Hilbert space (RKHS) H is an inﬁnite

dimensional space. For each f ∈H, there exists a kernel k ∈H,

then f is formulated as,

f(x)=



f,k (·,x)





k(x, x

). (16)

If F chooses RKHS space H, μ

represents mean embedding

of p, which is calculated as follows,



k(x, ·)p(dx) ∈H. (17)

For each f ∈H, E[f(X)] = f, μ



, the MMD can be formu-

lated as another mean feature matching, as shown in Eq. (18),

MMD[F,p,q]= sup

fH≤1

f(x) − E

f(y)

=sup

fH≤1



φ(x),f



− E



φ(y),f



=sup

fH≤1

μ

− μ

,f

= μ

− μ



(18)

where φ(·) represents x ∈H, μ

represents E

[φ(x)] and μ

represents E

[φ(y)]. The MMD was ﬁrstly proposed for the

problem of two-sample test to determine the differences between

two distributions p and q. In practical applications, the square

of the MMD is generally used, and it is deﬁned as,

MMD[F,p,q]=μ

− μ

,μ

− μ



= μ

,μ

 + μ

,μ

−2μ

,μ



= E



φ(x),φ(x



)



− 2E

p,q



φ(x),φ(y)



+ E



φ(y),φ(y



)



(19)

We can use kernel tricks to measure a kernel function k(x, y).

The choice of the kernel function is various, such as linear kernel,

Gaussian kernel, Laplacian kernel, etc.

Based on the ﬁxed Gaussian kernel k(x, y)=exp(−x −

y

),Liet al. [18] proposed the generative moment matching

networks (GMMN) to measure the discrepancy of two distri-

butions in GANs by minimizing the MMD distance. Unlike

regular GANs, the GMMN used an autoencoder instead of a

discriminator to estimate the discrepancy between two distri-

butions. During the training process, although the stability of

the generated samples is improved, the training efﬁciency of

GMMN is not satisfactory. To achieve improvements in the

generalization ability and computational efﬁciency of GMMN,

the MMDGAN [19] replaced the static ﬁxed Gaussian kernels

with the adversarial learned kernels. The adversarial learned

kernel consists of a Gaussian kernel and an injective function

, where k(x, y) = exp(−f

(x) − f

(y)

). In addition, in

order to enforce f

to be an injective function, they used an

autoencoder in the discriminator.

3) Other objective function methods: In addition to using

Lipschitz density to constrain the sample distribution, non-

probability forms can also be used to measure GANs. Energy-

based GAN (EBGAN) [20] is a typical one in this form. Unlike

the discriminator used in the regular GANs, the discriminator

Authorized licensed use limited to: Nanjing University of Information Science and Technology. Downloaded on August 05,2020 at 03:32:02 UTC from IEEE Xplore. Restrictions apply.

剩余22页未读，继续阅读

不问所以然

粉丝: 0
资源: 13

GANs损失函数解析：机遇与挑战

Generative Adversarial Networks Cookbook.zip

Feature Statistics Mixing Regularization for Generative Adversarial Networks中优化器是怎么设计的

Feature Statistics Mixing Regularization for Generative Adversarial Networks他的模型构成

GAN（Generative Adversarial Networks）是什么

Feature Statistics Mixing Regularization for Generative Adversarial Networks中对生成器有修改吗

spectral normalization for generative adversarial networks

self-attention generative adversarial networks

a style-based generator architecture for generative adversarial networks主要研究了什么

Feature Statistics Mixing Regularization for Generative Adversarial Networks中on the fly是通过什么方法来实现的

Feature Statistics Mixing Regularization for Generative Adversarial Networks进行了几次混合

最新资源