in Section 3, and experimental results are presented in Section 4. Finally, we
conclude the paper in Section 5.
2 Related Work
Deep generative models attempt to capture the probability distributions over
the given data. Restricted Boltzmann Machines (RBMs), one type of deep
generative models, are the basis of many other hierarchical models, and they
have been used to model the distributions of images [15] and documents [16].
Deep Belief Networks (DBNs) [17] and Deep Boltzmann Machines (DBMs) [5]
are extended from the RBMs. The most successful application of DBNs is for
image classification [17], where DBNs are used to extract feature representa-
tions. However, RBMs, DBNs and DBMs all have the difficulties of intractable
partition functions or intractable posterior distributions, which thus use the ap-
proximation methods to learn the models. Another important deep generative
model is Variational Autoencoders (VAE) [6], a directed model, which can be
trained with gradient-based optimization methods. But VAEs are trained by
maximizing the variational lower bound, which may lead to the blurry problem
of generated images.
Recently, Generative Adversarial Networks (GANs) have been proposed by
Goodfellow et al. [7], who explained the theory of GANs learning based on a
game theoretic scenario. Compared with the above models, training GANs does
not require any approximation method. Like VAEs, GANs also can be trained
through differentiable networks. Showing the powerful capability for unsuper-
vised tasks, GANs have been applied to many specific tasks, like image genera-
tion [18], image super-resolution [9], text to image synthesis [19] and image to
image translation [20]. By combining the traditional content loss and the ad-
versarial loss, super-resolution generative adversarial networks [9] achieve state-
of-the-art performance for the task of image super-resolution. Reed et al. [19]
proposed a model to synthesize images given text descriptions based on the con-
ditional GANs [21]. Isola et al. [20] also use the conditional GANs to transfer
images from one representation to another. In addition to unsupervised learning
tasks, GANs also show potential for semi-supervised learning tasks. Salimans
et al. [10] proposed a GAN-based framework for semi-supervised learning, in
which the discriminator not only outputs the probability that an input image
is from real data but also outputs the probabilities of belonging to each class.
Despite the great successes GANs have achieved, improving the quality of
generated images is still a challenge. A lot of works have been proposed to
improve the quality of images for GANs. Radford et al. [13] first introduced
convolutional layers to GANs architecture, and proposed a network architecture
called deep convolutional generative adversarial networks (DCGANs). Denton
et al. [22] proposed another framework called Laplacian pyramid of genera-
tive adversarial networks (LAPGANs). They construct a Laplacian pyramid to
generate high-resolution images starting from low-resolution images. Further,
Salimans et al. [10] proposed a technique called feature matching to get better
4