深度卷积生成对抗网络在无监督学习中的应用

需积分: 50 133 浏览量更新于2024-07-19 2 收藏 7.11MB PDF 举报

"这篇论文是关于深度卷积生成对抗网络(Deep Convolutional Generative Adversarial Networks，简称DCGANs)在无监督学习中的应用。作者Alec Radford、Luke Metz和Soumith Chintala分别来自indico Research和Facebook AI Research。他们提出了一种具有特定架构约束的CNN变体，即DCGANs，用于无监督学习，并展示其在图像数据集上的强大表现，能学习到从物体部分到场景的层次表示。此外，这些学习到的特征还能被应用于新任务，证明了它们的通用性。" 在深度学习领域，卷积神经网络（CNN）在监督学习任务，如图像分类和识别上取得了显著的成功。然而，相比而言，CNN在无监督学习中的应用相对较少。生成对抗网络（Generative Adversarial Networks，GANs）是由Goodfellow等人提出的，它由两部分组成：生成器（Generator）和判别器（Discriminator）。生成器试图生成逼真的样本，而判别器则尝试区分真实样本和生成器产生的样本。两者在对抗过程中互相学习，从而提升生成器的生成能力和判别器的鉴别能力。 DCGANs是GANs的一个变体，专为无监督学习设计。它对传统的GAN架构进行了一些关键的改进，包括使用卷积和反卷积层代替全连接层，以及使用批量归一化和Leaky ReLU激活函数等技术，以提高模型的稳定性和训练效率。这些改进使得DCGANs更适合处理图像数据，可以有效地学习到图像的复杂结构。在论文中，研究人员通过在多个图像数据集上训练DCGANs，展示了该模型能够学习到从局部特征（如物体部分）到全局特征（如场景）的多层次表示。这种表示学习的能力是无监督学习的关键，因为它无需人工标注的数据，就能自动发现数据中的模式和结构。同时，这些学到的特征不仅可用于生成新的图像，还能够迁移到其他任务，如图像分类或语义分割，这表明DCGANs学习到的特征具有很好的泛化能力。 DCGANs为无监督学习提供了一个有力的工具，特别是在计算机视觉领域。通过利用卷积结构和对抗训练，DCGANs能够从大量未标记的图像数据中提取出有用的信息，这有助于推动无监督学习方法在图像理解和生成等领域的进步。

Under review as a conference paper at ICLR 2016

Figure 1: DCGAN generator used for LSUN scene modeling. A 100 dimensional uniform distribu-

tion Z is projected to a small spatial extent convolutional representation with many feature maps.

A series of four fractionally-strided convolutions (in some recent papers, these are wrongly called

deconvolutions) then convert this high level representation into a 64 × 64 pixel image. Notably, no

fully connected or pooling layers are used.

suggested value of 0.9 resulted in training oscillation and instability while reducing it to 0.5 helped

stabilize training.

4.1 LSUN

As visual quality of samples from generative image models has improved, concerns of over-ﬁtting

and memorization of training samples have risen. To demonstrate how our model scales with more

data and higher resolution generation, we train a model on the LSUN bedrooms dataset containing

a little over 3 million training examples. Recent analysis has shown that there is a direct link be-

tween how fast models learn and their generalization performance (Hardt et al., 2015). We show

samples from one epoch of training (Fig.2), mimicking online learning, in addition to samples after

convergence (Fig.3), as an opportunity to demonstrate that our model is not producing high quality

samples via simply overﬁtting/memorizing training examples. No data augmentation was applied to

the images.

4.1.1 DEDUPLICATION

To further decrease the likelihood of the generator memorizing input examples (Fig.2) we perform a

simple image de-duplication process. We ﬁt a 3072-128-3072 de-noising dropout regularized RELU

autoencoder on 32x32 downsampled center-crops of training examples. The resulting code layer

activations are then binarized via thresholding the ReLU activation which has been shown to be an

effective information preserving technique (Srivastava et al., 2014) and provides a convenient form

of semantic-hashing, allowing for linear time de-duplication . Visual inspection of hash collisions

showed high precision with an estimated false positive rate of less than 1 in 100. Additionally, the

technique detected and removed approximately 275,000 near duplicates, suggesting a high recall.

4.2 FACES

We scraped images containing human faces from random web image queries of peoples names. The

people names were acquired from dbpedia, with a criterion that they were born in the modern era.

This dataset has 3M images from 10K people. We run an OpenCV face detector on these images,

keeping the detections that are sufﬁciently high resolution, which gives us approximately 350,000

face boxes. We use these face boxes for training. No data augmentation was applied to the images.

剩余15页未读，继续阅读

AI牛丝

粉丝: 41
资源: 44

深度卷积生成对抗网络在无监督学习中的应用

(DCGAN)Deep convolutional Generative Adversarial Nets

Deep Convolution Generative Adversarial Networks 源码

2020年的国际会议ICML在哪里举行？Unsupervised representation learning with deep convolutional generative adversarial networks发表在哪里？

非监督学习有什么最新论文？

用于近红外光谱数据增强的GAN生成对抗网络代码

如何将DCGAN生成大小为1-bit color

如果水印比较复杂或者遮挡了图片的主要内容，以上方法可能效果不佳。此时可以尝试使用一些特殊的工具，特殊的工具是什么

DCGANs生成器器的损失函数

DCGAN的score

DCGAN-DenseNet模型架构

最新资源