苹果发布首篇AI论文：对抗训练提升图像仿真

需积分: 0 43 浏览量更新于2024-07-19 收藏 1.72MB PDF 举报

苹果公司最近发布了一篇重要的AI论文，名为《通过对抗训练学习模拟和无监督图像》（Learning from Simulated and Unsupervised Images through Adversarial Training），发表日期为2016年11月15日。这篇研究标志着苹果在人工智能领域的探索迈出了新的步伐。论文的核心议题是解决在利用合成图像进行模型训练时面临的挑战。随着图形技术的进步，训练模型在合成图像上变得可行，这可以减少对昂贵标注数据的依赖。然而，直接从合成图像学习可能无法达到理想的表现，因为合成图像与真实世界图像之间的分布差异仍然存在。为了缩小这种差距，苹果提出了Simulated+Unsupervised (S+U) 学习方法。 S+U学习的目标是通过无标签的真实数据，让模型学习如何改进模拟器输出的逼真度，同时保持模拟器原有的标注信息。这种方法借鉴了生成对抗网络（Generative Adversarial Networks, GANs）的理念，但关键在于对标准GAN算法进行了修改，以便适应以合成图像作为输入的情况。具体来说，论文作者对GAN的训练过程进行了几个关键的修改，包括： 1. 输入调整：通常GAN使用随机向量作为输入，但在该研究中，输入被替换为合成图像，这样模型可以直接学习图像特征，而不是随机噪声。 2. 保留标注：为了确保模型在学习图像逼真度的同时，不会丢失原始的标注信息，论文作者设计了一种策略来平衡生成器（生成逼真的图像）和判别器（区分真实和合成图像）之间的竞争，使得模型既能生成看起来像真实图像的合成图像，又能保持这些图像与原始标注的一致性。 3. 算法优化：针对合成图像的特性，可能需要调整损失函数、训练策略或网络架构，以更好地适应S+U学习环境。这篇论文不仅展示了苹果在AI领域的创新实践，还提供了一个有潜力的新范式，即结合模拟和无监督学习，以提升机器学习模型在实际应用中的表现，尤其是在需要高真实感图像的场景，如计算机视觉、游戏开发或者虚拟现实等。通过这篇论文，苹果展示了其在人工智能前沿的探索能力和对技术细节的深入理解。

展开

Algorithm 1: Adversarial training of reﬁner net-

work R

Input: Sets of synthetic images x

∈ X , and real

images y

∈ Y, max number of steps (T ),

number of discriminator network updates

per step (K

), number of generative

network updates per step (K

Output: ConvNet model R

for t = 1, . . . , T do

for k = 1, . . . , K

1. Sample a mini-batch of synthetic images

2. Update θ by taking a SGD step on

mini-batch loss L

(θ) in (4) .

end

for k = 1, . . . , K

1. Sample a mini-batch of synthetic images

, and real images y

2. Compute

= R

) with current θ.

3. Update φ by taking a SGD step on

mini-batch loss L

(φ) in (2).

end

Discriminator

Probability mapInput image

Figure 3. Illustration of local adversarial loss. The discrimina-

tor network outputs a w × h probability map. The adversarial

loss function is the sum of the cross-entropy losses over the

local patches.

loss function (1) used in our implementation is:

(θ) = −

log(1 − D

)))

+λkR

) − x

, (4)

where k.k

is `

norm. We implement R

as a fully con-

volutional neural net without striding or pooling. This

modiﬁes the synthetic image on a pixel level, rather

than holistically modifying the image content as in e.g.

a fully connected encoder network, and preserves the

global structure and the annotations. We learn the reﬁner

and discriminator parameters by minimizing L

(θ) and

(φ) alternately. While updating the parameters of

, we keep φ ﬁxed, and while updating D

, we ﬁx θ.

We summarize this training procedure in Algorithm 1.

2.2. Local Adversarial Loss

Another key requirement for the reﬁner network is

that it should learn to model the real image characteris-

tics without introducing any artifacts. When we train a

Buffer of

reﬁned images

Reﬁned images

with current

Reﬁned Real

Mini-batch for

Figure 4. Illustration of using a history of reﬁned images. See

text for details.

single strong discriminator network, the reﬁner network

tends to over-emphasize certain image features to fool

the current discriminator network, leading to drifting

and producing artifacts. A key observation is that any

local patch we sample from the reﬁned image, should

have similar statistics to a real image patch. Therefore,

rather than deﬁning a global discriminator network, we

can deﬁne discriminator network that classiﬁes all local

image patches separately. This not only limits the re-

ceptive ﬁeld, and hence the capacity of the discriminator

network, but also provides many samples per image for

learning the discriminator network. This also improves

training of the reﬁner network because we have multiple

‘realism loss’ values per image.

In our implementation, we design the discriminator

D to be a fully convolutional network that outputs w ×

h dimensional probability map of patches belonging to

fake class, where w × h are the number of local patches

in the image. While training the reﬁner network, we sum

the cross-entropy loss values over w × h local patches,

as illustrated in Figure 3.

2.3. Updating Discriminator using a History of

Reﬁned Images

Another problem of adversarial training is that the

discriminator network only focuses on the latest reﬁned

images. This may cause (i) diverging of the adversar-

ial training, and (ii) the reﬁner network re-introducing

the artifacts that the discriminator has forgotten about.

Any reﬁned image generated by the reﬁner network at

any time during the entire training procedure is a ‘fake’

image for the discriminator. Hence, the discriminator

should be able to classify all these images as fake. Based

on this observation, we introduce a method to improve

the stability of adversarial training by updating the dis-

criminator using a history of reﬁned images, rather than

only the ones in the current mini-batch. We slightly

modify Algorithm 1 to have a buffer of reﬁned images

generated by previous networks. Let B be the size of the

buffer and b be the mini-batch size used in Algorithm 1.

下载后可阅读完整内容，剩余15页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

r1275805710

粉丝: 0

苹果发布首篇AI论文：对抗训练提升图像仿真

苹果首份人工智能论文

中国人工智能行业系列分析2017

ai芯片加速论文

苹果大模型论文MM1:Methods,Analysis & Insights from MultimodalLLM Pre-tr

论文研究-人工智能中的识别思想 .pdf

Datawhale X 李宏毅苹果书 AI夏令营 进阶班 选修笔记

这个医生利用ChatGPT在4个月内写了16篇论文，已发表5篇.docx

解决人工智能虚拟个人助理中的间接歧视和性别刻板印象：国际人权法的作用-研究论文

高通举办人工智能开放日联合AI生态系统合作伙伴展示AI应用.pdf

基于python+AI的动物识别技术研究源码数据库论文.docx

最新资源

Datawhale X 李宏毅苹果书 AI夏令营进阶班选修笔记