使用对抗一致性损失的无配对图像到图像转换

需积分: 21 131 浏览量更新于2024-07-16 收藏 8.36MB PDF 举报

"这篇论文《使用对抗一致性损失的无配对图像到图像转换》(Unpaired Image-to-Image Translation using Adversarial Consistency Loss)由Yihao Zhao、Ruihai Wu和Hao Dong共同撰写，来自北京大学计算机科学与技术系。文章探讨了在缺乏一对一配对图像数据的情况下进行图像转换的挑战，并提出了一个新颖的对抗一致性损失函数来解决这个问题。传统的循环一致性损失在处理几何变化、移除大对象或忽略无关纹理时存在局限性，而该方法旨在克服这些限制，保持源图像的重要特征，并在三个具有挑战性的任务上实现了最先进的结果：眼镜去除、男性到女性的转换以及自拍到动漫的转换。关键词包括生成对抗网络和双学习等。" 论文详细解读：无配对图像到图像转换是一个旨在发现不同图像域之间映射的视觉问题，但在许多实际应用中，获取精确的一一配对图像数据往往是困难的。传统的解决方案，如循环一致性损失，通过严格的像素级约束来确保从一个域到另一个域的转换后能回译回原始图像。然而，这种方法的一个显著缺点是它无法处理几何变换、移除图像中的大型物体或者忽略某些不相关的纹理细节。为了克服这些问题，作者提出了对抗一致性损失这一新概念。这个损失函数不要求翻译后的图像必须回译回特定的源图像，而是鼓励翻译后的图像保持源图像的重要特性。通过这种方式，它能更好地适应场景中的几何变化，允许删除或添加对象，并且能够忽略那些对目标域不重要的纹理细节。在实验部分，作者展示了他们的方法在三个具有挑战性的任务上的优越性能。首先，眼镜去除任务，该方法能成功地移除人像照片中的人物眼镜而不影响脸部其他特征。其次，男性到女性的性别转换任务，这涉及复杂的面部特征和风格的变化。最后，自拍到动漫的转换任务，该任务需要捕捉并再现人类面部特征到卡通风格的转换。此外，使用生成对抗网络（GANs）是本文的核心技术，GANs通过两个神经网络——生成器和判别器之间的对抗训练，可以学习生成接近真实的新图像。同时，论文中提到的双学习策略可能也被用来提高模型的泛化能力和学习效率。这篇文章通过引入对抗一致性损失，为无配对图像到图像转换提供了一个新的视角，解决了现有方法的一些关键局限性，提高了转换的质量和真实性。这一创新对于推动生成模型在图像处理领域的应用具有重要意义。

4 Yihao Zhao, Ruihai Wu, and Hao Dong

ℒ

()*

ℒ

(,-

ℒ

(,-

ℒ

()*

2)3

ℒ

2)3

ℒ

2)3

ℒ

()*

Fig. 2. The training schema of our model. (Left) Our model contains two gener-

ators: G

: (X, Z) → X

and G

: (X, Z) → X

and three discriminators: D

, D

for

adv

, L

adv

and

D for L

acl

. D

and D

ensure that the translated images belong to the

correct image domain, while

D encourages the translated images to preserve important

features of the source images. The noise vectors z

, z

are randomly sampled from

N (0, 1). (Right) L

idt

encourages to maintain features, improves the image quality, sta-

bilises the training process and prevents mode collapse, where the noise vector is from

the noise encoder. The blocks with the same colour indicate shared parameters.

Moreover, there are two kinds of discriminators D

and

D is a consis-

tency discriminator. Its goal is to ensure the consistency between source images

and translated images, and this is the core of our method. The goal of D

and

is to distinguish between real and fake images in a certain domain. Speciﬁ-

cally, the task of D

is to distinguish between X

and G

(X), and the task of

is to distinguish between X

and G

(X).

The objective of ACL-GAN has three parts. The ﬁrst, adversarial-translation

loss, matches the distributions of generated images to the data distributions in

the target domain. The second, adversarial-consistency loss, preserves signiﬁ-

cant features of the source images in the translated images, i.e., it results in

reasonable mappings between domains. The third, identity loss and bounded fo-

cus mask, can further help to improve the image quality and maintain the image

background. The data are forwarded as shown in Fig. 2, and the details of our

method are described below.

3.1 Adversarial-Translation Loss

For image translation between domains, we utilise the classical adversarial loss,

which we call adversarial-translation loss in our method, to both generators, G

and G

, and discriminators, D

and D

. For generator G

and its discriminator

, the adversarial-translation loss is as follows:

adv

, D

, X

) = E

∼p

[logD

)]

+ E

¯x

∼p

{¯x

}

[log(1 − D

(¯x

))]

(1)

剩余17页未读，继续阅读

黑火的柠檬树下快乐多

粉丝: 0
资源: 1

使用对抗一致性损失的无配对图像到图像转换

Unpaired Image-to-Image Translation using Cycle-consistent adversarial networks

Unpaired Image-to-Image Translationusing Cycle-Consistent Adversarial Networks

Image-to-Image Translation with Conditional Adversarial Nets

Python_PyTorch中的ImagetoImage转换.zip

CycleGAN代码.zip

CycleGAN-Tensorflow:CycleGAN的简单Tensorflow实现

Oasys是一个使用Maven进行项目管理的OA办公自动化系统。该项目基于Spring B

peewee-3.15.0-cp310-cp310-win_amd64.whl

Delphi 12控件之FR安装工具24.7z

基于树莓派使用sht20采集温湿度信息，以JSON格式上传至MQTT，MQTT发布开关灯指令，树莓派执行指令操作

最新资源