多元图像复原：生成多样合理解决方案

需积分: 50 72 浏览量更新于2024-09-03 收藏 2.93MB PDF 举报

"这篇论文提出了一种多元图像复原的方法，即Pluralistic Image Completion，旨在为图像填充任务生成多种多样且合理的解决方案。传统的图像修复技术通常只能产生一个结果，而这种方法能创建多个可能的、多样的修复图像。研究者面对的主要挑战是基于学习的方法通常只有一个标注的真实训练实例。为解决这个问题，他们设计了一个包含两条平行路径的新框架：一条是重建路径，利用唯一给出的真实标注来获取缺失部分的先验分布，并据此重建原始图像；另一条是生成路径，其条件先验与重建路径中的分布相结合。这两条路径都得到了GANs的支持。此外，他们还引入了一种新的短期+长期注意力层，以利用解码器和编码器特征之间的远程关系，提高了外观一致性。在巴黎建筑、 CelebA-HQ人脸和ImageNet自然图像数据集上的实验表明，该方法不仅能生成高质量的修复结果，还能提供多样化的合理输出。" 本文是关于计算机视觉领域的深度学习应用，特别是在图像修复或完成方面的创新。关键词包括人工智能、深度学习、机器学习、CV（计算机视觉）。论文指出，现有的图像完成技术往往只能产生单一的修复结果，而真实世界中可能存在多种合理的选择。因此，作者提出了一种新的框架，它能够生成多元和多样性的图像修复结果。该框架的核心是两个并行的路径：一是利用条件变分自编码器（Conditional VAEs）的重建路径，通过唯一的地面真实信息来学习缺失部分的先验分布，从而重建整个图像；二是生成路径，它的条件先验与重建路径的分布相结合，生成不同的可能性。为增强多样性，作者采用了生成对抗网络（GANs），并且开发了一种新的注意力机制，称为短期+长期注意力层，它能捕捉到解码器和编码器特征之间更广泛的关系，有助于保持图像的外观一致性。实验证明，这种方法在人脸、建筑物和自然场景的图像上都能产生高质量和多样性的修复结果，没有经过后处理。这表明，提出的框架不仅提高了图像复原的质量，还大大增加了结果的多样性，这对于图像编辑、修复和艺术创作等应用具有重要意义。

Pluralistic Image Completion

Chuanxia Zheng Tat-Jen Cham Jianfei Cai

School of Computer Science and Engineering

Nanyang Technological University, Singapore

{chuanxia001,astjcham,asjfcai}@ntu.edu.sg

Figure 1. Example completion results of our method on images of a face, a building, and natural scenery with various masks (missing

regions shown in white). For each group, the masked input image is shown left, followed by sampled results from our model without any

post-processing. The results are diverse and plausible. (Zoom in to see the details.)

Abstract

Most image completion methods produce only one result

for each masked input, although there may be many reason-

able possibilities. In this paper, we present an approach for

pluralistic image completion – the task of generating mul-

tiple and diverse plausible solutions for image completion.

A major challenge faced by learning-based approaches is

that usually only one ground truth training instance per la-

bel. As such, sampling from conditional VAEs still leads

to minimal diversity. To overcome this, we propose a novel

and probabilistically principled framework with two paral-

lel paths. One is a reconstructive path that utilizes the only

one given ground truth to get prior distribution of missing

parts and rebuild the original image from this distribution.

The other is a generative path for which the conditional

prior is coupled to the distribution obtained in the recon-

structive path. Both are supported by GANs. We also in-

troduce a new short+long term attention layer that exploits

distant relations among decoder and encoder features, im-

proving appearance consistency. When tested on datasets

with buildings (Paris), faces (CelebA-HQ), and natural im-

ages (ImageNet), our method not only generated higher-

quality completion results, but also with multiple and di-

verse plausible outputs.

1. Introduction

Image completion is a highly subjective process. Sup-

posing you were shown the various images with missing

regions in ﬁg. 1, what would you imagine to be occupying

these holes? Bertalmio et al. [4] related how expert con-

servators would inpaint damaged art by: 1) imagining the

semantic content to be ﬁlled based on the overall scene; 2)

ensuring structural continuity between the masked and un-

masked regions; and 3) ﬁlling in visually realistic content

for missing regions. Nonetheless, each expert will indepen-

dently end up creating substantially different details, even if

they may universally agree on high-level semantics, such as

general placement of eyes on a damaged portrait.

Based on this observation, our main goal is thus to gen-

erate multiple and diverse plausible results when presented

with a masked image — in this paper we refer to this task

as pluralistic image completion (depicted in ﬁg. 1). This

is as opposed to approaches that attempt to generate only a

single “guess” for missing parts.

Early image completion works [4, 7, 5, 8, 3, 13] fo-

cus only on steps 2 and 3 above, by assuming that gaps

should be ﬁlled with similar content to that of the back-

ground. Although these approaches produced high-quality

texture-consistent images, they cannot capture global se-

mantics and hallucinate new content for large holes. More

recently, some learning-based image completion methods

[29, 14, 39, 40, 42, 24, 38] were proposed that infer seman-

4321

1438

下载后可阅读完整内容，剩余9页未读，立即下载

HeiβmichYanic

粉丝: 0
资源: 5

多元图像复原：生成多样合理解决方案

Completion

经典的图像修复代码 Inpainting

gray_Criminisi

pluralistic image completion

图像修复：多远图像修复源码，四种场景下图像修复 @inproceedings：zheng2019pluralistic

Pluralistic-Inpainting.zip

Some concerns about the estimation of learning potential from the system of multicultural pluralistic assessment

从多元文化多元评估体系评估学习潜力的若干顾虑

Chavkin, N. F. (Ed.). (1993). Families and schools in a pluralistic society. Albany: State University of New York Press, 268 pp., [dollar]19.95 (paper); [dollar]59.50 (hard)

查夫金，NF（编辑）。 (1993)。 多元社会中的家庭和学校。 奥尔巴尼：纽约州立大学出版社，268 页，[美元]19.95（纸）； [美元]59.50（硬）

最新资源

查夫金，NF（编辑）。 (1993)。多元社会中的家庭和学校。奥尔巴尼：纽约州立大学出版社，268 页，[美元]19.95（纸）； [美元]59.50（硬）