优化StyleGAN：MobileStyleGAN——轻量级高保真图像合成

需积分: 8 7 浏览量更新于2024-08-05 收藏 3.25MB PDF 举报

"MobileStyleGAN是轻量级卷积神经网络，专为高保真图像合成设计，旨在在边缘设备上实现高效部署。该模型在保持与StyleGAN2相当的图像质量的同时，参数减少了约3.5倍，计算复杂度降低了约9.5倍。" 在近年来，生成式对抗网络（GANs）在生成图像建模领域变得非常流行。特别是基于风格的GAN架构如StyleGAN系列，已经在高保真图像生成方面取得了最先进的成果。然而，这些模型在计算需求上非常高，这限制了它们在资源受限的边缘设备上的应用。 MobileStyleGAN的研究重点在于性能优化，特别是对StyleGAN2中的计算密集部分进行分析。研究者提出了一系列改变，调整了生成器网络的设计，目的是在不牺牲图像质量的前提下，降低模型的复杂度，使其更适合在移动设备和边缘计算环境中运行。 MobileStyleGAN架构的创新之处在于其显著的轻量化特性。相比于StyleGAN2，MobileStyleGAN的参数数量减少了3.5倍，这意味着它需要更少的存储空间，这对于内存有限的移动设备来说至关重要。同时，其计算复杂度下降了9.5倍，这意味着在运行时，它需要的计算资源大大减少，能够在低功耗设备上快速高效地生成图像。尽管进行了这些优化，MobileStyleGAN仍能提供与StyleGAN2相当的图像质量。这表明，通过精心设计的网络结构和算法优化，可以在保持生成图像质量的同时，大幅降低模型的计算需求，这对于推动AI技术在移动设备上的普及具有重要意义。 1. 引言近年来，GANs的发展极大地提升了高保真图像合成的质量。早期的如DCGAN可以生成最多64x64像素的图像，而现代网络如StyleGAN2等已经能够生成高达1024x1024甚至更高分辨率的图像，这些图像细节丰富，几乎可以以假乱真。然而，随着分辨率的提升，计算资源的需求也随之急剧增加。 MobileStyleGAN的出现，解决了在有限计算资源下实现高质量图像生成的问题。它的成功在于平衡了性能和效率，使得AI生成图像技术不再局限于高性能计算平台，而是可以扩展到更广泛的移动和物联网设备中，这将有助于推动AI技术的广泛应用，并可能催生新的应用场景，如实时个性化图像生成、智能相机增强等。

MobileStyleGAN: A Lightweight Convolutional Neural Network for

High-Fidelity Image Synthesis

Sergei Belousov

sergei.o.belousov@gmail.com

Abstract

In recent years, the use of Generative Adversarial Net-

works (GANs) has become very popular in generative image

modeling. While style-based GAN architectures yield state-

of-the-art results in high-ﬁdelity image synthesis, computa-

tionally, they are highly complex. In our work, we focus

on the performance optimization of style-based generative

models. We analyze the most computationally hard parts

of StyleGAN2, and propose changes in the generator net-

work to make it possible to deploy style-based generative

networks in the edge devices. We introduce MobileStyle-

GAN architecture, which has x3.5 fewer parameters and is

x9.5 less computationally complex than StyleGAN2, while

providing comparable quality.

1. Introduction

In recent years, high-ﬁdelity image synthesis has signif-

icantly improved by through the use of Generative Adver-

sarial Networks (GANs) [9]. Whereas early work such as

DCGAN [27] could generate images having a resolution up

to 64x64 pixels, modern networks such as BigGAN [3] and

StyleGAN [20, 21, 19] allow the generation of photorealis-

tic images with up to 512x512 and even 1024x1024 pixels.

Although the quality of generative models has signiﬁcantly

improved, image generation still requires many computa-

tion resources. The high computational complexity makes

it difﬁcult to deploy state-of-the-art generative models to

edge devices.

For example, the StyleGAN2 [21] network allows realis-

tic face images 1024x1024 pixels in size with FID=2.84 for

the FFHQ dataset. It, however, contains 28.27M parameters

and has a computational complexity of 143.15GMAC.

We propose a new lightweight architecture, Mo-

bileStyleGAN, a high-resolution generative model for high-

quality image generation. Taking as a baseline the orig-

inal StyleGAN2 architecture, we revisit computationally

hard parts of this network to create our own lightweight

model that provides comparable quality (Figure 1). The

whole network contains 8.01M parameters, has a compu-

tational complexity of 15.09 GMAC, and provides quality

with FID=12.38 for the FFHQ dataset.

Our main contributions are:

• We introduce an end-to-end wavelet-based convolu-

tional neural network for high-ﬁdelity image synthesis.

• We introduce Depthwise Separable Modulated Convo-

lution as a lightweight version of Modulated Convolu-

tion to decrease computational complexity.

• We introduce a revisited version of the demodulation

mechanism applicable to graph optimizations such as

operation fusion.

• We propose a pipeline based on knowledge distillation

to train our network.

2. Related Work

2.1. StyleGAN

StyleGAN [20] is a modern generative model for high-

resolution image generation. The key aspects of the Style-

GAN network are:

• It uses progressive growing to increase the resolution

gradually.

• It generates images from a ﬁxed value tensor, as op-

posed to generating images from stochastically gener-

ated latent variables as in conventional GANs.

• The stochastically generated latent variables are used

as style vectors through AdaIN [16] at each resolution

after being nonlinearly transformed by an 8-layer neu-

ral network.

StyleGAN2 [21] improves upon StyleGAN by:

• Eliminating droplet modes by normalizing with es-

timated statistics instead of normalizing with actual

statistics such as AdaIN.

• Reducing eye and tooth stagnation by using a hierar-

chical generator with skip connections instead of pro-

gressive growing.

arXiv:2104.04767v1 [cs.CV] 10 Apr 2021

下载后可阅读完整内容，剩余7页未读，立即下载

TracelessLe

粉丝: 5w+
资源: 466

优化StyleGAN：MobileStyleGAN——轻量级高保真图像合成

MobileStyleGAN.pytorch:PyTorch中MobileStyleGAN的正式实现

mobilestylegan_ffhq.ckpt

经济补偿金、赔偿金的核定及劳动合同解除后双方的义务.ppt

配电网分布式电源和储能选址定容 以配电网总成本最低为目标函数，其中包括年运行成本，设备维护折损成本、环境成本；以系统潮流运行为约

Cpp-halcono-pencv互相转换

××部20xx年人员裁员分流方案表.xlsx

『人事流程图新』会前准备工作管理流程图.docx

遗传算法优化频率抽样法非零点插入_GeneA.zip

基于ssm的商会管理系统设计与实现.docx

MATLAB的贪吃蛇系统GUI设计.zip

最新资源

配电网分布式电源和储能选址定容以配电网总成本最低为目标函数，其中包括年运行成本，设备维护折损成本、环境成本；以系统潮流运行为约