结构保持超分辨率：梯度引导的SPSR方法

需积分: 0 53 浏览量更新于2024-07-15 收藏 8.17MB PDF 举报

"这篇论文'Structure-Preserving Super Resolution with Gradient Guidance'是2020年CVPR会议上发表的，由Cheng Ma等人撰写，来自清华大学自动化系、智能技术与系统国家重点实验室、北京信息科学与技术国家研究中心以及清华大学深圳国际研究生院。论文主要关注超分辨率图像重建过程中结构保持的问题，提出了一种新的方法来解决因使用生成对抗网络（GAN）导致的图像结构失真问题，同时保留了GAN在生成感知细节上的优势。" 正文：超分辨率（Super Resolution，简称SISR）是计算机视觉领域的一个重要研究方向，它旨在通过单一低分辨率图像恢复出高分辨率的清晰图像。近年来，随着生成对抗网络（Generative Adversarial Networks，GANs）的发展，SISR技术在生成逼真的图像细节方面取得了显著进步。然而，GAN在提升图像分辨率时，可能会引入结构失真，这在实际应用中是一个亟待解决的问题。论文《Structure-Preserving Super Resolution with Gradient Guidance》针对这一问题提出了一个创新的解决方案。作者们利用图像的梯度图（Gradient Maps）作为指导，以在恢复图像细节的同时，保持图像的结构完整性。梯度图包含了图像边缘和纹理信息，对于保持图像的几何结构至关重要。通过结合梯度信息，该方法能够更准确地指导像素级别的恢复过程，减少不必要的结构扭曲。具体来说，论文中的方法包括以下几个关键部分： 1. **梯度引导模块**：此模块利用图像的梯度信息，帮助模型在提升分辨率时更加注重保持原始结构。梯度信息提供了关于图像边缘和形状的线索，有助于减少结构失真。 2. **生成对抗网络框架**：论文沿用了GAN的基本框架，利用对抗学习机制生成高质量的高分辨率图像。GAN的两个网络，生成器（Generator）和判别器（Discriminator），在对抗游戏中不断优化，生成器试图生成逼真的图像，而判别器则尝试区分真实高分辨率图像与生成的图像。 3. **结构保真损失函数**：除了传统的像素级损失函数（如均方误差MSE或感知损失L1/L2）之外，论文还引入了结构保真度指标，确保生成的图像在结构上尽可能接近原始图像。 4. **综合训练策略**：为了平衡结构保真和细节生成，论文可能采用了逐步训练策略，先以结构为主，然后逐渐加入对细节的追求，以实现更好的性能平衡。通过以上技术，论文提出的结构保持超分辨率方法在理论上和实验中都表现出了优秀的性能，既保留了GAN生成的细节，又减少了结构失真，为SISR领域提供了一个有价值的贡献。这种方法可以应用于各种应用场景，如高清视频处理、医疗影像增强以及遥感图像分析等，有助于提高图像质量和用户视觉体验。

kernel. In fact, we do not consider gradient direction in-

formation since gradient intensity is adequate to reveal the

sharpness of local regions in recovered images. Hence we

adopt the intensity maps as the gradient maps. Such gradi-

ent maps can be regarded as another kind of images, so that

techniques for image-to-image translation can be utilized

to learn the mapping between two modalities. The transla-

tion process is equivalent to the spatial distribution transla-

tion from LR edge sharpness to HR edge sharpness. Since

most area of the gradient map is close to zero, the convolu-

tional neural network can concentrates more on the spatial

relationship of outlines. Therefore, it may be easier for the

network to capture structure dependency and consequently

produce approximate gradient maps for SR images.

As shown in Figure 2, the gradient branch incorpo-

rates several intermediate-level representations from the SR

branch. The motivation of such scheme is that the well-

designed SR branch is capable of carrying rich structural in-

formation which is pivotal to the recovery of gradient maps.

Hence we utilize the features as a strong prior to promote

the performance of the gradient branch, whose parameters

can be largely reduced in this case. Between each two inter-

mediate features, there is a gradient block which can be any

basic block to extract higher-level features. Once we get

the SR gradient maps by the gradient branch, we are able to

integrate the obtained gradient features into the SR branch

to guide SR reconstruction in turn. The magnitude of gra-

dient map can implicitly reﬂect whether a recovered region

should be sharp or smooth. In practice, we feed the feature

maps produced by the next-to-last layer of gradient branch

to the SR branch. Meanwhile, we generate the output gra-

dient maps by a 1 × 1 convolution layer with these feature

maps as inputs.

3.2.2 Structure-Preserving SR Branch

We design a structure-preserving SR branch to get the ﬁnal

SR outputs. This branch constitutes of two parts. The ﬁrst

part is a regular SR network comprising of multiple gener-

ative neural blocks which can be any architecture. Here we

introduce the Residual in Residual Dense Block (RRDB)

proposed in ESRGAN [42]. There are 23 RRDB blocks in

the original model. Therefore, we incorporate the feature

maps from the 5th, 10th, 15th, 20th blocks to the gradi-

ent branch. Since regular SR models produce images with

only 3 channels, we remove the last convolutional recon-

struction layer and feed the output feature to the consecu-

tive part. The second part of the SR branch wires the SR

gradient feature maps obtained from the gradient branch as

mentioned above. We fuse the structure information by a

fusion block which fuses the features from two branches to-

gether. Speciﬁcally, we concatenate the two features and

then use another RRDB block and convolutional layer to

reconstruct the ﬁnal SR features. It is noteworthy that we

only add one RRDB block into the SR branch. Thus the pa-

rameter increment is slight compared to the original model

with 23 blocks.

3.3. Objective Functions

Conventional Loss: Most SR methods optimize the

elaborately designed networks by a common pixelwise loss,

which is efﬁcient for the task of super resolution measured

by PSNR. This metric can reduce the average pixel differ-

ence between recovered images and ground-truths but the

results may be too smooth to maintain sharp edges for visual

effects. However, this loss is still widely used to accelerate

convergence and improve SR performance:

P ix

= E

S R

kG(I

) − I

. (3)

Perceptual loss has been proposed in [20] to improve per-

ceptual quality of recovered images. Features containing se-

mantic information are extracted by a pre-trained VGG net-

work [36]. The Euclidean distances between the features of

HR images and SR ones are minimized in perceptual loss:

P er

= E

S R

kφ

(G(I

)) − φ

, (4)

where φ

(.) denotes the ith layer output of the VGG model.

Methods [27, 42] based on generative adversarial net-

works (GANs) [3, 4, 15, 16, 21, 33] also play an important

role in the SR problem. The discriminator D

and the gen-

erator G are optimized by a two-player game as follows:

Dis

= −E

S R

[log(1 − D

))]

−E

[log D

)], (5)

Adv

= −E

S R

[log D

(G(I

))]. (6)

Following [21, 42] we conduct relativistic average GAN

(RaGAN) to achieve better optimization in practice. Mod-

els supervised by the above objective functions merely con-

sider the image-space constraint for images, but neglect the

semantically structural information provided by the gradi-

ent space. While the generated results look photo-realistic,

there are also a number of undesired geometric distortions.

Thus we introduce the gradient loss to alleviate this issue.

Gradient Loss: Our motivation can be illustrated clearly

by Figure 3. Here we only consider a simple 1-dimensional

case. If the model is only optimized in image space by the

L1 loss, we usually get a SR sequence as Figure 3 (b) given

an input testing sequence whose ground-truth is a sharp

edge as Figure 3 (a). The model fails to recover sharp edges

for the reason that the model tends to give an statistical av-

erage of possible HR solutions from training data. In this

case, if we compute and show the gradient magnitudes of

two sequences, it can be observed that the SR gradient is

ﬂat with low values while the HR gradient is a spike with

剩余15页未读，继续阅读

hbw136

粉丝: 13
资源: 2

结构保持超分辨率：梯度引导的SPSR方法

arm中断返回地址详细分析归类.pdf

ARM Programming Techniques.pdf

ARM汇编指令.pdf

以下ARM指令有一句是错误的，请指出其错误原因，并解释每行语句的含义。1 Add R0,R1,R2 2.MOV R0,R2,LSL#3 3.BL SUBR1 4.MOV PC,LR 5.TST R3,#0x02 6.MRS R0,SPSR 7.STMFD SP!,{R0-R3,LR}

MRS R0,SPSR

aarch64异常 spsr esr

以下ARV指令有一句是错误的，请指出;其错误原因，并解释每行语句的含义。 1) Add RO.R1.52 2) MOV RO,R2.LSIL #3 3) BL SUBRI 4) MOV PCIR 5) TST R3,#0x02 6) MRS RO, SPSR 7) STMFD SP!.(RO-R3,LR)

SPSR寄存器在用户模式下可以使用

以下ARV指令有一句是错误的，请指出;其错误原因，并解释每行语句的含义。 1) Add RO,R1,52 2) MOV RO,R2.LSIL #3 3) BL SUBRI 4) MOV PCLR 5) TST R3,#0x02 6) MRS RO, SPSR 7) STMFD SP!.(RO-R3,LR)

以下ARV指令有一句是错误的，请指出;其错误原因，并解释每行语句的含义。 1) Add RO,R1,52 2) MOV RO,R2.LSIL #3 3) BL SUBRI 4) MOV PCIR 5) TST R3,#0x02 6) MRS RO, SPSR 7) STMFD SP!.(RO-R3,LR)

最新资源