深度卷积网络驱动的高精度图像超分辨率：CVPR 2016论文解析

需积分: 0 25 浏览量更新于2024-08-05 收藏 1.33MB PDF 举报

本篇论文主要探讨了"Accurate Image Super-Resolution Using Very Deep Convolutional Networks"，发表于2016年的CVPR（计算机视觉与模式识别）会议。作者Jiwon Kim、Jung Kwon Lee和Kyoung Mu Lee来自韩国首尔国立大学电子工程系及ASRI，他们通过深度学习技术在单图像超分辨率（Single Image Super-Resolution, SR）领域取得了显著突破。论文的核心贡献是提出了一种基于非常深卷积网络（Very Deep Convolutional Network，灵感来源于用于ImageNet图像分类的VGG-net[19]）的高精度图像超分辨率方法。通过在网络结构中堆叠大量小型滤波器，这种方法有效地利用了大区域图像的上下文信息。值得注意的是，深度网络的增加对模型准确性产生了重大影响，作者最终构建的模型包含20个权重层。然而，深度网络的训练过程中的收敛速度成为一个关键问题。为了解决这个问题，论文提出了一种简单而有效的训练策略。他们专注于学习残差（Residual Learning），即网络直接学习输入与输出之间的差异，而非直接预测输出。此外，他们采用了极端高的学习率（是SRCNN[6]的104倍），并借助可调的梯度裁剪技术来保持训练稳定。实验结果显示，相比于现有方法，这种新型的超分辨率技术在准确性上表现出色，且视觉效果的提升明显易见。这表明，通过深度卷积网络进行图像超分辨率不仅能提高重建图像的细节和清晰度，还能在保持高效的同时，处理复杂的图像恢复任务。这篇论文为图像超分辨率领域提供了一个新的深度学习框架，强调了深度网络在捕捉图像细节和上下文方面的优势，并展示了如何有效地应对深度学习模型训练中遇到的挑战。这对于那些关注图像质量提升和计算机视觉应用的开发者来说，是一篇值得深入研究的重要文献。

Accurate Image Super-Resolution Using Very Deep Convolutional Networks

Jiwon Kim, Jung Kwon Lee and Kyoung Mu Lee

Department of ECE, ASRI, Seoul National University, Korea

{j.kim, deruci, kyoungmu}@snu.ac.kr

Abstract

We present a highly accurate single-image super-

resolution (SR) method. Our method uses a very deep con-

volutional network inspired by VGG-net used for ImageNet

classiﬁcation [

19]. We ﬁnd increasing our network depth

shows a signiﬁcant improvement in accuracy. Our ﬁnal

model uses 20 weight layers. By cascading small ﬁlters

many times in a deep network structure, contextual infor-

mation over large image regions is exploited in an efﬁcient

way. With very deep networks, however, convergence speed

becomes a critical issue during training. We propose a sim-

ple yet effective training procedure. We learn residuals only

and use extremely high learning rates (10

times higher

than SRCNN [

6]) enabled by adjustable gradient clipping.

Our proposed method performs better than existing meth-

ods in accuracy and visual improvements in our results are

easily noticeable.

1. Introduction

We address the problem of generating a high-resolution

(HR) image given a low-resolution (LR) image, commonly

referred as single image super-resolution (SISR) [

12], [8],

[

9]. SISR is widely used in computer vision applications

ranging from security and surveillance imaging to medical

imaging where more image details are required on demand.

Many SISR methods have been studied in the computer

vision community. Early methods include interpolation

such as bicubic interpolation and Lanczos resampling [

more powerful methods utilizing statistical image priors

[

20, 13] or internal patch recurrence [9].

Currently, learning methods are widely used to model a

mapping from LR to HR patches. Neighbor embedding [

15] methods interpolate the patch subspace. Sparse coding

[

25, 26, 21, 22] methods use a learned compact dictionary

based on sparse signal representation. Lately, random forest

[18] and convolutional neural network (CNN) [6] have also

been used with large improvements in accuracy.

Among them, Dong et al. [

6] has demonstrated that a

CNN can be used to learn a mapping from LR to HR in an

slow running time(s) fast

-2

-1

PSNR (dB)

36.4

36.6

36.8

37.2

37.4

37.6

VDSR (Ours)

SRCNN

SelfEx

RFL

A+

Figure 1: Our VDSR improves PSNR for scale factor ×2 on

dataset Set5 in comparison to the state-of-the-art methods (SR-

CNN uses the public slower implementation using CPU). VDSR

outperforms SRCNN by a large margin (0.87 dB).

end-to-end manner. Their method, termed SRCNN, does

not require any engineered features that are typically neces-

sary in other methods [

25, 26, 21, 22] and shows the state-

of-the-art performance.

While SRCNN successfully introduced a deep learning

technique into the super-resolution (SR) problem, we ﬁnd

its limitations in three aspects: ﬁrst, it relies on the con-

text of small image regions; second, training converges too

slowly; third, the network only works for a single scale.

In this work, we propose a new method to practically

resolve the issues.

Context We utilize contextual information spread over

very large image regions. For a large scale factor, it is often

the case that information contained in a small patch is not

sufﬁcient for detail recovery (ill-posed). Our very deep net-

work using large receptive ﬁeld takes a large image context

into account.

Convergence We suggest a way to speed-up the train-

ing: residual-learning CNN and extremely high learning

rates. As LR image and HR image share the same infor-

mation to a large extent, explicitly modelling the residual

image, which is the difference between HR and LR images,

is advantageous. We propose a network structure for efﬁ-

1646

下载后可阅读完整内容，剩余8页未读，立即下载

恽磊

粉丝: 27
资源: 297

深度卷积网络驱动的高精度图像超分辨率：CVPR 2016论文解析

Accurate Image Super-Resolution Using Very Deep Convolutional Networks

WDSR-Wide Activation for Efficient and Accurate Image Super-Resolution 论文代码

机器学习论文合集（pdf格式）.zip

VDSR-caffe

A Neural Enhancement Post-Processor with a Dynamic AV1 Encoder C

【Theoretical Deepening】: Cracking the Convergence Dilemma of GANs: In-Depth Analysis from Theory ...

DCGAN-based Image Restoration using Multi-scale Features这篇文献在哪引用

中国新能源汽车&充电桩（2018-2022）数据-最新出炉.zip

JDK17 win64位版本下载

【SCI一区】Matlab实现哈里斯鹰优化算法HHO-CNN-LSTM-Attention的风电功率预测算法研究.rar

最新资源