深度卷积神经网络驱动的图像超分辨率重建

需积分: 10 28 浏览量更新于2024-09-06 收藏 10.39MB PDF 举报

"《深度卷积神经网络在图像超分辨率重建中的应用》（Image Super-Resolution Using Deep Convolutional Networks）是一篇探讨利用深度学习技术提升图像分辨率的研究论文。该文章的主要贡献在于提出了一种全新的图像超分辨率重建方法，这种方法不再依赖传统的插值或稀疏编码策略，而是直接通过一个端到端的深层卷积神经网络（CNN）来学习低分辨率图像与高分辨率图像之间的映射关系。文章的核心思想是将图像超分辨率问题视为一个深度学习任务，通过训练一个包含多层卷积层和可能还包括池化层、激活函数等组件的网络。这种网络结构使得模型能够捕捉图像中的局部特征并进行有效的特征融合，从而生成更精细的细节。与传统方法逐个处理图像的各个部分不同，这种深度学习方法能够联合优化所有层级，实现整体性能的提升。作者强调，他们的深度CNN设计轻量化，这意味着在保持卓越的图像恢复质量的同时，还具有极快的实时处理速度，这对于实际在线应用至关重要。他们通过对网络结构和参数设置的深入研究，找到了性能和速度之间的良好平衡。此外，他们还扩展了网络架构，使其能够同时处理三个颜色通道，从而实现了整体色彩还原的进一步优化。这篇论文展示了深度卷积神经网络在图像超分辨率领域的潜力，不仅在理论上提供了新的视角，而且在实践中展示了其在提高图像清晰度和速度方面的实际优势。对于图像处理和计算机视觉领域的研究人员以及对图像增强有需求的工程师来说，这是一项值得深入研究和应用的技术。"

function [47], random forest [37] and anchored neigh-

borhood regression [41], [42] are proposed to further

improve the mapping accuracy and speed. The sparse-

coding-based method and its several improvements [41],

[42], [48] are among the state-of-the-art SR methods

nowadays. In these methods, the patches are the focus

of the optimization; the patch extraction and aggregation

steps are considered as pre/post-processing and handled

separately.

The majority of SR algorithms [2], [4], [15], [41], [48],

[49], [50], [51] focus on gray-scale or single-channel

image super-resolution. For color images, the aforemen-

tioned methods ﬁrst transform the problem to a dif-

ferent color space (YCbCr or YUV), and SR is applied

only on the luminance channel. There are also works

attempting to super-resolve all channels simultaneously.

For example, Kim and Kwon [25] and Dai et al. [7] apply

their model to each RGB channel and combined them to

produce the ﬁnal results. However, none of them has

analyzed the SR performance of different channels, and

the necessity of recovering all three channels.

2.2 Convolutional Neural Networks

Convolutional neural networks (CNN) date back

decades [27] and deep CNNs have recently shown an

explosive popularity partially due to its success in image

classiﬁcation [18], [26]. They have also been success-

fully applied to other computer vision ﬁelds, such as

object detection [34], [40], [52], face recognition [39], and

pedestrian detection [35]. Several factors are of central

importance in this progress: (i) the efﬁcient training

implementation on modern powerful GPUs [26], (ii) the

proposal of the Rectiﬁed Linear Unit (ReLU) [33] which

makes convergence much faster while still presents good

quality [26], and (iii) the easy access to an abundance of

data (like ImageNet [9]) for training larger models. Our

method also beneﬁts from these progresses.

2.3 Deep Learning for Image Restoration

There have been a few studies of using deep learning

techniques for image restoration. The multi-layer per-

ceptron (MLP), whose all layers are fully-connected (in

contrast to convolutional), is applied for natural image

denoising [3] and post-deblurring denoising [36]. More

closely related to our work, the convolutional neural net-

work is applied for natural image denoising [22] and re-

moving noisy patterns (dirt/rain) [12]. These restoration

problems are more or less denoising-driven. Cui et al. [5]

propose to embed auto-encoder networks in their super-

resolution pipeline under the notion internal example-

based approach [16]. The deep model is not speciﬁcally

designed to be an end-to-end solution, since each layer

of the cascade requires independent optimization of the

self-similarity search process and the auto-encoder. On

the contrary, the proposed SRCNN optimizes an end-to-

end mapping. Further, the SRCNN is faster at speed. It

is not only a quantitatively superior method, but also a

practically useful one.

3 CONVOLUTIONAL NEURAL NETWORKS FOR

SUPER-RESOLUTION

3.1 Formulation

Consider a single low-resolution image, we ﬁrst upscale

it to the desired size using bicubic interpolation, which

is the only pre-processing we perform

. Let us denote

the interpolated image as Y. Our goal is to recover

from Y an image F (Y) that is as similar as possible

to the ground truth high-resolution image X. For the

ease of presentation, we still call Y a “low-resolution”

image, although it has the same size as X. We wish to

learn a mapping F , which conceptually consists of three

operations:

1) Patch extraction and representation: this opera-

tion extracts (overlapping) patches from the low-

resolution image Y and represents each patch as a

high-dimensional vector. These vectors comprise a

set of feature maps, of which the number equals to

the dimensionality of the vectors.

2) Non-linear mapping: this operation nonlinearly

maps each high-dimensional vector onto another

high-dimensional vector. Each mapped vector is

conceptually the representation of a high-resolution

patch. These vectors comprise another set of feature

maps.

3) Reconstruction: this operation aggregates the

above high-resolution patch-wise representations

to generate the ﬁnal high-resolution image. This

image is expected to be similar to the ground truth

We will show that all these operations form a convolu-

tional neural network. An overview of the network is

depicted in Figure 2. Next we detail our deﬁnition of

each operation.

3.1.1 Patch extraction and representation

A popular strategy in image restoration (e.g., [1]) is to

densely extract patches and then represent them by a set

of pre-trained bases such as PCA, DCT, Haar, etc. This

is equivalent to convolving the image by a set of ﬁlters,

each of which is a basis. In our formulation, we involve

the optimization of these bases into the optimization of

the network. Formally, our ﬁrst layer is expressed as an

operation F

(Y) = max (0, W

∗ Y + B

) , (1)

where W

and B

represent the ﬁlters and biases re-

spectively, and ’∗’ denotes the convolution operation.

Here, W

corresponds to n

ﬁlters of support c × f

× f

where c is the number of channels in the input image,

is the spatial size of a ﬁlter. Intuitively, W

applies

convolutions on the image, and each convolution has

3. Bicubic interpolation is also a convolutional operation, so it can

be formulated as a convolutional layer. However, the output size of

this layer is larger than the input size, so there is a fractional stride. To

take advantage of the popular well-optimized implementations such

as cuda-convnet [26], we exclude this “layer” from learning.

剩余13页未读，继续阅读

hschen

粉丝: 0
资源: 5

深度卷积神经网络驱动的图像超分辨率重建

Image Super-Resolution Using Deep Convolutional Networks code

Accurate Image Super-Resolution Using Very Deep Convolutional Networks

论文4_Accurate Image Super-Resolution Using Very Deep Convolutiona

【Advanced Chapter】Image Super-Resolution in MATLAB: Applying Deep Learning Methods for Image Super...

Semantic Image Segmentation in MATLAB: Using Fully Convolutional Networks for Semantic Image ...

super_resolution.zip

Super-Resolutiong经典文章

机器学习论文合集（pdf格式）.zip

VDSR-caffe

waifu2x - 利用卷积神经网络放大图片-python

最新资源