深度耦合自动编码器：驱动单图超分辨率的创新方法

80 浏览量更新于2024-08-26 收藏 2.47MB PDF 举报

耦合深度自动编码器（Coupled Deep Autoencoder，CDA）是一种创新的机器学习模型，特别应用于单图像超分辨率（Single Image Super-Resolution, SISR）领域，发表于2017年1月的《IEEE Transactions on Cybernetics》第47卷第1期。传统的基于稀疏编码的方法在学习低分辨率（Low-Resolution, LR）和高分辨率（High-Resolution, HR）图像配对的有效表示时取得了显著的进步。然而，这些方法往往因为过度简化假设而导致生成的高分辨率图像存在环状噪声（ringing）、锯齿状边缘（jaggy）以及模糊效应。 CDA模型的提出旨在克服这些问题，它借鉴了深度学习的强大能力。与传统方法不同，CDA采用了全新的深度架构，这赋予了模型更高的表征能力。它不仅学习LR和HR图像块的内在表示，而且通过大数据驱动的函数，精确地将LR图像的表征映射到它们对应的HR表征上，从而避免了对LR和HR表示之间关系的硬性假设。具体来说，CDA模型的工作原理包括两个关键步骤：首先，它通过深度网络对LR和HR图像进行编码，提取出每个像素点的深层特征。这些特征捕捉到了图像的复杂结构和细节信息。其次，通过训练一个大型数据集，模型学习了一个非线性映射函数，这个函数能够高效地从LR特征空间转换到HR特征空间，保持了原始图像的高质量特性，同时减少或消除由简单线性或相似假设引起的失真。 CDA的优势在于其数据驱动的灵活性和自适应性，它能更好地模拟真实世界中的图像关系，因此生成的超分辨率图像在视觉效果上更加自然。此外，CDA的深度架构也使得模型能够处理更大规模的数据，提高了模型的泛化能力和鲁棒性。耦合深度自动编码器作为一种新颖的单图像超分辨率技术，利用深度学习的力量，通过无监督的方式学习图像的潜在表示，并通过大规模数据优化映射过程，显著改进了图像重建的质量，为图像处理领域的超分辨率研究开辟了新的可能性。

ZENG et al.: CDA FOR SINGLE IMAGE SR 29

given a set of samples Y = [y

, y

,...,y

], where y

∈ R

the training objective of an autoencoder is to minimize the

reconstruction error

 =





−



(1)

where y

and

are the original input and the reconstructed

input, respectively. The hidden layer implies an encoding

process and a decoding process



= f



+ b



= f





+ b





(2)

where h

∈ R

is the compact representation, W and W



rep-

resent the weight matrices for encoding and decoding layers,

b and b



denote the bias terms. f (·) is the activation function,

which we set as the sigmoid function in this paper

f (z) =

1 + exp(−z)

. (3)

The autoencoders can induce very useful representations

of the inputs. However, they can only handle a single sam-

ple and cannot model the relationship between a sample pair.

In image SR, we are interested in the joint task of discov-

ering suitable representations for image pairs and encoding

their relationship. We argue that a better representation should

depend on not only the input image but also the internal

relationship between the HR/LR image pairs. With this in

mind, we develop the CDA.

A. CDA

CDA has a three-stage architecture, as shown in Fig. 1.The

ﬁrst and third stages employ two autoencoders for learning

the representations of LR and HR image patches, respectively.

The second stage incorporates a one-layer neural network to

transform the LR representation into the HR representation.

Following the above notations, the two autoencoders gener-

ate the hidden representations h

and h

, which we term the

intrinsic representations of the LR and HR input, respectively.

Given the LR input y

and the corresponding HR input x

,the

intrinsic representations can be obtained by

= f



+ b



(4)

= f





+ b





. (5)

For reconstruction, the decoding processes imply that

= f





+ b





(6)

= f



+ b



. (7)

The parameters (W

, W



, b



) characterize the LR autoen-

coder (LRAE) while (W



, W

, b



, b

) parameterize the HR

autoencoder.

After obtaining the LR/HR intrinsic representations, the

neural network implements mapping from h

to h

Mathematically, let us denote the parameters in this stage as

, b

), where W

is the weight matrix and b

is the bias

term. The mapping function then becomes

= f



+ b



. (8)

Algorithm 1 CDA for SR

Input: a LR image Y and well trained CDA model  =

, W

, b

}

Output: the HR image

Step 1: Extract low-resolution image patches y

using (9);

Step 2: for each image patch y

Step 2.1: Obtain the LR intrinsic representation h

by (4);

Step 2.2: Obtain the HR intrinsic representation h

by (8);

Step 2.3: Obtain the HR image patch

by (7);

Step 3: Reconstruct the HR image

X using (10).

The construction of the CDA suggests that the model is simple

and ﬂexible. The autoencoders ensure that the intrinsic repre-

sentations are well ﬁt to the LR and HR images and the neural

network can learn complex relationships between the LR/HR

representations; notably, the mapping function and the intrin-

sic representations are jointly optimized and thus correlated.

Therefore, the constructed architecture is a data-driven model

for single image SR. Note that we can replace autoen-

coder with the stacked autoencoder [46] or the de-noising

autoencoder [47] to obtain further performance improvement.

B. Super-Resolution by CDA

For single image SR, CDA is a three-layer forward network

employing a fast feed-forward process, as shown in Fig. 1.The

SR steps are as follows.

Following the preprocessing step found in most SR meth-

ods, a single LR image is ﬁrst upscaled to the desired size

using bi-cubic interpolation. To avoid confusion, this inter-

polated LR image is denoted by Y. The LR image patches

(i = 1, 2,...,N) are obtained through

= R

Y (9)

where R

is the operator to extract the ith local patch in Y.

Taking y

as the input of CDA, the forward process incorpo-

rates (4), (7) and (8) to infer the LR intrinsic representation h

the HR intrinsic representation h

, and the ﬁnal restored HR

patch

, respectively. To estimate the whole HR image

we merge all restored patches by averaging the overlapping

regions between adjacent patches

X =







−1



. (10)

Algorithm 1 describes CDA for SR in detail. The dimensions

of the hidden units in each layer are discussed in Section III.

C. Training CDA

CDA needs to discover the LR/HR intrinsic representations

and simultaneously joint them using a well-trained mapping

function. For this purpose, we have designed a two-part

training procedure: the ﬁrst part is initialization (stages 1–3

in Fig. 2), and the second part is the ﬁne-tuning implemented

in stage 4.

1) Initialization: To train CDA, the intrinsic representations

of the LR/HR inputs are ﬁrst generated. According to

the autoencoder introduced in the beginning of Section II,

剩余10页未读，继续阅读

weixin_38571453

粉丝: 4
资源: 968

深度耦合自动编码器：驱动单图超分辨率的创新方法

dSRVAE:通过CVPR2020中的变分自动编码器实现无监督的真实图像超分辨率

光电图像处理复习资料

图像处理中信息耦合和深度学习中的信息耦合是什么意思

面向对象耦合深度学习

5．利用定向耦合器可实现功率分配，请问如何实现3dB耦合器；是否可以用定向耦合器来实现CWDM（比如1310nm和1550nm），并解释。

介绍一下：Decoupled-and-Coupled Networks: Self-Supervised Hyperspectral Image Super-Resolution with Subpixel Fusion

变压器磁热耦合深度学习的代码

基于深度学习松耦合视觉惯性里程计

提供几种超声耦合剂固定在超声探头的方法

单纵模激光器为什么可以耦合多模光纤

最新资源