深度学习驱动的立体图像质量评估方法

需积分: 13 191 浏览量更新于2024-08-26 收藏 2.46MB PDF 举报

"基于深度学习的立体图像质量评估" 在当今数字化时代，立体3D（S3D）图像和视频越来越普及，对它们的质量评估成为了一个关键的研究领域，以确保观众的体验质量（QoE）。传统的客观质量评估方法通常利用视差图来提取特征，但随着深度学习技术的发展，人们开始探索将其应用到S3D图像质量评估中。这篇研究论文，"基于深度学习的立体图像质量评估"，由Kai Wang、Jun Zhou、Ning Liu和Xiao Gu等人撰写，他们来自上海交通大学的图像通信与网络工程研究所和上海市数字媒体处理与传输重点实验室。论文提出了一种创新的S3D图像质量评估（S3DIQA）方法，该方法充分利用了深度学习的优势。在该方法中，研究人员采用卷积受限玻尔兹曼机（CRBM）与因子第三阶RBM（FTO-RBM）相结合的模型作为特征提取工具。CRBM是一种无监督学习模型，能够从原始数据中学习高级抽象特征。而FTO-RBM则通过考虑更高阶的依赖关系，进一步增强了特征表示的能力。通过对预处理过的左右两幅图像进行特征提取，这种方法能够更准确地捕捉到S3D图像中的质量差异。论文中的实验部分可能详细比较了新提出的深度学习方法与其他传统方法在预测用户感知质量方面的性能。通常，这些比较会包括相关性分析，如Pearson相关系数和SROCC（结构相似性秩相关系数），以及可能的视觉一致性评价。此外，可能还会通过大量的主观评估实验结果来验证模型的准确性。深度学习在S3D图像质量评估中的应用开辟了新的研究方向，因为传统的基于统计和几何的方法可能无法完全捕捉到复杂的视觉感知效应。通过深度学习模型，可以模拟人眼对立体图像的复杂感知过程，从而提供更接近于人类感知的评估结果。总结来说，这篇论文贡献在于： 1. 提出了一种结合CRBM和FTO-RBM的深度学习模型，用于S3D图像的质量评估。 2. 展示了深度学习在提取立体图像特征方面的优势，特别是对于复杂视觉效果的捕获。 3. 可能通过广泛的实验验证了该方法的预测性能，与传统方法进行了对比，并与主观评估结果相一致。这篇研究不仅对学术界有重要意义，也为S3D图像和视频的编码、传输和显示提供了更精确的质量控制手段，有助于提升整体的用户体验。

Stereoscopic Images Quality Assessment

Based On Deep Learning

Kai Wang

1,2

, Jun Zhou

1,2

, Member, IEEE, Ning Liu

1,2

, Xiao Gu

1,2

Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University

Shanghai Key Laboratory of Digital Media Processing and Transmissions

Shanghai, 200240, China

Email: {aa576aaa, zhoujun, ningliu, gugu97}@sjtu.edu.cn

Abstract—With the popularity of stereoscopic 3D (S3D) images

and videos, many advanced objective quality assessment methods

have been proposed to evaluate viewers’ Quality of Experience

(QoE). Among them, most algorithms take advantages of the

disparity maps to extract useful features. On the other hand,

deep learning has been one of the hottest research topics during

these years, but limited efforts focused on the ﬁeld in objective

quality evaluation of S3D images. In this paper, we propose a

S3D image quality assessment (S3D IQA) method based on deep

learning. In this method, the Convolutional Restricted Boltzmann

Machines (CRBM) combined with Factored Third-Order RBM

(FTO-RBM) is considered as learning model to extract feature

maps from pre-processed left and right images automatically.

Then an improved traversal algorithm based on two pooling

strategies is put forward to optimize extracted feature maps,

which improves the ﬁnal quality assessment performance signif-

icantly. Experimental results show that our S3D IQA method

achieves good performance on 3D databases tested.

Index Terms—Stereoscopic Image Quality Assessment, Con-

volutional Restricted Boltzmann Machines (CRBM), Factored

Third-Order RBM (FTO-RBM), Deep Learning, Optimized Fea-

ture Maps

I. INTRODUCTION

In recent years, stereoscopic 3D (S3D) movies have been

increasingly popular among viewers. However, the viewing

experience may be unsatisﬁed if we watch S3D ﬁlms for a

long time, which can be improved by many proposed S3D

image or video quality assessment (QA) methods. Subjective

and objective QA can be used to classify these concrete

QA methods. Compared to object QA, subject assessment is

expensive and time-consuming so that more attention has been

focused on building objective S3D IQA models.

Most objective S3D IQA models presented their stereo

visual related features to setup a IQA model. Deep learning

can be applied to extract features automatically, and has been

used in ﬁelds like speech processing, image classiﬁcation, etc.

Recently, some researchers have paid attention to combine

deep learning with 2D IQA. Mocanu et al.[1] took Gaussian

Bernoulli RBM for reconstruction error to deﬁne RBMSim to

performe IQA. Hou et al.[2] employed DBN to obtain features

and took ﬁve classiﬁed results that representing 5 IQA scores.

In [3], DNN was used to collect features, and compared with

shallow architectures, such method can better approximate the

sensation of HVS to IQA. However, fewer researchers have

combined deep learning with S3D IQA.

Fig. 1. CRBM+FTO-RBM model. The inputs(V1) of CRBM are real values,

while units of other layers(from H1 to H3) are all binary.

Pooling is a common and valid method to optimize fea-

ture maps. The goal of spatial feature pooling is to con-

vert joint feature representation into more feasible one that

preserves signiﬁcant information while removing irrelevant

details. Boureau et al.[4] gave a detailed pooling theoretical

analysis. Wang et al. [5] investigated three spatial feature

pooling methods under the background of perceptual IQA.

In this paper, we combine CRBM with FTO-RBM as model

to learn S3D image feature maps, which is then optimized by

an improved traversal algorithm based on pooling methods.

The rest of the paper is organized as follows. Section II ﬁrstly

introduces our learning model, then describes complete S3D

IQA method. Section III describes two pooling methods in

detail, and then gives improved traversal algorithm. Section IV

gives experimental results based on two benchmark databases,

and Section V draws the conclusion.

II. PROPOSED METHOD

A. Learning Model

Complete learning model used in this paper is shown in

Fig.1. In this model, two CRBMs are used as underlying

architecture, and FTO-RBM is used as top-level model.

The basic CRBM consists of two layers: one visible layer

V1 and one hidden layer H1. At the same time, probabilistic

max-pooling concept(layer P1) was led into CRBM to acquire

more stable performance[7]. CRBM can deal with larger scale

images and supply K group elementary features {f

ele

}(k ∈

[1, K]) in pool layer (layer P1). The training process of CRBM

is similar to RBM, which is convenient and highly-efﬁcient.

978–1–5090–5316–2/16/$31.00

 2016 IEEE VCIP 2016, Nov. 27 – 30, 2016, Chengdu, China

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38635996

粉丝: 3
资源: 851

深度学习驱动的立体图像质量评估方法

基于深度学习的无参考立体图像质量评价.pdf

利用CNN评价立体图像质量

基于卷积神经网络的立体图像质量评价.pdf

立体图像质量评估：基于双目视觉特性的学习方法

立体图像质量评价：基于JND与SVR的方法

基于小波与结构特征的无参考立体图像质量评估

立体图像质量评价：一种基于结构失真的客观模型

网络游戏-基于卷积神经网络的无参考立体图像质量评估方法.zip

基于深度学习的立体匹配和深度图采集算法

基于稀疏表示的立体图像客观质量评价方法

最新资源