监督质量约束稀疏编码：盲图像质量评估的新方法

159 浏览量更新于2024-08-29 收藏 1.69MB PDF 举报

本文主要探讨了在盲视图图像质量评估（Blind Image Quality Assessment, BIQA）领域中，利用监督词典学习（Supervised Dictionary Learning）结合质量约束稀疏编码（Quality-Constrained Sparse Coding）技术来设计一种无主观评价（Opinion-Free, OF-BIQA）模型。BIQA的目标是预测受损坏图像的感知质量，而无需依赖于其对应的参考图像作为基准，这对于实际应用中的自动化图像质量控制具有重要意义。在传统的BIQA方法中，通常依赖于主观评分，如用户调查或主观测试，这些方法可能存在时间消耗大、成本高以及结果可能受个体差异影响等问题。因此，发展一种客观且无需主观评价的BIQA模型变得至关重要。作者提出的模型创新之处在于它构建了一个监督学习框架，通过质量约束的稀疏编码，能够在不依赖于人类主观意见的情况下，对图像的视觉质量进行有效估计。质量约束稀疏编码在该模型中起关键作用，它试图找到一组最优的字典元素（dictionary atoms），以最稀疏的方式表示输入图像。这有助于提取图像的特征，同时考虑到图像质量的内在属性。具体来说，模型会利用图像的自然纹理、边缘信息和局部特征，例如Gabor滤波器和方向梯度直方图（Histogram of Oriented Gradients, HOG）等，作为训练数据，以学习到能够反映图像质量变化的特征字典。在学习过程中，模型引入了质量约束，这意味着在训练过程中不仅要优化重构误差，还要确保重构出来的图像质量与已知的质量分数相匹配。这种双目标优化策略使得模型能够更好地捕捉图像质量与特征之间的关联，并生成更准确的无主观评价。实验部分展示了这种方法的有效性，通过对比与传统BIQA方法的结果，证明了在无需参考图像的情况下，使用质量约束的稀疏编码的监督词典学习能够提供更接近于客观质量评价的预测。此外，由于其高效性和通用性，这种模型对于各种类型的图像质量损失，如噪声、压缩失真或几何变换都有良好的适应性。这篇研究论文提出了一种新颖的监督词典学习方法，将质量约束与稀疏编码相结合，为盲视图图像质量评估提供了一种客观且高效的解决方案，有助于提升图像处理领域的自动化质量控制能力。

展开

reconstructed based on the learnt quality-aware dictionary

and its corresponding sparse coefﬁcients w.r.t. the learnt

feature-aware dictionary.

(3) In addition to the commonly used Gabor ﬁlter response

based features which have proven to be useful for quality

assessment, we also incorporate Histogram of Oriented

Gradient (HOG) descriptor for local feature representation.

Experimental results demonstrate the effectiveness of HOG

feature.

The remainder of this paper is organized as follows. Section 2

introduces the related works and then describes the general idea

of this work. Section 3 illustrates the detailed design of the pro-

posed model. Experimental results and analyses are presented in

Section 4, and ﬁnally conclusions are drawn in Section 5.

2. Related works and general idea

2.1. Dictionary learning using sparse coding

The goal of sparse coding is to simulate the sparsity of simple-

cell RF properties in V1. Previous studies have demonstrated that

the sparsity is an important prior based on the observation that

natural images generally contain sparse structures and can be

described by only a small number of structural primitives like lines

and edges [35]. We aim to learn an overcomplete dictionary in

which each basis function is tailored to one speciﬁc structural

primitive or one particular feature, so that any complex structure

in an image can be described as a linear combination of a set of

basis functions.Particularly, given n patches, each patch is

described by a d-dimensional feature vector y

2 R

, such that

the raw patches can be represented by a matrix

Y ¼½y

; ...; y

2R

dn

. From these raw patches, a dictionary

D ¼½d

; ...; d

2R

dm

(allowing m > d to make the dic-

tionary overcomplete) and the corresponding sparse coefﬁcients

C ¼½c

; ...; c

2R

mn

can be learned simultaneously by

using existing dictionary learning algorithms [36–38], where

2 R

, and c

2 R

. Mathematically, this process can be accom-

plished by optimizing the following objective function:

min

fD;Cg

 Dc

; subjec to

8i; kc

6 T

ð1Þ

where kk

denotes the Frobenius norm, kk

denotes the l

-norm,

and T

is a predeﬁned sparsity constraint factor that represents the

maximum number of non-zero elements in each sparse coefﬁcient

. Although the l

-norm gives a straightforward measurement of

sparsity, the introduction of l

-norm sparsity constraint makes this

problem NP-hard. Fortunately, recent developments in the ﬁeld of

optimization theories reveal that the l

-norm minimization prob-

lem will have the same solution with the l

-norm minimization

problem if the restricted isometric property (RIP) condition is satis-

ﬁed [39]. Based on this important theory, the above formulation can

be rewritten as follows:

fD; Cg¼arg min

fD;Cg

i¼1

 Dc

þ kkc



ð2Þ

where k is a positive constant controlling the relative important of

the reconstruction error term and sparse constraint term. Typically,

both D and C are unknown in this stage. In order to solve this prob-

lem, several approaches have been proposed to seek an optimal

sparse solution, such as K-SVD [37] and online dictionary learning

(ODL) [38]. Once the dictionary D is learned, given a testing sample

similarly described by a d-dimensional feature vector y

, we can

automatically convert it to its sparse coefﬁcients c

by solving the

following l

-norm minimization problem:

¼ arg min

 Dc

þ kkc



ð3Þ

Generally, we term the sparse coefﬁcient vector c

as the sparse

feature of y

over dictionary D because the majority elements in c

are zeroes. Since the sparse feature has been proved to be highly

consistent with visual perceptual, it has been widely used in many

computer vision and image processing applications, such as object

classiﬁcation [40], visual saliency detection [41], face recognition

[42] and image quality assessment [33,34].

2.2. Sparse coding solution for IQA

As stated above, the sparse feature obtained with the learnt dic-

tionary is a promising solution to predict the perceived quality. In

this subsection, we give a short overview of some representative

sparse coding-based IQA methods.

Chang et al. proposed a visual cortex-like FR-IQA metric by

modeling the neural processing mechanism of RFs of simple cells

in V1 [33]. Speciﬁcally, independent component analysis (ICA) is

adopted to train a feature detector from a collection of natural

image samples for sparse coding. Then, the image quality of a dis-

torted image is quantiﬁed by measuring SFF . Mathematically, the

proposed SFF metric is deﬁned as

SFFðI

ref

; I

dis

Þ¼

K  M

i¼1

j¼1

þ c

ðA

þðB

þ c

ð4Þ

where K denotes the number of the sparse feature vectors in an

image, M is the dimension of each sparse feature vector, A

and B

represent the values of the j-th element in the i-th sparse feature

vector of the distorted image I

ref

and reference image I

dis

respectively.

Guha et al. devised a new FR-IQA metric, named sparse

representation-based quality index (SPARQ) [34]. Different from

Chang’s work, K-SVD algorithm is used for dictionary learning in

this work. The ﬁdelity of the sparse coefﬁcients is computed to

measure the image quality by

SPARQðI

ref

; I

dis

Þ¼

i¼1

r;i

d;i

jþc

r;i

d;i

þc

 1 

r;i

x

d;i

þc

r;i

þkx

d;i

þc



()

ð5Þ

where K denotes the number of the sparse feature vectors in an

image, x

r;i

and x

d;i

represent the sparse feature vector of the i-th

image patch in I

ref

and I

dis

, respectively.

In addition to the solution for FR-IQA, sparse coding has been

adopted for OA-BIQA as well. He et al. proposed a simple yet effec-

tive BIQA metric based on sparse representation of natural scene

statistics (SRNSS) [43]. In this work, a set of NSS feature vectors

and the corresponding human opinion scores are collected from

the training images to construct a dictionary. In the testing stage,

by extracting the NSS feature vector from a testing image, the

sparse coefﬁcients over the constructed dictionary can be obtained

by using the sparse coding strategy in Eq. (3). The ﬁnal quality

score is computed by weighting the human opinion scores of all

the training images using the estimated sparse coefﬁcients.

Although the performances are promising, the existing

sparse coding-based IQA methods still suffer from the following

limitations:

(a) In those sparse coding-based FR-IQA methods (e.g., SFF and

SPARQ), the dictionary is learned in an unsupervised way

and acted as an unsupervised cortex-like feature detector.

Q. Jiang et al. / J. Vis. Commun. Image R. 33 (2015) 123–133

125

下载后可阅读完整内容，剩余10页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

weixin_38545485

粉丝: 5

监督质量约束稀疏编码：盲图像质量评估的新方法

通过学习的几何词典和聚类稀疏编码进行单图像超分辨率重建

在PCA字典上进行自适应稀疏编码以进行图像降噪

音视频-编解码-稀疏编码中词典的监督和非监督学习方法及应用.pdf

优化语音处理的稀疏编码学习词典方法

一组使用稀疏编码算法 进行特征提取和图像分类 的 Python工具_python_代码_下载

通过学习词典和彩色视觉特性来评估无参考立体图像质量

通过学习词典和稀疏表示进行人口类型识别

使用Fisher判别式进行无监督词典聚类

TWSC: 三边加权稀疏编码提升真实图像去噪效果

Fisher判别稀疏编码在图像场景分类中的应用

最新资源

一组使用稀疏编码算法进行特征提取和图像分类的 Python工具_python_代码_下载