低质量图像的深度退化先验

178 浏览量更新于2023-10-24 收藏 14.31MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

tional neural network trained on large-scale datasets likeImageNet [8]. Since the collected images of the existingdatasets are usually free of degradation, such as low visibil-ity [41, 12, 34], color cast [42, 2], and overexposure, evalu-ation of standard neural network-based methods for imageclassiﬁcation shows a signiﬁcant drop in classiﬁcation ac-curacy when applied to low-quality images [13].In practice, one solution for this problem is to ﬁrstly im-prove the visibility of degraded images by image enhance-ment methods, and then to perform image classiﬁcation,called two-stage methods in this paper [7, 17, 18]. However,the main reason why image degradation affects image clas-siﬁcation is that the structural and statistical properties ofpixels in the neighborhood are obstructed by image degra-dation. Since the existing image enhancement methods aredevised to achieve pleasing visual effect, they cannot guar-antee that the regions with similar structure in the imagecan be enhanced uniformly, leading to uncovered and in-complete feature representation for classiﬁcation. A typicalexample is shown in Figure 1. As can be seen, there is still amargin between the feature distributions of enhanced imageand clear image.Another feasible solution for low-quality image classiﬁ-cation is to transform it into a domain adaptation problemand to match the degraded and clear images with adversar-ial learning or kernelized training. This method is based onthe assumption that the marginal distributions of low- andhigh- quality images can be aligned in the learned featurespace by a deep network. Therefore, after decreasing the di-vergence between the marginal distributions in the learnedfeature space, the classiﬁer trained with high-quality imagestends to perform well on the low-quality images.While promising, most existing domain adaptation ap-proaches require either complex neural network architec-tures [3, 27, 14] or ﬁne-tuning the target domain [31, 21,26].Different from these methods, this paper proposesto learn a transferrable mapping relationship between deeprepresentations of low- and high- quality images, and lever-age it as a deep degradation prior (DDP) for image classi-ﬁer. This method is based on statistical observations that, i)110490低质量图像分类的深度退化先验0杨旺1，曹阳1�，查正军1†，张静2，熊志伟101 中国科学技术大学，中国合肥 2 UBTECH悉尼人工智能中心，悉尼大学，澳大利亚悉尼0ywang120@mail.ustc.edu.cn, forrest,zhazj,zwxiong@ustc.edu.cn, jing.zhang1@sydney.edu.au0摘要0基于卷积神经网络（CNN）的最先进图像分类算法通常是在大规模高质量图像的注释数据集上进行训练的。当应用于低质量图像时，由于图像退化阻碍了邻域像素的结构和统计特性，它们的性能将会显著下降。为了解决这个问题，本文提出了一种新颖的低质量图像分类的深度退化先验。它基于统计观察，即在深度表示空间中，具有结构相似性的图像块即使来自不同的图像，其分布也是均匀的，并且在相同的退化条件下，低质量图像和高质量图像中对应的图像块的分布具有均匀的边缘。因此，我们提出了一个特征去漂移模块（FDM）来学习低质量图像和高质量图像的深度表示之间的映射关系，并将其作为低质量图像分类的深度退化先验（DDP）。由于统计特性与图像内容无关，深度退化先验可以在有限图像的训练集上学习，无需语义标签的监督，并以“插入”现有分类网络的形式提供，以提高其在退化图像上的性能。在基准数据集ImageNet-C上的评估结果表明，我们提出的DDP可以在各种退化条件下将预训练网络模型的准确性提高20%以上。即使在仅使用CUB-C数据集的10张图像进行DDP训练的极端设置下，我们的方法也将VGG16在ImageNet-C上的准确性从37%提高到55%。01. 引言0� 共同第一作者。† 通讯作者。c2c1c4c3a1a2a3a4b3b4b2b1110500雾气图像降级特征0去雾结果增强特征0清晰图像清晰特征0（C）（D）0图1。我们从不同的清晰图像中选择了四个相似的补丁a1�a4，并使用在ImageNet上训练的VGG16可视化它们的特征，如（A）所示。我们使用t-SNE[20]可视化特征分布，如（D）所示。如图（D）所示，清晰补丁在特征空间中具有均匀分布。雾气阻碍了局部补丁内像素的统计特性，导致特征漂移，如（B）和（D）所示。去雾后能够明显改善可见性，但特征漂移现象仍然存在，如（C）和（D）所示。0具有结构相似性的图像补丁在深度表示空间中具有均匀分布，即使它们来自不同的图像；在相同的降级条件下，低质量和高质量图像中对应的结构相似补丁的分布具有均匀的边缘。由于统计特性与图像内容无关，我们提出的深度降级先验可以在少量图像上进行学习，无需语义标签的监督。0具体而言，我们提出了一种特征漂移模块（FDM），用于学习低质量和高质量图像的深度表示之间的映射关系，并将其作为低质量图像分类的深度降级先验（DDP）。FDM旨在补偿图像降级对特征的衰减效应，其中采用了一个四层前向网络来模拟非经典感受野模型的视觉处理机制。在任意降级图像数据集上进行训练后，FDM可以轻松地插入到现有的分类网络中，提高其在降级图像上的性能。在基准数据集ImageNet-C[13]上的评估结果表明，我们的方法在低质量图像分类上具有良好的性能。所提出的DDP可以仅在CUB-C数据集的10张图像上学习，并将VGG16在ImageNet-C上的准确率从37%提高到55%。本文的贡献总结如下：0（1）本文发现，在深度表示中，具有结构相似性的图像补丁在深度表示空间中具有均匀分布。0在相同的降级条件下，低质量和高质量图像补丁的分布具有均匀的边缘，并因此提出了一种新颖的用于低质量图像分类的深度降级先验。本文提出了一种特征漂移模块（FDM），用于学习低质量和高质量图像的深度表示之间的映射关系。在对一小部分降级图像数据集进行训练后，FDM可以轻松地插入到现有的分类网络中，提升其在降级图像上的性能。在基准数据集ImageNet-C上的评估结果表明，我们提出的方法在不同的降级条件下具有良好的性能。即使在只使用CUB-C数据集的10张图像进行训练的极端情况下，我们的方法仍将VGG16的准确率从37%提高到55%。02. 相关工作0降级图像和清晰图像的表示差异会导致特征分布的偏移，在跨域评估预训练分类模型时性能下降[29]。图像增强算法可以将降级图像恢复为清晰版本，以便人眼可以识别物体和结构细节。例如，B.Li等人[17]使用去雾网络改善雾天环境下的目标检测性能。D. Da等人[7]评估了图像超分辨率对高质量图像的影响。PatchSet_ImageNetCleanImg_ImageNetimages from another dataset (e.g., ImageNet). This prop-erty comes from the following fact that the local patchesare statistically irrelevant to speciﬁc image content (e.g.,semantics) and can be embedded into a low-dimensionalmanifold [35]. Besides, recent progress in deep convolu-tional neural network also witnesses a similar phenomenonthat features learned in shallow layers are mainly low-levelones, such as edges, colors, textures, etc [9, 38]. Accord-ingly, the learned convolutional weights can be regarded asvisual dictionaries to extract (represent) these low-level fea-tures (local patches).Motivated by the above work, we propose a novel deepdegradation prior which is simple and effective for down-stream high-level vision tasks, such as classiﬁcation on de-graded images. Speciﬁcally, we argue that: 1) the clearimage patches from different dataset share a similar distri-bution in the feature embedding space, resulting in an in-distinguishable cluster; 2) the degraded image patches havethe similar property while they are separated from the clearones due to their distinct local statistics; 3) if we can learn amapping between the clear features and degraded features,it could be used for arbitrary natural images.To illustrate the above claims, we conducted a statisticalexperiment on the clear and foggy images, synthesized ac-cording to the hazy model [13] on the ImageNet [8] datasetand CUB [33] dataset. Some examplar hazy images areshown in the right corners of Figure 2. First, for the Im-ageNet dataset, we sampled hundreds of clear patches andthe corresponding foggy patches from the same positions,denoted SI and SIF , respectively. The same procedure wascarried out on the CUB dataset, and the sampled patchesare denoted SC and SCF , respectively. Referring to [35],the patch size was set 10∗10 in the experiment. To bettervisualize their distributions and avoid messy clusters, weﬁltered the patches to enable that they share the similar lo-cal structures. Mathematically, they were subjected to thefollowing constraint:SSIM (pi, pj) > T, ∀ pi, pj ∈ SI or SC,(1)where SSIM denotes the structural similarity measurement[4]. T is a threshold and was set to 0.7. We kept 500 patchesfor both clear and foggy cases on each dataset, as shown inthe up/bottom middle part of Figure 2.Then, to obtain their feature representations in the em-bedding space [40], we used the VGG16 network pre-trained on the ImageNet dataset as the feature extractor.Speciﬁcally, the features of each patch were extracted fromthe “conv2 2” layer, since it’s respective ﬁeld is 9*9, almostthe same with the patch size. These features are shown inthe red, blue, green, and yellow cubes in Figure 2. Finally,we leveraged t-SNE [20] to visualize them, as shown in thecentral part of Figure 2.It is clear that: 1) features from clear images are clus-tered together regardless which dataset they come from. Itis the same for the foggy case; 2) there is a large gap be-tween the clear features and foggy ones. Therefore, if weﬁnd a mapping between them, we can bridge them together.In this sense, restoring the degraded image patch to a clearone in the feature embedding space will certainly beneﬁt thedown-stream classiﬁcation and detection tasks. In this pa-per, we call such a mapping as the deep degradation prior(DDP). In the next part, we will present an efﬁcient solutionto show how to learn an effective DDP.4. Learning DDP by Deep Neural Network4.1. Overview of the NetworkGiven a clear dataset and its paired degraded datasetwithout semantic labels, our goal is to learn an effectiveDDP, which can be plugged in existing convolutional neuralnetworks seamlessly to enhance their generalizability on de-graded images. To this end, we propose an effective learn-ing method by reconstructing the clear features from the de-graded ones under the supervision of a simple Mean SquareError (MSE) loss. As shown in Figure 3 (a), during thetraining phase, we ﬁrst use a pre-trained model to extractthe low-level features of both degraded and clear images,for example, “conv2 2” in VGG16 [28], and the ﬁrst layerin AlexNet [15], respectively. This part of network is ﬁxedduring the training phase and is called Shallow PretrainedLayers (SPL) in this paper. The degraded and clear featuresare denoted “DF” and “CF” in Figure 3 (a).Then, we propose a novel feature de-drifting module(FDM) to accomplish the feature reconstruction. Takingthe hazy image as an example, it is concatenated with theDF from SPL together and fed into FDM. Leveraging theresidual learning idea, the output feature from FDM is fusedwith the original DF by an element-wise sum. The result-ing enhanced feature (EF) is compared with its paired CF tocalculate the MSE loss and the error is back-propagated toFDM to update its parameters.During the testing phase, we insert the trained FDM intoan existing classiﬁcation network, i.e., between its SPL andsubsequent deep pre-trained layers (DPL), as shown in Fig-ure 3 (b). It is noteworthy that the learned weights in FDMserve as the learned DDP to map the degraded features tothe clear ones (see Figure 2).4.2. Feature De-drifting ModuleThe degradation changes the statistics of a local patch,which results in a biased feature response by SPL comparedwith the original clear one. In order to correct the driftedfeature response, we propose a novel feature de-driftingmodule. It is inspired by the non-classical receptive ﬁeldof human vision [6] as shown in Figure 3 (c). Non-classicalreceptive ﬁeld is very useful to enhance the high frequencies11052FDM(Fixed)DPL(Fixed)SPL(Fixed)Output AOutput B√Test on ImageNet DatasetDegraded Dog InputDFEF…….........G1G2G3Feature De-drifting ModuleOutputInput1*1 Conv3*3 Conv3*3 Conv3*3 Conv(a)(d)G1_1G1_2G2_1G2_2G3_2G3_1WFDM(Update)SPL(Fixed)MSESPL(Fixed)Train on CUB DatasetDegraded Bird InputDFEFCFClear Bird InputCF: Clear FeaturesDF: Degraded FeaturesEF: Enhanced FeaturesDPL: Deep Pretrained LayersSPL: Shallow Pretrained LayersFDM: Feature De-drifting ModuleNon-classical receptive fieldOutputInput(b)(c)Figure 3. (a)-(b) Diagram of the training/testing phase of the proposed method. (c) Illustration of the non-classical receptive ﬁeld [6] forenhancing the high frequencies while maintaining the low frequencies. (d) The network structure of our feature de-drifting module (FDM).while maintaining low frequencies. Mathematically, givenan input I, a ﬁlter with a non-classical receptive ﬁeld gen-erates an output f as follows:f = A1(I ∗ G1(σ1)) + A2(I ∗ G2(σ2)) + A3(I ∗ G3(σ3)),(2)where G1 ∼ G3 denote three Gaussian ﬁlters with differentﬁlter bandwidths, i.e.,G(σ) =1√2πσ2 exp(− x2+y22σ)(3)A1 ∼ A3 represent the coefﬁcient in the central, sur-rounded, and marginal frequency areas, respectively. ∗ de-notes the convolution operation. σ1 ∼ σ3 are the scaleparameters determining the ﬁlter bandwidths accordingly.Eq. (2) can be reformulated as:f = A1(I ∗ G1) + A2((I ∗ G1) ∗ G2′)+A3((I ∗ G1) ∗ G2′) ∗ G3′,(4)where, G2′ and G3′ are also Gaussian ﬁlters with scale pa-rameters�σ22 − σ21 and�σ23 − σ22. In this way, the convo-lutional result in the ﬁrst (second) term can be used as theinput of the second (third) Gaussian ﬁlter G2′ (G3′).Inspired by the non-classical receptive ﬁeld and Eq. (4),we design our FDM as shown in Figure 3 (d). The ﬁrst threeblocks ﬁlled in orange, yellow and blue, represent the con-volution process in the central, surrounded, and marginalfrequency areas, respectively. Each block consists of twoconvolutional layers. The output features from the threeblocks are then concatenated together and fed into a ﬁnal1*1 convolutional layer simulate the linear weighting be-tween the central, surrounded and marginal frequency parts,as shown in Eq. (2). Details of FDM are summarized in Ta-ble 1 and Table 2 for VGG16 and AlexNet, respectively.Table 1. The details of FDM with VGG16.InputSizeNumFilterStridePadG1 1131*112*112128311G1 2128*112*112128311G2 1128*112*11264311G2 264*112*11264311G3 164*112*11232311G3 232*112*11232311W224*112*112128110Table 2. The details of FDM with AlexNet.InputSizeNumFilterStridePadG1 167*27*27128311G1 2128*27*27128311G2 1128*27*2764311G2 264*27*2764311G3 164*27*2732311G3 232*27*2732311W224*27*27641105. ExperimentsWe evaluate the performance of our proposed deepdegradation prior on ImageNet-C dataset [13], which is arigorous benchmark with 50000 images in 1000 categoriesand widely used for robustness evaluation of image classi-ﬁer. In this paper, we mainly focus on three kinds of degra-dation conditions, including fog, low contrast, and bright-ness. For each of the mentioned degradation conditions, weperform experiments on ﬁve levels of degradation. The ex-amples of degraded images are shown in Figure 4. Besides,under each degradation condition, we test the inﬂuence ofdata size on deep degradation prior modeling.Following the degraded image generation methods in[13], we use the clean images of CUB (total 11788 images)11053FogContrastBrightnessLevel1Level2Level3Level4Level5Figure 4. Some examples of degraded images from ImageNet-C.dataset to synthesize the degraded images, named CUB-C,for the training of FD-Module. We employ AlexNet [15]and VGG16 [28] pre-trained on clean images as base mod-els. The FD-Module is trained for 5000 iterations usingSGD with batch size of 32. The initial learning rate is 0.001.The learning rate decreases to 0.0001 after 2500 iterations.In FD-Module, the ﬁlter weights of each layer are initializedusing the MSRA initialization method.5.1. FogFog is very common for the images captured in outdoorand will cause image degradation due to atmospheric ab-sorption and scattering. The high-frequency componentsand color ﬁdelity are degraded in fog images. Since thedegradation is spatial-variant, it will change the structureof local regions inconsistently and signiﬁcantly increase thedifﬁculty of feature extraction. The goal of image dehazingmethods is to enhance the contrast and restore the struc-ture details, which make the images more visual pleasing,but cannot eliminate the structure inconsistency. In this pa-per, we select ﬁve state-of-the-art image dehazing methodsas baseline: DehazeNet [4], AoD-Net [17], FPCNet [42],FAMED-Net [43], and GFN-Net [25]. All of the test foggyimages are dehazed by the baseline methods and then sendto pre-trained image classiﬁers (VGG16 and AlexNet). Ourproposed FDM is ﬁrstly trained on foggy images and thenplugged into the same pre-trained image classiﬁers (VGG16and AlexNet). The evaluations are performed on ﬁve degra-dation levels respectively.As can be seen from Figure 5 (a), our proposed methodsigniﬁcantly surpasses the two-stage methods in terms ofclassiﬁcation accuracy, especially when the fog concentra-tion is large. Moreover, with the increase of fog concentra-tion, the performance of two-stage methods drops notably,while o

下载后可阅读完整内容，剩余1页未读，立即下载

cpongm

粉丝: 4
资源: 2万+

会员权益专享

低质量图像的深度退化先验

matlab压缩光谱成像的局部先验和非局部先验退化估计递归神经网络.zip

水下退化图像

论文研究-基于改进暗原色先验模型的快速图像去雾方法.pdf

读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显 示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR。matlab实现

用MATLAB读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比PSNR。

用matlab图像退化读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比PSNR。

编写程序，实现基于暗通道先验的雾霾退化图像清晰化复原，并给出当大气散射系数为特定值时的场景深度估计图。

使用matlab编程实现以下内容：读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显 示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR。

水下图像主动退化模型 matlab

深度神经网络退化问题

相机图像还原原始图像

matlab编程读入一幅图像将其变为灰度图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显 示灰度图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR

matlab读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显 示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR。

大气湍流退化图像复原代码

基于暗通道先验的低照度图像去模糊

伽马分布的先验分布是逆伽马分布的情况下，已知一条退化轨迹后期基本符合线性，对该退化轨迹后期的退化进行预测，那么先验分布逆伽马分布的两个参数如何给定，或者说先验分布的参数如何估计，matlab代码如何实现

构建水下图像主动退化模型，包括光线传输模型、散射模型、吸收模型等，然后利用Matlab中的数值计算工具对模型进行求解，得到水下图像的退化过程 matlab如何编写

构建水下图像主动退化模型，然后利用Matlab中的数值计算工具对模型进行求解，得到水下图像的退化过程 matlab如何编写

简述图像退化的基本模型，并写出离散退化模型。

会员权益专享

最新资源

读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR。matlab实现

使用matlab编程实现以下内容：读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR。

matlab编程读入一幅图像将其变为灰度图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显示灰度图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR

matlab读入一幅图像，利用大气湍流退化函数将图像进行退化处理，保存退化图像。同屏显示原始图像、退化图像，并标注大气湍流退化函数参数，计算退化图像的峰值信噪比 PSNR。