没有合适的资源?快使用搜索试试~ 我知道了~
tional neural network trained on large-scale datasets likeImageNet [8]. Since the collected images of the existingdatasets are usually free of degradation, such as low visibil-ity [41, 12, 34], color cast [42, 2], and overexposure, evalu-ation of standard neural network-based methods for imageclassification shows a significant drop in classification ac-curacy when applied to low-quality images [13].In practice, one solution for this problem is to firstly im-prove the visibility of degraded images by image enhance-ment methods, and then to perform image classification,called two-stage methods in this paper [7, 17, 18]. However,the main reason why image degradation affects image clas-sification is that the structural and statistical properties ofpixels in the neighborhood are obstructed by image degra-dation. Since the existing image enhancement methods aredevised to achieve pleasing visual effect, they cannot guar-antee that the regions with similar structure in the imagecan be enhanced uniformly, leading to uncovered and in-complete feature representation for classification. A typicalexample is shown in Figure 1. As can be seen, there is still amargin between the feature distributions of enhanced imageand clear image.Another feasible solution for low-quality image classifi-cation is to transform it into a domain adaptation problemand to match the degraded and clear images with adversar-ial learning or kernelized training. This method is based onthe assumption that the marginal distributions of low- andhigh- quality images can be aligned in the learned featurespace by a deep network. Therefore, after decreasing the di-vergence between the marginal distributions in the learnedfeature space, the classifier trained with high-quality imagestends to perform well on the low-quality images.While promising, most existing domain adaptation ap-proaches require either complex neural network architec-tures [3, 27, 14] or fine-tuning the target domain [31, 21,26].Different from these methods, this paper proposesto learn a transferrable mapping relationship between deeprepresentations of low- and high- quality images, and lever-age it as a deep degradation prior (DDP) for image classi-fier. This method is based on statistical observations that, i)110490低质量图像分类的深度退化先验0杨旺1,曹阳1�,查正军1†,张静2,熊志伟101 中国科学技术大学,中国合肥 2 UBTECH悉尼人工智能中心,悉尼大学,澳大利亚悉尼0ywang120@mail.ustc.edu.cn, forrest,zhazj,zwxiong@ustc.edu.cn, jing.zhang1@sydney.edu.au0摘要0基于卷积神经网络(CNN)的最先进图像分类算法通常是在大规模高质量图像的注释数据集上进行训练的。当应用于低质量图像时,由于图像退化阻碍了邻域像素的结构和统计特性,它们的性能将会显著下降。为了解决这个问题,本文提出了一种新颖的低质量图像分类的深度退化先验。它基于统计观察,即在深度表示空间中,具有结构相似性的图像块即使来自不同的图像,其分布也是均匀的,并且在相同的退化条件下,低质量图像和高质量图像中对应的图像块的分布具有均匀的边缘。因此,我们提出了一个特征去漂移模块(FDM)来学习低质量图像和高质量图像的深度表示之间的映射关系,并将其作为低质量图像分类的深度退化先验(DDP)。由于统计特性与图像内容无关,深度退化先验可以在有限图像的训练集上学习,无需语义标签的监督,并以“插入”现有分类网络的形式提供,以提高其在退化图像上的性能。在基准数据集ImageNet-C上的评估结果表明,我们提出的DDP可以在各种退化条件下将预训练网络模型的准确性提高20%以上。即使在仅使用CUB-C数据集的10张图像进行DDP训练的极端设置下,我们的方法也将VGG16在ImageNet-C上的准确性从37%提高到55%。01. 引言0� 共同第一作者。† 通讯作者。c2c1c4c3a1a2a3a4b3b4b2b1110500雾气图像 降级特征0去雾结果 增强特征0清晰图像 清晰特征0(C)(D)0图1。我们从不同的清晰图像中选择了四个相似的补丁a1�a4,并使用在ImageNet上训练的VGG16可视化它们的特征,如(A)所示。我们使用t-SNE[20]可视化特征分布,如(D)所示。如图(D)所示,清晰补丁在特征空间中具有均匀分布。雾气阻碍了局部补丁内像素的统计特性,导致特征漂移,如(B)和(D)所示。去雾后能够明显改善可见性,但特征漂移现象仍然存在,如(C)和(D)所示。0具有结构相似性的图像补丁在深度表示空间中具有均匀分布,即使它们来自不同的图像;在相同的降级条件下,低质量和高质量图像中对应的结构相似补丁的分布具有均匀的边缘。由于统计特性与图像内容无关,我们提出的深度降级先验可以在少量图像上进行学习,无需语义标签的监督。0具体而言,我们提出了一种特征漂移模块(FDM),用于学习低质量和高质量图像的深度表示之间的映射关系,并将其作为低质量图像分类的深度降级先验(DDP)。FDM旨在补偿图像降级对特征的衰减效应,其中采用了一个四层前向网络来模拟非经典感受野模型的视觉处理机制。在任意降级图像数据集上进行训练后,FDM可以轻松地插入到现有的分类网络中,提高其在降级图像上的性能。在基准数据集ImageNet-C[13]上的评估结果表明,我们的方法在低质量图像分类上具有良好的性能。所提出的DDP可以仅在CUB-C数据集的10张图像上学习,并将VGG16在ImageNet-C上的准确率从37%提高到55%。本文的贡献总结如下:0(1)本文发现,在深度表示中,具有结构相似性的图像补丁在深度表示空间中具有均匀分布。0在相同的降级条件下,低质量和高质量图像补丁的分布具有均匀的边缘,并因此提出了一种新颖的用于低质量图像分类的深度降级先验。本文提出了一种特征漂移模块(FDM),用于学习低质量和高质量图像的深度表示之间的映射关系。在对一小部分降级图像数据集进行训练后,FDM可以轻松地插入到现有的分类网络中,提升其在降级图像上的性能。在基准数据集ImageNet-C上的评估结果表明,我们提出的方法在不同的降级条件下具有良好的性能。即使在只使用CUB-C数据集的10张图像进行训练的极端情况下,我们的方法仍将VGG16的准确率从37%提高到55%。02. 相关工作0降级图像和清晰图像的表示差异会导致特征分布的偏移,在跨域评估预训练分类模型时性能下降[29]。图像增强算法可以将降级图像恢复为清晰版本,以便人眼可以识别物体和结构细节。例如,B.Li等人[17]使用去雾网络改善雾天环境下的目标检测性能。D. Da等人[7]评估了图像超分辨率对高质量图像的影响。PatchSet_ImageNetCleanImg_ImageNetimages from another dataset (e.g., ImageNet). This prop-erty comes from the following fact that the local patchesare statistically irrelevant to specific image content (e.g.,semantics) and can be embedded into a low-dimensionalmanifold [35]. Besides, recent progress in deep convolu-tional neural network also witnesses a similar phenomenonthat features learned in shallow layers are mainly low-levelones, such as edges, colors, textures, etc [9, 38]. Accord-ingly, the learned convolutional weights can be regarded asvisual dictionaries to extract (represent) these low-level fea-tures (local patches).Motivated by the above work, we propose a novel deepdegradation prior which is simple and effective for down-stream high-level vision tasks, such as classification on de-graded images. Specifically, we argue that: 1) the clearimage patches from different dataset share a similar distri-bution in the feature embedding space, resulting in an in-distinguishable cluster; 2) the degraded image patches havethe similar property while they are separated from the clearones due to their distinct local statistics; 3) if we can learn amapping between the clear features and degraded features,it could be used for arbitrary natural images.To illustrate the above claims, we conducted a statisticalexperiment on the clear and foggy images, synthesized ac-cording to the hazy model [13] on the ImageNet [8] datasetand CUB [33] dataset. Some examplar hazy images areshown in the right corners of Figure 2. First, for the Im-ageNet dataset, we sampled hundreds of clear patches andthe corresponding foggy patches from the same positions,denoted SI and SIF , respectively. The same procedure wascarried out on the CUB dataset, and the sampled patchesare denoted SC and SCF , respectively. Referring to [35],the patch size was set 10∗10 in the experiment. To bettervisualize their distributions and avoid messy clusters, wefiltered the patches to enable that they share the similar lo-cal structures. Mathematically, they were subjected to thefollowing constraint:SSIM (pi, pj) > T, ∀ pi, pj ∈ SI or SC,(1)where SSIM denotes the structural similarity measurement[4]. T is a threshold and was set to 0.7. We kept 500 patchesfor both clear and foggy cases on each dataset, as shown inthe up/bottom middle part of Figure 2.Then, to obtain their feature representations in the em-bedding space [40], we used the VGG16 network pre-trained on the ImageNet dataset as the feature extractor.Specifically, the features of each patch were extracted fromthe “conv2 2” layer, since it’s respective field is 9*9, almostthe same with the patch size. These features are shown inthe red, blue, green, and yellow cubes in Figure 2. Finally,we leveraged t-SNE [20] to visualize them, as shown in thecentral part of Figure 2.It is clear that: 1) features from clear images are clus-tered together regardless which dataset they come from. Itis the same for the foggy case; 2) there is a large gap be-tween the clear features and foggy ones. Therefore, if wefind a mapping between them, we can bridge them together.In this sense, restoring the degraded image patch to a clearone in the feature embedding space will certainly benefit thedown-stream classification and detection tasks. In this pa-per, we call such a mapping as the deep degradation prior(DDP). In the next part, we will present an efficient solutionto show how to learn an effective DDP.4. Learning DDP by Deep Neural Network4.1. Overview of the NetworkGiven a clear dataset and its paired degraded datasetwithout semantic labels, our goal is to learn an effectiveDDP, which can be plugged in existing convolutional neuralnetworks seamlessly to enhance their generalizability on de-graded images. To this end, we propose an effective learn-ing method by reconstructing the clear features from the de-graded ones under the supervision of a simple Mean SquareError (MSE) loss. As shown in Figure 3 (a), during thetraining phase, we first use a pre-trained model to extractthe low-level features of both degraded and clear images,for example, “conv2 2” in VGG16 [28], and the first layerin AlexNet [15], respectively. This part of network is fixedduring the training phase and is called Shallow PretrainedLayers (SPL) in this paper. The degraded and clear featuresare denoted “DF” and “CF” in Figure 3 (a).Then, we propose a novel feature de-drifting module(FDM) to accomplish the feature reconstruction. Takingthe hazy image as an example, it is concatenated with theDF from SPL together and fed into FDM. Leveraging theresidual learning idea, the output feature from FDM is fusedwith the original DF by an element-wise sum. The result-ing enhanced feature (EF) is compared with its paired CF tocalculate the MSE loss and the error is back-propagated toFDM to update its parameters.During the testing phase, we insert the trained FDM intoan existing classification network, i.e., between its SPL andsubsequent deep pre-trained layers (DPL), as shown in Fig-ure 3 (b). It is noteworthy that the learned weights in FDMserve as the learned DDP to map the degraded features tothe clear ones (see Figure 2).4.2. Feature De-drifting ModuleThe degradation changes the statistics of a local patch,which results in a biased feature response by SPL comparedwith the original clear one. In order to correct the driftedfeature response, we propose a novel feature de-driftingmodule. It is inspired by the non-classical receptive fieldof human vision [6] as shown in Figure 3 (c). Non-classicalreceptive field is very useful to enhance the high frequencies11052FDM(Fixed)DPL(Fixed)SPL(Fixed)Output AOutput B√Test on ImageNet DatasetDegraded Dog InputDFEF…….........G1G2G3Feature De-drifting ModuleOutputInput1*1 Conv3*3 Conv3*3 Conv3*3 Conv(a)(d)G1_1G1_2G2_1G2_2G3_2G3_1WFDM(Update)SPL(Fixed)MSESPL(Fixed)Train on CUB DatasetDegraded Bird InputDFEFCFClear Bird InputCF: Clear FeaturesDF: Degraded FeaturesEF: Enhanced FeaturesDPL: Deep Pretrained LayersSPL: Shallow Pretrained LayersFDM: Feature De-drifting ModuleNon-classical receptive fieldOutputInput(b)(c)Figure 3. (a)-(b) Diagram of the training/testing phase of the proposed method. (c) Illustration of the non-classical receptive field [6] forenhancing the high frequencies while maintaining the low frequencies. (d) The network structure of our feature de-drifting module (FDM).while maintaining low frequencies. Mathematically, givenan input I, a filter with a non-classical receptive field gen-erates an output f as follows:f = A1(I ∗ G1(σ1)) + A2(I ∗ G2(σ2)) + A3(I ∗ G3(σ3)),(2)where G1 ∼ G3 denote three Gaussian filters with differentfilter bandwidths, i.e.,G(σ) =1√2πσ2 exp(− x2+y22σ)(3)A1 ∼ A3 represent the coefficient in the central, sur-rounded, and marginal frequency areas, respectively. ∗ de-notes the convolution operation. σ1 ∼ σ3 are the scaleparameters determining the filter bandwidths accordingly.Eq. (2) can be reformulated as:f = A1(I ∗ G1) + A2((I ∗ G1) ∗ G2′)+A3((I ∗ G1) ∗ G2′) ∗ G3′,(4)where, G2′ and G3′ are also Gaussian filters with scale pa-rameters�σ22 − σ21 and�σ23 − σ22. In this way, the convo-lutional result in the first (second) term can be used as theinput of the second (third) Gaussian filter G2′ (G3′).Inspired by the non-classical receptive field and Eq. (4),we design our FDM as shown in Figure 3 (d). The first threeblocks filled in orange, yellow and blue, represent the con-volution process in the central, surrounded, and marginalfrequency areas, respectively. Each block consists of twoconvolutional layers. The output features from the threeblocks are then concatenated together and fed into a final1*1 convolutional layer simulate the linear weighting be-tween the central, surrounded and marginal frequency parts,as shown in Eq. (2). Details of FDM are summarized in Ta-ble 1 and Table 2 for VGG16 and AlexNet, respectively.Table 1. The details of FDM with VGG16.InputSizeNumFilterStridePadG1 1131*112*112128311G1 2128*112*112128311G2 1128*112*11264311G2 264*112*11264311G3 164*112*11232311G3 232*112*11232311W224*112*112128110Table 2. The details of FDM with AlexNet.InputSizeNumFilterStridePadG1 167*27*27128311G1 2128*27*27128311G2 1128*27*2764311G2 264*27*2764311G3 164*27*2732311G3 232*27*2732311W224*27*27641105. ExperimentsWe evaluate the performance of our proposed deepdegradation prior on ImageNet-C dataset [13], which is arigorous benchmark with 50000 images in 1000 categoriesand widely used for robustness evaluation of image classi-fier. In this paper, we mainly focus on three kinds of degra-dation conditions, including fog, low contrast, and bright-ness. For each of the mentioned degradation conditions, weperform experiments on five levels of degradation. The ex-amples of degraded images are shown in Figure 4. Besides,under each degradation condition, we test the influence ofdata size on deep degradation prior modeling.Following the degraded image generation methods in[13], we use the clean images of CUB (total 11788 images)11053FogContrastBrightnessLevel1Level2Level3Level4Level5Figure 4. Some examples of degraded images from ImageNet-C.dataset to synthesize the degraded images, named CUB-C,for the training of FD-Module. We employ AlexNet [15]and VGG16 [28] pre-trained on clean images as base mod-els. The FD-Module is trained for 5000 iterations usingSGD with batch size of 32. The initial learning rate is 0.001.The learning rate decreases to 0.0001 after 2500 iterations.In FD-Module, the filter weights of each layer are initializedusing the MSRA initialization method.5.1. FogFog is very common for the images captured in outdoorand will cause image degradation due to atmospheric ab-sorption and scattering. The high-frequency componentsand color fidelity are degraded in fog images. Since thedegradation is spatial-variant, it will change the structureof local regions inconsistently and significantly increase thedifficulty of feature extraction. The goal of image dehazingmethods is to enhance the contrast and restore the struc-ture details, which make the images more visual pleasing,but cannot eliminate the structure inconsistency. In this pa-per, we select five state-of-the-art image dehazing methodsas baseline: DehazeNet [4], AoD-Net [17], FPCNet [42],FAMED-Net [43], and GFN-Net [25]. All of the test foggyimages are dehazed by the baseline methods and then sendto pre-trained image classifiers (VGG16 and AlexNet). Ourproposed FDM is firstly trained on foggy images and thenplugged into the same pre-trained image classifiers (VGG16and AlexNet). The evaluations are performed on five degra-dation levels respectively.As can be seen from Figure 5 (a), our proposed methodsignificantly surpasses the two-stage methods in terms ofclassification accuracy, especially when the fog concentra-tion is large. Moreover, with the increase of fog concentra-tion, the performance of two-stage methods drops notably,while o
下载后可阅读完整内容,剩余1页未读,立即下载
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![.pdf](https://img-home.csdnimg.cn/images/20210720083646.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://profile-avatar.csdnimg.cn/default.jpg!1)
cpongm
- 粉丝: 4
- 资源: 2万+
上传资源 快速赚钱
我的内容管理 收起
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助
![](https://csdnimg.cn/release/wenkucmsfe/public/img/voice.245cc511.png)
会员权益专享
最新资源
- 电力电子与电力传动专业《电子技术基础》期末考试试题
- 电力电子技术期末考试题:电力客户与服务管理专业
- 电力系统自动化《电力电子技术》期末考卷习题精选
- 电力系统自动化专业《电力电子技术》期末考试试题
- 电子信息专业《电子技术》期末考试试题解析
- 电子与信息技术专业《电子技术》期末考试试题概览
- 电子信息工程《电子技术》期末考卷习题集
- 电子信息工程专业《电子技术》期末考试试题解析
- 电子信息工程《电工与电子技术》期末考试试题解析
- 电子信息工程专业《电子技术基础》期末考试计算题解析
- 电子技术期末考试题试卷(试卷B)——电子技术应用专业
- 电子科技专业《电力电子技术》期末考试填空题精选
- 2020-21秋《电力电子技术》电机电器智能化期末试题解析
- 电气工程及其自动化专业《电子技术》期末考试题(卷六)
- 电气工程专业《电子技术基础》期末考试试题解析
- 电气自动化专业《电子技术》期末考试试题解析
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
![](https://img-home.csdnimg.cn/images/20220527035711.png)
![](https://img-home.csdnimg.cn/images/20220527035711.png)
![](https://img-home.csdnimg.cn/images/20220527035111.png)
安全验证
文档复制为VIP权益,开通VIP直接复制
![](https://csdnimg.cn/release/wenkucmsfe/public/img/green-success.6a4acb44.png)