小图像分类的高效卷积神经网络迁移学习

117 浏览量更新于2024-08-26 收藏 288KB PDF 举报

"这篇研究论文探讨了紧凑型卷积神经网络在小图像分类中的转移学习应用，旨在解决大型预训练模型在计算复杂性和存储需求上的问题，以适应于实际应用，特别是对高性能GPU要求不高的便携设备如手机和平板。论文提出了一种新的紧凑CNN转移学习方法，包括微调和联合学习两个主要步骤，通过优化模型结构和滤波器设计，降低计算复杂性，提高小图像分类的效率和准确性。" 在计算机视觉领域，卷积神经网络（CNN）已成为图像分类任务的首选模型，尤其是在大规模数据集如ImageNet上预训练的模型，它们在各种小规模图像分类任务上表现出了最先进的性能。然而，这些预训练模型通常具有较高的计算复杂度和存储需求，这在现实世界的应用中，尤其是对于资源有限的便携设备来说，是一个挑战。为了应对这个问题，研究者们提出了多种近似方法，尝试通过用一系列小型滤波器替换原有的线性或非线性滤波器（响应），来减少卷积层的复杂性。论文中的新方法进一步发展了这一思路，提出了一种紧凑型CNN转移学习策略，专门针对小规模图像分类任务。该方法分为两个主要阶段：微调和联合学习。微调是指在预训练的大型CNN模型基础上，针对特定的小图像分类任务进行参数调整，以适应新的数据分布。而联合学习则强调在保持模型紧凑性的前提下，优化整个网络结构，使得模型能够在减少计算负担的同时，保持或提高分类精度。具体实现中，研究可能涉及到滤波器的重新设计，如使用更小的卷积核，以及结构优化，如采用深度可分离卷积等技术，这些都能显著降低计算量，同时保持对特征的提取能力。此外，论文可能会讨论如何在训练过程中平衡模型的复杂性和性能，以及如何有效地利用小样本数据进行有效学习。这篇论文对解决小图像分类在资源受限环境下的应用问题提供了新的视角和解决方案，对于推动移动设备上的智能视觉应用的发展具有重要的理论和实践意义。通过优化和压缩模型，不仅能够提升便携设备的运行效率，也能为边缘计算和物联网设备提供更可行的AI解决方案。

COMPACT CONVOLUTIONAL NEURAL NETWORK TRANSFER LEARNING FOR

SMALL-SCALE IMAGE CLASSIFICATION

Zengxi Li

Yan Song

Ian Mcloughlin

†

Lirong Dai

National Engineering Laboratory of Speech and Language Information Processing, USTC

†

School of Computing, University of Kent

ABSTRACT

Transfer learning methods have demonstrated state-of-the-

art performance on various small-scale image classiﬁcation

tasks. This is generally achieved by exploiting the infor-

mation from an ImageNet convolution neural network (Im-

ageNet CNN). However, the transferred CNN model is gen-

erally with high computational complexity and storage re-

quirement. It raises the issue for real-world applications,

especially for some portable devices like phones and tablets

without high-performance GPUs. Several approximation

methods have been proposed to reduce the complexity by

reconstructing the linear or non-linear ﬁlters (responses) in

convolutional layers with a series of small ones.

In this paper, we present a compact CNN transfer learn-

ing method for small-scale image classiﬁcation. Speciﬁcally,

it can be decomposed into ﬁne-tuning and joint learning

stages. In ﬁne-tuning stage, a high-performance target CNN

is trained by transferring information from the ImageNet

CNN. In joint learning stage, a compact target CNN is opti-

mized based on ground-truth labels, jointly with the predic-

tions of the high-performance target CNN. The experimental

results on CIFAR-10 and MIT Indoor Scene demonstrate the

effectiveness and efﬁciency of our proposed method.

Index Terms— CNN, Transfer Learning, Image Classiﬁ-

cation

1. INTRODUCTION

Recently, deep convolutional neural networks (CNN) have

achieved outstanding performance in large scale visual recog-

nition competitions [1]. Generally, the deep CNN structure

can be decomposed into (1) convolutional layers, which per-

form non-linear feature extraction via convolution, rectiﬁed

linear units (ReLU), and max-pooling operations, and (2)

fully connected layers, which map the extracted features into

posterior probabilities. It is known that the powerful model-

ing capability of deep CNN mainly comes from its complex

We acknowledge the support of the following organizations for re-

search funding: National Nature Science Foundation of China (Grant No.

61273264 and No. 61172158), Science and Technology Department of

Anhui Province (Grant No. 15CZZ02007), Chinese Academy of Sciences

(Grant No. XDB02070006).

structure with millions of parameters tuned with large-scale

labeled dataset like ImageNet [2].

However, for small-scale datasets, e.g. MIT Indoor

Scene [3], the complexly structured CNN may be prone to

over-ﬁtting, leading to reduced performance. In such cases,

several recent works indicate that it is preferable to transfer

a previous well-trained CNN rather than to train a new CNN

with limited labeled data. For example, Razavian et.al. con-

ducted a series of experiments for various recognition tasks

using CNN features as generic image representation [4].

Chatﬁeld et.al. compared the results of using CNNs with var-

ious structures, e.g. CNN-F, CNN-M and CNN-S [5]. In [6],

Girshick et.al showed that CNN ﬁne-tuning scheme can yield

a signiﬁcant performance boost. In [7], the transferability

of features from different layers has been comprehensively

evaluated. The effectiveness of CNN ﬁne-tuning schemes has

been validated on similar tasks.

Despite the superior performance of transferred CNNs,

the high computational complexity and storage requirement

make it difﬁcult to apply them in real-world systems, espe-

cially for some portable devices, such as mobile phones and

tablets without high-performance GPUs. So, it is of practical

importance to improve CNN efﬁciency without reducing per-

formance. Several approximation methods were developed to

reconstruct linear ﬁlters or responses with a series of smaller

ones [8, 9]. In [10], Zhang et.al proposed to minimize the

reconstruction error of non-linear responses, which is subject

to a low-rank constraint. These methods mostly focus on the

convolutional layers of CNNs.

In this paper, we propose a compact transfer learning

scheme for small-scale recognition tasks, as shown in Fig 1.

Given a pre-trained CNN for source task (i.e. ImageNet),

the transferring process can be decomposed into ﬁne-tuning

and joint learning stages. In the ﬁne-tuning stage, a high-

performance CNN model on the target dataset, such as MIT

Indoor Scene and CIFAR-10, is ﬁne-tuned by transferring

the parameters of internal layers from a pre-trained CNN. In

the joint learning stage, a compact CNN model that satisﬁes

the complexity and storage requirement is ﬁrstly designed,

and then optimized with an objective function which ex-

ploits the information lying in the output probabilities from

the high-performance CNN. This may enforce the compact

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38663193

粉丝: 8
资源: 950

小图像分类的高效卷积神经网络迁移学习

最新资源