Xception：深度学习中的深度可分离卷积网络

版权申诉

2 浏览量更新于2024-09-11 收藏 786KB PDF 举报

"深度学习中的Xception架构：基于深度可分离卷积的创新" 深度学习是一种机器学习技术，它通过模仿人脑神经网络的工作方式来处理和学习数据。在深度学习中，卷积神经网络（Convolutional Neural Networks，CNN）是图像识别和处理领域的核心组件。本文探讨的是在CNN中的一种优化技术——深度可分离卷积（Depthwise Separable Convolutions），以及如何通过它们构建一种名为Xception的新架构，该架构在图像分类任务上表现出色。深度可分离卷积是将传统的卷积操作分解为两步：深度卷积（Depthwise Convolution）和点状卷积（Pointwise Convolution）。深度卷积对输入的每个通道独立应用卷积核，而点状卷积则是一个1x1的卷积，用于混合不同通道的信息。这种分解方法减少了计算量，提高了模型的效率。 Inception模块是另一种在CNN中广泛使用的结构，其特点是在同一层中并行使用不同大小的卷积核，以捕获不同尺度的特征。Inception V3是Inception系列的一个版本，它在ImageNet数据集上表现优秀。然而，Xception架构是受到Inception模块启发的，但用深度可分离卷积替换了Inception模块中的常规卷积。 Xception架构的主要优势在于其效率和性能。尽管它与Inception V3拥有相同的参数数量，但Xception在ImageNet数据集上的性能略胜一筹，这表明模型参数被更有效地利用。此外，Xception在包含3.5亿张图像和1.7万个类别的大型图像分类数据集上，其表现显著优于Inception V3，这意味着在大规模数据集上，Xception的泛化能力更强。这一改进对于深度学习领域有重大意义，因为它表明通过优化基础运算（如卷积）可以实现性能提升，而无需增加模型的复杂度或参数数量。这有助于在资源有限的环境中，如嵌入式设备或移动设备上，实现更高效、更强大的深度学习模型。 Xception架构是深度学习领域的一个重要进步，它结合了深度可分离卷积的优势，提供了在保持模型容量不变的情况下提高性能的方法。这不仅有助于提升模型在图像识别任务上的表现，也为未来深度学习模型的设计提供了新的思路和方向。

Xception: Deep Learning with Depthwise Separable Convolutions

Franc¸ois Chollet

Google, Inc.

fchollet@google.com

Abstract

We present an interpretation of Inception modules in con-

volutional neural networks as being an intermediate step

in-between regular convolution and the depthwise separable

convolution operation (a depthwise convolution followed by

a pointwise convolution). In this light, a depthwise separable

convolution can be understood as an Inception module with

a maximally large number of towers. This observation leads

us to propose a novel deep convolutional neural network

architecture inspired by Inception, where Inception modules

have been replaced with depthwise separable convolutions.

We show that this architecture, dubbed Xception, slightly

outperforms Inception V3 on the ImageNet dataset (which

Inception V3 was designed for), and signiﬁcantly outper-

forms Inception V3 on a larger image classiﬁcation dataset

comprising 350 million images and 17,000 classes. Since

the Xception architecture has the same number of param-

eters as Inception V3, the performance gains are not due

to increased capacity but rather to a more efﬁcient use of

model parameters.

1. Introduction

Convolutional neural networks have emerged as the mas-

ter algorithm in computer vision in recent years, and de-

veloping recipes for designing them has been a subject of

considerable attention. The history of convolutional neural

network design started with LeNet-style models [

], which

were simple stacks of convolutions for feature extraction

and max-pooling operations for spatial sub-sampling. In

2012, these ideas were reﬁned into the AlexNet architec-

ture [

], where convolution operations were being repeated

multiple times in-between max-pooling operations, allowing

the network to learn richer features at every spatial scale.

What followed was a trend to make this style of network

increasingly deeper, mostly driven by the yearly ILSVRC

competition; ﬁrst with Zeiler and Fergus in 2013 [

] and

then with the VGG architecture in 2014 [18].

At this point a new style of network emerged, the Incep-

tion architecture, introduced by Szegedy et al. in 2014 [

]

as GoogLeNet (Inception V1), later reﬁned as Inception V2

[

], Inception V3 [

], and most recently Inception-ResNet

[

]. Inception itself was inspired by the earlier Network-

In-Network architecture [

]. Since its ﬁrst introduction,

Inception has been one of the best performing family of

models on the ImageNet dataset [

], as well as internal

datasets in use at Google, in particular JFT [5].

The fundamental building block of Inception-style mod-

els is the Inception module, of which several different ver-

sions exist. In ﬁgure 1 we show the canonical form of an

Inception module, as found in the Inception V3 architec-

ture. An Inception model can be understood as a stack of

such modules. This is a departure from earlier VGG-style

networks which were stacks of simple convolution layers.

While Inception modules are conceptually similar to con-

volutions (they are convolutional feature extractors), they

empirically appear to be capable of learning richer repre-

sentations with less parameters. How do they work, and

how do they differ from regular convolutions? What design

strategies come after Inception?

1.1. The Inception hypothesis

A convolution layer attempts to learn ﬁlters in a 3D space,

with 2 spatial dimensions (width and height) and a chan-

nel dimension; thus a single convolution kernel is tasked

with simultaneously mapping cross-channel correlations and

spatial correlations.

This idea behind the Inception module is to make this

process easier and more efﬁcient by explicitly factoring it

into a series of operations that would independently look at

cross-channel correlations and at spatial correlations. More

precisely, the typical Inception module ﬁrst looks at cross-

channel correlations via a set of 1x1 convolutions, mapping

the input data into 3 or 4 separate spaces that are smaller than

the original input space, and then maps all correlations in

these smaller 3D spaces, via regular 3x3 or 5x5 convolutions.

This is illustrated in ﬁgure 1. In effect, the fundamental hy-

pothesis behind Inception is that cross-channel correlations

and spatial correlations are sufﬁciently decoupled that it is

preferable not to map them jointly

A variant of the process is to independently look at width-wise corre-

arXiv:1610.02357v3 [cs.CV] 4 Apr 2017

下载后可阅读完整内容，剩余7页未读，立即下载

电动汽车控制与安全

粉丝: 268
资源: 4186

Xception：深度学习中的深度可分离卷积网络

浅析深度学习在股票预测中的应用前景.pdf

多模态深度学习综述.pdf

“互联网 ”背景下幼儿园教师的深度学习.pdf

spire.pdf.dll 解除10页的限制

例外情况 org.springframework.web.util.NestedServletException: Handler dispatch failed; nested exception is java.lang.StackOverflowError

java.net.ConnectException: Failed to connect to /172.20.4.233:8380

org.apache.http.conn.HttpHostConnectException: Connect to 192.168.146.134:8765 [/192.168.146.134] failed: Connection timed out: connect

关于 IDEA 启动 springboot 项目异常 - Disconnected from the target VM, address: '127.0.0.1:59770', transport:...

最新资源