GoogleNet：深度卷积网络与Inception结构解析

需积分: 1 48 浏览量更新于2024-09-10 收藏 1.24MB PDF 举报

"GoogLeNet是谷歌团队在2014年ILSVRC比赛中提出的深度卷积神经网络架构，其主要目标是通过构建密集的块结构来近似最优的稀疏结构，以提高性能而不会大幅增加计算量。该模型在分类和检测任务上达到了当时的新纪录，并以Inception命名，因其核心思想在于多尺度处理和Hebbian原则。GoogLeNet的Caffe模型大约50M，但表现非常出色。" 在"Going Deeper with Convolutions"这篇论文中，作者Christian Szegedy等人提出了一种名为Inception的网络结构，该结构显著提升了网络在图像识别挑战中的表现。ILSVRC14比赛的胜利证明了这一设计的有效性。Inception架构的主要特点是它优化了网络内部的计算资源利用，增加了网络的深度和宽度，同时保持了计算成本的恒定。 Inception模块的核心是其多尺度处理能力，它通过并行的不同尺寸的卷积层（1x1, 3x3, 和5x5）以及池化层组合，允许网络同时捕获不同层次的特征。1x1卷积用于减少输入通道的数量，降低计算复杂度，而更大尺寸的卷积则可以捕获更复杂的模式。这种设计使得网络能够在不增加过多计算负担的情况下，实现更丰富的特征提取。此外，论文还提到了辅助分支和多模型平均策略。在训练过程中，网络包含了一些辅助的分类损失层，这些分支在网络的不同深度提供额外的监督信号，有助于稳定训练过程。同时，通过训练多个模型并取其预测结果的平均值，可以进一步提高模型的泛化能力。在实验部分，作者详细描述了他们如何通过裁剪不同尺度的图像进行多次验证，以及如何调整网络参数以优化性能。这些实践方法对于提高模型的准确性和鲁棒性起到了关键作用。总结来说，"Going Deeper with Convolutions"这篇论文不仅提出了Inception架构，即后来被广泛使用的GoogLeNet，还引入了有效的设计原则和训练策略，对深度学习领域产生了深远影响，尤其是在卷积神经网络的设计和优化方面。通过Inception模块的创新设计，该模型能够以相对较低的计算成本实现高性能，为后续的深度学习研究和应用奠定了基础。

Going Deeper with Convolutions

Christian Szegedy

, Wei Liu

, Yangqing Jia

, Pierre Sermanet

, Scott Reed

Dragomir Anguelov

, Dumitru Erhan

, Vincent Vanhoucke

, Andrew Rabinovich

Google Inc.

University of North Carolina, Chapel Hill

University of Michigan, Ann Arbor

Magic Leap Inc.

{szegedy,jiayq,sermanet,dragomir,dumitru,vanhoucke}@google.com

wliu@cs.unc.edu,

reedscott@umich.edu,

arabinovich@magicleap.com

Abstract

We propose a deep convolutional neural network ar-

chitecture codenamed Inception that achieves the new

state of the art for classiﬁcation and detection in the Im-

ageNet Large-Scale Visual Recognition Challenge 2014

(ILSVRC14). The main hallmark of this architecture is the

improved utilization of the computing resources inside the

network. By a carefully crafted design, we increased the

depth and width of the network while keeping the compu-

tational budget constant. To optimize quality, the architec-

tural decisions were based on the Hebbian principle and

the intuition of multi-scale processing. One particular in-

carnation used in our submission for ILSVRC14 is called

GoogLeNet, a 22 layers deep network, the quality of which

is assessed in the context of classiﬁcation and detection.

1. Introduction

In the last three years, our object classiﬁcation and de-

tection capabilities have dramatically improved due to ad-

vances in deep learning and convolutional networks [10].

One encouraging news is that most of this progress is not

just the result of more powerful hardware, larger datasets

and bigger models, but mainly a consequence of new ideas,

algorithms and improved network architectures. No new

data sources were used, for example, by the top entries

in the ILSVRC 2014 competition besides the classiﬁcation

dataset of the same competition for detection purposes. Our

GoogLeNet submission to ILSVRC 2014 actually uses 12

times fewer parameters than the winning architecture of

Krizhevsky et al [9] from two years ago, while being sig-

niﬁcantly more accurate. On the object detection front, the

biggest gains have not come from naive application of big-

ger and bigger deep networks, but from the synergy of deep

architectures and classical computer vision, like the R-CNN

algorithm by Girshick et al [6].

Another notable factor is that with the ongoing traction

of mobile and embedded computing, the efﬁciency of our

algorithms – especially their power and memory use – gains

importance. It is noteworthy that the considerations leading

to the design of the deep architecture presented in this paper

included this factor rather than having a sheer ﬁxation on

accuracy numbers. For most of the experiments, the models

were designed to keep a computational budget of 1.5 billion

multiply-adds at inference time, so that the they do not end

up to be a purely academic curiosity, but could be put to real

world use, even on large datasets, at a reasonable cost.

In this paper, we will focus on an efﬁcient deep neural

network architecture for computer vision, codenamed In-

ception, which derives its name from the Network in net-

work paper by Lin et al [12] in conjunction with the famous

“we need to go deeper” internet meme [1]. In our case, the

word “deep” is used in two different meanings: ﬁrst of all,

in the sense that we introduce a new level of organization

in the form of the “Inception module” and also in the more

direct sense of increased network depth. In general, one can

view the Inception model as a logical culmination of [12]

while taking inspiration and guidance from the theoretical

work by Arora et al [2]. The beneﬁts of the architecture are

experimentally veriﬁed on the ILSVRC 2014 classiﬁcation

and detection challenges, where it signiﬁcantly outperforms

the current state of the art.

2. Related Work

Starting with LeNet-5 [10], convolutional neural net-

works (CNN) have typically had a standard structure –

stacked convolutional layers (optionally followed by con-

下载后可阅读完整内容，剩余8页未读，立即下载

xyWander

粉丝: 1
资源: 3

GoogleNet：深度卷积网络与Inception结构解析

Going deeper with convolutions论文PPT

Going deeper with convolutions

Going deeper with convolutions.xmind

论文精解：Going Deeper with Convolutions

【6】Going deeper with convolutions.pdf

简单的基于 Kotlin 和 JavaFX 实现的推箱子小游戏示例代码

基于simulink建立的PEMFC燃料电池机理模型（国外团队开发的，密歇根大学)，包含空压机模型，空气路，氢气路，电堆等模型 可以正常进行仿真

基于springboot的高校教学档案管理系统设计与实现源码（java毕业设计完整源码+LW）.zip

物流工厂往复式升降机2018可编辑全套技术资料100%好用.zip

基于USuperStar酒店管理系统（java web课程设计）、全部资料+详细文档+高分项目.zip

最新资源

基于simulink建立的PEMFC燃料电池机理模型（国外团队开发的，密歇根大学)，包含空压机模型，空气路，氢气路，电堆等模型可以正常进行仿真