深度学习新突破：网络嵌入网络（NIN）提升特征表达

下载需积分: 17 | PDF格式 | 581KB | 更新于2024-09-06 | 188 浏览量 | 举报

"深度学习论文：1312.4400-Network in Network（NIN）" 是一篇由Min Lin、Qiang Chen和Shuicheng Yan三位作者在2013年发布的重要研究，他们分别来自新加坡国立大学的整合科学与工程研究生院和电子与计算机工程系。这篇论文提出了一种名为"网络在内部"（Network In Network, 或简称NIN）的创新深度学习架构，其目标是提升在卷积神经网络（CNN）感受野内局部特征的模型区分度。传统的卷积层采用线性滤波器结合非线性激活函数来扫描输入数据，但NIN则引入了微型神经网络，它们具有更复杂的结构，用于对感受野内的数据进行抽象。微型神经网络被实例化为多层感知机（Multilayer Perceptron, MLP），这是一种强大的函数逼近工具。通过这种方式，NIN能够在保持类似CNN的滑动窗口处理方式下，提取特征图。 NIN的主要创新之处在于利用微型神经网络增强局部特征的学习能力，这使得在分类层可以使用全局平均池化（Global Average Pooling, GAP），这种操作相对简单，且有助于减少模型对输入位置的依赖，提高模型的泛化性能。这样做的好处是，它能够更好地捕捉输入数据的全局信息，同时避免了传统CNN可能面临的过拟合问题，提高了深度网络的整体效率和准确性。论文进一步探讨了如何通过堆叠多个这样的NIN模块来构建深层NIN网络，以及如何优化网络参数和训练策略，以达到最优的性能。NIN的提出不仅扩展了深度学习模型的设计空间，也为后续研究者提供了一种新颖的视角，推动了卷积神经网络在视觉识别、图像分类等领域的进步。这篇论文标志着深度学习领域对网络结构设计的深刻探索，它通过微型神经网络的引入，为提升深度学习模型的特征表示能力和模型简洁性提供了新的可能，对于理解和改进现代深度学习架构具有重要的理论和实践价值。

展开

Network In Network

Min Lin

1,2

, Qiang Chen

, Shuicheng Yan

Graduate School for Integrative Sciences and Engineering

Department of Electronic & Computer Engineering

National University of Singapore, Singapore

{linmin,chenqiang,eleyans}@nus.edu.sg

Abstract

We propose a novel deep network structure called “Network In Network”(NIN)

to enhance model discriminability for local patches within the receptive ﬁeld. The

conventional convolutional layer uses linear ﬁlters followed by a nonlinear acti-

vation function to scan the input. Instead, we build micro neural networks with

more complex structures to abstract the data within the receptive ﬁeld. We in-

stantiate the micro neural network with a multilayer perceptron, which is a potent

function approximator. The feature maps are obtained by sliding the micro net-

works over the input in a similar manner as CNN; they are then fed into the next

layer. Deep NIN can be implemented by stacking mutiple of the above described

structure. With enhanced local modeling via the micro network, we are able to uti-

lize global average pooling over feature maps in the classiﬁcation layer, which is

easier to interpret and less prone to overﬁtting than traditional fully connected lay-

ers. We demonstrated the state-of-the-art classiﬁcation performances with NIN on

CIFAR-10 and CIFAR-100, and reasonable performances on SVHN and MNIST

datasets.

1 Introduction

Convolutional neural networks (CNNs) [1] consist of alternating convolutional layers and pooling

layers. Convolution layers take inner product of the linear ﬁlter and the underlying receptive ﬁeld

followed by a nonlinear activation function at every local portion of the input. The resulting outputs

are called feature maps.

The convolution ﬁlter in CNN is a generalized linear model (GLM) for the underlying data patch,

and we argue that the level of abstraction is low with GLM. By abstraction we mean that the fea-

ture is invariant to the variants of the same concept [2]. Replacing the GLM with a more potent

nonlinear function approximator can enhance the abstraction ability of the local model. GLM can

achieve a good extent of abstraction when the samples of the latent concepts are linearly separable,

i.e. the variants of the concepts all live on one side of the separation plane deﬁned by the GLM. Thus

conventional CNN implicitly makes the assumption that the latent concepts are linearly separable.

However, the data for the same concept often live on a nonlinear manifold, therefore the represen-

tations that capture these concepts are generally highly nonlinear function of the input. In NIN, the

GLM is replaced with a ”micro network” structure which is a general nonlinear function approxi-

mator. In this work, we choose multilayer perceptron [3] as the instantiation of the micro network,

which is a universal function approximator and a neural network trainable by back-propagation.

The resulting structure which we call an mlpconv layer is compared with CNN in Figure 1. Both the

linear convolutional layer and the mlpconv layer map the local receptive ﬁeld to an output feature

vector. The mlpconv maps the input local patch to the output feature vector with a multilayer percep-

tron (MLP) consisting of multiple fully connected layers with nonlinear activation functions. The

MLP is shared among all local receptive ﬁelds. The feature maps are obtained by sliding the MLP

arXiv:1312.4400v3 [cs.NE] 4 Mar 2014

下载后可阅读完整内容，剩余9页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

LIYUNKEBEYOND

粉丝: 2

深度学习新突破：网络嵌入网络（NIN）提升特征表达

Chrome插件：chrome.devtools.network使用指南

Vue组件 vue-d3-network：利用d3-force实现网络图绘制

PyTorch深度学习：实现Siamese Region Proposal Network

掌握802.11-MAC协议：Opnet仿真源码学习指南

Tensorflow深度学习实战：优化CNN在CIFAR-10上的分类性能

深度学习实验：手动实现与torch.nn前馈神经网络对比

深入解读ITU-T XG-PON技术标准G.987.1-G.987.4

深度学习目标检测模型：R-CNN、Fast-RCNN与Faster-RCNN的对比与优化

GPU加速利器: cudnn-11.3-windows-x64-v8.2.1.32

Yolov5模型转换：从.pt到.onnx的详细步骤解析

最新资源