深度学习：详解卷积神经网络（CNN）架构

需积分: 50 188 浏览量更新于2024-07-19 收藏 797KB PDF 举报

"这篇PDF文件主要探讨了卷积神经网络（Convolutional Neural Networks, CNNs）的概念、架构以及在图像处理中的应用。文件详细介绍了CNNs的基本组成、层结构和设计模式，同时讨论了计算考量及一些经典网络架构案例。" 在卷积神经网络（CNNs）中，其核心概念是它们与传统的神经网络非常相似，由具有可学习权重和偏置的神经元构成。每个神经元接收输入，执行点积操作，并可能随后接一个非线性函数。整个网络仍然表达了一个从原始图像像素到类别得分的单个可微分得分函数，并且最后一层（全连接层）通常带有损失函数，如支持向量机（SVM）或softmax，用于训练和优化。 CNNs的独特之处在于，它们明确假设输入是图像。这一假设使得网络能有效地利用图像的空间局部相关性和参数共享特性。以下是CNN架构的关键组成部分： 1. **卷积层(Convolutional Layer)**：通过卷积核对输入图像进行扫描，提取特征，每个核对应一层的神经元。卷积层可以检测图像的不同特征，例如边缘、纹理等。 2. **池化层(Pooling Layer)**：降低数据的空间维度，通常用于减少计算量并防止过拟合。常见的池化操作有最大池化和平均池化。 3. **归一化层(Normalization Layer)**：通过规范化神经元的激活值来加速学习过程和提高模型的稳定性。 4. **全连接层(Fully-Connected Layer)**：在卷积层之后，通常会有一个或多个全连接层，将提取的特征映射到最终的分类输出。 5. **将全连接层转换为卷积层**：为减少参数数量和提高效率，有时会将全连接层转换为卷积层，这种做法在某些网络架构如AlexNet中被采用。文件中还讨论了CNN架构的常见模式，包括层的排列顺序和大小的设定，以及LeNet、AlexNet、ZFNet、GoogLeNet和VGGNet等经典网络案例。这些网络在深度、宽度、滤波器大小和步长等方面有所不同，展示了如何通过调整参数来优化网络性能。计算考量部分可能涉及计算效率、内存占用和模型复杂度之间的平衡，这在实际应用中尤其重要，特别是在处理大型图像数据集时。最后，文件提到了额外的参考资料，为读者提供了更深入研究CNNs的途径。这份文档提供了一个全面的CNN入门指南，涵盖了从基础原理到实践应用的广泛知识。

higher layers of the network. Now, we will have an entire set of filters in each CONV layer (e.g. 12

filters), and each of them will produce a separate 2-dimensional activation map. We will stack these

activation maps along the depth dimension and produce the output volume.

The brain view. If you’re a fan of the brain/neuron analogies, every entry in the 3D output volume

can also be interpreted as an output of a neuron that looks at only a small region in the input and

shares parameters with all neurons to the left and right spatially (since these numbers all result from

applying the same filter). We now discuss the details of the neuron connectivities, their

arrangement in space, and their parameter sharing scheme.

Local Connectivity. When dealing with high-dimensional inputs such as images, as we saw above

it is impractical to connect neurons to all neurons in the previous volume. Instead, we will connect

each neuron to only a local region of the input volume. The spatial extent of this connectivity is a

hyperparameter called the receptive field of the neuron (equivalently this is the filter size). The

extent of the connectivity along the depth axis is always equal to the depth of the input volume. It is

important to emphasize again this asymmetry in how we treat the spatial dimensions (width and

height) and the depth dimension: The connections are local in space (along width and height), but

always full along the entire depth of the input volume.

Example 1. For example, suppose that the input volume has size [32x32x3], (e.g. an RGB

CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer

will have weights to a [5x5x3] region in the input volume, for a total of 5*5*3 = 75 weights (and +1

bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this

is the depth of the input volume.

Example 2. Suppose an input volume had size [16x16x20]. Then using an example receptive field

size of 3x3, every neuron in the Conv Layer would now have a total of 3*3*20 = 180 connections to

the input volume. Notice that, again, the connectivity is local in space (e.g. 3x3), but full along the

input depth (20).

剩余20页未读，继续阅读

lisaientisite

粉丝: 0

深度学习：详解卷积神经网络（CNN）架构

吴恩达第4课第1周编程作业Convolutional Neural Networks: Step by Step代码。

A Guide to Convolutional Neural Networks for Computer Vision

Convolutional Neural Networks in Visual Computing A Concise Guide 无水印原版pdf

Backpropagation In Convolutional Neural Networks.pdf

Recent Advances in Convolutional Neural Networks.pdf

ImageNet Classification with Deep Convolutional Neural Networks.pdf

图神经网络研讨 - Graph convolutional neural networks.pdf

【4】Imagenet classification with deep convolutional neural networks.pdf

EfficientNet_Rethinking Model Scaling for Convolutional Neural Networks.pdf

Learning Channel-wise Interactions for Binary Convolutional Neural Networks.pdf

最新资源