resnet模型架构
ResNet 模型架构详解
一、ResNet 的核心概念——残差学习
传统深层神经网络在增加层数后会出现退化现象,即随着网络深度的增加,准确率会饱和并最终下降。为解决这一问题,ResNet引入了残差学习的概念[^3]。
残差块通过跳跃连接(skip connection),使得每一层可以更专注于拟合输入与输出之间的映射关系F(x)=H(x)-x而不是直接去拟合目标函数H(x),从而简化了优化过程。这种设计不仅有助于缓解梯度消失/爆炸的问题,还让更深的网络更容易训练。
二、基础组件 - Basic Block 和 Bottleneck
对于不同规模的ResNet,存在两种主要类型的构建单元:
BasicBlock: 主要应用于较浅层次(如ResNet18和ResNet32),由两个连续的标准卷积层组成,每个都带有批标准化(Batch Normalization)操作以及ReLU激活函数。
class BasicBlock(nn.Module): expansion = 1 def __init__(self, in_planes, planes, stride=1, downsample=None): super().__init__() self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) self.bn1 = nn.BatchNorm2d(planes) self.relu = nn.ReLU(inplace=True) self.conv2 = nn.Conv2d(planes, planes * self.expansion, kernel_size=3, stride=1, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(planes * self.expansion) self.downsample = downsample def forward(self, x): identity = x out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) if self.downsample is not None: identity = self.downsample(x) out += identity out = self.relu(out) return out
Bottleneck: 更适用于深层次结构(比如ResNet50及以上版本).它采用了一个“瓶颈”策略,在中间部分减少特征图数量以降低计算成本;具体来说就是先降维再升回原来的维度大小.
class Bottleneck(nn.Module): expansion = 4 def __init__(self, in_planes, planes, stride=1, downsample=None): super().__init__() self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False) self.bn1 = nn.BatchNorm2d(planes) self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(planes) self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, bias=False) self.bn3 = nn.BatchNorm2d(planes * self.expansion) self.relu = nn.ReLU(inplace=True) self.downsample = downsample def forward(self, x): identity = x out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) out = self.relu(out) out = self.conv3(out) out = self.bn3(out) if self.downsample is not None: identity = self.downsample(x) out += identity out = self.relu(out) return out
三、Downsampling机制
当需要改变feature map的空间尺寸时,ResNet采用了步长(stride)>1的方式来进行下采样(downsample)[^1].此时为了保持shape的一致性,identity shortcut也需要相应调整其channel数目或空间分辨率。这通常通过额外设置一个convolutional layer来完成。
四、整体框架概述
整个ResNet是由多个stage构成,每个stage内部重复堆叠相同配置的基础block或者bottleneck block若干次形成layer group;而在相邻stages之间则可能涉及到downsampling处理。最后一层一般接全连接层用于分类任务。
相关推荐


















