MNIST 手写数字识别
时间: 2024-12-27 14:27:43 浏览: 8
### MNIST 手写数字识别实现方法
#### 加载必要的库
为了构建并运行一个简单的卷积神经网络(CNN),用于MNIST手写数字分类,首先需要加载一些基本的Python库以及PyTorch框架。
```python
import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.optim as optim
```
#### 准备数据集
MNIST 数据集包含70000张灰度图像,每幅图像是28×28像素大小,代表了从0到9十个类别的手写字体。这些图片被划分为两个主要部分:一个是拥有60000个样本的训练集合;另一个则是含有10000个实例的测试集合[^3]。
对于输入的数据预处理非常重要,在这里采用标准化操作来调整图像数值范围至\[0, 1\]之间,并将其转换成Tensor对象以便于后续计算:
```python
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
batch_size = 64
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)
```
#### 构建模型结构
定义了一个非常基础版本的CNN架构,它包含了两层卷积层(Convolutional Layer)加上一层全连接层(Fully Connected Layer)[^1]。
```python
class SimpleCNN(nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
# 定义卷积层
self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=2)
self.pool = nn.MaxPool2d(kernel_size=2)
self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=2)
# 全连接层
self.fc1 = nn.Linear(32 * 7 * 7, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.pool(torch.relu(self.conv1(x)))
x = self.pool(torch.relu(self.conv2(x)))
x = x.view(-1, 32 * 7 * 7)
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x
model = SimpleCNN()
print(model)
```
#### 训练过程
设置损失函数为交叉熵(Cross Entropy Loss),优化器选用随机梯度下降法SGD。通过迭代整个训练集多次更新权重参数直到达到满意的性能指标为止。
```python
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
num_epochs = 5
for epoch in range(num_epochs):
running_loss = 0.0
for i, data in enumerate(train_loader, 0):
inputs, labels = data
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if i % 100 == 99:
print(f'Epoch [{epoch + 1}/{num_epochs}], Step [{i + 1}], Loss: {running_loss / 100:.4f}')
running_loss = 0.0
print('Finished Training')
```
#### 测试评估
完成上述步骤之后就可以利用已经训练好的模型来进行预测工作了。此时应该使用之前从未见过的新数据即测试集中保存下来的那部分样本来检验最终效果如何。
```python
correct = 0
total = 0
with torch.no_grad():
for data in test_loader:
images, labels = data
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
accuracy = 100 * correct / total
print(f'Test Accuracy of the network on the 10000 test images: {accuracy}%')
```
阅读全文