写一个基于pytorch的手写英文字母识别系统,使用gpu训练,有可视化训练结果
时间: 2023-12-10 19:40:15 浏览: 83
好的,以下是一个基于PyTorch的手写英文字母识别系统,使用GPU训练,同时也有可视化训练结果。
首先,我们需要安装必要的库:
```python
!pip install torch torchvision matplotlib
```
接下来,我们需要加载数据集。这里采用的是MNIST手写数字数据集,但是只使用其中的字母部分。
```python
import torch
import torchvision.datasets as datasets
import torchvision.transforms as transforms
# 加载数据集
train_dataset = datasets.MNIST(root='./data',
train=True,
transform=transforms.ToTensor(),
download=True)
test_dataset = datasets.MNIST(root='./data',
train=False,
transform=transforms.ToTensor())
# 取出字母数据(0-9为数字,10-35为字母)
train_dataset.data = train_dataset.data[(train_dataset.targets >= 10) & (train_dataset.targets <= 35)]
train_dataset.targets = train_dataset.targets[(train_dataset.targets >= 10) & (train_dataset.targets <= 35)] - 10
test_dataset.data = test_dataset.data[(test_dataset.targets >= 10) & (test_dataset.targets <= 35)]
test_dataset.targets = test_dataset.targets[(test_dataset.targets >= 10) & (test_dataset.targets <= 35)] - 10
# 定义数据加载器
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=64,
shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
batch_size=64,
shuffle=False)
```
接下来,我们定义模型。这里使用的是一个简单的卷积神经网络。
```python
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
self.fc1 = nn.Linear(320, 50)
self.fc2 = nn.Linear(50, 26)
def forward(self, x):
x = F.relu(F.max_pool2d(self.conv1(x), 2))
x = F.relu(F.max_pool2d(self.conv2(x), 2))
x = x.view(-1, 320)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
model = Net().cuda()
```
然后,我们定义优化器和损失函数。
```python
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
```
接下来,我们定义训练过程。
```python
def train(epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.cuda(), target.cuda()
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
```
然后,我们定义测试过程。
```python
def test():
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.cuda(), target.cuda()
output = model(data)
test_loss += criterion(output, target).item()
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset)))
return test_loss, 100. * correct / len(test_loader.dataset)
```
接下来,我们开始训练。这里训练10个epoch,并且记录每个epoch的训练损失和测试损失,最后将其可视化。
```python
import matplotlib.pyplot as plt
train_losses = []
test_losses = []
test_accs = []
for epoch in range(1, 11):
train(epoch)
test_loss, test_acc = test()
train_losses.append(criterion(model(torch.FloatTensor(train_dataset.data).unsqueeze(1).cuda()), train_dataset.targets.cuda()).item())
test_losses.append(test_loss)
test_accs.append(test_acc)
plt.plot(train_losses, label='Training loss')
plt.plot(test_losses, label='Test loss')
plt.legend()
plt.show()
plt.plot(test_accs, label='Test accuracy')
plt.legend()
plt.show()
```
最后,我们可以使用该模型对任意手写英文字母进行识别:
```python
from PIL import Image
# 加载图片并转化为灰度图
img = Image.open('letter.png').convert('L')
# 将图片转化为tensor,并且调整形状
img_tensor = transforms.ToTensor()(img).unsqueeze(0).cuda()
# 使用模型进行预测
output = model(img_tensor)
pred = chr(output.argmax().item() + 65)
print('Prediction: {}'.format(pred))
```
以上就是一个基于PyTorch的手写英文字母识别系统,使用GPU训练,同时也有可视化训练结果。
阅读全文