使用深度卷积神经网络,(自己构造或者采用预训练模型 VGG, GoogLeNet, ResNet, DenseNet 等),对图像数据集 CIFAR-10(https://www.cs.toronto.edu/~kriz/cifar.html) 进行分类,该数据集的训练集包含 50000 张图片,测试集包含 10000 张图片。 要求:需要把训练集划分出 10000 张图片作为验证集,用来进行模型选择,可以使用 Kaggle 的免费 GPU 进行训练。需要在训练集和测试集上画出 accuracy vs. epochs,在验证 集上画出 loss vs. epochs,总共 3 幅图。
时间: 2023-06-11 07:08:49 浏览: 58
好的,这个任务需要使用深度学习框架,我选择使用PyTorch来完成。
首先,我们需要下载CIFAR-10数据集。可以使用以下代码:
```python
import torchvision.datasets as datasets
import torchvision.transforms as transforms
transform = transforms.Compose([
transforms.RandomHorizontalFlip(), # 数据增强
transforms.RandomCrop(32, padding=4),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])
trainset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
testset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
```
这里我们使用了数据增强来提高模型的泛化能力。
然后,我们需要将训练集划分出一部分作为验证集,可以使用PyTorch中的`SubsetRandomSampler`实现:
```python
import torch.utils.data as data
num_train = len(trainset)
indices = list(range(num_train))
split = int(num_train * 0.2) # 20%的数据作为验证集
np.random.shuffle(indices)
train_idx, valid_idx = indices[split:], indices[:split]
train_sampler = data.sampler.SubsetRandomSampler(train_idx)
valid_sampler = data.sampler.SubsetRandomSampler(valid_idx)
train_loader = data.DataLoader(trainset, batch_size=128, sampler=train_sampler, num_workers=4)
valid_loader = data.DataLoader(trainset, batch_size=128, sampler=valid_sampler, num_workers=4)
test_loader = data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=4)
```
接下来,我们可以定义一个卷积神经网络模型,这里我们使用ResNet18:
```python
import torch.nn as nn
import torch.nn.functional as F
import torchvision.models as models
class ResNet(nn.Module):
def __init__(self):
super(ResNet, self).__init__()
self.resnet = models.resnet18(pretrained=False, num_classes=10)
def forward(self, x):
x = self.resnet(x)
return x
model = ResNet().cuda()
```
在定义好模型后,我们需要定义损失函数和优化器:
```python
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
```
然后,我们可以开始训练模型:
```python
import numpy as np
num_epochs = 20
train_loss = []
valid_loss = []
train_acc = []
valid_acc = []
for epoch in range(num_epochs):
model.train()
running_loss = 0.0
running_corrects = 0
for i, data in enumerate(train_loader):
inputs, labels = data
inputs, labels = inputs.cuda(), labels.cuda()
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
_, preds = torch.max(outputs, 1)
loss.backward()
optimizer.step()
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
epoch_loss = running_loss / len(train_sampler)
epoch_acc = running_corrects.double() / len(train_sampler)
train_loss.append(epoch_loss)
train_acc.append(epoch_acc)
model.eval()
running_loss = 0.0
running_corrects = 0
for i, data in enumerate(valid_loader):
inputs, labels = data
inputs, labels = inputs.cuda(), labels.cuda()
with torch.no_grad():
outputs = model(inputs)
loss = criterion(outputs, labels)
_, preds = torch.max(outputs, 1)
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
epoch_loss = running_loss / len(valid_sampler)
epoch_acc = running_corrects.double() / len(valid_sampler)
valid_loss.append(epoch_loss)
valid_acc.append(epoch_acc)
print('Epoch [{}/{}], Train Loss: {:.4f}, Train Acc: {:.4f}, Valid Loss: {:.4f}, Valid Acc: {:.4f}'
.format(epoch+1, num_epochs, train_loss[-1], train_acc[-1], valid_loss[-1], valid_acc[-1]))
```
最后,我们可以在训练集和测试集上画出accuracy vs. epochs的图像:
```python
import matplotlib.pyplot as plt
plt.plot(train_acc, label='Train')
plt.plot(valid_acc, label='Valid')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
```
在验证集上画出loss vs. epochs的图像:
```python
plt.plot(train_loss, label='Train')
plt.plot(valid_loss, label='Valid')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
```
完整代码如下: