pytorch 训练自己的数据集
时间: 2023-10-08 22:08:35 浏览: 110
训练自己的数据集
加载自己的数据集并在PyTorch中进行训练可以通过以下几个步骤实现:
1. 导入所需的库和模块:
```python
import torch
import torchvision
from torchvision import datasets, transforms
```
2. 创建自定义数据集加载器:
```python
class CustomDataset(torch.utils.data.Dataset):
def __init__(self, root, transform=None):
self.data = ... # 加载自己的数据集
self.labels = ... # 加载自己的标签
self.transform = transform
def __getitem__(self, index):
img, target = self.data[index], self.labels[index]
if self.transform is not None:
img = self.transform(img)
return img, target
def __len__(self):
return len(self.data)
```
3. 对数据进行预处理和转换:
```python
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.5], std=[0.5])
])
```
4. 加载训练和测试数据集:
```python
train_dataset = CustomDataset(root='path_to_train_data', transform=transform)
test_dataset = CustomDataset(root='path_to_test_data', transform=transform)
batch_size = 32
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
```
5. 定义模型架构和损失函数:
```python
model = ... # 定义模型架构
criterion = torch.nn.CrossEntropyLoss() # 定义损失函数
```
6. 定义优化器和训练循环:
```python
optimizer = torch.optim.Adam(model.parameters(), lr=0.001) # 定义优化器
num_epochs = 10
for epoch in range(num_epochs):
for images, labels in train_loader:
# 前向传播
outputs = model(images)
loss = criterion(outputs, labels)
# 反向传播和优化
optimizer.zero_grad()
loss.backward()
optimizer.step()
```
7. 在测试集上评估模型性能:
```python
model.eval() # 将模型设置为评估模式
total_correct = 0
total_samples = 0
for images, labels in test_loader:
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total_samples += labels.size(0)
total_correct += (predicted == labels).sum().item()
accuracy = total_correct / total_samples
print('测试集准确率: {:.2f}%'.format(100 * accuracy))
```
阅读全文