实现从MINIST 数据集 到 FashionMinist数据集的迁移学习
时间: 2024-01-21 13:05:09 浏览: 65
迁移学习可以帮助我们在一个任务上已经训练好的模型的基础上,快速地在另一个任务上训练出一个高效的模型。在这里,我们可以利用从MINIST数据集上训练出来的模型,通过迁移学习的方式,训练出在FashionMinist数据集上表现良好的模型。
具体实现方法如下:
1. 导入预训练的MINIST模型,并在最后一层后添加一个适合FashionMinist数据集的全连接层(例如,输出层的神经元数量应该与FashionMinist数据集中的类别数量相同)。
2. 冻结MINIST模型中的所有层,只训练新添加的全连接层。这样可以防止在新任务上过度拟合。
3. 对新添加的全连接层进行训练,直到模型在FashionMinist数据集上的性能开始收敛。
4. 解冻MINIST模型中的前几层,并重新训练整个模型,直到性能进一步提升。
5. 对整个模型进行微调,以进一步提高性能。
需要注意的是,迁移学习的效果很大程度上取决于原始模型与目标任务之间的相似性。因此,在选择预训练模型时,应该尽可能选择与目标任务相近的数据集进行预训练。
相关问题
实现从MINIST 数据集 到 FashionMinist数据集的迁移学习代码
以下是使用PyTorch实现从MNIST数据集到FashionMNIST数据集的迁移学习的代码示例:
```python
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST, FashionMNIST
from torchvision.transforms import ToTensor, Normalize
from tqdm import tqdm
# 加载MNIST数据集
mnist_trainset = MNIST(root='./data', train=True, download=True, transform=ToTensor())
mnist_testset = MNIST(root='./data', train=False, download=True, transform=ToTensor())
mnist_trainloader = DataLoader(mnist_trainset, batch_size=64, shuffle=True)
mnist_testloader = DataLoader(mnist_testset, batch_size=64, shuffle=False)
# 加载FashionMNIST数据集
fashion_trainset = FashionMNIST(root='./data', train=True, download=True, transform=ToTensor())
fashion_testset = FashionMNIST(root='./data', train=False, download=True, transform=ToTensor())
fashion_trainloader = DataLoader(fashion_trainset, batch_size=64, shuffle=True)
fashion_testloader = DataLoader(fashion_testset, batch_size=64, shuffle=False)
# 定义模型
class LeNet(nn.Module):
def __init__(self):
super(LeNet, self).__init__()
self.conv1 = nn.Conv2d(1, 6, kernel_size=5)
self.pool1 = nn.MaxPool2d(kernel_size=2)
self.conv2 = nn.Conv2d(6, 16, kernel_size=5)
self.pool2 = nn.MaxPool2d(kernel_size=2)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool1(torch.relu(self.conv1(x)))
x = self.pool2(torch.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
x = self.fc3(x)
return x
# 加载预训练的模型
pretrained_model = LeNet()
pretrained_model.load_state_dict(torch.load('mnist_model.pt'))
# 添加新的全连接层
pretrained_model.fc4 = nn.Linear(84, 10)
# 冻结已经训练好的模型的所有层
for param in pretrained_model.parameters():
param.requires_grad = False
# 定义优化器和损失函数
optimizer = optim.Adam(pretrained_model.fc4.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
# 训练新添加的全连接层
for epoch in range(10):
running_loss = 0.0
for i, data in tqdm(enumerate(fashion_trainloader), total=len(fashion_trainloader)):
inputs, labels = data
optimizer.zero_grad()
outputs = pretrained_model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print('Epoch %d, loss: %.3f' % (epoch + 1, running_loss / len(fashion_trainloader)))
# 解冻已经训练好的模型的所有层
for param in pretrained_model.parameters():
param.requires_grad = True
# 定义优化器和损失函数
optimizer = optim.Adam(pretrained_model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
# 端到端的微调
for epoch in range(10):
running_loss = 0.0
for i, data in tqdm(enumerate(fashion_trainloader), total=len(fashion_trainloader)):
inputs, labels = data
optimizer.zero_grad()
outputs = pretrained_model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print('Epoch %d, loss: %.3f' % (epoch + 1, running_loss / len(fashion_trainloader)))
# 在测试集上进行测试
total = 0
correct = 0
with torch.no_grad():
for data in tqdm(fashion_testloader, total=len(fashion_testloader)):
images, labels = data
outputs = pretrained_model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the %d test images: %.2f%%' % (total, 100.0 * correct / total))
# 保存模型
torch.save(pretrained_model.state_dict(), 'fashion_model.pt')
```
在该代码中,我们首先加载MNIST和FashionMNIST数据集,并且定义了LeNet模型。然后,我们加载了在MNIST数据集上训练好的LeNet模型,并且添加了一个新的全连接层用于适应FashionMNIST数据集。接下来,我们冻结了已经训练好的模型的所有层,并且只训练了新添加的全连接层。训练完成后,我们解冻了已经训练好的模型的所有层,并且进行了端到端的微调。最后,我们在FashionMNIST测试集上对模型进行了测试,并且保存了模型的参数。
从MINIST 数据集 到 FashionMinist数据集的迁移学习代码实现
以下是一个使用Keras框架从MNIST数据集到FashionMNIST数据集的迁移学习示例代码:
```python
import keras
from keras.datasets import mnist, fashion_mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras import backend as K
# 加载MNIST数据集
(x_train_mnist, y_train_mnist), (x_test_mnist, y_test_mnist) = mnist.load_data()
# 加载FashionMNIST数据集
(x_train_fashion, y_train_fashion), (x_test_fashion, y_test_fashion) = fashion_mnist.load_data()
# 将像素值归一化到0到1之间
x_train_mnist = x_train_mnist.reshape(x_train_mnist.shape[0], 28, 28, 1).astype('float32') / 255
x_test_mnist = x_test_mnist.reshape(x_test_mnist.shape[0], 28, 28, 1).astype('float32') / 255
x_train_fashion = x_train_fashion.reshape(x_train_fashion.shape[0], 28, 28, 1).astype('float32') / 255
x_test_fashion = x_test_fashion.reshape(x_test_fashion.shape[0], 28, 28, 1).astype('float32') / 255
# 转换类别向量为二进制类别矩阵
num_classes = 10
y_train_mnist = keras.utils.to_categorical(y_train_mnist, num_classes)
y_test_mnist = keras.utils.to_categorical(y_test_mnist, num_classes)
y_train_fashion = keras.utils.to_categorical(y_train_fashion, num_classes)
y_test_fashion = keras.utils.to_categorical(y_test_fashion, num_classes)
# 构建MNIST模型
model_mnist = Sequential()
model_mnist.add(Flatten(input_shape=(28, 28, 1)))
model_mnist.add(Dense(128, activation='relu'))
model_mnist.add(Dropout(0.5))
model_mnist.add(Dense(num_classes, activation='softmax'))
model_mnist.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
# 训练MNIST模型
model_mnist.fit(x_train_mnist, y_train_mnist,
batch_size=128,
epochs=10,
verbose=1,
validation_data=(x_test_mnist, y_test_mnist))
# 冻结MNIST模型的前几层,构建FashionMNIST模型
for layer in model_mnist.layers[:2]:
layer.trainable = False
model_fashion = Sequential(model_mnist.layers[:2])
model_fashion.add(Flatten(input_shape=(28, 28, 1)))
model_fashion.add(Dense(128, activation='relu'))
model_fashion.add(Dropout(0.5))
model_fashion.add(Dense(num_classes, activation='softmax'))
model_fashion.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
# 训练FashionMNIST模型
model_fashion.fit(x_train_fashion, y_train_fashion,
batch_size=128,
epochs=10,
verbose=1,
validation_data=(x_test_fashion, y_test_fashion))
```
该代码首先加载MNIST和FashionMNIST数据集,并将像素值归一化到0到1之间。然后,构建了一个简单的MNIST模型,并对其进行了训练。接着,将MNIST模型的前两层冻结,构建了一个新的FashionMNIST模型,并对其进行了训练。冻结前两层的目的是保留MNIST模型中学到的有用特征,以便在FashionMNIST数据集上进行微调。
阅读全文