实现从MINIST 数据集 到 FashionMinist数据集的迁移学习
时间: 2024-01-21 21:04:49 浏览: 80
迁移学习是指利用一个已经训练好的模型,通过微调或者其他方法,使其适应于新的任务或数据集。在本例中,我们可以利用已经在MINIST数据集上训练好的模型,通过微调的方式使其适应于FashionMNIST数据集。
以下是迁移学习的步骤:
1. 加载已经在MINIST数据集上训练好的模型。可以选择一个在MINIST数据集上表现良好的模型,例如LeNet-5。
2. 在模型的顶部添加新的全连接层或者卷积层,用于适应FashionMNIST数据集的特征。
3. 将新添加的层进行随机初始化,并且冻结已经训练好的模型的所有层,使其参数不发生改变。
4. 在FashionMNIST数据集上进行微调。可以选择一部分FashionMNIST数据集作为训练集,另一部分作为验证集,利用交叉验证等方法来确定最佳的超参数。
5. 解冻已经训练好的模型的所有层,进行端到端的微调,继续在FashionMNIST数据集上进行训练。
6. 对模型进行测试,利用测试集来评估模型的性能。
需要注意的是,迁移学习的成功与否取决于两个数据集之间的相似性。在本例中,MINIST和FashionMNIST数据集都是手写数字图像,因此两个数据集之间的相似度较高,迁移学习可以取得良好的效果。
相关问题
实现从MINIST 数据集 到 FashionMinist数据集的迁移学习代码
以下是使用PyTorch实现从MNIST数据集到FashionMNIST数据集的迁移学习的代码示例:
```python
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST, FashionMNIST
from torchvision.transforms import ToTensor, Normalize
from tqdm import tqdm
# 加载MNIST数据集
mnist_trainset = MNIST(root='./data', train=True, download=True, transform=ToTensor())
mnist_testset = MNIST(root='./data', train=False, download=True, transform=ToTensor())
mnist_trainloader = DataLoader(mnist_trainset, batch_size=64, shuffle=True)
mnist_testloader = DataLoader(mnist_testset, batch_size=64, shuffle=False)
# 加载FashionMNIST数据集
fashion_trainset = FashionMNIST(root='./data', train=True, download=True, transform=ToTensor())
fashion_testset = FashionMNIST(root='./data', train=False, download=True, transform=ToTensor())
fashion_trainloader = DataLoader(fashion_trainset, batch_size=64, shuffle=True)
fashion_testloader = DataLoader(fashion_testset, batch_size=64, shuffle=False)
# 定义模型
class LeNet(nn.Module):
def __init__(self):
super(LeNet, self).__init__()
self.conv1 = nn.Conv2d(1, 6, kernel_size=5)
self.pool1 = nn.MaxPool2d(kernel_size=2)
self.conv2 = nn.Conv2d(6, 16, kernel_size=5)
self.pool2 = nn.MaxPool2d(kernel_size=2)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool1(torch.relu(self.conv1(x)))
x = self.pool2(torch.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
x = self.fc3(x)
return x
# 加载预训练的模型
pretrained_model = LeNet()
pretrained_model.load_state_dict(torch.load('mnist_model.pt'))
# 添加新的全连接层
pretrained_model.fc4 = nn.Linear(84, 10)
# 冻结已经训练好的模型的所有层
for param in pretrained_model.parameters():
param.requires_grad = False
# 定义优化器和损失函数
optimizer = optim.Adam(pretrained_model.fc4.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
# 训练新添加的全连接层
for epoch in range(10):
running_loss = 0.0
for i, data in tqdm(enumerate(fashion_trainloader), total=len(fashion_trainloader)):
inputs, labels = data
optimizer.zero_grad()
outputs = pretrained_model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print('Epoch %d, loss: %.3f' % (epoch + 1, running_loss / len(fashion_trainloader)))
# 解冻已经训练好的模型的所有层
for param in pretrained_model.parameters():
param.requires_grad = True
# 定义优化器和损失函数
optimizer = optim.Adam(pretrained_model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
# 端到端的微调
for epoch in range(10):
running_loss = 0.0
for i, data in tqdm(enumerate(fashion_trainloader), total=len(fashion_trainloader)):
inputs, labels = data
optimizer.zero_grad()
outputs = pretrained_model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print('Epoch %d, loss: %.3f' % (epoch + 1, running_loss / len(fashion_trainloader)))
# 在测试集上进行测试
total = 0
correct = 0
with torch.no_grad():
for data in tqdm(fashion_testloader, total=len(fashion_testloader)):
images, labels = data
outputs = pretrained_model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the %d test images: %.2f%%' % (total, 100.0 * correct / total))
# 保存模型
torch.save(pretrained_model.state_dict(), 'fashion_model.pt')
```
在该代码中,我们首先加载MNIST和FashionMNIST数据集,并且定义了LeNet模型。然后,我们加载了在MNIST数据集上训练好的LeNet模型,并且添加了一个新的全连接层用于适应FashionMNIST数据集。接下来,我们冻结了已经训练好的模型的所有层,并且只训练了新添加的全连接层。训练完成后,我们解冻了已经训练好的模型的所有层,并且进行了端到端的微调。最后,我们在FashionMNIST测试集上对模型进行了测试,并且保存了模型的参数。
从MINIST 数据集 到 FashionMinist数据集的迁移学习代码实现
以下是一个使用Keras框架从MNIST数据集到FashionMNIST数据集的迁移学习示例代码:
```python
import keras
from keras.datasets import mnist, fashion_mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras import backend as K
# 加载MNIST数据集
(x_train_mnist, y_train_mnist), (x_test_mnist, y_test_mnist) = mnist.load_data()
# 加载FashionMNIST数据集
(x_train_fashion, y_train_fashion), (x_test_fashion, y_test_fashion) = fashion_mnist.load_data()
# 将像素值归一化到0到1之间
x_train_mnist = x_train_mnist.reshape(x_train_mnist.shape[0], 28, 28, 1).astype('float32') / 255
x_test_mnist = x_test_mnist.reshape(x_test_mnist.shape[0], 28, 28, 1).astype('float32') / 255
x_train_fashion = x_train_fashion.reshape(x_train_fashion.shape[0], 28, 28, 1).astype('float32') / 255
x_test_fashion = x_test_fashion.reshape(x_test_fashion.shape[0], 28, 28, 1).astype('float32') / 255
# 转换类别向量为二进制类别矩阵
num_classes = 10
y_train_mnist = keras.utils.to_categorical(y_train_mnist, num_classes)
y_test_mnist = keras.utils.to_categorical(y_test_mnist, num_classes)
y_train_fashion = keras.utils.to_categorical(y_train_fashion, num_classes)
y_test_fashion = keras.utils.to_categorical(y_test_fashion, num_classes)
# 构建MNIST模型
model_mnist = Sequential()
model_mnist.add(Flatten(input_shape=(28, 28, 1)))
model_mnist.add(Dense(128, activation='relu'))
model_mnist.add(Dropout(0.5))
model_mnist.add(Dense(num_classes, activation='softmax'))
model_mnist.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
# 训练MNIST模型
model_mnist.fit(x_train_mnist, y_train_mnist,
batch_size=128,
epochs=10,
verbose=1,
validation_data=(x_test_mnist, y_test_mnist))
# 冻结MNIST模型的前几层,构建FashionMNIST模型
for layer in model_mnist.layers[:2]:
layer.trainable = False
model_fashion = Sequential(model_mnist.layers[:2])
model_fashion.add(Flatten(input_shape=(28, 28, 1)))
model_fashion.add(Dense(128, activation='relu'))
model_fashion.add(Dropout(0.5))
model_fashion.add(Dense(num_classes, activation='softmax'))
model_fashion.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
# 训练FashionMNIST模型
model_fashion.fit(x_train_fashion, y_train_fashion,
batch_size=128,
epochs=10,
verbose=1,
validation_data=(x_test_fashion, y_test_fashion))
```
该代码首先加载MNIST和FashionMNIST数据集,并将像素值归一化到0到1之间。然后,构建了一个简单的MNIST模型,并对其进行了训练。接着,将MNIST模型的前两层冻结,构建了一个新的FashionMNIST模型,并对其进行了训练。冻结前两层的目的是保留MNIST模型中学到的有用特征,以便在FashionMNIST数据集上进行微调。
阅读全文