给个微调的代码例子
时间: 2024-02-27 10:51:42 浏览: 122
一个代码例子
下面是使用Hugging Face的Transformers库微调Roberta模型的代码示例,以文本分类任务为例:
```python
import torch
from transformers import RobertaTokenizer, RobertaForSequenceClassification, AdamW
from torch.utils.data import DataLoader, RandomSampler, SequentialSampler
# 加载数据集
train_dataset = ...
valid_dataset = ...
# 加载Roberta模型和Tokenizer
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
model = RobertaForSequenceClassification.from_pretrained('roberta-base')
# 冻结前3层
for layer in model.roberta.embeddings.parameters():
layer.requires_grad = False
for i in range(3):
for layer in model.roberta.encoder.layer[i].parameters():
layer.requires_grad = False
# 设定其他参数
epochs = 5
batch_size = 32
learning_rate = 1e-5
# 创建DataLoader
train_sampler = RandomSampler(train_dataset)
train_dataloader = DataLoader(train_dataset, sampler=train_sampler, batch_size=batch_size)
valid_sampler = SequentialSampler(valid_dataset)
valid_dataloader = DataLoader(valid_dataset, sampler=valid_sampler, batch_size=batch_size)
# 定义优化器和损失函数
optimizer = AdamW(model.parameters(), lr=learning_rate, eps=1e-8)
criterion = torch.nn.CrossEntropyLoss()
# 定义训练函数
def train(model, dataloader, optimizer, criterion):
model.train()
total_loss = 0
for step, batch in enumerate(dataloader):
input_ids, attention_mask, labels = batch
optimizer.zero_grad()
outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
total_loss += loss.item()
loss.backward()
optimizer.step()
return total_loss / len(dataloader)
# 定义验证函数
def evaluate(model, dataloader, criterion):
model.eval()
total_loss = 0
total_correct = 0
with torch.no_grad():
for step, batch in enumerate(dataloader):
input_ids, attention_mask, labels = batch
outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
logits = outputs.logits
total_loss += loss.item()
preds = torch.argmax(logits, dim=1)
total_correct += torch.sum(preds == labels).item()
acc = total_correct / len(dataloader.dataset)
return total_loss / len(dataloader), acc
# 训练模型
for epoch in range(1, epochs+1):
train_loss = train(model, train_dataloader, optimizer, criterion)
valid_loss, valid_acc = evaluate(model, valid_dataloader, criterion)
print(f'Epoch {epoch}: train_loss={train_loss:.3f}, valid_loss={valid_loss:.3f}, valid_acc={valid_acc:.3f}')
```
以上代码示例中,首先加载了Roberta模型和Tokenizer,然后冻结了前3层,设定了其他超参数,创建了DataLoader,定义了优化器和损失函数,最后定义了训练函数和验证函数,并调用它们进行模型训练。需要根据实际情况进行调整。
阅读全文