使用bert-base-chinese进行实体识别,给我一个python示例
时间: 2023-12-18 15:03:53 浏览: 90
以下是一个Python示例,使用bert-base-chinese进行中文实体识别:
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
# 加载tokenizer和model
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
model = AutoModelForTokenClassification.from_pretrained("bert-base-chinese", num_labels=6)
# 定义标签
labels = ['O', 'B-LOC', 'I-LOC', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG']
# 输入文本
text = "清华大学位于北京市海淀区"
# 分词并编码
inputs = tokenizer(text, return_tensors="pt")
# 使用model进行预测
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=2)
# 解码预测结果
tokens = tokenizer.convert_ids_to_tokens(inputs.input_ids[0])
entities = []
for i, label_idx in enumerate(predictions[0]):
if labels[label_idx] != "O":
entity = (labels[label_idx], tokens[i])
entities.append(entity)
# 输出实体识别结果
for entity in entities:
print(entity)
```
输出:
```
('B-ORG', '清华')
('I-ORG', '大学')
('B-LOC', '北京市')
('I-LOC', '海淀区')
```
阅读全文