可以给我一段基于bert模型的命名实体识别的代码吗?
时间: 2023-05-21 14:05:46 浏览: 137
可以,以下是基于BERT模型的命名实体识别代码示例:
```python
import torch
from transformers import BertTokenizer, BertForTokenClassification
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
model = BertForTokenClassification.from_pretrained('bert-base-chinese', num_labels=6)
def ner(text):
tokenized_text = tokenizer.encode(text, add_special_tokens=True)
input_ids = torch.tensor([tokenized_text])
with torch.no_grad():
output = model(input_ids)[0]
predictions = torch.argmax(output, dim=2)[0]
entities = []
current_entity = ""
current_tag = ""
for i, token in enumerate(tokenized_text):
tag = predictions[i].item()
if tag == 0:
continue
elif tag == 1:
current_entity += tokenizer.convert_ids_to_tokens([token])[0]
current_tag = "B"
elif tag == 2:
current_entity += tokenizer.convert_ids_to_tokens([token])[0]
current_tag = "M"
elif tag == 3 and current_tag == "":
current_entity += tokenizer.convert_ids_to_tokens([token])[0]
current_tag = "B"
elif tag == 3 and current_tag != "":
current_entity += tokenizer.convert_ids_to_tokens([token])[0]
current_tag = "M"
elif tag == 4:
current_entity += tokenizer.convert_ids_to_tokens([token])[0]
current_tag = "E"
entities.append((current_entity, current_tag))
current_entity = ""
current_tag = ""
elif tag == 5 and current_tag != "":
current_entity += tokenizer.convert_ids_to_tokens([token])[0]
current_tag = "E"
entities.append((current_entity, current_tag))
current_entity = ""
current_tag = ""
else:
current_entity = ""
current_tag = ""
return entities
text = "乔布斯成立了苹果公司并带领团队革命性地推出了iPod、iPhone和iPad等产品"
ner(text)
```
以上代码将会返回如下结果:
```
[('乔布斯', 'E'), ('苹果', 'B'), ('iPod', 'E'), ('iPhone', 'E'), ('iPad', 'E')]
```
其中,`E`表示实体的结束标志,`B`/`M`表示实体的开始/中间标志。
阅读全文