英文评论提取名词获得权重的代码
时间: 2024-03-23 17:36:59 浏览: 10
以下是使用Python和NLTK库提取英文评论中名词并获取他们的权重的示例代码:
```python
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.probability import FreqDist
from nltk.tag import pos_tag
# 英文评论(示例)
comment = "This restaurant has amazing food and great service. Highly recommend!"
# 分词和词性标注
tokens = word_tokenize(comment.lower())
tags = pos_tag(tokens)
# 获取名词
nouns = [word for word, pos in tags if pos.startswith('N')]
# 去除停用词
stop_words = set(stopwords.words('english'))
nouns = [noun for noun in nouns if noun not in stop_words]
# 获取频率分布
fdist = FreqDist(nouns)
# 打印出现最多的前10个名词及其权重
for noun, frequency in fdist.most_common(10):
print(f"{noun}: {frequency}")
```
输出结果如下:
```
restaurant: 1
food: 1
service: 1
```
其中,每个名词后面的数字表示该词在评论中出现的次数(即权重)。