首页英文评论提取名词获得权重的代码

英文评论提取名词获得权重的代码

时间: 2024-03-23 17:36:59 浏览: 10

以下是使用Python和NLTK库提取英文评论中名词并获取他们的权重的示例代码： ```python import nltk from nltk.tokenize import word_tokenize from nltk.corpus import stopwords from nltk.probability import FreqDist from nltk.tag import pos_tag # 英文评论（示例） comment = "This restaurant has amazing food and great service. Highly recommend!" # 分词和词性标注 tokens = word_tokenize(comment.lower()) tags = pos_tag(tokens) # 获取名词 nouns = [word for word, pos in tags if pos.startswith('N')] # 去除停用词 stop_words = set(stopwords.words('english')) nouns = [noun for noun in nouns if noun not in stop_words] # 获取频率分布 fdist = FreqDist(nouns) # 打印出现最多的前10个名词及其权重 for noun, frequency in fdist.most_common(10): print(f"{noun}: {frequency}") ``` 输出结果如下： ``` restaurant: 1 food: 1 service: 1 ``` 其中，每个名词后面的数字表示该词在评论中出现的次数（即权重）。