三国演义词云图python
时间: 2023-11-23 19:58:00 浏览: 117
以下是三国演义词云图的Python代码:
```python
import jieba
from wordcloud import WordCloud
import matplotlib.pyplot as plt
# 读取文件
with open('threekingdoms.txt', 'r', encoding='utf-8') as f:
txt = f.read()
# 分词
words = jieba.lcut(txt)
# 去重
dupDict={'曹操' : ['孟德','丞相'],
'玄德' : ['刘备','皇叔','刘皇叔','玄德曰'],
'云长' : ['关羽','关云长','关公'],
'孔明' : ['诸葛亮','诸葛','孔明曰'],
'张飞' : ['翼徳'],
'赵云' : ['子龙','赵子龙'],
'周瑜' : ['公瑾','都督']}
ls = []
for i in words:
if len(i)==1:
continue
for key in dupDict:
if i in dupDict[key]:
i = key
ls.append(i)
words = ls
# 生成词云
wordcloud = WordCloud(font_path='simhei.ttf', background_color='white', width=800, height=600).generate(' '.join(words))
# 显示词云
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()
# 输出词频最高的5个词
word_freq = {}
for word in words:
if word in word_freq:
word_freq[word] += 1
else:
word_freq[word] = 1
top_words = sorted(word_freq.items(), key=lambda x: x[1], reverse=True)[:5]
print('词频最高的5个词:')
for word, freq in top_words:
print(word, freq)
```
阅读全文