python三国演义词云图实验过程详细
时间: 2023-08-18 18:15:36 浏览: 92
以下是使用Python制作三国演义词云图的详细步骤:
1. 安装必要的库
使用Python制作词云图需要安装wordcloud和jieba库。可以通过以下命令进行安装:
```
pip install wordcloud
```
```
pip install jieba
```
2. 导入必要的库
导入wordcloud、jieba和PIL库:
```
import jieba
from wordcloud import WordCloud
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
```
3. 读取文本并分词
使用jieba库对三国演义文本进行分词,并存储到一个列表中:
```
text = open('sanguo.txt', 'r', encoding='utf-8').read()
words = jieba.lcut(text)
```
4. 去除停用词
从words列表中去除停用词,可以使用自己编写的停用词表或者使用第三方的停用词表。这里使用了jieba库自带的停用词表和一些自定义的停用词:
```
stopwords = [line.strip() for line in open('stopwords.txt', 'r', encoding='utf-8').readlines()]
new_words = []
for word in words:
if word not in stopwords and word != '\n':
new_words.append(word)
```
5. 生成词云
将分词后的词语列表转换成字符串,并使用WordCloud库生成词云图:
```
text = ' '.join(new_words)
mask = np.array(Image.open('mask.png'))
wc = WordCloud(background_color='white', mask=mask, font_path='simfang.ttf')
wc.generate(text)
```
6. 显示词云
使用matplotlib库将生成的词云图显示出来:
```
plt.imshow(wc, interpolation='bilinear')
plt.axis('off')
plt.show()
```
完整代码如下:
```
import jieba
from wordcloud import WordCloud
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
text = open('sanguo.txt', 'r', encoding='utf-8').read()
words = jieba.lcut(text)
stopwords = [line.strip() for line in open('stopwords.txt', 'r', encoding='utf-8').readlines()]
new_words = []
for word in words:
if word not in stopwords and word != '\n':
new_words.append(word)
text = ' '.join(new_words)
mask = np.array(Image.open('mask.png'))
wc = WordCloud(background_color='white', mask=mask, font_path='simfang.ttf')
wc.generate(text)
plt.imshow(wc, interpolation='bilinear')
plt.axis('off')
plt.show()
```
注意,上述代码中需要准备好三国演义文本文件(sanguo.txt)、停用词表文件(stopwords.txt)和词云图形状文件(mask.png)。
阅读全文