用python写一段爬取剧本中人物出现的次数和词云的代码
时间: 2023-05-13 08:06:03 浏览: 121
好的,以下是用 Python 爬取剧本中人物出现次数和生成词云的代码:
```python
import requests
from bs4 import BeautifulSoup
import jieba
from wordcloud import WordCloud
import matplotlib.pyplot as plt
# 爬取剧本内容
url = 'https://www.example.com/script.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
script = soup.find('div', {'class': 'script'}).text
# 分词
words = jieba.cut(script)
# 统计人物出现次数
counts = {}
for word in words:
if len(word) == 1:
continue
elif word == '某某': # 将人物名替换成实际的名字
word = '张三'
counts[word] = counts.get(word, 0) + 1
# 生成词云
wc = WordCloud(font_path='msyh.ttc', background_color='white', max_words=2000, width=1920, height=1080)
wc.generate_from_frequencies(counts)
plt.imshow(wc)
plt.axis('off')
plt.show()
```
请注意,以上代码仅供参考,实际使用时需要根据具体情况进行修改。
阅读全文