豆瓣电影Top250Python词云图
时间: 2023-11-05 22:53:16 浏览: 172
好的,我可以帮您实现这个功能。首先,需要爬取豆瓣电影Top250的数据,可以使用Python的requests和BeautifulSoup库来实现。以下是爬取Top250电影的代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://movie.douban.com/top250'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
res = requests.get(url, headers=headers)
soup = BeautifulSoup(res.text, 'html.parser')
movies = []
for item in soup.select('.item'):
movie = {
'title': item.select('.title')[0].text,
'rating': item.select('.rating_num')[0].text,
'link': item.select('a')[0]['href'],
'cover': item.select('img')[0]['src']
}
movies.append(movie)
```
接下来,需要使用jieba和wordcloud库来实现词云图的制作。以下是制作词云图的代码:
```python
import jieba
from wordcloud import WordCloud
import matplotlib.pyplot as plt
text = ''
for movie in movies:
text += movie['title']
words = ' '.join(jieba.cut(text))
wc = WordCloud(background_color="white", max_words=2000, font_path='msyh.ttc')
wc.generate(words)
plt.imshow(wc)
plt.axis("off")
plt.show()
```
最后运行代码即可生成豆瓣电影Top250的词云图。需要注意的是,需要提前安装jieba和wordcloud库。
阅读全文