豆瓣电影top250可视化代码
时间: 2023-08-30 17:12:38 浏览: 85
豆瓣Top250数据可视化完整代码,DouBanMovieTop250.rar
5星 · 资源好评率100%
以下是豆瓣电影Top250的可视化代码示例:
```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
import matplotlib.pyplot as plt
# 获取豆瓣电影Top250的URL
url = 'https://movie.douban.com/top250'
# 获取网页HTML代码
response = requests.get(url)
html = response.content.decode('utf-8')
# 使用BeautifulSoup解析HTML代码
soup = BeautifulSoup(html, 'html.parser')
# 获取电影名称、评分和评价数
movies = soup.find_all('div', {'class': 'info'})
names = [movie.find('span', {'class': 'title'}).get_text() for movie in movies]
scores = [float(movie.find('span', {'class': 'rating_num'}).get_text()) for movie in movies]
comments = [int(movie.find('div', {'class': 'star'}).find_all('span')[3].get_text().replace('人评价', '')) for movie in movies]
# 将数据存储到Pandas DataFrame
df = pd.DataFrame({'Name': names, 'Score': scores, 'Comment': comments})
# 绘制散点图
plt.scatter(df['Comment'], df['Score'], alpha=0.5)
plt.xlabel('Number of Comments')
plt.ylabel('Score')
plt.title('Douban Top250 Movies')
plt.show()
```
这段代码将获取豆瓣电影Top250页面的HTML代码,并使用BeautifulSoup解析数据。然后,将电影名称、评分和评论数存储到Pandas DataFrame中,并绘制散点图以可视化数据。
阅读全文