python爬取豆瓣电影top250代码
时间: 2023-09-01 07:12:01 浏览: 128
python爬取豆瓣电影top250
以下是使用Python爬取豆瓣电影Top250的示例代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://movie.douban.com/top250'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
movie_list = []
for i in range(0, 250, 25):
res = requests.get(url, headers=headers, params={'start': str(i)})
soup = BeautifulSoup(res.text, 'html.parser')
movies = soup.find_all('div', {'class': 'info'})
for movie in movies:
title = movie.find('span', {'class': 'title'}).text
rating = movie.find('span', {'class': 'rating_num'}).text
quote = movie.find('span', {'class': 'inq'}).text if movie.find('span', {'class': 'inq'}) else None
movie_list.append({'title': title, 'rating': rating, 'quote': quote})
for movie in movie_list:
print(movie['title'], movie['rating'], movie['quote'])
```
这段代码使用了requests库和BeautifulSoup库来获取网页内容和提取页面信息。循环遍历每一页,找到每部电影的标题、评分和简介(如果有的话),并将它们保存在一个列表中。最后,打印出每部电影的信息。
阅读全文