爬取豆瓣网top250
时间: 2023-12-18 09:04:40 浏览: 28
您好,如果您想要爬取豆瓣网的top250电影信息,可以使用Python编程语言和BeautifulSoup库来实现。以下是一个简单的示例代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://movie.douban.com/top250'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
movie_list = []
for i in range(10): # 爬取前10页的电影信息
start = i * 25
req_url = url + '?start=' + str(start) + '&filter='
res = requests.get(req_url, headers=headers)
soup = BeautifulSoup(res.text, 'html.parser')
movies = soup.select('.item')
for movie in movies:
title = movie.select('.title')[0].text
rating_num = movie.select('.rating_num')[0].text
movie_info = movie.select('.bd p')[0].text.strip()
movie_list.append({'title': title, 'rating_num': rating_num, 'movie_info': movie_info})
for movie in movie_list:
print(movie['title'], movie['rating_num'], movie['movie_info'])
```
这个代码会爬取豆瓣网top250电影的名称、评分以及其他相关信息,并将这些信息存储在一个列表中。然后,程序会遍历这个列表,并输出每一部电影的信息。