用python抓取以下https://book.douban.com/top250的250条信息
时间: 2023-07-14 11:53:29 浏览: 55
好的,以下是实现该爬虫任务的Python代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://book.douban.com/top250'
books = []
for i in range(0, 250, 25):
params = {
'start': str(i)
}
response = requests.get(url, params=params)
soup = BeautifulSoup(response.text, 'html.parser')
book_items = soup.find_all('div', class_='pl2')
for item in book_items:
book = {}
book['title'] = item.find('a').text.strip()
book['link'] = item.find('a')['href']
book['author'] = item.find('p', class_='pl').text.strip().split('/')[0]
book['rating'] = item.find('span', class_='rating_nums').text.strip()
book['num_of_comments'] = item.find('span', class_='pl').text.strip().strip('()')
books.append(book)
print(books)
```
这段代码会抓取豆瓣图书Top250页面上的所有书籍信息,并将其存储在一个列表中。每条记录包括书名、链接、作者、评分和评论数。