首页爬取这个网站https://yuedu.163.com/book/rank/sell/week/p1/s20所有数据的代码

爬取这个网站https://yuedu.163.com/book/rank/sell/week/p1/s20所有数据的代码

时间: 2023-06-15 09:04:00 浏览: 149

以下是使用 Python 3 爬取 https://yuedu.163.com/book/rank/sell/week/p1/s20 所有数据的代码： ```python import requests from bs4 import BeautifulSoup url = 'https://yuedu.163.com/book/rank/sell/week/p1/s20' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') books = soup.find_all('div', {'class': 'book-info'}) for book in books: title = book.find('h3').text.strip() author = book.find('p', {'class': 'author'}).text.strip() category = book.find('p', {'class': 'category'}).text.strip() description = book.find('p', {'class': 'desc'}).text.strip() print(title) print(author) print(category) print(description) print('-' * 50) ``` 该代码通过 requests 库向网站发送 GET 请求，获取网站内容。然后使用 BeautifulSoup 库解析网站内容，获取所有书籍的信息。最后，将每本书籍的标题、作者、分类和描述打印出来。注意，该代码只能爬取第一页的数据。如果需要爬取更多页面的数据，可以修改 URL 中的页码参数。

阅读全文