python爬虫豆瓣图书
时间: 2023-11-08 12:50:19 浏览: 139
Python豆瓣图书数据爬取
好的,我可以给你提供一个简单的Python爬虫豆瓣图书的例子,你可以参考一下。首先,需要安装requests、BeautifulSoup4和lxml这三个库:
```
pip install requests
pip install BeautifulSoup4
pip install lxml
```
然后,可以使用以下代码来实现豆瓣图书爬虫:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://book.douban.com/top250'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
res = requests.get(url, headers=headers)
soup = BeautifulSoup(res.text, 'lxml')
book_list = soup.find('div', {'class': 'article'}).find_all('table')
for book in book_list:
book_name = book.find('div', {'class': 'pl2'}).a['title']
book_author = book.find('p', {'class': 'pl'}).get_text().strip()
book_score = book.find('span', {'class': 'rating_nums'}).get_text().strip()
book_intro = book.find('span', {'class': 'inq'}).get_text().strip()
print(f'书名:{book_name}\n作者:{book_author}\n评分:{book_score}\n简介:{book_intro}\n')
```
这个代码会爬取豆瓣图书Top250的书名、作者、评分和简介,并输出到控制台。你可以根据需要修改代码来实现你自己的需求。
阅读全文