python爬取豆瓣读书top50(正则表达式)
时间: 2023-12-20 13:32:10 浏览: 196
基于Python所写,豆瓣读书的爬虫
以下是使用正则表达式爬取豆瓣读书Top50的示例代码:
```python
import requests
import re
# 发送请求获取网页内容
url = 'https://book.douban.com/top250'
response = requests.get(url)
content = response.text
# 使用正则表达式提取书籍信息
pattern = r'<div class="pl2">.*?<a href="(.*?)" title="(.*?)".*?</div>'
results = re.findall(pattern, content, re.S)
# 打印书籍信息
for result in results:
book_url = result[0]
book_title = result[1]
print("书籍链接:", book_url)
print("书籍标题:", book_title)
print("--")
```
阅读全文