import requests from bs4 import BeautifulSoup import lxml headers = { "User-Agent": "Mozilla/5.0 (Macintosh; Intel)" } for start_name in range(0, 250, 25): res = requests.get(f"https://book.douban.com/top250?start=", headers=headers) print(res.status_code) soup = BeautifulSoup(res.text, 'lxml') t1 = soup.findAll('div', attrs={'class': 'pl2'}) for i in t1: t2 = i.find('a').text.strip() # 去除<a>标签中的空格 print(t2);去除其中会出现的空格
时间: 2023-06-19 21:03:11 浏览: 310
python爬虫代码(requests、BeautifulSoup).docx
代码中的错误是在发送请求时没有正确地拼接 URL。应该将起始页数 `start_name` 与 URL 拼接在一起,修改如下:
```python
import requests
from bs4 import BeautifulSoup
import lxml
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel)"
}
for start_name in range(0, 250, 25):
url = f"https://book.douban.com/top250?start={start_name}"
res = requests.get(url, headers=headers)
print(res.status_code)
soup = BeautifulSoup(res.text, 'lxml')
t1 = soup.findAll('div', attrs={'class': 'pl2'})
for i in t1:
t2 = i.find('a').text.strip() # 去除<a>标签中的空格
print(t2)
```
阅读全文