输出的时候显示一下是第几页的数据
时间: 2024-04-03 16:36:03 浏览: 82
好的,你可以将代码修改为以下形式:
```
import requests
from bs4 import BeautifulSoup
from pprint import pprint
for page in range(5):
url = "https://movie.douban.com/subject/30391186/comments?start={}&limit=20&sort=new_score&status=P".format(page*20)
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
comments = []
for comment in soup.select(".comment-item"):
content = comment.select_one(".comment-content").get_text().strip()
author = comment.select_one(".comment-info a").get_text().strip()
likes = comment.select_one(".votes").get_text().strip()
time = comment.select_one(".comment-time").get_text().strip()
location = comment.select_one(".comment-location").contents[-1].strip()
comments.append({
"content": content,
"author": author,
"likes": likes,
"time": time,
"location": location
})
print("第{}页的数据:".format(page+1))
pprint(comments)
```
这段代码中,我们在循环中添加了一个print语句,用于显示当前是第几页的数据。我们使用了format方法来将页码转换为字符串,并使用print函数将其输出。
阅读全文