首页python爬取豆瓣影评

python爬取豆瓣影评

时间: 2023-09-14 19:12:28 浏览: 67

好的，可以使用 Python 中的 requests 和 BeautifulSoup 库来实现这个功能。首先，需要用 requests 库发送 HTTP 请求获取网页内容。可以使用如下代码： ```python import requests url = "https://movie.douban.com/subject/1292052/comments?start=0&limit=20&sort=new_score&status=P" response = requests.get(url) html = response.text ``` 这里以电影《肖申克的救赎》为例，获取该电影的影评页面。这个页面的 URL 中包含了几个参数，如 start、limit 和 sort 等，可以根据需要进行修改。接下来，可以使用 BeautifulSoup 库解析 HTML 页面，提取需要的信息。可以使用如下代码： ```python from bs4 import BeautifulSoup soup = BeautifulSoup(html, "html.parser") comments = soup.find_all("span", class_="short") for comment in comments: print(comment.text) ``` 这里使用了 find_all 方法查找所有 class 属性为 "short" 的 span 标签，即影评内容所在的标签。然后使用 text 属性获取标签中的文本内容，并输出到控制台。这样就可以实现简单的豆瓣影评爬取功能了。当然，在实际应用中需要注意一些法律和道德规范，避免侵犯他人的权益。

阅读全文