爬虫爬取豆瓣电影短评
时间: 2023-09-18 11:06:01 浏览: 97
你可以使用Python中的第三方库(例如BeautifulSoup、Requests等)来编写一个爬虫程序,以获取豆瓣电影的短评数据。下面是一个简单的示例代码:
```python
import requests
from bs4 import BeautifulSoup
def crawl_douban_comments(movie_url):
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36'
}
comments = []
response = requests.get(movie_url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
comment_tags = soup.select('.comment-item .short')
for comment in comment_tags:
comments.append(comment.text.strip())
return comments
# 豆瓣电影链接示例:https://movie.douban.com/subject/26931786/comments?status=P
movie_url = 'https://movie.douban.com/subject/26931786/comments?status=P'
comments = crawl_douban_comments(movie_url)
for comment in comments:
print(comment)
```
请注意,豆瓣网站有反爬虫机制,可能会对频繁请求进行限制。因此,在实际使用爬虫时,请确保遵守网站的使用规则和爬虫道德准则。
阅读全文