爬取大众点评评论信息代码

由于大众点评网站的反爬机制比较严格，因此需要使用一些技巧来避免被封禁。以下是一个使用Python爬取大众点评评论信息的例子： ```python import requests import time from bs4 import BeautifulSoup def get_comments(url): headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, 'html.parser') # 获取评论信息 comments = [] comment_list = soup.find_all('div', class_='comment-list')[0].find_all('div', class_='comment') for comment in comment_list: name = comment.find_all('div', class_='user-info')[0].find_all('a')[0].text.strip() star = len(comment.find_all('span', class_='sml-rank-stars')[0].find_all('span', class_='sml-str')) content = comment.find_all('div', class_='comment-txt')[0].text.strip() comments.append({'name': name, 'star': star, 'content': content}) return comments if __name__ == '__main__': # 模拟浏览器访问网页，获取cookies url = 'https://www.dianping.com/' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) cookies = response.cookies.get_dict() # 爬取评论信息 url = 'https://www.dianping.com/shop/5343507/review_all' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3', 'Cookie': 'xxxxxxx' # 将获取的cookies填入 } comments = [] for i in range(1, 11): print('正在爬取第%d页评论...' % i) url = 'https://www.dianping.com/shop/5343507/review_all/p%d' % i response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, 'html.parser') if soup.find_all('div', class_='content') != []: comments += get_comments(url) else: break time.sleep(2) # 保存评论信息到文件 with open('comments.txt', 'w', encoding='utf-8') as f: for comment in comments: f.write('姓名：%s，评分：%d，评论：%s\n' % (comment['name'], comment['star'], comment['content'])) ``` 需要注意的是，以上代码中需要手动获取cookies，并填入`headers`中，否则会被网站封禁。此外，为了避免被封禁，需要设置合理的访问间隔，这里设置为每爬取一页评论后暂停2秒。

阅读全文

爬取大众点评评论信息代码

相关推荐

python爬虫-爬取大众点评中所有评论、评分、图片信息（含源码）

大众点评爬虫源码

Python爬取微信公众号评论点赞等相关信息

爬取大众点评餐饮评论代码

爬取大众点评评论数据，要求代码能实现翻页爬取数据，同时每页数据能够获取评论人姓名、评论星级评价、评论内容

python爬虫爬取大众点评数据代码

python爬取大众点评代码

用python代码实现爬取大众点评店铺所有评论

爬取大众点评网商家信息的爬虫

puppeteer爬取大众点评的demo

爬取大众点评美食商家评分、地址、推荐菜相关代码

爬取大众点评餐饮评论，并保存为scv文件的代码

python爬取大众点评

爬取大众点评重庆美食评论数据并保存为CSV文件的代码

反爬取大众点评重庆美食评论数据并保存为CSV文件的代码

python爬虫爬取大众点评

爬取大众点评app数据

爬取大众点评里的评论 python 保存成txt

反爬取大众点评重庆所有美食评论数据并保存为CSV文件的代码

jupyter notebook能否爬取大众点评

大家在看

tms320f28335 从flash启动

使用eclipse来写R程序

改进的Socket编程—客户端主要流程-利用OpenssL的C/S安全通信 程序设计

nacos2.4.0源码改造oracle版

空调室外机气动与声学特性的数值分析 (2013年)

最新推荐

Python调试器vardbg：动画可视化算法流程

管理建模和仿真的文件

【IT设备维保管理入门指南】：如何制定有效的维护计划，提升设备性能与寿命

python爬取网页链接，url = “https://koubei.16888.com/57233/0-0-0-0”

掌握Web开发：Udacity天气日记项目解析

"互动学习：行动中的多样性与论文攻读经历"

【文献整理高效法】：ENDNOTE软件实用功能及快捷操作揭秘

在使用SQL创建存储过程时，是否可以在定义输入参数时直接为其赋予初始值？

MySQL 5.5.28 64位数据库软件免费下载

关系数据表示学习

改进的Socket编程—客户端主要流程-利用OpenssL的C/S安全通信程序设计