首页Python爬虫爬取QQ音乐歌曲评论并分析的代码

Python爬虫爬取QQ音乐歌曲评论并分析的代码

时间: 2024-12-25 16:30:50 浏览: 8

Python爬虫用于从网站抓取数据非常常见，特别是在网络上获取信息如网页内容、API数据等。对于QQ音乐这样的网站，由于它有反爬机制和相对复杂的页面结构，我们需要使用一些库，如`requests`（发送HTTP请求）、`BeautifulSoup`（解析HTML）或者`Scrapy`（更强大的框架）来提取评论。以下是一个简单的例子，使用`requests`和`BeautifulSoup`来抓取QQ音乐某首歌的部分评论： ```python import requests from bs4 import BeautifulSoup # 指定要抓取的歌曲URL url = "https://y.qq.com/n/yqq/song/index.html?songmid=xxxxx&hashcode=yyyyy" # 请替换实际的songmid和hashcode headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3' } # 发送请求 response = requests.get(url, headers=headers) # 解析HTML soup = BeautifulSoup(response.text, 'lxml') # 查找评论区域 comments_area = soup.find('div', {'class': 'comment_list'}) # 提取每个评论的信息 comments = comments_area.find_all('li', {'class': 'comment-item'}) for comment in comments: username = comment.find('a', {'class': 'nick'}).text # 用户名 content = comment.find('span', {'class': 'content'}).text # 内容 print(f"{username} - {content.strip()}")

阅读全文