请用python爬取B站某一视频的评论内容,用字云的方式进行分析
时间: 2024-06-08 22:09:48 浏览: 196
好的,这是一份基于Python的爬取B站评论并生成词云的代码,需要使用到`requests`、`beautifulsoup4`、`jieba`、`wordcloud`等库:
```python
import requests
from bs4 import BeautifulSoup
import jieba
from wordcloud import WordCloud
# 获取B站视频评论数据
def get_comments(av_num):
url = f"https://api.bilibili.com/x/v2/reply?jsonp=jsonp&pn=1&type=1&oid={av_num}&sort=2"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
}
response = requests.get(url, headers=headers)
data = response.json()
comments = []
for comment in data["data"]["replies"]:
comments.append(comment["content"]["message"])
return comments
# 生成词云图
def gen_word_cloud(text):
cut_text = " ".join(jieba.cut(text))
wc = WordCloud(
font_path="msyh.ttc",
background_color="white",
width=1000,
height=600,
max_words=200,
max_font_size=80,
)
wordcloud = wc.generate(cut_text)
wordcloud.to_file("wordcloud.png")
if __name__ == "__main__":
av_num = "填写B站视频的av号"
comments = get_comments(av_num)
text = "".join(comments)
gen_word_cloud(text)
```
其中,`get_comments()`函数实现了获取B站视频评论数据的功能,使用了B站的API接口。`gen_word_cloud()`函数实现了生成词云图的功能,使用了`jieba`和`wordcloud`库。最后,在`if __name__ == "__main__":`中调用以上两个函数,即可得到B站视频评论的词云图。请注意填写正确的B站视频的av号。
阅读全文