如何用Python批量爬取小红书帖子标签
时间: 2023-05-28 15:05:54 浏览: 163
以下是一种可能的方法:
1. 安装 requests 和 BeautifulSoup 模块:
```python
pip install requests
pip install beautifulsoup4
```
2. 编写爬虫代码:
```python
import requests
from bs4 import BeautifulSoup
# 小红书帖子链接
url = 'https://www.xiaohongshu.com/discovery/item/5f627f2f0000000001002fb3'
# 发送请求获取 HTML 页面
response = requests.get(url)
html = response.text
# 解析 HTML 页面,获取标签
soup = BeautifulSoup(html, 'html.parser')
tags = soup.select('.tag-text')
# 打印标签
for tag in tags:
print(tag.text)
```
3. 批量爬取小红书帖子标签:
```python
import requests
from bs4 import BeautifulSoup
# 小红书帖子链接列表
urls = [
'https://www.xiaohongshu.com/discovery/item/5f627f2f0000000001002fb3',
'https://www.xiaohongshu.com/discovery/item/5f5e5f3d0000000001002fac',
'https://www.xiaohongshu.com/discovery/item/5f5e5f3d0000000001002fac'
]
# 循环遍历链接列表,爬取标签
for url in urls:
# 发送请求获取 HTML 页面
response = requests.get(url)
html = response.text
# 解析 HTML 页面,获取标签
soup = BeautifulSoup(html, 'html.parser')
tags = soup.select('.tag-text')
# 打印标签
for tag in tags:
print(tag.text)
```
以上代码可以批量爬取小红书帖子的标签。如果需要将标签保存到文件中,可以使用 Python 的文件操作函数进行实现。
阅读全文