请使用Python爬取微博数据,以评论、点赞、转发、粉丝、时间、状态、正文等作为数据特征。
时间: 2023-10-20 18:18:32 浏览: 59
好的,使用Python爬取微博数据需要先获取授权,并使用第三方库进行爬取。下面是一个基本的示例代码,可供参考:
```python
import weibo_sdk
from weibo_sdk import WeiboOAuth2, WeiboAPI
import pandas as pd
# 获取授权
app_key = 'your_app_key'
app_secret = 'your_app_secret'
redirect_url = 'your_redirect_url'
auth = WeiboOAuth2(app_key, app_secret, redirect_url)
url = auth.authorize_url
code = 'your_code' # 从授权后的回调url中获取
token = auth.auth_access(code)
client = WeiboAPI(token)
# 爬取微博数据
uid = 'your_user_id'
statuses = client.get('statuses/user_timeline', uid=uid, count=100) # 获取最近100条微博
weibo_data = []
for status in statuses['statuses']:
weibo_dict = {}
weibo_dict['id'] = status['id']
weibo_dict['text'] = status['text']
weibo_dict['created_at'] = status['created_at']
weibo_dict['comments_count'] = status['comments_count']
weibo_dict['attitudes_count'] = status['attitudes_count']
weibo_dict['reposts_count'] = status['reposts_count']
weibo_dict['user_id'] = status['user']['id']
weibo_dict['user_name'] = status['user']['name']
weibo_dict['user_followers_count'] = status['user']['followers_count']
weibo_dict['user_friends_count'] = status['user']['friends_count']
weibo_dict['user_statuses_count'] = status['user']['statuses_count']
weibo_dict['user_description'] = status['user']['description']
weibo_data.append(weibo_dict)
# 数据整理
weibo_df = pd.DataFrame(weibo_data)
weibo_df = weibo_df[['id', 'text', 'created_at', 'comments_count', 'attitudes_count', 'reposts_count',
'user_id', 'user_name', 'user_followers_count', 'user_friends_count', 'user_statuses_count',
'user_description']]
```
这里使用了第三方库weibo_sdk,需要先安装该库。爬取的数据包括微博ID、微博正文、时间、评论数、点赞数、转发数、用户ID、用户名、粉丝数、关注数、微博数、用户描述等信息。最终将数据整理为DataFrame格式,方便后续处理和分析。