python 实现爬取boss直聘并可视化 源代码
时间: 2023-12-30 10:01:58 浏览: 46
很高兴为您解答问题!以下是python实现爬取boss直聘并可视化的源代码:
```
# 导入相关模块
import requests
import pandas as pd
import json
# 设置请求头
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
# 设置请求参数
params = {
'query': 'Python',
'city': '深圳'
}
# 获取第一页数据
url = 'https://www.zhipin.com/api/company/search/bossSearchByIndex.json'
response = requests.get(url, headers=headers, params=params).content
data = json.loads(response)['zpData']['rows']
# 循环获取所有数据
while True:
url = 'https://www.zhipin.com/job_detail/?query=Python&city=%E6%B7%B1%E5%9C%B3&industry=&position='
response = requests.get(url, headers=headers, params=params).content
page_data = json.loads(response)['zpData']['rows']
if page_data:
data.extend(page_data)
params['page'] = str(int(params['page']) + 1)
else:
break
# 将数据存储为DataFrame格式
df = pd.DataFrame(data)
# 数据可视化
import matplotlib.pyplot as plt
from wordcloud import WordCloud
# 按照公司名称分组统计
company_count = df.groupby('company_short_name')['company_short_name'].count().sort_values(ascending=False)[:50]
# 绘制词云
wordcloud = WordCloud(width=1200,height=800)
wordcloud.fit_words(company_count)
plt.imshow(wordcloud)
plt.axis('off')
plt.show()
```
希望能对您有所帮助!