python爬虫数据可视化BOSS直聘

以下是使用Python爬取BOSS直聘岗位数据并进行可视化的步骤： 1. 导入必要的库 ```python import requests from bs4 import BeautifulSoup import pymysql from pyecharts.charts import Bar, Pie, WordCloud from pyecharts import options as opts from flask import Flask, render_template ``` 2. 爬取数据 ```python # 爬取BOSS直聘网站上的数据 def get_data(url): headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, 'html.parser') return soup ``` 3. 爬取多页数据 ```python # 爬取多页数据 def get_all_data(): all_data = [] for i in range(1, 11): url = 'https://www.zhipin.com/c101280100-p100109/?page={}&ka=page-{}'.format(i, i) soup = get_data(url) data_list = soup.find_all('div', class_='job-primary') for data in data_list: job_name = data.find('div', class_='job-title').text.strip() salary = data.find('span', class_='red').text.strip() company = data.find('div', class_='company-text').find('a').text.strip() education = data.find('div', class_='job-limit clearfix').find_all('span')[1].text.strip() welfare = data.find('div', class_='info-append').find_all('span') welfare_list = [w.text.strip() for w in welfare] all_data.append([job_name, salary, company, education, welfare_list]) return all_data ``` 4. 存储数据 ```python # 存储数据到MySQL数据库 def save_data(data): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'INSERT INTO job_info(job_name, salary, company, education, welfare) values(%s, %s, %s, %s, %s)' try: cursor.executemany(sql, data) db.commit() except Exception as e: print(e) db.rollback() db.close() ``` 5. 数据可视化 ```python # 数据可视化 app = Flask(__name__) @app.route('/') def index(): return render_template('index.html') @app.route('/salary') def salary(): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'SELECT salary FROM job_info' cursor.execute(sql) results = cursor.fetchall() salary_list = [] for result in results: salary = result[0].replace('k', '').replace('K', '') salary_list.append(int(salary)) salary_dict = {} for i in range(0, 31, 5): salary_dict['{}k-{}k'.format(i, i + 5)] = 0 for salary in salary_list: for key in salary_dict.keys(): if salary >= int(key.split('-')[0]) and salary <= int(key.split('-')[1]): salary_dict[key] += 1 bar = Bar() bar.add_xaxis(list(salary_dict.keys())) bar.add_yaxis('薪资分布', list(salary_dict.values())) bar.set_global_opts(title_opts=opts.TitleOpts(title='BOSS直聘薪资分布图')) return bar.dump_options_with_quotes() @app.route('/education') def education(): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'SELECT education FROM job_info' cursor.execute(sql) results = cursor.fetchall() education_list = [] for result in results: education_list.append(result[0]) education_dict = {} for education in education_list: if education in education_dict.keys(): education_dict[education] += 1 else: education_dict[education] = 1 pie = Pie() pie.add('', list(education_dict.items())) pie.set_global_opts(title_opts=opts.TitleOpts(title='BOSS直聘学历要求分布图')) return pie.dump_options_with_quotes() @app.route('/welfare') def welfare(): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'SELECT welfare FROM job_info' cursor.execute(sql) results = cursor.fetchall() welfare_list = [] for result in results: welfare_list.extend(result[0]) welfare_dict = {} for welfare in welfare_list: if welfare in welfare_dict.keys(): welfare_dict[welfare] += 1 else: welfare_dict[welfare] = 1 wordcloud = WordCloud() wordcloud.add('', list(welfare_dict.items()), word_size_range=[20, 100]) wordcloud.set_global_opts(title_opts=opts.TitleOpts(title='BOSS直聘福利词云图')) return wordcloud.dump_options_with_quotes() if __name__ == '__main__': app.run() ```

阅读全文

python爬虫数据可视化BOSS直聘

相关推荐

BOSS直聘Python相关招聘岗位数据可视化

人工智能-项目实践-可视化-Boss直聘岗位数据爬虫分析可视化.zip

爬虫-基于python的Boss直聘网站的数据爬取

python爬虫数据可视化boss直聘

基于 python 实现的Boss直聘岗位数据爬虫分析可视化

基于 python 实现的Boss直聘岗位数据爬虫分析可视化源码+文档说明

基于Python+djangoDRF的Boss直聘在线爬虫及数据分析可视化系统源码+全部资料（毕业设计）.zip

Boss直聘岗位数据爬虫分析可视化.zip

优秀毕设-基于python的Boss直聘岗位数据爬虫分析可视化（含全部资料+报告）.zip

课程设计-基于Python的Boss直聘岗位数据爬虫分析可视化系统（含全部资料+报告）.zip

基于python实现的Boss直-聘岗位数据爬虫分析可视化+源代码+文档说明+界面截图+数据

Boss直聘岗位数据爬虫分析可视化 (源码+说明)高分项目

基于BOSS直聘数据分析师职位信息的爬虫实现、数据分析、数据可视化和机器学习预测的综合性项目python源码.zip

Boss直聘数据爬虫（Python源码）_招聘数据分析_含详细文档+可视化报告.zip

Python爬虫分析Boss直聘岗位数据：可视化与源代码教程

Python实现Boss直聘薪资分析爬虫及数据可视化

基于BOSS直聘的Django数据爬虫及可视化分析项目

Python爬虫与matplotlib数据可视化分析教程

利用Python实现boss直聘数据分析与可视化

如何利用Python编写爬虫程序实现对boss直聘网站招聘数据的自动化采集？

大家在看

GD32F系列分散加载说明

建立点击按钮-INTOUCH资料

单片机与DSP中的基于DSP的PSK信号调制设计与实现

菊安酱的机器学习第5期 支持向量机（直播）.pdf

小米澎湃OS 钱包XPosed模块

最新推荐

基于 Python 的招聘网站数据分析.docx

基于Andorid的音乐播放器项目改进版本设计.zip

uniapp-machine-learning-from-scratch-05.rar

game_patch_1.30.21.13250.pak

【毕业设计-java】springboot-vue计算机学院校友网源码（完整前后端+mysql+说明文档+LunW）.zip

Windows下操作Linux图形界面的VNC工具

【SketchUp Ruby API：从入门到精通】

VMware虚拟机打开虚拟网络编辑器出现由于找不到vnetlib.dll,无法继续执行代码。重新安装程序可能会解决问题

基于Preact的高性能PWA实现定期天气信息更新

从停机到上线，EMC VNX5100控制器SP更换的实战演练

菊安酱的机器学习第5期支持向量机（直播）.pdf