首页爬取中南财经政法大学所有员工，包括姓名、职称、擅长领域、所属学院，并可视化数据

爬取中南财经政法大学所有员工，包括姓名、职称、擅长领域、所属学院，并可视化数据

时间: 2023-10-20 22:33:32 浏览: 145

好的，这个任务可以使用 Python 的 requests、BeautifulSoup、pandas、matplotlib 库来完成。以下是一个简单的实现： ``` python import requests from bs4 import BeautifulSoup import pandas as pd import matplotlib.pyplot as plt # 定义目标网站和请求头 url = 'http://www.znufe.edu.cn/szdw/jsml.htm' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} # 发送请求并获取页面内容 response = requests.get(url, headers=headers) response.encoding = 'utf-8' # 使用 BeautifulSoup 解析页面内容 soup = BeautifulSoup(response.text, 'html.parser') # 定位到员工列表所在的标签 table = soup.find('table', {'class': 'table table-bordered table-hover table-striped'}) # 遍历表格中的每行数据，提取姓名、职称、擅长领域、所属学院等信息，并存储到字典列表中 employees = [] for tr in table.find_all('tr')[1:]: tds = tr.find_all('td') employee = { 'name': tds[0].text.strip(), 'title': tds[1].text.strip(), 'expertise': tds[2].text.strip(), 'department': tds[3].text.strip() } employees.append(employee) # 将数据转换成 pandas 的 DataFrame 格式，并显示前5行数据 df = pd.DataFrame(employees) print(df.head()) # 统计各学院员工数量，并绘制饼状图 departments = df.groupby('department').size() plt.pie(departments, labels=departments.index, autopct='%1.1f%%') plt.title('Employees by Department') plt.show() ``` 注：本代码仅供学习参考，爬取网站信息时需遵守相关法律法规。

阅读全文