python爬虫前程无忧可视化
时间: 2024-01-09 17:05:05 浏览: 91
python爬虫数据可视化
以下是使用Python爬虫从前程无忧网爬取招聘数据并进行可视化的步骤:
1. 导入所需的库:
```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
import matplotlib.pyplot as plt
```
2. 发送HTTP请求并解析HTML页面:
```python
url = 'https://search.51job.com/list/000000,000000,0000,00,9,99,python,2,1.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
```
3. 提取招聘信息:
```python
job_titles = soup.select('.t1 span a')
company_names = soup.select('.t2 a')
salaries = soup.select('.t4')
```
4. 创建数据框并保存数据到Excel文件:
```python
data = {'Job Title': [title.text.strip() for title in job_titles],
'Company Name': [name.text.strip() for name in company_names],
'Salary': [salary.text.strip() for salary in salaries]}
df = pd.DataFrame(data)
df.to_excel('job_data.xlsx', index=False)
```
5. 进行数据分析和可视化:
```python
# 统计各公司的招聘数量
company_counts = df['Company Name'].value_counts()
# 绘制柱状图
plt.bar(company_counts.index[:10], company_counts[:10])
plt.xlabel('Company')
plt.ylabel('Count')
plt.title('Top 10 Companies Hiring Python Developers')
plt.xticks(rotation=45)
plt.show()
```
这样,你就可以通过爬取前程无忧网的招聘数据,并将其保存到本地Excel文件中。然后,你可以使用matplotlib库对数据进行分析和可视化。
阅读全文