使用python爬取51job招聘信息
时间: 2023-09-20 14:11:14 浏览: 108
1. 首先需要安装requests和beautifulsoup4两个库,可以使用pip install requests和pip install beautifulsoup4命令进行安装。
2. 导入需要使用的库。
```
import requests
from bs4 import BeautifulSoup
```
3. 使用requests库向51job网站发送请求,并获取网页内容。
```
url = 'https://search.51job.com/list/000000,000000,0000,00,9,99,Python,2,1.html'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
html = response.text
```
4. 使用BeautifulSoup库解析网页内容,并获取需要的招聘信息。
```
soup = BeautifulSoup(html, 'html.parser')
job_list = soup.find_all('div', class_ = 'el')
for job in job_list:
job_name = job.find('a', attrs={'target': '_blank'}).text.strip()
company_name = job.find('a', attrs={'class': 'name'}).text.strip()
salary = job.find('span', attrs={'class': 't4'}).text.strip()
location = job.find('span', attrs={'class': 't3'}).text.strip()
release_time = job.find('span', attrs={'class': 't5'}).text.strip()
print('职位名称:', job_name)
print('公司名称:', company_name)
print('工资待遇:', salary)
print('工作地点:', location)
print('发布时间:', release_time)
print('-----------------------------------------------')
```
完整代码如下:
```
import requests
from bs4 import BeautifulSoup
url = 'https://search.51job.com/list/000000,000000,0000,00,9,99,Python,2,1.html'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
html = response.text
soup = BeautifulSoup(html, 'html.parser')
job_list = soup.find_all('div', class_ = 'el')
for job in job_list:
job_name = job.find('a', attrs={'target': '_blank'}).text.strip()
company_name = job.find('a', attrs={'class': 'name'}).text.strip()
salary = job.find('span', attrs={'class': 't4'}).text.strip()
location = job.find('span', attrs={'class': 't3'}).text.strip()
release_time = job.find('span', attrs={'class': 't5'}).text.strip()
print('职位名称:', job_name)
print('公司名称:', company_name)
print('工资待遇:', salary)
print('工作地点:', location)
print('发布时间:', release_time)
print('-----------------------------------------------')
```
阅读全文