基于lxml解析boss的职位名称、薪资、地区、工作经验、学历、职位类别、招聘人数、职位描述、岗位职责、职位福利、公司名称、公司类型、公司规模数据
时间: 2023-11-29 13:45:35 浏览: 57
基于网络爬虫获取招聘网站数据,并采用数据分析方法对数据进行筛选和分析
5星 · 资源好评率100%
```python
import requests
from lxml import etree
url = 'https://www.zhipin.com/job_detail/?query=python&city=101010100&industry=&position='
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
html = etree.HTML(response.text)
# 职位名称
job_title = html.xpath('//div[@class="job-banner"]/div[@class="info-primary"]/div[@class="name"]/h1/text()')[0]
# 薪资
salary = html.xpath('//div[@class="job-banner"]/div[@class="info-primary"]/div[@class="name"]/span/text()')[0]
# 地区
location = html.xpath('//div[@class="job-banner"]/div[@class="info-primary"]/p/text()')[0]
# 工作经验
experience = html.xpath('//div[@class="job-banner"]/div[@class="info-primary"]/p/text()')[1]
# 学历
education = html.xpath('//div[@class="job-banner"]/div[@class="info-primary"]/p/text()')[2]
# 职位类别
job_type = html.xpath('//div[@class="job-banner"]/div[@class="info-primary"]/p/text()')[3]
# 招聘人数
recruit_num = html.xpath('//div[@class="job-banner"]/div[@class="info-primary"]/p/text()')[4]
# 职位描述
job_description = html.xpath('//div[@class="job-detail"]/div[@class="detail-content"]/div[@class="job-sec"]/div[@class="text"]/text()')
job_description = ''.join(job_description).strip()
# 岗位职责
job_responsibility = html.xpath('//div[@class="job-detail"]/div[@class="detail-content"]/div[@class="job-sec"][1]/div[@class="text"]/text()')
job_responsibility = ''.join(job_responsibility).strip()
# 职位福利
job_benefits = html.xpath('//div[@class="job-detail"]/div[@class="detail-content"]/div[@class="job-sec"][2]/div[@class="text"]/text()')
job_benefits = ''.join(job_benefits).strip()
# 公司名称
company_name = html.xpath('//div[@class="job-sec company-info"]/div[@class="name"]/text()')[0]
# 公司类型
company_type = html.xpath('//div[@class="job-sec company-info"]/div[@class="text"]/p/text()')[0]
# 公司规模
company_size = html.xpath('//div[@class="job-sec company-info"]/div[@class="text"]/p/text()')[1]
print('职位名称:', job_title)
print('薪资:', salary)
print('地区:', location)
print('工作经验:', experience)
print('学历:', education)
print('职位类别:', job_type)
print('招聘人数:', recruit_num)
print('职位描述:', job_description)
print('岗位职责:', job_responsibility)
print('职位福利:', job_benefits)
print('公司名称:', company_name)
print('公司类型:', company_type)
print('公司规模:', company_size)
```
--相关问题--:
1. 如何使用beautifulsoup解析网页?
2.
阅读全文