首页Python前程无忧的爬虫代码

Python前程无忧的爬虫代码

时间: 2023-11-19 11:43:01 浏览: 32

请注意，未经网站所有者许可，爬取网站数据可能违反法律法规，请谨慎使用。以下是一个示例爬取前程无忧网站的 Python 代码： ```python import requests from bs4 import BeautifulSoup # 请求数据 url = 'https://search.51job.com/list/000000,000000,0000,00,9,99,Python,2,1.html' headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) # 解析数据 soup = BeautifulSoup(response.text, 'html.parser') job_list = soup.find_all('div', class_='el') # 输出数据 for job in job_list: job_name = job.find('a', target='_blank').text.strip() job_company = job.find('span', class_='t2').text.strip() job_location = job.find('span', class_='t3').text.strip() job_salary = job.find('span', class_='t4').text.strip() print(job_name, job_company, job_location, job_salary) ``` 这个示例代码使用 requests 库请求网页数据，并使用 BeautifulSoup 库解析 HTML 标签，然后输出了搜索结果页面中的职位名称、公司名称、工作地点和薪水。请注意，这个示例只是一个简单的爬虫例子，如果需要更复杂的功能（如登录后爬取数据、分页等），需要更加细致地设计代码。

阅读全文