python使用xpath爬取招聘信息并保存csv
时间: 2023-10-10 21:02:58 浏览: 260
Python使用XPath爬取招聘信息并保存为CSV文件的步骤如下:
1. 导入所需的库:
```python
import requests
from lxml import etree
import csv
```
2. 发送请求获取页面内容:
```python
url = "招聘信息页面的URL"
response = requests.get(url)
```
3. 解析页面内容:
```python
html = etree.HTML(response.text)
```
4. 使用XPath选取招聘信息:
```python
title = html.xpath("XPath表达式1")
company = html.xpath("XPath表达式2")
salary = html.xpath("XPath表达式3")
```
5. 创建CSV文件并写入表头:
```python
csv_file = open('招聘信息.csv', 'w', newline='', encoding='utf-8')
writer = csv.writer(csv_file)
writer.writerow(['标题', '公司', '薪水'])
```
6. 遍历招聘信息并写入CSV文件:
```python
for i in range(len(title)):
writer.writerow([title[i], company[i], salary[i]])
```
7. 关闭CSV文件:
```python
csv_file.close()
```
完整代码示例:
```python
import requests
from lxml import etree
import csv
url = "招聘信息页面的URL"
response = requests.get(url)
html = etree.HTML(response.text)
title = html.xpath("XPath表达式1")
company = html.xpath("XPath表达式2")
salary = html.xpath("XPath表达式3")
csv_file = open('招聘信息.csv', 'w', newline='', encoding='utf-8')
writer = csv.writer(csv_file)
writer.writerow(['标题', '公司', '薪水'])
for i in range(len(title)):
writer.writerow([title[i], company[i], salary[i]])
csv_file.close()
```
运行这段代码后,会在当前目录下生成一个名为"招聘信息.csv"的CSV文件,其中包含了爬取到的招聘信息的标题、公司和薪水。
阅读全文