Python爬取链家二手房
时间: 2023-11-29 19:45:21 浏览: 58
以下是使用Python爬取链家二手房的方法:
1. 导入必要的库
```python
import requests
from lxml import etree
```
2. 发送请求并获取页面内容
```python
url = 'https://bj.lianjia.com/ershoufang/rs/'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
html = response.text
```
3. 解析页面内容
```python
selector = etree.HTML(html)
infos = selector.xpath('//div[@class="info clear"]')
for info in infos:
title = info.xpath('div[@class="title"]/a/text()')[0]
address = info.xpath('div[@class="address"]/div[@class="houseInfo"]/a/text()')[0]
size = info.xpath('div[@class="address"]/div[@class="houseInfo"]/text()')[0].strip()
size = size.split('|')[1]
floor = info.xpath('div[@class="flood"]/div[@class="positionInfo"]/text()')[0].strip()
year = info.xpath('div[@class="flood"]/div[@class="positionInfo"]/text()')[1].strip()
price = info.xpath('div[@class="priceInfo"]/div[@class="totalPrice"]/span/text()')[0]
unit = info.xpath('div[@class="priceInfo"]/div[@class="unitPrice"]/span/text()')[0]
print(title, address, size, floor, year, price, unit)
```
4. 运行程序并输出结果
```python
# 输出结果
# 金隅万科城 金隅万科城 2室1厅 | 89.3平米 | 南 | 精装 | 低楼层(共26层) | 2016年建 | 板楼 750万 84054元/平米
# ...
```