首页安居客二手房爬虫xpath

安居客二手房爬虫xpath

时间: 2023-12-26 08:28:58 浏览: 145

安居客二手房爬虫可以使用XPath来解析网页数据。下面是一个使用XPath解析安居客二手房页面的示例代码： ```python import requests from lxml import etree def get_house_info(url): headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) html = etree.HTML(response.text) # 使用XPath提取房屋信息 titles = html.xpath('//div[@class="house-title"]/a/text()') prices = html.xpath('//span[@class="price-det"]/strong/text()') areas = html.xpath('//div[@class="details-item"][1]/span[2]/text()') # 打印房屋信息 for title, price, area in zip(titles, prices, areas): print("标题：", title) print("价格：", price) print("面积：", area) print("--") # 获取一百个城市的链接地址 urls = first_url() # 遍历每个城市的链接地址，进行数据解析 for url in urls: get_house_info(url) ``` 这段代码首先使用requests库发送请求获取网页内容，然后使用lxml库的etree模块将网页内容转换为可解析的HTML对象。接下来，使用XPath表达式提取房屋标题、价格和面积等信息，并打印出来。

阅读全文