首页写一个爬取贝壳网二手房信息的scrapy

写一个爬取贝壳网二手房信息的scrapy

时间: 2023-05-22 19:01:55 浏览: 554

非常好的问题！以下是一个简单的 scrapy 爬虫，可以爬取贝壳网二手房信息： ```python import scrapy class BeikeSpider(scrapy.Spider): name = 'beike' allowed_domains = ['ke.com'] start_urls = ['https://sz.ke.com/ershoufang/'] def parse(self, response): for house in response.xpath('//div[@class="info clear"]'): yield { 'title': house.xpath('.//div[@class="title"]/a/text()').get(), 'price': house.xpath('.//div[@class="priceInfo"]/div[@class="totalPrice"]/span/text()').get(), 'area': house.xpath('.//div[@class="houseInfo"]/text()').get(), } next_page = response.xpath('//a[@class="next"]/@href') if next_page: yield response.follow(next_page[0], self.parse) ``` 这个爬虫会爬取深圳贝壳网二手房信息，并将每个房源的标题、价格和面积保存到一个字典中。如果有下一页，它会自动跟随链接继续爬取。你可以根据需要修改 `allowed_domains` 和 `start_urls` 来爬取其它城市的贝壳网二手房信息。

阅读全文