首页scrapy爬取京东商品信息

scrapy爬取京东商品信息

时间: 2023-10-21 16:04:50 浏览: 192

Scrapy是一个强大的Python爬虫框架，可以用来爬取各种网站的信息。下面是一个Scrapy爬取京东商品信息的简单示例： 1. 创建Scrapy项目在命令行中输入以下命令来创建一个Scrapy项目： ``` scrapy startproject jingdong ``` 这将创建一个名为“jingdong”的Scrapy项目。 2. 创建爬虫在命令行中输入以下命令来创建一个爬虫： ``` scrapy genspider jingdong_spider jd.com ``` 这将在项目中创建一个名为“jingdong_spider”的爬虫，用于爬取jd.com网站上的商品信息。 3. 编写爬虫代码打开“jingdong_spider.py”文件，添加以下代码： ```python import scrapy class JingdongSpider(scrapy.Spider): name = "jingdong" allowed_domains = ["jd.com"] start_urls = [ "https://list.jd.com/list.html?cat=9987,653,655" ] def parse(self, response): for sel in response.xpath('//ul[@class="gl-warp clearfix"]/li'): item = {} item['name'] = sel.xpath('div[@class="gl-i-wrap"]/div[@class="p-name"]/a/em/text()').extract()[0] item['price'] = sel.xpath('div[@class="gl-i-wrap"]/div[@class="p-price"]/strong/i/text()').extract()[0] yield item ``` 这个简单的爬虫代码将在“https://list.jd.com/list.html?cat=9987,653,655”页面中爬取商品名称和价格，并将它们保存到一个字典中。 4. 运行爬虫在命令行中输入以下命令来运行爬虫： ``` scrapy crawl jingdong ``` Scrapy将开始爬取京东商品信息，并将结果打印到命令行中。这只是一个简单的示例，你可以根据自己的需求修改代码以及爬取其他网站的信息。

阅读全文