scrapy items存成json文件
时间: 2023-11-28 15:05:41 浏览: 75
scrapy京东爬虫,保存到json.zip
5星 · 资源好评率100%
以下是使用Scrapy将items存储为JSON文件的步骤:
1. 在Scrapy项目中的settings.py文件中添加以下代码,以启用JSON输出:
```
FEED_FORMAT = "json"
FEED_URI = "output.json"
```
2. 在Scrapy项目中的items.py文件中定义要爬取的数据字段,例如:
```python
import scrapy
class BookItem(scrapy.Item):
title = scrapy.Field()
author = scrapy.Field()
description = scrapy.Field()
```
3. 在Scrapy项目中的spider文件中,使用定义的item字段来填充数据,例如:
```python
import scrapy
from myproject.items import BookItem
class MySpider(scrapy.Spider):
name = 'example.com'
allowed_domains = ['example.com']
start_urls = ['http://www.example.com']
def parse(self, response):
item = BookItem()
item['title'] = response.css('title::text').extract_first()
item['author'] = response.css('author::text').extract_first()
item['description'] = response.css('description::text').extract_first()
yield item
```
4. 运行Scrapy爬虫并将输出保存为JSON文件,例如:
```
scrapy crawl myspider -o output.json
```
阅读全文