scrapy mysql异步写入
时间: 2023-11-08 19:06:04 浏览: 154
Scrapy抓取资源Mysql异步存储
Scrapy提供了异步的Item Pipeline机制,可以方便地将数据存储到MySQL数据库中。具体实现步骤如下:
1. 安装异步MySQL库aiomysql:`pip install aiomysql`
2. 在settings.py中配置MySQL数据库信息:
```
MYSQL_HOST = 'localhost'
MYSQL_PORT = 3306
MYSQL_USER = 'root'
MYSQL_PASSWORD = 'password'
MYSQL_DBNAME = 'database_name'
```
3. 创建一个异步的MySQL连接池:
```
import aiomysql
class MySQLPipeline(object):
def __init__(self, mysql_host, mysql_port, mysql_user, mysql_password, mysql_dbname):
self.mysql_host = mysql_host
self.mysql_port = mysql_port
self.mysql_user = mysql_user
self.mysql_password = mysql_password
self.mysql_dbname = mysql_dbname
self.pool = None
@classmethod
async def from_crawler(cls, crawler):
mysql_host = crawler.settings.get('MYSQL_HOST', 'localhost')
mysql_port = crawler.settings.get('MYSQL_PORT', 3306)
mysql_user = crawler.settings.get('MYSQL_USER', 'root')
mysql_password = crawler.settings.get('MYSQL_PASSWORD', 'password')
mysql_dbname = crawler.settings.get('MYSQL_DBNAME', 'database_name')
obj = cls(mysql_host, mysql_port, mysql_user, mysql_password, mysql_dbname)
obj.pool = await aiomysql.create_pool(
host=obj.mysql_host,
port=obj.mysql_port,
user=obj.mysql_user,
password=obj.mysql_password,
db=obj.mysql_dbname,
charset='utf8mb4',
autocommit=True,
maxsize=10,
minsize=1
)
return obj
async def process_item(self, item, spider):
async with self.pool.acquire() as conn:
async with conn.cursor() as cur:
sql = "INSERT INTO table_name (field1, field2) VALUES (%s, %s)"
await cur.execute(sql, (item['field1'], item['field2']))
return item
async def close_spider(self, spider):
self.pool.close()
await self.pool.wait_closed()
```
4. 在settings.py中启用MySQLPipeline:
```
ITEM_PIPELINES = {
'myproject.pipelines.MySQLPipeline': 300,
}
```
阅读全文