python爬虫怎么存储图片
时间: 2024-10-05 17:03:24 浏览: 14
Python爬虫通过requests库获取网页内容,然后使用BeautifulSoup、Scrapy等库解析HTML找到图片链接。一旦获取到图片URL,可以使用如PIL(Pillow)库下载图片并保存至本地文件。以下是一个简单的步骤:
1. 引入必要的库:
```python
import requests
from bs4 import BeautifulSoup
import os
```
2. 定义函数,用于下载图片:
```python
def download_image(url, save_path):
response = requests.get(url)
with open(save_path, 'wb') as f:
f.write(response.content)
```
3. 爬取网页并处理图片:
```python
def spider(url):
# 发送请求
response = requests.get(url)
# 解析HTML
soup = BeautifulSoup(response.text, 'html.parser')
# 找到所有图片标签,假设它们有相同的class名
img_tags = soup.find_all('img', class_='image-class')
# 设置保存图片的目录
if not os.path.exists('images'):
os.makedirs('images')
for img in img_tags:
img_url = img['src'] # 获取图片地址
local_path = os.path.join('images', img_url.split('/')[-1]) # 生成本地路径
download_image(img_url, local_path)
```
4. 调用spider函数开始爬取:
```python
spider('http://example.com')
```