首页python中多图片爬虫保存在本地磁盘

python中多图片爬虫保存在本地磁盘

时间: 2024-10-10 13:11:41 浏览: 54

在Python中编写一个多图片爬虫并将其保存在本地磁盘，通常涉及到网络请求库（如requests）、HTML解析库（如BeautifulSoup）以及文件操作（os模块）。以下是一个简化的步骤说明： 1. 导入必要的库： ```python import requests from bs4 import BeautifulSoup import os ``` 2. 定义目标URL，发起HTTP请求获取网页内容： ```python url = "http://example.com/images" # 替换为你需要爬取图片的网站地址 response = requests.get(url) ``` 3. 解析网页以提取图片链接： ```python soup = BeautifulSoup(response.text, 'html.parser') image_links = [img['src'] for img in soup.find_all('img', src=True)] ``` 这里假设图片链接都在`<img>`标签的`src`属性中。 4. 检查目录是否存在，如果不存在则创建： ```python target_dir = 'images_from_web' # 目标文件夹路径 if not os.path.exists(target_dir): os.makedirs(target_dir) ``` 5. 逐个下载图片并保存至本地： ```python for link in image_links: response_img = requests.get(link, stream=True) filename = os.path.join(target_dir, link.split('/')[-1]) with open(filename, 'wb') as f: for chunk in response_img.iter_content(chunk_size=1024): if chunk: # filter out keep-alive new chunks f.write(chunk) ``` 这会下载每个图片链接，并将其保存在指定的目标目录下。 6. 爬虫结束，记得处理可能出现的异常。

阅读全文