python爬取网页中content-type: image/png的所有图片
时间: 2024-12-25 10:28:18 浏览: 8
在Python中,可以结合`requests`和`beautifulsoup4`库来爬取特定类型的图片(如`image/png`)。这里有一个示例,展示如何抓取指定Content-Type的PNG图片:
```python
import requests
from bs4 import BeautifulSoup
import os
def get_image_links(url):
response = requests.get(url)
if response.status_code == 200:
soup = BeautifulSoup(response.content, 'html.parser')
def is_png(link):
return link.lower().endswith('.png') and 'image/png' in response.headers['Content-Type']
png_links = [a['href'] for a in soup.find_all('a', href=True) if is_png(a['href'])]
return png_links
def download_images(png_links, save_path='images'):
if not os.path.exists(save_path):
os.makedirs(save_path)
for link in png_links:
try:
img_data = requests.get(link).content
with open(os.path.join(save_path, link.split('/')[-1]), 'wb') as f:
f.write(img_data)
print(f"已下载图片: {link}")
except Exception as e:
print(f"下载图片失败: {link} - {str(e)}")
# 示例使用
url = "https://example.com" # 需要爬取的网站
links = get_image_links(url)
download_images(links)
```
这个脚本首先获取网页内容,然后解析HTML找寻包含PNG链接的`<a>`元素。接着,它检查每个链接的内容类型是否为`image/png`,如果是则下载。
阅读全文