编写爬取百度图片大小 为100kb以上的,宝可梦图片,python语言
时间: 2023-05-20 13:06:55 浏览: 48
可以使用Python的requests和BeautifulSoup库来实现爬取百度图片的功能。以下是示例代码:
```python
import requests
from bs4 import BeautifulSoup
import os
def download_image(url, save_path):
response = requests.get(url)
with open(save_path, 'wb') as f:
f.write(response.content)
def crawl_pokemon_images():
url = 'https://image.baidu.com/search/index?tn=baiduimage&word=%E5%AE%9D%E5%8F%AF%E6%A2%A6&ct=201326592&z=&lm=-1&pn=0&ic=0&s=0&se=1&sme=0&tab=0&width=&height=&face=0&istype=2&qc=&nc=1&fr=&pn=0&rn=30'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img', class_='main_img')
for img_tag in img_tags:
img_url = img_tag['src']
if 'http' not in img_url:
img_url = 'https:' + img_url
if 'jpg' not in img_url:
continue
response = requests.get(img_url)
if response.status_code != 200:
continue
content_length = int(response.headers.get('Content-Length', 0))
if content_length < 100 * 1024:
continue
filename = os.path.basename(img_url)
save_path = os.path.join('pokemon_images', filename)
download_image(img_url, save_path)
if __name__ == '__main__':
crawl_pokemon_images()
```
这段代码可以爬取百度图片搜索“宝可梦”关键词的结果,并下载大小大于100KB的图片到本地的“pokemon_images”文件夹中。