用Python语言编写爬取百度图片中猫和狗的图片各100张,将图片存入相应文件夹,两个文件夹分别为cat和dog的代码
时间: 2024-06-10 10:05:46 浏览: 142
以下是基于Python语言的实现方法:
```python
import requests
import os
# 创建图片存储目录
if not os.path.exists('cat'):
os.mkdir('cat')
if not os.path.exists('dog'):
os.mkdir('dog')
# 爬取猫的图片
cat_url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result&queryWord=猫&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=-1&z=&ic=&word=猫&s=&se=&tab=&width=&height=&face=&istype=&qc=&nc=&fr=&step_word='
cat_headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
}
cat_response = requests.get(cat_url, headers=cat_headers)
cat_data = cat_response.json()['data']
for i in range(100):
cat_img_url = cat_data[i]['thumbURL']
cat_img_response = requests.get(cat_img_url, headers=cat_headers)
with open(f'cat/cat{i}.jpg', 'wb') as f:
f.write(cat_img_response.content)
# 爬取狗的图片
dog_url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result&queryWord=狗&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=-1&z=&ic=&word=狗&s=&se=&tab=&width=&height=&face=&istype=&qc=&nc=&fr=&step_word='
dog_headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
}
dog_response = requests.get(dog_url, headers=dog_headers)
dog_data = dog_response.json()['data']
for i in range(100):
dog_img_url = dog_data[i]['thumbURL']
dog_img_response = requests.get(dog_img_url, headers=dog_headers)
with open(f'dog/dog{i}.jpg', 'wb') as f:
f.write(dog_img_response.content)
```
以上代码中,我们首先创建了两个文件夹用于存储猫和狗的图片,然后使用requests模块发送HTTP请求,获取百度搜索结果的JSON数据。接着,我们从JSON数据中提取出每张图片的URL,并将其下载到对应的文件夹中。最终,我们可以在cat和dog文件夹中找到100张猫和100张狗的图片,分别以cat{i}.jpg和dog{i}.jpg的格式命名。
阅读全文