用Python语言编写爬取百度图片中猫和狗的图片各100张,将图片存入相应文件夹,两个文件夹分别为cat和dog的代码
时间: 2024-06-10 19:04:28 浏览: 99
:```python
import requests
import os
def download_images(name, num):
# 创建存储图片的文件夹
if not os.path.exists(name):
os.mkdir(name)
url = 'https://image.baidu.com/search/acjson'
params = {
'tn': 'resultjson_com',
'logid': '123456',
'ipn': 'rj',
'ct': 201326592,
'is': '',
'fp': 'result',
'queryWord': name,
'cl': 2,
'lm': -1,
'ie': 'utf-8',
'oe': 'utf-8',
'adpicid': '',
'st': -1,
'z': '',
'ic': 0,
'word': name,
's': '',
'se': '',
'tab': '',
'width': '',
'height': '',
'face': 0,
'istype': 2,
'qc': '',
'nc': 1,
'fr': '',
'pn': 0,
'rn': 30
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
count = 0
while count < num:
try:
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
response.encoding = response.apparent_encoding
json_data = response.json()
for item in json_data['data']:
if 'thumbURL' in item:
img_url = item['thumbURL']
img_data = requests.get(img_url, headers=headers, timeout=10)
if img_data.status_code == 200:
img_path = name + '/' + str(count+1) + '.jpg'
with open(img_path, 'wb') as f:
f.write(img_data.content)
print('{} saved.'.format(img_path))
count += 1
if count == num:
break
except:
print('Error.')
continue
if __name__ == '__main__':
download_images('猫', 100)
download_images('狗', 100)
```
注意:在下载图片的过程中,因为网络原因或者图片不存在等情况,有可能下载失败,需要进行异常处理。
阅读全文