用Python语言编写爬取百度图片中猫和狗的图片各100张,将图片存入相应文件夹,两个文件夹分别为cat和dog的代码
时间: 2024-06-10 22:05:07 浏览: 311
这是一段Python代码,可以爬取百度图片中猫和狗的图片各100张,并将图片存入相应文件夹,两个文件夹分别为cat和dog:
```python
import requests
import os
def download_images(keyword, folder):
if not os.path.exists(folder):
os.mkdir(folder)
url = 'https://image.baidu.com/search/acjson'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
params = {
'tn': 'resultjson',
'logid': '7124381283671390293',
'ipn': 'rj',
'ct': '201326592',
'is': '',
'fp': 'result',
'queryWord': keyword,
'cl': '2',
'lm': '-1',
'ie': 'utf-8',
'oe': 'utf-8',
'adpicid': '',
'st': '-1',
'z': '',
'ic': '0',
'word': keyword,
's': '',
'se': '',
'tab': '',
'width': '',
'height': '',
'face': '0',
'istype': '2',
'qc': '',
'nc': '1',
'fr': '',
'expermode': '',
'force': '',
'pn': '',
'rn': '30',
'gsm': '1e',
'1590666527759': ''
}
for i in range(0, 100, 30):
params['pn'] = str(i)
response = requests.get(url, headers=headers, params=params)
json_data = response.json()
for item in json_data['data']:
if item.get('thumbURL') != None:
image_url = item['thumbURL']
try:
response = requests.get(image_url, headers=headers, timeout=10)
content = response.content
file_path = os.path.join(folder, item['di'] + '.jpg')
with open(file_path, 'wb') as f:
f.write(content)
except Exception as e:
print(e)
download_images('猫', 'cat')
download_images('狗', 'dog')
```
这段代码的原理是通过调用百度图片搜索接口,根据关键词获取图片的链接,在通过requests.get方法发送请求获取图片内容,并将图片存储到本地文件夹中。
阅读全文