首页写一段代码可以爬取百度的图片

写一段代码可以爬取百度的图片

时间: 2023-12-06 18:44:08 浏览: 79

以下是一个简单的Python代码，可以爬取百度图片搜索结果的图片： ```python import requests import os # 定义搜索关键词和图片数量 search_word = '猫' num_images = 10 # 设置请求头部信息 headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} # 创建保存图片的文件夹 if not os.path.exists(search_word): os.makedirs(search_word) # 循环获取图片 url = 'http://image.baidu.com/search/acjson' params = { 'tn': 'resultjson_com', 'ipn': 'rj', 'ct': 201326592, 'fp': 'result', 'queryWord': search_word, 'cl': 2, 'lm': -1, 'ie': 'utf-8', 'oe': 'utf-8', 'adpicid': '', 'st': -1, 'z': '', 'ic': 0, 'word': search_word, 's': '', 'se': '', 'tab': '', 'width': '', 'height': '', 'face': 0, 'istype': 2, 'qc': '', 'nc': 1, 'fr': '', 'pn': 0, 'rn': 30, 'gsm': '1e'} for i in range(num_images): params['pn'] = i * 30 response = requests.get(url, headers=headers, params=params) response_json = response.json() for image_info in response_json['data']: if 'thumbURL' in image_info: image_url = image_info['thumbURL'] response_image = requests.get(image_url, headers=headers) with open(os.path.join(search_word, '{}.jpg'.format(i * 30 + response_json['data'].index(image_info))), 'wb') as f: f.write(response_image.content) ``` 注意：爬取百度图片有一定的风险，可能会触发反爬虫机制。建议使用合法的方式获取图片，例如使用百度开放平台提供的API接口。

阅读全文