首页利用pycharm爬取猫的图片完整代码

利用pycharm爬取猫的图片完整代码

时间: 2023-07-25 17:52:33 浏览: 99

python3 +lxmL+requests+pycharm组件爬取图片建议使用pyCharm编辑器

看网站图片，还要翻页，不得劲，干脆直接爬过来，嗯。

以下是使用Pycharm爬取猫的图片的完整代码： ```python import requests from bs4 import BeautifulSoup import os # 设置需要爬取的页数和图片类型 page_num = 10 image_type = "jpg" # 设置图片保存路径 save_path = "./cat_images/" # 创建保存图片的文件夹 if not os.path.exists(save_path): os.makedirs(save_path) # 循环爬取每一页的图片 for i in range(1, page_num + 1): # 构建URL url = "https://www.pexels.com/search/cat/?page=" + str(i) + "&format=js&seed=2021-08-27%2003%3A57%3A03%20%2B0200" # 发送请求 response = requests.get(url) # 解析响应 soup = BeautifulSoup(response.content, "html.parser") # 查找所有的图片 images = soup.find_all("img") # 循环下载每张图片 for j, image in enumerate(images): # 获取图片的URL image_url = image["src"] # 判断图片的类型是否符合要求 if image_url.endswith(image_type): # 发送请求下载图片 image_response = requests.get(image_url) # 保存图片到本地 with open(save_path + str(i) + "_" + str(j) + "." + image_type, "wb") as f: f.write(image_response.content) print("已下载第", i, "页第", j, "张图片") ``` 说明： 1. 首先需要导入`requests`、`BeautifulSoup`和`os`模块。 2. 然后设置需要爬取的页数和图片类型，以及保存图片的路径。 3. 创建保存图片的文件夹。 4. 循环爬取每一页的图片，构建URL、发送请求、解析响应、查找所有的图片。 5. 循环下载每张图片，获取图片的URL、判断图片的类型是否符合要求、发送请求下载图片、保存图片到本地。 6. 最后输出已下载的图片数量。

阅读全文