首页python爬取网页图

python爬取网页图

时间: 2024-09-24 15:15:57 浏览: 60

在Python中，可以使用requests库来发送HTTP请求获取网页内容，然后利用BeautifulSoup、PIL（Pillow）或者Scrapy等库来解析HTML并抓取图片。以下是一个基本步骤： 1. 安装所需库：首先需要安装`requests`, `beautifulsoup4`, 和 `Pillow`（如果只处理静态图片）。你可以通过pip命令安装： ``` pip install requests beautifulsoup4 Pillow ``` 2. 发送GET请求：使用requests.get()函数获取网页的HTML内容，例如： ```python import requests url = 'http://example.com' response = requests.get(url) html_content = response.text ``` 3. 解析HTML：使用BeautifulSoup解析HTML，定位到图片元素。例如，假设图片位于`<img>`标签内，我们查找所有这样的标签： ```python from bs4 import BeautifulSoup soup = BeautifulSoup(html_content, 'html.parser') img_tags = soup.find_all('img') ``` 4. 下载图片：遍历找到的图片链接，使用requests下载图片，并保存到本地文件系统： ```python import os for img_tag in img_tags: img_url = img_tag['src'] img_response = requests.get(img_url) filename = os.path.join(os.getcwd(), os.path.basename(img_url)) with open(filename, 'wb') as f: f.write(img_response.content) ```

阅读全文