爬取京东电脑图片并保存用BeautifulSoup库
时间: 2024-04-05 15:05:44 浏览: 60
python3结合bs4爬取博客数据.txt
好的,下面是实现步骤:
1.导入需要的库
```python
from bs4 import BeautifulSoup
import requests
import os
```
2.获取京东电脑页面的HTML内容
```python
url = "https://search.jd.com/Search?keyword=%E7%94%B5%E8%84%91&enc=utf-8&wq=%E7%94%B5%E8%84%91&pvid=fb1d9f6b63eb4b1d8f9d5f9d9d7f0f6c"
response = requests.get(url)
html_content = response.text
```
3.使用BeautifulSoup解析HTML内容
```python
soup = BeautifulSoup(html_content, 'html.parser')
```
4.查找所有电脑图片的标签
```python
img_tags = soup.find_all('img', class_='err-product')
```
5.循环遍历所有图片标签,获取图片路径并保存
```python
for img in img_tags:
img_url = img['src']
img_name = img_url.split('/')[-1]
img_path = os.path.join('电脑图片', img_name)
img_data = requests.get(img_url).content
with open(img_path, 'wb') as f:
f.write(img_data)
```
完整代码如下:
```python
from bs4 import BeautifulSoup
import requests
import os
url = "https://search.jd.com/Search?keyword=%E7%94%B5%E8%84%91&enc=utf-8&wq=%E7%94%B5%E8%84%91&pvid=fb1d9f6b63eb4b1d8f9d5f9d9d7f0f6c"
response = requests.get(url)
html_content = response.text
soup = BeautifulSoup(html_content, 'html.parser')
img_tags = soup.find_all('img', class_='err-product')
if not os.path.exists('电脑图片'):
os.mkdir('电脑图片')
for img in img_tags:
img_url = img['src']
img_name = img_url.split('/')[-1]
img_path = os.path.join('电脑图片', img_name)
img_data = requests.get(img_url).content
with open(img_path, 'wb') as f:
f.write(img_data)
```
注意:本代码仅供学习参考,请勿用于商业用途。
阅读全文