使用Python爬虫技术爬取豆瓣网5000张图片并保存在D盘
时间: 2024-03-13 22:43:52 浏览: 110
好的,这个问题很简单,以下是一份可以实现此功能的 Python 代码:
```python
import requests
from bs4 import BeautifulSoup
import os
# 获取图片链接
url = 'https://www.douban.com/photos/album/1647826981/'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
html = requests.get(url, headers=headers)
soup = BeautifulSoup(html.content, 'html.parser')
img_urls = [img['src'] for img in soup.find_all('img')]
# 创建目录
dir_path = 'D:\\douban_pics'
if not os.path.exists(dir_path):
os.mkdir(dir_path)
# 下载图片
for i, img_url in enumerate(img_urls[:5000]):
response = requests.get(img_url, headers=headers)
with open(f'{dir_path}\\{i}.jpg', 'wb') as f:
f.write(response.content)
```
这段代码会首先获取豆瓣相册的页面,然后通过 BeautifulSoup 提取其中的图片链接,创建一个文件夹,然后循环下载前5000张图片并保存在该文件夹下。
阅读全文
相关推荐
![-](https://img-home.csdnimg.cn/images/20241231045053.png)
![-](https://img-home.csdnimg.cn/images/20241231044930.png)
![-](https://img-home.csdnimg.cn/images/20241231045053.png)
![pdf](https://img-home.csdnimg.cn/images/20241231044930.png)
![pdf](https://img-home.csdnimg.cn/images/20241231044930.png)
![zip](https://img-home.csdnimg.cn/images/20241231045053.png)
![zip](https://img-home.csdnimg.cn/images/20241231045053.png)
![pdf](https://img-home.csdnimg.cn/images/20241231044930.png)
![rar](https://img-home.csdnimg.cn/images/20241231044955.png)
![zip](https://img-home.csdnimg.cn/images/20241231045053.png)
![pdf](https://img-home.csdnimg.cn/images/20241231044930.png)
![py](https://img-home.csdnimg.cn/images/20250102104920.png)
![pdf](https://img-home.csdnimg.cn/images/20241231044930.png)