用requests爬取英雄联盟官网所有英雄的皮肤图片
时间: 2024-10-11 20:02:35 浏览: 38
使用Python库`requests`和BeautifulSoup来爬取英雄联盟(League of Legends,简称LoL)官网的所有英雄皮肤图片,需要分几个步骤:
1. 首先,安装必要的库:
```
pip install requests beautifulsoup4
```
2. 创建一个函数,用于发送请求并解析HTML内容:
```python
import requests
from bs4 import BeautifulSoup
def get_images(url):
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img', {'class': 'skin-image'}) # 查找含有皮肤图片标签
return [img['src'] for img in img_tags]
```
3. LoL官网英雄皮肤页面通常按字母顺序排列,所以你可以构建一个URL模板,然后循环遍历获取每个英雄页的链接:
```python
base_url = 'https://lol.gamepedia.com/List_of_champions'
images_list = []
page_number = 1
while True:
hero_page_url = f'{base_url}?page={page_number}'
img_urls_on_page = get_images(hero_page_url)
if not img_urls_on_page: # 如果该页没有更多图像,退出循环
break
images_list.extend(img_urls_on_page)
page_number += 1 # 进入下一页
```
4. 最后,保存图片到本地文件夹,注意处理可能出现的网络错误和下载速度限制:
```python
import os
save_dir = '英雄皮肤图片'
if not os.path.exists(save_dir):
os.makedirs(save_dir)
for img_url in images_list:
try:
response = requests.get(img_url, stream=True)
with open(os.path.join(save_dir, img_url.split('/')[-1]), 'wb') as f:
for chunk in response.iter_content(chunk_size=1024): # 分块下载以避免内存溢出
f.write(chunk)
except Exception as e:
print(f"下载 {img_url} 出错: {e}")
```
阅读全文