爬取https://sc.chinaz.com/tupian/xiaogouxiaomaotupian.html中的动物图片
时间: 2024-01-02 22:04:56 浏览: 24
可以使用Python中的requests和beautifulsoup4库来实现网页爬取和解析。
首先,使用requests库获取网页内容:
```python
import requests
url = "https://sc.chinaz.com/tupian/xiaogouxiaomaotupian.html"
response = requests.get(url)
html = response.text
```
接下来,使用beautifulsoup4库解析网页内容,提取出所有图片的链接:
```python
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
img_list = soup.find_all("img", class_="lazy")
img_urls = [img["data-original"] for img in img_list]
```
最后,使用requests库下载图片到本地:
```python
for i, url in enumerate(img_urls):
response = requests.get(url)
with open(f"animal_{i}.jpg", "wb") as f:
f.write(response.content)
```
完整代码如下:
```python
import requests
from bs4 import BeautifulSoup
url = "https://sc.chinaz.com/tupian/xiaogouxiaomaotupian.html"
response = requests.get(url)
html = response.text
soup = BeautifulSoup(html, "html.parser")
img_list = soup.find_all("img", class_="lazy")
img_urls = [img["data-original"] for img in img_list]
for i, url in enumerate(img_urls):
response = requests.get(url)
with open(f"animal_{i}.jpg", "wb") as f:
f.write(response.content)
```
注意:在爬取网页时要遵守相关法律法规,不能违反网站的规定。此外,爬取图片时要注意版权问题,确保自己的行为不会侵犯他人的权益。