读取不同的text1和text2文件中的内容,获取不同Cookie1和Cookie2信息,爬取相关网页图片,并存入本地文件image1和image2python完整代码
时间: 2024-10-23 20:10:44 浏览: 27
在Python中,你可以使用`requests`库来发送HTTP请求,包括获取Cookie,以及`BeautifulSoup`库来解析HTML并提取图片链接,然后用`urllib`或`wget`下载图片。这里是一个基本的示例,假设你已经有了text1和text2文件,它们分别包含了Cookie1和Cookie2的信息:
```python
import requests
from bs4 import BeautifulSoup
import os
# 读取cookie信息
def get_cookies(file_name):
with open(file_name, 'r') as f:
cookies = f.read().strip()
return cookies
# 爬取图片并保存
def download_images(url, cookie, image_name):
response = requests.get(url, headers={'Cookie': cookie})
if response.status_code == 200:
soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img') # 这里假设目标页面只有img标签包含图片
for img in img_tags:
img_url = img['src']
try:
local_img_path = os.path.join(os.getcwd(), image_name + '_' + img_url.split('/')[-1])
with open(local_img_path, 'wb') as img_file:
img_data = requests.get(img_url).content
img_file.write(img_data)
print(f"Image {img_url} downloaded to {local_img_path}")
except Exception as e:
print(f"Error downloading {img_url}: {e}")
# 使用cookies爬取并保存图片
for file_name, cookie_prefix in [('text1', 'Cookie1'), ('text2', 'Cookie2')]:
cookie = get_cookies(file_name)
url = "http://example.com" # 替换为你需要爬取的实际网址
image_name = f"{cookie_prefix}_images"
download_images(url, cookie, image_name)
阅读全文