用python写一个批量下载一个页面内的所有图片并用选择器选择所下载的图片以及顺序来转换成pdf
时间: 2023-03-22 12:03:31 浏览: 95
python将pdf文件转换成图片,可设置图片的清晰度,可对源码进行修改实现
您好!以下是使用Python编写批量下载页面内所有图片并用选择器选择下载的图片,并将它们按顺序转换成PDF的示例代码:
``` python
import requests
from bs4 import BeautifulSoup
from PIL import Image
from io import BytesIO
from fpdf import FPDF
import os
# URL of the webpage to download images from
url = 'https://example.com/'
# CSS selector for the images to download
selector = 'img'
# Initialize the PDF object
pdf = FPDF()
# Send a GET request to the URL and get the webpage HTML content
response = requests.get(url)
html_content = response.content
# Parse the HTML content using BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
# Find all the images in the HTML using the CSS selector
images = soup.select(selector)
# Loop through each image and download it
for i, image in enumerate(images):
# Get the image source URL
src = image.get('src')
# Send a GET request to the image URL and get the image binary content
response = requests.get(src)
image_content = response.content
# Convert the binary image content to a PIL Image object
pil_image = Image.open(BytesIO(image_content))
# Add the PIL Image object to the PDF object
pdf.add_page()
pdf.image(pil_image, 0, 0, pdf.w, pdf.h)
# Save the PDF file
pdf_filename = 'output.pdf'
pdf.output(pdf_filename, 'F')
print(f'PDF saved to {os.path.abspath(pdf_filename)}')
```
您需要修改 `url` 和 `selector` 变量以匹配您要下载的网页和图像。在运行代码之前,请确保已安装必需的依赖项:
- requests
- BeautifulSoup
- Pillow
- fpdf
您可以使用 `pip install` 命令安装它们。请注意,由于这个脚本下载了大量的图片,它可能需要一些时间才能完成。
阅读全文