首页我想用python获取PDF中TOTAL后面的数字,并提取到excel中

我想用python获取PDF中TOTAL后面的数字,并提取到excel中

时间: 2024-05-02 08:22:20 浏览: 7

您可以使用Python中的第三方库PyPDF2来读取PDF文件中的内容，并使用正则表达式来从文本中提取数字。下面是一个示例代码： ```python import re import xlwt # 用于将数字写入Excel中 from PyPDF2 import PdfFileReader # 读取PDF文件 pdf = PdfFileReader(open('example.pdf', 'rb')) pages = pdf.getNumPages() # 正则表达式匹配数字 pattern = re.compile(r'TOTAL\s+(\d+.\d+)') total = None for i in range(pages): page = pdf.getPage(i) text = page.extractText() match = pattern.search(text) if match: total = match.group(1) break # 将数字写入Excel文件 if total: book = xlwt.Workbook() sheet = book.add_sheet('Sheet 1') sheet.write(0, 0, 'TOTAL') sheet.write(0, 1, float(total)) book.save('example.xls') ``` 注意，这只是一个示例代码，您需要根据实际情况调整文件名、正则表达式等。同时，需要安装PyPDF2和xlwt库才能运行此代码。