首页用Python读取PDF，搜索某个词，返回这个词在哪个文件里，并举例测试

用Python读取PDF，搜索某个词，返回这个词在哪个文件里，并举例测试

时间: 2024-05-14 17:12:49 浏览: 13

可以使用Python的第三方库PyPDF2来读取PDF文件，并使用正则表达式来搜索特定词汇。以下是一个示例代码： ```python import PyPDF2 import re def search_term_in_pdf(term, file_path): with open(file_path, 'rb') as pdf_file: pdf_reader = PyPDF2.PdfFileReader(pdf_file) for page_num in range(pdf_reader.getNumPages()): page = pdf_reader.getPage(page_num) text = page.extractText() match = re.search(term, text) if match: return f"Term '{term}' found in page {page_num + 1} of file {file_path}" return f"Term '{term}' not found in file {file_path}" # Example usage: print(search_term_in_pdf('Python', 'example.pdf')) ``` 这个函数会在指定的PDF文件中搜索特定词汇，如果找到了则返回包含该词汇出现位置的字符串，否则返回未找到的提示。你可以将上述代码保存到一个Python文件中，然后将文件名和搜索词汇替换成你实际使用的值来测试。