首页如何解决 AttributeError: module 'pdfminer' has no attribute 'extract_tables'

如何解决 AttributeError: module 'pdfminer' has no attribute 'extract_tables'

时间: 2023-06-14 08:04:00 浏览: 479

这个错误通常是因为您正在使用旧版本的pdfminer库。在旧版本中，`extract_tables`函数不存在。您可以通过以下方法解决： 1. 升级pdfminer库到最新版本： ``` pip install --upgrade pdfminer ``` 2. 如果升级不起作用，可以尝试使用pdfminer.six库： ``` pip install pdfminer.six ``` 然后在代码中使用以下语句进行表格提取： ``` from pdfminer.pdfparser import PDFParser from pdfminer.pdfdocument import PDFDocument from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter from pdfminer.converter import PDFPageAggregator from pdfminer.layout import LAParams, LTTextBox, LTTextLine, LTChar, LTFigure, LTImage, LTTextBoxHorizontal, LTTextGroup, LTTextBoxVertical from pdfminer.pdfpage import PDFPage from pdfminer.pdfpage import PDFTextExtractionNotAllowed from pdfminer.pdfdevice import PDFDevice from pdfminer.layout import * from pdfminer.pdfinterp import resolve1 from pdfminer.converter import PDFPageAggregator def extract_tables(pdf_path): tables = [] with open(pdf_path, "rb") as fp: parser = PDFParser(fp) doc = PDFDocument(parser) if not doc.is_extractable: raise PDFTextExtractionNotAllowed rsrcmgr = PDFResourceManager() laparams = LAParams() device = PDFPageAggregator(rsrcmgr, laparams=laparams) interpreter = PDFPageInterpreter(rsrcmgr, device) for page in PDFPage.create_pages(doc): interpreter.process_page(page) layout = device.get_result() for lt_obj in layout: if isinstance(lt_obj, LTTable): table = [] for row in lt_obj: row_data = [] for cell in row: row_data.append(cell.get_text().strip()) table.append(row_data) tables.append(table) return tables ```

阅读全文

最新推荐

matplotlib-3.6.3-cp39-cp39-linux_armv7l.whl

如何解决 AttributeError: module 'pdfminer' has no attribute 'extract_tables'

相关推荐

求解报错：AttributeError:module ‘os’ has no attribute ‘exit’

AttributeError: module 'tensorflow.compat.v1' has no attribute '

问题解决：AttributeError: module ‘paddle.fluid’ has no attribute ‘EndStepEvent’

AttributeError: module 'pdfminer' has no attribute 'extract_tables'

怎么解决AttributeError: module 'torch' has no attribute '_six'，我代码中用到了AttributeError: module 'torch' has no attribute '_six'

解决AttributeError: module 'em' has no attribute 'RAW_OPT'

AttributeError: 'PdfReader' object has no attribute 'extract_text'

AttributeError: 'PageObject' object has no attribute 'extract_tables'

AttributeError: module 'mmdet' has no attribute '__version__'

AttributeError: module 'matplotlib' has no attribute '__version__'

AttributeError: module 'torch' has no attribute '__vision__'

pandas AttributeError: module 'numpy' has no attribute '__version__'

Anaconda AttributeError: module 'chardet' has no attribute '__version__'

mmdet AttributeError: module 'mmdet' has no attribute '__version__'

AttributeError: module 'importlib' has no attribute 'find_module'

AttributeError: module 'logging' has no attribute 'set_verbosity_error'

AttributeError: module 'tensorflow' has no attribute 'get_default_graph

AttributeError: module 'statsmodels.api' has no attribute 'contingency_tables'

AttributeError: module 'config' has no attribute 'ANNO_STORE_DIR'

最新推荐

matplotlib-3.6.3-cp39-cp39-linux_armv7l.whl

基于Python和Opencv的车牌识别系统实现

管理建模和仿真的文件

网络隔离与防火墙策略：防御网络威胁的终极指南

在密码学中，对称加密和非对称加密有哪些关键区别，它们各自适用于哪些场景？

我的代码小部件库：统计、MySQL操作与树结构功能

"互动学习：行动中的多样性与论文攻读经历"

网络测试与性能评估：准确衡量网络效能的科学方法

在永磁同步电机中，如何利用有限元仿真技术模拟失磁故障对电机性能的影响？

React初学者入门指南：快速构建并部署你的第一个应用

AttributeError: module 'mmdet' has no attribute 'version'

AttributeError: module 'matplotlib' has no attribute 'version'

AttributeError: module 'torch' has no attribute 'vision'

pandas AttributeError: module 'numpy' has no attribute 'version'

Anaconda AttributeError: module 'chardet' has no attribute 'version'

mmdet AttributeError: module 'mmdet' has no attribute 'version'