运行pytesseract.py 报 Usage: pytesseract [-l lang] input_file

pytesseract是一个用于OCR（光学字符识别）的Python库，它可以将图像中的文本提取出来。在运行pytesseract.py时，报错"Usage: pytesseract [-l lang] input_file"是因为命令行参数不正确。正确的使用方式是在命令行中输入以下命令： pytesseract [-l lang] input_file 其中，"[-l lang]"是可选参数，用于指定语言，默认为英语。"input_file"是要进行OCR的图像文件路径。如果你想识别英语文本，可以使用以下命令： pytesseract input_file 如果你想识别其他语言的文本，可以使用以下命令： pytesseract -l lang input_file 请确保已经正确安装了pytesseract库，并且已经安装了Tesseract OCR引擎。如果还是无法正常运行，请检查命令行参数是否正确，并确保输入文件存在。

Traceback (most recent call last): File "D:\py\lib\site-packages\pytesseract\pytesseract.py", line 255, in run_tesseract proc = subprocess.Popen(cmd_args, **subprocess_args()) File "D:\Python37\lib\subprocess.py", line 800, in init restore_signals, start_new_session) File "D:\Python37\lib\subprocess.py", line 1207, in _execute_child startupinfo) FileNotFoundError: [WinError 2] 系统找不到指定的文件。 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "D:\pythonProject\test3.py", line 21, in <module> subtitle_text = pytesseract.image_to_string(subtitle_region, config=config) File "D:\py\lib\site-packages\pytesseract\pytesseract.py", line 427, in image_to_string }output_type File "D:\py\lib\site-packages\pytesseract\pytesseract.py", line 426, in <lambda> Output.STRING: lambda: run_and_get_output(*args), File "D:\py\lib\site-packages\pytesseract\pytesseract.py", line 288, in run_and_get_output run_tesseract(**kwargs) File "D:\py\lib\site-packages\pytesseract\pytesseract.py", line 260, in run_tesseract raise TesseractNotFoundError() pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.刚刚你修改后报这个错误

这个错误提示是说找不到安装的 OCR 引擎 Tesseract 或者 Tesseract 没有在你的环境变量 PATH 中。为了解决这个问题，你需要安装 Tesseract 并将其添加到 PATH 中。你可以从以下链接下载 Tesseract：https://github.com/UB-Mannheim/tesseract/wiki。安装完成后，将 Tesseract 的安装路径添加到 PATH 中即可。

C:\Users\huawei\PycharmProjects\python.exe C:\Users\huawei\PycharmProjects\pythonProject4\识别文字.py Traceback (most recent call last): File "C:\Users\huawei\PycharmProjects\pythonProject4\识别文字.py", line 8, in <module> text = pytesseract.image_to_string(image, lang='chi_sim') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\huawei\AppData\Roaming\Python\Python311\site-packages\pytesseract\pytesseract.py", line 423, in image_to_string return { ^ File "C:\Users\huawei\AppData\Roaming\Python\Python311\site-packages\pytesseract\pytesseract.py", line 426, in <lambda> Output.STRING: lambda: run_and_get_output(*args), ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\huawei\AppData\Roaming\Python\Python311\site-packages\pytesseract\pytesseract.py", line 288, in run_and_get_output run_tesseract(**kwargs) File "C:\Users\huawei\AppData\Roaming\Python\Python311\site-packages\pytesseract\pytesseract.py", line 264, in run_tesseract raise TesseractError(proc.returncode, get_errors(error_string)) pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\\Program Files (x86)\\Tesseract-OCR\\/tessdata/chi_sim.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'chi_sim\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

这个错误是由于缺少Tesseract OCR引擎的语言数据文件致的。根据错误信息，缺少的语言数据文件是`chi_sim.traineddata`。为了解决这个问题，您需要确保已经正确安装Tesseract OCR引擎，并且已经下载了中文简体语言数据文件。您可以按照以下步骤进行操作： 1. 确保已经正确安装了Tesseract OCR引擎。您可以从Tesseract OCR的官方网站（https://github.com/tesseract-ocr/tesseract）下载适用于您的操作系统的安装程序，并按照安装指南进行安装。 2. 下载中文简体语言数据文件`chi_sim.traineddata`。您可以从Tesseract OCR的官方网站（https://github.com/tesseract-ocr/tessdata）下载该文件。将它保存到Tesseract OCR引擎的`tessdata`目录下。 3. 设置TESSDATA_PREFIX环境变量。在PyCharm中，您可以在运行配置中添加一个环境变量。在PyCharm的菜单栏中选择"Run" -> "Edit Configurations"，然后在配置窗口中找到您的项目配置。在"Environment variables"部分添加一个新的环境变量，名称为"TESSDATA_PREFIX"，值为Tesseract OCR引擎的`tessdata`目录路径。完成上述步骤后，再次运行您的代码，应该就能够正常进行文字识别了。请注意，根据您的系统和Tesseract OCR引擎的安装位置，可能需要相应地调整路径和环境变量的设置。如果仍然遇到问题，请尝试搜索和参考Tesseract OCR的文档和支持资源，以获取更详细的指南和解决方案。

阅读全文

运行pytesseract.py 报 Usage: pytesseract [-l lang] input_file

相关推荐

python-Pytesseract 插件

CVE-2018-8174_EXP:CVE-2018-8174_python

报错：pytesseract.TesseractNotFoundError: tesseract is not installed or it’s not in your path

CVE-2020-0688_EXP:CVE-2020-0688_EXP自动触发有效载荷和加密方法

CVE-2018-2628-MultiThreading-master.zip_CVE20182628_cve-2018-262

tensorflow----tf_upgrade.py

netcat.py:netcat.py-python（2 + 3）netcat

PyPI 官网下载 | all_packages-2021.10.19.19.11.35-py3-none-any.whl

setup_win.py 用于win10下text-detection-ctpn编译

Python库 | colemen_file_utils-0.0.26-py3-none-any.whl

PyPI 官网下载 | django_s3_file_field-0.0.11-py3-none-any.whl

surabaya-py.github.io-source:surabaya py网站的源代码-git source code

Traceback (most recent call last): File "D:\pythonProject\test3.py", line 8, in <module> tesseract = pytesseract.pytesseract.Tesseract() AttributeError: module 'pytesseract.pytesseract' has no attribute 'Tesseract'你的代码报这个错误

File "C:\Users\Administrator\Desktop\测试\图片识别.py", line 2 import pytesseract.pytesseract.tesseract_cmd = 'D:/Program Files/Tesseract-OCR/tesseract.exe' ^ SyntaxError: invalid syntax

pytesseract.py文件路径

大家在看

【答题卡识别】 Hough变换答题卡识别【含Matlab源码 250期】.zip

Solar-Wind-Hybrid-Power-plant_matlab_

OZ9350 设计规格书

看nova-scheduler如何选择计算节点-每天5分钟玩转OpenStack

机器视觉选型计算概述-不错的总结

最新推荐

解决python脚本中error: unrecognized arguments: True错误

torch-1.7.1+cu110-cp37-cp37m-linux_x86_64.whl离线安装包linux系统x86_64

VB航空公司管理信息系统 (源代码+系统)(2024it).7z

基于SpringBoot+Vue开发的排课管理系统设计源码

vb图书管理系统（论文+源代码+开题报告+外文翻译+答辩ppt）(20249q).7z

S7-PDIAG工具使用教程及技术资料下载指南

管理建模和仿真的文件

CC-LINK远程IO模块AJ65SBTB1现场应用指南：常见问题快速解决

python 画一个进度条

Nginx 1.19.0版本Windows服务器部署指南