在这篇文章中,我们将为您详细介绍tesseract如何计算单词置信度得分?的内容。此外,我们还会涉及一些关于centos7yum安装tesseractpip安装python3tesserocr、cen
在这篇文章中,我们将为您详细介绍tesseract如何计算单词置信度得分?的内容。此外,我们还会涉及一些关于centos7 yum 安装 tesseract pip 安装 python3 tesserocr、centos7 下安装 tesseract-ocr 进行验证码识别,centos7 安装 tesseract ,yum 安装 tesseract、java如何使用 tesseract 4.0.0-1.4.4、leptonica & tesseract & tess4j的知识,以帮助您更全面地了解这个主题。
本文目录一览:- tesseract如何计算单词置信度得分?
- centos7 yum 安装 tesseract pip 安装 python3 tesserocr
- centos7 下安装 tesseract-ocr 进行验证码识别,centos7 安装 tesseract ,yum 安装 tesseract
- java如何使用 tesseract 4.0.0-1.4.4
- leptonica & tesseract & tess4j
tesseract如何计算单词置信度得分?
如何解决tesseract如何计算单词置信度得分??
我将hocr_char_Boxes设置为1,以获得字符级置信度得分。
以下是一个单词“患者” 的HOCR输出摘录。
<spanocr_line'' id=''line_1_1'' title="bBox 92 11 735 31; baseline 0 -5; x_size 18; x_descenders 3; x_ascenders 4">
<spanocrx_word'' id=''word_1_1'' title=''bBox 92 11 156 26; x_wconf 96''>
<spanocrx_cinfo'' title=''x_bBoxes 92 11 102 26; x_conf 99.574387''>P</span>
<spanocrx_cinfo'' title=''x_bBoxes 92 11 103 26; x_conf 99.574722''>a</span>
<spanocrx_cinfo'' title=''x_bBoxes 105 15 115 26; x_conf 99.572853''>t</span>
<spanocrx_cinfo'' title=''x_bBoxes 116 11 125 26; x_conf 99.571632''>i</span>
<spanocrx_cinfo'' title=''x_bBoxes 128 15 138 26; x_conf 99.518715''>e</span>
<spanocrx_cinfo'' title=''x_bBoxes 140 15 149 26; x_conf 99.574196''>n</span>
<spanocrx_cinfo'' title=''x_bBoxes 151 12 156 26; x_conf 99.548164''>t</span>
</span>
</span>
tesseract如何计算 x_wconf ?与 x_conf 字符值有什么关系?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)
centos7 yum 安装 tesseract pip 安装 python3 tesserocr
centos7 系统 yum 安装 tesseract,并 pip 安装 python3 的 tesserocr
2018 年 09 月 04 日 00:00:27 阅读数:15 标签: centos7python3tesserocr 更多
个人分类: python
版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/zyy247796143/article/details/82356867
#安装 epel 源:
yum -y install epel-release
#安装 tesseract:
yum -y install tesseract
#执行检查 tesseract 支持的语言:
tesseract --list-langs
List of available languages (1):
eng
发现目前只支持英语,要安装更多语言包可执行 git 获取:
-
git clone https://github.com/tesseract-ocr/tessdata.git
-
mv tessdata/* /usr/share/tesseract/tessdata
pip 安装 pillow 和 tesserocr:
pip3 install pillow tesserocr
发现安装 pillow 成功,tesserocr 报错了
Installing collected packages: tesserocr
Running setup.py install for tesserocr ... error
Complete output from command /usr/local/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__=''/tmp/pip-install-i48iarbe/tesserocr/setup.py'';f=getattr(tokenize, ''open'', open)(__file__);code=f.read().replace(''\r\n'', ''\n'');f.close();exec(compile(code, __file__, ''exec''))" install --record /tmp/pip-record-p27b42h9/install-record.txt --single-version-externally-managed --compile:
pkg-config failed to find tesseract/lept libraries: b"Package tesseract was not found in the pkg-config search path.\nPerhaps you should add the directory containing `tesseract.pc''\nto the PKG_CONFIG_PATH environment variable\nNo package ''tesseract'' found\n"
Supporting tesseract v3.04.00
Building with configs: {''libraries'': [''tesseract'', ''lept''], ''cython_compile_time_env'': {''TESSERACT_VERSION'': 197632}}
/usr/local/python3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: ''long_description_content_type''
warnings.warn(msg)
running install
running build
running build_ext
building ''tesserocr'' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/python3/include/python3.6m -c tesserocr.cpp -o build/temp.linux-x86_64-3.6/tesserocr.o
tesserocr.cpp:597:34: fatal error: leptonica/allheaders.h: No such file or directory
#include "leptonica/allheaders.h"
^
compilation terminated.
error: command ''gcc'' failed with exit status 1
----------------------------------------
Command "/usr/local/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__=''/tmp/pip-install-i48iarbe/tesserocr/setup.py'';f=getattr(tokenize, ''open'', open)(__file__);code=f.read().replace(''\r\n'', ''\n'');f.close();exec(compile(code, __file__, ''exec''))" install --record /tmp/pip-record-p27b42h9/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-i48iarbe/tesserocr/
#解决方法,安装一下 tesseract-devel 库:
yum install tesseract-devel
#再重新 pip 安装 tesserocr:
pip3 install tesserocr
没报错,完成!
centos7 下安装 tesseract-ocr 进行验证码识别,centos7 安装 tesseract ,yum 安装 tesseract

step 1 :
yum install tesseract -y
查看 tesseract -v
step 2: install more language
yum install -y tesseract-langpack-rus
转自 http://tutorialspots.com/how-to-install-tesseract-on-centos-7-4500.html
1 安装 centos 系统依赖
yum install -y automake autoconf libtool gcc gcc-c++ yum install -y libpng-devel libjpeg-devel libtiff-devel
--部分新服务器需要 yum install gtk2-devel yasm glibc.i686 libstdc++.so.6 libgtk-x11-2.0.so libatk-1.0.so.0 libcairo.so.2 libcups.so.2 libgdk-x11-2.0.so.0 libgdk_pixbuf-2.0.so.0 libgtk-x11-2.0.so.0 libpango-1.0.so.0 libpangocairo-1.0.so.0 libICE.so.6 libSM.so.6 libmng.so.1 libpng12.so.0 libGLU.so.1 -y
2
安装 leptonica
wget http://www.leptonica.org/source/leptonica-1.72.tar.gz
tar xvzf leptonica-1.72.tar.gz cd leptonica-1.72/
./configure --prefix=/usr/local/
make && make install
配置环境
vim /etc/bashrc
加入
PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig
export PKG_CONFIG_PATH
CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/usr/local/include/
export CPLUS_INCLUDE_PATH
C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/leptonica/include/leptonica
export C_INCLUDE_PATH
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export LD_LIBRARY_PATH
LIBRARY_PATH=$LIBRARY_PATH:/usr/local/lib
export LIBRARY_PATH
TESSDATA_PREFIX=/usr/local/share/tessdata
export TESSDATA_PREFIX
最后刷新
source /etc/bashrc
vim /etc/profile
在最后插入
export LD_LIBRARY_PATH=$LD_LIBRARY_PAYT:/usr/local/lib
export LIBLEPT_HEADERSDIR=/usr/local/include
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
source /etc/profile
3
安装 tesseract-ocr
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.zip
unzip 3.04.zip cd tesseract-3.04/
./autogen.sh
./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/include
make && make install
sudo ldconfig
4
- 在 https://github.com/tesseract-ocr/tessdata 下载对应语言的模型文件
- 将模型文件移动到 /usr/local/share/tessdata
(eng.traineddata osd.traineddata)
https://www.cnblogs.com/panpan61803/p/10978117.html
https://www.cnblogs.com/arachis/p/OCR.html
https://blog.csdn.net/diyiday/article/details/80004793
java如何使用 tesseract 4.0.0-1.4.4
提示:
建议直接使用tess4j,tess4j是对tesseract的封装,使用更简单
首先引入依赖
<!-- https://mvnrepository.com/artifact/org.bytedeco.javacpp-presets/tesseract -->
<dependency>
<groupId>org.bytedeco.javacpp-presets</groupId>
<artifactId>tesseract</artifactId>
<version>4.0.0-1.4.4</version>
</dependency>
然后还要去下载安装文件
https://sourceforge.net/projects/tesseract-ocr-alt/files/
如果打不开,那你就出国下载,下载完了再带回来。或者找国内镜像
为了省心,全部勾选
管理员打开cmd测试一下
命令执行出现empty page!!错误应该是图片问题,分辨率和dpi太低
比如,我尝试识别就不行
换一张图就好了
虽然说识别率还不是很好
但是可以表明,代码是没有问题
leptonica & tesseract & tess4j
wget http://www.leptonica.org/source/leptonica-1.73.tar.gz
tar -zxvf leptonica-1.73.tar.gz
cd leptonica-1.73
./configure && make && sudo make install
wget https://github.com/tesseract-ocr/tesseract/archive/master.zip
unzip master.zip
cd tesseract-master
./autogen.sh
./configure
make
sudo make install
sudo ldconfig
wget https://github.com/tesseract-ocr/langdata/archive/master.zip
github
https://github.com/danbloomberg/leptonica
https://github.com/tesseract-ocr/tesseract
https://github.com/tesseract-ocr/langdata
http://san-yun.iteye.com/blog/1954866
https://github.com/iOS0x00/DiscuzAPI/blob/master/DiscuzAPI.py
关于tesseract如何计算单词置信度得分?的介绍现已完结,谢谢您的耐心阅读,如果想了解更多关于centos7 yum 安装 tesseract pip 安装 python3 tesserocr、centos7 下安装 tesseract-ocr 进行验证码识别,centos7 安装 tesseract ,yum 安装 tesseract、java如何使用 tesseract 4.0.0-1.4.4、leptonica & tesseract & tess4j的相关知识,请在本站寻找。
本文标签: