当前位置:网站首页>Python 图片识别 OCR
Python 图片识别 OCR
2020-11-07 20:56:00 【Coxhuang】
文章目录
- Python 图片识别 OCR
- #1 需求
- #2 环境
- #3 安装
- #3.1 macOS
- #3.2 Linux(CentOS)
- #4 使用
- #4.1 python安装pytesseract库
- #4.2 Python代码
- #5 在线案例
Python 图片识别 OCR
#1 需求
- 识别图片中的信息,如二维码
#2 环境
macOS / Linux Python3.7.6
#3 安装
#3.1 macOS
- 安装 tesseract
//只安装tesseract,不安装训练工具 brew install tesseract //安装tesseract的同时安装训练工具 brew install --with-training-tools tesseract //安装tesseract的同时安装所有语言,语言包比较大,如果安装的话时间较长,建议不安装,按需选择 brew install --all-languages tesseract //安装tesseract,并安装训练工具和语言 brew install --all-languages --with-training-tools tesseract
2. 下载语言包
地址 : https://github.com/tesseract-ocr/tessdata
我这里安装的是中文语言包
中文语言包 : https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata
然后将下载的中文语言包拷贝到如下路径 :
/usr/local/Cellar/tesseract/4.0.0_1/share/tessdata
3. 查看本地语言包
tesseract --list-langs
#3.2 Linux(CentOS)
- 安装依赖
yum install autoconf automake libtool libjpeg-devel libpng-devel libtiff-devel zlib-devel
2. 安装 leptonica
下载 : wget https://github.com/tesseract-ocr/tesseract/archive/4.1.0.tar.gz
解压安装
tar -xzvf leptonica-1.74.4.tar.gz cd leptonica-1.74.4.tar.gz ./configure --profix=/usr/local/leptonica make sudo make install
3. 安装 tesseract-ocr
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.zip unzip 3.04.zip cd tesseract-3.04/ ./configure make && make install sudo ldconfig
我这里安装的是中文语言包
中文语言包 : https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata
然后将下载的中文语言包拷贝到如下路径 :
/usr/local/share/tessdata
#4 使用
#4.1 python安装pytesseract库
pip install pytesseract pip install Pillow
#4.2 Python代码
from PIL import Image
import pytesseract
# 指定图片路径和识别的语言
data = pytesseract.image_to_string(Image.open('/Users/Documents/1.png'), lang='chi_sim')
print(data)
#5 在线案例
地址 :
本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。
版权声明
本文为[Coxhuang]所创,转载请带上原文链接,感谢
https://cloud.tencent.com/developer/article/1744581
边栏推荐
- 计组-总线通信控制之异步串行通信的数据传输
- How Facebook open source framework simplifies pytorch experiment
- Application and principle of handlermethodargumentresolver
- Kubernetes服务类型浅析:从概念到实践
- 是时候结束 BERTology了
- Reflection on a case of bus card being stolen and swiped
- [original] the influence of arm platform memory and cache on the real-time performance of xenomai
- Bgfx compilation tutorial
- Using pipe() to improve code readability in pandas
- 浅谈HiZ-buffer
猜你喜欢

use Xunit.DependencyInjection Transformation test project

统计文本中字母的频次(不区分大小写)

The official 1909 version of win10 cannot open the real-time protection solution of virus and threat protection in windows security center.

Annual salary of 900000 programmers is not as good as 3800 civil servants a month? How to choose between stability and high income?

vscode 配置

Do not understand the underlying principle of database index? That's because you don't have a B tree in your heart

你可能不知道的Animation动画技巧与细节

【原创】ARM平台内存和cache对xenomai实时性的影响

On the coverage technology and best practice of go code

awk实现类sql的join操作
随机推荐
Why do we need software engineering -- looking at a simple project
一次公交卡被“盜刷”事件帶來的思考
Improvement of maintenance mode of laravel8 update
Get started, GIT
技术债务是对业务功能缺乏真正的理解 -daverupert.com
How did I lose control of the team?
【原创】ARM平台内存和cache对xenomai实时性的影响
如何应对事关业务生死的数据泄露和删改?
Share several vs Code plug-ins I use everyday
Implementation of Caesar cipher
统计文本中字母的频次(不区分大小写)
我们为什么需要软件工程——从一个简单的项目进行观察
From technology to management, the technology of system optimization is applied to enterprise management
laravel8更新之维护模式改进
全网最硬核讲解计算机启动流程
京淘项目day09
sed之查找替换
浅谈HiZ-buffer
Analysis of kubernetes service types: from concept to practice
凯撒密码实现