当前位置:网站首页>Python image recognition OCR
Python image recognition OCR
2020-11-07 20:56:00 【Coxhuang】
List of articles
- Python Image recognition OCR
- #1 demand
- #2 Environmental Science
- #3 install
- #3.1 macOS
- #3.2 Linux(CentOS)
- #4 Use
- #4.1 python install pytesseract library
- #4.2 Python Code
- #5 Online case
Python Image recognition OCR
#1 demand
- Identify the information in the picture , Such as QR code
#2 Environmental Science
macOS / Linux Python3.7.6
#3 install
#3.1 macOS
- install tesseract
// Install only tesseract, Don't install training tools brew install tesseract // install tesseract At the same time install training tools brew install --with-training-tools tesseract // install tesseract Install all languages at the same time , The language pack is bigger , If installed, it will take a long time , It is not recommended to install , Select on demand brew install --all-languages tesseract // install tesseract, And install training tools and language brew install --all-languages --with-training-tools tesseract
2. Download the language pack
Address : https://github.com/tesseract-ocr/tessdata
I have installed a Chinese language pack here
Chinese language pack : https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata
Then copy the downloaded Chinese language pack to the following path :
/usr/local/Cellar/tesseract/4.0.0_1/share/tessdata
3. Check out the local language pack
tesseract --list-langs
#3.2 Linux(CentOS)
- Installation dependency
yum install autoconf automake libtool libjpeg-devel libpng-devel libtiff-devel zlib-devel
2. install leptonica
download : wget https://github.com/tesseract-ocr/tesseract/archive/4.1.0.tar.gz
Unpack the installation
tar -xzvf leptonica-1.74.4.tar.gz cd leptonica-1.74.4.tar.gz ./configure --profix=/usr/local/leptonica make sudo make install
3. install tesseract-ocr
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.zip unzip 3.04.zip cd tesseract-3.04/ ./configure make && make install sudo ldconfig
I have installed a Chinese language pack here
Chinese language pack : https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata
Then copy the downloaded Chinese language pack to the following path :
/usr/local/share/tessdata
#4 Use
#4.1 python install pytesseract library
pip install pytesseract pip install Pillow
#4.2 Python Code
from PIL import Image
import pytesseract
# Specify the image path and identify the language
data = pytesseract.image_to_string(Image.open('/Users/Documents/1.png'), lang='chi_sim')
print(data)
#5 Online case
Address :
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .
版权声明
本文为[Coxhuang]所创,转载请带上原文链接,感谢
边栏推荐
猜你喜欢

计组-总线通信控制之异步串行通信的数据传输

洞察——风格注意力网络(SANet)在任意风格迁移中的应用
![[C + + learning notes] how about the simple use of the C + + standard library STD:: thread?](/img/3e/3e7bc16c04d0d0ea953e2f739137d3.jpg)
[C + + learning notes] how about the simple use of the C + + standard library STD:: thread?

你可能不知道的Animation动画技巧与细节

Deep into web workers (1)

Insight -- the application of sanet in arbitrary style transfer

laravel8更新之维护模式改进

android基础-RadioButton(单选按钮)

The most hard core of the whole network explains the computer startup process

Reflection on a case of bus card being stolen and swiped
随机推荐
Recommend suicide, openai warns: gpt-3 is too risky for medical purposes
如何应对事关业务生死的数据泄露和删改?
团灭 LeetCode 股票买卖问题
How to deal with data leakage and deletion related to business life and death?
How to think in the way of computer
Static + code block + polymorphism + exception
动态规划——用二进制表示集合的状态压缩DP
supervisor进程管理安装使用
计组-总线通信控制之异步串行通信的数据传输
Git代码提交操作,以及git push提示failed to push some refs'XXX'
洞察——风格注意力网络(SANet)在任意风格迁移中的应用
三步一坑五步一雷,高速成长下的技术团队怎么带?
我是如何失去团队掌控的?
A detailed explanation of microservice architecture
编程界大佬教你:一行Python代码能做出哪些神奇的事情?
14000 word distributed transaction principle analysis, master all of them, are you afraid of being asked in the interview?
android基础-RadioButton(单选按钮)
盘点那些争议最大的编程观点,你是什么看法呢?
sed之查找替换
小熊派开发板实践:智慧路灯沙箱实验之真实设备接入