子嘉的博客 子嘉的博客
首页
bic-bic
技术
关于
收藏
  • 分类
  • 标签
  • 归档
GitHub (opens new window)

高子嘉

没有比脚更长的路,没有比人更高的山
首页
bic-bic
技术
关于
收藏
  • 分类
  • 标签
  • 归档
GitHub (opens new window)
  • install
  • virtualenv
  • stackless
  • twisted
  • pytesseract
    • windows 环境安装
    • mac 环境安装
  • django

  • python
子嘉
2024-02-28
目录

pytesseract

# 环境配置

python离线图像识别使用tesseract库,对应python库为pytesseract。使用前需要先安装tesseract

安装文档 (opens new window) 编译安装文档 (opens new window)

# windows 环境安装

# mac 环境安装

编译安装

# Packages which are always needed.
brew install automake autoconf libtool
brew install pkgconfig
brew install icu4c
brew install leptonica
# Packages required for training tools.
brew install pango
# Optional packages for extra features.
brew install libarchive
# Optional package for builds using g++.
brew install gcc
1
2
3
4
5
6
7
8
9
10
11
git clone https://github.com/tesseract-ocr/tesseract/
cd tesseract
./autogen.sh
mkdir build
cd build
# Optionally add CXX=g++-8 to the configure command if you really want to use a different compiler.
../configure PKG_CONFIG_PATH=/usr/local/opt/icu4c/lib/pkgconfig:/usr/local/opt/libarchive/lib/pkgconfig:/usr/local/opt/libffi/lib/pkgconfig
make -j
# Optionally install Tesseract.
sudo make install
# Optionally build and install training tools.
# 下面安装用于训练工具,酌情安装
make training
sudo make training-install
1
2
3
4
5
6
7
8
9
10
11
12
13
14

安装完成后,设置环境变量

export TESSDATA_PREFIX=/usr/local/share/tessdata

执行代码会发现有报错 Error opening data file /usr/local/share/tessdata/eng.traineddata

出错原因是程序在载入训练数据,但未找到训练数据,这里我们可以从github上下载数据,地址为 https://github.com/tesseract-ocr/tessdata

或使用下面命令直接下载:

wget https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata
1

# 代码示例

import pytesseract
from PIL import Image

img = Image.open('./num.jpg')
print(img)
text = pytesseract.image_to_string(img)
print(text)
1
2
3
4
5
6
7
编辑 (opens new window)
#tesseract
上次更新: 2024/03/02, 18:30:15
twisted
index

← twisted index→

最近更新
01
mongodb restore
03-06
02
consul
02-24
03
dump
01-17
更多文章>
Theme by Vdoing | Copyright © 2022-2025 子嘉 | MIT License
  • 跟随系统
  • 浅色模式
  • 深色模式
  • 阅读模式