├── .gitignore
├── .gitmodules
├── LICENSE
├── MistralOCR
    ├── __init__.py
    ├── i18n.csv
    ├── mistral_ocr.py
    └── mistral_ocr_config.py
├── README.md
├── demo_AbaOCR
    ├── README.md
    ├── __init__.py
    ├── aba_ocr.py
    ├── aba_ocr_config.py
    └── i18n.csv
├── win7_x64_Pix2Text
    ├── __init__.py
    ├── i18n.csv
    ├── p2t_api.py
    └── p2t_config.py
├── win7_x64_RapidOCR-json
    ├── __init__.py
    ├── api_rapidocr.py
    ├── i18n.csv
    ├── rapidocr.py
    └── rapidocr_config.py
└── win_linux_PaddleOCR-json
    ├── PPOCR_api.py
    ├── PPOCR_config.py
    ├── PPOCR_umi.py
    ├── README.md
    ├── __init__.py
    └── i18n.csv


/.gitignore:
--------------------------------------------------------------------------------
1 | **/*.dll
2 | **/*.exe
3 | **/models
4 | **/__pycache__
5 | **/site-packages
6 | **/models
7 | *.7z
8 | temp


--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------
 1 | [submodule "chineseocr_umi_plugin"]
 2 | 	path = chineseocr_umi_plugin
 3 | 	url = git@github.com:qwedc001/chineseocr_umi_plugin.git
 4 | [submodule "tesseractOCR_umi_plugin"]
 5 | 	path = tesseractOCR_umi_plugin
 6 | 	url = git@github.com:qwedc001/tesseractOCR_umi_plugin.git
 7 | [submodule "WechatOCR_umi_plugin"]
 8 | 	path = WechatOCR_umi_plugin
 9 | 	url = git@github.com:eaeful/WechatOCR_umi_plugin.git
10 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 hiroi-sora
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/MistralOCR/__init__.py:
--------------------------------------------------------------------------------
 1 | from . import mistral_ocr
 2 | from . import mistral_ocr_config
 3 | 
 4 | # 插件信息
 5 | PluginInfo = {
 6 |     # 插件组别
 7 |     "group": "ocr",
 8 |     # 全局配置
 9 |     "global_options": mistral_ocr_config.globalOptions,
10 |     # 局部配置
11 |     "local_options": mistral_ocr_config.localOptions,
12 |     # 接口类
13 |     "api_class": mistral_ocr.Api,
14 | }
15 | 


--------------------------------------------------------------------------------
/MistralOCR/i18n.csv:
--------------------------------------------------------------------------------
 1 | key,en_US
 2 | Mistral OCR,Mistral OCR
 3 | API密钥,API Key
 4 | Mistral API的密钥，用于访问OCR服务。,API key for Mistral OCR service.
 5 | 模型,Model
 6 | Mistral OCR使用的模型名称。,Model name used by Mistral OCR.
 7 | 超时时间,Timeout
 8 | API请求的超时时间。,Timeout for API requests.
 9 | 秒,seconds
10 | 包含图像Base64,Include Image Base64
11 | 是否在响应中包含图像的Base64编码。,Whether to include Base64 encoded images in the response.
12 | 文字识别（Mistral OCR）,Text Recognition (Mistral OCR)
13 | 语言,Language
14 | 自动检测,Auto Detect
15 | 中文,Chinese
16 | 英文,English
17 | 日语,Japanese
18 | 韩语,Korean
19 | 法语,French
20 | 德语,German
21 | 俄语,Russian
22 | 西班牙语,Spanish
23 | 葡萄牙语,Portuguese
24 | 意大利语,Italian
25 | 识别的目标语言，自动检测可能会影响准确性。,Target language for recognition. Auto detection may affect accuracy.
26 | 最小图像尺寸,Minimum Image Size
27 | 提取图像的最小尺寸（像素）。0表示不限制。,Minimum size of extracted images (pixels). 0 means no limit.
28 | 


--------------------------------------------------------------------------------
/MistralOCR/mistral_ocr.py:
--------------------------------------------------------------------------------
  1 | import base64
  2 | import json
  3 | import time
  4 | import os
  5 | import sys
  6 | from urllib.parse import urlparse
  7 | import socket
  8 | import ssl
  9 | from http.client import HTTPConnection, HTTPSConnection
 10 | 
 11 | # 内置HTTP客户端实现，避免依赖requests库
 12 | class Response:
 13 |     """HTTP响应类，模拟requests.Response"""
 14 |     def __init__(self, status_code, headers, content):
 15 |         self.status_code = status_code
 16 |         self.headers = headers
 17 |         self._content = content
 18 |         self._text = None
 19 |         self._json = None
 20 |     
 21 |     @property
 22 |     def content(self):
 23 |         """获取响应内容的字节流"""
 24 |         return self._content
 25 |     
 26 |     @property
 27 |     def text(self):
 28 |         """获取响应内容的文本形式"""
 29 |         if self._text is None:
 30 |             self._text = self._content.decode('utf-8')
 31 |         return self._text
 32 |     
 33 |     def json(self):
 34 |         """解析JSON响应"""
 35 |         if self._json is None:
 36 |             self._json = json.loads(self.text)
 37 |         return self._json
 38 | 
 39 | class RequestException(Exception):
 40 |     """请求异常基类"""
 41 |     pass
 42 | 
 43 | class Timeout(RequestException):
 44 |     """请求超时异常"""
 45 |     pass
 46 | 
 47 | class ConnectionError(RequestException):
 48 |     """连接错误异常"""
 49 |     pass
 50 | 
 51 | # 创建一个简单的requests模块替代品
 52 | class SimpleRequests:
 53 |     def __init__(self):
 54 |         self.exceptions = type('exceptions', (), {
 55 |             'Timeout': Timeout,
 56 |             'ConnectionError': ConnectionError,
 57 |             'RequestException': RequestException
 58 |         })
 59 |     
 60 |     def post(self, url, headers=None, json_data=None, timeout=30):
 61 |         """
 62 |         发送POST请求
 63 |         
 64 |         参数:
 65 |             url (str): 请求URL
 66 |             headers (dict): 请求头
 67 |             json_data (dict): JSON请求体
 68 |             timeout (int): 超时时间（秒）
 69 |         
 70 |         返回:
 71 |             Response: 响应对象
 72 |         """
 73 |         parsed_url = urlparse(url)
 74 |         
 75 |         # 确定是HTTP还是HTTPS
 76 |         is_https = parsed_url.scheme == 'https'
 77 |         
 78 |         # 设置主机和端口
 79 |         host = parsed_url.netloc
 80 |         if ':' in host:
 81 |             host, port = host.split(':')
 82 |             port = int(port)
 83 |         else:
 84 |             port = 443 if is_https else 80
 85 |         
 86 |         # 准备请求路径
 87 |         path = parsed_url.path
 88 |         if not path:
 89 |             path = '/'
 90 |         if parsed_url.query:
 91 |             path = f"{path}?{parsed_url.query}"
 92 |         
 93 |         # 准备请求头
 94 |         if headers is None:
 95 |             headers = {}
 96 |         
 97 |         # 准备请求体
 98 |         body = None
 99 |         if json_data is not None:
100 |             body = json.dumps(json_data).encode('utf-8')
101 |             headers['Content-Type'] = 'application/json'
102 |             headers['Content-Length'] = str(len(body))
103 |         
104 |         try:
105 |             # 创建连接
106 |             if is_https:
107 |                 conn = HTTPSConnection(host, port, timeout=timeout)
108 |             else:
109 |                 conn = HTTPConnection(host, port, timeout=timeout)
110 |             
111 |             # 发送请求
112 |             conn.request('POST', path, body=body, headers=headers)
113 |             
114 |             # 获取响应
115 |             resp = conn.getresponse()
116 |             
117 |             # 读取响应内容
118 |             content = resp.read()
119 |             
120 |             # 获取响应头
121 |             headers = {k.lower(): v for k, v in resp.getheaders()}
122 |             
123 |             # 创建响应对象
124 |             response = Response(resp.status, headers, content)
125 |             
126 |             # 关闭连接
127 |             conn.close()
128 |             
129 |             return response
130 |             
131 |         except socket.timeout:
132 |             raise Timeout("请求超时")
133 |         except (socket.error, ssl.SSLError) as e:
134 |             raise ConnectionError(f"连接错误: {str(e)}")
135 |         except Exception as e:
136 |             raise RequestException(f"请求失败: {str(e)}")
137 | 
138 | # 创建requests模块替代品的实例
139 | requests = SimpleRequests()
140 | HTTP_CLIENT_AVAILABLE = True
141 | 
142 | class Api:
143 |     def __init__(self, globalArgd):
144 |         """初始化Mistral OCR API接口"""
145 |         self.api_key = globalArgd.get("api_key", "")
146 |         self.model = globalArgd.get("model", "mistral-ocr-latest")
147 |         self.timeout = globalArgd.get("timeout", 30)
148 |         self.include_image_base64 = globalArgd.get("include_image_base64", False)
149 |         self.language = "auto"
150 |         self.image_min_size = 0
151 |         self.api_url = "https://api.mistral.ai/v1/ocr"
152 |         self.headers = {
153 |             "Content-Type": "application/json",
154 |             "Authorization": f"Bearer {self.api_key}"
155 |         }
156 |         # 检查依赖是否已安装
157 |         self.dependency_error = self._check_dependencies()
158 | 
159 |     def _check_dependencies(self):
160 |         """检查必要的依赖是否已安装"""
161 |         if not HTTP_CLIENT_AVAILABLE:
162 |             return "[Error] HTTP客户端初始化失败。请联系插件作者。"
163 |         return ""
164 | 
165 |     def start(self, argd):
166 |         """启动引擎，设置局部配置"""
167 |         # 首先检查依赖错误
168 |         if self.dependency_error:
169 |             return self.dependency_error
170 |             
171 |         if not self.api_key:
172 |             return "[Error] API密钥未设置，请在全局设置中配置Mistral API密钥。"
173 |         
174 |         self.language = argd.get("language", "auto")
175 |         self.image_min_size = argd.get("image_min_size", 0)
176 |         return ""
177 | 
178 |     def stop(self):
179 |         """停止引擎"""
180 |         pass
181 | 
182 |     def runPath(self, imgPath):
183 |         """通过图片路径进行OCR识别"""
184 |         # 检查依赖错误
185 |         if self.dependency_error:
186 |             return {"code": 102, "data": self.dependency_error}
187 |             
188 |         try:
189 |             # 读取图片文件并转换为base64
190 |             with open(imgPath, "rb") as image_file:
191 |                 image_base64 = base64.b64encode(image_file.read()).decode('utf-8')
192 |             
193 |             return self._process_image_base64(image_base64)
194 |         except Exception as e:
195 |             return {"code": 102, "data": f"[Error] 图片读取失败: {str(e)}"}
196 | 
197 |     def runBytes(self, imageBytes):
198 |         """通过图片字节流进行OCR识别"""
199 |         # 检查依赖错误
200 |         if self.dependency_error:
201 |             return {"code": 102, "data": self.dependency_error}
202 |             
203 |         try:
204 |             image_base64 = base64.b64encode(imageBytes).decode('utf-8')
205 |             return self._process_image_base64(image_base64)
206 |         except Exception as e:
207 |             return {"code": 102, "data": f"[Error] 图片处理失败: {str(e)}"}
208 | 
209 |     def runBase64(self, imageBase64):
210 |         """通过base64编码的图片进行OCR识别"""
211 |         # 检查依赖错误
212 |         if self.dependency_error:
213 |             return {"code": 102, "data": self.dependency_error}
214 |             
215 |         try:
216 |             return self._process_image_base64(imageBase64)
217 |         except Exception as e:
218 |             return {"code": 102, "data": f"[Error] 图片处理失败: {str(e)}"}
219 | 
220 |     def _process_image_base64(self, image_base64):
221 |         """处理base64编码的图片并调用Mistral OCR API"""
222 |         if not HTTP_CLIENT_AVAILABLE:
223 |             return {"code": 102, "data": self.dependency_error}
224 |             
225 |         try:
226 |             # 构建API请求
227 |             payload = {
228 |                 "model": self.model,
229 |                 "document": {
230 |                     "type": "image_url",
231 |                     "image_url": f"data:image/jpeg;base64,{image_base64}"
232 |                 },
233 |                 "include_image_base64": self.include_image_base64,
234 |                 "image_min_size": self.image_min_size
235 |             }
236 |             
237 |             # 发送请求到Mistral API
238 |             response = requests.post(
239 |                 self.api_url,
240 |                 headers=self.headers,
241 |                 json_data=payload,
242 |                 timeout=self.timeout
243 |             )
244 |             
245 |             # 检查响应状态
246 |             if response.status_code != 200:
247 |                 error_message = f"API请求失败: HTTP {response.status_code}"
248 |                 try:
249 |                     error_data = response.json()
250 |                     if "error" in error_data:
251 |                         error_message += f" - {error_data['error']['message']}"
252 |                 except:
253 |                     pass
254 |                 return {"code": 102, "data": f"[Error] {error_message}"}
255 |             
256 |             # 解析响应数据
257 |             ocr_result = response.json()
258 |             
259 |             # 转换为Umi-OCR期望的格式
260 |             return self._convert_to_umi_format(ocr_result)
261 |             
262 |         except requests.exceptions.Timeout:
263 |             return {"code": 102, "data": f"[Error] API请求超时，请检查网络连接或增加超时时间。"}
264 |         except requests.exceptions.ConnectionError:
265 |             return {"code": 102, "data": f"[Error] 网络连接错误，请检查网络连接。"}
266 |         except Exception as e:
267 |             return {"code": 102, "data": f"[Error] OCR处理失败: {str(e)}"}
268 | 
269 |     def _convert_to_umi_format(self, mistral_result):
270 |         """将Mistral OCR结果转换为Umi-OCR期望的格式"""
271 |         try:
272 |             # 检查是否有页面数据
273 |             if not mistral_result.get("pages"):
274 |                 return {"code": 101, "data": ""}
275 |             
276 |             # 提取第一页的markdown文本
277 |             page = mistral_result["pages"][0]
278 |             markdown_text = page.get("markdown", "")
279 |             
280 |             if not markdown_text:
281 |                 return {"code": 101, "data": ""}
282 |             
283 |             # 提取文本内容
284 |             text_blocks = []
285 |             
286 |             # 从markdown中提取文本并创建文本块
287 |             lines = markdown_text.split("\n")
288 |             y_offset = 0
289 |             line_height = 40  # 假设每行高度为40像素
290 |             
291 |             for line in lines:
292 |                 if line.strip():  # 忽略空行
293 |                     # 创建文本块
294 |                     text_block = {
295 |                         "text": line,
296 |                         "box": [[0, y_offset], [800, y_offset], [800, y_offset + line_height], [0, y_offset + line_height]],
297 |                         "score": 1.0
298 |                     }
299 |                     text_blocks.append(text_block)
300 |                     y_offset += line_height
301 |             
302 |             # 如果没有提取到文本块，返回无文字结果
303 |             if not text_blocks:
304 |                 return {"code": 101, "data": ""}
305 |             
306 |             # 返回成功结果
307 |             return {
308 |                 "code": 100,
309 |                 "data": text_blocks
310 |             }
311 |             
312 |         except Exception as e:
313 |             return {"code": 102, "data": f"[Error] 结果转换失败: {str(e)}"}
314 | 


--------------------------------------------------------------------------------
/MistralOCR/mistral_ocr_config.py:
--------------------------------------------------------------------------------
 1 | from plugin_i18n import Translator
 2 | 
 3 | # UI翻译
 4 | tr = Translator(__file__, "i18n.csv")
 5 | 
 6 | # 全局配置
 7 | globalOptions = {
 8 |     "title": tr("Mistral OCR"),
 9 |     "type": "group",
10 |     "api_key": {
11 |         "title": tr("API密钥"),
12 |         "default": "",
13 |         "toolTip": tr("Mistral API的密钥，用于访问OCR服务。"),
14 |     },
15 |     "model": {
16 |         "title": tr("模型"),
17 |         "default": "mistral-ocr-latest",
18 |         "toolTip": tr("Mistral OCR使用的模型名称。"),
19 |     },
20 |     "timeout": {
21 |         "title": tr("超时时间"),
22 |         "isInt": True,
23 |         "default": 30,
24 |         "min": 5,
25 |         "max": 120,
26 |         "unit": tr("秒"),
27 |         "toolTip": tr("API请求的超时时间。"),
28 |     },
29 |     "include_image_base64": {
30 |         "title": tr("包含图像Base64"),
31 |         "default": False,
32 |         "toolTip": tr("是否在响应中包含图像的Base64编码。"),
33 |         "advanced": True,
34 |     },
35 | }
36 | 
37 | # 局部配置
38 | localOptions = {
39 |     "title": tr("文字识别（Mistral OCR）"),
40 |     "type": "group",
41 |     "language": {
42 |         "title": tr("语言"),
43 |         "optionsList": [
44 |             ["auto", tr("自动检测")],
45 |             ["zh", tr("中文")],
46 |             ["en", tr("英文")],
47 |             ["ja", tr("日语")],
48 |             ["ko", tr("韩语")],
49 |             ["fr", tr("法语")],
50 |             ["de", tr("德语")],
51 |             ["ru", tr("俄语")],
52 |             ["es", tr("西班牙语")],
53 |             ["pt", tr("葡萄牙语")],
54 |             ["it", tr("意大利语")],
55 |         ],
56 |         "default": "auto",
57 |         "toolTip": tr("识别的目标语言，自动检测可能会影响准确性。"),
58 |     },
59 |     "image_min_size": {
60 |         "title": tr("最小图像尺寸"),
61 |         "isInt": True,
62 |         "default": 0,
63 |         "min": 0,
64 |         "max": 1000,
65 |         "toolTip": tr("提取图像的最小尺寸（像素）。0表示不限制。"),
66 |         "advanced": True,
67 |     },
68 | }
69 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <p align="center">
  2 |   <a href="https://github.com/hiroi-sora/Umi-OCR">
  3 |     <img width="200" height="128" src="https://tupian.li/images/2022/10/27/icon---256.png" alt="Umi-OCR">
  4 |   </a>
  5 | </p>
  6 | 
  7 | <h1 align="center">Umi-OCR 插件库</h1>
  8 | 
  9 | 这里是存放开源软件 [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR) 的插件的仓库。
 10 | 
 11 | Umi-OCR (v2 以上) 支持以插件的形式导入 OCR 引擎等组件，只需将插件文件放置于软件指定目录即可。
 12 | 
 13 | - [如何开发插件？](demo_AbaOCR)
 14 | 
 15 | ## 如何安装插件
 16 | 
 17 | 1. **在 [Releases](https://github.com/hiroi-sora/Umi-OCR_plugins/releases) 中下载插件压缩包。** 不要直接下载仓库的源代码！
 18 | 2. **在 [Releases](https://github.com/hiroi-sora/Umi-OCR_plugins/releases) 中下载插件压缩包。** 不要直接下载仓库的源代码！！
 19 | 3. **在 [Releases](https://github.com/hiroi-sora/Umi-OCR_plugins/releases) 中下载插件压缩包。** 不要直接下载仓库的源代码！！！
 20 | 
 21 | （重要的事情说三遍）
 22 | 
 23 | 4. 将下载的文件解压，放置于：`UmiOCR-data/plugins`
 24 | 
 25 | ## OCR 文字识别 插件
 26 | 
 27 | ### win7_x64_PaddleOCR-json / linux_x64_PaddleOCR-json
 28 | 
 29 | - Umi-OCR_Paddle 版自带此插件
 30 | - 目前唯一支持 Windows、Linux 双平台的插件
 31 | 
 32 | > 性能和准确率优秀的开源离线 OCR 组件。支持 mkldnn 数学库加速，能充分榨干 CPU 的潜力。适合高配置电脑使用。
 33 | 
 34 | | 源仓库     | [PaddleOCR-json](https://github.com/hiroi-sora/PaddleOCR-json)                   |
 35 | | ---------- | -------------------------------------------------------------------------------- |
 36 | | 下载       | [Releases](https://github.com/hiroi-sora/Umi-OCR_plugins/releases)               |
 37 | | 计算方式   | 本地，CPU                                                                        |
 38 | | 平台兼容   | Windows 7 x64 / Linux x64                                                        |
 39 | | 硬件兼容   | CPU 须带 AVX 指令集（不支持凌动 Atom，安腾 Itanium，赛扬 Celeron，奔腾 Pentium） |
 40 | | 附带语言库 | `简, 繁, 英, 日, 韩, 俄`                                                         |
 41 | 
 42 | ---
 43 | 
 44 | ### win7_x64_RapidOCR-json
 45 | 
 46 | - Umi-OCR_Rapid 版自带此插件
 47 | 
 48 | > 相当于PaddleOCR的“轻量版”。CPU兼容性好、内存占用低。速度相对慢一点。适合低配置老电脑使用。
 49 | 
 50 | | 源仓库     | [RapidOCR-json](https://github.com/hiroi-sora/RapidOCR-json)       |
 51 | | ---------- | ------------------------------------------------------------------ |
 52 | | 下载       | [Releases](https://github.com/hiroi-sora/Umi-OCR_plugins/releases) |
 53 | | 计算方式   | 本地，CPU                                                          |
 54 | | 平台兼容   | win7 以上，64 位                                                   |
 55 | | 硬件兼容   | 无特殊要求                                                         |
 56 | | 附带语言库 | `简, 繁, 英, 日, 韩, 俄`                                           |
 57 | 
 58 | ---
 59 | 
 60 | ### win7_x64_Pix2Text
 61 | 
 62 | > 支持中英文/数学公式/混合排版。插件体积大，加载速度较慢，识别速度快。
 63 | 
 64 | | 源仓库     | [Pix2Text](https://github.com/breezedeus/Pix2Text)                 |
 65 | | ---------- | ------------------------------------------------------------------ |
 66 | | 下载       | [Releases](https://github.com/hiroi-sora/Umi-OCR_plugins/releases) |
 67 | | 计算方式   | 本地，CPU                                                          |
 68 | | 平台兼容   | win7 以上，64 位                                                   |
 69 | | 硬件兼容   | 无特殊要求                                                         |
 70 | | 附带语言库 | `中文/英文/数学公式`                                               |
 71 | 
 72 | ---
 73 | 
 74 | ### TesseractOCR_umi_plugin
 75 | 
 76 | > 老牌开源模型，支持多国语言。速度较快，英文准确率优秀，中文准确率稍差。支持导入多个小语种识别库。  
 77 | > 自带排版识别模型，能整理复杂的文档排版，比Umi自带的排版解析器准确率更好。如果使用此插件，请在Umi的标签页设置中将“排版解析方案”设为“不做处理”。  
 78 | 
 79 | | 源仓库     | [TesseractOCR](https://github.com/tesseract-ocr/tesseract)               |
 80 | | ---------- | ------------------------------------------------------------------------ |
 81 | | 下载       | [Releases](https://github.com/qwedc001/tesseractOCR_umi_plugin/releases) |
 82 | | 计算方式   | 本地，CPU                                                                |
 83 | | 平台兼容   | win7 以上，64 位                                                         |
 84 | | 硬件兼容   | 无特殊要求                                                               |
 85 | | 附带语言库 | `简, 繁, 英, 日，数学公式` （另支持自行下载其他语言模型                  |
 86 | 
 87 | ---
 88 | 
 89 | ### chineseocr_umi_plugin
 90 | 
 91 | > 支持中英文识别，ChineseOCR 的轻量级模型，仍在接入适配中。
 92 | 
 93 | | 源仓库     | [ChineseOCR](https://github.com/DayBreak-u/chineseocr_lite/)           |
 94 | | ---------- | ---------------------------------------------------------------------- |
 95 | | 下载       | [Releases](https://github.com/qwedc001/chineseocr_umi_plugin/releases) |
 96 | | 计算方式   | 本地，CPU                                                              |
 97 | | 平台兼容   | win7 以上，64 位                                                       |
 98 | | 硬件兼容   | 无特殊要求                                                             |
 99 | | 附带语言库 | 中英文                                                                 |
100 | 
101 | ---
102 | 
103 | ### WechatOCR_umi_plugin
104 | 
105 | > 离线调用微信OCR进行ocr识别文字
106 | 
107 | | 源仓库     | [WechatOCR_umi_plugin](https://github.com/eaeful/WechatOCR_umi_plugin/releases)           |
108 | | ---------- | ---------------------------------------------------------------------- |
109 | | 下载       | [Releases](https://github.com/eaeful/WechatOCR_umi_plugin/releases) |
110 | | 计算方式   | 本地，CPU                                                              |
111 | | 平台兼容   | win7 以上，64 位                                                       |
112 | | 硬件兼容   | 无特殊要求                                                             |
113 | | 附带语言库 | 中英日文                                                                 |
114 | 
115 | ---
116 | ### mistral.ai_umi_plugin
117 | 
118 | > 基于 Mistral AI OCR API 进行文字识别
119 | 
120 | | 源仓库     | [mistral.ai_umi_plugin](https://github.com/chunzhimoe/mistral.ai_umi_plugin/)           |
121 | | ---------- | ---------------------------------------------------------------------- |
122 | | 下载       | [Releases](https://github.com/chunzhimoe/mistral.ai_umi_plugin/releases) |
123 | | 计算方式   | 云端，API 调用                                                              |
124 | | 平台兼容   | 跨平台                                                       |
125 | | 硬件兼容   | 无特殊要求                                                             |
126 | | 附带语言库 | 多语言识别                                                                 |
127 | 
128 | ## 插件开发
129 | 
130 | 请见 [插件开发文档及 demo](demo_AbaOCR)。
131 | 
132 | # Umi-OCR 项目结构
133 | 
134 | ### 各仓库：
135 | 
136 | - [主仓库](https://github.com/hiroi-sora/Umi-OCR)
137 | - [插件库](https://github.com/hiroi-sora/Umi-OCR_plugins) 👈
138 | - [Win 运行库](https://github.com/hiroi-sora/Umi-OCR_runtime_windows)
139 | - [Linux 运行库](https://github.com/hiroi-sora/Umi-OCR_runtime_linux)
140 | 
141 | ### 工程结构：
142 | 
143 | `**` 后缀表示本仓库(`插件库`)包含的内容。
144 | 
145 | ```
146 | Umi-OCR
147 | └─ UmiOCR-data
148 |    ├─ main.py
149 |    ├─ version.py
150 |    ├─ qt_res
151 |    │  └─ 项目qt资源，包括图标和qml源码
152 |    ├─ py_src
153 |    │  └─ 项目python源码
154 |    ├─ plugins **
155 |    │  └─ 插件
156 |    └─ i18n
157 |       └─ 翻译文件
158 | ```
159 | 


--------------------------------------------------------------------------------
/demo_AbaOCR/README.md:
--------------------------------------------------------------------------------
  1 | # OCR 插件开发文档
  2 | 
  3 | 开发者你好，欢迎探索 Umi-OCR 插件开发。这篇文档将介绍 **OCR插件** 的开发方法。
  4 | 
  5 | 目录 `demo_AbaOCR` 中的文件构成一个最小demo，可在此基础上进行开发。
  6 | 
  7 | ## 开始吧
  8 | 
  9 | 下面将通过一个例子来介绍OCR插件的开发流程。
 10 | 
 11 | 假设我们需要将一个已有的OCR组件：**阿巴阿巴OCR** 导入Umi。顾名思义， **阿巴阿巴OCR** 就是无论输入什么图片，都只会返回 `阿巴阿巴阿巴` 。
 12 | 
 13 | ## 1. 提取可配置项
 14 | 
 15 | 每个OCR插件，都一定会有一些配置项，能让用户自定义。大体上，配置项会分为两类：**全局配置项**和**局部配置项**。
 16 | 
 17 | ##### 全局配置项：
 18 | 
 19 | 无论在什么情景下，或者在哪个标签页中，都应该一致的配置项。如：在线Api接口的密钥key，超时时间，本地引擎组件的线程数，是否启用硬件加速……等。
 20 | 
 21 | ##### 局部配置项：
 22 | 
 23 | 不同标签页中，可能不相同配置项。如：识别语言等。
 24 | 
 25 | #### 阿巴的配置：
 26 | 
 27 | 假设 **阿巴OCR** 含有以下配置：
 28 | 
 29 | | 配置项   | 类型   | 位置 |
 30 | | -------- | ------ | ---- |
 31 | | api_key  | 字符串 | 全局 |
 32 | | language | 枚举   | 局部 |
 33 | 
 34 | 那么，创建一个 [aba_ocr_config.py](aba_ocr_config.py) ：
 35 | 
 36 | ```python
 37 | from plugin_i18n import Translator
 38 | 
 39 | # UI翻译
 40 | tr = Translator(__file__, "i18n.csv")
 41 | 
 42 | # 全局配置
 43 | globalOptions = {
 44 |     "title": tr("阿巴阿巴OCR"),
 45 |     "type": "group",
 46 |     "api_key": {
 47 |         "title": tr("Api密钥"),
 48 |         "default": "",
 49 |         "toolTip": tr("阿巴阿巴OCR的Api密钥。"),
 50 |     },
 51 | }
 52 | 
 53 | # 局部配置
 54 | localOptions = {
 55 |     "title": tr("文字识别（阿巴阿巴OCR）"),
 56 |     "type": "group",
 57 |     "language": {
 58 |         "title": tr("语言"),
 59 |         "optionsList": [
 60 |             ["zh_CN", "简体中文"],
 61 |             ["zh_TW", "繁體中文"],
 62 |             ["en_US", "English"],
 63 |             ["ja_JP", "日本語"],
 64 |         ],
 65 |     },
 66 | }
 67 | ```
 68 | #### 说明：
 69 | 
 70 | ##### UI翻译机制：
 71 | 
 72 | Umi-OCR 内嵌了一套简单的插件UI翻译机制（并非标准库或第三方库，只能在Umi中使用）。
 73 | 
 74 | 在开头，翻译初始化的固定写法：
 75 | 
 76 | ```python
 77 | from plugin_i18n import Translator
 78 | tr = Translator(__file__, "i18n.csv")
 79 | ```
 80 | 
 81 | 翻译某些字符串的写法：
 82 | 
 83 | ```python
 84 | str1 = tr("字符串1")
 85 | str2 = tr("字符串2")
 86 | ```
 87 | 
 88 | 编写翻译表：
 89 | 
 90 | `i18n.csv` 中规定了每个字符串及对应的多语言翻译。以下示例，翻译了英文、繁中、日语、俄语：
 91 | 
 92 | ```
 93 | key,en_US,zh_TW,ja_JP,ru_RU
 94 | 字符串1,String 1,字串1,文字列1,Строка 1
 95 | 字符串2,String 2,字串2,文字列2,Строка 2
 96 | ```
 97 | 
 98 | 如果嫌麻烦，可以只翻译英文。非中文的语言（如日韩俄德法……）缺失时，默认会采用英文翻译。
 99 | 
100 | 可以用Excel来编辑csv表格，但最后要将csv文件转为`utf-8`编码。（Excel默认可能不是utf-8，请务必检查。）
101 | 
102 | 阿巴OCR的 `i18n.csv` 示例：
103 | 
104 | ```
105 | key,en_US
106 | 阿巴阿巴OCR,Aba OCR
107 | Api密钥,Api Key
108 | 阿巴阿巴OCR的Api密钥。,Api key for Aba OCR.
109 | 文字识别（阿巴阿巴OCR）,Text recognition (Aba OCR)
110 | 语言,language
111 | ```
112 | 
113 | ##### 插件配置字典：
114 | 
115 | 插件需要定义两个字典：全局配置 `globalOptions` 和 局部配置 `localOptions` 。
116 | 
117 | 每个配置字典，外层的固定写法如下：
118 | 
119 | ```python
120 | options = {
121 |     "title": tr("配置组名称"),
122 |     "type": "group",
123 |     # TODO: 配置项
124 | }
125 | ```
126 | 
127 | 内层配置项的写法：
128 | 
129 | ```python
130 |     "布尔 boolean （开关）": {
131 |         "title": "标题",
132 |         "toolTip": "鼠标悬停提示",
133 |         "default": True / False,
134 |     },
135 |     "文本 text （文本框）": {
136 |         "title": ,
137 |         "default": "文本",
138 |     },
139 |     "数字 number （输入框）": {
140 |         "title": ,
141 |         "isInt": True 整数 / False 浮点数,
142 |         "default": 233,
143 |         "max": 可选，上限,
144 |         "min": 可选，下限,
145 |         "unit": 可选，单位。如 tr("秒"),
146 |     },
147 |     "枚举 enum （下拉框）": {
148 |         "title": ,
149 |         "optionsList": [
150 |             ["键1", "名称1"],
151 |             ["键2", "名称2"],
152 |         ],
153 |     },
154 |     "文件路径 file （文件选择框）": {
155 |         "title": ,
156 |         "type": "file",
157 |         "default": "默认路径",
158 |         "selectExisting": True 选择现有文件 / False 新创建文件(夹),
159 |         "selectFolder": True 选择文件夹 / False 选择文件,
160 |         "dialogTitle": 对话框标题,
161 |         "nameFilters": ["图片 (*.jpg *.jpeg)", "类型2..."] 文件夹类型可不需要
162 |     },
163 | 
164 |     # 每个配置项都可选的参数：
165 |     "toolTip": 可选，字符串，鼠标悬停时的提示,
166 |     "advanced": 可选，填True时为高级选项，平时隐藏
167 | ```
168 | 
169 | ##### 空配置组
170 | 
171 | 如果插件确实没有全局配置或局部配置，则可提供一个空配置组字典：
172 | ```python
173 | # 空的局部配置
174 | localOptions = {
175 |     "title": tr("文字识别（阿巴阿巴OCR）"),
176 |     "type": "group",
177 | }
178 | ```
179 | 
180 | 
181 | ## 2. 构造OCR接口
182 | 
183 | 每个OCR插件，都必须提供一个接口类，必须含有以下方法：
184 | 
185 | | 方法        | 说明                                   | 输入             | 返回值                                |
186 | | ----------- | -------------------------------------- | ---------------- | ------------------------------------- |
187 | | `__init__`  | 初始化接口类。不要进行耗时长的操作。   | 全局配置字典     | /                                     |
188 | | `start`     | 启动引擎或接口。可以进行耗时长的操作。 | 局部配置字典     | 成功返回""，失败返回 "[Error] XXX..." |
189 | | `stop`      | 停止引擎或接口。                       | /                | /                                     |
190 | | `runPath`   | 输入路径，进行OCR                      | 图片路径字符串   | OCR结果                               |
191 | | `runBytes`  | 输入字节流，进行OCR                    | 图片字节流       | OCR结果                               |
192 | | `runBase64` | 输入base64，进行OCR                    | 图片base64字符串 | OCR结果                               |
193 | 
194 | OCR结果的格式：
195 | 
196 | 成功，且有文字：
197 | ```python
198 | {
199 |     "code": 100,
200 |     "data": [
201 |         { # 第一组文本
202 |             "text": "识别文本",
203 |             "box": [[0, 0], [200, 0], [200, 40], [0, 40]], # 文本包围盒
204 |             "score": 1, # 置信度，0~1。缺省就填1。
205 |         },
206 |         {}, # 第二组文本……
207 |     ],
208 | }
209 | ```
210 | 
211 | 成功，但图中没有文字：
212 | ```python
213 | {
214 |     "code": 101,
215 |     "data": "",
216 | }
217 | ```
218 | 
219 | 失败：
220 | ```python
221 | {
222 |     "code": 102, # 自定错误码：>101的数值
223 |     "data": "[Error] 错误原因……",
224 | }
225 | ```
226 | 
227 | #### 阿巴的接口：
228 | 
229 | 创建一个 [aba_ocr.py](aba_ocr.py) ：
230 | 
231 | ```python
232 | class Api:  # 接口
233 |     def __init__(self, globalArgd):
234 |         self.lang = ""  # 当前语言
235 |         api_key = globalArgd["api_key"]
236 |         print("阿巴阿巴OCR获取 api_key： ", api_key)
237 | 
238 |     # 启动引擎。返回： "" 成功，"[Error] xxx" 失败
239 |     def start(self, argd):
240 |         self.lang = argd["language"]
241 |         print("阿巴阿巴OCR当前语言： ", self.lang)
242 |         return ""
243 | 
244 |     def stop(self):  # 停止引擎
245 |         pass
246 | 
247 |     def runPath(self, imgPath: str):  # 路径识图
248 |         res = self._ocr()
249 |         return res
250 | 
251 |     def runBytes(self, imageBytes):  # 字节流
252 |         res = self._ocr()
253 |         return res
254 | 
255 |     def runBase64(self, imageBase64):  # base64字符串
256 |         res = self._ocr()
257 |         return res
258 | 
259 |     def _ocr(self):
260 |         flag = True
261 |         text = ""
262 |         if self.lang == "zh_CN":
263 |             text = "阿巴阿巴阿巴"
264 |         elif self.lang == "zh_TW":
265 |             text = "阿巴阿巴阿巴"
266 |         elif self.lang == "en_US":
267 |             text = "Aba Aba Aba"
268 |         elif self.lang == "ja_JP":
269 |             text = "あばあばあば"
270 |         else:
271 |             flag = False
272 |             text = f"[Error] 未知的语言：{self.lang}"
273 |         if flag:
274 |             res = {
275 |                 "code": 100,
276 |                 "data": [
277 |                     {
278 |                         "text": text,
279 |                         "box": [[0, 0], [200, 0], [200, 40], [0, 40]],
280 |                         "score": 1,
281 |                     }
282 |                 ],
283 |             }
284 |         else:
285 |             res = {"code": 102, "data": text}
286 |         return res
287 | ```
288 | 
289 | ## 3. 构造插件结构
290 | 
291 | 每个插件是一个文件夹。
292 | 
293 | 文件夹名称唯一标识一个插件。文件夹名须为Ascii字符，且 **不能和python中已有的任何模块重名** 。
294 | 
295 | 文件夹中必须有一个 [`__init__.py`](__init__.py) 。Umi会读取并载入`__init__.py`，以实现动态导入插件。
296 | 
297 | `__init__.py` 中必须定义一个字典 `PluginInfo` ，如下：
298 | ```python
299 | PluginInfo = {
300 |     "group": "ocr",  # 固定写法，定义插件组
301 |     "global_options": ,  # 全局配置字典
302 |     "local_options": ,  # 局部配置字典
303 |     "api_class": ,  # 接口类
304 | }
305 | ```
306 | 
307 | #### 阿巴的结构：
308 | 
309 | 阿巴OCR插件文件夹为 `demo_AbaOCR` ，包含文件有：
310 | 
311 | [`__init__.py`](__init__.py)  
312 | [`aba_ocr.py`](aba_ocr.py)  
313 | [`aba_ocr_config.py`](aba_ocr_config.py)  
314 | [`i18n.csv`](i18n.csv)  
315 | 
316 | 其中 `__init__.py` 的内容为：
317 | 
318 | ```python
319 | from . import aba_ocr
320 | from . import aba_ocr_config
321 | 
322 | # 插件信息
323 | PluginInfo = {
324 |     # 插件组别
325 |     "group": "ocr",
326 |     # 全局配置
327 |     "global_options": aba_ocr_config.globalOptions,
328 |     # 局部配置
329 |     "local_options": aba_ocr_config.localOptions,
330 |     # 接口类
331 |     "api_class": aba_ocr.Api,
332 | }
333 | ```
334 | 
335 | ## 4. 放置插件
336 | 
337 | 通过以上步骤，我们已经创建了一个可运行的 **阿巴OCR** 插件：`demo_AbaOCR`。
338 | 
339 | 接下来，将它放入 Umi-OCR 的目录：
340 | 
341 | `UmiOCR-data\plugins`
342 | 
343 | 启动 Umi-OCR ，即可在全局设置中，切换到 **阿巴OCR** 。
344 | 


--------------------------------------------------------------------------------
/demo_AbaOCR/__init__.py:
--------------------------------------------------------------------------------
 1 | from . import aba_ocr
 2 | from . import aba_ocr_config
 3 | 
 4 | # 插件信息
 5 | PluginInfo = {
 6 |     # 插件组别
 7 |     "group": "ocr",
 8 |     # 全局配置
 9 |     "global_options": aba_ocr_config.globalOptions,
10 |     # 局部配置
11 |     "local_options": aba_ocr_config.localOptions,
12 |     # 接口类
13 |     "api_class": aba_ocr.Api,
14 | }
15 | 


--------------------------------------------------------------------------------
/demo_AbaOCR/aba_ocr.py:
--------------------------------------------------------------------------------
 1 | # demo: 阿巴阿巴OCR
 2 | 
 3 | 
 4 | class Api:  # 接口
 5 |     def __init__(self, globalArgd):
 6 |         self.lang = ""  # 当前语言
 7 |         api_key = globalArgd["api_key"]
 8 |         print("阿巴阿巴OCR获取 api_key： ", api_key)
 9 | 
10 |     # 启动引擎。返回： "" 成功，"[Error] xxx" 失败
11 |     def start(self, argd):
12 |         self.lang = argd["language"]
13 |         print("阿巴阿巴OCR当前语言： ", self.lang)
14 |         return ""
15 | 
16 |     def stop(self):  # 停止引擎
17 |         pass
18 | 
19 |     def runPath(self, imgPath: str):  # 路径识图
20 |         res = self._ocr()
21 |         return res
22 | 
23 |     def runBytes(self, imageBytes):  # 字节流
24 |         res = self._ocr()
25 |         return res
26 | 
27 |     def runBase64(self, imageBase64):  # base64字符串
28 |         res = self._ocr()
29 |         return res
30 | 
31 |     def _ocr(self):
32 |         flag = True
33 |         text = ""
34 |         if self.lang == "zh_CN":
35 |             text = "阿巴阿巴阿巴"
36 |         elif self.lang == "zh_TW":
37 |             text = "阿巴阿巴阿巴"
38 |         elif self.lang == "en_US":
39 |             text = "Aba Aba Aba"
40 |         elif self.lang == "ja_JP":
41 |             text = "あばあばあば"
42 |         else:
43 |             flag = False
44 |             text = f"[Error] 未知的语言：{self.lang}"
45 |         if flag:
46 |             res = {
47 |                 "code": 100,
48 |                 "data": [
49 |                     {
50 |                         "text": text,
51 |                         "box": [[0, 0], [200, 0], [200, 40], [0, 40]],
52 |                         "score": 1,
53 |                     }
54 |                 ],
55 |             }
56 |         else:
57 |             res = {"code": 102, "data": text}
58 |         return res
59 | 


--------------------------------------------------------------------------------
/demo_AbaOCR/aba_ocr_config.py:
--------------------------------------------------------------------------------
 1 | from plugin_i18n import Translator
 2 | 
 3 | tr = Translator(__file__, "i18n.csv")
 4 | 
 5 | globalOptions = {
 6 |     "title": tr("阿巴阿巴OCR"),
 7 |     "type": "group",
 8 |     "api_key": {
 9 |         "title": tr("Api密钥"),
10 |         "default": "",
11 |         "toolTip": tr("阿巴阿巴OCR的Api密钥。"),
12 |     },
13 | }
14 | 
15 | localOptions = {
16 |     "title": tr("文字识别（阿巴阿巴OCR）"),
17 |     "type": "group",
18 |     "language": {
19 |         "title": tr("语言"),
20 |         "optionsList": [
21 |             ["zh_CN", "简体中文"],
22 |             ["zh_TW", "繁體中文"],
23 |             ["en_US", "English"],
24 |             ["ja_JP", "日本語"],
25 |         ],
26 |     },
27 | }
28 | 


--------------------------------------------------------------------------------
/demo_AbaOCR/i18n.csv:
--------------------------------------------------------------------------------
1 | key,en_US
2 | 阿巴阿巴OCR,Aba OCR
3 | Api密钥,Api Key
4 | 阿巴阿巴OCR的Api密钥。,Api key for Aba OCR.
5 | 文字识别（阿巴阿巴OCR）,Text recognition (Aba OCR)
6 | 语言,language
7 | 


--------------------------------------------------------------------------------
/win7_x64_Pix2Text/__init__.py:
--------------------------------------------------------------------------------
 1 | from . import p2t_api
 2 | from . import p2t_config
 3 | 
 4 | PluginInfo = {
 5 |     "group": "ocr",  # 固定写法，定义插件组
 6 |     "global_options": p2t_config.globalOptions,  # 全局配置字典
 7 |     "local_options": p2t_config.localOptions,  # 局部配置字典
 8 |     "api_class": p2t_api.Api,  # 接口类
 9 | }
10 | 


--------------------------------------------------------------------------------
/win7_x64_Pix2Text/i18n.csv:
--------------------------------------------------------------------------------
 1 | key,en_US
 2 | 支持中文/英文/数学公式/混排,Chinese + English + Formulas
 3 | （本地）,(local)
 4 | 文字识别,Text Recognition
 5 | 启用文字识别,Enable Text
 6 | 支持简体中文+英文,Support Chinese + English
 7 | 启用数学公式,Enable Math Formula
 8 | 限制图像边长,Limit image edge length
 9 | （默认）,(Default)
10 | 将边长大于该值的图片进行压缩，可以提高识别速度。可能降低识别精度。,Compressing images with side lengths greater than this value can improve recognition speed. May reduce recognition accuracy.


--------------------------------------------------------------------------------
/win7_x64_Pix2Text/p2t_api.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | import site
  4 | import time
  5 | import base64
  6 | import shutil
  7 | from PIL import Image
  8 | from io import BytesIO
  9 | from typing import Union
 10 | 
 11 | # 当前目录
 12 | CurrentDir = os.path.dirname(os.path.abspath(__file__))
 13 | # 依赖包目录
 14 | SitePackages = os.path.join(CurrentDir, "site-packages")
 15 | # 模型库目录
 16 | Models = os.path.join(CurrentDir, "models")
 17 | 
 18 | 
 19 | class Api:  # 接口
 20 |     def __init__(self, globalArgd):
 21 |         self.p2t = None
 22 |         self.argd = {
 23 |             "recognize_text": True,  # 启用文字识别
 24 |             "recognize_formula": True,  # 启用公式识别
 25 |             "resized_shape": 608,  # 缩放限制
 26 |         }
 27 | 
 28 |     # 启动引擎。返回： "" 成功，"[Error] xxx" 失败
 29 |     def start(self, argd):
 30 |         self.argd = argd  # 记录局部参数
 31 |         self.argd["resized_shape"] = int(argd["resized_shape"])  # 确保类型正确
 32 |         if self.p2t:  # 引擎已启动
 33 |             return ""
 34 |         # 将依赖目录 添加到搜索路径
 35 |         site.addsitedir(SitePackages)
 36 |         try:
 37 |             # 补充输出接口，避免第三方库报错
 38 |             class _std:
 39 |                 def write(self, e):
 40 |                     print(e)
 41 | 
 42 |             sys.__stdout__ = _std
 43 |             sys.__stderr__ = _std
 44 |             # 导入依赖库
 45 |             t1 = time.time()
 46 |             from pix2text import Pix2Text
 47 |             import numpy as np
 48 | 
 49 |             t2 = time.time()
 50 |             print("import 耗时：", t2 - t1)
 51 | 
 52 |             self.np = np
 53 |             # 实例化P2T
 54 |             self.p2t = Pix2Text(
 55 |                 # 分类模型
 56 |                 analyzer_config=dict(
 57 |                     model_name="mfd",
 58 |                     model_type="yolov7_tiny",
 59 |                     model_fp=os.path.join(Models, "mfd-yolov7_tiny.pt"),
 60 |                 ),
 61 |                 # 公式模型
 62 |                 formula_config=dict(
 63 |                     model_name="mfr",
 64 |                     model_backend="onnx",
 65 |                     model_dir=os.path.join(Models, "mfr-onnx"),
 66 |                 ),
 67 |                 # 文字模型
 68 |                 text_config=dict(
 69 |                     # 检测
 70 |                     det_model_name="ch_PP-OCRv3_det",
 71 |                     det_model_backend="onnx",
 72 |                     det_model_fp=os.path.join(Models, "ch_PP-OCRv3_det_infer.onnx"),
 73 |                     # 识别
 74 |                     rec_model_name="densenet_lite_136-gru",
 75 |                     rec_model_backend="onnx",
 76 |                     rec_model_fp=os.path.join(
 77 |                         Models,
 78 |                         "cnocr-v2.3-densenet_lite_136-gru-epoch=004-ft-model.onnx",
 79 |                     ),
 80 |                 ),
 81 |             )
 82 |             return ""
 83 |         except Exception as e:
 84 |             self.p2t = None
 85 |             err = str(e)
 86 |             if "DLL load failed while importing onnxruntime_pybind11_state" in str(e):
 87 |                 err += "\n请下载 Please download VC++2022 :\nhttps://aka.ms/vs/17/release/vc_redist.x64.exe"
 88 |             return f"[Error] {err}"
 89 | 
 90 |     def stop(self):
 91 |         self.p2t = None
 92 | 
 93 |     # 将p2t的text结果转为Umi-OCR的格式
 94 |     def _text_standardized(self, res):
 95 |         datas = []
 96 |         for tb in res:
 97 |             datas.append(
 98 |                 {
 99 |                     "text": tb["text"],
100 |                     "box": tb["position"].tolist(),  # np数组转list
101 |                     "score": 1,  # 无置信度信息
102 |                 }
103 |             )
104 |         if datas:
105 |             return {"code": 100, "data": datas}
106 |         else:
107 |             return {"code": 101, "data": ""}
108 | 
109 |     # 将p2t的混合结果转为Umi-OCR的格式
110 |     def _tf_standardized(self, res):
111 |         # print("获取结果：", res)
112 |         resL = len(res)
113 |         tbs = []
114 | 
115 |         # 合并box
116 |         def mergeBox(b1, b2):
117 |             p00 = min([b1[0][0], b1[3][0], b2[0][0], b2[3][0]])
118 |             p01 = min([b1[0][1], b1[1][1], b2[0][1], b2[1][1]])
119 |             p20 = max([b1[1][0], b1[2][0], b2[1][0], b2[2][0]])
120 |             p21 = max([b1[2][1], b1[3][1], b2[2][1], b2[3][1]])
121 |             return [
122 |                 [p00, p01],
123 |                 [p20, p01],
124 |                 [p20, p21],
125 |                 [p00, p21],
126 |             ]
127 | 
128 |         # 当前行信息
129 |         line_number = 0
130 |         index = 0
131 |         while True:
132 |             if index == resL:
133 |                 break
134 |             # 一行
135 |             text = ""
136 |             score = 0
137 |             num = 0
138 |             box = res[index]["position"].tolist()
139 |             # 遍历一行
140 |             while True:
141 |                 # 行结束
142 |                 data = res[index]
143 |                 if data["line_number"] > line_number:
144 |                     line_number = data["line_number"]
145 |                     break
146 |                 # 行收集
147 |                 text += data["text"]
148 |                 score += data["score"] if "score" in data else 1.0
149 |                 num += 1
150 |                 box = mergeBox(data["position"].tolist(), box)
151 |                 index += 1
152 |                 if index == resL:
153 |                     break
154 |             print(index, score, num)
155 |             # 整合一行
156 |             tbs.append(
157 |                 {
158 |                     "text": text,
159 |                     "score": score / num,
160 |                     "box": box,
161 |                 }
162 |             )
163 |         if tbs:
164 |             return {"code": 100, "data": tbs}
165 |         else:
166 |             return {"code": 101, "data": ""}
167 | 
168 |     # 进行一次识图。可输入路径字符串或PIL Image
169 |     def _run(self, img: Union[str, Image.Image]):
170 |         if not self.p2t:
171 |             return {"code": 201, "data": "p2t not initialized."}
172 |         t, f = self.argd["recognize_text"], self.argd["recognize_formula"]
173 |         resized_shape = self.argd["resized_shape"]
174 |         try:
175 |             if t and f:  # 混合识别
176 |                 res = self.p2t.recognize(img, resized_shape=resized_shape)
177 |                 return self._tf_standardized(res)
178 |             elif t:  # 仅文字识别
179 |                 # res = self.p2t.recognize_text(img)
180 |                 res = self.p2t.text_ocr.ocr(self.np.array(img))
181 |                 return self._text_standardized(res)
182 |             elif f:  # 仅公式识别
183 |                 im = img
184 |                 if isinstance(im, str):  # 读入路径
185 |                     im = Image.open(im)
186 |                 text = self.p2t.recognize_formula(im)
187 |                 w, h = im.size
188 |                 if text:
189 |                     return {
190 |                         "code": 100,
191 |                         "data": [
192 |                             {
193 |                                 "box": [[0, 0], [w, 0], [w, h], [0, h]],
194 |                                 "score": 1,
195 |                                 "text": text,
196 |                             }
197 |                         ],
198 |                     }
199 |                 else:
200 |                     return self._text_standardized([])
201 |             else:
202 |                 return {
203 |                     "code": 202,
204 |                     "data": "未启用文字识别或公式识别。\nText or formula recognition is not enabled.",
205 |                 }
206 |         except Exception as e:
207 |             return {"code": 203, "data": f"p2t recognize error: {e}"}
208 | 
209 |     def runPath(self, imgPath: str):  # 路径识图
210 |         return self._run(imgPath)
211 | 
212 |     def runBytes(self, imageBytes):  # 字节流
213 |         bytesIO = BytesIO(imageBytes)
214 |         image = Image.open(bytesIO)
215 |         return self._run(image)
216 | 
217 |     def runBase64(self, imageBase64):  # base64字符串
218 |         imageBytes = base64.b64decode(imageBase64)
219 |         return self.runBytes(imageBytes)
220 | 
221 | 
222 | """
223 | 如果Win7报错
224 | [Error] DLL load failed while importing onnxruntime_pybind11_state: 找不到指定的程序。
225 | 那么下载VC++2019
226 | https://aka.ms/vs/16/release/VC_redist.x64.exe
227 | """
228 | 


--------------------------------------------------------------------------------
/win7_x64_Pix2Text/p2t_config.py:
--------------------------------------------------------------------------------
 1 | from plugin_i18n import Translator
 2 | 
 3 | # UI翻译
 4 | tr = Translator(__file__, "i18n.csv")
 5 | 
 6 | 
 7 | # 全局配置
 8 | globalOptions = {
 9 |     "title": "Pix2Text" + tr("（本地）"),
10 |     "type": "group",
11 |     "tips": {
12 |         "title": tr("支持中文/英文/数学公式/混排"),
13 |         "btnsList": [],
14 |     },
15 | }
16 | 
17 | # 局部配置
18 | localOptions = {
19 |     "title": tr("文字识别") + " (Pix2Text)",
20 |     "type": "group",
21 |     "recognize_text": {
22 |         "title": tr("启用文字识别"),
23 |         "toolTip": tr("支持简体中文+英文"),
24 |         "default": True,
25 |     },
26 |     "recognize_formula": {
27 |         "title": tr("启用数学公式"),
28 |         "default": True,
29 |     },
30 |     "resized_shape": {
31 |         "title": tr("限制图像边长"),
32 |         "optionsList": [
33 |             [608, "608 " + tr("（默认）")],
34 |             [1216, "1216"],
35 |             [2432, "2432"],
36 |             [4864, "4864"],
37 |         ],
38 |         "toolTip": tr(
39 |             "将边长大于该值的图片进行压缩，可以提高识别速度。可能降低识别精度。"
40 |         ),
41 |     },
42 | }
43 | 


--------------------------------------------------------------------------------
/win7_x64_RapidOCR-json/__init__.py:
--------------------------------------------------------------------------------
 1 | from . import api_rapidocr
 2 | from . import rapidocr_config
 3 | 
 4 | # 插件信息
 5 | PluginInfo = {
 6 |     # 插件组别
 7 |     "group": "ocr",
 8 |     # 全局配置
 9 |     "global_options": rapidocr_config.globalOptions,
10 |     # 局部配置
11 |     "local_options": rapidocr_config.localOptions,
12 |     # 接口类
13 |     "api_class": api_rapidocr.Api,
14 | }
15 | 


--------------------------------------------------------------------------------
/win7_x64_RapidOCR-json/api_rapidocr.py:
--------------------------------------------------------------------------------
 1 | # 调用 RapidOCR-json.exe 的 Python Api
 2 | # 项目主页：
 3 | # https://github.com/hiroi-sora/RapidOCR-json
 4 | 
 5 | import os
 6 | from .rapidocr import Rapid_pipe
 7 | from .rapidocr_config import LangDict
 8 | 
 9 | # exe路径
10 | ExePath = os.path.dirname(os.path.abspath(__file__)) + "/RapidOCR-json.exe"
11 | 
12 | 
13 | class Api:  # 公开接口
14 |     def __init__(self, globalArgd):
15 |         # 测试路径是否存在
16 |         if not os.path.exists(ExePath):
17 |             raise ValueError(f'[Error] Exe path "{ExePath}" does not exist.')
18 |         # 初始化参数
19 |         self.api = None  # api对象
20 |         self.exeConfigs = {  # exe启动参数字典
21 |             "models": "models",
22 |             "ensureAscii": 1,
23 |             "det": None,
24 |             "cls": None,
25 |             "rec": None,
26 |             "keys": None,
27 |             "doAngle": 0,
28 |             "mostAngle": 0,
29 |             "maxSideLen": None,
30 |             "numThread": globalArgd["numThread"],
31 |         }
32 | 
33 |     def start(self, argd):  # 启动引擎。返回： "" 成功，"[Error] xxx" 失败
34 |         # 加载局部参数
35 |         tempConfigs = self.exeConfigs.copy()
36 |         try:
37 |             lang = LangDict[argd["language"]]
38 |             tempConfigs.update(lang)
39 |             if argd["angle"]:
40 |                 tempConfigs["doAngle"] = tempConfigs["mostAngle"] = 1
41 |             else:
42 |                 tempConfigs["doAngle"] = tempConfigs["mostAngle"] = 0
43 |             tempConfigs["maxSideLen"] = argd["maxSideLen"]
44 |         except Exception as e:
45 |             self.api = None
46 |             return f"[Error] OCR start fail. Argd: {argd}\n{e}"
47 | 
48 |         # 若引擎已启动，且局部参数与传入参数一致，则无需重启
49 |         if not self.api == None:
50 |             if set(tempConfigs.items()) == set(self.exeConfigs.items()):
51 |                 return ""
52 |             # 若引擎已启动但需要更改参数，则停止旧引擎
53 |             self.stop()
54 |         # 启动新引擎
55 |         self.exeConfigs = tempConfigs
56 |         try:
57 |             self.api = Rapid_pipe(ExePath, tempConfigs)
58 |         except Exception as e:
59 |             self.api = None
60 |             return f"[Error] OCR init fail. Argd: {tempConfigs}\n{e}"
61 |         return ""
62 | 
63 |     def stop(self):  # 停止引擎
64 |         if self.api == None:
65 |             return
66 |         self.api.exit()
67 |         self.api = None
68 | 
69 |     def runPath(self, imgPath: str):  # 路径识图
70 |         res = self.api.run(imgPath)
71 |         return res
72 | 
73 |     def runBytes(self, imageBytes):  # 字节流
74 |         res = self.api.runBytes(imageBytes)
75 |         return res
76 | 
77 |     def runBase64(self, imageBase64):  # base64字符串
78 |         res = self.api.runBase64(imageBase64)
79 |         return res
80 | 


--------------------------------------------------------------------------------
/win7_x64_RapidOCR-json/i18n.csv:
--------------------------------------------------------------------------------
 1 | key,en_US,zh_TW,ja_JP
 2 | RapidOCR（本地）,RapidOCR (local),RapidOCR（本地）,RapidOCR（ローカル）
 3 | 线程数,Number of threads,線程數,スレッド数
 4 | 文字识别（RapidOCR）,Text Recognition (RapidOCR),文字識別（RapidOCR）,文字認識（RapidOCR）
 5 | 语言/模型库,Language/Model Library,語言/模型庫,言語/モデルライブラリ
 6 | 纠正文本方向,Correct Text Direction,糾正文字方向,テキスト方向の修正
 7 | 启用方向分类，识别倾斜或倒置的文本。可能降低识别速度。,Enable directional classification to recognize tilted or inverted text. May reduce recognition speed.,啟用方向分類，識別傾斜或倒置的文字。 可能降低識別速度。,方向分類を有効にして、傾斜または反転したテキストを識別します。認識速度が低下する可能性があります。
 8 | 限制图像边长,Limit image edge length,限制影像邊長,画像の辺の長さを制限する
 9 | （默认）,(Default),（默認）,(デフォルト)
10 | 无限制,Unlimited,無限制,制限なし
11 | 将边长大于该值的图片进行压缩，可以提高识别速度。可能降低识别精度。,Compressing images with side lengths greater than this value can improve recognition speed. May reduce recognition accuracy.,將邊長大於該值的圖片進行壓縮，可以提高識別速度。 可能降低識別精度。,辺の長さがこの値より大きい画像を圧縮することで、認識速度を高めることができます。認識精度が低下する可能性があります。
12 | 


--------------------------------------------------------------------------------
/win7_x64_RapidOCR-json/rapidocr.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import atexit  # 退出处理
  3 | import subprocess  # 进程，管道
  4 | from json import loads as jsonLoads, dumps as jsonDumps
  5 | from sys import platform as sysPlatform  # popen静默模式
  6 | from base64 import b64encode  # base64 编码
  7 | 
  8 | 
  9 | class Rapid_pipe:  # 调用OCR（管道模式）
 10 |     def __init__(self, exePath: str, argument: dict = None):
 11 |         """初始化识别器（管道模式）。\n
 12 |         `exePath`: 识别器`PaddleOCR_json.exe`的路径。\n
 13 |         `argument`: 启动参数，字典`{"键":值}`。参数说明见 https://github.com/hiroi-sora/PaddleOCR-json
 14 |         """
 15 |         cwd = os.path.abspath(os.path.join(exePath, os.pardir))  # 获取exe父文件夹
 16 |         # 处理启动参数
 17 |         if not argument == None:
 18 |             for key, value in argument.items():
 19 |                 if isinstance(value, str):  # 字符串类型的值加双引号
 20 |                     exePath += f' --{key}="{value}"'
 21 |                 else:
 22 |                     exePath += f" --{key}={value}"
 23 |         if "ensureAscii" not in exePath:
 24 |             exePath += f" --ensureAscii=1"
 25 |         # 设置子进程启用静默模式，不显示控制台窗口
 26 |         self.ret = None
 27 |         startupinfo = None
 28 |         if "win32" in str(sysPlatform).lower():
 29 |             startupinfo = subprocess.STARTUPINFO()
 30 |             startupinfo.dwFlags = (
 31 |                 subprocess.CREATE_NEW_CONSOLE | subprocess.STARTF_USESHOWWINDOW
 32 |             )
 33 |             startupinfo.wShowWindow = subprocess.SW_HIDE
 34 |         self.ret = subprocess.Popen(  # 打开管道
 35 |             exePath,
 36 |             cwd=cwd,
 37 |             stdin=subprocess.PIPE,
 38 |             stdout=subprocess.PIPE,
 39 |             stderr=subprocess.DEVNULL,  # 丢弃stderr的内容
 40 |             startupinfo=startupinfo,  # 开启静默模式
 41 |         )
 42 |         # 启动子进程
 43 |         while True:
 44 |             if not self.ret.poll() == None:  # 子进程已退出，初始化失败
 45 |                 raise Exception(f"OCR init fail.")
 46 |             initStr = self.ret.stdout.readline().decode("utf-8", errors="ignore")
 47 |             if "OCR init completed." in initStr:  # 初始化成功
 48 |                 break
 49 |         atexit.register(self.exit)  # 注册程序终止时执行强制停止子进程
 50 | 
 51 |     def runDict(self, writeDict: dict):
 52 |         """传入指令字典，发送给引擎进程。\n
 53 |         `writeDict`: 指令字典。\n
 54 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
 55 |         # 检查子进程
 56 |         if not self.ret:
 57 |             return {"code": 901, "data": f"引擎实例不存在。"}
 58 |         if not self.ret.poll() == None:
 59 |             return {"code": 902, "data": f"子进程已崩溃。"}
 60 |         # 输入信息
 61 |         writeStr = jsonDumps(writeDict, ensure_ascii=True, indent=None) + "\n"
 62 |         try:
 63 |             self.ret.stdin.write(writeStr.encode("utf-8"))
 64 |             self.ret.stdin.flush()
 65 |         except Exception as e:
 66 |             return {"code": 902, "data": f"向识别器进程传入指令失败，疑似子进程已崩溃。{e}"}
 67 |         # 获取返回值
 68 |         try:
 69 |             getStr = self.ret.stdout.readline().decode("utf-8", errors="ignore")
 70 |         except Exception as e:
 71 |             return {"code": 903, "data": f"读取识别器进程输出值失败。异常信息：[{e}]"}
 72 |         try:
 73 |             return jsonLoads(getStr)
 74 |         except Exception as e:
 75 |             return {"code": 904, "data": f"识别器输出值反序列化JSON失败。异常信息：[{e}]。原始内容：[{getStr}]"}
 76 | 
 77 |     def run(self, imgPath: str):
 78 |         """对一张本地图片进行文字识别。\n
 79 |         `exePath`: 图片路径。\n
 80 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
 81 |         writeDict = {"image_path": imgPath}
 82 |         return self.runDict(writeDict)
 83 | 
 84 |     def runBase64(self, imageBase64: str):
 85 |         """对一张编码为base64字符串的图片进行文字识别。\n
 86 |         `imageBase64`: 图片base64字符串。\n
 87 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
 88 |         writeDict = {"image_base64": imageBase64}
 89 |         return self.runDict(writeDict)
 90 | 
 91 |     def runBytes(self, imageBytes):
 92 |         """对一张图片的字节流信息进行文字识别。\n
 93 |         `imageBytes`: 图片字节流。\n
 94 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
 95 |         imageBase64 = b64encode(imageBytes).decode("utf-8")
 96 |         return self.runBase64(imageBase64)
 97 | 
 98 |     def exit(self):
 99 |         """关闭引擎子进程"""
100 |         if not self.ret:
101 |             return
102 |         self.ret.kill()  # 关闭子进程
103 |         self.ret = None
104 |         atexit.unregister(self.exit)  # 移除退出处理
105 |         print("### RapidOCR引擎子进程关闭！")
106 | 
107 |     @staticmethod
108 |     def printResult(res: dict):
109 |         """用于调试，格式化打印识别结果。\n
110 |         `res`: OCR识别结果。"""
111 | 
112 |         # 识别成功
113 |         if res["code"] == 100:
114 |             index = 1
115 |             for line in res["data"]:
116 |                 print(f"{index}-置信度：{round(line['score'], 2)}，文本：{line['text']}")
117 |                 index += 1
118 |         elif res["code"] == 100:
119 |             print("图片中未识别出文字。")
120 |         else:
121 |             print(f"图片识别失败。错误码：{res['code']}，错误信息：{res['data']}")
122 | 
123 |     def __del__(self):
124 |         self.exit()
125 | 


--------------------------------------------------------------------------------
/win7_x64_RapidOCR-json/rapidocr_config.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import psutil
  3 | from plugin_i18n import Translator
  4 | 
  5 | tr = Translator(__file__, "i18n.csv")
  6 | 
  7 | # 模块配置路径
  8 | MODELS_CONFIGS = "/models/configs.txt"
  9 | # 保存语言字典
 10 | LangDict = {}
 11 | 
 12 | 
 13 | # 动态获取模型库列表
 14 | def _getlanguageList():
 15 |     global LangDict
 16 |     """configs.txt 格式示例：
 17 |     简体中文(V4)
 18 |     ch_PP-OCRv4_det_infer.onnx
 19 |     ch_ppocr_mobile_v2.0_cls_infer.onnx
 20 |     rec_ch_PP-OCRv4_infer.onnx
 21 |     dict_chinese.txt
 22 | 
 23 |     """
 24 |     optionsList = []
 25 |     configsPath = os.path.dirname(os.path.abspath(__file__)) + MODELS_CONFIGS
 26 |     try:
 27 |         with open(configsPath, "r", encoding="utf-8") as file:
 28 |             content = file.read()
 29 |             parts = content.split("\n\n")
 30 |             for part in parts:
 31 |                 items = part.split("\n")
 32 |                 if len(items) == 5:
 33 |                     title, det, cls, rec, keys = items
 34 |                     LangDict[title] = {
 35 |                         "det": det,
 36 |                         "cls": cls,
 37 |                         "rec": rec,
 38 |                         "keys": keys,
 39 |                     }
 40 |                     optionsList.append([title, title])
 41 |         return optionsList
 42 |     except FileNotFoundError:
 43 |         print("[Error] RapidOCR配置文件configs不存在，请检查文件路径是否正确。", configsPath)
 44 |     except IOError:
 45 |         print("[Error] RapidOCR配置文件configs无法打开或读取。")
 46 |     return []
 47 | 
 48 | 
 49 | _LanguageList = _getlanguageList()
 50 | 
 51 | 
 52 | # 获取最佳线程数
 53 | def _getThreads():
 54 |     try:
 55 |         phyCore = psutil.cpu_count(logical=False)  # 物理核心数
 56 |         lgiCore = psutil.cpu_count(logical=True)  # 逻辑核心数
 57 |         if (
 58 |             not isinstance(phyCore, int)
 59 |             or not isinstance(lgiCore, int)
 60 |             or lgiCore < phyCore
 61 |         ):
 62 |             raise ValueError("核心数计算异常")
 63 |         # 物理核数=逻辑核数，返回逻辑核数
 64 |         if phyCore * 2 == lgiCore or phyCore == lgiCore:
 65 |             return lgiCore
 66 |         # 大小核处理器，返回大核线程数
 67 |         big = lgiCore - phyCore
 68 |         return big * 2
 69 |     except Exception as e:
 70 |         print("[Warning] 无法获取CPU核心数！", e)
 71 |         return 4
 72 | 
 73 | 
 74 | _threads = _getThreads()
 75 | 
 76 | globalOptions = {
 77 |     "title": tr("RapidOCR（本地）"),
 78 |     "type": "group",
 79 |     "numThread": {
 80 |         "title": tr("线程数"),
 81 |         "default": _threads,
 82 |         "min": 1,
 83 |         "isInt": True,
 84 |     },
 85 | }
 86 | 
 87 | localOptions = {
 88 |     "title": tr("文字识别（RapidOCR）"),
 89 |     "type": "group",
 90 |     "language": {
 91 |         "title": tr("语言/模型库"),
 92 |         "optionsList": _LanguageList,
 93 |     },
 94 |     "angle": {
 95 |         "title": tr("纠正文本方向"),
 96 |         "default": False,
 97 |         "toolTip": tr("启用方向分类，识别倾斜或倒置的文本。可能降低识别速度。"),
 98 |     },
 99 |     "maxSideLen": {
100 |         "title": tr("限制图像边长"),
101 |         "optionsList": [
102 |             [1024, "1024 " + tr("（默认）")],
103 |             [2048, "2048"],
104 |             [4096, "4096"],
105 |             [999999, tr("无限制")],
106 |         ],
107 |         "toolTip": tr("将边长大于该值的图片进行压缩，可以提高识别速度。可能降低识别精度。"),
108 |     },
109 | }
110 | 


--------------------------------------------------------------------------------
/win_linux_PaddleOCR-json/PPOCR_api.py:
--------------------------------------------------------------------------------
  1 | # 调用 PaddleOCR-json.exe 的 Python Api
  2 | # 项目主页：
  3 | # https://github.com/hiroi-sora/PaddleOCR-json
  4 | 
  5 | import os
  6 | import socket  # 套接字
  7 | import atexit  # 退出处理
  8 | import subprocess  # 进程，管道
  9 | import re  # regex
 10 | from json import loads as jsonLoads, dumps as jsonDumps
 11 | from sys import platform as sysPlatform  # popen静默模式
 12 | from base64 import b64encode  # base64 编码
 13 | 
 14 | 
 15 | class PPOCR_pipe:  # 调用OCR（管道模式）
 16 |     def __init__(self, exePath: str, modelsPath: str = None, argument: dict = None):
 17 |         """初始化识别器（管道模式）。\n
 18 |         `exePath`: 识别器`PaddleOCR_json.exe`的路径。\n
 19 |         `modelsPath`: 识别库`models`文件夹的路径。若为None则默认识别库与识别器在同一目录下。\n
 20 |         `argument`: 启动参数，字典`{"键":值}`。参数说明见 https://github.com/hiroi-sora/PaddleOCR-json
 21 |         """
 22 |         # 私有成员变量
 23 |         self.__ENABLE_CLIPBOARD = False
 24 | 
 25 |         exePath = os.path.abspath(exePath)
 26 |         cwd = os.path.abspath(os.path.join(exePath, os.pardir))  # 获取exe父文件夹
 27 |         cmds = [exePath]
 28 |         # 处理启动参数
 29 |         if modelsPath is not None:
 30 |             if os.path.exists(modelsPath) and os.path.isdir(modelsPath):
 31 |                 cmds += ["--models_path", os.path.abspath(modelsPath)]
 32 |             else:
 33 |                 raise Exception(
 34 |                     f"Input modelsPath doesn't exits or isn't a directory. modelsPath: [{modelsPath}]"
 35 |                 )
 36 |         if isinstance(argument, dict):
 37 |             for key, value in argument.items():
 38 |                 # Popen() 要求输入list里所有的元素都是 str 或 bytes
 39 |                 if isinstance(value, bool):
 40 |                     cmds += [f"--{key}={value}"]  # 布尔参数必须键和值连在一起
 41 |                 elif isinstance(value, str):
 42 |                     cmds += [f"--{key}", value]
 43 |                 else:
 44 |                     cmds += [f"--{key}", str(value)]
 45 |         # 设置子进程启用静默模式，不显示控制台窗口
 46 |         self.ret = None
 47 |         startupinfo = None
 48 |         if "win32" in str(sysPlatform).lower():
 49 |             startupinfo = subprocess.STARTUPINFO()
 50 |             startupinfo.dwFlags = (
 51 |                 subprocess.CREATE_NEW_CONSOLE | subprocess.STARTF_USESHOWWINDOW
 52 |             )
 53 |             startupinfo.wShowWindow = subprocess.SW_HIDE
 54 |         self.ret = subprocess.Popen(  # 打开管道
 55 |             cmds,
 56 |             cwd=cwd,
 57 |             stdin=subprocess.PIPE,
 58 |             stdout=subprocess.PIPE,
 59 |             stderr=subprocess.DEVNULL,  # 丢弃stderr的内容
 60 |             startupinfo=startupinfo,  # 开启静默模式
 61 |         )
 62 |         # 启动子进程
 63 |         while True:
 64 |             if not self.ret.poll() == None:  # 子进程已退出，初始化失败
 65 |                 raise Exception(f"OCR init fail.")
 66 |             initStr = self.ret.stdout.readline().decode("utf-8", errors="ignore")
 67 |             if "OCR init completed." in initStr:  # 初始化成功
 68 |                 break
 69 |             elif "OCR clipboard enbaled." in initStr:  # 检测到剪贴板已启用
 70 |                 self.__ENABLE_CLIPBOARD = True
 71 |         atexit.register(self.exit)  # 注册程序终止时执行强制停止子进程
 72 | 
 73 |     def isClipboardEnabled(self) -> bool:
 74 |         return self.__ENABLE_CLIPBOARD
 75 | 
 76 |     def getRunningMode(self) -> str:
 77 |         # 默认管道模式只能运行在本地
 78 |         return "local"
 79 | 
 80 |     def runDict(self, writeDict: dict):
 81 |         """传入指令字典，发送给引擎进程。\n
 82 |         `writeDict`: 指令字典。\n
 83 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
 84 |         # 检查子进程
 85 |         if not self.ret:
 86 |             return {"code": 901, "data": f"引擎实例不存在。"}
 87 |         if not self.ret.poll() == None:
 88 |             return {"code": 902, "data": f"子进程已崩溃。"}
 89 |         # 输入信息
 90 |         writeStr = jsonDumps(writeDict, ensure_ascii=True, indent=None) + "\n"
 91 |         try:
 92 |             self.ret.stdin.write(writeStr.encode("utf-8"))
 93 |             self.ret.stdin.flush()
 94 |         except Exception as e:
 95 |             return {
 96 |                 "code": 902,
 97 |                 "data": f"向识别器进程传入指令失败，疑似子进程已崩溃。{e}",
 98 |             }
 99 |         # 获取返回值
100 |         try:
101 |             getStr = self.ret.stdout.readline().decode("utf-8", errors="ignore")
102 |         except Exception as e:
103 |             return {"code": 903, "data": f"读取识别器进程输出值失败。异常信息：[{e}]"}
104 |         try:
105 |             return jsonLoads(getStr)
106 |         except Exception as e:
107 |             return {
108 |                 "code": 904,
109 |                 "data": f"识别器输出值反序列化JSON失败。异常信息：[{e}]。原始内容：[{getStr}]",
110 |             }
111 | 
112 |     def run(self, imgPath: str):
113 |         """对一张本地图片进行文字识别。\n
114 |         `exePath`: 图片路径。\n
115 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
116 |         writeDict = {"image_path": imgPath}
117 |         return self.runDict(writeDict)
118 | 
119 |     def runClipboard(self):
120 |         """立刻对剪贴板第一位的图片进行文字识别。\n
121 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
122 |         if self.__ENABLE_CLIPBOARD:
123 |             return self.run("clipboard")
124 |         else:
125 |             raise Exception("剪贴板功能不存在或已禁用。")
126 | 
127 |     def runBase64(self, imageBase64: str):
128 |         """对一张编码为base64字符串的图片进行文字识别。\n
129 |         `imageBase64`: 图片base64字符串。\n
130 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
131 |         writeDict = {"image_base64": imageBase64}
132 |         return self.runDict(writeDict)
133 | 
134 |     def runBytes(self, imageBytes):
135 |         """对一张图片的字节流信息进行文字识别。\n
136 |         `imageBytes`: 图片字节流。\n
137 |         `return`:  {"code": 识别码, "data": 内容列表或错误信息字符串}\n"""
138 |         imageBase64 = b64encode(imageBytes).decode("utf-8")
139 |         return self.runBase64(imageBase64)
140 | 
141 |     def exit(self):
142 |         """关闭引擎子进程"""
143 |         if hasattr(self, "ret"):
144 |             if not self.ret:
145 |                 return
146 |             try:
147 |                 self.ret.kill()  # 关闭子进程
148 |             except Exception as e:
149 |                 print(f"[Error] ret.kill() {e}")
150 |         self.ret = None
151 |         atexit.unregister(self.exit)  # 移除退出处理
152 |         print("###  PPOCR引擎子进程关闭！")
153 | 
154 |     @staticmethod
155 |     def printResult(res: dict):
156 |         """用于调试，格式化打印识别结果。\n
157 |         `res`: OCR识别结果。"""
158 | 
159 |         # 识别成功
160 |         if res["code"] == 100:
161 |             index = 1
162 |             for line in res["data"]:
163 |                 print(
164 |                     f"{index}-置信度：{round(line['score'], 2)}，文本：{line['text']}"
165 |                 )
166 |                 index += 1
167 |         elif res["code"] == 100:
168 |             print("图片中未识别出文字。")
169 |         else:
170 |             print(f"图片识别失败。错误码：{res['code']}，错误信息：{res['data']}")
171 | 
172 |     def __del__(self):
173 |         self.exit()
174 | 


--------------------------------------------------------------------------------
/win_linux_PaddleOCR-json/PPOCR_config.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import psutil
  3 | from plugin_i18n import Translator
  4 | 
  5 | tr = Translator(__file__, "i18n.csv")
  6 | 
  7 | # 模块配置路径
  8 | MODELS_CONFIGS = "/models/configs.txt"
  9 | 
 10 | 
 11 | # 动态获取模型库列表
 12 | def _getlanguageList():
 13 |     """configs.txt 格式示例：
 14 |     config_chinese.txt 简体中文
 15 |     config_en.txt English
 16 |     """
 17 |     optionsList = []
 18 |     configsPath = os.path.dirname(os.path.abspath(__file__)) + MODELS_CONFIGS
 19 |     try:
 20 |         with open(configsPath, "r", encoding="utf-8") as file:
 21 |             content = file.read()
 22 |             lines = content.split("\n")
 23 |             for l in lines:
 24 |                 parts = l.split(" ", 1)
 25 |                 optionsList.append([f"models/{parts[0]}", parts[1]])
 26 |         return optionsList
 27 |     except FileNotFoundError:
 28 |         print(
 29 |             "[Error] PPOCR配置文件configs不存在，请检查文件路径是否正确。", configsPath
 30 |         )
 31 |     except IOError:
 32 |         print("[Error] PPOCR配置文件configs无法打开或读取。")
 33 |     return []
 34 | 
 35 | 
 36 | _LanguageList = _getlanguageList()
 37 | 
 38 | 
 39 | # 获取最佳线程数。用户设定可以覆盖这个计算值。
 40 | def _getThreads():
 41 |     threadsCount = 1
 42 |     try:
 43 |         phyCore = psutil.cpu_count(logical=False)  # 物理核心数
 44 |         lgiCore = psutil.cpu_count(logical=True)  # 逻辑核心数
 45 |         if (
 46 |             not isinstance(phyCore, int)
 47 |             or not isinstance(lgiCore, int)
 48 |             or lgiCore < phyCore
 49 |         ):
 50 |             raise ValueError("核心数计算异常")
 51 |         # 物理核数=逻辑核数，返回逻辑核数
 52 |         if phyCore * 2 == lgiCore or phyCore == lgiCore:
 53 |             threadsCount = lgiCore
 54 |         # 大小核处理器，返回大核线程数
 55 |         else:
 56 |             big = lgiCore - phyCore
 57 |             threadsCount = big * 2
 58 |         threadsCount = int(threadsCount)
 59 |     except Exception as e:
 60 |         print("[Warning] 无法获取CPU核心数！", e)
 61 |     # 线程上限16
 62 |     if threadsCount > 16:
 63 |         threadsCount = 16
 64 |     return threadsCount
 65 | 
 66 | 
 67 | _threads = _getThreads()
 68 | 
 69 | 
 70 | # 获取内存占用默认上限。用户设定可以覆盖这个计算值。
 71 | def _getRamMax():
 72 |     ramMax = 1024
 73 |     try:
 74 |         # 获取系统总内存数（以字节为单位）
 75 |         totalMemoryBytes = psutil.virtual_memory().total
 76 |         ramMax *= 0.5  # 取总内存的一半
 77 |         # 将总内存数转换为 MB 单位
 78 |         ramMax = totalMemoryBytes / 1048576
 79 |         ramMax = int(ramMax)
 80 |     except Exception as e:
 81 |         print("[Warning] 无法获取系统总内存数！", e)
 82 |     # 默认内存下限512MB，上限8G
 83 |     if ramMax < 512:
 84 |         ramMax = 512
 85 |     elif ramMax > 8192:
 86 |         ramMax = 8192
 87 |     return ramMax
 88 | 
 89 | 
 90 | _ramMax = _getRamMax()
 91 | 
 92 | globalOptions = {
 93 |     "title": tr("PaddleOCR（本地）"),
 94 |     "type": "group",
 95 |     "enable_mkldnn": {
 96 |         "title": tr("启用MKL-DNN加速"),
 97 |         "default": True,
 98 |         "toolTip": tr(
 99 |             "使用MKL-DNN数学库提高神经网络的计算速度。能大幅加快OCR识别速度，但也会增加内存占用。"
100 |         ),
101 |     },
102 |     "cpu_threads": {
103 |         "title": tr("线程数"),
104 |         "default": _threads,
105 |         "min": 1,
106 |         "isInt": True,
107 |     },
108 |     "ram_max": {
109 |         "title": tr("内存占用限制"),
110 |         "default": _ramMax,
111 |         "min": -1,
112 |         "unit": "MB",
113 |         "isInt": True,
114 |         "toolTip": tr("值>0时启用。引擎内存占用超过该值时，执行内存清理。"),
115 |     },
116 |     "ram_time": {
117 |         "title": tr("内存闲时清理"),
118 |         "default": 60,
119 |         "min": -1,
120 |         "unit": tr("秒"),
121 |         "isInt": True,
122 |         "toolTip": tr("值>0时启用。引擎空闲时间超过该值时，执行内存清理。"),
123 |     },
124 | }
125 | 
126 | localOptions = {
127 |     "title": tr("文字识别（PaddleOCR）"),
128 |     "type": "group",
129 |     "language": {
130 |         "title": tr("语言/模型库"),
131 |         "optionsList": _LanguageList,
132 |     },
133 |     "cls": {
134 |         "title": tr("纠正文本方向"),
135 |         "default": False,
136 |         "toolTip": tr("启用方向分类，识别倾斜或倒置的文本。可能降低识别速度。"),
137 |     },
138 |     "limit_side_len": {
139 |         "title": tr("限制图像边长"),
140 |         "optionsList": [
141 |             [960, "960 " + tr("（默认）")],
142 |             [2880, "2880"],
143 |             [4320, "4320"],
144 |             [999999, tr("无限制")],
145 |         ],
146 |         "toolTip": tr(
147 |             "将边长大于该值的图片进行压缩，可以提高识别速度。可能降低识别精度。"
148 |         ),
149 |     },
150 | }
151 | 


--------------------------------------------------------------------------------
/win_linux_PaddleOCR-json/PPOCR_umi.py:
--------------------------------------------------------------------------------
  1 | # Umi-OCR 插件接口： PaddleOCR-json
  2 | # 项目主页：
  3 | # https://github.com/hiroi-sora/PaddleOCR-json
  4 | 
  5 | import os
  6 | import psutil  # 进程检查
  7 | from platform import system  # 平台检查
  8 | import threading  # 线程
  9 | 
 10 | from call_func import CallFunc
 11 | from .PPOCR_api import PPOCR_pipe
 12 | 
 13 | # 引擎可执行文件（入口）名称
 14 | # # TODO ：改为 Umi 内部的平台标志，无需自己获取标志
 15 | system_type = system()
 16 | ExeFile = ""
 17 | if system_type == "Windows":
 18 |     ExeFile = "PaddleOCR-json.exe"
 19 | elif system_type == "Linux":
 20 |     ExeFile = "run.sh"
 21 | else:
 22 |     raise NotImplementedError(f"[Error] PaddleOCR: Unsupported system: {system_type}")
 23 | 
 24 | # 引擎可执行文件路径
 25 | ExePath = os.path.join(os.path.dirname(os.path.abspath(__file__)), ExeFile)
 26 | # 引擎可执行文件启动参数映射表。将配置项映射到启动参数
 27 | ExeConfigs = [
 28 |     ("enable_mkldnn", "enable_mkldnn"),  # mkl加速
 29 |     ("config_path", "language"),  # 配置文件路径
 30 |     ("cls", "cls"),  # 方向分类
 31 |     ("use_angle_cls", "cls"),  # 方向分类
 32 |     ("limit_side_len", "limit_side_len"),  # 长边压缩
 33 |     ("cpu_threads", "cpu_threads"),  # 线程数
 34 | ]
 35 | 
 36 | 
 37 | class Api:  # 公开接口
 38 |     def __init__(self, globalArgd):
 39 |         # 测试路径是否存在
 40 |         if not os.path.exists(ExePath):
 41 |             raise ValueError(f'[Error] Exe path "{ExePath}" does not exist.')
 42 |         # 初始化参数
 43 |         self.api = None  # api对象
 44 |         self.exeConfigs = {}  # exe启动参数字典
 45 |         self._updateExeConfigs(self.exeConfigs, globalArgd)  # 更新启动参数字典
 46 |         # 内存清理参数
 47 |         self.ramInfo = {"max": -1, "time": -1, "timerID": ""}
 48 |         m = globalArgd["ram_max"]
 49 |         if isinstance(m, (int, float)):
 50 |             self.ramInfo["max"] = m
 51 |         m = globalArgd["ram_time"]
 52 |         if isinstance(m, (int, float)):
 53 |             self.ramInfo["time"] = m
 54 |         self.isInit = True
 55 |         self.lock = threading.Lock()  # 线程锁
 56 | 
 57 |     # 更新启动参数，将data的值写入target
 58 |     def _updateExeConfigs(self, target, data):
 59 |         for c in ExeConfigs:
 60 |             if c[1] in data:
 61 |                 target[c[0]] = data[c[1]]
 62 | 
 63 |     # 启动引擎。返回： "" 成功，"[Error] xxx" 失败
 64 |     def start(self, argd):
 65 |         with self.lock:
 66 |             # 加载局部参数
 67 |             tempConfigs = self.exeConfigs.copy()
 68 |             self._updateExeConfigs(tempConfigs, argd)
 69 |             # 若引擎已启动，且局部参数与传入参数一致，则无需重启
 70 |             if self.api is not None and set(tempConfigs.items()) == set(
 71 |                 self.exeConfigs.items()
 72 |             ):
 73 |                 return ""
 74 |             # 记录参数
 75 |             self.exeConfigs = tempConfigs
 76 |             # 启动引擎
 77 |             try:
 78 |                 self.stop()
 79 |                 self.api = PPOCR_pipe(ExePath, argument=tempConfigs)
 80 |             except Exception as e:
 81 |                 self.api = None
 82 |                 return f"[Error] OCR init fail. Argd: {tempConfigs}\n{e}"
 83 |             return ""
 84 | 
 85 |     def stop(self):  # 停止引擎
 86 |         if self.api is None:
 87 |             return
 88 |         self.api.exit()
 89 |         self.api = None
 90 | 
 91 |     def runPath(self, imgPath: str):  # 路径识图
 92 |         self._runBefore()
 93 |         res = self.api.run(imgPath)
 94 |         self._ramClear()
 95 |         return res
 96 | 
 97 |     def runBytes(self, imageBytes):  # 字节流
 98 |         self._runBefore()
 99 |         res = self.api.runBytes(imageBytes)
100 |         self._ramClear()
101 |         return res
102 | 
103 |     def runBase64(self, imageBase64):  # base64字符串
104 |         self._runBefore()
105 |         res = self.api.runBase64(imageBase64)
106 |         self._ramClear()
107 |         return res
108 | 
109 |     def _runBefore(self):
110 |         CallFunc.delayStop(self.ramInfo["timerID"])  # 停止ram清理计时器
111 | 
112 |     def _restart(self):  # 重启引擎
113 |         with self.lock:
114 |             try:
115 |                 self.stop()
116 |                 self.api = PPOCR_pipe(ExePath, argument=self.exeConfigs)
117 |             except Exception as e:
118 |                 self.api = None
119 |                 print(f"[Error] OCR restart fail: {e}")
120 | 
121 |     def _ramClear(self):  # 内存清理
122 |         if self.ramInfo["max"] > 0:
123 |             pid = self.api.ret.pid
124 |             rss = psutil.Process(pid).memory_info().rss
125 |             rss /= 1048576
126 |             if rss > self.ramInfo["max"]:
127 |                 self._restart()
128 |         if self.ramInfo["time"] > 0:
129 |             self.ramInfo["timerID"] = CallFunc.delay(
130 |                 self._restart, self.ramInfo["time"]
131 |             )
132 | 


--------------------------------------------------------------------------------
/win_linux_PaddleOCR-json/README.md:
--------------------------------------------------------------------------------
  1 | # PaddleOCR-json 插件
  2 | 
  3 | 兼容 `Windows 7 x64`、`Linux x64` 。
  4 | 
  5 | - [Windows 简易部署](#win-1)
  6 | - [Windows 源码部署](#win-2)
  7 | - [Linux 简易部署](#linux-1)
  8 | - [Linux 源码部署](#linux-2)
  9 | 
 10 | <a id="win-1"></a>
 11 | 
 12 | ## Windows 简易部署步骤
 13 | 
 14 | - 按照 [Umi-OCR Linux 运行环境](https://github.com/hiroi-sora/Umi-OCR_runtime_linux) 的说明，配置 Umi-OCR 本体。
 15 | - 访问 [Umi-OCR_plugins 发布页](https://github.com/hiroi-sora/Umi-OCR_plugins/releases) ，下载最新的 Windows 发行包 `win7_x64_PaddleOCR-json_**.7z` ，解压。
 16 | - 解压得到的文件夹，丢到 Umi-OCR 的插件目录 `UmiOCR-data/plugins` 。
 17 | 
 18 | <a id="win-2"></a>
 19 | 
 20 | ## Windows 源码部署步骤
 21 | 
 22 | ### 第1步：准备 Umi-OCR 本体、插件源码
 23 | 
 24 | - 按照 [Umi-OCR Windows 运行环境](https://github.com/hiroi-sora/Umi-OCR_runtime_windows) 的说明，配置 Umi-OCR 本体。
 25 | - 在本体目录中，创建一个 `tools` 目录。
 26 | - 下载 **插件仓库** 源码：
 27 | 
 28 | ```sh
 29 | git clone https://github.com/hiroi-sora/Umi-OCR_plugins.git
 30 | ```
 31 | 
 32 | ### 第2步：准备 PaddleOCR-json 可执行文件
 33 | 
 34 | #### 方式1：直接下载
 35 | 
 36 | - 浏览器访问 [PaddleOCR-json 发布页](https://github.com/hiroi-sora/PaddleOCR-json/releases) ，获取最新的 Windows 发行包 `PaddleOCR-json_v1.X.X_windows_x86-64.7z` 的链接，下载压缩包并解压。
 37 | - 解压出来的文件夹，改名为 `win7_x64_PaddleOCR-json` 。
 38 | 
 39 | #### 方式2：从源码构建
 40 | 
 41 | - 见 [PaddleOCR-json Windows 构建指南](https://github.com/hiroi-sora/PaddleOCR-json/blob/main/cpp/README.md) 。
 42 | 
 43 | - 假设你完成了编译，那么将生成的所有可执行文件拷贝到一个 `win7_x64_PaddleOCR-json` 文件夹中。
 44 | 
 45 | ### 第3步：组装插件，放置插件
 46 | 
 47 | - 将 `tools\Umi-OCR_plugins\win_linux_PaddleOCR-json` 中的所有文件，复制到 `win7_x64_PaddleOCR-json` 。
 48 | - 在 `win7_x64_PaddleOCR-json` 中，双击 `PaddleOCR-json.exe` 测试。正常情况下，应该打开一个控制台窗口，显示 `OCR init completed.` 。
 49 | - 将 `win7_x64_PaddleOCR-json` 整个文件夹，复制到 `UmiOCR-data\plugins` 中。
 50 | 
 51 | ### 最终测试
 52 | 
 53 | 启动 Umi-OCR ，测试各种功能吧。
 54 | 
 55 | 在全局设置→拉到最底下，可以看到 PaddleOCR-json 插件相关的性能设置。
 56 | 
 57 | <a id="linux-1"></a>
 58 | 
 59 | ## Linux 简易部署步骤
 60 | 
 61 | - 按照 [Umi-OCR Linux 运行环境](https://github.com/hiroi-sora/Umi-OCR_runtime_linux) 的说明，配置 Umi-OCR 本体。
 62 | - 去到 Umi-OCR 的插件目录 `UmiOCR-data/plugins`
 63 | - 浏览器访问 [Umi-OCR_plugins 发布页](https://github.com/hiroi-sora/Umi-OCR_plugins/releases) ，获取最新的 Linux 发行包 `linux_x64_PaddleOCR-json` 的链接，下载压缩包并解压。
 64 | 
 65 | 示例：
 66 | 
 67 | ```
 68 | # 去到插件目录
 69 | cd UmiOCR-data/plugins
 70 | # 如果没有则创建
 71 | # mkdir UmiOCR-data/plugins
 72 | # 下载
 73 | wget https://github.com/hiroi-sora/Umi-OCR_plugins/releases/download/2.1.3_dev/linux_x64_PaddleOCR-json_v140_beta.tar.xz
 74 | # 解压
 75 | tar -v -xf linux_x64_PaddleOCR-json_v140_beta.tar.xz
 76 | # 完成，打开 Umi-OCR 软件，进行测试吧
 77 | ```
 78 | 
 79 | <a id="linux-2"></a>
 80 | 
 81 | ## Linux 源码部署步骤
 82 | 
 83 | ### 第1步：准备 Umi-OCR 本体、插件源码
 84 | 
 85 | - 按照 [Umi-OCR Linux 运行环境](https://github.com/hiroi-sora/Umi-OCR_runtime_linux) 的说明，配置 Umi-OCR 本体。
 86 | - 在本体目录中，创建一个 `tools` 目录：
 87 | 
 88 | ```sh
 89 | mkdir tools
 90 | cd tools
 91 | ```
 92 | 
 93 | - 下载 插件仓库源码：
 94 | 
 95 | ```sh
 96 | git clone https://github.com/hiroi-sora/Umi-OCR_plugins.git
 97 | ```
 98 | 
 99 | ### 第2步：准备 PaddleOCR-json 可执行文件
100 | 
101 | #### 方式1：直接下载
102 | 
103 | - 浏览器访问 [PaddleOCR-json 发布页](https://github.com/hiroi-sora/PaddleOCR-json/releases) ，获取最新的 Linux 发行包 `PaddleOCR-json_v1.X.X_debian_gcc_x86-64.tar.xz` 的链接，下载压缩包并解压。示例：
104 | 
105 | ```
106 | # 下载
107 | wget https://github.com/hiroi-sora/PaddleOCR-json/releases/download/v1.4.0-beta.2/PaddleOCR-json_v1.4.0.beta.2_debian_gcc_x86-64.tar.xz
108 | # 解压
109 | tar -v -xf PaddleOCR-json_v1.4.0.beta.2_debian_gcc_x86-64.tar.xz
110 | # 改个短一点的名，更好操作
111 | mv PaddleOCR-json_v1.4.0_debian_gcc_x86-64 PaddleOCR-json-bin
112 | # 测试一下。如果没有测试图片，那么留空即可。
113 | ./PaddleOCR-json-bin/run.sh --image_path="测试图片路径"
114 | # 如果输出 OCR init completed. 那么测试通过。
115 | # 如果输出 OCR init completed. 后没有停止，那么按 Ctrl+C 强制停止即可。
116 | ```
117 | 
118 | #### 方式2：从源码构建
119 | 
120 | - 见 [PaddleOCR-json Linux 构建指南](https://github.com/hiroi-sora/PaddleOCR-json/blob/main/cpp/README-linux.md) 。
121 | 
122 | - 假设你完成了编译，那么将生成的所有可执行文件拷贝到前文所述的 `tools/PaddleOCR-json-bin` 目录。
123 | 
124 | #### 准备完成
125 | 
126 | - 通过以上步骤，你应该得到这样的目录结构：
127 | 
128 | ```
129 | Umi-OCR
130 | ├─ umi-ocr.sh
131 | ├─ UmiOCR-data
132 | └─ tools
133 |    ├─ Umi-OCR_plugins
134 |    └─ PaddleOCR-json-bin
135 | ```
136 | 
137 | ### 第3步：组装插件，放置插件
138 | 
139 | - 确保当前在 `tools` 目录中，其中有 `Umi-OCR_plugins` 和 `PaddleOCR-json-bin` 。
140 | - 进行以下操作：
141 | 
142 | ```sh
143 | # 创建插件目录
144 | mkdir -p linux_x64_PaddleOCR-json
145 | # 复制可执行文件
146 | cp -rf PaddleOCR-json-bin/* linux_x64_PaddleOCR-json/
147 | # 复制插件控制源码
148 | cp -rf Umi-OCR_plugins/win_linux_PaddleOCR-json/* linux_x64_PaddleOCR-json/
149 | # 将组装完毕的完整插件，放入 Umi-OCR 的插件目录
150 | cp -rf linux_x64_PaddleOCR-json ../UmiOCR-data/plugins/
151 | ```
152 | 
153 | ### 最终测试
154 | 
155 | 启动 Umi-OCR ，测试各种功能吧。
156 | 
157 | 在全局设置→拉到最底下，可以看到 PaddleOCR-json 插件相关的性能设置。
158 | 


--------------------------------------------------------------------------------
/win_linux_PaddleOCR-json/__init__.py:
--------------------------------------------------------------------------------
 1 | from . import PPOCR_umi
 2 | from . import PPOCR_config
 3 | 
 4 | # 插件信息
 5 | PluginInfo = {
 6 |     # 插件组别
 7 |     "group": "ocr",
 8 |     # 全局配置
 9 |     "global_options": PPOCR_config.globalOptions,
10 |     # 局部配置
11 |     "local_options": PPOCR_config.localOptions,
12 |     # 接口类
13 |     "api_class": PPOCR_umi.Api,
14 | }
15 | 


--------------------------------------------------------------------------------
/win_linux_PaddleOCR-json/i18n.csv:
--------------------------------------------------------------------------------
 1 | key,en_US,zh_TW,ja_JP
 2 | PaddleOCR（本地）,PaddleOCR (Local),PaddleOCR（本地）,PaddleOCR（ローカル）
 3 | 启用MKL-DNN加速,Enable MKL-DNN acceleration,啟用MKL-DNN加速,MKL-DNN加速を有効にする
 4 | 使用MKL-DNN数学库提高神经网络的计算速度。能大幅加快OCR识别速度，但也会增加内存占用。,"Use the MKL-DNN mathematical library to improve the computation speed of the neural network. This can significantly accelerate OCR recognition speed, but it will also increase memory usage.",使用MKL-DNN數學庫提高神經網路的計算速度。 能大幅加快OCR識別速度，但也會新增記憶體佔用。,MKL?DNN数学ライブラリを用いてニューラルネットワークの計算速度を向上させた。OCRの認識速度を大幅に速めることができますが、メモリの使用量も増加します。
 5 | 线程数,Number of threads,線程數,スレッド数
 6 | 内存占用限制,Memory usage limit,記憶體佔用限制,メモリ使用量の制限
 7 | 值>0时启用。引擎内存占用超过该值时，执行内存清理。,"Enable when the value is greater than 0. When the memory usage of the engine exceeds this value, memory cleanup will be performed.",值>0時啟用。 引擎記憶體佔用超過該值時，執行記憶體清理。,値>0の場合に有効になります。エンジンメモリがこの値を超えて占有されている場合は、メモリクリーンアップを実行します。
 8 | 内存闲时清理,Memory cleanup during idle time,記憶體閑時清理,メモリアイドル時のクリーンアップ
 9 | 秒,seconds,秒,秒
10 | 值>0时启用。引擎空闲时间超过该值时，执行内存清理。,"Enable when the value is greater than 0. When the idle time of the engine exceeds this value, memory cleanup will be performed.",值>0時啟用。 引擎空閒時間超過該值時，執行記憶體清理。,値>0の場合に有効になります。エンジンアイドル時間がこの値を超えると、メモリクリーンアップが実行されます。
11 | 文字识别（PaddleOCR）,Text recognition (PaddleOCR),文字識別（PaddleOCR）,文字認識（PaddleOCR）
12 | 语言/模型库,Language/Model library,語言/模型庫,言語/モデルライブラリ
13 | 纠正文本方向,Text direction correction,糾正文字方向,テキスト方向の修正
14 | 启用方向分类，识别倾斜或倒置的文本。可能降低识别速度。,Enable direction classification to recognize slanted or inverted text. This may reduce recognition speed.,啟用方向分類，識別傾斜或倒置的文字。 可能降低識別速度。,方向分類を有効にして、傾斜または反転したテキストを識別します。認識速度が低下する可能性があります。
15 | 限制图像边长,Limit image edge length,限制影像邊長,画像の辺の長さを制限する
16 | （默认）,(Default),（默認）,(デフォルト)
17 | 无限制,Unlimited,無限制,制限なし
18 | 将边长大于该值的图片进行压缩，可以提高识别速度。可能降低识别精度。,Compress images with edge length greater than this value to improve recognition speed. This may reduce recognition accuracy.,將邊長大於該值的圖片進行壓縮，可以提高識別速度。 可能降低識別精度。,辺の長さがこの値より大きい画像を圧縮することで、認識速度を高めることができます。認識精度が低下する可能性があります。
19 | 


--------------------------------------------------------------------------------