├── .gitignore
├── .gitmodules
├── LICENSE
├── README.md
├── TODO.md
├── build
├── dldt
│ └── dldt_setup.sh
├── ffmpeg
│ ├── ffmpeg_premake.sh
│ └── ffmpeg_setup.sh
├── openblas
│ └── openblas_setup.sh
└── opencv
│ └── opencv_setup.sh
├── create_wheel
├── LICENSE
├── LICENSE_MIT
├── README.md
├── cv2
│ └── __init__.py
└── setup.py
├── download_all_stuff.sh
├── requirements.txt
├── tests
├── README.md
├── dislike.jpg
├── examples.ipynb
├── helloworld.png
├── pixellink.py
├── prepare_and_run_tests.sh
├── requirements.txt
├── short_video.mp4
├── tests.py
├── text-detection-0004.bin.sha256sum
├── text-detection-0004.xml.sha256sum
├── text-recognition-0012.bin.sha256sum
├── text-recognition-0012.xml.sha256sum
└── text_recognition.py
└── tests_openvino
├── README.md
├── pixellink.py
├── prepare_and_run_tests.sh
├── requirements.txt
├── tests.py
├── text-detection-0004.bin.sha256sum
├── text-detection-0004.xml.sha256sum
└── text_recognition.py
/.gitignore:
--------------------------------------------------------------------------------
1 | openblas/*
2 | !openblas/.gitkeep
3 |
4 | dldt/*
5 | !dldt/.gitkeep
6 |
7 | opencv/*
8 | !opencv/.gitkeep
9 |
10 | ffmpeg/*
11 | !ffmpeg/.gitkeep
12 |
13 | build/*
14 | !build/opencv
15 | build/opencv/*
16 | !build/opencv/opencv_setup.sh
17 |
18 | !build/dldt
19 | build/dldt/*
20 | !build/dldt/dldt_setup.sh
21 | !build/dldt/*.patch
22 |
23 | !build/ffmpeg
24 | build/ffmpeg/*
25 | !build/ffmpeg/ffmpeg_setup.sh
26 | !build/ffmpeg/ffmpeg_premake.sh
27 |
28 | !build/openblas
29 | build/openblas/*
30 | !build/openblas/openblas_setup.sh
31 |
32 | create_wheel/*
33 | !create_wheel/LICENSE*
34 | !create_wheel/README.md
35 | !create_wheel/setup.py
36 | !create_wheel/cv2
37 | create_wheel/cv2/*
38 | !create_wheel/cv2/__init__.py
39 |
40 | tests/venv_t
41 | tests/rateme*
42 | venv
43 |
44 | tests_openvino/venv_t
45 | tests_openvino/venv_d
46 | tests_openvino/rateme*
47 | tests_openvino/*.bin
48 | tests_openvino/*.xml
49 |
50 | *cache*
51 | .ipynb_checkpoints
52 | *tar.gz
53 | *tar.bz2
54 | *.zip
55 | *.swp
56 | *.whl
57 | *.xml
58 | TODO.txt
59 | *.bin
60 | *.weights
61 |
--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------
1 | [submodule "dldt"]
2 | path = dldt
3 | url = https://github.com/openvinotoolkit/openvino
4 | [submodule "opencv"]
5 | path = opencv
6 | url = https://github.com/opencv/opencv
7 | [submodule "ffmpeg"]
8 | path = ffmpeg
9 | url = https://github.com/FFmpeg/FFmpeg
10 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 Kabakov Borys
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | [](https://pepy.tech/project/opencv-python-inference-engine) [](https://pepy.tech/project/opencv-python-inference-engine/month) [](https://pepy.tech/project/opencv-python-inference-engine/week)
2 |
3 | # opencv-python-inference-engine
4 |
5 | ---
6 |
7 | $${\color{red}It \space is \space deprecated \space now, \space all \space future \space updates \space are \space improbable}$$
8 |
9 | A lot has changed during my military leave:
10 |
11 | 1. Everything [changed since OpenVINO 2021.1](https://github.com/openvinotoolkit/openvino/releases/tag/2022.1.0), now there should be [two separate libs](https://opencv.org/how-to-use-opencv-with-openvino/), building process and inference engine API have changed dramatically, *without backwards compatibility* (btw, opencv-python now *official* python builds).
12 | 2. Now opencv has [small package installations via pip](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html), so the main reason for creating that package is gone.
13 | 3. Just [look at the current official way](https://github.com/banderlog/opencv-python-inference-engine/tree/dev/create_wheel/cv2) of importing cv2 cxx lib into the python module via 5 python scripts -- I do not want to mess with this growing lump of crutches
14 |
15 | My advice is to use openvino and opencv-python packages, see rewritten examples in [`tests_openvino`](https://github.com/banderlog/opencv-python-inference-engine/tree/master/tests_openvino)
16 |
17 | ---
18 |
19 | This is *Unofficial* pre-built OpenCV with the inference engine part of [OpenVINO](https://github.com/openvinotoolkit/openvino) package for Python.
20 |
21 | ## Installing from `pip3`
22 |
23 | Remove previously installed versions of `cv2`
24 |
25 | ```bash
26 | pip3 install opencv-python-inference-engine
27 | ```
28 |
29 | ## Examples of usage
30 |
31 | Please see the `examples.ipynb` in the `tests` folder.
32 |
33 | You will need to preprocess data as a model requires and decode the output. A description of the decoding *should* be in the model documentation with examples in open-vino documentation, however, in some cases, the original article may be the only information source. Some models are very simple to encode/decode, others are tough (e.g., PixelLink in tests).
34 |
35 |
36 | ## Downloading intel models
37 |
38 | The official way is awkward because you need to git clone the whole [model_zoo](https://github.com/opencv/open_model_zoo) ([details](https://github.com/opencv/open_model_zoo/issues/522))
39 |
40 | Better to find a model description [here](https://github.com/opencv/open_model_zoo/blob/master/models/intel/index.md) and download manually from [here](https://download.01.org/opencv/2021/openvinotoolkit/2021.2/open_model_zoo/models_bin/3/)
41 |
42 |
43 | ## Description
44 |
45 |
46 | ### Why
47 |
48 | I needed an ability to fast deploy a small package that able to run models from [Intel's model zoo](https://github.com/openvinotoolkit/open_model_zoo) and use [Movidius NCS](https://software.intel.com/en-us/neural-compute-stick).
49 | Well-known [opencv-python](https://github.com/skvark/opencv-python) can't do this.
50 | The official way is to use OpenVINO, but it is big and clumsy (just try to use it with python venv or fast download it on cloud instance).
51 |
52 |
53 | ### Limitations
54 |
55 | + Package comes without contrib modules.
56 | + You need to [add udev rules](https://www.intel.com/content/www/us/en/support/articles/000057005/boards-and-kits.html) if you want working MYRIAD plugin.
57 | + It was tested on Ubuntu 18.04, Ubuntu 18.10 as Windows 10 Subsystem and Gentoo.
58 | + It will not work for Ubuntu 16.04 and below (except v4.1.0.4).
59 | + I had not made builds for Windows or MacOS.
60 | + It built with `ffmpeg` and `v4l` support (`ffmpeg` libs included).
61 | + No GTK/QT support -- use `matplotlib` for plotting your results.
62 | + It is 64 bit.
63 |
64 | ### Main differences from `opencv-python-headless`
65 |
66 | + Usage of `AVX2` instructions
67 | + No `JPEG 2000`, `WEBP`, `OpenEXR` support
68 | + `TBB` used as a parallel framework
69 | + Inference Engine with `MYRIAD` plugin
70 |
71 | ### Main differences from OpenVINO
72 |
73 | + No model-optimizer
74 | + No [ITT](https://software.intel.com/en-us/articles/intel-itt-api-open-source)
75 | + No [IPP](https://software.intel.com/en-us/ipp)
76 | + No [Intel Media SDK](https://software.intel.com/en-us/media-sdk)
77 | + No [OpenVINO IE API](https://github.com/opencv/dldt/tree/2020/inference-engine/ie_bridges/python/src/openvino/inference_engine)
78 | + No python2 support (it is dead)
79 | + No Gstreamer (use ffmpeg)
80 | + No GTK (+16 MB and a lot of problems and extra work to compile Qt\GTK libs from sources.)
81 |
82 | For additional info read `cv2.getBuildInformation()` output.
83 |
84 | ### Versioning
85 |
86 | `YYYY.MM.DD`, because it is the most simple way to track opencv/openvino versions.
87 |
88 | ## Compiling from source
89 |
90 | You will need ~7GB RAM and ~10GB disk space
91 |
92 | I am using Ubuntu 18.04 (python 3.6) [multipass](https://multipass.run/) instance: `multipass launch -c 6 -d 10G -m 7G 18.04`.
93 |
94 | ### Requirements
95 |
96 | From [opencv](https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html), [dldt](https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html),
97 | [ffmpeg](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu), and [ngraph](https://www.ngraph.ai/documentation/buildlb)
98 |
99 | ```bash
100 | # We need newer `cmake` for dldt (fastest way I know)
101 | # >=cmake-3.16
102 | sudo apt remove --purge cmake
103 | hash -r
104 | sudo snap install cmake --classic
105 |
106 | # nasm for ffmpeg
107 | # libusb-1.0-0-dev for MYRIAD plugin
108 | sudo apt update
109 | sudo apt install build-essential git pkg-config python3-dev nasm python3 virtualenv libusb-1.0-0-dev chrpath shellcheck
110 |
111 | # for ngraph
112 | # the `dldt/_deps/ext_onnx-src/onnx/gen_proto.py` has `#!/usr/bin/env python` string and will throw an error otherwise
113 | sudo ln -s /usr/bin/python3 /usr/bin/python
114 | ```
115 |
116 | ### Preparing
117 |
118 | ```bash
119 | git clone https://github.com/banderlog/opencv-python-inference-engine
120 | cd opencv-python-inference-engine
121 | # git checkout dev
122 | ./download_all_stuff.sh
123 | ```
124 |
125 | ### Compilation
126 |
127 | ```bash
128 | cd build/ffmpeg
129 | ./ffmpeg_setup.sh &&
130 | ./ffmpeg_premake.sh &&
131 | make -j6 &&
132 | make install
133 |
134 | cd ../dldt
135 | ./dldt_setup.sh &&
136 | make -j6
137 |
138 | # NB: check `-D INF_ENGINE_RELEASE` value
139 | # should be in form YYYYAABBCC (e.g. 2020.1.0.2 -> 2020010002)")
140 | cd ../opencv
141 | ./opencv_setup.sh &&
142 | make -j6
143 | ```
144 |
145 | ### Wheel creation
146 |
147 | ```bash
148 | # get all compiled libs together
149 | cd ../../
150 | cp build/opencv/lib/python3/cv2.cpython*.so create_wheel/cv2/cv2.so
151 |
152 | cp dldt/bin/intel64/Release/lib/*.so create_wheel/cv2/
153 | cp dldt/bin/intel64/Release/lib/*.mvcmd create_wheel/cv2/
154 | cp dldt/bin/intel64/Release/lib/plugins.xml create_wheel/cv2/
155 | cp dldt/inference-engine/temp/tbb/lib/libtbb.so.2 create_wheel/cv2/
156 |
157 | cp build/ffmpeg/binaries/lib/*.so create_wheel/cv2/
158 |
159 | # change RPATH
160 | cd create_wheel
161 | for i in cv2/*.so; do chrpath -r '$ORIGIN' $i; done
162 |
163 | # final .whl will be in /create_wheel/dist/
164 | # NB: check version in the `setup.py`
165 | ../venv/bin/python3 setup.py bdist_wheel
166 | ```
167 |
168 | ### Optional things to play with
169 |
170 | + [dldt build instruction](https://github.com/openvinotoolkit/openvino/wiki/CMakeOptionsForCustomCompilation)
171 | + [dldt cmake flags](https://github.com/openvinotoolkit/openvino/blob/master/inference-engine/cmake/features.cmake)
172 | + [opencv cmake flags](https://github.com/opencv/opencv/blob/master/CMakeLists.txt)
173 |
174 | **NB:** removing `QUIET` from `find_package()` in project Cmake files, could help to solve some problems -- сmake will start to log them.
175 |
176 |
177 | #### GTK2
178 |
179 | Make next changes in `opencv-python-inference-engine/build/opencv/opencv_setup.sh`:
180 | 1. change string `-D WITH_GTK=OFF \` to `-D WITH_GTK=ON \`
181 | 2. `export PKG_CONFIG_PATH=$ABS_PORTION/build/ffmpeg/binaries/lib/pkgconfig:$PKG_CONFIG_PATH` -- you will need to
182 | add absolute paths to `.pc` files. On Ubuntu 18.04 they here:
183 | `/usr/lib/x86_64-linux-gnu/pkgconfig/:/usr/share/pkgconfig/:/usr/local/lib/pkgconfig/:/usr/lib/pkgconfig/`
184 |
185 | Exporting `PKG_CONFIG_PATH` for `ffmpeg` somehow messes with default values.
186 |
187 | It will add ~16MB to the package.
188 |
189 | #### Integrated Performance Primitives
190 |
191 | Just set `-D WITH_IPP=ON` in `opencv_setup.sh`.
192 |
193 | It will give +30MB to the final `cv2.so` size. And it will boost _some_ opencv functions.
194 |
195 | [Official Intel's IPP benchmarks](https://software.intel.com/en-us/ipp/benchmarks) (may ask for registration)
196 |
197 | #### MKL
198 |
199 | You need to download MKL-DNN release and set two flags:`-D GEMM=MKL` , `-D MKLROOT` ([details](https://github.com/opencv/dldt/issues/327))
200 |
201 | OpenVino comes with 30MB `libmkl_tiny_tbb.so`, but [you will not be able to compile it](https://github.com/intel/mkl-dnn/issues/674), because it made from proprietary MKL.
202 |
203 | Our opensource MKL-DNN experiment will end with 125MB `libmklml_gnu.so` and inference speed compatible with 5MB openblas ([details](https://github.com/banderlog/opencv-python-inference-engine/issues/5)).
204 |
205 |
206 | #### CUDA
207 |
208 | I did not try it. But it cannot be universal, it will only work with the certain combination of GPU+CUDA+cuDNN for which it will be compiled for.
209 |
210 | + [Compile OpenCV’s ‘dnn’ module with NVIDIA GPU support](https://www.pyimagesearch.com/2020/02/10/opencv-dnn-with-nvidia-gpus-1549-faster-yolo-ssd-and-mask-r-cnn/)
211 | + [Use OpenCV’s ‘dnn’ module with NVIDIA GPUs, CUDA, and cuDNN](https://www.pyimagesearch.com/2020/02/03/how-to-use-opencvs-dnn-module-with-nvidia-gpus-cuda-and-cudnn/)
212 |
213 |
214 | #### OpenMP
215 |
216 | It is possible to compile OpenBLAS, dldt and OpenCV with OpenMP. I am not sure that the result would be better than now, but who knows.
217 |
--------------------------------------------------------------------------------
/TODO.md:
--------------------------------------------------------------------------------
1 | # TODO list
2 |
3 | + Auto value for `-D INF_ENGINE_RELEASE`: https://github.com/openvinotoolkit/openvino/issues/1435
4 | +
5 | + `ENABLE_AVX512F`, how often you see such CPUs in clouds?
6 | + `avresample` from ffmpeg to the opencv, do we need it?
7 |
--------------------------------------------------------------------------------
/build/dldt/dldt_setup.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # https://github.com/openvinotoolkit/openvino/wiki/CMakeOptionsForCustomCompilation
4 | # https://github.com/openvinotoolkit/openvino/issues/4527
5 | # -D ENABLE_OPENCV=OFF \
6 | # https://github.com/openvinotoolkit/openvino/issues/5100
7 | # -D BUILD_SHARED_LIBS=OFF \
8 | # -D BUILD_SHARED_LIBS=ON \
9 | # https://github.com/openvinotoolkit/openvino/issues/5209
10 | # -D NGRAPH_TOOLS_ENABLE=OFF \
11 | cmake -D CMAKE_BUILD_TYPE=Release \
12 | -D THREADING=TBB \
13 | -D ENABLE_MKL_DNN=ON \
14 | -D GEMM=JIT \
15 | -D ENABLE_FASTER_BUILD=ON \
16 | -D ENABLE_LTO=ON \
17 | -D ENABLE_VPU=ON \
18 | -D ENABLE_MYRIAD=ON \
19 | -D ENABLE_SSE42=ON \
20 | -D ENABLE_AVX2=ON \
21 | -D ENABLE_AVX512F=OFF \
22 | -D BUILD_TESTS=OFF \
23 | -D ENABLE_ALTERNATIVE_TEMP=OFF \
24 | -D ENABLE_CLDNN=OFF \
25 | -D ENABLE_CLDNN_TESTS=OFF \
26 | -D ENABLE_DOCS=OFF \
27 | -D ENABLE_GAPI_TESTS=OFF \
28 | -D ENABLE_GNA=OFF \
29 | -D ENABLE_OPENCV=OFF \
30 | -D ENABLE_PROFILING_ITT=OFF \
31 | -D ENABLE_PYTHON=OFF \
32 | -D ENABLE_SAMPLES=OFF \
33 | -D ENABLE_SPEECH_DEMO=OFF \
34 | -D ENABLE_TESTS=OFF \
35 | -D GAPI_TEST_PERF=OFF \
36 | -D NGRAPH_ONNX_IMPORT_ENABLE=ON \
37 | -D NGRAPH_TEST_UTIL_ENABLE=OFF \
38 | -D NGRAPH_TOOLS_ENABLE=OFF \
39 | -D NGRAPH_UNIT_TEST_ENABLE=OFF \
40 | -D SELECTIVE_BUILD=OFF ../../dldt/
41 |
--------------------------------------------------------------------------------
/build/ffmpeg/ffmpeg_premake.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Build ffmpeg shared libraries without version suffix
4 | # from
5 |
6 | OLD1='SLIBNAME_WITH_VERSION=$(SLIBNAME).$(LIBVERSION)'
7 | OLD2='SLIBNAME_WITH_MAJOR=$(SLIBNAME).$(LIBMAJOR)'
8 | OLD3='SLIB_INSTALL_NAME=$(SLIBNAME_WITH_VERSION)'
9 | OLD4='SLIB_INSTALL_LINKS=$(SLIBNAME_WITH_MAJOR) $(SLIBNAME)'
10 |
11 | NEW1='SLIBNAME_WITH_VERSION=$(SLIBNAME)'
12 | NEW2='SLIBNAME_WITH_MAJOR=$(SLIBNAME)'
13 | NEW3='SLIB_INSTALL_NAME=$(SLIBNAME)'
14 | NEW4='SLIB_INSTALL_LINKS='
15 |
16 |
17 | sed -i -e "s/${OLD1}/${NEW1}/" -e "s/${OLD2}/${NEW2}/" -e "s/${OLD3}/${NEW3}/" -e "s/${OLD4}/${NEW4}/" ./ffbuild/config.mak
18 |
--------------------------------------------------------------------------------
/build/ffmpeg/ffmpeg_setup.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | # deprecated: --enable-avresample , switch to libswresample
3 | # The libswresample library performs highly optimized audio resampling,
4 | # rematrixing and sample format conversion operations.
5 |
6 | PATH_TO_SCRIPT=`dirname $(realpath $0)`
7 |
8 | ../../ffmpeg/configure \
9 | --prefix=$PATH_TO_SCRIPT/binaries \
10 | --disable-programs \
11 | --disable-avdevice \
12 | --disable-postproc \
13 | --disable-static \
14 | --disable-avdevice \
15 | --disable-swresample \
16 | --disable-postproc \
17 | --disable-avfilter \
18 | --disable-alsa \
19 | --disable-appkit \
20 | --disable-avfoundation \
21 | --disable-bzlib \
22 | --disable-coreimage \
23 | --disable-iconv \
24 | --disable-lzma \
25 | --disable-sndio \
26 | --disable-schannel \
27 | --disable-sdl2 \
28 | --disable-securetransport \
29 | --disable-xlib \
30 | --disable-zlib \
31 | --disable-audiotoolbox \
32 | --disable-amf \
33 | --disable-cuvid \
34 | --disable-d3d11va \
35 | --disable-dxva2 \
36 | --disable-ffnvcodec \
37 | --disable-nvdec \
38 | --disable-nvenc \
39 | --disable-v4l2-m2m \
40 | --disable-vaapi \
41 | --disable-vdpau \
42 | --disable-videotoolbox \
43 | --disable-doc \
44 | --disable-static \
45 | --enable-pic \
46 | --enable-shared \
47 |
--------------------------------------------------------------------------------
/build/openblas/openblas_setup.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Please refer here for details:
4 | #
5 | # If you compile it with `make FC=gfortran`,
6 | # you'll need `libgfortran.so.4` and `libquadmath.so.0`
7 |
8 | cmake -D NO_LAPACKE=1 \
9 | -D CMAKE_BUILD_TYPE=Release \
10 | -D NOFORTRAN=ON \
11 | -D BUILD_RELAPACK=OFF \
12 | -D NO_AFFINITY=1 \
13 | -D USE_OPENMP=0 \
14 | -D NO_WARMUP=1 \
15 | -D NUM_THREADS=64 \
16 | -D GEMM_MULTITHREAD_THRESHOLD=64 \
17 | -D BUILD_SHARED_LIBS=ON \
18 | -D CMAKE_INSTALL_PREFIX=./ ../../openblas/
19 |
--------------------------------------------------------------------------------
/build/opencv/opencv_setup.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # for CPU_BASELINE and CPU_DISPATCH see https://github.com/opencv/opencv/wiki/CPU-optimizations-build-options
4 | # they should match with ones for dldt/inference-engine/src/extension/cmake/OptimizationFlags.cmake
5 | #
6 | # -DINF_ENGINE_RELEASE= should match dldt version
7 | # See
8 | # From
9 | # "Force IE version, should be in form YYYYAABBCC (e.g. 2020.1.0.2 -> 2020010002)")
10 |
11 | tmp=$(pwd)
12 | ABS_PORTION=${tmp%%"/build/opencv"}
13 |
14 | FFMPEG_PATH=$ABS_PORTION/build/ffmpeg/binaries
15 | export LD_LIBRARY_PATH=$FFMPEG_PATH/lib/:$LD_LIBRARY_PATH
16 | export PKG_CONFIG_PATH=$FFMPEG_PATH/lib/pkgconfig:$PKG_CONFIG_PATH
17 | export PKG_CONFIG_LIBDIR=$FFMPEG_PATH/lib/:$PKG_CONFIG_LIBDIR
18 |
19 | # grep "5" from "Python 3.5.2"
20 | PY_VER=`$ABS_PORTION/venv/bin/python3 --version | sed -rn "s/Python .\.(.)\..$/\1/p"`
21 | PY_LIB_PATH=`find $ABS_PORTION/venv/lib/ -iname libpython3.${PY_VER}m.so`
22 |
23 |
24 | cmake -D CMAKE_BUILD_TYPE=RELEASE \
25 | -D BUILD_DOCS=OFF \
26 | -D BUILD_EXAMPLES=OFF \
27 | -D BUILD_JPEG=OFF \
28 | -D BUILD_JPEG=OFF \
29 | -D BUILD_PERF_TESTS=OFF \
30 | -D BUILD_SHARED_LIBS=OFF \
31 | -D BUILD_TESTS=OFF \
32 | -D BUILD_opencv_apps=OFF \
33 | -D BUILD_opencv_java=OFF \
34 | -D BUILD_opencv_python2.7=OFF \
35 | -D BUILD_opencv_python2=OFF \
36 | -D BUILD_opencv_python3=ON \
37 | -D BUILD_opencv_world=OFF \
38 | -D CMAKE_INSTALL_PREFIX=./binaries/ \
39 | -D CPU_BASELINE=SSE4_2 \
40 | -D CPU_DISPATCH=AVX,AVX2,FP16,AVX512 \
41 | -D CV_TRACE=OFF \
42 | -D ENABLE_CXX11=ON \
43 | -D ENABLE_PRECOMPILED_HEADERS=OFF \
44 | -D FFMPEG_INCLUDE_DIRS=$FFMPEG_PATH/include \
45 | -D INF_ENGINE_INCLUDE_DIRS=$ABS_PORTION/dldt/inference-engine/include \
46 | -D INF_ENGINE_LIB_DIRS=$ABS_PORTION/dldt/bin/intel64/Release/lib \
47 | -D INF_ENGINE_RELEASE=2021040200 \
48 | -D INSTALL_CREATE_DISTRIB=ON \
49 | -D INSTALL_C_EXAMPLES=OFF \
50 | -D INSTALL_PYTHON_EXAMPLES=OFF \
51 | -D JPEG_INCLUDE_DIR=$JPEG_INCLUDE_DIR \
52 | -D JPEG_LIBRARY=$JPEG_LIBRARY \
53 | -D OPENCV_ENABLE_NONFREE=OFF \
54 | -D OPENCV_FORCE_3RDPARTY_BUILD=ON \
55 | -D OPENCV_SKIP_PYTHON_LOADER=ON \
56 | -D PYTHON3_EXECUTABLE=$ABS_PORTION/venv/bin/python3 \
57 | -D PYTHON3_LIBRARY:PATH=$PY_LIB_PATH \
58 | -D PYTHON3_NUMPY_INCLUDE_DIRS:PATH=$ABS_PORTION/venv/lib/python3.${PY_VER}/site-packages/numpy/core/include \
59 | -D PYTHON3_PACKAGES_PATH=$ABS_PORTION/venv/lib/python3.${PY_VER}/site-packages \
60 | -D PYTHON_DEFAULT_EXECUTABLE=$ABS_PORTION/venv/bin/python3 \
61 | -D PYTHON_INCLUDE_DIR=/usr/include/python3.${PY_VER} \
62 | -D WITH_1394=OFF \
63 | -D WITH_CUDA=OFF \
64 | -D WITH_EIGEN=OFF \
65 | -D WITH_FFMPEG=ON \
66 | -D WITH_GSTRREAMER=OFF \
67 | -D WITH_GTK=OFF \
68 | -D WITH_INF_ENGINE=ON \
69 | -D WITH_IPP=OFF \
70 | -D WITH_ITT=OFF \
71 | -D WITH_JASPER=OFF \
72 | -D WITH_NGRAPH=ON \
73 | -D WITH_OPENEXR=OFF \
74 | -D WITH_OPENMP=OFF \
75 | -D WITH_PNG=ON \
76 | -D WITH_PROTOBUF=ON \
77 | -D WITH_QT=OFF \
78 | -D WITH_TBB=ON \
79 | -D WITH_V4L=ON \
80 | -D WITH_VTK=OFF \
81 | -D WITH_WEBP=OFF \
82 | -D ngraph_DIR=$ABS_PORTION/build/dldt/ngraph ../../opencv
83 |
--------------------------------------------------------------------------------
/create_wheel/LICENSE:
--------------------------------------------------------------------------------
1 |
2 | Apache License
3 | Version 2.0, January 2004
4 | http://www.apache.org/licenses/
5 |
6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7 |
8 | 1. Definitions.
9 |
10 | "License" shall mean the terms and conditions for use, reproduction,
11 | and distribution as defined by Sections 1 through 9 of this document.
12 |
13 | "Licensor" shall mean the copyright owner or entity authorized by
14 | the copyright owner that is granting the License.
15 |
16 | "Legal Entity" shall mean the union of the acting entity and all
17 | other entities that control, are controlled by, or are under common
18 | control with that entity. For the purposes of this definition,
19 | "control" means (i) the power, direct or indirect, to cause the
20 | direction or management of such entity, whether by contract or
21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
22 | outstanding shares, or (iii) beneficial ownership of such entity.
23 |
24 | "You" (or "Your") shall mean an individual or Legal Entity
25 | exercising permissions granted by this License.
26 |
27 | "Source" form shall mean the preferred form for making modifications,
28 | including but not limited to software source code, documentation
29 | source, and configuration files.
30 |
31 | "Object" form shall mean any form resulting from mechanical
32 | transformation or translation of a Source form, including but
33 | not limited to compiled object code, generated documentation,
34 | and conversions to other media types.
35 |
36 | "Work" shall mean the work of authorship, whether in Source or
37 | Object form, made available under the License, as indicated by a
38 | copyright notice that is included in or attached to the work
39 | (an example is provided in the Appendix below).
40 |
41 | "Derivative Works" shall mean any work, whether in Source or Object
42 | form, that is based on (or derived from) the Work and for which the
43 | editorial revisions, annotations, elaborations, or other modifications
44 | represent, as a whole, an original work of authorship. For the purposes
45 | of this License, Derivative Works shall not include works that remain
46 | separable from, or merely link (or bind by name) to the interfaces of,
47 | the Work and Derivative Works thereof.
48 |
49 | "Contribution" shall mean any work of authorship, including
50 | the original version of the Work and any modifications or additions
51 | to that Work or Derivative Works thereof, that is intentionally
52 | submitted to Licensor for inclusion in the Work by the copyright owner
53 | or by an individual or Legal Entity authorized to submit on behalf of
54 | the copyright owner. For the purposes of this definition, "submitted"
55 | means any form of electronic, verbal, or written communication sent
56 | to the Licensor or its representatives, including but not limited to
57 | communication on electronic mailing lists, source code control systems,
58 | and issue tracking systems that are managed by, or on behalf of, the
59 | Licensor for the purpose of discussing and improving the Work, but
60 | excluding communication that is conspicuously marked or otherwise
61 | designated in writing by the copyright owner as "Not a Contribution."
62 |
63 | "Contributor" shall mean Licensor and any individual or Legal Entity
64 | on behalf of whom a Contribution has been received by Licensor and
65 | subsequently incorporated within the Work.
66 |
67 | 2. Grant of Copyright License. Subject to the terms and conditions of
68 | this License, each Contributor hereby grants to You a perpetual,
69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
70 | copyright license to reproduce, prepare Derivative Works of,
71 | publicly display, publicly perform, sublicense, and distribute the
72 | Work and such Derivative Works in Source or Object form.
73 |
74 | 3. Grant of Patent License. Subject to the terms and conditions of
75 | this License, each Contributor hereby grants to You a perpetual,
76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
77 | (except as stated in this section) patent license to make, have made,
78 | use, offer to sell, sell, import, and otherwise transfer the Work,
79 | where such license applies only to those patent claims licensable
80 | by such Contributor that are necessarily infringed by their
81 | Contribution(s) alone or by combination of their Contribution(s)
82 | with the Work to which such Contribution(s) was submitted. If You
83 | institute patent litigation against any entity (including a
84 | cross-claim or counterclaim in a lawsuit) alleging that the Work
85 | or a Contribution incorporated within the Work constitutes direct
86 | or contributory patent infringement, then any patent licenses
87 | granted to You under this License for that Work shall terminate
88 | as of the date such litigation is filed.
89 |
90 | 4. Redistribution. You may reproduce and distribute copies of the
91 | Work or Derivative Works thereof in any medium, with or without
92 | modifications, and in Source or Object form, provided that You
93 | meet the following conditions:
94 |
95 | (a) You must give any other recipients of the Work or
96 | Derivative Works a copy of this License; and
97 |
98 | (b) You must cause any modified files to carry prominent notices
99 | stating that You changed the files; and
100 |
101 | (c) You must retain, in the Source form of any Derivative Works
102 | that You distribute, all copyright, patent, trademark, and
103 | attribution notices from the Source form of the Work,
104 | excluding those notices that do not pertain to any part of
105 | the Derivative Works; and
106 |
107 | (d) If the Work includes a "NOTICE" text file as part of its
108 | distribution, then any Derivative Works that You distribute must
109 | include a readable copy of the attribution notices contained
110 | within such NOTICE file, excluding those notices that do not
111 | pertain to any part of the Derivative Works, in at least one
112 | of the following places: within a NOTICE text file distributed
113 | as part of the Derivative Works; within the Source form or
114 | documentation, if provided along with the Derivative Works; or,
115 | within a display generated by the Derivative Works, if and
116 | wherever such third-party notices normally appear. The contents
117 | of the NOTICE file are for informational purposes only and
118 | do not modify the License. You may add Your own attribution
119 | notices within Derivative Works that You distribute, alongside
120 | or as an addendum to the NOTICE text from the Work, provided
121 | that such additional attribution notices cannot be construed
122 | as modifying the License.
123 |
124 | You may add Your own copyright statement to Your modifications and
125 | may provide additional or different license terms and conditions
126 | for use, reproduction, or distribution of Your modifications, or
127 | for any such Derivative Works as a whole, provided Your use,
128 | reproduction, and distribution of the Work otherwise complies with
129 | the conditions stated in this License.
130 |
131 | 5. Submission of Contributions. Unless You explicitly state otherwise,
132 | any Contribution intentionally submitted for inclusion in the Work
133 | by You to the Licensor shall be under the terms and conditions of
134 | this License, without any additional terms or conditions.
135 | Notwithstanding the above, nothing herein shall supersede or modify
136 | the terms of any separate license agreement you may have executed
137 | with Licensor regarding such Contributions.
138 |
139 | 6. Trademarks. This License does not grant permission to use the trade
140 | names, trademarks, service marks, or product names of the Licensor,
141 | except as required for reasonable and customary use in describing the
142 | origin of the Work and reproducing the content of the NOTICE file.
143 |
144 | 7. Disclaimer of Warranty. Unless required by applicable law or
145 | agreed to in writing, Licensor provides the Work (and each
146 | Contributor provides its Contributions) on an "AS IS" BASIS,
147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 | implied, including, without limitation, any warranties or conditions
149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 | PARTICULAR PURPOSE. You are solely responsible for determining the
151 | appropriateness of using or redistributing the Work and assume any
152 | risks associated with Your exercise of permissions under this License.
153 |
154 | 8. Limitation of Liability. In no event and under no legal theory,
155 | whether in tort (including negligence), contract, or otherwise,
156 | unless required by applicable law (such as deliberate and grossly
157 | negligent acts) or agreed to in writing, shall any Contributor be
158 | liable to You for damages, including any direct, indirect, special,
159 | incidental, or consequential damages of any character arising as a
160 | result of this License or out of the use or inability to use the
161 | Work (including but not limited to damages for loss of goodwill,
162 | work stoppage, computer failure or malfunction, or any and all
163 | other commercial damages or losses), even if such Contributor
164 | has been advised of the possibility of such damages.
165 |
166 | 9. Accepting Warranty or Additional Liability. While redistributing
167 | the Work or Derivative Works thereof, You may choose to offer,
168 | and charge a fee for, acceptance of support, warranty, indemnity,
169 | or other liability obligations and/or rights consistent with this
170 | License. However, in accepting such obligations, You may act only
171 | on Your own behalf and on Your sole responsibility, not on behalf
172 | of any other Contributor, and only if You agree to indemnify,
173 | defend, and hold each Contributor harmless for any liability
174 | incurred by, or claims asserted against, such Contributor by reason
175 | of your accepting any such warranty or additional liability.
176 |
177 | END OF TERMS AND CONDITIONS
178 |
179 | APPENDIX: How to apply the Apache License to your work.
180 |
181 | To apply the Apache License to your work, attach the following
182 | boilerplate notice, with the fields enclosed by brackets "[]"
183 | replaced with your own identifying information. (Don't include
184 | the brackets!) The text should be enclosed in the appropriate
185 | comment syntax for the file format. We also recommend that a
186 | file or class name and description of purpose be included on the
187 | same "printed page" as the copyright notice for easier
188 | identification within third-party archives.
189 |
190 | Copyright [yyyy] [name of copyright owner]
191 |
192 | Licensed under the Apache License, Version 2.0 (the "License");
193 | you may not use this file except in compliance with the License.
194 | You may obtain a copy of the License at
195 |
196 | http://www.apache.org/licenses/LICENSE-2.0
197 |
198 | Unless required by applicable law or agreed to in writing, software
199 | distributed under the License is distributed on an "AS IS" BASIS,
200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201 | See the License for the specific language governing permissions and
202 | limitations under the License.
203 |
--------------------------------------------------------------------------------
/create_wheel/LICENSE_MIT:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 Kabakov Borys
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/create_wheel/README.md:
--------------------------------------------------------------------------------
1 | # README
2 |
3 | This is a pre-built [OpenCV](https://github.com/opencv/opencv) with [Inference Engine](https://github.com/openvinotoolkit/openvino) module package for Python3.
4 | You need that module if you want to run models from [Intel's model zoo](https://github.com/openvinotoolkit/open_model_zoo).
5 |
6 | It built with `ffmpeg` and `v4l` but without GTK/QT (use matplotlib for plotting your results).
7 | Contrib modules and haarcascades are not included.
8 |
9 | For additional info visit the [project homepage](https://github.com/banderlog/opencv-python-inference-engine)
10 |
--------------------------------------------------------------------------------
/create_wheel/cv2/__init__.py:
--------------------------------------------------------------------------------
1 | import importlib
2 | from .cv2 import *
3 |
4 | # wildcard import above does not import "private" variables like __version__
5 | # this makes them available
6 | globals().update(importlib.import_module('cv2.cv2').__dict__)
7 |
--------------------------------------------------------------------------------
/create_wheel/setup.py:
--------------------------------------------------------------------------------
1 | import setuptools
2 |
3 |
4 | with open("README.md", "r") as fh:
5 | long_description = fh.read()
6 |
7 |
8 | # This creates a list which is empty but returns a length of 1.
9 | # Should make the wheel a binary distribution and platlib compliant.
10 | # from
11 | class EmptyListWithLength(list):
12 | def __len__(self):
13 | return 1
14 |
15 |
16 | setuptools.setup(
17 | name='opencv-python-inference-engine',
18 | version='2022.01.05',
19 | url="https://github.com/banderlog/opencv-python-inference-engine",
20 | maintainer="Kabakov Borys",
21 | license='MIT, Apache 2.0',
22 | description="Wrapper package for OpenCV with Inference Engine python bindings",
23 | long_description=long_description,
24 | long_description_content_type="text/markdown",
25 | ext_modules=EmptyListWithLength(),
26 | packages=['cv2'],
27 | package_data={'cv2': ['*.so*', '*.mvcmd', '*.xml']},
28 | include_package_data=True,
29 | install_requires=['numpy'],
30 | classifiers=[
31 | 'Development Status :: 5 - Production/Stable',
32 | 'Environment :: Console',
33 | 'Intended Audience :: Developers',
34 | 'Intended Audience :: Education',
35 | 'Intended Audience :: Information Technology',
36 | 'Intended Audience :: Science/Research',
37 | 'Programming Language :: Python :: 3',
38 | 'Programming Language :: C++',
39 | 'Operating System :: POSIX :: Linux',
40 | 'Topic :: Scientific/Engineering',
41 | 'Topic :: Scientific/Engineering :: Image Recognition',
42 | 'Topic :: Software Development',
43 | ],
44 | )
--------------------------------------------------------------------------------
/download_all_stuff.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # colors
4 | end="\033[0m"
5 | red="\033[0;31m"
6 | green="\033[0;32m"
7 |
8 | green () {
9 | echo -e "${green}${1}${end}"
10 | }
11 |
12 |
13 | red () {
14 | echo -e "${red}${1}${end}"
15 | }
16 |
17 |
18 | ROOT_DIR=$(pwd)
19 |
20 | # check Ubuntu version (20.04 build will not work on 18.04)
21 | if test $(lsb_release -rs) != 18.04; then
22 | red "\n!!! You are NOT on the Ubuntu 18.04 !!!\n"
23 | fi
24 |
25 | green "RESET GIT SUBMODULES"
26 | # git checkout dev
27 | # for update use `git submodule update --init --recursive --jobs=4`
28 | # cd submodule dir and `git fetch --tags && git checkout tags/`
29 | git submodule update --init --recursive --depth=1 --jobs=4
30 | # restore changes command will differ between GIT versions (e.g., `restore`)
31 | git submodule foreach --recursive git checkout .
32 | # remove untracked
33 | git submodule foreach --recursive git clean -dxf
34 |
35 | green "CLEAN BUILD DIRS"
36 | find build/dldt/ -mindepth 1 -not -name 'dldt_setup.sh' -not -name '*.patch' -delete
37 | find build/opencv/ -mindepth 1 -not -name 'opencv_setup.sh' -delete
38 | find build/ffmpeg/ -mindepth 1 -not -name 'ffmpeg_*.sh' -delete
39 |
40 | green "CLEAN WHEEL DIR"
41 | find create_wheel/cv2/ -type f -not -name '__init__.py' -delete
42 | rm -drf create_wheel/build
43 | rm -drf create_wheel/dist
44 | rm -drf create_wheel/*egg-info
45 |
46 | green "CREATE VENV"
47 | cd $ROOT_DIR
48 |
49 | if [[ ! -d ./venv ]]; then
50 | virtualenv --clear --always-copy -p /usr/bin/python3 ./venv
51 | ./venv/bin/pip3 install -r requirements.txt
52 | fi
53 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy
2 |
--------------------------------------------------------------------------------
/tests/README.md:
--------------------------------------------------------------------------------
1 | # Tests for opencv-python-inference-engine wheel
2 |
3 | ## Requirements
4 |
5 | `sudo apt install virtualenv`
6 |
7 | ## Usage
8 |
9 | ### Features
10 |
11 | Just run bash script and read output.
12 |
13 | ```bash
14 | cd tests
15 | ./prepare_and_run_tests.sh
16 | ```
17 |
18 | ### Inference speed
19 |
20 | Something like below. The general idea is to test only inference speed, without preprocessing and decoding.
21 | Also, 1st inference must not count, because it will load all stuff into memory.
22 |
23 | **NB:** be strict about Backend and Target
24 |
25 | ```python
26 | import cv2
27 |
28 | class PixelLinkDetectorTest():
29 | """ Cut version of PixelLinkDetector """
30 | def __init__(self, xml_model_path: str):
31 | self.net = cv2.dnn.readNet(xml_model_path, xml_model_path[:-3] + 'bin')
32 |
33 | def detect(self, img: 'np.ndarray') -> None:
34 | blob = cv2.dnn.blobFromImage(img, 1, (1280, 768))
35 | self.net.setInput(blob)
36 | out_layer_names = self.net.getUnconnectedOutLayersNames()
37 | return self.net.forward(out_layer_names)
38 |
39 |
40 | # check opencv version
41 | cv2.__version__
42 |
43 | # read img and network
44 | img = cv2.imread('helloworld.png')
45 | detector = PixelLinkDetectorTest('text-detection-0004.xml')
46 |
47 | # select target & backend, please read the documentation for details:
48 | #
49 | detector.net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
50 | detector.net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
51 |
52 | # 1st inference does not count
53 | links, pixels = detector.detect(img)
54 |
55 | # use magic function
56 | %timeit links, pixels = detector.detect(img)
57 | ```
58 |
59 |
60 | ## Models
61 |
62 | + [rateme](https://github.com/banderlog/rateme) (YOLO3)
63 | + [text-detection-0004](https://github.com/opencv/open_model_zoo/blob/master/models/intel/text-detection-0004/description/text-detection-0004.md)
64 | + [text-recognition-0012](https://github.com/opencv/open_model_zoo/blob/master/models/intel/text-recognition-0012/description/text-recognition-0012.md)
65 |
66 | ## Files
67 |
68 | + `short_video.mp4` from [here](https://www.pexels.com/video/a-cattails-fluff-floats-in-air-2156021/) (free)
69 | + `dislike.jpg` from [rateme repository](https://github.com/banderlog/rateme/blob/master/test_imgs/dislike.jpg)
70 | + `helloworld.png` I either made it or forgot from where it downloaded from
71 |
--------------------------------------------------------------------------------
/tests/dislike.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/banderlog/opencv-python-inference-engine/0abe6990b938275c48ad990f2b484dd03ad0f39d/tests/dislike.jpg
--------------------------------------------------------------------------------
/tests/helloworld.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/banderlog/opencv-python-inference-engine/0abe6990b938275c48ad990f2b484dd03ad0f39d/tests/helloworld.png
--------------------------------------------------------------------------------
/tests/pixellink.py:
--------------------------------------------------------------------------------
1 | """ Wrapper class for Intel's PixelLink realisation (text segmentation NN)
2 | text-detection-00[34]
3 |
4 | For text-detection-002 you'll need to uncomment string in detect()
5 | """
6 | import cv2
7 | import numpy as np
8 | from skimage.morphology import label
9 | from skimage.measure import regionprops
10 | from typing import List, Tuple
11 | from skimage.measure._regionprops import RegionProperties
12 |
13 |
14 | class PixelLinkDetector():
15 | """ Wrapper class for Intel's version of PixelLink text detector
16 |
17 | See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
18 | text-detection-0004/description/text-detection-0004.md
19 |
20 | :param xml_model_path: path to XML file
21 |
22 | **Example:**
23 |
24 | .. code-block:: python
25 | detector = PixelLinkDetector('text-detection-0004.xml')
26 | img = cv2.imread('tmp.jpg')
27 | # ~250ms on i7-6700K
28 | detector.detect(img)
29 | # ~2ms
30 | bboxes = detector.decode()
31 | """
32 | def __init__(self, xml_model_path: str, txt_threshold=0.5):
33 | """
34 | :param xml_model_path: path to model's XML file
35 | :param txt_threshold: confidence, defaults to ``0.5``
36 | """
37 | self._net = cv2.dnn.readNet(xml_model_path, xml_model_path[:-3] + 'bin')
38 | self._txt_threshold = txt_threshold
39 |
40 | def detect(self, img: np.ndarray) -> None:
41 | """ GetPixelLink's outputs (BxCxHxW):
42 | + [1x16x192x320] - logits related to linkage between pixels and their neighbors
43 | + [1x2x192x320] - logits related to text/no-text classification for each pixel
44 |
45 | B - batch size
46 | C - number of channels
47 | H - image height
48 | W - image width
49 |
50 | :param img: image as ``numpy.ndarray``
51 | """
52 | self._img_shape = img.shape
53 | blob = cv2.dnn.blobFromImage(img, 1, (1280, 768))
54 | self._net.setInput(blob)
55 | out_layer_names = self._net.getUnconnectedOutLayersNames()
56 | # for text-detection-002
57 | # self.pixels, self.links = self._net.forward(out_layer_names)
58 | # for text-detection-00[34]
59 | self.links, self.pixels = self._net.forward(out_layer_names)
60 |
61 | def get_mask(self) -> np.ndarray:
62 | """ Get binary mask of detected text pixels
63 | """
64 | pixel_mask = self._get_pixel_scores() >= self._txt_threshold
65 | return pixel_mask.astype(np.uint8)
66 |
67 | def _logsumexp(self, a: np.ndarray, axis=-1) -> np.ndarray:
68 | """ Castrated function from scipy
69 | https://github.com/scipy/scipy/blob/v1.6.2/scipy/special/_logsumexp.py
70 |
71 | Compute the log of the sum of exponentials of input elements.
72 | """
73 | a_max = np.amax(a, axis=axis, keepdims=True)
74 | tmp = np.exp(a - a_max)
75 | s = np.sum(tmp, axis=axis, keepdims=True)
76 | out = np.log(s)
77 | out += a_max
78 | return out
79 |
80 | def _get_pixel_scores(self) -> np.ndarray:
81 | """ get softmaxed properly shaped pixel scores """
82 | # move channels to the end
83 | tmp = np.transpose(self.pixels, (0, 2, 3, 1))
84 | # softmax from scipy
85 | tmp = np.exp(tmp - self._logsumexp(tmp, axis=-1))
86 | # select single batch, single chanel values
87 | return tmp[0, :, :, 1]
88 |
89 | def _get_txt_regions(self, pixel_mask: np.ndarray) -> List[RegionProperties]:
90 | """ kernels are class dependent """
91 | img_h, img_w = self._img_shape[:2]
92 | _, mask = cv2.threshold(pixel_mask, 0, 1, cv2.THRESH_BINARY)
93 | # transmutatioins
94 | # kernel size should be image size dependant (default (21,21))
95 | # on small image it will connect separate words
96 | txt_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
97 | mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, txt_kernel)
98 | # connect regions on mask of original img size
99 | mask = cv2.resize(mask, (img_w, img_h), interpolation=cv2.INTER_NEAREST)
100 | # Label connected regions of an integer array
101 | mask = label(mask, background=0, connectivity=2)
102 | # Measure properties of labeled image regions.
103 | txt_regions = regionprops(mask)
104 | return txt_regions
105 |
106 | def _get_txt_bboxes(self, txt_regions: List[RegionProperties]) -> List[Tuple[int, int, int, int]]:
107 | """ Filter text area by area and height
108 |
109 | :return: ``[(ymin, xmin, ymax, xmax)]``
110 | """
111 | min_area = 0
112 | min_height = 4
113 | boxes = []
114 | for p in txt_regions:
115 | if p.area > min_area:
116 | bbox = p.bbox
117 | if (bbox[2] - bbox[0]) > min_height:
118 | boxes.append(bbox)
119 | return boxes
120 |
121 | def decode(self) -> List[Tuple[int, int, int, int]]:
122 | """ Decode PixelLink's output
123 |
124 | :return: bounding_boxes
125 |
126 | .. note::
127 | bounding_boxes format: [ymin ,xmin ,ymax, xmax]
128 |
129 | """
130 | mask = self.get_mask()
131 | bboxes = self._get_txt_bboxes(self._get_txt_regions(mask))
132 | # sort by xmin, ymin
133 | bboxes = sorted(bboxes, key=lambda x: (x[1], x[0]))
134 | return bboxes
135 |
--------------------------------------------------------------------------------
/tests/prepare_and_run_tests.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | green="\033[0;32m"
4 | red="\033[0;31m"
5 | end="\033[0m"
6 |
7 | green () {
8 | echo -e "${green}${1}${end}"
9 | }
10 |
11 | red () {
12 | echo -e "${red}${1}${end}"
13 | }
14 |
15 |
16 | # check if (no ARG and no some appropriate files are compiled) or
17 | # (some args provided but arg1 is not existing file)
18 | # of course, you could shoot your leg here in different ways
19 | if ([ ! $# -ge 1 ] && ! $(ls ../create_wheel/dist/opencv_python_inference_engine*.whl &> /dev/null)) ||
20 | ([ $# -ge 1 ] && [ ! -f $1 ]); then
21 | red "How do you suppose to run wheel tests without wheel?"
22 | red "Compile it or provide as an ARG1 to script"
23 | exit 1
24 | fi
25 |
26 | echo "======================================================================"
27 | green "CREATE SEPARATE TEST VENV"
28 | if [ ! -d ./venv_t ]; then
29 | virtualenv --clear --always-copy -p /usr/bin/python3 ./venv_t
30 | fi
31 |
32 |
33 | green "INSTALLING DEPENDENCIES"
34 | if [ $1 ]; then
35 | # install ARGV1
36 | green "Installing from provided path"
37 | WHEEL="$1"
38 | else
39 | # install compiled wheel
40 | green "Installing from default path"
41 | WHEEL=$(realpath ../create_wheel/dist/opencv_python_inference_engine*.whl)
42 | fi
43 |
44 | ./venv_t/bin/pip3 install --force-reinstall "$WHEEL"
45 | ./venv_t/bin/pip3 install -r requirements.txt
46 |
47 |
48 | green "GET MODELS"
49 |
50 | if [ ! -d "rateme" ]; then
51 | ./venv_t/bin/pip3 install "https://github.com/banderlog/rateme/releases/download/v0.1.1/rateme-0.1.1.tar.gz"
52 | fi
53 |
54 | # urls, filenames and checksums are from:
55 | # +
56 | # +
57 | declare -a models=("text-detection-0004.xml"
58 | "text-detection-0004.bin"
59 | "text-recognition-0012.xml"
60 | "text-recognition-0012.bin")
61 |
62 | url_start="https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1"
63 |
64 | for i in "${models[@]}"; do
65 | # if no such file
66 | if [ ! -f $i ]; then
67 | # download
68 | wget "${url_start}/${i%.*}/FP32/${i}"
69 | else
70 | # checksum
71 | sha256sum -c "${i}.sha256sum" || red "PROBLEMS ^^^"
72 | fi
73 | done
74 |
75 | green "For \"$WHEEL\""
76 | green "RUN TESTS with ./venv_t/bin/python ./tests.py"
77 | ./venv_t/bin/python ./tests.py
78 |
--------------------------------------------------------------------------------
/tests/requirements.txt:
--------------------------------------------------------------------------------
1 | scikit-image
2 |
--------------------------------------------------------------------------------
/tests/short_video.mp4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/banderlog/opencv-python-inference-engine/0abe6990b938275c48ad990f2b484dd03ad0f39d/tests/short_video.mp4
--------------------------------------------------------------------------------
/tests/tests.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | import cv2
3 | from pixellink import PixelLinkDetector
4 | from text_recognition import TextRecognizer
5 | from rateme.utils import RateMe
6 |
7 |
8 | class TestPackage(unittest.TestCase):
9 |
10 | def test_dnn_module(self):
11 | model = RateMe()
12 | img = cv2.imread('dislike.jpg')
13 | answer = model.predict(img)
14 | self.assertEqual(answer, 'dislike')
15 | print('rateme: passed')
16 |
17 | def test_inference_engine(self):
18 | img = cv2.imread('helloworld.png')
19 | detector4 = PixelLinkDetector('text-detection-0004.xml')
20 | detector4.detect(img)
21 | bboxes = detector4.decode()
22 |
23 | recognizer12 = TextRecognizer('./text-recognition-0012.xml')
24 | answer = recognizer12.do_ocr(img, bboxes)
25 | self.assertEqual(answer, ['hello', 'world'])
26 | print('text detection and recognition: passed')
27 |
28 | def test_ffmpeg(self):
29 | cap = cv2.VideoCapture('short_video.mp4')
30 | answer, img = cap.read()
31 | self.assertTrue(answer)
32 | print('video opening: passed')
33 |
34 |
35 | if __name__ == '__main__':
36 | unittest.main()
37 |
--------------------------------------------------------------------------------
/tests/text-detection-0004.bin.sha256sum:
--------------------------------------------------------------------------------
1 | 6da6456f27123be2d9a0e68bb73a7750f6aaee2f0af75d7f34ec6fa97f6727dc text-detection-0004.bin
2 |
--------------------------------------------------------------------------------
/tests/text-detection-0004.xml.sha256sum:
--------------------------------------------------------------------------------
1 | 244f836e36d63c9bd45b2123f4b9e4672cae6be348c15cac857d75a8b9852dd7 text-detection-0004.xml
2 |
--------------------------------------------------------------------------------
/tests/text-recognition-0012.bin.sha256sum:
--------------------------------------------------------------------------------
1 | b0d99549692baeea3e83709a671844a365b15bd40e36d9a5d3ef5368a69d2897 text-recognition-0012.bin
2 |
--------------------------------------------------------------------------------
/tests/text-recognition-0012.xml.sha256sum:
--------------------------------------------------------------------------------
1 | 54fd8ae6ea5ae11fdeb85f5c6b701793c28883f1e3dd8c3a531c43db6c3713ea text-recognition-0012.xml
2 |
--------------------------------------------------------------------------------
/tests/text_recognition.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import numpy as np
3 | from typing import List
4 |
5 |
6 | class TextRecognizer():
7 | def __init__(self, xml_model_path: str):
8 | """ Class for the Intels' OCR model pipeline
9 |
10 | See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
11 | text-recognition-0012/description/text-recognition-0012.md
12 |
13 | :param xml_model_path: path to model's XML file
14 | """
15 | # load model
16 | self._net = cv2.dnn.readNetFromModelOptimizer(xml_model_path, xml_model_path[:-3] + 'bin')
17 |
18 | def _get_confidences(self, img: np.ndarray, box: tuple) -> np.ndarray:
19 | """ get OCR prediction confidences from a part of image in memory
20 |
21 | :param img: BGR image
22 | :param box: (ymin ,xmin ,ymax, xmax)
23 |
24 | :return: blob with the shape [30, 1, 37] in the format [WxBxL], where:
25 | W - output sequence length
26 | B - batch size
27 | L - confidence distribution across alphanumeric symbols:
28 | "0123456789abcdefghijklmnopqrstuvwxyz#", where # - special
29 | blank character for CTC decoding algorithm.
30 | """
31 | y1, x1, y2, x2 = box
32 | img = img[y1:y2, x1:x2]
33 | img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
34 | blob = cv2.dnn.blobFromImage(img, 1, (120, 32))
35 | self._net.setInput(blob)
36 | outs = self._net.forward()
37 | return outs
38 |
39 | def do_ocr(self, img: np.ndarray, bboxes: List[tuple]) -> List[str]:
40 | """ Run OCR pipeline with greedy decoder for each single word (bbox)
41 |
42 | :param img: BGR image
43 | :param bboxes: list of sepaate word bboxes (ymin ,xmin ,ymax, xmax)
44 |
45 | :return: recognized words
46 |
47 | For TF version use:
48 |
49 | .. code-block:: python
50 |
51 | # 30 is `confs.shape[0]` it is fixed
52 | a, b = tf.nn.ctc_beam_search_decoder(confs, np.array([30]))
53 | idx_no_blanks = tf.sparse.to_dense(a[0])[0].numpy()
54 | word = ''.join(char_vec[idxs_no_blanks])
55 | """
56 | words = []
57 | # net could detect only these chars
58 | char_vec = np.array(list("0123456789abcdefghijklmnopqrstuvwxyz#"))
59 |
60 | for box in bboxes:
61 | # confidence distribution across symbols
62 | confs = self._get_confidences(img, box)
63 | # get maximal confidence for the whole beam width aka greedy decoder
64 | idxs = confs[:, 0, :].argmax(axis=1)
65 | # drop blank characters '#' with id == 36 in charvec
66 | # isupposedly we taking only separate words as input
67 | idxs_no_blanks = idxs[idxs != 36]
68 | # joint to string
69 | word = ''.join(char_vec[idxs_no_blanks])
70 | words.append(word)
71 |
72 | return words
73 |
--------------------------------------------------------------------------------
/tests_openvino/README.md:
--------------------------------------------------------------------------------
1 | Only way to download models -- through model downloader, no manual download anymore:
2 | -
3 | -
4 | -
5 | -
6 | - (you need to clone github repo to get them)
7 |
8 | Sometimes models are backwards compatible to new OpenVINO version, sometimes no.
9 | Sometimes new model versions became unworkable.
10 |
11 | IE API for network upload and usage now deprecated, one should use openvino API instead:
12 | - see differences of `pixellink.py` and `text_recognition.py` between `tests` and `tests_openvino` folders
13 | -
14 |
--------------------------------------------------------------------------------
/tests_openvino/pixellink.py:
--------------------------------------------------------------------------------
1 | """ Wrapper class for Intel's PixelLink realisation (text segmentation NN)
2 | text-detection-00[34]
3 |
4 | For text-detection-002 you'll need to uncomment string in detect()
5 | """
6 | import cv2
7 | import numpy as np
8 | from openvino.runtime import Core
9 | from skimage.morphology import label
10 | from skimage.measure import regionprops
11 | from typing import List, Tuple
12 | from skimage.measure._regionprops import RegionProperties
13 |
14 |
15 | class PixelLinkDetector():
16 | """ Wrapper class for Intel's version of PixelLink text detector
17 |
18 | See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
19 | text-detection-0004/description/text-detection-0004.md
20 |
21 | :param xml_model_path: path to XML file
22 |
23 | **Example:**
24 |
25 | .. code-block:: python
26 | detector = PixelLinkDetector('text-detection-0004.xml')
27 | img = cv2.imread('tmp.jpg')
28 | # ~250ms on i7-6700K
29 | detector.detect(img)
30 | # ~2ms
31 | bboxes = detector.decode()
32 | """
33 | def __init__(self, xml_model_path: str, txt_threshold=0.5):
34 | """
35 | :param xml_model_path: path to model's XML file
36 | :param txt_threshold: confidence, defaults to ``0.5``
37 | """
38 | ie = Core()
39 | model = ie.read_model(xml_model_path)
40 | self._net = ie.compile_model(model=model, device_name="CPU")
41 | #self._net = cv2.dnn.readNet(xml_model_path, xml_model_path[:-3] + 'bin')
42 | self._txt_threshold = txt_threshold
43 |
44 | def detect(self, img: np.ndarray) -> None:
45 | """ GetPixelLink's outputs (BxCxHxW):
46 | + [1x16x192x320] - logits related to linkage between pixels and their neighbors
47 | + [1x2x192x320] - logits related to text/no-text classification for each pixel
48 |
49 | B - batch size
50 | C - number of channels
51 | H - image height
52 | W - image width
53 |
54 | :param img: image as ``numpy.ndarray``
55 | """
56 | #input_layer = self._net.input(0)
57 | output_layer_1 = self._net.output(0)
58 | output_layer_2 = self._net.output(1)
59 | self._img_shape = img.shape
60 | blob = cv2.dnn.blobFromImage(img, 1, (1280, 768))
61 | #self._net.setInput(blob)
62 | #out_layer_names = self._net.getUnconnectedOutLayersNames()
63 | # for text-detection-002
64 | # self.pixels, self.links = self._net.forward(out_layer_names)
65 | # for text-detection-00[34]
66 | #self.links, self.pixels = self._net.forward(out_layer_names)
67 | #self.links, self.pixels = self._net([blob])[output_layer]
68 | #self.links = self._net([blob])[output_layer_1]
69 | #self.pixels = self._net([blob])[output_layer_2]
70 | out = self._net([blob])
71 | self.links = out[output_layer_1]
72 | self.pixels = out[output_layer_2]
73 |
74 | def get_mask(self) -> np.ndarray:
75 | """ Get binary mask of detected text pixels
76 | """
77 | pixel_mask = self._get_pixel_scores() >= self._txt_threshold
78 | return pixel_mask.astype(np.uint8)
79 |
80 | def _logsumexp(self, a: np.ndarray, axis=-1) -> np.ndarray:
81 | """ Castrated function from scipy
82 | https://github.com/scipy/scipy/blob/v1.6.2/scipy/special/_logsumexp.py
83 |
84 | Compute the log of the sum of exponentials of input elements.
85 | """
86 | a_max = np.amax(a, axis=axis, keepdims=True)
87 | tmp = np.exp(a - a_max)
88 | s = np.sum(tmp, axis=axis, keepdims=True)
89 | out = np.log(s)
90 | out += a_max
91 | return out
92 |
93 | def _get_pixel_scores(self) -> np.ndarray:
94 | """ get softmaxed properly shaped pixel scores """
95 | # move channels to the end
96 | tmp = np.transpose(self.pixels, (0, 2, 3, 1))
97 | # softmax from scipy
98 | tmp = np.exp(tmp - self._logsumexp(tmp, axis=-1))
99 | # select single batch, single chanel values
100 | return tmp[0, :, :, 1]
101 |
102 | def _get_txt_regions(self, pixel_mask: np.ndarray) -> List[RegionProperties]:
103 | """ kernels are class dependent """
104 | img_h, img_w = self._img_shape[:2]
105 | _, mask = cv2.threshold(pixel_mask, 0, 1, cv2.THRESH_BINARY)
106 | # transmutatioins
107 | # kernel size should be image size dependant (default (21,21))
108 | # on small image it will connect separate words
109 | txt_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
110 | mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, txt_kernel)
111 | # connect regions on mask of original img size
112 | mask = cv2.resize(mask, (img_w, img_h), interpolation=cv2.INTER_NEAREST)
113 | # Label connected regions of an integer array
114 | mask = label(mask, background=0, connectivity=2)
115 | # Measure properties of labeled image regions.
116 | txt_regions = regionprops(mask)
117 | return txt_regions
118 |
119 | def _get_txt_bboxes(self, txt_regions: List[RegionProperties]) -> List[Tuple[int, int, int, int]]:
120 | """ Filter text area by area and height
121 |
122 | :return: ``[(ymin, xmin, ymax, xmax)]``
123 | """
124 | min_area = 0
125 | min_height = 4
126 | boxes = []
127 | for p in txt_regions:
128 | if p.area > min_area:
129 | bbox = p.bbox
130 | if (bbox[2] - bbox[0]) > min_height:
131 | boxes.append(bbox)
132 | return boxes
133 |
134 | def decode(self) -> List[Tuple[int, int, int, int]]:
135 | """ Decode PixelLink's output
136 |
137 | :return: bounding_boxes
138 |
139 | .. note::
140 | bounding_boxes format: [ymin ,xmin ,ymax, xmax]
141 |
142 | """
143 | mask = self.get_mask()
144 | bboxes = self._get_txt_bboxes(self._get_txt_regions(mask))
145 | # sort by xmin, ymin
146 | bboxes = sorted(bboxes, key=lambda x: (x[1], x[0]))
147 | return bboxes
148 |
--------------------------------------------------------------------------------
/tests_openvino/prepare_and_run_tests.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | green="\033[0;32m"
4 | red="\033[0;31m"
5 | end="\033[0m"
6 |
7 | green () {
8 | echo -e "${green}${1}${end}"
9 | }
10 |
11 | red () {
12 | echo -e "${red}${1}${end}"
13 | }
14 |
15 |
16 | echo "======================================================================"
17 | green "CREATE VENV WITH OPENCV AND OPENVINO RUNTIME"
18 | if [ ! -d ./venv_t ]; then
19 | virtualenv --clear --always-copy -p /usr/bin/python3 ./venv_t
20 | fi
21 | green "CREATE SEPARATE VENV WITH OPENVINO DEV TO USE MODEL DOWNLOADER"
22 | if [ ! -d ./venv_d ]; then
23 | virtualenv --clear --always-copy -p /usr/bin/python3 ./venv_d
24 | fi
25 |
26 |
27 | green "INSTALLING DEPENDENCIES"
28 | ./venv_t/bin/pip3 install -r requirements.txt
29 | ./venv_d/bin/pip3 install openvino-dev==2022.3.0
30 |
31 |
32 | green "GET MODELS"
33 | if [ ! -f "rateme-0.1.1.tar.gz" ]; then
34 | wget "https://github.com/banderlog/rateme/releases/download/v0.1.1/rateme-0.1.1.tar.gz"
35 | fi
36 | ./venv_t/bin/pip3 install --no-deps "rateme-0.1.1.tar.gz"
37 |
38 | # download models from intel
39 | if [ ! -f "intel/text-recognition-0012/FP32/text-recognition-0012.bin" ]; then
40 | ./venv_d/bin/omz_downloader --precision FP32 -o ./ --name text-recognition-0012
41 | fi
42 |
43 | # particularly that new model does not work or something changed in decoder
44 | declare -a models=("text-detection-0004.xml"
45 | "text-detection-0004.bin")
46 |
47 | url_start="https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1"
48 |
49 | for i in "${models[@]}"; do
50 | # if no such file
51 | if [ ! -f $i ]; then
52 | # download
53 | wget "${url_start}/${i%.*}/FP32/${i}"
54 | else
55 | # checksum
56 | sha256sum -c "${i}.sha256sum" || red "PROBLEMS ^^^"
57 | fi
58 | done
59 |
60 |
61 |
62 | green "For \"$WHEEL\""
63 | green "RUN TESTS with ./venv_t/bin/python ./tests.py"
64 | ./venv_t/bin/python ./tests.py
65 |
--------------------------------------------------------------------------------
/tests_openvino/requirements.txt:
--------------------------------------------------------------------------------
1 | scikit-image
2 | openvino==2022.3.0
3 | opencv-python
4 |
--------------------------------------------------------------------------------
/tests_openvino/tests.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | import cv2
3 | from pixellink import PixelLinkDetector
4 | from text_recognition import TextRecognizer
5 | from rateme.utils import RateMe
6 |
7 |
8 | class TestPackage(unittest.TestCase):
9 |
10 | def test_dnn_module(self):
11 | model = RateMe()
12 | img = cv2.imread('../tests/dislike.jpg')
13 | answer = model.predict(img)
14 | self.assertEqual(answer, 'dislike')
15 | print('rateme: passed')
16 |
17 | def test_inference_engine(self):
18 | img = cv2.imread('../tests/helloworld.png')
19 | detector4 = PixelLinkDetector('text-detection-0004.xml')
20 | detector4.detect(img)
21 | bboxes = detector4.decode()
22 |
23 | recognizer12 = TextRecognizer('intel/text-recognition-0012/FP32/text-recognition-0012.xml')
24 | answer = recognizer12.do_ocr(img, bboxes)
25 | self.assertEqual(answer, ['hello', 'world'])
26 | print('text detection and recognition: passed')
27 |
28 |
29 | if __name__ == '__main__':
30 | unittest.main()
31 |
--------------------------------------------------------------------------------
/tests_openvino/text-detection-0004.bin.sha256sum:
--------------------------------------------------------------------------------
1 | 6da6456f27123be2d9a0e68bb73a7750f6aaee2f0af75d7f34ec6fa97f6727dc text-detection-0004.bin
2 |
--------------------------------------------------------------------------------
/tests_openvino/text-detection-0004.xml.sha256sum:
--------------------------------------------------------------------------------
1 | 244f836e36d63c9bd45b2123f4b9e4672cae6be348c15cac857d75a8b9852dd7 text-detection-0004.xml
2 |
--------------------------------------------------------------------------------
/tests_openvino/text_recognition.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | from openvino.runtime import Core
3 | import numpy as np
4 | from typing import List
5 |
6 |
7 | class TextRecognizer():
8 | def __init__(self, xml_model_path: str):
9 | """ Class for the Intels' OCR model pipeline
10 |
11 | See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
12 | text-recognition-0012/description/text-recognition-0012.md
13 |
14 | :param xml_model_path: path to model's XML file
15 | """
16 | # load model
17 | ie = Core()
18 | model = ie.read_model(xml_model_path)
19 | self._net = ie.compile_model(model=model, device_name="CPU")
20 | #self._net = cv2.dnn.readNetFromModelOptimizer(xml_model_path, xml_model_path[:-3] + 'bin')
21 |
22 | def _get_confidences(self, img: np.ndarray, box: tuple) -> np.ndarray:
23 | """ get OCR prediction confidences from a part of image in memory
24 |
25 | :param img: BGR image
26 | :param box: (ymin ,xmin ,ymax, xmax)
27 |
28 | :return: blob with the shape [30, 1, 37] in the format [WxBxL], where:
29 | W - output sequence length
30 | B - batch size
31 | L - confidence distribution across alphanumeric symbols:
32 | "0123456789abcdefghijklmnopqrstuvwxyz#", where # - special
33 | blank character for CTC decoding algorithm.
34 | """
35 | y1, x1, y2, x2 = box
36 | img = img[y1:y2, x1:x2]
37 | img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
38 | blob = cv2.dnn.blobFromImage(img, 1, (120, 32))
39 | #self._net.setInput(blob)
40 | #outs = self._net.forward()
41 | outs = self._net([blob])
42 | return outs
43 |
44 | def do_ocr(self, img: np.ndarray, bboxes: List[tuple]) -> List[str]:
45 | """ Run OCR pipeline with greedy decoder for each single word (bbox)
46 |
47 | :param img: BGR image
48 | :param bboxes: list of sepaate word bboxes (ymin ,xmin ,ymax, xmax)
49 |
50 | :return: recognized words
51 |
52 | For TF version use:
53 |
54 | .. code-block:: python
55 |
56 | # 30 is `confs.shape[0]` it is fixed
57 | a, b = tf.nn.ctc_beam_search_decoder(confs, np.array([30]))
58 | idx_no_blanks = tf.sparse.to_dense(a[0])[0].numpy()
59 | word = ''.join(char_vec[idxs_no_blanks])
60 | """
61 | words = []
62 | # net could detect only these chars
63 | char_vec = np.array(list("0123456789abcdefghijklmnopqrstuvwxyz#"))
64 | out_layer_name = self._net.output(0)
65 |
66 | for box in bboxes:
67 | # confidence distribution across symbols
68 | confs = self._get_confidences(img, box)[out_layer_name]
69 | # get maximal confidence for the whole beam width aka greedy decoder
70 | idxs = confs[:, 0, :].argmax(axis=1)
71 | # drop blank characters '#' with id == 36 in charvec
72 | # isupposedly we taking only separate words as input
73 | idxs_no_blanks = idxs[idxs != 36]
74 | # joint to string
75 | word = ''.join(char_vec[idxs_no_blanks])
76 | words.append(word)
77 |
78 | return words
79 |
--------------------------------------------------------------------------------