├── .gitignore
├── .gitmodules
├── LICENSE
├── README.md
├── TODO.md
├── build
    ├── dldt
    │   └── dldt_setup.sh
    ├── ffmpeg
    │   ├── ffmpeg_premake.sh
    │   └── ffmpeg_setup.sh
    ├── openblas
    │   └── openblas_setup.sh
    └── opencv
    │   └── opencv_setup.sh
├── create_wheel
    ├── LICENSE
    ├── LICENSE_MIT
    ├── README.md
    ├── cv2
    │   └── __init__.py
    └── setup.py
├── download_all_stuff.sh
├── requirements.txt
├── tests
    ├── README.md
    ├── dislike.jpg
    ├── examples.ipynb
    ├── helloworld.png
    ├── pixellink.py
    ├── prepare_and_run_tests.sh
    ├── requirements.txt
    ├── short_video.mp4
    ├── tests.py
    ├── text-detection-0004.bin.sha256sum
    ├── text-detection-0004.xml.sha256sum
    ├── text-recognition-0012.bin.sha256sum
    ├── text-recognition-0012.xml.sha256sum
    └── text_recognition.py
└── tests_openvino
    ├── README.md
    ├── pixellink.py
    ├── prepare_and_run_tests.sh
    ├── requirements.txt
    ├── tests.py
    ├── text-detection-0004.bin.sha256sum
    ├── text-detection-0004.xml.sha256sum
    └── text_recognition.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | openblas/*
 2 | !openblas/.gitkeep
 3 | 
 4 | dldt/*
 5 | !dldt/.gitkeep
 6 | 
 7 | opencv/*
 8 | !opencv/.gitkeep
 9 | 
10 | ffmpeg/*
11 | !ffmpeg/.gitkeep
12 | 
13 | build/*
14 | !build/opencv
15 | build/opencv/*
16 | !build/opencv/opencv_setup.sh
17 | 
18 | !build/dldt
19 | build/dldt/*
20 | !build/dldt/dldt_setup.sh
21 | !build/dldt/*.patch
22 | 
23 | !build/ffmpeg
24 | build/ffmpeg/*
25 | !build/ffmpeg/ffmpeg_setup.sh
26 | !build/ffmpeg/ffmpeg_premake.sh
27 | 
28 | !build/openblas
29 | build/openblas/*
30 | !build/openblas/openblas_setup.sh
31 | 
32 | create_wheel/*
33 | !create_wheel/LICENSE*
34 | !create_wheel/README.md
35 | !create_wheel/setup.py
36 | !create_wheel/cv2
37 | create_wheel/cv2/*
38 | !create_wheel/cv2/__init__.py
39 | 
40 | tests/venv_t
41 | tests/rateme*
42 | venv
43 | 
44 | tests_openvino/venv_t
45 | tests_openvino/venv_d
46 | tests_openvino/rateme*
47 | tests_openvino/*.bin
48 | tests_openvino/*.xml
49 | 
50 | *cache*
51 | .ipynb_checkpoints
52 | *tar.gz
53 | *tar.bz2
54 | *.zip
55 | *.swp
56 | *.whl
57 | *.xml
58 | TODO.txt
59 | *.bin
60 | *.weights
61 | 


--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------
 1 | [submodule "dldt"]
 2 | 	path = dldt
 3 | 	url = https://github.com/openvinotoolkit/openvino
 4 | [submodule "opencv"]
 5 | 	path = opencv
 6 | 	url = https://github.com/opencv/opencv
 7 | [submodule "ffmpeg"]
 8 | 	path = ffmpeg
 9 | 	url = https://github.com/FFmpeg/FFmpeg
10 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Kabakov Borys
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | [![Downloads](https://pepy.tech/badge/opencv-python-inference-engine)](https://pepy.tech/project/opencv-python-inference-engine) [![Downloads](https://pepy.tech/badge/opencv-python-inference-engine/month)](https://pepy.tech/project/opencv-python-inference-engine/month) [![Downloads](https://pepy.tech/badge/opencv-python-inference-engine/week)](https://pepy.tech/project/opencv-python-inference-engine/week)
  2 | 
  3 | # opencv-python-inference-engine
  4 | 
  5 | ---
  6 | 
  7 | $${\color{red}It \space is \space deprecated \space now, \space all \space future \space updates \space are \space improbable}$$
  8 | 
  9 | A lot has changed during my military leave:
 10 | 
 11 | 1. Everything [changed since OpenVINO 2021.1](https://github.com/openvinotoolkit/openvino/releases/tag/2022.1.0), now there should be [two separate libs](https://opencv.org/how-to-use-opencv-with-openvino/), building process and inference engine API have changed dramatically, *without backwards compatibility* (btw, opencv-python now *official* python builds).
 12 | 2. Now opencv has [small package installations via pip](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html), so the main reason for creating that package is gone.
 13 | 3. Just [look at the current official way](https://github.com/banderlog/opencv-python-inference-engine/tree/dev/create_wheel/cv2) of importing cv2 cxx lib into the python module via 5 python scripts -- I do not want to mess with this growing lump of crutches
 14 | 
 15 | My advice is to use openvino and opencv-python packages, see rewritten examples in [`tests_openvino`](https://github.com/banderlog/opencv-python-inference-engine/tree/master/tests_openvino)
 16 | 
 17 | ---
 18 | 
 19 | This is *Unofficial* pre-built OpenCV with the inference engine part of [OpenVINO](https://github.com/openvinotoolkit/openvino) package for Python.
 20 | 
 21 | ## Installing from `pip3`
 22 | 
 23 | Remove previously installed versions of `cv2`
 24 | 
 25 | ```bash
 26 | pip3 install opencv-python-inference-engine
 27 | ```
 28 | 
 29 | ## Examples of usage
 30 | 
 31 | Please see the `examples.ipynb` in the `tests` folder.
 32 | 
 33 | You will need to preprocess data as a model requires and decode the output. A description of the decoding *should* be in the model documentation with examples in open-vino documentation, however, in some cases, the original article may be the only information source. Some models are very simple to encode/decode, others are tough (e.g., PixelLink in tests).
 34 | 
 35 | 
 36 | ## Downloading intel models
 37 | 
 38 | The official way is awkward because you need to git clone the whole [model_zoo](https://github.com/opencv/open_model_zoo) ([details](https://github.com/opencv/open_model_zoo/issues/522))
 39 | 
 40 | Better to find a model description [here](https://github.com/opencv/open_model_zoo/blob/master/models/intel/index.md) and download manually from [here](https://download.01.org/opencv/2021/openvinotoolkit/2021.2/open_model_zoo/models_bin/3/)
 41 | 
 42 | 
 43 | ## Description
 44 | 
 45 | 
 46 | ### Why
 47 | 
 48 | I needed an ability to fast deploy a small package that able to run models from [Intel's model zoo](https://github.com/openvinotoolkit/open_model_zoo) and use [Movidius NCS](https://software.intel.com/en-us/neural-compute-stick).
 49 | Well-known [opencv-python](https://github.com/skvark/opencv-python) can't do this.
 50 | The official way is to use OpenVINO, but it is big and clumsy (just try to use it with python venv or fast download it on cloud instance).
 51 | 
 52 | 
 53 | ### Limitations
 54 | 
 55 | + Package comes without contrib modules.
 56 | + You need to [add udev rules](https://www.intel.com/content/www/us/en/support/articles/000057005/boards-and-kits.html) if you want working MYRIAD plugin.
 57 | + It was tested on Ubuntu 18.04, Ubuntu 18.10 as Windows 10 Subsystem and Gentoo.
 58 | + It will not work for Ubuntu 16.04 and below (except v4.1.0.4).
 59 | + I had not made builds for Windows or MacOS.
 60 | + It built with `ffmpeg` and `v4l` support (`ffmpeg` libs included).
 61 | + No GTK/QT support -- use `matplotlib` for plotting your results.
 62 | + It is 64 bit.
 63 | 
 64 | ### Main differences from `opencv-python-headless`
 65 | 
 66 | + Usage of `AVX2` instructions
 67 | + No `JPEG 2000`, `WEBP`, `OpenEXR` support
 68 | + `TBB` used as a parallel framework
 69 | + Inference Engine with `MYRIAD` plugin
 70 | 
 71 | ### Main differences from OpenVINO
 72 | 
 73 | + No model-optimizer
 74 | + No [ITT](https://software.intel.com/en-us/articles/intel-itt-api-open-source)
 75 | + No [IPP](https://software.intel.com/en-us/ipp)
 76 | + No [Intel Media SDK](https://software.intel.com/en-us/media-sdk)
 77 | + No [OpenVINO IE API](https://github.com/opencv/dldt/tree/2020/inference-engine/ie_bridges/python/src/openvino/inference_engine)
 78 | + No python2 support (it is dead)
 79 | + No Gstreamer (use ffmpeg)
 80 | + No GTK (+16 MB and a lot of problems and extra work to compile Qt\GTK libs from sources.)
 81 | 
 82 | For additional info read `cv2.getBuildInformation()` output.
 83 | 
 84 | ### Versioning
 85 | 
 86 | `YYYY.MM.DD`, because it is the most simple way to track opencv/openvino versions.
 87 | 
 88 | ## Compiling from source
 89 | 
 90 | You will need ~7GB RAM and ~10GB disk space
 91 | 
 92 | I am using Ubuntu 18.04 (python 3.6) [multipass](https://multipass.run/) instance: `multipass launch -c 6 -d 10G -m 7G 18.04`.
 93 | 
 94 | ### Requirements
 95 | 
 96 | From [opencv](https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html), [dldt](https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html),
 97 |  [ffmpeg](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu), and [ngraph](https://www.ngraph.ai/documentation/buildlb)
 98 | 
 99 | ```bash
100 | # We need newer `cmake` for dldt (fastest way I know)
101 | # >=cmake-3.16
102 | sudo apt remove --purge cmake
103 | hash -r
104 | sudo snap install cmake --classic
105 | 
106 | # nasm for ffmpeg
107 | # libusb-1.0-0-dev for MYRIAD plugin
108 | sudo apt update
109 | sudo apt install build-essential git pkg-config python3-dev nasm python3 virtualenv libusb-1.0-0-dev chrpath shellcheck
110 | 
111 | # for ngraph
112 | # the `dldt/_deps/ext_onnx-src/onnx/gen_proto.py` has `#!/usr/bin/env python` string and will throw an error otherwise
113 | sudo ln -s  /usr/bin/python3 /usr/bin/python
114 | ```
115 | 
116 | ### Preparing
117 | 
118 | ```bash
119 | git clone https://github.com/banderlog/opencv-python-inference-engine
120 | cd opencv-python-inference-engine
121 | # git checkout dev
122 | ./download_all_stuff.sh
123 | ```
124 | 
125 | ### Compilation
126 | 
127 | ```bash
128 | cd build/ffmpeg
129 | ./ffmpeg_setup.sh &&
130 | ./ffmpeg_premake.sh &&
131 | make -j6 &&
132 | make install
133 | 
134 | cd ../dldt
135 | ./dldt_setup.sh &&
136 | make -j6
137 | 
138 | # NB: check `-D INF_ENGINE_RELEASE` value
139 | # should be in form YYYYAABBCC (e.g. 2020.1.0.2 -> 2020010002)")
140 | cd ../opencv
141 | ./opencv_setup.sh &&
142 | make -j6
143 | ```
144 | 
145 | ### Wheel creation
146 | 
147 | ```bash
148 | # get all compiled libs together
149 | cd ../../
150 | cp build/opencv/lib/python3/cv2.cpython*.so create_wheel/cv2/cv2.so
151 | 
152 | cp dldt/bin/intel64/Release/lib/*.so create_wheel/cv2/
153 | cp dldt/bin/intel64/Release/lib/*.mvcmd create_wheel/cv2/
154 | cp dldt/bin/intel64/Release/lib/plugins.xml create_wheel/cv2/
155 | cp dldt/inference-engine/temp/tbb/lib/libtbb.so.2 create_wheel/cv2/
156 | 
157 | cp build/ffmpeg/binaries/lib/*.so create_wheel/cv2/
158 | 
159 | # change RPATH
160 | cd create_wheel
161 | for i in  cv2/*.so; do chrpath -r '$ORIGIN' $i; done
162 | 
163 | # final .whl will be in /create_wheel/dist/
164 | # NB: check version in the `setup.py`
165 | ../venv/bin/python3 setup.py bdist_wheel
166 | ```
167 | 
168 | ### Optional things to play with
169 | 
170 | + [dldt build instruction](https://github.com/openvinotoolkit/openvino/wiki/CMakeOptionsForCustomCompilation)
171 | + [dldt cmake flags](https://github.com/openvinotoolkit/openvino/blob/master/inference-engine/cmake/features.cmake)
172 | + [opencv cmake flags](https://github.com/opencv/opencv/blob/master/CMakeLists.txt)
173 | 
174 | **NB:** removing `QUIET` from `find_package()` in project Cmake files, could help to solve some problems -- сmake will start to log them.
175 | 
176 | 
177 | #### GTK2
178 | 
179 | Make next changes in `opencv-python-inference-engine/build/opencv/opencv_setup.sh`:
180 | 1. change string `-D WITH_GTK=OFF \`  to `-D WITH_GTK=ON \`
181 | 2. `export PKG_CONFIG_PATH=$ABS_PORTION/build/ffmpeg/binaries/lib/pkgconfig:$PKG_CONFIG_PATH` -- you will need to
182 |    add absolute paths to `.pc` files. On Ubuntu 18.04 they here:
183 |    `/usr/lib/x86_64-linux-gnu/pkgconfig/:/usr/share/pkgconfig/:/usr/local/lib/pkgconfig/:/usr/lib/pkgconfig/`
184 | 
185 | Exporting `PKG_CONFIG_PATH` for `ffmpeg` somehow messes with default values.
186 | 
187 | It will add ~16MB to the package.
188 | 
189 | #### Integrated Performance Primitives
190 | 
191 | Just set `-D WITH_IPP=ON` in `opencv_setup.sh`.
192 | 
193 | It will give +30MB to the final `cv2.so` size. And it will boost _some_ opencv functions.
194 | 
195 | [Official Intel's IPP benchmarks](https://software.intel.com/en-us/ipp/benchmarks) (may ask for registration)
196 | 
197 | #### MKL
198 | 
199 | You need to download MKL-DNN release and set two flags:`-D GEMM=MKL` , `-D MKLROOT` ([details](https://github.com/opencv/dldt/issues/327))
200 | 
201 | OpenVino comes with 30MB `libmkl_tiny_tbb.so`, but [you will not be able to compile it](https://github.com/intel/mkl-dnn/issues/674), because it made from proprietary MKL.
202 | 
203 | Our opensource MKL-DNN experiment will end with 125MB `libmklml_gnu.so` and inference speed compatible with 5MB openblas ([details](https://github.com/banderlog/opencv-python-inference-engine/issues/5)).
204 | 
205 | 
206 | #### CUDA
207 | 
208 | I did not try it. But it cannot be universal, it will only work with the certain combination of GPU+CUDA+cuDNN for which it will be compiled for.
209 | 
210 | + [Compile OpenCV’s ‘dnn’ module with NVIDIA GPU support](https://www.pyimagesearch.com/2020/02/10/opencv-dnn-with-nvidia-gpus-1549-faster-yolo-ssd-and-mask-r-cnn/)
211 | + [Use OpenCV’s ‘dnn’ module with NVIDIA GPUs, CUDA, and cuDNN](https://www.pyimagesearch.com/2020/02/03/how-to-use-opencvs-dnn-module-with-nvidia-gpus-cuda-and-cudnn/)
212 | 
213 | 
214 | #### OpenMP
215 | 
216 | It is possible to compile OpenBLAS, dldt and OpenCV with OpenMP. I am not sure that the result would be better than now, but who knows.
217 | 


--------------------------------------------------------------------------------
/TODO.md:
--------------------------------------------------------------------------------
1 | # TODO list
2 | 
3 | + Auto value for `-D INF_ENGINE_RELEASE`: https://github.com/openvinotoolkit/openvino/issues/1435
4 | + <https://answers.opencv.org/question/236271/what-the-difference-between-cv_version_status-values/>
5 | + `ENABLE_AVX512F`, how often you see such CPUs in clouds?
6 | + `avresample` from ffmpeg to the opencv, do we need it?
7 | 


--------------------------------------------------------------------------------
/build/dldt/dldt_setup.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # https://github.com/openvinotoolkit/openvino/wiki/CMakeOptionsForCustomCompilation
 4 | # https://github.com/openvinotoolkit/openvino/issues/4527
 5 | #      -D ENABLE_OPENCV=OFF \
 6 | # https://github.com/openvinotoolkit/openvino/issues/5100
 7 | #      -D BUILD_SHARED_LIBS=OFF \
 8 | #      -D BUILD_SHARED_LIBS=ON \
 9 | # https://github.com/openvinotoolkit/openvino/issues/5209
10 | #      -D NGRAPH_TOOLS_ENABLE=OFF \
11 | cmake -D CMAKE_BUILD_TYPE=Release \
12 |       -D THREADING=TBB \
13 |       -D ENABLE_MKL_DNN=ON \
14 |       -D GEMM=JIT \
15 |       -D ENABLE_FASTER_BUILD=ON \
16 |       -D ENABLE_LTO=ON \
17 |       -D ENABLE_VPU=ON \
18 |       -D ENABLE_MYRIAD=ON \
19 |       -D ENABLE_SSE42=ON \
20 |       -D ENABLE_AVX2=ON \
21 |       -D ENABLE_AVX512F=OFF \
22 |       -D BUILD_TESTS=OFF \
23 |       -D ENABLE_ALTERNATIVE_TEMP=OFF \
24 |       -D ENABLE_CLDNN=OFF \
25 |       -D ENABLE_CLDNN_TESTS=OFF \
26 |       -D ENABLE_DOCS=OFF \
27 |       -D ENABLE_GAPI_TESTS=OFF \
28 |       -D ENABLE_GNA=OFF \
29 |       -D ENABLE_OPENCV=OFF \
30 |       -D ENABLE_PROFILING_ITT=OFF \
31 |       -D ENABLE_PYTHON=OFF \
32 |       -D ENABLE_SAMPLES=OFF \
33 |       -D ENABLE_SPEECH_DEMO=OFF \
34 |       -D ENABLE_TESTS=OFF \
35 |       -D GAPI_TEST_PERF=OFF \
36 |       -D NGRAPH_ONNX_IMPORT_ENABLE=ON \
37 |       -D NGRAPH_TEST_UTIL_ENABLE=OFF \
38 |       -D NGRAPH_TOOLS_ENABLE=OFF \
39 |       -D NGRAPH_UNIT_TEST_ENABLE=OFF \
40 |       -D SELECTIVE_BUILD=OFF ../../dldt/
41 | 


--------------------------------------------------------------------------------
/build/ffmpeg/ffmpeg_premake.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Build ffmpeg shared libraries without version suffix
 4 | # from <https://stackoverflow.com/a/36937360/7599215>
 5 | 
 6 | OLD1='SLIBNAME_WITH_VERSION=$(SLIBNAME).$(LIBVERSION)'
 7 | OLD2='SLIBNAME_WITH_MAJOR=$(SLIBNAME).$(LIBMAJOR)'
 8 | OLD3='SLIB_INSTALL_NAME=$(SLIBNAME_WITH_VERSION)'
 9 | OLD4='SLIB_INSTALL_LINKS=$(SLIBNAME_WITH_MAJOR) $(SLIBNAME)'
10 | 
11 | NEW1='SLIBNAME_WITH_VERSION=$(SLIBNAME)'
12 | NEW2='SLIBNAME_WITH_MAJOR=$(SLIBNAME)'
13 | NEW3='SLIB_INSTALL_NAME=$(SLIBNAME)'
14 | NEW4='SLIB_INSTALL_LINKS='
15 | 
16 | 
17 | sed -i -e "s/${OLD1}/${NEW1}/" -e "s/${OLD2}/${NEW2}/" -e "s/${OLD3}/${NEW3}/" -e "s/${OLD4}/${NEW4}/" ./ffbuild/config.mak
18 | 


--------------------------------------------------------------------------------
/build/ffmpeg/ffmpeg_setup.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | # deprecated: --enable-avresample , switch to libswresample
 3 | # The libswresample library performs highly optimized audio resampling,
 4 | #   rematrixing and sample format conversion operations. 
 5 | 
 6 | PATH_TO_SCRIPT=`dirname $(realpath $0)`
 7 | 
 8 | ../../ffmpeg/configure \
 9 | --prefix=$PATH_TO_SCRIPT/binaries \
10 | --disable-programs \
11 | --disable-avdevice \
12 | --disable-postproc \
13 | --disable-static \
14 | --disable-avdevice \
15 | --disable-swresample \
16 | --disable-postproc \
17 | --disable-avfilter \
18 | --disable-alsa \
19 | --disable-appkit \
20 | --disable-avfoundation \
21 | --disable-bzlib \
22 | --disable-coreimage \
23 | --disable-iconv \
24 | --disable-lzma \
25 | --disable-sndio \
26 | --disable-schannel \
27 | --disable-sdl2 \
28 | --disable-securetransport \
29 | --disable-xlib \
30 | --disable-zlib  \
31 | --disable-audiotoolbox \
32 | --disable-amf \
33 | --disable-cuvid \
34 | --disable-d3d11va \
35 | --disable-dxva2 \
36 | --disable-ffnvcodec \
37 | --disable-nvdec \
38 | --disable-nvenc \
39 | --disable-v4l2-m2m \
40 | --disable-vaapi \
41 | --disable-vdpau \
42 | --disable-videotoolbox \
43 | --disable-doc \
44 | --disable-static \
45 | --enable-pic \
46 | --enable-shared \
47 | 


--------------------------------------------------------------------------------
/build/openblas/openblas_setup.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Please refer here for details: <https://github.com/xianyi/OpenBLAS/issues/2528>
 4 | #
 5 | # If you compile it with `make FC=gfortran`,
 6 | # you'll need `libgfortran.so.4` and `libquadmath.so.0`
 7 | 
 8 | cmake -D NO_LAPACKE=1 \
 9 |       -D CMAKE_BUILD_TYPE=Release \
10 |       -D NOFORTRAN=ON \
11 |       -D BUILD_RELAPACK=OFF \
12 |       -D NO_AFFINITY=1 \
13 |       -D USE_OPENMP=0 \
14 |       -D NO_WARMUP=1 \
15 |       -D NUM_THREADS=64 \
16 |       -D GEMM_MULTITHREAD_THRESHOLD=64 \
17 |       -D BUILD_SHARED_LIBS=ON \
18 |       -D CMAKE_INSTALL_PREFIX=./ ../../openblas/
19 | 


--------------------------------------------------------------------------------
/build/opencv/opencv_setup.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # for CPU_BASELINE and CPU_DISPATCH see https://github.com/opencv/opencv/wiki/CPU-optimizations-build-options
 4 | # they should match with ones for  dldt/inference-engine/src/extension/cmake/OptimizationFlags.cmake 
 5 | #
 6 | # -DINF_ENGINE_RELEASE= should match dldt version
 7 | # See <https://github.com/opencv/dldt/issues/248#issuecomment-590102331>
 8 | # From <https://github.com/opencv/opencv/blob/c8ebe0eb86fca1c2de9de516e27be685eaba3e69/cmake/OpenCVDetectInferenceEngine.cmake#L134>
 9 | # 	"Force IE version, should be in form YYYYAABBCC (e.g. 2020.1.0.2 -> 2020010002)")
10 | 
11 | tmp=$(pwd)
12 | ABS_PORTION=${tmp%%"/build/opencv"}
13 | 
14 | FFMPEG_PATH=$ABS_PORTION/build/ffmpeg/binaries
15 | export LD_LIBRARY_PATH=$FFMPEG_PATH/lib/:$LD_LIBRARY_PATH
16 | export PKG_CONFIG_PATH=$FFMPEG_PATH/lib/pkgconfig:$PKG_CONFIG_PATH
17 | export PKG_CONFIG_LIBDIR=$FFMPEG_PATH/lib/:$PKG_CONFIG_LIBDIR
18 | 
19 | # grep "5" from "Python 3.5.2"
20 | PY_VER=`$ABS_PORTION/venv/bin/python3 --version | sed -rn "s/Python .\.(.)\..$/\1/p"`
21 | PY_LIB_PATH=`find $ABS_PORTION/venv/lib/ -iname libpython3.${PY_VER}m.so`
22 | 
23 | 
24 | cmake -D CMAKE_BUILD_TYPE=RELEASE \
25 |       -D BUILD_DOCS=OFF \
26 |       -D BUILD_EXAMPLES=OFF \
27 |       -D BUILD_JPEG=OFF \
28 |       -D BUILD_JPEG=OFF \
29 |       -D BUILD_PERF_TESTS=OFF \
30 |       -D BUILD_SHARED_LIBS=OFF \
31 |       -D BUILD_TESTS=OFF \
32 |       -D BUILD_opencv_apps=OFF \
33 |       -D BUILD_opencv_java=OFF \
34 |       -D BUILD_opencv_python2.7=OFF \
35 |       -D BUILD_opencv_python2=OFF \
36 |       -D BUILD_opencv_python3=ON \
37 |       -D BUILD_opencv_world=OFF \
38 |       -D CMAKE_INSTALL_PREFIX=./binaries/ \
39 |       -D CPU_BASELINE=SSE4_2 \
40 |       -D CPU_DISPATCH=AVX,AVX2,FP16,AVX512 \
41 |       -D CV_TRACE=OFF \
42 |       -D ENABLE_CXX11=ON \
43 |       -D ENABLE_PRECOMPILED_HEADERS=OFF \
44 |       -D FFMPEG_INCLUDE_DIRS=$FFMPEG_PATH/include \
45 |       -D INF_ENGINE_INCLUDE_DIRS=$ABS_PORTION/dldt/inference-engine/include \
46 |       -D INF_ENGINE_LIB_DIRS=$ABS_PORTION/dldt/bin/intel64/Release/lib \
47 |       -D INF_ENGINE_RELEASE=2021040200 \
48 |       -D INSTALL_CREATE_DISTRIB=ON \
49 |       -D INSTALL_C_EXAMPLES=OFF \
50 |       -D INSTALL_PYTHON_EXAMPLES=OFF \
51 |       -D JPEG_INCLUDE_DIR=$JPEG_INCLUDE_DIR \
52 |       -D JPEG_LIBRARY=$JPEG_LIBRARY \
53 |       -D OPENCV_ENABLE_NONFREE=OFF \
54 |       -D OPENCV_FORCE_3RDPARTY_BUILD=ON \
55 |       -D OPENCV_SKIP_PYTHON_LOADER=ON \
56 |       -D PYTHON3_EXECUTABLE=$ABS_PORTION/venv/bin/python3 \
57 |       -D PYTHON3_LIBRARY:PATH=$PY_LIB_PATH \
58 |       -D PYTHON3_NUMPY_INCLUDE_DIRS:PATH=$ABS_PORTION/venv/lib/python3.${PY_VER}/site-packages/numpy/core/include \
59 |       -D PYTHON3_PACKAGES_PATH=$ABS_PORTION/venv/lib/python3.${PY_VER}/site-packages \
60 |       -D PYTHON_DEFAULT_EXECUTABLE=$ABS_PORTION/venv/bin/python3 \
61 |       -D PYTHON_INCLUDE_DIR=/usr/include/python3.${PY_VER} \
62 |       -D WITH_1394=OFF \
63 |       -D WITH_CUDA=OFF \
64 |       -D WITH_EIGEN=OFF \
65 |       -D WITH_FFMPEG=ON \
66 |       -D WITH_GSTRREAMER=OFF \
67 |       -D WITH_GTK=OFF \
68 |       -D WITH_INF_ENGINE=ON \
69 |       -D WITH_IPP=OFF \
70 |       -D WITH_ITT=OFF \
71 |       -D WITH_JASPER=OFF \
72 |       -D WITH_NGRAPH=ON \
73 |       -D WITH_OPENEXR=OFF \
74 |       -D WITH_OPENMP=OFF \
75 |       -D WITH_PNG=ON \
76 |       -D WITH_PROTOBUF=ON \
77 |       -D WITH_QT=OFF \
78 |       -D WITH_TBB=ON \
79 |       -D WITH_V4L=ON \
80 |       -D WITH_VTK=OFF \
81 |       -D WITH_WEBP=OFF \
82 |       -D ngraph_DIR=$ABS_PORTION/build/dldt/ngraph ../../opencv
83 | 


--------------------------------------------------------------------------------
/create_wheel/LICENSE:
--------------------------------------------------------------------------------
  1 | 
  2 |                                  Apache License
  3 |                            Version 2.0, January 2004
  4 |                         http://www.apache.org/licenses/
  5 | 
  6 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  7 | 
  8 |    1. Definitions.
  9 | 
 10 |       "License" shall mean the terms and conditions for use, reproduction,
 11 |       and distribution as defined by Sections 1 through 9 of this document.
 12 | 
 13 |       "Licensor" shall mean the copyright owner or entity authorized by
 14 |       the copyright owner that is granting the License.
 15 | 
 16 |       "Legal Entity" shall mean the union of the acting entity and all
 17 |       other entities that control, are controlled by, or are under common
 18 |       control with that entity. For the purposes of this definition,
 19 |       "control" means (i) the power, direct or indirect, to cause the
 20 |       direction or management of such entity, whether by contract or
 21 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 22 |       outstanding shares, or (iii) beneficial ownership of such entity.
 23 | 
 24 |       "You" (or "Your") shall mean an individual or Legal Entity
 25 |       exercising permissions granted by this License.
 26 | 
 27 |       "Source" form shall mean the preferred form for making modifications,
 28 |       including but not limited to software source code, documentation
 29 |       source, and configuration files.
 30 | 
 31 |       "Object" form shall mean any form resulting from mechanical
 32 |       transformation or translation of a Source form, including but
 33 |       not limited to compiled object code, generated documentation,
 34 |       and conversions to other media types.
 35 | 
 36 |       "Work" shall mean the work of authorship, whether in Source or
 37 |       Object form, made available under the License, as indicated by a
 38 |       copyright notice that is included in or attached to the work
 39 |       (an example is provided in the Appendix below).
 40 | 
 41 |       "Derivative Works" shall mean any work, whether in Source or Object
 42 |       form, that is based on (or derived from) the Work and for which the
 43 |       editorial revisions, annotations, elaborations, or other modifications
 44 |       represent, as a whole, an original work of authorship. For the purposes
 45 |       of this License, Derivative Works shall not include works that remain
 46 |       separable from, or merely link (or bind by name) to the interfaces of,
 47 |       the Work and Derivative Works thereof.
 48 | 
 49 |       "Contribution" shall mean any work of authorship, including
 50 |       the original version of the Work and any modifications or additions
 51 |       to that Work or Derivative Works thereof, that is intentionally
 52 |       submitted to Licensor for inclusion in the Work by the copyright owner
 53 |       or by an individual or Legal Entity authorized to submit on behalf of
 54 |       the copyright owner. For the purposes of this definition, "submitted"
 55 |       means any form of electronic, verbal, or written communication sent
 56 |       to the Licensor or its representatives, including but not limited to
 57 |       communication on electronic mailing lists, source code control systems,
 58 |       and issue tracking systems that are managed by, or on behalf of, the
 59 |       Licensor for the purpose of discussing and improving the Work, but
 60 |       excluding communication that is conspicuously marked or otherwise
 61 |       designated in writing by the copyright owner as "Not a Contribution."
 62 | 
 63 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 64 |       on behalf of whom a Contribution has been received by Licensor and
 65 |       subsequently incorporated within the Work.
 66 | 
 67 |    2. Grant of Copyright License. Subject to the terms and conditions of
 68 |       this License, each Contributor hereby grants to You a perpetual,
 69 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 70 |       copyright license to reproduce, prepare Derivative Works of,
 71 |       publicly display, publicly perform, sublicense, and distribute the
 72 |       Work and such Derivative Works in Source or Object form.
 73 | 
 74 |    3. Grant of Patent License. Subject to the terms and conditions of
 75 |       this License, each Contributor hereby grants to You a perpetual,
 76 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 77 |       (except as stated in this section) patent license to make, have made,
 78 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 79 |       where such license applies only to those patent claims licensable
 80 |       by such Contributor that are necessarily infringed by their
 81 |       Contribution(s) alone or by combination of their Contribution(s)
 82 |       with the Work to which such Contribution(s) was submitted. If You
 83 |       institute patent litigation against any entity (including a
 84 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 85 |       or a Contribution incorporated within the Work constitutes direct
 86 |       or contributory patent infringement, then any patent licenses
 87 |       granted to You under this License for that Work shall terminate
 88 |       as of the date such litigation is filed.
 89 | 
 90 |    4. Redistribution. You may reproduce and distribute copies of the
 91 |       Work or Derivative Works thereof in any medium, with or without
 92 |       modifications, and in Source or Object form, provided that You
 93 |       meet the following conditions:
 94 | 
 95 |       (a) You must give any other recipients of the Work or
 96 |           Derivative Works a copy of this License; and
 97 | 
 98 |       (b) You must cause any modified files to carry prominent notices
 99 |           stating that You changed the files; and
100 | 
101 |       (c) You must retain, in the Source form of any Derivative Works
102 |           that You distribute, all copyright, patent, trademark, and
103 |           attribution notices from the Source form of the Work,
104 |           excluding those notices that do not pertain to any part of
105 |           the Derivative Works; and
106 | 
107 |       (d) If the Work includes a "NOTICE" text file as part of its
108 |           distribution, then any Derivative Works that You distribute must
109 |           include a readable copy of the attribution notices contained
110 |           within such NOTICE file, excluding those notices that do not
111 |           pertain to any part of the Derivative Works, in at least one
112 |           of the following places: within a NOTICE text file distributed
113 |           as part of the Derivative Works; within the Source form or
114 |           documentation, if provided along with the Derivative Works; or,
115 |           within a display generated by the Derivative Works, if and
116 |           wherever such third-party notices normally appear. The contents
117 |           of the NOTICE file are for informational purposes only and
118 |           do not modify the License. You may add Your own attribution
119 |           notices within Derivative Works that You distribute, alongside
120 |           or as an addendum to the NOTICE text from the Work, provided
121 |           that such additional attribution notices cannot be construed
122 |           as modifying the License.
123 | 
124 |       You may add Your own copyright statement to Your modifications and
125 |       may provide additional or different license terms and conditions
126 |       for use, reproduction, or distribution of Your modifications, or
127 |       for any such Derivative Works as a whole, provided Your use,
128 |       reproduction, and distribution of the Work otherwise complies with
129 |       the conditions stated in this License.
130 | 
131 |    5. Submission of Contributions. Unless You explicitly state otherwise,
132 |       any Contribution intentionally submitted for inclusion in the Work
133 |       by You to the Licensor shall be under the terms and conditions of
134 |       this License, without any additional terms or conditions.
135 |       Notwithstanding the above, nothing herein shall supersede or modify
136 |       the terms of any separate license agreement you may have executed
137 |       with Licensor regarding such Contributions.
138 | 
139 |    6. Trademarks. This License does not grant permission to use the trade
140 |       names, trademarks, service marks, or product names of the Licensor,
141 |       except as required for reasonable and customary use in describing the
142 |       origin of the Work and reproducing the content of the NOTICE file.
143 | 
144 |    7. Disclaimer of Warranty. Unless required by applicable law or
145 |       agreed to in writing, Licensor provides the Work (and each
146 |       Contributor provides its Contributions) on an "AS IS" BASIS,
147 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 |       implied, including, without limitation, any warranties or conditions
149 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 |       PARTICULAR PURPOSE. You are solely responsible for determining the
151 |       appropriateness of using or redistributing the Work and assume any
152 |       risks associated with Your exercise of permissions under this License.
153 | 
154 |    8. Limitation of Liability. In no event and under no legal theory,
155 |       whether in tort (including negligence), contract, or otherwise,
156 |       unless required by applicable law (such as deliberate and grossly
157 |       negligent acts) or agreed to in writing, shall any Contributor be
158 |       liable to You for damages, including any direct, indirect, special,
159 |       incidental, or consequential damages of any character arising as a
160 |       result of this License or out of the use or inability to use the
161 |       Work (including but not limited to damages for loss of goodwill,
162 |       work stoppage, computer failure or malfunction, or any and all
163 |       other commercial damages or losses), even if such Contributor
164 |       has been advised of the possibility of such damages.
165 | 
166 |    9. Accepting Warranty or Additional Liability. While redistributing
167 |       the Work or Derivative Works thereof, You may choose to offer,
168 |       and charge a fee for, acceptance of support, warranty, indemnity,
169 |       or other liability obligations and/or rights consistent with this
170 |       License. However, in accepting such obligations, You may act only
171 |       on Your own behalf and on Your sole responsibility, not on behalf
172 |       of any other Contributor, and only if You agree to indemnify,
173 |       defend, and hold each Contributor harmless for any liability
174 |       incurred by, or claims asserted against, such Contributor by reason
175 |       of your accepting any such warranty or additional liability.
176 | 
177 |    END OF TERMS AND CONDITIONS
178 | 
179 |    APPENDIX: How to apply the Apache License to your work.
180 | 
181 |       To apply the Apache License to your work, attach the following
182 |       boilerplate notice, with the fields enclosed by brackets "[]"
183 |       replaced with your own identifying information. (Don't include
184 |       the brackets!)  The text should be enclosed in the appropriate
185 |       comment syntax for the file format. We also recommend that a
186 |       file or class name and description of purpose be included on the
187 |       same "printed page" as the copyright notice for easier
188 |       identification within third-party archives.
189 | 
190 |    Copyright [yyyy] [name of copyright owner]
191 | 
192 |    Licensed under the Apache License, Version 2.0 (the "License");
193 |    you may not use this file except in compliance with the License.
194 |    You may obtain a copy of the License at
195 | 
196 |        http://www.apache.org/licenses/LICENSE-2.0
197 | 
198 |    Unless required by applicable law or agreed to in writing, software
199 |    distributed under the License is distributed on an "AS IS" BASIS,
200 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201 |    See the License for the specific language governing permissions and
202 |    limitations under the License.
203 | 


--------------------------------------------------------------------------------
/create_wheel/LICENSE_MIT:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Kabakov Borys
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/create_wheel/README.md:
--------------------------------------------------------------------------------
 1 | # README
 2 | 
 3 | This is a pre-built [OpenCV](https://github.com/opencv/opencv) with [Inference Engine](https://github.com/openvinotoolkit/openvino) module package for Python3.
 4 | You need that module if you want to run models from [Intel's model zoo](https://github.com/openvinotoolkit/open_model_zoo).
 5 | 
 6 | It built with `ffmpeg` and `v4l` but without GTK/QT (use matplotlib for plotting your results).  
 7 | Contrib modules and haarcascades are not included.
 8 | 
 9 | For additional info visit the [project homepage](https://github.com/banderlog/opencv-python-inference-engine)
10 | 


--------------------------------------------------------------------------------
/create_wheel/cv2/__init__.py:
--------------------------------------------------------------------------------
1 | import importlib
2 | from .cv2 import *
3 | 
4 | # wildcard import above does not import "private" variables like __version__
5 | # this makes them available
6 | globals().update(importlib.import_module('cv2.cv2').__dict__)
7 | 


--------------------------------------------------------------------------------
/create_wheel/setup.py:
--------------------------------------------------------------------------------
 1 | import setuptools
 2 | 
 3 | 
 4 | with open("README.md", "r") as fh:
 5 |     long_description = fh.read()
 6 | 
 7 | 
 8 | # This creates a list which is empty but returns a length of 1.
 9 | # Should make the wheel a binary distribution and platlib compliant.
10 | # from <https://github.com/skvark/opencv-python/blob/master/setup.py>
11 | class EmptyListWithLength(list):
12 |     def __len__(self):
13 |         return 1
14 | 
15 | 
16 | setuptools.setup(
17 |     name='opencv-python-inference-engine',
18 |     version='2022.01.05',
19 |     url="https://github.com/banderlog/opencv-python-inference-engine",
20 |     maintainer="Kabakov Borys",
21 |     license='MIT, Apache 2.0',
22 |     description="Wrapper package for OpenCV with Inference Engine python bindings",
23 |     long_description=long_description,
24 |     long_description_content_type="text/markdown",
25 |     ext_modules=EmptyListWithLength(),
26 |     packages=['cv2'],
27 |     package_data={'cv2': ['*.so*', '*.mvcmd', '*.xml']},
28 |     include_package_data=True,
29 |     install_requires=['numpy'],
30 |     classifiers=[
31 |         'Development Status :: 5 - Production/Stable',
32 |         'Environment :: Console',
33 |         'Intended Audience :: Developers',
34 |         'Intended Audience :: Education',
35 |         'Intended Audience :: Information Technology',
36 |         'Intended Audience :: Science/Research',
37 |         'Programming Language :: Python :: 3',
38 |         'Programming Language :: C++',
39 |         'Operating System :: POSIX :: Linux',
40 |         'Topic :: Scientific/Engineering',
41 |         'Topic :: Scientific/Engineering :: Image Recognition',
42 |         'Topic :: Software Development',
43 |     ],
44 | )


--------------------------------------------------------------------------------
/download_all_stuff.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # colors
 4 | end="\033[0m"
 5 | red="\033[0;31m"
 6 | green="\033[0;32m"
 7 | 
 8 | green () {
 9 |   echo -e "${green}${1}${end}"
10 | }
11 | 
12 | 
13 | red () {
14 |   echo -e "${red}${1}${end}"
15 | }
16 | 
17 | 
18 | ROOT_DIR=$(pwd)
19 | 
20 | # check Ubuntu version (20.04 build will not work on 18.04)
21 | if test $(lsb_release -rs) != 18.04; then
22 |     red "\n!!! You are NOT on the Ubuntu 18.04 !!!\n"
23 | fi
24 | 
25 | green "RESET GIT SUBMODULES"
26 | # git checkout dev
27 | # for update use `git submodule update --init --recursive  --jobs=4`
28 | #   cd submodule dir and `git fetch --tags && git checkout tags/<tag>`
29 | git submodule update --init --recursive --depth=1 --jobs=4
30 | # restore changes command will differ between GIT versions (e.g., `restore`)
31 | git submodule foreach --recursive git checkout .
32 | # remove untracked
33 | git submodule foreach --recursive git clean -dxf
34 | 
35 | green "CLEAN BUILD DIRS"
36 | find build/dldt/ -mindepth 1 -not -name 'dldt_setup.sh' -not -name '*.patch' -delete
37 | find build/opencv/ -mindepth 1 -not -name 'opencv_setup.sh' -delete
38 | find build/ffmpeg/ -mindepth 1 -not -name 'ffmpeg_*.sh' -delete
39 | 
40 | green "CLEAN WHEEL DIR"
41 | find create_wheel/cv2/ -type f -not -name '__init__.py' -delete
42 | rm -drf create_wheel/build
43 | rm -drf create_wheel/dist
44 | rm -drf create_wheel/*egg-info
45 | 
46 | green "CREATE VENV"
47 | cd $ROOT_DIR
48 | 
49 | if [[ ! -d ./venv ]]; then
50 | 	virtualenv --clear --always-copy -p /usr/bin/python3 ./venv
51 | 	./venv/bin/pip3 install -r requirements.txt
52 | fi
53 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy
2 | 


--------------------------------------------------------------------------------
/tests/README.md:
--------------------------------------------------------------------------------
 1 | # Tests for opencv-python-inference-engine wheel
 2 | 
 3 | ## Requirements
 4 | 
 5 | `sudo apt install virtualenv`
 6 | 
 7 | ## Usage
 8 | 
 9 | ### Features
10 | 
11 | Just run bash script and read output.
12 | 
13 | ```bash
14 | cd tests
15 | ./prepare_and_run_tests.sh
16 | ```
17 | 
18 | ### Inference speed
19 | 
20 | Something like below. The general idea is to test only inference speed, without preprocessing and decoding.
21 | Also, 1st inference must not count, because it will load all stuff into memory.
22 | 
23 | **NB:** be strict about Backend and Target
24 | 
25 | ```python
26 | import cv2
27 | 
28 | class PixelLinkDetectorTest():
29 |     """ Cut version of PixelLinkDetector """
30 |     def __init__(self, xml_model_path: str):
31 |         self.net = cv2.dnn.readNet(xml_model_path, xml_model_path[:-3] + 'bin')
32 | 
33 |     def detect(self, img: 'np.ndarray') -> None:
34 |         blob = cv2.dnn.blobFromImage(img, 1, (1280, 768))
35 |         self.net.setInput(blob)
36 |         out_layer_names = self.net.getUnconnectedOutLayersNames()
37 |         return self.net.forward(out_layer_names)
38 | 
39 | 
40 | # check opencv version
41 | cv2.__version__
42 | 
43 | # read img and network
44 | img = cv2.imread('helloworld.png')
45 | detector = PixelLinkDetectorTest('text-detection-0004.xml')
46 | 
47 | # select target & backend, please read the documentation for details:
48 | # <https://docs.opencv.org/4.2.0/db/d30/classcv_1_1dnn_1_1Net.html#a9dddbefbc7f3defbe3eeb5dc3d3483f4>
49 | detector.net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
50 | detector.net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
51 | 
52 | # 1st inference does not count
53 | links, pixels = detector.detect(img)
54 | 
55 | # use magic function
56 | %timeit links, pixels = detector.detect(img)
57 | ```
58 | 
59 | 
60 | ## Models
61 | 
62 | + [rateme](https://github.com/banderlog/rateme) (YOLO3)
63 | + [text-detection-0004](https://github.com/opencv/open_model_zoo/blob/master/models/intel/text-detection-0004/description/text-detection-0004.md)
64 | + [text-recognition-0012](https://github.com/opencv/open_model_zoo/blob/master/models/intel/text-recognition-0012/description/text-recognition-0012.md)
65 | 
66 | ## Files
67 | 
68 | + `short_video.mp4` from [here](https://www.pexels.com/video/a-cattails-fluff-floats-in-air-2156021/)  (free)
69 | + `dislike.jpg` from [rateme repository](https://github.com/banderlog/rateme/blob/master/test_imgs/dislike.jpg)
70 | + `helloworld.png` I either made it or forgot from where it downloaded from
71 | 


--------------------------------------------------------------------------------
/tests/dislike.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/banderlog/opencv-python-inference-engine/0abe6990b938275c48ad990f2b484dd03ad0f39d/tests/dislike.jpg


--------------------------------------------------------------------------------
/tests/helloworld.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/banderlog/opencv-python-inference-engine/0abe6990b938275c48ad990f2b484dd03ad0f39d/tests/helloworld.png


--------------------------------------------------------------------------------
/tests/pixellink.py:
--------------------------------------------------------------------------------
  1 | """ Wrapper class for Intel's PixelLink realisation (text segmentation NN)
  2 |     text-detection-00[34]
  3 | 
  4 |     For text-detection-002 you'll need to uncomment string in detect()
  5 | """
  6 | import cv2
  7 | import numpy as np
  8 | from skimage.morphology import label
  9 | from skimage.measure import regionprops
 10 | from typing import List, Tuple
 11 | from skimage.measure._regionprops import RegionProperties
 12 | 
 13 | 
 14 | class PixelLinkDetector():
 15 |     """ Wrapper class for Intel's version of PixelLink text detector
 16 | 
 17 |         See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
 18 |             text-detection-0004/description/text-detection-0004.md
 19 | 
 20 |         :param xml_model_path: path to XML file
 21 | 
 22 |         **Example:**
 23 | 
 24 |         .. code-block:: python
 25 |             detector = PixelLinkDetector('text-detection-0004.xml')
 26 |             img = cv2.imread('tmp.jpg')
 27 |             # ~250ms on i7-6700K
 28 |             detector.detect(img)
 29 |             # ~2ms
 30 |             bboxes = detector.decode()
 31 |     """
 32 |     def __init__(self, xml_model_path: str, txt_threshold=0.5):
 33 |         """
 34 |             :param xml_model_path: path to model's XML file
 35 |             :param txt_threshold: confidence, defaults to ``0.5``
 36 |         """
 37 |         self._net = cv2.dnn.readNet(xml_model_path, xml_model_path[:-3] + 'bin')
 38 |         self._txt_threshold = txt_threshold
 39 | 
 40 |     def detect(self, img: np.ndarray) -> None:
 41 |         """ GetPixelLink's outputs (BxCxHxW):
 42 |                 + [1x16x192x320] - logits related to linkage between pixels and their neighbors
 43 |                 + [1x2x192x320] - logits related to text/no-text classification for each pixel
 44 | 
 45 |             B - batch size
 46 |             C - number of channels
 47 |             H - image height
 48 |             W - image width
 49 | 
 50 |             :param img: image as ``numpy.ndarray``
 51 |         """
 52 |         self._img_shape = img.shape
 53 |         blob = cv2.dnn.blobFromImage(img, 1, (1280, 768))
 54 |         self._net.setInput(blob)
 55 |         out_layer_names = self._net.getUnconnectedOutLayersNames()
 56 |         # for text-detection-002
 57 |         # self.pixels, self.links = self._net.forward(out_layer_names)
 58 |         # for text-detection-00[34]
 59 |         self.links, self.pixels = self._net.forward(out_layer_names)
 60 | 
 61 |     def get_mask(self) -> np.ndarray:
 62 |         """ Get binary mask of detected text pixels
 63 |         """
 64 |         pixel_mask = self._get_pixel_scores() >= self._txt_threshold
 65 |         return pixel_mask.astype(np.uint8)
 66 | 
 67 |     def _logsumexp(self, a: np.ndarray, axis=-1) -> np.ndarray:
 68 |         """ Castrated function from scipy
 69 |             https://github.com/scipy/scipy/blob/v1.6.2/scipy/special/_logsumexp.py
 70 | 
 71 |             Compute the log of the sum of exponentials of input elements.
 72 |         """
 73 |         a_max = np.amax(a, axis=axis, keepdims=True)
 74 |         tmp = np.exp(a - a_max)
 75 |         s = np.sum(tmp, axis=axis, keepdims=True)
 76 |         out = np.log(s)
 77 |         out += a_max
 78 |         return out
 79 | 
 80 |     def _get_pixel_scores(self) -> np.ndarray:
 81 |         """ get softmaxed properly shaped pixel scores """
 82 |         # move channels to the end
 83 |         tmp = np.transpose(self.pixels, (0, 2, 3, 1))
 84 |         # softmax from scipy
 85 |         tmp = np.exp(tmp - self._logsumexp(tmp, axis=-1))
 86 |         # select single batch, single chanel values
 87 |         return tmp[0, :, :, 1]
 88 | 
 89 |     def _get_txt_regions(self, pixel_mask: np.ndarray) -> List[RegionProperties]:
 90 |         """ kernels are class dependent """
 91 |         img_h, img_w = self._img_shape[:2]
 92 |         _, mask = cv2.threshold(pixel_mask, 0, 1, cv2.THRESH_BINARY)
 93 |         # transmutatioins
 94 |         # kernel size should be image size dependant (default (21,21))
 95 |         # on small image it will connect separate words
 96 |         txt_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
 97 |         mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, txt_kernel)
 98 |         # connect regions on mask of original img size
 99 |         mask = cv2.resize(mask, (img_w, img_h), interpolation=cv2.INTER_NEAREST)
100 |         # Label connected regions of an integer array
101 |         mask = label(mask, background=0, connectivity=2)
102 |         # Measure properties of labeled image regions.
103 |         txt_regions = regionprops(mask)
104 |         return txt_regions
105 | 
106 |     def _get_txt_bboxes(self, txt_regions: List[RegionProperties]) -> List[Tuple[int, int, int, int]]:
107 |         """ Filter text area by area and height
108 | 
109 |             :return: ``[(ymin, xmin, ymax, xmax)]``
110 |         """
111 |         min_area = 0
112 |         min_height = 4
113 |         boxes = []
114 |         for p in txt_regions:
115 |             if p.area > min_area:
116 |                 bbox = p.bbox
117 |                 if (bbox[2] - bbox[0]) > min_height:
118 |                     boxes.append(bbox)
119 |         return boxes
120 | 
121 |     def decode(self) -> List[Tuple[int, int, int, int]]:
122 |         """ Decode PixelLink's output
123 | 
124 |             :return: bounding_boxes
125 | 
126 |             .. note::
127 |                 bounding_boxes format: [ymin ,xmin ,ymax, xmax]
128 | 
129 |         """
130 |         mask = self.get_mask()
131 |         bboxes = self._get_txt_bboxes(self._get_txt_regions(mask))
132 |         # sort by xmin, ymin
133 |         bboxes = sorted(bboxes, key=lambda x: (x[1], x[0]))
134 |         return bboxes
135 | 


--------------------------------------------------------------------------------
/tests/prepare_and_run_tests.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | green="\033[0;32m"
 4 | red="\033[0;31m"
 5 | end="\033[0m"
 6 | 
 7 | green () {
 8 |   echo -e "${green}${1}${end}"
 9 | }
10 | 
11 | red () {
12 |   echo -e "${red}${1}${end}"
13 | }
14 | 
15 | 
16 | # check if (no ARG and no some appropriate files are compiled) or
17 | # (some args provided but arg1 is not existing file)
18 | # of course, you could shoot your leg here in different ways
19 | if ([ ! $# -ge 1 ] && ! $(ls ../create_wheel/dist/opencv_python_inference_engine*.whl &> /dev/null)) ||
20 |    ([ $# -ge 1 ] && [ ! -f $1 ]); then
21 |         red "How do you suppose to run wheel tests without wheel?"
22 |     red "Compile it or provide as an ARG1 to script"
23 |     exit 1
24 | fi
25 | 
26 | echo "======================================================================"
27 | green "CREATE SEPARATE TEST VENV"
28 | if [ ! -d ./venv_t ]; then
29 | 	virtualenv --clear --always-copy -p /usr/bin/python3 ./venv_t
30 | fi
31 | 
32 | 
33 | green "INSTALLING DEPENDENCIES"
34 | if [ $1 ]; then
35 |     # install ARGV1
36 |     green "Installing from provided path"
37 |     WHEEL="$1"
38 | else
39 |     # install compiled wheel
40 |     green "Installing from default path"
41 |     WHEEL=$(realpath ../create_wheel/dist/opencv_python_inference_engine*.whl)
42 | fi
43 | 
44 | ./venv_t/bin/pip3 install --force-reinstall "$WHEEL"
45 | ./venv_t/bin/pip3 install -r requirements.txt
46 | 
47 | 
48 | green "GET MODELS"
49 | 
50 | if [ ! -d "rateme" ]; then
51 |     ./venv_t/bin/pip3 install "https://github.com/banderlog/rateme/releases/download/v0.1.1/rateme-0.1.1.tar.gz"
52 | fi
53 | 
54 | # urls, filenames and checksums are from:
55 | #  + <https://github.com/opencv/open_model_zoo/blob/2020.1/models/intel/text-detection-0004/model.yml>
56 | #  + <https://github.com/opencv/open_model_zoo/blob/2020.1/models/intel/text-recognition-0012/model.yml>
57 | declare -a models=("text-detection-0004.xml"
58 |                    "text-detection-0004.bin"
59 |                    "text-recognition-0012.xml"
60 |                    "text-recognition-0012.bin")
61 | 
62 | url_start="https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1"
63 | 
64 | for i in "${models[@]}"; do
65 |     # if no such file
66 |     if [ ! -f $i ]; then
67 | 	# download
68 |         wget "${url_start}/${i%.*}/FP32/${i}"
69 |     else
70 | 	# checksum
71 |         sha256sum -c "${i}.sha256sum" || red "PROBLEMS ^^^"
72 |     fi
73 | done
74 | 
75 | green "For \"$WHEEL\""
76 | green "RUN TESTS with ./venv_t/bin/python ./tests.py"
77 | ./venv_t/bin/python ./tests.py
78 | 


--------------------------------------------------------------------------------
/tests/requirements.txt:
--------------------------------------------------------------------------------
1 | scikit-image
2 | 


--------------------------------------------------------------------------------
/tests/short_video.mp4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/banderlog/opencv-python-inference-engine/0abe6990b938275c48ad990f2b484dd03ad0f39d/tests/short_video.mp4


--------------------------------------------------------------------------------
/tests/tests.py:
--------------------------------------------------------------------------------
 1 | import unittest
 2 | import cv2
 3 | from pixellink import PixelLinkDetector
 4 | from text_recognition import TextRecognizer
 5 | from rateme.utils import RateMe
 6 | 
 7 | 
 8 | class TestPackage(unittest.TestCase):
 9 | 
10 |     def test_dnn_module(self):
11 |         model = RateMe()
12 |         img = cv2.imread('dislike.jpg')
13 |         answer = model.predict(img)
14 |         self.assertEqual(answer, 'dislike')
15 |         print('rateme: passed')
16 | 
17 |     def test_inference_engine(self):
18 |         img = cv2.imread('helloworld.png')
19 |         detector4 = PixelLinkDetector('text-detection-0004.xml')
20 |         detector4.detect(img)
21 |         bboxes = detector4.decode()
22 | 
23 |         recognizer12 = TextRecognizer('./text-recognition-0012.xml')
24 |         answer = recognizer12.do_ocr(img, bboxes)
25 |         self.assertEqual(answer, ['hello', 'world'])
26 |         print('text detection and recognition: passed')
27 | 
28 |     def test_ffmpeg(self):
29 |         cap = cv2.VideoCapture('short_video.mp4')
30 |         answer, img = cap.read()
31 |         self.assertTrue(answer)
32 |         print('video opening: passed')
33 | 
34 | 
35 | if __name__ == '__main__':
36 |     unittest.main()
37 | 


--------------------------------------------------------------------------------
/tests/text-detection-0004.bin.sha256sum:
--------------------------------------------------------------------------------
1 | 6da6456f27123be2d9a0e68bb73a7750f6aaee2f0af75d7f34ec6fa97f6727dc  text-detection-0004.bin
2 | 


--------------------------------------------------------------------------------
/tests/text-detection-0004.xml.sha256sum:
--------------------------------------------------------------------------------
1 | 244f836e36d63c9bd45b2123f4b9e4672cae6be348c15cac857d75a8b9852dd7  text-detection-0004.xml
2 | 


--------------------------------------------------------------------------------
/tests/text-recognition-0012.bin.sha256sum:
--------------------------------------------------------------------------------
1 | b0d99549692baeea3e83709a671844a365b15bd40e36d9a5d3ef5368a69d2897  text-recognition-0012.bin
2 | 


--------------------------------------------------------------------------------
/tests/text-recognition-0012.xml.sha256sum:
--------------------------------------------------------------------------------
1 | 54fd8ae6ea5ae11fdeb85f5c6b701793c28883f1e3dd8c3a531c43db6c3713ea  text-recognition-0012.xml
2 | 


--------------------------------------------------------------------------------
/tests/text_recognition.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | from typing import List
 4 | 
 5 | 
 6 | class TextRecognizer():
 7 |     def __init__(self, xml_model_path: str):
 8 |         """ Class for the Intels' OCR model pipeline
 9 | 
10 |             See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
11 |                 text-recognition-0012/description/text-recognition-0012.md
12 | 
13 |             :param xml_model_path: path to model's XML file
14 |         """
15 |         # load model
16 |         self._net = cv2.dnn.readNetFromModelOptimizer(xml_model_path, xml_model_path[:-3] + 'bin')
17 | 
18 |     def _get_confidences(self, img: np.ndarray, box: tuple) -> np.ndarray:
19 |         """ get OCR prediction confidences from a part of image in memory
20 | 
21 |             :param img: BGR image
22 |             :param box: (ymin ,xmin ,ymax, xmax)
23 | 
24 |             :return: blob with the shape [30, 1, 37] in the format [WxBxL], where:
25 |                     W - output sequence length
26 |                     B - batch size
27 |                     L - confidence distribution across alphanumeric symbols:
28 |                         "0123456789abcdefghijklmnopqrstuvwxyz#", where # - special
29 |                         blank character for CTC decoding algorithm.
30 |         """
31 |         y1, x1, y2, x2 = box
32 |         img = img[y1:y2, x1:x2]
33 |         img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
34 |         blob = cv2.dnn.blobFromImage(img, 1, (120, 32))
35 |         self._net.setInput(blob)
36 |         outs = self._net.forward()
37 |         return outs
38 | 
39 |     def do_ocr(self, img: np.ndarray, bboxes: List[tuple]) -> List[str]:
40 |         """ Run OCR pipeline with greedy decoder for each single word (bbox)
41 | 
42 |             :param img: BGR image
43 |             :param bboxes: list of sepaate word bboxes (ymin ,xmin ,ymax, xmax)
44 | 
45 |             :return: recognized words
46 | 
47 |             For TF version use:
48 | 
49 |             .. code-block:: python
50 | 
51 |                 # 30 is `confs.shape[0]` it is fixed
52 |                 a, b = tf.nn.ctc_beam_search_decoder(confs, np.array([30]))
53 |                 idx_no_blanks = tf.sparse.to_dense(a[0])[0].numpy()
54 |                 word = ''.join(char_vec[idxs_no_blanks])
55 |         """
56 |         words = []
57 |         # net could detect only these chars
58 |         char_vec = np.array(list("0123456789abcdefghijklmnopqrstuvwxyz#"))
59 | 
60 |         for box in bboxes:
61 |             # confidence distribution across symbols
62 |             confs = self._get_confidences(img, box)
63 |             # get maximal confidence for the whole beam width aka greedy decoder
64 |             idxs = confs[:, 0, :].argmax(axis=1)
65 |             # drop blank characters '#' with id == 36 in charvec
66 |             # isupposedly we taking only separate words as input
67 |             idxs_no_blanks = idxs[idxs != 36]
68 |             # joint to string
69 |             word = ''.join(char_vec[idxs_no_blanks])
70 |             words.append(word)
71 | 
72 |         return words
73 | 


--------------------------------------------------------------------------------
/tests_openvino/README.md:
--------------------------------------------------------------------------------
 1 | Only way to download models -- through model downloader, no manual download anymore:
 2 | - <https://github.com/openvinotoolkit/open_model_zoo/blob/releases/2022/3/tools/model_tools/README.md>
 3 | - <https://github.com/openvinotoolkit/open_model_zoo/blob/releases/2022/3/models/intel/index.md>
 4 | - <https://github.com/openvinotoolkit/open_model_zoo/blob/releases/2022/3/models/public/index.md>
 5 | - <https://github.com/openvinotoolkit/open_model_zoo/>
 6 | - <https://github.com/opencv/opencv_zoo> (you need to clone github repo to get them)
 7 | 
 8 | Sometimes models are backwards compatible to new OpenVINO version, sometimes no.
 9 | Sometimes new model versions became unworkable.
10 | 
11 | IE API for network upload and usage now deprecated, one should use openvino API instead:
12 | - see differences of `pixellink.py` and `text_recognition.py` between `tests` and `tests_openvino` folders
13 | - <https://docs.openvino.ai/latest/notebooks/002-openvino-api-with-output.html>
14 | 


--------------------------------------------------------------------------------
/tests_openvino/pixellink.py:
--------------------------------------------------------------------------------
  1 | """ Wrapper class for Intel's PixelLink realisation (text segmentation NN)
  2 |     text-detection-00[34]
  3 | 
  4 |     For text-detection-002 you'll need to uncomment string in detect()
  5 | """
  6 | import cv2
  7 | import numpy as np
  8 | from openvino.runtime import Core
  9 | from skimage.morphology import label
 10 | from skimage.measure import regionprops
 11 | from typing import List, Tuple
 12 | from skimage.measure._regionprops import RegionProperties
 13 | 
 14 | 
 15 | class PixelLinkDetector():
 16 |     """ Wrapper class for Intel's version of PixelLink text detector
 17 | 
 18 |         See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
 19 |             text-detection-0004/description/text-detection-0004.md
 20 | 
 21 |         :param xml_model_path: path to XML file
 22 | 
 23 |         **Example:**
 24 | 
 25 |         .. code-block:: python
 26 |             detector = PixelLinkDetector('text-detection-0004.xml')
 27 |             img = cv2.imread('tmp.jpg')
 28 |             # ~250ms on i7-6700K
 29 |             detector.detect(img)
 30 |             # ~2ms
 31 |             bboxes = detector.decode()
 32 |     """
 33 |     def __init__(self, xml_model_path: str, txt_threshold=0.5):
 34 |         """
 35 |             :param xml_model_path: path to model's XML file
 36 |             :param txt_threshold: confidence, defaults to ``0.5``
 37 |         """
 38 |         ie = Core()
 39 |         model = ie.read_model(xml_model_path)
 40 |         self._net = ie.compile_model(model=model, device_name="CPU")
 41 |         #self._net = cv2.dnn.readNet(xml_model_path, xml_model_path[:-3] + 'bin')
 42 |         self._txt_threshold = txt_threshold
 43 | 
 44 |     def detect(self, img: np.ndarray) -> None:
 45 |         """ GetPixelLink's outputs (BxCxHxW):
 46 |                 + [1x16x192x320] - logits related to linkage between pixels and their neighbors
 47 |                 + [1x2x192x320] - logits related to text/no-text classification for each pixel
 48 | 
 49 |             B - batch size
 50 |             C - number of channels
 51 |             H - image height
 52 |             W - image width
 53 | 
 54 |             :param img: image as ``numpy.ndarray``
 55 |         """
 56 |         #input_layer = self._net.input(0)
 57 |         output_layer_1 = self._net.output(0)
 58 |         output_layer_2 = self._net.output(1)
 59 |         self._img_shape = img.shape
 60 |         blob = cv2.dnn.blobFromImage(img, 1, (1280, 768))
 61 |         #self._net.setInput(blob)
 62 |         #out_layer_names = self._net.getUnconnectedOutLayersNames()
 63 |         # for text-detection-002
 64 |         # self.pixels, self.links = self._net.forward(out_layer_names)
 65 |         # for text-detection-00[34]
 66 |         #self.links, self.pixels = self._net.forward(out_layer_names)
 67 |         #self.links, self.pixels = self._net([blob])[output_layer]
 68 |         #self.links = self._net([blob])[output_layer_1]
 69 |         #self.pixels = self._net([blob])[output_layer_2]
 70 |         out = self._net([blob])
 71 |         self.links = out[output_layer_1]
 72 |         self.pixels = out[output_layer_2]
 73 | 
 74 |     def get_mask(self) -> np.ndarray:
 75 |         """ Get binary mask of detected text pixels
 76 |         """
 77 |         pixel_mask = self._get_pixel_scores() >= self._txt_threshold
 78 |         return pixel_mask.astype(np.uint8)
 79 | 
 80 |     def _logsumexp(self, a: np.ndarray, axis=-1) -> np.ndarray:
 81 |         """ Castrated function from scipy
 82 |             https://github.com/scipy/scipy/blob/v1.6.2/scipy/special/_logsumexp.py
 83 | 
 84 |             Compute the log of the sum of exponentials of input elements.
 85 |         """
 86 |         a_max = np.amax(a, axis=axis, keepdims=True)
 87 |         tmp = np.exp(a - a_max)
 88 |         s = np.sum(tmp, axis=axis, keepdims=True)
 89 |         out = np.log(s)
 90 |         out += a_max
 91 |         return out
 92 | 
 93 |     def _get_pixel_scores(self) -> np.ndarray:
 94 |         """ get softmaxed properly shaped pixel scores """
 95 |         # move channels to the end
 96 |         tmp = np.transpose(self.pixels, (0, 2, 3, 1))
 97 |         # softmax from scipy
 98 |         tmp = np.exp(tmp - self._logsumexp(tmp, axis=-1))
 99 |         # select single batch, single chanel values
100 |         return tmp[0, :, :, 1]
101 | 
102 |     def _get_txt_regions(self, pixel_mask: np.ndarray) -> List[RegionProperties]:
103 |         """ kernels are class dependent """
104 |         img_h, img_w = self._img_shape[:2]
105 |         _, mask = cv2.threshold(pixel_mask, 0, 1, cv2.THRESH_BINARY)
106 |         # transmutatioins
107 |         # kernel size should be image size dependant (default (21,21))
108 |         # on small image it will connect separate words
109 |         txt_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
110 |         mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, txt_kernel)
111 |         # connect regions on mask of original img size
112 |         mask = cv2.resize(mask, (img_w, img_h), interpolation=cv2.INTER_NEAREST)
113 |         # Label connected regions of an integer array
114 |         mask = label(mask, background=0, connectivity=2)
115 |         # Measure properties of labeled image regions.
116 |         txt_regions = regionprops(mask)
117 |         return txt_regions
118 | 
119 |     def _get_txt_bboxes(self, txt_regions: List[RegionProperties]) -> List[Tuple[int, int, int, int]]:
120 |         """ Filter text area by area and height
121 | 
122 |             :return: ``[(ymin, xmin, ymax, xmax)]``
123 |         """
124 |         min_area = 0
125 |         min_height = 4
126 |         boxes = []
127 |         for p in txt_regions:
128 |             if p.area > min_area:
129 |                 bbox = p.bbox
130 |                 if (bbox[2] - bbox[0]) > min_height:
131 |                     boxes.append(bbox)
132 |         return boxes
133 | 
134 |     def decode(self) -> List[Tuple[int, int, int, int]]:
135 |         """ Decode PixelLink's output
136 | 
137 |             :return: bounding_boxes
138 | 
139 |             .. note::
140 |                 bounding_boxes format: [ymin ,xmin ,ymax, xmax]
141 | 
142 |         """
143 |         mask = self.get_mask()
144 |         bboxes = self._get_txt_bboxes(self._get_txt_regions(mask))
145 |         # sort by xmin, ymin
146 |         bboxes = sorted(bboxes, key=lambda x: (x[1], x[0]))
147 |         return bboxes
148 | 


--------------------------------------------------------------------------------
/tests_openvino/prepare_and_run_tests.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | green="\033[0;32m"
 4 | red="\033[0;31m"
 5 | end="\033[0m"
 6 | 
 7 | green () {
 8 |   echo -e "${green}${1}${end}"
 9 | }
10 | 
11 | red () {
12 |   echo -e "${red}${1}${end}"
13 | }
14 | 
15 | 
16 | echo "======================================================================"
17 | green "CREATE VENV WITH OPENCV AND OPENVINO RUNTIME"
18 | if [ ! -d ./venv_t ]; then
19 | 	virtualenv --clear --always-copy -p /usr/bin/python3 ./venv_t
20 | fi
21 | green "CREATE SEPARATE VENV WITH OPENVINO DEV TO USE MODEL DOWNLOADER"
22 | if [ ! -d ./venv_d ]; then
23 | 	virtualenv --clear --always-copy -p /usr/bin/python3 ./venv_d
24 | fi
25 | 
26 | 
27 | green "INSTALLING DEPENDENCIES"
28 | ./venv_t/bin/pip3 install -r requirements.txt
29 | ./venv_d/bin/pip3 install openvino-dev==2022.3.0
30 | 
31 | 
32 | green "GET MODELS"
33 | if [ ! -f "rateme-0.1.1.tar.gz" ]; then
34 |     wget "https://github.com/banderlog/rateme/releases/download/v0.1.1/rateme-0.1.1.tar.gz"
35 | fi
36 |     ./venv_t/bin/pip3 install --no-deps "rateme-0.1.1.tar.gz"
37 | 
38 | # download models from intel
39 | if [ ! -f "intel/text-recognition-0012/FP32/text-recognition-0012.bin" ]; then
40 | 	./venv_d/bin/omz_downloader --precision FP32 -o ./ --name text-recognition-0012
41 | fi
42 | 
43 | # particularly that new model does not work or something changed in decoder
44 | declare -a models=("text-detection-0004.xml"
45 |                    "text-detection-0004.bin")
46 | 
47 | url_start="https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1"
48 | 
49 | for i in "${models[@]}"; do
50 |     # if no such file
51 |     if [ ! -f $i ]; then
52 | 	# download
53 |         wget "${url_start}/${i%.*}/FP32/${i}"
54 |     else
55 | 	# checksum
56 |         sha256sum -c "${i}.sha256sum" || red "PROBLEMS ^^^"
57 |     fi
58 | done
59 | 
60 | 
61 | 
62 | green "For \"$WHEEL\""
63 | green "RUN TESTS with ./venv_t/bin/python ./tests.py"
64 | ./venv_t/bin/python ./tests.py
65 | 


--------------------------------------------------------------------------------
/tests_openvino/requirements.txt:
--------------------------------------------------------------------------------
1 | scikit-image
2 | openvino==2022.3.0
3 | opencv-python
4 | 


--------------------------------------------------------------------------------
/tests_openvino/tests.py:
--------------------------------------------------------------------------------
 1 | import unittest
 2 | import cv2
 3 | from pixellink import PixelLinkDetector
 4 | from text_recognition import TextRecognizer
 5 | from rateme.utils import RateMe
 6 | 
 7 | 
 8 | class TestPackage(unittest.TestCase):
 9 | 
10 |     def test_dnn_module(self):
11 |         model = RateMe()
12 |         img = cv2.imread('../tests/dislike.jpg')
13 |         answer = model.predict(img)
14 |         self.assertEqual(answer, 'dislike')
15 |         print('rateme: passed')
16 | 
17 |     def test_inference_engine(self):
18 |         img = cv2.imread('../tests/helloworld.png')
19 |         detector4 = PixelLinkDetector('text-detection-0004.xml')
20 |         detector4.detect(img)
21 |         bboxes = detector4.decode()
22 | 
23 |         recognizer12 = TextRecognizer('intel/text-recognition-0012/FP32/text-recognition-0012.xml')
24 |         answer = recognizer12.do_ocr(img, bboxes)
25 |         self.assertEqual(answer, ['hello', 'world'])
26 |         print('text detection and recognition: passed')
27 | 
28 | 
29 | if __name__ == '__main__':
30 |     unittest.main()
31 | 


--------------------------------------------------------------------------------
/tests_openvino/text-detection-0004.bin.sha256sum:
--------------------------------------------------------------------------------
1 | 6da6456f27123be2d9a0e68bb73a7750f6aaee2f0af75d7f34ec6fa97f6727dc  text-detection-0004.bin
2 | 


--------------------------------------------------------------------------------
/tests_openvino/text-detection-0004.xml.sha256sum:
--------------------------------------------------------------------------------
1 | 244f836e36d63c9bd45b2123f4b9e4672cae6be348c15cac857d75a8b9852dd7  text-detection-0004.xml
2 | 


--------------------------------------------------------------------------------
/tests_openvino/text_recognition.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | from openvino.runtime import Core
 3 | import numpy as np
 4 | from typing import List
 5 | 
 6 | 
 7 | class TextRecognizer():
 8 |     def __init__(self, xml_model_path: str):
 9 |         """ Class for the Intels' OCR model pipeline
10 | 
11 |             See https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/ \
12 |                 text-recognition-0012/description/text-recognition-0012.md
13 | 
14 |             :param xml_model_path: path to model's XML file
15 |         """
16 |         # load model
17 |         ie = Core()
18 |         model = ie.read_model(xml_model_path)
19 |         self._net = ie.compile_model(model=model, device_name="CPU")
20 |         #self._net = cv2.dnn.readNetFromModelOptimizer(xml_model_path, xml_model_path[:-3] + 'bin')
21 | 
22 |     def _get_confidences(self, img: np.ndarray, box: tuple) -> np.ndarray:
23 |         """ get OCR prediction confidences from a part of image in memory
24 | 
25 |             :param img: BGR image
26 |             :param box: (ymin ,xmin ,ymax, xmax)
27 | 
28 |             :return: blob with the shape [30, 1, 37] in the format [WxBxL], where:
29 |                     W - output sequence length
30 |                     B - batch size
31 |                     L - confidence distribution across alphanumeric symbols:
32 |                         "0123456789abcdefghijklmnopqrstuvwxyz#", where # - special
33 |                         blank character for CTC decoding algorithm.
34 |         """
35 |         y1, x1, y2, x2 = box
36 |         img = img[y1:y2, x1:x2]
37 |         img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
38 |         blob = cv2.dnn.blobFromImage(img, 1, (120, 32))
39 |         #self._net.setInput(blob)
40 |         #outs = self._net.forward()
41 |         outs = self._net([blob])
42 |         return outs
43 | 
44 |     def do_ocr(self, img: np.ndarray, bboxes: List[tuple]) -> List[str]:
45 |         """ Run OCR pipeline with greedy decoder for each single word (bbox)
46 | 
47 |             :param img: BGR image
48 |             :param bboxes: list of sepaate word bboxes (ymin ,xmin ,ymax, xmax)
49 | 
50 |             :return: recognized words
51 | 
52 |             For TF version use:
53 | 
54 |             .. code-block:: python
55 | 
56 |                 # 30 is `confs.shape[0]` it is fixed
57 |                 a, b = tf.nn.ctc_beam_search_decoder(confs, np.array([30]))
58 |                 idx_no_blanks = tf.sparse.to_dense(a[0])[0].numpy()
59 |                 word = ''.join(char_vec[idxs_no_blanks])
60 |         """
61 |         words = []
62 |         # net could detect only these chars
63 |         char_vec = np.array(list("0123456789abcdefghijklmnopqrstuvwxyz#"))
64 |         out_layer_name = self._net.output(0)
65 | 
66 |         for box in bboxes:
67 |             # confidence distribution across symbols
68 |             confs = self._get_confidences(img, box)[out_layer_name]
69 |             # get maximal confidence for the whole beam width aka greedy decoder
70 |             idxs = confs[:, 0, :].argmax(axis=1)
71 |             # drop blank characters '#' with id == 36 in charvec
72 |             # isupposedly we taking only separate words as input
73 |             idxs_no_blanks = idxs[idxs != 36]
74 |             # joint to string
75 |             word = ''.join(char_vec[idxs_no_blanks])
76 |             words.append(word)
77 | 
78 |         return words
79 | 


--------------------------------------------------------------------------------