├── ESP32-CAM-pinout-new.png
├── LICENSE
├── README.md
├── edge-impulse-esp32-cam-bare
    ├── camera_index.h
    ├── camera_pins.h
    ├── dl_lib_matrix3d.h
    ├── dl_lib_matrix3dq.h
    ├── edge-impulse-esp32-cam-bare.ino
    ├── esp_image.hpp
    ├── frmn.h
    ├── image_util.c
    ├── image_util.h
    └── mtmn.h
├── edge-impulse-esp32-cam
    ├── camera_index.h
    ├── camera_pins.h
    ├── dl_lib_matrix3d.h
    ├── dl_lib_matrix3dq.h
    ├── edge-impulse-esp32-cam.ino
    ├── esp_image.hpp
    ├── frmn.h
    ├── image_util.c
    ├── image_util.h
    └── mtmn.h
├── ei-esp32-cam-cat-dog-arduino-1.0.4.zip
└── esp32-cam-edge-impulse.png


/ESP32-CAM-pinout-new.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alankrantas/edge-impulse-esp32-cam-image-classification/bea4de2a83737349598063bc4ded2949cdcc5b25/ESP32-CAM-pinout-new.png


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Alan Wang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Live Image Classification on ESP32-CAM and ST7735 TFT using MobileNet v1 from Edge Impulse (TinyML)
 2 | 
 3 | ![41Ub2S0SjXL _AC_](https://user-images.githubusercontent.com/44191076/153631624-e13576b3-b440-4cd0-8a42-fd29cbe25a2d.jpg)
 4 | 
 5 | This example is for running a micro neural network model on the 10-dollar Ai-Thinker ESP32-CAM board and show the image classification results on a small TFT LCD display.
 6 | 
 7 | > Be noted that I am not testing and improving this any further. This is simply a proof-of-concept and demonstration that you can make simple and practical edge AI devices without making them overly complicated.
 8 | 
 9 | This is modified from [ESP32 Cam and Edge Impulse](https://github.com/edgeimpulse/example-esp32-cam) with simplified code, TFT support and copied necessary libraries from Espressif's [esp-face](https://github.com/Yuri-R-Studio/esp-face). ```esp-face``` had been refactored into [esp-dl](https://github.com/espressif/esp-dl) to support their other products and thus broke the original example. The original example also requires WiFi connection and has image lagging problems, which makes it difficult to use. My version works more like a hand-held point and shoot camera.
10 | 
11 | > See the original example repo or [this article](https://www.survivingwithandroid.com/tinyml-esp32-cam-edge-image-classification-with-edge-impulse/) about how to generate your own model on Edge Impulse. You can also still run the original example by copy every libraries in this example to the project directory, then re-open the .ino script.
12 | 
13 | ![demo](https://user-images.githubusercontent.com/44191076/154735134-12b59e38-79d6-4890-945c-db0604b0444e.JPG)
14 | 
15 | See the [video demonstration](https://www.youtube.com/watch?v=UoWfiEZE0Y4)
16 | 
17 | [中文版介紹](https://alankrantas.medium.com/tinyml-%E5%BD%B1%E5%83%8F%E8%BE%A8%E8%AD%98%E5%BE%AE%E5%9E%8B%E5%8C%96-%E5%9C%A8%E6%88%90%E6%9C%AC-10-%E7%BE%8E%E5%85%83%E7%9A%84-esp32-cam-%E9%96%8B%E7%99%BC%E6%9D%BF%E4%B8%8A%E5%8D%B3%E6%99%82%E5%88%86%E9%A1%9E%E8%B2%93%E7%8B%97-%E5%BE%9E%E6%AD%A4%E8%B7%9F%E9%BA%BB%E7%85%A9%E7%9A%84-wifi-%E9%80%A3%E7%B7%9A%E8%AA%AA%E6%8B%9C%E6%8B%9C-%E4%BD%BF%E7%94%A8-mobilenet-v1-%E6%A8%A1%E5%9E%8B%E8%88%87%E9%81%B7%E7%A7%BB%E5%AD%B8%E7%BF%92-10fb02da83e9)
18 | 
19 | ## Setup
20 | 
21 | The following is needed in your Arduino IDE:
22 | 
23 | * [Arduino-ESP32 board support](https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json) (select ```Ai Thinker ESP32-CAM```)
24 | * [Adafruit GFX Library](https://github.com/adafruit/Adafruit-GFX-Library)
25 | * [Adafruit ST7735 and ST7789 Library](https://github.com/adafruit/Adafruit-ST7735-Library)
26 | * [Import](https://docs.arduino.cc/software/ide-v1/tutorials/installing-libraries) the model library you've generated from Edge Impulse Studio
27 | * Download [edge-impulse-esp32-cam](https://github.com/alankrantas/edge-impulse-esp32-cam-image-classification/tree/main/edge-impulse-esp32-cam) from this repo and open the ```.ino``` file in the directory.
28 | 
29 | Be noted that you won't be able to read any serial output if you use Arduino IDE 2.0!
30 | 
31 | ## Wiring
32 | 
33 | ![pinout](https://github.com/alankrantas/edge-impulse-esp32-cam-image-classification/raw/main/ESP32-CAM-pinout-new.png)
34 | 
35 | ![wiring](https://github.com/alankrantas/edge-impulse-esp32-cam-image-classification/raw/main/esp32-cam-edge-impulse.png)
36 | 
37 | For the ESP32-CAM, the side with the reset button is "up". The whole system is powered from a power module that can output both 5V and 3.3V. The ESP32-CAM is powered by 5V and TFT by 3.3V. I use a 7.5V 1A charger (power modules require 6.5V+ to provide stable 5V). The power module I use only output 500 mA max - but you don't need a lot since we don't use WiFi.
38 | 
39 | | USB-TTL pins | ESP32-CAM |
40 | | --- | --- |
41 | | Tx | GPIO 3 (UOR) |
42 | | Rx | GPIO 1 (UOT) |
43 | | GND | GND |
44 | 
45 | The USB-TTL's GND should be connected to the breadboard, not the ESP32-CAM itself. If you want to upload code, disconnect power then connect GPIO 0 to GND (also should be on the breadboard), then power it up. It would be in flash mode. (The alternative way is remove the ESP32-CAM itself and use the ESP32-CAM-MB programmer board.)
46 | 
47 | | TFT pins | ESP32-CAM |
48 | | --- | --- |
49 | | SCK (SCL) | GPIO 14 |
50 | | MOSI (SDA) | GPIO 13 |
51 | | RESET (RST) | GPIO 12 |
52 | | DC | GPIO 2 |
53 | | CS | GPIO 15 |
54 | | BL (back light) | 3V3 |
55 | 
56 | The script will display a 120x120 image on the TFT, so any 160x128 or 128x128 versions can be used. But you might want to change the parameter in ```tft.initR(INITR_GREENTAB);``` to ```INITR_REDTAB``` or ```INITR_BLACKTAB``` to get correct text colors.
57 | 
58 | | Button | ESP32-CAM |
59 | | --- | --- |
60 | | BTN | 3V3 |
61 | | BTN | GPIO 4 |
62 | 
63 | Be noted that since the button pin is shared with the flash LED (this is the available pin left; GPIO 16 is camera-related), the button has to be **pulled down** with two 10 KΩ resistors.
64 | 
65 | ## The Example Model - Cat & Dog Classification
66 | 
67 | My demo model used Microsoft's [Kaggle Cats and Dogs Dataset](https://www.microsoft.com/en-us/download/details.aspx?id=54765) which has 12,500 cats and 12,500 dogs. 24,969 photos had successfully uploaded and split into 80-20% training/test sets. The variety of the images is perfect since we are not doing YOLO- or SSD- style object detection.
68 | 
69 | ![下載](https://user-images.githubusercontent.com/44191076/154785876-b65de5e1-acba-4c2a-9c25-01d02e9b7a2b.png)
70 | 
71 | The model I choose was ```MobileNetV1 96x96 0.25 (no final dense layer, 0.1 dropout)``` with transfer learning. Since free Edge Impulse accounts has a training time limit of 20 minutes per job, I can only train the model for 5 cycles. (You can go [ask for more](https://forum.edgeimpulse.com/t/err-deadlineexceeded-ways-to-fix-this/2354/2) though...) I imagine if you have only a dozen images per class, you can try better models or longer training cycles.
72 | 
73 | Anyway, I got ```89.8%``` accuracy for training set and ```86.97%``` for test set, which seems decent enough.
74 | 
75 | ![1](https://user-images.githubusercontent.com/44191076/153631673-96b90c0b-5745-43b9-9e5f-9a426d8bfe61.png)
76 | 
77 | Also, ESP32-CAM is not yet an officially supported board when I created this project, so I cannot use EON Tuner for futher find-tuning.
78 | 
79 | You can find my published Edge Impulse project here: [esp32-cam-cat-dog](https://studio.edgeimpulse.com/public/76904/latest). [ei-esp32-cam-cat-dog-arduino-1.0.4.zip](https://github.com/alankrantas/edge-impulse-esp32-cam-image-classification/blob/main/ei-esp32-cam-cat-dog-arduino-1.0.4.zip) is the downloaded Arduino library which can be imported into Ardiono IDE.
80 | 
81 | The camera captures 240x240 images and resize them into 96x96 for the model input, and again resize the original image to 120x120 for the TFT display. The model inference time (prediction time) is 2607 ms (2.6 secs) per image, which is not very fast, with mostly good results. I don't know yet if different image sets or models may effect the result.
82 | 
83 | > Note: the demo model has only two classes - dog and cat - thus it will try "predict" whatever it sees to either dogs or cats. A better model should have a third class of "not dogs nor cats" to avoid invalid responses.
84 | 
85 | ## Boilerplate Version
86 | 
87 | The [edge-impulse-esp32-cam-bare](https://github.com/alankrantas/edge-impulse-esp32-cam-image-classification/tree/main/edge-impulse-esp32-cam-bare) is the version that dosen't use any external devices. The model would be running in a non-stop loop. You can try to point the camera to the images and read the prediction via serial port (use Arduino IDE 1.x).
88 | 
89 | ![bogdan-farca-CEx86maLUSc-unsplash](https://user-images.githubusercontent.com/44191076/153636524-9b2edab9-7c50-4aa1-9d6e-74477d67011f.jpg)
90 | 
91 | ![richard-brutyo-Sg3XwuEpybU-unsplash](https://user-images.githubusercontent.com/44191076/153636561-16f7fb47-dcfc-4988-8772-85dcc5acfdac.jpg)
92 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam-bare/camera_pins.h:
--------------------------------------------------------------------------------
  1 | 
  2 | #if defined(CAMERA_MODEL_WROVER_KIT)
  3 | #define PWDN_GPIO_NUM    -1
  4 | #define RESET_GPIO_NUM   -1
  5 | #define XCLK_GPIO_NUM    21
  6 | #define SIOD_GPIO_NUM    26
  7 | #define SIOC_GPIO_NUM    27
  8 | 
  9 | #define Y9_GPIO_NUM      35
 10 | #define Y8_GPIO_NUM      34
 11 | #define Y7_GPIO_NUM      39
 12 | #define Y6_GPIO_NUM      36
 13 | #define Y5_GPIO_NUM      19
 14 | #define Y4_GPIO_NUM      18
 15 | #define Y3_GPIO_NUM       5
 16 | #define Y2_GPIO_NUM       4
 17 | #define VSYNC_GPIO_NUM   25
 18 | #define HREF_GPIO_NUM    23
 19 | #define PCLK_GPIO_NUM    22
 20 | 
 21 | #elif defined(CAMERA_MODEL_ESP_EYE)
 22 | #define PWDN_GPIO_NUM    -1
 23 | #define RESET_GPIO_NUM   -1
 24 | #define XCLK_GPIO_NUM    4
 25 | #define SIOD_GPIO_NUM    18
 26 | #define SIOC_GPIO_NUM    23
 27 | 
 28 | #define Y9_GPIO_NUM      36
 29 | #define Y8_GPIO_NUM      37
 30 | #define Y7_GPIO_NUM      38
 31 | #define Y6_GPIO_NUM      39
 32 | #define Y5_GPIO_NUM      35
 33 | #define Y4_GPIO_NUM      14
 34 | #define Y3_GPIO_NUM      13
 35 | #define Y2_GPIO_NUM      34
 36 | #define VSYNC_GPIO_NUM   5
 37 | #define HREF_GPIO_NUM    27
 38 | #define PCLK_GPIO_NUM    25
 39 | 
 40 | #elif defined(CAMERA_MODEL_M5STACK_PSRAM)
 41 | #define PWDN_GPIO_NUM     -1
 42 | #define RESET_GPIO_NUM    15
 43 | #define XCLK_GPIO_NUM     27
 44 | #define SIOD_GPIO_NUM     25
 45 | #define SIOC_GPIO_NUM     23
 46 | 
 47 | #define Y9_GPIO_NUM       19
 48 | #define Y8_GPIO_NUM       36
 49 | #define Y7_GPIO_NUM       18
 50 | #define Y6_GPIO_NUM       39
 51 | #define Y5_GPIO_NUM        5
 52 | #define Y4_GPIO_NUM       34
 53 | #define Y3_GPIO_NUM       35
 54 | #define Y2_GPIO_NUM       32
 55 | #define VSYNC_GPIO_NUM    22
 56 | #define HREF_GPIO_NUM     26
 57 | #define PCLK_GPIO_NUM     21
 58 | 
 59 | #elif defined(CAMERA_MODEL_M5STACK_V2_PSRAM)
 60 | #define PWDN_GPIO_NUM     -1
 61 | #define RESET_GPIO_NUM    15
 62 | #define XCLK_GPIO_NUM     27
 63 | #define SIOD_GPIO_NUM     22
 64 | #define SIOC_GPIO_NUM     23
 65 | 
 66 | #define Y9_GPIO_NUM       19
 67 | #define Y8_GPIO_NUM       36
 68 | #define Y7_GPIO_NUM       18
 69 | #define Y6_GPIO_NUM       39
 70 | #define Y5_GPIO_NUM        5
 71 | #define Y4_GPIO_NUM       34
 72 | #define Y3_GPIO_NUM       35
 73 | #define Y2_GPIO_NUM       32
 74 | #define VSYNC_GPIO_NUM    25
 75 | #define HREF_GPIO_NUM     26
 76 | #define PCLK_GPIO_NUM     21
 77 | 
 78 | #elif defined(CAMERA_MODEL_M5STACK_WIDE)
 79 | #define PWDN_GPIO_NUM     -1
 80 | #define RESET_GPIO_NUM    15
 81 | #define XCLK_GPIO_NUM     27
 82 | #define SIOD_GPIO_NUM     22
 83 | #define SIOC_GPIO_NUM     23
 84 | 
 85 | #define Y9_GPIO_NUM       19
 86 | #define Y8_GPIO_NUM       36
 87 | #define Y7_GPIO_NUM       18
 88 | #define Y6_GPIO_NUM       39
 89 | #define Y5_GPIO_NUM        5
 90 | #define Y4_GPIO_NUM       34
 91 | #define Y3_GPIO_NUM       35
 92 | #define Y2_GPIO_NUM       32
 93 | #define VSYNC_GPIO_NUM    25
 94 | #define HREF_GPIO_NUM     26
 95 | #define PCLK_GPIO_NUM     21
 96 | 
 97 | #elif defined(CAMERA_MODEL_M5STACK_ESP32CAM)
 98 | #define PWDN_GPIO_NUM     -1
 99 | #define RESET_GPIO_NUM    15
100 | #define XCLK_GPIO_NUM     27
101 | #define SIOD_GPIO_NUM     25
102 | #define SIOC_GPIO_NUM     23
103 | 
104 | #define Y9_GPIO_NUM       19
105 | #define Y8_GPIO_NUM       36
106 | #define Y7_GPIO_NUM       18
107 | #define Y6_GPIO_NUM       39
108 | #define Y5_GPIO_NUM        5
109 | #define Y4_GPIO_NUM       34
110 | #define Y3_GPIO_NUM       35
111 | #define Y2_GPIO_NUM       17
112 | #define VSYNC_GPIO_NUM    22
113 | #define HREF_GPIO_NUM     26
114 | #define PCLK_GPIO_NUM     21
115 | 
116 | #elif defined(CAMERA_MODEL_AI_THINKER)
117 | #define PWDN_GPIO_NUM     32
118 | #define RESET_GPIO_NUM    -1
119 | #define XCLK_GPIO_NUM      0
120 | #define SIOD_GPIO_NUM     26
121 | #define SIOC_GPIO_NUM     27
122 | 
123 | #define Y9_GPIO_NUM       35
124 | #define Y8_GPIO_NUM       34
125 | #define Y7_GPIO_NUM       39
126 | #define Y6_GPIO_NUM       36
127 | #define Y5_GPIO_NUM       21
128 | #define Y4_GPIO_NUM       19
129 | #define Y3_GPIO_NUM       18
130 | #define Y2_GPIO_NUM        5
131 | #define VSYNC_GPIO_NUM    25
132 | #define HREF_GPIO_NUM     23
133 | #define PCLK_GPIO_NUM     22
134 | 
135 | #elif defined(CAMERA_MODEL_TTGO_T_JOURNAL)
136 | #define PWDN_GPIO_NUM      0
137 | #define RESET_GPIO_NUM    15
138 | #define XCLK_GPIO_NUM     27
139 | #define SIOD_GPIO_NUM     25
140 | #define SIOC_GPIO_NUM     23
141 | 
142 | #define Y9_GPIO_NUM       19
143 | #define Y8_GPIO_NUM       36
144 | #define Y7_GPIO_NUM       18
145 | #define Y6_GPIO_NUM       39
146 | #define Y5_GPIO_NUM        5
147 | #define Y4_GPIO_NUM       34
148 | #define Y3_GPIO_NUM       35
149 | #define Y2_GPIO_NUM       17
150 | #define VSYNC_GPIO_NUM    22
151 | #define HREF_GPIO_NUM     26
152 | #define PCLK_GPIO_NUM     21
153 | 
154 | #else
155 | #error "Camera model not selected"
156 | #endif
157 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam-bare/dl_lib_matrix3d.h:
--------------------------------------------------------------------------------
  1 | #pragma once
  2 | 
  3 | #include <stdint.h>
  4 | #include <stdio.h>
  5 | #include <stdlib.h>
  6 | #include <string.h>
  7 | #include <math.h>
  8 | #include <assert.h>
  9 | 
 10 | #if CONFIG_SPIRAM_SUPPORT || CONFIG_ESP32_SPIRAM_SUPPORT
 11 | #include "freertos/FreeRTOS.h"
 12 | #define DL_SPIRAM_SUPPORT 1
 13 | #else
 14 | #define DL_SPIRAM_SUPPORT 0
 15 | #endif
 16 | 
 17 | 
 18 | #ifndef max
 19 | #define max(x, y) (((x) < (y)) ? (y) : (x))
 20 | #endif
 21 | 
 22 | #ifndef min
 23 | #define min(x, y) (((x) < (y)) ? (x) : (y))
 24 | #endif
 25 | 
 26 | typedef float fptp_t;
 27 | typedef uint8_t uc_t;
 28 | 
 29 | typedef enum
 30 | {
 31 |     DL_SUCCESS = 0,
 32 |     DL_FAIL = 1,
 33 | } dl_error_type;
 34 | 
 35 | typedef enum
 36 | {
 37 |     PADDING_VALID = 0,                   /*!< Valid padding */
 38 |     PADDING_SAME = 1,                    /*!< Same padding, from right to left, free input */
 39 |     PADDING_SAME_DONT_FREE_INPUT = 2,    /*!< Same padding, from right to left, do not free input */
 40 |     PADDING_SAME_MXNET = 3,              /*!< Same padding, from left to right */
 41 | } dl_padding_type;
 42 | 
 43 | typedef enum
 44 | {
 45 |     DL_POOLING_MAX = 0,        /*!< Max pooling */
 46 |     DL_POOLING_AVG = 1,        /*!< Average pooling */
 47 | } dl_pooling_type; 
 48 | /*
 49 |  * Matrix for 3d
 50 |  * @Warning: the sequence of variables is fixed, cannot be modified, otherwise there will be errors in esp_dsp_dot_float
 51 |  */
 52 | typedef struct
 53 | {
 54 |     int w;        /*!< Width */
 55 |     int h;        /*!< Height */
 56 |     int c;        /*!< Channel */
 57 |     int n;        /*!< Number of filter, input and output must be 1 */
 58 |     int stride;   /*!< Step between lines */
 59 |     fptp_t *item; /*!< Data */
 60 | } dl_matrix3d_t;
 61 | 
 62 | typedef struct
 63 | {
 64 |     int w;      /*!< Width */
 65 |     int h;      /*!< Height */
 66 |     int c;      /*!< Channel */
 67 |     int n;      /*!< Number of filter, input and output must be 1 */
 68 |     int stride; /*!< Step between lines */
 69 |     uc_t *item; /*!< Data */
 70 | } dl_matrix3du_t;
 71 | 
 72 | typedef enum
 73 | {
 74 |     UPSAMPLE_NEAREST_NEIGHBOR = 0, /*!< Use nearest neighbor interpolation as the upsample method*/
 75 |     UPSAMPLE_BILINEAR = 1,        /*!< Use nearest bilinear interpolation as the upsample method*/
 76 | } dl_upsample_type;
 77 | 
 78 | typedef struct
 79 | {
 80 |     int stride_x;                    /*!< Strides of width */
 81 |     int stride_y;                    /*!< Strides of height */
 82 |     dl_padding_type padding;         /*!< Padding type */
 83 | } dl_matrix3d_mobilenet_config_t;
 84 | 
 85 | /*
 86 |  * @brief Allocate a zero-initialized space. Must use 'dl_lib_free' to free the memory.
 87 |  *
 88 |  * @param cnt  Count of units.
 89 |  * @param size Size of unit.
 90 |  * @param align Align of memory. If not required, set 0.
 91 |  * @return Pointer of allocated memory. Null for failed.
 92 |  */
 93 | static void *dl_lib_calloc(int cnt, int size, int align)
 94 | {
 95 |     int total_size = cnt * size + align + sizeof(void *);
 96 |     void *res = malloc(total_size);
 97 |     if (NULL == res)
 98 |     {
 99 | #if DL_SPIRAM_SUPPORT
100 |         res = heap_caps_malloc(total_size, MALLOC_CAP_8BIT | MALLOC_CAP_SPIRAM);
101 |     }
102 |     if (NULL == res)
103 |     {
104 |         printf("Item psram alloc failed. Size: %d x %d\n", cnt, size);
105 | #else
106 |         printf("Item alloc failed. Size: %d x %d, SPIRAM_FLAG: %d\n", cnt, size, DL_SPIRAM_SUPPORT);
107 | #endif
108 |         return NULL;
109 |     }
110 |     bzero(res, total_size);
111 |     void **data = (void **)res + 1;
112 |     void **aligned;
113 |     if (align)
114 |         aligned = (void **)(((size_t)data + (align - 1)) & -align);
115 |     else
116 |         aligned = data;
117 | 
118 |     aligned[-1] = res;
119 |     return (void *)aligned;
120 | }
121 | 
122 | /**
123 |  * @brief Free the memory space allocated by 'dl_lib_calloc'
124 |  * 
125 |  */
126 | static inline void dl_lib_free(void *d)
127 | {
128 |     if (NULL == d)
129 |         return;
130 | 
131 |     free(((void **)d)[-1]);
132 | }
133 | 
134 | /*
135 |  * @brief Allocate a 3D matrix with float items, the access sequence is NHWC
136 |  *
137 |  * @param n     Number of matrix3d, for filters it is out channels, for others it is 1
138 |  * @param w     Width of matrix3d
139 |  * @param h     Height of matrix3d
140 |  * @param c     Channel of matrix3d
141 |  * @return      3d matrix
142 |  */
143 | static inline dl_matrix3d_t *dl_matrix3d_alloc(int n, int w, int h, int c)
144 | {
145 |     dl_matrix3d_t *r = (dl_matrix3d_t *)dl_lib_calloc(1, sizeof(dl_matrix3d_t), 0);
146 |     if (NULL == r)
147 |     {
148 |         printf("internal r failed.\n");
149 |         return NULL;
150 |     }
151 |     fptp_t *items = (fptp_t *)dl_lib_calloc(n * w * h * c, sizeof(fptp_t), 0);
152 |     if (NULL == items)
153 |     {
154 |         printf("matrix3d item alloc failed.\n");
155 |         dl_lib_free(r);
156 |         return NULL;
157 |     }
158 | 
159 |     r->w = w;
160 |     r->h = h;
161 |     r->c = c;
162 |     r->n = n;
163 |     r->stride = w * c;
164 |     r->item = items;
165 | 
166 |     return r;
167 | }
168 | 
169 | /*
170 |  * @brief Allocate a 3D matrix with 8-bits items, the access sequence is NHWC
171 |  *
172 |  * @param n     Number of matrix3d, for filters it is out channels, for others it is 1
173 |  * @param w     Width of matrix3d
174 |  * @param h     Height of matrix3d
175 |  * @param c     Channel of matrix3d
176 |  * @return      3d matrix
177 |  */
178 | static inline dl_matrix3du_t *dl_matrix3du_alloc(int n, int w, int h, int c)
179 | {
180 |     dl_matrix3du_t *r = (dl_matrix3du_t *)dl_lib_calloc(1, sizeof(dl_matrix3du_t), 0);
181 |     if (NULL == r)
182 |     {
183 |         printf("internal r failed.\n");
184 |         return NULL;
185 |     }
186 |     uc_t *items = (uc_t *)dl_lib_calloc(n * w * h * c, sizeof(uc_t), 0);
187 |     if (NULL == items)
188 |     {
189 |         printf("matrix3du item alloc failed.\n");
190 |         dl_lib_free(r);
191 |         return NULL;
192 |     }
193 | 
194 |     r->w = w;
195 |     r->h = h;
196 |     r->c = c;
197 |     r->n = n;
198 |     r->stride = w * c;
199 |     r->item = items;
200 | 
201 |     return r;
202 | }
203 | 
204 | /*
205 |  * @brief Free a matrix3d
206 |  *
207 |  * @param m matrix3d with float items
208 |  */
209 | static inline void dl_matrix3d_free(dl_matrix3d_t *m)
210 | {
211 |     if (NULL == m)
212 |         return;
213 |     if (NULL == m->item)
214 |     {
215 |         dl_lib_free(m);
216 |         return;
217 |     }
218 |     dl_lib_free(m->item);
219 |     dl_lib_free(m);
220 | }
221 | 
222 | /*
223 |  * @brief Free a matrix3d
224 |  *
225 |  * @param m matrix3d with 8-bits items
226 |  */
227 | static inline void dl_matrix3du_free(dl_matrix3du_t *m)
228 | {
229 |     if (NULL == m)
230 |         return;
231 |     if (NULL == m->item)
232 |     {
233 |         dl_lib_free(m);
234 |         return;
235 |     }
236 |     dl_lib_free(m->item);
237 |     dl_lib_free(m);
238 | }
239 | 
240 | 
241 | /*
242 |  * @brief Dot product with a vector and matrix
243 |  *
244 |  * @param out   Space to put the result
245 |  * @param in    input vector
246 |  * @param f     filter matrix
247 |  */
248 | void dl_matrix3dff_dot_product(dl_matrix3d_t *out, dl_matrix3d_t *in, dl_matrix3d_t *f);
249 | 
250 | /**
251 |  * @brief Do a softmax operation on a matrix3d
252 |  *
253 |  * @param in        Input matrix3d
254 |  */
255 | void dl_matrix3d_softmax(dl_matrix3d_t *m);
256 | 
257 | /**
258 |  * @brief Copy a range of float items from an existing matrix to a preallocated matrix
259 |  *
260 |  * @param dst   The destination slice matrix
261 |  * @param src   The source matrix to slice
262 |  * @param x     X-offset of the origin of the returned matrix within the sliced matrix
263 |  * @param y     Y-offset of the origin of the returned matrix within the sliced matrix
264 |  * @param w     Width of the resulting matrix
265 |  * @param h     Height of the resulting matrix
266 |  */
267 | void dl_matrix3d_slice_copy(dl_matrix3d_t *dst,
268 |                             dl_matrix3d_t *src,
269 |                             int x,
270 |                             int y,
271 |                             int w,
272 |                             int h);
273 | 
274 | /**
275 |  * @brief Copy a range of 8-bits items from an existing matrix to a preallocated matrix
276 |  *
277 |  * @param dst   The destination slice matrix
278 |  * @param src   The source matrix to slice
279 |  * @param x     X-offset of the origin of the returned matrix within the sliced matrix
280 |  * @param y     Y-offset of the origin of the returned matrix within the sliced matrix
281 |  * @param w     Width of the resulting matrix
282 |  * @param h     Height of the resulting matrix
283 |  */
284 | void dl_matrix3du_slice_copy(dl_matrix3du_t *dst,
285 |                              dl_matrix3du_t *src,
286 |                              int x,
287 |                              int y,
288 |                              int w,
289 |                              int h);
290 | 
291 | /**
292 |  * @brief Transform a sliced matrix block from nhwc to nchw, the block needs to be memory continous.
293 |  *
294 |  * @param out  The destination sliced matrix in nchw
295 |  * @param in   The source sliced matrix in nhwc
296 |  */
297 | void dl_matrix3d_sliced_transform_nchw(dl_matrix3d_t *out,
298 |                                        dl_matrix3d_t *in);
299 | 
300 | /**
301 |  * @brief Do a general CNN layer pass, dimension is (number, width, height, channel)
302 |  *
303 |  * @param in               Input matrix3d
304 |  * @param filter           Weights of the neurons
305 |  * @param bias             Bias for the CNN layer
306 |  * @param stride_x         The step length of the convolution window in x(width) direction
307 |  * @param stride_y         The step length of the convolution window in y(height) direction
308 |  * @param padding          One of VALID or SAME
309 |  * @param mode             Do convolution using C implement or xtensa implement, 0 or 1, with respect
310 |  *                         If ESP_PLATFORM is not defined, this value is not used. Default is 0
311 |  * @return dl_matrix3d_t*  The result of CNN layer
312 |  */
313 | dl_matrix3d_t *dl_matrix3d_conv(dl_matrix3d_t *in,
314 |                                 dl_matrix3d_t *filter,
315 |                                 dl_matrix3d_t *bias,
316 |                                 int stride_x,
317 |                                 int stride_y,
318 |                                 int padding,
319 |                                 int mode);
320 | 
321 | /**
322 |  * @brief Do a global average pooling layer pass, dimension is (number, width, height, channel)
323 |  *
324 |  * @param in             Input matrix3d
325 |  *
326 |  * @return               The result of global average pooling layer
327 |  */
328 | dl_matrix3d_t *dl_matrix3d_global_pool(dl_matrix3d_t *in);
329 | 
330 | /**
331 |  * @brief Calculate pooling layer of a feature map
332 |  *
333 |  * @param in               Input matrix, size (1, w, h, c)
334 |  * @param f_w              Window width
335 |  * @param f_h              Window height 
336 |  * @param stride_x         Stride in horizontal direction
337 |  * @param stride_y         Stride in vertical direction
338 |  * @param padding          Padding type: PADDING_VALID and PADDING_SAME
339 |  * @param pooling_type     Pooling type: DL_POOLING_MAX and POOLING_AVG
340 |  * @return dl_matrix3d_t*  Resulting matrix, size (1, w', h', c)
341 |  */
342 | dl_matrix3d_t *dl_matrix3d_pooling(dl_matrix3d_t *in,
343 |                                    int f_w,
344 |                                    int f_h,
345 |                                    int stride_x,
346 |                                    int stride_y,
347 |                                    dl_padding_type padding,
348 |                                    dl_pooling_type pooling_type);
349 | /**
350 |  * @brief Do a batch normalization operation, update the input matrix3d: input = input * scale + offset
351 |  *
352 |  * @param m              Input matrix3d
353 |  * @param scale          scale matrix3d,  scale = gamma/((moving_variance+sigma)^(1/2))
354 |  * @param Offset         Offset matrix3d, offset = beta-(moving_mean*gamma/((moving_variance+sigma)^(1/2)))
355 |  */
356 | void dl_matrix3d_batch_normalize(dl_matrix3d_t *m,
357 |                                  dl_matrix3d_t *scale,
358 |                                  dl_matrix3d_t *offset);
359 | 
360 | /**
361 |  * @brief Add a pair of matrix3d item-by-item: res=in_1+in_2
362 |  *
363 |  * @param in_1             First Floating point input matrix3d
364 |  * @param in_2             Second Floating point input matrix3d
365 |  *
366 |  * @return dl_matrix3d_t*  Added data
367 |  */
368 | dl_matrix3d_t *dl_matrix3d_add(dl_matrix3d_t *in_1, dl_matrix3d_t *in_2);
369 | 
370 | /**
371 |  * @brief Concatenate the channels of two matrix3ds into a new matrix3d
372 |  *
373 |  * @param in_1             First Floating point input matrix3d
374 |  * @param in_2             Second Floating point input matrix3d
375 |  *
376 |  * @return dl_matrix3d_t*  A newly allocated matrix3d with as avlues in_1|in_2
377 |  */
378 | dl_matrix3d_t *dl_matrix3d_concat(dl_matrix3d_t *in_1, dl_matrix3d_t *in_2);
379 | 
380 | /**
381 |  * @brief Concatenate the channels of four matrix3ds into a new matrix3d
382 |  *
383 |  * @param in_1           First Floating point input matrix3d
384 |  * @param in_2           Second Floating point input matrix3d
385 |  * @param in_3           Third Floating point input matrix3d
386 |  * @param in_4           Fourth Floating point input matrix3d
387 |  *
388 |  * @return               A newly allocated matrix3d with as avlues in_1|in_2|in_3|in_4
389 |  */
390 | dl_matrix3d_t *dl_matrix3d_concat_4(dl_matrix3d_t *in_1,
391 |                                     dl_matrix3d_t *in_2,
392 |                                     dl_matrix3d_t *in_3,
393 |                                     dl_matrix3d_t *in_4);
394 | 
395 | /**
396 |  * @brief Concatenate the channels of eight matrix3ds into a new matrix3d
397 |  *
398 |  * @param in_1           First Floating point input matrix3d
399 |  * @param in_2           Second Floating point input matrix3d
400 |  * @param in_3           Third Floating point input matrix3d
401 |  * @param in_4           Fourth Floating point input matrix3d
402 |  * @param in_5           Fifth Floating point input matrix3d
403 |  * @param in_6           Sixth Floating point input matrix3d
404 |  * @param in_7           Seventh Floating point input matrix3d
405 |  * @param in_8           eighth Floating point input matrix3d
406 |  *
407 |  * @return               A newly allocated matrix3d with as avlues in_1|in_2|in_3|in_4|in_5|in_6|in_7|in_8
408 |  */
409 | dl_matrix3d_t *dl_matrix3d_concat_8(dl_matrix3d_t *in_1,
410 |                                     dl_matrix3d_t *in_2,
411 |                                     dl_matrix3d_t *in_3,
412 |                                     dl_matrix3d_t *in_4,
413 |                                     dl_matrix3d_t *in_5,
414 |                                     dl_matrix3d_t *in_6,
415 |                                     dl_matrix3d_t *in_7,
416 |                                     dl_matrix3d_t *in_8);
417 | 
418 | /**
419 |  * @brief Do a mobilefacenet block forward, dimension is (number, width, height, channel)
420 |  *
421 |  * @param in                    Input matrix3d
422 |  * @param pw                    Weights of the pointwise conv layer
423 |  * @param pw_bn_scale           The scale params of the batch_normalize layer after the pointwise conv layer
424 |  * @param pw_bn_offset          The offset params of the batch_normalize layer after the pointwise conv layer
425 |  * @param dw                    Weights of the depthwise conv layer
426 |  * @param dw_bn_scale           The scale params of the batch_normalize layer after the depthwise conv layer
427 |  * @param dw_bn_offset          The offset params of the batch_normalize layer after the depthwise conv layer
428 |  * @param pw_linear             Weights of the pointwise linear conv layer
429 |  * @param pw_linear_bn_scale    The scale params of the batch_normalize layer after the pointwise linear conv layer
430 |  * @param pw_linear_bn_offset   The offset params of the batch_normalize layer after the pointwise linear conv layer
431 |  * @param stride_x              The step length of the convolution window in x(width) direction
432 |  * @param stride_y              The step length of the convolution window in y(height) direction
433 |  * @param padding               One of VALID or SAME
434 |  * @param mode                  Do convolution using C implement or xtensa implement, 0 or 1, with respect
435 |  *                              If ESP_PLATFORM is not defined, this value is not used. Default is 0
436 |  * @return                      The result of a mobilefacenet block
437 |  */
438 | dl_matrix3d_t *dl_matrix3d_mobilefaceblock(dl_matrix3d_t *in,
439 |                                            dl_matrix3d_t *pw,
440 |                                            dl_matrix3d_t *pw_bn_scale,
441 |                                            dl_matrix3d_t *pw_bn_offset,
442 |                                            dl_matrix3d_t *dw,
443 |                                            dl_matrix3d_t *dw_bn_scale,
444 |                                            dl_matrix3d_t *dw_bn_offset,
445 |                                            dl_matrix3d_t *pw_linear,
446 |                                            dl_matrix3d_t *pw_linear_bn_scale,
447 |                                            dl_matrix3d_t *pw_linear_bn_offset,
448 |                                            int stride_x,
449 |                                            int stride_y,
450 |                                            int padding,
451 |                                            int mode,
452 |                                            int shortcut);
453 | 
454 | /**
455 |  * @brief Do a mobilefacenet block forward with 1x1 split conv, dimension is (number, width, height, channel)
456 |  *
457 |  * @param in                    Input matrix3d
458 |  * @param pw_1                  Weights of the pointwise conv layer 1
459 |  * @param pw_2                  Weights of the pointwise conv layer 2
460 |  * @param pw_bn_scale           The scale params of the batch_normalize layer after the pointwise conv layer
461 |  * @param pw_bn_offset          The offset params of the batch_normalize layer after the pointwise conv layer
462 |  * @param dw                    Weights of the depthwise conv layer
463 |  * @param dw_bn_scale           The scale params of the batch_normalize layer after the depthwise conv layer
464 |  * @param dw_bn_offset          The offset params of the batch_normalize layer after the depthwise conv layer
465 |  * @param pw_linear_1           Weights of the pointwise linear conv layer 1
466 |  * @param pw_linear_2           Weights of the pointwise linear conv layer 2
467 |  * @param pw_linear_bn_scale    The scale params of the batch_normalize layer after the pointwise linear conv layer
468 |  * @param pw_linear_bn_offset   The offset params of the batch_normalize layer after the pointwise linear conv layer
469 |  * @param stride_x              The step length of the convolution window in x(width) direction
470 |  * @param stride_y              The step length of the convolution window in y(height) direction
471 |  * @param padding               One of VALID or SAME
472 |  * @param mode                  Do convolution using C implement or xtensa implement, 0 or 1, with respect
473 |  *                              If ESP_PLATFORM is not defined, this value is not used. Default is 0
474 |  * @return                      The result of a mobilefacenet block
475 |  */
476 | dl_matrix3d_t *dl_matrix3d_mobilefaceblock_split(dl_matrix3d_t *in,
477 |                                                  dl_matrix3d_t *pw_1,
478 |                                                  dl_matrix3d_t *pw_2,
479 |                                                  dl_matrix3d_t *pw_bn_scale,
480 |                                                  dl_matrix3d_t *pw_bn_offset,
481 |                                                  dl_matrix3d_t *dw,
482 |                                                  dl_matrix3d_t *dw_bn_scale,
483 |                                                  dl_matrix3d_t *dw_bn_offset,
484 |                                                  dl_matrix3d_t *pw_linear_1,
485 |                                                  dl_matrix3d_t *pw_linear_2,
486 |                                                  dl_matrix3d_t *pw_linear_bn_scale,
487 |                                                  dl_matrix3d_t *pw_linear_bn_offset,
488 |                                                  int stride_x,
489 |                                                  int stride_y,
490 |                                                  int padding,
491 |                                                  int mode,
492 |                                                  int shortcut);
493 | 
494 | /**
495 |  * @brief           Initialize the matrix3d feature map to bias
496 |  * 
497 |  * @param out       The matrix3d feature map needs to be initialized
498 |  * @param bias      The bias of a convlotion operation
499 |  */
500 | void dl_matrix3d_init_bias(dl_matrix3d_t *out, dl_matrix3d_t *bias);
501 | 
502 | /**
503 |  * @brief  Do a elementwise multiplication of two matrix3ds
504 |  * 
505 |  * @param out  Preallocated matrix3d, size (n, w, h, c)
506 |  * @param in1  Input matrix 1, size (n, w, h, c)
507 |  * @param in2  Input matrix 2, size (n, w, h, c)
508 |  */
509 | void dl_matrix3d_multiply(dl_matrix3d_t *out, dl_matrix3d_t *in1, dl_matrix3d_t *in2);
510 | 
511 | //
512 | // Activation
513 | //
514 | 
515 | /**
516 |  * @brief Do a standard relu operation, update the input matrix3d
517 |  *
518 |  * @param m        Floating point input matrix3d
519 |  */
520 | void dl_matrix3d_relu(dl_matrix3d_t *m);
521 | 
522 | /**
523 |  * @brief Do a relu (Rectifier Linear Unit) operation, update the input matrix3d
524 |  *
525 |  * @param in        Floating point input matrix3d
526 |  * @param clip      If value is higher than this, it will be clipped to this value
527 |  */
528 | void dl_matrix3d_relu_clip(dl_matrix3d_t *m, fptp_t clip);
529 | 
530 | /**
531 |  * @brief Do a Prelu (Rectifier Linear Unit) operation, update the input matrix3d
532 |  *
533 |  * @param in        Floating point input matrix3d
534 |  * @param alpha     If value is less than zero, it will be updated by multiplying this factor
535 |  */
536 | void dl_matrix3d_p_relu(dl_matrix3d_t *in, dl_matrix3d_t *alpha);
537 | 
538 | /**
539 |  * @brief Do a leaky relu (Rectifier Linear Unit) operation, update the input matrix3d
540 |  *
541 |  * @param in        Floating point input matrix3d
542 |  * @param alpha     If value is less than zero, it will be updated by multiplying this factor
543 |  */
544 | void dl_matrix3d_leaky_relu(dl_matrix3d_t *m, fptp_t alpha);
545 | 
546 | //
547 | // Conv 1x1
548 | //
549 | /**
550 |  * @brief Do 1x1 convolution with a matrix3d
551 |  * 
552 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
553 |  * @param in         Input matrix, size (1, w, h, c)
554 |  * @param filter     1x1 filter, size (n, 1, 1, c)
555 |  */
556 | void dl_matrix3dff_conv_1x1(dl_matrix3d_t *out,
557 |                             dl_matrix3d_t *in,
558 |                             dl_matrix3d_t *filter);
559 | 
560 | /**
561 |  * @brief Do 1x1 convolution with a matrix3d, with bias adding
562 |  * 
563 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
564 |  * @param in         Input matrix, size (1, w, h, c)
565 |  * @param filter     1x1 filter, size (n, 1, 1, c)
566 |  * @param bias       Bias, size (1, 1, 1, n)
567 |  */
568 | void dl_matrix3dff_conv_1x1_with_bias(dl_matrix3d_t *out,
569 |                                       dl_matrix3d_t *in,
570 |                                       dl_matrix3d_t *filter,
571 |                                       dl_matrix3d_t *bias);
572 | 
573 | /**
574 |  * @brief Do 1x1 convolution with an 8-bit fixed point matrix
575 |  * 
576 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
577 |  * @param in         Input matrix, size (1, w, h, c)
578 |  * @param filter     1x1 filter, size (n, 1, 1, c)
579 |  */
580 | void dl_matrix3duf_conv_1x1(dl_matrix3d_t *out,
581 |                             dl_matrix3du_t *in,
582 |                             dl_matrix3d_t *filter);
583 | 
584 | /**
585 |  * @brief Do 1x1 convolution with an 8-bit fixed point matrix, with bias adding
586 |  * 
587 |  * @param out        Preallocated matrix3d, size (1, w, h, n)  
588 |  * @param in         Input matrix, size (1, w, h, c)
589 |  * @param filter     1x1 filter, size (n, 1, 1, c)
590 |  * @param bias       Bias, size (1, 1, 1, n)
591 |  */
592 | void dl_matrix3duf_conv_1x1_with_bias(dl_matrix3d_t *out,
593 |                                       dl_matrix3du_t *in,
594 |                                       dl_matrix3d_t *filter,
595 |                                       dl_matrix3d_t *bias);
596 | 
597 | //
598 | // Conv 3x3
599 | //
600 | 
601 | /**
602 |  * @brief Do 3x3 convolution with a matrix3d, without padding
603 |  * 
604 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
605 |  * @param in         Input matrix, size (1, w, h, c)
606 |  * @param f          3x3 filter, size (n, 3, 3, c)
607 |  * @param step_x     Stride of width
608 |  * @param step_y     Stride of height
609 |  */
610 | void dl_matrix3dff_conv_3x3_op(dl_matrix3d_t *out,
611 |                                dl_matrix3d_t *in,
612 |                                dl_matrix3d_t *f,
613 |                                int step_x,
614 |                                int step_y);
615 | 
616 | /**
617 |  * @brief Do 3x3 convolution with a matrix3d, with bias adding
618 |  * 
619 |  * @param input             Input matrix, size (1, w, h, c)
620 |  * @param filter            3x3 filter, size (n, 3, 3, c)
621 |  * @param bias              Bias, size (1, 1, 1, n)
622 |  * @param stride_x          Stride of width
623 |  * @param stride_y          Stride of height
624 |  * @param padding           Padding type
625 |  * @return dl_matrix3d_t*   Resulting matrix3d
626 |  */
627 | dl_matrix3d_t *dl_matrix3dff_conv_3x3(dl_matrix3d_t *in,
628 |                                       dl_matrix3d_t *filter,
629 |                                       dl_matrix3d_t *bias,
630 |                                       int stride_x,
631 |                                       int stride_y,
632 |                                       dl_padding_type padding);
633 | 
634 | //
635 | // Conv Common
636 | //
637 | 
638 | /**
639 |  * @brief Do a general convolution layer pass with an 8-bit fixed point matrix, size is (number, width, height, channel)
640 |  * 
641 |  * @param in                Input image
642 |  * @param filter            Weights of the neurons
643 |  * @param bias              Bias for the CNN layer
644 |  * @param stride_x          The step length of the convolution window in x(width) direction
645 |  * @param stride_y          The step length of the convolution window in y(height) direction
646 |  * @param padding           Padding type
647 |  * @return dl_matrix3d_t*   Resulting matrix3d
648 |  */
649 | dl_matrix3d_t *dl_matrix3duf_conv_common(dl_matrix3du_t *in,
650 |                                          dl_matrix3d_t *filter,
651 |                                          dl_matrix3d_t *bias,
652 |                                          int stride_x,
653 |                                          int stride_y,
654 |                                          dl_padding_type padding);
655 | 
656 | /**
657 |  * @brief Do a general convolution layer pass, size is (number, width, height, channel)
658 |  * 
659 |  * @param in                Input image
660 |  * @param filter            Weights of the neurons
661 |  * @param bias              Bias for the CNN layer
662 |  * @param stride_x          The step length of the convolution window in x(width) direction
663 |  * @param stride_y          The step length of the convolution window in y(height) direction
664 |  * @param padding           Padding type
665 |  * @return dl_matrix3d_t*   Resulting matrix3d
666 |  */
667 | dl_matrix3d_t *dl_matrix3dff_conv_common(dl_matrix3d_t *in,
668 |                                          dl_matrix3d_t *filter,
669 |                                          dl_matrix3d_t *bias,
670 |                                          int stride_x,
671 |                                          int stride_y,
672 |                                          dl_padding_type padding);
673 | 
674 | //
675 | // Depthwise 3x3
676 | //
677 | 
678 | /**
679 |  * @brief Do 3x3 depthwise convolution with a float matrix3d
680 |  * 
681 |  * @param in                  Input matrix, size (1, w, h, c)
682 |  * @param filter              3x3 filter, size (1, 3, 3, c)
683 |  * @param stride_x            Stride of width
684 |  * @param stride_y            Stride of height
685 |  * @param padding             Padding type, 0: valid, 1: same
686 |  * @return dl_matrix3d_t*     Resulting float matrix3d
687 |  */
688 | dl_matrix3d_t *dl_matrix3dff_depthwise_conv_3x3(dl_matrix3d_t *in,
689 |                                                 dl_matrix3d_t *filter,
690 |                                                 int stride_x,
691 |                                                 int stride_y,
692 |                                                 int padding);
693 | 
694 | /**
695 |  * @brief Do 3x3 depthwise convolution with a 8-bit fixed point matrix
696 |  * 
697 |  * @param in                  Input matrix, size (1, w, h, c)
698 |  * @param filter              3x3 filter, size (1, 3, 3, c)
699 |  * @param stride_x            Stride of width
700 |  * @param stride_y            Stride of height
701 |  * @param padding             Padding type, 0: valid, 1: same
702 |  * @return dl_matrix3d_t*     Resulting float matrix3d
703 |  */
704 | dl_matrix3d_t *dl_matrix3duf_depthwise_conv_3x3(dl_matrix3du_t *in,
705 |                                                 dl_matrix3d_t *filter,
706 |                                                 int stride_x,
707 |                                                 int stride_y,
708 |                                                 int padding);
709 | 
710 | /**
711 |  * @brief Do 3x3 depthwise convolution with a float matrix3d, without padding
712 |  * 
713 |  * @param out                 Preallocated matrix3d, size (1, w, h, n)
714 |  * @param in                  Input matrix, size (1, w, h, c)
715 |  * @param f                   3x3 filter, size (1, 3, 3, c)
716 |  * @param step_x              Stride of width
717 |  * @param step_y              Stride of height
718 |  */
719 | void dl_matrix3dff_depthwise_conv_3x3_op(dl_matrix3d_t *out,
720 |                                          dl_matrix3d_t *in,
721 |                                          dl_matrix3d_t *f,
722 |                                          int step_x,
723 |                                          int step_y);
724 | 
725 | //
726 | // Depthwise Common
727 | //
728 | 
729 | /**
730 |  * @brief Do a depthwise CNN layer pass, dimension is (number, width, height, channel)
731 |  *
732 |  * @param in             Input matrix3d
733 |  * @param filter         Weights of the neurons
734 |  * @param stride_x       The step length of the convolution window in x(width) direction
735 |  * @param stride_y       The step length of the convolution window in y(height) direction
736 |  * @param padding        One of VALID or SAME
737 |  * @param mode           Do convolution using C implement or xtensa implement, 0 or 1, with respect
738 |  *                       If ESP_PLATFORM is not defined, this value is not used. Default is 0
739 |  * @return               The result of depthwise CNN layer
740 |  */
741 | dl_matrix3d_t *dl_matrix3dff_depthwise_conv_common(dl_matrix3d_t *in,
742 |                                                    dl_matrix3d_t *filter,
743 |                                                    int stride_x,
744 |                                                    int stride_y,
745 |                                                    dl_padding_type padding);
746 | 
747 | //
748 | // FC
749 | //
750 | /**
751 |  * @brief Do a general fully connected layer pass, dimension is (number, width, height, channel)
752 |  *
753 |  * @param in             Input matrix3d, size is (1, w, 1, 1)
754 |  * @param filter         Weights of the neurons, size is (1, w, h, 1)
755 |  * @param bias           Bias for the fc layer, size is (1, 1, 1, h)
756 |  * @return               The result of fc layer, size is (1, 1, 1, h)
757 |  */
758 | void dl_matrix3dff_fc(dl_matrix3d_t *out,
759 |                       dl_matrix3d_t *in,
760 |                       dl_matrix3d_t *filter);
761 | 
762 | /**
763 |  * @brief Do fully connected layer forward, with bias adding
764 |  *
765 |  * @param out       Preallocated resulting matrix, size (1, 1, 1, h)
766 |  * @param in        Input matrix, size (1, 1, 1, w)
767 |  * @param filter    Filter matrix, size (1, w, h, 1)
768 |  * @param bias      Bias matrix, size (1, 1, 1, h)
769 |  */
770 | void dl_matrix3dff_fc_with_bias(dl_matrix3d_t *out,
771 |                                 dl_matrix3d_t *in,
772 |                                 dl_matrix3d_t *filter,
773 |                                 dl_matrix3d_t *bias);
774 | 
775 | //
776 | // Mobilenet
777 | //
778 | 
779 | /**
780 |  * @brief Do a mobilenet block forward, dimension is (number, width, height, channel)
781 |  *
782 |  * @param in             Input matrix3d
783 |  * @param filter         Weights of the neurons
784 |  * @param stride_x       The step length of the convolution window in x(width) direction
785 |  * @param stride_y       The step length of the convolution window in y(height) direction
786 |  * @param padding        One of VALID or SAME
787 |  * @param mode           Do convolution using C implement or xtensa implement, 0 or 1, with respect
788 |  *                       If ESP_PLATFORM is not defined, this value is not used. Default is 0
789 |  * @return               The result of depthwise CNN layer
790 |  */
791 | dl_matrix3d_t *dl_matrix3dff_mobilenet(dl_matrix3d_t *in,
792 |                                        dl_matrix3d_t *dilate_filter,
793 |                                        dl_matrix3d_t *dilate_prelu,
794 |                                        dl_matrix3d_t *depthwise_filter,
795 |                                        dl_matrix3d_t *depthwise_prelu,
796 |                                        dl_matrix3d_t *compress_filter,
797 |                                        dl_matrix3d_t *bias,
798 |                                        dl_matrix3d_mobilenet_config_t config);
799 | 
800 | /**
801 |  * @brief Do a mobilenet block forward, dimension is (number, width, height, channel)
802 |  *
803 |  * @param in             Input matrix3du
804 |  * @param filter         Weights of the neurons
805 |  * @param stride_x       The step length of the convolution window in x(width) direction
806 |  * @param stride_y       The step length of the convolution window in y(height) direction
807 |  * @param padding        One of VALID or SAME
808 |  * @param mode           Do convolution using C implement or xtensa implement, 0 or 1, with respect
809 |  *                       If ESP_PLATFORM is not defined, this value is not used. Default is 0
810 |  * @return               The result of depthwise CNN layer
811 |  */
812 | dl_matrix3d_t *dl_matrix3duf_mobilenet(dl_matrix3du_t *in,
813 |                                        dl_matrix3d_t *dilate_filter,
814 |                                        dl_matrix3d_t *dilate_prelu,
815 |                                        dl_matrix3d_t *depthwise_filter,
816 |                                        dl_matrix3d_t *depthwise_prelu,
817 |                                        dl_matrix3d_t *compress_filter,
818 |                                        dl_matrix3d_t *bias,
819 |                                        dl_matrix3d_mobilenet_config_t config);
820 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam-bare/edge-impulse-esp32-cam-bare.ino:
--------------------------------------------------------------------------------
  1 | /*
  2 |   Live Image Classification on ESP32-CAM using MobileNet v1 from Edge Impulse
  3 |   Modified from https://github.com/edgeimpulse/example-esp32-cam.
  4 | 
  5 |   Note: 
  6 |   Do not use Arduino IDE 2.0 or you won't be able to see the serial output!
  7 | */
  8 | 
  9 | #include <esp32-cam-cat-dog_inferencing.h>  // replace with your deployed Edge Impulse library
 10 | 
 11 | #define CAMERA_MODEL_AI_THINKER
 12 | 
 13 | #include "img_converters.h"
 14 | #include "image_util.h"
 15 | #include "esp_camera.h"
 16 | #include "camera_pins.h"
 17 | 
 18 | dl_matrix3du_t *resized_matrix = NULL;
 19 | ei_impulse_result_t result = {0};
 20 | 
 21 | // setup
 22 | void setup() {
 23 |   Serial.begin(115200);
 24 | 
 25 |   // cam config
 26 |   camera_config_t config;
 27 |   config.ledc_channel = LEDC_CHANNEL_0;
 28 |   config.ledc_timer = LEDC_TIMER_0;
 29 |   config.pin_d0 = Y2_GPIO_NUM;
 30 |   config.pin_d1 = Y3_GPIO_NUM;
 31 |   config.pin_d2 = Y4_GPIO_NUM;
 32 |   config.pin_d3 = Y5_GPIO_NUM;
 33 |   config.pin_d4 = Y6_GPIO_NUM;
 34 |   config.pin_d5 = Y7_GPIO_NUM;
 35 |   config.pin_d6 = Y8_GPIO_NUM;
 36 |   config.pin_d7 = Y9_GPIO_NUM;
 37 |   config.pin_xclk = XCLK_GPIO_NUM;
 38 |   config.pin_pclk = PCLK_GPIO_NUM;
 39 |   config.pin_vsync = VSYNC_GPIO_NUM;
 40 |   config.pin_href = HREF_GPIO_NUM;
 41 |   config.pin_sscb_sda = SIOD_GPIO_NUM;
 42 |   config.pin_sscb_scl = SIOC_GPIO_NUM;
 43 |   config.pin_pwdn = PWDN_GPIO_NUM;
 44 |   config.pin_reset = RESET_GPIO_NUM;
 45 |   config.xclk_freq_hz = 20000000;
 46 |   config.pixel_format = PIXFORMAT_JPEG;
 47 |   config.frame_size = FRAMESIZE_240X240;
 48 |   config.jpeg_quality = 10;
 49 |   config.fb_count = 1;
 50 | 
 51 |   // camera init
 52 |   esp_err_t err = esp_camera_init(&config);
 53 |   if (err != ESP_OK) {
 54 |     Serial.printf("Camera init failed with error 0x%x", err);
 55 |     return;
 56 |   }
 57 | 
 58 |   sensor_t * s = esp_camera_sensor_get();
 59 |   // initial sensors are flipped vertically and colors are a bit saturated
 60 |   if (s->id.PID == OV3660_PID) {
 61 |     s->set_vflip(s, 1); // flip it back
 62 |     s->set_brightness(s, 1); // up the brightness just a bit
 63 |     s->set_saturation(s, 0); // lower the saturation
 64 |   }
 65 | 
 66 |   Serial.println("Camera Ready!");
 67 | }
 68 | 
 69 | // main loop
 70 | void loop() {
 71 | 
 72 |   // capture a image and classify it
 73 |   String result = classify();
 74 | 
 75 |   // display result
 76 |   Serial.printf("Result: %s\n", result);
 77 | }
 78 | 
 79 | // classify labels
 80 | String classify() {
 81 | 
 82 |   // run image capture once to force clear buffer
 83 |   // otherwise the captured image below would only show up next time you pressed the button!
 84 |   capture_quick();
 85 | 
 86 |   // capture image from camera
 87 |   if (!capture()) return "Error";
 88 | 
 89 |   Serial.println("Getting image...");
 90 |   signal_t signal;
 91 |   signal.total_length = EI_CLASSIFIER_INPUT_WIDTH * EI_CLASSIFIER_INPUT_WIDTH;
 92 |   signal.get_data = &raw_feature_get_data;
 93 | 
 94 |   Serial.println("Run classifier...");
 95 |   // Feed signal to the classifier
 96 |   EI_IMPULSE_ERROR res = run_classifier(&signal, &result, false /* debug */);
 97 |   // --- Free memory ---
 98 |   dl_matrix3du_free(resized_matrix);
 99 | 
100 |   // --- Returned error variable "res" while data object.array in "result" ---
101 |   ei_printf("run_classifier returned: %d\n", res);
102 |   if (res != 0) return "Error";
103 | 
104 |   // --- print the predictions ---
105 |   ei_printf("Predictions (DSP: %d ms., Classification: %d ms., Anomaly: %d ms.): \n",
106 |             result.timing.dsp, result.timing.classification, result.timing.anomaly);
107 |   int index;
108 |   float score = 0.0;
109 |   for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
110 |     // record the most possible label
111 |     if (result.classification[ix].value > score) {
112 |       score = result.classification[ix].value;
113 |       index = ix;
114 |     }
115 |     ei_printf("    %s: \t%f\r\n", result.classification[ix].label, result.classification[ix].value);
116 |   }
117 | 
118 | #if EI_CLASSIFIER_HAS_ANOMALY == 1
119 |   ei_printf("    anomaly score: %f\r\n", result.anomaly);
120 | #endif
121 | 
122 |   // --- return the most possible label ---
123 |   return String(result.classification[index].label);
124 | }
125 | 
126 | // quick capture (to clear buffer)
127 | void capture_quick() {
128 |   camera_fb_t *fb = NULL;
129 |   fb = esp_camera_fb_get();
130 |   if (!fb) return;
131 |   esp_camera_fb_return(fb);
132 | }
133 | 
134 | // capture image from cam
135 | bool capture() {
136 | 
137 |   Serial.println("Capture image...");
138 |   esp_err_t res = ESP_OK;
139 |   camera_fb_t *fb = NULL;
140 |   fb = esp_camera_fb_get();
141 |   if (!fb) {
142 |     Serial.println("Camera capture failed");
143 |     return false;
144 |   }
145 | 
146 |   // --- Convert frame to RGB888  ---
147 |   Serial.println("Converting to RGB888...");
148 |   // Allocate rgb888_matrix buffer
149 |   dl_matrix3du_t *rgb888_matrix = dl_matrix3du_alloc(1, fb->width, fb->height, 3);
150 |   fmt2rgb888(fb->buf, fb->len, fb->format, rgb888_matrix->item);
151 | 
152 |   // --- Resize the RGB888 frame to 96x96 in this example ---
153 |   Serial.println("Resizing the frame buffer...");
154 |   resized_matrix = dl_matrix3du_alloc(1, EI_CLASSIFIER_INPUT_WIDTH, EI_CLASSIFIER_INPUT_HEIGHT, 3);
155 |   image_resize_linear(resized_matrix->item, rgb888_matrix->item, EI_CLASSIFIER_INPUT_WIDTH, EI_CLASSIFIER_INPUT_HEIGHT, 3, fb->width, fb->height);
156 | 
157 |   // --- Free memory ---
158 |   dl_matrix3du_free(rgb888_matrix);
159 |   esp_camera_fb_return(fb);
160 | 
161 |   return true;
162 | }
163 | 
164 | int raw_feature_get_data(size_t offset, size_t out_len, float *signal_ptr) {
165 | 
166 |   size_t pixel_ix = offset * 3;
167 |   size_t bytes_left = out_len;
168 |   size_t out_ptr_ix = 0;
169 | 
170 |   // read byte for byte
171 |   while (bytes_left != 0) {
172 |     // grab the values and convert to r/g/b
173 |     uint8_t r, g, b;
174 |     r = resized_matrix->item[pixel_ix];
175 |     g = resized_matrix->item[pixel_ix + 1];
176 |     b = resized_matrix->item[pixel_ix + 2];
177 | 
178 |     // then convert to out_ptr format
179 |     float pixel_f = (r << 16) + (g << 8) + b;
180 |     signal_ptr[out_ptr_ix] = pixel_f;
181 | 
182 |     // and go to the next pixel
183 |     out_ptr_ix++;
184 |     pixel_ix += 3;
185 |     bytes_left--;
186 |   }
187 | 
188 |   return 0;
189 | }
190 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam-bare/esp_image.hpp:
--------------------------------------------------------------------------------
  1 | /*
  2 |   * ESPRESSIF MIT License
  3 |   *
  4 |   * Copyright (c) 2018 <ESPRESSIF SYSTEMS (SHANGHAI) PTE LTD>
  5 |   *
  6 |   * Permission is hereby granted for use on ESPRESSIF SYSTEMS products only, in which case,
  7 |   * it is free of charge, to any person obtaining a copy of this software and associated
  8 |   * documentation files (the "Software"), to deal in the Software without restriction, including
  9 |   * without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
 10 |   * and/or sell copies of the Software, and to permit persons to whom the Software is furnished
 11 |   * to do so, subject to the following conditions:
 12 |   *
 13 |   * The above copyright notice and this permission notice shall be included in all copies or
 14 |   * substantial portions of the Software.
 15 |   *
 16 |   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 17 |   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
 18 |   * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 19 |   * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 20 |   * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 21 |   * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 22 |   *
 23 |   */
 24 | #pragma once
 25 | 
 26 | #ifdef __cplusplus
 27 | extern "C"
 28 | {
 29 | #endif
 30 | 
 31 | #include <stdint.h>
 32 | #include <math.h>
 33 | #include <assert.h>
 34 | 
 35 | #ifdef __cplusplus
 36 | }
 37 | #endif
 38 | 
 39 | typedef enum
 40 | {
 41 |     IMAGE_RESIZE_BILINEAR = 0, /*<! Resize image by taking bilinear of four pixels */
 42 |     IMAGE_RESIZE_MEAN = 1,     /*<! Resize image by taking mean of four pixels */
 43 |     IMAGE_RESIZE_NEAREST = 2   /*<! Resize image by taking the nearest pixel */
 44 | } image_resize_t;
 45 | 
 46 | template <class T>
 47 | class Image
 48 | {
 49 | public:
 50 |     /**
 51 |      * @brief Convert a RGB565 pixel to RGB888
 52 |      * 
 53 |      * @param input     Pixel value in RGB565
 54 |      * @param output    Pixel value in RGB888
 55 |      */
 56 |     static inline void pixel_rgb565_to_rgb888(uint16_t input, T *output)
 57 |     {
 58 |         output[2] = (input & 0x1F00) >> 5;                           //blue
 59 |         output[1] = ((input & 0x7) << 5) | ((input & 0xE000) >> 11); //green
 60 |         output[0] = input & 0xF8;                                    //red
 61 |     };
 62 | 
 63 |     /**
 64 |      * @brief Resize a RGB565 image to a RGB88 image
 65 |      * 
 66 |      * @param dst_image     The destination image
 67 |      * @param y_start       The start y index of where resized image located
 68 |      * @param y_end         The end y index of where resized image located
 69 |      * @param x_start       The start x index of where resized image located
 70 |      * @param x_end         The end x index of where resized image located
 71 |      * @param channel       The channel number of image
 72 |      * @param src_image     The source image
 73 |      * @param src_h         The height of source image
 74 |      * @param src_w         The width of source image
 75 |      * @param dst_w         The width of destination image
 76 |      * @param shift_left    The bit number of left shifting
 77 |      * @param type          The resize type
 78 |      */
 79 |     static void resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint16_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
 80 | 
 81 |     /**
 82 |      * @brief Resize a RGB888 image to a RGB88 image
 83 |      * 
 84 |      * @param dst_image     The destination image
 85 |      * @param y_start       The start y index of where resized image located
 86 |      * @param y_end         The end y index of where resized image located
 87 |      * @param x_start       The start x index of where resized image located
 88 |      * @param x_end         The end x index of where resized image located
 89 |      * @param channel       The channel number of image
 90 |      * @param src_image     The source image
 91 |      * @param src_h         The height of source image
 92 |      * @param src_w         The width of source image
 93 |      * @param dst_w         The width of destination image
 94 |      * @param shift_left    The bit number of left shifting
 95 |      * @param type          The resize type
 96 |      */
 97 |     static void resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint8_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
 98 |     // static void resize_to_rgb565(uint16_t *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint16_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
 99 |     // static void resize_to_rgb565(uint16_t *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint8_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
100 | };
101 | 
102 | template <class T>
103 | void Image<T>::resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint16_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type)
104 | {
105 |     assert(channel == 3);
106 |     float scale_y = (float)src_h / (y_end - y_start);
107 |     float scale_x = (float)src_w / (x_end - x_start);
108 |     int temp[13];
109 | 
110 |     switch (type)
111 |     {
112 |     case IMAGE_RESIZE_BILINEAR:
113 |         for (size_t y = y_start; y < y_end; y++)
114 |         {
115 |             float ratio_y[2];
116 |             ratio_y[0] = (float)((y + 0.5) * scale_y - 0.5); // y
117 |             int src_y = (int)ratio_y[0];                     // y1
118 |             ratio_y[0] -= src_y;                             // y - y1
119 | 
120 |             if (src_y < 0)
121 |             {
122 |                 ratio_y[0] = 0;
123 |                 src_y = 0;
124 |             }
125 |             if (src_y > src_h - 2)
126 |             {
127 |                 ratio_y[0] = 0;
128 |                 src_y = src_h - 2;
129 |             }
130 |             ratio_y[1] = 1 - ratio_y[0]; // y2 - y
131 | 
132 |             int _dst_i = y * dst_w;
133 | 
134 |             int _src_row_0 = src_y * src_w;
135 |             int _src_row_1 = _src_row_0 + src_w;
136 | 
137 |             for (size_t x = x_start; x < x_end; x++)
138 |             {
139 |                 float ratio_x[2];
140 |                 ratio_x[0] = (float)((x + 0.5) * scale_x - 0.5); // x
141 |                 int src_x = (int)ratio_x[0];                     // x1
142 |                 ratio_x[0] -= src_x;                             // x - x1
143 | 
144 |                 if (src_x < 0)
145 |                 {
146 |                     ratio_x[0] = 0;
147 |                     src_x = 0;
148 |                 }
149 |                 if (src_x > src_w - 2)
150 |                 {
151 |                     ratio_x[0] = 0;
152 |                     src_x = src_w - 2;
153 |                 }
154 |                 ratio_x[1] = 1 - ratio_x[0]; // x2 - x
155 | 
156 |                 int dst_i = (_dst_i + x) * channel;
157 | 
158 |                 int src_row_0 = _src_row_0 + src_x;
159 |                 int src_row_1 = _src_row_1 + src_x;
160 | 
161 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0], temp);
162 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0 + 1], temp + 3);
163 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1], temp + 6);
164 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1 + 1], temp + 9);
165 | 
166 |                 for (int c = 0; c < channel; c++)
167 |                 {
168 |                     temp[12] = round(temp[c] * ratio_x[1] * ratio_y[1] + temp[channel + c] * ratio_x[0] * ratio_y[1] + temp[channel + channel + c] * ratio_x[1] * ratio_y[0] + src_image[channel + channel + channel + c] * ratio_x[0] * ratio_y[0]);
169 |                     dst_image[dst_i + c] = (shift_left > 0) ? (temp[12] << shift_left) : (temp[12] >> -shift_left);
170 |                 }
171 |             }
172 |         }
173 |         break;
174 | 
175 |     case IMAGE_RESIZE_MEAN:
176 |         shift_left -= 2;
177 |         for (int y = y_start; y < y_end; y++)
178 |         {
179 |             int _dst_i = y * dst_w;
180 | 
181 |             float _src_row_0 = rintf(y * scale_y) * src_w;
182 |             float _src_row_1 = _src_row_0 + src_w;
183 | 
184 |             for (int x = x_start; x < x_end; x++)
185 |             {
186 |                 int dst_i = (_dst_i + x) * channel;
187 | 
188 |                 int src_row_0 = (_src_row_0 + rintf(x * scale_x));
189 |                 int src_row_1 = (_src_row_1 + rintf(x * scale_x));
190 | 
191 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0], temp);
192 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0 + 1], temp + 3);
193 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1], temp + 6);
194 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1 + 1], temp + 9);
195 | 
196 |                 dst_image[dst_i] = (shift_left > 0) ? ((temp[0] + temp[3] + temp[6] + temp[9]) << shift_left) : ((temp[0] + temp[3] + temp[6] + temp[9]) >> -shift_left);
197 |                 dst_image[dst_i + 1] = (shift_left > 0) ? ((temp[1] + temp[4] + temp[7] + temp[10]) << shift_left) : ((temp[1] + temp[4] + temp[7] + temp[10]) >> -shift_left);
198 |                 dst_image[dst_i + 2] = (shift_left > 0) ? ((temp[2] + temp[5] + temp[8] + temp[11]) << shift_left) : ((temp[1] + temp[4] + temp[7] + temp[10]) >> -shift_left);
199 |             }
200 |         }
201 | 
202 |         break;
203 | 
204 |     case IMAGE_RESIZE_NEAREST:
205 |         for (size_t y = y_start; y < y_end; y++)
206 |         {
207 |             int _dst_i = y * dst_w;
208 |             float _src_i = rintf(y * scale_y) * src_w;
209 | 
210 |             for (size_t x = x_start; x < x_end; x++)
211 |             {
212 |                 int dst_i = (_dst_i + x) * channel;
213 |                 int src_i = _src_i + rintf(x * scale_x);
214 | 
215 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_i], temp);
216 | 
217 |                 dst_image[dst_i] = (shift_left > 0) ? (temp[0] << shift_left) : (temp[0] >> -shift_left);
218 |                 dst_image[dst_i + 1] = (shift_left > 0) ? (temp[1] << shift_left) : (temp[1] >> -shift_left);
219 |                 dst_image[dst_i + 2] = (shift_left > 0) ? (temp[2] << shift_left) : (temp[2] >> -shift_left);
220 |             }
221 |         }
222 |         break;
223 | 
224 |     default:
225 |         break;
226 |     }
227 | }
228 | 
229 | template <class T>
230 | void Image<T>::resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint8_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type)
231 | {
232 |     float scale_y = (float)src_h / (y_end - y_start);
233 |     float scale_x = (float)src_w / (x_end - x_start);
234 |     int temp;
235 | 
236 |     switch (type)
237 |     {
238 |     case IMAGE_RESIZE_BILINEAR:
239 |         for (size_t y = y_start; y < y_end; y++)
240 |         {
241 |             float ratio_y[2];
242 |             ratio_y[0] = (float)((y + 0.5) * scale_y - 0.5); // y
243 |             int src_y = (int)ratio_y[0];                     // y1
244 |             ratio_y[0] -= src_y;                             // y - y1
245 | 
246 |             if (src_y < 0)
247 |             {
248 |                 ratio_y[0] = 0;
249 |                 src_y = 0;
250 |             }
251 |             if (src_y > src_h - 2)
252 |             {
253 |                 ratio_y[0] = 0;
254 |                 src_y = src_h - 2;
255 |             }
256 |             ratio_y[1] = 1 - ratio_y[0]; // y2 - y
257 | 
258 |             int _dst_i = y * dst_w;
259 | 
260 |             int _src_row_0 = src_y * src_w;
261 |             int _src_row_1 = _src_row_0 + src_w;
262 | 
263 |             for (size_t x = x_start; x < x_end; x++)
264 |             {
265 |                 float ratio_x[2];
266 |                 ratio_x[0] = (float)((x + 0.5) * scale_x - 0.5); // x
267 |                 int src_x = (int)ratio_x[0];                     // x1
268 |                 ratio_x[0] -= src_x;                             // x - x1
269 | 
270 |                 if (src_x < 0)
271 |                 {
272 |                     ratio_x[0] = 0;
273 |                     src_x = 0;
274 |                 }
275 |                 if (src_x > src_w - 2)
276 |                 {
277 |                     ratio_x[0] = 0;
278 |                     src_x = src_w - 2;
279 |                 }
280 |                 ratio_x[1] = 1 - ratio_x[0]; // x2 - x
281 | 
282 |                 int dst_i = (_dst_i + x) * channel;
283 | 
284 |                 int src_row_0 = (_src_row_0 + src_x) * channel;
285 |                 int src_row_1 = (_src_row_1 + src_x) * channel;
286 | 
287 |                 for (int c = 0; c < channel; c++)
288 |                 {
289 |                     temp = round(src_image[src_row_0 + c] * ratio_x[1] * ratio_y[1] + src_image[src_row_0 + channel + c] * ratio_x[0] * ratio_y[1] + src_image[src_row_1 + c] * ratio_x[1] * ratio_y[0] + src_image[src_row_1 + channel + c] * ratio_x[0] * ratio_y[0]);
290 |                     dst_image[dst_i + c] = (shift_left > 0) ? (temp << shift_left) : (temp >> -shift_left);
291 |                 }
292 |             }
293 |         }
294 |         break;
295 | 
296 |     case IMAGE_RESIZE_MEAN:
297 |         shift_left -= 2;
298 | 
299 |         for (size_t y = y_start; y < y_end; y++)
300 |         {
301 |             int _dst_i = y * dst_w;
302 | 
303 |             float _src_row_0 = rintf(y * scale_y) * src_w;
304 |             float _src_row_1 = _src_row_0 + src_w;
305 | 
306 |             for (size_t x = x_start; x < x_end; x++)
307 |             {
308 |                 int dst_i = (_dst_i + x) * channel;
309 | 
310 |                 int src_row_0 = (_src_row_0 + rintf(x * scale_x)) * channel;
311 |                 int src_row_1 = (_src_row_1 + rintf(x * scale_x)) * channel;
312 | 
313 |                 for (size_t c = 0; c < channel; c++)
314 |                 {
315 |                     temp = (int)src_image[src_row_0 + c] + (int)src_image[src_row_0 + channel + c] + (int)src_image[src_row_1 + c] + (int)src_image[src_row_1 + channel + c];
316 |                     dst_image[dst_i + c] = (shift_left > 0) ? (temp << shift_left) : (temp >> -shift_left);
317 |                 }
318 |             }
319 |         }
320 |         break;
321 | 
322 |     case IMAGE_RESIZE_NEAREST:
323 |         for (size_t y = y_start; y < y_end; y++)
324 |         {
325 |             int _dst_i = y * dst_w;
326 |             float _src_i = rintf(y * scale_y) * src_w;
327 | 
328 |             for (size_t x = x_start; x < x_end; x++)
329 |             {
330 |                 int dst_i = (_dst_i + x) * channel;
331 |                 int src_i = (_src_i + rintf(x * scale_x)) * channel;
332 | 
333 |                 for (size_t c = 0; c < channel; c++)
334 |                 {
335 |                     dst_image[dst_i + c] = (shift_left > 0) ? ((T)src_image[src_i + c] << shift_left) : ((T)src_image[src_i + c] >> -shift_left);
336 |                 }
337 |             }
338 |         }
339 |         break;
340 | 
341 |     default:
342 |         break;
343 |     }
344 | }


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam-bare/frmn.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #if __cplusplus
 4 | extern "C"
 5 | {
 6 | #endif
 7 | 
 8 | #include "dl_lib_matrix3d.h"
 9 | #include "dl_lib_matrix3dq.h"
10 | 
11 |     /**
12 |      * @brief Forward the face recognition process with frmn model. Calculate in float.
13 |      *
14 |      * @param in    Image matrix, rgb888 format, size is 56x56, normalized
15 |      * @return dl_matrix3d_t* Face ID feature vector, size is 512
16 |      */
17 |     dl_matrix3d_t *frmn(dl_matrix3d_t *in);
18 |     
19 |     /**@{*/
20 |     /**
21 |      * @brief Forward the face recognition process with specified model. Calculate in quantization.
22 |      *
23 |      * @param in    Image matrix, rgb888 format, size is 56x56, normalized
24 |      * @param mode  0: C implement; 1: handwrite xtensa instruction implement
25 |      * @return      Face ID feature vector, size is 512
26 |      */
27 |     dl_matrix3dq_t *frmn_q(dl_matrix3dq_t *in, dl_conv_mode mode);
28 | 
29 |     dl_matrix3dq_t *frmn2p_q(dl_matrix3dq_t *in, dl_conv_mode mode);
30 | 
31 |     dl_matrix3dq_t *mfn56_42m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
32 | 
33 |     dl_matrix3dq_t *mfn56_72m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
34 | 
35 |     dl_matrix3dq_t *mfn56_112m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
36 | 
37 |     dl_matrix3dq_t *mfn56_156m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
38 | 
39 |     /**@}*/
40 | 
41 | #if __cplusplus
42 | }
43 | #endif
44 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam-bare/image_util.h:
--------------------------------------------------------------------------------
  1 | /*
  2 |   * ESPRESSIF MIT License
  3 |   *
  4 |   * Copyright (c) 2018 <ESPRESSIF SYSTEMS (SHANGHAI) PTE LTD>
  5 |   *
  6 |   * Permission is hereby granted for use on ESPRESSIF SYSTEMS products only, in which case,
  7 |   * it is free of charge, to any person obtaining a copy of this software and associated
  8 |   * documentation files (the "Software"), to deal in the Software without restriction, including
  9 |   * without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
 10 |   * and/or sell copies of the Software, and to permit persons to whom the Software is furnished
 11 |   * to do so, subject to the following conditions:
 12 |   *
 13 |   * The above copyright notice and this permission notice shall be included in all copies or
 14 |   * substantial portions of the Software.
 15 |   *
 16 |   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 17 |   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
 18 |   * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 19 |   * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 20 |   * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 21 |   * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 22 |   *
 23 |   */
 24 | #pragma once
 25 | #ifdef __cplusplus
 26 | extern "C"
 27 | {
 28 | #endif
 29 | #include <stdint.h>
 30 | #include <math.h>
 31 | #include "mtmn.h"
 32 | 
 33 | #define LANDMARKS_NUM (10)
 34 | 
 35 | #define MAX_VALID_COUNT_PER_IMAGE (30)
 36 | 
 37 | #define DL_IMAGE_MIN(A, B) ((A) < (B) ? (A) : (B))
 38 | #define DL_IMAGE_MAX(A, B) ((A) < (B) ? (B) : (A))
 39 | 
 40 | #define RGB565_MASK_RED 0xF800
 41 | #define RGB565_MASK_GREEN 0x07E0
 42 | #define RGB565_MASK_BLUE 0x001F
 43 | 
 44 |     typedef enum
 45 |     {
 46 |         BINARY, /*!< binary */
 47 |     } en_threshold_mode;
 48 | 
 49 |     typedef struct
 50 |     {
 51 |         fptp_t landmark_p[LANDMARKS_NUM]; /*!< landmark struct */
 52 |     } landmark_t;
 53 | 
 54 |     typedef struct
 55 |     {
 56 |         fptp_t box_p[4]; /*!< box struct */
 57 |     } box_t;
 58 | 
 59 |     typedef struct tag_box_list
 60 |     {
 61 |         uint8_t *category;    /*!< The category of the corresponding box */
 62 |         fptp_t *score;        /*!< The confidence score of the class corresponding to the box */
 63 |         box_t *box;           /*!< Anchor boxes or predicted boxes*/
 64 |         landmark_t *landmark; /*!< The landmarks corresponding to the box */
 65 |         int len;              /*!< The num of the boxes */
 66 |     } box_array_t;
 67 | 
 68 |     typedef struct tag_image_box
 69 |     {
 70 |         struct tag_image_box *next; /*!< Next image_box_t */
 71 |         uint8_t category;
 72 |         fptp_t score;        /*!< The confidence score of the class corresponding to the box */
 73 |         box_t box;           /*!< Anchor boxes or predicted boxes */
 74 |         box_t offset;        /*!< The predicted anchor-based offset */
 75 |         landmark_t landmark; /*!< The landmarks corresponding to the box */
 76 |     } image_box_t;
 77 | 
 78 |     typedef struct tag_image_list
 79 |     {
 80 |         image_box_t *head;        /*!< The current head of the image_list */
 81 |         image_box_t *origin_head; /*!< The original head of the image_list */
 82 |         int len;                  /*!< Length of the image_list */
 83 |     } image_list_t;
 84 | 
 85 |     /**
 86 |      * @brief Get the width and height of the box.
 87 |      * 
 88 |      * @param box         Input box
 89 |      * @param w           Resulting width of the box
 90 |      * @param h           Resulting height of the box
 91 |      */
 92 |     static inline void image_get_width_and_height(box_t *box, float *w, float *h)
 93 |     {
 94 |         *w = box->box_p[2] - box->box_p[0] + 1;
 95 |         *h = box->box_p[3] - box->box_p[1] + 1;
 96 |     }
 97 | 
 98 |     /**
 99 |      * @brief Get the area of the box.
100 |      * 
101 |      * @param box         Input box
102 |      * @param area        Resulting area of the box 
103 |      */
104 |     static inline void image_get_area(box_t *box, float *area)
105 |     {
106 |         float w, h;
107 |         image_get_width_and_height(box, &w, &h);
108 |         *area = w * h;
109 |     }
110 | 
111 |     /**
112 |      * @brief calibrate the boxes by offset
113 |      * 
114 |      * @param image_list         Input boxes
115 |      * @param image_height       Height of the original image
116 |      * @param image_width        Width of the original image
117 |      */
118 |     static inline void image_calibrate_by_offset(image_list_t *image_list, int image_height, int image_width)
119 |     {
120 |         for (image_box_t *head = image_list->head; head; head = head->next)
121 |         {
122 |             float w, h;
123 |             image_get_width_and_height(&(head->box), &w, &h);
124 |             head->box.box_p[0] = DL_IMAGE_MAX(0, head->box.box_p[0] + head->offset.box_p[0] * w);
125 |             head->box.box_p[1] = DL_IMAGE_MAX(0, head->box.box_p[1] + head->offset.box_p[1] * w);
126 |             head->box.box_p[2] += head->offset.box_p[2] * w;
127 |             if (head->box.box_p[2] > image_width)
128 |             {
129 |                 head->box.box_p[2] = image_width - 1;
130 |                 head->box.box_p[0] = image_width - w;
131 |             }
132 |             head->box.box_p[3] += head->offset.box_p[3] * h;
133 |             if (head->box.box_p[3] > image_height)
134 |             {
135 |                 head->box.box_p[3] = image_height - 1;
136 |                 head->box.box_p[1] = image_height - h;
137 |             }
138 |         }
139 |     }
140 | 
141 |     /**
142 |      * @brief calibrate the landmarks
143 |      * 
144 |      * @param image_list     Input landmarks
145 |      */
146 |     static inline void image_landmark_calibrate(image_list_t *image_list)
147 |     {
148 |         for (image_box_t *head = image_list->head; head; head = head->next)
149 |         {
150 |             float w, h;
151 |             image_get_width_and_height(&(head->box), &w, &h);
152 |             head->landmark.landmark_p[0] = head->box.box_p[0] + head->landmark.landmark_p[0] * w;
153 |             head->landmark.landmark_p[1] = head->box.box_p[1] + head->landmark.landmark_p[1] * h;
154 | 
155 |             head->landmark.landmark_p[2] = head->box.box_p[0] + head->landmark.landmark_p[2] * w;
156 |             head->landmark.landmark_p[3] = head->box.box_p[1] + head->landmark.landmark_p[3] * h;
157 | 
158 |             head->landmark.landmark_p[4] = head->box.box_p[0] + head->landmark.landmark_p[4] * w;
159 |             head->landmark.landmark_p[5] = head->box.box_p[1] + head->landmark.landmark_p[5] * h;
160 | 
161 |             head->landmark.landmark_p[6] = head->box.box_p[0] + head->landmark.landmark_p[6] * w;
162 |             head->landmark.landmark_p[7] = head->box.box_p[1] + head->landmark.landmark_p[7] * h;
163 | 
164 |             head->landmark.landmark_p[8] = head->box.box_p[0] + head->landmark.landmark_p[8] * w;
165 |             head->landmark.landmark_p[9] = head->box.box_p[1] + head->landmark.landmark_p[9] * h;
166 |         }
167 |     }
168 | 
169 |     /**
170 |      * @brief Convert a rectangular box into a square box
171 |      * 
172 |      * @param boxes    Input box 
173 |      * @param width    Width of the orignal image
174 |      * @param height   height of the orignal image
175 |      */
176 |     static inline void image_rect2sqr(box_array_t *boxes, int width, int height)
177 |     {
178 |         for (int i = 0; i < boxes->len; i++)
179 |         {
180 |             box_t *box = &(boxes->box[i]);
181 | 
182 |             int x1 = round(box->box_p[0]);
183 |             int y1 = round(box->box_p[1]);
184 |             int x2 = round(box->box_p[2]);
185 |             int y2 = round(box->box_p[3]);
186 | 
187 |             int w = x2 - x1 + 1;
188 |             int h = y2 - y1 + 1;
189 |             int l = DL_IMAGE_MAX(w, h);
190 | 
191 |             box->box_p[0] = DL_IMAGE_MAX(round(DL_IMAGE_MAX(0, x1) + 0.5 * (w - l)), 0);
192 |             box->box_p[1] = DL_IMAGE_MAX(round(DL_IMAGE_MAX(0, y1) + 0.5 * (h - l)), 0);
193 | 
194 |             box->box_p[2] = box->box_p[0] + l - 1;
195 |             if (box->box_p[2] > width)
196 |             {
197 |                 box->box_p[2] = width - 1;
198 |                 box->box_p[0] = width - l;
199 |             }
200 |             box->box_p[3] = box->box_p[1] + l - 1;
201 |             if (box->box_p[3] > height)
202 |             {
203 |                 box->box_p[3] = height - 1;
204 |                 box->box_p[1] = height - l;
205 |             }
206 |         }
207 |     }
208 | 
209 |     /**@{*/
210 |     /**
211 |      * @brief Convert RGB565 image to RGB888 image
212 |      * 
213 |      * @param in    Input RGB565 image
214 |      * @param dst   Resulting RGB888 image
215 |      */
216 |     static inline void rgb565_to_888(uint16_t in, uint8_t *dst)
217 |     { /*{{{*/
218 |         in = (in & 0xFF) << 8 | (in & 0xFF00) >> 8;
219 |         dst[2] = (in & RGB565_MASK_BLUE) << 3;  // blue
220 |         dst[1] = (in & RGB565_MASK_GREEN) >> 3; // green
221 |         dst[0] = (in & RGB565_MASK_RED) >> 8;   // red
222 | 
223 |         // dst[0] = (in & 0x1F00) >> 5;
224 |         // dst[1] = ((in & 0x7) << 5) | ((in & 0xE000) >> 11);
225 |         // dst[2] = in & 0xF8;
226 |     } /*}}}*/
227 | 
228 |     static inline void rgb565_to_888_q16(uint16_t in, int16_t *dst)
229 |     { /*{{{*/
230 |         in = (in & 0xFF) << 8 | (in & 0xFF00) >> 8;
231 |         dst[2] = (in & RGB565_MASK_BLUE) << 3;  // blue
232 |         dst[1] = (in & RGB565_MASK_GREEN) >> 3; // green
233 |         dst[0] = (in & RGB565_MASK_RED) >> 8;   // red
234 | 
235 |         // dst[0] = (in & 0x1F00) >> 5;
236 |         // dst[1] = ((in & 0x7) << 5) | ((in & 0xE000) >> 11);
237 |         // dst[2] = in & 0xF8;
238 |     } /*}}}*/
239 |     /**@}*/
240 | 
241 |     /**
242 |      * @brief Convert RGB888 image to RGB565 image
243 |      * 
244 |      * @param in      Resulting RGB565 image
245 |      * @param r       The red channel of the Input RGB888 image 
246 |      * @param g       The green channel of the Input RGB888 image 
247 |      * @param b       The blue channel of the Input RGB888 image
248 |      */
249 |     static inline void rgb888_to_565(uint16_t *in, uint8_t r, uint8_t g, uint8_t b)
250 |     { /*{{{*/
251 |         uint16_t rgb565 = 0;
252 |         rgb565 = ((r >> 3) << 11);
253 |         rgb565 |= ((g >> 2) << 5);
254 |         rgb565 |= (b >> 3);
255 |         rgb565 = (rgb565 & 0xFF) << 8 | (rgb565 & 0xFF00) >> 8;
256 |         *in = rgb565;
257 |     } /*}}}*/
258 | 
259 |     /**
260 |      * @brief Filter out the resulting boxes whose confidence score is lower than the threshold and convert the boxes to the actual boxes on the original image.((x, y, w, h) -> (x1, y1, x2, y2))
261 |      * 
262 |      * @param score                    Confidence score of the boxes
263 |      * @param offset                   The predicted anchor-based offset
264 |      * @param landmark                 The landmarks corresponding to the box
265 |      * @param width                    Height of the original image
266 |      * @param height                   Width of the original image
267 |      * @param anchor_number            Anchor number of the detection output feature map 
268 |      * @param anchors_size             The anchor size
269 |      * @param score_threshold          Threshold of the confidence score
270 |      * @param stride 
271 |      * @param resized_height_scale 
272 |      * @param resized_width_scale 
273 |      * @param do_regression 
274 |      * @return image_list_t* 
275 |      */
276 |     image_list_t *image_get_valid_boxes(fptp_t *score,
277 |                                         fptp_t *offset,
278 |                                         fptp_t *landmark,
279 |                                         int width,
280 |                                         int height,
281 |                                         int anchor_number,
282 |                                         int *anchors_size,
283 |                                         fptp_t score_threshold,
284 |                                         int stride,
285 |                                         fptp_t resized_height_scale,
286 |                                         fptp_t resized_width_scale,
287 |                                         bool do_regression);
288 |     /**
289 |      * @brief Sort the resulting box lists by their confidence score.
290 |      * 
291 |      * @param image_sorted_list     The sorted box list.
292 |      * @param insert_list           The box list that have not been sorted.
293 |      */
294 |     void image_sort_insert_by_score(image_list_t *image_sorted_list, const image_list_t *insert_list);
295 | 
296 |     /**
297 |      * @brief Run NMS algorithm 
298 |      * 
299 |      * @param image_list         The input boxes list
300 |      * @param nms_threshold      NMS threshold
301 |      * @param same_area          The flag of boxes with same area
302 |      */
303 |     void image_nms_process(image_list_t *image_list, fptp_t nms_threshold, int same_area);
304 | 
305 |     /**
306 |      * @brief Resize an image to half size 
307 |      * 
308 |      * @param dimage      The output image
309 |      * @param dw          Width of the output image
310 |      * @param dh          Height of the output image
311 |      * @param dc          Channel of the output image
312 |      * @param simage      Source image
313 |      * @param sw          Width of the source image
314 |      * @param sc          Channel of the source image
315 |      */
316 |     void image_zoom_in_twice(uint8_t *dimage,
317 |                              int dw,
318 |                              int dh,
319 |                              int dc,
320 |                              uint8_t *simage,
321 |                              int sw,
322 |                              int sc);
323 | 
324 |     /**
325 |      * @brief Resize the image in RGB888 format via bilinear interpolation
326 |      * 
327 |      * @param dst_image    The output image
328 |      * @param src_image    Source image
329 |      * @param dst_w        Width of the output image
330 |      * @param dst_h        Height of the output image
331 |      * @param dst_c        Channel of the output image
332 |      * @param src_w        Width of the source image
333 |      * @param src_h        Height of the source image
334 |      */
335 |     void image_resize_linear(uint8_t *dst_image, uint8_t *src_image, int dst_w, int dst_h, int dst_c, int src_w, int src_h);
336 | 
337 |     /**
338 |      * @brief Crop， rotate and zoom the image in RGB888 format, 
339 |      * 
340 |      * @param corp_image       The output image
341 |      * @param src_image        Source image
342 |      * @param rotate_angle     Rotate angle
343 |      * @param ratio            scaling ratio
344 |      * @param center           Center of rotation
345 |      */
346 |     void image_cropper(uint8_t *corp_image, uint8_t *src_image, int dst_w, int dst_h, int dst_c, int src_w, int src_h, float rotate_angle, float ratio, float *center);
347 | 
348 |     /**
349 |      * @brief Convert the rgb565 image to the rgb888 image   
350 |      * 
351 |      * @param m       The output rgb888 image
352 |      * @param bmp     The input rgb565 image
353 |      * @param count   Total pixels of the rgb565 image
354 |      */
355 |     void image_rgb565_to_888(uint8_t *m, uint16_t *bmp, int count);
356 | 
357 |     /**
358 |      * @brief Convert the rgb888 image to the rgb565 image
359 |      * 
360 |      * @param bmp     The output rgb565 image
361 |      * @param m       The input rgb888 image
362 |      * @param count   Total pixels of the rgb565 image
363 |      */
364 |     void image_rgb888_to_565(uint16_t *bmp, uint8_t *m, int count);
365 | 
366 |     /**
367 |      * @brief draw rectangle on the rgb565 image
368 |      * 
369 |      * @param buf     Input image
370 |      * @param boxes   Rectangle Boxes
371 |      * @param width   Width of the input image
372 |      */
373 |     void draw_rectangle_rgb565(uint16_t *buf, box_array_t *boxes, int width);
374 | 
375 |     /**
376 |      * @brief draw rectangle on the rgb888 image
377 |      * 
378 |      * @param buf     Input image
379 |      * @param boxes   Rectangle Boxes
380 |      * @param width   Width of the input image
381 |      */
382 |     void draw_rectangle_rgb888(uint8_t *buf, box_array_t *boxes, int width);
383 | 
384 |     /**
385 |      * @brief Get the pixel difference of two images
386 |      * 
387 |      * @param dst       The output pixel difference
388 |      * @param src1      Input image 1
389 |      * @param src2      Input image 2
390 |      * @param count     Total pixels of the input image
391 |      */
392 |     void image_abs_diff(uint8_t *dst, uint8_t *src1, uint8_t *src2, int count);
393 | 
394 |     /**
395 |      * @brief Binarize an image to 0 and value. 
396 |      * 
397 |      * @param dst           The output image
398 |      * @param src           Source image
399 |      * @param threshold     Threshold of binarization
400 |      * @param value         The value of binarization
401 |      * @param count         Total pixels of the input image
402 |      * @param mode          Threshold mode
403 |      */
404 |     void image_threshold(uint8_t *dst, uint8_t *src, int threshold, int value, int count, en_threshold_mode mode);
405 | 
406 |     /**
407 |      * @brief Erode the image
408 |      * 
409 |      * @param dst          The output image
410 |      * @param src          Source image
411 |      * @param src_w        Width of the source image
412 |      * @param src_h        Height of the source image
413 |      * @param src_c        Channel of the source image
414 |      */
415 |     void image_erode(uint8_t *dst, uint8_t *src, int src_w, int src_h, int src_c);
416 | 
417 |     typedef float matrixType;
418 |     typedef struct
419 |     {
420 |         int w;              /*!< width */
421 |         int h;              /*!< height */
422 |         matrixType **array; /*!< array */
423 |     } Matrix;
424 | 
425 |     /**
426 |      * @brief Allocate a 2d matrix
427 |      * 
428 |      * @param h                Height of matrix
429 |      * @param w                Width of matrix
430 |      * @return Matrix*         2d matrix
431 |      */
432 |     Matrix *matrix_alloc(int h, int w);
433 | 
434 |     /**
435 |      * @brief Free a 2d matrix
436 |      * 
437 |      * @param m    2d matrix 
438 |      */
439 |     void matrix_free(Matrix *m);
440 | 
441 |     /**
442 |      * @brief Get the similarity matrix of similarity transformation
443 |      * 
444 |      * @param srcx          Source x coordinates
445 |      * @param srcy          Source y coordinates
446 |      * @param dstx          Destination x coordinates
447 |      * @param dsty          Destination y coordinates
448 |      * @param num           The number of the coordinates
449 |      * @return Matrix*      The resulting transformation matrix
450 |      */
451 |     Matrix *get_similarity_matrix(float *srcx, float *srcy, float *dstx, float *dsty, int num);
452 | 
453 |     /**
454 |      * @brief Get the affine transformation matrix
455 |      * 
456 |      * @param srcx          Source x coordinates
457 |      * @param srcy          Source y coordinates
458 |      * @param dstx          Destination x coordinates
459 |      * @param dsty          Destination y coordinates
460 |      * @return Matrix*      The resulting transformation matrix
461 |      */
462 |     Matrix *get_affine_transform(float *srcx, float *srcy, float *dstx, float *dsty);
463 | 
464 |     /**
465 |      * @brief Applies an affine transformation to an image
466 |      * 
467 |      * @param img           Input image
468 |      * @param crop          Dst output image that has the size dsize and the same type as src
469 |      * @param M             Affine transformation matrix
470 |      */
471 |     void warp_affine(dl_matrix3du_t *img, dl_matrix3du_t *crop, Matrix *M);
472 | 
473 |     /**
474 |      * @brief Resize the image in RGB888 format via bilinear interpolation, and quantify the output image
475 |      * 
476 |      * @param dst_image            Quantized output image
477 |      * @param src_image            Input image 
478 |      * @param dst_w                Width of the output image 
479 |      * @param dst_h                Height of the output image 
480 |      * @param dst_c                Channel of the output image
481 |      * @param src_w                Width of the input image 
482 |      * @param src_h                Height of the input image
483 |      * @param shift                Shift parameter of quantization.
484 |      */
485 |     void image_resize_linear_q(qtp_t *dst_image, uint8_t *src_image, int dst_w, int dst_h, int dst_c, int src_w, int src_h, int shift);
486 | 
487 |     /**
488 |      * @brief Preprocess the input image of object detection model. The process is like this: resize -> normalize -> quantify
489 |      * 
490 |      * @param image                 Input image, RGB888 format.
491 |      * @param input_w               Width of the input image.
492 |      * @param input_h               Height of the input image.
493 |      * @param target_size           Target size of the model input image.
494 |      * @param exponent              Exponent of the quantized model input image.
495 |      * @param process_mode          Process mode. 0: resize with padding to keep height == width. 1: resize without padding, height != width.  
496 |      * @return dl_matrix3dq_t*      The resulting preprocessed image.
497 |      */
498 |     dl_matrix3dq_t *image_resize_normalize_quantize(uint8_t *image, int input_w, int input_h, int target_size, int exponent, int process_mode);
499 | 
500 |     /**
501 |      * @brief Resize the image in RGB565 format via mean neighbour interpolation, and quantify the output image
502 |      * 
503 |      * @param dimage            Quantized output image. 
504 |      * @param simage            Input image.  
505 |      * @param dw                Width of the allocated output image memory.
506 |      * @param dc                Channel of the allocated output image memory.
507 |      * @param sw                Width of the input image. 
508 |      * @param sh                Height of the input image. 
509 |      * @param tw                Target width of the output image.
510 |      * @param th                Target height of the output image.
511 |      * @param shift             Shift parameter of quantization.
512 |      */
513 |     void image_resize_shift_fast(qtp_t *dimage, uint16_t *simage, int dw, int dc, int sw, int sh, int tw, int th, int shift);
514 | 
515 |     /**
516 |      * @brief Resize the image in RGB565 format via nearest neighbour interpolation, and quantify the output image
517 |      * 
518 |      * @param dimage            Quantized output image. 
519 |      * @param simage            Input image.  
520 |      * @param dw                Width of the allocated output image memory.
521 |      * @param dc                Channel of the allocated output image memory.
522 |      * @param sw                Width of the input image. 
523 |      * @param sh                Height of the input image. 
524 |      * @param tw                Target width of the output image.
525 |      * @param th                Target height of the output image.
526 |      * @param shift             Shift parameter of quantization.
527 |      */
528 |     void image_resize_nearest_shift(qtp_t *dimage, uint16_t *simage, int dw, int dc, int sw, int sh, int tw, int th, int shift);
529 | 
530 |     /**
531 |      * @brief Crop the image in RGB565 format and resize it to target size, then quantify the output image 
532 |      * 
533 |      * @param dimage            Quantized output image. 
534 |      * @param simage            Input image.
535 |      * @param dw                Target size of the output image.
536 |      * @param sw                Width of the input image. 
537 |      * @param sh                Height of the input image. 
538 |      * @param x1                The x coordinate of the upper left corner of the cropped area
539 |      * @param y1                The y coordinate of the upper left corner of the cropped area
540 |      * @param x2                The x coordinate of the lower right corner of the cropped area
541 |      * @param y2                The y coordinate of the lower right corner of the cropped area
542 |      * @param shift             Shift parameter of quantization.
543 |      */
544 |     void image_crop_shift_fast(qtp_t *dimage, uint16_t *simage, int dw, int sw, int sh, int x1, int y1, int x2, int y2, int shift);
545 | 
546 | #ifdef __cplusplus
547 | }
548 | #endif
549 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam-bare/mtmn.h:
--------------------------------------------------------------------------------
  1 | /*
  2 |   * ESPRESSIF MIT License
  3 |   *
  4 |   * Copyright (c) 2018 <ESPRESSIF SYSTEMS (SHANGHAI) PTE LTD>
  5 |   *
  6 |   * Permission is hereby granted for use on ESPRESSIF SYSTEMS products only, in which case,
  7 |   * it is free of charge, to any person obtaining a copy of this software and associated
  8 |   * documentation files (the "Software"), to deal in the Software without restriction, including
  9 |   * without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
 10 |   * and/or sell copies of the Software, and to permit persons to whom the Software is furnished
 11 |   * to do so, subject to the following conditions:
 12 |   *
 13 |   * The above copyright notice and this permission notice shall be included in all copies or
 14 |   * substantial portions of the Software.
 15 |   *
 16 |   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 17 |   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
 18 |   * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 19 |   * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 20 |   * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 21 |   * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 22 |   *
 23 |   */
 24 | #pragma once
 25 | 
 26 | #ifdef __cplusplus
 27 | extern "C"
 28 | {
 29 | #endif
 30 | #include "dl_lib_matrix3d.h"
 31 | #include "dl_lib_matrix3dq.h"
 32 | 
 33 |     /**
 34 |      * Detection results with MTMN.
 35 |      *
 36 |      */
 37 |     typedef struct
 38 |     {
 39 |         dl_matrix3d_t *category;    /*!< Classification result after softmax, channel is 2 */
 40 |         dl_matrix3d_t *offset;      /*!< Bounding box offset of 2 points: top-left and bottom-right, channel is 4 */
 41 |         dl_matrix3d_t *landmark;    /*!< Offsets of 5 landmarks:
 42 |                                      * - Left eye
 43 |                                      * - Mouth leftside
 44 |                                      * - Nose
 45 |                                      * - Right eye
 46 |                                      * - Mouth rightside
 47 |                                      *
 48 |                                      * channel is 10
 49 |                                      * */
 50 |     } mtmn_net_t;
 51 | 
 52 | 
 53 |     /**
 54 |      * @brief Free a mtmn_net_t
 55 |      *
 56 |      * @param p         A mtmn_net_t pointer
 57 |      *
 58 |      */
 59 | 
 60 |     void mtmn_net_t_free(mtmn_net_t *p);
 61 | 
 62 |     /**
 63 |      * @brief Forward the pnet process, coarse detection. Calculate in float.
 64 |      *
 65 |      * @param in        Image matrix, rgb888 format, size is 320x240
 66 |      * @return          Scores for every pixel, and box offset with respect.
 67 |      */
 68 |     mtmn_net_t *pnet_lite_f(dl_matrix3du_t *in);
 69 | 
 70 |     /**
 71 |      * @brief Forward the rnet process, fine determine the boxes from pnet. Calculate in float.
 72 |      *
 73 |      * @param in        Image matrix, rgb888 format
 74 |      * @param threshold Score threshold to detect human face
 75 |      * @return          Scores for every box, and box offset with respect.
 76 |      */
 77 |     mtmn_net_t *rnet_lite_f_with_score_verify(dl_matrix3du_t *in, float threshold);
 78 | 
 79 |     /**
 80 |      * @brief Forward the onet process, fine determine the boxes from rnet. Calculate in float.
 81 |      *
 82 |      * @param in        Image matrix, rgb888 format
 83 |      * @param threshold Score threshold to detect human face
 84 |      * @return          Scores for every box, box offset, and landmark with respect.
 85 |      */
 86 |     mtmn_net_t *onet_lite_f_with_score_verify(dl_matrix3du_t *in, float threshold);
 87 | 
 88 |     /**
 89 |      * @brief Forward the pnet process, coarse detection. Calculate in quantization.
 90 |      *
 91 |      * @param in        Image matrix, rgb888 format, size is 320x240
 92 |      * @return          Scores for every pixel, and box offset with respect.
 93 |      */
 94 |     mtmn_net_t *pnet_lite_q(dl_matrix3du_t *in, dl_conv_mode mode);
 95 | 
 96 |     /**
 97 |      * @brief Forward the rnet process, fine determine the boxes from pnet. Calculate in quantization.
 98 |      *
 99 |      * @param in        Image matrix, rgb888 format
100 |      * @param threshold Score threshold to detect human face
101 |      * @return          Scores for every box, and box offset with respect.
102 |      */
103 |     mtmn_net_t *rnet_lite_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
104 | 
105 |     /**
106 |      * @brief Forward the onet process, fine determine the boxes from rnet. Calculate in quantization.
107 |      *
108 |      * @param in        Image matrix, rgb888 format
109 |      * @param threshold Score threshold to detect human face
110 |      * @return          Scores for every box, box offset, and landmark with respect.
111 |      */
112 |     mtmn_net_t *onet_lite_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
113 | 
114 |     /**
115 |      * @brief Forward the pnet process, coarse detection. Calculate in quantization.
116 |      *
117 |      * @param in        Image matrix, rgb888 format, size is 320x240
118 |      * @return          Scores for every pixel, and box offset with respect.
119 |      */
120 |     mtmn_net_t *pnet_heavy_q(dl_matrix3du_t *in, dl_conv_mode mode);
121 | 
122 |     /**
123 |      * @brief Forward the rnet process, fine determine the boxes from pnet. Calculate in quantization.
124 |      *
125 |      * @param in        Image matrix, rgb888 format
126 |      * @param threshold Score threshold to detect human face
127 |      * @return          Scores for every box, and box offset with respect.
128 |      */
129 |     mtmn_net_t *rnet_heavy_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
130 | 
131 |     /**
132 |      * @brief Forward the onet process, fine determine the boxes from rnet. Calculate in quantization.
133 |      *
134 |      * @param in        Image matrix, rgb888 format
135 |      * @param threshold Score threshold to detect human face
136 |      * @return          Scores for every box, box offset, and landmark with respect.
137 |      */
138 |     mtmn_net_t *onet_heavy_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
139 | 
140 | #ifdef __cplusplus
141 | }
142 | #endif
143 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam/camera_pins.h:
--------------------------------------------------------------------------------
  1 | 
  2 | #if defined(CAMERA_MODEL_WROVER_KIT)
  3 | #define PWDN_GPIO_NUM    -1
  4 | #define RESET_GPIO_NUM   -1
  5 | #define XCLK_GPIO_NUM    21
  6 | #define SIOD_GPIO_NUM    26
  7 | #define SIOC_GPIO_NUM    27
  8 | 
  9 | #define Y9_GPIO_NUM      35
 10 | #define Y8_GPIO_NUM      34
 11 | #define Y7_GPIO_NUM      39
 12 | #define Y6_GPIO_NUM      36
 13 | #define Y5_GPIO_NUM      19
 14 | #define Y4_GPIO_NUM      18
 15 | #define Y3_GPIO_NUM       5
 16 | #define Y2_GPIO_NUM       4
 17 | #define VSYNC_GPIO_NUM   25
 18 | #define HREF_GPIO_NUM    23
 19 | #define PCLK_GPIO_NUM    22
 20 | 
 21 | #elif defined(CAMERA_MODEL_ESP_EYE)
 22 | #define PWDN_GPIO_NUM    -1
 23 | #define RESET_GPIO_NUM   -1
 24 | #define XCLK_GPIO_NUM    4
 25 | #define SIOD_GPIO_NUM    18
 26 | #define SIOC_GPIO_NUM    23
 27 | 
 28 | #define Y9_GPIO_NUM      36
 29 | #define Y8_GPIO_NUM      37
 30 | #define Y7_GPIO_NUM      38
 31 | #define Y6_GPIO_NUM      39
 32 | #define Y5_GPIO_NUM      35
 33 | #define Y4_GPIO_NUM      14
 34 | #define Y3_GPIO_NUM      13
 35 | #define Y2_GPIO_NUM      34
 36 | #define VSYNC_GPIO_NUM   5
 37 | #define HREF_GPIO_NUM    27
 38 | #define PCLK_GPIO_NUM    25
 39 | 
 40 | #elif defined(CAMERA_MODEL_M5STACK_PSRAM)
 41 | #define PWDN_GPIO_NUM     -1
 42 | #define RESET_GPIO_NUM    15
 43 | #define XCLK_GPIO_NUM     27
 44 | #define SIOD_GPIO_NUM     25
 45 | #define SIOC_GPIO_NUM     23
 46 | 
 47 | #define Y9_GPIO_NUM       19
 48 | #define Y8_GPIO_NUM       36
 49 | #define Y7_GPIO_NUM       18
 50 | #define Y6_GPIO_NUM       39
 51 | #define Y5_GPIO_NUM        5
 52 | #define Y4_GPIO_NUM       34
 53 | #define Y3_GPIO_NUM       35
 54 | #define Y2_GPIO_NUM       32
 55 | #define VSYNC_GPIO_NUM    22
 56 | #define HREF_GPIO_NUM     26
 57 | #define PCLK_GPIO_NUM     21
 58 | 
 59 | #elif defined(CAMERA_MODEL_M5STACK_V2_PSRAM)
 60 | #define PWDN_GPIO_NUM     -1
 61 | #define RESET_GPIO_NUM    15
 62 | #define XCLK_GPIO_NUM     27
 63 | #define SIOD_GPIO_NUM     22
 64 | #define SIOC_GPIO_NUM     23
 65 | 
 66 | #define Y9_GPIO_NUM       19
 67 | #define Y8_GPIO_NUM       36
 68 | #define Y7_GPIO_NUM       18
 69 | #define Y6_GPIO_NUM       39
 70 | #define Y5_GPIO_NUM        5
 71 | #define Y4_GPIO_NUM       34
 72 | #define Y3_GPIO_NUM       35
 73 | #define Y2_GPIO_NUM       32
 74 | #define VSYNC_GPIO_NUM    25
 75 | #define HREF_GPIO_NUM     26
 76 | #define PCLK_GPIO_NUM     21
 77 | 
 78 | #elif defined(CAMERA_MODEL_M5STACK_WIDE)
 79 | #define PWDN_GPIO_NUM     -1
 80 | #define RESET_GPIO_NUM    15
 81 | #define XCLK_GPIO_NUM     27
 82 | #define SIOD_GPIO_NUM     22
 83 | #define SIOC_GPIO_NUM     23
 84 | 
 85 | #define Y9_GPIO_NUM       19
 86 | #define Y8_GPIO_NUM       36
 87 | #define Y7_GPIO_NUM       18
 88 | #define Y6_GPIO_NUM       39
 89 | #define Y5_GPIO_NUM        5
 90 | #define Y4_GPIO_NUM       34
 91 | #define Y3_GPIO_NUM       35
 92 | #define Y2_GPIO_NUM       32
 93 | #define VSYNC_GPIO_NUM    25
 94 | #define HREF_GPIO_NUM     26
 95 | #define PCLK_GPIO_NUM     21
 96 | 
 97 | #elif defined(CAMERA_MODEL_M5STACK_ESP32CAM)
 98 | #define PWDN_GPIO_NUM     -1
 99 | #define RESET_GPIO_NUM    15
100 | #define XCLK_GPIO_NUM     27
101 | #define SIOD_GPIO_NUM     25
102 | #define SIOC_GPIO_NUM     23
103 | 
104 | #define Y9_GPIO_NUM       19
105 | #define Y8_GPIO_NUM       36
106 | #define Y7_GPIO_NUM       18
107 | #define Y6_GPIO_NUM       39
108 | #define Y5_GPIO_NUM        5
109 | #define Y4_GPIO_NUM       34
110 | #define Y3_GPIO_NUM       35
111 | #define Y2_GPIO_NUM       17
112 | #define VSYNC_GPIO_NUM    22
113 | #define HREF_GPIO_NUM     26
114 | #define PCLK_GPIO_NUM     21
115 | 
116 | #elif defined(CAMERA_MODEL_AI_THINKER)
117 | #define PWDN_GPIO_NUM     32
118 | #define RESET_GPIO_NUM    -1
119 | #define XCLK_GPIO_NUM      0
120 | #define SIOD_GPIO_NUM     26
121 | #define SIOC_GPIO_NUM     27
122 | 
123 | #define Y9_GPIO_NUM       35
124 | #define Y8_GPIO_NUM       34
125 | #define Y7_GPIO_NUM       39
126 | #define Y6_GPIO_NUM       36
127 | #define Y5_GPIO_NUM       21
128 | #define Y4_GPIO_NUM       19
129 | #define Y3_GPIO_NUM       18
130 | #define Y2_GPIO_NUM        5
131 | #define VSYNC_GPIO_NUM    25
132 | #define HREF_GPIO_NUM     23
133 | #define PCLK_GPIO_NUM     22
134 | 
135 | #elif defined(CAMERA_MODEL_TTGO_T_JOURNAL)
136 | #define PWDN_GPIO_NUM      0
137 | #define RESET_GPIO_NUM    15
138 | #define XCLK_GPIO_NUM     27
139 | #define SIOD_GPIO_NUM     25
140 | #define SIOC_GPIO_NUM     23
141 | 
142 | #define Y9_GPIO_NUM       19
143 | #define Y8_GPIO_NUM       36
144 | #define Y7_GPIO_NUM       18
145 | #define Y6_GPIO_NUM       39
146 | #define Y5_GPIO_NUM        5
147 | #define Y4_GPIO_NUM       34
148 | #define Y3_GPIO_NUM       35
149 | #define Y2_GPIO_NUM       17
150 | #define VSYNC_GPIO_NUM    22
151 | #define HREF_GPIO_NUM     26
152 | #define PCLK_GPIO_NUM     21
153 | 
154 | #else
155 | #error "Camera model not selected"
156 | #endif
157 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam/dl_lib_matrix3d.h:
--------------------------------------------------------------------------------
  1 | #pragma once
  2 | 
  3 | #include <stdint.h>
  4 | #include <stdio.h>
  5 | #include <stdlib.h>
  6 | #include <string.h>
  7 | #include <math.h>
  8 | #include <assert.h>
  9 | 
 10 | #if CONFIG_SPIRAM_SUPPORT || CONFIG_ESP32_SPIRAM_SUPPORT
 11 | #include "freertos/FreeRTOS.h"
 12 | #define DL_SPIRAM_SUPPORT 1
 13 | #else
 14 | #define DL_SPIRAM_SUPPORT 0
 15 | #endif
 16 | 
 17 | 
 18 | #ifndef max
 19 | #define max(x, y) (((x) < (y)) ? (y) : (x))
 20 | #endif
 21 | 
 22 | #ifndef min
 23 | #define min(x, y) (((x) < (y)) ? (x) : (y))
 24 | #endif
 25 | 
 26 | typedef float fptp_t;
 27 | typedef uint8_t uc_t;
 28 | 
 29 | typedef enum
 30 | {
 31 |     DL_SUCCESS = 0,
 32 |     DL_FAIL = 1,
 33 | } dl_error_type;
 34 | 
 35 | typedef enum
 36 | {
 37 |     PADDING_VALID = 0,                   /*!< Valid padding */
 38 |     PADDING_SAME = 1,                    /*!< Same padding, from right to left, free input */
 39 |     PADDING_SAME_DONT_FREE_INPUT = 2,    /*!< Same padding, from right to left, do not free input */
 40 |     PADDING_SAME_MXNET = 3,              /*!< Same padding, from left to right */
 41 | } dl_padding_type;
 42 | 
 43 | typedef enum
 44 | {
 45 |     DL_POOLING_MAX = 0,        /*!< Max pooling */
 46 |     DL_POOLING_AVG = 1,        /*!< Average pooling */
 47 | } dl_pooling_type; 
 48 | /*
 49 |  * Matrix for 3d
 50 |  * @Warning: the sequence of variables is fixed, cannot be modified, otherwise there will be errors in esp_dsp_dot_float
 51 |  */
 52 | typedef struct
 53 | {
 54 |     int w;        /*!< Width */
 55 |     int h;        /*!< Height */
 56 |     int c;        /*!< Channel */
 57 |     int n;        /*!< Number of filter, input and output must be 1 */
 58 |     int stride;   /*!< Step between lines */
 59 |     fptp_t *item; /*!< Data */
 60 | } dl_matrix3d_t;
 61 | 
 62 | typedef struct
 63 | {
 64 |     int w;      /*!< Width */
 65 |     int h;      /*!< Height */
 66 |     int c;      /*!< Channel */
 67 |     int n;      /*!< Number of filter, input and output must be 1 */
 68 |     int stride; /*!< Step between lines */
 69 |     uc_t *item; /*!< Data */
 70 | } dl_matrix3du_t;
 71 | 
 72 | typedef enum
 73 | {
 74 |     UPSAMPLE_NEAREST_NEIGHBOR = 0, /*!< Use nearest neighbor interpolation as the upsample method*/
 75 |     UPSAMPLE_BILINEAR = 1,        /*!< Use nearest bilinear interpolation as the upsample method*/
 76 | } dl_upsample_type;
 77 | 
 78 | typedef struct
 79 | {
 80 |     int stride_x;                    /*!< Strides of width */
 81 |     int stride_y;                    /*!< Strides of height */
 82 |     dl_padding_type padding;         /*!< Padding type */
 83 | } dl_matrix3d_mobilenet_config_t;
 84 | 
 85 | /*
 86 |  * @brief Allocate a zero-initialized space. Must use 'dl_lib_free' to free the memory.
 87 |  *
 88 |  * @param cnt  Count of units.
 89 |  * @param size Size of unit.
 90 |  * @param align Align of memory. If not required, set 0.
 91 |  * @return Pointer of allocated memory. Null for failed.
 92 |  */
 93 | static void *dl_lib_calloc(int cnt, int size, int align)
 94 | {
 95 |     int total_size = cnt * size + align + sizeof(void *);
 96 |     void *res = malloc(total_size);
 97 |     if (NULL == res)
 98 |     {
 99 | #if DL_SPIRAM_SUPPORT
100 |         res = heap_caps_malloc(total_size, MALLOC_CAP_8BIT | MALLOC_CAP_SPIRAM);
101 |     }
102 |     if (NULL == res)
103 |     {
104 |         printf("Item psram alloc failed. Size: %d x %d\n", cnt, size);
105 | #else
106 |         printf("Item alloc failed. Size: %d x %d, SPIRAM_FLAG: %d\n", cnt, size, DL_SPIRAM_SUPPORT);
107 | #endif
108 |         return NULL;
109 |     }
110 |     bzero(res, total_size);
111 |     void **data = (void **)res + 1;
112 |     void **aligned;
113 |     if (align)
114 |         aligned = (void **)(((size_t)data + (align - 1)) & -align);
115 |     else
116 |         aligned = data;
117 | 
118 |     aligned[-1] = res;
119 |     return (void *)aligned;
120 | }
121 | 
122 | /**
123 |  * @brief Free the memory space allocated by 'dl_lib_calloc'
124 |  * 
125 |  */
126 | static inline void dl_lib_free(void *d)
127 | {
128 |     if (NULL == d)
129 |         return;
130 | 
131 |     free(((void **)d)[-1]);
132 | }
133 | 
134 | /*
135 |  * @brief Allocate a 3D matrix with float items, the access sequence is NHWC
136 |  *
137 |  * @param n     Number of matrix3d, for filters it is out channels, for others it is 1
138 |  * @param w     Width of matrix3d
139 |  * @param h     Height of matrix3d
140 |  * @param c     Channel of matrix3d
141 |  * @return      3d matrix
142 |  */
143 | static inline dl_matrix3d_t *dl_matrix3d_alloc(int n, int w, int h, int c)
144 | {
145 |     dl_matrix3d_t *r = (dl_matrix3d_t *)dl_lib_calloc(1, sizeof(dl_matrix3d_t), 0);
146 |     if (NULL == r)
147 |     {
148 |         printf("internal r failed.\n");
149 |         return NULL;
150 |     }
151 |     fptp_t *items = (fptp_t *)dl_lib_calloc(n * w * h * c, sizeof(fptp_t), 0);
152 |     if (NULL == items)
153 |     {
154 |         printf("matrix3d item alloc failed.\n");
155 |         dl_lib_free(r);
156 |         return NULL;
157 |     }
158 | 
159 |     r->w = w;
160 |     r->h = h;
161 |     r->c = c;
162 |     r->n = n;
163 |     r->stride = w * c;
164 |     r->item = items;
165 | 
166 |     return r;
167 | }
168 | 
169 | /*
170 |  * @brief Allocate a 3D matrix with 8-bits items, the access sequence is NHWC
171 |  *
172 |  * @param n     Number of matrix3d, for filters it is out channels, for others it is 1
173 |  * @param w     Width of matrix3d
174 |  * @param h     Height of matrix3d
175 |  * @param c     Channel of matrix3d
176 |  * @return      3d matrix
177 |  */
178 | static inline dl_matrix3du_t *dl_matrix3du_alloc(int n, int w, int h, int c)
179 | {
180 |     dl_matrix3du_t *r = (dl_matrix3du_t *)dl_lib_calloc(1, sizeof(dl_matrix3du_t), 0);
181 |     if (NULL == r)
182 |     {
183 |         printf("internal r failed.\n");
184 |         return NULL;
185 |     }
186 |     uc_t *items = (uc_t *)dl_lib_calloc(n * w * h * c, sizeof(uc_t), 0);
187 |     if (NULL == items)
188 |     {
189 |         printf("matrix3du item alloc failed.\n");
190 |         dl_lib_free(r);
191 |         return NULL;
192 |     }
193 | 
194 |     r->w = w;
195 |     r->h = h;
196 |     r->c = c;
197 |     r->n = n;
198 |     r->stride = w * c;
199 |     r->item = items;
200 | 
201 |     return r;
202 | }
203 | 
204 | /*
205 |  * @brief Free a matrix3d
206 |  *
207 |  * @param m matrix3d with float items
208 |  */
209 | static inline void dl_matrix3d_free(dl_matrix3d_t *m)
210 | {
211 |     if (NULL == m)
212 |         return;
213 |     if (NULL == m->item)
214 |     {
215 |         dl_lib_free(m);
216 |         return;
217 |     }
218 |     dl_lib_free(m->item);
219 |     dl_lib_free(m);
220 | }
221 | 
222 | /*
223 |  * @brief Free a matrix3d
224 |  *
225 |  * @param m matrix3d with 8-bits items
226 |  */
227 | static inline void dl_matrix3du_free(dl_matrix3du_t *m)
228 | {
229 |     if (NULL == m)
230 |         return;
231 |     if (NULL == m->item)
232 |     {
233 |         dl_lib_free(m);
234 |         return;
235 |     }
236 |     dl_lib_free(m->item);
237 |     dl_lib_free(m);
238 | }
239 | 
240 | 
241 | /*
242 |  * @brief Dot product with a vector and matrix
243 |  *
244 |  * @param out   Space to put the result
245 |  * @param in    input vector
246 |  * @param f     filter matrix
247 |  */
248 | void dl_matrix3dff_dot_product(dl_matrix3d_t *out, dl_matrix3d_t *in, dl_matrix3d_t *f);
249 | 
250 | /**
251 |  * @brief Do a softmax operation on a matrix3d
252 |  *
253 |  * @param in        Input matrix3d
254 |  */
255 | void dl_matrix3d_softmax(dl_matrix3d_t *m);
256 | 
257 | /**
258 |  * @brief Copy a range of float items from an existing matrix to a preallocated matrix
259 |  *
260 |  * @param dst   The destination slice matrix
261 |  * @param src   The source matrix to slice
262 |  * @param x     X-offset of the origin of the returned matrix within the sliced matrix
263 |  * @param y     Y-offset of the origin of the returned matrix within the sliced matrix
264 |  * @param w     Width of the resulting matrix
265 |  * @param h     Height of the resulting matrix
266 |  */
267 | void dl_matrix3d_slice_copy(dl_matrix3d_t *dst,
268 |                             dl_matrix3d_t *src,
269 |                             int x,
270 |                             int y,
271 |                             int w,
272 |                             int h);
273 | 
274 | /**
275 |  * @brief Copy a range of 8-bits items from an existing matrix to a preallocated matrix
276 |  *
277 |  * @param dst   The destination slice matrix
278 |  * @param src   The source matrix to slice
279 |  * @param x     X-offset of the origin of the returned matrix within the sliced matrix
280 |  * @param y     Y-offset of the origin of the returned matrix within the sliced matrix
281 |  * @param w     Width of the resulting matrix
282 |  * @param h     Height of the resulting matrix
283 |  */
284 | void dl_matrix3du_slice_copy(dl_matrix3du_t *dst,
285 |                              dl_matrix3du_t *src,
286 |                              int x,
287 |                              int y,
288 |                              int w,
289 |                              int h);
290 | 
291 | /**
292 |  * @brief Transform a sliced matrix block from nhwc to nchw, the block needs to be memory continous.
293 |  *
294 |  * @param out  The destination sliced matrix in nchw
295 |  * @param in   The source sliced matrix in nhwc
296 |  */
297 | void dl_matrix3d_sliced_transform_nchw(dl_matrix3d_t *out,
298 |                                        dl_matrix3d_t *in);
299 | 
300 | /**
301 |  * @brief Do a general CNN layer pass, dimension is (number, width, height, channel)
302 |  *
303 |  * @param in               Input matrix3d
304 |  * @param filter           Weights of the neurons
305 |  * @param bias             Bias for the CNN layer
306 |  * @param stride_x         The step length of the convolution window in x(width) direction
307 |  * @param stride_y         The step length of the convolution window in y(height) direction
308 |  * @param padding          One of VALID or SAME
309 |  * @param mode             Do convolution using C implement or xtensa implement, 0 or 1, with respect
310 |  *                         If ESP_PLATFORM is not defined, this value is not used. Default is 0
311 |  * @return dl_matrix3d_t*  The result of CNN layer
312 |  */
313 | dl_matrix3d_t *dl_matrix3d_conv(dl_matrix3d_t *in,
314 |                                 dl_matrix3d_t *filter,
315 |                                 dl_matrix3d_t *bias,
316 |                                 int stride_x,
317 |                                 int stride_y,
318 |                                 int padding,
319 |                                 int mode);
320 | 
321 | /**
322 |  * @brief Do a global average pooling layer pass, dimension is (number, width, height, channel)
323 |  *
324 |  * @param in             Input matrix3d
325 |  *
326 |  * @return               The result of global average pooling layer
327 |  */
328 | dl_matrix3d_t *dl_matrix3d_global_pool(dl_matrix3d_t *in);
329 | 
330 | /**
331 |  * @brief Calculate pooling layer of a feature map
332 |  *
333 |  * @param in               Input matrix, size (1, w, h, c)
334 |  * @param f_w              Window width
335 |  * @param f_h              Window height 
336 |  * @param stride_x         Stride in horizontal direction
337 |  * @param stride_y         Stride in vertical direction
338 |  * @param padding          Padding type: PADDING_VALID and PADDING_SAME
339 |  * @param pooling_type     Pooling type: DL_POOLING_MAX and POOLING_AVG
340 |  * @return dl_matrix3d_t*  Resulting matrix, size (1, w', h', c)
341 |  */
342 | dl_matrix3d_t *dl_matrix3d_pooling(dl_matrix3d_t *in,
343 |                                    int f_w,
344 |                                    int f_h,
345 |                                    int stride_x,
346 |                                    int stride_y,
347 |                                    dl_padding_type padding,
348 |                                    dl_pooling_type pooling_type);
349 | /**
350 |  * @brief Do a batch normalization operation, update the input matrix3d: input = input * scale + offset
351 |  *
352 |  * @param m              Input matrix3d
353 |  * @param scale          scale matrix3d,  scale = gamma/((moving_variance+sigma)^(1/2))
354 |  * @param Offset         Offset matrix3d, offset = beta-(moving_mean*gamma/((moving_variance+sigma)^(1/2)))
355 |  */
356 | void dl_matrix3d_batch_normalize(dl_matrix3d_t *m,
357 |                                  dl_matrix3d_t *scale,
358 |                                  dl_matrix3d_t *offset);
359 | 
360 | /**
361 |  * @brief Add a pair of matrix3d item-by-item: res=in_1+in_2
362 |  *
363 |  * @param in_1             First Floating point input matrix3d
364 |  * @param in_2             Second Floating point input matrix3d
365 |  *
366 |  * @return dl_matrix3d_t*  Added data
367 |  */
368 | dl_matrix3d_t *dl_matrix3d_add(dl_matrix3d_t *in_1, dl_matrix3d_t *in_2);
369 | 
370 | /**
371 |  * @brief Concatenate the channels of two matrix3ds into a new matrix3d
372 |  *
373 |  * @param in_1             First Floating point input matrix3d
374 |  * @param in_2             Second Floating point input matrix3d
375 |  *
376 |  * @return dl_matrix3d_t*  A newly allocated matrix3d with as avlues in_1|in_2
377 |  */
378 | dl_matrix3d_t *dl_matrix3d_concat(dl_matrix3d_t *in_1, dl_matrix3d_t *in_2);
379 | 
380 | /**
381 |  * @brief Concatenate the channels of four matrix3ds into a new matrix3d
382 |  *
383 |  * @param in_1           First Floating point input matrix3d
384 |  * @param in_2           Second Floating point input matrix3d
385 |  * @param in_3           Third Floating point input matrix3d
386 |  * @param in_4           Fourth Floating point input matrix3d
387 |  *
388 |  * @return               A newly allocated matrix3d with as avlues in_1|in_2|in_3|in_4
389 |  */
390 | dl_matrix3d_t *dl_matrix3d_concat_4(dl_matrix3d_t *in_1,
391 |                                     dl_matrix3d_t *in_2,
392 |                                     dl_matrix3d_t *in_3,
393 |                                     dl_matrix3d_t *in_4);
394 | 
395 | /**
396 |  * @brief Concatenate the channels of eight matrix3ds into a new matrix3d
397 |  *
398 |  * @param in_1           First Floating point input matrix3d
399 |  * @param in_2           Second Floating point input matrix3d
400 |  * @param in_3           Third Floating point input matrix3d
401 |  * @param in_4           Fourth Floating point input matrix3d
402 |  * @param in_5           Fifth Floating point input matrix3d
403 |  * @param in_6           Sixth Floating point input matrix3d
404 |  * @param in_7           Seventh Floating point input matrix3d
405 |  * @param in_8           eighth Floating point input matrix3d
406 |  *
407 |  * @return               A newly allocated matrix3d with as avlues in_1|in_2|in_3|in_4|in_5|in_6|in_7|in_8
408 |  */
409 | dl_matrix3d_t *dl_matrix3d_concat_8(dl_matrix3d_t *in_1,
410 |                                     dl_matrix3d_t *in_2,
411 |                                     dl_matrix3d_t *in_3,
412 |                                     dl_matrix3d_t *in_4,
413 |                                     dl_matrix3d_t *in_5,
414 |                                     dl_matrix3d_t *in_6,
415 |                                     dl_matrix3d_t *in_7,
416 |                                     dl_matrix3d_t *in_8);
417 | 
418 | /**
419 |  * @brief Do a mobilefacenet block forward, dimension is (number, width, height, channel)
420 |  *
421 |  * @param in                    Input matrix3d
422 |  * @param pw                    Weights of the pointwise conv layer
423 |  * @param pw_bn_scale           The scale params of the batch_normalize layer after the pointwise conv layer
424 |  * @param pw_bn_offset          The offset params of the batch_normalize layer after the pointwise conv layer
425 |  * @param dw                    Weights of the depthwise conv layer
426 |  * @param dw_bn_scale           The scale params of the batch_normalize layer after the depthwise conv layer
427 |  * @param dw_bn_offset          The offset params of the batch_normalize layer after the depthwise conv layer
428 |  * @param pw_linear             Weights of the pointwise linear conv layer
429 |  * @param pw_linear_bn_scale    The scale params of the batch_normalize layer after the pointwise linear conv layer
430 |  * @param pw_linear_bn_offset   The offset params of the batch_normalize layer after the pointwise linear conv layer
431 |  * @param stride_x              The step length of the convolution window in x(width) direction
432 |  * @param stride_y              The step length of the convolution window in y(height) direction
433 |  * @param padding               One of VALID or SAME
434 |  * @param mode                  Do convolution using C implement or xtensa implement, 0 or 1, with respect
435 |  *                              If ESP_PLATFORM is not defined, this value is not used. Default is 0
436 |  * @return                      The result of a mobilefacenet block
437 |  */
438 | dl_matrix3d_t *dl_matrix3d_mobilefaceblock(dl_matrix3d_t *in,
439 |                                            dl_matrix3d_t *pw,
440 |                                            dl_matrix3d_t *pw_bn_scale,
441 |                                            dl_matrix3d_t *pw_bn_offset,
442 |                                            dl_matrix3d_t *dw,
443 |                                            dl_matrix3d_t *dw_bn_scale,
444 |                                            dl_matrix3d_t *dw_bn_offset,
445 |                                            dl_matrix3d_t *pw_linear,
446 |                                            dl_matrix3d_t *pw_linear_bn_scale,
447 |                                            dl_matrix3d_t *pw_linear_bn_offset,
448 |                                            int stride_x,
449 |                                            int stride_y,
450 |                                            int padding,
451 |                                            int mode,
452 |                                            int shortcut);
453 | 
454 | /**
455 |  * @brief Do a mobilefacenet block forward with 1x1 split conv, dimension is (number, width, height, channel)
456 |  *
457 |  * @param in                    Input matrix3d
458 |  * @param pw_1                  Weights of the pointwise conv layer 1
459 |  * @param pw_2                  Weights of the pointwise conv layer 2
460 |  * @param pw_bn_scale           The scale params of the batch_normalize layer after the pointwise conv layer
461 |  * @param pw_bn_offset          The offset params of the batch_normalize layer after the pointwise conv layer
462 |  * @param dw                    Weights of the depthwise conv layer
463 |  * @param dw_bn_scale           The scale params of the batch_normalize layer after the depthwise conv layer
464 |  * @param dw_bn_offset          The offset params of the batch_normalize layer after the depthwise conv layer
465 |  * @param pw_linear_1           Weights of the pointwise linear conv layer 1
466 |  * @param pw_linear_2           Weights of the pointwise linear conv layer 2
467 |  * @param pw_linear_bn_scale    The scale params of the batch_normalize layer after the pointwise linear conv layer
468 |  * @param pw_linear_bn_offset   The offset params of the batch_normalize layer after the pointwise linear conv layer
469 |  * @param stride_x              The step length of the convolution window in x(width) direction
470 |  * @param stride_y              The step length of the convolution window in y(height) direction
471 |  * @param padding               One of VALID or SAME
472 |  * @param mode                  Do convolution using C implement or xtensa implement, 0 or 1, with respect
473 |  *                              If ESP_PLATFORM is not defined, this value is not used. Default is 0
474 |  * @return                      The result of a mobilefacenet block
475 |  */
476 | dl_matrix3d_t *dl_matrix3d_mobilefaceblock_split(dl_matrix3d_t *in,
477 |                                                  dl_matrix3d_t *pw_1,
478 |                                                  dl_matrix3d_t *pw_2,
479 |                                                  dl_matrix3d_t *pw_bn_scale,
480 |                                                  dl_matrix3d_t *pw_bn_offset,
481 |                                                  dl_matrix3d_t *dw,
482 |                                                  dl_matrix3d_t *dw_bn_scale,
483 |                                                  dl_matrix3d_t *dw_bn_offset,
484 |                                                  dl_matrix3d_t *pw_linear_1,
485 |                                                  dl_matrix3d_t *pw_linear_2,
486 |                                                  dl_matrix3d_t *pw_linear_bn_scale,
487 |                                                  dl_matrix3d_t *pw_linear_bn_offset,
488 |                                                  int stride_x,
489 |                                                  int stride_y,
490 |                                                  int padding,
491 |                                                  int mode,
492 |                                                  int shortcut);
493 | 
494 | /**
495 |  * @brief           Initialize the matrix3d feature map to bias
496 |  * 
497 |  * @param out       The matrix3d feature map needs to be initialized
498 |  * @param bias      The bias of a convlotion operation
499 |  */
500 | void dl_matrix3d_init_bias(dl_matrix3d_t *out, dl_matrix3d_t *bias);
501 | 
502 | /**
503 |  * @brief  Do a elementwise multiplication of two matrix3ds
504 |  * 
505 |  * @param out  Preallocated matrix3d, size (n, w, h, c)
506 |  * @param in1  Input matrix 1, size (n, w, h, c)
507 |  * @param in2  Input matrix 2, size (n, w, h, c)
508 |  */
509 | void dl_matrix3d_multiply(dl_matrix3d_t *out, dl_matrix3d_t *in1, dl_matrix3d_t *in2);
510 | 
511 | //
512 | // Activation
513 | //
514 | 
515 | /**
516 |  * @brief Do a standard relu operation, update the input matrix3d
517 |  *
518 |  * @param m        Floating point input matrix3d
519 |  */
520 | void dl_matrix3d_relu(dl_matrix3d_t *m);
521 | 
522 | /**
523 |  * @brief Do a relu (Rectifier Linear Unit) operation, update the input matrix3d
524 |  *
525 |  * @param in        Floating point input matrix3d
526 |  * @param clip      If value is higher than this, it will be clipped to this value
527 |  */
528 | void dl_matrix3d_relu_clip(dl_matrix3d_t *m, fptp_t clip);
529 | 
530 | /**
531 |  * @brief Do a Prelu (Rectifier Linear Unit) operation, update the input matrix3d
532 |  *
533 |  * @param in        Floating point input matrix3d
534 |  * @param alpha     If value is less than zero, it will be updated by multiplying this factor
535 |  */
536 | void dl_matrix3d_p_relu(dl_matrix3d_t *in, dl_matrix3d_t *alpha);
537 | 
538 | /**
539 |  * @brief Do a leaky relu (Rectifier Linear Unit) operation, update the input matrix3d
540 |  *
541 |  * @param in        Floating point input matrix3d
542 |  * @param alpha     If value is less than zero, it will be updated by multiplying this factor
543 |  */
544 | void dl_matrix3d_leaky_relu(dl_matrix3d_t *m, fptp_t alpha);
545 | 
546 | //
547 | // Conv 1x1
548 | //
549 | /**
550 |  * @brief Do 1x1 convolution with a matrix3d
551 |  * 
552 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
553 |  * @param in         Input matrix, size (1, w, h, c)
554 |  * @param filter     1x1 filter, size (n, 1, 1, c)
555 |  */
556 | void dl_matrix3dff_conv_1x1(dl_matrix3d_t *out,
557 |                             dl_matrix3d_t *in,
558 |                             dl_matrix3d_t *filter);
559 | 
560 | /**
561 |  * @brief Do 1x1 convolution with a matrix3d, with bias adding
562 |  * 
563 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
564 |  * @param in         Input matrix, size (1, w, h, c)
565 |  * @param filter     1x1 filter, size (n, 1, 1, c)
566 |  * @param bias       Bias, size (1, 1, 1, n)
567 |  */
568 | void dl_matrix3dff_conv_1x1_with_bias(dl_matrix3d_t *out,
569 |                                       dl_matrix3d_t *in,
570 |                                       dl_matrix3d_t *filter,
571 |                                       dl_matrix3d_t *bias);
572 | 
573 | /**
574 |  * @brief Do 1x1 convolution with an 8-bit fixed point matrix
575 |  * 
576 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
577 |  * @param in         Input matrix, size (1, w, h, c)
578 |  * @param filter     1x1 filter, size (n, 1, 1, c)
579 |  */
580 | void dl_matrix3duf_conv_1x1(dl_matrix3d_t *out,
581 |                             dl_matrix3du_t *in,
582 |                             dl_matrix3d_t *filter);
583 | 
584 | /**
585 |  * @brief Do 1x1 convolution with an 8-bit fixed point matrix, with bias adding
586 |  * 
587 |  * @param out        Preallocated matrix3d, size (1, w, h, n)  
588 |  * @param in         Input matrix, size (1, w, h, c)
589 |  * @param filter     1x1 filter, size (n, 1, 1, c)
590 |  * @param bias       Bias, size (1, 1, 1, n)
591 |  */
592 | void dl_matrix3duf_conv_1x1_with_bias(dl_matrix3d_t *out,
593 |                                       dl_matrix3du_t *in,
594 |                                       dl_matrix3d_t *filter,
595 |                                       dl_matrix3d_t *bias);
596 | 
597 | //
598 | // Conv 3x3
599 | //
600 | 
601 | /**
602 |  * @brief Do 3x3 convolution with a matrix3d, without padding
603 |  * 
604 |  * @param out        Preallocated matrix3d, size (1, w, h, n)
605 |  * @param in         Input matrix, size (1, w, h, c)
606 |  * @param f          3x3 filter, size (n, 3, 3, c)
607 |  * @param step_x     Stride of width
608 |  * @param step_y     Stride of height
609 |  */
610 | void dl_matrix3dff_conv_3x3_op(dl_matrix3d_t *out,
611 |                                dl_matrix3d_t *in,
612 |                                dl_matrix3d_t *f,
613 |                                int step_x,
614 |                                int step_y);
615 | 
616 | /**
617 |  * @brief Do 3x3 convolution with a matrix3d, with bias adding
618 |  * 
619 |  * @param input             Input matrix, size (1, w, h, c)
620 |  * @param filter            3x3 filter, size (n, 3, 3, c)
621 |  * @param bias              Bias, size (1, 1, 1, n)
622 |  * @param stride_x          Stride of width
623 |  * @param stride_y          Stride of height
624 |  * @param padding           Padding type
625 |  * @return dl_matrix3d_t*   Resulting matrix3d
626 |  */
627 | dl_matrix3d_t *dl_matrix3dff_conv_3x3(dl_matrix3d_t *in,
628 |                                       dl_matrix3d_t *filter,
629 |                                       dl_matrix3d_t *bias,
630 |                                       int stride_x,
631 |                                       int stride_y,
632 |                                       dl_padding_type padding);
633 | 
634 | //
635 | // Conv Common
636 | //
637 | 
638 | /**
639 |  * @brief Do a general convolution layer pass with an 8-bit fixed point matrix, size is (number, width, height, channel)
640 |  * 
641 |  * @param in                Input image
642 |  * @param filter            Weights of the neurons
643 |  * @param bias              Bias for the CNN layer
644 |  * @param stride_x          The step length of the convolution window in x(width) direction
645 |  * @param stride_y          The step length of the convolution window in y(height) direction
646 |  * @param padding           Padding type
647 |  * @return dl_matrix3d_t*   Resulting matrix3d
648 |  */
649 | dl_matrix3d_t *dl_matrix3duf_conv_common(dl_matrix3du_t *in,
650 |                                          dl_matrix3d_t *filter,
651 |                                          dl_matrix3d_t *bias,
652 |                                          int stride_x,
653 |                                          int stride_y,
654 |                                          dl_padding_type padding);
655 | 
656 | /**
657 |  * @brief Do a general convolution layer pass, size is (number, width, height, channel)
658 |  * 
659 |  * @param in                Input image
660 |  * @param filter            Weights of the neurons
661 |  * @param bias              Bias for the CNN layer
662 |  * @param stride_x          The step length of the convolution window in x(width) direction
663 |  * @param stride_y          The step length of the convolution window in y(height) direction
664 |  * @param padding           Padding type
665 |  * @return dl_matrix3d_t*   Resulting matrix3d
666 |  */
667 | dl_matrix3d_t *dl_matrix3dff_conv_common(dl_matrix3d_t *in,
668 |                                          dl_matrix3d_t *filter,
669 |                                          dl_matrix3d_t *bias,
670 |                                          int stride_x,
671 |                                          int stride_y,
672 |                                          dl_padding_type padding);
673 | 
674 | //
675 | // Depthwise 3x3
676 | //
677 | 
678 | /**
679 |  * @brief Do 3x3 depthwise convolution with a float matrix3d
680 |  * 
681 |  * @param in                  Input matrix, size (1, w, h, c)
682 |  * @param filter              3x3 filter, size (1, 3, 3, c)
683 |  * @param stride_x            Stride of width
684 |  * @param stride_y            Stride of height
685 |  * @param padding             Padding type, 0: valid, 1: same
686 |  * @return dl_matrix3d_t*     Resulting float matrix3d
687 |  */
688 | dl_matrix3d_t *dl_matrix3dff_depthwise_conv_3x3(dl_matrix3d_t *in,
689 |                                                 dl_matrix3d_t *filter,
690 |                                                 int stride_x,
691 |                                                 int stride_y,
692 |                                                 int padding);
693 | 
694 | /**
695 |  * @brief Do 3x3 depthwise convolution with a 8-bit fixed point matrix
696 |  * 
697 |  * @param in                  Input matrix, size (1, w, h, c)
698 |  * @param filter              3x3 filter, size (1, 3, 3, c)
699 |  * @param stride_x            Stride of width
700 |  * @param stride_y            Stride of height
701 |  * @param padding             Padding type, 0: valid, 1: same
702 |  * @return dl_matrix3d_t*     Resulting float matrix3d
703 |  */
704 | dl_matrix3d_t *dl_matrix3duf_depthwise_conv_3x3(dl_matrix3du_t *in,
705 |                                                 dl_matrix3d_t *filter,
706 |                                                 int stride_x,
707 |                                                 int stride_y,
708 |                                                 int padding);
709 | 
710 | /**
711 |  * @brief Do 3x3 depthwise convolution with a float matrix3d, without padding
712 |  * 
713 |  * @param out                 Preallocated matrix3d, size (1, w, h, n)
714 |  * @param in                  Input matrix, size (1, w, h, c)
715 |  * @param f                   3x3 filter, size (1, 3, 3, c)
716 |  * @param step_x              Stride of width
717 |  * @param step_y              Stride of height
718 |  */
719 | void dl_matrix3dff_depthwise_conv_3x3_op(dl_matrix3d_t *out,
720 |                                          dl_matrix3d_t *in,
721 |                                          dl_matrix3d_t *f,
722 |                                          int step_x,
723 |                                          int step_y);
724 | 
725 | //
726 | // Depthwise Common
727 | //
728 | 
729 | /**
730 |  * @brief Do a depthwise CNN layer pass, dimension is (number, width, height, channel)
731 |  *
732 |  * @param in             Input matrix3d
733 |  * @param filter         Weights of the neurons
734 |  * @param stride_x       The step length of the convolution window in x(width) direction
735 |  * @param stride_y       The step length of the convolution window in y(height) direction
736 |  * @param padding        One of VALID or SAME
737 |  * @param mode           Do convolution using C implement or xtensa implement, 0 or 1, with respect
738 |  *                       If ESP_PLATFORM is not defined, this value is not used. Default is 0
739 |  * @return               The result of depthwise CNN layer
740 |  */
741 | dl_matrix3d_t *dl_matrix3dff_depthwise_conv_common(dl_matrix3d_t *in,
742 |                                                    dl_matrix3d_t *filter,
743 |                                                    int stride_x,
744 |                                                    int stride_y,
745 |                                                    dl_padding_type padding);
746 | 
747 | //
748 | // FC
749 | //
750 | /**
751 |  * @brief Do a general fully connected layer pass, dimension is (number, width, height, channel)
752 |  *
753 |  * @param in             Input matrix3d, size is (1, w, 1, 1)
754 |  * @param filter         Weights of the neurons, size is (1, w, h, 1)
755 |  * @param bias           Bias for the fc layer, size is (1, 1, 1, h)
756 |  * @return               The result of fc layer, size is (1, 1, 1, h)
757 |  */
758 | void dl_matrix3dff_fc(dl_matrix3d_t *out,
759 |                       dl_matrix3d_t *in,
760 |                       dl_matrix3d_t *filter);
761 | 
762 | /**
763 |  * @brief Do fully connected layer forward, with bias adding
764 |  *
765 |  * @param out       Preallocated resulting matrix, size (1, 1, 1, h)
766 |  * @param in        Input matrix, size (1, 1, 1, w)
767 |  * @param filter    Filter matrix, size (1, w, h, 1)
768 |  * @param bias      Bias matrix, size (1, 1, 1, h)
769 |  */
770 | void dl_matrix3dff_fc_with_bias(dl_matrix3d_t *out,
771 |                                 dl_matrix3d_t *in,
772 |                                 dl_matrix3d_t *filter,
773 |                                 dl_matrix3d_t *bias);
774 | 
775 | //
776 | // Mobilenet
777 | //
778 | 
779 | /**
780 |  * @brief Do a mobilenet block forward, dimension is (number, width, height, channel)
781 |  *
782 |  * @param in             Input matrix3d
783 |  * @param filter         Weights of the neurons
784 |  * @param stride_x       The step length of the convolution window in x(width) direction
785 |  * @param stride_y       The step length of the convolution window in y(height) direction
786 |  * @param padding        One of VALID or SAME
787 |  * @param mode           Do convolution using C implement or xtensa implement, 0 or 1, with respect
788 |  *                       If ESP_PLATFORM is not defined, this value is not used. Default is 0
789 |  * @return               The result of depthwise CNN layer
790 |  */
791 | dl_matrix3d_t *dl_matrix3dff_mobilenet(dl_matrix3d_t *in,
792 |                                        dl_matrix3d_t *dilate_filter,
793 |                                        dl_matrix3d_t *dilate_prelu,
794 |                                        dl_matrix3d_t *depthwise_filter,
795 |                                        dl_matrix3d_t *depthwise_prelu,
796 |                                        dl_matrix3d_t *compress_filter,
797 |                                        dl_matrix3d_t *bias,
798 |                                        dl_matrix3d_mobilenet_config_t config);
799 | 
800 | /**
801 |  * @brief Do a mobilenet block forward, dimension is (number, width, height, channel)
802 |  *
803 |  * @param in             Input matrix3du
804 |  * @param filter         Weights of the neurons
805 |  * @param stride_x       The step length of the convolution window in x(width) direction
806 |  * @param stride_y       The step length of the convolution window in y(height) direction
807 |  * @param padding        One of VALID or SAME
808 |  * @param mode           Do convolution using C implement or xtensa implement, 0 or 1, with respect
809 |  *                       If ESP_PLATFORM is not defined, this value is not used. Default is 0
810 |  * @return               The result of depthwise CNN layer
811 |  */
812 | dl_matrix3d_t *dl_matrix3duf_mobilenet(dl_matrix3du_t *in,
813 |                                        dl_matrix3d_t *dilate_filter,
814 |                                        dl_matrix3d_t *dilate_prelu,
815 |                                        dl_matrix3d_t *depthwise_filter,
816 |                                        dl_matrix3d_t *depthwise_prelu,
817 |                                        dl_matrix3d_t *compress_filter,
818 |                                        dl_matrix3d_t *bias,
819 |                                        dl_matrix3d_mobilenet_config_t config);
820 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam/edge-impulse-esp32-cam.ino:
--------------------------------------------------------------------------------
  1 | /*
  2 |   Live Image Classification on ESP32-CAM and ST7735 TFT display 
  3 |   using MobileNet v1 from Edge Impulse
  4 |   Modified from https://github.com/edgeimpulse/example-esp32-cam.
  5 | 
  6 |   Note: 
  7 |   The ST7735 TFT size has to be at least 120x120.
  8 |   Do not use Arduino IDE 2.0 or you won't be able to see the serial output!
  9 | */
 10 | 
 11 | #include <esp32-cam-cat-dog_inferencing.h>  // replace with your deployed Edge Impulse library
 12 | 
 13 | #define CAMERA_MODEL_AI_THINKER
 14 | 
 15 | #include "img_converters.h"
 16 | #include "image_util.h"
 17 | #include "esp_camera.h"
 18 | #include "camera_pins.h"
 19 | 
 20 | #include <Adafruit_GFX.h>    // Core graphics library
 21 | #include <Adafruit_ST7735.h> // Hardware-specific library for ST7735
 22 | 
 23 | #define TFT_SCLK 14 // SCL
 24 | #define TFT_MOSI 13 // SDA
 25 | #define TFT_RST  12 // RES (RESET)
 26 | #define TFT_DC    2 // Data Command control pin
 27 | #define TFT_CS   15 // Chip select control pin
 28 |                     // BL (back light) and VCC -> 3V3
 29 | 
 30 | #define BTN       4 // button (shared with flash led)
 31 | 
 32 | dl_matrix3du_t *resized_matrix = NULL;
 33 | ei_impulse_result_t result = {0};
 34 | 
 35 | Adafruit_ST7735 tft = Adafruit_ST7735(TFT_CS, TFT_DC, TFT_MOSI, TFT_SCLK, TFT_RST);
 36 | 
 37 | // setup
 38 | void setup() {
 39 |   Serial.begin(115200);
 40 | 
 41 |   // button
 42 |   pinMode(4, INPUT);
 43 | 
 44 |   // TFT display init
 45 |   tft.initR(INITR_GREENTAB); // you might need to use INITR_REDTAB or INITR_BLACKTAB to get correct text colors
 46 |   tft.setRotation(0);
 47 |   tft.fillScreen(ST77XX_BLACK);
 48 | 
 49 |   // cam config
 50 |   camera_config_t config;
 51 |   config.ledc_channel = LEDC_CHANNEL_0;
 52 |   config.ledc_timer = LEDC_TIMER_0;
 53 |   config.pin_d0 = Y2_GPIO_NUM;
 54 |   config.pin_d1 = Y3_GPIO_NUM;
 55 |   config.pin_d2 = Y4_GPIO_NUM;
 56 |   config.pin_d3 = Y5_GPIO_NUM;
 57 |   config.pin_d4 = Y6_GPIO_NUM;
 58 |   config.pin_d5 = Y7_GPIO_NUM;
 59 |   config.pin_d6 = Y8_GPIO_NUM;
 60 |   config.pin_d7 = Y9_GPIO_NUM;
 61 |   config.pin_xclk = XCLK_GPIO_NUM;
 62 |   config.pin_pclk = PCLK_GPIO_NUM;
 63 |   config.pin_vsync = VSYNC_GPIO_NUM;
 64 |   config.pin_href = HREF_GPIO_NUM;
 65 |   config.pin_sscb_sda = SIOD_GPIO_NUM;
 66 |   config.pin_sscb_scl = SIOC_GPIO_NUM;
 67 |   config.pin_pwdn = PWDN_GPIO_NUM;
 68 |   config.pin_reset = RESET_GPIO_NUM;
 69 |   config.xclk_freq_hz = 20000000;
 70 |   config.pixel_format = PIXFORMAT_JPEG;
 71 |   config.frame_size = FRAMESIZE_240X240;
 72 |   config.jpeg_quality = 10;
 73 |   config.fb_count = 1;
 74 | 
 75 |   // camera init
 76 |   esp_err_t err = esp_camera_init(&config);
 77 |   if (err != ESP_OK) {
 78 |     Serial.printf("Camera init failed with error 0x%x", err);
 79 |     return;
 80 |   }
 81 | 
 82 |   sensor_t * s = esp_camera_sensor_get();
 83 |   // initial sensors are flipped vertically and colors are a bit saturated
 84 |   if (s->id.PID == OV3660_PID) {
 85 |     s->set_vflip(s, 1); // flip it back
 86 |     s->set_brightness(s, 1); // up the brightness just a bit
 87 |     s->set_saturation(s, 0); // lower the saturation
 88 |   }
 89 | 
 90 |   Serial.println("Camera Ready!...(standby, press button to start)");
 91 |   tft_drawtext(4, 4, "Standby", 1, ST77XX_BLUE);
 92 | }
 93 | 
 94 | // main loop
 95 | void loop() {
 96 | 
 97 |   // wait until the button is pressed
 98 |   while (!digitalRead(BTN));
 99 |   delay(100);
100 | 
101 |   // capture a image and classify it
102 |   String result = classify();
103 | 
104 |   // display result
105 |   Serial.printf("Result: %s\n", result);
106 |   tft_drawtext(4, 120 - 16, result, 2, ST77XX_GREEN);
107 | }
108 | 
109 | // classify labels
110 | String classify() {
111 | 
112 |   // run image capture once to force clear buffer
113 |   // otherwise the captured image below would only show up next time you pressed the button!
114 |   capture_quick();
115 | 
116 |   // capture image from camera
117 |   if (!capture()) return "Error";
118 |   tft_drawtext(4, 4, "Classifying...", 1, ST77XX_CYAN);
119 | 
120 |   Serial.println("Getting image...");
121 |   signal_t signal;
122 |   signal.total_length = EI_CLASSIFIER_INPUT_WIDTH * EI_CLASSIFIER_INPUT_WIDTH;
123 |   signal.get_data = &raw_feature_get_data;
124 | 
125 |   Serial.println("Run classifier...");
126 |   // Feed signal to the classifier
127 |   EI_IMPULSE_ERROR res = run_classifier(&signal, &result, false /* debug */);
128 |   // --- Free memory ---
129 |   dl_matrix3du_free(resized_matrix);
130 | 
131 |   // --- Returned error variable "res" while data object.array in "result" ---
132 |   ei_printf("run_classifier returned: %d\n", res);
133 |   if (res != 0) return "Error";
134 | 
135 |   // --- print the predictions ---
136 |   ei_printf("Predictions (DSP: %d ms., Classification: %d ms., Anomaly: %d ms.): \n",
137 |             result.timing.dsp, result.timing.classification, result.timing.anomaly);
138 |   int index;
139 |   float score = 0.0;
140 |   for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
141 |     // record the most possible label
142 |     if (result.classification[ix].value > score) {
143 |       score = result.classification[ix].value;
144 |       index = ix;
145 |     }
146 |     ei_printf("    %s: \t%f\r\n", result.classification[ix].label, result.classification[ix].value);
147 |     tft_drawtext(4, 12 + 8 * ix, String(result.classification[ix].label) + " " + String(result.classification[ix].value * 100) + "%", 1, ST77XX_ORANGE);
148 |   }
149 | 
150 | #if EI_CLASSIFIER_HAS_ANOMALY == 1
151 |   ei_printf("    anomaly score: %f\r\n", result.anomaly);
152 | #endif
153 | 
154 |   // --- return the most possible label ---
155 |   return String(result.classification[index].label);
156 | }
157 | 
158 | // quick capture (to clear buffer)
159 | void capture_quick() {
160 |   camera_fb_t *fb = NULL;
161 |   fb = esp_camera_fb_get();
162 |   if (!fb) return;
163 |   esp_camera_fb_return(fb);
164 | }
165 | 
166 | // capture image from cam
167 | bool capture() {
168 | 
169 |   Serial.println("Capture image...");
170 |   esp_err_t res = ESP_OK;
171 |   camera_fb_t *fb = NULL;
172 |   fb = esp_camera_fb_get();
173 |   if (!fb) {
174 |     Serial.println("Camera capture failed");
175 |     return false;
176 |   }
177 | 
178 |   // --- Convert frame to RGB888  ---
179 |   Serial.println("Converting to RGB888...");
180 |   // Allocate rgb888_matrix buffer
181 |   dl_matrix3du_t *rgb888_matrix = dl_matrix3du_alloc(1, fb->width, fb->height, 3);
182 |   fmt2rgb888(fb->buf, fb->len, fb->format, rgb888_matrix->item);
183 | 
184 |   // --- Resize the RGB888 frame to 96x96 in this example ---
185 |   Serial.println("Resizing the frame buffer...");
186 |   resized_matrix = dl_matrix3du_alloc(1, EI_CLASSIFIER_INPUT_WIDTH, EI_CLASSIFIER_INPUT_HEIGHT, 3);
187 |   image_resize_linear(resized_matrix->item, rgb888_matrix->item, EI_CLASSIFIER_INPUT_WIDTH, EI_CLASSIFIER_INPUT_HEIGHT, 3, fb->width, fb->height);
188 | 
189 |   // --- Convert frame to RGB565 and display on the TFT ---
190 |   Serial.println("Converting to RGB565 and display on TFT...");
191 |   uint8_t *rgb565 = (uint8_t *) malloc(240 * 240 * 3);
192 |   jpg2rgb565(fb->buf, fb->len, rgb565, JPG_SCALE_2X); // scale to half size
193 |   tft.drawRGBBitmap(0, 0, (uint16_t*)rgb565, 120, 120);
194 | 
195 |   // --- Free memory ---
196 |   rgb565 = NULL;
197 |   dl_matrix3du_free(rgb888_matrix);
198 |   esp_camera_fb_return(fb);
199 | 
200 |   return true;
201 | }
202 | 
203 | int raw_feature_get_data(size_t offset, size_t out_len, float *signal_ptr) {
204 | 
205 |   size_t pixel_ix = offset * 3;
206 |   size_t bytes_left = out_len;
207 |   size_t out_ptr_ix = 0;
208 | 
209 |   // read byte for byte
210 |   while (bytes_left != 0) {
211 |     // grab the values and convert to r/g/b
212 |     uint8_t r, g, b;
213 |     r = resized_matrix->item[pixel_ix];
214 |     g = resized_matrix->item[pixel_ix + 1];
215 |     b = resized_matrix->item[pixel_ix + 2];
216 | 
217 |     // then convert to out_ptr format
218 |     float pixel_f = (r << 16) + (g << 8) + b;
219 |     signal_ptr[out_ptr_ix] = pixel_f;
220 | 
221 |     // and go to the next pixel
222 |     out_ptr_ix++;
223 |     pixel_ix += 3;
224 |     bytes_left--;
225 |   }
226 | 
227 |   return 0;
228 | }
229 | 
230 | // draw test on TFT
231 | void tft_drawtext(int16_t x, int16_t y, String text, uint8_t font_size, uint16_t color) {
232 |   tft.setCursor(x, y);
233 |   tft.setTextSize(font_size); // font size 1 = 6x8, 2 = 12x16, 3 = 18x24
234 |   tft.setTextColor(color);
235 |   tft.setTextWrap(true);
236 |   tft.print(strcpy(new char[text.length() + 1], text.c_str()));
237 | }
238 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam/esp_image.hpp:
--------------------------------------------------------------------------------
  1 | /*
  2 |   * ESPRESSIF MIT License
  3 |   *
  4 |   * Copyright (c) 2018 <ESPRESSIF SYSTEMS (SHANGHAI) PTE LTD>
  5 |   *
  6 |   * Permission is hereby granted for use on ESPRESSIF SYSTEMS products only, in which case,
  7 |   * it is free of charge, to any person obtaining a copy of this software and associated
  8 |   * documentation files (the "Software"), to deal in the Software without restriction, including
  9 |   * without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
 10 |   * and/or sell copies of the Software, and to permit persons to whom the Software is furnished
 11 |   * to do so, subject to the following conditions:
 12 |   *
 13 |   * The above copyright notice and this permission notice shall be included in all copies or
 14 |   * substantial portions of the Software.
 15 |   *
 16 |   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 17 |   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
 18 |   * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 19 |   * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 20 |   * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 21 |   * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 22 |   *
 23 |   */
 24 | #pragma once
 25 | 
 26 | #ifdef __cplusplus
 27 | extern "C"
 28 | {
 29 | #endif
 30 | 
 31 | #include <stdint.h>
 32 | #include <math.h>
 33 | #include <assert.h>
 34 | 
 35 | #ifdef __cplusplus
 36 | }
 37 | #endif
 38 | 
 39 | typedef enum
 40 | {
 41 |     IMAGE_RESIZE_BILINEAR = 0, /*<! Resize image by taking bilinear of four pixels */
 42 |     IMAGE_RESIZE_MEAN = 1,     /*<! Resize image by taking mean of four pixels */
 43 |     IMAGE_RESIZE_NEAREST = 2   /*<! Resize image by taking the nearest pixel */
 44 | } image_resize_t;
 45 | 
 46 | template <class T>
 47 | class Image
 48 | {
 49 | public:
 50 |     /**
 51 |      * @brief Convert a RGB565 pixel to RGB888
 52 |      * 
 53 |      * @param input     Pixel value in RGB565
 54 |      * @param output    Pixel value in RGB888
 55 |      */
 56 |     static inline void pixel_rgb565_to_rgb888(uint16_t input, T *output)
 57 |     {
 58 |         output[2] = (input & 0x1F00) >> 5;                           //blue
 59 |         output[1] = ((input & 0x7) << 5) | ((input & 0xE000) >> 11); //green
 60 |         output[0] = input & 0xF8;                                    //red
 61 |     };
 62 | 
 63 |     /**
 64 |      * @brief Resize a RGB565 image to a RGB88 image
 65 |      * 
 66 |      * @param dst_image     The destination image
 67 |      * @param y_start       The start y index of where resized image located
 68 |      * @param y_end         The end y index of where resized image located
 69 |      * @param x_start       The start x index of where resized image located
 70 |      * @param x_end         The end x index of where resized image located
 71 |      * @param channel       The channel number of image
 72 |      * @param src_image     The source image
 73 |      * @param src_h         The height of source image
 74 |      * @param src_w         The width of source image
 75 |      * @param dst_w         The width of destination image
 76 |      * @param shift_left    The bit number of left shifting
 77 |      * @param type          The resize type
 78 |      */
 79 |     static void resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint16_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
 80 | 
 81 |     /**
 82 |      * @brief Resize a RGB888 image to a RGB88 image
 83 |      * 
 84 |      * @param dst_image     The destination image
 85 |      * @param y_start       The start y index of where resized image located
 86 |      * @param y_end         The end y index of where resized image located
 87 |      * @param x_start       The start x index of where resized image located
 88 |      * @param x_end         The end x index of where resized image located
 89 |      * @param channel       The channel number of image
 90 |      * @param src_image     The source image
 91 |      * @param src_h         The height of source image
 92 |      * @param src_w         The width of source image
 93 |      * @param dst_w         The width of destination image
 94 |      * @param shift_left    The bit number of left shifting
 95 |      * @param type          The resize type
 96 |      */
 97 |     static void resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint8_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
 98 |     // static void resize_to_rgb565(uint16_t *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint16_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
 99 |     // static void resize_to_rgb565(uint16_t *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint8_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type);
100 | };
101 | 
102 | template <class T>
103 | void Image<T>::resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint16_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type)
104 | {
105 |     assert(channel == 3);
106 |     float scale_y = (float)src_h / (y_end - y_start);
107 |     float scale_x = (float)src_w / (x_end - x_start);
108 |     int temp[13];
109 | 
110 |     switch (type)
111 |     {
112 |     case IMAGE_RESIZE_BILINEAR:
113 |         for (size_t y = y_start; y < y_end; y++)
114 |         {
115 |             float ratio_y[2];
116 |             ratio_y[0] = (float)((y + 0.5) * scale_y - 0.5); // y
117 |             int src_y = (int)ratio_y[0];                     // y1
118 |             ratio_y[0] -= src_y;                             // y - y1
119 | 
120 |             if (src_y < 0)
121 |             {
122 |                 ratio_y[0] = 0;
123 |                 src_y = 0;
124 |             }
125 |             if (src_y > src_h - 2)
126 |             {
127 |                 ratio_y[0] = 0;
128 |                 src_y = src_h - 2;
129 |             }
130 |             ratio_y[1] = 1 - ratio_y[0]; // y2 - y
131 | 
132 |             int _dst_i = y * dst_w;
133 | 
134 |             int _src_row_0 = src_y * src_w;
135 |             int _src_row_1 = _src_row_0 + src_w;
136 | 
137 |             for (size_t x = x_start; x < x_end; x++)
138 |             {
139 |                 float ratio_x[2];
140 |                 ratio_x[0] = (float)((x + 0.5) * scale_x - 0.5); // x
141 |                 int src_x = (int)ratio_x[0];                     // x1
142 |                 ratio_x[0] -= src_x;                             // x - x1
143 | 
144 |                 if (src_x < 0)
145 |                 {
146 |                     ratio_x[0] = 0;
147 |                     src_x = 0;
148 |                 }
149 |                 if (src_x > src_w - 2)
150 |                 {
151 |                     ratio_x[0] = 0;
152 |                     src_x = src_w - 2;
153 |                 }
154 |                 ratio_x[1] = 1 - ratio_x[0]; // x2 - x
155 | 
156 |                 int dst_i = (_dst_i + x) * channel;
157 | 
158 |                 int src_row_0 = _src_row_0 + src_x;
159 |                 int src_row_1 = _src_row_1 + src_x;
160 | 
161 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0], temp);
162 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0 + 1], temp + 3);
163 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1], temp + 6);
164 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1 + 1], temp + 9);
165 | 
166 |                 for (int c = 0; c < channel; c++)
167 |                 {
168 |                     temp[12] = round(temp[c] * ratio_x[1] * ratio_y[1] + temp[channel + c] * ratio_x[0] * ratio_y[1] + temp[channel + channel + c] * ratio_x[1] * ratio_y[0] + src_image[channel + channel + channel + c] * ratio_x[0] * ratio_y[0]);
169 |                     dst_image[dst_i + c] = (shift_left > 0) ? (temp[12] << shift_left) : (temp[12] >> -shift_left);
170 |                 }
171 |             }
172 |         }
173 |         break;
174 | 
175 |     case IMAGE_RESIZE_MEAN:
176 |         shift_left -= 2;
177 |         for (int y = y_start; y < y_end; y++)
178 |         {
179 |             int _dst_i = y * dst_w;
180 | 
181 |             float _src_row_0 = rintf(y * scale_y) * src_w;
182 |             float _src_row_1 = _src_row_0 + src_w;
183 | 
184 |             for (int x = x_start; x < x_end; x++)
185 |             {
186 |                 int dst_i = (_dst_i + x) * channel;
187 | 
188 |                 int src_row_0 = (_src_row_0 + rintf(x * scale_x));
189 |                 int src_row_1 = (_src_row_1 + rintf(x * scale_x));
190 | 
191 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0], temp);
192 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_0 + 1], temp + 3);
193 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1], temp + 6);
194 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_row_1 + 1], temp + 9);
195 | 
196 |                 dst_image[dst_i] = (shift_left > 0) ? ((temp[0] + temp[3] + temp[6] + temp[9]) << shift_left) : ((temp[0] + temp[3] + temp[6] + temp[9]) >> -shift_left);
197 |                 dst_image[dst_i + 1] = (shift_left > 0) ? ((temp[1] + temp[4] + temp[7] + temp[10]) << shift_left) : ((temp[1] + temp[4] + temp[7] + temp[10]) >> -shift_left);
198 |                 dst_image[dst_i + 2] = (shift_left > 0) ? ((temp[2] + temp[5] + temp[8] + temp[11]) << shift_left) : ((temp[1] + temp[4] + temp[7] + temp[10]) >> -shift_left);
199 |             }
200 |         }
201 | 
202 |         break;
203 | 
204 |     case IMAGE_RESIZE_NEAREST:
205 |         for (size_t y = y_start; y < y_end; y++)
206 |         {
207 |             int _dst_i = y * dst_w;
208 |             float _src_i = rintf(y * scale_y) * src_w;
209 | 
210 |             for (size_t x = x_start; x < x_end; x++)
211 |             {
212 |                 int dst_i = (_dst_i + x) * channel;
213 |                 int src_i = _src_i + rintf(x * scale_x);
214 | 
215 |                 Image<int>::pixel_rgb565_to_rgb888(src_image[src_i], temp);
216 | 
217 |                 dst_image[dst_i] = (shift_left > 0) ? (temp[0] << shift_left) : (temp[0] >> -shift_left);
218 |                 dst_image[dst_i + 1] = (shift_left > 0) ? (temp[1] << shift_left) : (temp[1] >> -shift_left);
219 |                 dst_image[dst_i + 2] = (shift_left > 0) ? (temp[2] << shift_left) : (temp[2] >> -shift_left);
220 |             }
221 |         }
222 |         break;
223 | 
224 |     default:
225 |         break;
226 |     }
227 | }
228 | 
229 | template <class T>
230 | void Image<T>::resize_to_rgb888(T *dst_image, int y_start, int y_end, int x_start, int x_end, int channel, uint8_t *src_image, int src_h, int src_w, int dst_w, int shift_left, image_resize_t type)
231 | {
232 |     float scale_y = (float)src_h / (y_end - y_start);
233 |     float scale_x = (float)src_w / (x_end - x_start);
234 |     int temp;
235 | 
236 |     switch (type)
237 |     {
238 |     case IMAGE_RESIZE_BILINEAR:
239 |         for (size_t y = y_start; y < y_end; y++)
240 |         {
241 |             float ratio_y[2];
242 |             ratio_y[0] = (float)((y + 0.5) * scale_y - 0.5); // y
243 |             int src_y = (int)ratio_y[0];                     // y1
244 |             ratio_y[0] -= src_y;                             // y - y1
245 | 
246 |             if (src_y < 0)
247 |             {
248 |                 ratio_y[0] = 0;
249 |                 src_y = 0;
250 |             }
251 |             if (src_y > src_h - 2)
252 |             {
253 |                 ratio_y[0] = 0;
254 |                 src_y = src_h - 2;
255 |             }
256 |             ratio_y[1] = 1 - ratio_y[0]; // y2 - y
257 | 
258 |             int _dst_i = y * dst_w;
259 | 
260 |             int _src_row_0 = src_y * src_w;
261 |             int _src_row_1 = _src_row_0 + src_w;
262 | 
263 |             for (size_t x = x_start; x < x_end; x++)
264 |             {
265 |                 float ratio_x[2];
266 |                 ratio_x[0] = (float)((x + 0.5) * scale_x - 0.5); // x
267 |                 int src_x = (int)ratio_x[0];                     // x1
268 |                 ratio_x[0] -= src_x;                             // x - x1
269 | 
270 |                 if (src_x < 0)
271 |                 {
272 |                     ratio_x[0] = 0;
273 |                     src_x = 0;
274 |                 }
275 |                 if (src_x > src_w - 2)
276 |                 {
277 |                     ratio_x[0] = 0;
278 |                     src_x = src_w - 2;
279 |                 }
280 |                 ratio_x[1] = 1 - ratio_x[0]; // x2 - x
281 | 
282 |                 int dst_i = (_dst_i + x) * channel;
283 | 
284 |                 int src_row_0 = (_src_row_0 + src_x) * channel;
285 |                 int src_row_1 = (_src_row_1 + src_x) * channel;
286 | 
287 |                 for (int c = 0; c < channel; c++)
288 |                 {
289 |                     temp = round(src_image[src_row_0 + c] * ratio_x[1] * ratio_y[1] + src_image[src_row_0 + channel + c] * ratio_x[0] * ratio_y[1] + src_image[src_row_1 + c] * ratio_x[1] * ratio_y[0] + src_image[src_row_1 + channel + c] * ratio_x[0] * ratio_y[0]);
290 |                     dst_image[dst_i + c] = (shift_left > 0) ? (temp << shift_left) : (temp >> -shift_left);
291 |                 }
292 |             }
293 |         }
294 |         break;
295 | 
296 |     case IMAGE_RESIZE_MEAN:
297 |         shift_left -= 2;
298 | 
299 |         for (size_t y = y_start; y < y_end; y++)
300 |         {
301 |             int _dst_i = y * dst_w;
302 | 
303 |             float _src_row_0 = rintf(y * scale_y) * src_w;
304 |             float _src_row_1 = _src_row_0 + src_w;
305 | 
306 |             for (size_t x = x_start; x < x_end; x++)
307 |             {
308 |                 int dst_i = (_dst_i + x) * channel;
309 | 
310 |                 int src_row_0 = (_src_row_0 + rintf(x * scale_x)) * channel;
311 |                 int src_row_1 = (_src_row_1 + rintf(x * scale_x)) * channel;
312 | 
313 |                 for (size_t c = 0; c < channel; c++)
314 |                 {
315 |                     temp = (int)src_image[src_row_0 + c] + (int)src_image[src_row_0 + channel + c] + (int)src_image[src_row_1 + c] + (int)src_image[src_row_1 + channel + c];
316 |                     dst_image[dst_i + c] = (shift_left > 0) ? (temp << shift_left) : (temp >> -shift_left);
317 |                 }
318 |             }
319 |         }
320 |         break;
321 | 
322 |     case IMAGE_RESIZE_NEAREST:
323 |         for (size_t y = y_start; y < y_end; y++)
324 |         {
325 |             int _dst_i = y * dst_w;
326 |             float _src_i = rintf(y * scale_y) * src_w;
327 | 
328 |             for (size_t x = x_start; x < x_end; x++)
329 |             {
330 |                 int dst_i = (_dst_i + x) * channel;
331 |                 int src_i = (_src_i + rintf(x * scale_x)) * channel;
332 | 
333 |                 for (size_t c = 0; c < channel; c++)
334 |                 {
335 |                     dst_image[dst_i + c] = (shift_left > 0) ? ((T)src_image[src_i + c] << shift_left) : ((T)src_image[src_i + c] >> -shift_left);
336 |                 }
337 |             }
338 |         }
339 |         break;
340 | 
341 |     default:
342 |         break;
343 |     }
344 | }


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam/frmn.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #if __cplusplus
 4 | extern "C"
 5 | {
 6 | #endif
 7 | 
 8 | #include "dl_lib_matrix3d.h"
 9 | #include "dl_lib_matrix3dq.h"
10 | 
11 |     /**
12 |      * @brief Forward the face recognition process with frmn model. Calculate in float.
13 |      *
14 |      * @param in    Image matrix, rgb888 format, size is 56x56, normalized
15 |      * @return dl_matrix3d_t* Face ID feature vector, size is 512
16 |      */
17 |     dl_matrix3d_t *frmn(dl_matrix3d_t *in);
18 |     
19 |     /**@{*/
20 |     /**
21 |      * @brief Forward the face recognition process with specified model. Calculate in quantization.
22 |      *
23 |      * @param in    Image matrix, rgb888 format, size is 56x56, normalized
24 |      * @param mode  0: C implement; 1: handwrite xtensa instruction implement
25 |      * @return      Face ID feature vector, size is 512
26 |      */
27 |     dl_matrix3dq_t *frmn_q(dl_matrix3dq_t *in, dl_conv_mode mode);
28 | 
29 |     dl_matrix3dq_t *frmn2p_q(dl_matrix3dq_t *in, dl_conv_mode mode);
30 | 
31 |     dl_matrix3dq_t *mfn56_42m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
32 | 
33 |     dl_matrix3dq_t *mfn56_72m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
34 | 
35 |     dl_matrix3dq_t *mfn56_112m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
36 | 
37 |     dl_matrix3dq_t *mfn56_156m_q(dl_matrix3dq_t *in, dl_conv_mode mode);
38 | 
39 |     /**@}*/
40 | 
41 | #if __cplusplus
42 | }
43 | #endif
44 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam/image_util.h:
--------------------------------------------------------------------------------
  1 | /*
  2 |   * ESPRESSIF MIT License
  3 |   *
  4 |   * Copyright (c) 2018 <ESPRESSIF SYSTEMS (SHANGHAI) PTE LTD>
  5 |   *
  6 |   * Permission is hereby granted for use on ESPRESSIF SYSTEMS products only, in which case,
  7 |   * it is free of charge, to any person obtaining a copy of this software and associated
  8 |   * documentation files (the "Software"), to deal in the Software without restriction, including
  9 |   * without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
 10 |   * and/or sell copies of the Software, and to permit persons to whom the Software is furnished
 11 |   * to do so, subject to the following conditions:
 12 |   *
 13 |   * The above copyright notice and this permission notice shall be included in all copies or
 14 |   * substantial portions of the Software.
 15 |   *
 16 |   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 17 |   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
 18 |   * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 19 |   * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 20 |   * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 21 |   * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 22 |   *
 23 |   */
 24 | #pragma once
 25 | #ifdef __cplusplus
 26 | extern "C"
 27 | {
 28 | #endif
 29 | #include <stdint.h>
 30 | #include <math.h>
 31 | #include "mtmn.h"
 32 | 
 33 | #define LANDMARKS_NUM (10)
 34 | 
 35 | #define MAX_VALID_COUNT_PER_IMAGE (30)
 36 | 
 37 | #define DL_IMAGE_MIN(A, B) ((A) < (B) ? (A) : (B))
 38 | #define DL_IMAGE_MAX(A, B) ((A) < (B) ? (B) : (A))
 39 | 
 40 | #define RGB565_MASK_RED 0xF800
 41 | #define RGB565_MASK_GREEN 0x07E0
 42 | #define RGB565_MASK_BLUE 0x001F
 43 | 
 44 |     typedef enum
 45 |     {
 46 |         BINARY, /*!< binary */
 47 |     } en_threshold_mode;
 48 | 
 49 |     typedef struct
 50 |     {
 51 |         fptp_t landmark_p[LANDMARKS_NUM]; /*!< landmark struct */
 52 |     } landmark_t;
 53 | 
 54 |     typedef struct
 55 |     {
 56 |         fptp_t box_p[4]; /*!< box struct */
 57 |     } box_t;
 58 | 
 59 |     typedef struct tag_box_list
 60 |     {
 61 |         uint8_t *category;    /*!< The category of the corresponding box */
 62 |         fptp_t *score;        /*!< The confidence score of the class corresponding to the box */
 63 |         box_t *box;           /*!< Anchor boxes or predicted boxes*/
 64 |         landmark_t *landmark; /*!< The landmarks corresponding to the box */
 65 |         int len;              /*!< The num of the boxes */
 66 |     } box_array_t;
 67 | 
 68 |     typedef struct tag_image_box
 69 |     {
 70 |         struct tag_image_box *next; /*!< Next image_box_t */
 71 |         uint8_t category;
 72 |         fptp_t score;        /*!< The confidence score of the class corresponding to the box */
 73 |         box_t box;           /*!< Anchor boxes or predicted boxes */
 74 |         box_t offset;        /*!< The predicted anchor-based offset */
 75 |         landmark_t landmark; /*!< The landmarks corresponding to the box */
 76 |     } image_box_t;
 77 | 
 78 |     typedef struct tag_image_list
 79 |     {
 80 |         image_box_t *head;        /*!< The current head of the image_list */
 81 |         image_box_t *origin_head; /*!< The original head of the image_list */
 82 |         int len;                  /*!< Length of the image_list */
 83 |     } image_list_t;
 84 | 
 85 |     /**
 86 |      * @brief Get the width and height of the box.
 87 |      * 
 88 |      * @param box         Input box
 89 |      * @param w           Resulting width of the box
 90 |      * @param h           Resulting height of the box
 91 |      */
 92 |     static inline void image_get_width_and_height(box_t *box, float *w, float *h)
 93 |     {
 94 |         *w = box->box_p[2] - box->box_p[0] + 1;
 95 |         *h = box->box_p[3] - box->box_p[1] + 1;
 96 |     }
 97 | 
 98 |     /**
 99 |      * @brief Get the area of the box.
100 |      * 
101 |      * @param box         Input box
102 |      * @param area        Resulting area of the box 
103 |      */
104 |     static inline void image_get_area(box_t *box, float *area)
105 |     {
106 |         float w, h;
107 |         image_get_width_and_height(box, &w, &h);
108 |         *area = w * h;
109 |     }
110 | 
111 |     /**
112 |      * @brief calibrate the boxes by offset
113 |      * 
114 |      * @param image_list         Input boxes
115 |      * @param image_height       Height of the original image
116 |      * @param image_width        Width of the original image
117 |      */
118 |     static inline void image_calibrate_by_offset(image_list_t *image_list, int image_height, int image_width)
119 |     {
120 |         for (image_box_t *head = image_list->head; head; head = head->next)
121 |         {
122 |             float w, h;
123 |             image_get_width_and_height(&(head->box), &w, &h);
124 |             head->box.box_p[0] = DL_IMAGE_MAX(0, head->box.box_p[0] + head->offset.box_p[0] * w);
125 |             head->box.box_p[1] = DL_IMAGE_MAX(0, head->box.box_p[1] + head->offset.box_p[1] * w);
126 |             head->box.box_p[2] += head->offset.box_p[2] * w;
127 |             if (head->box.box_p[2] > image_width)
128 |             {
129 |                 head->box.box_p[2] = image_width - 1;
130 |                 head->box.box_p[0] = image_width - w;
131 |             }
132 |             head->box.box_p[3] += head->offset.box_p[3] * h;
133 |             if (head->box.box_p[3] > image_height)
134 |             {
135 |                 head->box.box_p[3] = image_height - 1;
136 |                 head->box.box_p[1] = image_height - h;
137 |             }
138 |         }
139 |     }
140 | 
141 |     /**
142 |      * @brief calibrate the landmarks
143 |      * 
144 |      * @param image_list     Input landmarks
145 |      */
146 |     static inline void image_landmark_calibrate(image_list_t *image_list)
147 |     {
148 |         for (image_box_t *head = image_list->head; head; head = head->next)
149 |         {
150 |             float w, h;
151 |             image_get_width_and_height(&(head->box), &w, &h);
152 |             head->landmark.landmark_p[0] = head->box.box_p[0] + head->landmark.landmark_p[0] * w;
153 |             head->landmark.landmark_p[1] = head->box.box_p[1] + head->landmark.landmark_p[1] * h;
154 | 
155 |             head->landmark.landmark_p[2] = head->box.box_p[0] + head->landmark.landmark_p[2] * w;
156 |             head->landmark.landmark_p[3] = head->box.box_p[1] + head->landmark.landmark_p[3] * h;
157 | 
158 |             head->landmark.landmark_p[4] = head->box.box_p[0] + head->landmark.landmark_p[4] * w;
159 |             head->landmark.landmark_p[5] = head->box.box_p[1] + head->landmark.landmark_p[5] * h;
160 | 
161 |             head->landmark.landmark_p[6] = head->box.box_p[0] + head->landmark.landmark_p[6] * w;
162 |             head->landmark.landmark_p[7] = head->box.box_p[1] + head->landmark.landmark_p[7] * h;
163 | 
164 |             head->landmark.landmark_p[8] = head->box.box_p[0] + head->landmark.landmark_p[8] * w;
165 |             head->landmark.landmark_p[9] = head->box.box_p[1] + head->landmark.landmark_p[9] * h;
166 |         }
167 |     }
168 | 
169 |     /**
170 |      * @brief Convert a rectangular box into a square box
171 |      * 
172 |      * @param boxes    Input box 
173 |      * @param width    Width of the orignal image
174 |      * @param height   height of the orignal image
175 |      */
176 |     static inline void image_rect2sqr(box_array_t *boxes, int width, int height)
177 |     {
178 |         for (int i = 0; i < boxes->len; i++)
179 |         {
180 |             box_t *box = &(boxes->box[i]);
181 | 
182 |             int x1 = round(box->box_p[0]);
183 |             int y1 = round(box->box_p[1]);
184 |             int x2 = round(box->box_p[2]);
185 |             int y2 = round(box->box_p[3]);
186 | 
187 |             int w = x2 - x1 + 1;
188 |             int h = y2 - y1 + 1;
189 |             int l = DL_IMAGE_MAX(w, h);
190 | 
191 |             box->box_p[0] = DL_IMAGE_MAX(round(DL_IMAGE_MAX(0, x1) + 0.5 * (w - l)), 0);
192 |             box->box_p[1] = DL_IMAGE_MAX(round(DL_IMAGE_MAX(0, y1) + 0.5 * (h - l)), 0);
193 | 
194 |             box->box_p[2] = box->box_p[0] + l - 1;
195 |             if (box->box_p[2] > width)
196 |             {
197 |                 box->box_p[2] = width - 1;
198 |                 box->box_p[0] = width - l;
199 |             }
200 |             box->box_p[3] = box->box_p[1] + l - 1;
201 |             if (box->box_p[3] > height)
202 |             {
203 |                 box->box_p[3] = height - 1;
204 |                 box->box_p[1] = height - l;
205 |             }
206 |         }
207 |     }
208 | 
209 |     /**@{*/
210 |     /**
211 |      * @brief Convert RGB565 image to RGB888 image
212 |      * 
213 |      * @param in    Input RGB565 image
214 |      * @param dst   Resulting RGB888 image
215 |      */
216 |     static inline void rgb565_to_888(uint16_t in, uint8_t *dst)
217 |     { /*{{{*/
218 |         in = (in & 0xFF) << 8 | (in & 0xFF00) >> 8;
219 |         dst[2] = (in & RGB565_MASK_BLUE) << 3;  // blue
220 |         dst[1] = (in & RGB565_MASK_GREEN) >> 3; // green
221 |         dst[0] = (in & RGB565_MASK_RED) >> 8;   // red
222 | 
223 |         // dst[0] = (in & 0x1F00) >> 5;
224 |         // dst[1] = ((in & 0x7) << 5) | ((in & 0xE000) >> 11);
225 |         // dst[2] = in & 0xF8;
226 |     } /*}}}*/
227 | 
228 |     static inline void rgb565_to_888_q16(uint16_t in, int16_t *dst)
229 |     { /*{{{*/
230 |         in = (in & 0xFF) << 8 | (in & 0xFF00) >> 8;
231 |         dst[2] = (in & RGB565_MASK_BLUE) << 3;  // blue
232 |         dst[1] = (in & RGB565_MASK_GREEN) >> 3; // green
233 |         dst[0] = (in & RGB565_MASK_RED) >> 8;   // red
234 | 
235 |         // dst[0] = (in & 0x1F00) >> 5;
236 |         // dst[1] = ((in & 0x7) << 5) | ((in & 0xE000) >> 11);
237 |         // dst[2] = in & 0xF8;
238 |     } /*}}}*/
239 |     /**@}*/
240 | 
241 |     /**
242 |      * @brief Convert RGB888 image to RGB565 image
243 |      * 
244 |      * @param in      Resulting RGB565 image
245 |      * @param r       The red channel of the Input RGB888 image 
246 |      * @param g       The green channel of the Input RGB888 image 
247 |      * @param b       The blue channel of the Input RGB888 image
248 |      */
249 |     static inline void rgb888_to_565(uint16_t *in, uint8_t r, uint8_t g, uint8_t b)
250 |     { /*{{{*/
251 |         uint16_t rgb565 = 0;
252 |         rgb565 = ((r >> 3) << 11);
253 |         rgb565 |= ((g >> 2) << 5);
254 |         rgb565 |= (b >> 3);
255 |         rgb565 = (rgb565 & 0xFF) << 8 | (rgb565 & 0xFF00) >> 8;
256 |         *in = rgb565;
257 |     } /*}}}*/
258 | 
259 |     /**
260 |      * @brief Filter out the resulting boxes whose confidence score is lower than the threshold and convert the boxes to the actual boxes on the original image.((x, y, w, h) -> (x1, y1, x2, y2))
261 |      * 
262 |      * @param score                    Confidence score of the boxes
263 |      * @param offset                   The predicted anchor-based offset
264 |      * @param landmark                 The landmarks corresponding to the box
265 |      * @param width                    Height of the original image
266 |      * @param height                   Width of the original image
267 |      * @param anchor_number            Anchor number of the detection output feature map 
268 |      * @param anchors_size             The anchor size
269 |      * @param score_threshold          Threshold of the confidence score
270 |      * @param stride 
271 |      * @param resized_height_scale 
272 |      * @param resized_width_scale 
273 |      * @param do_regression 
274 |      * @return image_list_t* 
275 |      */
276 |     image_list_t *image_get_valid_boxes(fptp_t *score,
277 |                                         fptp_t *offset,
278 |                                         fptp_t *landmark,
279 |                                         int width,
280 |                                         int height,
281 |                                         int anchor_number,
282 |                                         int *anchors_size,
283 |                                         fptp_t score_threshold,
284 |                                         int stride,
285 |                                         fptp_t resized_height_scale,
286 |                                         fptp_t resized_width_scale,
287 |                                         bool do_regression);
288 |     /**
289 |      * @brief Sort the resulting box lists by their confidence score.
290 |      * 
291 |      * @param image_sorted_list     The sorted box list.
292 |      * @param insert_list           The box list that have not been sorted.
293 |      */
294 |     void image_sort_insert_by_score(image_list_t *image_sorted_list, const image_list_t *insert_list);
295 | 
296 |     /**
297 |      * @brief Run NMS algorithm 
298 |      * 
299 |      * @param image_list         The input boxes list
300 |      * @param nms_threshold      NMS threshold
301 |      * @param same_area          The flag of boxes with same area
302 |      */
303 |     void image_nms_process(image_list_t *image_list, fptp_t nms_threshold, int same_area);
304 | 
305 |     /**
306 |      * @brief Resize an image to half size 
307 |      * 
308 |      * @param dimage      The output image
309 |      * @param dw          Width of the output image
310 |      * @param dh          Height of the output image
311 |      * @param dc          Channel of the output image
312 |      * @param simage      Source image
313 |      * @param sw          Width of the source image
314 |      * @param sc          Channel of the source image
315 |      */
316 |     void image_zoom_in_twice(uint8_t *dimage,
317 |                              int dw,
318 |                              int dh,
319 |                              int dc,
320 |                              uint8_t *simage,
321 |                              int sw,
322 |                              int sc);
323 | 
324 |     /**
325 |      * @brief Resize the image in RGB888 format via bilinear interpolation
326 |      * 
327 |      * @param dst_image    The output image
328 |      * @param src_image    Source image
329 |      * @param dst_w        Width of the output image
330 |      * @param dst_h        Height of the output image
331 |      * @param dst_c        Channel of the output image
332 |      * @param src_w        Width of the source image
333 |      * @param src_h        Height of the source image
334 |      */
335 |     void image_resize_linear(uint8_t *dst_image, uint8_t *src_image, int dst_w, int dst_h, int dst_c, int src_w, int src_h);
336 | 
337 |     /**
338 |      * @brief Crop， rotate and zoom the image in RGB888 format, 
339 |      * 
340 |      * @param corp_image       The output image
341 |      * @param src_image        Source image
342 |      * @param rotate_angle     Rotate angle
343 |      * @param ratio            scaling ratio
344 |      * @param center           Center of rotation
345 |      */
346 |     void image_cropper(uint8_t *corp_image, uint8_t *src_image, int dst_w, int dst_h, int dst_c, int src_w, int src_h, float rotate_angle, float ratio, float *center);
347 | 
348 |     /**
349 |      * @brief Convert the rgb565 image to the rgb888 image   
350 |      * 
351 |      * @param m       The output rgb888 image
352 |      * @param bmp     The input rgb565 image
353 |      * @param count   Total pixels of the rgb565 image
354 |      */
355 |     void image_rgb565_to_888(uint8_t *m, uint16_t *bmp, int count);
356 | 
357 |     /**
358 |      * @brief Convert the rgb888 image to the rgb565 image
359 |      * 
360 |      * @param bmp     The output rgb565 image
361 |      * @param m       The input rgb888 image
362 |      * @param count   Total pixels of the rgb565 image
363 |      */
364 |     void image_rgb888_to_565(uint16_t *bmp, uint8_t *m, int count);
365 | 
366 |     /**
367 |      * @brief draw rectangle on the rgb565 image
368 |      * 
369 |      * @param buf     Input image
370 |      * @param boxes   Rectangle Boxes
371 |      * @param width   Width of the input image
372 |      */
373 |     void draw_rectangle_rgb565(uint16_t *buf, box_array_t *boxes, int width);
374 | 
375 |     /**
376 |      * @brief draw rectangle on the rgb888 image
377 |      * 
378 |      * @param buf     Input image
379 |      * @param boxes   Rectangle Boxes
380 |      * @param width   Width of the input image
381 |      */
382 |     void draw_rectangle_rgb888(uint8_t *buf, box_array_t *boxes, int width);
383 | 
384 |     /**
385 |      * @brief Get the pixel difference of two images
386 |      * 
387 |      * @param dst       The output pixel difference
388 |      * @param src1      Input image 1
389 |      * @param src2      Input image 2
390 |      * @param count     Total pixels of the input image
391 |      */
392 |     void image_abs_diff(uint8_t *dst, uint8_t *src1, uint8_t *src2, int count);
393 | 
394 |     /**
395 |      * @brief Binarize an image to 0 and value. 
396 |      * 
397 |      * @param dst           The output image
398 |      * @param src           Source image
399 |      * @param threshold     Threshold of binarization
400 |      * @param value         The value of binarization
401 |      * @param count         Total pixels of the input image
402 |      * @param mode          Threshold mode
403 |      */
404 |     void image_threshold(uint8_t *dst, uint8_t *src, int threshold, int value, int count, en_threshold_mode mode);
405 | 
406 |     /**
407 |      * @brief Erode the image
408 |      * 
409 |      * @param dst          The output image
410 |      * @param src          Source image
411 |      * @param src_w        Width of the source image
412 |      * @param src_h        Height of the source image
413 |      * @param src_c        Channel of the source image
414 |      */
415 |     void image_erode(uint8_t *dst, uint8_t *src, int src_w, int src_h, int src_c);
416 | 
417 |     typedef float matrixType;
418 |     typedef struct
419 |     {
420 |         int w;              /*!< width */
421 |         int h;              /*!< height */
422 |         matrixType **array; /*!< array */
423 |     } Matrix;
424 | 
425 |     /**
426 |      * @brief Allocate a 2d matrix
427 |      * 
428 |      * @param h                Height of matrix
429 |      * @param w                Width of matrix
430 |      * @return Matrix*         2d matrix
431 |      */
432 |     Matrix *matrix_alloc(int h, int w);
433 | 
434 |     /**
435 |      * @brief Free a 2d matrix
436 |      * 
437 |      * @param m    2d matrix 
438 |      */
439 |     void matrix_free(Matrix *m);
440 | 
441 |     /**
442 |      * @brief Get the similarity matrix of similarity transformation
443 |      * 
444 |      * @param srcx          Source x coordinates
445 |      * @param srcy          Source y coordinates
446 |      * @param dstx          Destination x coordinates
447 |      * @param dsty          Destination y coordinates
448 |      * @param num           The number of the coordinates
449 |      * @return Matrix*      The resulting transformation matrix
450 |      */
451 |     Matrix *get_similarity_matrix(float *srcx, float *srcy, float *dstx, float *dsty, int num);
452 | 
453 |     /**
454 |      * @brief Get the affine transformation matrix
455 |      * 
456 |      * @param srcx          Source x coordinates
457 |      * @param srcy          Source y coordinates
458 |      * @param dstx          Destination x coordinates
459 |      * @param dsty          Destination y coordinates
460 |      * @return Matrix*      The resulting transformation matrix
461 |      */
462 |     Matrix *get_affine_transform(float *srcx, float *srcy, float *dstx, float *dsty);
463 | 
464 |     /**
465 |      * @brief Applies an affine transformation to an image
466 |      * 
467 |      * @param img           Input image
468 |      * @param crop          Dst output image that has the size dsize and the same type as src
469 |      * @param M             Affine transformation matrix
470 |      */
471 |     void warp_affine(dl_matrix3du_t *img, dl_matrix3du_t *crop, Matrix *M);
472 | 
473 |     /**
474 |      * @brief Resize the image in RGB888 format via bilinear interpolation, and quantify the output image
475 |      * 
476 |      * @param dst_image            Quantized output image
477 |      * @param src_image            Input image 
478 |      * @param dst_w                Width of the output image 
479 |      * @param dst_h                Height of the output image 
480 |      * @param dst_c                Channel of the output image
481 |      * @param src_w                Width of the input image 
482 |      * @param src_h                Height of the input image
483 |      * @param shift                Shift parameter of quantization.
484 |      */
485 |     void image_resize_linear_q(qtp_t *dst_image, uint8_t *src_image, int dst_w, int dst_h, int dst_c, int src_w, int src_h, int shift);
486 | 
487 |     /**
488 |      * @brief Preprocess the input image of object detection model. The process is like this: resize -> normalize -> quantify
489 |      * 
490 |      * @param image                 Input image, RGB888 format.
491 |      * @param input_w               Width of the input image.
492 |      * @param input_h               Height of the input image.
493 |      * @param target_size           Target size of the model input image.
494 |      * @param exponent              Exponent of the quantized model input image.
495 |      * @param process_mode          Process mode. 0: resize with padding to keep height == width. 1: resize without padding, height != width.  
496 |      * @return dl_matrix3dq_t*      The resulting preprocessed image.
497 |      */
498 |     dl_matrix3dq_t *image_resize_normalize_quantize(uint8_t *image, int input_w, int input_h, int target_size, int exponent, int process_mode);
499 | 
500 |     /**
501 |      * @brief Resize the image in RGB565 format via mean neighbour interpolation, and quantify the output image
502 |      * 
503 |      * @param dimage            Quantized output image. 
504 |      * @param simage            Input image.  
505 |      * @param dw                Width of the allocated output image memory.
506 |      * @param dc                Channel of the allocated output image memory.
507 |      * @param sw                Width of the input image. 
508 |      * @param sh                Height of the input image. 
509 |      * @param tw                Target width of the output image.
510 |      * @param th                Target height of the output image.
511 |      * @param shift             Shift parameter of quantization.
512 |      */
513 |     void image_resize_shift_fast(qtp_t *dimage, uint16_t *simage, int dw, int dc, int sw, int sh, int tw, int th, int shift);
514 | 
515 |     /**
516 |      * @brief Resize the image in RGB565 format via nearest neighbour interpolation, and quantify the output image
517 |      * 
518 |      * @param dimage            Quantized output image. 
519 |      * @param simage            Input image.  
520 |      * @param dw                Width of the allocated output image memory.
521 |      * @param dc                Channel of the allocated output image memory.
522 |      * @param sw                Width of the input image. 
523 |      * @param sh                Height of the input image. 
524 |      * @param tw                Target width of the output image.
525 |      * @param th                Target height of the output image.
526 |      * @param shift             Shift parameter of quantization.
527 |      */
528 |     void image_resize_nearest_shift(qtp_t *dimage, uint16_t *simage, int dw, int dc, int sw, int sh, int tw, int th, int shift);
529 | 
530 |     /**
531 |      * @brief Crop the image in RGB565 format and resize it to target size, then quantify the output image 
532 |      * 
533 |      * @param dimage            Quantized output image. 
534 |      * @param simage            Input image.
535 |      * @param dw                Target size of the output image.
536 |      * @param sw                Width of the input image. 
537 |      * @param sh                Height of the input image. 
538 |      * @param x1                The x coordinate of the upper left corner of the cropped area
539 |      * @param y1                The y coordinate of the upper left corner of the cropped area
540 |      * @param x2                The x coordinate of the lower right corner of the cropped area
541 |      * @param y2                The y coordinate of the lower right corner of the cropped area
542 |      * @param shift             Shift parameter of quantization.
543 |      */
544 |     void image_crop_shift_fast(qtp_t *dimage, uint16_t *simage, int dw, int sw, int sh, int x1, int y1, int x2, int y2, int shift);
545 | 
546 | #ifdef __cplusplus
547 | }
548 | #endif
549 | 


--------------------------------------------------------------------------------
/edge-impulse-esp32-cam/mtmn.h:
--------------------------------------------------------------------------------
  1 | /*
  2 |   * ESPRESSIF MIT License
  3 |   *
  4 |   * Copyright (c) 2018 <ESPRESSIF SYSTEMS (SHANGHAI) PTE LTD>
  5 |   *
  6 |   * Permission is hereby granted for use on ESPRESSIF SYSTEMS products only, in which case,
  7 |   * it is free of charge, to any person obtaining a copy of this software and associated
  8 |   * documentation files (the "Software"), to deal in the Software without restriction, including
  9 |   * without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
 10 |   * and/or sell copies of the Software, and to permit persons to whom the Software is furnished
 11 |   * to do so, subject to the following conditions:
 12 |   *
 13 |   * The above copyright notice and this permission notice shall be included in all copies or
 14 |   * substantial portions of the Software.
 15 |   *
 16 |   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 17 |   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
 18 |   * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 19 |   * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 20 |   * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 21 |   * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 22 |   *
 23 |   */
 24 | #pragma once
 25 | 
 26 | #ifdef __cplusplus
 27 | extern "C"
 28 | {
 29 | #endif
 30 | #include "dl_lib_matrix3d.h"
 31 | #include "dl_lib_matrix3dq.h"
 32 | 
 33 |     /**
 34 |      * Detection results with MTMN.
 35 |      *
 36 |      */
 37 |     typedef struct
 38 |     {
 39 |         dl_matrix3d_t *category;    /*!< Classification result after softmax, channel is 2 */
 40 |         dl_matrix3d_t *offset;      /*!< Bounding box offset of 2 points: top-left and bottom-right, channel is 4 */
 41 |         dl_matrix3d_t *landmark;    /*!< Offsets of 5 landmarks:
 42 |                                      * - Left eye
 43 |                                      * - Mouth leftside
 44 |                                      * - Nose
 45 |                                      * - Right eye
 46 |                                      * - Mouth rightside
 47 |                                      *
 48 |                                      * channel is 10
 49 |                                      * */
 50 |     } mtmn_net_t;
 51 | 
 52 | 
 53 |     /**
 54 |      * @brief Free a mtmn_net_t
 55 |      *
 56 |      * @param p         A mtmn_net_t pointer
 57 |      *
 58 |      */
 59 | 
 60 |     void mtmn_net_t_free(mtmn_net_t *p);
 61 | 
 62 |     /**
 63 |      * @brief Forward the pnet process, coarse detection. Calculate in float.
 64 |      *
 65 |      * @param in        Image matrix, rgb888 format, size is 320x240
 66 |      * @return          Scores for every pixel, and box offset with respect.
 67 |      */
 68 |     mtmn_net_t *pnet_lite_f(dl_matrix3du_t *in);
 69 | 
 70 |     /**
 71 |      * @brief Forward the rnet process, fine determine the boxes from pnet. Calculate in float.
 72 |      *
 73 |      * @param in        Image matrix, rgb888 format
 74 |      * @param threshold Score threshold to detect human face
 75 |      * @return          Scores for every box, and box offset with respect.
 76 |      */
 77 |     mtmn_net_t *rnet_lite_f_with_score_verify(dl_matrix3du_t *in, float threshold);
 78 | 
 79 |     /**
 80 |      * @brief Forward the onet process, fine determine the boxes from rnet. Calculate in float.
 81 |      *
 82 |      * @param in        Image matrix, rgb888 format
 83 |      * @param threshold Score threshold to detect human face
 84 |      * @return          Scores for every box, box offset, and landmark with respect.
 85 |      */
 86 |     mtmn_net_t *onet_lite_f_with_score_verify(dl_matrix3du_t *in, float threshold);
 87 | 
 88 |     /**
 89 |      * @brief Forward the pnet process, coarse detection. Calculate in quantization.
 90 |      *
 91 |      * @param in        Image matrix, rgb888 format, size is 320x240
 92 |      * @return          Scores for every pixel, and box offset with respect.
 93 |      */
 94 |     mtmn_net_t *pnet_lite_q(dl_matrix3du_t *in, dl_conv_mode mode);
 95 | 
 96 |     /**
 97 |      * @brief Forward the rnet process, fine determine the boxes from pnet. Calculate in quantization.
 98 |      *
 99 |      * @param in        Image matrix, rgb888 format
100 |      * @param threshold Score threshold to detect human face
101 |      * @return          Scores for every box, and box offset with respect.
102 |      */
103 |     mtmn_net_t *rnet_lite_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
104 | 
105 |     /**
106 |      * @brief Forward the onet process, fine determine the boxes from rnet. Calculate in quantization.
107 |      *
108 |      * @param in        Image matrix, rgb888 format
109 |      * @param threshold Score threshold to detect human face
110 |      * @return          Scores for every box, box offset, and landmark with respect.
111 |      */
112 |     mtmn_net_t *onet_lite_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
113 | 
114 |     /**
115 |      * @brief Forward the pnet process, coarse detection. Calculate in quantization.
116 |      *
117 |      * @param in        Image matrix, rgb888 format, size is 320x240
118 |      * @return          Scores for every pixel, and box offset with respect.
119 |      */
120 |     mtmn_net_t *pnet_heavy_q(dl_matrix3du_t *in, dl_conv_mode mode);
121 | 
122 |     /**
123 |      * @brief Forward the rnet process, fine determine the boxes from pnet. Calculate in quantization.
124 |      *
125 |      * @param in        Image matrix, rgb888 format
126 |      * @param threshold Score threshold to detect human face
127 |      * @return          Scores for every box, and box offset with respect.
128 |      */
129 |     mtmn_net_t *rnet_heavy_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
130 | 
131 |     /**
132 |      * @brief Forward the onet process, fine determine the boxes from rnet. Calculate in quantization.
133 |      *
134 |      * @param in        Image matrix, rgb888 format
135 |      * @param threshold Score threshold to detect human face
136 |      * @return          Scores for every box, box offset, and landmark with respect.
137 |      */
138 |     mtmn_net_t *onet_heavy_q_with_score_verify(dl_matrix3du_t *in, float threshold, dl_conv_mode mode);
139 | 
140 | #ifdef __cplusplus
141 | }
142 | #endif
143 | 


--------------------------------------------------------------------------------
/ei-esp32-cam-cat-dog-arduino-1.0.4.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alankrantas/edge-impulse-esp32-cam-image-classification/bea4de2a83737349598063bc4ded2949cdcc5b25/ei-esp32-cam-cat-dog-arduino-1.0.4.zip


--------------------------------------------------------------------------------
/esp32-cam-edge-impulse.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alankrantas/edge-impulse-esp32-cam-image-classification/bea4de2a83737349598063bc4ded2949cdcc5b25/esp32-cam-edge-impulse.png


--------------------------------------------------------------------------------