├── .gitignore ├── images ├── zadig-1.png └── zadig-2.png ├── .gitmodules ├── linux.md ├── windows.md ├── inference-app ├── src │ ├── dsp_pipeline.h │ ├── ml_model.h │ ├── dsp_pipeline.cpp │ ├── ml_model.cpp │ ├── main.cpp │ └── tflite_model.h ├── CMakeLists.txt └── pico_sdk_import.cmake ├── .github └── stale.yml ├── README.md ├── colab_utils ├── serial_monitor.py ├── pico.py └── audio.py ├── LICENSE └── ml_audio_classifier_example_for_pico.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | inference-app/build/ 2 | 3 | -------------------------------------------------------------------------------- /images/zadig-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/HEAD/images/zadig-1.png -------------------------------------------------------------------------------- /images/zadig-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/HEAD/images/zadig-2.png -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "inference-app/lib/microphone-library-for-pico"] 2 | path = inference-app/lib/microphone-library-for-pico 3 | url = https://github.com/ArmDeveloperEcosystem/microphone-library-for-pico.git 4 | [submodule "inference-app/lib/pico-tflmicro"] 5 | path = inference-app/lib/pico-tflmicro 6 | url = https://github.com/raspberrypi/pico-tflmicro.git 7 | [submodule "inference-app/lib/CMSIS_5"] 8 | path = inference-app/lib/CMSIS_5 9 | url = https://github.com/ARM-software/CMSIS_5.git 10 | shallow = true 11 | -------------------------------------------------------------------------------- /linux.md: -------------------------------------------------------------------------------- 1 | # Linux Specific Instructions 2 | 3 | # udev 4 | 5 | In order to flash the board using WebUSB on Linux, udev rules will need to be configured. Without doing this you will get an "Access denied" error. 6 | 7 | 1. Create a file named `/etc/udev/rules.d/60-rp2040.rules` as root or with sudo, which contains the following: 8 | ``` 9 | SUBSYSTEMS=="usb", ATTRS{idVendor}=="2e8a", MODE:="0666" 10 | ``` 11 | 2. Run `sudo udevadm trigger` 12 | 3. Run `sudo udevadm control --reload-rules` 13 | 4. If the board was plug in previously, unplug and plug it back in. 14 | -------------------------------------------------------------------------------- /windows.md: -------------------------------------------------------------------------------- 1 | # Windows Specific Instructions 2 | 3 | ## WinUSB driver 4 | 5 | In order to flash the board using WebUSB on Windows the WinUSB driver must be installed. 6 | 7 | 1. Download [Zadig](https://github.com/pbatard/libwdi/releases/download/b755/zadig-2.6.exe) 8 | 1. Put board into USB boot ROM mode, by holding down Boot or BOOTSEL button while plugging in USB cable. 9 | 1. Open Zadig 10 | 1. In drop-down, ensure "RP2 Boot (Interface 1)" is selected: 11 | ![Zadig step 1](images/zadig-1.png) 12 | 1. Click "Install Driver" button. 13 | 1. WinUSB driver is installed 14 | ![Zadig step 2](images/zadig-2.png) 15 | -------------------------------------------------------------------------------- /inference-app/src/dsp_pipeline.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | * 4 | * SPDX-License-Identifier: Apache-2.0 5 | * 6 | */ 7 | 8 | #ifndef _DSP_PIPELINE_H_ 9 | #define _DSP_PIPELINE_H_ 10 | 11 | #include "arm_math.h" 12 | 13 | class DSPPipeline { 14 | public: 15 | DSPPipeline(int fft_size); 16 | virtual ~DSPPipeline(); 17 | 18 | int init(); 19 | void calculate_spectrum(const int16_t* input, int8_t* output, int32_t scale_divider, float scale_zero_point); 20 | void shift_spectrogram(int8_t* spectrogram, int shift_amount, int spectrogram_width); 21 | 22 | private: 23 | int _fft_size; 24 | int16_t* _hanning_window; 25 | arm_rfft_instance_q15 _S_q15; 26 | }; 27 | 28 | #endif 29 | -------------------------------------------------------------------------------- /.github/stale.yml: -------------------------------------------------------------------------------- 1 | # Number of days of inactivity before an issue becomes stale 2 | daysUntilStale: 60 3 | # Number of days of inactivity before a stale issue is closed 4 | daysUntilClose: 7 5 | # Issues with these labels will never be considered stale 6 | exemptLabels: 7 | - pinned 8 | - security 9 | - help wanted 10 | # Label to use when marking an issue as stale 11 | staleLabel: stale 12 | # Comment to post when marking an issue as stale. Set to `false` to disable 13 | markComment: > 14 | This issue has been automatically marked as stale because it has not had 15 | recent activity. It will be closed if no further activity occurs. Thank you 16 | for your contributions. 17 | # Comment to post when closing a stale issue. Set to `false` to disable 18 | closeComment: false 19 | # Never stale pull requests 20 | only: issues -------------------------------------------------------------------------------- /inference-app/src/ml_model.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | * 4 | * SPDX-License-Identifier: Apache-2.0 5 | * 6 | */ 7 | 8 | #ifndef _ML_MODEL_H_ 9 | #define _ML_MODEL_H_ 10 | 11 | #include "tensorflow/lite/micro/all_ops_resolver.h" 12 | #include "tensorflow/lite/micro/micro_error_reporter.h" 13 | #include "tensorflow/lite/micro/micro_interpreter.h" 14 | 15 | class MLModel { 16 | public: 17 | MLModel(const unsigned char tflite_model[], int tensor_arena_size); 18 | virtual ~MLModel(); 19 | 20 | int init(); 21 | void* input_data(); 22 | float predict(); 23 | 24 | float input_scale() const; 25 | int32_t input_zero_point() const; 26 | private: 27 | const unsigned char* _tflite_model; 28 | int _tensor_arena_size; 29 | 30 | uint8_t* _tensor_arena; 31 | tflite::MicroErrorReporter _error_reporter; 32 | const tflite::Model* _model; 33 | tflite::MicroInterpreter* _interpreter; 34 | TfLiteTensor* _input_tensor; 35 | TfLiteTensor* _output_tensor; 36 | 37 | tflite::AllOpsResolver _opsResolver; 38 | }; 39 | 40 | #endif 41 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ML Audio Classifier Example for Pico 2 | 3 | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/blob/main/ml_audio_classifier_example_for_pico.ipynb) 4 | 5 | An end-to-end flow for ML audio classification: collecting and capturing data for training, transfer learning, re-training, deploying to a [Raspberry Pi RP2040](https://www.raspberrypi.org/products/rp2040/) based device using Google Colab. 6 | 7 | For an overview of the flow, see the ["End-to-end tinyML audio classification with the Raspberry Pi RP2040" guest blog post on the TensorFlow blog.](https://blog.tensorflow.org/2021/09/TinyML-Audio-for-everyone.html) 8 | 9 | ## Supported Hardware 10 | 11 | One of the following: 12 | 13 | * [SparkFun MicroMod RP2040 Processor](https://www.sparkfun.com/products/17720) + [SparkFun MicroMod Machine Learning Carrier Board](https://www.sparkfun.com/products/16400) 14 | * [Raspberry Pi Pico](https://www.raspberrypi.org/products/raspberry-pi-pico/) + [Adafruit PDM MEMS Microphone Breakout](https://www.adafruit.com/product/3492) + breadboard 15 | 16 | ## License 17 | 18 | [Apache-2.0 License](LICENSE) 19 | 20 | --- 21 | 22 | Disclaimer: This is not an official Arm product. 23 | -------------------------------------------------------------------------------- /inference-app/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 3.12) 2 | 3 | # initialize pico_sdk from GIT 4 | # (note this can come from environment, CMake cache etc) 5 | # set(PICO_SDK_FETCH_FROM_GIT on) 6 | 7 | # pico_sdk_import.cmake is a single file copied from this SDK 8 | # note: this must happen before project() 9 | include(pico_sdk_import.cmake) 10 | 11 | project(pico_inference_app) 12 | 13 | # initialize the Pico SDK 14 | pico_sdk_init() 15 | 16 | # Define ARM_CPU, CMSIS ROOT and DSP to use CMSIS-DSP 17 | set(ARM_CPU "cortex-m0plus") 18 | set(ROOT ${CMAKE_CURRENT_LIST_DIR}/lib/CMSIS_5) 19 | set(DSP ${ROOT}/CMSIS/DSP) 20 | 21 | set(CONFIGTABLE ON) 22 | set(RFFT_Q15_256 ON) 23 | set(ALLFAST ON) 24 | 25 | # include CMSIS-DSP .cmake for GCC Toolchain 26 | include(${DSP}/Toolchain/GCC.cmake) 27 | 28 | # add CMSIS-DSP Source directory as subdirectory 29 | add_subdirectory(${DSP}/Source EXCLUDE_FROM_ALL) 30 | 31 | # rest of your project 32 | add_executable(pico_inference_app 33 | ${CMAKE_CURRENT_LIST_DIR}/src/main.cpp 34 | ${CMAKE_CURRENT_LIST_DIR}/src/dsp_pipeline.cpp 35 | ${CMAKE_CURRENT_LIST_DIR}/src/ml_model.cpp 36 | ) 37 | 38 | target_link_libraries(pico_inference_app 39 | pico_stdlib 40 | hardware_pwm 41 | pico-tflmicro 42 | pico_pdm_microphone 43 | CMSISDSPTransform CMSISDSPSupport CMSISDSPCommon CMSISDSPComplexMath CMSISDSPFastMath CMSISDSPBasicMath 44 | ) 45 | 46 | # enable usb output, disable uart output 47 | pico_enable_stdio_usb(pico_inference_app 1) 48 | pico_enable_stdio_uart(pico_inference_app 0) 49 | 50 | # create map/bin/hex/uf2 file in addition to ELF. 51 | pico_add_extra_outputs(pico_inference_app) 52 | 53 | add_subdirectory("lib/microphone-library-for-pico" EXCLUDE_FROM_ALL) 54 | add_subdirectory("lib/pico-tflmicro" EXCLUDE_FROM_ALL) 55 | -------------------------------------------------------------------------------- /inference-app/src/dsp_pipeline.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | * 4 | * SPDX-License-Identifier: Apache-2.0 5 | * 6 | */ 7 | 8 | #include "dsp_pipeline.h" 9 | 10 | DSPPipeline::DSPPipeline(int fft_size) : 11 | _fft_size(fft_size), 12 | _hanning_window(NULL) 13 | { 14 | } 15 | 16 | DSPPipeline::~DSPPipeline() 17 | { 18 | if (_hanning_window != NULL) { 19 | delete [] _hanning_window; 20 | 21 | _hanning_window = NULL; 22 | } 23 | } 24 | 25 | int DSPPipeline::init() 26 | { 27 | _hanning_window = new int16_t[_fft_size]; 28 | if (_hanning_window == NULL) { 29 | return 0; 30 | } 31 | 32 | for (size_t i = 0; i < _fft_size; i++) { 33 | float32_t f = 0.5 * (1.0 - arm_cos_f32(2 * PI * i / _fft_size)); 34 | 35 | arm_float_to_q15(&f, &_hanning_window[i], 1); 36 | } 37 | 38 | if (arm_rfft_init_q15(&_S_q15, _fft_size, 0, 1) != ARM_MATH_SUCCESS) { 39 | return 0; 40 | } 41 | 42 | return 1; 43 | } 44 | 45 | void DSPPipeline::calculate_spectrum(const int16_t* input, int8_t* output, int32_t scale_divider, float scale_zero_point) 46 | { 47 | int16_t windowed_input[_fft_size]; 48 | int16_t fft_q15[_fft_size * 2]; 49 | int16_t fft_mag_q15[_fft_size / 2 + 1]; 50 | 51 | // apply the DSP pipeline: Hanning Window + FFT 52 | arm_mult_q15(_hanning_window, input, windowed_input, _fft_size); 53 | arm_rfft_q15(&_S_q15, windowed_input, fft_q15); 54 | arm_cmplx_mag_q15(fft_q15, fft_mag_q15, (_fft_size / 2) + 1); 55 | 56 | int8_t* dst = output; 57 | 58 | for (int j = 0; j < ((_fft_size / 2) + 1); j++) { 59 | *dst++ = __SSAT((fft_mag_q15[j] / scale_divider) + scale_zero_point, 8); 60 | } 61 | } 62 | 63 | void DSPPipeline::shift_spectrogram(int8_t* spectrogram, int shift_amount, int spectrogram_width) 64 | { 65 | int spectrogram_height = _fft_size / 2 + 1; 66 | 67 | memmove(spectrogram, spectrogram + (spectrogram_height * shift_amount), spectrogram_height * (spectrogram_width - shift_amount) * sizeof(spectrogram[0])); 68 | } 69 | -------------------------------------------------------------------------------- /inference-app/pico_sdk_import.cmake: -------------------------------------------------------------------------------- 1 | # This is a copy of /external/pico_sdk_import.cmake 2 | 3 | # This can be dropped into an external project to help locate this SDK 4 | # It should be include()ed prior to project() 5 | 6 | if (DEFINED ENV{PICO_SDK_PATH} AND (NOT PICO_SDK_PATH)) 7 | set(PICO_SDK_PATH $ENV{PICO_SDK_PATH}) 8 | message("Using PICO_SDK_PATH from environment ('${PICO_SDK_PATH}')") 9 | endif () 10 | 11 | if (DEFINED ENV{PICO_SDK_FETCH_FROM_GIT} AND (NOT PICO_SDK_FETCH_FROM_GIT)) 12 | set(PICO_SDK_FETCH_FROM_GIT $ENV{PICO_SDK_FETCH_FROM_GIT}) 13 | message("Using PICO_SDK_FETCH_FROM_GIT from environment ('${PICO_SDK_FETCH_FROM_GIT}')") 14 | endif () 15 | 16 | if (DEFINED ENV{PICO_SDK_FETCH_FROM_GIT_PATH} AND (NOT PICO_SDK_FETCH_FROM_GIT_PATH)) 17 | set(PICO_SDK_FETCH_FROM_GIT_PATH $ENV{PICO_SDK_FETCH_FROM_GIT_PATH}) 18 | message("Using PICO_SDK_FETCH_FROM_GIT_PATH from environment ('${PICO_SDK_FETCH_FROM_GIT_PATH}')") 19 | endif () 20 | 21 | set(PICO_SDK_PATH "${PICO_SDK_PATH}" CACHE PATH "Path to the Raspberry Pi Pico SDK") 22 | set(PICO_SDK_FETCH_FROM_GIT "${PICO_SDK_FETCH_FROM_GIT}" CACHE BOOL "Set to ON to fetch copy of SDK from git if not otherwise locatable") 23 | set(PICO_SDK_FETCH_FROM_GIT_PATH "${PICO_SDK_FETCH_FROM_GIT_PATH}" CACHE FILEPATH "location to download SDK") 24 | 25 | if (NOT PICO_SDK_PATH) 26 | if (PICO_SDK_FETCH_FROM_GIT) 27 | include(FetchContent) 28 | set(FETCHCONTENT_BASE_DIR_SAVE ${FETCHCONTENT_BASE_DIR}) 29 | if (PICO_SDK_FETCH_FROM_GIT_PATH) 30 | get_filename_component(FETCHCONTENT_BASE_DIR "${PICO_SDK_FETCH_FROM_GIT_PATH}" REALPATH BASE_DIR "${CMAKE_SOURCE_DIR}") 31 | endif () 32 | FetchContent_Declare( 33 | pico_sdk 34 | GIT_REPOSITORY https://github.com/raspberrypi/pico-sdk 35 | GIT_TAG master 36 | ) 37 | if (NOT pico_sdk) 38 | message("Downloading Raspberry Pi Pico SDK") 39 | FetchContent_Populate(pico_sdk) 40 | set(PICO_SDK_PATH ${pico_sdk_SOURCE_DIR}) 41 | endif () 42 | set(FETCHCONTENT_BASE_DIR ${FETCHCONTENT_BASE_DIR_SAVE}) 43 | else () 44 | message(FATAL_ERROR 45 | "SDK location was not specified. Please set PICO_SDK_PATH or set PICO_SDK_FETCH_FROM_GIT to on to fetch from git." 46 | ) 47 | endif () 48 | endif () 49 | 50 | get_filename_component(PICO_SDK_PATH "${PICO_SDK_PATH}" REALPATH BASE_DIR "${CMAKE_BINARY_DIR}") 51 | if (NOT EXISTS ${PICO_SDK_PATH}) 52 | message(FATAL_ERROR "Directory '${PICO_SDK_PATH}' not found") 53 | endif () 54 | 55 | set(PICO_SDK_INIT_CMAKE_FILE ${PICO_SDK_PATH}/pico_sdk_init.cmake) 56 | if (NOT EXISTS ${PICO_SDK_INIT_CMAKE_FILE}) 57 | message(FATAL_ERROR "Directory '${PICO_SDK_PATH}' does not appear to contain the Raspberry Pi Pico SDK") 58 | endif () 59 | 60 | set(PICO_SDK_PATH ${PICO_SDK_PATH} CACHE PATH "Path to the Raspberry Pi Pico SDK" FORCE) 61 | 62 | include(${PICO_SDK_INIT_CMAKE_FILE}) 63 | -------------------------------------------------------------------------------- /colab_utils/serial_monitor.py: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | # 4 | # SPDX-License-Identifier: Apache-2.0 5 | # 6 | 7 | from IPython import display 8 | 9 | def run_serial_monitor(): 10 | display.display(display.Javascript(''' 11 | if ('serial' in navigator) { 12 | const scriptElement = document.createElement("script"); 13 | scriptElement.src = "https://cdnjs.cloudflare.com/ajax/libs/xterm/3.14.5/xterm.min.js"; 14 | document.body.appendChild(scriptElement); 15 | 16 | const linkElement = document.createElement("link"); 17 | linkElement.rel = "stylesheet" 18 | linkElement.href = "https://cdnjs.cloudflare.com/ajax/libs/xterm/3.14.5/xterm.min.css"; 19 | document.body.appendChild(linkElement); 20 | 21 | const connectDisconnectButton = document.createElement("button"); 22 | 23 | connectDisconnectButton.innerHTML = "Connect Port"; 24 | 25 | document.querySelector("#output-area").appendChild(connectDisconnectButton); 26 | 27 | terminalDiv = document.createElement("div"); 28 | terminalDiv.style = "margin: 5px"; 29 | 30 | document.querySelector("#output-area").appendChild(terminalDiv); 31 | 32 | let port = undefined; 33 | let reader = undefined; 34 | let keepReading = true; 35 | let term = undefined; 36 | 37 | connectDisconnectButton.onclick = async () => { 38 | if (port !== undefined) { 39 | if (reader !== undefined) { 40 | keepReading = false; 41 | try { 42 | await reader.cancel(); 43 | } catch (e) {} 44 | } 45 | port = undefined; 46 | reader = undefined; 47 | 48 | connectDisconnectButton.innerHTML = "Connect Port"; 49 | 50 | return; 51 | } 52 | 53 | port = await navigator.serial.requestPort(); 54 | keepReading = true; 55 | 56 | connectDisconnectButton.innerHTML = "Disconnect Port"; 57 | 58 | await port.open({ baudRate: 115200 }); 59 | 60 | if (term === undefined) { 61 | term = new Terminal(); 62 | term.open(terminalDiv); 63 | } 64 | term.clear(); 65 | 66 | const decoder = new TextDecoder(); 67 | 68 | while (port && keepReading) { 69 | try { 70 | reader = port.readable.getReader(); 71 | 72 | while (true) { 73 | const { value, done } = await reader.read(); 74 | if (done) { 75 | keepReading = false; 76 | break; 77 | } 78 | 79 | term.write(decoder.decode(value, { stream: true })); 80 | } 81 | } catch (error) { 82 | keepReading = false; 83 | } finally { 84 | await reader.releaseLock(); 85 | } 86 | } 87 | 88 | await port.close(); 89 | 90 | port = undefined; 91 | reader = undefined; 92 | 93 | connectDisconnectButton.innerHTML = "Connect Port"; 94 | }; 95 | } else { 96 | document.querySelector("#output-area").appendChild(document.createTextNode( 97 | "Oh no! Your browser does not support Web Serial!" 98 | )); 99 | } 100 | ''')) 101 | -------------------------------------------------------------------------------- /inference-app/src/ml_model.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | * 4 | * SPDX-License-Identifier: Apache-2.0 5 | * 6 | */ 7 | 8 | #include "tensorflow/lite/schema/schema_generated.h" 9 | #include "tensorflow/lite/version.h" 10 | 11 | #include "ml_model.h" 12 | 13 | MLModel::MLModel(const unsigned char tflite_model[], int tensor_arena_size) : 14 | _tflite_model(tflite_model), 15 | _tensor_arena_size(tensor_arena_size), 16 | _tensor_arena(NULL), 17 | _model(NULL), 18 | _interpreter(NULL), 19 | _input_tensor(NULL), 20 | _output_tensor(NULL) 21 | { 22 | } 23 | 24 | MLModel::~MLModel() 25 | { 26 | if (_interpreter != NULL) { 27 | delete _interpreter; 28 | _interpreter = NULL; 29 | } 30 | 31 | if (_tensor_arena != NULL) { 32 | delete [] _tensor_arena; 33 | _tensor_arena = NULL; 34 | } 35 | } 36 | 37 | int MLModel::init() 38 | { 39 | _model = tflite::GetModel(_tflite_model); 40 | if (_model->version() != TFLITE_SCHEMA_VERSION) { 41 | TF_LITE_REPORT_ERROR(&_error_reporter, 42 | "Model provided is schema version %d not equal " 43 | "to supported version %d.", 44 | _model->version(), TFLITE_SCHEMA_VERSION); 45 | 46 | return 0; 47 | } 48 | 49 | _tensor_arena = new uint8_t[_tensor_arena_size]; 50 | if (_tensor_arena == NULL) { 51 | TF_LITE_REPORT_ERROR(&_error_reporter, 52 | "Failed to allocate tensor area of size %d", 53 | _tensor_arena_size); 54 | return 0; 55 | } 56 | 57 | _interpreter = new tflite::MicroInterpreter( 58 | _model, _opsResolver, 59 | _tensor_arena, _tensor_arena_size, 60 | &_error_reporter 61 | ); 62 | if (_interpreter == NULL) { 63 | TF_LITE_REPORT_ERROR(&_error_reporter, 64 | "Failed to allocate interpreter"); 65 | return 0; 66 | } 67 | 68 | TfLiteStatus allocate_status = _interpreter->AllocateTensors(); 69 | if (allocate_status != kTfLiteOk) { 70 | TF_LITE_REPORT_ERROR(&_error_reporter, "AllocateTensors() failed"); 71 | return 0; 72 | } 73 | 74 | _input_tensor = _interpreter->input(0); 75 | _output_tensor = _interpreter->output(0); 76 | 77 | return 1; 78 | } 79 | 80 | void* MLModel::input_data() 81 | { 82 | if (_input_tensor == NULL) { 83 | return NULL; 84 | } 85 | 86 | return _input_tensor->data.data; 87 | } 88 | 89 | float MLModel::predict() 90 | { 91 | TfLiteStatus invoke_status = _interpreter->Invoke(); 92 | 93 | if (invoke_status != kTfLiteOk) { 94 | return NAN; 95 | } 96 | 97 | float y_quantized = _output_tensor->data.int8[0]; 98 | float y = (y_quantized - _output_tensor->params.zero_point) * _output_tensor->params.scale; 99 | 100 | return y; 101 | } 102 | 103 | float MLModel::input_scale() const 104 | { 105 | if (_input_tensor == NULL) { 106 | return NAN; 107 | } 108 | 109 | return _input_tensor->params.scale; 110 | } 111 | 112 | int32_t MLModel::input_zero_point() const 113 | { 114 | if (_input_tensor == NULL) { 115 | return 0; 116 | } 117 | 118 | return _input_tensor->params.zero_point; 119 | } -------------------------------------------------------------------------------- /colab_utils/pico.py: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | # 4 | # SPDX-License-Identifier: Apache-2.0 5 | # 6 | 7 | from IPython import display 8 | from google.colab import output 9 | import base64 10 | 11 | def flash_pico(binary_file): 12 | def read_binary_base64(binary_file): 13 | with open(binary_file, 'rb') as f: 14 | return base64.b64encode(f.read()).decode('ascii') 15 | 16 | output.register_callback('notebook.read_binary_base64', read_binary_base64) 17 | 18 | display.display(display.Javascript(""" 19 | const picotoolJsScript = document.createElement("script"); 20 | 21 | picotoolJsScript.src = "https://armdeveloperecosystem.github.io/picotool.js/src/picotool.js"; 22 | picotoolJsScript.type = "text/javascript"; 23 | 24 | document.body.append(picotoolJsScript); 25 | 26 | async function readBinary() { 27 | const result = await google.colab.kernel.invokeFunction('notebook.read_binary_base64', ['###binary_file###']); 28 | let str = result.data['text/plain']; 29 | 30 | str = str.substring(1, str.length - 1); 31 | 32 | return Uint8Array.from(atob(str), c => c.charCodeAt(0)); 33 | } 34 | 35 | async function flashPico() { 36 | statusDiv.innerHTML = ''; 37 | 38 | try { 39 | statusDiv.innerHTML += "requesting device ...
"; 40 | const usbDevice = await PicoTool.requestDevice(); 41 | 42 | const fileData = (await readBinary()).buffer; 43 | const writeData = new ArrayBuffer(PicoTool.FLASH_SECTOR_ERASE_SIZE); 44 | 45 | const srcDataView = new DataView(fileData); 46 | const dstDataView = new DataView(writeData); 47 | 48 | const picoTool = new PicoTool(usbDevice); 49 | 50 | statusDiv.innerHTML += "opening device ...
"; 51 | await picoTool.open(); 52 | 53 | // statusDiv.innerHTML += "reset ...
"; 54 | await picoTool.reset(); 55 | 56 | // statusDiv.innerHTML += "exlusive access device ...
"; 57 | await picoTool.exlusiveAccess(1); 58 | 59 | // statusDiv.innerHTML += "exit xip ...
"; 60 | await picoTool.exitXip(); 61 | 62 | for (let i = 0; i < fileData.byteLength; i += PicoTool.FLASH_SECTOR_ERASE_SIZE) { 63 | let j = 0; 64 | for (j = 0; j < PicoTool.FLASH_SECTOR_ERASE_SIZE && (i + j) < fileData.byteLength; j++) { 65 | dstDataView.setUint8(j, srcDataView.getUint8(i + j)); 66 | } 67 | 68 | statusDiv.innerHTML += "erasing ... "; 69 | await picoTool.flashErase(PicoTool.FLASH_START + i, PicoTool.FLASH_SECTOR_ERASE_SIZE); 70 | 71 | statusDiv.innerHTML += "writing ... "; 72 | await picoTool.write(PicoTool.FLASH_START + i, writeData); 73 | 74 | statusDiv.innerHTML += " " + ((i + j) * 100 / fileData.byteLength).toFixed(2) + "%
"; 75 | } 76 | 77 | statusDiv.innerHTML += "rebooting device ...
"; 78 | await picoTool.reboot(0, PicoTool.SRAM_END, 512); 79 | 80 | statusDiv.innerHTML += "closing device ...
"; 81 | await picoTool.close(); 82 | } catch (e) { 83 | statusDiv.innerHTML = 'Error: ' + e.message; 84 | } 85 | } 86 | 87 | let statusDiv = undefined; 88 | 89 | if ('usb' in navigator) { 90 | const flashButton = document.createElement("button"); 91 | 92 | flashButton.innerHTML = "Flash"; 93 | flashButton.onclick = flashPico; 94 | 95 | document.querySelector("#output-area").appendChild(flashButton); 96 | 97 | statusDiv = document.createElement("div"); 98 | statusDiv.style = "margin: 5px"; 99 | 100 | document.querySelector("#output-area").appendChild(statusDiv); 101 | } else { 102 | document.querySelector("#output-area").appendChild(document.createTextNode( 103 | "Oh no! Your browser does not support WebUSB!" 104 | )); 105 | } 106 | """.replace('###binary_file###', binary_file))) 107 | -------------------------------------------------------------------------------- /colab_utils/audio.py: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | # 4 | # SPDX-License-Identifier: Apache-2.0 5 | # 6 | 7 | from IPython import display 8 | from google.colab import output 9 | import base64 10 | from datetime import datetime 11 | from os import path 12 | 13 | def record_wav_file(folder): 14 | def save_wav_file(folder, s): 15 | b = base64.b64decode(s.split(',')[1]) 16 | 17 | file_path = path.join(folder, datetime.now().strftime("%d-%m-%Y-%H-%M-%S.wav")) 18 | 19 | print(f'Saving file: {file_path}') 20 | 21 | with open(file_path, 'wb') as f: 22 | f.write(b) 23 | 24 | output.register_callback('notebook.save_wav_file', save_wav_file) 25 | 26 | display.display(display.Javascript(""" 27 | const recorderJsScript = document.createElement("script"); 28 | const audioInputSelect = document.createElement("select"); 29 | const recordButton = document.createElement("button"); 30 | 31 | recorderJsScript.src = "https://sandeepmistry.github.io/Recorderjs/dist/recorder.js"; 32 | recorderJsScript.type = "text/javascript"; 33 | 34 | recordButton.innerHTML = "⏺ Start Recording"; 35 | 36 | document.body.append(recorderJsScript); 37 | document.querySelector("#output-area").appendChild(audioInputSelect); 38 | document.querySelector("#output-area").appendChild(recordButton); 39 | 40 | async function updateAudioInputSelect() { 41 | while (audioInputSelect.firstChild) { 42 | audioInputSelect.removeChild(audioInputSelect.firstChild); 43 | } 44 | 45 | const deviceInfos = await navigator.mediaDevices.enumerateDevices(); 46 | 47 | for (let i = 0; i !== deviceInfos.length; ++i) { 48 | const deviceInfo = deviceInfos[i]; 49 | const option = document.createElement("option"); 50 | 51 | option.value = deviceInfo.deviceId; 52 | 53 | if (deviceInfo.kind === "audioinput") { 54 | option.text = deviceInfo.label || "Microphone " + (audioInputSelect.length + 1); 55 | option.selected = (option.text.indexOf("MicNode") !== -1); 56 | audioInputSelect.appendChild(option); 57 | } 58 | } 59 | } 60 | 61 | const blob2text = (blob) => new Promise(resolve => { 62 | const reader = new FileReader(); 63 | reader.onloadend = e => resolve(e.srcElement.result); 64 | reader.readAsDataURL(blob) 65 | }); 66 | 67 | let recorder = undefined; 68 | let stream = undefined; 69 | 70 | // https://www.html5rocks.com/en/tutorials/getusermedia/intro/ 71 | recordButton.onclick = async () => { 72 | if (recordButton.innerHTML != "⏺ Start Recording") { 73 | recordButton.disabled = true; 74 | recorder.stop(); 75 | 76 | recorder.exportWAV(async (blob) => { 77 | const text = await blob2text(blob); 78 | 79 | google.colab.kernel.invokeFunction('notebook.save_wav_file', ['###folder###', text], {}); 80 | 81 | recordButton.innerHTML = "⏺ Start Recording"; 82 | recordButton.disabled = false; 83 | 84 | stream.getTracks().forEach(function(track) { 85 | if (track.readyState === 'live') { 86 | track.stop(); 87 | } 88 | }); 89 | }); 90 | 91 | return; 92 | } 93 | 94 | const constraints = { 95 | audio: { 96 | deviceId: { 97 | }, 98 | sampleRate: 16000 99 | }, 100 | video: false 101 | }; 102 | 103 | stream = await navigator.mediaDevices.getUserMedia(constraints); 104 | 105 | if (audioInputSelect.value === "") { 106 | await updateAudioInputSelect(); 107 | 108 | stream.getTracks().forEach(function(track) { 109 | if (track.readyState === 'live') { 110 | track.stop(); 111 | } 112 | }); 113 | 114 | constraints.audio.deviceId.exact = audioInputSelect.value; 115 | stream = await navigator.mediaDevices.getUserMedia(constraints); 116 | } 117 | 118 | const audioContext = new AudioContext({ 119 | sampleRate: 16000 120 | }); 121 | 122 | const input = audioContext.createMediaStreamSource(stream); 123 | 124 | recorder = new Recorder(input, { 125 | numChannels: 1 126 | }); 127 | 128 | recordButton.innerHTML = "⏹ Stop Recording"; 129 | 130 | recorder.record(); 131 | }; 132 | 133 | updateAudioInputSelect(); 134 | """.replace('###folder###', folder))) 135 | -------------------------------------------------------------------------------- /inference-app/src/main.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | * 4 | * SPDX-License-Identifier: Apache-2.0 5 | * 6 | */ 7 | 8 | #include 9 | 10 | #include "pico/stdlib.h" 11 | #include "hardware/pwm.h" 12 | 13 | extern "C" { 14 | #include "pico/pdm_microphone.h" 15 | } 16 | 17 | #include "tflite_model.h" 18 | 19 | #include "dsp_pipeline.h" 20 | #include "ml_model.h" 21 | 22 | // constants 23 | #define SAMPLE_RATE 16000 24 | #define FFT_SIZE 256 25 | #define SPECTRUM_SHIFT 4 26 | #define INPUT_BUFFER_SIZE ((FFT_SIZE / 2) * SPECTRUM_SHIFT) 27 | #define INPUT_SHIFT 0 28 | 29 | // microphone configuration 30 | const struct pdm_microphone_config pdm_config = { 31 | // GPIO pin for the PDM DAT signal 32 | .gpio_data = 2, 33 | 34 | // GPIO pin for the PDM CLK signal 35 | .gpio_clk = 3, 36 | 37 | // PIO instance to use 38 | .pio = pio0, 39 | 40 | // PIO State Machine instance to use 41 | .pio_sm = 0, 42 | 43 | // sample rate in Hz 44 | .sample_rate = SAMPLE_RATE, 45 | 46 | // number of samples to buffer 47 | .sample_buffer_size = INPUT_BUFFER_SIZE, 48 | }; 49 | 50 | q15_t capture_buffer_q15[INPUT_BUFFER_SIZE]; 51 | volatile int new_samples_captured = 0; 52 | 53 | q15_t input_q15[INPUT_BUFFER_SIZE + (FFT_SIZE / 2)]; 54 | 55 | DSPPipeline dsp_pipeline(FFT_SIZE); 56 | MLModel ml_model(tflite_model, 128 * 1024); 57 | 58 | int8_t* scaled_spectrum = nullptr; 59 | int32_t spectogram_divider; 60 | float spectrogram_zero_point; 61 | 62 | void on_pdm_samples_ready(); 63 | 64 | int main( void ) 65 | { 66 | // initialize stdio 67 | stdio_init_all(); 68 | 69 | printf("hello pico fire alarm detection\n"); 70 | 71 | gpio_set_function(PICO_DEFAULT_LED_PIN, GPIO_FUNC_PWM); 72 | 73 | uint pwm_slice_num = pwm_gpio_to_slice_num(PICO_DEFAULT_LED_PIN); 74 | uint pwm_chan_num = pwm_gpio_to_channel(PICO_DEFAULT_LED_PIN); 75 | 76 | // Set period of 256 cycles (0 to 255 inclusive) 77 | pwm_set_wrap(pwm_slice_num, 256); 78 | 79 | // Set the PWM running 80 | pwm_set_enabled(pwm_slice_num, true); 81 | 82 | if (!ml_model.init()) { 83 | printf("Failed to initialize ML model!\n"); 84 | while (1) { tight_loop_contents(); } 85 | } 86 | 87 | if (!dsp_pipeline.init()) { 88 | printf("Failed to initialize DSP Pipeline!\n"); 89 | while (1) { tight_loop_contents(); } 90 | } 91 | 92 | scaled_spectrum = (int8_t*)ml_model.input_data(); 93 | spectogram_divider = 64 * ml_model.input_scale(); 94 | spectrogram_zero_point = ml_model.input_zero_point(); 95 | 96 | // initialize the PDM microphone 97 | if (pdm_microphone_init(&pdm_config) < 0) { 98 | printf("PDM microphone initialization failed!\n"); 99 | while (1) { tight_loop_contents(); } 100 | } 101 | 102 | // set callback that is called when all the samples in the library 103 | // internal sample buffer are ready for reading 104 | pdm_microphone_set_samples_ready_handler(on_pdm_samples_ready); 105 | 106 | // start capturing data from the PDM microphone 107 | if (pdm_microphone_start() < 0) { 108 | printf("PDM microphone start failed!\n"); 109 | while (1) { tight_loop_contents(); } 110 | } 111 | 112 | while (1) { 113 | // wait for new samples 114 | while (new_samples_captured == 0) { 115 | tight_loop_contents(); 116 | } 117 | new_samples_captured = 0; 118 | 119 | dsp_pipeline.shift_spectrogram(scaled_spectrum, SPECTRUM_SHIFT, 124); 120 | 121 | // move input buffer values over by INPUT_BUFFER_SIZE samples 122 | memmove(input_q15, &input_q15[INPUT_BUFFER_SIZE], (FFT_SIZE / 2)); 123 | 124 | // copy new samples to end of the input buffer with a bit shift of INPUT_SHIFT 125 | arm_shift_q15(capture_buffer_q15, INPUT_SHIFT, input_q15 + (FFT_SIZE / 2), INPUT_BUFFER_SIZE); 126 | 127 | for (int i = 0; i < SPECTRUM_SHIFT; i++) { 128 | dsp_pipeline.calculate_spectrum( 129 | input_q15 + i * ((FFT_SIZE / 2)), 130 | scaled_spectrum + (129 * (124 - SPECTRUM_SHIFT + i)), 131 | spectogram_divider, spectrogram_zero_point 132 | ); 133 | } 134 | 135 | float prediction = ml_model.predict(); 136 | 137 | if (prediction >= 0.5) { 138 | printf("\t🔥 🔔\tdetected!\t(prediction = %f)\n\n", prediction); 139 | } else { 140 | printf("\t🔕\tNOT detected\t(prediction = %f)\n\n", prediction); 141 | } 142 | 143 | pwm_set_chan_level(pwm_slice_num, pwm_chan_num, prediction * 255); 144 | } 145 | 146 | return 0; 147 | } 148 | 149 | void on_pdm_samples_ready() 150 | { 151 | // callback from library when all the samples in the library 152 | // internal sample buffer are ready for reading 153 | 154 | // read in the new samples 155 | new_samples_captured = pdm_microphone_read(capture_buffer_q15, INPUT_BUFFER_SIZE); 156 | } 157 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /inference-app/src/tflite_model.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2021 Arm Limited and Contributors. All rights reserved. 3 | * 4 | * SPDX-License-Identifier: Apache-2.0 5 | * 6 | */ 7 | 8 | #ifndef _TFLITE_MODEL_H_ 9 | #define _TFLITE_MODEL_H_ 10 | 11 | alignas(8) const unsigned char tflite_model[] = { 12 | 0x20, 0x00, 0x00, 0x00, 0x54, 0x46, 0x4c, 0x33, 0x00, 0x00, 0x00, 0x00, 13 | 0x14, 0x00, 0x20, 0x00, 0x1c, 0x00, 0x18, 0x00, 0x14, 0x00, 0x10, 0x00, 14 | 0x0c, 0x00, 0x00, 0x00, 0x08, 0x00, 0x04, 0x00, 0x14, 0x00, 0x00, 0x00, 15 | 0x1c, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x00, 0x00, 0x4c, 0x00, 0x00, 0x00, 16 | 0x68, 0x04, 0x00, 0x00, 0x78, 0x04, 0x00, 0x00, 0x98, 0x0e, 0x00, 0x00, 17 | 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 18 | 0x0c, 0x00, 0x00, 0x00, 0x08, 0x00, 0x0c, 0x00, 0x08, 0x00, 0x04, 0x00, 19 | 0x08, 0x00, 0x00, 0x00, 0x0e, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 20 | 0x13, 0x00, 0x00, 0x00, 0x6d, 0x69, 0x6e, 0x5f, 0x72, 0x75, 0x6e, 0x74, 21 | 0x69, 0x6d, 0x65, 0x5f, 0x76, 0x65, 0x72, 0x73, 0x69, 0x6f, 0x6e, 0x00, 22 | 0x0f, 0x00, 0x00, 0x00, 0x18, 0x04, 0x00, 0x00, 0x10, 0x04, 0x00, 0x00, 23 | 0xf8, 0x03, 0x00, 0x00, 0xd8, 0x03, 0x00, 0x00, 0xc8, 0x01, 0x00, 0x00, 24 | 0x98, 0x01, 0x00, 0x00, 0x90, 0x01, 0x00, 0x00, 0x88, 0x01, 0x00, 0x00, 25 | 0x80, 0x01, 0x00, 0x00, 0x78, 0x01, 0x00, 0x00, 0x48, 0x00, 0x00, 0x00, 26 | 0x34, 0x00, 0x00, 0x00, 0x2c, 0x00, 0x00, 0x00, 0x24, 0x00, 0x00, 0x00, 27 | 0x04, 0x00, 0x00, 0x00, 0x42, 0xfc, 0xff, 0xff, 0x04, 0x00, 0x00, 0x00, 28 | 0x10, 0x00, 0x00, 0x00, 0x32, 0x2e, 0x33, 0x2e, 0x30, 0x00, 0x00, 0x00, 29 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xc8, 0xf2, 0xff, 0xff, 30 | 0xcc, 0xf2, 0xff, 0xff, 0x66, 0xfc, 0xff, 0xff, 0x04, 0x00, 0x00, 0x00, 31 | 0x04, 0x00, 0x00, 0x00, 0x4d, 0xfa, 0xff, 0xff, 0x76, 0xfc, 0xff, 0xff, 32 | 0x04, 0x00, 0x00, 0x00, 0x20, 0x01, 0x00, 0x00, 0xaf, 0xf0, 0x0f, 0xb7, 33 | 0xd6, 0xdb, 0xc6, 0x18, 0x42, 0x39, 0x4b, 0x41, 0xe3, 0x54, 0x13, 0x1d, 34 | 0x0b, 0x14, 0x1d, 0x40, 0x1c, 0x1d, 0x47, 0x02, 0x19, 0x99, 0xab, 0xdd, 35 | 0x0c, 0xf4, 0x1a, 0xc7, 0x91, 0xab, 0xe6, 0xbf, 0xb9, 0xd3, 0xc0, 0xc8, 36 | 0xc4, 0xd0, 0xd5, 0xce, 0xd2, 0xe7, 0xbe, 0xdb, 0xb0, 0xec, 0x02, 0xc9, 37 | 0xbe, 0xe4, 0xdb, 0x2a, 0x1f, 0x44, 0x10, 0x12, 0xd7, 0x30, 0xfa, 0xfe, 38 | 0xf7, 0xfc, 0x10, 0x1a, 0x0d, 0x18, 0x0e, 0x11, 0x1b, 0xc1, 0xb7, 0xc5, 39 | 0x02, 0x0b, 0x03, 0xac, 0x91, 0xc4, 0xe0, 0xb7, 0xdb, 0xd7, 0xc9, 0xd2, 40 | 0xab, 0xad, 0xdf, 0xfa, 0xce, 0xe5, 0xd6, 0xbf, 0xaf, 0x0c, 0x15, 0xdc, 41 | 0xe4, 0x06, 0xc2, 0x17, 0x24, 0x38, 0x29, 0x21, 0xd5, 0x4a, 0xf7, 0x17, 42 | 0x04, 0x09, 0xfb, 0x51, 0x13, 0x00, 0x38, 0x00, 0x1f, 0x94, 0xdc, 0x94, 43 | 0x08, 0x06, 0x03, 0xcb, 0x8b, 0x40, 0xd9, 0xa1, 0x0c, 0xdd, 0xdb, 0xf1, 44 | 0xcb, 0xd1, 0xdb, 0xd0, 0xdb, 0xb2, 0xf5, 0xbc, 0xc7, 0x04, 0xfb, 0x9c, 45 | 0xe7, 0xf7, 0xf3, 0x36, 0x35, 0x40, 0x2e, 0x07, 0x2b, 0x31, 0x19, 0x23, 46 | 0xe1, 0x06, 0x18, 0x24, 0x2b, 0x18, 0x29, 0x05, 0x04, 0xc4, 0xb5, 0x0c, 47 | 0x09, 0x10, 0x07, 0xa7, 0x81, 0x02, 0x0b, 0x0b, 0xe3, 0xf4, 0xee, 0xd2, 48 | 0xbe, 0xeb, 0xe3, 0xd0, 0xe0, 0xe0, 0xdf, 0xcf, 0xc5, 0x05, 0x1b, 0xe9, 49 | 0xc2, 0xd0, 0xe9, 0xfa, 0x42, 0x1f, 0x4f, 0x4a, 0xb8, 0x7b, 0xfb, 0x09, 50 | 0xdf, 0xfd, 0x06, 0x41, 0x17, 0x01, 0x39, 0x0c, 0x1c, 0xae, 0xe2, 0xf9, 51 | 0x0a, 0x0c, 0x01, 0x1f, 0xa2, 0x2a, 0xf4, 0x3f, 0x02, 0x0d, 0x11, 0xe0, 52 | 0xae, 0xd9, 0xe8, 0xfc, 0xde, 0xeb, 0xf4, 0xdb, 0x98, 0x38, 0xfb, 0xc1, 53 | 0xeb, 0xc6, 0xe2, 0x37, 0x49, 0x5f, 0x2d, 0xe6, 0xfd, 0x2c, 0x13, 0x16, 54 | 0x15, 0x0a, 0x09, 0x2a, 0x25, 0x18, 0x3e, 0xff, 0x37, 0x91, 0xe3, 0xe7, 55 | 0xed, 0xfa, 0xf7, 0xdd, 0xaa, 0x23, 0xe6, 0xf1, 0xd3, 0xeb, 0xef, 0xd1, 56 | 0xfb, 0xef, 0xc8, 0xe0, 0xd1, 0xee, 0xe6, 0xc9, 0x0c, 0xf4, 0xff, 0xff, 57 | 0x10, 0xf4, 0xff, 0xff, 0x14, 0xf4, 0xff, 0xff, 0x18, 0xf4, 0xff, 0xff, 58 | 0xb2, 0xfd, 0xff, 0xff, 0x04, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 59 | 0x9b, 0xff, 0xff, 0xff, 0x0d, 0xff, 0xff, 0xff, 0x89, 0xfe, 0xff, 0xff, 60 | 0xe5, 0xfe, 0xff, 0xff, 0x75, 0x00, 0x00, 0x00, 0x50, 0xff, 0xff, 0xff, 61 | 0x1e, 0xfe, 0xff, 0xff, 0x56, 0x00, 0x00, 0x00, 0xde, 0xfd, 0xff, 0xff, 62 | 0x04, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x7f, 0xf8, 0xec, 0xac, 63 | 0x31, 0x01, 0xd1, 0xb1, 0x3f, 0xfc, 0x15, 0xd1, 0x13, 0x26, 0xcf, 0xb7, 64 | 0x19, 0x12, 0xf5, 0x06, 0xdc, 0xf1, 0xe4, 0x1d, 0x2c, 0xfd, 0x17, 0xdd, 65 | 0x03, 0x18, 0xf8, 0xe5, 0x26, 0x00, 0x02, 0xde, 0xf9, 0x44, 0x09, 0xd4, 66 | 0x34, 0xf8, 0x07, 0xec, 0xe4, 0x12, 0x06, 0xce, 0x36, 0x01, 0x07, 0xbd, 67 | 0x41, 0x12, 0x07, 0xc1, 0x75, 0x03, 0x0c, 0xce, 0xe9, 0x1a, 0xd6, 0xc2, 68 | 0x0d, 0x88, 0x28, 0x47, 0xe1, 0x07, 0xf7, 0xef, 0x01, 0x8b, 0x31, 0x41, 69 | 0xe9, 0xfb, 0x14, 0xf2, 0xff, 0x84, 0x4c, 0x0d, 0xde, 0x0e, 0x06, 0xed, 70 | 0x10, 0x98, 0x06, 0x14, 0xf9, 0x0f, 0x05, 0x09, 0x1c, 0x81, 0x03, 0xf7, 71 | 0xe6, 0x1b, 0x09, 0x19, 0x0e, 0x85, 0x47, 0xf8, 0xea, 0xe1, 0xfa, 0x0f, 72 | 0x03, 0x86, 0x50, 0x23, 0xd4, 0x0c, 0x0e, 0xe6, 0xf7, 0x82, 0x26, 0x54, 73 | 0xd2, 0xfe, 0x06, 0x01, 0xc5, 0xe9, 0x26, 0x1e, 0x28, 0x42, 0x0b, 0x3b, 74 | 0x91, 0xd3, 0xe2, 0x1d, 0x10, 0xfa, 0x1a, 0xec, 0x81, 0xb5, 0xaa, 0xb3, 75 | 0xad, 0xcf, 0xd1, 0xc4, 0xf1, 0x36, 0x37, 0x5c, 0x27, 0x01, 0x23, 0xfe, 76 | 0xeb, 0x0b, 0x33, 0x37, 0x30, 0xf5, 0xff, 0x0e, 0x94, 0xd5, 0xd9, 0xb8, 77 | 0xd7, 0xdb, 0xe6, 0xf6, 0xad, 0xcf, 0x00, 0xf3, 0xea, 0x26, 0x12, 0x1a, 78 | 0x96, 0x8d, 0x08, 0xe3, 0x37, 0x14, 0x0d, 0x14, 0xa6, 0x81, 0xaf, 0xb1, 79 | 0x94, 0xbd, 0xca, 0xce, 0xc8, 0xa1, 0xb7, 0xe0, 0xea, 0xea, 0xeb, 0xfb, 80 | 0x24, 0x42, 0x44, 0x1d, 0x4b, 0x45, 0x0c, 0x01, 0x54, 0x5f, 0x34, 0x56, 81 | 0x18, 0x43, 0x49, 0x1a, 0x39, 0x64, 0x60, 0x57, 0x4d, 0x11, 0xf4, 0x3b, 82 | 0x07, 0x0f, 0x30, 0x06, 0x06, 0x12, 0x27, 0x1d, 0xe5, 0xcf, 0xd8, 0xb4, 83 | 0xd5, 0xd8, 0xdf, 0xf4, 0xc5, 0x8d, 0xb0, 0xc0, 0xb5, 0xb3, 0xba, 0xd8, 84 | 0xd0, 0x1b, 0xef, 0x51, 0x3e, 0xdf, 0x19, 0xb9, 0x23, 0x7f, 0xf9, 0xfe, 85 | 0x05, 0x59, 0x03, 0xce, 0xeb, 0x44, 0xf1, 0x25, 0x03, 0xf7, 0x0f, 0x00, 86 | 0xf9, 0xc5, 0xad, 0x10, 0xf0, 0xda, 0x06, 0xd8, 0x15, 0x1a, 0xde, 0x11, 87 | 0x0e, 0xe1, 0xfb, 0xff, 0xe3, 0x24, 0x00, 0xf8, 0x12, 0x0f, 0xe0, 0xf9, 88 | 0x10, 0x1e, 0xf3, 0x28, 0x0a, 0xd7, 0x0a, 0xc5, 0xe9, 0x66, 0xe3, 0x1a, 89 | 0x18, 0x13, 0x00, 0xdc, 0xbc, 0xde, 0xd0, 0xc0, 0xfb, 0x4a, 0x5a, 0xc5, 90 | 0xb2, 0x01, 0xee, 0x81, 0xce, 0x4a, 0x26, 0x18, 0x0a, 0x26, 0x3f, 0xe3, 91 | 0xc7, 0x0c, 0x1c, 0xe7, 0x06, 0x35, 0x30, 0xe5, 0xd2, 0x24, 0x12, 0x0d, 92 | 0x10, 0x56, 0xfe, 0xf6, 0xea, 0x10, 0x1c, 0xe5, 0xf7, 0x0d, 0x2a, 0xdc, 93 | 0xcd, 0x57, 0x1a, 0x14, 0xce, 0xfa, 0x14, 0xdd, 0xc6, 0x58, 0x45, 0xeb, 94 | 0xa7, 0xd6, 0xd2, 0x99, 0xdf, 0x50, 0x4a, 0xf6, 0x01, 0x54, 0x15, 0x40, 95 | 0x81, 0x9d, 0x93, 0x29, 0xfd, 0x3f, 0x31, 0x13, 0xa7, 0xc3, 0xbe, 0xfb, 96 | 0xfd, 0x0f, 0x36, 0x17, 0xc0, 0xa2, 0xc6, 0xfd, 0x06, 0xf0, 0x06, 0x34, 97 | 0xb0, 0xc9, 0xe0, 0x12, 0xfd, 0x20, 0x0a, 0x03, 0xc1, 0xae, 0xd4, 0xdd, 98 | 0x05, 0x21, 0x25, 0xeb, 0x98, 0xc5, 0xc2, 0xf3, 0xee, 0x3b, 0x2f, 0x0d, 99 | 0xa9, 0xc1, 0xf4, 0xec, 0xec, 0x23, 0x6d, 0xdc, 0xa1, 0xb8, 0xab, 0x45, 100 | 0x8b, 0x04, 0x22, 0x0e, 0x1d, 0x04, 0x1a, 0x45, 0x81, 0x02, 0x1d, 0x19, 101 | 0x18, 0x2e, 0x0f, 0x16, 0xa7, 0xf2, 0x0a, 0x1c, 0x15, 0x18, 0x16, 0xf5, 102 | 0xbb, 0xf1, 0x0b, 0x0b, 0x20, 0xf2, 0x17, 0xff, 0xd6, 0xeb, 0x0b, 0x22, 103 | 0x0e, 0x2b, 0x26, 0x09, 0xb6, 0xf6, 0x1f, 0x02, 0x1b, 0xd1, 0x14, 0xf4, 104 | 0xc3, 0xfd, 0x29, 0x06, 0x03, 0x1e, 0xef, 0x42, 0x9d, 0xb7, 0xf4, 0x2d, 105 | 0x33, 0x40, 0x1f, 0x3c, 0xea, 0xff, 0xff, 0xff, 0x04, 0x00, 0x00, 0x00, 106 | 0x08, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0x20, 0x01, 0x00, 0x00, 107 | 0x00, 0x00, 0x06, 0x00, 0x08, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00, 108 | 0x04, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 109 | 0x20, 0x00, 0x00, 0x00, 0x84, 0xf6, 0xff, 0xff, 0x88, 0xf6, 0xff, 0xff, 110 | 0x0f, 0x00, 0x00, 0x00, 0x4d, 0x4c, 0x49, 0x52, 0x20, 0x43, 0x6f, 0x6e, 111 | 0x76, 0x65, 0x72, 0x74, 0x65, 0x64, 0x2e, 0x00, 0x01, 0x00, 0x00, 0x00, 112 | 0x14, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0e, 0x00, 0x18, 0x00, 0x14, 0x00, 113 | 0x10, 0x00, 0x0c, 0x00, 0x08, 0x00, 0x04, 0x00, 0x0e, 0x00, 0x00, 0x00, 114 | 0x14, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x00, 0x00, 0xb0, 0x01, 0x00, 0x00, 115 | 0xb4, 0x01, 0x00, 0x00, 0xb8, 0x01, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 116 | 0x6d, 0x61, 0x69, 0x6e, 0x00, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 117 | 0x5c, 0x01, 0x00, 0x00, 0xf8, 0x00, 0x00, 0x00, 0x94, 0x00, 0x00, 0x00, 118 | 0x6c, 0x00, 0x00, 0x00, 0x28, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 119 | 0xaa, 0xff, 0xff, 0xff, 0x0c, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 120 | 0x05, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 121 | 0x01, 0x00, 0x00, 0x00, 0x0b, 0x00, 0x00, 0x00, 0x4a, 0xff, 0xff, 0xff, 122 | 0x14, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x10, 0x00, 0x00, 0x00, 123 | 0x14, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x30, 0xf7, 0xff, 0xff, 124 | 0x01, 0x00, 0x00, 0x00, 0x0b, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 125 | 0x08, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x00, 0x00, 126 | 0x00, 0x00, 0x0a, 0x00, 0x10, 0x00, 0x0c, 0x00, 0x08, 0x00, 0x04, 0x00, 127 | 0x0a, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 128 | 0x03, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 129 | 0x02, 0x00, 0x00, 0x00, 0x07, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 130 | 0xae, 0xff, 0xff, 0xff, 0x24, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 131 | 0x34, 0x00, 0x00, 0x00, 0x38, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 132 | 0x00, 0x00, 0x0e, 0x00, 0x18, 0x00, 0x17, 0x00, 0x10, 0x00, 0x0c, 0x00, 133 | 0x08, 0x00, 0x04, 0x00, 0x0e, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 134 | 0x02, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 135 | 0x00, 0x00, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 0x07, 0x00, 0x00, 0x00, 136 | 0x01, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0e, 0x00, 137 | 0x18, 0x00, 0x14, 0x00, 0x10, 0x00, 0x0c, 0x00, 0x0b, 0x00, 0x04, 0x00, 138 | 0x0e, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 139 | 0x2c, 0x00, 0x00, 0x00, 0x30, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 140 | 0x0c, 0x00, 0x14, 0x00, 0x13, 0x00, 0x0c, 0x00, 0x08, 0x00, 0x07, 0x00, 141 | 0x0c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x00, 0x00, 0x00, 142 | 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 143 | 0x06, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 144 | 0x03, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0e, 0x00, 145 | 0x14, 0x00, 0x00, 0x00, 0x10, 0x00, 0x0c, 0x00, 0x0b, 0x00, 0x04, 0x00, 146 | 0x0e, 0x00, 0x00, 0x00, 0x18, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x4a, 147 | 0x18, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x00, 0x00, 0x08, 0x00, 0x08, 0x00, 148 | 0x00, 0x00, 0x07, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 149 | 0x01, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 150 | 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 151 | 0x0c, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 152 | 0x0d, 0x00, 0x00, 0x00, 0xc0, 0x07, 0x00, 0x00, 0x5c, 0x07, 0x00, 0x00, 153 | 0x08, 0x07, 0x00, 0x00, 0xf8, 0x05, 0x00, 0x00, 0xd4, 0x04, 0x00, 0x00, 154 | 0x38, 0x04, 0x00, 0x00, 0x3c, 0x03, 0x00, 0x00, 0xb0, 0x02, 0x00, 0x00, 155 | 0x3c, 0x02, 0x00, 0x00, 0x90, 0x01, 0x00, 0x00, 0x0c, 0x01, 0x00, 0x00, 156 | 0x70, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x88, 0xf8, 0xff, 0xff, 157 | 0x18, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 158 | 0x0d, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 0x44, 0x00, 0x00, 0x00, 159 | 0x02, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0x01, 0x00, 0x00, 0x00, 160 | 0x6c, 0xf8, 0xff, 0xff, 0x08, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 161 | 0x01, 0x00, 0x00, 0x00, 0x80, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 162 | 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x3b, 163 | 0x08, 0x00, 0x00, 0x00, 0x49, 0x64, 0x65, 0x6e, 0x74, 0x69, 0x74, 0x79, 164 | 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 165 | 0x01, 0x00, 0x00, 0x00, 0xf0, 0xf8, 0xff, 0xff, 0x18, 0x00, 0x00, 0x00, 166 | 0x20, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 167 | 0x00, 0x00, 0x00, 0x09, 0x74, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 168 | 0xff, 0xff, 0xff, 0xff, 0x01, 0x00, 0x00, 0x00, 0xd4, 0xf8, 0xff, 0xff, 169 | 0x08, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 170 | 0xfd, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, 171 | 0x01, 0x00, 0x00, 0x00, 0xbf, 0xb6, 0x86, 0x3d, 0x3a, 0x00, 0x00, 0x00, 172 | 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 173 | 0x74, 0x5f, 0x64, 0x65, 0x6e, 0x73, 0x65, 0x5f, 0x31, 0x2f, 0x4d, 0x61, 174 | 0x74, 0x4d, 0x75, 0x6c, 0x3b, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 175 | 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x64, 0x65, 0x6e, 0x73, 0x65, 176 | 0x5f, 0x31, 0x2f, 0x42, 0x69, 0x61, 0x73, 0x41, 0x64, 0x64, 0x00, 0x00, 177 | 0x02, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 178 | 0xe2, 0xf9, 0xff, 0xff, 0x14, 0x00, 0x00, 0x00, 0x34, 0x00, 0x00, 0x00, 179 | 0x0b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x64, 0x00, 0x00, 0x00, 180 | 0x5c, 0xf9, 0xff, 0xff, 0x08, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 181 | 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 182 | 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x88, 0x6b, 0xbd, 0x39, 183 | 0x35, 0x00, 0x00, 0x00, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 184 | 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x64, 0x65, 0x6e, 0x73, 0x65, 0x5f, 185 | 0x31, 0x2f, 0x42, 0x69, 0x61, 0x73, 0x41, 0x64, 0x64, 0x2f, 0x52, 0x65, 186 | 0x61, 0x64, 0x56, 0x61, 0x72, 0x69, 0x61, 0x62, 0x6c, 0x65, 0x4f, 0x70, 187 | 0x2f, 0x72, 0x65, 0x73, 0x6f, 0x75, 0x72, 0x63, 0x65, 0x00, 0x00, 0x00, 188 | 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x62, 0xfa, 0xff, 0xff, 189 | 0x14, 0x00, 0x00, 0x00, 0x34, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x00, 0x00, 190 | 0x00, 0x00, 0x00, 0x09, 0x88, 0x00, 0x00, 0x00, 0xdc, 0xf9, 0xff, 0xff, 191 | 0x08, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 192 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 193 | 0x01, 0x00, 0x00, 0x00, 0x0a, 0xee, 0xe0, 0x3b, 0x59, 0x00, 0x00, 0x00, 194 | 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 195 | 0x74, 0x5f, 0x64, 0x65, 0x6e, 0x73, 0x65, 0x5f, 0x31, 0x2f, 0x4d, 0x61, 196 | 0x74, 0x4d, 0x75, 0x6c, 0x3b, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 197 | 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x64, 0x65, 0x6e, 0x73, 0x65, 198 | 0x5f, 0x31, 0x2f, 0x4c, 0x61, 0x73, 0x74, 0x56, 0x61, 0x6c, 0x75, 0x65, 199 | 0x51, 0x75, 0x61, 0x6e, 0x74, 0x2f, 0x46, 0x61, 0x6b, 0x65, 0x51, 0x75, 200 | 0x61, 0x6e, 0x74, 0x57, 0x69, 0x74, 0x68, 0x4d, 0x69, 0x6e, 0x4d, 0x61, 201 | 0x78, 0x56, 0x61, 0x72, 0x73, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 202 | 0x01, 0x00, 0x00, 0x00, 0x20, 0x01, 0x00, 0x00, 0xb0, 0xfa, 0xff, 0xff, 203 | 0x18, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x3c, 0x00, 0x00, 0x00, 204 | 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 0x4c, 0x00, 0x00, 0x00, 205 | 0x02, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0x20, 0x01, 0x00, 0x00, 206 | 0x94, 0xfa, 0xff, 0xff, 0x08, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 207 | 0x01, 0x00, 0x00, 0x00, 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 208 | 0x01, 0x00, 0x00, 0x00, 0xcc, 0x95, 0x57, 0x3d, 0x17, 0x00, 0x00, 0x00, 209 | 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x66, 0x6c, 0x61, 0x74, 210 | 0x74, 0x65, 0x6e, 0x2f, 0x52, 0x65, 0x73, 0x68, 0x61, 0x70, 0x65, 0x00, 211 | 0x02, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x20, 0x01, 0x00, 0x00, 212 | 0x20, 0xfb, 0xff, 0xff, 0x18, 0x00, 0x00, 0x00, 0x28, 0x00, 0x00, 0x00, 213 | 0x44, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 214 | 0x5c, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 215 | 0x06, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 216 | 0x0c, 0xfb, 0xff, 0xff, 0x08, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 217 | 0x01, 0x00, 0x00, 0x00, 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 218 | 0x01, 0x00, 0x00, 0x00, 0xcc, 0x95, 0x57, 0x3d, 0x1d, 0x00, 0x00, 0x00, 219 | 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x6d, 0x61, 0x78, 0x5f, 220 | 0x70, 0x6f, 0x6f, 0x6c, 0x69, 0x6e, 0x67, 0x32, 0x64, 0x2f, 0x4d, 0x61, 221 | 0x78, 0x50, 0x6f, 0x6f, 0x6c, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 222 | 0x01, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 223 | 0x08, 0x00, 0x00, 0x00, 0xa8, 0xfb, 0xff, 0xff, 0x18, 0x00, 0x00, 0x00, 224 | 0x28, 0x00, 0x00, 0x00, 0x48, 0x00, 0x00, 0x00, 0x07, 0x00, 0x00, 0x00, 225 | 0x00, 0x00, 0x00, 0x09, 0xcc, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 226 | 0xff, 0xff, 0xff, 0xff, 0x0d, 0x00, 0x00, 0x00, 0x0d, 0x00, 0x00, 0x00, 227 | 0x08, 0x00, 0x00, 0x00, 0x94, 0xfb, 0xff, 0xff, 0x08, 0x00, 0x00, 0x00, 228 | 0x14, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0xeb, 0xff, 0xff, 0xff, 229 | 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 230 | 0xcc, 0x95, 0x57, 0x3d, 0x88, 0x00, 0x00, 0x00, 0x6d, 0x6f, 0x64, 0x65, 231 | 0x6c, 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x63, 0x6f, 232 | 0x6e, 0x76, 0x32, 0x64, 0x2f, 0x52, 0x65, 0x6c, 0x75, 0x3b, 0x6d, 0x6f, 233 | 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 234 | 0x63, 0x6f, 0x6e, 0x76, 0x32, 0x64, 0x2f, 0x42, 0x69, 0x61, 0x73, 0x41, 235 | 0x64, 0x64, 0x3b, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x71, 236 | 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x63, 0x6f, 0x6e, 0x76, 0x32, 0x64, 0x2f, 237 | 0x43, 0x6f, 0x6e, 0x76, 0x32, 0x44, 0x3b, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 238 | 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x63, 0x6f, 0x6e, 239 | 0x76, 0x32, 0x64, 0x2f, 0x42, 0x69, 0x61, 0x73, 0x41, 0x64, 0x64, 0x2f, 240 | 0x52, 0x65, 0x61, 0x64, 0x56, 0x61, 0x72, 0x69, 0x61, 0x62, 0x6c, 0x65, 241 | 0x4f, 0x70, 0x2f, 0x72, 0x65, 0x73, 0x6f, 0x75, 0x72, 0x63, 0x65, 0x31, 242 | 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 243 | 0x0d, 0x00, 0x00, 0x00, 0x0d, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 244 | 0xa0, 0xfc, 0xff, 0xff, 0x18, 0x00, 0x00, 0x00, 0x28, 0x00, 0x00, 0x00, 245 | 0x44, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 246 | 0x6c, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 247 | 0x20, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 248 | 0x8c, 0xfc, 0xff, 0xff, 0x08, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 249 | 0x01, 0x00, 0x00, 0x00, 0x80, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 250 | 0x01, 0x00, 0x00, 0x00, 0x03, 0x03, 0xe3, 0x3e, 0x2d, 0x00, 0x00, 0x00, 251 | 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x72, 0x65, 0x73, 0x69, 252 | 0x7a, 0x69, 0x6e, 0x67, 0x2f, 0x72, 0x65, 0x73, 0x69, 0x7a, 0x65, 0x2f, 253 | 0x52, 0x65, 0x73, 0x69, 0x7a, 0x65, 0x4e, 0x65, 0x61, 0x72, 0x65, 0x73, 254 | 0x74, 0x4e, 0x65, 0x69, 0x67, 0x68, 0x62, 0x6f, 0x72, 0x00, 0x00, 0x00, 255 | 0x04, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 256 | 0x20, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x92, 0xfd, 0xff, 0xff, 257 | 0x14, 0x00, 0x00, 0x00, 0x84, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 258 | 0x00, 0x00, 0x00, 0x02, 0x04, 0x01, 0x00, 0x00, 0x0c, 0xfd, 0xff, 0xff, 259 | 0x08, 0x00, 0x00, 0x00, 0x48, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 260 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 261 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 262 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 263 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 264 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 265 | 0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0xa0, 0xed, 0x84, 0x3a, 266 | 0x61, 0x5f, 0x15, 0x3b, 0x8e, 0x36, 0x10, 0x3b, 0x15, 0x03, 0xe3, 0x3a, 267 | 0x73, 0x57, 0x9e, 0x3a, 0x32, 0x52, 0xfb, 0x3a, 0xda, 0x85, 0xbe, 0x3a, 268 | 0x0f, 0xef, 0x02, 0x3b, 0x87, 0x00, 0x00, 0x00, 0x6d, 0x6f, 0x64, 0x65, 269 | 0x6c, 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x63, 0x6f, 270 | 0x6e, 0x76, 0x32, 0x64, 0x2f, 0x52, 0x65, 0x6c, 0x75, 0x3b, 0x6d, 0x6f, 271 | 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 272 | 0x63, 0x6f, 0x6e, 0x76, 0x32, 0x64, 0x2f, 0x42, 0x69, 0x61, 0x73, 0x41, 273 | 0x64, 0x64, 0x3b, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x71, 274 | 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x63, 0x6f, 0x6e, 0x76, 0x32, 0x64, 0x2f, 275 | 0x43, 0x6f, 0x6e, 0x76, 0x32, 0x44, 0x3b, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 276 | 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x63, 0x6f, 0x6e, 277 | 0x76, 0x32, 0x64, 0x2f, 0x42, 0x69, 0x61, 0x73, 0x41, 0x64, 0x64, 0x2f, 278 | 0x52, 0x65, 0x61, 0x64, 0x56, 0x61, 0x72, 0x69, 0x61, 0x62, 0x6c, 0x65, 279 | 0x4f, 0x70, 0x2f, 0x72, 0x65, 0x73, 0x6f, 0x75, 0x72, 0x63, 0x65, 0x00, 280 | 0x01, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0xb2, 0xfe, 0xff, 0xff, 281 | 0x14, 0x00, 0x00, 0x00, 0x88, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 282 | 0x00, 0x00, 0x00, 0x09, 0xe4, 0x00, 0x00, 0x00, 0x2c, 0xfe, 0xff, 0xff, 283 | 0x08, 0x00, 0x00, 0x00, 0x4c, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 284 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 285 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 286 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 287 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 288 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 289 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 290 | 0x09, 0xe7, 0x15, 0x3b, 0x5a, 0x72, 0xa8, 0x3b, 0xdd, 0xa0, 0xa2, 0x3b, 291 | 0x0a, 0x00, 0x80, 0x3b, 0x9e, 0x8f, 0x32, 0x3b, 0xed, 0xb4, 0x8d, 0x3b, 292 | 0x05, 0xda, 0x56, 0x3b, 0x45, 0xa7, 0x93, 0x3b, 0x61, 0x00, 0x00, 0x00, 293 | 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 0x71, 0x75, 0x61, 0x6e, 294 | 0x74, 0x5f, 0x63, 0x6f, 0x6e, 0x76, 0x32, 0x64, 0x2f, 0x43, 0x6f, 0x6e, 295 | 0x76, 0x32, 0x44, 0x3b, 0x6d, 0x6f, 0x64, 0x65, 0x6c, 0x5f, 0x31, 0x2f, 296 | 0x71, 0x75, 0x61, 0x6e, 0x74, 0x5f, 0x63, 0x6f, 0x6e, 0x76, 0x32, 0x64, 297 | 0x2f, 0x4c, 0x61, 0x73, 0x74, 0x56, 0x61, 0x6c, 0x75, 0x65, 0x51, 0x75, 298 | 0x61, 0x6e, 0x74, 0x2f, 0x46, 0x61, 0x6b, 0x65, 0x51, 0x75, 0x61, 0x6e, 299 | 0x74, 0x57, 0x69, 0x74, 0x68, 0x4d, 0x69, 0x6e, 0x4d, 0x61, 0x78, 0x56, 300 | 0x61, 0x72, 0x73, 0x50, 0x65, 0x72, 0x43, 0x68, 0x61, 0x6e, 0x6e, 0x65, 301 | 0x6c, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 302 | 0x08, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 303 | 0xbe, 0xff, 0xff, 0xff, 0x14, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 304 | 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x24, 0x00, 0x00, 0x00, 305 | 0xb0, 0xff, 0xff, 0xff, 0x15, 0x00, 0x00, 0x00, 0x6d, 0x6f, 0x64, 0x65, 306 | 0x6c, 0x5f, 0x31, 0x2f, 0x66, 0x6c, 0x61, 0x74, 0x74, 0x65, 0x6e, 0x2f, 307 | 0x43, 0x6f, 0x6e, 0x73, 0x74, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 308 | 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0e, 0x00, 0x18, 0x00, 0x14, 0x00, 309 | 0x13, 0x00, 0x0c, 0x00, 0x08, 0x00, 0x04, 0x00, 0x0e, 0x00, 0x00, 0x00, 310 | 0x18, 0x00, 0x00, 0x00, 0x18, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 311 | 0x00, 0x00, 0x00, 0x02, 0x30, 0x00, 0x00, 0x00, 0x04, 0x00, 0x04, 0x00, 312 | 0x04, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x00, 0x00, 0x6d, 0x6f, 0x64, 0x65, 313 | 0x6c, 0x5f, 0x31, 0x2f, 0x72, 0x65, 0x73, 0x69, 0x7a, 0x69, 0x6e, 0x67, 314 | 0x2f, 0x72, 0x65, 0x73, 0x69, 0x7a, 0x65, 0x2f, 0x73, 0x69, 0x7a, 0x65, 315 | 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 316 | 0x14, 0x00, 0x1c, 0x00, 0x18, 0x00, 0x17, 0x00, 0x10, 0x00, 0x0c, 0x00, 317 | 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x14, 0x00, 0x00, 0x00, 318 | 0x18, 0x00, 0x00, 0x00, 0x34, 0x00, 0x00, 0x00, 0x50, 0x00, 0x00, 0x00, 319 | 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 0x50, 0x00, 0x00, 0x00, 320 | 0x04, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0x7c, 0x00, 0x00, 0x00, 321 | 0x81, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x0c, 0x00, 322 | 0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x04, 0x00, 0x0c, 0x00, 0x00, 0x00, 323 | 0x08, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 324 | 0x80, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x01, 0x00, 0x00, 0x00, 325 | 0x03, 0x03, 0xe3, 0x3e, 0x07, 0x00, 0x00, 0x00, 0x69, 0x6e, 0x70, 0x75, 326 | 0x74, 0x5f, 0x31, 0x00, 0x04, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 327 | 0x7c, 0x00, 0x00, 0x00, 0x81, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 328 | 0x06, 0x00, 0x00, 0x00, 0x7c, 0x00, 0x00, 0x00, 0x5c, 0x00, 0x00, 0x00, 329 | 0x48, 0x00, 0x00, 0x00, 0x38, 0x00, 0x00, 0x00, 0x18, 0x00, 0x00, 0x00, 330 | 0x04, 0x00, 0x00, 0x00, 0xa8, 0xff, 0xff, 0xff, 0x0e, 0x00, 0x00, 0x00, 331 | 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0e, 0xb8, 0xff, 0xff, 0xff, 332 | 0x09, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 333 | 0x0c, 0x00, 0x0c, 0x00, 0x0b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 334 | 0x0c, 0x00, 0x00, 0x00, 0x16, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x16, 335 | 0xe0, 0xff, 0xff, 0xff, 0x11, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 336 | 0x00, 0x00, 0x00, 0x11, 0xf0, 0xff, 0xff, 0xff, 0x03, 0x00, 0x00, 0x00, 337 | 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x0c, 0x00, 0x10, 0x00, 338 | 0x0f, 0x00, 0x00, 0x00, 0x08, 0x00, 0x04, 0x00, 0x0c, 0x00, 0x00, 0x00, 339 | 0x61, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x61 340 | }; 341 | 342 | #endif 343 | -------------------------------------------------------------------------------- /ml_audio_classifier_example_for_pico.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "ml_audio_classifier_example_for_pico.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [] 9 | }, 10 | "kernelspec": { 11 | "display_name": "Python 3", 12 | "name": "python3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "w7gY4ELwlmOk" 23 | }, 24 | "source": [ 25 | "\n", 26 | "\"Arm" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "id": "m174xspUl7LC" 33 | }, 34 | "source": [ 35 | "# ML Audio Classifier Example for Pico\n", 36 | "\n", 37 | "```\n", 38 | "Copyright (c) 2021 Arm Limited and Contributors. All rights reserved.\n", 39 | "\n", 40 | "SPDX-License-Identifier: Apache-2.0\n", 41 | "```\n", 42 | "\n", 43 | "Authors: [Sandeep Mistry](https://twitter.com/sandeepmistry), [Henri Woodcock](https://twitter.com/henriwoodcock) from the [Arm Software Developers team](https://twitter.com/armsoftwaredev)" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": { 49 | "id": "_h5zXsutmdCS" 50 | }, 51 | "source": [ 52 | "## Introduction\n", 53 | "\n", 54 | "This tutorial will guide you through how to train a TensorFlow based audio classification Machine Learning (ML) model to detect a fire alarm sound. We’ll show you how to use [TensorFlow Lite for Microcontrollers](https://www.tensorflow.org/lite/microcontrollers) with Arm [CMSIS-NN](https://arm-software.github.io/CMSIS_5/NN/html/index.html) accelerated kernels to deploy the ML model to an [Arm Cortex-M0+](https://developer.arm.com/ip-products/processors/cortex-m/cortex-m0-plus) based microcontroller (MCU) board for local on-device ML interferencing. Arm’s [CMSIS-DSP](https://arm-software.github.io/CMSIS_5/DSP/html/index.html) library, which provides optimized Digital Signal Processing (DSP) function implementations for [Arm Cortex-M](https://developer.arm.com/ip-products/processors/cortex-m) processors, will also be used to extract features from the real-time audio data prior to inference.\n", 55 | "\n", 56 | "While this guide focuses on detecting a fire alarm sound, it can be adapted for other sound classification tasks. You may also need to adapt the feature extraction stages and/or adjust ML model architecture for your use case.\n" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": { 62 | "id": "6w60BKbnzzrP" 63 | }, 64 | "source": [ 65 | "## What you need to to get started\n", 66 | "\n", 67 | "### Development Environment\n", 68 | "\n", 69 | " * [Google Chrome](https://www.google.com/intl/en_ca/chrome/)\n", 70 | " * [Google Colab](https://colab.research.google.com/notebooks/)\n", 71 | " * A [Google Account](https://www.google.com/account/about/)\n", 72 | "\n", 73 | "### Hardware\n", 74 | "\n", 75 | "You’ll need one of the following development boards that are based on [Raspberry Pi’s RP2040 MCU chip](https://www.raspberrypi.org/products/rp2040/) that was released early in 2021.\n", 76 | "\n", 77 | "\n", 78 | "#### SparkFun RP2040 MicroMod and MicroMod ML Carrier\n", 79 | "\n", 80 | "This is recommended for people who are new to electronics and microcontrollers. While it does cost a bit more than the option below, it is easier to assemble and does not require a soldering iron, knowing how to solder and how to wire up breadboards.\n", 81 | "\n", 82 | " * [SparkFun MicroMod RP2040 Processor](https://www.sparkfun.com/products/17720)\n", 83 | " * For the brains of the operation! Contains Raspberry Pi’s RP2040 MCU and 16MB of flash storage\n", 84 | " * [SparkFun MicroMod Machine Learning Carrier Board](https://www.sparkfun.com/products/16400)\n", 85 | "Enables USB connectivity, and provides a built-in microphone, IMU and camera connector\n", 86 | " * A USB-C cable to connect the board to your computer\n", 87 | " * A Phillips screwdriver\n", 88 | "\n", 89 | "#### Raspberry Pi Pico and PDM microphone board\n", 90 | "\n", 91 | "This option is slightly lower in cost, however it requires a soldering iron and knowledge of how to wire a breadboard with electronic components.\n", 92 | "\n", 93 | " * [Raspberry Pi Pico](https://www.raspberrypi.org/products/raspberry-pi-pico/)\n", 94 | " * [Adafruit PDM MEMS Microphone Breakout](https://www.adafruit.com/product/3492)\n", 95 | " * Half size or full size breadboard\n", 96 | " * Jumper wires\n", 97 | " * A USB-B micro cable to connect the board to your computer\n", 98 | " * Soldering iron\n", 99 | "\n", 100 | "#### More information\n", 101 | "\n", 102 | "Both of the options above will allow you to collect real-time 16 kHz audio from a digital microphone and process the audio signal in real-time on the development board’s Arm Cortex-M0+ processor which operates at 125 MHz. The application running on the Arm Cortex-M0+ will have a Digital Signal Processing (DSP) stage to extract features from the audio signal, the extracted features will then be fed into a neural network to perform a classification task to determine if a fire alarm sound is present in the board’s environment.\n", 103 | "\n", 104 | "### Hardware Setup\n", 105 | "\n", 106 | "#### SparkFun MicroMod RP2040\n", 107 | "\n", 108 | "For assembly, remove the screw on the carrier board, at an angle, slide in the MicroMod RP2040 Processor board into the socket and secure it in place with the screw. See the [MicroMod Machine Learning Carrier Board Hookup Guide](https://learn.sparkfun.com/tutorials/micromod-machine-learning-carrier-board-hookup-guide?_ga=2.90268890.1509654996.1628608170-268367655.1627493370#hardware-hookup) for more details.\n", 109 | "\n", 110 | "\n", 111 | "#### Raspberry Pi Pico\n", 112 | "\n", 113 | "Follow the instructions from the [Hardware Setup section of the \"Create a USB Microphone with the Raspberry Pi Pico\"](https://www.hackster.io/sandeep-mistry/create-a-usb-microphone-with-the-raspberry-pi-pico-cc9bd5#toc-hardware-setup-5) guide for assembly instructions.\n" 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": { 119 | "id": "Ojuc2yoIrA8G" 120 | }, 121 | "source": [ 122 | "## Install dependencies" 123 | ] 124 | }, 125 | { 126 | "cell_type": "markdown", 127 | "metadata": { 128 | "id": "2t4hpGoHoUR6" 129 | }, 130 | "source": [ 131 | "### Python Libraries\n", 132 | "\n", 133 | "Let's start by installing the Python library dependencies:" 134 | ] 135 | }, 136 | { 137 | "cell_type": "code", 138 | "metadata": { 139 | "id": "B6CQatks4z4p" 140 | }, 141 | "source": [ 142 | "!pip install librosa matplotlib pandas \"tensorflow==2.8.*\" \"tensorflow-io==0.24.*\" \"tensorflow-model-optimization==0.7.2\"\n", 143 | "\n", 144 | "!pip install git+https://github.com/ARM-software/CMSIS_5.git@5.8.0#egg=CMSISDSP\\&subdirectory=CMSIS/DSP/PythonWrapper" 145 | ], 146 | "execution_count": null, 147 | "outputs": [] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": { 152 | "id": "3QMTfgPBoeRB" 153 | }, 154 | "source": [ 155 | "### Command line tools\n", 156 | "\n", 157 | "Now let's install the command line tools we will need to build applications for the Raspberry Pi RP2040:" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "metadata": { 163 | "id": "Lb3DjP6lXyUf" 164 | }, 165 | "source": [ 166 | "import tensorflow as tf\n", 167 | "\n", 168 | "tf.keras.utils.get_file('cmake-3.21.0-linux-x86_64.tar.gz',\n", 169 | " 'https://github.com/Kitware/CMake/releases/download/v3.21.0/cmake-3.21.0-linux-x86_64.tar.gz',\n", 170 | " cache_dir='./',\n", 171 | " cache_subdir='tools',\n", 172 | " extract=True)\n", 173 | "\n", 174 | "tf.keras.utils.get_file('gcc-arm-none-eabi-10-2020-q4-major-x86_64-linux.tar.bz2',\n", 175 | " 'https://developer.arm.com/-/media/Files/downloads/gnu-rm/10-2020q4/gcc-arm-none-eabi-10-2020-q4-major-x86_64-linux.tar.bz2',\n", 176 | " cache_dir='./',\n", 177 | " cache_subdir='tools',\n", 178 | " extract=True)" 179 | ], 180 | "execution_count": null, 181 | "outputs": [] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "metadata": { 186 | "id": "TV4uI68XX_J0" 187 | }, 188 | "source": [ 189 | "!apt-get install -y xxd" 190 | ], 191 | "execution_count": null, 192 | "outputs": [] 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "metadata": { 197 | "id": "EPi1R_THovN9" 198 | }, 199 | "source": [ 200 | "Now add the downloaded and extracted tools to the `PATH` environmental variable, so we can use them later on without specifying the full path to them:" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "metadata": { 206 | "id": "9evc84YJYz2P" 207 | }, 208 | "source": [ 209 | "import os\n", 210 | "\n", 211 | "os.environ['PATH'] = f\"{os.getcwd()}/tools/cmake-3.21.0-linux-x86_64/bin:{os.environ['PATH']}\"\n", 212 | "os.environ['PATH'] = f\"{os.getcwd()}/tools/gcc-arm-none-eabi-10-2020-q4-major/bin:{os.environ['PATH']}\"" 213 | ], 214 | "execution_count": null, 215 | "outputs": [] 216 | }, 217 | { 218 | "cell_type": "markdown", 219 | "metadata": { 220 | "id": "MT78ms3GpQIv" 221 | }, 222 | "source": [ 223 | "### Raspberry Pi Pico SDK\n", 224 | "\n", 225 | "We can use `git` to get the `v1.2.0` of the [Raspberry Pi Pico SDK](https://github.com/raspberrypi/pico-sdk)" 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "metadata": { 231 | "id": "9P02F36WY5uZ" 232 | }, 233 | "source": [ 234 | "%%shell\n", 235 | "git clone --branch 1.2.0 https://github.com/raspberrypi/pico-sdk.git\n", 236 | "cd pico-sdk\n", 237 | "git submodule init\n", 238 | "git submodule update" 239 | ], 240 | "execution_count": null, 241 | "outputs": [] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": { 246 | "id": "iE_gX86hpk8h" 247 | }, 248 | "source": [ 249 | "Set the `PICO_SDK_PATH` environment variable to specify the location of the `pico-sdk`" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "metadata": { 255 | "id": "yXCHQtSWY_dS" 256 | }, 257 | "source": [ 258 | "os.environ['PICO_SDK_PATH'] = f\"{os.getcwd()}/pico-sdk\"" 259 | ], 260 | "execution_count": null, 261 | "outputs": [] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "metadata": { 266 | "id": "YzxhKhMQpzqr" 267 | }, 268 | "source": [ 269 | "**You will need to change the code cell below** to select the board you will be using for the remainder of the tutorial.\n", 270 | "\n", 271 | "By default the `PICO_BOARD` environment variable is set to `sparkfun_micromod` for the SparkFun RP2040 MicroMod. Set the value to `pico` if you are using a Raspberry Pi Pico board." 272 | ] 273 | }, 274 | { 275 | "cell_type": "code", 276 | "metadata": { 277 | "id": "5WVJZJFUTMEC" 278 | }, 279 | "source": [ 280 | "# for SparkFun MicroMod\n", 281 | "os.environ['PICO_BOARD'] = 'sparkfun_micromod'\n", 282 | "\n", 283 | "# for Raspberry Pi Pico (uncomment next line)\n", 284 | "# os.environ['PICO_BOARD'] = 'pico'\n", 285 | "\n", 286 | "print(f\"PICO_BOARD env. var. set to '{os.environ['PICO_BOARD']}'\")" 287 | ], 288 | "execution_count": null, 289 | "outputs": [] 290 | }, 291 | { 292 | "cell_type": "markdown", 293 | "metadata": { 294 | "id": "n1-rsj4vchBC" 295 | }, 296 | "source": [ 297 | "### Project Files\n", 298 | "\n", 299 | "The source code for the inference application and Python utilities for Google Colab can also be cloned using `git`:" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "metadata": { 305 | "id": "-jDNNonWcyAL" 306 | }, 307 | "source": [ 308 | "%%shell\n", 309 | "git clone --recurse-submodules https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico.git" 310 | ], 311 | "execution_count": null, 312 | "outputs": [] 313 | }, 314 | { 315 | "cell_type": "markdown", 316 | "metadata": { 317 | "id": "yWTsVPrgdL9R" 318 | }, 319 | "source": [ 320 | "For convenience we can create symbolic links for the project files that we've cloned to the root Google Colab folder:" 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "metadata": { 326 | "id": "_1JjfZ15eZX9" 327 | }, 328 | "source": [ 329 | "%%shell\n", 330 | "ln -s ml-audio-classifier-example-for-pico/colab_utils colab_utils\n", 331 | "ln -s ml-audio-classifier-example-for-pico/inference-app inference-app" 332 | ], 333 | "execution_count": null, 334 | "outputs": [] 335 | }, 336 | { 337 | "cell_type": "markdown", 338 | "metadata": { 339 | "id": "HMuv2599z8hy" 340 | }, 341 | "source": [ 342 | "## Baseline model\n", 343 | "\n", 344 | "We will start by training a generic sound classifier model with TensorFlow using the [ESC-50: Dataset for Environmental Sound Classification](https://github.com/karolpiczak/ESC-50). This will allow us to create a more generic model that is trained on a broader dataset, and then use Transfer Learning later on to fine tune it for our specific audio classification task.\n", 345 | "\n", 346 | "This model will be trained on the ESC-50 dataset, which contains 50 types of sounds; each sound category has 40 audio files that are 5 seconds each in length. Each audio file will be split into 1 second soundbites, and any soundbites that contain pure silence will be discarded.\n", 347 | "\n", 348 | "### Prepare dataset\n", 349 | "\n", 350 | "#### Download and extract\n", 351 | "\n", 352 | "The ESC-50 dataset will be downloaded and extracted to the `datasets` folder using the [`tf.keras.utils.get_file(...)`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/get_file) function." 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "metadata": { 358 | "id": "EVM5_CQdoEAZ" 359 | }, 360 | "source": [ 361 | "import tensorflow as tf\n", 362 | "\n", 363 | "tf.keras.utils.get_file('esc-50.zip',\n", 364 | " 'https://github.com/karoldvl/ESC-50/archive/master.zip',\n", 365 | " cache_dir='./',\n", 366 | " cache_subdir='datasets',\n", 367 | " extract=True)" 368 | ], 369 | "execution_count": null, 370 | "outputs": [] 371 | }, 372 | { 373 | "cell_type": "markdown", 374 | "metadata": { 375 | "id": "tH8NUBY90nlH" 376 | }, 377 | "source": [ 378 | "#### Load dataset metadata\n", 379 | "\n", 380 | "Now we will use the [pandas](https://pandas.pydata.org/) library to read the `datasets/ESC-50-master/meta/esc50.csv` file which contains the metadata for the audio files in the ESC-50 dataset:" 381 | ] 382 | }, 383 | { 384 | "cell_type": "code", 385 | "metadata": { 386 | "id": "UJnoX0I70r2j" 387 | }, 388 | "source": [ 389 | "import pandas as pd\n", 390 | "\n", 391 | "esc50_csv = './datasets/ESC-50-master/meta/esc50.csv'\n", 392 | "base_data_path = './datasets/ESC-50-master/audio/'\n", 393 | "\n", 394 | "df = pd.read_csv(esc50_csv)\n", 395 | "df.head()" 396 | ], 397 | "execution_count": null, 398 | "outputs": [] 399 | }, 400 | { 401 | "cell_type": "markdown", 402 | "metadata": { 403 | "id": "5SMgChs01Jry" 404 | }, 405 | "source": [ 406 | "Then add new column with the `fullpath` of the wave files:" 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "metadata": { 412 | "id": "mioGTM5M1TQN" 413 | }, 414 | "source": [ 415 | "from os import path\n", 416 | "\n", 417 | "base_data_path = './datasets/ESC-50-master/audio/'\n", 418 | "\n", 419 | "df['fullpath'] = df['filename'].map(lambda x: path.join(base_data_path, x))\n", 420 | "\n", 421 | "df.head()" 422 | ], 423 | "execution_count": null, 424 | "outputs": [] 425 | }, 426 | { 427 | "cell_type": "markdown", 428 | "metadata": { 429 | "id": "LqnpuM-G2jDr" 430 | }, 431 | "source": [ 432 | "#### Load wave file data\n", 433 | "\n", 434 | "We can then define a new function named `load_wav` to load audio samples from a wave file using TensorFlow's [`tf.io.read_file(...)`](https://www.tensorflow.org/api_docs/python/tf/io/read_file) and[`tf.audio.decode_wav(...)`](https://www.tensorflow.org/api_docs/python/tf/audio/decode_wav) API's. The [`tfio.audio.resample(...)`](https://www.tensorflow.org/io/api_docs/python/tfio/audio/resample) API will be used to resample the audio samples at the specified sampling rate.\n", 435 | "\n", 436 | "[librosa](https://librosa.org/)'s [`load(...)`](https://librosa.org/doc/main/generated/librosa.load.html) API will be used as a fallback if TensorFlow is unable to decode the wave file.\n" 437 | ] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "metadata": { 442 | "id": "70RwxZs12iPZ" 443 | }, 444 | "source": [ 445 | "import tensorflow_io as tfio\n", 446 | "import librosa\n", 447 | "\n", 448 | "def load_wav(filename, desired_sample_rate, desired_channels):\n", 449 | " try:\n", 450 | " file_contents = tf.io.read_file(filename)\n", 451 | " wav, sample_rate = tf.audio.decode_wav(file_contents, desired_channels=desired_channels)\n", 452 | " wav = tf.squeeze(wav, axis=-1)\n", 453 | " except:\n", 454 | " # fallback to librosa if the wav file can be read with TF\n", 455 | " filename = tf.cast(filename, tf.string)\n", 456 | " wav, sample_rate = librosa.load(filename.numpy().decode('utf-8'), sr=None, mono=(desired_channels == 1))\n", 457 | " \n", 458 | " wav = tfio.audio.resample(wav, rate_in=tf.cast(sample_rate, dtype=tf.int64), rate_out=tf.cast(desired_sample_rate, dtype=tf.int64))\n", 459 | "\n", 460 | " return wav" 461 | ], 462 | "execution_count": null, 463 | "outputs": [] 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "metadata": { 468 | "id": "P_2AYbxi5oIp" 469 | }, 470 | "source": [ 471 | "Now let's load the first wave file, which is a sound of a dog barking, from the pandas `DataFrame`, and plot it overtime using `matplotlib`. The [`IPython.display.Audio(...)`](https://ipython.org/ipython-doc/3/api/generated/IPython.display.html#IPython.display.Audio) API can be used to playback the audio samples inside the notebook.\n", 472 | "\n", 473 | "\n" 474 | ] 475 | }, 476 | { 477 | "cell_type": "code", 478 | "metadata": { 479 | "id": "YgLhGyi75bvI" 480 | }, 481 | "source": [ 482 | "import matplotlib.pyplot as plt\n", 483 | "from IPython import display\n", 484 | "\n", 485 | "sample_rate = 16000\n", 486 | "channels = 1\n", 487 | "\n", 488 | "test_wav_file_path = df['fullpath'][0]\n", 489 | "test_wav_data = load_wav(test_wav_file_path, sample_rate, channels)\n", 490 | "\n", 491 | "plt.plot(test_wav_data)\n", 492 | "plt.show()\n", 493 | "\n", 494 | "display.Audio(test_wav_data, rate=sample_rate)" 495 | ], 496 | "execution_count": null, 497 | "outputs": [] 498 | }, 499 | { 500 | "cell_type": "markdown", 501 | "metadata": { 502 | "id": "FLVRrhvt7E9T" 503 | }, 504 | "source": [ 505 | "If we zoom in and only plot samples `32000` to `48000`, we can get a closer plot of the audio samples in the wave file in the 2 to 3 second span:" 506 | ] 507 | }, 508 | { 509 | "cell_type": "code", 510 | "metadata": { 511 | "id": "gvKn45Pl6X4B" 512 | }, 513 | "source": [ 514 | "_ = plt.plot(test_wav_data[32000:48000])" 515 | ], 516 | "execution_count": null, 517 | "outputs": [] 518 | }, 519 | { 520 | "cell_type": "markdown", 521 | "metadata": { 522 | "id": "sYg6G3ivfOPq" 523 | }, 524 | "source": [ 525 | "We can then use the [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) TensorFlow API to create a pipeline that loads all wave file data from the dataset." 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "metadata": { 531 | "id": "hDr615fNfLdp" 532 | }, 533 | "source": [ 534 | "fullpaths = df['fullpath']\n", 535 | "targets = df['target']\n", 536 | "folds = df['fold']\n", 537 | "\n", 538 | "fullpaths_ds = tf.data.Dataset.from_tensor_slices((fullpaths, targets, folds))\n", 539 | "fullpaths_ds.element_spec" 540 | ], 541 | "execution_count": null, 542 | "outputs": [] 543 | }, 544 | { 545 | "cell_type": "markdown", 546 | "metadata": { 547 | "id": "v2IJbEHPf2ok" 548 | }, 549 | "source": [ 550 | "Map each `fullpath` value to wave file samples:" 551 | ] 552 | }, 553 | { 554 | "cell_type": "code", 555 | "metadata": { 556 | "id": "u-A6jYUz6lfI" 557 | }, 558 | "source": [ 559 | "def load_wav_for_map(fullpath, label, fold):\n", 560 | " wav = tf.py_function(load_wav, [fullpath, sample_rate, channels], tf.float32)\n", 561 | "\n", 562 | " return wav, label, fold\n", 563 | "\n", 564 | "wav_ds = fullpaths_ds.map(load_wav_for_map)\n", 565 | "wav_ds.element_spec" 566 | ], 567 | "execution_count": null, 568 | "outputs": [] 569 | }, 570 | { 571 | "cell_type": "markdown", 572 | "metadata": { 573 | "id": "V2RD4zXVRAJ2" 574 | }, 575 | "source": [ 576 | "#### Split Wave file data\n", 577 | "\n", 578 | "We would like to train the model on 1 secound soundbites, so we must split up the 5 seconds of audio per item in the ESC-50 dataset to slices of 16000 samples. We will also stride the original audio samples `4000` samples at a time, and filter out any sound chunks that contain pure silence." 579 | ] 580 | }, 581 | { 582 | "cell_type": "code", 583 | "metadata": { 584 | "id": "UnDQ_PDERIJl" 585 | }, 586 | "source": [ 587 | "@tf.function\n", 588 | "def split_wav(wav, width, stride):\n", 589 | " return tf.map_fn(fn=lambda t: wav[t * stride:t * stride + width], elems=tf.range((tf.shape(wav)[0] - width) // stride), fn_output_signature=tf.float32)\n", 590 | "\n", 591 | "@tf.function\n", 592 | "def wav_not_empty(wav):\n", 593 | " return tf.experimental.numpy.any(wav)\n", 594 | "\n", 595 | "def split_wav_for_flat_map(wav, label, fold):\n", 596 | " wavs = split_wav(wav, width=16000, stride=4000)\n", 597 | " labels = tf.repeat(label, tf.shape(wavs)[0])\n", 598 | " folds = tf.repeat(fold, tf.shape(wavs)[0])\n", 599 | "\n", 600 | " return tf.data.Dataset.from_tensor_slices((wavs, labels, folds))\n", 601 | "\n", 602 | "split_wav_ds = wav_ds.flat_map(split_wav_for_flat_map)\n", 603 | "split_wav_ds = split_wav_ds.filter(lambda x, y, z: wav_not_empty(x))" 604 | ], 605 | "execution_count": null, 606 | "outputs": [] 607 | }, 608 | { 609 | "cell_type": "markdown", 610 | "metadata": { 611 | "id": "pyiCY4I-9AFS" 612 | }, 613 | "source": [ 614 | "Let's plot the first 5 soundbites over time using `matplotlib`:" 615 | ] 616 | }, 617 | { 618 | "cell_type": "code", 619 | "metadata": { 620 | "id": "qeRO3Z4khchs" 621 | }, 622 | "source": [ 623 | "for wav, _, _ in split_wav_ds.take(5):\n", 624 | " _ = plt.plot(wav)\n", 625 | " plt.show()" 626 | ], 627 | "execution_count": null, 628 | "outputs": [] 629 | }, 630 | { 631 | "cell_type": "markdown", 632 | "metadata": { 633 | "id": "XdV6yAYYhsBw" 634 | }, 635 | "source": [ 636 | "#### Create Spectrograms\n", 637 | "\n", 638 | "Rather than passing in the time series data directly into our TensorFlow model, we will transform the audio data into an audio spectrogram representation. This will create a 2D representation of the audio signal’s frequency content over time.\n", 639 | "\n", 640 | "The input audio signal we will use will have a sampling rate of 16kHz, this means one second of audio will contain 16,000 samples. Using TensorFlow’s [`tf.signal.stft(...)`](https://www.tensorflow.org/api_docs/python/tf/signal/stft) function we can transform a 1 second audio signal into a 2D tensor representation. We will choose a frame length of 256 and a frame step of 128, so the output of this feature extraction stage will be a Tensor that has a shape of `(124, 129)`.\n" 641 | ] 642 | }, 643 | { 644 | "cell_type": "code", 645 | "metadata": { 646 | "id": "PzOXIaNkh9jW" 647 | }, 648 | "source": [ 649 | "@tf.function\n", 650 | "def create_spectrogram(samples):\n", 651 | " return tf.abs(\n", 652 | " tf.signal.stft(samples, frame_length=256, frame_step=128)\n", 653 | " )" 654 | ], 655 | "execution_count": null, 656 | "outputs": [] 657 | }, 658 | { 659 | "cell_type": "markdown", 660 | "metadata": { 661 | "id": "Ts4rVQvG9kgL" 662 | }, 663 | "source": [ 664 | "Let's take the same 2 - 3 second interval of the first dog barking wave file and create it's spectrogram representation:" 665 | ] 666 | }, 667 | { 668 | "cell_type": "code", 669 | "metadata": { 670 | "id": "NuGcCinfilFs" 671 | }, 672 | "source": [ 673 | "spectrogram = create_spectrogram(test_wav_data[32000:48000])\n", 674 | "\n", 675 | "spectrogram.shape" 676 | ], 677 | "execution_count": null, 678 | "outputs": [] 679 | }, 680 | { 681 | "cell_type": "markdown", 682 | "metadata": { 683 | "id": "mFq1Mpq_-Ecc" 684 | }, 685 | "source": [ 686 | "We can then create `plot_spectrogram` function to plot the spectrogram representation using `matplotlib`:" 687 | ] 688 | }, 689 | { 690 | "cell_type": "code", 691 | "metadata": { 692 | "id": "hrj38Jdiig3H" 693 | }, 694 | "source": [ 695 | "import numpy as np\n", 696 | "\n", 697 | "def plot_spectrogram(spectrogram, vmax=None):\n", 698 | " transposed_spectrogram = tf.transpose(spectrogram)\n", 699 | "\n", 700 | " fig = plt.figure(figsize=(8,6))\n", 701 | " height = transposed_spectrogram.shape[0]\n", 702 | " X = np.arange(transposed_spectrogram.shape[1])\n", 703 | " Y = np.arange(height * int(sample_rate / 256), step=int(sample_rate / 256))\n", 704 | "\n", 705 | " im = plt.pcolormesh(X, Y, tf.transpose(spectrogram), vmax=vmax)\n", 706 | "\n", 707 | " fig.colorbar(im)\n", 708 | " plt.show()\n", 709 | "\n", 710 | "\n", 711 | "plot_spectrogram(spectrogram)" 712 | ], 713 | "execution_count": null, 714 | "outputs": [] 715 | }, 716 | { 717 | "cell_type": "markdown", 718 | "metadata": { 719 | "id": "THC4DDLg-SyA" 720 | }, 721 | "source": [ 722 | "Then we can map each split wave item to a spectrogram:" 723 | ] 724 | }, 725 | { 726 | "cell_type": "code", 727 | "metadata": { 728 | "id": "Xw-KTABdvSpG" 729 | }, 730 | "source": [ 731 | "def create_spectrogram_for_map(samples, label, fold):\n", 732 | " return create_spectrogram(samples), label, fold\n", 733 | "\n", 734 | "spectrograms_ds = split_wav_ds.map(create_spectrogram_for_map)\n", 735 | "spectrograms_ds.element_spec" 736 | ], 737 | "execution_count": null, 738 | "outputs": [] 739 | }, 740 | { 741 | "cell_type": "markdown", 742 | "metadata": { 743 | "id": "nMFP7b-l-kOb" 744 | }, 745 | "source": [ 746 | "Let's plot the first 5 spectrograms in the dataset:" 747 | ] 748 | }, 749 | { 750 | "cell_type": "code", 751 | "metadata": { 752 | "id": "Cc3fqcs6lMal" 753 | }, 754 | "source": [ 755 | "for s, _, _ in spectrograms_ds.take(5):\n", 756 | " plot_spectrogram(s)" 757 | ], 758 | "execution_count": null, 759 | "outputs": [] 760 | }, 761 | { 762 | "cell_type": "markdown", 763 | "metadata": { 764 | "id": "OrCmiQVyQ_NB" 765 | }, 766 | "source": [ 767 | "### Split Dataset\n", 768 | "\n", 769 | "Before we start training the ML classifier model, we must split the dataset up in three parts: training, validation, and test.\n", 770 | "\n", 771 | "We will use the same technique in TensorFlow's [Transfer learning with YAMNet for environmental sound classification](https://www.tensorflow.org/tutorials/audio/transfer_learning_audio#split_the_data) guide, and use the `fold` column of the ESC-50 dataset to determine the split.\n", 772 | "\n", 773 | "Before splitting the dataset, let's set a random seed for reproducibility:" 774 | ] 775 | }, 776 | { 777 | "cell_type": "code", 778 | "metadata": { 779 | "id": "T3bR-BHBRCYi" 780 | }, 781 | "source": [ 782 | "import numpy as np\n", 783 | "import tensorflow as tf\n", 784 | "\n", 785 | "# Set seed for experiment reproducibility\n", 786 | "random_seed = 42\n", 787 | "tf.random.set_seed(random_seed)\n", 788 | "np.random.seed(random_seed)" 789 | ], 790 | "execution_count": null, 791 | "outputs": [] 792 | }, 793 | { 794 | "cell_type": "markdown", 795 | "metadata": { 796 | "id": "kq3kDVG9_nhW" 797 | }, 798 | "source": [ 799 | "Entries with a `fold` value of less than 4 will used for training, the ones with a `value` will be used for validation, and finally the remaining items with be used for testing.\n", 800 | "\n", 801 | "The `fold` column will be removed as it is no longer needed, and the dimensions of the spectrogram shape will be expanded from `(124, 129)` to `(124, 129, 1)`. The training items will also be shuffled." 802 | ] 803 | }, 804 | { 805 | "cell_type": "code", 806 | "metadata": { 807 | "id": "SyclwrPnMEIh" 808 | }, 809 | "source": [ 810 | "cached_ds = spectrograms_ds.cache()\n", 811 | "\n", 812 | "train_ds = cached_ds.filter(lambda spectrogram, label, fold: fold < 4)\n", 813 | "val_ds = cached_ds.filter(lambda spectrogram, label, fold: fold == 4)\n", 814 | "test_ds = cached_ds.filter(lambda spectrogram, label, fold: fold > 4)\n", 815 | "\n", 816 | "# remove the folds column as it's no longer needed\n", 817 | "remove_fold_column = lambda spectrogram, label, fold: (tf.expand_dims(spectrogram, axis=-1), label)\n", 818 | "\n", 819 | "train_ds = train_ds.map(remove_fold_column)\n", 820 | "val_ds = val_ds.map(remove_fold_column)\n", 821 | "test_ds = test_ds.map(remove_fold_column)\n", 822 | "\n", 823 | "train_ds = train_ds.cache().shuffle(1000, seed=random_seed).batch(32).prefetch(tf.data.AUTOTUNE)\n", 824 | "val_ds = val_ds.cache().batch(32).prefetch(tf.data.AUTOTUNE)\n", 825 | "test_ds = test_ds.cache().batch(32).prefetch(tf.data.AUTOTUNE)" 826 | ], 827 | "execution_count": null, 828 | "outputs": [] 829 | }, 830 | { 831 | "cell_type": "markdown", 832 | "metadata": { 833 | "id": "LIdHUP8mRF-9" 834 | }, 835 | "source": [ 836 | "### Train Model\n", 837 | "\n", 838 | "Now that we have the features extracted from the audio signal, we can create a model using TensorFlow’s Keras API. The model will consist of 8 layers:\n", 839 | "\n", 840 | " 1. An input layer.\n", 841 | " 1. A preprocessing layer, that will resize the input tensor from 124x129x1 to 32x32x1.\n", 842 | " 1. A normalization layer, that will scale the input values between -1 and 1\n", 843 | " 1. A 2D convolution layer with: 8 filters, a kernel size of 8x8, and stride of 2x2, and ReLU activation function.\n", 844 | " 1. A 2D max pooling layer with size of 2x2\n", 845 | " 1. A flatten layer to flatten the 2D data to 1D\n", 846 | " 1. A dropout layer, that will help reduce overfitting during training\n", 847 | " 1. A dense layer with 50 outputs and a softmax activation function, which outputs the likelihood of the sound category (between 0 and 1).\n" 848 | ] 849 | }, 850 | { 851 | "cell_type": "markdown", 852 | "metadata": { 853 | "id": "pbTAU5yZA43R" 854 | }, 855 | "source": [ 856 | "Before we build the model using [Tensflow's Keras API's](https://www.tensorflow.org/api_docs/python/tf/keras), we will create normalization layer and feed in all the spectrogram dataset items." 857 | ] 858 | }, 859 | { 860 | "cell_type": "code", 861 | "metadata": { 862 | "id": "fFLHe9y-iGmj" 863 | }, 864 | "source": [ 865 | "for spectrogram, _, _ in cached_ds.take(1):\n", 866 | " input_shape = tf.expand_dims(spectrogram, axis=-1).shape\n", 867 | " print('Input shape:', input_shape)\n", 868 | " \n", 869 | "norm_layer = tf.keras.layers.experimental.preprocessing.Normalization()\n", 870 | "norm_layer.adapt(cached_ds.map(lambda x, y, z: tf.reshape(x, input_shape)))" 871 | ], 872 | "execution_count": null, 873 | "outputs": [] 874 | }, 875 | { 876 | "cell_type": "markdown", 877 | "metadata": { 878 | "id": "W3N5rys9EF8B" 879 | }, 880 | "source": [ 881 | "Define a sequential 8 layer model as described above:" 882 | ] 883 | }, 884 | { 885 | "cell_type": "code", 886 | "metadata": { 887 | "id": "XyO5ilzPPePi" 888 | }, 889 | "source": [ 890 | "baseline_model = tf.keras.models.Sequential([\n", 891 | " tf.keras.layers.Input(shape=input_shape),\n", 892 | " tf.keras.layers.experimental.preprocessing.Resizing(32, 32, interpolation=\"nearest\"), \n", 893 | " norm_layer,\n", 894 | " tf.keras.layers.Conv2D(8, kernel_size=(8,8), strides=(2, 2), activation=\"relu\"),\n", 895 | " tf.keras.layers.MaxPool2D(pool_size=(2,2)),\n", 896 | " tf.keras.layers.Flatten(),\n", 897 | " tf.keras.layers.Dropout(0.25),\n", 898 | " tf.keras.layers.Dense(50, activation='softmax')\n", 899 | "])\n", 900 | "\n", 901 | "baseline_model.summary()" 902 | ], 903 | "execution_count": null, 904 | "outputs": [] 905 | }, 906 | { 907 | "cell_type": "markdown", 908 | "metadata": { 909 | "id": "N4PgoMF8ENpu" 910 | }, 911 | "source": [ 912 | "Compile the model with `accuracy` metrics, an Adam optimizer and a sparse categorical crossentropy loss function. As well as define early stopping and dynamic learning rate scheduler callbacks for training." 913 | ] 914 | }, 915 | { 916 | "cell_type": "code", 917 | "metadata": { 918 | "id": "RyTlP0G1QHD6" 919 | }, 920 | "source": [ 921 | "METRICS = [\n", 922 | " \"accuracy\",\n", 923 | "]\n", 924 | "\n", 925 | "baseline_model.compile(\n", 926 | " optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),\n", 927 | " loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),\n", 928 | " metrics=METRICS,\n", 929 | ")\n", 930 | "\n", 931 | "def scheduler(epoch, lr):\n", 932 | " if epoch < 100:\n", 933 | " return lr\n", 934 | " else:\n", 935 | " return lr * tf.math.exp(-0.1)\n", 936 | "\n", 937 | "callbacks = [\n", 938 | " tf.keras.callbacks.EarlyStopping(verbose=1, patience=25), \n", 939 | " tf.keras.callbacks.LearningRateScheduler(scheduler)\n", 940 | "]" 941 | ], 942 | "execution_count": null, 943 | "outputs": [] 944 | }, 945 | { 946 | "cell_type": "markdown", 947 | "metadata": { 948 | "id": "FBUof7CcErKE" 949 | }, 950 | "source": [ 951 | "Train the model:" 952 | ] 953 | }, 954 | { 955 | "cell_type": "code", 956 | "metadata": { 957 | "id": "lqs-O9o8QV58" 958 | }, 959 | "source": [ 960 | "EPOCHS = 250\n", 961 | "history = baseline_model.fit(\n", 962 | " train_ds, \n", 963 | " validation_data=val_ds, \n", 964 | " epochs=EPOCHS,\n", 965 | " callbacks=callbacks,\n", 966 | ")" 967 | ], 968 | "execution_count": null, 969 | "outputs": [] 970 | }, 971 | { 972 | "cell_type": "markdown", 973 | "metadata": { 974 | "id": "AjnfP2hCFB9z" 975 | }, 976 | "source": [ 977 | "Evaluate the loss and accuracy of the test dataset:" 978 | ] 979 | }, 980 | { 981 | "cell_type": "code", 982 | "metadata": { 983 | "id": "sWpXhy1eQmgH" 984 | }, 985 | "source": [ 986 | "baseline_model.evaluate(test_ds)" 987 | ], 988 | "execution_count": null, 989 | "outputs": [] 990 | }, 991 | { 992 | "cell_type": "markdown", 993 | "metadata": { 994 | "id": "cnDIdyN38UVh" 995 | }, 996 | "source": [ 997 | "The baseline model has a relatively low accuracy ~24%, however in the next steps we will use it as a starting point to fine tune a more accurate model for our use case." 998 | ] 999 | }, 1000 | { 1001 | "cell_type": "markdown", 1002 | "metadata": { 1003 | "id": "TkW5RFLVFIQO" 1004 | }, 1005 | "source": [ 1006 | "Save the model:" 1007 | ] 1008 | }, 1009 | { 1010 | "cell_type": "code", 1011 | "metadata": { 1012 | "id": "ZmZNjn-OR-rF" 1013 | }, 1014 | "source": [ 1015 | "baseline_model.save(\"baseline_model\")" 1016 | ], 1017 | "execution_count": null, 1018 | "outputs": [] 1019 | }, 1020 | { 1021 | "cell_type": "markdown", 1022 | "metadata": { 1023 | "id": "CalvtW18FOra" 1024 | }, 1025 | "source": [ 1026 | "Create a zip file of the saved model, for download purposes:" 1027 | ] 1028 | }, 1029 | { 1030 | "cell_type": "code", 1031 | "metadata": { 1032 | "id": "00l_RwHWSY-Y" 1033 | }, 1034 | "source": [ 1035 | "!zip -r baseline_model.zip baseline_model" 1036 | ], 1037 | "execution_count": null, 1038 | "outputs": [] 1039 | }, 1040 | { 1041 | "cell_type": "markdown", 1042 | "metadata": { 1043 | "id": "EopW6v06SsRu" 1044 | }, 1045 | "source": [ 1046 | "## Transfer Learning\n", 1047 | "\n", 1048 | "Now we will use Transfer Learning and change the classification head of the model to train a binary classification model for fire alarm sounds.\n", 1049 | "\n", 1050 | "Transfer Learning is the process of retraining a model that has been developed for a task to complete a new similar task. The idea is that the model has learned transferable \"skills\" and the weights and biases can be used in other models as a starting point.\n", 1051 | "\n", 1052 | "Transfer learning is very common in computer vision. Big data companies spend weeks training models on ImageNet, this is not possible for most people and so people reuse the models built in these research companies to complete their own tasks. A model designed to recognise 1000 different objects in a image can be adapted to recognise other or similar objects.\n", 1053 | "\n", 1054 | "As humans we use transfer learning too. The skills you developed to learn to walk could also be used to learn to run later on.\n", 1055 | "\n", 1056 | "In a neural network, the first few layers of a model start to perform a \"feature extraction\" such as finding shapes, edges and colours. The layers later on are used as classifiers; they take the extracted features and classify them.\n", 1057 | "\n", 1058 | "You can find more information and visualizations about this here https://yosinski.com/deepvis.\n", 1059 | "\n", 1060 | "Because of this, we can assume the first few layers have learned quite general feature extraction techniques that can be applied to all similar tasks and so we can freeze all these layers. The classifier layer will need to be trained based on the new classes.\n", 1061 | "\n", 1062 | "To do this, we break the process into two steps:\n", 1063 | "Freeze the \"backbone\" of the model and train the head with a fairly high learning rate. We slowly reduce the learning rate.\n", 1064 | "Unfreeze the \"backbone\" and fine-tune the model with a low learning rate.\n", 1065 | "\n", 1066 | "\n", 1067 | "### Dataset\n", 1068 | "\n", 1069 | "We have collected 10 fire alarm clips from [freesound.org](https://freesound.org/) and [BigSoundBank.com](https://bigsoundbank.com/). Background noise clips from the [SpeechCommands](https://www.tensorflow.org/datasets/catalog/speech_commands) dataset, will be used for non-fire alarm sounds. This dataset is small and represents the sort of data you might expect to see in the real world. Data augmentation techniques will be used to supplement the training data we’ve collected." 1070 | ] 1071 | }, 1072 | { 1073 | "cell_type": "markdown", 1074 | "metadata": { 1075 | "id": "APrXeNMkZmyT" 1076 | }, 1077 | "source": [ 1078 | "### Download datasets\n", 1079 | "\n", 1080 | "We've created an archive with the following wave files for you:\n", 1081 | "\n", 1082 | " * https://freesound.org/people/rayprice/sounds/155006/ ([CC BY 3.0 license](https://creativecommons.org/licenses/by/3.0/))\n", 1083 | "\n", 1084 | " * https://freesound.org/people/deleted_user_2104797/sounds/164686/ ([CC0 1.0 license](https://creativecommons.org/publicdomain/zero/1.0/))\n", 1085 | "\n", 1086 | " * https://freesound.org/people/AdamWeeden/sounds/255180/ ([CC BY 3.0 license](https://creativecommons.org/licenses/by/3.0/))\n", 1087 | "\n", 1088 | "* https://freesound.org/people/MoonlightShadow/sounds/325367/([CC0 1.0 license](https://creativecommons.org/publicdomain/zero/1.0/))\n", 1089 | "\n", 1090 | "* https://freesound.org/people/SpliceSound/sounds/369847/ ([CC0 1.0 license](https://creativecommons.org/publicdomain/zero/1.0/))\n", 1091 | "\n", 1092 | "* https://freesound.org/people/SpliceSound/sounds/369848/ ([CC0 1.0 license](https://creativecommons.org/publicdomain/zero/1.0/))\n", 1093 | "\n", 1094 | "* https://bigsoundbank.com/detail-0800-smoke-detector-alarm.html ([free of charge and royalty free.](https://bigsoundbank.com/droit.html))\n", 1095 | "\n", 1096 | "* https://bigsoundbank.com/detail-1151-smoke-detector-alarm-2.html ([free of charge and royalty free.](https://bigsoundbank.com/droit.html))\n", 1097 | "\n", 1098 | "* https://bigsoundbank.com/detail-1153-smoke-detector-alarm-3.html ([free of charge and royalty free.](https://bigsoundbank.com/droit.html))\n" 1099 | ] 1100 | }, 1101 | { 1102 | "cell_type": "code", 1103 | "metadata": { 1104 | "id": "h-FUUQXMZaJQ" 1105 | }, 1106 | "source": [ 1107 | "tf.keras.utils.get_file('fire_alarms.tar.gz',\n", 1108 | " 'https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/archive/refs/heads/fire_alarms.tar.gz',\n", 1109 | " cache_dir='./',\n", 1110 | " cache_subdir='datasets',\n", 1111 | " extract=True)" 1112 | ], 1113 | "execution_count": null, 1114 | "outputs": [] 1115 | }, 1116 | { 1117 | "cell_type": "code", 1118 | "metadata": { 1119 | "id": "CpS5FBOJaK0z" 1120 | }, 1121 | "source": [ 1122 | "# Since we only need the files in the _background_noise_ folder of the dataset\n", 1123 | "# use the curl command to download the archive file and then manually extract\n", 1124 | "# using the tar command, instead of using tf.keras.utils.get_file(...) \n", 1125 | "# in Python\n", 1126 | "\n", 1127 | "!mkdir -p datasets/speech_commands\n", 1128 | "!curl -L -o datasets/speech_commands_v0.02.tar.gz http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz\n", 1129 | "!tar --wildcards --directory datasets/speech_commands -xzvf datasets/speech_commands_v0.02.tar.gz './_background_noise_/*'" 1130 | ], 1131 | "execution_count": null, 1132 | "outputs": [] 1133 | }, 1134 | { 1135 | "cell_type": "markdown", 1136 | "metadata": { 1137 | "id": "2NoayTieb_WR" 1138 | }, 1139 | "source": [ 1140 | "### Load dataset\n", 1141 | "\n", 1142 | "Instead of using a pandas DataFrame to load the dataset, we will load the fire alarm files and background noise files separately. The `label` and `fold` values will be mapped manually." 1143 | ] 1144 | }, 1145 | { 1146 | "cell_type": "code", 1147 | "metadata": { 1148 | "id": "owe9kEqBcWR0" 1149 | }, 1150 | "source": [ 1151 | "fire_alarm_files_ds = tf.data.Dataset.list_files(\"datasets/ml-audio-classifier-example-for-pico-fire_alarms/*.wav\", shuffle=False)\n", 1152 | "fire_alarm_files_ds = fire_alarm_files_ds.map(lambda x: (x, 1, -1))" 1153 | ], 1154 | "execution_count": null, 1155 | "outputs": [] 1156 | }, 1157 | { 1158 | "cell_type": "code", 1159 | "metadata": { 1160 | "id": "StJH-68kbYvF" 1161 | }, 1162 | "source": [ 1163 | "background_noise_files_ds = tf.data.Dataset.list_files(\"datasets/speech_commands/_background_noise_/*.wav\", shuffle=False)\n", 1164 | "background_noise_files_ds = background_noise_files_ds.map(lambda x: (x, 0, -1))" 1165 | ], 1166 | "execution_count": null, 1167 | "outputs": [] 1168 | }, 1169 | { 1170 | "cell_type": "code", 1171 | "metadata": { 1172 | "id": "TMOJWhoFcm-5" 1173 | }, 1174 | "source": [ 1175 | "fire_alarm_wav_ds = fire_alarm_files_ds.map(load_wav_for_map)\n", 1176 | "fire_alarm_wav_ds = fire_alarm_wav_ds.cache()\n", 1177 | "\n", 1178 | "background_noise_wav_ds = background_noise_files_ds.map(load_wav_for_map)\n", 1179 | "background_noise_wav_ds = background_noise_wav_ds.cache()" 1180 | ], 1181 | "execution_count": null, 1182 | "outputs": [] 1183 | }, 1184 | { 1185 | "cell_type": "markdown", 1186 | "metadata": { 1187 | "id": "l29lUf44JB1r" 1188 | }, 1189 | "source": [ 1190 | "Let's plot and listen to the first fire alarm file:" 1191 | ] 1192 | }, 1193 | { 1194 | "cell_type": "code", 1195 | "metadata": { 1196 | "id": "2TpRuruRTwpi" 1197 | }, 1198 | "source": [ 1199 | "for wav_data, _, _ in fire_alarm_wav_ds.take(1):\n", 1200 | " plt.plot(wav_data)\n", 1201 | " plt.ylim([-1, 1])\n", 1202 | " plt.show()\n", 1203 | "\n", 1204 | " display.display(display.Audio(wav_data, rate=sample_rate))" 1205 | ], 1206 | "execution_count": null, 1207 | "outputs": [] 1208 | }, 1209 | { 1210 | "cell_type": "markdown", 1211 | "metadata": { 1212 | "id": "3i6TmVUUJeuh" 1213 | }, 1214 | "source": [ 1215 | "Then do the same for the first background noise file:" 1216 | ] 1217 | }, 1218 | { 1219 | "cell_type": "code", 1220 | "metadata": { 1221 | "id": "CIxUgvNaZWWN" 1222 | }, 1223 | "source": [ 1224 | "for wav_data, _, _ in background_noise_wav_ds.take(1):\n", 1225 | " plt.plot(wav_data)\n", 1226 | " plt.ylim([-1, 1])\n", 1227 | " plt.show()\n", 1228 | "\n", 1229 | " display.display(display.Audio(wav_data, rate=sample_rate))" 1230 | ], 1231 | "execution_count": null, 1232 | "outputs": [] 1233 | }, 1234 | { 1235 | "cell_type": "markdown", 1236 | "metadata": { 1237 | "id": "KNoTz4FpJv8i" 1238 | }, 1239 | "source": [ 1240 | "Then split the audio samples into 1 second soundbites:" 1241 | ] 1242 | }, 1243 | { 1244 | "cell_type": "code", 1245 | "metadata": { 1246 | "id": "JYJT95GgpX4p" 1247 | }, 1248 | "source": [ 1249 | "split_fire_alarm_wav_ds = fire_alarm_wav_ds.flat_map(split_wav_for_flat_map)\n", 1250 | "split_fire_alarm_wav_ds = split_fire_alarm_wav_ds.filter(lambda x, y, z: wav_not_empty(x))\n", 1251 | "\n", 1252 | "split_background_noise_wav_ds = background_noise_wav_ds.flat_map(split_wav_for_flat_map)\n", 1253 | "split_background_noise_wav_ds = split_background_noise_wav_ds.filter(lambda x, y, z: wav_not_empty(x))" 1254 | ], 1255 | "execution_count": null, 1256 | "outputs": [] 1257 | }, 1258 | { 1259 | "cell_type": "markdown", 1260 | "metadata": { 1261 | "id": "LICTjFR3J8uO" 1262 | }, 1263 | "source": [ 1264 | "TensorFlow Lite for Microcontroller (TFLu) provides a subset of TensorFlow operations, so we are unable to use the `tf.signal.sft(...)` API we’ve used for feature extraction of the baseline model on our MCU. However, we can leverage Arm’s CMSIS-DSP library to generate spectrograms on the MCU. CMSIS-DSP contains support for both floating-point and fixed-point DSP operations which are optimized for Arm Cortex-M processors, including the Arm Cortex-M0+ that we will be deploying the ML model to. The Arm Cortex-M0+ does not contain a floating-point unit (FPU) so it would be better to leverage a 16-bit fixed-point DSP based feature extraction pipeline on the board.\n", 1265 | "\n", 1266 | "We can leverage CMSIS-DSP’s Python Wrapper in the notebook to perform the same operations on our training pipeline using 16-bit fixed-point math. At a high level we can replicate the TensorFlow SFT API with the following CMSIS-DSP based operations:\n", 1267 | "\n", 1268 | " 1. Manually creating a Hanning Window of length 256 using the Hanning Window formula along with CMSIS-DSP’s `arm_cos_f32` API.\n", 1269 | " 1. Creating a CMSIS-DSP `arm_rfft_instance_q15` instance and initializing it using CMSIS-DSP’s `arm_rfft_init_q15` API. \n", 1270 | " 1. Looping through the audio data 256 samples at a time, with a stride of 128 (this matches the parameters we’ve passed into the TF sft API)\n", 1271 | " 1. Multiplying the 256 samples by the Hanning Window, using CMSIS-DSP’s `arm_mult_q15` API\n", 1272 | " 1. Calculating the FFT of the output of the previous step, using CMSIS-DSP’s `arm_rfft_q15` API\n", 1273 | " 1. Calculating the magnitude of the previous step, using CMSIS-DSP’s `arm_cmplx_mag_q15` API\n", 1274 | " 1. Each audio soundbites’s FFT magnitude represents the one column of the spectrogram.\n", 1275 | " 1. Since our baseline model expects a floating-point input, instead of the 16-bit quantized value we were using, the CMSIS-DSP `arm_q15_to_float` API can be used to convert the spectrogram data from a 16-bit fixed-point value to a floating-point value for training.\n", 1276 | "\n", 1277 | "For an in-depth description of how to create audio spectrograms using fixed-point operations with CMSIS-DSP, please see the [Towards Data Science “Fixed-point DSP for Data Scientists” guide](https://towardsdatascience.com/fixed-point-dsp-for-data-scientists-d773a4271f7f).\n", 1278 | "\n" 1279 | ] 1280 | }, 1281 | { 1282 | "cell_type": "code", 1283 | "metadata": { 1284 | "id": "cdhrliYiAfKE" 1285 | }, 1286 | "source": [ 1287 | "import cmsisdsp\n", 1288 | "from numpy import pi as PI\n", 1289 | "\n", 1290 | "window_size = 256\n", 1291 | "step_size = 128\n", 1292 | "\n", 1293 | "hanning_window_f32 = np.zeros(window_size)\n", 1294 | "for i in range(window_size):\n", 1295 | " hanning_window_f32[i] = 0.5 * (1 - cmsisdsp.arm_cos_f32(2 * PI * i / window_size ))\n", 1296 | "\n", 1297 | "hanning_window_q15 = cmsisdsp.arm_float_to_q15(hanning_window_f32)\n", 1298 | "\n", 1299 | "rfftq15 = cmsisdsp.arm_rfft_instance_q15()\n", 1300 | "status = cmsisdsp.arm_rfft_init_q15(rfftq15, window_size, 0, 1)\n", 1301 | "\n", 1302 | "def get_arm_spectrogram(waveform):\n", 1303 | " \n", 1304 | " num_frames = int(1 + (len(waveform) - window_size) // step_size)\n", 1305 | " fft_size = int(window_size // 2 + 1)\n", 1306 | "\n", 1307 | " # Convert the audio to q15\n", 1308 | " waveform_q15 = cmsisdsp.arm_float_to_q15(waveform)\n", 1309 | "\n", 1310 | " # Create empty spectrogram array\n", 1311 | " spectrogram_q15 = np.empty((num_frames, fft_size), dtype = np.int16)\n", 1312 | "\n", 1313 | " start_index = 0\n", 1314 | "\n", 1315 | " for index in range(num_frames):\n", 1316 | " # Take the window from the waveform.\n", 1317 | " window = waveform_q15[start_index:start_index + window_size]\n", 1318 | "\n", 1319 | " # Apply the Hanning Window.\n", 1320 | " window = cmsisdsp.arm_mult_q15(window, hanning_window_q15)\n", 1321 | "\n", 1322 | " # Calculate the FFT, shift by 7 according to docs\n", 1323 | " window = cmsisdsp.arm_rfft_q15(rfftq15, window)\n", 1324 | "\n", 1325 | " # Take the absolute value of the FFT and add to the Spectrogram.\n", 1326 | " spectrogram_q15[index] = cmsisdsp.arm_cmplx_mag_q15(window)[:fft_size]\n", 1327 | "\n", 1328 | " # Increase the start index of the window by the overlap amount.\n", 1329 | " start_index += step_size\n", 1330 | "\n", 1331 | " # Convert to numpy output ready for keras\n", 1332 | " return cmsisdsp.arm_q15_to_float(spectrogram_q15).reshape(num_frames,fft_size) * 512" 1333 | ], 1334 | "execution_count": null, 1335 | "outputs": [] 1336 | }, 1337 | { 1338 | "cell_type": "markdown", 1339 | "metadata": { 1340 | "id": "9hHnibH7LCz1" 1341 | }, 1342 | "source": [ 1343 | "Let's create a spectrogram representation for all of the fire alarm soundbites, and plot the first spectrogram." 1344 | ] 1345 | }, 1346 | { 1347 | "cell_type": "code", 1348 | "metadata": { 1349 | "id": "BRDqTa1CZfEA" 1350 | }, 1351 | "source": [ 1352 | "@tf.function\n", 1353 | "def create_arm_spectrogram_for_map(wav, label, fold):\n", 1354 | " spectrogram = tf.py_function(get_arm_spectrogram, [wav], tf.float32)\n", 1355 | "\n", 1356 | " return spectrogram, label, fold\n", 1357 | "\n", 1358 | "fire_alarm_spectrograms_ds = split_fire_alarm_wav_ds.map(create_arm_spectrogram_for_map)\n", 1359 | "fire_alarm_spectrograms_ds = fire_alarm_spectrograms_ds.cache()\n", 1360 | "\n", 1361 | "for spectrogram, _, _ in fire_alarm_spectrograms_ds.take(1):\n", 1362 | " plot_spectrogram(spectrogram)\n" 1363 | ], 1364 | "execution_count": null, 1365 | "outputs": [] 1366 | }, 1367 | { 1368 | "cell_type": "markdown", 1369 | "metadata": { 1370 | "id": "RZ-fpDdkLyyy" 1371 | }, 1372 | "source": [ 1373 | "The do the same for the background noise soundbites:" 1374 | ] 1375 | }, 1376 | { 1377 | "cell_type": "code", 1378 | "metadata": { 1379 | "id": "haGn0RT9Z1FI" 1380 | }, 1381 | "source": [ 1382 | "background_noise_spectrograms_ds = split_background_noise_wav_ds.map(create_arm_spectrogram_for_map)\n", 1383 | "background_noise_spectrograms_ds = background_noise_spectrograms_ds.cache()\n", 1384 | "\n", 1385 | "for spectrogram, _, _ in background_noise_spectrograms_ds.take(1):\n", 1386 | " plot_spectrogram(spectrogram)" 1387 | ], 1388 | "execution_count": null, 1389 | "outputs": [] 1390 | }, 1391 | { 1392 | "cell_type": "markdown", 1393 | "metadata": { 1394 | "id": "Wi-trTZtL6FG" 1395 | }, 1396 | "source": [ 1397 | "Now let's calculate the lengths of each dataset to see how balanced they are:" 1398 | ] 1399 | }, 1400 | { 1401 | "cell_type": "code", 1402 | "metadata": { 1403 | "id": "ouhaiuLzMFiN" 1404 | }, 1405 | "source": [ 1406 | "def calculate_ds_len(ds):\n", 1407 | " count = 0\n", 1408 | " for _, _, _ in ds:\n", 1409 | " count += 1\n", 1410 | " \n", 1411 | " return count\n", 1412 | "\n", 1413 | "num_fire_alarm_spectrograms = calculate_ds_len(fire_alarm_spectrograms_ds)\n", 1414 | "num_background_noise_spectrograms = calculate_ds_len(background_noise_spectrograms_ds)\n", 1415 | "\n", 1416 | "print(f\"num_fire_alarm_spectrograms = {num_fire_alarm_spectrograms}\")\n", 1417 | "print(f\"num_background_noise_spectrograms = {num_background_noise_spectrograms}\")" 1418 | ], 1419 | "execution_count": null, 1420 | "outputs": [] 1421 | }, 1422 | { 1423 | "cell_type": "markdown", 1424 | "metadata": { 1425 | "id": "EigYdlAXMpkP" 1426 | }, 1427 | "source": [ 1428 | "We can see there a more background noise samples than fire alarm samples. In the next section we will use data augmentation to balance this." 1429 | ] 1430 | }, 1431 | { 1432 | "cell_type": "markdown", 1433 | "metadata": { 1434 | "id": "sxYKzLPCmIrO" 1435 | }, 1436 | "source": [ 1437 | "### Data Augmentation" 1438 | ] 1439 | }, 1440 | { 1441 | "cell_type": "markdown", 1442 | "metadata": { 1443 | "id": "iGZI32bKMI9Z" 1444 | }, 1445 | "source": [ 1446 | "Data augmentation is a set of techniques used to increase the size of a dataset. This is done by slightly modifying samples from the dataset or by creating synthetic data. In this situation we are using audio and we will create a few functions to augment the different samples. We will use three techniques:\n", 1447 | "\n", 1448 | " * adding white noise to the audio samples\n", 1449 | " * adding random silence to the audio\n", 1450 | " * mixing two audio samples together\n", 1451 | "\n", 1452 | "As well as increasing the size of the dataset, data augmentation also helps to reduce overfitting by training the model on different (not perfect) data samples. For example, on a microcontroller you are unlikely to have perfect high quality audio, and so a technique like adding white noise can help the model work in situations where your microphone might every so often have noise in there.\n", 1453 | "\n", 1454 | "First let's plot the time representation of the first fire alarm soundbite over time along with it's spectrogram representation, so we can compare against the augmented versions.\n" 1455 | ] 1456 | }, 1457 | { 1458 | "cell_type": "code", 1459 | "metadata": { 1460 | "id": "sjWJd9Zdo4NH" 1461 | }, 1462 | "source": [ 1463 | "for wav, _, _ in split_fire_alarm_wav_ds.take(1):\n", 1464 | " test_fire_alarm_wav = wav\n", 1465 | "\n", 1466 | "plt.plot(test_fire_alarm_wav)\n", 1467 | "plt.ylim([-1, 1])\n", 1468 | "plt.show()\n", 1469 | "\n", 1470 | "plot_spectrogram(get_arm_spectrogram(test_fire_alarm_wav), vmax=25)\n", 1471 | "\n", 1472 | "display.display(display.Audio(test_fire_alarm_wav, rate=sample_rate))" 1473 | ], 1474 | "execution_count": null, 1475 | "outputs": [] 1476 | }, 1477 | { 1478 | "cell_type": "markdown", 1479 | "metadata": { 1480 | "id": "MDSEeTsLo5aN" 1481 | }, 1482 | "source": [ 1483 | "#### White Noise" 1484 | ] 1485 | }, 1486 | { 1487 | "cell_type": "markdown", 1488 | "metadata": { 1489 | "id": "61lkrtSuNIS1" 1490 | }, 1491 | "source": [ 1492 | "TensorFlow's [`tf.random.uniform(...)`](https://www.tensorflow.org/api_docs/python/tf/random/uniform) API can be used generate a Tensor of equal shape to the original audio. This Tensor can then be multiplied by a random scalar, and then added to the original audio samples. The [`tf.clip_by_value(...)`](https://www.tensorflow.org/api_docs/python/tf/clip_by_value) API will also be used to ensure the audio remains in the range of -1.0 to 1.0." 1493 | ] 1494 | }, 1495 | { 1496 | "cell_type": "code", 1497 | "metadata": { 1498 | "id": "mXV0un4cmRTY" 1499 | }, 1500 | "source": [ 1501 | "def add_white_noise(audio):\n", 1502 | " #generate noise and the scalar multiplier\n", 1503 | " noise = tf.random.uniform(shape=tf.shape(audio), minval=-1, maxval=1)\n", 1504 | " noise_scalar = tf.random.uniform(shape=[1], minval=0, maxval=0.2)\n", 1505 | "\n", 1506 | " # add them to the original audio\n", 1507 | " audio_with_noise = audio + (noise * noise_scalar)\n", 1508 | " \n", 1509 | " # final clip the values to ensure they are still between -1 and 1\n", 1510 | " audio_with_noise = tf.clip_by_value(audio_with_noise, clip_value_min=-1, clip_value_max=1)\n", 1511 | " \n", 1512 | " return audio_with_noise" 1513 | ], 1514 | "execution_count": null, 1515 | "outputs": [] 1516 | }, 1517 | { 1518 | "cell_type": "markdown", 1519 | "metadata": { 1520 | "id": "6z-wGwCmNx4x" 1521 | }, 1522 | "source": [ 1523 | "Let's apply the white noise to the fire alarm sound and then plot it to compare. We can also listen to the difference." 1524 | ] 1525 | }, 1526 | { 1527 | "cell_type": "code", 1528 | "metadata": { 1529 | "id": "FsffnIb7paHj" 1530 | }, 1531 | "source": [ 1532 | "test_fire_alarm_with_white_noise_wav = add_white_noise(test_fire_alarm_wav)\n", 1533 | "\n", 1534 | "plt.plot(test_fire_alarm_with_white_noise_wav)\n", 1535 | "plt.ylim([-1, 1])\n", 1536 | "plt.show()\n", 1537 | "\n", 1538 | "plot_spectrogram(get_arm_spectrogram(test_fire_alarm_with_white_noise_wav), vmax=25)\n", 1539 | "\n", 1540 | "display.display(display.Audio(test_fire_alarm_with_white_noise_wav, rate=sample_rate))" 1541 | ], 1542 | "execution_count": null, 1543 | "outputs": [] 1544 | }, 1545 | { 1546 | "cell_type": "markdown", 1547 | "metadata": { 1548 | "id": "KCE-90pJpoio" 1549 | }, 1550 | "source": [ 1551 | "#### Random Silence\n", 1552 | "\n", 1553 | "TensorFlow's [`tf.random.categorical(...)`](https://www.tensorflow.org/api_docs/python/tf/random/categorical) API can be used generate a Tensor of equal shape to the original audio containing mask of `True` or `False`. This mask can then be casted to a float type of 1.0 or 0.0, so that it can be multiplied by the original audio single to create random periods of silence." 1554 | ] 1555 | }, 1556 | { 1557 | "cell_type": "code", 1558 | "metadata": { 1559 | "id": "rSNkZPqPnW_k" 1560 | }, 1561 | "source": [ 1562 | "def add_random_silence(audio):\n", 1563 | " audio_mask = tf.random.categorical(tf.math.log([[0.2, 0.8]]), num_samples=tf.shape(audio)[0])\n", 1564 | " audio_mask = tf.cast(audio_mask, dtype=tf.float32)\n", 1565 | " audio_mask = tf.squeeze(audio_mask, axis=0)\n", 1566 | "\n", 1567 | " # multiply the audio input by the mask\n", 1568 | " augmented_audio = audio * audio_mask\n", 1569 | " \n", 1570 | " return augmented_audio" 1571 | ], 1572 | "execution_count": null, 1573 | "outputs": [] 1574 | }, 1575 | { 1576 | "cell_type": "markdown", 1577 | "metadata": { 1578 | "id": "BQ2p-MiwOzm9" 1579 | }, 1580 | "source": [ 1581 | "Let's apply the random silence to the fire alarm sound and then plot it to compare. We can also listen to the difference." 1582 | ] 1583 | }, 1584 | { 1585 | "cell_type": "code", 1586 | "metadata": { 1587 | "id": "Wl94M9Ryp9zJ" 1588 | }, 1589 | "source": [ 1590 | "test_fire_alarm_with_random_silence_wav = add_random_silence(test_fire_alarm_wav)\n", 1591 | "\n", 1592 | "plt.plot(test_fire_alarm_with_random_silence_wav)\n", 1593 | "plt.ylim([-1, 1])\n", 1594 | "plt.show()\n", 1595 | "\n", 1596 | "plot_spectrogram(get_arm_spectrogram(test_fire_alarm_with_random_silence_wav), vmax=25)\n", 1597 | "\n", 1598 | "display.display(display.Audio(test_fire_alarm_with_random_silence_wav, rate=sample_rate))" 1599 | ], 1600 | "execution_count": null, 1601 | "outputs": [] 1602 | }, 1603 | { 1604 | "cell_type": "markdown", 1605 | "metadata": { 1606 | "id": "oSfVXJhcs1sn" 1607 | }, 1608 | "source": [ 1609 | "#### Audio Mixups\n", 1610 | "\n", 1611 | "We can combine a fire alarm soundbite with a background noise soundbite to create a mixed up version of the two.\n", 1612 | "\n", 1613 | "Let's select the first background noise soundbite do see how this can be done." 1614 | ] 1615 | }, 1616 | { 1617 | "cell_type": "code", 1618 | "metadata": { 1619 | "id": "2mfaofOktV19" 1620 | }, 1621 | "source": [ 1622 | "for wav, _, _ in split_background_noise_wav_ds.take(1):\n", 1623 | " test_background_noise_wav = wav\n", 1624 | "\n", 1625 | "plt.plot(test_background_noise_wav)\n", 1626 | "plt.ylim([-1, 1])\n", 1627 | "plt.show()\n", 1628 | "\n", 1629 | "plot_spectrogram(get_arm_spectrogram(test_background_noise_wav))\n", 1630 | "\n", 1631 | "display.display(display.Audio(test_background_noise_wav, rate=sample_rate))" 1632 | ], 1633 | "execution_count": null, 1634 | "outputs": [] 1635 | }, 1636 | { 1637 | "cell_type": "markdown", 1638 | "metadata": { 1639 | "id": "SmiYtvxwPXd0" 1640 | }, 1641 | "source": [ 1642 | "We will multiply the background noise soundbite with a random scalar before adding it to the original fire alarm soundbite. Then ensure the mixed up value is between the range of -1.0 and 1.0." 1643 | ] 1644 | }, 1645 | { 1646 | "cell_type": "code", 1647 | "metadata": { 1648 | "id": "Vdzr8Hy9szgl" 1649 | }, 1650 | "source": [ 1651 | "def add_audio_mixup(audio, mixup_audio):\n", 1652 | " # randomly generate a scalar\n", 1653 | " noise_scalar = tf.random.uniform(shape=[1], minval=0, maxval=1)\n", 1654 | "\n", 1655 | " # add the background noise to the audio\n", 1656 | " augmented_audio = audio + (mixup_audio * noise_scalar)\n", 1657 | " \n", 1658 | " #final clip the values so they are stil between -1 and 1\n", 1659 | " augmented_audio = tf.clip_by_value(augmented_audio, clip_value_min=-1, clip_value_max=1)\n", 1660 | " \n", 1661 | " return augmented_audio" 1662 | ], 1663 | "execution_count": null, 1664 | "outputs": [] 1665 | }, 1666 | { 1667 | "cell_type": "markdown", 1668 | "metadata": { 1669 | "id": "0FvodQe0Pu8w" 1670 | }, 1671 | "source": [ 1672 | "Let's apply the audio mixup to the fire alarm sound and then plot it to compare. We can also listen to the difference." 1673 | ] 1674 | }, 1675 | { 1676 | "cell_type": "code", 1677 | "metadata": { 1678 | "id": "82af5IyPuqQU" 1679 | }, 1680 | "source": [ 1681 | "test_fire_alarm_with_mixup_wav = add_audio_mixup(test_fire_alarm_wav, test_background_noise_wav)\n", 1682 | "\n", 1683 | "plt.plot(test_fire_alarm_with_mixup_wav)\n", 1684 | "plt.ylim([-1, 1])\n", 1685 | "plt.show()\n", 1686 | "\n", 1687 | "plot_spectrogram(get_arm_spectrogram(test_fire_alarm_with_mixup_wav), vmax=25)\n", 1688 | "\n", 1689 | "display.display(display.Audio(test_fire_alarm_with_mixup_wav, rate=sample_rate))" 1690 | ], 1691 | "execution_count": null, 1692 | "outputs": [] 1693 | }, 1694 | { 1695 | "cell_type": "markdown", 1696 | "metadata": { 1697 | "id": "ntHK8oSHvl3N" 1698 | }, 1699 | "source": [ 1700 | "### Create Augmented Dataset\n", 1701 | "\n", 1702 | "We can now combine all three augmententation techniques to balance our dataset.\n", 1703 | "\n", 1704 | "First let's calculate how many augmented files we need to generate:" 1705 | ] 1706 | }, 1707 | { 1708 | "cell_type": "code", 1709 | "metadata": { 1710 | "id": "gFrYCTPYL0Vs" 1711 | }, 1712 | "source": [ 1713 | "num_augmented_fire_alarm_spectrograms = num_background_noise_spectrograms - num_fire_alarm_spectrograms\n", 1714 | "\n", 1715 | "print(f'num_augmented_fire_alarm_spectrograms = {num_augmented_fire_alarm_spectrograms}')" 1716 | ], 1717 | "execution_count": null, 1718 | "outputs": [] 1719 | }, 1720 | { 1721 | "cell_type": "markdown", 1722 | "metadata": { 1723 | "id": "2RkWzf_qQSlG" 1724 | }, 1725 | "source": [ 1726 | "Then we can divide by 3, to calculate how many augmented soundbites per technique to generate:" 1727 | ] 1728 | }, 1729 | { 1730 | "cell_type": "code", 1731 | "metadata": { 1732 | "id": "KpJwqgWbNLCk" 1733 | }, 1734 | "source": [ 1735 | "num_white_noise_fire_alarm_spectrograms = num_augmented_fire_alarm_spectrograms // 3\n", 1736 | "num_random_silence_fire_alarm_spectrograms = num_augmented_fire_alarm_spectrograms // 3\n", 1737 | "num_audio_mixup_fire_alarm_spectrograms = num_augmented_fire_alarm_spectrograms // 3\n", 1738 | "\n", 1739 | "print(f'num_white_noise_fire_alarm_spectrograms = {num_white_noise_fire_alarm_spectrograms}')\n", 1740 | "print(f'num_random_silence_fire_alarm_spectrograms = {num_random_silence_fire_alarm_spectrograms}')\n", 1741 | "print(f'num_audio_mixup_fire_alarm_spectrograms = {num_audio_mixup_fire_alarm_spectrograms}')" 1742 | ], 1743 | "execution_count": null, 1744 | "outputs": [] 1745 | }, 1746 | { 1747 | "cell_type": "markdown", 1748 | "metadata": { 1749 | "id": "Y12lDIeNQe7e" 1750 | }, 1751 | "source": [ 1752 | "Select and shuffle the number of soundbites required:" 1753 | ] 1754 | }, 1755 | { 1756 | "cell_type": "code", 1757 | "metadata": { 1758 | "id": "c0_hV-Y8vrFx" 1759 | }, 1760 | "source": [ 1761 | "split_fire_alarm_wav_ds = split_fire_alarm_wav_ds.cache()\n", 1762 | "preaugmented_split_fire_alarm_wav = split_fire_alarm_wav_ds.shuffle(num_augmented_fire_alarm_spectrograms, seed=random_seed).take(num_augmented_fire_alarm_spectrograms)" 1763 | ], 1764 | "execution_count": null, 1765 | "outputs": [] 1766 | }, 1767 | { 1768 | "cell_type": "markdown", 1769 | "metadata": { 1770 | "id": "VQKWXZSuQk2T" 1771 | }, 1772 | "source": [ 1773 | "Create the white noise augmented soundbites:" 1774 | ] 1775 | }, 1776 | { 1777 | "cell_type": "code", 1778 | "metadata": { 1779 | "id": "mBu718tfNxLY" 1780 | }, 1781 | "source": [ 1782 | "def add_white_noise_for_map(wav, label, fold):\n", 1783 | " return add_white_noise(wav), label, fold\n", 1784 | "\n", 1785 | "white_noise_fire_alarm_wav_ds = preaugmented_split_fire_alarm_wav.take(num_white_noise_fire_alarm_spectrograms)\n", 1786 | "white_noise_fire_alarm_wav_ds = white_noise_fire_alarm_wav_ds.map(add_white_noise_for_map)" 1787 | ], 1788 | "execution_count": null, 1789 | "outputs": [] 1790 | }, 1791 | { 1792 | "cell_type": "markdown", 1793 | "metadata": { 1794 | "id": "tK2pAnK-QqIq" 1795 | }, 1796 | "source": [ 1797 | "Create the random noise augmented soundbites:" 1798 | ] 1799 | }, 1800 | { 1801 | "cell_type": "code", 1802 | "metadata": { 1803 | "id": "0oKTJN6uPVIT" 1804 | }, 1805 | "source": [ 1806 | "def add_random_silence_for_map(wav, label, fold):\n", 1807 | " return add_random_silence(wav), label, fold\n", 1808 | "\n", 1809 | "random_silence_fire_alarm_wav_ds = preaugmented_split_fire_alarm_wav.skip(num_white_noise_fire_alarm_spectrograms)\n", 1810 | "random_silence_fire_alarm_wav_ds = random_silence_fire_alarm_wav_ds.take(num_random_silence_fire_alarm_spectrograms)\n", 1811 | "random_silence_fire_alarm_wav_ds = random_silence_fire_alarm_wav_ds.map(add_random_silence_for_map)" 1812 | ], 1813 | "execution_count": null, 1814 | "outputs": [] 1815 | }, 1816 | { 1817 | "cell_type": "markdown", 1818 | "metadata": { 1819 | "id": "FmYVy-VwQx_J" 1820 | }, 1821 | "source": [ 1822 | "Create the audio mixup augmented soundbites:" 1823 | ] 1824 | }, 1825 | { 1826 | "cell_type": "code", 1827 | "metadata": { 1828 | "id": "DQ0jfAJVQre7" 1829 | }, 1830 | "source": [ 1831 | "audio_mixup_background_noise_ds = split_background_noise_wav_ds.shuffle(num_audio_mixup_fire_alarm_spectrograms).take(num_audio_mixup_fire_alarm_spectrograms)\n", 1832 | "audio_mixup_background_noise_iter = iter(audio_mixup_background_noise_ds.map(lambda x, y, z: x))\n", 1833 | "\n", 1834 | "def add_audio_mixup_for_map(wav, label, fold):\n", 1835 | " return add_audio_mixup(wav, next(audio_mixup_background_noise_iter)), label, fold\n", 1836 | "\n", 1837 | "audio_mixup_split_fire_alarm_wav_ds = preaugmented_split_fire_alarm_wav.skip(num_white_noise_fire_alarm_spectrograms + num_random_silence_fire_alarm_spectrograms)\n", 1838 | "audio_mixup_split_fire_alarm_wav_ds = audio_mixup_split_fire_alarm_wav_ds.take(num_audio_mixup_fire_alarm_spectrograms)\n", 1839 | "audio_mixup_split_fire_alarm_wav_ds = audio_mixup_split_fire_alarm_wav_ds.map(add_audio_mixup_for_map)" 1840 | ], 1841 | "execution_count": null, 1842 | "outputs": [] 1843 | }, 1844 | { 1845 | "cell_type": "markdown", 1846 | "metadata": { 1847 | "id": "Cx-yL-pdQ5KF" 1848 | }, 1849 | "source": [ 1850 | "Combine all the augmented soundbites together and map them to their spectrogram representations:" 1851 | ] 1852 | }, 1853 | { 1854 | "cell_type": "code", 1855 | "metadata": { 1856 | "id": "N85bl1Xx1C4l" 1857 | }, 1858 | "source": [ 1859 | "augment_split_fire_alarm_wav_ds = tf.data.Dataset.concatenate(white_noise_fire_alarm_wav_ds, random_silence_fire_alarm_wav_ds)\n", 1860 | "augment_split_fire_alarm_wav_ds = tf.data.Dataset.concatenate(augment_split_fire_alarm_wav_ds, audio_mixup_split_fire_alarm_wav_ds)\n", 1861 | "\n", 1862 | "augment_fire_alarm_spectrograms_ds = augment_split_fire_alarm_wav_ds.map(create_arm_spectrogram_for_map)" 1863 | ], 1864 | "execution_count": null, 1865 | "outputs": [] 1866 | }, 1867 | { 1868 | "cell_type": "markdown", 1869 | "metadata": { 1870 | "id": "vD6zn_eWu-YS" 1871 | }, 1872 | "source": [ 1873 | "### Split Dataset\n", 1874 | "\n", 1875 | "Now combine the spectrogram datasets, and split them into training, validation, and test sets. Instead of using the `fold` value to split them, we will shuffle all the items, and then split by percentage." 1876 | ] 1877 | }, 1878 | { 1879 | "cell_type": "code", 1880 | "metadata": { 1881 | "id": "UmQxp4EtFiVn" 1882 | }, 1883 | "source": [ 1884 | "full_ds = tf.data.Dataset.concatenate(fire_alarm_spectrograms_ds, background_noise_spectrograms_ds)\n", 1885 | "full_ds = tf.data.Dataset.concatenate(full_ds, augment_fire_alarm_spectrograms_ds)\n", 1886 | "full_ds = full_ds.cache()\n", 1887 | "\n", 1888 | "full_ds_size = calculate_ds_len(full_ds)\n", 1889 | "\n", 1890 | "print(f'full_ds_size = {full_ds_size}')\n", 1891 | "\n", 1892 | "full_ds = full_ds.shuffle(full_ds_size)" 1893 | ], 1894 | "execution_count": null, 1895 | "outputs": [] 1896 | }, 1897 | { 1898 | "cell_type": "code", 1899 | "metadata": { 1900 | "id": "3CltJjc1Dgjl" 1901 | }, 1902 | "source": [ 1903 | "train_ds_size = int(0.60 * full_ds_size)\n", 1904 | "val_ds_size = int(0.20 * full_ds_size)\n", 1905 | "test_ds_size = int(0.20 * full_ds_size)\n", 1906 | "\n", 1907 | "train_ds = full_ds.take(train_ds_size)\n", 1908 | "\n", 1909 | "remaining_ds = full_ds.skip(train_ds_size) \n", 1910 | "val_ds = remaining_ds.take(val_ds_size)\n", 1911 | "test_ds = remaining_ds.skip(val_ds_size)\n", 1912 | "\n", 1913 | "# remove the folds column as it's no longer needed\n", 1914 | "remove_fold_column = lambda spectrogram, label, fold: (tf.expand_dims(spectrogram, axis=-1), label)\n", 1915 | "\n", 1916 | "train_ds = train_ds.map(remove_fold_column)\n", 1917 | "val_ds = val_ds.map(remove_fold_column)\n", 1918 | "test_ds = test_ds.map(remove_fold_column)\n", 1919 | "\n", 1920 | "train_ds = train_ds.cache().shuffle(1000).batch(32).prefetch(tf.data.AUTOTUNE)\n", 1921 | "val_ds = val_ds.cache().batch(32).prefetch(tf.data.AUTOTUNE)\n", 1922 | "test_ds = test_ds.cache().batch(32).prefetch(tf.data.AUTOTUNE)" 1923 | ], 1924 | "execution_count": null, 1925 | "outputs": [] 1926 | }, 1927 | { 1928 | "cell_type": "markdown", 1929 | "metadata": { 1930 | "id": "_ffwTQt_vFAx" 1931 | }, 1932 | "source": [ 1933 | "### Replace Baseline Model Classification Head and Train Model\n", 1934 | "\n", 1935 | "The model we previously trained on the ESC-50 dataset, predicted the presence of 50 sound types, and which resulted in the final dense layer of the model having 50 outputs. The new model we would like to create is a binary classifier, and needs to have a single output value.\n", 1936 | "\n", 1937 | "We will load the baseline model, and swap out the final dense layer to match our needs:\n" 1938 | ] 1939 | }, 1940 | { 1941 | "cell_type": "code", 1942 | "metadata": { 1943 | "id": "tqY5I0QeGKAV" 1944 | }, 1945 | "source": [ 1946 | "# we need a new head with one neuron.\n", 1947 | "model_body = tf.keras.Model(inputs=baseline_model.input, outputs=baseline_model.layers[-2].output)\n", 1948 | "\n", 1949 | "classifier_head = tf.keras.layers.Dense(1, activation=\"sigmoid\")(model_body.output)\n", 1950 | "\n", 1951 | "fine_tune_model = tf.keras.Model(model_body.input, classifier_head)\n", 1952 | "\n", 1953 | "fine_tune_model.summary()" 1954 | ], 1955 | "execution_count": null, 1956 | "outputs": [] 1957 | }, 1958 | { 1959 | "cell_type": "markdown", 1960 | "metadata": { 1961 | "id": "w8Sla0atSL1G" 1962 | }, 1963 | "source": [ 1964 | "To freeze a layer in TensorFlow we can set `layer.trainable = False`. Let's loop through all the layers and do this:" 1965 | ] 1966 | }, 1967 | { 1968 | "cell_type": "code", 1969 | "metadata": { 1970 | "id": "lbcW4ANCGXUa" 1971 | }, 1972 | "source": [ 1973 | "for layer in fine_tune_model.layers:\n", 1974 | " layer.trainable = False" 1975 | ], 1976 | "execution_count": null, 1977 | "outputs": [] 1978 | }, 1979 | { 1980 | "cell_type": "markdown", 1981 | "metadata": { 1982 | "id": "MY9pZMYjSSHV" 1983 | }, 1984 | "source": [ 1985 | "and now unfreeze the last layer (the head):" 1986 | ] 1987 | }, 1988 | { 1989 | "cell_type": "code", 1990 | "metadata": { 1991 | "id": "HMBdchlqSToY" 1992 | }, 1993 | "source": [ 1994 | "fine_tune_model.layers[-1].trainable = True" 1995 | ], 1996 | "execution_count": null, 1997 | "outputs": [] 1998 | }, 1999 | { 2000 | "cell_type": "markdown", 2001 | "metadata": { 2002 | "id": "WuE1ONbpSWil" 2003 | }, 2004 | "source": [ 2005 | "Then we can `compile` the model, this time with using a binary crossentropy loss function as this model contains a single output." 2006 | ] 2007 | }, 2008 | { 2009 | "cell_type": "code", 2010 | "metadata": { 2011 | "id": "CUkVzymnGOOV" 2012 | }, 2013 | "source": [ 2014 | "METRICS = [\n", 2015 | " \"accuracy\",\n", 2016 | "]\n", 2017 | "\n", 2018 | "fine_tune_model.compile(\n", 2019 | " optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),\n", 2020 | " loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),\n", 2021 | " metrics=METRICS,\n", 2022 | ")\n", 2023 | "\n", 2024 | "def scheduler(epoch, lr):\n", 2025 | " if epoch < 10:\n", 2026 | " return lr\n", 2027 | " else:\n", 2028 | " return lr * tf.math.exp(-0.1)\n", 2029 | "\n", 2030 | "callbacks = [\n", 2031 | " tf.keras.callbacks.EarlyStopping(verbose=1, patience=5), \n", 2032 | " tf.keras.callbacks.LearningRateScheduler(scheduler)\n", 2033 | "]" 2034 | ], 2035 | "execution_count": null, 2036 | "outputs": [] 2037 | }, 2038 | { 2039 | "cell_type": "markdown", 2040 | "metadata": { 2041 | "id": "YblAlGy4SpRQ" 2042 | }, 2043 | "source": [ 2044 | "Kick off training:" 2045 | ] 2046 | }, 2047 | { 2048 | "cell_type": "code", 2049 | "metadata": { 2050 | "id": "GoEcHNBcGqnT" 2051 | }, 2052 | "source": [ 2053 | "EPOCHS = 25\n", 2054 | "\n", 2055 | "history_1 = fine_tune_model.fit(\n", 2056 | " train_ds, \n", 2057 | " validation_data=val_ds, \n", 2058 | " epochs=EPOCHS,\n", 2059 | " callbacks=callbacks,\n", 2060 | ")" 2061 | ], 2062 | "execution_count": null, 2063 | "outputs": [] 2064 | }, 2065 | { 2066 | "cell_type": "markdown", 2067 | "metadata": { 2068 | "id": "KCWZ_SUkS95R" 2069 | }, 2070 | "source": [ 2071 | "Now unfreeze all the layers, and train for a few more epochs:" 2072 | ] 2073 | }, 2074 | { 2075 | "cell_type": "code", 2076 | "metadata": { 2077 | "id": "CiJYDSOxe6uQ" 2078 | }, 2079 | "source": [ 2080 | "for layer in fine_tune_model.layers:\n", 2081 | " layer.trainable = True\n", 2082 | "\n", 2083 | "fine_tune_model.compile(\n", 2084 | " optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),\n", 2085 | " loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),\n", 2086 | " metrics=METRICS,\n", 2087 | ")\n", 2088 | "\n", 2089 | "def scheduler(epoch, lr):\n", 2090 | " return lr * tf.math.exp(-0.1)\n", 2091 | "\n", 2092 | "callbacks = [\n", 2093 | " tf.keras.callbacks.EarlyStopping(verbose=1, patience=5), \n", 2094 | " tf.keras.callbacks.LearningRateScheduler(scheduler)\n", 2095 | "]" 2096 | ], 2097 | "execution_count": null, 2098 | "outputs": [] 2099 | }, 2100 | { 2101 | "cell_type": "code", 2102 | "metadata": { 2103 | "id": "gRq1N0qXIBSs" 2104 | }, 2105 | "source": [ 2106 | "EPOCHS = 10\n", 2107 | "\n", 2108 | "history_2 = fine_tune_model.fit(\n", 2109 | " train_ds, \n", 2110 | " validation_data=val_ds, \n", 2111 | " epochs=EPOCHS,\n", 2112 | " callbacks=callbacks,\n", 2113 | ")" 2114 | ], 2115 | "execution_count": null, 2116 | "outputs": [] 2117 | }, 2118 | { 2119 | "cell_type": "code", 2120 | "metadata": { 2121 | "id": "kOewhILQcemp" 2122 | }, 2123 | "source": [ 2124 | "fine_tune_model.save(\"fine_tuned_model\")" 2125 | ], 2126 | "execution_count": null, 2127 | "outputs": [] 2128 | }, 2129 | { 2130 | "cell_type": "markdown", 2131 | "metadata": { 2132 | "id": "BXENjCJJzyrO" 2133 | }, 2134 | "source": [ 2135 | "## Training with your own audio (optional)\n", 2136 | "\n" 2137 | ] 2138 | }, 2139 | { 2140 | "cell_type": "markdown", 2141 | "metadata": { 2142 | "id": "AIIuAagAjoVq" 2143 | }, 2144 | "source": [ 2145 | "We now have an ML model which can classify the presence of fire alarm sound. However this model was trained on publicly available sound recordings which might not match the sound characteristics of the hardware microphone we will use for inferencing.\n", 2146 | "\n", 2147 | "The Raspberry Pi RP2040 MCU has a native USB feature that allows it to act like a USB microphone. We can flash an application to the board to enable it to act like a USB microphone to our PC. Then we can extend Google Colab’s capabilities with the [Web Audio API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API) on a modern Web browser like Google Chrome to collect live data samples all from within Google Colab!\n", 2148 | "\n", 2149 | "**If you don't have a fire alarm handy to record from, you can skip to the [next section](#scrollTo=Model_Optimization).**\n" 2150 | ] 2151 | }, 2152 | { 2153 | "cell_type": "markdown", 2154 | "metadata": { 2155 | "id": "R-ajrgSMaBvI" 2156 | }, 2157 | "source": [ 2158 | "### Record your own audio\n", 2159 | "\n", 2160 | "#### Software Setup\n", 2161 | "\n", 2162 | "Now we can use the USB microphone example from the [Microphone Library for Pico](https://github.com/ArmDeveloperEcosystem/microphone-library-for-pico). The example application can be compiled using `cmake` and `make`. Then we can flash the example application to the board over USB by putting the board into “boot ROM mode” which will allow us to upload an application to the board.\n", 2163 | "\n", 2164 | "Let's use `git` to clone the library source code and accompanying examples:\n" 2165 | ] 2166 | }, 2167 | { 2168 | "cell_type": "code", 2169 | "metadata": { 2170 | "id": "GBsW1c5xaFwy" 2171 | }, 2172 | "source": [ 2173 | "%%shell\n", 2174 | "git clone https://github.com/ArmDeveloperEcosystem/microphone-library-for-pico.git" 2175 | ], 2176 | "execution_count": null, 2177 | "outputs": [] 2178 | }, 2179 | { 2180 | "cell_type": "markdown", 2181 | "metadata": { 2182 | "id": "sQ9vcp6ApjXN" 2183 | }, 2184 | "source": [ 2185 | "Now let's change to the libraries directory folder, and create a `build` folder to run `cmake` on:" 2186 | ] 2187 | }, 2188 | { 2189 | "cell_type": "code", 2190 | "metadata": { 2191 | "id": "0R5fJC5HaMom" 2192 | }, 2193 | "source": [ 2194 | "%%shell\n", 2195 | "cd microphone-library-for-pico\n", 2196 | "mkdir -p build\n", 2197 | "cd build\n", 2198 | "cmake .. -DPICO_BOARD=${PICO_BOARD}" 2199 | ], 2200 | "execution_count": null, 2201 | "outputs": [] 2202 | }, 2203 | { 2204 | "cell_type": "markdown", 2205 | "metadata": { 2206 | "id": "Cr0Txtlip6iw" 2207 | }, 2208 | "source": [ 2209 | "Then we can run `make` to compile the example:" 2210 | ] 2211 | }, 2212 | { 2213 | "cell_type": "code", 2214 | "metadata": { 2215 | "id": "Q7ohkHZnp_Ns" 2216 | }, 2217 | "source": [ 2218 | "%%shell\n", 2219 | "cd microphone-library-for-pico/build\n", 2220 | "\n", 2221 | "make -j usb_microphone" 2222 | ], 2223 | "execution_count": null, 2224 | "outputs": [] 2225 | }, 2226 | { 2227 | "cell_type": "markdown", 2228 | "metadata": { 2229 | "id": "m9asx3qzuIvR" 2230 | }, 2231 | "source": [ 2232 | "#### Flashing the board\n", 2233 | "\n", 2234 | "If you are using a [WebUSB API](https://wicg.github.io/webusb/) enabled browser like Google Chrome, you can directly flash the image onto the board from within Google Collab! (Otherwise, you can manually download the .uf2 file to your computer and then drag it onto the USB disk for the RP2040 board.)\n", 2235 | "\n", 2236 | "**Note for Windows**: If you are using Windows you must install WinUSB drivers in order to use WebUSB, you can do so by following the instructions found [here](https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/blob/main/windows.md).\n", 2237 | "\n", 2238 | "**Note for Linux**: If you are using Linux you must configure udev in order to use WebUSB, you can do so by following the instructions found [here](https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/blob/main/linux.md).\n", 2239 | "\n", 2240 | "First you must place the board in USB Boot ROM mode, as follows:\n", 2241 | "\n", 2242 | " * SparkFun MicroMod\n", 2243 | " * Plug the USB-C cable into the board and your PC to power the board\n", 2244 | " * While holding down the BOOT button on the board, tap the RESET button\n", 2245 | " * Raspberry Pi Pico\n", 2246 | " * Plug the USB Micro cable into your PC, but do NOT plug in the Pico side.\n", 2247 | " * While holding down the white BOOTSEL button, plug in the micro USB cable to the Pico\n", 2248 | "\n", 2249 | "\n", 2250 | "\n", 2251 | "\n", 2252 | "\n" 2253 | ] 2254 | }, 2255 | { 2256 | "cell_type": "markdown", 2257 | "metadata": { 2258 | "id": "hmtVMRLrvfvw" 2259 | }, 2260 | "source": [ 2261 | "Run the code cell below and then click the \"Flash\" button to upload the USB microphone example application to the board over USB." 2262 | ] 2263 | }, 2264 | { 2265 | "cell_type": "code", 2266 | "metadata": { 2267 | "id": "Vhy9ddOXAZnY" 2268 | }, 2269 | "source": [ 2270 | "from colab_utils.pico import flash_pico\n", 2271 | "\n", 2272 | "flash_pico('microphone-library-for-pico/build/examples/usb_microphone/usb_microphone.bin')" 2273 | ], 2274 | "execution_count": null, 2275 | "outputs": [] 2276 | }, 2277 | { 2278 | "cell_type": "markdown", 2279 | "metadata": { 2280 | "id": "p1GJDEYIwI0r" 2281 | }, 2282 | "source": [ 2283 | "Now you can record audio by running the cells below, select the \"MicNode\" item from the drop down, and then click the \"Starting Recording\" button to start capturing audio. You must click the \"Stop Recording\" button to stop recording." 2284 | ] 2285 | }, 2286 | { 2287 | "cell_type": "markdown", 2288 | "metadata": { 2289 | "id": "wlpeIDxIxSIw" 2290 | }, 2291 | "source": [ 2292 | "Record the your own fire alarm sounds:" 2293 | ] 2294 | }, 2295 | { 2296 | "cell_type": "code", 2297 | "metadata": { 2298 | "id": "3Pgq9iZ7ybmt" 2299 | }, 2300 | "source": [ 2301 | "from colab_utils.audio import record_wav_file\n", 2302 | "\n", 2303 | "os.makedirs('datasets/custom/fire_alarm', exist_ok=True)\n", 2304 | "\n", 2305 | "record_wav_file('datasets/custom/fire_alarm')" 2306 | ], 2307 | "execution_count": null, 2308 | "outputs": [] 2309 | }, 2310 | { 2311 | "cell_type": "markdown", 2312 | "metadata": { 2313 | "id": "r8nXgrdTyIfB" 2314 | }, 2315 | "source": [ 2316 | "Record the your own background noise sounds:" 2317 | ] 2318 | }, 2319 | { 2320 | "cell_type": "code", 2321 | "metadata": { 2322 | "id": "0AH4-FQ3zSHV" 2323 | }, 2324 | "source": [ 2325 | "os.makedirs('datasets/custom/background_noise', exist_ok=True)\n", 2326 | "\n", 2327 | "record_wav_file('datasets/custom/background_noise')" 2328 | ], 2329 | "execution_count": null, 2330 | "outputs": [] 2331 | }, 2332 | { 2333 | "cell_type": "markdown", 2334 | "metadata": { 2335 | "id": "NoQuE4jOzKoX" 2336 | }, 2337 | "source": [ 2338 | "We can zip up the recorded wave files to download and use again:" 2339 | ] 2340 | }, 2341 | { 2342 | "cell_type": "code", 2343 | "metadata": { 2344 | "id": "cxsdP2uY0IpD" 2345 | }, 2346 | "source": [ 2347 | "!zip -r custom.zip datasets/custom" 2348 | ], 2349 | "execution_count": null, 2350 | "outputs": [] 2351 | }, 2352 | { 2353 | "cell_type": "markdown", 2354 | "metadata": { 2355 | "id": "YqOLRYDT33Y2" 2356 | }, 2357 | "source": [ 2358 | "### Load dataset\n", 2359 | "\n", 2360 | "We can load and transform the custom recorded dataset using the same pipeline we used before." 2361 | ] 2362 | }, 2363 | { 2364 | "cell_type": "code", 2365 | "metadata": { 2366 | "id": "B8D_30220vlN" 2367 | }, 2368 | "source": [ 2369 | "custom_fire_alarm_ds = tf.data.Dataset.list_files(\"datasets/custom/fire_alarm/*.wav\", shuffle=False)\n", 2370 | "custom_fire_alarm_ds = custom_fire_alarm_ds.map(lambda x: (x, 1, -1))\n", 2371 | "custom_fire_alarm_ds = custom_fire_alarm_ds.map(load_wav_for_map)\n", 2372 | "custom_fire_alarm_ds = custom_fire_alarm_ds.flat_map(split_wav_for_flat_map)\n", 2373 | "custom_fire_alarm_ds = custom_fire_alarm_ds.map(create_arm_spectrogram_for_map)\n", 2374 | "\n", 2375 | "custom_background_noise_ds = tf.data.Dataset.list_files(\"datasets/custom/background_noise/*.wav\", shuffle=False)\n", 2376 | "custom_background_noise_ds = custom_background_noise_ds.map(lambda x: (x, 0, -1))\n", 2377 | "custom_background_noise_ds = custom_background_noise_ds.map(load_wav_for_map)\n", 2378 | "custom_background_noise_ds = custom_background_noise_ds.flat_map(split_wav_for_flat_map)\n", 2379 | "custom_background_noise_ds = custom_background_noise_ds.map(create_arm_spectrogram_for_map)\n", 2380 | "\n", 2381 | "custom_ds = tf.data.Dataset.concatenate(custom_fire_alarm_ds, custom_background_noise_ds)\n", 2382 | "custom_ds = custom_ds.map(lambda x, y, z: (tf.expand_dims(x, axis=-1), y, z))\n", 2383 | "custom_ds_len = calculate_ds_len(custom_ds)\n", 2384 | "\n", 2385 | "print(f'{custom_ds_len}')\n", 2386 | "\n", 2387 | "custom_ds = custom_ds.map(lambda x, y,z: (x, y))\n", 2388 | "\n", 2389 | "custom_ds = custom_ds.shuffle(custom_ds_len).cache()" 2390 | ], 2391 | "execution_count": null, 2392 | "outputs": [] 2393 | }, 2394 | { 2395 | "cell_type": "markdown", 2396 | "metadata": { 2397 | "id": "3H8L-1jp0MX2" 2398 | }, 2399 | "source": [ 2400 | "Evaluate dataset performance before training:" 2401 | ] 2402 | }, 2403 | { 2404 | "cell_type": "code", 2405 | "metadata": { 2406 | "id": "sJDj7dD1PGfN" 2407 | }, 2408 | "source": [ 2409 | "fine_tune_model.evaluate(custom_ds.batch(1))" 2410 | ], 2411 | "execution_count": null, 2412 | "outputs": [] 2413 | }, 2414 | { 2415 | "cell_type": "markdown", 2416 | "metadata": { 2417 | "id": "DphMcfEf3_hR" 2418 | }, 2419 | "source": [ 2420 | "### Fine tune model" 2421 | ] 2422 | }, 2423 | { 2424 | "cell_type": "code", 2425 | "metadata": { 2426 | "id": "hp8R30Uc5FRc" 2427 | }, 2428 | "source": [ 2429 | "EPOCHS = 25\n", 2430 | "\n", 2431 | "for layer in fine_tune_model.layers:\n", 2432 | " layer.trainable = False\n", 2433 | "\n", 2434 | "fine_tune_model.layers[-1].trainable = True\n", 2435 | "\n", 2436 | "fine_tune_model.compile(\n", 2437 | " optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),\n", 2438 | " loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),\n", 2439 | " metrics=METRICS,\n", 2440 | ")\n", 2441 | "\n", 2442 | "history3 = fine_tune_model.fit(\n", 2443 | " custom_ds.take(int(custom_ds_len * 0.8)).batch(1), \n", 2444 | " validation_data=custom_ds.skip(int(custom_ds_len * 0.8)).batch(1), \n", 2445 | " epochs=EPOCHS,\n", 2446 | " # callbacks=callbacks,\n", 2447 | ")" 2448 | ], 2449 | "execution_count": null, 2450 | "outputs": [] 2451 | }, 2452 | { 2453 | "cell_type": "code", 2454 | "metadata": { 2455 | "id": "sx0iguJePXHN" 2456 | }, 2457 | "source": [ 2458 | "fine_tune_model.evaluate(custom_ds.batch(1))" 2459 | ], 2460 | "execution_count": null, 2461 | "outputs": [] 2462 | }, 2463 | { 2464 | "cell_type": "code", 2465 | "metadata": { 2466 | "id": "alhP_RWTPkNl" 2467 | }, 2468 | "source": [ 2469 | "fine_tune_model.save('fine_tuned_model')" 2470 | ], 2471 | "execution_count": null, 2472 | "outputs": [] 2473 | }, 2474 | { 2475 | "cell_type": "markdown", 2476 | "metadata": { 2477 | "id": "8HW-PdaT5Fx0" 2478 | }, 2479 | "source": [ 2480 | "## Model Optimization\n", 2481 | "\n", 2482 | "To optimize the model to run on the Arm Cortex-M0+ processor, we will use a process called model quantization. Model quantization converts the model’s weights and biases from 32-bit floating-point values to 8-bit values. The [pico-tflmicro](https://github.com/raspberrypi/pico-tflmicro) library, which is a port of TFLu for the RP2040’s Pico SDK contains Arm’s CMSIS-NN library, which supports optimized kernel operations for quantized 8-bit weights on Arm Cortex-M processors.\n", 2483 | "\n" 2484 | ] 2485 | }, 2486 | { 2487 | "cell_type": "markdown", 2488 | "metadata": { 2489 | "id": "2HRVOG9-5JsE" 2490 | }, 2491 | "source": [ 2492 | "### Quantization Aware Training\n", 2493 | "\n", 2494 | "We can use [TensorFlow’s Quantization Aware Training (QAT)](https://www.tensorflow.org/model_optimization/guide/quantization/training) feature to easily convert the floating-point model to quantized." 2495 | ] 2496 | }, 2497 | { 2498 | "cell_type": "code", 2499 | "metadata": { 2500 | "id": "XyWDswSD5PKF" 2501 | }, 2502 | "source": [ 2503 | "final_model = tf.keras.models.load_model(\"fine_tuned_model\")" 2504 | ], 2505 | "execution_count": null, 2506 | "outputs": [] 2507 | }, 2508 | { 2509 | "cell_type": "code", 2510 | "metadata": { 2511 | "id": "2uWWv-Wd5Uvx" 2512 | }, 2513 | "source": [ 2514 | "import tensorflow_model_optimization as tfmot\n", 2515 | "\n", 2516 | "def apply_qat_to_dense_and_cnn(layer):\n", 2517 | " if isinstance(layer, (tf.keras.layers.Dense, tf.keras.layers.Conv2D)):\n", 2518 | " return tfmot.quantization.keras.quantize_annotate_layer(layer)\n", 2519 | " return layer\n", 2520 | "\n", 2521 | "annotated_model = tf.keras.models.clone_model(\n", 2522 | " fine_tune_model,\n", 2523 | " clone_function=apply_qat_to_dense_and_cnn,\n", 2524 | ")\n", 2525 | "\n", 2526 | "quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)\n", 2527 | "quant_aware_model.summary()" 2528 | ], 2529 | "execution_count": null, 2530 | "outputs": [] 2531 | }, 2532 | { 2533 | "cell_type": "code", 2534 | "metadata": { 2535 | "id": "jmHPDwpI5g29" 2536 | }, 2537 | "source": [ 2538 | "quant_aware_model.compile(\n", 2539 | " optimizer=tf.keras.optimizers.Adam(),\n", 2540 | " loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),\n", 2541 | " metrics=METRICS,\n", 2542 | ")\n", 2543 | "\n", 2544 | "EPOCHS=1\n", 2545 | "quant_aware_history = quant_aware_model.fit(\n", 2546 | " train_ds, \n", 2547 | " validation_data=val_ds, \n", 2548 | " epochs=EPOCHS\n", 2549 | ")" 2550 | ], 2551 | "execution_count": null, 2552 | "outputs": [] 2553 | }, 2554 | { 2555 | "cell_type": "markdown", 2556 | "metadata": { 2557 | "id": "rINV6dEU5sqp" 2558 | }, 2559 | "source": [ 2560 | "### Saving model in TFLite format\n", 2561 | "\n", 2562 | "We will now use the [tf.lite.TFLiteConverter.from_keras_model(...)](https://www.tensorflow.org/api_docs/python/tf/lite/TFLiteConverter#from_keras_model) API to convert the quantized Keras model to TF Lite format, and then save it to disk as a `.tflite` file." 2563 | ] 2564 | }, 2565 | { 2566 | "cell_type": "code", 2567 | "metadata": { 2568 | "id": "rRuuJi145r8r" 2569 | }, 2570 | "source": [ 2571 | "converter = tf.lite.TFLiteConverter.from_keras_model(quant_aware_model)\n", 2572 | "converter.optimizations = [tf.lite.Optimize.DEFAULT]\n", 2573 | "\n", 2574 | "def representative_data_gen():\n", 2575 | " for input_value, output_value in train_ds.unbatch().batch(1).take(100):\n", 2576 | " # Model has only one input so each data point has one element.\n", 2577 | " yield [input_value]\n", 2578 | " \n", 2579 | "converter.representative_dataset = representative_data_gen\n", 2580 | "# Ensure that if any ops can't be quantized, the converter throws an error\n", 2581 | "converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]\n", 2582 | "# Set the input and output tensors to uint8 (APIs added in r2.3)\n", 2583 | "converter.inference_input_type = tf.int8\n", 2584 | "converter.inference_output_type = tf.int8\n", 2585 | "tflite_model_quant = converter.convert()\n", 2586 | "\n", 2587 | "with open(\"tflite_model.tflite\", \"wb\") as f:\n", 2588 | " f.write(tflite_model_quant)" 2589 | ], 2590 | "execution_count": null, 2591 | "outputs": [] 2592 | }, 2593 | { 2594 | "cell_type": "markdown", 2595 | "metadata": { 2596 | "id": "StbdFxpk6D_8" 2597 | }, 2598 | "source": [ 2599 | "### Test TF Lite model\n", 2600 | "\n", 2601 | "Since TensorFlow also supports loading TF Lite models using [`tensorflow.lite`](https://www.tensorflow.org/api_docs/python/tf/lite), we can also verify the functionality of the quantized model and compare its accuracy with the regular unquantized model inside the notebook." 2602 | ] 2603 | }, 2604 | { 2605 | "cell_type": "code", 2606 | "metadata": { 2607 | "id": "qZkEUTBr6Hd-" 2608 | }, 2609 | "source": [ 2610 | "import tensorflow.lite as tflite\n", 2611 | "\n", 2612 | "# Load the interpreter and allocate tensors\n", 2613 | "interpreter = tflite.Interpreter(\"tflite_model.tflite\")\n", 2614 | "interpreter.allocate_tensors()\n", 2615 | "\n", 2616 | "# Load input and output details\n", 2617 | "input_details = interpreter.get_input_details()[0]\n", 2618 | "output_details = interpreter.get_output_details()[0]\n", 2619 | "\n", 2620 | "# Set quantization values\n", 2621 | "input_scale, input_zero_point = input_details[\"quantization\"]\n", 2622 | "output_scale, output_zero_point = output_details[\"quantization\"]" 2623 | ], 2624 | "execution_count": null, 2625 | "outputs": [] 2626 | }, 2627 | { 2628 | "cell_type": "code", 2629 | "metadata": { 2630 | "id": "yLw9Yy7w6bwy" 2631 | }, 2632 | "source": [ 2633 | "# Calculate the number of correct predictions\n", 2634 | "correct = 0\n", 2635 | "test_ds_len = 0\n", 2636 | "\n", 2637 | "# Loop through entire test set\n", 2638 | "for x, y in test_ds.unbatch():\n", 2639 | " # original shape is [124, 129, 1] expand to [1, 124, 129, 1]\n", 2640 | " x = tf.expand_dims(x, 0).numpy()\n", 2641 | " \n", 2642 | " # quantize the input value\n", 2643 | " if (input_scale, input_zero_point) != (0, 0):\n", 2644 | " x = x / input_scale + input_zero_point\n", 2645 | " x = x.astype(input_details['dtype'])\n", 2646 | "\n", 2647 | " # add the input tensor to interpreter\n", 2648 | " interpreter.set_tensor(input_details[\"index\"], x)\n", 2649 | " \n", 2650 | " #run the model\n", 2651 | " interpreter.invoke()\n", 2652 | "\n", 2653 | " # Get output data from model and convert to fp32\n", 2654 | " output_data = interpreter.get_tensor(output_details[\"index\"])\n", 2655 | " output_data = output_data.astype(np.float32)\n", 2656 | "\n", 2657 | " # Dequantize the output\n", 2658 | " if (output_scale, output_zero_point) != (0.0, 0):\n", 2659 | " output_data = (output_data - output_zero_point) * output_scale\n", 2660 | "\n", 2661 | " # convert output to category\n", 2662 | " if output_data[0][0] >= 0.5:\n", 2663 | " category = 1\n", 2664 | " else:\n", 2665 | " category = 0\n", 2666 | " \n", 2667 | " # add 1 if category = y\n", 2668 | " correct += 1 if category == y.numpy() else 0\n", 2669 | "\n", 2670 | " test_ds_len += 1" 2671 | ], 2672 | "execution_count": null, 2673 | "outputs": [] 2674 | }, 2675 | { 2676 | "cell_type": "code", 2677 | "metadata": { 2678 | "id": "R38f75_p61se" 2679 | }, 2680 | "source": [ 2681 | "accuracy = correct / test_ds_len\n", 2682 | "print(f\"Accuracy for quantized model is {accuracy*100:.2f}% (to 2 D.P) on test set.\")" 2683 | ], 2684 | "execution_count": null, 2685 | "outputs": [] 2686 | }, 2687 | { 2688 | "cell_type": "markdown", 2689 | "metadata": { 2690 | "id": "exRvsfKz7JPR" 2691 | }, 2692 | "source": [ 2693 | "## Deploy on Device" 2694 | ] 2695 | }, 2696 | { 2697 | "cell_type": "markdown", 2698 | "metadata": { 2699 | "id": "WcWf2CCK_dst" 2700 | }, 2701 | "source": [ 2702 | "#### Convert `.tflite` to `.h` file\n", 2703 | "\n", 2704 | "The RP2040 MCU on the boards we are deploying to, does not have a built-in file system, which means we cannot use the .tflite file directly on the board. However, we can use the Linux `xxd` command to convert the .tflite file to a .h file which can then be compiled in the inference application in the next step." 2705 | ] 2706 | }, 2707 | { 2708 | "cell_type": "code", 2709 | "metadata": { 2710 | "id": "PWBsWuIN7MHx" 2711 | }, 2712 | "source": [ 2713 | "%%shell\n", 2714 | "echo \"alignas(8) const unsigned char tflite_model[] = {\" > tflite_model.h\n", 2715 | "cat tflite_model.tflite | xxd -i >> tflite_model.h\n", 2716 | "echo \"};\" >> tflite_model.h" 2717 | ], 2718 | "execution_count": null, 2719 | "outputs": [] 2720 | }, 2721 | { 2722 | "cell_type": "markdown", 2723 | "metadata": { 2724 | "id": "x14EFzyN_mk_" 2725 | }, 2726 | "source": [ 2727 | "#### Inference Application\n", 2728 | "\n", 2729 | "We now have a model that is ready to be deployed to the device! We’ve created an application template for inference which can be compiled with the .h file that we’ve generated for the model.\n", 2730 | "\n", 2731 | "The C++ application uses the `pico-sdk` as the base, along with the `CMSIS-DSP`, `pico-tflmicro`, and `Microphone Libary for Pico` libraries. It’s general structure is as follows:\n", 2732 | " 1. Initialization\n", 2733 | " 1. Configure the board's built-in LED for output. The application will map the brightness of the LED to the output of the model. (0.0 LED off, 1.0 LED on with full brightness)\n", 2734 | " 1. Setup the TF Lite library and TF Lite model for inference\n", 2735 | " 1. Setup the CMSIS-DSP based DSP pipeline\n", 2736 | " 1. Setup and start the microphone for real-time audio\n", 2737 | " 1. Inference loop\n", 2738 | " 1. Wait for 128 * 4 = 512 new audio samples from the microphone\n", 2739 | " 1. Shift the spectrogram array over by 4 columns\n", 2740 | " 1. Shift the audio input buffer over by 128 * 4 = 512 samples and copy in the new samples\n", 2741 | " 1. Calculate 4 new spectrogram columns for the updated input buffer\n", 2742 | " 1. Perform inference on the spectrogram data\n", 2743 | " 1. Map the inference output value to the on-board LED’s brightness and output the status to the USB port\n", 2744 | "\n", 2745 | "In-order to run in real-time each cycle of the inference loop must take under (512 / 16000) = 0.032 seconds or 32 milliseconds. The model we’ve trained and converted takes 24 ms for inference, which gives us ~8 ms for the other operations in the loop.\n", 2746 | "\n", 2747 | "128 was used above to match the stride of 128 used in the training pipeline for the spectrogram. We used a shift of 4 in the spectrogram to fit within the real-time constraints we had.\n", 2748 | "\n", 2749 | "The source code for the inference application can be found on GitHub: https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/tree/main/inference-app\n", 2750 | "\n", 2751 | "**Note:** We have already cloned this project in the setup steps from earlier.\n" 2752 | ] 2753 | }, 2754 | { 2755 | "cell_type": "markdown", 2756 | "metadata": { 2757 | "id": "mbJTbrhI2Lm7" 2758 | }, 2759 | "source": [ 2760 | "Now we can copy the updated `tflite_model.h` file over:" 2761 | ] 2762 | }, 2763 | { 2764 | "cell_type": "code", 2765 | "metadata": { 2766 | "id": "ieVMgugZ-k5L" 2767 | }, 2768 | "source": [ 2769 | "!cp tflite_model.h inference-app/src/tflite_model.h" 2770 | ], 2771 | "execution_count": null, 2772 | "outputs": [] 2773 | }, 2774 | { 2775 | "cell_type": "markdown", 2776 | "metadata": { 2777 | "id": "0NehBUZd_ydi" 2778 | }, 2779 | "source": [ 2780 | "#### Compile Inference Application\n", 2781 | "\n", 2782 | "Once again we can use `cmake` to setup project before compiling it:" 2783 | ] 2784 | }, 2785 | { 2786 | "cell_type": "code", 2787 | "metadata": { 2788 | "id": "bPswlG9u_AfN" 2789 | }, 2790 | "source": [ 2791 | "%%shell\n", 2792 | "cd inference-app\n", 2793 | "mkdir -p build\n", 2794 | "cd build\n", 2795 | "cmake .. -DPICO_BOARD=${PICO_BOARD}" 2796 | ], 2797 | "execution_count": null, 2798 | "outputs": [] 2799 | }, 2800 | { 2801 | "cell_type": "markdown", 2802 | "metadata": { 2803 | "id": "9m89TzMy2r01" 2804 | }, 2805 | "source": [ 2806 | "Then use `make` to compile it:" 2807 | ] 2808 | }, 2809 | { 2810 | "cell_type": "code", 2811 | "metadata": { 2812 | "id": "Sm9D9Bsv2mre" 2813 | }, 2814 | "source": [ 2815 | "%%shell\n", 2816 | "cd inference-app/build\n", 2817 | "\n", 2818 | "make -j" 2819 | ], 2820 | "execution_count": null, 2821 | "outputs": [] 2822 | }, 2823 | { 2824 | "cell_type": "markdown", 2825 | "metadata": { 2826 | "id": "8PIvWvzJABRb" 2827 | }, 2828 | "source": [ 2829 | "#### Flash inferencing application to board\n", 2830 | "\n", 2831 | "You’ll need to put the board into USB boot ROM mode again to load the new application to it. If you are using a WebUSB API enabled browser like Google Chrome, you can directly flash the image onto the board from within Google Collab! Otherwise, you can manually download the .uf2 file to your computer and then drag it onto the USB disk for the RP2040 board.\n", 2832 | "\n", 2833 | "**Note for Windows**: If you are using Windows you must install WinUSB drivers in order to use WebUSB, you can do so by following the instructions found [here](https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/blob/main/windows.md).\n", 2834 | "\n", 2835 | "**Note for Linux**: If you are using Linux you must configure udev in order to use WebUSB, you can do so by following the instructions found [here](https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/blob/main/linux.md).\n", 2836 | "\n", 2837 | " * SparkFun MicroMod\n", 2838 | " * Plug the USB-C cable into the board and your PC to power the board\n", 2839 | " * While holding down the BOOT button on the board, tap the RESET button\n", 2840 | "\n", 2841 | " * Raspberry Pi Pico\n", 2842 | " * Plug the USB Micro cable into your PC, but do NOT plug in the Pico side.\n", 2843 | " * While holding down the white BOOTSEL button, plug in the micro USB cable to the Pico\n", 2844 | "\n", 2845 | "\n", 2846 | "Then run the code cell below and click the \"flash\" button." 2847 | ] 2848 | }, 2849 | { 2850 | "cell_type": "code", 2851 | "metadata": { 2852 | "id": "CsWtAuAiAFYc" 2853 | }, 2854 | "source": [ 2855 | "from colab_utils.pico import flash_pico\n", 2856 | "\n", 2857 | "flash_pico('inference-app/build/pico_inference_app.bin')" 2858 | ], 2859 | "execution_count": null, 2860 | "outputs": [] 2861 | }, 2862 | { 2863 | "cell_type": "markdown", 2864 | "metadata": { 2865 | "id": "0qFp9s1I0P7G" 2866 | }, 2867 | "source": [ 2868 | "### Monitoring the Inference on the board\n", 2869 | "\n", 2870 | "\n", 2871 | "Now that the inference application is running on the board you can observe it in action in two ways:\n", 2872 | "\n", 2873 | " 1. Visually by observing the brightness of the LED on the board. It should remain off or dim when no fire alarm sound is present - and be on when a fire alarm sound is present.\n", 2874 | "\n", 2875 | " 1. Connecting to the board’s USB serial port to view output from the inference application. If you are using a [Web Serial API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Serial_API) enabled browser like Google Chrome, this can be done directly from Google Colab!\n", 2876 | "\n", 2877 | "#### Test Audio\n", 2878 | "\n", 2879 | "Use the code cell below to playback the fire alarms sounds used during training from your computer. You may need to adjust the speaker volume from your computer.\n" 2880 | ] 2881 | }, 2882 | { 2883 | "cell_type": "code", 2884 | "metadata": { 2885 | "id": "4gqmp3nPtp6P" 2886 | }, 2887 | "source": [ 2888 | "for wav, _, _ in fire_alarm_wav_ds:\n", 2889 | " display.display(display.Audio(wav, rate=sample_rate))" 2890 | ], 2891 | "execution_count": null, 2892 | "outputs": [] 2893 | }, 2894 | { 2895 | "cell_type": "markdown", 2896 | "metadata": { 2897 | "id": "w2ufVilt4BKU" 2898 | }, 2899 | "source": [ 2900 | "#### Serial Monitor\n", 2901 | "\n", 2902 | "Run the code cell below and then click the \"Connect Port\" button to view the serial output from the board:" 2903 | ] 2904 | }, 2905 | { 2906 | "cell_type": "code", 2907 | "metadata": { 2908 | "id": "RAvIj39m3fUa" 2909 | }, 2910 | "source": [ 2911 | "from colab_utils.serial_monitor import run_serial_monitor\n", 2912 | "\n", 2913 | "run_serial_monitor()" 2914 | ], 2915 | "execution_count": null, 2916 | "outputs": [] 2917 | }, 2918 | { 2919 | "cell_type": "markdown", 2920 | "metadata": { 2921 | "id": "YoQzVwGA41Eq" 2922 | }, 2923 | "source": [ 2924 | "## Improving the model\n", 2925 | "\n", 2926 | "You now have the first version of the model deployed to the board, and it is performing inference on live 16,000 kHz audio data!\n", 2927 | "\n", 2928 | "Test out various sounds to see if the model has the expected output. Maybe the fire alarm sound is being falsely detected (false positive) or not detected when it should be (false negative). \n", 2929 | "\n", 2930 | "If this occurs, you can record more new audio data for the scenario(s) by flashing the USB microphone application firmware to the board, recording the data for training, re-training the model and converting to TF lite format, and re-compiling + flashing the inference application to the board.\n", 2931 | "\n", 2932 | "Supervised machine learning models can generally only be as good as the training data they are trained with, so additional training data for these scenarios might help. You can also try to experiment with changing the model architecture or feature extraction process - but keep in mind that your model must be small enough and fast enough to run on the RP2040 MCU!\n", 2933 | "\n", 2934 | "## Conclusion\n", 2935 | "\n", 2936 | "This guide covered an end-to-end flow of how to train a custom audio classifier model to run locally on a development board that uses an Arm Cortex-M0+ processor. Google Colab was used to train the model using Transfer Learning techniques along with a smaller dataset and data augmentation techniques. We also collected our own data from the microphone that is used at inference time by loading an USB microphone application to the board, and extending Colab’s features with the Web Audio API and custom JavaScript\n", 2937 | "\n", 2938 | "The training side of the project combined Google’s Colab service and Chrome browser, with the open source TensorFlow library. The inference application captured audio data from a digital microphone, used Arm’s CMSIS-DSP library for the feature extraction stage, then used TensorFlow Lite for Microcontrollers with Arm CMSIS-NN accelerated kernels to perform inference with a 8-bit quantized model that classified a real-time 16 kHz audio input on an Arm Cortex-M0+ processor.\n", 2939 | "\n", 2940 | "The Web Audio API, Web USB API, and Web Serial API features of Google Chrome were used to extend Google Colab’s functionality to interact with the development board. This allowed us to experiment with and develop our application entirely with a web browser and deploy it to a constrained development board for on-device inference.\n", 2941 | "\n", 2942 | "Since the ML processing was performed on the development boards RP2040 MCU, privacy was preserved as no raw audio data left the device at inference time.\n", 2943 | "\n" 2944 | ] 2945 | } 2946 | ] 2947 | } 2948 | --------------------------------------------------------------------------------