├── .gitignore
├── LICENSE
├── README.md
├── config_parser.py
├── docs
└── config_file_explained.md
├── download_gspeech_v2.sh
├── inference.py
├── label_map.json
├── make_data_list.py
├── models
├── __init__.py
└── kwmlp.py
├── notebooks
├── README.md
├── keyword_mlp_tutorial.ipynb
└── mlp-mixer-audio.ipynb
├── requirements.txt
├── resources
├── kw-mlp.png
└── wandb.png
├── sample_configs
└── base_config.yaml
├── train.py
├── utils
├── __init__.py
├── augment.py
├── dataset.py
├── loss.py
├── misc.py
├── opt.py
├── scheduler.py
└── trainer.py
└── window_inference.py
/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__/
2 | .vscode/
3 | env/
4 | configs/
5 | data/
6 | runs/
7 | wandb/
8 | notes.txt
9 | tests
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2021 Mashrur Mahmud Morshed and Ahmad Omar Ahsan
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Keyword-MLP
2 |
3 | Official PyTorch implementation of [*Attention-Free Keyword Spotting*](https://arxiv.org/abs/2110.07749v1).
4 |
5 |
6 |
7 |
8 |
9 | ## Setup
10 |
11 | ```
12 | pip install -r requirements.txt
13 | ```
14 |
15 | ## Dataset
16 | To download the Google Speech Commands V2 dataset, you may run the provided bash script as below. This would download and extract the dataset to the "destination path" provided.
17 |
18 | ```
19 | sh ./download_gspeech_v2.sh
20 | ```
21 |
22 | ## Training
23 |
24 | The Speech Commands V2 dataset provides two files: `validation_list.txt` and `testing_list.txt`. Run:
25 |
26 | ```
27 | python make_data_list.py -v -t -d -o