├── .gitignore ├── LICENSE.txt ├── README.md ├── audio_please_unzip.zip ├── data └── info_file.csv ├── environments ├── requirements.txt └── umap_tut_env.yaml ├── example_imgs └── tool_image.png ├── functions ├── __init__.py ├── __pycache__ │ ├── __init__.cpython-37.pyc │ ├── audio_functions.cpython-37.pyc │ ├── custom_dist_functions_umap.cpython-37.pyc │ ├── evaluation_functions.cpython-37.pyc │ ├── plot_functions.cpython-37.pyc │ └── preprocessing_functions.cpython-37.pyc ├── audio_functions.py ├── custom_dist_functions_umap.py ├── evaluation_functions.py ├── plot_functions.py └── preprocessing_functions.py ├── notebooks ├── .ipynb_checkpoints │ ├── 01_generate_spectrograms-checkpoint.ipynb │ ├── 02a_generate_UMAP_basic-checkpoint.ipynb │ ├── 02b_generate_UMAP_timeshift-checkpoint.ipynb │ ├── 03_UMAP_clustering-checkpoint.ipynb │ ├── 03_UMAP_eval-checkpoint.ipynb │ ├── 03_UMAP_viz_part_1_prep-checkpoint.ipynb │ └── 03_UMAP_viz_part_2_tool-checkpoint.ipynb ├── 01_generate_spectrograms.ipynb ├── 02a_generate_UMAP_basic.ipynb ├── 02b_generate_UMAP_timeshift.ipynb ├── 03_UMAP_clustering.ipynb ├── 03_UMAP_eval.ipynb ├── 03_UMAP_viz_part_1_prep.ipynb └── 03_UMAP_viz_part_2_tool.ipynb ├── parameters ├── __pycache__ │ └── spec_params.cpython-37.pyc └── spec_params.py └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | /audio 2 | /data_all 3 | .ipynb_checkpoints 4 | /notebooks/.ipynb_checkpoints 5 | 6 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | This repository contains code and the exemplary audio data file "audio_please_unzip.zip". 2 | Separate licenses apply to code vs. the exemplary audio data file. 3 | 4 | For all code, the following MIT-license applies: 5 | 6 | Copyright (c) 2022 Mara Thomas 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining 9 | a copy of this software and associated documentation files (with the 10 | exception of the file "audio_please_unzip.zip"), to deal in the 11 | Software without restriction, including 12 | without limitation the rights to use, copy, modify, merge, publish, 13 | distribute, sublicense, and/or sell copies of the Software, and to 14 | permit persons to whom the Software is furnished to do so, subject to 15 | the following conditions: 16 | 17 | The above copyright notice and this permission notice shall be 18 | included in all copies or substantial portions of the Software. 19 | 20 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 21 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 22 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 23 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 24 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 25 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 26 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 27 | © 2022 GitHub, Inc. 28 | Terms 29 | Privacy 30 | Security 31 | Status 32 | Docs 33 | Contact GitHub 34 | Pricing 35 | API 36 | 37 | 38 | Only the exemplary audio data file "audio_please_unzip.zip" is exempt from the MIT-license. This file is under exclusive copyright, meaning that it is not allowed to copy, distribute, or modify it. 39 | 40 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Tutorial for generating and evaluating latent-space representations of vocalizations using UMAP 2 | 3 | [![DOI](https://zenodo.org/badge/400540617.svg)](https://zenodo.org/badge/latestdoi/400540617) 4 | 5 | 6 | This tutorial contains a sequence of jupyter notebook files that help you generate latent space representations from input audio files, evaluate them and generate an interactive visualization. 7 | 8 |

9 | 10 |

11 | 12 | ## 1. Structure 13 | 14 | Keep the directory structure the way it is and put your data in the 'audio' and 'data' folder. Do not change the folder structure or location of notebooks or function files! 15 | 16 | ├── notebooks <- contains analysis scripts 17 | │ ├── 01_generate_spectrograms.ipynb 18 | │ ├── ... 19 | │ └── ... 20 | ├── audio <- ! put your input soundfiles in this folder or unzip the provided example audio! 21 | │ ├── call_1.wav 22 | │ ├── call_2.wav 23 | │ └── ... 24 | ├── functions <- contains functions that will be called in analysis scripts 25 | │ ├── audio_functions.py 26 | │ ├── ... 27 | │ └── ... 28 | ├── data <- ! put a .csv metadata file of your input in this folder or use the provided example csv! 29 | │ └── info_file.csv 30 | ├── parameters 31 | │ └── spec_params.py <- this file contains parameters for spectrogramming (fft_win, fft_hop...) 32 | ├── environments 33 | │ └── umap_tut_env.yaml <- conda environment file (linux) 34 | ├── ... 35 | 36 | 37 | ## 2. Requirements 38 | 39 | ### 2.1. Packages, installations etc. 40 | 41 | Python>=3.8. is recommended. I would recommend to __install the packages manually__, but a conda environment file is also included in /environments (created on Linux! Dependencies may differ for other OS!). 42 | 43 | For manual install, these are the core packages: 44 | 45 | >umap-learn 46 | 47 | >librosa 48 | 49 | >ipywidgets 50 | 51 | >pandas=1.2.4 52 | 53 | >seaborn 54 | 55 | >pysoundfile=0.10.3 56 | 57 | >voila 58 | 59 | >hdbscan 60 | 61 | >plotly 62 | 63 | >graphviz 64 | 65 | >networkx 66 | 67 | >pygraphviz 68 | 69 | 70 | Make sure to enable jupyter widgets with: 71 | >jupyter nbextension enable --py widgetsnbextension 72 | 73 | 74 | __NOTE__: Graphviz, networkx and pygraphviz are only required for one plot, so if you fail to install them, you can still run 99 % of the code. 75 | 76 | 77 | #### This is an example for a manual installation on Windows with Python 3.8. and conda: 78 | 79 | If you haven't worked with Python and/or conda (a package manager), an easy way to get started is to install anaconda or miniconda (only the basic/core parts of anaconda) first: 80 | 81 | - Anaconda: [https://www.anaconda.com/products/individual-d](https://www.anaconda.com/products/individual-d) 82 | 83 | - Miniconda: [https://docs.conda.io/en/latest/miniconda.html](https://docs.conda.io/en/latest/miniconda.html) 84 | 85 | After successful installation, create and activate your environment with conda: 86 | 87 | ``` 88 | conda create --name my_env 89 | conda activate my_env 90 | ``` 91 | 92 | Then, install the required core packages: 93 | 94 | ``` 95 | conda install -c conda-forge umap-learn 96 | conda install -c conda-forge librosa 97 | conda install ipywidgets 98 | conda install pandas=1.2.4 99 | conda install seaborn 100 | conda install -c conda-forge pysoundfile=0.10.3 101 | conda install -c conda-forge voila 102 | conda install -c anaconda graphviz 103 | conda install -c conda-forge hdbscan 104 | conda install -c plotly plotly 105 | conda install networkx 106 | conda install -c conda-forge pygraphviz 107 | ``` 108 | 109 | Finally, enable ipywidgets in jupyter notebook 110 | 111 | ``` 112 | jupyter nbextension enable --py widgetsnbextension 113 | ``` 114 | 115 | Clone this repository or download as zip and unpack. Make sure to have the same structure of subdirectories as described in section "Structure" and prepare your input files as described in section "Input requirements". 116 | 117 | 118 | Start jupyter notebook with 119 | ``` 120 | jupyter notebook 121 | ``` 122 | 123 | and select the first jupyter notebook file to start your analysis (see section "Where to start"). 124 | 125 | 126 | ### 2.2. Input requirements 127 | 128 | #### 2.2.1. Audio files 129 | 130 | All audio input files need to be in a subfolder /audio. This folder should not contain any other files. 131 | 132 | To use the provided example data of meerkat calls, please unzip the file 'audio_please_unzip.zip' and verify that all audio files have been unpacked into an /audio folder according to the structure described in Section1. 133 | 134 | To use your own data, create a subfolder "/audio" and put your sound files there (make sure that the /audio folder contains __only__ your input files, nothing else). Each sound file should contain a single vocalization or syllable. 135 | (You may have to detect and extract such vocal elements first, if working with acoustic recordings.) 136 | 137 | 138 | Ideally, start and end of the sound file correspond exactly to start and end of the vocalization. 139 | If there are delays in the onset of the vocalizations, these should be the same for all sound files. 140 | Otherwise, vocalizations may appear dissimilar or distant in latent space simply because their onset times are different. 141 | If it is not possible to mark the start times correctly, use the timeshift option to generate UMAP embeddings, 142 | but note that it comes at the cost of increased computation time. 143 | 144 | #### 2.2.2. [Optional: Info file] 145 | 146 | Use the provided info_file.csv file for the example audio data or, if you are using your own data,´ add a ";"-separated info_file.csv file with headers containing the filenames of the input audio, some labels and any other additional metadata (if available) in the subfolder "/data". 147 | If some or all labels are unknown, there should still be a label column and unkown labels should be marked with "unknown". 148 | 149 | Structure of info_file.csv must be: 150 | 151 | | filename | label | ... | .... 152 | ----------------------------------------- 153 | | call_1.wav | alarm | ... | .... 154 | | call_2.wav | contact | ... | .... 155 | | ... | ... | ... | .... 156 | 157 | If you don't provide an info_file.csv, a default one will be generated, containing ALL files that are found in /audio and with all vocalizations labelled as "unknown". 158 | 159 | 160 | ## 3. Where to start 161 | 162 | 1. Start with 01_generate_spectrograms.ipynb to generate spectrograms from input audio files. 163 | 2. Generate latent space representations with 02a_generate_UMAP_basic.ipynb OR 02b_generate_UMAP_timeshift.ipynb 164 | 165 | 3. You can now 166 | - __Evaluate__ the latent space representation with 03_UMAP_eval.ipynb, 167 | 168 | - __Visualize__ the latent space representation by running 03_UMAP_viz_part_1_prep.ipynb and 03_UMAP_viz_part_2_tool.ipynb or 169 | 170 | - __Apply clustering__ on the latent space representation with 03_UMAP_clustering.ipynb 171 | 172 | 173 | ## 4. Data accessibility 174 | 175 | All code is under MIT-license. Exclusive copyright applies to the audio data file (audio_please_unzip.zip), meaning that you cannot reproduce, distribute or create derivative works from it. You may use this data to test the provided code, but not for any other purposes. If you are interested in using the exemplary data beyond the sole purpose of testing the provided code, please get touch with Prof. Marta Manser. See license for details. 176 | -------------------------------------------------------------------------------- /audio_please_unzip.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/audio_please_unzip.zip -------------------------------------------------------------------------------- /environments/requirements.txt: -------------------------------------------------------------------------------- 1 | # This file may be used to create an environment using: 2 | # $ conda create --name --file 3 | # platform: linux-64 4 | _libgcc_mutex=0.1=main 5 | _openmp_mutex=4.5=1_gnu 6 | anyio=3.1.0=py39hf3d152e_0 7 | appdirs=1.4.4=pyh9f0ad1d_0 8 | argon2-cffi=20.1.0=py39h27cfd23_1 9 | async_generator=1.10=py_0 10 | attrs=20.2.0=py_0 11 | audioread=2.1.9=py39hf3d152e_0 12 | backcall=0.2.0=py_0 13 | blas=1.0=mkl 14 | bleach=3.2.1=py_0 15 | brotlipy=0.7.0=py39h3811e60_1001 16 | bzip2=1.0.8=h7f98852_4 17 | ca-certificates=2021.5.30=ha878542_0 18 | certifi=2021.5.30=py39hf3d152e_0 19 | cffi=1.14.5=py39he32792d_0 20 | chardet=4.0.0=py39hf3d152e_1 21 | cryptography=3.4.7=py39hbca0aa6_0 22 | cycler=0.10.0=py39h06a4308_0 23 | dbus=1.13.18=hb2f20db_0 24 | decorator=5.0.9=pyhd8ed1ab_0 25 | defusedxml=0.6.0=py_0 26 | entrypoints=0.3=py39h06a4308_0 27 | expat=2.4.1=h2531618_2 28 | ffmpeg=4.3.1=hca11adc_2 29 | fontconfig=2.13.1=h6c09931_0 30 | freetype=2.10.4=h5ab3b9f_0 31 | gettext=0.19.8.1=h0b5b191_1005 32 | glib=2.68.2=h36276a3_0 33 | gmp=6.2.1=h58526e2_0 34 | gnutls=3.6.13=h85f3911_1 35 | gst-plugins-base=1.14.0=h8213a91_2 36 | gstreamer=1.14.0=h28cd5cc_2 37 | icu=58.2=he6710b0_3 38 | idna=2.10=pyh9f0ad1d_0 39 | importlib-metadata=2.0.0=py_1 40 | importlib_metadata=2.0.0=1 41 | intel-openmp=2021.2.0=h06a4308_610 42 | ipykernel=5.3.4=py39hb070fc8_0 43 | ipython=7.22.0=py39hb070fc8_0 44 | ipython_genutils=0.2.0=pyhd3eb1b0_1 45 | ipywidgets=7.5.1=py_1 46 | jedi=0.17.2=py39h06a4308_1 47 | jinja2=2.11.2=py_0 48 | joblib=1.0.1=pyhd8ed1ab_0 49 | jpeg=9b=h024ee3a_2 50 | jsonschema=3.2.0=py_2 51 | jupyter_client=6.1.7=py_0 52 | jupyter_core=4.7.1=py39h06a4308_0 53 | jupyter_server=1.8.0=pyhd8ed1ab_0 54 | jupyterlab_pygments=0.1.2=pyh9f0ad1d_0 55 | kiwisolver=1.3.1=py39h2531618_0 56 | lame=3.100=h7f98852_1001 57 | lcms2=2.12=h3be6417_0 58 | ld_impl_linux-64=2.35.1=h7274673_9 59 | libffi=3.3=he6710b0_2 60 | libflac=1.3.3=h9c3ff4c_1 61 | libgcc-ng=9.3.0=h5101ec6_17 62 | libgfortran-ng=7.5.0=ha8ba4b0_17 63 | libgfortran4=7.5.0=ha8ba4b0_17 64 | libgomp=9.3.0=h5101ec6_17 65 | libllvm10=10.0.1=he513fc3_3 66 | libogg=1.3.4=h7f98852_1 67 | libopus=1.3.1=h7f98852_1 68 | libpng=1.6.37=hbc83047_0 69 | librosa=0.8.1=pyhd8ed1ab_0 70 | libsndfile=1.0.31=h9c3ff4c_1 71 | libsodium=1.0.18=h7b6447c_0 72 | libstdcxx-ng=9.3.0=hd4cf53a_17 73 | libtiff=4.2.0=h85742a9_0 74 | libuuid=1.0.3=h1bed415_2 75 | libvorbis=1.3.7=h9c3ff4c_0 76 | libwebp-base=1.2.0=h27cfd23_0 77 | libxcb=1.14=h7b6447c_0 78 | libxml2=2.9.10=hb55368b_3 79 | llvmlite=0.36.0=py39h1bbdace_0 80 | lz4-c=1.9.3=h2531618_0 81 | markupsafe=2.0.1=py39h27cfd23_0 82 | matplotlib=3.3.4=py39h06a4308_0 83 | matplotlib-base=3.3.4=py39h62a2d02_0 84 | mistune=0.8.4=py39h27cfd23_1000 85 | mkl=2021.2.0=h06a4308_296 86 | mkl-service=2.3.0=py39h27cfd23_1 87 | mkl_fft=1.3.0=py39h42c9631_2 88 | mkl_random=1.2.1=py39ha9443f7_2 89 | nbclient=0.5.3=pyhd8ed1ab_0 90 | nbconvert=6.0.7=py39hf3d152e_3 91 | nbformat=5.0.8=py_0 92 | ncurses=6.2=he6710b0_1 93 | nest-asyncio=1.5.1=pyhd8ed1ab_0 94 | nettle=3.6=he412f7d_0 95 | notebook=6.4.0=py39h06a4308_0 96 | numba=0.53.1=py39h56b8d98_1 97 | numpy=1.20.2=py39h2d18471_0 98 | numpy-base=1.20.2=py39hfae3a4d_0 99 | olefile=0.46=py_0 100 | openh264=2.1.1=h780b84a_0 101 | openssl=1.1.1k=h7f98852_0 102 | packaging=20.9=pyh44b312d_0 103 | pandas=1.2.4=py39h2531618_0 104 | pandoc=2.11=hb0f4dca_0 105 | pandocfilters=1.4.3=py39h06a4308_1 106 | parso=0.7.0=py_0 107 | pcre=8.44=he6710b0_0 108 | pexpect=4.8.0=pyhd3eb1b0_3 109 | pickleshare=0.7.5=pyhd3eb1b0_1003 110 | pillow=8.2.0=py39he98fc37_0 111 | pip=21.1.2=py39h06a4308_0 112 | plotly=4.14.3=py_0 113 | pooch=1.4.0=pyhd8ed1ab_0 114 | prometheus_client=0.8.0=py_0 115 | prompt-toolkit=3.0.8=py_0 116 | ptyprocess=0.7.0=pyhd3eb1b0_2 117 | pycparser=2.20=pyh9f0ad1d_2 118 | pygments=2.7.1=py_0 119 | pynndescent=0.5.2=pyh44b312d_0 120 | pyopenssl=20.0.1=pyhd8ed1ab_0 121 | pyparsing=2.4.7=pyhd3eb1b0_0 122 | pyqt=5.9.2=py39h2531618_6 123 | pyrsistent=0.17.3=py39h27cfd23_0 124 | pysocks=1.7.1=py39hf3d152e_3 125 | pysoundfile=0.10.3.post1=pyhd3deb0d_0 126 | python=3.9.5=h12debd9_4 127 | python-dateutil=2.8.1=pyhd3eb1b0_0 128 | python_abi=3.9=1_cp39 129 | pytz=2021.1=pyhd3eb1b0_0 130 | pyzmq=20.0.0=py39h2531618_1 131 | qt=5.9.7=h5867ecd_1 132 | readline=8.1=h27cfd23_0 133 | requests=2.25.1=pyhd3deb0d_0 134 | resampy=0.2.2=py_0 135 | retrying=1.3.3=py_2 136 | scikit-learn=0.24.2=py39ha9443f7_0 137 | scipy=1.6.2=py39had2a1c9_1 138 | seaborn=0.11.1=pyhd3eb1b0_0 139 | send2trash=1.5.0=pyhd3eb1b0_1 140 | setuptools=52.0.0=py39h06a4308_0 141 | sip=4.19.13=py39h2531618_0 142 | six=1.16.0=pyhd3eb1b0_0 143 | sniffio=1.2.0=py39hf3d152e_1 144 | sqlite=3.35.4=hdfb4753_0 145 | tbb=2020.2=h4bd325d_4 146 | terminado=0.9.4=py39h06a4308_0 147 | testpath=0.4.4=py_0 148 | threadpoolctl=2.1.0=pyh5ca1d4c_0 149 | tk=8.6.10=hbc83047_0 150 | tornado=6.1=py39h27cfd23_0 151 | traitlets=5.0.5=py_0 152 | tzdata=2020f=h52ac0ba_0 153 | umap-learn=0.5.1=py39hf3d152e_1 154 | urllib3=1.26.5=pyhd8ed1ab_0 155 | voila=0.2.10=pyhd8ed1ab_0 156 | wcwidth=0.2.5=py_0 157 | webencodings=0.5.1=py39h06a4308_1 158 | websocket-client=0.57.0=py39hf3d152e_4 159 | wheel=0.36.2=pyhd3eb1b0_0 160 | widgetsnbextension=3.5.1=py39h06a4308_0 161 | x264=1!161.3030=h7f98852_1 162 | xz=5.2.5=h7b6447c_0 163 | zeromq=4.3.3=he6710b0_3 164 | zipp=3.3.1=py_0 165 | zlib=1.2.11=h7b6447c_3 166 | zstd=1.4.9=haebb681_0 167 | -------------------------------------------------------------------------------- /environments/umap_tut_env.yaml: -------------------------------------------------------------------------------- 1 | name: umap_tut_env 2 | channels: 3 | - conda-forge 4 | - defaults 5 | dependencies: 6 | - _libgcc_mutex=0.1 7 | - _openmp_mutex=4.5 8 | - anyio=2.2.0 9 | - appdirs=1.4.4 10 | - argon2-cffi=20.1.0 11 | - async_generator=1.10 12 | - attrs=20.3.0 13 | - audioread=2.1.9 14 | - babel=2.9.0 15 | - backcall=0.2.0 16 | - blas=1.0 17 | - bleach=3.3.0 18 | - brotlipy=0.7.0 19 | - bzip2=1.0.8 20 | - ca-certificates=2021.5.30 21 | - cachecontrol=0.12.6 22 | - cairo=1.14.12 23 | - certifi=2021.5.30 24 | - cffi=1.14.5 25 | - chardet=4.0.0 26 | - cryptography=3.4.7 27 | - cycler=0.10.0 28 | - cython=0.29.24 29 | - dbus=1.13.18 30 | - decorator=5.0.7 31 | - defusedxml=0.7.1 32 | - entrypoints=0.3 33 | - expat=2.3.0 34 | - ffmpeg=4.3.1 35 | - fontconfig=2.13.1 36 | - freetype=2.10.4 37 | - fribidi=1.0.10 38 | - gettext=0.19.8.1 39 | - glib=2.56.2 40 | - gmp=6.2.1 41 | - gnutls=3.6.13 42 | - graphite2=1.3.14 43 | - graphviz=2.40.1 44 | - gst-plugins-base=1.14.0 45 | - gstreamer=1.14.0 46 | - harfbuzz=1.8.8 47 | - hdbscan=0.8.27 48 | - hdmedians=0.14.2 49 | - icu=58.2 50 | - idna=2.10 51 | - importlib-metadata=3.10.0 52 | - importlib_metadata=3.10.0 53 | - iniconfig=1.1.1 54 | - intel-openmp=2021.2.0 55 | - ipykernel=5.3.4 56 | - ipython=7.22.0 57 | - ipython_genutils=0.2.0 58 | - ipywidgets=7.6.3 59 | - jedi=0.17.0 60 | - jinja2=2.11.3 61 | - joblib=1.0.1 62 | - jpeg=9d 63 | - json5=0.9.5 64 | - jsonschema=3.2.0 65 | - jupyter-packaging=0.7.12 66 | - jupyter_client=6.1.12 67 | - jupyter_core=4.7.1 68 | - jupyter_server=1.4.1 69 | - jupyterlab=3.0.14 70 | - jupyterlab_pygments=0.1.2 71 | - jupyterlab_server=2.4.0 72 | - jupyterlab_widgets=1.0.0 73 | - kiwisolver=1.3.1 74 | - lame=3.100 75 | - lcms2=2.12 76 | - ld_impl_linux-64=2.33.1 77 | - libffi=3.3 78 | - libflac=1.3.3 79 | - libgcc=7.2.0 80 | - libgcc-ng=9.3.0 81 | - libgfortran-ng=7.5.0 82 | - libgfortran4=7.5.0 83 | - libgomp=9.3.0 84 | - libllvm10=10.0.1 85 | - libogg=1.3.4 86 | - libopus=1.3.1 87 | - libpng=1.6.37 88 | - librosa=0.8.0 89 | - libsndfile=1.0.31 90 | - libsodium=1.0.18 91 | - libstdcxx-ng=9.3.0 92 | - libtiff=4.2.0 93 | - libuuid=1.0.3 94 | - libvorbis=1.3.7 95 | - libwebp-base=1.2.0 96 | - libxcb=1.14 97 | - libxml2=2.9.10 98 | - llvmlite=0.36.0 99 | - lockfile=0.12.2 100 | - lz4-c=1.9.3 101 | - markupsafe=1.1.1 102 | - matplotlib=3.3.4 103 | - matplotlib-base=3.3.4 104 | - mistune=0.8.4 105 | - mkl=2021.2.0 106 | - mkl-service=2.3.0 107 | - mkl_fft=1.3.0 108 | - mkl_random=1.2.1 109 | - more-itertools=8.8.0 110 | - msgpack-python=1.0.2 111 | - natsort=7.1.1 112 | - nbclassic=0.2.6 113 | - nbclient=0.5.3 114 | - nbconvert=6.0.7 115 | - nbformat=5.1.3 116 | - ncurses=6.2 117 | - nest-asyncio=1.5.1 118 | - nettle=3.6 119 | - networkx=2.5 120 | - nodejs=6.11.2 121 | - notebook=6.3.0 122 | - numba=0.53.1 123 | - numpy=1.20.1 124 | - numpy-base=1.20.1 125 | - olefile=0.46 126 | - openh264=2.1.1 127 | - openjpeg=2.4.0 128 | - openssl=1.1.1k 129 | - packaging=20.9 130 | - pandas=1.2.4 131 | - pandoc=2.12 132 | - pandocfilters=1.4.3 133 | - pango=1.42.4 134 | - parso=0.8.2 135 | - patsy=0.5.1 136 | - pcre=8.44 137 | - pexpect=4.8.0 138 | - pickleshare=0.7.5 139 | - pillow=8.1.2 140 | - pip=21.0.1 141 | - pixman=0.40.0 142 | - plotly=4.14.3 143 | - pluggy=0.13.1 144 | - pooch=1.3.0 145 | - prometheus_client=0.10.1 146 | - prompt-toolkit=3.0.17 147 | - ptyprocess=0.7.0 148 | - py=1.10.0 149 | - pycparser=2.20 150 | - pygments=2.8.1 151 | - pygraphviz=1.3 152 | - pynndescent=0.5.2 153 | - pyopenssl=20.0.1 154 | - pyparsing=2.4.7 155 | - pyqt=5.9.2 156 | - pyrsistent=0.17.3 157 | - pysocks=1.7.1 158 | - pysoundfile=0.10.3.post1 159 | - pytest=6.2.4 160 | - python=3.7.10 161 | - python-dateutil=2.8.1 162 | - python_abi=3.7 163 | - pytz=2021.1 164 | - pyzmq=20.0.0 165 | - qt=5.9.7 166 | - readline=8.1 167 | - requests=2.25.1 168 | - resampy=0.2.2 169 | - retrying=1.3.3 170 | - scikit-bio=0.5.6 171 | - scikit-learn=0.24.1 172 | - scipy=1.6.2 173 | - seaborn=0.11.1 174 | - send2trash=1.5.0 175 | - setuptools=52.0.0 176 | - sip=4.19.8 177 | - six=1.15.0 178 | - sniffio=1.2.0 179 | - sqlite=3.35.4 180 | - statsmodels=0.12.2 181 | - tbb=2020.2 182 | - terminado=0.9.4 183 | - testpath=0.4.4 184 | - threadpoolctl=2.1.0 185 | - tk=8.6.10 186 | - toml=0.10.2 187 | - tornado=6.1 188 | - traitlets=5.0.5 189 | - typing_extensions=3.7.4.3 190 | - umap-learn=0.5.1 191 | - urllib3=1.26.4 192 | - voila=0.2.10 193 | - wcwidth=0.2.5 194 | - webencodings=0.5.1 195 | - wheel=0.36.2 196 | - widgetsnbextension=3.5.1 197 | - x264=1!161.3030 198 | - xz=5.2.5 199 | - zeromq=4.3.4 200 | - zipp=3.4.1 201 | - zlib=1.2.11 202 | - zstd=1.4.9 203 | - pip: 204 | - audeer==1.14.0 205 | - audiofile==0.4.2 206 | - pathlib2==2.3.5 207 | - sox==1.4.1 208 | - tqdm==4.60.0 209 | prefix: /home/mthomas/anaconda3/envs/umap_tut_env 210 | -------------------------------------------------------------------------------- /example_imgs/tool_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/example_imgs/tool_image.png -------------------------------------------------------------------------------- /functions/__init__.py: -------------------------------------------------------------------------------- 1 | # init 2 | -------------------------------------------------------------------------------- /functions/__pycache__/__init__.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/functions/__pycache__/__init__.cpython-37.pyc -------------------------------------------------------------------------------- /functions/__pycache__/audio_functions.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/functions/__pycache__/audio_functions.cpython-37.pyc -------------------------------------------------------------------------------- /functions/__pycache__/custom_dist_functions_umap.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/functions/__pycache__/custom_dist_functions_umap.cpython-37.pyc -------------------------------------------------------------------------------- /functions/__pycache__/evaluation_functions.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/functions/__pycache__/evaluation_functions.cpython-37.pyc -------------------------------------------------------------------------------- /functions/__pycache__/plot_functions.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/functions/__pycache__/plot_functions.cpython-37.pyc -------------------------------------------------------------------------------- /functions/__pycache__/preprocessing_functions.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/functions/__pycache__/preprocessing_functions.cpython-37.pyc -------------------------------------------------------------------------------- /functions/audio_functions.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import soundfile as sf 4 | import io 5 | import librosa 6 | from scipy.signal import butter, lfilter 7 | 8 | 9 | def generate_mel_spectrogram(data, rate, n_mels, window, fft_win , fft_hop, fmax, fmin=0): 10 | 11 | """ 12 | Function that generates mel spectrogram from audio data using librosa functions 13 | 14 | Parameters 15 | ---------- 16 | data: 1D numpy array (float) 17 | Audio data 18 | rate: numeric(integer) 19 | samplerate in Hz 20 | n_mels: numeric (integer) 21 | number of mel bands 22 | window: string 23 | spectrogram window generation type ('hann'...) 24 | fft_win: numeric (float) 25 | window length in s 26 | fft_hop: numeric (float) 27 | hop between window start in s 28 | 29 | Returns 30 | ------- 31 | result : 2D np.array 32 | Mel-transformed spectrogram, dB scale 33 | 34 | Example 35 | ------- 36 | >>> 37 | 38 | """ 39 | spectro = np.nan 40 | 41 | try: 42 | n_fft = int(fft_win * rate) 43 | hop_length = int(fft_hop * rate) 44 | 45 | s = librosa.feature.melspectrogram(y = data , 46 | sr = rate, 47 | n_mels = n_mels , 48 | fmax = fmax, 49 | fmin = fmin, 50 | n_fft = n_fft, 51 | hop_length = hop_length, 52 | window = window, 53 | win_length = n_fft) 54 | 55 | spectro = librosa.power_to_db(s, ref=np.max) 56 | except: 57 | print("Failed to generate spectrogram.") 58 | 59 | return spectro 60 | 61 | 62 | def generate_stretched_mel_spectrogram(data, sr, duration, n_mels, window, fft_win , fft_hop, MAX_DURATION): 63 | """ 64 | Function that generates stretched mel spectrogram from audio data using librosa functions 65 | 66 | Parameters 67 | ---------- 68 | data: 1D numpy array (float) 69 | Audio data 70 | sr: numeric(integer) 71 | samplerate in Hz 72 | duration: numeric (float) 73 | duration of audio in seconds 74 | n_mels: numeric (integer) 75 | number of mel bands 76 | window: string 77 | spectrogram window generation type ('hann'...) 78 | fft_win: numeric (float) 79 | window length in s 80 | fft_hop: numeric (float) 81 | hop between window start in s 82 | 83 | Returns 84 | ------- 85 | result : 2D np.array 86 | stretched, mel-transformed spectrogram, dB scale 87 | ------- 88 | >>> 89 | 90 | """ 91 | n_fft = int(fft_win * sr) 92 | hop_length = int(fft_hop * sr) 93 | stretch_rate = duration/MAX_DURATION 94 | 95 | # generate normal spectrogram (NOT mel transformed) 96 | D = librosa.stft(y=data, 97 | n_fft = n_fft, 98 | hop_length = hop_length, 99 | window=window, 100 | win_length = n_fft 101 | ) 102 | 103 | # Stretch spectrogram using phase vocoder algorithm 104 | D_stretched = librosa.core.phase_vocoder(D, stretch_rate, hop_length=hop_length) 105 | D_stretched = np.abs(D_stretched)**2 106 | 107 | # mel transform 108 | spectro = librosa.feature.melspectrogram(S=D_stretched, 109 | sr=sr, 110 | n_mels=n_mels, 111 | fmax=4000) 112 | 113 | # Convert to db scale 114 | s = librosa.power_to_db(spectro, ref=np.max) 115 | 116 | return s 117 | 118 | def read_wavfile(filename, channel=0): 119 | """ 120 | Function that reads audio data and sr from audiofile 121 | If audio is stereo, channel 0 is selected by default. 122 | 123 | Parameters 124 | ---------- 125 | filename: String 126 | path to wav file 127 | 128 | channel: Integer (0 or 1) 129 | which channel is selected for stereo files 130 | default is 0 131 | 132 | Returns 133 | ------- 134 | data : 1D np.array 135 | Raw audio data (Amplitude) 136 | 137 | sr: numeric (Integer) 138 | Samplerate (in Hz) 139 | """ 140 | data = np.nan 141 | sr = np.nan 142 | 143 | if os.path.exists(filename): 144 | try: 145 | data, sr = sf.read(filename) 146 | if data.ndim>1: 147 | data = data[:,channel] 148 | except: 149 | print("Couldn't read: ", filename) 150 | else: 151 | print("No such file or directory: ", filename) 152 | 153 | 154 | return data, sr 155 | 156 | 157 | 158 | # Butter bandpass filter implementation: 159 | # from https://scipy-cookbook.readthedocs.io/items/ButterworthBandpass.html 160 | 161 | def butter_bandpass_filter(data, lowcut, highcut, sr, order=5): 162 | """ 163 | Function that applies a butter bandpass filter on audio data 164 | and returns the filtered audio 165 | 166 | Parameters 167 | ---------- 168 | data: 1D np.array 169 | audio data (amplitude) 170 | 171 | lowcut: Numeric 172 | lower bound for bandpass filter 173 | 174 | highcut: Numeric 175 | upper bound for bandpass filter 176 | 177 | sr: Numeric 178 | samplerate in Hz 179 | 180 | order: Numeric 181 | order of the filter 182 | 183 | Returns 184 | ------- 185 | filtered_data : 1D np.array 186 | filtered audio data 187 | """ 188 | 189 | nyq = 0.5 * sr 190 | low = lowcut / nyq 191 | high = highcut / nyq 192 | b, a = butter(order, [low, high], btype='band') 193 | 194 | filtered_data = lfilter(b, a, data) 195 | return filtered_data 196 | 197 | 198 | -------------------------------------------------------------------------------- /functions/custom_dist_functions_umap.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # In[ ]: 5 | 6 | 7 | # -*- coding: utf-8 -*- 8 | """ 9 | Created on Tue May 4 17:39:59 2021 10 | 11 | Collection of custom distance functions for UMAP 12 | 13 | @author: marathomas 14 | """ 15 | 16 | import numpy as np 17 | import numba 18 | from numba import jit 19 | 20 | MIN_OVERLAP = 0.9 21 | 22 | 23 | @numba.njit() 24 | def unpack_specs(a,b): 25 | """ 26 | Function that unpacks two specs that have been transformed into 27 | a 1D array with preprocessing_functions.pad_transform_spec and 28 | restores their original 2D shape 29 | 30 | Parameters 31 | ---------- 32 | a,b : 1D numpy arrays (numeric) 33 | 34 | Returns 35 | ------- 36 | spec_s, spec_l : 2D numpy arrays (numeric) 37 | the restored specs 38 | Example 39 | ------- 40 | >>> 41 | 42 | """ 43 | 44 | a_shape0 = int(a[0]) 45 | a_shape1 = int(a[1]) 46 | b_shape0 = int(b[0]) 47 | b_shape1 = int(b[1]) 48 | 49 | spec_a= np.reshape(a[2:(a_shape0*a_shape1)+2], (a_shape0, a_shape1)) 50 | spec_b= np.reshape(b[2:(b_shape0*b_shape1)+2], (b_shape0, b_shape1)) 51 | 52 | len_a = a_shape1 53 | len_b = b_shape1 54 | 55 | # find bigger spec 56 | spec_s = spec_a 57 | spec_l = spec_b 58 | 59 | if len_a>len_b: 60 | spec_s = spec_b 61 | spec_l = spec_a 62 | 63 | return spec_s, spec_l 64 | 65 | 66 | @numba.njit() 67 | def calc_timeshift_pad(a,b): 68 | """ 69 | Custom numba-compatible distance function for UMAP. 70 | Calculates distance between two spectrograms a,b 71 | by shifting the shorter spectrogram along the longer 72 | one and finding the minimum distance overlap (according to 73 | spec_dist). Non-overlapping sections of the shorter spec are 74 | zero-padded to match the longer spec when calculating the distance. 75 | Uses global variable OVERLAP to constrain shifting to have 76 | OVERLAP*100 % of overlap between specs. 77 | 78 | Parameters 79 | ---------- 80 | a,b : 1D numpy arrays (numeric) 81 | pad_transformed spectrograms 82 | (with preprocessing_functions.pad_transform_spec) 83 | 84 | Returns 85 | ------- 86 | dist : numeric (float64) 87 | distance between spectrograms a,b 88 | 89 | Example 90 | ------- 91 | >>> 92 | 93 | """ 94 | 95 | spec_s, spec_l = unpack_specs(a,b) 96 | 97 | len_s = spec_s.shape[1] 98 | len_l = spec_l.shape[1] 99 | 100 | nfreq = spec_s.shape[0] 101 | 102 | # define start position 103 | min_overlap_frames = int(MIN_OVERLAP * len_s) 104 | start_timeline = min_overlap_frames-len_s 105 | max_timeline = len_l - min_overlap_frames 106 | 107 | n_of_calculations = int((((max_timeline+1-start_timeline)+(max_timeline+1-start_timeline))/2) +1) 108 | 109 | distances = np.full((n_of_calculations),999.) 110 | 111 | count=0 112 | 113 | for timeline_p in range(start_timeline, max_timeline+1,2): 114 | #print("timeline: ", timeline_p) 115 | # mismatch on left side 116 | if timeline_p < 0: 117 | 118 | len_overlap = len_s - abs(timeline_p) 119 | 120 | pad_s = np.full((nfreq, (len_l-len_overlap)),0.) 121 | pad_l = np.full((nfreq, (len_s-len_overlap)),0.) 122 | 123 | s_config = np.append(spec_s, pad_s, axis=1).astype(np.float64) 124 | l_config = np.append(pad_l, spec_l, axis=1).astype(np.float64) 125 | 126 | # mismatch on right side 127 | elif timeline_p > (len_l-len_s): 128 | 129 | len_overlap = len_l - timeline_p 130 | 131 | pad_s = np.full((nfreq, (len_l-len_overlap)),0.) 132 | pad_l = np.full((nfreq, (len_s-len_overlap)),0.) 133 | 134 | s_config = np.append(pad_s, spec_s, axis=1).astype(np.float64) 135 | l_config = np.append(spec_l, pad_l, axis=1).astype(np.float64) 136 | 137 | # no mismatch on either side 138 | else: 139 | len_overlap = len_s 140 | start_col_l = timeline_p 141 | end_col_l = start_col_l + len_overlap 142 | 143 | pad_s_left = np.full((nfreq, start_col_l),0.) 144 | pad_s_right = np.full((nfreq, (len_l - end_col_l)),0.) 145 | 146 | l_config = spec_l.astype(np.float64) 147 | s_config = np.append(pad_s_left, spec_s, axis=1).astype(np.float64) 148 | s_config = np.append(s_config, pad_s_right, axis=1).astype(np.float64) 149 | 150 | size = s_config.shape[0]*s_config.shape[1] 151 | distances[count] = spec_dist(s_config, l_config, size) 152 | count = count + 1 153 | 154 | 155 | min_dist = np.min(distances) 156 | return min_dist 157 | 158 | -------------------------------------------------------------------------------- /functions/evaluation_functions.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # In[4]: 5 | 6 | 7 | # -*- coding: utf-8 -*- 8 | """ 9 | Created on Tue May 4 17:39:59 2021 10 | 11 | Collection of custom evaluation functions for embedding 12 | 13 | @author: marathomas 14 | """ 15 | 16 | import numpy as np 17 | import pandas as pd 18 | from sklearn.neighbors import NearestNeighbors 19 | from sklearn.metrics import silhouette_samples, silhouette_score 20 | import seaborn as sns 21 | import matplotlib.pyplot as plt 22 | import string 23 | from scipy.spatial.distance import pdist, squareform 24 | import sklearn 25 | from sklearn.metrics.pairwise import euclidean_distances 26 | 27 | 28 | def make_nn_stats_dict(calltypes, labels, nb_indices): 29 | """ 30 | Function that evaluates the labels of the k nearest neighbors of 31 | all datapoints in a dataset. 32 | 33 | Parameters 34 | ---------- 35 | calltypes : 1D numpy array (string) or list of strings 36 | set of class labels 37 | labels: 1D numpy array (string) or list of strings 38 | vector/list of class labels in dataset 39 | nb_indices: 2D numpy array (numeric integer) 40 | Array I(X,k) containing the indices of the k nearest 41 | nearest neighbors for each datapoint X of a 42 | dataset 43 | 44 | Returns 45 | ------- 46 | nn_stats_dict : dictionary[] = 2D numpy array (numeric) 47 | dictionary that contains one array for each type of label. 48 | Given a label L, nn_stats_dict[L] contains an array A(X,Y), 49 | where Y is the number of class labels in the dataset and each 50 | row X represents a datapoint of label L in the dataset. 51 | A[i,j] is the number of nearest neighbors of datapoint i that 52 | are of label calltypes[j]. 53 | 54 | Example 55 | ------- 56 | >>> 57 | 58 | """ 59 | nn_stats_dict = {} 60 | 61 | for calltype in calltypes: 62 | # which datapoints in the dataset are of this specific calltype? 63 | # -> get their indices 64 | call_indices = np.asarray(np.where(labels==calltype))[0] 65 | 66 | # initialize array that can save the class labels of the k nearest 67 | # neighbors of all these datapoints 68 | calltype_counts = np.zeros((call_indices.shape[0],len(calltypes))) 69 | 70 | # for each datapoint 71 | for i,ind in enumerate(call_indices): 72 | # what are the indices of its k nearest neighbors 73 | nearest_neighbors = nb_indices[ind] 74 | # for eacht of these neighbors 75 | for neighbor in nearest_neighbors: 76 | # what is their label 77 | neighbor_label = labels[neighbor] 78 | # put a +1 in the array 79 | calltype_counts[i,np.where(np.asarray(calltypes)==neighbor_label)[0][0]] += 1 80 | 81 | # save the resulting array in dictionary 82 | # (1 array per calltype) 83 | nn_stats_dict[calltype] = calltype_counts 84 | 85 | return nn_stats_dict 86 | 87 | def get_knn(k,embedding): 88 | """ 89 | Function that finds k nearest neighbors (based on 90 | euclidean distance) for each datapoint in a multidimensional 91 | dataset 92 | 93 | Parameters 94 | ---------- 95 | k : integer 96 | number of nearest neighbors 97 | embedding: 2D numpy array (numeric) 98 | a dataset E(X,Y) with X datapoints and Y dimensions 99 | 100 | Returns 101 | ------- 102 | indices: 2D numpy array (numeric) 103 | Array I(X,k) containing the indices of the k nearest 104 | nearest neighbors for each datapoint X of the input 105 | dataset 106 | 107 | distances: 2D numpy array (numeric) 108 | Array D(X,k) containing the euclidean distance to each 109 | of the k nearest neighbors for each datapoint X of the 110 | input dataset. D[i,j] is the euclidean distance of datapoint 111 | embedding[i,:] to its jth neighbor. 112 | 113 | Example 114 | ------- 115 | >>> 116 | 117 | """ 118 | 119 | # Find k nearest neighbors 120 | nbrs = NearestNeighbors(metric='euclidean',n_neighbors=k+1, algorithm='brute').fit(embedding) 121 | distances, indices = nbrs.kneighbors(embedding) 122 | 123 | # need to remove the first neighbor, because that is the datapoint itself 124 | indices = indices[:,1:] 125 | distances = distances[:,1:] 126 | 127 | return indices, distances 128 | 129 | 130 | def make_statstabs(nn_stats_dict, calltypes, labels,k): 131 | """ 132 | Function that generates two summary tables containing 133 | the frequency of different class labels among the k nearest 134 | neighbors of datapoints belonging to a class. 135 | 136 | Parameters 137 | ---------- 138 | nn_stats_dict : dictionary[] = 2D numpy array (numeric) 139 | dictionary that contains one array for each type of label. 140 | Given a label L, nn_stats_dict[L] contains an array A(X,Y), 141 | where Y is the number of class labels in the dataset and each 142 | row X represents a datapoint of label L in the dataset. 143 | A[i,j] is the number of nearest neighbors of datapoint i that 144 | are of label calltypes[j]. 145 | (is returned from evaulation_functions.make_nn_statsdict) 146 | calltypes : 1D numpy array (string) or list of strings 147 | set of class labels 148 | labels: 1D numpy array (string) or list of strings 149 | vector/list of class labels in dataset 150 | k: Integer 151 | number of nearest neighbors 152 | 153 | Returns 154 | ------- 155 | stats_tab: 2D pandas dataframe (numeric) 156 | Summary table T(X,Y) with X,Y = number of classes. 157 | T[i,j] is the average percentage of datapoints with class label j 158 | in the neighborhood of datapoints with class label i 159 | 160 | stats_tab_norm: 2D pandas dataframe (numeric) 161 | Summary table N(X,Y) with X,Y = number of classes. 162 | N[i,j] is the log2-transformed ratio of the percentage of datapoints 163 | with class label j in the neighborhood of datapoints with class label i 164 | to the percentage that would be expected by random chance and random 165 | distribution. (N[i,j] = log2(T[i,j]/random_expect)) 166 | 167 | Example 168 | ------- 169 | >>> 170 | 171 | """ 172 | 173 | # Get the class frequencies in the dataset 174 | overall = np.zeros((len(calltypes))) 175 | for i,calltype in enumerate(calltypes): 176 | overall[i] = sum(labels==calltype) 177 | overall = (overall/np.sum(overall))*100 178 | 179 | # Initialize empty array for stats_tab and stats_tab_norm 180 | stats_tab = np.zeros((len(calltypes),len(calltypes))) 181 | stats_tab_norm = np.zeros((len(calltypes),len(calltypes))) 182 | 183 | # For each calltype 184 | for i, calltype in enumerate(calltypes): 185 | # Get the table with all neighbor label counts per datapoint 186 | stats = nn_stats_dict[calltype] 187 | # Average across all datapoints and transform to percentage 188 | stats_tab[i,:] = (np.mean(stats,axis=0)/k)*100 189 | # Divide by overall percentage of this class in dataset 190 | # for the normalized statstab version 191 | stats_tab_norm[i,:] = ((np.mean(stats,axis=0)/k)*100)/overall 192 | 193 | # Turn into dataframe 194 | stats_tab = pd.DataFrame(stats_tab) 195 | stats_tab_norm = pd.DataFrame(stats_tab_norm) 196 | 197 | # Add row with overall frequencies to statstab 198 | stats_tab.loc[len(stats_tab)] = overall 199 | 200 | # Name columns and rows 201 | stats_tab.columns = calltypes 202 | stats_tab.index = calltypes+['overall'] 203 | 204 | stats_tab_norm.columns = calltypes 205 | stats_tab_norm.index = calltypes 206 | 207 | # Replace zeros with small value as otherwise log2 transform cannot be applied 208 | x=stats_tab_norm.replace(0, 0.0001) 209 | 210 | # log2-tranform the ratios that are currently in statstabnorm 211 | stats_tab_norm = np.log2(x) 212 | 213 | return stats_tab, stats_tab_norm 214 | 215 | 216 | class nn: 217 | """ 218 | A class to represent nearest neighbor statistics for a 219 | given latent space representation of a labelled dataset 220 | 221 | Attributes 222 | ---------- 223 | embedding : 2D numpy array (numeric) 224 | a dataset E(X,Y) with X datapoints and Y dimensions 225 | 226 | labels: 1D numpy array (string) or list of strings 227 | vector/list of class labels in dataset 228 | k : integer 229 | number of nearest neighbors to consider 230 | 231 | statstab: 2D pandas dataframe (numeric) 232 | Summary table T(X,Y) with X,Y = number of classes. 233 | T[i,j] is the average percentage of datapoints with class label j 234 | in the neighborhood of datapoints with class label i 235 | 236 | statstabnorm: 2D pandas dataframe (numeric) 237 | Summary table N(X,Y) with X,Y = number of classes. 238 | N[i,j] is the log2-transformed ratio of the percentage of datapoints 239 | with class label j in the neighborhood of datapoints with class label i 240 | to the percentage that would be expected by random chance and random 241 | distribution. (N[i,j] = log2(T[i,j]/random_expect)) 242 | 243 | Methods 244 | ------- 245 | 246 | def knn_cc(): 247 | returns k nearest neighbor fractional consistency for each class 248 | (1D numpy array). What percentage of datapoints (of this class) 249 | have fully consistent k neighbors (all k are also of the same class) 250 | 251 | def knn_accuracy(self): 252 | returns k nearest neighbor classifier accuracy for each class 253 | (1D numpy array). What percentage of datapoints (of this class) 254 | have a majority of same-class neighbors among k nearest neighbors 255 | 256 | get_statstab(): 257 | returns statstab 258 | 259 | get_statstabnorm(): 260 | returns statstabnorm 261 | 262 | get_S(): 263 | returns S score of embedding 264 | S(class X) is the average percentage of same-class neighbors 265 | among the k nearest neighbors of all datapoints of 266 | class X. S of an embedding is the average of S(class X) over all 267 | classes X (unweighted, e.g. does not consider class frequencies). 268 | 269 | get_Snorm(): 270 | returns Snorm score of embedding 271 | Snorm(class X) is the log2 transformed, normalized percentage of 272 | same-class neighbors among the k nearest neighbors of all datapoints of 273 | class X. Snorm of an embedding is the average of Snorm(class X) over all 274 | classes X. 275 | 276 | get_ownclass_S(): 277 | returns array of S(class X) score for each class X in the dataset 278 | (alphanumerically sorted by class name) 279 | S(class X) is the average percentage of same-class neighbors 280 | among the k nearest neighbors of all datapoints of 281 | class X. 282 | 283 | get_ownclass_Snorm(): 284 | returns array of Snorm(class X) score for each class X in the dataset 285 | (alphanumerically sorted by class name) 286 | Snorm(class X) is the log2 transformed, normalized percentage of 287 | same-class neighbors among the k nearest neighbors of all datapoints of 288 | class X. 289 | 290 | plot_heat_S(vmin, vmax, center, cmap, cbar, outname) 291 | plots heatmap of S scores 292 | 293 | plot_heat_S(vmin, vmax, center, cmap, cbar, outname) 294 | plots heatmap of Snorm scores 295 | 296 | plot_heat_S(center, cmap, cbar, outname) 297 | plots heatmap of fold likelihood (statstabnorm scores to the power of 2) 298 | 299 | draw_simgraph(outname) 300 | draws similarity graph based on statstabnorm scores 301 | 302 | """ 303 | def __init__(self, embedding, labels, k): 304 | 305 | self.embedding = embedding 306 | self.labels = labels 307 | self.k = k 308 | 309 | label_types = sorted(list(set(labels))) 310 | 311 | indices, distances = get_knn(k,embedding) 312 | nn_stats_dict = make_nn_stats_dict(label_types, labels, indices) 313 | stats_tab, stats_tab_norm = make_statstabs(nn_stats_dict, label_types, labels, k) 314 | 315 | self.nn_stats_dict = nn_stats_dict 316 | self.statstab = stats_tab 317 | self.statstabnorm = stats_tab_norm 318 | 319 | def knn_cc(self): 320 | label_types = sorted(list(set(self.labels))) 321 | consistent = [] 322 | for i,labeltype in enumerate(label_types): 323 | statsd = self.nn_stats_dict[labeltype] 324 | x = statsd[:,i] 325 | cc = (np.sum(x == self.k) / statsd.shape[0])*100 326 | consistent.append(cc) 327 | return np.asarray(consistent) 328 | 329 | def knn_accuracy(self): 330 | label_types = sorted(list(set(self.labels))) 331 | has_majority = [] 332 | if (self.k % 2) == 0: 333 | n_majority = (self.k/2)+ 1 334 | else: 335 | n_majority = (self.k/2)+ 0.5 336 | for i,labeltype in enumerate(label_types): 337 | statsd = self.nn_stats_dict[labeltype] 338 | x = statsd[:,i] 339 | cc = (np.sum(x >= n_majority) / statsd.shape[0])*100 340 | has_majority.append(cc) 341 | return np.asarray(has_majority) 342 | 343 | def get_statstab(self): 344 | return self.statstab 345 | 346 | def get_statstabnorm(self): 347 | return self.statstabnorm 348 | 349 | def get_S(self): 350 | return np.mean(np.diagonal(self.statstab)) 351 | 352 | def get_Snorm(self): 353 | return np.mean(np.diagonal(self.statstabnorm)) 354 | 355 | def get_ownclass_S(self): 356 | return np.diagonal(self.statstab) 357 | 358 | def get_ownclass_Snorm(self): 359 | return np.diagonal(self.statstabnorm) 360 | 361 | def plot_heat_S(self,vmin=0, vmax=100, center=50, cmap=sns.color_palette("Greens", as_cmap=True), cbar=None, outname=None): 362 | plt.figure(figsize=(6,6)) 363 | ax=sns.heatmap(self.statstab, annot=True, vmin=vmin, vmax=vmax, center=center, cmap=cmap, cbar=cbar) 364 | plt.xlabel("neighbor label") 365 | plt.ylabel("datapoint label") 366 | plt.title("Nearest Neighbor Frequency P") 367 | if outname: 368 | plt.savefig(outname, facecolor="white") 369 | 370 | def plot_heat_Snorm(self,vmin=-13, vmax=13, center=0, cmap=sns.diverging_palette(20, 145, as_cmap=True), cbar=None, outname=None): 371 | plt.figure(figsize=(6,6)) 372 | ax=sns.heatmap(self.statstabnorm, annot=True, vmin=vmin, vmax=vmax, center=center, cmap=cmap, cbar=cbar) 373 | plt.xlabel("neighbor label") 374 | plt.ylabel("datapoint label") 375 | plt.title("Normalized Nearest Neighbor Frequency Pnorm") 376 | if outname: 377 | plt.savefig(outname, facecolor="white") 378 | 379 | def plot_heat_fold(self, center=1, cmap=sns.diverging_palette(20, 145, as_cmap=True), cbar=None, outname=None): 380 | plt.figure(figsize=(6,6)) 381 | ax=sns.heatmap(np.power(2,self.statstabnorm), annot=True, center=center, cmap=cmap, cbar=cbar) 382 | plt.xlabel("neighbor label") 383 | plt.ylabel("datapoint label") 384 | plt.title("Nearest Neighbor fold likelihood") 385 | if outname: 386 | plt.savefig(outname, facecolor="white") 387 | 388 | def draw_simgraph(self, outname="simgraph.png"): 389 | 390 | # Imports here because specific to this method and 391 | # sometimes problematic to install (dependencies) 392 | 393 | import networkx as nx 394 | import pygraphviz 395 | 396 | calltypes = sorted(list(set(self.labels))) 397 | sim_mat = np.asarray(self.statstabnorm).copy() 398 | for i in range(sim_mat.shape[0]): 399 | for j in range(i,sim_mat.shape[0]): 400 | if i!=j: 401 | sim_mat[i,j] = np.mean((sim_mat[i,j], sim_mat[j,i])) 402 | sim_mat[j,i] = sim_mat[i,j] 403 | else: 404 | sim_mat[i,j] = 0 405 | 406 | dist_mat = sim_mat*(-1) 407 | dist_mat = np.interp(dist_mat, (dist_mat.min(), dist_mat.max()), (1, 10)) 408 | 409 | for i in range(dist_mat.shape[0]): 410 | dist_mat[i,i] = 0 411 | 412 | dt = [('len', float)] 413 | 414 | A = dist_mat 415 | A = A.view(dt) 416 | 417 | G = nx.from_numpy_matrix(A) 418 | G = nx.relabel_nodes(G, dict(zip(range(len(G.nodes())),calltypes))) 419 | 420 | G = nx.drawing.nx_agraph.to_agraph(G) 421 | 422 | G.node_attr.update(color="#bec1d4", style="filled", shape='circle', fontsize='20') 423 | G.edge_attr.update(color="blue", width="2.0") 424 | print("Graph saved at ", outname) 425 | G.draw(outname, format='png', prog='neato') 426 | return G 427 | 428 | 429 | 430 | class sil: 431 | """ 432 | A class to represent Silhouette score statistics for a 433 | given latent space representation of a labelled dataset 434 | 435 | Attributes 436 | ---------- 437 | embedding : 2D numpy array (numeric) 438 | a dataset E(X,Y) with X datapoints and Y dimensions 439 | 440 | labels: 1D numpy array (string) or list of strings 441 | vector/list of class labels in dataset 442 | 443 | 444 | labeltypes: list of strings 445 | alphanumerically sorted set of class labels 446 | 447 | avrg_SIL: Numeric (float) 448 | The average Silhouette score of the dataset 449 | 450 | sample_SIL: 1D numpy array (numeric) 451 | The Silhouette scores for each datapoint in the dataset 452 | 453 | Methods 454 | ------- 455 | 456 | get_avrg_score(): 457 | returns the average Silhouette score of the dataset 458 | 459 | get_score_per_class(): 460 | returns the average Silhouette score per class for each 461 | class in the dataset as 1D numpy array 462 | (alphanumerically sorted classes) 463 | 464 | get_sample_scores(): 465 | returns the Silhouette scores for each datapoint in the dataset 466 | (1D numpy array, numeric) 467 | 468 | 469 | """ 470 | def __init__(self, embedding, labels): 471 | 472 | self.embedding = embedding 473 | self.labels = labels 474 | self.labeltypes = sorted(list(set(labels))) 475 | 476 | self.avrg_SIL = silhouette_score(embedding, labels) 477 | self.sample_SIL = silhouette_samples(embedding, labels) 478 | 479 | def get_avrg_score(self): 480 | return self.avrg_SIL 481 | 482 | def get_score_per_class(self): 483 | scores = np.zeros((len(self.labeltypes),)) 484 | for i, label in enumerate(self.labeltypes): 485 | ith_cluster_silhouette_values = self.sample_SIL[self.labels == label] 486 | scores[i] = np.mean(ith_cluster_silhouette_values) 487 | #scores_tab = pd.DataFrame([scores],columns=self.labeltypes) 488 | return scores 489 | 490 | def get_sample_scores(self): 491 | return self.sample_SIL 492 | 493 | def plot_sil(self, mypalette="Set2", embedding_type=None, outname=None): 494 | labeltypes = sorted(list(set(self.labels))) 495 | n_clusters = len(labeltypes) 496 | 497 | # Create a subplot with 1 row and 2 columns 498 | fig, ax1 = plt.subplots(1, 1) 499 | fig.set_size_inches(9, 7) 500 | ax1.set_xlim([-1, 1]) 501 | ax1.set_ylim([0, self.embedding.shape[0] + (n_clusters + 1) * 10]) 502 | y_lower = 10 503 | 504 | pal = sns.color_palette(mypalette, n_colors=len(labeltypes)) 505 | color_dict = dict(zip(labeltypes, pal)) 506 | 507 | labeltypes = sorted(labeltypes, reverse=True) 508 | 509 | 510 | for i, cluster_label in enumerate(labeltypes): 511 | ith_cluster_silhouette_values = self.sample_SIL[self.labels == cluster_label] 512 | ith_cluster_silhouette_values.sort() 513 | 514 | size_cluster_i = ith_cluster_silhouette_values.shape[0] 515 | y_upper = y_lower + size_cluster_i 516 | 517 | ax1.fill_betweenx(np.arange(y_lower, y_upper), 518 | 0, ith_cluster_silhouette_values, 519 | facecolor=color_dict[cluster_label], edgecolor=color_dict[cluster_label], alpha=0.7) 520 | 521 | ax1.text(-0.05, y_lower + 0.5 * size_cluster_i, cluster_label) 522 | 523 | # Compute the new y_lower for next plot 524 | y_lower = y_upper + 10 # 10 for the 0 samples 525 | 526 | if embedding_type: 527 | mytitle = "Silhouette plot for "+embedding_type+" labels" 528 | else: 529 | mytitle = "Silhouette plot" 530 | 531 | ax1.set_title(mytitle) 532 | ax1.set_xlabel("Silhouette value") 533 | ax1.set_ylabel("Cluster label") 534 | 535 | # The vertical line for average silhouette score of all the values 536 | ax1.axvline(x=self.avrg_SIL, color="red", linestyle="--") 537 | 538 | if outname: 539 | plt.savefig(outname, facecolor="white") 540 | 541 | 542 | 543 | 544 | def plot_within_without(embedding,labels, distance_metric = "euclidean", outname=None,xmin=0, xmax=12, ymax=0.5, nbins=50,nrows=4, ncols=2, density=True): 545 | """ 546 | Function that plots distribution of pairwise distances within a class 547 | vs. towards other classes ("between"), for each class in a dataset 548 | 549 | Parameters 550 | ---------- 551 | embedding : 2D numpy array (numeric) 552 | a dataset E(X,Y) with X datapoints and Y dimensions 553 | 554 | labels: 1D numpy array (string) or list of strings 555 | vector/list of class labels in dataset 556 | 557 | 558 | distance_metric: String 559 | Type of distance metric, e.g. "euclidean", "manhattan"... 560 | all scipy.spatial.distance metrics are allowed 561 | 562 | outname: String 563 | Output filename at which plot will be saved 564 | No plot will be saved if outname is None 565 | (e.g. "my_folder/my_img.png") 566 | 567 | xmin, xmax: Numeric 568 | Min and max of x-axis 569 | 570 | ymax: Numeric 571 | Max of yaxis 572 | 573 | nbins: Integer 574 | Number of bins in histograms 575 | 576 | nrows: Integer 577 | Number of rows of subplots 578 | 579 | ncols: Integer 580 | Number of columns of subplots 581 | 582 | density: Boolean 583 | Plot density histogram if density=True 584 | else plot frequency histogram 585 | 586 | Returns 587 | ------- 588 | 589 | - 590 | 591 | """ 592 | 593 | distmat_embedded = squareform(pdist(embedding, metric=distance_metric)) 594 | labels = np.asarray(labels) 595 | calltypes = sorted(list(set(labels))) 596 | 597 | self_dists={} 598 | other_dists={} 599 | 600 | for calltype in calltypes: 601 | x=distmat_embedded[np.where(labels==calltype)] 602 | x = np.transpose(x) 603 | y = x[np.where(labels==calltype)] 604 | 605 | self_dists[calltype] = y[np.triu_indices(n=y.shape[0], m=y.shape[1],k = 1)] 606 | y = x[np.where(labels!=calltype)] 607 | other_dists[calltype] = y[np.triu_indices(n=y.shape[0], m=y.shape[1], k = 1)] 608 | 609 | plt.figure(figsize=(8, 8)) 610 | i=1 611 | 612 | for calltype in calltypes: 613 | 614 | plt.subplot(nrows, ncols, i) 615 | n, bins, patches = plt.hist(x=self_dists[calltype], label="within", density=density, 616 | bins=np.linspace(xmin, xmax, nbins), color='green', 617 | alpha=0.5, rwidth=0.85) 618 | 619 | plt.vlines(x=np.mean(self_dists[calltype]),ymin=0,ymax=ymax,color='green', linestyles='dotted') 620 | 621 | n, bins, patches = plt.hist(x=other_dists[calltype], label="between", density=density, 622 | bins=np.linspace(xmin, xmax, nbins), color='red', 623 | alpha=0.5, rwidth=0.85) 624 | 625 | plt.vlines(x=np.mean(other_dists[calltype]),ymin=0,ymax=ymax,color='red', linestyles='dotted') 626 | plt.legend() 627 | plt.grid(axis='y', alpha=0.75) 628 | plt.title(calltype) 629 | plt.xlim(xmin,xmax) 630 | plt.ylim(0, ymax) 631 | 632 | if (i%ncols)==1: 633 | ylabtitle = 'Density' if density else 'Frequency' 634 | plt.ylabel(ylabtitle) 635 | if i>=((nrows*ncols)-ncols): 636 | plt.xlabel(distance_metric+' distance') 637 | 638 | i=i+1 639 | 640 | plt.tight_layout() 641 | if outname: 642 | plt.savefig(outname, facecolor="white") 643 | 644 | 645 | def next_sameclass_nb(embedding, labels): 646 | """ 647 | Function that calculates the neighborhood degree of the closest 648 | same-class neighbor for a given labelled dataset. Calculation is 649 | based on euclidean distance and done for each datapoint. E.g. 6 650 | means that the 6th nearest neighbor of this datapoint is the first 651 | to be of the same-class (the first 5 nearest neighbors are of 652 | different class) 653 | 654 | Parameters: 655 | ---------- 656 | embedding : 2D numpy array (numeric) 657 | a dataset E(X,Y) with X datapoints and Y dimensions 658 | 659 | labels: 1D numpy array (string) or list of strings 660 | vector/list of class labels in dataset 661 | 662 | Returns: 663 | ------- 664 | 665 | nbs_to_sameclass: 1D numpy array 666 | nearest same-class neighborhood degree for 667 | each datapoint of the input dataset 668 | 669 | """ 670 | indices = [] 671 | distmat = euclidean_distances(embedding, embedding) 672 | k = embedding.shape[0]-1 673 | 674 | nbs_to_sameclass = [] 675 | 676 | for i in range(distmat.shape[0]): 677 | neighbors = [] 678 | distances = distmat[i,:] 679 | ranks = np.array(distances).argsort().argsort() 680 | for j in range(1,embedding.shape[0]): 681 | ind = np.where(ranks==j)[0] 682 | nb_label = labels[ind[0]] 683 | neighbors.append(nb_label) 684 | 685 | neighbors = np.asarray(neighbors) 686 | 687 | # How many neighbors until I encounter a same-class neighbor? 688 | own_type = labels[i] 689 | distances = distmat[i,:] 690 | ranks = np.array(distances).argsort().argsort() 691 | neighbors = [] 692 | for j in range(1,embedding.shape[0]): 693 | ind = np.where(ranks==j)[0] 694 | nb_label = labels[ind[0]] 695 | neighbors.append(nb_label) 696 | 697 | neighbors = np.asarray(neighbors) 698 | 699 | # How long to same-class label? 700 | own_type = labels[i] 701 | first_occurrence = np.where(neighbors==labels[i])[0][0] 702 | 703 | nbs_to_sameclass.append(first_occurrence) 704 | 705 | nbs_to_sameclass = np.asarray(nbs_to_sameclass) 706 | return(nbs_to_sameclass) 707 | 708 | 709 | 710 | -------------------------------------------------------------------------------- /functions/plot_functions.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # In[4]: 5 | 6 | 7 | # -*- coding: utf-8 -*- 8 | """ 9 | Created on Tue May 4 17:39:59 2021 10 | 11 | Collection of custom evaluation functions for embedding 12 | 13 | @author: marathomas 14 | """ 15 | 16 | import pandas as pd 17 | import numpy as np 18 | import matplotlib.pyplot as plt 19 | from mpl_toolkits.mplot3d import Axes3D 20 | from matplotlib.legend import Legend 21 | import matplotlib 22 | import seaborn as sns 23 | import plotly.express as px 24 | import plotly.graph_objects as go 25 | 26 | 27 | 28 | def umap_2Dplot(x,y, scat_labels, mycolors, outname=None, showlegend=True): 29 | """ 30 | Function that creates (and saves) 2D plot from an 31 | input dataset, color-colored by the provided labels. 32 | 33 | Parameters 34 | ---------- 35 | x : 1D numpy array (numeric) or list 36 | x coordinates of datapoints 37 | 38 | y: 1D numpy array (numeric) or list 39 | y coordinates of datapoints 40 | 41 | scat_labels: List-of-Strings 42 | Datapoint labels 43 | 44 | mycolors: String or List-of-Strings 45 | Seaborn color palette name (e.g. "Set2") or list of 46 | colors (Hex value strings) used for coloring datapoints 47 | (e.g. ["#FFEBCD","#0000FF",...]) 48 | 49 | outname: String 50 | Output filename at which plot will be saved 51 | No plot will be saved if outname is None 52 | (e.g. "my_folder/my_img.png") 53 | 54 | showlegend: Boolean 55 | Show legend if True, else don't 56 | 57 | Returns 58 | ------- 59 | 60 | - 61 | 62 | """ 63 | 64 | labeltypes = sorted(list(set(scat_labels))) 65 | pal = sns.color_palette(mycolors, n_colors=len(labeltypes)) 66 | color_dict = dict(zip(labeltypes, pal)) 67 | c = [color_dict[val] for val in scat_labels] 68 | 69 | fig = plt.figure(figsize=(6,6)) 70 | 71 | plt.scatter(x, y, alpha=1, 72 | s=10, c=c) 73 | plt.xlabel('UMAP1') 74 | plt.ylabel('UMAP2'); 75 | 76 | scatters = [] 77 | for label in labeltypes: 78 | scatters.append(matplotlib.lines.Line2D([0],[0], linestyle="none", c=color_dict[label], marker = 'o')) 79 | 80 | if showlegend: plt.legend(scatters, labeltypes, numpoints = 1) 81 | if outname: plt.savefig(outname, facecolor="white") 82 | 83 | 84 | 85 | def umap_3Dplot(x,y,z,scat_labels, mycolors,outname=None, showlegend=True): 86 | """ 87 | Function that creates (and saves) 3D plot from an 88 | input dataset, color-colored by the provided labels. 89 | 90 | Parameters 91 | ---------- 92 | x : 1D numpy array (numeric) or list 93 | x coordinates of datapoints 94 | 95 | y: 1D numpy array (numeric) or list 96 | y coordinates of datapoints 97 | 98 | z: 1D numpy array (numeric) or list 99 | z coordinates of datapoints 100 | 101 | scat_labels: List-of-Strings 102 | Datapoint labels 103 | 104 | mycolors: String or List-of-Strings 105 | Seaborn color palette name (e.g. "Set2") or list of 106 | colors (Hex value strings) used for coloring datapoints 107 | (e.g. ["#FFEBCD","#0000FF",...]) 108 | 109 | outname: String 110 | Output filename at which plot will be saved 111 | No plot will be saved if outname is None 112 | (e.g. "my_folder/my_img.png") 113 | 114 | showlegend: Boolean 115 | Show legend if True, else don't 116 | 117 | Returns 118 | ------- 119 | 120 | - 121 | 122 | """ 123 | labeltypes = sorted(list(set(scat_labels))) 124 | pal = sns.color_palette(mycolors, n_colors=len(labeltypes)) 125 | color_dict = dict(zip(labeltypes, pal)) 126 | c = [color_dict[val] for val in scat_labels] 127 | 128 | fig = plt.figure(figsize=(10,10)) 129 | ax = fig.add_subplot(111, projection='3d') 130 | 131 | Axes3D.scatter(ax, 132 | xs = x, 133 | ys = y, 134 | zs = z, 135 | zdir='z', 136 | s=20, 137 | label = c, 138 | c=c, 139 | depthshade=False) 140 | 141 | ax.set_xlabel('UMAP1') 142 | ax.set_ylabel('UMAP2') 143 | ax.set_zlabel('UMAP3') 144 | 145 | ax.xaxis.pane.fill = False 146 | ax.yaxis.pane.fill = False 147 | ax.zaxis.pane.fill = False 148 | 149 | ax.xaxis.pane.set_edgecolor('w') 150 | ax.yaxis.pane.set_edgecolor('w') 151 | ax.zaxis.pane.set_edgecolor('w') 152 | 153 | 154 | 155 | if showlegend: 156 | scatters = [] 157 | for label in labeltypes: 158 | scatters.append(matplotlib.lines.Line2D([0],[0], linestyle="none", c=color_dict[label], marker = 'o')) 159 | 160 | ax.legend(scatters, labeltypes, numpoints = 1) 161 | 162 | if outname: plt.savefig(outname, facecolor="white") 163 | 164 | 165 | 166 | def plotly_viz(x,y,z,scat_labels, mycolors): 167 | """ 168 | Function that creates interactive 3D plot with plotly from 169 | an input dataset, color-colored by the provided labels. 170 | 171 | Parameters 172 | ---------- 173 | x : 1D numpy array (numeric) or list 174 | x coordinates of datapoints 175 | 176 | y: 1D numpy array (numeric) or list 177 | y coordinates of datapoints 178 | 179 | z: 1D numpy array (numeric) or list 180 | z coordinates of datapoints 181 | 182 | scat_labels: List-of-Strings 183 | Datapoint labels 184 | 185 | mycolors: String or List-of-Strings 186 | Seaborn color palette name (e.g. "Set2") or list of 187 | colors (Hex value strings) used for coloring datapoints 188 | (e.g. ["#FFEBCD","#0000FF",...]) 189 | 190 | Returns 191 | ------- 192 | 193 | - 194 | 195 | """ 196 | labeltypes = sorted(list(set(scat_labels))) 197 | pal = sns.color_palette(mycolors, n_colors=len(labeltypes)) 198 | color_dict = dict(zip(labeltypes, pal)) 199 | c = [color_dict[val] for val in scat_labels] 200 | 201 | fig = go.Figure(data=[go.Scatter3d(x=x, y=y, z=z, 202 | mode='markers', 203 | hovertext = scat_labels, 204 | marker=dict( 205 | size=4, 206 | color=c, # set color to an array/list of desired values 207 | opacity=0.8 208 | ))]) 209 | 210 | fig.update_layout(scene = dict( 211 | xaxis_title='UMAP1', 212 | yaxis_title='UMAP2', 213 | zaxis_title='UMAP3'), 214 | width=700, 215 | margin=dict(r=20, b=10, l=10, t=10)) 216 | 217 | return fig 218 | 219 | 220 | -------------------------------------------------------------------------------- /functions/preprocessing_functions.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | def calc_zscore(s): 4 | """ 5 | Function that z-score transforms each value of a 2D array 6 | (not along any axis). numba-compatible. 7 | 8 | Parameters 9 | ---------- 10 | spec : 2D numpy array (numeric) 11 | 12 | Returns 13 | ------- 14 | spec : 2D numpy array (numeric) 15 | the z-transformed array 16 | 17 | """ 18 | spec = s.copy() 19 | mn = np.mean(spec) 20 | std = np.std(spec) 21 | for i in range(spec.shape[0]): 22 | for j in range(spec.shape[1]): 23 | spec[i,j] = (spec[i,j]-mn)/std 24 | return spec 25 | 26 | def pad_spectro(spec,maxlen): 27 | """ 28 | Function that Pads a spectrogram with shape (X,Y) with 29 | zeros, so that the result is in shape (X,maxlen) 30 | 31 | Parameters 32 | ---------- 33 | spec : 2D numpy array (numeric) 34 | a spectrogram S(X,Y) with X frequency bins and Y timeframes 35 | maxlen: maximal length (integer) 36 | 37 | Returns 38 | ------- 39 | padded_spec : 2D numpy array (numeric) 40 | a zero-padded spectrogram S(X,maxlen) with X frequency bins 41 | and maxlen timeframes 42 | 43 | """ 44 | padding = maxlen - spec.shape[1] 45 | z = np.zeros((spec.shape[0],padding)) 46 | padded_spec=np.append(spec, z, axis=1) 47 | return padded_spec 48 | 49 | 50 | def pad_transform_spectro(spec,maxlen): 51 | """ 52 | Function that encodes a 2D spectrogram in a 1D array, so that it 53 | can later be restored again. 54 | Flattens and pads a spectrogram with default value 999 55 | to a given length. Size of the original spectrogram is encoded 56 | in the first two cells of the resulting array 57 | 58 | Parameters 59 | ---------- 60 | spec : 2D numpy array (numeric) 61 | a spectrogram S(X,Y) with X frequency bins and Y timeframes 62 | maxlen: Integer 63 | n of timeframes to which spec should be padded 64 | 65 | Returns 66 | ------- 67 | trans_spec : 1D numpy array (numeric) 68 | the padded and flattened spectrogram 69 | 70 | """ 71 | flat_spec = spec.flatten() 72 | trans_spec = np.concatenate((np.asarray([spec.shape[0], spec.shape[1]]), flat_spec, np.asarray([999]*(maxlen-flat_spec.shape[0]-2)))) 73 | trans_spec = np.float64(trans_spec) 74 | 75 | return trans_spec 76 | -------------------------------------------------------------------------------- /notebooks/.ipynb_checkpoints/02a_generate_UMAP_basic-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Step 2, alternative a: Generate UMAP representations from spectrograms - Basic pipeline" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Introduction" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "This script creates UMAP representations from spectrograms using the basic pipeline.\n", 22 | "\n", 23 | "#### The following structure and files are required in the project directory:\n", 24 | "\n", 25 | " ├── data\n", 26 | " │ ├── df.pkl <- pickled pandas dataframe with metadata and spectrograms (generated in\n", 27 | " | 01_generate_spectrograms.ipynb)\n", 28 | " ├── parameters \n", 29 | " ├── functions <- the folder with the function files provided in the repo \n", 30 | " ├── notebooks <- the folder with the notebook files provided in the repo \n", 31 | " ├── ... \n", 32 | " \n", 33 | "\n", 34 | "#### The following columns must exist (somewhere) in the pickled dataframe df.pkl:\n", 35 | "\n", 36 | " | spectrograms | ....\n", 37 | " ------------------------------------------\n", 38 | " | 2D np.array | ....\n", 39 | " | ... | ....\n", 40 | " | ... | .... \n", 41 | " \n", 42 | "\n", 43 | "#### The following files are generated in this script:\n", 44 | "\n", 45 | " ├── data\n", 46 | " │ ├── df_umap.pkl <- pickled pandas dataframe with metadata, spectrograms AND UMAP coordinates" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "## Import statements, constants and functions" 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": 1, 59 | "metadata": {}, 60 | "outputs": [], 61 | "source": [ 62 | "import pandas as pd\n", 63 | "import numpy as np\n", 64 | "import pickle\n", 65 | "import os\n", 66 | "from pathlib import Path\n", 67 | "import umap\n", 68 | "import sys \n", 69 | "sys.path.insert(0, '..')\n", 70 | "\n", 71 | "from functions.preprocessing_functions import calc_zscore, pad_spectro\n", 72 | "from functions.custom_dist_functions_umap import unpack_specs" 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": 2, 78 | "metadata": {}, 79 | "outputs": [], 80 | "source": [ 81 | "P_DIR = str(Path(os.getcwd()).parents[0]) # project directory\n", 82 | "DATA = os.path.join(os.path.sep, P_DIR, 'data') # path to data subfolder in project directory\n", 83 | "DF_NAME = 'df.pkl' # name of pickled dataframe with metadata and spectrograms" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": {}, 89 | "source": [ 90 | "Specify UMAP parameters. If desired, other inputs can be used for UMAP, such as denoised spectrograms, bandpass filtered spectrograms or other (MFCC, specs on frequency scale...) by changining the INPUT_COL parameter." 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 3, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "INPUT_COL = 'spectrograms' # column that is used for UMAP\n", 100 | " # could also choose 'denoised_spectrograms' or 'stretched_spectrograms' etc etc...\n", 101 | " \n", 102 | "METRIC_TYPE = 'euclidean' # distance metric used in UMAP. Check UMAP documentation for other options\n", 103 | " # e.g. 'euclidean', correlation', 'cosine','manhattan' ...\n", 104 | " \n", 105 | "N_COMP = 3 # number of dimensions desired in latent space " 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": {}, 111 | "source": [ 112 | "## 1. Load data" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": 4, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "df = pd.read_pickle(os.path.join(os.path.sep, DATA, DF_NAME))" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "## 2. UMAP\n", 129 | "### 2.1. Prepare UMAP input" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "In this step, the spectrograms are z-transformed, zero-padded and concatenated to obtain numeric vectors." 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": 5, 142 | "metadata": {}, 143 | "outputs": [], 144 | "source": [ 145 | "# Basic pipeline\n", 146 | "# No time-shift allowed, spectrograms should be aligned at the start. All spectrograms are zero-padded \n", 147 | "# to equal length\n", 148 | " \n", 149 | "specs = df[INPUT_COL] # choose spectrogram column\n", 150 | "specs = [calc_zscore(s) for s in specs] # z-transform each spectrogram\n", 151 | "\n", 152 | "maxlen= np.max([spec.shape[1] for spec in specs]) # find maximal length in dataset\n", 153 | "flattened_specs = [pad_spectro(spec, maxlen).flatten() for spec in specs] # pad all specs to maxlen, then row-wise concatenate (flatten)\n", 154 | "data = np.asarray(flattened_specs) # data is the final input data for UMAP" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "### 2.2. Specify UMAP parameters" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": 6, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [ 170 | "reducer = umap.UMAP(n_components=N_COMP, metric = METRIC_TYPE, # specify parameters of UMAP reducer\n", 171 | " min_dist = 0, random_state=2204) " 172 | ] 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "metadata": {}, 177 | "source": [ 178 | "### 2.2. Fit UMAP" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": 7, 184 | "metadata": {}, 185 | "outputs": [], 186 | "source": [ 187 | "embedding = reducer.fit_transform(data) # embedding contains the new coordinates of datapoints in 3D space" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "## 3. Save dataframe" 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": 8, 200 | "metadata": {}, 201 | "outputs": [], 202 | "source": [ 203 | "# Add UMAP coordinates to dataframe\n", 204 | "for i in range(N_COMP):\n", 205 | " df['UMAP'+str(i+1)] = embedding[:,i]\n", 206 | "\n", 207 | "# Save dataframe\n", 208 | "df.to_pickle(os.path.join(os.path.sep, DATA, 'df_umap.pkl'))" 209 | ] 210 | } 211 | ], 212 | "metadata": { 213 | "kernelspec": { 214 | "display_name": "Python 3", 215 | "language": "python", 216 | "name": "python3" 217 | }, 218 | "language_info": { 219 | "codemirror_mode": { 220 | "name": "ipython", 221 | "version": 3 222 | }, 223 | "file_extension": ".py", 224 | "mimetype": "text/x-python", 225 | "name": "python", 226 | "nbconvert_exporter": "python", 227 | "pygments_lexer": "ipython3", 228 | "version": "3.7.10" 229 | } 230 | }, 231 | "nbformat": 4, 232 | "nbformat_minor": 4 233 | } 234 | -------------------------------------------------------------------------------- /notebooks/.ipynb_checkpoints/02b_generate_UMAP_timeshift-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Step 2, alternative b: Generate UMAP representations from spectrograms - custom distance (time-shift)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Introduction" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "This script creates UMAP representations from spectrograms, while allowing for some time-shift of spectrograms. This increases computation time, but is well suited for calls that are not well aligned at the start.\n", 22 | "\n", 23 | "#### The following structure and files are required in the project directory:\n", 24 | "\n", 25 | " ├── data\n", 26 | " │ ├── df.pkl <- pickled pandas dataframe with metadata and spectrograms (generated in\n", 27 | " | 01_generate_spectrograms.ipynb)\n", 28 | " ├── parameters \n", 29 | " ├── functions <- the folder with the function files provided in the repo \n", 30 | " ├── notebooks <- the folder with the notebook files provided in the repo \n", 31 | " ├── ... \n", 32 | " \n", 33 | "\n", 34 | "#### The following columns must exist (somewhere) in the pickled dataframe df.pkl:\n", 35 | "\n", 36 | " | spectrograms | ....\n", 37 | " ------------------------------------------\n", 38 | " | 2D np.array | ....\n", 39 | " | ... | ....\n", 40 | " | ... | .... \n", 41 | " \n", 42 | "#### The following files are generated in this script:\n", 43 | "\n", 44 | " ├── data\n", 45 | " │ ├── df_umap.pkl <- pickled pandas dataframe with metadata, spectrograms AND UMAP coordinates" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": {}, 51 | "source": [ 52 | "## Import statements, constants and functions" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": 1, 58 | "metadata": {}, 59 | "outputs": [], 60 | "source": [ 61 | "import pandas as pd\n", 62 | "import numpy as np\n", 63 | "import pickle\n", 64 | "import os\n", 65 | "from pathlib import Path\n", 66 | "import umap\n", 67 | "import sys \n", 68 | "sys.path.insert(0, '..')\n", 69 | "\n", 70 | "from functions.preprocessing_functions import calc_zscore, pad_spectro\n", 71 | "from functions.custom_dist_functions_umap import unpack_specs" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 2, 77 | "metadata": {}, 78 | "outputs": [], 79 | "source": [ 80 | "P_DIR = str(Path(os.getcwd()).parents[0]) # project directory\n", 81 | "DATA = os.path.join(os.path.sep, P_DIR, 'data') " 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "Specify UMAP parameters. If desired, other inputs can be used for UMAP, such as denoised spectrograms, bandpass filtered spectrograms or other (MFCC, specs on frequency scale...) by changining the INPUT_COL parameter." 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 7, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "INPUT_COL = 'spectrograms' # column that is used for UMAP\n", 98 | " # could also choose 'denoised_spectrograms' or 'stretched_spectrograms' etc etc...\n", 99 | "\n", 100 | "MIN_OVERLAP = 0.9 # time shift constraint\n", 101 | " # MIN_OVERLAP*100 % of the shorter spectrogram must overlap with the longer spectrogram\n", 102 | " # when finding the position with the least error during the time-shifting\n", 103 | "\n", 104 | "METRIC_TYPE = 'euclidean' # distance metric used in UMAP.\n", 105 | " # If performing time-shift, only 'euclidean', correlation', 'cosine' and 'manhattan' \n", 106 | " # are available\n", 107 | " \n", 108 | "N_COMP = 3 # number of dimensions desired in latent space " 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "metadata": {}, 114 | "source": [ 115 | "## 1. Load data" 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": 4, 121 | "metadata": {}, 122 | "outputs": [], 123 | "source": [ 124 | "df = pd.read_pickle(os.path.join(os.path.sep, DATA, 'df.pkl'))" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": {}, 130 | "source": [ 131 | "## 2. UMAP" 132 | ] 133 | }, 134 | { 135 | "cell_type": "markdown", 136 | "metadata": {}, 137 | "source": [ 138 | "In this step, the spectrograms are z-transformed, zero-padded and concatenated to obtain numeric vectors." 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "### 2.1. Load custom distance function with time-shift" 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": null, 151 | "metadata": {}, 152 | "outputs": [], 153 | "source": [ 154 | "# Pipeline with allowing for time-shift of spectrograms. When assessing distance between spectrograms,\n", 155 | "# the shorter spectrogram is shifted along the longer one to find the position of minimum-error overlap.\n", 156 | "# The shorter is then zero-padded to the length of the longer one and distance is calculated using the \n", 157 | "# chosen METRIC_TYPE distance (euclidean, manhatten, cosine, correlation)\n", 158 | "# This also means that the dimensionality of the spectrogram vectors can be different for each pairwise \n", 159 | "# comparison. Hence, we need some sort of normalization to the dimensionality, otherwise metrics like \n", 160 | "# euclidean or manhattan will automatically be larger for high-dimensional spectrogram vectors (i.e. calls\n", 161 | "# with long duration). Therefore, euclidean and manhattan are normalized to the size of the spectrogram.\n", 162 | " \n", 163 | "from preprocessing_functions import pad_transform_spectro\n", 164 | "import numba\n", 165 | "\n", 166 | "if METRIC_TYPE=='euclidean':\n", 167 | " @numba.njit()\n", 168 | " def spec_dist(a,b,size):\n", 169 | " dist = np.sqrt((np.sum(np.subtract(a,b)*np.subtract(a,b)))) / np.sqrt(size)\n", 170 | " return dist\n", 171 | "elif METRIC_TYPE=='manhattan':\n", 172 | " @numba.njit()\n", 173 | " def spec_dist(a,b,size):\n", 174 | " dist = (np.sum(np.abs(np.subtract(a,b)))) / size\n", 175 | " return dist\n", 176 | "elif METRIC_TYPE=='cosine':\n", 177 | " @numba.njit()\n", 178 | " def spec_dist(a,b,size):\n", 179 | " # turn into unit vectors by dividing each vector field by magnitude of vector\n", 180 | " dot_product = np.sum(a*b)\n", 181 | " a_magnitude = np.sqrt(np.sum(a*a))\n", 182 | " b_magnitude = np.sqrt(np.sum(b*b))\n", 183 | " dist = 1 - dot_product/(a_magnitude*b_magnitude)\n", 184 | " return dist\n", 185 | "\n", 186 | "elif METRIC_TYPE=='correlation':\n", 187 | " @numba.njit()\n", 188 | " def spec_dist(a,b,size):\n", 189 | " a_meandiff = a - np.mean(a)\n", 190 | " b_meandiff = b - np.mean(b)\n", 191 | " dot_product = np.sum(a_meandiff*b_meandiff)\n", 192 | " a_meandiff_magnitude = np.sqrt(np.sum(a_meandiff*a_meandiff))\n", 193 | " b_meandiff_magnitude = np.sqrt(np.sum(b_meandiff*b_meandiff))\n", 194 | " dist = 1 - dot_product/(a_meandiff_magnitude * b_meandiff_magnitude)\n", 195 | " return dist\n", 196 | "else:\n", 197 | " print('Metric type ', METRIC_TYPE, ' not compatible with option TIME_SHIFT = True')\n", 198 | " raise\n", 199 | " \n", 200 | "@numba.njit()\n", 201 | "def calc_timeshift_pad(a,b):\n", 202 | " spec_s, spec_l = unpack_specs(a,b)\n", 203 | "\n", 204 | " len_s = spec_s.shape[1]\n", 205 | " len_l = spec_l.shape[1]\n", 206 | "\n", 207 | " nfreq = spec_s.shape[0] \n", 208 | "\n", 209 | " # define start position\n", 210 | " min_overlap_frames = int(MIN_OVERLAP * len_s)\n", 211 | " start_timeline = min_overlap_frames-len_s\n", 212 | " max_timeline = len_l - min_overlap_frames\n", 213 | " n_of_calculations = int((((max_timeline+1-start_timeline)+(max_timeline+1-start_timeline))/2) +1)\n", 214 | " distances = np.full((n_of_calculations),999.)\n", 215 | " count=0\n", 216 | " \n", 217 | " for timeline_p in range(start_timeline, max_timeline+1,2):\n", 218 | " # mismatch on left side\n", 219 | " if timeline_p < 0:\n", 220 | " len_overlap = len_s - abs(timeline_p)\n", 221 | " pad_s = np.full((nfreq, (len_l-len_overlap)),0.)\n", 222 | " pad_l = np.full((nfreq, (len_s-len_overlap)),0.)\n", 223 | " s_config = np.append(spec_s, pad_s, axis=1).astype(np.float64)\n", 224 | " l_config = np.append(pad_l, spec_l, axis=1).astype(np.float64)\n", 225 | " \n", 226 | " # mismatch on right side\n", 227 | " elif timeline_p > (len_l-len_s):\n", 228 | " len_overlap = len_l - timeline_p\n", 229 | " pad_s = np.full((nfreq, (len_l-len_overlap)),0.)\n", 230 | " pad_l = np.full((nfreq, (len_s-len_overlap)),0.)\n", 231 | " s_config = np.append(pad_s, spec_s, axis=1).astype(np.float64)\n", 232 | " l_config = np.append(spec_l, pad_l, axis=1).astype(np.float64)\n", 233 | " \n", 234 | " else:\n", 235 | " len_overlap = len_s\n", 236 | " start_col_l = timeline_p\n", 237 | " end_col_l = start_col_l + len_overlap\n", 238 | " pad_s_left = np.full((nfreq, start_col_l),0.)\n", 239 | " pad_s_right = np.full((nfreq, (len_l - end_col_l)),0.)\n", 240 | " l_config = spec_l.astype(np.float64)\n", 241 | " s_config = np.append(pad_s_left, spec_s, axis=1).astype(np.float64)\n", 242 | " s_config = np.append(s_config, pad_s_right, axis=1).astype(np.float64)\n", 243 | " \n", 244 | " size = s_config.shape[0]*s_config.shape[1]\n", 245 | " distances[count] = spec_dist(s_config, l_config, size)\n", 246 | " count = count + 1\n", 247 | " \n", 248 | " min_dist = np.min(distances)\n", 249 | " return min_dist" 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "metadata": {}, 255 | "source": [ 256 | "### 2.1. Prepare UMAP input" 257 | ] 258 | }, 259 | { 260 | "cell_type": "code", 261 | "execution_count": null, 262 | "metadata": {}, 263 | "outputs": [], 264 | "source": [ 265 | "specs = df[INPUT_COL]\n", 266 | "specs = [calc_zscore(s) for s in specs] # z-transform\n", 267 | " \n", 268 | "n_bins = specs[0].shape[0]\n", 269 | "maxlen = np.max([spec.shape[1] for spec in specs]) * n_bins + 2\n", 270 | "trans_specs = [pad_transform_spectro(spec, maxlen) for spec in specs]\n", 271 | "data = np.asarray(trans_specs)" 272 | ] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": {}, 277 | "source": [ 278 | "### 2.2. Specify UMAP parameters" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "reducer = umap.UMAP(n_components=N_COMP, metric = calc_timeshift_pad, min_dist = 0, random_state=2204) " 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "### 2.3. Fit UMAP" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": 23, 300 | "metadata": {}, 301 | "outputs": [ 302 | { 303 | "name": "stderr", 304 | "output_type": "stream", 305 | "text": [ 306 | "/home/mthomas/anaconda3/envs/umap_tut_env/lib/python3.7/site-packages/umap/umap_.py:1728: UserWarning: custom distance metric does not return gradient; inverse_transform will be unavailable. To enable using inverse_transform method method, define a distance function that returns a tuple of (distance [float], gradient [np.array])\n", 307 | " \"custom distance metric does not return gradient; inverse_transform will be unavailable. \"\n" 308 | ] 309 | } 310 | ], 311 | "source": [ 312 | "embedding = reducer.fit_transform(data) # embedding contains the new coordinates of datapoints in 3D space" 313 | ] 314 | }, 315 | { 316 | "cell_type": "markdown", 317 | "metadata": {}, 318 | "source": [ 319 | "## 3. Save dataframe" 320 | ] 321 | }, 322 | { 323 | "cell_type": "code", 324 | "execution_count": 25, 325 | "metadata": {}, 326 | "outputs": [], 327 | "source": [ 328 | "# Add UMAP coordinates to dataframe\n", 329 | "for i in range(N_COMP):\n", 330 | " df['UMAP'+str(i+1)] = embedding[:,i]\n", 331 | "\n", 332 | "# Save dataframe\n", 333 | "df.to_pickle(os.path.join(os.path.sep, DATA, 'df_umap.pkl'))" 334 | ] 335 | } 336 | ], 337 | "metadata": { 338 | "kernelspec": { 339 | "display_name": "Python 3", 340 | "language": "python", 341 | "name": "python3" 342 | }, 343 | "language_info": { 344 | "codemirror_mode": { 345 | "name": "ipython", 346 | "version": 3 347 | }, 348 | "file_extension": ".py", 349 | "mimetype": "text/x-python", 350 | "name": "python", 351 | "nbconvert_exporter": "python", 352 | "pygments_lexer": "ipython3", 353 | "version": "3.7.10" 354 | } 355 | }, 356 | "nbformat": 4, 357 | "nbformat_minor": 4 358 | } 359 | -------------------------------------------------------------------------------- /notebooks/.ipynb_checkpoints/03_UMAP_viz_part_1_prep-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Interactive visualization of UMAP representations: Part 1 (Prep)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "This script creates a spectrogram image for each call and saves all images in a pickled dictionary in the data subfolder (image_data.pkl). These images will be displayed later in the interactive visualization tool; generating them beforehand makes the tool faster, as images don't need to be created on-the-fly, but can be accessed through the dictionary. \n", 15 | "\n", 16 | "The default dictionary key is the filename without datatype specifier (e.g. without .wav), but if the dataframe contains a column 'callID', this is used as keys.\n", 17 | "\n", 18 | "#### The following minimal structure and files are required in the project directory:\n", 19 | "\n", 20 | " ├── data\n", 21 | " │ ├── df_umap.pkl <- pickled pandas dataframe with metadata, raw_audio, spectrograms and UMAP coordinates\n", 22 | " | (generated in 02a_generate_UMAP_basic.ipynb or 02b_generate_UMAP_timeshift.ipynb)\n", 23 | " ├── parameters \n", 24 | " │ ├── spec_params.py <- python file containing the spectrogram parameters used (generated in \n", 25 | " | 01_generate_spectrograms.ipynb) \n", 26 | " ├── functions <- the folder with the function files provided in the repo \n", 27 | " ├── notebooks <- the folder with the notebook files provided in the repo \n", 28 | " ├── ... \n", 29 | "\n", 30 | "\n", 31 | "#### The following columns must exist (somewhere) in the pickled dataframe df.pkl:\n", 32 | "(callID is optional)\n", 33 | "\n", 34 | " | filename | spectrograms | samplerate_hz | [optional: callID]\n", 35 | " --------------------------------------------------------------------\n", 36 | " | call_1.wav | 2D np.array | 8000 | [call_1]\n", 37 | " | call_2.wav | ... | 48000 | [call_2] \n", 38 | " | ... | ... | .... | .... \n", 39 | "\n", 40 | "#### The following files are generated in this script:\n", 41 | "\n", 42 | " ├── data\n", 43 | " │ ├── df_umap.pkl <- is overwritten with updated version of df_umap.pkl (with ID column) \n", 44 | " │ ├── image_data.pkl <- pickled dictionary with spectrogram images as values, ID column as keys" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "## Import statements, constants and functions" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 1, 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "import pandas as pd\n", 61 | "import numpy as np\n", 62 | "import pickle\n", 63 | "import matplotlib.pyplot as plt\n", 64 | "import os\n", 65 | "from pathlib import Path\n", 66 | "import soundfile as sf\n", 67 | "import io\n", 68 | "import librosa\n", 69 | "import librosa.display\n", 70 | "import umap\n", 71 | "\n", 72 | "import sys \n", 73 | "sys.path.insert(0, '..')" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": 3, 79 | "metadata": {}, 80 | "outputs": [], 81 | "source": [ 82 | "P_DIR = str(Path(os.getcwd()).parents[0]) \n", 83 | "DATA = os.path.join(os.path.sep, P_DIR, 'data') \n", 84 | "DF_NAME = 'df_umap.pkl'\n", 85 | "\n", 86 | "SPEC_COL = 'spectrograms' # column name that contains the spectrograms\n", 87 | "ID_COL = 'callID' # column name that contains call identifier (must be unique)\n", 88 | "\n", 89 | "\n", 90 | "OVERWRITE = False # If there already exists an image_data.pkl, should it be overwritten? Default no" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 4, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "# Spectrogramming parameters (needed for generating the images)\n", 100 | "\n", 101 | "from parameters.spec_params import FFT_WIN, FFT_HOP, FMIN, FMAX\n", 102 | "\n", 103 | "# Make sure the spectrogramming parameters are correct!\n", 104 | "# They are used to set the correct time and frequency axis labels for the spectrogram images. \n", 105 | "\n", 106 | "# If you are using bandpass-filtered spectrograms...\n", 107 | "if 'filtered' in SPEC_COL:\n", 108 | " # ...FMIN is set to LOWCUT, FMAX to HIGHCUT and N_MELS to N_MELS_FILTERED\n", 109 | " from parameters.spec_params import LOWCUT, HIGHCUT, N_MELS_FILTERED\n", 110 | " \n", 111 | " FMIN = LOWCUT\n", 112 | " FMAX = HIGHCUT\n", 113 | " N_MELS = N_MELS_FILTERED" 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": {}, 119 | "source": [ 120 | "## 1. Read in files" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 5, 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [ 129 | "df = pd.read_pickle(os.path.join(os.path.sep, DATA, DF_NAME))" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "### 1.1. Check if call identifier column is present" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": 6, 142 | "metadata": {}, 143 | "outputs": [ 144 | { 145 | "name": "stdout", 146 | "output_type": "stream", 147 | "text": [ 148 | "No ID-Column found ( callID )\n", 149 | "Default ID column callID will be generated from filename.\n" 150 | ] 151 | } 152 | ], 153 | "source": [ 154 | "# Default callID will be the name of the wav file\n", 155 | "\n", 156 | "if ID_COL not in df.columns:\n", 157 | " print('No ID-Column found (', ID_COL, ')')\n", 158 | " \n", 159 | " if 'filename' in df.columns:\n", 160 | " print(\"Default ID column \", ID_COL, \"will be generated from filename.\")\n", 161 | " df[ID_COL] = [x.split(\".\")[0] for x in df['filename']]\n", 162 | " else:\n", 163 | " raise" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "## 2. Generate spectrogram images" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "A spectrogram image is generated from each row in the dataframe. Images are saved in a dictionary (keys are the ID_COL of the dataframe).\n", 178 | "\n", 179 | "The dictionary is pickled and saved as image_data.pkl. It will later be loaded in the interactive visualization script and these images will be displayed in the visualization." 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 7, 185 | "metadata": {}, 186 | "outputs": [ 187 | { 188 | "name": "stdout", 189 | "output_type": "stream", 190 | "text": [ 191 | "\r", 192 | "Processing i: 0 / 6428\r", 193 | "Processing i: 1 / 6428\r", 194 | "Processing i: 2 / 6428\r", 195 | "Processing i: 3 / 6428" 196 | ] 197 | }, 198 | { 199 | "name": "stderr", 200 | "output_type": "stream", 201 | "text": [ 202 | "/home/mthomas/anaconda3/envs/umap_tut_env/lib/python3.7/site-packages/librosa/display.py:974: MatplotlibDeprecationWarning: The 'basey' parameter of __init__() has been renamed 'base' since Matplotlib 3.3; support for the old name will be dropped two minor releases later.\n", 203 | " scaler(mode, **kwargs)\n", 204 | "/home/mthomas/anaconda3/envs/umap_tut_env/lib/python3.7/site-packages/librosa/display.py:974: MatplotlibDeprecationWarning: The 'linthreshy' parameter of __init__() has been renamed 'linthresh' since Matplotlib 3.3; support for the old name will be dropped two minor releases later.\n", 205 | " scaler(mode, **kwargs)\n" 206 | ] 207 | }, 208 | { 209 | "name": "stdout", 210 | "output_type": "stream", 211 | "text": [ 212 | "Processing i: 6427 / 6428 / 6428" 213 | ] 214 | } 215 | ], 216 | "source": [ 217 | "if OVERWRITE==False and os.path.isfile(os.path.join(os.path.sep,DATA,'image_data.pkl')):\n", 218 | " print(\"File already exists. Overwrite is set to FALSE, so no new image_data will be generated.\")\n", 219 | " \n", 220 | " # Double-ceck if image_data contains all the required calls\n", 221 | " with open(os.path.join(os.path.sep, DATA, 'image_data.pkl'), 'rb') as handle:\n", 222 | " image_data = pickle.load(handle) \n", 223 | " image_keys = list(image_data.keys())\n", 224 | " expected_keys = list(df[ID_COL])\n", 225 | " missing = list(set(expected_keys)-set(image_keys))\n", 226 | " \n", 227 | " if len(missing)>0:\n", 228 | " print(\"BUT: The current image_data.pkl file doesn't seem to contain all calls that are in your dataframe!\")\n", 229 | " \n", 230 | "else:\n", 231 | " image_data = {}\n", 232 | " for i,dat in enumerate(df.spectrograms):\n", 233 | " print('\\rProcessing i:',i,'/',df.shape[0], end='')\n", 234 | " dat = np.asarray(df.iloc[i][SPEC_COL]) \n", 235 | " sr = df.iloc[i]['samplerate_hz']\n", 236 | " plt.figure()\n", 237 | " librosa.display.specshow(dat,sr=sr, hop_length=int(FFT_HOP * sr) , fmin=FMIN, fmax=FMAX, y_axis='mel', x_axis='s',cmap='inferno')\n", 238 | " buf = io.BytesIO()\n", 239 | " plt.savefig(buf, format='png')\n", 240 | " byte_im = buf.getvalue()\n", 241 | " image_data[df.iloc[i][ID_COL]] = byte_im\n", 242 | " plt.close()\n", 243 | "\n", 244 | " # Store data (serialize)\n", 245 | " with open(os.path.join(os.path.sep,DATA,'image_data.pkl'), 'wb') as handle:\n", 246 | " pickle.dump(image_data, handle, protocol=pickle.HIGHEST_PROTOCOL)" 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "## 3. Save dataframe" 254 | ] 255 | }, 256 | { 257 | "cell_type": "markdown", 258 | "metadata": {}, 259 | "source": [ 260 | "Save the dataframe to make sure it contains the correct ID column for access to the image_data." 261 | ] 262 | }, 263 | { 264 | "cell_type": "code", 265 | "execution_count": 8, 266 | "metadata": {}, 267 | "outputs": [], 268 | "source": [ 269 | "df.to_pickle(os.path.join(os.path.sep, DATA, DF_NAME))" 270 | ] 271 | } 272 | ], 273 | "metadata": { 274 | "kernelspec": { 275 | "display_name": "Python 3", 276 | "language": "python", 277 | "name": "python3" 278 | }, 279 | "language_info": { 280 | "codemirror_mode": { 281 | "name": "ipython", 282 | "version": 3 283 | }, 284 | "file_extension": ".py", 285 | "mimetype": "text/x-python", 286 | "name": "python", 287 | "nbconvert_exporter": "python", 288 | "pygments_lexer": "ipython3", 289 | "version": "3.7.10" 290 | } 291 | }, 292 | "nbformat": 4, 293 | "nbformat_minor": 4 294 | } 295 | -------------------------------------------------------------------------------- /notebooks/02a_generate_UMAP_basic.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Step 2, alternative a: Generate UMAP representations from spectrograms - Basic pipeline" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Introduction" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "This script creates UMAP representations from spectrograms using the basic pipeline.\n", 22 | "\n", 23 | "#### The following structure and files are required in the project directory:\n", 24 | "\n", 25 | " ├── data\n", 26 | " │ ├── df.pkl <- pickled pandas dataframe with metadata and spectrograms (generated in\n", 27 | " | 01_generate_spectrograms.ipynb)\n", 28 | " ├── parameters \n", 29 | " ├── functions <- the folder with the function files provided in the repo \n", 30 | " ├── notebooks <- the folder with the notebook files provided in the repo \n", 31 | " ├── ... \n", 32 | " \n", 33 | "\n", 34 | "#### The following columns must exist (somewhere) in the pickled dataframe df.pkl:\n", 35 | "\n", 36 | " | spectrograms | ....\n", 37 | " ------------------------------------------\n", 38 | " | 2D np.array | ....\n", 39 | " | ... | ....\n", 40 | " | ... | .... \n", 41 | " \n", 42 | "\n", 43 | "#### The following files are generated in this script:\n", 44 | "\n", 45 | " ├── data\n", 46 | " │ ├── df_umap.pkl <- pickled pandas dataframe with metadata, spectrograms AND UMAP coordinates" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "## Import statements, constants and functions" 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": 1, 59 | "metadata": {}, 60 | "outputs": [], 61 | "source": [ 62 | "import pandas as pd\n", 63 | "import numpy as np\n", 64 | "import pickle\n", 65 | "import os\n", 66 | "from pathlib import Path\n", 67 | "import umap\n", 68 | "import sys \n", 69 | "sys.path.insert(0, '..')\n", 70 | "\n", 71 | "from functions.preprocessing_functions import calc_zscore, pad_spectro\n", 72 | "from functions.custom_dist_functions_umap import unpack_specs" 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": 2, 78 | "metadata": {}, 79 | "outputs": [], 80 | "source": [ 81 | "P_DIR = str(Path(os.getcwd()).parents[0]) # project directory\n", 82 | "DATA = os.path.join(os.path.sep, P_DIR, 'data') # path to data subfolder in project directory\n", 83 | "DF_NAME = 'df.pkl' # name of pickled dataframe with metadata and spectrograms" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": {}, 89 | "source": [ 90 | "Specify UMAP parameters. If desired, other inputs can be used for UMAP, such as denoised spectrograms, bandpass filtered spectrograms or other (MFCC, specs on frequency scale...) by changining the INPUT_COL parameter." 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 3, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "INPUT_COL = 'spectrograms' # column that is used for UMAP\n", 100 | " # could also choose 'denoised_spectrograms' or 'stretched_spectrograms' etc etc...\n", 101 | " \n", 102 | "METRIC_TYPE = 'euclidean' # distance metric used in UMAP. Check UMAP documentation for other options\n", 103 | " # e.g. 'euclidean', correlation', 'cosine','manhattan' ...\n", 104 | " \n", 105 | "N_COMP = 3 # number of dimensions desired in latent space " 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": {}, 111 | "source": [ 112 | "## 1. Load data" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": 4, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "df = pd.read_pickle(os.path.join(os.path.sep, DATA, DF_NAME))" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "## 2. UMAP\n", 129 | "### 2.1. Prepare UMAP input" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "In this step, the spectrograms are z-transformed, zero-padded and concatenated to obtain numeric vectors." 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": 5, 142 | "metadata": {}, 143 | "outputs": [], 144 | "source": [ 145 | "# Basic pipeline\n", 146 | "# No time-shift allowed, spectrograms should be aligned at the start. All spectrograms are zero-padded \n", 147 | "# to equal length\n", 148 | " \n", 149 | "specs = df[INPUT_COL] # choose spectrogram column\n", 150 | "specs = [calc_zscore(s) for s in specs] # z-transform each spectrogram\n", 151 | "\n", 152 | "maxlen= np.max([spec.shape[1] for spec in specs]) # find maximal length in dataset\n", 153 | "flattened_specs = [pad_spectro(spec, maxlen).flatten() for spec in specs] # pad all specs to maxlen, then row-wise concatenate (flatten)\n", 154 | "data = np.asarray(flattened_specs) # data is the final input data for UMAP" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "### 2.2. Specify UMAP parameters" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": 6, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [ 170 | "reducer = umap.UMAP(n_components=N_COMP, metric = METRIC_TYPE, # specify parameters of UMAP reducer\n", 171 | " min_dist = 0, random_state=2204) " 172 | ] 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "metadata": {}, 177 | "source": [ 178 | "### 2.2. Fit UMAP" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": 7, 184 | "metadata": {}, 185 | "outputs": [], 186 | "source": [ 187 | "embedding = reducer.fit_transform(data) # embedding contains the new coordinates of datapoints in 3D space" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "## 3. Save dataframe" 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": 8, 200 | "metadata": {}, 201 | "outputs": [], 202 | "source": [ 203 | "# Add UMAP coordinates to dataframe\n", 204 | "for i in range(N_COMP):\n", 205 | " df['UMAP'+str(i+1)] = embedding[:,i]\n", 206 | "\n", 207 | "# Save dataframe\n", 208 | "df.to_pickle(os.path.join(os.path.sep, DATA, 'df_umap.pkl'))" 209 | ] 210 | } 211 | ], 212 | "metadata": { 213 | "kernelspec": { 214 | "display_name": "Python 3", 215 | "language": "python", 216 | "name": "python3" 217 | }, 218 | "language_info": { 219 | "codemirror_mode": { 220 | "name": "ipython", 221 | "version": 3 222 | }, 223 | "file_extension": ".py", 224 | "mimetype": "text/x-python", 225 | "name": "python", 226 | "nbconvert_exporter": "python", 227 | "pygments_lexer": "ipython3", 228 | "version": "3.7.10" 229 | } 230 | }, 231 | "nbformat": 4, 232 | "nbformat_minor": 4 233 | } 234 | -------------------------------------------------------------------------------- /notebooks/02b_generate_UMAP_timeshift.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Step 2, alternative b: Generate UMAP representations from spectrograms - custom distance (time-shift)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Introduction" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "This script creates UMAP representations from spectrograms, while allowing for some time-shift of spectrograms. This increases computation time, but is well suited for calls that are not well aligned at the start.\n", 22 | "\n", 23 | "#### The following structure and files are required in the project directory:\n", 24 | "\n", 25 | " ├── data\n", 26 | " │ ├── df.pkl <- pickled pandas dataframe with metadata and spectrograms (generated in\n", 27 | " | 01_generate_spectrograms.ipynb)\n", 28 | " ├── parameters \n", 29 | " ├── functions <- the folder with the function files provided in the repo \n", 30 | " ├── notebooks <- the folder with the notebook files provided in the repo \n", 31 | " ├── ... \n", 32 | " \n", 33 | "\n", 34 | "#### The following columns must exist (somewhere) in the pickled dataframe df.pkl:\n", 35 | "\n", 36 | " | spectrograms | ....\n", 37 | " ------------------------------------------\n", 38 | " | 2D np.array | ....\n", 39 | " | ... | ....\n", 40 | " | ... | .... \n", 41 | " \n", 42 | "#### The following files are generated in this script:\n", 43 | "\n", 44 | " ├── data\n", 45 | " │ ├── df_umap.pkl <- pickled pandas dataframe with metadata, spectrograms AND UMAP coordinates" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": {}, 51 | "source": [ 52 | "## Import statements, constants and functions" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": 1, 58 | "metadata": {}, 59 | "outputs": [], 60 | "source": [ 61 | "import pandas as pd\n", 62 | "import numpy as np\n", 63 | "import pickle\n", 64 | "import os\n", 65 | "from pathlib import Path\n", 66 | "import umap\n", 67 | "import sys \n", 68 | "sys.path.insert(0, '..')\n", 69 | "\n", 70 | "from functions.preprocessing_functions import calc_zscore, pad_spectro\n", 71 | "from functions.custom_dist_functions_umap import unpack_specs" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 2, 77 | "metadata": {}, 78 | "outputs": [], 79 | "source": [ 80 | "P_DIR = str(Path(os.getcwd()).parents[0]) # project directory\n", 81 | "DATA = os.path.join(os.path.sep, P_DIR, 'data') " 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "Specify UMAP parameters. If desired, other inputs can be used for UMAP, such as denoised spectrograms, bandpass filtered spectrograms or other (MFCC, specs on frequency scale...) by changining the INPUT_COL parameter." 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 7, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "INPUT_COL = 'spectrograms' # column that is used for UMAP\n", 98 | " # could also choose 'denoised_spectrograms' or 'stretched_spectrograms' etc etc...\n", 99 | "\n", 100 | "MIN_OVERLAP = 0.9 # time shift constraint\n", 101 | " # MIN_OVERLAP*100 % of the shorter spectrogram must overlap with the longer spectrogram\n", 102 | " # when finding the position with the least error during the time-shifting\n", 103 | "\n", 104 | "METRIC_TYPE = 'euclidean' # distance metric used in UMAP.\n", 105 | " # If performing time-shift, only 'euclidean', correlation', 'cosine' and 'manhattan' \n", 106 | " # are available\n", 107 | " \n", 108 | "N_COMP = 3 # number of dimensions desired in latent space " 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "metadata": {}, 114 | "source": [ 115 | "## 1. Load data" 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": 4, 121 | "metadata": {}, 122 | "outputs": [], 123 | "source": [ 124 | "df = pd.read_pickle(os.path.join(os.path.sep, DATA, 'df.pkl'))" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": {}, 130 | "source": [ 131 | "## 2. UMAP" 132 | ] 133 | }, 134 | { 135 | "cell_type": "markdown", 136 | "metadata": {}, 137 | "source": [ 138 | "In this step, the spectrograms are z-transformed, zero-padded and concatenated to obtain numeric vectors." 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "### 2.1. Load custom distance function with time-shift" 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": null, 151 | "metadata": {}, 152 | "outputs": [], 153 | "source": [ 154 | "# Pipeline with allowing for time-shift of spectrograms. When assessing distance between spectrograms,\n", 155 | "# the shorter spectrogram is shifted along the longer one to find the position of minimum-error overlap.\n", 156 | "# The shorter is then zero-padded to the length of the longer one and distance is calculated using the \n", 157 | "# chosen METRIC_TYPE distance (euclidean, manhatten, cosine, correlation)\n", 158 | "# This also means that the dimensionality of the spectrogram vectors can be different for each pairwise \n", 159 | "# comparison. Hence, we need some sort of normalization to the dimensionality, otherwise metrics like \n", 160 | "# euclidean or manhattan will automatically be larger for high-dimensional spectrogram vectors (i.e. calls\n", 161 | "# with long duration). Therefore, euclidean and manhattan are normalized to the size of the spectrogram.\n", 162 | " \n", 163 | "from preprocessing_functions import pad_transform_spectro\n", 164 | "import numba\n", 165 | "\n", 166 | "if METRIC_TYPE=='euclidean':\n", 167 | " @numba.njit()\n", 168 | " def spec_dist(a,b,size):\n", 169 | " dist = np.sqrt((np.sum(np.subtract(a,b)*np.subtract(a,b)))) / np.sqrt(size)\n", 170 | " return dist\n", 171 | "elif METRIC_TYPE=='manhattan':\n", 172 | " @numba.njit()\n", 173 | " def spec_dist(a,b,size):\n", 174 | " dist = (np.sum(np.abs(np.subtract(a,b)))) / size\n", 175 | " return dist\n", 176 | "elif METRIC_TYPE=='cosine':\n", 177 | " @numba.njit()\n", 178 | " def spec_dist(a,b,size):\n", 179 | " # turn into unit vectors by dividing each vector field by magnitude of vector\n", 180 | " dot_product = np.sum(a*b)\n", 181 | " a_magnitude = np.sqrt(np.sum(a*a))\n", 182 | " b_magnitude = np.sqrt(np.sum(b*b))\n", 183 | " dist = 1 - dot_product/(a_magnitude*b_magnitude)\n", 184 | " return dist\n", 185 | "\n", 186 | "elif METRIC_TYPE=='correlation':\n", 187 | " @numba.njit()\n", 188 | " def spec_dist(a,b,size):\n", 189 | " a_meandiff = a - np.mean(a)\n", 190 | " b_meandiff = b - np.mean(b)\n", 191 | " dot_product = np.sum(a_meandiff*b_meandiff)\n", 192 | " a_meandiff_magnitude = np.sqrt(np.sum(a_meandiff*a_meandiff))\n", 193 | " b_meandiff_magnitude = np.sqrt(np.sum(b_meandiff*b_meandiff))\n", 194 | " dist = 1 - dot_product/(a_meandiff_magnitude * b_meandiff_magnitude)\n", 195 | " return dist\n", 196 | "else:\n", 197 | " print('Metric type ', METRIC_TYPE, ' not compatible with option TIME_SHIFT = True')\n", 198 | " raise\n", 199 | " \n", 200 | "@numba.njit()\n", 201 | "def calc_timeshift_pad(a,b):\n", 202 | " spec_s, spec_l = unpack_specs(a,b)\n", 203 | "\n", 204 | " len_s = spec_s.shape[1]\n", 205 | " len_l = spec_l.shape[1]\n", 206 | "\n", 207 | " nfreq = spec_s.shape[0] \n", 208 | "\n", 209 | " # define start position\n", 210 | " min_overlap_frames = int(MIN_OVERLAP * len_s)\n", 211 | " start_timeline = min_overlap_frames-len_s\n", 212 | " max_timeline = len_l - min_overlap_frames\n", 213 | " n_of_calculations = int((((max_timeline+1-start_timeline)+(max_timeline+1-start_timeline))/2) +1)\n", 214 | " distances = np.full((n_of_calculations),999.)\n", 215 | " count=0\n", 216 | " \n", 217 | " for timeline_p in range(start_timeline, max_timeline+1,2):\n", 218 | " # mismatch on left side\n", 219 | " if timeline_p < 0:\n", 220 | " len_overlap = len_s - abs(timeline_p)\n", 221 | " pad_s = np.full((nfreq, (len_l-len_overlap)),0.)\n", 222 | " pad_l = np.full((nfreq, (len_s-len_overlap)),0.)\n", 223 | " s_config = np.append(spec_s, pad_s, axis=1).astype(np.float64)\n", 224 | " l_config = np.append(pad_l, spec_l, axis=1).astype(np.float64)\n", 225 | " \n", 226 | " # mismatch on right side\n", 227 | " elif timeline_p > (len_l-len_s):\n", 228 | " len_overlap = len_l - timeline_p\n", 229 | " pad_s = np.full((nfreq, (len_l-len_overlap)),0.)\n", 230 | " pad_l = np.full((nfreq, (len_s-len_overlap)),0.)\n", 231 | " s_config = np.append(pad_s, spec_s, axis=1).astype(np.float64)\n", 232 | " l_config = np.append(spec_l, pad_l, axis=1).astype(np.float64)\n", 233 | " \n", 234 | " else:\n", 235 | " len_overlap = len_s\n", 236 | " start_col_l = timeline_p\n", 237 | " end_col_l = start_col_l + len_overlap\n", 238 | " pad_s_left = np.full((nfreq, start_col_l),0.)\n", 239 | " pad_s_right = np.full((nfreq, (len_l - end_col_l)),0.)\n", 240 | " l_config = spec_l.astype(np.float64)\n", 241 | " s_config = np.append(pad_s_left, spec_s, axis=1).astype(np.float64)\n", 242 | " s_config = np.append(s_config, pad_s_right, axis=1).astype(np.float64)\n", 243 | " \n", 244 | " size = s_config.shape[0]*s_config.shape[1]\n", 245 | " distances[count] = spec_dist(s_config, l_config, size)\n", 246 | " count = count + 1\n", 247 | " \n", 248 | " min_dist = np.min(distances)\n", 249 | " return min_dist" 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "metadata": {}, 255 | "source": [ 256 | "### 2.1. Prepare UMAP input" 257 | ] 258 | }, 259 | { 260 | "cell_type": "code", 261 | "execution_count": null, 262 | "metadata": {}, 263 | "outputs": [], 264 | "source": [ 265 | "specs = df[INPUT_COL]\n", 266 | "specs = [calc_zscore(s) for s in specs] # z-transform\n", 267 | " \n", 268 | "n_bins = specs[0].shape[0]\n", 269 | "maxlen = np.max([spec.shape[1] for spec in specs]) * n_bins + 2\n", 270 | "trans_specs = [pad_transform_spectro(spec, maxlen) for spec in specs]\n", 271 | "data = np.asarray(trans_specs)" 272 | ] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": {}, 277 | "source": [ 278 | "### 2.2. Specify UMAP parameters" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "reducer = umap.UMAP(n_components=N_COMP, metric = calc_timeshift_pad, min_dist = 0, random_state=2204) " 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "### 2.3. Fit UMAP" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": 23, 300 | "metadata": {}, 301 | "outputs": [ 302 | { 303 | "name": "stderr", 304 | "output_type": "stream", 305 | "text": [ 306 | "/home/mthomas/anaconda3/envs/umap_tut_env/lib/python3.7/site-packages/umap/umap_.py:1728: UserWarning: custom distance metric does not return gradient; inverse_transform will be unavailable. To enable using inverse_transform method method, define a distance function that returns a tuple of (distance [float], gradient [np.array])\n", 307 | " \"custom distance metric does not return gradient; inverse_transform will be unavailable. \"\n" 308 | ] 309 | } 310 | ], 311 | "source": [ 312 | "embedding = reducer.fit_transform(data) # embedding contains the new coordinates of datapoints in 3D space" 313 | ] 314 | }, 315 | { 316 | "cell_type": "markdown", 317 | "metadata": {}, 318 | "source": [ 319 | "## 3. Save dataframe" 320 | ] 321 | }, 322 | { 323 | "cell_type": "code", 324 | "execution_count": 25, 325 | "metadata": {}, 326 | "outputs": [], 327 | "source": [ 328 | "# Add UMAP coordinates to dataframe\n", 329 | "for i in range(N_COMP):\n", 330 | " df['UMAP'+str(i+1)] = embedding[:,i]\n", 331 | "\n", 332 | "# Save dataframe\n", 333 | "df.to_pickle(os.path.join(os.path.sep, DATA, 'df_umap.pkl'))" 334 | ] 335 | } 336 | ], 337 | "metadata": { 338 | "kernelspec": { 339 | "display_name": "Python 3", 340 | "language": "python", 341 | "name": "python3" 342 | }, 343 | "language_info": { 344 | "codemirror_mode": { 345 | "name": "ipython", 346 | "version": 3 347 | }, 348 | "file_extension": ".py", 349 | "mimetype": "text/x-python", 350 | "name": "python", 351 | "nbconvert_exporter": "python", 352 | "pygments_lexer": "ipython3", 353 | "version": "3.7.10" 354 | } 355 | }, 356 | "nbformat": 4, 357 | "nbformat_minor": 4 358 | } 359 | -------------------------------------------------------------------------------- /notebooks/03_UMAP_viz_part_1_prep.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Interactive visualization of UMAP representations: Part 1 (Prep)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "This script creates a spectrogram image for each call and saves all images in a pickled dictionary in the data subfolder (image_data.pkl). These images will be displayed later in the interactive visualization tool; generating them beforehand makes the tool faster, as images don't need to be created on-the-fly, but can be accessed through the dictionary. \n", 15 | "\n", 16 | "The default dictionary key is the filename without datatype specifier (e.g. without .wav), but if the dataframe contains a column 'callID', this is used as keys.\n", 17 | "\n", 18 | "#### The following minimal structure and files are required in the project directory:\n", 19 | "\n", 20 | " ├── data\n", 21 | " │ ├── df_umap.pkl <- pickled pandas dataframe with metadata, raw_audio, spectrograms and UMAP coordinates\n", 22 | " | (generated in 02a_generate_UMAP_basic.ipynb or 02b_generate_UMAP_timeshift.ipynb)\n", 23 | " ├── parameters \n", 24 | " │ ├── spec_params.py <- python file containing the spectrogram parameters used (generated in \n", 25 | " | 01_generate_spectrograms.ipynb) \n", 26 | " ├── functions <- the folder with the function files provided in the repo \n", 27 | " ├── notebooks <- the folder with the notebook files provided in the repo \n", 28 | " ├── ... \n", 29 | "\n", 30 | "\n", 31 | "#### The following columns must exist (somewhere) in the pickled dataframe df.pkl:\n", 32 | "(callID is optional)\n", 33 | "\n", 34 | " | filename | spectrograms | samplerate_hz | [optional: callID]\n", 35 | " --------------------------------------------------------------------\n", 36 | " | call_1.wav | 2D np.array | 8000 | [call_1]\n", 37 | " | call_2.wav | ... | 48000 | [call_2] \n", 38 | " | ... | ... | .... | .... \n", 39 | "\n", 40 | "#### The following files are generated in this script:\n", 41 | "\n", 42 | " ├── data\n", 43 | " │ ├── df_umap.pkl <- is overwritten with updated version of df_umap.pkl (with ID column) \n", 44 | " │ ├── image_data.pkl <- pickled dictionary with spectrogram images as values, ID column as keys" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "## Import statements, constants and functions" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 1, 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "import pandas as pd\n", 61 | "import numpy as np\n", 62 | "import pickle\n", 63 | "import matplotlib.pyplot as plt\n", 64 | "import os\n", 65 | "from pathlib import Path\n", 66 | "import soundfile as sf\n", 67 | "import io\n", 68 | "import librosa\n", 69 | "import librosa.display\n", 70 | "import umap\n", 71 | "\n", 72 | "import sys \n", 73 | "sys.path.insert(0, '..')" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": 3, 79 | "metadata": {}, 80 | "outputs": [], 81 | "source": [ 82 | "P_DIR = str(Path(os.getcwd()).parents[0]) \n", 83 | "DATA = os.path.join(os.path.sep, P_DIR, 'data') \n", 84 | "DF_NAME = 'df_umap.pkl'\n", 85 | "\n", 86 | "SPEC_COL = 'spectrograms' # column name that contains the spectrograms\n", 87 | "ID_COL = 'callID' # column name that contains call identifier (must be unique)\n", 88 | "\n", 89 | "\n", 90 | "OVERWRITE = False # If there already exists an image_data.pkl, should it be overwritten? Default no" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 4, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "# Spectrogramming parameters (needed for generating the images)\n", 100 | "\n", 101 | "from parameters.spec_params import FFT_WIN, FFT_HOP, FMIN, FMAX\n", 102 | "\n", 103 | "# Make sure the spectrogramming parameters are correct!\n", 104 | "# They are used to set the correct time and frequency axis labels for the spectrogram images. \n", 105 | "\n", 106 | "# If you are using bandpass-filtered spectrograms...\n", 107 | "if 'filtered' in SPEC_COL:\n", 108 | " # ...FMIN is set to LOWCUT, FMAX to HIGHCUT and N_MELS to N_MELS_FILTERED\n", 109 | " from parameters.spec_params import LOWCUT, HIGHCUT, N_MELS_FILTERED\n", 110 | " \n", 111 | " FMIN = LOWCUT\n", 112 | " FMAX = HIGHCUT\n", 113 | " N_MELS = N_MELS_FILTERED" 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": {}, 119 | "source": [ 120 | "## 1. Read in files" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 5, 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [ 129 | "df = pd.read_pickle(os.path.join(os.path.sep, DATA, DF_NAME))" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "### 1.1. Check if call identifier column is present" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": 6, 142 | "metadata": {}, 143 | "outputs": [ 144 | { 145 | "name": "stdout", 146 | "output_type": "stream", 147 | "text": [ 148 | "No ID-Column found ( callID )\n", 149 | "Default ID column callID will be generated from filename.\n" 150 | ] 151 | } 152 | ], 153 | "source": [ 154 | "# Default callID will be the name of the wav file\n", 155 | "\n", 156 | "if ID_COL not in df.columns:\n", 157 | " print('No ID-Column found (', ID_COL, ')')\n", 158 | " \n", 159 | " if 'filename' in df.columns:\n", 160 | " print(\"Default ID column \", ID_COL, \"will be generated from filename.\")\n", 161 | " df[ID_COL] = [x.split(\".\")[0] for x in df['filename']]\n", 162 | " else:\n", 163 | " raise" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "## 2. Generate spectrogram images" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "A spectrogram image is generated from each row in the dataframe. Images are saved in a dictionary (keys are the ID_COL of the dataframe).\n", 178 | "\n", 179 | "The dictionary is pickled and saved as image_data.pkl. It will later be loaded in the interactive visualization script and these images will be displayed in the visualization." 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 7, 185 | "metadata": {}, 186 | "outputs": [ 187 | { 188 | "name": "stdout", 189 | "output_type": "stream", 190 | "text": [ 191 | "\r", 192 | "Processing i: 0 / 6428\r", 193 | "Processing i: 1 / 6428\r", 194 | "Processing i: 2 / 6428\r", 195 | "Processing i: 3 / 6428" 196 | ] 197 | }, 198 | { 199 | "name": "stderr", 200 | "output_type": "stream", 201 | "text": [ 202 | "/home/mthomas/anaconda3/envs/umap_tut_env/lib/python3.7/site-packages/librosa/display.py:974: MatplotlibDeprecationWarning: The 'basey' parameter of __init__() has been renamed 'base' since Matplotlib 3.3; support for the old name will be dropped two minor releases later.\n", 203 | " scaler(mode, **kwargs)\n", 204 | "/home/mthomas/anaconda3/envs/umap_tut_env/lib/python3.7/site-packages/librosa/display.py:974: MatplotlibDeprecationWarning: The 'linthreshy' parameter of __init__() has been renamed 'linthresh' since Matplotlib 3.3; support for the old name will be dropped two minor releases later.\n", 205 | " scaler(mode, **kwargs)\n" 206 | ] 207 | }, 208 | { 209 | "name": "stdout", 210 | "output_type": "stream", 211 | "text": [ 212 | "Processing i: 6427 / 6428 / 6428" 213 | ] 214 | } 215 | ], 216 | "source": [ 217 | "if OVERWRITE==False and os.path.isfile(os.path.join(os.path.sep,DATA,'image_data.pkl')):\n", 218 | " print(\"File already exists. Overwrite is set to FALSE, so no new image_data will be generated.\")\n", 219 | " \n", 220 | " # Double-ceck if image_data contains all the required calls\n", 221 | " with open(os.path.join(os.path.sep, DATA, 'image_data.pkl'), 'rb') as handle:\n", 222 | " image_data = pickle.load(handle) \n", 223 | " image_keys = list(image_data.keys())\n", 224 | " expected_keys = list(df[ID_COL])\n", 225 | " missing = list(set(expected_keys)-set(image_keys))\n", 226 | " \n", 227 | " if len(missing)>0:\n", 228 | " print(\"BUT: The current image_data.pkl file doesn't seem to contain all calls that are in your dataframe!\")\n", 229 | " \n", 230 | "else:\n", 231 | " image_data = {}\n", 232 | " for i,dat in enumerate(df.spectrograms):\n", 233 | " print('\\rProcessing i:',i,'/',df.shape[0], end='')\n", 234 | " dat = np.asarray(df.iloc[i][SPEC_COL]) \n", 235 | " sr = df.iloc[i]['samplerate_hz']\n", 236 | " plt.figure()\n", 237 | " librosa.display.specshow(dat,sr=sr, hop_length=int(FFT_HOP * sr) , fmin=FMIN, fmax=FMAX, y_axis='mel', x_axis='s',cmap='inferno')\n", 238 | " buf = io.BytesIO()\n", 239 | " plt.savefig(buf, format='png')\n", 240 | " byte_im = buf.getvalue()\n", 241 | " image_data[df.iloc[i][ID_COL]] = byte_im\n", 242 | " plt.close()\n", 243 | "\n", 244 | " # Store data (serialize)\n", 245 | " with open(os.path.join(os.path.sep,DATA,'image_data.pkl'), 'wb') as handle:\n", 246 | " pickle.dump(image_data, handle, protocol=pickle.HIGHEST_PROTOCOL)" 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "## 3. Save dataframe" 254 | ] 255 | }, 256 | { 257 | "cell_type": "markdown", 258 | "metadata": {}, 259 | "source": [ 260 | "Save the dataframe to make sure it contains the correct ID column for access to the image_data." 261 | ] 262 | }, 263 | { 264 | "cell_type": "code", 265 | "execution_count": 8, 266 | "metadata": {}, 267 | "outputs": [], 268 | "source": [ 269 | "df.to_pickle(os.path.join(os.path.sep, DATA, DF_NAME))" 270 | ] 271 | } 272 | ], 273 | "metadata": { 274 | "kernelspec": { 275 | "display_name": "Python 3", 276 | "language": "python", 277 | "name": "python3" 278 | }, 279 | "language_info": { 280 | "codemirror_mode": { 281 | "name": "ipython", 282 | "version": 3 283 | }, 284 | "file_extension": ".py", 285 | "mimetype": "text/x-python", 286 | "name": "python", 287 | "nbconvert_exporter": "python", 288 | "pygments_lexer": "ipython3", 289 | "version": "3.7.10" 290 | } 291 | }, 292 | "nbformat": 4, 293 | "nbformat_minor": 4 294 | } 295 | -------------------------------------------------------------------------------- /parameters/__pycache__/spec_params.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marathomas/tutorial_repo/5123e19118f51c81e3c933b19fe264a0e7744798/parameters/__pycache__/spec_params.cpython-37.pyc -------------------------------------------------------------------------------- /parameters/spec_params.py: -------------------------------------------------------------------------------- 1 | N_MELS = 40 2 | FFT_WIN = 0.03 3 | FFT_HOP = 0.00375 4 | WINDOW = "hann" 5 | FMIN = 0 6 | FMAX = 4000 7 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import find_packages, setup 2 | 3 | setup( 4 | name='umap_tutorial', 5 | packages=find_packages(), 6 | version='0.1.0', 7 | description='Tutorial for generating latent-space representations from vocalizations using UMAP', 8 | author='Mara Thomas', 9 | license='MIT', 10 | ) 11 | --------------------------------------------------------------------------------