├── slides ├── ml-4-audio-session1.pdf ├── ml-4-audio-session2.pdf ├── ml-4-audio-session3.pdf ├── ml-4-audio-paper-reading-2-hubert.pdf ├── ml-4-audio-paper-reading-1-wav2vec2.pdf ├── ml-4-audio-paper-reading-3-data2vec.pdf └── ml-4-audio-paper-reading-4-speecht5.pdf ├── README.md └── notebooks ├── session2 ├── speech_recognition.ipynb └── audio_data.ipynb └── session3 └── text_to_speech.ipynb /slides/ml-4-audio-session1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Vaibhavs10/ml-with-audio/HEAD/slides/ml-4-audio-session1.pdf -------------------------------------------------------------------------------- /slides/ml-4-audio-session2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Vaibhavs10/ml-with-audio/HEAD/slides/ml-4-audio-session2.pdf -------------------------------------------------------------------------------- /slides/ml-4-audio-session3.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Vaibhavs10/ml-with-audio/HEAD/slides/ml-4-audio-session3.pdf -------------------------------------------------------------------------------- /slides/ml-4-audio-paper-reading-2-hubert.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Vaibhavs10/ml-with-audio/HEAD/slides/ml-4-audio-paper-reading-2-hubert.pdf -------------------------------------------------------------------------------- /slides/ml-4-audio-paper-reading-1-wav2vec2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Vaibhavs10/ml-with-audio/HEAD/slides/ml-4-audio-paper-reading-1-wav2vec2.pdf -------------------------------------------------------------------------------- /slides/ml-4-audio-paper-reading-3-data2vec.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Vaibhavs10/ml-with-audio/HEAD/slides/ml-4-audio-paper-reading-3-data2vec.pdf -------------------------------------------------------------------------------- /slides/ml-4-audio-paper-reading-4-speecht5.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Vaibhavs10/ml-with-audio/HEAD/slides/ml-4-audio-paper-reading-4-speecht5.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Hugging Face Machine Learning for Audio Study Group 2 | 3 | Welcome to the ML for Audio Study Group. Through a series of presentations, paper reading and discussions, we'll explore the field of applying Machine Learning in the Audio domain. Some examples of this are: 4 | * Generating synthetic sound out of a given text (think of conversational assistants) 5 | * Transcribing audio signals to text. 6 | * Removing noise out of an audio. 7 | * Separating different sources of audio. 8 | * Identifying which speaker is talking. 9 | * And much more! 10 | 11 | We suggest you to join the community Discord at http://hf.co/join/discord, and we're looking forward to meet at the #ml-4-audio-study-group channel 🤗. Remember, this is a community effort so make out of this your space! 12 | 13 | ## Organisation 14 | 15 | We'll kick off with some basics and then collaboratively decide the further direction of the group. 16 | 17 | Before each session: 18 | * Read/watch related resources 19 | 20 | During each session, you can 21 | * Ask question in the forum 22 | * Present a short (~10-15mins) presentation on the topic (agree beforehand) 23 | 24 | Before/after: 25 | * Keep discussing/asking questions about the topic (#ml-4-audio-study channel on discord) 26 | * Share interesting resources 27 | 28 | ## Schedule 29 | 30 | | Date | Topics | Resources (To read before) | 31 | |--------------|-----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 32 | | Dec 14, 2021 | Kickoff + Overview of Audio related usecases ([video](https://www.youtube.com/watch?v=cAviRhkqdnc&ab_channel=HuggingFace), [questions](https://discuss.huggingface.co/t/ml-for-audio-study-group-kick-off-dec-14/12745))| [The 3 DL Frameworks for e2e Speech Recognition that power your devices](https://heartbeat.comet.ml/the-3-deep-learning-frameworks-for-end-to-end-speech-recognition-that-power-your-devices-37b891ddc380) | 33 | | Dec 21, 2021 | ([video](https://www.youtube.com/watch?v=D-MH6YjuIlE&ab_channel=HuggingFace), [questions](https://discuss.huggingface.co/t/ml-for-audio-study-group-intro-to-audio-and-asr-dec-21/12890))|