├── GOATBookLM_Nari_Labs_DIA_1B_Open_Source_Podcast_Generator_Shared.ipynb ├── README.md ├── combined_20250526_212529.wav ├── f_f_voices.mp3 ├── f_f_voices.txt ├── m_f_voices.mp3 ├── m_f_voices.txt ├── m_m_voices.mp3 └── m_m_voices.txt /README.md: -------------------------------------------------------------------------------- 1 | **Welcome To GOATBookLM The Open Source Podcast Generator Powered by Nari Labs Dia 1B** 2 | 3 | The Colab notebook in this repo enables you to generate **dual voice** podcast style audio files using Nari labs open source audio model Dia-1.B. 4 | 5 | **Link to the notebook in Colab:** 6 | 7 | https://colab.research.google.com/github/smartaces/dia_podcast_generator/blob/main/GOATBookLM_Nari_Labs_DIA_1B_Open_Source_Podcast_Generator_Shared.ipynb 8 | 9 | It is structured as follows... 10 | 11 | 1. First an example of generating a basic short clip with Dia-1.6B using the default randomized voices functionality. 12 | 13 | 2. After that, the notebook allows you choose and load base voices which will allow you to create longer podcast style audio files ENSURING CONSISTENT voices throughout. 14 | 15 | **Beyond this, the notebook also includes:** 16 | 17 | * A Dia formatted podcast script generator from any text source you copy paste in. This makes it super simple to quickly create a script which is optimized for how Dia works giving the best possible output. You also of course have the option to modify the script as you wish. 18 | 19 | * The script generator allows you to use a variety of OpenAI, Google Gemini or Anthropic models. 20 | 21 | * The notebook also allows you preview sections of the podcast audio you generate, and redraft/ regenerate parts if needed. 22 | 23 | * Finally the notebook exports a complete single file which you can listen to as a full podcast recording. 24 | 25 | ![image](https://github.com/user-attachments/assets/3f1ab2a9-9ef2-4297-bfdb-1b32e992540b) 26 | 27 | -------------------------------------------------------------------------------- /combined_20250526_212529.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smartaces/dia_podcast_generator/814adf9a530aabab7b8d0a0cdcceef1c98f5451f/combined_20250526_212529.wav -------------------------------------------------------------------------------- /f_f_voices.mp3: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smartaces/dia_podcast_generator/814adf9a530aabab7b8d0a0cdcceef1c98f5451f/f_f_voices.mp3 -------------------------------------------------------------------------------- /f_f_voices.txt: -------------------------------------------------------------------------------- 1 | [S1] Welcome back to another episode of AI Unfiltered! I'm Jamie. [S2] And I'm Taylor. Today, we have some really exciting news from the text-to-speech frontier. [S1] That's right! There's a new open source TTS model on the block called Dia. [S2] So, Dia is developed by a two-person startup called Nari Labs. 2 | -------------------------------------------------------------------------------- /m_f_voices.mp3: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smartaces/dia_podcast_generator/814adf9a530aabab7b8d0a0cdcceef1c98f5451f/m_f_voices.mp3 -------------------------------------------------------------------------------- /m_f_voices.txt: -------------------------------------------------------------------------------- 1 | [S1] Welcome back to another episode of AI Unfiltered! I'm Jamie. 2 | [S2] And I'm Taylor. Today, we have some really exciting news from the text-to-speech frontier. 3 | [S1] That's right. There's a new open source TTS model on the block called Dia. 4 | [S2] So, Dia is developed by a two-person startup called Nari Labs. 5 | -------------------------------------------------------------------------------- /m_m_voices.mp3: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smartaces/dia_podcast_generator/814adf9a530aabab7b8d0a0cdcceef1c98f5451f/m_m_voices.mp3 -------------------------------------------------------------------------------- /m_m_voices.txt: -------------------------------------------------------------------------------- 1 | [S1] Pretty amazing for scriptwriters, podcasters, or even game developers. Oh, and you don’t need to fish around for complicated setup — Nari Labs has released Python code and a browser-based demo. 2 | [S2] Yeah, I tried their Gradio demo last night. Seriously impressive. The model even switches emotional tone smoothly, which is something ElevenLabs and the new Sesame model struggle with. 3 | --------------------------------------------------------------------------------