├── .devcontainer └── devcontainer.json ├── .github └── workflows │ └── stale.yaml ├── .gitignore ├── Dockerfile ├── LICENSE.md ├── README.md ├── __pycache__ └── download.cpython-312.pyc ├── assets ├── audio │ ├── essence_calculus.m4a │ ├── groq_ama_trimmed_20min.m4a │ └── transformers_explained.m4a └── groqlabs.svg ├── download.py ├── examples ├── essence_calculus │ └── generated_notes.pdf └── transformers_explained │ └── generated_notes.pdf ├── main.py ├── packages.txt ├── replit.nix └── requirements.txt /.devcontainer/devcontainer.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "Python 3", 3 | // Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile 4 | "image": "mcr.microsoft.com/devcontainers/python:1-3.11-bullseye", 5 | "customizations": { 6 | "codespaces": { 7 | "openFiles": [ 8 | "README.md", 9 | "main.py" 10 | ] 11 | }, 12 | "vscode": { 13 | "settings": {}, 14 | "extensions": [ 15 | "ms-python.python", 16 | "ms-python.vscode-pylance" 17 | ] 18 | } 19 | }, 20 | "updateContentCommand": "[ -f packages.txt ] && sudo apt update && sudo apt upgrade -y && sudo xargs apt install -y ~/.streamlit/config.toml 19 | 20 | ENTRYPOINT ["streamlit", "run", "main.py", "--server.port=8080", "--server.address=0.0.0.0", "--server.headless=true"] -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Benjamin Klieger 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 |
3 | Generate Organizes Notes with BlogWizard 4 |
5 |
6 | BlogWizard: Generate blog articles from video or audio
using Groq, Whisper, and Llama3 7 |
8 |

9 | 10 | 11 |

12 | 13 | 14 | 15 | 16 |

17 | 18 |

19 | Overview • 20 | Features • 21 | Quickstart • 22 | Contributing 23 |

24 | 25 |
26 | 27 | [Demo of BlogWizard](https://github.com/user-attachments/assets/0893d952-bb3b-4f94-b79d-cedb6183024b) 28 | > Demo of BlogWizard fast transcription of audio and generation of structured blog. 29 | 30 | 31 | ## Overview 32 | 33 | BlogWizard is a streamlit app that scaffolds the creation of blogs by iteratively structuring and generating blogs from transcribed audio lectures using Groq's Whisper API. The app mixes Llama3-8b and Llama3-70b, utilizing the larger model for generating the blog structure and the faster of the two for creating the content. 34 | 35 | 36 | ### Features 37 | 38 | - 🎧 Generate a structured blog using transcribed audio by Whisper-large and text by Llama3 39 | - ⚡ Lightning fast speed transcribing audio and generating text using Groq 40 | - 📖 Scaffolded prompting strategically switches between Llama3-70b and Llama3-8b to balance speed and quality 41 | - 🖊️ Markdown styling creates aesthetic blog posts on the streamlit app that can include tables and code 42 | - 📂 Allows user to download a text or PDF file with the entire blog contents 43 | 44 | ## Quickstart 45 | 46 | > [!IMPORTANT] 47 | > To use BlogWizard, you can use a hosted version at [https://blogwizard.groqlabs.com/](https://blogwizard.groqlabs.com/). 48 | > Alternatively, you can run BlogWizard locally with Streamlit using the quickstart instructions. 49 | 50 | 51 | ### Run locally: 52 | 53 | Alternative, you can run BlogWizard locally with streamlit. 54 | 55 | #### Step 1 56 | First, you can set your Groq API key in the environment variables: 57 | 58 | ~~~ 59 | export GROQ_API_KEY="gsk_yA..." 60 | ~~~ 61 | 62 | This is an optional step that allows you to skip setting the Groq API key later in the streamlit app. 63 | 64 | #### Step 2 65 | Next, you can set up a virtual environment and install the dependencies. 66 | 67 | ~~~ 68 | python3 -m venv venv 69 | ~~~ 70 | 71 | ~~~ 72 | source venv/bin/activate 73 | ~~~ 74 | 75 | ~~~ 76 | pip3 install -r requirements.txt 77 | ~~~ 78 | 79 | 80 | #### Step 3 81 | Finally, you can run the streamlit app. 82 | 83 | ~~~ 84 | python3 -m streamlit run main.py 85 | ~~~ 86 | 87 | ## Details 88 | 89 | 90 | ### Technologies 91 | 92 | - Streamlit 93 | - Llama3 on Groq Cloud 94 | - Whisper-large on Groq Cloud 95 | - PyDub to take the first 96 | - It is recommented to use python3.12 97 | 98 | ### Limitations 99 | 100 | Audio files greater than 25mb or YouTube videos longer than 19 minutes will be trimmed down to those thresholds amounts listed and then summarized. 101 | 102 | BlogWizard may generate inaccurate information or placeholder content. It should be used to generate notes for entertainment purposes only. 103 | 104 | ## Contributing 105 | 106 | Improvements through PRs are welcome! 107 | 108 | Adapted from ScribeWizard by [Ben Klieger](https://github.com/bklieger-groq) 109 | -------------------------------------------------------------------------------- /__pycache__/download.cpython-312.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/groq/blogwizard/35dba42d09711065678957c9e552c65876cf995f/__pycache__/download.cpython-312.pyc -------------------------------------------------------------------------------- /assets/audio/essence_calculus.m4a: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/groq/blogwizard/35dba42d09711065678957c9e552c65876cf995f/assets/audio/essence_calculus.m4a -------------------------------------------------------------------------------- /assets/audio/groq_ama_trimmed_20min.m4a: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/groq/blogwizard/35dba42d09711065678957c9e552c65876cf995f/assets/audio/groq_ama_trimmed_20min.m4a -------------------------------------------------------------------------------- /assets/audio/transformers_explained.m4a: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/groq/blogwizard/35dba42d09711065678957c9e552c65876cf995f/assets/audio/transformers_explained.m4a -------------------------------------------------------------------------------- /assets/groqlabs.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 10 | 11 | 12 | 13 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | -------------------------------------------------------------------------------- /download.py: -------------------------------------------------------------------------------- 1 | from __future__ import unicode_literals 2 | import yt_dlp as youtube_dl 3 | import os 4 | import time 5 | import os 6 | import shutil 7 | 8 | MAX_FILE_SIZE = 25 * 1024 * 1024 # 25 MB 9 | LARGER_MAX_FILE_SIZE = 225 * 1024 * 1024 # 250 MB, around a 3 hour audio file max 10 | FILE_TOO_LARGE_MESSAGE = "The audio file is too large for the current size and rate limits using Whisper. If you used a YouTube link, please try a shorter video clip. If you uploaded an audio file, try trimming or compressing the audio to under 25 MB." 11 | max_retries = 3 12 | delay = 2 13 | 14 | 15 | class MyLogger(object): 16 | def __init__(self, external_logger=lambda x: None): 17 | self.external_logger = external_logger 18 | 19 | def debug(self, msg): 20 | print("[debug]: ", msg) 21 | self.external_logger(msg) 22 | 23 | def warning(self, msg): 24 | print("[warning]: ", msg) 25 | 26 | def error(self, msg): 27 | print("[error]: ", msg) 28 | 29 | 30 | def my_hook(d): 31 | print("hook", d["status"]) 32 | if d["status"] == "finished": 33 | print("Done downloading, now converting ...") 34 | 35 | 36 | def get_ydl_opts(external_logger=lambda x: None): 37 | return { 38 | "format": "bestaudio/best", 39 | "postprocessors": [ 40 | { 41 | "key": "FFmpegExtractAudio", 42 | "preferredcodec": "mp3", 43 | "preferredquality": "192", # set the preferred bitrate to 192kbps 44 | } 45 | ], 46 | "logger": MyLogger(external_logger), 47 | "outtmpl": "./downloads/audio/%(title)s.%(ext)s", # Set the output filename directly 48 | "progress_hooks": [my_hook], 49 | } 50 | 51 | 52 | def download_video_audio(url, external_logger=lambda x: None): 53 | retries = 0 54 | while retries < max_retries: 55 | try: 56 | ydl_opts = get_ydl_opts(external_logger) 57 | with youtube_dl.YoutubeDL(ydl_opts) as ydl: 58 | print("Going to download ", url) 59 | info = ydl.extract_info(url, download=False) 60 | filesize = info.get("filesize", 0) 61 | if filesize > MAX_FILE_SIZE: 62 | if filesize > LARGER_MAX_FILE_SIZE: 63 | # raise error we are not transcribing any video over 3 hours 64 | raise Exception(FILE_TOO_LARGE_MESSAGE) 65 | else: 66 | print("Only the first 19 minutes of the file will be summarized.") 67 | 68 | filename = ydl.prepare_filename(info) 69 | res = ydl.download([url]) 70 | print("youtube-dl result :", res) 71 | mp3_filename = os.path.splitext(filename)[0] + '.mp3' 72 | print('mp3 file name - ', mp3_filename) 73 | return mp3_filename 74 | except Exception as e: 75 | retries += 1 76 | print( 77 | f"An error occurred during downloading (Attempt {retries}/{max_retries}):", 78 | str(e), 79 | ) 80 | if retries >= max_retries: 81 | raise e 82 | time.sleep(delay) 83 | 84 | 85 | 86 | def delete_download(path): 87 | try: 88 | if os.path.isfile(path): 89 | os.remove(path) 90 | print(f"File {path} has been deleted.") 91 | elif os.path.isdir(path): 92 | shutil.rmtree(path) 93 | print(f"Directory {path} and its contents have been deleted.") 94 | else: 95 | print(f"The path {path} is neither a file nor a directory.") 96 | except PermissionError: 97 | print(f"Permission denied: Unable to delete {path}.") 98 | except FileNotFoundError: 99 | print(f"File or directory not found: {path}") 100 | except Exception as e: 101 | print(f"An error occurred while trying to delete {path}: {str(e)}") 102 | -------------------------------------------------------------------------------- /examples/essence_calculus/generated_notes.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/groq/blogwizard/35dba42d09711065678957c9e552c65876cf995f/examples/essence_calculus/generated_notes.pdf -------------------------------------------------------------------------------- /examples/transformers_explained/generated_notes.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/groq/blogwizard/35dba42d09711065678957c9e552c65876cf995f/examples/transformers_explained/generated_notes.pdf -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import streamlit as st 2 | from groq import Groq 3 | import json 4 | import os 5 | from io import BytesIO 6 | from md2pdf.core import md2pdf 7 | from dotenv import load_dotenv 8 | from download import download_video_audio, delete_download 9 | from pydub import AudioSegment 10 | 11 | load_dotenv() 12 | 13 | # idk where it is in the code but for file upload, if we're given a video then we can just split it using python libraries and get the audio. 14 | 15 | GROQ_API_KEY = os.environ.get("GROQ_API_KEY", None) 16 | 17 | MAX_FILE_SIZE = 25 * 1024 * 1024 # 25 MB 18 | FILE_TOO_LARGE_MESSAGE = "The audio file is too large for the current size and rate limits using Whisper. If you used a YouTube link, please try a shorter video clip. If you uploaded an audio file, try trimming or compressing the audio to under 25 MB." 19 | 20 | global_variable = None 21 | audio_file_path = None 22 | 23 | if 'api_key' not in st.session_state: 24 | st.session_state.api_key = GROQ_API_KEY 25 | 26 | if 'groq' not in st.session_state: 27 | if GROQ_API_KEY: 28 | st.session_state.groq = Groq() 29 | 30 | st.set_page_config( 31 | page_title="BlogWizard", 32 | page_icon="🧙‍♂️", 33 | ) 34 | 35 | class GenerationStatistics: 36 | def __init__(self, input_time=0,output_time=0,input_tokens=0,output_tokens=0,total_time=0,model_name="llama3-8b-8192"): 37 | self.input_time = input_time 38 | self.output_time = output_time 39 | self.input_tokens = input_tokens 40 | self.output_tokens = output_tokens 41 | self.total_time = total_time # Sum of queue, prompt (input), and completion (output) times 42 | self.model_name = model_name 43 | 44 | def get_input_speed(self): 45 | """ 46 | Tokens per second calculation for input 47 | """ 48 | if self.input_time != 0: 49 | return self.input_tokens / self.input_time 50 | else: 51 | return 0 52 | 53 | def get_output_speed(self): 54 | """ 55 | Tokens per second calculation for output 56 | """ 57 | if self.output_time != 0: 58 | return self.output_tokens / self.output_time 59 | else: 60 | return 0 61 | 62 | def add(self, other): 63 | """ 64 | Add statistics from another GenerationStatistics object to this one. 65 | """ 66 | if not isinstance(other, GenerationStatistics): 67 | raise TypeError("Can only add GenerationStatistics objects") 68 | 69 | self.input_time += other.input_time 70 | self.output_time += other.output_time 71 | self.input_tokens += other.input_tokens 72 | self.output_tokens += other.output_tokens 73 | self.total_time += other.total_time 74 | 75 | def __str__(self): 76 | return (f"\n## {self.get_output_speed():.2f} T/s ⚡\nRound trip time: {self.total_time:.2f}s Model: {self.model_name}\n\n" 77 | f"| Metric | Input | Output | Total |\n" 78 | f"|-----------------|----------------|-----------------|----------------|\n" 79 | f"| Speed (T/s) | {self.get_input_speed():.2f} | {self.get_output_speed():.2f} | {(self.input_tokens + self.output_tokens) / self.total_time if self.total_time != 0 else 0:.2f} |\n" 80 | f"| Tokens | {self.input_tokens} | {self.output_tokens} | {self.input_tokens + self.output_tokens} |\n" 81 | f"| Inference Time (s) | {self.input_time:.2f} | {self.output_time:.2f} | {self.total_time:.2f} |") 82 | 83 | class NoteSection: 84 | def __init__(self, structure, transcript): 85 | self.structure = structure 86 | self.contents = {title: "" for title in self.flatten_structure(structure)} 87 | self.placeholders = {title: st.empty() for title in self.flatten_structure(structure)} 88 | 89 | st.markdown("## Raw transcript:") 90 | st.markdown(transcript) 91 | st.markdown("---") 92 | 93 | def flatten_structure(self, structure): 94 | sections = [] 95 | for title, content in structure.items(): 96 | sections.append(title) 97 | if isinstance(content, dict): 98 | sections.extend(self.flatten_structure(content)) 99 | return sections 100 | 101 | def update_content(self, title, new_content): 102 | try: 103 | self.contents[title] += new_content 104 | self.display_content(title) 105 | except TypeError as e: 106 | pass 107 | 108 | def display_content(self, title): 109 | if self.contents[title].strip(): 110 | self.placeholders[title].markdown(f"## {title}\n{self.contents[title]}") 111 | 112 | def return_existing_contents(self, level=1) -> str: 113 | existing_content = "" 114 | for title, content in self.structure.items(): 115 | if self.contents[title].strip(): # Only include title if there is content 116 | existing_content += f"{'#' * level} {title}\n{self.contents[title]}.\n\n" 117 | if isinstance(content, dict): 118 | existing_content += self.get_markdown_content(content, level + 1) 119 | return existing_content 120 | 121 | def display_structure(self, structure=None, level=1): 122 | if structure is None: 123 | structure = self.structure 124 | 125 | for title, content in structure.items(): 126 | if self.contents[title].strip(): # Only display title if there is content 127 | st.markdown(f"{'#' * level} {title}") 128 | self.placeholders[title].markdown(self.contents[title]) 129 | if isinstance(content, dict): 130 | self.display_structure(content, level + 1) 131 | 132 | def display_toc(self, structure, columns, level=1, col_index=0): 133 | for title, content in structure.items(): 134 | with columns[col_index % len(columns)]: 135 | st.markdown(f"{' ' * (level-1) * 2}- {title}") 136 | col_index += 1 137 | if isinstance(content, dict): 138 | col_index = self.display_toc(content, columns, level + 1, col_index) 139 | return col_index 140 | 141 | def get_markdown_content(self, structure=None, level=1): 142 | """ 143 | Returns the markdown styled pure string with the contents. 144 | """ 145 | if structure is None: 146 | structure = self.structure 147 | 148 | markdown_content = "" 149 | for title, content in structure.items(): 150 | if self.contents[title].strip(): # Only include title if there is content 151 | markdown_content += f"{'#' * level} {title}\n{self.contents[title]}.\n\n" 152 | if isinstance(content, dict): 153 | markdown_content += self.get_markdown_content(content, level + 1) 154 | return markdown_content 155 | 156 | def get_markdown_arabic(self, structure=None, level=1): 157 | """ 158 | Returns the dictionary contents of the structure. 159 | """ 160 | if structure is None: 161 | structure = self.structure 162 | 163 | markdown_content = "" 164 | for title, content in structure.items(): 165 | if self.contents[title].strip(): # Only include title if there is content 166 | markdown_content += translate_to_arabic(f"{'#' * level} {title}\n{self.contents[title]}.\n\n") 167 | if isinstance(content, dict): 168 | markdown_content += translate_to_arabic(self.get_markdown_content(content, level + 1)) 169 | return markdown_content 170 | 171 | 172 | def create_markdown_file(content: str) -> BytesIO: 173 | """ 174 | Create a Markdown file from the provided content. 175 | """ 176 | markdown_file = BytesIO() 177 | markdown_file.write(content.encode('utf-8')) 178 | markdown_file.seek(0) 179 | return markdown_file 180 | 181 | def create_pdf_file(content: str): 182 | """ 183 | Create a PDF file from the provided content. 184 | """ 185 | pdf_buffer = BytesIO() 186 | md2pdf(pdf_buffer, md_content=content) 187 | pdf_buffer.seek(0) 188 | return pdf_buffer 189 | 190 | def transcribe_audio(audio_file): 191 | """ 192 | Transcribes audio using Groq's Whisper API. 193 | """ 194 | transcription = st.session_state.groq.audio.transcriptions.create( 195 | file=audio_file, 196 | model="whisper-large-v3", 197 | prompt="If Groq is mentioned it is spelled Groq", 198 | response_format="json", 199 | language="en", 200 | temperature=0.0 201 | ) 202 | 203 | results = transcription.text 204 | return results 205 | 206 | def generate_notes_structure(transcript: str, blog_style, model: str = "llama3-70b-8192"): 207 | """ 208 | Returns notes structure content as well as total tokens and total time for generation. 209 | """ 210 | 211 | shot_example = """ 212 | "Introduction": "Brief overview of the topic. Why it's relevant and important", 213 | "Key Topic Discussions [1-3]": "Talk about the key moments of the topic", 214 | "Analysis and Insights": "Highlight insights and statistics. May include past and present comparison", 215 | "Takeaways": "Share advice that may be relevant to the readers", 216 | "Conclusion": "May include recap of key points, implications for the future, call to action." 217 | }""" 218 | if blog_style == "Customer Case Study": 219 | shot_example = """ 220 | Customer company description 221 | Challenge 222 | -List one to three main challenges that a customer or an end user faces 223 | -These challenges should clearly express why there is a need for solution 224 | Solution 225 | -Explain how the featured customer solves the above stated challenge 226 | -In this description, include unique advantages and specific ROI that the customer offers to its end users 227 | -Explain how Groq enables this customer to deliver this solution better than anyone else - this should usually include something about our value prop around speed, scalability, performance, or ROI 228 | Key Features 229 | Opportunity 230 | -Explain the ways this solution can transform an end user's experience, disrupt an industry, or change the course of the world 231 | -Explain how the solution can be applied to various industries and use cases""" 232 | elif blog_style == "Launch of new Product": 233 | shot_example = """ 234 | Introduction 235 | -Name of model is now available on GroqCloud 236 | -How to access the model 237 | -A video or image showcasing the model running 238 | -Quote from senior level executive, internal or external 239 | Advantages of model 240 | -Speed 241 | -Quality 242 | -Performance 243 | -Price 244 | -Third party benchmarks if available 245 | Background on the model 246 | -How was it built? 247 | -Who does it serve? 248 | -What use cases can it help with most? 249 | -Why does it matter that the model is running on Groq 250 | -Name of model running on GroqCloud means (speed, accessibility, performance, or some other value prop) for developers -and enterprises that is otherwise unavailable in the market 251 | CTA 252 | -Start building with Name of model today 253 | -Call out any tools or features that make the model more enticing (tool use, higher rate limits, etc) 254 | """ 255 | 256 | completion = st.session_state.groq.chat.completions.create( 257 | model=model, 258 | messages=[ 259 | { 260 | "role": "system", 261 | "content": "Write in JSON format:\n\n{\"Title of section goes here\":\"Description of section goes here\",\"Title of section goes here\":\"Description of section goes here\",\"Title of section goes here\":\"Description of section goes here\"}" 262 | }, 263 | { 264 | "role": "user", 265 | "content": f"### Transcript {transcript}\n\n### Example\n\n{shot_example}### Instructions\n\nCreate a structure for a comprehensive blog article on the above transcribed audio. Section titles and content descriptions must be comprehensive. Quality over quantity." 266 | } 267 | ], 268 | temperature=0.3, 269 | max_tokens=8000, 270 | top_p=1, 271 | stream=False, 272 | response_format={"type": "json_object"}, 273 | stop=None, 274 | ) 275 | 276 | usage = completion.usage 277 | statistics_to_return = GenerationStatistics(input_time=usage.prompt_time, output_time=usage.completion_time, input_tokens=usage.prompt_tokens, output_tokens=usage.completion_tokens, total_time=usage.total_time, model_name=model) 278 | 279 | return statistics_to_return, completion.choices[0].message.content 280 | 281 | def generate_section(blog_length, transcript: str, existing_notes: str, section: str, model: str = "llama3-8b-8192"): 282 | stream = st.session_state.groq.chat.completions.create( 283 | model=model, 284 | messages=[ 285 | { 286 | "role": "system", 287 | "content": f"You are an expert blog writer. Generate body content in third-person for the section provided based on the transcript. Do *not* repeat any content from previous sections. No need to preface with any titles or pleasantries, just provide the paragraphs. Max word count of {blog_length} words." 288 | }, 289 | { 290 | "role": "user", 291 | "content": f"### Transcript\n\n{transcript}\n\n### Existing Notes\n\n{existing_notes}\n\n### Instructions\n\nGenerate short blog-like paragraphs only for this section based on the transcript: \n\n{section}" 292 | } 293 | ], 294 | temperature=0.3, 295 | max_tokens=8000, 296 | top_p=1, 297 | stream=True, 298 | stop=None, 299 | ) 300 | 301 | for chunk in stream: 302 | tokens = chunk.choices[0].delta.content 303 | if tokens: 304 | yield tokens 305 | if x_groq := chunk.x_groq: 306 | if not x_groq.usage: 307 | continue 308 | usage = x_groq.usage 309 | statistics_to_return = GenerationStatistics(input_time=usage.prompt_time, output_time=usage.completion_time, input_tokens=usage.prompt_tokens, output_tokens=usage.completion_tokens, total_time=usage.total_time, model_name=model) 310 | yield statistics_to_return 311 | 312 | # Initialize 313 | if 'button_disabled' not in st.session_state: 314 | st.session_state.button_disabled = False 315 | 316 | if 'button_text' not in st.session_state: 317 | st.session_state.button_text = "Generate Blog" 318 | 319 | if 'statistics_text' not in st.session_state: 320 | st.session_state.statistics_text = "" 321 | 322 | if 'buttons_misc_disabled' not in st.session_state: 323 | st.session_state.buttons_misc_disabled = True 324 | 325 | if 'notes' not in st.session_state: 326 | st.session_state.notes = None 327 | # if 'notes_structure_json' not in st.session_state: 328 | # st.session_state.notes_structure_json = {} 329 | 330 | st.write(""" 331 | # BlogWizard: Create structured blog from audio 🗒️⚡ 332 | """) 333 | 334 | def enable_buttons_misc(): 335 | st.session_state.buttons_misc_disabled = False 336 | 337 | def disable(): 338 | st.session_state.button_disabled = True 339 | # and also enable the miscs buttons 340 | st.session_state.buttons_misc_disabled = False 341 | 342 | def enable(): 343 | st.session_state.button_disabled = False 344 | 345 | def empty_st(): 346 | st.empty() 347 | 348 | def translate(text, selected_lang): 349 | chat_completion = st.session_state.groq.chat.completions.create( 350 | messages=[ 351 | { 352 | "role": "system", 353 | "content": f"Translate this text into {selected_lang}. Use markdown." 354 | }, 355 | { 356 | "role": "user", 357 | "content": text, 358 | } 359 | ], 360 | model="llama-3.3-70b-versatile" 361 | ) 362 | print(f"translated notes in {selected_lang}: ", chat_completion.choices[0].message.content) 363 | 364 | return chat_completion.choices[0].message.content 365 | 366 | def translate_to_arabic(markdown_content): 367 | 368 | chat_completion = st.session_state.groq.chat.completions.create( 369 | messages=[ 370 | { 371 | "role": "system", 372 | "content": "Translate the entire text into Arabic" 373 | }, 374 | { 375 | "role": "user", 376 | "content": markdown_content, 377 | } 378 | ], 379 | model="allam-2-7b", 380 | ) 381 | print("translated notes: ", chat_completion.choices[0].message.content) 382 | 383 | return chat_completion.choices[0].message.content 384 | 385 | 386 | image_file = "assets/groqlabs.svg" 387 | try: 388 | with st.sidebar: 389 | 390 | if image_file: 391 | st.image(image_file, width=200) 392 | 393 | st.write(f"# 🧙‍♂️ BlogWizard \n## Generate blog from audio in seconds using Groq, Whisper, and Llama3") 394 | st.markdown(f"[Github Repository](https://github.com/cho-groq/BlogWizard)\n\n") 395 | 396 | STYLES = [ 397 | "Default", 398 | "Customer Case Study", 399 | "Launch of new Product" 400 | ] 401 | 402 | st.title("Blog options") 403 | 404 | # Create a dropdown selector 405 | blog_style = st.selectbox("Choose a template style:", options=STYLES) 406 | 407 | BLOG_WORD_COUNT = { 408 | "Up to 800 words":200, 409 | "Up to 1400 words":300, 410 | "Up to 2500 words":500, 411 | } 412 | 413 | # Create a dropdown selector 414 | blog_length = st.selectbox("Choose a word count:", options=BLOG_WORD_COUNT.keys()) 415 | 416 | st.info("Audio files and YouTube videos over 19 minutes will be summarized only up to the first 19 minutes. Videos longer than 3 hours are not allowed") 417 | 418 | audio_files = { 419 | "Groq AI Weekly Updates": { 420 | "file_path": "assets/audio/groq_ama_trimmed_20min.m4a", 421 | "youtube_link": "https://www.youtube.com/watch?v=A3IRU6aoLYA" 422 | }, 423 | "Highlights of 2025 LIV Golf Riyadh Round 1": { 424 | "file_path": "assets/audio/transformers_explained.m4a", 425 | "youtube_link": "https://www.youtube.com/watch?v=SZorAJ4I-sA" 426 | }, 427 | "Joaquin Niemann LIV Golf Adelaide Postgame Winner Interview": { 428 | "file_path": "assets/audio/essence_calculus.m4a", 429 | "youtube_link": "https://www.youtube.com/watch?v=xIVKjjKQgl4" 430 | } 431 | } 432 | 433 | st.write(f"---") 434 | 435 | st.write(f"# Sample Audio Files") 436 | 437 | for audio_name, audio_info in audio_files.items(): 438 | 439 | st.write(f"### {audio_name}") 440 | 441 | # Read audio file as binary 442 | with open(audio_info['file_path'], 'rb') as audio_file: 443 | audio_bytes = audio_file.read() 444 | 445 | # Create download button 446 | # st.download_button( 447 | # label=f"Download audio", 448 | # data=audio_bytes, 449 | # file_name=audio_info['file_path'], 450 | # mime='audio/m4a' 451 | # ) 452 | 453 | st.markdown(f"[Youtube Link]({audio_info['youtube_link']})") 454 | st.write(f"\n\n") 455 | 456 | st.write(f"---") 457 | 458 | st.write("# Customization Settings\n🧪 These settings are experimental.\n") 459 | st.write(f"By default, BlogWizard uses Llama3.3-70b for generating the blog outline and Llama3-8b for the content. This balances quality with speed and rate limit usage. You can customize these selections below.") 460 | outline_model_options = ["llama-3.3-70b-versatile", "llama3-70b-8192", "deepseek-r1-distill-qwen-32b", "mixtral-8x7b-32768", "gemma-9b-it"] 461 | outline_selected_model = st.selectbox("Outline generation:", outline_model_options) 462 | content_model_options = ["llama3-8b-8192", "llama3-70b-8192", "mixtral-8x7b-32768", "gemma2-9b-it"] 463 | content_selected_model = st.selectbox("Content generation:", content_model_options) 464 | 465 | 466 | # Add note about rate limits 467 | st.info("Important: Different models have different token and rate limits which may cause runtime errors.") 468 | 469 | LANGUAGES = { 470 | "en": "English", 471 | "fr": "French", 472 | "es": "Spanish", 473 | "de": "German", 474 | "it": "Italian", 475 | "pt": "Portuguese", 476 | "zh": "Chinese", 477 | "ja": "Japanese", 478 | "ko": "Korean", 479 | "ru": "Russian", 480 | "hi": "Hindi", 481 | "nl": "Dutch", 482 | "sv": "Swedish", 483 | "fi": "Finnish", 484 | "da": "Danish", 485 | "no": "Norwegian", 486 | "pl": "Polish", 487 | "tr": "Turkish", 488 | "he": "Hebrew", 489 | } 490 | 491 | st.title("Language Translate") 492 | 493 | # Create a dropdown selector 494 | selected_lang = st.selectbox("Choose a language to translate to:", options=LANGUAGES.values()) 495 | 496 | @st.dialog(f"{selected_lang} translation", width="large") 497 | def language(item): 498 | st.markdown(item) 499 | 500 | 501 | # Get the abbreviation code from the selected language. but not needed for text 502 | # selected_code = next(code for code, name in LANGUAGES.items() if name == selected_lang) 503 | 504 | 505 | if selected_lang: 506 | if st.button("Translate into language", disabled=st.session_state.buttons_misc_disabled): 507 | translation = translate(st.session_state.notes.get_markdown_content(), selected_lang) 508 | language(translation) 509 | 510 | 511 | st.title("Translate into Arabic:") 512 | 513 | @st.dialog("Arabic Translation", width="large") 514 | def arabic(item): 515 | st.markdown( 516 | f'
{item}
', 517 | unsafe_allow_html=True 518 | ) 519 | 520 | if "arabic" not in st.session_state: 521 | if st.button("Translate into Arabic", disabled=st.session_state.buttons_misc_disabled): 522 | arabic_translation = st.session_state.notes.get_markdown_arabic() 523 | print(arabic_translation) 524 | arabic(arabic_translation) 525 | 526 | def linkedin_post(text, selected_lang, social_media): 527 | chat_completion = st.session_state.groq.chat.completions.create( 528 | messages=[ 529 | { 530 | "role": "system", 531 | "content": f"Create a social post in the style of {social_media}. Use markdown and emojis.{' Make it less than 280 characters.' if social_media == 'X' else ''}", 532 | 533 | }, 534 | { 535 | "role": "user", 536 | "content": text, 537 | } 538 | ], 539 | model="llama-3.3-70b-versatile", 540 | ) 541 | 542 | temp = chat_completion.choices[0].message.content 543 | chat_completion2 = st.session_state.groq.chat.completions.create( 544 | messages=[ 545 | { 546 | "role": "system", 547 | "content": f"Translate this markdown text into {selected_lang}:", 548 | }, 549 | { 550 | "role": "user", 551 | "content": temp, 552 | } 553 | ], 554 | model="llama-3.3-70b-versatile", 555 | ) 556 | 557 | return chat_completion2.choices[0].message.content 558 | 559 | @st.dialog("Social Media Post", width="large") 560 | def vote(item): 561 | # Path = f'''{item}''' 562 | st.markdown(item) 563 | 564 | 565 | social_media_options = ["LinkedIn", "Facebook", "X", "Instagram", "Reddit"] 566 | st.title("Turn into a Social Media post") 567 | social_media = st.selectbox("Choose a social media platform:", social_media_options) 568 | st.write("Also uses the langugage above to translate.") 569 | 570 | if "vote" not in st.session_state: 571 | if st.button("Create Social Media post", disabled=st.session_state.buttons_misc_disabled): 572 | linkedin_post_text = linkedin_post(st.session_state.notes.get_markdown_content(), selected_lang, social_media) 573 | vote(linkedin_post_text) 574 | 575 | st.markdown(""" 576 | - [Groq Terms of Use](https://groq.com/terms-of-use/) 577 | - [Groq Privacy Policy (PDF)](https://groq.com/wp-content/uploads/2024/05/Groq-Privacy-Policy_Final_30MAY2024.pdf) 578 | """) 579 | 580 | if st.button('End Generation and Download Blog'): 581 | if "notes" in st.session_state: 582 | 583 | # Create markdown file 584 | markdown_file = create_markdown_file(st.session_state.notes.get_markdown_content()) 585 | st.download_button( 586 | label='Download Text', 587 | data=markdown_file, 588 | file_name='generated_notes.txt', 589 | mime='text/plain' 590 | ) 591 | 592 | # Create pdf file (styled) 593 | pdf_file = create_pdf_file(st.session_state.notes.get_markdown_content()) 594 | st.download_button( 595 | label='Download PDF', 596 | data=pdf_file, 597 | file_name='generated_notes.pdf', 598 | mime='application/pdf' 599 | ) 600 | st.session_state.button_disabled = False 601 | else: 602 | raise ValueError("Please generate content first before downloading the blog.") 603 | 604 | input_method = st.radio("Choose input method:", ["Upload audio file", "YouTube link"]) 605 | audio_file = None 606 | youtube_link = None 607 | groq_input_key = None 608 | with st.form("groqform"): 609 | if not GROQ_API_KEY: 610 | groq_input_key = st.text_input("Enter your Groq API Key (gsk_yA...):", "", type="password", autocomplete="off") 611 | 612 | # Add radio button to choose between file upload and YouTube link 613 | 614 | if input_method == "Upload audio file": 615 | audio_file = st.file_uploader("Upload an audio file", type=["mp3", "wav", "m4a"]) # TODO: Add a max size 616 | else: 617 | youtube_link = st.text_input("Enter YouTube link:", "") 618 | 619 | # Generate button 620 | submitted = st.form_submit_button(st.session_state.button_text, on_click=disable, disabled=st.session_state.button_disabled) 621 | 622 | #processing status 623 | status_text = st.empty() 624 | def display_status(text): 625 | status_text.write(text) 626 | 627 | def clear_status(): 628 | status_text.empty() 629 | 630 | download_status_text = st.empty() 631 | def display_download_status(text:str): 632 | download_status_text.write(text) 633 | 634 | def clear_download_status(): 635 | download_status_text.empty() 636 | 637 | # Statistics 638 | placeholder = st.empty() 639 | def display_statistics(): 640 | with placeholder.container(): 641 | if st.session_state.statistics_text: 642 | if "Transcribing audio in background" not in st.session_state.statistics_text: 643 | st.markdown(st.session_state.statistics_text + "\n\n---\n") # Format with line if showing statistics 644 | else: 645 | st.markdown(st.session_state.statistics_text) 646 | else: 647 | placeholder.empty() 648 | 649 | # this displays the notes on the second go around when the user clicks a button on the side of the page 650 | if 'notes' in st.session_state and st.session_state.notes is not None: 651 | st.markdown(st.session_state.notes.get_markdown_content()) 652 | 653 | if submitted: 654 | if input_method == "Upload audio file" and audio_file is None: 655 | st.error("Please upload an audio file") 656 | elif input_method == "YouTube link" and not youtube_link: 657 | st.error("Please enter a YouTube link") 658 | else: 659 | st.session_state.button_disabled = True 660 | # Show temporary message before transcription is generated and statistics show 661 | 662 | audio_file_path = None 663 | 664 | if input_method == "YouTube link": 665 | display_status("Downloading audio from YouTube link ....") 666 | audio_file_path = download_video_audio(youtube_link, display_download_status) 667 | if audio_file_path is None: 668 | st.error("Failed to download audio from YouTube link. Please try again.") 669 | enable() 670 | clear_status() 671 | else: 672 | # Read the downloaded file and create a file-like objec 673 | display_status("Processing Youtube audio ....") 674 | 675 | # Check size first to ensure will work with Whisper 676 | if os.path.getsize(audio_file_path) > MAX_FILE_SIZE: 677 | # use pydub to get the first 15 minutes of the audio file 678 | print(FILE_TOO_LARGE_MESSAGE) 679 | audio = AudioSegment.from_file(audio_file_path) 680 | 681 | # Extract the first 19 minutes 682 | fifteen_minutes_in_ms = 19 * 60 * 1000 # pydub works in milliseconds 683 | trimmed_audio = audio[:fifteen_minutes_in_ms] 684 | 685 | # Export directly to the original file path, overwriting it 686 | trimmed_audio.export(audio_file_path, format="mp3") 687 | 688 | # Now read the file (either original or trimmed) into memory 689 | with open(audio_file_path, 'rb') as f: 690 | file_contents = f.read() 691 | audio_file = BytesIO(file_contents) 692 | 693 | audio_file.name = os.path.basename(audio_file_path) # Set the file name 694 | delete_download(audio_file_path) 695 | clear_download_status() 696 | 697 | if not GROQ_API_KEY: 698 | st.session_state.groq = Groq(api_key=groq_input_key) 699 | 700 | display_status("Transcribing audio in background....") 701 | transcription_text = transcribe_audio(audio_file) 702 | 703 | display_statistics() 704 | 705 | 706 | display_status("Generating blog structure....") 707 | large_model_generation_statistics, notes_structure = generate_notes_structure(transcription_text, blog_style, model=str(outline_selected_model)) 708 | # print("Structure: ",notes_structure) 709 | 710 | display_status("Generating blog ...") 711 | total_generation_statistics = GenerationStatistics(model_name=str(content_selected_model)) 712 | clear_status() 713 | 714 | 715 | try: 716 | notes_structure_json = json.loads(notes_structure) 717 | st.session_state.notes_structure_json = notes_structure_json 718 | # print(notes_structure_json) 719 | notes = NoteSection(structure=notes_structure_json,transcript=transcription_text) 720 | 721 | st.session_state.notes = notes 722 | 723 | st.session_state.notes.display_structure() 724 | print( st.session_state.notes.display_structure()) 725 | 726 | # will this save the notes 727 | # st.session_state.markdown = st.session_state.notes.get_markdown_content() 728 | # print("this is the markdown: "+st.session_state.markdown) 729 | 730 | # st.write(st.session_state.markdown) 731 | 732 | # st.markdown(st.session_state.notes.get_markdown_content()) 733 | 734 | # st.markdown(st.session_state.markdown) 735 | 736 | def stream_section_content(sections): 737 | for title, content in sections.items(): 738 | if isinstance(content, str): 739 | content_stream = generate_section(blog_length, transcript=transcription_text, existing_notes=notes.return_existing_contents(), section=(title + ": " + content),model=str(content_selected_model)) 740 | for chunk in content_stream: 741 | # Check if GenerationStatistics data is returned instead of str tokens 742 | chunk_data = chunk 743 | if type(chunk_data) == GenerationStatistics: 744 | total_generation_statistics.add(chunk_data) 745 | 746 | st.session_state.statistics_text = str(total_generation_statistics) 747 | display_statistics() 748 | elif chunk is not None: 749 | st.session_state.notes.update_content(title, chunk) 750 | elif isinstance(content, dict): 751 | stream_section_content(content) 752 | 753 | stream_section_content(notes_structure_json) 754 | # st.write(st.session_state.notes) 755 | # st.write("NONONONONON") 756 | # st.markdown(st.session_state.notes) 757 | 758 | except json.JSONDecodeError: 759 | st.error("Failed to decode the blog structure. Please try again.") 760 | 761 | enable() 762 | 763 | 764 | except Exception as e: 765 | st.session_state.button_disabled = False 766 | 767 | if hasattr(e, 'status_code') and e.status_code == 413: 768 | # In the future, this limitation will be fixed as BlogWizard will automatically split the audio file and transcribe each part. 769 | st.error(FILE_TOO_LARGE_MESSAGE) 770 | else: 771 | st.error(e) 772 | 773 | if st.button("Clear"): 774 | st.rerun() 775 | 776 | # Remove audio after exception to prevent data storage leak 777 | if audio_file_path is not None: 778 | delete_download(audio_file_path) 779 | -------------------------------------------------------------------------------- /packages.txt: -------------------------------------------------------------------------------- 1 | weasyprint 2 | ffmpeg -------------------------------------------------------------------------------- /replit.nix: -------------------------------------------------------------------------------- 1 | { pkgs }: { 2 | deps = [ 3 | pkgs.libffi 4 | pkgs.zlib 5 | pkgs.tk 6 | pkgs.tcl 7 | pkgs.openjpeg 8 | pkgs.libxcrypt 9 | pkgs.libwebp 10 | pkgs.libtiff 11 | pkgs.libjpeg 12 | pkgs.libimagequant 13 | pkgs.lcms2 14 | pkgs.freetype 15 | pkgs.pango 16 | pkgs.harfbuzz 17 | pkgs.glib 18 | pkgs.fontconfig 19 | pkgs.rustc 20 | pkgs.libiconv 21 | pkgs.cargo 22 | pkgs.cacert 23 | pkgs.glibcLocales 24 | pkgs.pkg-config 25 | pkgs.arrow-cpp 26 | pkgs.ghostscript 27 | pkgs.ffmpeg 28 | ]; 29 | } 30 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | altair==5.3.0 2 | annotated-types==0.6.0 3 | anyio==4.3.0 4 | attrs==23.2.0 5 | blinker==1.8.2 6 | Brotli==1.1.0 7 | cachetools==5.3.3 8 | certifi==2024.7.4 9 | cffi==1.16.0 10 | charset-normalizer==3.3.2 11 | click==8.1.7 12 | cssselect2==0.7.0 13 | defusedxml==0.7.1 14 | distro==1.9.0 15 | docopt==0.6.2 16 | exceptiongroup==1.2.1 17 | fonttools==4.51.0 18 | fpdf2==2.7.9 19 | gitdb==4.0.11 20 | GitPython==3.1.43 21 | groq==0.6.0 22 | h11==0.14.0 23 | html5lib==1.1 24 | httpcore==1.0.5 25 | httpx==0.27.0 26 | idna==3.7 27 | Jinja2==3.1.6 28 | jsonschema==4.22.0 29 | jsonschema-specifications==2023.12.1 30 | markdown-it-py==3.0.0 31 | markdown2==2.4.13 32 | MarkupSafe==2.1.5 33 | md2pdf==1.0.1 34 | mdurl==0.1.2 35 | mutagen==1.47.0 36 | numpy==1.26.4 37 | packaging==24.0 38 | pandas==2.2.2 39 | pillow==11.1.0 40 | protobuf==4.25.3 41 | pyarrow==16.1.0 42 | pycparser==2.22 43 | pycryptodomex==3.20.0 44 | pydantic==2.7.1 45 | pydantic_core==2.18.2 46 | pydeck==0.9.1 47 | pydyf==0.10.0 48 | Pygments==2.18.0 49 | pyphen==0.15.0 50 | python-dateutil==2.9.0.post0 51 | python-dotenv==1.0.1 52 | pytz==2024.1 53 | referencing==0.35.1 54 | requests==2.32.3 55 | rich==13.7.1 56 | rpds-py==0.18.1 57 | six==1.16.0 58 | smmap==5.0.1 59 | sniffio==1.3.1 60 | streamlit==1.42.0 61 | tenacity==8.3.0 62 | tinycss2==1.3.0 63 | toml==0.10.2 64 | toolz==0.12.1 65 | tornado==6.4.2 66 | typing_extensions==4.11.0 67 | tzdata==2024.1 68 | urllib3==2.2.2 69 | watchdog==4.0.1 70 | weasyprint==62.3 71 | webencodings==0.5.1 72 | websockets 73 | yt-dlp @ https://github.com/yt-dlp/yt-dlp/archive/master.tar.gz 74 | zopfli==0.2.3 75 | pydub --------------------------------------------------------------------------------