├── .env.example ├── .gitignore ├── README.md ├── __init__.py ├── assets ├── arrow_1.png ├── axd_pipeline.png ├── axd_workflow.png ├── bg.png ├── scanlines_mask.png ├── system_design.png └── white_gradient_mask.png ├── blog_editor.py ├── cli.py ├── errors.py ├── evals ├── __init__.py └── evals.py ├── file_system ├── __init__.py ├── file_helper.py ├── file_repository.py └── handlers │ ├── __init__.py │ ├── blog_handler.py │ ├── file_handler.py │ ├── interface.py │ ├── metadata_handler.py │ ├── podcast_handler.py │ └── thumbnails_handler.py ├── helpers ├── __init__.py ├── notion_service.py ├── podcast_generator.py ├── resume_extractor.py ├── thumbnail_generator.py └── transcriber.py ├── llms ├── anthropic_client.py ├── llm.py ├── llm_service.py ├── models.yaml ├── ollama_client.py └── openai_client.py └── schemas ├── __init__.py ├── file.py └── prompt.py /.env.example: -------------------------------------------------------------------------------- 1 | OPENAI_API_KEY= 2 | ANTHROPIC_API_KEY= 3 | ASSEMBLYAI_API_KEY= 4 | NOTION_TOKEN= 5 | NOTION_DATABASE_ID= 6 | FIREBASE_CREDENTIALS_PATH= 7 | FIREBASE_STORAGE_BUCKET= 8 | ELEVENLABS_API_KEY= 9 | GOOGLE_API_KEY= -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Python 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | *.so 6 | .Python 7 | build/ 8 | develop-eggs/ 9 | dist/ 10 | downloads/ 11 | eggs/ 12 | .eggs/ 13 | lib/ 14 | lib64/ 15 | parts/ 16 | sdist/ 17 | var/ 18 | wheels/ 19 | *.egg-info/ 20 | .installed.cfg 21 | *.egg 22 | 23 | # Virtual Environment 24 | venv/ 25 | env/ 26 | ENV/ 27 | .env 28 | .venv 29 | env.bak/ 30 | venv.bak/ 31 | 32 | # IDE 33 | .idea/ 34 | .vscode/ 35 | *.swp 36 | *.swo 37 | .DS_Store 38 | 39 | # Testing 40 | .coverage 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .pytest_cache/ 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Jupyter Notebook 50 | .ipynb_checkpoints 51 | 52 | # Local development settings 53 | local_settings.py 54 | db.sqlite3 55 | db.sqlite3-journal 56 | 57 | # Logs 58 | *.log 59 | logs/ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Blog Editor 2 | 3 | A CLI tool to easily transcribe, generate thumbnails and generate blog assets (title, description, blog, linkedin) for a given .m4a audio file. 4 | 5 | ### Demo: 6 | https://github.com/user-attachments/assets/35e19e20-e7df-443e-995c-37b2b10227ed 7 | 8 | ## Problem Context 9 | 10 | This is the current workflow for creating a blog post (for my newsletter, [Ambitious x Driven](https://www.ambitiousxdriven.com)): 11 | 12 | ![axd_workflow](./assets/axd_workflow.png) 13 | As you can see, the major bottleneck for c 14 | ontent production is preparing the blog assets from the recording. 15 | 16 | As this is a side-project (that I'm doing next to my full-time CS Masters @ ETH Zurich), this would be unsustainable on the long-run. 17 | 18 | Thus, I decided to build a Python CLI tool leveraging the latest LLMs & prompt-engineering techniques that would automate this workflow. 19 | 20 | ## Data processing pipeline 21 | 22 | Here is a diagram that shows the processing pipeline for the blog writer. 23 | 24 | ![axd_data_flow](./assets/axd_pipeline.png) 25 | 26 | To generate the blog assets, you need to input an audio file, a PDF resume, and a photo of the guest. 27 | It then: 28 | 29 | 1. Extracts the guest details from the resume 30 | 2. Transcribes the audio file (using AssemblyAI w/ speaker diarization) 31 | 3. Generates landscape + square thumbnails in the Ambitious x Driven style by using an 🤗 Hugging Face Image Segmentation model & the resume details 32 | 4. Generates a blog post, title, description and LinkedIn post using the guest details(to reduce hallucinations + fix transcription errors) and transcript. 33 | 34 | [Design] 35 | Below is the system design for the blog writer. 36 | ![axd_design](./assets/system_design.png) 37 | 38 | - I created a separate File Helper class + FileHandlers that directly interact with the file system (as it's more convenient for a personal project + viewing the outputs). 39 | - I encapsulate each Blog 'component' into a separate schema and helper class ('Files', 'Metadata', 'Thumbnails', 'Blog'). 40 | 41 | Here is how a typical file change happens (e.g. generating a title) 42 | 43 | 1. User requests a file from the CLI (we are requesting files as there are multiple Zoom recordings in the folder) 44 | 2. The BlogEditor class requests the file from the FileHelper 45 | 3. The FileHelper parses together all the files using the relevant FileHandlers, then returns it to the BlogEditor (then CLI for viewing) 46 | 4. Then, through the CLI, the user can apply relevant functions (e.g. generate all, transcribe, edit, ...). The BlogEditor gets the latest file, applies the requested function, saves the file, then returns it to the CLI for viewing. 47 | 48 | ## Usage 49 | 50 | On MacOS: 51 | 52 | ```bash 53 | # Create virtual environment 54 | python3 -m venv venv 55 | 56 | # Activate virtual environment 57 | source venv/bin/activate 58 | 59 | # Install dependencies 60 | pip install -r requirements.txt 61 | ``` 62 | 63 | #### Required changes before using: 64 | 65 | - Update the Zoom folder path in `blog_editor.py` 66 | - You'll have to write your own `prompts.py` file (I can provide a skeleton if it's needed - reach out here: http://linkedin.com/in/anirudhhramesh/ or anirudhh.ramesh[AT]gmail.com) 67 | - You'll have to include a .env file following the `env.example` file 68 | 69 | #### Required files 70 | 71 | - Provide .m4a/.mp3 2-person interview (with one guest and one interviewer) 72 | - Provide resume.pdf (of the guest) 73 | - Provide photo.png (of the guest) - needs to end with photo.png 74 | 75 | Then: 76 | 77 | ```bash 78 | python cli.py 79 | ``` 80 | 81 | This will run & open the CLI interface. I recommend going into full-screen terminal mode before running this. 82 | 83 | ## Features 84 | 85 | In the CLI interface: 86 | - list (see all the blogs found in your Zoom folder) 87 | - get (get the blog with the given name, this will be your 'working blog') 88 | - generate all (generate all the attributes for the blog) 89 | - edit (edit the attribute with the given value) 90 | - publish (publish the blog to notion) 91 | - reset all (reset all the attributes for the blog) - (NOT YET IMPLEMENTED) 92 | - reset (reset the attribute with the given value) - (NOT YET IMPLEMENTED) 93 | 94 | ## Prompt-engineering details for generating a human-sounding 'true-to-transcript' blog 95 | 96 | - I probably will write a paper on this, so I'll share more details once that's prepared :) 97 | - In the meantime, I'm looking for labs/researchers to advise the paper-writing, if you're interested email me please: anirudhh.ramesh[AT]gmail.com 98 | 99 | ### Citations 100 | 101 | I use the [BiRefNet model](https://github.com/ZhengPeng7/BiRefNet) for image segmentation model from Hugging Face. This one seemed to work best out of the couple I tried. 102 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/__init__.py -------------------------------------------------------------------------------- /assets/arrow_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/assets/arrow_1.png -------------------------------------------------------------------------------- /assets/axd_pipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/assets/axd_pipeline.png -------------------------------------------------------------------------------- /assets/axd_workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/assets/axd_workflow.png -------------------------------------------------------------------------------- /assets/bg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/assets/bg.png -------------------------------------------------------------------------------- /assets/scanlines_mask.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/assets/scanlines_mask.png -------------------------------------------------------------------------------- /assets/system_design.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/assets/system_design.png -------------------------------------------------------------------------------- /assets/white_gradient_mask.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/assets/white_gradient_mask.png -------------------------------------------------------------------------------- /blog_editor.py: -------------------------------------------------------------------------------- 1 | import time 2 | import os 3 | from typing import List 4 | from file_system.file_helper import FileHelper 5 | from llms.llm_service import LLMService 6 | from helpers.transcriber import Transcriber 7 | from helpers.resume_extractor import ResumeExtractor 8 | from helpers.thumbnail_generator import ThumbnailGenerator 9 | from prompts.prompts import Prompts 10 | from helpers.notion_service import NotionService 11 | from helpers.podcast_generator import PodcastGenerator 12 | from dotenv import load_dotenv 13 | from schemas.file import Blog, Thumbnails 14 | from schemas.prompt import SimpleResponse, Prompt 15 | from errors import GuestNotFoundError 16 | 17 | class BlogEditor(): 18 | """ 19 | Class to handle the blog editing process 20 | """ 21 | 22 | def __init__(self): 23 | """ 24 | Initialize the BlogEditor 25 | """ 26 | # Load env variables 27 | load_dotenv() 28 | config = self.get_env_vars() 29 | 30 | # Initialize services 31 | self.file_helper = FileHelper('/Users/anirudhh/Documents/Zoom_v2') 32 | self.llm = LLMService(config) 33 | self.prompts = Prompts(self.file_helper) 34 | self.resume_extractor = ResumeExtractor(self.llm, self.prompts) 35 | self.thumbnail_generator = ThumbnailGenerator() 36 | self.transcriber = Transcriber(config, self.llm, self.prompts) 37 | self.podcast_generator = PodcastGenerator(config, self.llm, self.prompts) 38 | self.notion_service = NotionService(config) 39 | 40 | def get_env_vars(self): 41 | """ 42 | Get the environment variables 43 | """ 44 | return { 45 | "ASSEMBLYAI_API_KEY": os.getenv("ASSEMBLYAI_API_KEY"), 46 | "NOTION_TOKEN": os.getenv("NOTION_TOKEN"), 47 | "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY"), 48 | "ANTHROPIC_API_KEY": os.getenv("ANTHROPIC_API_KEY"), 49 | "NOTION_DATABASE_ID": os.getenv("NOTION_DATABASE_ID"), 50 | "FIREBASE_CREDENTIALS_PATH": os.getenv("FIREBASE_CREDENTIALS_PATH"), 51 | "FIREBASE_STORAGE_BUCKET": os.getenv("FIREBASE_STORAGE_BUCKET"), 52 | "ELEVENLABS_API_KEY": os.getenv("ELEVENLABS_API_KEY"), 53 | } 54 | 55 | # List & get files 56 | 57 | def list_files(self) -> List[str]: 58 | """ 59 | List all the files in the file system 60 | """ 61 | return self.file_helper.list_files() 62 | 63 | def get(self, file_name: str) -> Blog: 64 | """ 65 | Get the blog with the given file name 66 | """ 67 | return self.file_helper.get(file_name) 68 | 69 | 70 | # Extract metadata from the existing documents 71 | 72 | def extract_resume(self, file_name: str, callback=None) -> None: 73 | """ 74 | Extract the resume from the given file name 75 | """ 76 | blog = self.file_helper.get(file_name) 77 | 78 | if not self.check_files(blog): 79 | print("Missing files, upload them!") 80 | return 81 | 82 | if not blog.metadata.resume: 83 | if callback: 84 | callback("Extracting resume for " + file_name) 85 | blog.metadata.resume = self.resume_extractor.extract(blog) 86 | self.file_helper.save(blog) 87 | 88 | def transcribe(self, file_name: str, callback=None) -> None: 89 | """ 90 | Transcribe the given file name 91 | """ 92 | blog = self.file_helper.get(file_name) 93 | 94 | if not self.check_files(blog): 95 | print("Missing files, upload them!") 96 | return 97 | 98 | if not blog.metadata.utterances: 99 | if callback: 100 | callback("Utterances not found, generating for " + file_name) 101 | blog.metadata.utterances = self.transcriber.transcribe(blog.files.audio_file) 102 | self.file_helper.save(blog) 103 | 104 | if not blog.metadata.transcript: 105 | if callback: 106 | callback("Transcript not found, generating for " + file_name) 107 | blog.metadata.transcript = self.transcriber.generate_transcript(blog.metadata.utterances) 108 | self.file_helper.save(blog) 109 | 110 | # Enrich guest 111 | def enrich_guest(self, file_name: str, callback=None): 112 | """ 113 | Enrich the guest data with first_name, origin, top companies & universities 114 | """ 115 | blog = self.file_helper.get(file_name) 116 | 117 | if not blog.metadata.guest: 118 | blog.metadata.guest = self.resume_extractor.enrich_guest(blog) 119 | self.file_helper.save(blog) 120 | 121 | # Generate thumbnails 122 | def generate_thumbnails(self, file_name, callback=None): 123 | """ 124 | Generate the thumbnails for the given file name 125 | """ 126 | blog = self.file_helper.get(file_name) 127 | 128 | if not blog.files.photo: 129 | print(f"Photo not found for {file_name}, upload it!") 130 | return 131 | if not blog.files.resume_file: 132 | print(f"Resume not found for {file_name}, upload & generate it!") 133 | return 134 | 135 | # Always generate thumbnails 136 | if callback: 137 | callback(f"Generating thumbnails for {file_name}") 138 | blog.thumbnails = self.thumbnail_generator.generate_thumbnails(blog) 139 | self.file_helper.save(blog) 140 | 141 | # Generate blog assets (title, description, linkedin, blog) 142 | 143 | def generate(self, file_name: str, attr: str, model="opus", llm_stream=None, callback=None): 144 | """ 145 | Handle the generation of an attribute for a blog 146 | """ 147 | file = self.file_helper.get(file_name) 148 | 149 | if not self.check_files(file): 150 | print("Missing files, upload them!") 151 | return 152 | 153 | # The attribute has not been generated previously, so generate it 154 | if not getattr(file.blog, attr): 155 | prompt = self.prompts.get_prompt(file, attr) 156 | self._generate(file, attr, prompt, model, llm_stream, callback) 157 | else: 158 | # Attribute was already generated once, ask the user what to do 159 | message = f"{attr} already generated for {file_name}, skipping." 160 | print(message) 161 | llm_stream(message) 162 | callback(message) 163 | 164 | def edit(self, file_name: str, attr: str, instructions: str,model="opus", llm_stream=None, callback=None): 165 | """ 166 | Edit the given attribute for the given file name 167 | 168 | TODO: This can be merged into generate method 169 | """ 170 | file = self.file_helper.get(file_name) 171 | 172 | if not self.check_files(file): 173 | print("Missing files, upload them!") 174 | return 175 | 176 | if not getattr(file.blog, attr): 177 | message = f"{attr} not generated for {file_name}, generating it!" 178 | print(message) 179 | llm_stream(message) 180 | callback(message) 181 | self._generate(file, attr, model, llm_stream, callback) 182 | return 183 | 184 | attr_prompt = self.prompts.get_prompt(file, attr) 185 | prompt = f""" 186 | Here is the previous chat conversation: 187 | 188 | 189 | {attr_prompt.text} 190 | 191 | 192 | {getattr(file.blog, attr)} 193 | 194 | 195 | 196 | You have been provided with the following user instructions: 197 | 198 | {instructions} 199 | 200 | 201 | Edit the text: {getattr(file.blog, attr)} using the above instructions and return. 202 | """ 203 | 204 | self._generate(file, attr, Prompt(text=prompt, model=attr_prompt.model), model, llm_stream, callback) 205 | 206 | def _generate(self, file: Blog, attr: str, prompt: str, model="opus", llm_stream=None, callback=None): 207 | """ 208 | Generate a specified attribute for the blog 209 | """ 210 | if callback: 211 | callback(f"Generating {attr} for {file.name}") 212 | 213 | if llm_stream: 214 | print(f"Streaming {attr} for {file.name}") 215 | response = self.llm.stream_prompt(prompt.text, model=prompt.model, llm_stream=llm_stream) 216 | setattr(file.blog, attr, response) 217 | self.file_helper.save(file) 218 | else: 219 | if attr in ["title", "description", "linkedin"]: 220 | response = self.llm.prompt(prompt.text, model=prompt.model, schema=SimpleResponse) 221 | else: 222 | response = self.llm.prompt(prompt.text, model=prompt.model) 223 | setattr(file.blog, attr, response) 224 | self.file_helper.save(file) 225 | 226 | # Publish the blog 227 | # TODO: Move these into a dedicated helper class 228 | def publish_markdown_draft(self, file_name, callback=None): 229 | """ 230 | Publish the blog as a markdown draft 231 | """ 232 | file = self.file_helper.get(file_name) 233 | 234 | if not file.blog: 235 | file.blog = Blog() 236 | 237 | if callback: 238 | callback(f"Publishing markdown draft for {file_name}") 239 | 240 | # TODO: Old code, this must be fixed 241 | with open(f"/temp/{file_name}.md", "w") as f: 242 | blog_content = f""" 243 | # Title: {file.blog.title} 244 | Description: {file.blog.description} 245 | --- 246 | # Overview 247 | {file.metadata.resume} 248 | --- 249 | # Blog 250 | {file.blog.content} 251 | --- 252 | # LinkedIn 253 | {file.blog.linkedin} 254 | """ 255 | f.write(blog_content) 256 | 257 | def publish_notion_draft(self, file_name, callback=None): 258 | """ 259 | Publish the blog to notion 260 | """ 261 | file = self.file_helper.get(file_name) 262 | 263 | if callback: 264 | callback(f"Publishing {file_name} to notion") 265 | 266 | if not file.files.photo: 267 | print(f"Photo not found for {file_name}, upload it!") 268 | return 269 | 270 | if not file.thumbnails.landscape: 271 | print(f"Banner not found for {file_name}, upload it!") 272 | return 273 | 274 | if not file.thumbnails.square: 275 | print(f"Square thumbnail not found for {file_name}, generate it!") 276 | return 277 | 278 | if not file.metadata.resume: 279 | print(f"Resume not found for {file_name}, generate it!") 280 | return 281 | 282 | if not file.blog.title: 283 | print(f"Title not found for {file_name}, generate it!") 284 | return 285 | 286 | if not file.blog.description: 287 | print(f"Description not found for {file_name}, generate it!") 288 | return 289 | 290 | if not file.blog.content: 291 | print(f"Blog not found for {file_name}, generate it!") 292 | return 293 | 294 | if not file.blog.linkedin: 295 | print(f"LinkedIn not found for {file_name}, generate it!") 296 | return 297 | 298 | self.notion_service.create_page(file) 299 | 300 | if callback: 301 | callback(f"Published {file_name} to notion") 302 | 303 | def reset(self, file_name, callback=None): 304 | """ 305 | Reset the blog with the given file name 306 | """ 307 | # TODO: Not yet implemented 308 | if callback: 309 | callback(f"Resetting {file_name}. NOT YET IMPLEMENTED") 310 | self.file_helper.reset(file_name) 311 | 312 | def generate_all(self, file_name, model="opus", llm_stream=None, callback=None): 313 | """ 314 | Generate all the attributes for the given file name 315 | """ 316 | if callback: 317 | callback(f"Generating all for {file_name}") 318 | 319 | self.extract_resume(file_name, callback=callback) 320 | self.transcribe(file_name, callback=callback) 321 | 322 | # Enrich the guest 323 | self.enrich_guest(file_name, callback=callback) 324 | 325 | # Generate thumbnails 326 | self.generate_thumbnails(file_name, callback=callback) 327 | 328 | # Generate blog 329 | for attr in Blog.__annotations__.keys(): 330 | self.generate(file_name, attr, model=model, llm_stream=llm_stream, callback=callback) 331 | 332 | # Generate podcast 333 | # if callback: 334 | # callback("Generating podcast intro...") 335 | # file = self.file_helper.get(file_name) 336 | # self.podcast_generator.generate_intro(file) 337 | 338 | # if callback: 339 | # callback("Generating podcast...") 340 | # self.podcast_generator.generate_podcast(file) 341 | 342 | print(f"All generated for {file_name}") 343 | 344 | # Validation 345 | 346 | def check_files(self, file: Blog) -> bool: 347 | """ 348 | Check if all the required files are present for the given blog 349 | 350 | To generate a blog, we need the following files: audio.m4a, resume.pdf and a photo.png 351 | """ 352 | if not file.files.audio_file: 353 | return False 354 | if not file.files.resume_file: 355 | return False 356 | if not file.files.photo: 357 | return False 358 | 359 | return True -------------------------------------------------------------------------------- /cli.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import cmd 3 | from blog_editor import BlogEditor 4 | import curses 5 | import time 6 | from errors import GuestNotFoundError 7 | import itertools 8 | import sys 9 | import os 10 | import logging 11 | from schemas.file import Metadata, Blog 12 | 13 | def main(stdscr): 14 | """ 15 | CLI for the blog editor to easily generate blogs 16 | """ 17 | # Clear screen and hide cursor 18 | curses.curs_set(0) 19 | stdscr.clear() 20 | 21 | # Get terminal dimensions 22 | height, width = stdscr.getmaxyx() 23 | current_file = None 24 | current_file_name = "No file set!" 25 | 26 | # Initialize input variables 27 | input_buffer = "" 28 | input_y = height - 2 # Position at bottom of screen 29 | 30 | # Add preview scroll position 31 | preview_scroll = 0 32 | visible_preview_lines = height - 4 33 | 34 | welcome_text = "Welcome to Blog Generator CLI!" 35 | commands = ["list", "get", "set_model", "generate_all", "quit"] 36 | 37 | content_text = "Here are the available commands: \n - " + "\n - ".join(commands) + "\n \nTo start, use 'get '\n" 38 | preview_lines = ["Preview screen"] 39 | 40 | # Initialize blog editor 41 | blog_editor = BlogEditor() 42 | 43 | def refresh_screen(): 44 | nonlocal stdscr, height, width, welcome_text, current_file_name, content_text, preview_lines, visible_preview_lines, input_buffer, preview_scroll, input_y 45 | stdscr.clear() 46 | try: 47 | # Redraw header 48 | stdscr.addstr(0, 0, welcome_text[:width-1]) 49 | header_text = "Currently editing file: " 50 | stdscr.addstr(1, 0, header_text) 51 | stdscr.addstr(1, len(header_text), current_file_name[:width-len(header_text)-1], curses.A_REVERSE) 52 | 53 | # Redraw content with up-to-date info 54 | if current_file_name != "No file set!": 55 | current_file = blog_editor.get(current_file_name) 56 | content_text = f"{current_file.__str__()}" 57 | 58 | content_y_pos = 3 59 | for paragraph in content_text.split('\n'): 60 | # Skip empty paragraphs but still add the spacing 61 | if not paragraph.strip(): 62 | content_y_pos += 1 63 | continue 64 | 65 | was_wrapped = False 66 | line = paragraph 67 | while line and content_y_pos < height - 2: 68 | if len(line) > width - 1: 69 | wrap_point = line[:width-1].rfind(' ') 70 | if wrap_point == -1: # No space found, force wrap at width 71 | wrap_point = width - 1 72 | 73 | stdscr.addstr(content_y_pos, 0, line[:wrap_point]) 74 | line = line[wrap_point:].lstrip() 75 | was_wrapped = True 76 | else: 77 | stdscr.addstr(content_y_pos, 0, line) 78 | line = '' 79 | content_y_pos += 1 80 | 81 | # Add extra line break after wrapped paragraphs 82 | if was_wrapped and content_y_pos < height - 2: 83 | content_y_pos += 1 84 | 85 | # Divider 86 | if content_y_pos < height - 2: 87 | stdscr.addstr(content_y_pos, 0, "-" * (width - 1)) 88 | content_y_pos += 1 89 | 90 | # Redraw preview with text wrapping and paragraph spacing 91 | preview_y_pos = content_y_pos + 1 92 | preview_text = preview_lines[0].split('\n') 93 | 94 | wrapped_preview_lines = [] 95 | for paragraph in preview_text: 96 | if not paragraph.strip(): 97 | wrapped_preview_lines.append('') 98 | continue 99 | 100 | line = paragraph 101 | paragraph_lines = [] 102 | while line: 103 | if len(line) > width - 1: 104 | wrap_point = line[:width-1].rfind(' ') 105 | if wrap_point == -1: 106 | wrap_point = width - 1 107 | paragraph_lines.append(line[:wrap_point]) 108 | line = line[wrap_point:].lstrip() 109 | else: 110 | paragraph_lines.append(line) 111 | line = '' 112 | 113 | # Add the paragraph lines 114 | wrapped_preview_lines.extend(paragraph_lines) 115 | # Add extra line break if the paragraph was wrapped 116 | if len(paragraph_lines) > 1: 117 | wrapped_preview_lines.append('') 118 | 119 | for i in range(visible_preview_lines): 120 | line_idx = i + preview_scroll 121 | if line_idx < len(wrapped_preview_lines): 122 | if preview_y_pos + i < height - 2: 123 | stdscr.addstr(preview_y_pos + i, 0, wrapped_preview_lines[line_idx]) 124 | 125 | # Redraw input line 126 | stdscr.addstr(input_y, 0, "> " + input_buffer[:width-3]) # Leave room for "> " 127 | except curses.error: 128 | pass 129 | stdscr.refresh() 130 | 131 | def llm_stream(text): 132 | nonlocal preview_lines 133 | preview_lines[0] += text 134 | refresh_screen() 135 | 136 | def cli_callback(text): 137 | nonlocal preview_lines 138 | preview_lines[0] = f"Generating...\nCurrent status: {text}\n\n" 139 | refresh_screen()# Generate content 140 | 141 | while True: 142 | refresh_screen() 143 | 144 | # Get user input 145 | key = stdscr.getch() 146 | # if key == ord('q'): 147 | # break 148 | if key == curses.KEY_UP: 149 | if preview_scroll > 0: 150 | preview_scroll -= 1 151 | elif key == curses.KEY_DOWN: 152 | if preview_scroll < len(preview_lines[0].split('\n')) - visible_preview_lines: 153 | preview_scroll += 1 154 | elif key == curses.KEY_BACKSPACE or key == 127: 155 | input_buffer = input_buffer[:-1] 156 | elif key == ord('\n'): 157 | # Parse and handle commands 158 | cmd_parts = input_buffer.strip().split() 159 | if not cmd_parts: 160 | pass # Empty line 161 | else: 162 | cmd = cmd_parts[0] 163 | param = cmd_parts[1] if len(cmd_parts) > 1 else None 164 | extra = cmd_parts[2:] if len(cmd_parts) > 2 else None 165 | 166 | # List files 167 | if cmd == 'list': 168 | files = blog_editor.list_files() 169 | content_text = "Files: \n - " + "\n - ".join(files) 170 | 171 | # Choose the working file 172 | elif cmd == 'get': 173 | if param: 174 | try: 175 | file_name = " ".join(cmd_parts[1:]) 176 | current_file_name = file_name 177 | current_file = blog_editor.get(file_name) 178 | content_text = f"{current_file.__str__()}" 179 | 180 | preview_lines[0] = f"File {file_name} loaded!" 181 | except GuestNotFoundError: 182 | content_text = f"File '{file_name}' not found! \n \n Here are the list of available files: \n - " + "\n - ".join(blog_editor.list_files()) 183 | 184 | # Help 185 | elif cmd == 'help': 186 | content_text = "Here are the available commands: \n - " + "\n - ".join(commands) 187 | 188 | # Quit 189 | elif cmd in ('quit', 'exit'): 190 | break 191 | 192 | elif cmd == 'publish': 193 | preview_lines[0] = f"Publishing {current_file_name} to notion" 194 | blog_editor.publish_notion_draft(current_file_name) 195 | 196 | elif cmd in ['generate', 'edit', 'reset'] and current_file is None: 197 | content_text = "Set a file first using 'get !'" 198 | 199 | elif cmd in ['generate', 'view', 'edit', 'overwrite', 'reset'] and current_file is not None: 200 | # Assert param is not None 201 | if param is None: 202 | preview_lines[0] = f"Parameter is required for this command! Use '{cmd} '" 203 | 204 | # Generating content for the first time 205 | if cmd == 'generate': 206 | preview_scroll = 0 207 | 208 | if param == 'all': 209 | # Generate all content 210 | blog_editor.generate_all(current_file_name, llm_stream=llm_stream, callback=cli_callback) 211 | 212 | elif param in ['thumbnail', 'thumbnails']: 213 | # Maybe need a task scheduler here? 214 | blog_editor.generate_thumbnails(current_file_name) 215 | 216 | else: 217 | if param not in Blog.__annotations__.keys(): 218 | preview_lines[0] = f"Parameter '{param}' is not present in {current_file_name}" 219 | else: 220 | blog_editor.generate(current_file_name, param, llm_stream=llm_stream, callback=cli_callback) 221 | 222 | # Editing content 223 | elif cmd == 'edit': 224 | if param not in Blog.__annotations__.keys(): 225 | preview_lines[0] = f"Parameter '{param}' is not present in {current_file_name}" 226 | else: 227 | blog_editor.edit(current_file_name, param, extra, llm_stream=llm_stream, callback=cli_callback) 228 | 229 | elif cmd == 'view': 230 | # Get the latest version of the file 231 | current_file = blog_editor.get(current_file_name) 232 | 233 | # Metadata params 234 | if param in Metadata.__annotations__.keys(): 235 | preview_scroll = 0 236 | param_value = getattr(current_file.metadata, param, f"Attribute '{param}' not found") 237 | if param_value: 238 | preview_lines[0] = param_value.__str__() 239 | else: 240 | preview_lines[0] = f"'{param}' is not present in {current_file_name}" 241 | 242 | elif param in Blog.__annotations__.keys(): 243 | preview_scroll = 0 244 | param_value = getattr(current_file.blog, param, f"Attribute '{param}' not found") 245 | if param_value: 246 | preview_lines[0] = param_value 247 | else: 248 | preview_lines[0] = f"'{param}' is not present in {current_file_name}" 249 | 250 | elif cmd == 'overwrite': 251 | # E.g. overwrite title will set title = 252 | preview_lines[0] = f"Not yet implemented: \ncmd: {cmd}, param: {param}, extra: {extra}" 253 | 254 | elif cmd == 'reset': 255 | preview_lines[0] = f"Not yet implemented: \ncmd: {cmd}, param: {param}, extra: {extra}" 256 | 257 | if param == 'all': 258 | # blog_editor.reset_all(current_file) 259 | pass 260 | else: 261 | # blog_editor.reset("param") 262 | pass 263 | 264 | input_buffer = "" 265 | elif 32 <= key <= 126: 266 | try: 267 | input_buffer += chr(key) 268 | except Exception as e: 269 | logging.error(f"Character input error: {str(e)}") 270 | 271 | 272 | if __name__ == '__main__': 273 | # Redirect stdout and stderr before starting curses 274 | logging.getLogger().setLevel(logging.ERROR) # Or use logging.CRITICAL for even less output 275 | 276 | # sys.stdout = open(os.devnull, 'w') 277 | # sys.stderr = open(os.devnull, 'w') 278 | error = "no error hmm?" 279 | try: 280 | curses.wrapper(main) 281 | except Exception as e: 282 | logging.error(f"Curses wrapper error: {str(e)}") 283 | error = str(e) 284 | 285 | print(f"Exiting... due to error: {error}") 286 | -------------------------------------------------------------------------------- /errors.py: -------------------------------------------------------------------------------- 1 | 2 | class GuestNotFoundError(Exception): 3 | """ 4 | Exception raised when the guest/blog_name is not found in the Zoom folder 5 | """ 6 | pass 7 | -------------------------------------------------------------------------------- /evals/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/evals/__init__.py -------------------------------------------------------------------------------- /evals/evals.py: -------------------------------------------------------------------------------- 1 | from math import exp 2 | from time import sleep 3 | from typing import List 4 | from collections import defaultdict 5 | from schemas.file import File 6 | from llms.llm_service import LLMService 7 | import numpy as np 8 | from prompts.prompts import Prompts 9 | # from sentence_transformers import SentenceTransformers 10 | 11 | 12 | 13 | class Evals: 14 | """ 15 | Evals class for the file object 16 | """ 17 | def __init__(self, file_helper, llm_service, dataset): 18 | self.file_helper = file_helper 19 | self.llm_service = llm_service 20 | self.prompts = Prompts(self.file_helper) 21 | self.dataset = dataset 22 | 23 | def _get_attribute(self, file: File, attr: str): 24 | """ 25 | Get the attribute from the file 26 | """ 27 | if attr in ["title", "description", "content", "linkedin"]: 28 | return getattr(file.blog, attr) 29 | elif attr in ["top_companies", "top_universities"]: 30 | return getattr(file.metadata.guest, attr) 31 | 32 | def eval_model(self, model: str, attr: str, iterations = 1): 33 | """ 34 | Evaluate a provided model + prompt configuration for a given attribute against the dataset 35 | """ 36 | # Run all evals 37 | evals = [] 38 | for i in range(iterations): 39 | for file in self.dataset: 40 | candidate = self.llm_service.prompt(model=model, prompt=self.prompts.get_prompt(file, attr).text) 41 | reference = self._get_attribute(file, attr) 42 | evals.append(self.eval_all(candidate, reference, file)) 43 | 44 | print(f"candidate: {candidate} \nreference: {reference} \nevals: {evals[-1]}") 45 | print(f"================================================ \n \n") 46 | 47 | if i > 0: 48 | sleep(1) 49 | 50 | # Average the results across iterations 51 | results = defaultdict(lambda: 0) 52 | for scores in evals: 53 | for key, value in scores.items(): 54 | results[key] += value 55 | 56 | for key, value in results.items(): 57 | results[key] /= (iterations * len(self.dataset)) 58 | 59 | return results 60 | 61 | def eval_all(self, candidate: str, reference: str, file: File=None): 62 | """ 63 | Evaluate a generated text agaisnt the entire dataset 64 | """ 65 | 66 | scores = { 67 | "bleu1": self.bleu_score(candidate, reference, 1), 68 | "bleu2": self.bleu_score(candidate, reference, 2), 69 | "bleu3": self.bleu_score(candidate, reference, 3), 70 | "transcript_overlap": self.transcript_overlap_score(candidate, file.metadata.transcript.text), 71 | # "rouge": self.rouge_score(candidate, reference), 72 | # "cosine": self.cosine_similarity(candidate, reference), 73 | # "perplexity": self.perplexity(candidate, reference) 74 | } 75 | 76 | return scores 77 | 78 | def _split_text(self, text: str, n: int): 79 | """ 80 | Split the text into n-grams 81 | """ 82 | words = text 83 | if isinstance(text, str): 84 | words = text.split() #Bad code, but handles top_companies case (parsed as list) 85 | return [tuple(words[i:i+n]) for i in range(len(words)-n+1)] 86 | 87 | def bleu_score(self, candidate: str, reference: str, n: int = 1) -> float: 88 | """ 89 | Calculate the BLEU score for a given candidate and reference text 90 | """ 91 | if not candidate or not reference: 92 | return 0.0 93 | 94 | candidate_ngrams = self._split_text(candidate, n) 95 | reference_ngrams = self._split_text(reference, n) 96 | 97 | # Count occurrences of each n-gram in both texts 98 | candidate_counts = defaultdict(lambda: 0) 99 | reference_counts = defaultdict(lambda: 0) 100 | 101 | for ngram in candidate_ngrams: 102 | candidate_counts[ngram] += 1 103 | 104 | for ngram in reference_ngrams: 105 | reference_counts[ngram] += 1 106 | 107 | # Count matches (clipped by reference counts) 108 | matches = 0 109 | for ngram, count in candidate_counts.items(): 110 | matches += min(count, reference_counts[ngram]) 111 | 112 | # Apply brevity penalty based on word count 113 | bp = 1.0 114 | 115 | if isinstance(candidate, str): 116 | candidate_words = candidate.split() 117 | else: 118 | candidate_words = candidate 119 | 120 | if isinstance(reference, str): 121 | reference_words = reference.split() 122 | else: 123 | reference_words = reference 124 | 125 | if len(candidate_words) < len(reference_words): 126 | bp = exp(1 - len(reference_words) / len(candidate_words)) 127 | 128 | # Calculate final score 129 | if len(candidate_ngrams) == 0: 130 | return 0.0 131 | return bp * (matches / len(candidate_ngrams)) 132 | 133 | def transcript_overlap_score(self, candidate: str, transcript: str): 134 | """ 135 | Calculate the transcript overlap score between the candidate and reference text 136 | """ 137 | # For each word in the candidate, check if it appears in the reference 138 | candidate_words = candidate.split() 139 | transcript_words = set(transcript.split()) 140 | 141 | # Count the number of words that appear in both the candidate and reference 142 | matches = 0 143 | for word in candidate_words: 144 | if word in transcript_words: 145 | matches += 1 146 | 147 | # Return the ratio of matches to the total number of words in the candidate 148 | return matches / len(candidate_words) 149 | 150 | def rouge_score(self, candidate: str, reference: str): 151 | #TODO: Implement ROUGE score 152 | pass 153 | 154 | def cosine_similarity(self, candidate: str, reference: str): 155 | # embedding_model = SentenceTransformers('all-MiniLM-L6-v2') # or another suitable model 156 | 157 | # if reference not in self.dataset_embeddings: 158 | # #cache dataset embeddings 159 | # self.dataset_embeddings[reference] = embedding_model.encode(reference) 160 | 161 | # # Get embeddings for both texts 162 | # candidate_embedding = embedding_model.encode(candidate) 163 | # reference_embedding = self.dataset_embeddings[reference] 164 | 165 | # # Calculate cosine similarity 166 | # return np.dot(candidate_embedding, reference_embedding) / ( 167 | # np.linalg.norm(candidate_embedding) * np.linalg.norm(reference_embedding) 168 | # ) 169 | pass 170 | 171 | def sentence_wise_similarity(self, candidate: str, reference: str): 172 | # """ 173 | # Calculate sentence-by-sentence similarity scores 174 | # """ 175 | # # Split into sentences 176 | # candidate_sentences = sent_tokenize(candidate) 177 | # reference_sentences = sent_tokenize(reference) 178 | 179 | # # Calculate similarities for each sentence pair 180 | # similarities = [] 181 | # for c_sent in candidate_sentences: 182 | # sent_scores = [] 183 | # for r_sent in reference_sentences: 184 | # score = self.cosine_similarity(c_sent, r_sent) 185 | # sent_scores.append(score) 186 | # similarities.append(max(sent_scores)) # Take best match for each candidate sentence 187 | 188 | # # Return average similarity 189 | # return sum(similarities) / len(similarities) if similarities else 0.0 190 | pass 191 | 192 | def BERTScore(self, candidate: str, reference: str): 193 | pass 194 | 195 | def perplexity(self, candidate: str, reference: str): 196 | pass 197 | 198 | def mauve_score(self, candidate: str, reference: str): 199 | pass 200 | 201 | def compression_ratio(self, candidate: str, reference: str): 202 | pass 203 | 204 | def rouge_l_score(self, candidate: str, reference: str): 205 | # Longest common subsequence 206 | pass 207 | 208 | # Knowledge Triplet Evaluation 209 | def knowledge_triplet_evaluation(self, candidate: str, reference: str): 210 | pass 211 | 212 | # Custom & Advanced evals 213 | def overlap_score(self, candidate: str, reference: str): 214 | """ 215 | Get the vocab overlap between generated content & original transcript 216 | """ 217 | pass 218 | 219 | def llm_judge(self, file: File): #Specific to the task: Blog, Title, LinkedIn, etc. 220 | pass 221 | 222 | #Util funcs 223 | def _embed_file(self, file: File): 224 | pass -------------------------------------------------------------------------------- /file_system/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/file_system/__init__.py -------------------------------------------------------------------------------- /file_system/file_helper.py: -------------------------------------------------------------------------------- 1 | from typing import Dict, Type 2 | from .file_repository import FileRepository 3 | from .handlers.file_handler import FileHandler 4 | from .handlers.metadata_handler import MetadataHandler 5 | from .handlers.thumbnails_handler import ThumbnailsHandler 6 | from .handlers.blog_handler import BlogHandler 7 | from .handlers.podcast_handler import PodcastHandler 8 | from schemas.file import File 9 | from errors import GuestNotFoundError 10 | 11 | class FileHelper: 12 | """ 13 | Helper class to handle the business logic of the (Zoom) file system 14 | """ 15 | 16 | def __init__(self, directory: str): 17 | """ 18 | Initialize the file helper 19 | """ 20 | self.file_repository = FileRepository(directory) 21 | self.handlers: Dict[str, FileHandler] = { 22 | 'files': FileHandler(self.file_repository), 23 | 'metadata': MetadataHandler(self.file_repository), 24 | 'thumbnails': ThumbnailsHandler(self.file_repository), 25 | 'blog': BlogHandler(self.file_repository), 26 | } 27 | 28 | def list_files(self): 29 | """ 30 | List all the files in the directory 31 | """ 32 | return self.file_repository.list_files() 33 | 34 | def get(self, blog_name: str) -> File: 35 | """ 36 | Get the folder named with the blog_name from the Zoom directory 37 | """ 38 | 39 | # Check if the blog exists 40 | if blog_name not in self.list_files(): 41 | raise GuestNotFoundError(f"Blog '{blog_name}' not found") 42 | 43 | data = {'name': blog_name} 44 | 45 | # Get each section of the blog 46 | for section, handler in self.handlers.items(): 47 | data[section] = handler.get(blog_name) 48 | 49 | return File(**data) 50 | 51 | def save(self, blog: File) -> None: 52 | """ 53 | Save the blog to the Zoom directory 54 | """ 55 | # Save each section using its specific handler, only if the section has changed 56 | for section, handler in self.handlers.items(): 57 | if handler.has_changed(blog, self.get(blog.name)): 58 | handler.save(blog.name, getattr(blog, section)) 59 | 60 | def reset(self, blog_name: str): 61 | """ 62 | Reset the blog by deleting all its files 63 | """ 64 | for handler in self.handlers.values(): 65 | handler.reset(blog_name) 66 | -------------------------------------------------------------------------------- /file_system/file_repository.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | 4 | from schemas.file import Blog 5 | 6 | class FileRepository: 7 | """ 8 | Class to handle the Zoom file system 9 | """ 10 | 11 | def __init__(self, directory: str): 12 | """ 13 | Initialize the file repository 14 | """ 15 | self.directory = directory 16 | 17 | def list_files(self): 18 | """ 19 | List all the files in the directory 20 | """ 21 | files = os.listdir(self.directory) 22 | # Filter out hidden files/directories that start with . 23 | visible_files = [f for f in files if not f.startswith('.')] 24 | return sorted(visible_files) 25 | 26 | # Handle retrieving user-uploaded files 27 | def get_file_ends_with(self, file_path: str, extensions: list[str]): 28 | """ 29 | Get the first file in the given file_path that ends with the given extension 30 | """ 31 | files = [] 32 | 33 | for extension in extensions: 34 | files.extend([f for f in os.listdir(f"{self.directory}/{file_path}") if f.endswith(extension)]) 35 | 36 | if len(files) == 0: 37 | return None 38 | else: 39 | return f"{self.directory}/{file_path}/{files[0]}" 40 | 41 | # Handle JSON files 42 | def get_json(self, file_path: str): 43 | """ 44 | Get the JSON data from the given file path 45 | """ 46 | try: 47 | with open(f"{self.directory}/{file_path}", "r") as f: 48 | return json.load(f) 49 | except FileNotFoundError: 50 | return None 51 | 52 | def save_json(self, file_path: str, data: dict): 53 | """ 54 | Save the JSON data to the given file path 55 | """ 56 | self._ensure_directory_exists(file_path) 57 | with open(f"{self.directory}/{file_path}", "w") as f: 58 | json.dump(data, f) 59 | 60 | # Handle image files 61 | def get_image(self, file_path: str): 62 | """ 63 | Returns the bytes of the image 64 | """ 65 | try: 66 | with open(f"{self.directory}/{file_path}", "rb") as f: 67 | return f.read() 68 | except FileNotFoundError: 69 | return None 70 | 71 | def save_image(self, file_path: str, data: bytes): 72 | """ 73 | Save the image data to the given file path 74 | """ 75 | self._ensure_directory_exists(file_path) 76 | with open(f"{self.directory}/{file_path}", "wb") as f: 77 | f.write(data) 78 | 79 | # Handle markdown files 80 | def get_text(self, file_path: str): 81 | """ 82 | Get the text from the given file path 83 | """ 84 | try: 85 | with open(f"{self.directory}/{file_path}", "r") as f: 86 | return f.read() 87 | except FileNotFoundError: 88 | return None 89 | 90 | def save_text(self, file_path: str, data: str): 91 | """ 92 | Save the text to the given file path 93 | """ 94 | self._ensure_directory_exists(file_path) 95 | with open(f"{self.directory}/{file_path}", "w") as f: 96 | f.write(data) 97 | 98 | def _ensure_directory_exists(self, file_path: str): 99 | """ 100 | Helper method to create directory structure if it doesn't exist 101 | """ 102 | directory = os.path.dirname(f"{self.directory}/{file_path}") 103 | os.makedirs(directory, exist_ok=True) -------------------------------------------------------------------------------- /file_system/handlers/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/file_system/handlers/__init__.py -------------------------------------------------------------------------------- /file_system/handlers/blog_handler.py: -------------------------------------------------------------------------------- 1 | from .interface import HandlerInterface 2 | from typing import Any 3 | from schemas.file import Blog 4 | 5 | class BlogHandler(HandlerInterface): 6 | """ 7 | Handler class to handle the blog section of the blog schema 8 | """ 9 | 10 | def get(self, file_name: str) -> Blog: 11 | data = {} 12 | 13 | for attr in Blog.__annotations__.keys(): 14 | # Get the latest version 15 | version = self.file_repository.get_json(f"{file_name}/generated/{attr}/version.json") 16 | 17 | if version: 18 | version = int(version.get("version", 0)) 19 | else: 20 | version = 0 21 | 22 | data[attr] = self.file_repository.get_text(f"{file_name}/generated/{attr}/{attr}_v{version}.txt") 23 | 24 | return Blog.model_validate(data) 25 | 26 | def save(self, file_name: str, data: Any) -> None: 27 | old_data = self.get(file_name) 28 | 29 | for field, content in data.model_dump().items(): 30 | # If the attribute has changed, update the version 31 | if self.attr_has_changed(field, data, old_data): 32 | print(f"Attribute {field} has changed, updating version") 33 | # Keep a record of the previous version 34 | version = self.file_repository.get_json(f"{file_name}/generated/{field}/version.json") 35 | 36 | if version: 37 | version = int(version.get("version", 0)) + 1 38 | else: 39 | version = 1 40 | 41 | # Update the version and save to generated 42 | self.file_repository.save_text(f"{file_name}/generated/{field}/{field}_v{version}.txt", content) 43 | self.file_repository.save_json(f"{file_name}/generated/{field}/version.json", {"version": version}) 44 | 45 | # Rewrite the latest version in the root directory 46 | # TODO: Currently just writing text (and not json)) 47 | self.file_repository.save_text(f"{file_name}/content/{field}.txt", content) -------------------------------------------------------------------------------- /file_system/handlers/file_handler.py: -------------------------------------------------------------------------------- 1 | from schemas.file import Files 2 | from .interface import HandlerInterface 3 | from typing import Any 4 | 5 | class FileHandler(HandlerInterface): 6 | """ 7 | Handler class to handle the files section of the blog schema 8 | """ 9 | 10 | def get(self, file_name: str) -> Files: 11 | data = {} 12 | 13 | for attr in Files.__annotations__.keys(): 14 | if attr == "audio_file": 15 | data[attr] = self.file_repository.get_file_ends_with(f"{file_name}", [".m4a", ".mp3"]) 16 | elif attr == "video_file": 17 | data[attr] = self.file_repository.get_file_ends_with(f"{file_name}", [".mp4", ".mov"]) 18 | elif attr == "resume_file": 19 | data[attr] = self.file_repository.get_file_ends_with(f"{file_name}", [".pdf"]) 20 | elif attr == "portrait": 21 | data[attr] = self.file_repository.get_file_ends_with(f"{file_name}", [f"{attr}.png", f"{attr}.jpeg", f"{attr}.jpg"]) 22 | elif attr == "photo": 23 | data[attr] = self.file_repository.get_file_ends_with(f"{file_name}", [f"{attr}.png", f"{attr}.jpeg", f"{attr}.jpg"]) 24 | 25 | return Files.model_validate(data) 26 | 27 | def save(self, file_name: str, data: Any) -> None: 28 | # Normally we will never save imported files 29 | pass -------------------------------------------------------------------------------- /file_system/handlers/interface.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | from typing import Any 3 | from errors import GuestNotFoundError 4 | from file_system.file_repository import FileRepository 5 | 6 | class HandlerInterface(ABC): 7 | """ 8 | Interface class for the handlers 9 | """ 10 | 11 | def __init__(self, file_repository: FileRepository): 12 | """ 13 | Initialize the handler 14 | """ 15 | self.file_repository = file_repository 16 | 17 | @abstractmethod 18 | def get(self, blog_name: str) -> Any: 19 | """ 20 | Get the data from the given blog name 21 | """ 22 | pass 23 | 24 | @abstractmethod 25 | def save(self, blog_name: str, data: Any) -> None: 26 | """ 27 | Save the data to the given blog name 28 | """ 29 | pass 30 | 31 | def has_changed(self, new_data: Any, old_data: Any) -> bool: 32 | """ 33 | Check if the data has changed 34 | """ 35 | return any(self.attr_has_changed(attr, new_data, old_data) for attr in new_data.__annotations__.keys()) 36 | 37 | def attr_has_changed(self, attr: str, new_data: Any, old_data: Any) -> bool: 38 | """ 39 | Check if a specific attribute of the data has changed 40 | """ 41 | return getattr(new_data, attr) != getattr(old_data, attr) 42 | 43 | # @abstractmethod 44 | # def reset(self, blog_name: str) -> None: 45 | # pass -------------------------------------------------------------------------------- /file_system/handlers/metadata_handler.py: -------------------------------------------------------------------------------- 1 | from .interface import HandlerInterface 2 | from schemas.file import Metadata 3 | 4 | class MetadataHandler(HandlerInterface): 5 | """ 6 | Handler class to handle the metadata section of the blog schema 7 | """ 8 | 9 | def get(self, file_name: str) -> Metadata: 10 | data = {} 11 | 12 | for attr in Metadata.__annotations__.keys(): 13 | data[attr] = self.file_repository.get_json(f"{file_name}/metadata/{attr}.json") 14 | 15 | return Metadata.model_validate(data) 16 | 17 | def save(self, file_name: str, new_data: Metadata) -> None: 18 | old_data = self.get(file_name) 19 | 20 | # Save each attribute as a JSON file 21 | for attr in Metadata.__annotations__.keys(): 22 | # Get the attribute data 23 | attr_data = getattr(new_data, attr) 24 | 25 | # Save the attribute data if it exists and has changed 26 | if attr_data is not None and self.attr_has_changed(attr, new_data, old_data): 27 | print(f"{attr} has changed for {file_name}, saving it!") 28 | self.file_repository.save_json(f"{file_name}/metadata/{attr}.json", attr_data.model_dump()) 29 | 30 | # TODO: This is bad and hard-coded 31 | if attr == "transcript": 32 | self.file_repository.save_text(f"{file_name}/metadata/transcript.txt", attr_data.text) 33 | self.file_repository.save_text(f"{file_name}/metadata/transcript.md", attr_data.text) 34 | # self.file_repository.save_text(f"{file_name}/content/questions.txt", attr_data.questions) -------------------------------------------------------------------------------- /file_system/handlers/podcast_handler.py: -------------------------------------------------------------------------------- 1 | from .interface import HandlerInterface 2 | from typing import Any 3 | from schemas.file import Podcast 4 | 5 | class PodcastHandler(HandlerInterface): 6 | """ 7 | Handler class to handle the podcast section of the blog schema (WIP) 8 | """ 9 | 10 | def save(self, file_name: str, data: Any) -> None: 11 | raise NotImplementedError("PodcastHandler not implemented") 12 | 13 | def get(self, file_name: str) -> Podcast: 14 | raise NotImplementedError("PodcastHandler not implemented") -------------------------------------------------------------------------------- /file_system/handlers/thumbnails_handler.py: -------------------------------------------------------------------------------- 1 | from .interface import HandlerInterface 2 | from typing import Any 3 | from schemas.file import Thumbnails 4 | 5 | class ThumbnailsHandler(HandlerInterface): 6 | """ 7 | Handler class to handle the thumbnails section of the blog schema 8 | """ 9 | 10 | def get(self, file_name: str) -> Thumbnails: 11 | data = {} 12 | 13 | for attr in Thumbnails.__annotations__.keys(): 14 | if attr == "photo_no_bg": 15 | data[attr] = self.file_repository.get_image(f"{file_name}/thumbnails/{attr}.png") 16 | elif attr == "landscape" or attr == "square": 17 | data[attr] = self.file_repository.get_image(f"{file_name}/content/{attr}.png") 18 | else: 19 | data[attr] = self.file_repository.get_json(f"{file_name}/thumbnails/{attr}.json") 20 | 21 | return Thumbnails.model_validate(data) 22 | 23 | def save(self, file_name: str, data: Any) -> None: 24 | # Only save changes 25 | old_data = self.get(file_name) 26 | if self.attr_has_changed("photo_no_bg", data, old_data): 27 | # Save parameters and no_bg to generated folder 28 | self.file_repository.save_image(f"{file_name}/thumbnails/photo_no_bg.png", data.photo_no_bg) 29 | self.file_repository.save_json(f"{file_name}/thumbnails/landscape_params.json", data.landscape_params.model_dump()) 30 | self.file_repository.save_json(f"{file_name}/thumbnails/square_params.json", data.square_params.model_dump()) 31 | 32 | # Save final images to main directory 33 | if self.attr_has_changed("landscape", data, old_data): 34 | self.file_repository.save_image(f"{file_name}/content/landscape.png", data.landscape) 35 | 36 | if self.attr_has_changed("square", data, old_data): 37 | self.file_repository.save_image(f"{file_name}/content/square.png", data.square) -------------------------------------------------------------------------------- /helpers/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/helpers/__init__.py -------------------------------------------------------------------------------- /helpers/notion_service.py: -------------------------------------------------------------------------------- 1 | from schemas.file import File 2 | from notion_client import Client 3 | import re 4 | import firebase_admin 5 | from firebase_admin import credentials, storage 6 | from uuid import uuid4 7 | import os 8 | import requests 9 | 10 | class NotionService: 11 | """ 12 | Service class to publish to Notion 13 | """ 14 | 15 | def __init__(self, config): 16 | """ 17 | Initialize the Notion service 18 | """ 19 | self.client = Client(auth=config["NOTION_TOKEN"]) 20 | self.database = config["NOTION_DATABASE_ID"] 21 | 22 | if not firebase_admin._apps: 23 | cred = credentials.Certificate(config["FIREBASE_CREDENTIALS_PATH"]) 24 | firebase_admin.initialize_app(cred, { 25 | 'storageBucket': config["FIREBASE_STORAGE_BUCKET"] 26 | }) 27 | 28 | self.bucket = storage.bucket() 29 | 30 | def upload_image(self, image_path): 31 | """ 32 | Notion does not support uploading images directly, so we upload to Firebase Storage and return the public URL. 33 | """ 34 | # Generate a unique filename 35 | file_extension = os.path.splitext(image_path)[1] 36 | destination_blob_name = f"images/{uuid4()}{file_extension}" 37 | 38 | # Upload the file 39 | blob = self.bucket.blob(destination_blob_name) 40 | blob.upload_from_filename(image_path) 41 | 42 | # Make the blob publicly accessible 43 | blob.make_public() 44 | 45 | # Return the public URL 46 | return blob.public_url 47 | 48 | #TODO: Add banner and photo to the page. Requires uploading file servce then adding 49 | 50 | def create_page(self, blog: File): 51 | """ 52 | Create a new page in Notion, in the specified database (at init) 53 | """ 54 | photo_url = self.upload_image(blog.files.portrait) 55 | banner_url = self.upload_image(blog.files.portrait.rsplit('portrait.jpeg', 1)[0] + 'content/square.png') 56 | 57 | # Split content by Markdown headers 58 | parts = re.split(r'(^#{1,3}\s.*$)', blog.blog.content, flags=re.MULTILINE) 59 | 60 | children = [] 61 | 62 | # Add the title as a h1 heading 63 | children.append({ 64 | "object": "block", 65 | "type": "heading_1", 66 | "heading_1": {"rich_text": [{"type": "text", "text": {"content": blog.blog.title}}]} 67 | }) 68 | 69 | # Add the description as a paragraph in grey italics 70 | children.append({ 71 | "object": "block", 72 | "type": "paragraph", 73 | "paragraph": {"rich_text": [{"type": "text", "text": {"content": blog.blog.description, "link": None}}]} 74 | }) 75 | 76 | # Add the banner 77 | children.append({ 78 | "object": "block", 79 | "type": "image", 80 | "image": { 81 | "type": "external", 82 | "external": { 83 | "url": banner_url 84 | } 85 | } 86 | }) 87 | 88 | # # Add the photo 89 | # children.append({ 90 | # "object": "block", 91 | # "type": "image", 92 | # "image": { 93 | # "type": "external", 94 | # "external": { 95 | # "url": photo_url 96 | # } 97 | # } 98 | # }) 99 | 100 | current_block = None 101 | 102 | for part in parts: 103 | part = part.strip() 104 | if not part: 105 | continue 106 | 107 | if part.startswith('# '): 108 | block_type = "heading_1" 109 | content = part[2:] 110 | elif part.startswith('## '): 111 | block_type = "heading_2" 112 | content = part[3:] 113 | elif part.startswith('### '): 114 | block_type = "heading_3" 115 | content = part[4:] 116 | else: 117 | block_type = "paragraph" 118 | content = part 119 | 120 | if current_block: 121 | children.append(current_block) 122 | 123 | current_block = { 124 | "object": "block", 125 | "type": block_type, 126 | block_type: { 127 | "rich_text": [] 128 | } 129 | } 130 | 131 | # Split content into chunks of 2000 characters or less 132 | while content: 133 | chunk = content[:2000] 134 | content = content[2000:] 135 | current_block[block_type]["rich_text"].append({"type": "text", "text": {"content": chunk}}) 136 | 137 | if content: # If there's more content, create a new block 138 | children.append(current_block) 139 | current_block = { 140 | "object": "block", 141 | "type": block_type, 142 | block_type: { 143 | "rich_text": [] 144 | } 145 | } 146 | 147 | if current_block: 148 | children.append(current_block) 149 | 150 | new_page = self.client.pages.create( 151 | parent={"database_id": self.database}, 152 | properties={ 153 | "Name": {"title": [{"text": {"content": blog.name}}]}, 154 | }, 155 | children=children 156 | ) 157 | 158 | return new_page 159 | -------------------------------------------------------------------------------- /helpers/podcast_generator.py: -------------------------------------------------------------------------------- 1 | from schemas.file import File 2 | from elevenlabs.client import ElevenLabs 3 | from llms.llm_service import LLMService 4 | import random 5 | import os 6 | from moviepy import AudioFileClip, ColorClip, CompositeVideoClip, TextClip 7 | 8 | class PodcastGenerator: 9 | """ 10 | Service class to generate podcasts 11 | """ 12 | 13 | def __init__(self, config, llm, prompts): 14 | """ 15 | Initialize the PodcastGenerator 16 | """ 17 | self.config = config 18 | self.client = ElevenLabs(api_key=config["ELEVENLABS_API_KEY"]) 19 | self.llm = llm 20 | self.prompts = prompts 21 | 22 | def generate(self, file: File) -> None: 23 | """ 24 | Generate a podcast from the given blog. WIP 25 | """ 26 | # You can use voice cloning to clone the guest's voices! 27 | 28 | # Use the blog.metadata.utterances and blog.metadata.guest to generate a podcast 29 | response = self.client.voices.get_all() 30 | print(f"Voices: {response}") 31 | 32 | # Split content into lines and categorize them as interviewer/guest parts 33 | lines = file.blog.content.split('\n') 34 | interviewer_parts = [] 35 | guest_parts = [] 36 | current_guest_response = [] 37 | 38 | for line in lines: 39 | # Skip main headings (single #) 40 | if line.startswith('# '): 41 | continue 42 | elif line.startswith('###'): 43 | # If we have collected any guest response, save it 44 | if current_guest_response: 45 | guest_parts.append(' '.join(current_guest_response)) 46 | current_guest_response = [] 47 | # Add interviewer question (remove the ### prefix) 48 | interviewer_parts.append(line.lstrip('#').strip()) 49 | elif line.strip(): # Only include non-empty lines 50 | current_guest_response.append(line) 51 | 52 | # Add the final guest response if any 53 | if current_guest_response: 54 | guest_parts.append(' '.join(current_guest_response)) 55 | 56 | 57 | print(f"Interviewer parts: {interviewer_parts}") 58 | print(f"Guest parts: {guest_parts}") 59 | 60 | total_audio_bytes = b"" 61 | 62 | for i in range(2): 63 | # Generate the interviewer question 64 | audio = self.client.generate(text=interviewer_parts[i], voice=response.voices[1]) 65 | audio_bytes = b"".join(list(audio)) 66 | total_audio_bytes += audio_bytes 67 | 68 | # Generate the guest response 69 | audio = self.client.generate(text=guest_parts[i], voice=response.voices[2]) 70 | audio_bytes = b"".join(list(audio)) 71 | total_audio_bytes += audio_bytes 72 | 73 | # Save the audio file to the output directory 74 | random_id = random.randint(1000, 9999) 75 | output_path = f"/Users/anirudhh/Documents/axd_blogs_voice/podcast_{random_id}.mp3" 76 | with open(output_path, "wb") as f: 77 | f.write(total_audio_bytes) 78 | return audio 79 | 80 | def clone_speaker(self, file: File) -> None: 81 | """ 82 | Clone the speaker's voice 83 | """ 84 | # Get the utterances that belong to the guest 85 | guest_utterances = [utterance for utterance in file.metadata.utterances.utterances if utterance.speaker == file.blog.metadata.guest] 86 | 87 | # Split the audio file into chunks based on the guest_utterances start + end timestamps 88 | 89 | # Join the chunks back together which are longer than 10 seconds 90 | 91 | # Use the AssemblyAI API to clone the guest voice 92 | 93 | raise NotImplementedError("Not implemented") 94 | 95 | def smart_cut(self, file: File) -> None: 96 | """ 97 | Smartly cut thet transcript segments by using the utterances + cross-matching the blog content 98 | 99 | Any papers on this? 100 | """ 101 | 102 | raise NotImplementedError("Not implemented") 103 | 104 | def generate_intro(self, file: File) -> None: 105 | """ 106 | Generate an intro for the podcast with both audio and video 107 | """ 108 | 109 | speaker_voice = self.client.voices.get(voice_id="5HDW0qbyNwiQkijnIHQ4") 110 | 111 | # TODO remove (for testing) 112 | # voices = self.client.voices.get_all() 113 | # speaker_voice = voices.voices[0] 114 | 115 | prompt = self.prompts.podcast_intro_prompt(file) 116 | intro_text = self.llm.prompt(prompt) 117 | 118 | # Generate the intro audio 119 | audio = self.client.generate(text=intro_text, voice=speaker_voice) 120 | audio_bytes = b"".join(list(audio)) 121 | random_id = random.randint(1000, 9999) 122 | 123 | # Create paths for both audio and video files 124 | base_path = f"/Users/anirudhh/Documents/Zoom_v2/irina_clip/content/intro_{random_id}" 125 | audio_path = f"{base_path}.mp3" 126 | video_path = f"{base_path}.mp4" 127 | 128 | os.makedirs(os.path.dirname(audio_path), exist_ok=True) 129 | 130 | # Save the audio file 131 | with open(audio_path, "wb") as f: 132 | f.write(audio_bytes) 133 | 134 | # Create video from audio using updated MoviePy syntax 135 | audio_clip = AudioFileClip(audio_path) 136 | 137 | # Create a simple colored background with updated syntax 138 | video_clip = ColorClip( 139 | size=(1920, 1080), 140 | color=(25, 25, 25), 141 | duration=audio_clip.duration 142 | ) 143 | 144 | # Split intro text into sentences for better presentation 145 | sentences = [s.strip() for s in intro_text.split('.') if s.strip()] 146 | 147 | # Create text clips for each sentence 148 | text_clips = [] 149 | duration_per_text = audio_clip.duration / len(sentences) 150 | 151 | for i, sentence in enumerate(sentences): 152 | text_clip = TextClip( 153 | text=sentence, 154 | font='Arial.ttf', 155 | font_size=48, 156 | color='white', 157 | size=(1600, None), # Width constraint, height automatic 158 | method='caption' 159 | ).with_position(('center', 'center')) 160 | 161 | # Add fade in/out effects and set the timing 162 | start_time = i * duration_per_text 163 | text_clip = (text_clip 164 | .with_start(start_time) 165 | .with_duration(duration_per_text)) 166 | 167 | text_clips.append(text_clip) 168 | 169 | # Combine background and text clips 170 | final_clip = CompositeVideoClip( 171 | [video_clip] + text_clips 172 | ).with_audio(audio_clip) 173 | 174 | # Write the final video file with modern codec settings 175 | final_clip.write_videofile( 176 | video_path, 177 | fps=12, 178 | codec='libx264', 179 | audio_codec='aac', 180 | preset='medium', 181 | threads=16 182 | ) 183 | 184 | # Clean up resources 185 | audio_clip.close() 186 | final_clip.close() 187 | for clip in text_clips: 188 | clip.close() 189 | os.remove(audio_path) 190 | 191 | # TODO: Append the intro to the podcast 192 | 193 | 194 | def generate_podcast(self, file: File) -> None: 195 | """ 196 | Generate a podcast from the given blog 197 | """ 198 | 199 | voices = self.client.voices.get_all() 200 | host_voice = self.client.voices.get(voice_id="5HDW0qbyNwiQkijnIHQ4") 201 | guest_voice = self.client.voices.get(voice_id="cgSgspJ2msm6clMCkdW9") 202 | 203 | blog = file.blog.content 204 | # Split the blog into h3 (host questions) and p (guest responses) 205 | lines = blog.split('\n') 206 | tasks = [] 207 | 208 | for line in lines: 209 | if line.startswith('###'): 210 | line = line.replace('#', '').strip() 211 | print(f"Line: {line}") 212 | if line.strip(): 213 | if tasks and tasks[-1][0] == 'host': 214 | tasks[-1] = ('host', tasks[-1][1] + ' ' + line) 215 | else: 216 | tasks.append(('host', line)) 217 | else: 218 | if not line.startswith('#'): 219 | if line.strip(): 220 | if tasks and tasks[-1][0] == 'guest': 221 | tasks[-1] = ('guest', tasks[-1][1] + ' ' + line) 222 | else: 223 | tasks.append(('guest', line)) 224 | 225 | print(f"Tasks: {tasks}") 226 | 227 | audio_bytes_list = [] 228 | # Generate the podcast 229 | for task in tasks: 230 | if task[0] == 'host': 231 | audio = self.client.generate(text=task[1], voice=host_voice) 232 | else: 233 | audio = self.client.generate(text=task[1], voice=guest_voice) 234 | audio_bytes = b"".join(list(audio)) 235 | audio_bytes_list.append(audio_bytes) 236 | 237 | # Save the entire audio_bytes to a file 238 | random_id = random.randint(1000, 9999) 239 | audio_path = f"/Users/anirudhh/Documents/Zoom_v2/irina_clip/content/podcast_{random_id}.mp3" 240 | with open(audio_path, "wb") as f: 241 | f.write(b''.join(audio_bytes_list)) 242 | 243 | def generate_outro(self, file: File) -> None: 244 | """ 245 | Generate an outro for the podcast 246 | """ 247 | 248 | # Recap podcast, add-in shoutouts to sponsors 249 | 250 | # Make a summary of the podcast in the outro 251 | 252 | sponsors = {"Founderful"} 253 | sponsors_prompt = self.prompts.sponsors_prompt(sponsors) 254 | # Make a shoutout to the sponsors as well 255 | 256 | 257 | raise NotImplementedError("Not implemented") 258 | -------------------------------------------------------------------------------- /helpers/resume_extractor.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from schemas.file import Resume, Guest, File 3 | from schemas.prompt import SimpleResponse, ListResponse 4 | from prompts.prompts import Prompts 5 | import PyPDF2 6 | 7 | logging.basicConfig(level=logging.INFO) 8 | logger = logging.getLogger(__name__) 9 | 10 | class ResumeExtractor: 11 | """ 12 | Service class to extract resume data from a PDF in a structured way. 13 | """ 14 | 15 | def __init__(self, llm, prompts): 16 | """ 17 | Initialize the ResumeExtractor 18 | """ 19 | self.llm = llm 20 | self.prompts = prompts 21 | 22 | def extract(self, file: File): 23 | """ 24 | Extract the resume data from the given PDF file 25 | """ 26 | text = "" 27 | 28 | with open(file.files.resume_file, "rb") as file: 29 | pdf_reader = PyPDF2.PdfReader(file) 30 | for page in pdf_reader.pages: 31 | text += page.extract_text() 32 | 33 | prompt = self.prompts.extract_resume_prompt(text) 34 | return self.llm.prompt(prompt.text, model=prompt.model, schema=Resume) 35 | 36 | def enrich_guest(self, file: File): 37 | """ 38 | Enrich the guest data with first_name, origin, top companies & universities using the resume 39 | """ 40 | 41 | first_name_prompt = self.prompts.first_name_prompt(file) 42 | top_companies_prompt = self.prompts.top_companies_prompt(file) 43 | top_universities_prompt = self.prompts.top_universities_prompt(file) 44 | origin_prompt = self.prompts.origin_prompt(file) 45 | 46 | first_name = self.llm.prompt(first_name_prompt.text, model=first_name_prompt.model, schema=SimpleResponse) 47 | top_companies = self.llm.prompt(top_companies_prompt.text, model=top_companies_prompt.model, schema=ListResponse) 48 | top_universities = self.llm.prompt(top_universities_prompt.text, model=top_universities_prompt.model, schema=ListResponse) 49 | origin = self.llm.prompt(origin_prompt.text, model=origin_prompt.model, schema=SimpleResponse) 50 | 51 | return Guest( 52 | first_name=first_name.response, 53 | top_companies=top_companies.response, 54 | top_universities=top_universities.response, 55 | origin=origin.response 56 | ) -------------------------------------------------------------------------------- /helpers/thumbnail_generator.py: -------------------------------------------------------------------------------- 1 | import io 2 | import time 3 | from typing import List 4 | from schemas.file import File, Thumbnails, ThumbnailParams 5 | 6 | from PIL import Image, ImageDraw, ImageFont, ImageEnhance 7 | from pydantic import BaseModel 8 | from gradio_client import Client, handle_file 9 | 10 | # Class to generate the thumbnails 11 | class ThumbnailGenerator: 12 | """ 13 | Class to generate the thumbnails 14 | """ 15 | 16 | def __init__(self): 17 | """ 18 | Initialize the ThumbnailGenerator 19 | """ 20 | pass 21 | 22 | def generate_thumbnails(self, file: File): 23 | """ 24 | Generate the thumbnails for the given blog 25 | """ 26 | photo_no_bg = self.remove_bg(file) 27 | 28 | universities_text_height = int(self.calculate_text_height(file.metadata.guest.top_universities, self.get_font("university", 74))) 29 | 30 | if file.thumbnails and file.thumbnails.landscape_params: 31 | landscape_params = file.thumbnails.landscape_params 32 | else: 33 | landscape_params = ThumbnailParams( 34 | height=1200, width=1680, 35 | companies_font_size=145, companies_x_offset=64, companies_y_offset=53, 36 | universities_x_offset=64, universities_y_offset=1200 - universities_text_height - 60, 37 | portrait_ratio=0.9, portrait_align="right" 38 | ) 39 | landscape = self.generate_thumbnail(file, photo_no_bg, landscape_params) 40 | 41 | companies_text_height = int(self.calculate_text_height(file.metadata.guest.top_companies, self.get_font("company", 99))) 42 | 43 | if file.thumbnails and file.thumbnails.square_params: 44 | square_params = file.thumbnails.square_params 45 | else: 46 | square_params = ThumbnailParams( 47 | height=1080, width=1080, 48 | companies_font_size=99, companies_x_offset=14, companies_y_offset=18, 49 | universities_x_offset=14, universities_y_offset=companies_text_height + 60, 50 | portrait_ratio=0.8, portrait_align="center" 51 | ) 52 | square = self.generate_thumbnail(file, photo_no_bg, square_params) 53 | 54 | return Thumbnails( 55 | photo_no_bg=self.image_to_bytes(photo_no_bg), 56 | landscape=self.image_to_bytes(landscape), 57 | square=self.image_to_bytes(square), 58 | landscape_params=landscape_params, 59 | square_params=square_params 60 | ) 61 | 62 | def generate_thumbnail(self, file: File, guest_photo_no_bg: str, params: ThumbnailParams): 63 | """ 64 | Generate the thumbnail for the given blog, using the provided thumbnail parameters. 65 | 66 | Generates companies, universities, portrait and name overlays, then stitches it together. 67 | """ 68 | # 1. Generate companies text overlay 69 | companies_overlay, companies_mask = self.generate_companies_overlay(file, params) 70 | 71 | # 2. Generate universities text overlay 72 | universities_overlay, universities_mask = self.generate_universities_overlay(file, params) 73 | 74 | # 3. Generate portrait with name overlay 75 | portrait, portrait_gray, name_overlay_position = self.generate_portrait(file, guest_photo_no_bg, params) 76 | 77 | # 4. Paste everything together and save 78 | 79 | # Companies and universities 80 | thumbnail = Image.open("assets/bg.png") 81 | thumbnail = thumbnail.resize((params.width, params.height), Image.LANCZOS) 82 | 83 | thumbnail.paste(companies_mask, (params.companies_x_offset, params.companies_y_offset), companies_overlay) 84 | thumbnail.paste(universities_overlay, (params.universities_x_offset, params.universities_y_offset), universities_overlay) 85 | 86 | # Portrait 87 | portrait_width, portrait_height = portrait.size 88 | 89 | if params.portrait_align == "right": 90 | paste_x = params.width - portrait_width + params.portrait_x_offset #Right of thumbnail 91 | elif params.portrait_align == "center": 92 | paste_x = ((params.width - portrait_width) // 2) + params.portrait_x_offset #Center of thumbnail 93 | 94 | paste_y = params.height - portrait_height - params.portrait_y_offset #Bottom of thumbnail 95 | thumbnail.paste(portrait_gray, (paste_x, paste_y), portrait) 96 | 97 | # Name + Arrow 98 | name_overlay = self.generate_name_overlay(file, params) 99 | name_overlay_position = (name_overlay_position[0] + params.name_x_offset + paste_x, name_overlay_position[1] - params.name_y_offset + paste_y) 100 | thumbnail.paste(name_overlay, name_overlay_position, name_overlay) 101 | 102 | return thumbnail 103 | 104 | # Thumbnail generation 105 | # 1. Generate companies text overlay 106 | def generate_companies_overlay(self, file: File, params: ThumbnailParams): 107 | """ 108 | Generates the companies text overlay for the thumbnail 109 | """ 110 | # Load the white gradient mask (ensure it's RGBA or RGB) 111 | gradient_mask = Image.open('assets/white_gradient_mask.png').convert('RGBA') 112 | # Resize gradient to match text overlay size 113 | gradient_mask = gradient_mask.resize((params.width, params.height), Image.LANCZOS) 114 | 115 | # Create a new RGBA image for the text overlay 116 | text_overlay = Image.new('RGBA', (params.width, params.height), (0, 0, 0, 0)) 117 | 118 | # Get the font 119 | font = self.get_font("company", params.companies_font_size) 120 | 121 | # Create a mask for the text (this will be the shape of the text) 122 | text_mask = Image.new('L', (params.width, params.height), 0) # L mode for grayscale (mask) 123 | text_draw = ImageDraw.Draw(text_mask) 124 | 125 | # Draw the text in white on the text_mask 126 | spacing = font.size * 0.21 127 | text_draw.text((0, 0), '\n'.join(file.metadata.guest.top_companies), font=font, fill=255, spacing=spacing) # White text as a mask 128 | 129 | # Now, composite the gradient with the text mask 130 | gradient_filled_text = Image.composite(gradient_mask, text_overlay, text_mask).convert('RGBA') 131 | 132 | return gradient_filled_text, text_mask 133 | 134 | # 2. Generate universities text overlay 135 | def generate_universities_overlay(self, file: File, params: ThumbnailParams, debug = False): 136 | """ 137 | Generates the universities text overlay for the thumbnail 138 | """ 139 | uni_text = '\n'.join(file.metadata.guest.top_universities) 140 | 141 | font = self.get_font("university", params.universities_font_size) 142 | text_height = self.calculate_text_height(file.metadata.guest.top_universities, font) 143 | 144 | # Calculate text width 145 | text_width = max(font.getbbox(university)[2] - font.getbbox(university)[0] for university in file.metadata.guest.top_universities) 146 | 147 | overlay_width = int(text_width) 148 | overlay_height = int(text_height) 149 | 150 | # Create an RGBA image to write the text, sized to the calculated dimensions 151 | colour = (255, 255, 255, 255) if debug else (0, 0, 0, 0) 152 | text_overlay = Image.new('RGBA', (overlay_width, overlay_height), colour) 153 | 154 | # Draw the text in hex(38, 38, 38) on the text_overlay 155 | draw = ImageDraw.Draw(text_overlay) 156 | spacing = font.size * 0.21 157 | draw.text((0, 0), uni_text, font=font, fill=(38, 38, 38, 255), spacing=spacing) 158 | 159 | # Create a mask for the text 160 | text_mask = text_overlay.convert('L') 161 | 162 | return text_overlay, text_mask 163 | 164 | # 3. Generate portrait with name overlay 165 | def generate_portrait(self, file: File, guest_photo_no_bg: str, params: ThumbnailParams): 166 | """ 167 | Generates the portrait for the thumbnail 168 | """ 169 | # Load the images 170 | portrait = guest_photo_no_bg 171 | portrait = portrait.crop(portrait.getbbox()) 172 | grayscale = portrait.convert('L') 173 | 174 | # Resize the portrait to 2/3 of height while maintaining the aspect ratio 175 | height = int(params.portrait_ratio * params.height) 176 | width = int(portrait.width * height / portrait.height) 177 | portrait = portrait.resize((width, height), Image.LANCZOS) 178 | grayscale = grayscale.resize((width, height), Image.LANCZOS) 179 | 180 | # Increase the contrast and brightness of the grayscale image 181 | # contrast_enhancer = ImageEnhance.Contrast(grayscale) 182 | # grayscale = contrast_enhancer.enhance(1.5) 183 | 184 | # Increase the brightness (exposure) 185 | # brightness_enhancer = ImageEnhance.Brightness(grayscale) 186 | # grayscale = brightness_enhancer.enhance(1.2) # Slight increase in brightness 187 | 188 | # Paste the "scanlines_mask.png" onto grayscale 189 | scanlines_mask = Image.open('assets/scanlines_mask.png') 190 | max_dimension = max(grayscale.width, grayscale.height) 191 | scanlines_mask = scanlines_mask.resize((max_dimension, max_dimension), Image.LANCZOS) 192 | grayscale.paste(scanlines_mask, (0, 0), scanlines_mask) 193 | 194 | # Check for transparent areas 195 | width, height = portrait.size 196 | transparent_areas = [] 197 | for y in range(height): 198 | for x in range(width): 199 | if portrait.getpixel((x, y))[3] == 0: # Check alpha channel 200 | transparent_areas.append((x, y)) 201 | 202 | # Split the portrait into 4 quadrants 203 | width, height = portrait.size 204 | mid_x, mid_y = width // 2, height // 2 205 | 206 | # Focus on the top right quadrant 207 | top_right_transparent = [ 208 | (x, y) for x, y in transparent_areas 209 | if x >= mid_x and y < mid_y 210 | ] 211 | 212 | if top_right_transparent: 213 | # Calculate the center of the transparent area in the top right quadrant 214 | center_x = sum(x for x, _ in top_right_transparent) // len(top_right_transparent) 215 | center_y = sum(y for _, y in top_right_transparent) // len(top_right_transparent) 216 | 217 | # Create the name overlay 218 | name_overlay = self.generate_name_overlay(file, params) 219 | 220 | # Calculate the position to center the name_overlay on the transparent area 221 | overlay_x = center_x - name_overlay.width // 2 222 | overlay_y = center_y - name_overlay.height // 2 223 | 224 | # Ensure the overlay stays within the portrait bounds 225 | overlay_x = max(0, min(overlay_x, width - name_overlay.width)) 226 | overlay_y = max(0, min(overlay_y, height - name_overlay.height)) 227 | 228 | name_overlay_position = (overlay_x, overlay_y) 229 | else: 230 | name_overlay_position = (0, 0) 231 | print("No transparent areas found in the top right quadrant.") 232 | 233 | # Return name_overlay_position so we can paste it onto thumbnail (to avoid portrait cropping when manually adjusting name x,y offset) 234 | return portrait, grayscale, name_overlay_position 235 | 236 | # 4. Generate name overlay 237 | def generate_name_overlay(self, file: File, params: ThumbnailParams, debug = False): 238 | """ 239 | Generates the name overlay for the thumbnail 240 | """ 241 | font = self.get_font("name", params.name_font_size) 242 | 243 | # Get the size of the text using getbbox() 244 | bbox = font.getbbox(file.metadata.guest.first_name) 245 | text_width = bbox[2] - bbox[0] 246 | text_height = bbox[3] - bbox[1] 247 | 248 | # Open and resize the arrow image 249 | arrow = Image.open('assets/arrow_1.png') 250 | arrow_width, arrow_height = arrow.size 251 | arrow_aspect_ratio = arrow_width / arrow_height 252 | new_arrow_height = 150 253 | new_arrow_width = int(new_arrow_height * arrow_aspect_ratio) 254 | arrow = arrow.resize((new_arrow_width, new_arrow_height), Image.LANCZOS) 255 | 256 | # Create the frame based on the bounding box of the text and arrow 257 | gap = 40 258 | image_width = max(text_width, new_arrow_width) + 20 # Add some padding 259 | image_height = text_height + new_arrow_height + gap + 20 # Add some padding 260 | 261 | colour = (255, 255, 255, 255) if debug else (0, 0, 0, 0) 262 | image = Image.new('RGBA', (image_width, image_height), colour) 263 | draw = ImageDraw.Draw(image) 264 | 265 | # Calculate positions for text and arrow 266 | text_x = (image_width - text_width) // 2 267 | text_y = 10 # Top padding 268 | arrow_x = (image_width - new_arrow_width) // 2 269 | arrow_y = text_y + text_height + gap # 10 pixels gap between text and arrow 270 | 271 | # Draw the text and paste the arrow 272 | fill = (0, 0, 0, 255) if debug else (255, 255, 255, 255) 273 | draw.text((text_x, text_y), file.metadata.guest.first_name, font=font, fill=fill) 274 | image.paste(arrow, (arrow_x, arrow_y), arrow) 275 | 276 | return image 277 | 278 | # === Helper functions === 279 | # Background removal 280 | def remove_bg(self, file: File, debug=False): 281 | """ 282 | Remove the background from the given photo using a HuggingFace BG removal model. 283 | """ 284 | if debug: 285 | return Image.open(file.files.photo).convert("RGBA") 286 | 287 | # If existing bg_removed photo, check if it matches the current photo otherwise remove background for the current photo 288 | if file.thumbnails and file.thumbnails.photo_no_bg: 289 | # TODO: Add support to auto-remove background if the photo has changed 290 | return Image.open(io.BytesIO(file.thumbnails.photo_no_bg)) 291 | 292 | client = Client("ZhengPeng7/BiRefNet_demo") 293 | result = client.predict( 294 | images=handle_file(file.files.photo), 295 | resolution="1024x1024", 296 | weights_file="General", 297 | api_name="/image" 298 | ) 299 | 300 | return Image.open(result[0]) 301 | 302 | def calculate_text_height(self, texts: List[str], font: ImageFont.FreeTypeFont): 303 | """ 304 | Calculate the height of a block of text with the given font (to calculate positioning of overlays) 305 | """ 306 | # Calculate the height of the text 307 | ascent, descent = font.getmetrics() 308 | (width, baseline), (offset_x, offset_y) = font.font.getsize(texts[0]) 309 | 310 | line_height = ascent + descent - offset_y 311 | 312 | # Calculate total text height including line spacing 313 | line_spacing = font.size * 0.21 # You can adjust this value 314 | text_height = line_height * len(texts) + line_spacing * (len(texts) - 1) 315 | 316 | return text_height 317 | 318 | def get_font(self, font_name: str, font_size: int): 319 | """ 320 | Gets the font for the given font name and size with the specified style. 321 | """ 322 | fonts = { 323 | "company": ImageFont.truetype("/Users/anirudhh/Library/Fonts/Inter-VariableFont_opsz,wght.ttf", font_size), 324 | "university": ImageFont.truetype("/Users/anirudhh/Library/Fonts/JetBrainsMono-VariableFont_wght.ttf", font_size), 325 | "name": ImageFont.truetype("/Users/anirudhh/Library/Fonts/LondrinaSolid-Regular.ttf", font_size) 326 | } 327 | 328 | selected_font = fonts[font_name] 329 | 330 | if font_name == "company": 331 | selected_font.set_variation_by_name("Bold") 332 | elif font_name == "university": 333 | selected_font.set_variation_by_name("Regular") 334 | 335 | return selected_font 336 | 337 | def image_to_bytes(self, img: Image.Image, format: str = 'PNG') -> bytes: 338 | """ 339 | Convert an image to bytes 340 | """ 341 | img_byte_arr = io.BytesIO() 342 | img.save(img_byte_arr, format=format) 343 | return img_byte_arr.getvalue() -------------------------------------------------------------------------------- /helpers/transcriber.py: -------------------------------------------------------------------------------- 1 | import assemblyai as aai 2 | from schemas.file import Utterances, Utterance, Transcript, Word 3 | from schemas.prompt import SimpleResponse 4 | 5 | class Transcriber: 6 | """ 7 | Service class to transcribe audio files 8 | """ 9 | 10 | def __init__(self, config, llm, prompts): 11 | """ 12 | Initialize the Transcriber 13 | """ 14 | aai.settings.api_key = config["ASSEMBLYAI_API_KEY"] 15 | self.llm = llm 16 | self.prompts = prompts 17 | 18 | def transcribe(self, audio_file_path: str): 19 | """ 20 | Transcribe the given audio file (hardcoded to use AssemblyAI with 2 speakers for now) 21 | """ 22 | config = aai.TranscriptionConfig(speaker_labels=True, speakers_expected=2) 23 | transcriber = aai.Transcriber() 24 | 25 | transcript = transcriber.transcribe( 26 | audio_file_path, 27 | config=config 28 | ) 29 | 30 | print(transcript.utterances[0]) 31 | # Map AssemblyAI transcript to our Transcript schema 32 | utterances = [ 33 | Utterance( 34 | confidence=utterance.confidence, 35 | end=utterance.end, 36 | speaker=utterance.speaker, 37 | start=utterance.start, 38 | text=utterance.text, 39 | words=[Word( 40 | text=word.text, 41 | start=word.start, 42 | end=word.end, 43 | confidence=word.confidence, 44 | speaker=word.speaker 45 | ) for word in utterance.words] 46 | ) 47 | for utterance in transcript.utterances 48 | ] 49 | 50 | return Utterances(utterances=utterances) 51 | 52 | def generate_transcript(self, utterances: Utterances) -> Transcript: 53 | """ 54 | Generate a transcript from the given utterances, by identifying the interviewer (h2) and guest (p). 55 | """ 56 | # Merge the utterances into a single transcript 57 | transcript = " ".join([f"Speaker {utterance.speaker}: {utterance.text}" for utterance in utterances.utterances]) 58 | 59 | # Analyze the transcript using LLM 'haiku' 60 | # Use CoT reasoning in the prompt even and then Pydantic validation for a more sophisticated prompt? 61 | prompt = self.prompts.identify_speaker_prompt(transcript) 62 | 63 | # Identify the guest speaker 64 | guest = self.llm.prompt(prompt.text, model=prompt.model, schema=SimpleResponse) 65 | 66 | if guest.response not in ['A', 'B']: 67 | guest.response = 'B' #Fallback to B 68 | 69 | guest_speaker = guest.response 70 | 71 | # Parse the transcript by labelling using the guest speaker 72 | annotated_transcript = "" 73 | questions = "" 74 | 75 | for utterance in utterances.utterances: 76 | if utterance.speaker == guest_speaker: 77 | annotated_transcript += f"{utterance.text} \n" 78 | else: 79 | annotated_transcript += f"## {utterance.text} \n" 80 | questions += f"## {utterance.text} \n \n" 81 | 82 | return Transcript(text=annotated_transcript) 83 | -------------------------------------------------------------------------------- /llms/anthropic_client.py: -------------------------------------------------------------------------------- 1 | import json 2 | import instructor 3 | from anthropic import Anthropic 4 | from pydantic import BaseModel 5 | from llms.llm import LLM 6 | 7 | import re 8 | import json 9 | 10 | class AnthropicClient(LLM): 11 | """ 12 | Service class to interact with the LLM 13 | """ 14 | 15 | def __init__(self, config, models): 16 | """ 17 | Initialize the LLM service 18 | """ 19 | super().__init__(provider="anthropic", config=config, models=models) 20 | self.client = Anthropic(api_key=config["ANTHROPIC_API_KEY"]) 21 | self.llm_instructor = instructor.from_anthropic(Anthropic()) 22 | 23 | def prompt(self, prompt: str, model: str = "sonnet", schema:BaseModel=None): 24 | """ 25 | Generate a response from the LLM 26 | """ 27 | if model == "debug": 28 | return f"DEBUG LLM: {prompt[:50]}" 29 | 30 | if schema: 31 | return self.llm_instructor.messages.create( 32 | model=self.get_model(model), 33 | messages=[ 34 | {"role": "user", "content": prompt} 35 | ], 36 | response_model=schema, 37 | max_tokens=4096 38 | ) 39 | 40 | response = self.client.messages.create( 41 | model=self.get_model(model), 42 | messages=[ 43 | {"role": "user", "content": prompt} 44 | ], 45 | max_tokens=4096 46 | ) 47 | 48 | return response.content[0].text 49 | 50 | def stream_prompt(self, prompt: str, model: str = "sonnet", llm_stream=None): 51 | """ 52 | Stream a response from the LLM 53 | """ 54 | if model == "debug": 55 | return f"DEBUG LLM: {prompt[:50]}" 56 | 57 | response = "" 58 | with self.client.messages.stream( 59 | model=self.get_model(model), 60 | messages=[{"role": "user", "content": prompt}], 61 | max_tokens=4096 62 | ) as stream: 63 | for text in stream.text_stream: 64 | response += text 65 | llm_stream(text) 66 | 67 | # TODO: Maybe I do 'post-processing' on the response to parse the JSON? I can't simultaneously stream and parse the JSON 68 | return self.parse_response(response) -------------------------------------------------------------------------------- /llms/llm.py: -------------------------------------------------------------------------------- 1 | import re 2 | import json 3 | from abc import ABC, abstractmethod 4 | from pydantic import BaseModel 5 | 6 | class LLM(ABC): 7 | 8 | @abstractmethod 9 | def __init__(self, provider: str, config, models: dict): 10 | self.provider = provider 11 | self.model_mapping = {} 12 | 13 | for key, value in models.items(): 14 | if (value["provider"] == self.provider): 15 | self.model_mapping[key] = value["model"] 16 | 17 | def get_model(self, model: str): 18 | """ 19 | Given the model alias, return the actual model name 20 | """ 21 | if model in self.model_mapping: 22 | return self.model_mapping[model] 23 | else: 24 | raise ValueError(f"Model {model} not found in model mapping") 25 | 26 | @abstractmethod 27 | def prompt(self, prompt: str, model: str = "sonnet", schema:BaseModel=None): 28 | raise NotImplementedError("prompt() must be implemented by subclass") 29 | 30 | @abstractmethod 31 | def stream_prompt(self, prompt: str, model: str = "sonnet", llm_stream=None): 32 | raise NotImplementedError("stream_prompt() must be implemented by subclass") 33 | 34 | def parse_response(self, response:str): 35 | """ 36 | Parse a response string to extract JSON if present, otherwise return original string 37 | """ 38 | 39 | response = response.strip() 40 | # Try to find JSON pattern {'response': '...'} 41 | match = re.search(r"\{[\s]*'response'[\s]*:[\s]*'([^']*)'[\s]*\}", response) 42 | 43 | if not match: 44 | print(f"No JSON found in response: {response}") 45 | return response 46 | 47 | try: 48 | # Try parsing the matched JSON string 49 | json_str = match.group(0).replace("'", '"') 50 | parsed = json.loads(json_str) 51 | return parsed['response'] 52 | except: 53 | # If JSON parsing fails, return the matched string 54 | print(f"JSON parsing failed for response: {response}") 55 | return match.group(1) -------------------------------------------------------------------------------- /llms/llm_service.py: -------------------------------------------------------------------------------- 1 | from llms.anthropic_client import AnthropicClient 2 | from llms.openai_client import OpenAIClient 3 | from llms.ollama_client import OllamaClient 4 | 5 | import yaml 6 | from pydantic import BaseModel 7 | 8 | class LLMService: 9 | """ 10 | LLM class for the file object 11 | """ 12 | 13 | def __init__(self, config): 14 | # Load in the models config 15 | self.models = {} 16 | with open("llms/models.yaml", "r") as f: 17 | self.models = yaml.safe_load(f) 18 | 19 | self.anthropic = AnthropicClient(config, self.models) 20 | self.openai = OpenAIClient(config, self.models) 21 | self.ollama = OllamaClient(config, self.models) 22 | 23 | # Create a model router from the yaml file 24 | model_mapping ={ 25 | "anthropic": self.anthropic, 26 | "openai": self.openai, 27 | "ollama" : self.ollama 28 | } 29 | 30 | self.model_router = {} 31 | for model, value in self.models.items(): 32 | if value["provider"] in model_mapping: 33 | self.model_router[model] = model_mapping[value["provider"]] 34 | else: 35 | raise ValueError(f"Provider '{value["provider"]}' is not yet implemented") 36 | 37 | def _get(self, model: str): 38 | return self.model_router[model] 39 | 40 | def prompt(self, prompt: str, model: str = "sonnet", schema:BaseModel=None): 41 | client = self._get(model) 42 | return client.prompt(model=model, prompt=prompt, schema=schema) 43 | 44 | def stream_prompt(self, prompt: str, model: str = "sonnet", llm_stream=None): 45 | client = self._get(model) 46 | return client.stream_prompt(model=model, prompt=prompt, llm_stream=llm_stream) -------------------------------------------------------------------------------- /llms/models.yaml: -------------------------------------------------------------------------------- 1 | sonnet-3.7: 2 | provider: anthropic 3 | model: claude-3-7-sonnet-20250219 4 | max_tokens: 200000 5 | sonnet-3.5: 6 | provider: anthropic 7 | model: claude-3-5-sonnet-20241022 8 | max_tokens: 200000 9 | sonnet: 10 | provider: anthropic 11 | model: claude-3-sonnet-20240229 12 | max_tokens: 200000 13 | opus: 14 | provider: anthropic 15 | model: claude-3-opus-20240229 16 | max_tokens: 200000 17 | haiku-3.5: 18 | provider: anthropic 19 | model: claude-3-5-haiku-20241022 20 | max_tokens: 200000 21 | haiku: 22 | provider: anthropic 23 | model: claude-3-haiku-20240307 24 | max_tokens: 200000 25 | gpt-4o: 26 | provider: openai 27 | model: gpt-4o-2024-08-06 28 | max_tokens: 128000 29 | llama3: 30 | provider: ollama 31 | model: llama3.2:3b 32 | max_tokens: 4096 33 | -------------------------------------------------------------------------------- /llms/ollama_client.py: -------------------------------------------------------------------------------- 1 | from llms.llm import LLM 2 | from pydantic import BaseModel 3 | from ollama import Client 4 | 5 | class OllamaClient(LLM): 6 | """ 7 | Ollama class for interacting with Ollama models 8 | """ 9 | 10 | def __init__(self, config, models): 11 | super().__init__(provider="ollama", config=config, models=models) 12 | self.client = Client(host=config.get("OLLAMA_HOST", "http://localhost:11434")) 13 | 14 | def prompt(self, prompt: str, model: str = "haiku", schema:BaseModel=None): 15 | """ 16 | Generate a response from Ollama 17 | """ 18 | if model == "debug": 19 | return f"DEBUG LLM: {prompt[:50]}" 20 | 21 | if schema: 22 | # Note: Ollama doesn't have native structured output support like Anthropic 23 | # You might want to implement a custom solution for schema validation 24 | raise NotImplementedError("Schema validation not supported for Ollama") 25 | 26 | response = self.client.chat(model=self.get_model(model), messages=[ 27 | { 28 | "role": "user", 29 | "content": prompt 30 | } 31 | ]) 32 | 33 | return response['message']['content'] 34 | 35 | def stream_prompt(self, prompt: str, model: str = "haiku", llm_stream=None): 36 | """ 37 | Stream a response from Ollama 38 | """ 39 | if model == "debug": 40 | return f"DEBUG LLM: {prompt[:50]}" 41 | 42 | response = "" 43 | for chunk in self.client.chat( 44 | model=self.get_model(model), 45 | messages=[{"role": "user", "content": prompt}], 46 | stream=True 47 | ): 48 | text = chunk['message']['content'] 49 | response += text 50 | if llm_stream: 51 | llm_stream(text) 52 | 53 | return self.parse_response(response) -------------------------------------------------------------------------------- /llms/openai_client.py: -------------------------------------------------------------------------------- 1 | from llms.llm import LLM 2 | from pydantic import BaseModel 3 | from openai import OpenAI 4 | import instructor 5 | import json 6 | 7 | class OpenAIClient(LLM): 8 | """ 9 | OpenAI class for the file object 10 | """ 11 | 12 | def __init__(self, config, models): 13 | super().__init__(provider="openai", config=config, models=models) 14 | 15 | self.client = OpenAI(api_key=config["OPENAI_API_KEY"]) 16 | self.llm_instructor = instructor.patch(self.client) 17 | 18 | def prompt(self, prompt: str, model: str = "gpt-4o", schema:BaseModel=None): 19 | """ 20 | Generate a response from the LLM 21 | """ 22 | if model == "debug": 23 | return f"DEBUG LLM: {prompt[:50]}" 24 | 25 | if schema: 26 | return self.llm_instructor.chat.completions.create( 27 | model=self.get_model(model), 28 | messages=[{"role": "user", "content": prompt}], 29 | response_model=schema 30 | ) 31 | 32 | response = self.client.chat.completions.create( 33 | model=self.get_model(model), 34 | messages=[{"role": "user", "content": prompt}] 35 | ) 36 | return response.choices[0].message.content 37 | 38 | def stream_prompt(self, prompt: str, model: str = "gpt-4", llm_stream=None): 39 | """ 40 | Stream a response from OpenAI 41 | """ 42 | if model == "debug": 43 | return f"DEBUG LLM: {prompt[:50]}" 44 | 45 | response = "" 46 | stream = self.client.chat.completions.create( 47 | model=self.get_model(model), 48 | messages=[{"role": "user", "content": prompt}], 49 | stream=True 50 | ) 51 | 52 | for chunk in stream: 53 | if chunk.choices[0].delta.content is not None: 54 | text = chunk.choices[0].delta.content 55 | response += text 56 | if llm_stream: 57 | llm_stream(text) 58 | 59 | return self.parse_response(response) -------------------------------------------------------------------------------- /schemas/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AnirudhhRamesh/BlogEditor/e60167d66a4a3b5e796ab8c378e2809f771c5b10/schemas/__init__.py -------------------------------------------------------------------------------- /schemas/file.py: -------------------------------------------------------------------------------- 1 | from pydantic import BaseModel 2 | from typing import Optional, List 3 | 4 | class Files(BaseModel): 5 | """ 6 | Files uploaded by the user 7 | """ 8 | audio_file: str 9 | video_file: str 10 | resume_file: str 11 | portrait: str 12 | photo: str 13 | 14 | # Metadata assets 15 | class Resume(BaseModel): 16 | """ 17 | Schema for the resume extracted from the resume pdf file 18 | """ 19 | name: str 20 | studies: List[str] 21 | experiences: List[str] 22 | linkedin_url: str 23 | 24 | def __str__(self): 25 | return f"Resume: {self.name}\nStudies: {self.studies}\nExperiences: {self.experiences}\nLinkedIn URL: {self.linkedin_url}" 26 | 27 | class Word(BaseModel): 28 | """ 29 | Schema for the word extracted from the transcription 30 | """ 31 | text: str 32 | start: int 33 | end: int 34 | confidence: float 35 | speaker: str 36 | 37 | class Utterance(BaseModel): 38 | """ 39 | AssemblyAI schema for the utterance extracted from the transcription 40 | """ 41 | confidence: float 42 | end: int 43 | speaker: str 44 | start: int 45 | text: str 46 | words: Optional[List[Word]] = None 47 | 48 | class Config: 49 | allow_extra = True 50 | 51 | 52 | def __str__(self): 53 | return f"Utterance: {self.text}\nSpeaker: {self.speaker}\nConfidence: {self.confidence}\nStart: {self.start}\nEnd: {self.end}" 54 | 55 | class Utterances(BaseModel): 56 | """ 57 | Schema for the utterances extracted from the transcription 58 | """ 59 | utterances: List[Utterance] 60 | 61 | def __str__(self): 62 | return "\n".join([utterance.__str__() for utterance in self.utterances]) 63 | 64 | class Transcript(BaseModel): 65 | """ 66 | Schema for the transcript (generated from the AssemblyAI transcription) 67 | text is the transcript in markdown format 68 | """ 69 | text: str 70 | 71 | def __str__(self): 72 | return self.text 73 | 74 | class Guest(BaseModel): 75 | """ 76 | Schema for the guest 77 | """ 78 | first_name: str 79 | top_companies: List[str] 80 | top_universities: List[str] 81 | origin: str 82 | 83 | class Metadata(BaseModel): 84 | """ 85 | Metadata extracted from the files 86 | """ 87 | resume: Optional[Resume] 88 | utterances: Optional[Utterances] 89 | transcript: Optional[Transcript] 90 | guest: Optional[Guest] 91 | 92 | class ThumbnailParams(BaseModel): 93 | """ 94 | Schema for the thumbnail parameters 95 | """ 96 | height: int 97 | width: int 98 | 99 | # Fonts 100 | companies_font_size: int 101 | companies_x_offset: int 102 | companies_y_offset: int 103 | 104 | universities_font_size: int = 74 105 | universities_x_offset: int 106 | universities_y_offset: int 107 | 108 | name_font_size: int = 74 109 | name_x_offset: int = 0 110 | name_y_offset: int = 0 111 | 112 | portrait_ratio: float #TODO: 0.9 previously? 113 | portrait_align: str 114 | portrait_x_offset: int = 0 115 | portrait_y_offset: int = 0 116 | 117 | class Thumbnails(BaseModel): 118 | """ 119 | Thumbnails generated from the metadata & files 120 | """ 121 | photo_no_bg: Optional[bytes] = None 122 | landscape: Optional[bytes] = None 123 | landscape_params: Optional[ThumbnailParams] = None 124 | square: Optional[bytes] = None 125 | square_params: Optional[ThumbnailParams] = None 126 | 127 | class Blog(BaseModel): 128 | """ 129 | Blog assets generated from the metadata & files 130 | """ 131 | structure: Optional[str] = None 132 | content: Optional[str] = None 133 | title: Optional[str] = None 134 | description: Optional[str] = None 135 | linkedin: Optional[str] = None 136 | 137 | # Podcast assets 138 | class Podcast(BaseModel): 139 | """ 140 | Podcast generated from the metadata & files 141 | """ 142 | title: Optional[str] = None 143 | description: Optional[str] = None 144 | blog: Optional[str] = None 145 | linkedin: Optional[str] = None 146 | 147 | class File(BaseModel): 148 | """ 149 | Schema for the file object 150 | """ 151 | name: Optional[str] 152 | files: Optional[Files] 153 | metadata: Optional[Metadata] 154 | thumbnails: Optional[Thumbnails] 155 | blog: Optional[Blog] 156 | # podcast: Podcast 157 | 158 | def __str__(self): 159 | return f"""File: {self.name} 160 | Metadata: 161 | - Resume: {self.metadata.resume.__str__()} 162 | - Utterances: {"Generated" if self.metadata.utterances else "Not generated"} 163 | - Transcript: {"Generated" if self.metadata.transcript else "Not generated"} 164 | 165 | Thumbnails: 166 | - Landscape: {"Generated" if self.thumbnails.landscape else "Not generated"} 167 | - Square: {"Generated" if self.thumbnails.square else "Not generated"} 168 | 169 | Blog: 170 | - Title: {self.blog.title[:100].replace('\r\n', ' ').replace('\n', ' ').replace('\r', ' ') if self.blog.title else "Not generated"} 171 | - Description: {self.blog.description[:100].replace('\r\n', ' ').replace('\n', ' ').replace('\r', ' ') if self.blog.description else "Not generated"} 172 | - Content: {self.blog.content[:100].replace('\r\n', ' ').replace('\n', ' ').replace('\r', ' ') if self.blog.content else "Not generated"} 173 | - Linkedin: {self.blog.linkedin[:100].replace('\r\n', ' ').replace('\n', ' ').replace('\r', ' ') if self.blog.linkedin else "Not generated"} 174 | """ 175 | 176 | # Misc 177 | class Prompt(BaseModel): 178 | """ 179 | Schema for the prompts 180 | """ 181 | text: str 182 | model: str = "opus" -------------------------------------------------------------------------------- /schemas/prompt.py: -------------------------------------------------------------------------------- 1 | from typing import List, Optional 2 | from pydantic import BaseModel 3 | 4 | class Prompt(BaseModel): 5 | """ 6 | Represents a prompt for an LLM 7 | """ 8 | text: str 9 | model: str 10 | 11 | class SimpleResponse(BaseModel): 12 | """ 13 | Represents a simple response from an LLM 14 | """ 15 | response: str 16 | 17 | class ListResponse(BaseModel): 18 | """ 19 | Represents a list response from an LLM 20 | """ 21 | response: List[str] --------------------------------------------------------------------------------