├── requirements.txt
├── .gitignore
├── LICENSE
├── README.md
└── popcorn_storyboard.py


/requirements.txt:
--------------------------------------------------------------------------------
1 | pydantic
2 | aiohttp
3 | python-dotenv
4 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | secrets.py
 2 | .env
 3 | __pycache__/
 4 | *.pyc
 5 | popcorn_output/
 6 | sequence_*/
 7 | .DS_Store
 8 | .vscode/
 9 | .idea/
10 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 Open Higgsfield Popcorn Contributors
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Open Higgsfield Popcorn 🍿
 2 | 
 3 | An open-source alternative to **Higgsfield Popcorn**, designed to generate consistent, cinematic storyboards and visual sequences using AI.
 4 | 
 5 | ## What is Higgsfield Popcorn?
 6 | 
 7 | [Higgsfield Popcorn](https://higgsfield.ai) is a powerful AI tool for creators that generates consistent character and environment sequences for storyboards, marketing campaigns, and visual storytelling. It solves a major pain point in AI image generation: **consistency**. It allows users to create 4-8 frames that look like they belong to the same movie or narrative, maintaining character identity and visual style across different shots and angles.
 8 | 
 9 | ## About This Project
10 | 
11 | **Open Higgsfield Popcorn** is an open-source implementation inspired by the original tool. It leverages the power of **MuAPI** (using models like `gpt-5-mini` and `nano-banana`) to achieve similar results: creating coherent, multi-frame visual stories from text prompts and reference images.
12 | 
13 | ### Key Features
14 | 
15 | *   **Consistent Storytelling**: Generates 2-12 frames that maintain visual consistency in style, characters, and lighting.
16 | *   **Auto Mode**: simply provide a prompt (e.g., "detective investigating a crime scene") and let the AI plan and generate the entire sequence.
17 | *   **Manual Mode**: Have full control by specifying the description for each shot individually.
18 | *   **Reference-Driven**: Use character and environment reference images to guide the generation and ensure identity consistency.
19 | *   **Cinematic Planning**: Uses an LLM "Director" to intelligently plan shot types (wide, close-up, etc.) and camera angles based on your narrative context.
20 | *   **Style Control**: Choose from various visual styles (Cinematic Realistic, Anime, Noir, etc.).
21 | 
22 | ## Installation
23 | 
24 | 1.  Clone this repository.
25 | 2.  Install dependencies:
26 |     ```bash
27 |     pip install requests python-dotenv pydantic aiohttp
28 |     ```
29 | 3.  Configure your API keys:
30 |     *   Rename `secrets.py` or edit it directly.
31 |     *   Add your **MuAPI** key (`MUAPIAPP_API_KEY`). You can get one from [muapi.ai](https://muapi.ai).
32 | 
33 | ## Usage
34 | 
35 | ### Auto Mode
36 | Generate a sequence from a single prompt. The AI will plan the shots for you.
37 | 
38 | ```bash
39 | python popcorn_storyboard.py --prompt "A cyberpunk hacker breaking into a secure server room" --frames 6 --style "cyberpunk neon"
40 | ```
41 | 
42 | ### Manual Mode
43 | Define exactly what happens in each frame.
44 | 
45 | ```bash
46 | python popcorn_storyboard.py --manual_shots "wide shot of a spooky house" "close up of a hand opening the door" "interior view of a dusty hallway" --style "horror"
47 | ```
48 | 
49 | ### Using References
50 | Upload reference images (e.g., your main character or a specific location) to guide the AI.
51 | 
52 | ```bash
53 | python popcorn_storyboard.py --prompt "A knight fighting a dragon" --references https://example.com/knight.png https://example.com/dragon.png
54 | ```
55 | 
56 | ## Options
57 | 
58 | *   `--prompt`: The main story or scene description (Required for Auto Mode).
59 | *   `--manual_shots`: List of descriptions for each frame (Enables Manual Mode).
60 | *   `--frames`: Number of frames to generate (Default: 6).
61 | *   `--style`: Visual style of the sequence (Default: "cinematic realistic").
62 | *   `--references`: URLs or paths to reference images (Up to 4 recommended).
63 | *   `--output`: Directory to save the results.
64 | 
65 | ## License
66 | 
67 | This project is open-source and available under the MIT License.
68 | 


--------------------------------------------------------------------------------
/popcorn_storyboard.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | """
  3 | Popcorn Storyboard Generator
  4 | Inspired by Higgsfield Popcorn - Generate 4-8 consistent frames from text + reference images
  5 | 
  6 | Usage:
  7 |     python popcorn_storyboard.py --prompt "detective investigating a crime scene" --references https://example.com/char.png https://example.com/env.png --frames 6
  8 | """
  9 | 
 10 | import asyncio
 11 | import argparse
 12 | import logging
 13 | import json
 14 | import sys
 15 | from pathlib import Path
 16 | from typing import List, Optional, Dict
 17 | from datetime import datetime
 18 | from pydantic import BaseModel
 19 | import aiohttp
 20 | 
 21 | import os
 22 | import time
 23 | from dotenv import load_dotenv
 24 | load_dotenv()
 25 | 
 26 | from secrets import MUAPIAPP_API_KEY
 27 | 
 28 | class LLMClient:
 29 |     def __init__(self, api_key: str, model: str = "gpt-5-mini"):
 30 |         self.api_key = api_key
 31 |         self.model = model
 32 |         self.base_url = "https://api.muapi.ai/api/v1"
 33 | 
 34 |     async def call(self, system_prompt: str, user_prompt: str, response_format: type = None, temperature: float = 0.7) -> dict:
 35 |         url = f"{self.base_url}/{self.model}"
 36 |         headers = {
 37 |             "Content-Type": "application/json",
 38 |             "x-api-key": self.api_key
 39 |         }
 40 |         
 41 |         full_prompt = f"{system_prompt}\n\n{user_prompt}" if system_prompt else user_prompt
 42 |         
 43 |         payload = {
 44 |             "prompt": full_prompt,
 45 |         }
 46 | 
 47 |         begin = time.time()
 48 |         
 49 |         async with aiohttp.ClientSession() as session:
 50 |             async with session.post(url, headers=headers, json=payload) as response:
 51 |                 if response.status != 200:
 52 |                      text = await response.text()
 53 |                      raise Exception(f"Error: {response.status}, {text}")
 54 |                      
 55 |                 result = await response.json()
 56 |                 request_id = result["request_id"]
 57 |                 logger.info(f"Task submitted. Request ID: {request_id}")
 58 | 
 59 |             result_url = f"{self.base_url}/predictions/{request_id}/result"
 60 |             headers = {"x-api-key": self.api_key}
 61 | 
 62 |             while True:
 63 |                 async with session.get(result_url, headers=headers) as response:
 64 |                     if response.status == 200:
 65 |                         result = await response.json()
 66 |                         status = result["status"]
 67 | 
 68 |                         if status == "completed":
 69 |                             end = time.time()
 70 |                             logger.info(f"Task completed in {end - begin:.2f} seconds.")
 71 |                             text = result["outputs"][0]
 72 |                             
 73 |                             if response_format == dict:
 74 |                                 try:
 75 |                                     clean_text = text.strip()
 76 |                                     if clean_text.startswith("```json"):
 77 |                                         clean_text = clean_text[7:]
 78 |                                     if clean_text.startswith("```"): 
 79 |                                         clean_text = clean_text[3:]
 80 |                                     if clean_text.endswith("```"):
 81 |                                         clean_text = clean_text[:-3]
 82 |                                     return json.loads(clean_text)
 83 |                                 except json.JSONDecodeError:
 84 |                                     logger.warning("Failed to parse JSON from LLM response")
 85 |                                     return {"raw_output": text}
 86 |                             return text
 87 |                             
 88 |                         elif status == "failed":
 89 |                             raise Exception(f"Task failed: {result.get('error')}")
 90 |                     else:
 91 |                         text = await response.text()
 92 |                         raise Exception(f"Error: {response.status}, {text}")
 93 | 
 94 |                 await asyncio.sleep(0.5)
 95 | 
 96 | class NanoBananaImageGenerator:
 97 |     def __init__(self, api_key: str):
 98 |         self.api_key = api_key
 99 |         self.base_url = "https://api.muapi.ai/api/v1"
100 |         
101 |     async def _submit_and_poll(self, model: str, payload: dict) -> str:
102 |         url = f"{self.base_url}/{model}"
103 |         headers = {
104 |             "Content-Type": "application/json",
105 |             "x-api-key": self.api_key
106 |         }
107 |         
108 |         async with aiohttp.ClientSession() as session:
109 |             # Submit
110 |             async with session.post(url, headers=headers, json=payload) as response:
111 |                 if response.status != 200:
112 |                     text = await response.text()
113 |                     raise Exception(f"Error submitting to {model}: {response.status}, {text}")
114 |                 result = await response.json()
115 |                 request_id = result["request_id"]
116 |                 logger.info(f"Task submitted to {model}. Request ID: {request_id}")
117 |             
118 |             # Poll
119 |             result_url = f"{self.base_url}/predictions/{request_id}/result"
120 |             headers = {"x-api-key": self.api_key}
121 |             
122 |             while True:
123 |                 async with session.get(result_url, headers=headers) as response:
124 |                     if response.status != 200:
125 |                          text = await response.text()
126 |                          raise Exception(f"Polling error: {response.status}, {text}")
127 |                     
128 |                     result = await response.json()
129 |                     status = result["status"]
130 |                     
131 |                     if status == "completed":
132 |                         return result["outputs"][0]
133 |                     elif status == "failed":
134 |                         raise Exception(f"Task failed: {result.get('error')}")
135 |                     
136 |                 await asyncio.sleep(0.5)
137 | 
138 |     async def generate_image(self, prompt: str, aspect_ratio: str = "16:9") -> str:
139 |         """Generate image from text (T2I)"""
140 |         payload = {
141 |             "prompt": prompt,
142 |             "aspect_ratio": aspect_ratio
143 |         }
144 |         return await self._submit_and_poll("nano-banana", payload)
145 |         
146 |     async def generate_with_references(self, prompt: str, reference_images: List[str], aspect_ratio: str = "16:9") -> str:
147 |         """Generate image with references (I2I/Edit)"""
148 |         payload = {
149 |             "prompt": prompt,
150 |             "images_list": reference_images,
151 |             "aspect_ratio": aspect_ratio
152 |         }
153 |         return await self._submit_and_poll("nano-banana-edit", payload)
154 | 
155 | # Setup logging
156 | logging.basicConfig(
157 |     level=logging.INFO,
158 |     format='%(asctime)s - %(levelname)s - %(message)s'
159 | )
160 | logger = logging.getLogger(__name__)
161 | 
162 | 
163 | # ============================================================================
164 | # VISION ANALYSIS
165 | # ============================================================================
166 | 
167 | async def ensure_url(image_path: str) -> str:
168 |     """Validate that the input is a URL."""
169 |     if not image_path.startswith(("http://", "https://")):
170 |         raise ValueError(f"Invalid reference: '{image_path}'. References must be image URLs (starting with http:// or https://). Local files are not supported.")
171 |     return image_path
172 | 
173 | 
174 | class VisionAdapter:
175 |     """
176 |     Vision-capable adapter using GPT-5-Nano:
177 |     - Accepts image URLs
178 |     - Produces structured analysis + description
179 |     """
180 | 
181 |     BASE_URL = "https://api.muapi.ai/api/v1"
182 |     MODEL = "gpt-5-nano"
183 | 
184 |     def __init__(self, api_key: str):
185 |         self.api_key = api_key
186 |         if not self.api_key:
187 |             raise ValueError("API key missing")
188 | 
189 |     async def analyze(
190 |         self,
191 |         prompt: str,
192 |         image_url: str
193 |     ) -> dict:
194 |         """
195 |         Given an image URL, produce structured analysis.
196 |         Returns a dict: { summary, tags, objects, colors, suggestions }
197 |         """
198 |         # Ensure we have a URL
199 |         image_url = await ensure_url(image_url)
200 | 
201 |         submit_url = f"{self.BASE_URL}/{self.MODEL}"
202 | 
203 |         headers = {
204 |             "Content-Type": "application/json",
205 |             "x-api-key": self.api_key,
206 |         }
207 | 
208 |         payload = {
209 |             "prompt": f"Analyze these images and return structured JSON: {prompt}",
210 |             "image_url": image_url,
211 |         }
212 | 
213 |         async with aiohttp.ClientSession() as session:
214 |             # Submit
215 |             logger.info(f"[VisionAdapter] Submitting analysis request for {image_url}")
216 |             async with session.post(submit_url, headers=headers, json=payload) as resp:
217 |                 if resp.status != 200:
218 |                     logger.error(f"[VisionAdapter] Submit failed: {resp.status}")
219 |                     raise RuntimeError(f"MuAPI vision submit error {resp.status}: {await resp.text()}")
220 |                 data = await resp.json() 
221 |                 request_id = data["request_id"]
222 |                 logger.info(f"[VisionAdapter] Request submitted. ID: {request_id}")
223 |     
224 |             # Poll
225 |             result_url = f"{self.BASE_URL}/predictions/{request_id}/result"
226 |     
227 |             while True:
228 |                 await asyncio.sleep(0.5)
229 | 
230 |                 async with session.get(result_url, headers={"x-api-key": self.api_key}) as resp:
231 |                     if resp.status != 200:
232 |                         raise RuntimeError(f"MuAPI poll error {resp.status}: {await resp.text()}")
233 | 
234 |                     result = await resp.json()
235 |                     status = result["status"]
236 | 
237 |                     if status == "completed":
238 |                         logger.info(f"[VisionAdapter] Request {request_id} completed.")
239 |                         output_text = result["outputs"][0]
240 | 
241 |                         # Try to parse JSON (if model outputs JSON)
242 |                         try:
243 |                             return json.loads(output_text)
244 |                         except:
245 |                             # If it's plain text, wrap it
246 |                             return {"summary": output_text}
247 | 
248 |                     if status == "failed":
249 |                         logger.info(f"[VisionAdapter] Request {request_id} failed.")
250 |                         raise RuntimeError(f"MuAPI vision task failed: {result.get('error')}")
251 | 
252 |                     logger.debug(f"[VisionAdapter] Polling {request_id}: {status}...")
253 | 
254 | 
255 | # ============================================================================
256 | # DATA MODELS
257 | # ============================================================================
258 | 
259 | class ReferenceImage(BaseModel):
260 |     """Analyzed reference image"""
261 |     url: str
262 |     type: str  # "character", "environment", "prop", "lighting"
263 |     description: str
264 |     key_features: List[str]
265 | 
266 | 
267 | class FramePlan(BaseModel):
268 |     """Plan for a single frame in the sequence"""
269 |     frame_number: int
270 |     shot_type: str  # "wide", "medium", "close_up", "extreme_close_up"
271 |     camera_angle: str  # "eye_level", "low_angle", "high_angle", "dutch_angle"
272 |     description: str
273 |     focus_elements: List[str]
274 |     composition_notes: str
275 |     duration_hint: float = 1.5  # seconds per frame
276 |     background_id: str # Added this field
277 | 
278 | 
279 | class BackgroundPlan(BaseModel):
280 |     """Plan for a background/environment"""
281 |     id: str  # e.g., "bg_1", "kitchen", "studio"
282 |     description: str
283 |     frames: List[int]  # List of frame numbers that use this background
284 | 
285 | 
286 | class PopcornSequence(BaseModel):
287 |     """Complete sequence plan"""
288 |     prompt: str
289 |     num_frames: int
290 |     references: List[ReferenceImage]
291 |     frames: List[FramePlan]
292 |     backgrounds: List[BackgroundPlan] # Added this field
293 |     style: str
294 |     consistency_rules: str
295 | 
296 | 
297 | # ============================================================================
298 | # REFERENCE ANALYZER
299 | # ============================================================================
300 | 
301 | class ReferenceAnalyzer:
302 |     """Analyze reference images using vision AI"""
303 |     
304 |     def __init__(self, vision: VisionAdapter, llm: LLMClient):
305 |         self.vision = vision
306 |         self.llm = llm
307 |     
308 |     async def analyze_reference(self, image_url: str, index: int) -> ReferenceImage:
309 |         """Analyze a single reference image"""
310 |         logger.info(f"  Analyzing reference {index + 1}...")
311 |         
312 |         # Step 1: Vision analysis
313 |         analysis_prompt = """Describe this image in detail:
314 | - What type of subject is this? (character, environment, object, lighting setup)
315 | - Key visual features (colors, textures, style, mood)
316 | - Notable details for consistency (clothing, architecture, props)
317 | 
318 | Return JSON: {"type": "...", "description": "...", "key_features": [...]}"""
319 |         
320 |         vision_result = await self.vision.analyze(analysis_prompt, image_url)
321 |         
322 |         # Step 2: Structure the analysis
323 |         ref_type = vision_result.get("type", "unknown")
324 |         description = vision_result.get("description", vision_result.get("summary", ""))
325 |         features = vision_result.get("key_features", [])
326 |         
327 |         # If features not provided, extract from description
328 |         if not features and description:
329 |             features = description.split(", ")[:5]
330 |         
331 |         return ReferenceImage(
332 |             url=image_url,
333 |             type=ref_type,
334 |             description=description,
335 |             key_features=features
336 |         )
337 | 
338 | 
339 | # ============================================================================
340 | # SEQUENCE PLANNER
341 | # ============================================================================
342 | 
343 | class SequencePlanner:
344 |     """Plan shot sequence - adapts to ANY use case (storytelling, product, architecture, etc.)"""
345 |     
346 |     def __init__(self, llm: LLMClient):
347 |         self.llm = llm
348 |     
349 |     async def plan_sequence(
350 |         self,
351 |         prompt: str,
352 |         num_frames: int,
353 |         references: List[ReferenceImage],
354 |         style: str = "cinematic realistic"
355 |     ) -> PopcornSequence:
356 |         """Plan a coherent sequence of frames - adapts to prompt context"""
357 |         logger.info(f"Planning {num_frames}-frame sequence...")
358 |         
359 |         # Build context from references
360 |         ref_context = self._build_reference_context(references)
361 |         
362 |         # Create planning prompt - LET THE LLM DECIDE THE APPROACH
363 |         planning_prompt = f"""You are a professional visual planner. Analyze the user's request and plan {num_frames} frames accordingly.
364 | 
365 | USER PROMPT: "{prompt}"
366 | 
367 | REFERENCE IMAGES AVAILABLE:
368 | {ref_context}
369 | 
370 | STYLE: {style}
371 | 
372 | TASK: Plan {num_frames} frames that best fulfill the user's prompt.
373 | 
374 | CONTEXT DETECTION:
375 | - If prompt is about storytelling/narrative → plan cinematic shots (wide, close-up, camera angles)
376 | - If prompt is about product photography → plan product shots (front view, side view, top view, detail shots)
377 | - If prompt is about architecture → plan architectural views (exterior, interior, details, context)
378 | - If prompt is about fashion → plan fashion shots (full body, detail shots, poses)
379 | - If prompt is about anything else → adapt intelligently
380 | 
381 | SHOT TYPE GUIDANCE:
382 | - Storytelling: "wide_shot", "close_up", "medium_shot", "over_shoulder", etc.
383 | - Product: "front_view", "side_view", "top_view", "45_degree_angle", "detail_shot", "lifestyle_shot"
384 | - Architecture: "exterior_wide", "interior_view", "detail_element", "aerial_view"
385 | - Fashion: "full_body", "upper_body", "detail_closeup", "editorial_pose"
386 | - Generic: describe the view naturally
387 | 
388 | CAMERA ANGLE GUIDANCE:
389 | - Storytelling: "eye_level", "low_angle", "high_angle", "dutch_angle"
390 | - Product: "straight_on", "slightly_elevated", "45_degree", "overhead"
391 | - Architecture: "street_level", "elevated", "birds_eye"
392 | - Fashion: "eye_level", "low_angle", "high_fashion_angle"
393 | - Generic: describe the angle naturally
394 | 
395 | Each frame should:
396 | 1. Use the reference images for consistency
397 | 2. Vary views/angles for visual interest
398 | 3. Maintain consistent lighting and style
399 | 4. Create a natural progression
400 | 
401 | BACKGROUND/ENVIRONMENT HANDLING:
402 | - Analyze the prompt to determine needed backgrounds
403 | - Product photography/static scenes → Usually 1 consistent background
404 | - Storytelling with movement → May need multiple backgrounds
405 | - Define the specific backgrounds needed for this sequence
406 | 
407 | Examples:
408 | - Prompt: "product shots for e-commerce" → All frames: same white/studio background
409 | - Prompt: "hero escapes building and runs to car" → Frame 1: building interior, Frame 2-3: street, Frame 4: car
410 | - Prompt: "detective in crowded market" → Background: "Bustling market street with colorful stalls, awnings, cobblestone path, and atmospheric lighting" (Note: NO mention of detective)
411 | - Prompt: "orange cup on table" → Background: "Empty wooden table surface with blurred kitchen background" (Note: NO mention of cup)
412 | 
413 | CRITICAL CONSISTENCY RULES:
414 | - Subject/product/character must look IDENTICAL in every frame
415 | - Lighting approach and color palette must match across all frames
416 | - Style must be uniform ({style})
417 | - Background consistency: decide based on prompt context
418 | 
419 | Return JSON:
420 | {{
421 |   "backgrounds": [
422 |     {{
423 |       "id": "bg_1",
424 |       "description": "Detailed description of the SETTING ONLY. Do NOT mention the main character, product, or subject. Describe the environment as if the subject is not there.",
425 |       "frames": [1, 2, 3, 4]
426 |     }}
427 |   ],
428 |   "frames": [
429 |     {{
430 |       "frame_number": 1,
431 |       "shot_type": "<appropriate shot type based on context>",
432 |       "camera_angle": "<appropriate angle based on context>",
433 |       "description": "Detailed description of what we see in this frame",
434 |       "focus_elements": ["element1", "element2"],
435 |       "composition_notes": "Technical composition details",
436 |       "background_id": "bg_1"
437 |     }},
438 |     ...
439 |   ],
440 |   "consistency_rules": "Summary of what must stay consistent across all frames"
441 | }}
442 | 
443 | IMPORTANT: 
444 | 1. Choose shot_type and camera_angle terminology that matches the context of the prompt.
445 | 2. Define backgrounds separately in the 'backgrounds' list.
446 | 3. Link each frame to a background_id."""
447 | 
448 |         response = await self.llm.call(
449 |             system_prompt="You are an expert visual planner who adapts to any creative brief - from film to product photography to architecture to fashion and beyond.",
450 |             user_prompt=planning_prompt,
451 |             response_format=dict,
452 |             temperature=0.4
453 |         )
454 |         
455 |         # Parse response
456 |         frames = []
457 |         for f in response["frames"]:
458 |             # Ensure background_id exists, default to first bg if missing
459 |             if "background_id" not in f and response.get("backgrounds"):
460 |                 f["background_id"] = response["backgrounds"][0]["id"]
461 |             frames.append(FramePlan(**f))
462 |             
463 |         backgrounds = [BackgroundPlan(**b) for b in response.get("backgrounds", [])]
464 |         consistency_rules = response.get("consistency_rules", "Maintain visual consistency")
465 |         
466 |         return PopcornSequence(
467 |             prompt=prompt,
468 |             num_frames=num_frames,
469 |             references=references,
470 |             frames=frames,
471 |             backgrounds=backgrounds,
472 |             style=style,
473 |             consistency_rules=consistency_rules
474 |         )
475 |     
476 |     def _build_reference_context(self, references: List[ReferenceImage]) -> str:
477 |         """Build text context from reference images"""
478 |         lines = []
479 |         for i, ref in enumerate(references):
480 |             lines.append(f"Reference {i+1} ({ref.type}):")
481 |             lines.append(f"  Description: {ref.description}")
482 |             lines.append(f"  Key features: {', '.join(ref.key_features)}")
483 |         return "\n".join(lines)
484 | 
485 | 
486 | # ============================================================================
487 | # BACKGROUND GENERATOR
488 | # ============================================================================
489 | 
490 | class BackgroundGenerator:
491 |     """Generate background reference images"""
492 |     
493 |     def __init__(self, image_gen: NanoBananaImageGenerator, output_dir: Path, style: str):
494 |         self.image_gen = image_gen
495 |         self.output_dir = output_dir / "backgrounds"
496 |         self.output_dir.mkdir(parents=True, exist_ok=True)
497 |         self.style = style
498 |         
499 |     async def generate_backgrounds(self, sequence: PopcornSequence) -> Dict[str, Dict[str, str]]:
500 |         """Generate all planned backgrounds. Returns dict of id -> {path: local_path, url: original_url}"""
501 |         logger.info(f"Generating {len(sequence.backgrounds)} background(s)...")
502 |         
503 |         bg_data = {}
504 |         
505 |         for bg in sequence.backgrounds:
506 |             logger.info(f"  Generating background '{bg.id}': {bg.description[:50]}...")
507 |             
508 |             # Build prompt - Clean environment description
509 |             # We rely on the planner to have removed the subject from the description
510 |             prompt = (
511 |                 f"{self.style} style. {bg.description}. "
512 |                 "background reference, location shot, environmental view, high quality, 4k"
513 |             )
514 |             
515 |             # Generate
516 |             bg_url = await self.image_gen.generate_image(
517 |                 prompt=prompt,
518 |                 aspect_ratio="16:9"
519 |             )
520 |             
521 |             # Download
522 |             bg_filename = f"{bg.id}.png"
523 |             bg_path_local = self.output_dir / bg_filename
524 |             
525 |             try:
526 |                 async with aiohttp.ClientSession() as session:
527 |                     async with session.get(bg_url) as response:
528 |                         if response.status == 200:
529 |                             with open(bg_path_local, 'wb') as f:
530 |                                 while True:
531 |                                     chunk = await response.content.read(1024)
532 |                                     if not chunk:
533 |                                         break
534 |                                     f.write(chunk)
535 |                             bg_data[bg.id] = {"path": str(bg_path_local), "url": bg_url}
536 |                             logger.info(f"    ✓ Saved: {bg_path_local.name}")
537 |                         else:
538 |                             logger.error(f"    ✗ Failed to download background {bg.id}")
539 |                             bg_data[bg.id] = {"path": bg_url, "url": bg_url}
540 |             except Exception as e:
541 |                 logger.error(f"    ✗ Download error for {bg.id}: {e}")
542 |                 bg_data[bg.id] = {"path": bg_url, "url": bg_url}
543 |                 
544 |         return bg_data
545 | 
546 | 
547 | # ============================================================================
548 | # FRAME GENERATOR
549 | # ============================================================================
550 | 
551 | class FrameGenerator:
552 |     """Generate consistent frames using nano-banana-edit"""
553 |     
554 |     def __init__(self, image_gen: NanoBananaImageGenerator, output_dir: Path, style: str):
555 |         self.image_gen = image_gen
556 |         self.output_dir = output_dir / "frames"
557 |         self.output_dir.mkdir(parents=True, exist_ok=True)
558 |         self.style = style
559 |     
560 |     async def generate_frames(
561 |         self,
562 |         sequence: PopcornSequence,
563 |         bg_data: Dict[str, Dict[str, str]]
564 |     ) -> List[str]:
565 |         """Generate all frames using reference images for consistency"""
566 |         logger.info(f"Generating {sequence.num_frames} frames...")
567 |         
568 |         # Collect user-provided reference URLs
569 |         user_ref_urls = [ref.url for ref in sequence.references]
570 |         
571 |         # Track first frame details for consistency
572 |         first_frame_details = None
573 |         
574 |         # Generate frames
575 |         frame_paths = []
576 |         for frame in sequence.frames:
577 |             logger.info(f"  Frame {frame.frame_number}/{sequence.num_frames}: {frame.shot_type}")
578 |             
579 |             # Build frame prompt with detail consistency
580 |             prompt = self._build_frame_prompt(frame, sequence, first_frame_details)
581 |             
582 |             # Extract details from first frame for future consistency
583 |             if frame.frame_number == 1:
584 |                 first_frame_details = self._extract_detail_hints(frame)
585 |             
586 |             # Prepare references for this frame
587 |             current_frame_refs = user_ref_urls.copy()
588 |             current_bg_info = bg_data.get(frame.background_id)
589 |             if current_bg_info and current_bg_info.get("url"):
590 |                 current_frame_refs.append(current_bg_info["url"])
591 | 
592 |             # Generate with references
593 |             frame_url = await self.image_gen.generate_with_references(
594 |                 prompt=prompt,
595 |                 reference_images=current_frame_refs,
596 |                 aspect_ratio="16:9"
597 |             )
598 |             
599 |             # Download frame
600 |             frame_filename = f"frame_{frame.frame_number:02d}.png"
601 |             frame_path_local = self.output_dir / frame_filename
602 |             
603 |             try:
604 |                 async with aiohttp.ClientSession() as session:
605 |                     async with session.get(frame_url) as response:
606 |                         if response.status == 200:
607 |                             with open(frame_path_local, 'wb') as f:
608 |                                 while True:
609 |                                     chunk = await response.content.read(1024)
610 |                                     if not chunk:
611 |                                         break
612 |                                     f.write(chunk)
613 |                             frame_paths.append(str(frame_path_local))
614 |                             logger.info(f"    ✓ Saved: {frame_path_local.name}")
615 |                         else:
616 |                             frame_paths.append(frame_url)
617 |             except Exception as e:
618 |                 logger.error(f"    ✗ Download error for frame {frame.frame_number}: {e}")
619 |                 frame_paths.append(frame_url)
620 |         
621 |         return frame_paths
622 |     
623 |     def _extract_detail_hints(self, frame: FramePlan) -> str:
624 |         """Extract detail-level consistency hints from first frame description"""
625 |         desc = frame.description.lower()
626 |         hints = []
627 |         
628 |         # SUBJECT/CHARACTER DETAILS (accessories, props that should stay consistent)
629 |         # Check for accessories/props mentioned
630 |         if "glove" in desc:
631 |             hints.append("wearing gloves")
632 |         elif "bare hand" in desc or "no glove" in desc:
633 |             hints.append("bare hands (no gloves)")
634 |         
635 |         if "watch" in desc:
636 |             hints.append("wearing watch")
637 |         
638 |         if "glass" in desc and "wearing" in desc:
639 |             hints.append("wearing glasses")
640 |         elif "no glass" in desc:
641 |             hints.append("no glasses")
642 |         
643 |         if "holding" in desc:
644 |             # Extract what's being held
645 |             for item in ["cup", "mug", "phone", "bag", "tool", "book", "pen", "bottle", "box"]:
646 |                 if item in desc:
647 |                     hints.append(f"holding {item}")
648 |         
649 |         return ", ".join(hints) if hints else ""
650 |     
651 |     def _build_frame_prompt(self, frame: FramePlan, sequence: PopcornSequence, first_frame_details: Optional[str] = None) -> str:
652 |         """Build detailed prompt for frame generation with detail consistency"""
653 |         
654 |         # Build base components
655 |         prompt_parts = [
656 |             f"{sequence.style} style",
657 |             f"{frame.shot_type.replace('_', ' ')} from {frame.camera_angle.replace('_', ' ')}",
658 |             frame.description,
659 |         ]
660 |         
661 |         # Add detail consistency for frames 2+
662 |         if first_frame_details and frame.frame_number > 1:
663 |             prompt_parts.append(f"MAINTAIN EXACT DETAILS FROM FRAME 1: {first_frame_details}")
664 |         
665 |         prompt_parts.extend([
666 |             frame.composition_notes,
667 |             sequence.consistency_rules,
668 |             "subject naturally integrated in scene with proper lighting",
669 |             "match scene lighting and atmosphere",
670 |             "blend seamlessly with environment",
671 |             "professional storyboard frame",
672 |             "consistent with reference images",
673 |             "cinematic framing",
674 |             "high quality"
675 |         ])
676 |         
677 |         return ", ".join([p for p in prompt_parts if p])
678 | 
679 | 
680 | 
681 | 
682 | 
683 | # ============================================================================
684 | # MAIN ORCHESTRATOR
685 | # ============================================================================
686 | 
687 | class PopcornGenerator:
688 |     """Main orchestrator for Popcorn-style generation"""
689 |     
690 |     def __init__(self, muapi_key: str, output_dir: str, style: str = "cinematic realistic"):
691 |         self.llm = LLMClient(muapi_key, model="gpt-5-mini")
692 |         self.vision = VisionAdapter(muapi_key)
693 |         self.image_gen = NanoBananaImageGenerator(muapi_key)
694 |         self.output_dir = Path(output_dir)
695 |         self.output_dir.mkdir(parents=True, exist_ok=True)
696 |         self.style = style
697 |         
698 |         self.analyzer = ReferenceAnalyzer(self.vision, self.llm)
699 |         self.planner = SequencePlanner(self.llm)
700 |         self.bg_generator = BackgroundGenerator(self.image_gen, self.output_dir, style)
701 |         self.generator = FrameGenerator(self.image_gen, self.output_dir, style)
702 |     
703 |     async def generate(
704 |         self,
705 |         prompt: str,
706 |         reference_urls: List[str],
707 |         num_frames: int = 6,
708 |         manual_shots: Optional[List[str]] = None
709 |     ) -> Dict:
710 |         """Generate a Popcorn-style sequence"""
711 |         logger.info(f"🍿 Popcorn Generator Starting...")
712 |         if manual_shots:
713 |             logger.info(f"  Mode: MANUAL ({len(manual_shots)} shots)")
714 |         else:
715 |             logger.info(f"  Mode: AUTO (Prompt: {prompt})")
716 |             logger.info(f"  Frames: {num_frames}")
717 |             
718 |         logger.info(f"  References: {len(reference_urls)}")
719 |         logger.info(f"  Style: {self.style}")
720 |         
721 |         # Step 1: Analyze references
722 |         logger.info("\n📸 Step 1: Analyzing reference images...")
723 |         references = []
724 |         for i, url in enumerate(reference_urls):
725 |             ref = await self.analyzer.analyze_reference(url, i)
726 |             references.append(ref)
727 |             logger.info(f"    ✓ {ref.type}: {ref.description[:60]}...")
728 |         
729 |         # Step 2: Plan sequence
730 |         logger.info("\n🎬 Step 2: Planning shot sequence...")
731 |         if manual_shots:
732 |             sequence = await self.planner.plan_manual_sequence(manual_shots, references, self.style)
733 |         else:
734 |             sequence = await self.planner.plan_sequence(prompt, num_frames, references, self.style)
735 |             
736 |         logger.info(f"    ✓ Planned {len(sequence.frames)} frames")
737 |         logger.info(f"    Consistency: {sequence.consistency_rules}")
738 |         
739 |         # Step 3: Generate backgrounds
740 |         bg_data = {}
741 |         logger.info("\n🏙️  Step 3: Generating backgrounds...")
742 |         bg_data = await self.bg_generator.generate_backgrounds(sequence)
743 |         logger.info(f"    ✓ Generated {len(bg_data)} backgrounds")
744 | 
745 |         # Step 4: Generate frames
746 |         logger.info("\n🖼️  Step 4: Generating frames...")
747 |         frame_paths = await self.generator.generate_frames(sequence, bg_data)
748 |             
749 |         logger.info(f"    ✓ Generated {len(frame_paths)} frames")
750 |         
751 |         # Step 5: Save sequence data
752 |         sequence_file = self.output_dir / "sequence.json"
753 |         with open(sequence_file, 'w') as f:
754 |             json.dump({
755 |                 "prompt": prompt if not manual_shots else "Manual Mode",
756 |                 "manual_shots": manual_shots,
757 |                 "num_frames": len(sequence.frames),
758 |                 "references": [ref.model_dump() for ref in references],
759 |                 "frames": [frame.model_dump() for frame in sequence.frames],
760 |                 "backgrounds": [bg.model_dump() for bg in sequence.backgrounds],
761 |                 "frame_paths": frame_paths,
762 |                 "bg_paths": bg_data,
763 |                 "style": self.style,
764 |                 "consistency_rules": sequence.consistency_rules
765 |             }, f, indent=2)
766 |         
767 |         logger.info(f"\n✅ Sequence complete!")
768 |         logger.info(f"   Frames: {self.output_dir}/frames/")
769 |         logger.info(f"   Data: {sequence_file}")
770 |         
771 |         return {
772 |             "sequence": sequence,
773 |             "frame_paths": frame_paths,
774 |             "output_dir": str(self.output_dir)
775 |         }
776 | 
777 | 
778 | # ============================================================================
779 | # CLI
780 | # ============================================================================
781 | 
782 | async def main():
783 |     parser = argparse.ArgumentParser(description="Popcorn Storyboard Generator")
784 |     
785 |     parser.add_argument(
786 |         "--prompt",
787 |         help="Text prompt describing the scene/action (Required for Auto Mode)"
788 |     )
789 |     
790 |     parser.add_argument(
791 |         "--manual_shots",
792 |         nargs="+",
793 |         help="List of manual shot descriptions (Enables Manual Mode)"
794 |     )
795 |     
796 |     parser.add_argument(
797 |         "--references",
798 |         nargs="+",
799 |         default=[],
800 |         help="Reference image URLs (up to 4 recommended)"
801 |     )
802 |     
803 |     parser.add_argument(
804 |         "--frames",
805 |         type=int,
806 |         default=6,
807 |         help="Number of frames to generate (Auto Mode only)"
808 |     )
809 |     
810 |     parser.add_argument(
811 |         "--style",
812 |         default="cinematic realistic",
813 |         help="Visual style (e.g., 'anime', 'watercolor', 'noir film')"
814 |     )
815 |     
816 | 
817 |     
818 |     parser.add_argument(
819 |         "--output",
820 |         default="popcorn_output",
821 |         help="Output directory"
822 |     )
823 |     
824 | 
825 |     
826 |     args = parser.parse_args()
827 |     
828 |     # Validate
829 |     if not args.prompt and not args.manual_shots:
830 |         print("Error: Must provide either --prompt OR --manual_shots")
831 |         sys.exit(1)
832 |         
833 |     if args.manual_shots and len(args.manual_shots) < 1:
834 |         print("Error: Manual mode requires at least 1 shot description")
835 |         sys.exit(1)
836 |     
837 |     if args.frames < 2 or args.frames > 12:
838 |         if not args.manual_shots: # Only enforce for auto mode
839 |             print("Error: Frames must be between 2-12")
840 |             sys.exit(1)
841 |     
842 |     if len(args.references) > 4:
843 |         print("Warning: More than 4 references may reduce consistency")
844 |     
845 |     # Get API keys
846 |     muapi_key = MUAPIAPP_API_KEY
847 |     
848 |     if not muapi_key:
849 |         print("Error: MUAPIAPP_API_KEY missing in secrets.py")
850 |         sys.exit(1)
851 |     
852 |     # Create output directory
853 |     timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
854 |     output_dir = Path(args.output) / f"sequence_{timestamp}"
855 |     
856 |     # Generate
857 |     generator = PopcornGenerator(muapi_key, str(output_dir), args.style)
858 |     
859 |     try:
860 |         result = await generator.generate(
861 |             prompt=args.prompt if args.prompt else "Manual Mode",
862 |             reference_urls=args.references,
863 |             num_frames=args.frames if not args.manual_shots else len(args.manual_shots),
864 |             manual_shots=args.manual_shots
865 |         )
866 |         print(f"\n🎉 Success! Frames saved to: {result['output_dir']}")
867 |         
868 |     except Exception as e:
869 |         logger.error(f"Generation failed: {e}")
870 |         import traceback
871 |         traceback.print_exc()
872 |         sys.exit(1)
873 | 
874 | 
875 | if __name__ == "__main__":
876 |     asyncio.run(main())
877 | 


--------------------------------------------------------------------------------