180 | );
181 | }
182 |
--------------------------------------------------------------------------------
/CLAUDE.md:
--------------------------------------------------------------------------------
1 | # CLAUDE.md - Technical Notes for LLM Council
2 |
3 | This file contains technical details, architectural decisions, and important implementation notes for future development sessions.
4 |
5 | ## Project Overview
6 |
7 | LLM Council is a 3-stage deliberation system where multiple LLMs collaboratively answer user questions. The key innovation is anonymized peer review in Stage 2, preventing models from playing favorites.
8 |
9 | ## Architecture
10 |
11 | ### Backend Structure (`backend/`)
12 |
13 | **`config.py`**
14 | - Contains `COUNCIL_MODELS` (list of OpenRouter model identifiers)
15 | - Contains `CHAIRMAN_MODEL` (model that synthesizes final answer)
16 | - Uses environment variable `OPENROUTER_API_KEY` from `.env`
17 | - Backend runs on **port 8008** in Docker deployment
18 |
19 | **`openrouter.py`**
20 | - `query_model()`: Single async model query
21 | - `query_models_parallel()`: Parallel queries using `asyncio.gather()`
22 | - Returns dict with 'content' and optional 'reasoning_details'
23 | - Graceful degradation: returns None on failure, continues with successful responses
24 |
25 | **`council.py`** - The Core Logic
26 | - `stage1_collect_responses()`: Parallel queries to all council models
27 | - `stage2_collect_rankings()`:
28 | - Anonymizes responses as "Response A, B, C, etc."
29 | - Creates `label_to_model` mapping for de-anonymization
30 | - Prompts models to evaluate and rank (with strict format requirements)
31 | - Returns tuple: (rankings_list, label_to_model_dict)
32 | - Each ranking includes both raw text and `parsed_ranking` list
33 | - `stage3_synthesize_final()`: Chairman synthesizes from all responses + rankings
34 | - `parse_ranking_from_text()`: Extracts "FINAL RANKING:" section, handles both numbered lists and plain format
35 | - `calculate_aggregate_rankings()`: Computes average rank position across all peer evaluations
36 |
37 | **`storage.py`**
38 | - JSON-based conversation storage in `data/conversations/`
39 | - Each conversation: `{id, created_at, messages[]}`
40 | - Assistant messages contain: `{role, stage1, stage2, stage3}`
41 | - Note: metadata (label_to_model, aggregate_rankings) is NOT persisted to storage, only returned via API
42 |
43 | **`main.py`**
44 | - FastAPI app with CORS enabled for:
45 | - Development: localhost:5173 (Vite), localhost:3000
46 | - Production: localhost:80, localhost (Docker/Nginx)
47 | - POST `/api/conversations/{id}/message` returns metadata in addition to stages
48 | - Metadata includes: label_to_model mapping and aggregate_rankings
49 | - `/health` endpoint for Docker health checks
50 |
51 | ### Frontend Structure (`frontend/src/`)
52 |
53 | **`App.jsx`**
54 | - Main orchestration: manages conversations list and current conversation
55 | - Handles message sending and metadata storage
56 | - Important: metadata is stored in the UI state for display but not persisted to backend JSON
57 |
58 | **`components/ChatInterface.jsx`**
59 | - Multiline textarea (3 rows, resizable)
60 | - Enter to send, Shift+Enter for new line
61 | - User messages wrapped in markdown-content class for padding
62 |
63 | **`components/Stage1.jsx`**
64 | - Tab view of individual model responses
65 | - ReactMarkdown rendering with markdown-content wrapper
66 |
67 | **`components/Stage2.jsx`**
68 | - **Critical Feature**: Tab view showing RAW evaluation text from each model
69 | - De-anonymization happens CLIENT-SIDE for display (models receive anonymous labels)
70 | - Shows "Extracted Ranking" below each evaluation so users can validate parsing
71 | - Aggregate rankings shown with average position and vote count
72 | - Explanatory text clarifies that boldface model names are for readability only
73 |
74 | **`components/Stage3.jsx`**
75 | - Final synthesized answer from chairman
76 | - Green-tinted background (#f0fff0) to highlight conclusion
77 |
78 | **Styling (`*.css`)**
79 | - Light mode theme (not dark mode)
80 | - Primary color: #4a90e2 (blue)
81 | - Global markdown styling in `index.css` with `.markdown-content` class
82 | - 12px padding on all markdown content to prevent cluttered appearance
83 |
84 | ## Key Design Decisions
85 |
86 | ### Stage 2 Prompt Format
87 | The Stage 2 prompt is very specific to ensure parseable output:
88 | ```
89 | 1. Evaluate each response individually first
90 | 2. Provide "FINAL RANKING:" header
91 | 3. Numbered list format: "1. Response C", "2. Response A", etc.
92 | 4. No additional text after ranking section
93 | ```
94 |
95 | This strict format allows reliable parsing while still getting thoughtful evaluations.
96 |
97 | ### De-anonymization Strategy
98 | - Models receive: "Response A", "Response B", etc.
99 | - Backend creates mapping: `{"Response A": "openai/gpt-5.1", ...}`
100 | - Frontend displays model names in **bold** for readability
101 | - Users see explanation that original evaluation used anonymous labels
102 | - This prevents bias while maintaining transparency
103 |
104 | ### Error Handling Philosophy
105 | - Continue with successful responses if some models fail (graceful degradation)
106 | - Never fail the entire request due to single model failure
107 | - Log errors but don't expose to user unless all models fail
108 |
109 | ### UI/UX Transparency
110 | - All raw outputs are inspectable via tabs
111 | - Parsed rankings shown below raw text for validation
112 | - Users can verify system's interpretation of model outputs
113 | - This builds trust and allows debugging of edge cases
114 |
115 | ## Important Implementation Details
116 |
117 | ### Relative Imports
118 | All backend modules use relative imports (e.g., `from .config import ...`) not absolute imports. This is critical for Python's module system to work correctly when running as `python -m backend.main`.
119 |
120 | ### Port Configuration
121 |
122 | **Development Mode:**
123 | - Backend: 8008 (direct access)
124 | - Frontend: 5173 (Vite dev server)
125 | - Frontend calls backend directly at `http://localhost:8008`
126 |
127 | **Docker/Production Mode:**
128 | - Backend: 8008 (internal container port)
129 | - Nginx: 80 (external access port)
130 | - Frontend built as static files served by Nginx
131 | - Frontend uses relative URLs (e.g., `/api/...`) which are proxied to backend by Nginx
132 | - No CORS issues because all traffic goes through single origin (port 80)
133 |
134 | The `frontend/src/api.js` automatically detects environment using `import.meta.env.PROD` and switches between development and production API base URLs.
135 |
136 | ### Markdown Rendering
137 | All ReactMarkdown components must be wrapped in `
` for proper spacing. This class is defined globally in `index.css`.
138 |
139 | ### Model Configuration
140 | Models are hardcoded in `backend/config.py`. Chairman can be same or different from council members. The current default is Gemini as chairman per user preference.
141 |
142 | ## Docker Deployment
143 |
144 | The application can be deployed using Docker Compose:
145 |
146 | ```bash
147 | # Build and start containers
148 | docker-compose up -d
149 |
150 | # View logs
151 | docker-compose logs -f
152 |
153 | # Rebuild after code changes
154 | docker-compose up -d --build
155 |
156 | # Stop containers
157 | docker-compose down
158 | ```
159 |
160 | **Architecture:**
161 | - `backend` container: Python FastAPI app on port 8008
162 | - `nginx` container: Nginx reverse proxy on port 80
163 | - Serves frontend static files from `/usr/share/nginx/html`
164 | - Proxies `/api/*` requests to backend container
165 | - Handles CORS by making everything same-origin
166 | - Shared volume: `./data` for conversation persistence
167 |
168 | **Important:** Before deploying, ensure:
169 | 1. Frontend is built: `cd frontend && npm run build`
170 | 2. `.env` file exists with `ZENMUX_API_KEY`
171 | 3. Port 80 is available on host machine
172 |
173 | ## Common Gotchas
174 |
175 | 1. **Module Import Errors**: Always run backend as `python -m backend.main` from project root, not from backend directory
176 | 2. **CORS Issues in Docker**:
177 | - Frontend must use relative URLs in production (handled automatically by `import.meta.env.PROD`)
178 | - Backend CORS must allow `http://localhost` and `http://localhost:80`
179 | - Never expose backend port 8008 externally in production
180 | 3. **Ranking Parse Failures**: If models don't follow format, fallback regex extracts any "Response X" patterns in order
181 | 4. **Missing Metadata**: Metadata is ephemeral (not persisted), only available in API responses
182 | 5. **Docker Build Issues**: Remember to rebuild frontend before `docker-compose up --build`
183 |
184 | ## Future Enhancement Ideas
185 |
186 | - Configurable council/chairman via UI instead of config file
187 | - Streaming responses instead of batch loading
188 | - Export conversations to markdown/PDF
189 | - Model performance analytics over time
190 | - Custom ranking criteria (not just accuracy/insight)
191 | - Support for reasoning models (o1, etc.) with special handling
192 |
193 | ## Testing Notes
194 |
195 | Use `test_openrouter.py` to verify API connectivity and test different model identifiers before adding to council. The script tests both streaming and non-streaming modes.
196 |
197 | ## Data Flow Summary
198 |
199 | ```
200 | User Query
201 | ↓
202 | Stage 1: Parallel queries → [individual responses]
203 | ↓
204 | Stage 2: Anonymize → Parallel ranking queries → [evaluations + parsed rankings]
205 | ↓
206 | Aggregate Rankings Calculation → [sorted by avg position]
207 | ↓
208 | Stage 3: Chairman synthesis with full context
209 | ↓
210 | Return: {stage1, stage2, stage3, metadata}
211 | ↓
212 | Frontend: Display with tabs + validation UI
213 | ```
214 |
215 | The entire flow is async/parallel where possible to minimize latency.
216 |
--------------------------------------------------------------------------------
/backend/main.py:
--------------------------------------------------------------------------------
1 | """FastAPI backend for LLM Council."""
2 |
3 | from fastapi import FastAPI, HTTPException, Request
4 | from fastapi.middleware.cors import CORSMiddleware
5 | from fastapi.responses import StreamingResponse, JSONResponse
6 | from fastapi.exceptions import RequestValidationError
7 | from pydantic import BaseModel, field_validator
8 | from typing import List, Dict, Any
9 | import uuid
10 | import json
11 | import asyncio
12 |
13 | from slowapi import Limiter
14 | from slowapi.util import get_remote_address
15 | from slowapi.errors import RateLimitExceeded
16 |
17 | from . import storage
18 | from .council import (
19 | run_full_council,
20 | generate_conversation_title,
21 | stage1_collect_responses,
22 | stage2_collect_rankings,
23 | stage3_synthesize_final,
24 | calculate_aggregate_rankings,
25 | )
26 |
27 | # Constants
28 | MAX_MESSAGE_LENGTH = 1000
29 | RATE_LIMIT_MESSAGE = "5/minute"
30 | RATE_LIMIT_STREAM = "5/minute"
31 |
32 |
33 | def get_real_ip(request: Request) -> str:
34 | """Get real client IP, considering proxy headers."""
35 | # Check X-Forwarded-For first (set by Nginx)
36 | forwarded_for = request.headers.get("X-Forwarded-For")
37 | if forwarded_for:
38 | # Take the first IP in the chain (original client)
39 | return forwarded_for.split(",")[0].strip()
40 |
41 | # Check X-Real-IP (set by Nginx)
42 | real_ip = request.headers.get("X-Real-IP")
43 | if real_ip:
44 | return real_ip
45 |
46 | # Fall back to direct client IP
47 | return get_remote_address(request)
48 |
49 |
50 | # Initialize rate limiter with real IP detection
51 | limiter = Limiter(key_func=get_real_ip)
52 |
53 | app = FastAPI(title="LLM Council API")
54 | app.state.limiter = limiter
55 |
56 |
57 | # Custom error handlers
58 | @app.exception_handler(RateLimitExceeded)
59 | async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
60 | """Handle rate limit exceeded errors."""
61 | return JSONResponse(
62 | status_code=429,
63 | content={
64 | "error": {
65 | "code": "RATE_LIMIT_EXCEEDED",
66 | "message": "请求过于频繁,请稍后再试",
67 | "message_en": "Too many requests, please try again later",
68 | "details": {
69 | "retry_after": 60
70 | }
71 | }
72 | },
73 | headers={"Retry-After": "60"}
74 | )
75 |
76 |
77 | @app.exception_handler(RequestValidationError)
78 | async def validation_error_handler(request: Request, exc: RequestValidationError):
79 | """Handle validation errors with user-friendly messages."""
80 | errors = exc.errors()
81 |
82 | # Check for content length error
83 | for error in errors:
84 | if "content" in str(error.get("loc", [])):
85 | msg = error.get("msg", "")
86 | if "1000" in msg or "too long" in msg.lower() or "超过" in msg:
87 | return JSONResponse(
88 | status_code=400,
89 | content={
90 | "error": {
91 | "code": "CONTENT_TOO_LONG",
92 | "message": f"消息内容不能超过 {MAX_MESSAGE_LENGTH} 个字符",
93 | "message_en": f"Message content cannot exceed {MAX_MESSAGE_LENGTH} characters",
94 | "details": {
95 | "max_length": MAX_MESSAGE_LENGTH
96 | }
97 | }
98 | }
99 | )
100 |
101 | # Default validation error
102 | return JSONResponse(
103 | status_code=400,
104 | content={
105 | "error": {
106 | "code": "VALIDATION_ERROR",
107 | "message": "请求参数无效",
108 | "message_en": "Invalid request parameters",
109 | "details": {
110 | "errors": [str(e) for e in errors]
111 | }
112 | }
113 | }
114 | )
115 |
116 | # Enable CORS for local development and production (Docker with Nginx)
117 | app.add_middleware(
118 | CORSMiddleware,
119 | allow_origins=[
120 | "http://localhost:5173", # Vite dev server
121 | "http://localhost:3000", # Alternative dev server
122 | "http://localhost:80", # Docker Nginx
123 | "http://localhost", # Docker Nginx (default port)
124 | ],
125 | allow_credentials=True,
126 | allow_methods=["*"],
127 | allow_headers=["*"],
128 | )
129 |
130 |
131 | class CreateConversationRequest(BaseModel):
132 | """Request to create a new conversation."""
133 |
134 | pass
135 |
136 |
137 | class SendMessageRequest(BaseModel):
138 | """Request to send a message in a conversation."""
139 |
140 | content: str
141 |
142 | @field_validator('content')
143 | @classmethod
144 | def validate_content(cls, v: str) -> str:
145 | if not v or len(v.strip()) == 0:
146 | raise ValueError('消息内容不能为空')
147 | if len(v) > MAX_MESSAGE_LENGTH:
148 | raise ValueError(f'消息内容不能超过 {MAX_MESSAGE_LENGTH} 个字符 (当前: {len(v)})')
149 | return v.strip()
150 |
151 |
152 | class ConversationMetadata(BaseModel):
153 | """Conversation metadata for list view."""
154 |
155 | id: str
156 | created_at: str
157 | title: str
158 | message_count: int
159 |
160 |
161 | class Conversation(BaseModel):
162 | """Full conversation with all messages."""
163 |
164 | id: str
165 | created_at: str
166 | title: str
167 | messages: List[Dict[str, Any]]
168 |
169 |
170 | @app.get("/")
171 | async def root():
172 | """Health check endpoint."""
173 | return {"status": "ok", "service": "LLM Council API"}
174 |
175 |
176 | @app.get("/health")
177 | async def health():
178 | """Health check endpoint for Docker."""
179 | return {"status": "ok"}
180 |
181 |
182 | @app.get("/api/conversations", response_model=List[ConversationMetadata])
183 | async def list_conversations():
184 | """List all conversations (metadata only)."""
185 | return storage.list_conversations()
186 |
187 |
188 | @app.post("/api/conversations", response_model=Conversation)
189 | async def create_conversation(request: CreateConversationRequest):
190 | """Create a new conversation."""
191 | conversation_id = str(uuid.uuid4())
192 | conversation = storage.create_conversation(conversation_id)
193 | return conversation
194 |
195 |
196 | @app.get("/api/conversations/{conversation_id}", response_model=Conversation)
197 | async def get_conversation(conversation_id: str):
198 | """Get a specific conversation with all its messages."""
199 | conversation = storage.get_conversation(conversation_id)
200 | if conversation is None:
201 | raise HTTPException(status_code=404, detail="Conversation not found")
202 | return conversation
203 |
204 |
205 | @app.post("/api/conversations/{conversation_id}/message")
206 | @limiter.limit(RATE_LIMIT_MESSAGE)
207 | async def send_message(request: Request, conversation_id: str, body: SendMessageRequest):
208 | """
209 | Send a message and run the 3-stage council process.
210 | Returns the complete response with all stages.
211 | """
212 | # Check if conversation exists
213 | conversation = storage.get_conversation(conversation_id)
214 | if conversation is None:
215 | raise HTTPException(status_code=404, detail="Conversation not found")
216 |
217 | # Check if this is the first message
218 | is_first_message = len(conversation["messages"]) == 0
219 |
220 | # Add user message
221 | storage.add_user_message(conversation_id, body.content)
222 |
223 | # If this is the first message, generate a title
224 | if is_first_message:
225 | title = await generate_conversation_title(body.content)
226 | storage.update_conversation_title(conversation_id, title)
227 |
228 | # Run the 3-stage council process
229 | stage1_results, stage2_results, stage3_result, metadata = await run_full_council(
230 | body.content
231 | )
232 |
233 | # Add assistant message with all stages and metadata
234 | storage.add_assistant_message(
235 | conversation_id, stage1_results, stage2_results, stage3_result, metadata
236 | )
237 |
238 | # Return the complete response with metadata
239 | return {
240 | "stage1": stage1_results,
241 | "stage2": stage2_results,
242 | "stage3": stage3_result,
243 | "metadata": metadata,
244 | }
245 |
246 |
247 | @app.post("/api/conversations/{conversation_id}/message/stream")
248 | @limiter.limit(RATE_LIMIT_STREAM)
249 | async def send_message_stream(request: Request, conversation_id: str, body: SendMessageRequest):
250 | """
251 | Send a message and stream the 3-stage council process.
252 | Returns Server-Sent Events as each stage completes.
253 | """
254 | # Check if conversation exists
255 | conversation = storage.get_conversation(conversation_id)
256 | if conversation is None:
257 | raise HTTPException(status_code=404, detail="Conversation not found")
258 |
259 | # Check if this is the first message
260 | is_first_message = len(conversation["messages"]) == 0
261 |
262 | async def event_generator():
263 | try:
264 | # Add user message
265 | storage.add_user_message(conversation_id, body.content)
266 |
267 | # Start title generation in parallel (don't await yet)
268 | title_task = None
269 | if is_first_message:
270 | title_task = asyncio.create_task(
271 | generate_conversation_title(body.content)
272 | )
273 |
274 | # Stage 1: Collect responses
275 | yield f"data: {json.dumps({'type': 'stage1_start'})}\n\n"
276 | stage1_results = await stage1_collect_responses(body.content)
277 | yield f"data: {json.dumps({'type': 'stage1_complete', 'data': stage1_results})}\n\n"
278 |
279 | # Stage 2: Collect rankings
280 | yield f"data: {json.dumps({'type': 'stage2_start'})}\n\n"
281 | stage2_results, label_to_model = await stage2_collect_rankings(
282 | body.content, stage1_results
283 | )
284 | aggregate_rankings = calculate_aggregate_rankings(
285 | stage2_results, label_to_model
286 | )
287 | yield f"data: {json.dumps({'type': 'stage2_complete', 'data': stage2_results, 'metadata': {'label_to_model': label_to_model, 'aggregate_rankings': aggregate_rankings}})}\n\n"
288 |
289 | # Stage 3: Synthesize final answer
290 | yield f"data: {json.dumps({'type': 'stage3_start'})}\n\n"
291 | stage3_result = await stage3_synthesize_final(
292 | body.content, stage1_results, stage2_results
293 | )
294 | yield f"data: {json.dumps({'type': 'stage3_complete', 'data': stage3_result})}\n\n"
295 |
296 | # Wait for title generation if it was started
297 | if title_task:
298 | title = await title_task
299 | storage.update_conversation_title(conversation_id, title)
300 | yield f"data: {json.dumps({'type': 'title_complete', 'data': {'title': title}})}\n\n"
301 |
302 | # Save complete assistant message with metadata
303 | metadata = {
304 | "label_to_model": label_to_model,
305 | "aggregate_rankings": aggregate_rankings,
306 | }
307 | storage.add_assistant_message(
308 | conversation_id, stage1_results, stage2_results, stage3_result, metadata
309 | )
310 |
311 | # Send completion event
312 | yield f"data: {json.dumps({'type': 'complete'})}\n\n"
313 |
314 | except Exception as e:
315 | # Send error event
316 | yield f"data: {json.dumps({'type': 'error', 'message': str(e)})}\n\n"
317 |
318 | return StreamingResponse(
319 | event_generator(),
320 | media_type="text/event-stream",
321 | headers={
322 | "Cache-Control": "no-cache",
323 | "Connection": "keep-alive",
324 | },
325 | )
326 |
327 |
328 | if __name__ == "__main__":
329 | import uvicorn
330 |
331 | uvicorn.run(app, host="0.0.0.0", port=8008)
332 |
--------------------------------------------------------------------------------
/frontend/src/App.jsx:
--------------------------------------------------------------------------------
1 | import { useState, useEffect } from "react";
2 | import { useNavigate, useParams, Routes, Route } from "react-router-dom";
3 | import { useTranslation } from "react-i18next";
4 | import { Toaster } from "sonner";
5 | import Sidebar from "./components/Sidebar";
6 | import ChatInterface from "./components/ChatInterface";
7 | import PartnerFooter from "./components/PartnerFooter";
8 | import LanguageSwitcher from "./components/LanguageSwitcher";
9 | import { api } from "./api";
10 | import { Sheet, SheetContent } from "@/components/ui/sheet";
11 | import { Button } from "@/components/ui/button";
12 | import { Menu } from "lucide-react";
13 |
14 | function AppContent() {
15 | const { t } = useTranslation();
16 | const [conversations, setConversations] = useState([]);
17 | const [currentConversation, setCurrentConversation] = useState(null);
18 | const [isLoading, setIsLoading] = useState(false);
19 | const [isMobileMenuOpen, setIsMobileMenuOpen] = useState(false);
20 |
21 | const navigate = useNavigate();
22 | const { conversationId } = useParams();
23 |
24 | // Load conversations on mount
25 | useEffect(() => {
26 | loadConversations();
27 | }, []);
28 |
29 | // Load conversation details when URL changes
30 | useEffect(() => {
31 | if (conversationId) {
32 | loadConversation(conversationId);
33 | } else {
34 | setCurrentConversation(null);
35 | }
36 | }, [conversationId]);
37 |
38 | const loadConversations = async () => {
39 | try {
40 | const convs = await api.listConversations();
41 | setConversations(convs);
42 | } catch (error) {
43 | console.error("Failed to load conversations:", error);
44 | }
45 | };
46 |
47 | const loadConversation = async (id) => {
48 | try {
49 | const conv = await api.getConversation(id);
50 | setCurrentConversation(conv);
51 | } catch (error) {
52 | console.error("Failed to load conversation:", error);
53 | }
54 | };
55 |
56 | const handleNewConversation = async () => {
57 | try {
58 | const newConv = await api.createConversation();
59 | setConversations([
60 | { id: newConv.id, created_at: newConv.created_at, message_count: 0 },
61 | ...conversations,
62 | ]);
63 | // Navigate to new conversation URL
64 | navigate(`/c/${newConv.id}`);
65 | } catch (error) {
66 | console.error("Failed to create conversation:", error);
67 | }
68 | };
69 |
70 | const handleSelectConversation = (id) => {
71 | // Navigate to conversation URL
72 | navigate(`/c/${id}`);
73 | };
74 |
75 | const handleSendMessage = async (content) => {
76 | if (!conversationId) return;
77 |
78 | setIsLoading(true);
79 | try {
80 | // Optimistically add user message to UI
81 | const userMessage = { role: "user", content };
82 | setCurrentConversation((prev) => ({
83 | ...prev,
84 | messages: [...prev.messages, userMessage],
85 | }));
86 |
87 | // Create a partial assistant message that will be updated progressively
88 | const assistantMessage = {
89 | role: "assistant",
90 | stage1: null,
91 | stage2: null,
92 | stage3: null,
93 | metadata: null,
94 | loading: {
95 | stage1: false,
96 | stage2: false,
97 | stage3: false,
98 | },
99 | };
100 |
101 | // Add the partial assistant message
102 | setCurrentConversation((prev) => ({
103 | ...prev,
104 | messages: [...prev.messages, assistantMessage],
105 | }));
106 |
107 | // Send message with streaming
108 | await api.sendMessageStream(
109 | conversationId,
110 | content,
111 | (eventType, event) => {
112 | switch (eventType) {
113 | case "stage1_start":
114 | setCurrentConversation((prev) => {
115 | const messages = [...prev.messages];
116 | const lastMsg = messages[messages.length - 1];
117 | lastMsg.loading.stage1 = true;
118 | return { ...prev, messages };
119 | });
120 | break;
121 |
122 | case "stage1_complete":
123 | setCurrentConversation((prev) => {
124 | const messages = [...prev.messages];
125 | const lastMsg = messages[messages.length - 1];
126 | lastMsg.stage1 = event.data;
127 | lastMsg.loading.stage1 = false;
128 | return { ...prev, messages };
129 | });
130 | break;
131 |
132 | case "stage2_start":
133 | setCurrentConversation((prev) => {
134 | const messages = [...prev.messages];
135 | const lastMsg = messages[messages.length - 1];
136 | lastMsg.loading.stage2 = true;
137 | return { ...prev, messages };
138 | });
139 | break;
140 |
141 | case "stage2_complete":
142 | setCurrentConversation((prev) => {
143 | const messages = [...prev.messages];
144 | const lastMsg = messages[messages.length - 1];
145 | lastMsg.stage2 = event.data;
146 | lastMsg.metadata = event.metadata;
147 | lastMsg.loading.stage2 = false;
148 | return { ...prev, messages };
149 | });
150 | break;
151 |
152 | case "stage3_start":
153 | setCurrentConversation((prev) => {
154 | const messages = [...prev.messages];
155 | const lastMsg = messages[messages.length - 1];
156 | lastMsg.loading.stage3 = true;
157 | return { ...prev, messages };
158 | });
159 | break;
160 |
161 | case "stage3_complete":
162 | setCurrentConversation((prev) => {
163 | const messages = [...prev.messages];
164 | const lastMsg = messages[messages.length - 1];
165 | lastMsg.stage3 = event.data;
166 | lastMsg.loading.stage3 = false;
167 | return { ...prev, messages };
168 | });
169 | break;
170 |
171 | case "title_complete":
172 | // Reload conversations to get updated title
173 | loadConversations();
174 | break;
175 |
176 | case "complete":
177 | // Stream complete, reload conversations list
178 | loadConversations();
179 | setIsLoading(false);
180 | break;
181 |
182 | case "error":
183 | console.error("Stream error:", event.message);
184 | setIsLoading(false);
185 | break;
186 |
187 | default:
188 | console.log("Unknown event type:", eventType);
189 | }
190 | },
191 | );
192 | } catch (error) {
193 | console.error("Failed to send message:", error);
194 | // Remove optimistic messages on error
195 | setCurrentConversation((prev) => ({
196 | ...prev,
197 | messages: prev.messages.slice(0, -2),
198 | }));
199 | setIsLoading(false);
200 | }
201 | };
202 |
203 | return (
204 |