├── .env.example
├── .gitignore
├── .python-version
├── README.md
├── pic.png
├── pyproject.toml
├── server.py
├── tests.py
└── uv.lock


/.env.example:
--------------------------------------------------------------------------------
 1 | # Required API Keys
 2 | ANTHROPIC_API_KEY="your-anthropic-api-key" # Needed if proxying *to* Anthropic
 3 | OPENAI_API_KEY="sk-..."
 4 | GEMINI_API_KEY="your-google-ai-studio-key"
 5 | 
 6 | # Optional: Provider Preference and Model Mapping
 7 | # Controls which provider (google or openai) is preferred for mapping haiku/sonnet.
 8 | # Defaults to openai if not set.
 9 | PREFERRED_PROVIDER="openai"
10 | 
11 | # Optional: Specify the exact models to map haiku/sonnet to.
12 | # If PREFERRED_PROVIDER=google, these MUST be valid Gemini model names known to the server.
13 | # Defaults to gemini-2.5-pro-preview-03-25 and gemini-2.0-flash if PREFERRED_PROVIDER=google.
14 | # Defaults to gpt-4.1 and gpt-4.1-mini if PREFERRED_PROVIDER=openai.
15 | # BIG_MODEL="gpt-4.1"
16 | # SMALL_MODEL="gpt-4.1-mini"
17 | 
18 | # Example Google mapping:
19 | # PREFERRED_PROVIDER="google"
20 | # BIG_MODEL="gemini-2.5-pro-preview-03-25"
21 | # SMALL_MODEL="gemini-2.0-flash" 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # Environment variables
 2 | .env
 3 | 
 4 | # Python
 5 | __pycache__/
 6 | *.py[cod]
 7 | *$py.class
 8 | *.so
 9 | .Python
10 | build/
11 | develop-eggs/
12 | dist/
13 | downloads/
14 | eggs/
15 | .eggs/
16 | lib/
17 | lib64/
18 | parts/
19 | sdist/
20 | var/
21 | wheels/
22 | *.egg-info/
23 | .installed.cfg
24 | *.egg
25 | 
26 | # Virtual environments
27 | venv/
28 | env/
29 | ENV/
30 | 
31 | # Logs
32 | *.log
33 | 
34 | # IDE specific files
35 | .idea/
36 | .vscode/
37 | *.swp
38 | *.swo


--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------
1 | 3.10
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Anthropic API Proxy for Gemini & OpenAI Models 🔄
  2 | 
  3 | **Use Anthropic clients (like Claude Code) with Gemini or OpenAI backends.** 🤝
  4 | 
  5 | A proxy server that lets you use Anthropic clients with Gemini or OpenAI models via LiteLLM. 🌉
  6 | 
  7 | 
  8 | ![Anthropic API Proxy](pic.png)
  9 | 
 10 | ## Quick Start ⚡
 11 | 
 12 | ### Prerequisites
 13 | 
 14 | - OpenAI API key 🔑
 15 | - Google AI Studio (Gemini) API key (if using Google provider) 🔑
 16 | - [uv](https://github.com/astral-sh/uv) installed.
 17 | 
 18 | ### Setup 🛠️
 19 | 
 20 | 1. **Clone this repository**:
 21 |    ```bash
 22 |    git clone https://github.com/1rgs/claude-code-openai.git
 23 |    cd claude-code-openai
 24 |    ```
 25 | 
 26 | 2. **Install uv** (if you haven't already):
 27 |    ```bash
 28 |    curl -LsSf https://astral.sh/uv/install.sh | sh
 29 |    ```
 30 |    *(`uv` will handle dependencies based on `pyproject.toml` when you run the server)*
 31 | 
 32 | 3. **Configure Environment Variables**:
 33 |    Copy the example environment file:
 34 |    ```bash
 35 |    cp .env.example .env
 36 |    ```
 37 |    Edit `.env` and fill in your API keys and model configurations:
 38 | 
 39 |    *   `ANTHROPIC_API_KEY`: (Optional) Needed only if proxying *to* Anthropic models.
 40 |    *   `OPENAI_API_KEY`: Your OpenAI API key (Required if using the default OpenAI preference or as fallback).
 41 |    *   `GEMINI_API_KEY`: Your Google AI Studio (Gemini) API key (Required if PREFERRED_PROVIDER=google).
 42 |    *   `PREFERRED_PROVIDER` (Optional): Set to `openai` (default) or `google`. This determines the primary backend for mapping `haiku`/`sonnet`.
 43 |    *   `BIG_MODEL` (Optional): The model to map `sonnet` requests to. Defaults to `gpt-4.1` (if `PREFERRED_PROVIDER=openai`) or `gemini-2.5-pro-preview-03-25`.
 44 |    *   `SMALL_MODEL` (Optional): The model to map `haiku` requests to. Defaults to `gpt-4.1-mini` (if `PREFERRED_PROVIDER=openai`) or `gemini-2.0-flash`.
 45 | 
 46 |    **Mapping Logic:**
 47 |    - If `PREFERRED_PROVIDER=openai` (default), `haiku`/`sonnet` map to `SMALL_MODEL`/`BIG_MODEL` prefixed with `openai/`.
 48 |    - If `PREFERRED_PROVIDER=google`, `haiku`/`sonnet` map to `SMALL_MODEL`/`BIG_MODEL` prefixed with `gemini/` *if* those models are in the server's known `GEMINI_MODELS` list (otherwise falls back to OpenAI mapping).
 49 | 
 50 | 4. **Run the server**:
 51 |    ```bash
 52 |    uv run uvicorn server:app --host 0.0.0.0 --port 8082 --reload
 53 |    ```
 54 |    *(`--reload` is optional, for development)*
 55 | 
 56 | ### Using with Claude Code 🎮
 57 | 
 58 | 1. **Install Claude Code** (if you haven't already):
 59 |    ```bash
 60 |    npm install -g @anthropic-ai/claude-code
 61 |    ```
 62 | 
 63 | 2. **Connect to your proxy**:
 64 |    ```bash
 65 |    ANTHROPIC_BASE_URL=http://localhost:8082 claude
 66 |    ```
 67 | 
 68 | 3. **That's it!** Your Claude Code client will now use the configured backend models (defaulting to Gemini) through the proxy. 🎯
 69 | 
 70 | ## Model Mapping 🗺️
 71 | 
 72 | The proxy automatically maps Claude models to either OpenAI or Gemini models based on the configured model:
 73 | 
 74 | | Claude Model | Default Mapping | When BIG_MODEL/SMALL_MODEL is a Gemini model |
 75 | |--------------|--------------|---------------------------|
 76 | | haiku | openai/gpt-4o-mini | gemini/[model-name] |
 77 | | sonnet | openai/gpt-4o | gemini/[model-name] |
 78 | 
 79 | ### Supported Models
 80 | 
 81 | #### OpenAI Models
 82 | The following OpenAI models are supported with automatic `openai/` prefix handling:
 83 | - o3-mini
 84 | - o1
 85 | - o1-mini
 86 | - o1-pro
 87 | - gpt-4.5-preview
 88 | - gpt-4o
 89 | - gpt-4o-audio-preview
 90 | - chatgpt-4o-latest
 91 | - gpt-4o-mini
 92 | - gpt-4o-mini-audio-preview
 93 | - gpt-4.1
 94 | - gpt-4.1-mini
 95 | 
 96 | #### Gemini Models
 97 | The following Gemini models are supported with automatic `gemini/` prefix handling:
 98 | - gemini-2.5-pro-preview-03-25
 99 | - gemini-2.0-flash
100 | 
101 | ### Model Prefix Handling
102 | The proxy automatically adds the appropriate prefix to model names:
103 | - OpenAI models get the `openai/` prefix 
104 | - Gemini models get the `gemini/` prefix
105 | - The BIG_MODEL and SMALL_MODEL will get the appropriate prefix based on whether they're in the OpenAI or Gemini model lists
106 | 
107 | For example:
108 | - `gpt-4o` becomes `openai/gpt-4o`
109 | - `gemini-2.5-pro-preview-03-25` becomes `gemini/gemini-2.5-pro-preview-03-25`
110 | - When BIG_MODEL is set to a Gemini model, Claude Sonnet will map to `gemini/[model-name]`
111 | 
112 | ### Customizing Model Mapping
113 | 
114 | Control the mapping using environment variables in your `.env` file or directly:
115 | 
116 | **Example 1: Default (Use OpenAI)**
117 | No changes needed in `.env` beyond API keys, or ensure:
118 | ```dotenv
119 | OPENAI_API_KEY="your-openai-key"
120 | GEMINI_API_KEY="your-google-key" # Needed if PREFERRED_PROVIDER=google
121 | # PREFERRED_PROVIDER="openai" # Optional, it's the default
122 | # BIG_MODEL="gpt-4.1" # Optional, it's the default
123 | # SMALL_MODEL="gpt-4.1-mini" # Optional, it's the default
124 | ```
125 | 
126 | **Example 2: Prefer Google**
127 | ```dotenv
128 | GEMINI_API_KEY="your-google-key"
129 | OPENAI_API_KEY="your-openai-key" # Needed for fallback
130 | PREFERRED_PROVIDER="google"
131 | # BIG_MODEL="gemini-2.5-pro-preview-03-25" # Optional, it's the default for Google pref
132 | # SMALL_MODEL="gemini-2.0-flash" # Optional, it's the default for Google pref
133 | ```
134 | 
135 | **Example 3: Use Specific OpenAI Models**
136 | ```dotenv
137 | OPENAI_API_KEY="your-openai-key"
138 | GEMINI_API_KEY="your-google-key"
139 | PREFERRED_PROVIDER="openai"
140 | BIG_MODEL="gpt-4o" # Example specific model
141 | SMALL_MODEL="gpt-4o-mini" # Example specific model
142 | ```
143 | 
144 | ## How It Works 🧩
145 | 
146 | This proxy works by:
147 | 
148 | 1. **Receiving requests** in Anthropic's API format 📥
149 | 2. **Translating** the requests to OpenAI format via LiteLLM 🔄
150 | 3. **Sending** the translated request to OpenAI 📤
151 | 4. **Converting** the response back to Anthropic format 🔄
152 | 5. **Returning** the formatted response to the client ✅
153 | 
154 | The proxy handles both streaming and non-streaming responses, maintaining compatibility with all Claude clients. 🌊
155 | 
156 | ## Contributing 🤝
157 | 
158 | Contributions are welcome! Please feel free to submit a Pull Request. 🎁
159 | 


--------------------------------------------------------------------------------
/pic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/1rgs/claude-code-proxy/e9c8cf8de6e8f11cf54dd677634e9796e040f2fd/pic.png


--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
 1 | [project]
 2 | name = "anthropic-proxy"
 3 | version = "0.1.0"
 4 | description = "Proxy that translates between Anthropic API and LiteLLM"
 5 | readme = "README.md"
 6 | requires-python = ">=3.10"
 7 | dependencies = [
 8 |     "fastapi[standard]>=0.115.11",
 9 |     "uvicorn>=0.34.0",
10 |     "httpx>=0.25.0",
11 |     "pydantic>=2.0.0",
12 |     "litellm>=1.40.14",
13 |     "python-dotenv>=1.0.0",
14 | ]
15 | 
16 | 


--------------------------------------------------------------------------------
/server.py:
--------------------------------------------------------------------------------
   1 | from fastapi import FastAPI, Request, HTTPException
   2 | import uvicorn
   3 | import logging
   4 | import json
   5 | from pydantic import BaseModel, Field, field_validator
   6 | from typing import List, Dict, Any, Optional, Union, Literal
   7 | import httpx
   8 | import os
   9 | from fastapi.responses import JSONResponse, StreamingResponse
  10 | import litellm
  11 | import uuid
  12 | import time
  13 | from dotenv import load_dotenv
  14 | import re
  15 | from datetime import datetime
  16 | import sys
  17 | 
  18 | # Load environment variables from .env file
  19 | load_dotenv()
  20 | 
  21 | # Configure logging
  22 | logging.basicConfig(
  23 |     level=logging.WARN,  # Change to INFO level to show more details
  24 |     format='%(asctime)s - %(levelname)s - %(message)s',
  25 | )
  26 | logger = logging.getLogger(__name__)
  27 | 
  28 | # Configure uvicorn to be quieter
  29 | import uvicorn
  30 | # Tell uvicorn's loggers to be quiet
  31 | logging.getLogger("uvicorn").setLevel(logging.WARNING)
  32 | logging.getLogger("uvicorn.access").setLevel(logging.WARNING)
  33 | logging.getLogger("uvicorn.error").setLevel(logging.WARNING)
  34 | 
  35 | # Create a filter to block any log messages containing specific strings
  36 | class MessageFilter(logging.Filter):
  37 |     def filter(self, record):
  38 |         # Block messages containing these strings
  39 |         blocked_phrases = [
  40 |             "LiteLLM completion()",
  41 |             "HTTP Request:", 
  42 |             "selected model name for cost calculation",
  43 |             "utils.py",
  44 |             "cost_calculator"
  45 |         ]
  46 |         
  47 |         if hasattr(record, 'msg') and isinstance(record.msg, str):
  48 |             for phrase in blocked_phrases:
  49 |                 if phrase in record.msg:
  50 |                     return False
  51 |         return True
  52 | 
  53 | # Apply the filter to the root logger to catch all messages
  54 | root_logger = logging.getLogger()
  55 | root_logger.addFilter(MessageFilter())
  56 | 
  57 | # Custom formatter for model mapping logs
  58 | class ColorizedFormatter(logging.Formatter):
  59 |     """Custom formatter to highlight model mappings"""
  60 |     BLUE = "\033[94m"
  61 |     GREEN = "\033[92m"
  62 |     YELLOW = "\033[93m"
  63 |     RED = "\033[91m"
  64 |     RESET = "\033[0m"
  65 |     BOLD = "\033[1m"
  66 |     
  67 |     def format(self, record):
  68 |         if record.levelno == logging.debug and "MODEL MAPPING" in record.msg:
  69 |             # Apply colors and formatting to model mapping logs
  70 |             return f"{self.BOLD}{self.GREEN}{record.msg}{self.RESET}"
  71 |         return super().format(record)
  72 | 
  73 | # Apply custom formatter to console handler
  74 | for handler in logger.handlers:
  75 |     if isinstance(handler, logging.StreamHandler):
  76 |         handler.setFormatter(ColorizedFormatter('%(asctime)s - %(levelname)s - %(message)s'))
  77 | 
  78 | app = FastAPI()
  79 | 
  80 | # Get API keys from environment
  81 | ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY")
  82 | OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
  83 | GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
  84 | 
  85 | # Get preferred provider (default to openai)
  86 | PREFERRED_PROVIDER = os.environ.get("PREFERRED_PROVIDER", "openai").lower()
  87 | 
  88 | # Get model mapping configuration from environment
  89 | # Default to latest OpenAI models if not set
  90 | BIG_MODEL = os.environ.get("BIG_MODEL", "gpt-4.1")
  91 | SMALL_MODEL = os.environ.get("SMALL_MODEL", "gpt-4.1-mini")
  92 | 
  93 | # List of OpenAI models
  94 | OPENAI_MODELS = [
  95 |     "o3-mini",
  96 |     "o1",
  97 |     "o1-mini",
  98 |     "o1-pro",
  99 |     "gpt-4.5-preview",
 100 |     "gpt-4o",
 101 |     "gpt-4o-audio-preview",
 102 |     "chatgpt-4o-latest",
 103 |     "gpt-4o-mini",
 104 |     "gpt-4o-mini-audio-preview",
 105 |     "gpt-4.1",  # Added default big model
 106 |     "gpt-4.1-mini" # Added default small model
 107 | ]
 108 | 
 109 | # List of Gemini models
 110 | GEMINI_MODELS = [
 111 |     "gemini-2.5-pro-preview-03-25",
 112 |     "gemini-2.0-flash"
 113 | ]
 114 | 
 115 | # Helper function to clean schema for Gemini
 116 | def clean_gemini_schema(schema: Any) -> Any:
 117 |     """Recursively removes unsupported fields from a JSON schema for Gemini."""
 118 |     if isinstance(schema, dict):
 119 |         # Remove specific keys unsupported by Gemini tool parameters
 120 |         schema.pop("additionalProperties", None)
 121 |         schema.pop("default", None)
 122 | 
 123 |         # Check for unsupported 'format' in string types
 124 |         if schema.get("type") == "string" and "format" in schema:
 125 |             allowed_formats = {"enum", "date-time"}
 126 |             if schema["format"] not in allowed_formats:
 127 |                 logger.debug(f"Removing unsupported format '{schema['format']}' for string type in Gemini schema.")
 128 |                 schema.pop("format")
 129 | 
 130 |         # Recursively clean nested schemas (properties, items, etc.)
 131 |         for key, value in list(schema.items()): # Use list() to allow modification during iteration
 132 |             schema[key] = clean_gemini_schema(value)
 133 |     elif isinstance(schema, list):
 134 |         # Recursively clean items in a list
 135 |         return [clean_gemini_schema(item) for item in schema]
 136 |     return schema
 137 | 
 138 | # Models for Anthropic API requests
 139 | class ContentBlockText(BaseModel):
 140 |     type: Literal["text"]
 141 |     text: str
 142 | 
 143 | class ContentBlockImage(BaseModel):
 144 |     type: Literal["image"]
 145 |     source: Dict[str, Any]
 146 | 
 147 | class ContentBlockToolUse(BaseModel):
 148 |     type: Literal["tool_use"]
 149 |     id: str
 150 |     name: str
 151 |     input: Dict[str, Any]
 152 | 
 153 | class ContentBlockToolResult(BaseModel):
 154 |     type: Literal["tool_result"]
 155 |     tool_use_id: str
 156 |     content: Union[str, List[Dict[str, Any]], Dict[str, Any], List[Any], Any]
 157 | 
 158 | class SystemContent(BaseModel):
 159 |     type: Literal["text"]
 160 |     text: str
 161 | 
 162 | class Message(BaseModel):
 163 |     role: Literal["user", "assistant"] 
 164 |     content: Union[str, List[Union[ContentBlockText, ContentBlockImage, ContentBlockToolUse, ContentBlockToolResult]]]
 165 | 
 166 | class Tool(BaseModel):
 167 |     name: str
 168 |     description: Optional[str] = None
 169 |     input_schema: Dict[str, Any]
 170 | 
 171 | class ThinkingConfig(BaseModel):
 172 |     enabled: bool
 173 | 
 174 | class MessagesRequest(BaseModel):
 175 |     model: str
 176 |     max_tokens: int
 177 |     messages: List[Message]
 178 |     system: Optional[Union[str, List[SystemContent]]] = None
 179 |     stop_sequences: Optional[List[str]] = None
 180 |     stream: Optional[bool] = False
 181 |     temperature: Optional[float] = 1.0
 182 |     top_p: Optional[float] = None
 183 |     top_k: Optional[int] = None
 184 |     metadata: Optional[Dict[str, Any]] = None
 185 |     tools: Optional[List[Tool]] = None
 186 |     tool_choice: Optional[Dict[str, Any]] = None
 187 |     thinking: Optional[ThinkingConfig] = None
 188 |     original_model: Optional[str] = None  # Will store the original model name
 189 |     
 190 |     @field_validator('model')
 191 |     def validate_model_field(cls, v, info): # Renamed to avoid conflict
 192 |         original_model = v
 193 |         new_model = v # Default to original value
 194 | 
 195 |         logger.debug(f"📋 MODEL VALIDATION: Original='{original_model}', Preferred='{PREFERRED_PROVIDER}', BIG='{BIG_MODEL}', SMALL='{SMALL_MODEL}'")
 196 | 
 197 |         # Remove provider prefixes for easier matching
 198 |         clean_v = v
 199 |         if clean_v.startswith('anthropic/'):
 200 |             clean_v = clean_v[10:]
 201 |         elif clean_v.startswith('openai/'):
 202 |             clean_v = clean_v[7:]
 203 |         elif clean_v.startswith('gemini/'):
 204 |             clean_v = clean_v[7:]
 205 | 
 206 |         # --- Mapping Logic --- START ---
 207 |         mapped = False
 208 |         # Map Haiku to SMALL_MODEL based on provider preference
 209 |         if 'haiku' in clean_v.lower():
 210 |             if PREFERRED_PROVIDER == "google" and SMALL_MODEL in GEMINI_MODELS:
 211 |                 new_model = f"gemini/{SMALL_MODEL}"
 212 |                 mapped = True
 213 |             else:
 214 |                 new_model = f"openai/{SMALL_MODEL}"
 215 |                 mapped = True
 216 | 
 217 |         # Map Sonnet to BIG_MODEL based on provider preference
 218 |         elif 'sonnet' in clean_v.lower():
 219 |             if PREFERRED_PROVIDER == "google" and BIG_MODEL in GEMINI_MODELS:
 220 |                 new_model = f"gemini/{BIG_MODEL}"
 221 |                 mapped = True
 222 |             else:
 223 |                 new_model = f"openai/{BIG_MODEL}"
 224 |                 mapped = True
 225 | 
 226 |         # Add prefixes to non-mapped models if they match known lists
 227 |         elif not mapped:
 228 |             if clean_v in GEMINI_MODELS and not v.startswith('gemini/'):
 229 |                 new_model = f"gemini/{clean_v}"
 230 |                 mapped = True # Technically mapped to add prefix
 231 |             elif clean_v in OPENAI_MODELS and not v.startswith('openai/'):
 232 |                 new_model = f"openai/{clean_v}"
 233 |                 mapped = True # Technically mapped to add prefix
 234 |         # --- Mapping Logic --- END ---
 235 | 
 236 |         if mapped:
 237 |             logger.debug(f"📌 MODEL MAPPING: '{original_model}' ➡️ '{new_model}'")
 238 |         else:
 239 |              # If no mapping occurred and no prefix exists, log warning or decide default
 240 |              if not v.startswith(('openai/', 'gemini/', 'anthropic/')):
 241 |                  logger.warning(f"⚠️ No prefix or mapping rule for model: '{original_model}'. Using as is.")
 242 |              new_model = v # Ensure we return the original if no rule applied
 243 | 
 244 |         # Store the original model in the values dictionary
 245 |         values = info.data
 246 |         if isinstance(values, dict):
 247 |             values['original_model'] = original_model
 248 | 
 249 |         return new_model
 250 | 
 251 | class TokenCountRequest(BaseModel):
 252 |     model: str
 253 |     messages: List[Message]
 254 |     system: Optional[Union[str, List[SystemContent]]] = None
 255 |     tools: Optional[List[Tool]] = None
 256 |     thinking: Optional[ThinkingConfig] = None
 257 |     tool_choice: Optional[Dict[str, Any]] = None
 258 |     original_model: Optional[str] = None  # Will store the original model name
 259 |     
 260 |     @field_validator('model')
 261 |     def validate_model_token_count(cls, v, info): # Renamed to avoid conflict
 262 |         # Use the same logic as MessagesRequest validator
 263 |         # NOTE: Pydantic validators might not share state easily if not class methods
 264 |         # Re-implementing the logic here for clarity, could be refactored
 265 |         original_model = v
 266 |         new_model = v # Default to original value
 267 | 
 268 |         logger.debug(f"📋 TOKEN COUNT VALIDATION: Original='{original_model}', Preferred='{PREFERRED_PROVIDER}', BIG='{BIG_MODEL}', SMALL='{SMALL_MODEL}'")
 269 | 
 270 |         # Remove provider prefixes for easier matching
 271 |         clean_v = v
 272 |         if clean_v.startswith('anthropic/'):
 273 |             clean_v = clean_v[10:]
 274 |         elif clean_v.startswith('openai/'):
 275 |             clean_v = clean_v[7:]
 276 |         elif clean_v.startswith('gemini/'):
 277 |             clean_v = clean_v[7:]
 278 | 
 279 |         # --- Mapping Logic --- START ---
 280 |         mapped = False
 281 |         # Map Haiku to SMALL_MODEL based on provider preference
 282 |         if 'haiku' in clean_v.lower():
 283 |             if PREFERRED_PROVIDER == "google" and SMALL_MODEL in GEMINI_MODELS:
 284 |                 new_model = f"gemini/{SMALL_MODEL}"
 285 |                 mapped = True
 286 |             else:
 287 |                 new_model = f"openai/{SMALL_MODEL}"
 288 |                 mapped = True
 289 | 
 290 |         # Map Sonnet to BIG_MODEL based on provider preference
 291 |         elif 'sonnet' in clean_v.lower():
 292 |             if PREFERRED_PROVIDER == "google" and BIG_MODEL in GEMINI_MODELS:
 293 |                 new_model = f"gemini/{BIG_MODEL}"
 294 |                 mapped = True
 295 |             else:
 296 |                 new_model = f"openai/{BIG_MODEL}"
 297 |                 mapped = True
 298 | 
 299 |         # Add prefixes to non-mapped models if they match known lists
 300 |         elif not mapped:
 301 |             if clean_v in GEMINI_MODELS and not v.startswith('gemini/'):
 302 |                 new_model = f"gemini/{clean_v}"
 303 |                 mapped = True # Technically mapped to add prefix
 304 |             elif clean_v in OPENAI_MODELS and not v.startswith('openai/'):
 305 |                 new_model = f"openai/{clean_v}"
 306 |                 mapped = True # Technically mapped to add prefix
 307 |         # --- Mapping Logic --- END ---
 308 | 
 309 |         if mapped:
 310 |             logger.debug(f"📌 TOKEN COUNT MAPPING: '{original_model}' ➡️ '{new_model}'")
 311 |         else:
 312 |              if not v.startswith(('openai/', 'gemini/', 'anthropic/')):
 313 |                  logger.warning(f"⚠️ No prefix or mapping rule for token count model: '{original_model}'. Using as is.")
 314 |              new_model = v # Ensure we return the original if no rule applied
 315 | 
 316 |         # Store the original model in the values dictionary
 317 |         values = info.data
 318 |         if isinstance(values, dict):
 319 |             values['original_model'] = original_model
 320 | 
 321 |         return new_model
 322 | 
 323 | class TokenCountResponse(BaseModel):
 324 |     input_tokens: int
 325 | 
 326 | class Usage(BaseModel):
 327 |     input_tokens: int
 328 |     output_tokens: int
 329 |     cache_creation_input_tokens: int = 0
 330 |     cache_read_input_tokens: int = 0
 331 | 
 332 | class MessagesResponse(BaseModel):
 333 |     id: str
 334 |     model: str
 335 |     role: Literal["assistant"] = "assistant"
 336 |     content: List[Union[ContentBlockText, ContentBlockToolUse]]
 337 |     type: Literal["message"] = "message"
 338 |     stop_reason: Optional[Literal["end_turn", "max_tokens", "stop_sequence", "tool_use"]] = None
 339 |     stop_sequence: Optional[str] = None
 340 |     usage: Usage
 341 | 
 342 | @app.middleware("http")
 343 | async def log_requests(request: Request, call_next):
 344 |     # Get request details
 345 |     method = request.method
 346 |     path = request.url.path
 347 |     
 348 |     # Log only basic request details at debug level
 349 |     logger.debug(f"Request: {method} {path}")
 350 |     
 351 |     # Process the request and get the response
 352 |     response = await call_next(request)
 353 |     
 354 |     return response
 355 | 
 356 | # Not using validation function as we're using the environment API key
 357 | 
 358 | def parse_tool_result_content(content):
 359 |     """Helper function to properly parse and normalize tool result content."""
 360 |     if content is None:
 361 |         return "No content provided"
 362 |         
 363 |     if isinstance(content, str):
 364 |         return content
 365 |         
 366 |     if isinstance(content, list):
 367 |         result = ""
 368 |         for item in content:
 369 |             if isinstance(item, dict) and item.get("type") == "text":
 370 |                 result += item.get("text", "") + "\n"
 371 |             elif isinstance(item, str):
 372 |                 result += item + "\n"
 373 |             elif isinstance(item, dict):
 374 |                 if "text" in item:
 375 |                     result += item.get("text", "") + "\n"
 376 |                 else:
 377 |                     try:
 378 |                         result += json.dumps(item) + "\n"
 379 |                     except:
 380 |                         result += str(item) + "\n"
 381 |             else:
 382 |                 try:
 383 |                     result += str(item) + "\n"
 384 |                 except:
 385 |                     result += "Unparseable content\n"
 386 |         return result.strip()
 387 |         
 388 |     if isinstance(content, dict):
 389 |         if content.get("type") == "text":
 390 |             return content.get("text", "")
 391 |         try:
 392 |             return json.dumps(content)
 393 |         except:
 394 |             return str(content)
 395 |             
 396 |     # Fallback for any other type
 397 |     try:
 398 |         return str(content)
 399 |     except:
 400 |         return "Unparseable content"
 401 | 
 402 | def convert_anthropic_to_litellm(anthropic_request: MessagesRequest) -> Dict[str, Any]:
 403 |     """Convert Anthropic API request format to LiteLLM format (which follows OpenAI)."""
 404 |     # LiteLLM already handles Anthropic models when using the format model="anthropic/claude-3-opus-20240229"
 405 |     # So we just need to convert our Pydantic model to a dict in the expected format
 406 |     
 407 |     messages = []
 408 |     
 409 |     # Add system message if present
 410 |     if anthropic_request.system:
 411 |         # Handle different formats of system messages
 412 |         if isinstance(anthropic_request.system, str):
 413 |             # Simple string format
 414 |             messages.append({"role": "system", "content": anthropic_request.system})
 415 |         elif isinstance(anthropic_request.system, list):
 416 |             # List of content blocks
 417 |             system_text = ""
 418 |             for block in anthropic_request.system:
 419 |                 if hasattr(block, 'type') and block.type == "text":
 420 |                     system_text += block.text + "\n\n"
 421 |                 elif isinstance(block, dict) and block.get("type") == "text":
 422 |                     system_text += block.get("text", "") + "\n\n"
 423 |             
 424 |             if system_text:
 425 |                 messages.append({"role": "system", "content": system_text.strip()})
 426 |     
 427 |     # Add conversation messages
 428 |     for idx, msg in enumerate(anthropic_request.messages):
 429 |         content = msg.content
 430 |         if isinstance(content, str):
 431 |             messages.append({"role": msg.role, "content": content})
 432 |         else:
 433 |             # Special handling for tool_result in user messages
 434 |             # OpenAI/LiteLLM format expects the assistant to call the tool, 
 435 |             # and the user's next message to include the result as plain text
 436 |             if msg.role == "user" and any(block.type == "tool_result" for block in content if hasattr(block, "type")):
 437 |                 # For user messages with tool_result, split into separate messages
 438 |                 text_content = ""
 439 |                 
 440 |                 # Extract all text parts and concatenate them
 441 |                 for block in content:
 442 |                     if hasattr(block, "type"):
 443 |                         if block.type == "text":
 444 |                             text_content += block.text + "\n"
 445 |                         elif block.type == "tool_result":
 446 |                             # Add tool result as a message by itself - simulate the normal flow
 447 |                             tool_id = block.tool_use_id if hasattr(block, "tool_use_id") else ""
 448 |                             
 449 |                             # Handle different formats of tool result content
 450 |                             result_content = ""
 451 |                             if hasattr(block, "content"):
 452 |                                 if isinstance(block.content, str):
 453 |                                     result_content = block.content
 454 |                                 elif isinstance(block.content, list):
 455 |                                     # If content is a list of blocks, extract text from each
 456 |                                     for content_block in block.content:
 457 |                                         if hasattr(content_block, "type") and content_block.type == "text":
 458 |                                             result_content += content_block.text + "\n"
 459 |                                         elif isinstance(content_block, dict) and content_block.get("type") == "text":
 460 |                                             result_content += content_block.get("text", "") + "\n"
 461 |                                         elif isinstance(content_block, dict):
 462 |                                             # Handle any dict by trying to extract text or convert to JSON
 463 |                                             if "text" in content_block:
 464 |                                                 result_content += content_block.get("text", "") + "\n"
 465 |                                             else:
 466 |                                                 try:
 467 |                                                     result_content += json.dumps(content_block) + "\n"
 468 |                                                 except:
 469 |                                                     result_content += str(content_block) + "\n"
 470 |                                 elif isinstance(block.content, dict):
 471 |                                     # Handle dictionary content
 472 |                                     if block.content.get("type") == "text":
 473 |                                         result_content = block.content.get("text", "")
 474 |                                     else:
 475 |                                         try:
 476 |                                             result_content = json.dumps(block.content)
 477 |                                         except:
 478 |                                             result_content = str(block.content)
 479 |                                 else:
 480 |                                     # Handle any other type by converting to string
 481 |                                     try:
 482 |                                         result_content = str(block.content)
 483 |                                     except:
 484 |                                         result_content = "Unparseable content"
 485 |                             
 486 |                             # In OpenAI format, tool results come from the user (rather than being content blocks)
 487 |                             text_content += f"Tool result for {tool_id}:\n{result_content}\n"
 488 |                 
 489 |                 # Add as a single user message with all the content
 490 |                 messages.append({"role": "user", "content": text_content.strip()})
 491 |             else:
 492 |                 # Regular handling for other message types
 493 |                 processed_content = []
 494 |                 for block in content:
 495 |                     if hasattr(block, "type"):
 496 |                         if block.type == "text":
 497 |                             processed_content.append({"type": "text", "text": block.text})
 498 |                         elif block.type == "image":
 499 |                             processed_content.append({"type": "image", "source": block.source})
 500 |                         elif block.type == "tool_use":
 501 |                             # Handle tool use blocks if needed
 502 |                             processed_content.append({
 503 |                                 "type": "tool_use",
 504 |                                 "id": block.id,
 505 |                                 "name": block.name,
 506 |                                 "input": block.input
 507 |                             })
 508 |                         elif block.type == "tool_result":
 509 |                             # Handle different formats of tool result content
 510 |                             processed_content_block = {
 511 |                                 "type": "tool_result",
 512 |                                 "tool_use_id": block.tool_use_id if hasattr(block, "tool_use_id") else ""
 513 |                             }
 514 |                             
 515 |                             # Process the content field properly
 516 |                             if hasattr(block, "content"):
 517 |                                 if isinstance(block.content, str):
 518 |                                     # If it's a simple string, create a text block for it
 519 |                                     processed_content_block["content"] = [{"type": "text", "text": block.content}]
 520 |                                 elif isinstance(block.content, list):
 521 |                                     # If it's already a list of blocks, keep it
 522 |                                     processed_content_block["content"] = block.content
 523 |                                 else:
 524 |                                     # Default fallback
 525 |                                     processed_content_block["content"] = [{"type": "text", "text": str(block.content)}]
 526 |                             else:
 527 |                                 # Default empty content
 528 |                                 processed_content_block["content"] = [{"type": "text", "text": ""}]
 529 |                                 
 530 |                             processed_content.append(processed_content_block)
 531 |                 
 532 |                 messages.append({"role": msg.role, "content": processed_content})
 533 |     
 534 |     # Cap max_tokens for OpenAI models to their limit of 16384
 535 |     max_tokens = anthropic_request.max_tokens
 536 |     if anthropic_request.model.startswith("openai/") or anthropic_request.model.startswith("gemini/"):
 537 |         max_tokens = min(max_tokens, 16384)
 538 |         logger.debug(f"Capping max_tokens to 16384 for OpenAI/Gemini model (original value: {anthropic_request.max_tokens})")
 539 |     
 540 |     # Create LiteLLM request dict
 541 |     litellm_request = {
 542 |         "model": anthropic_request.model,  # t understands "anthropic/claude-x" format
 543 |         "messages": messages,
 544 |         "max_tokens": max_tokens,
 545 |         "temperature": anthropic_request.temperature,
 546 |         "stream": anthropic_request.stream,
 547 |     }
 548 |     
 549 |     # Add optional parameters if present
 550 |     if anthropic_request.stop_sequences:
 551 |         litellm_request["stop"] = anthropic_request.stop_sequences
 552 |     
 553 |     if anthropic_request.top_p:
 554 |         litellm_request["top_p"] = anthropic_request.top_p
 555 |     
 556 |     if anthropic_request.top_k:
 557 |         litellm_request["top_k"] = anthropic_request.top_k
 558 |     
 559 |     # Convert tools to OpenAI format
 560 |     if anthropic_request.tools:
 561 |         openai_tools = []
 562 |         is_gemini_model = anthropic_request.model.startswith("gemini/")
 563 | 
 564 |         for tool in anthropic_request.tools:
 565 |             # Convert to dict if it's a pydantic model
 566 |             if hasattr(tool, 'dict'):
 567 |                 tool_dict = tool.dict()
 568 |             else:
 569 |                 # Ensure tool_dict is a dictionary, handle potential errors if 'tool' isn't dict-like
 570 |                 try:
 571 |                     tool_dict = dict(tool) if not isinstance(tool, dict) else tool
 572 |                 except (TypeError, ValueError):
 573 |                      logger.error(f"Could not convert tool to dict: {tool}")
 574 |                      continue # Skip this tool if conversion fails
 575 | 
 576 |             # Clean the schema if targeting a Gemini model
 577 |             input_schema = tool_dict.get("input_schema", {})
 578 |             if is_gemini_model:
 579 |                  logger.debug(f"Cleaning schema for Gemini tool: {tool_dict.get('name')}")
 580 |                  input_schema = clean_gemini_schema(input_schema)
 581 | 
 582 |             # Create OpenAI-compatible function tool
 583 |             openai_tool = {
 584 |                 "type": "function",
 585 |                 "function": {
 586 |                     "name": tool_dict["name"],
 587 |                     "description": tool_dict.get("description", ""),
 588 |                     "parameters": input_schema # Use potentially cleaned schema
 589 |                 }
 590 |             }
 591 |             openai_tools.append(openai_tool)
 592 | 
 593 |         litellm_request["tools"] = openai_tools
 594 |     
 595 |     # Convert tool_choice to OpenAI format if present
 596 |     if anthropic_request.tool_choice:
 597 |         if hasattr(anthropic_request.tool_choice, 'dict'):
 598 |             tool_choice_dict = anthropic_request.tool_choice.dict()
 599 |         else:
 600 |             tool_choice_dict = anthropic_request.tool_choice
 601 |             
 602 |         # Handle Anthropic's tool_choice format
 603 |         choice_type = tool_choice_dict.get("type")
 604 |         if choice_type == "auto":
 605 |             litellm_request["tool_choice"] = "auto"
 606 |         elif choice_type == "any":
 607 |             litellm_request["tool_choice"] = "any"
 608 |         elif choice_type == "tool" and "name" in tool_choice_dict:
 609 |             litellm_request["tool_choice"] = {
 610 |                 "type": "function",
 611 |                 "function": {"name": tool_choice_dict["name"]}
 612 |             }
 613 |         else:
 614 |             # Default to auto if we can't determine
 615 |             litellm_request["tool_choice"] = "auto"
 616 |     
 617 |     return litellm_request
 618 | 
 619 | def convert_litellm_to_anthropic(litellm_response: Union[Dict[str, Any], Any], 
 620 |                                  original_request: MessagesRequest) -> MessagesResponse:
 621 |     """Convert LiteLLM (OpenAI format) response to Anthropic API response format."""
 622 |     
 623 |     # Enhanced response extraction with better error handling
 624 |     try:
 625 |         # Get the clean model name to check capabilities
 626 |         clean_model = original_request.model
 627 |         if clean_model.startswith("anthropic/"):
 628 |             clean_model = clean_model[len("anthropic/"):]
 629 |         elif clean_model.startswith("openai/"):
 630 |             clean_model = clean_model[len("openai/"):]
 631 |         
 632 |         # Check if this is a Claude model (which supports content blocks)
 633 |         is_claude_model = clean_model.startswith("claude-")
 634 |         
 635 |         # Handle ModelResponse object from LiteLLM
 636 |         if hasattr(litellm_response, 'choices') and hasattr(litellm_response, 'usage'):
 637 |             # Extract data from ModelResponse object directly
 638 |             choices = litellm_response.choices
 639 |             message = choices[0].message if choices and len(choices) > 0 else None
 640 |             content_text = message.content if message and hasattr(message, 'content') else ""
 641 |             tool_calls = message.tool_calls if message and hasattr(message, 'tool_calls') else None
 642 |             finish_reason = choices[0].finish_reason if choices and len(choices) > 0 else "stop"
 643 |             usage_info = litellm_response.usage
 644 |             response_id = getattr(litellm_response, 'id', f"msg_{uuid.uuid4()}")
 645 |         else:
 646 |             # For backward compatibility - handle dict responses
 647 |             # If response is a dict, use it, otherwise try to convert to dict
 648 |             try:
 649 |                 response_dict = litellm_response if isinstance(litellm_response, dict) else litellm_response.dict()
 650 |             except AttributeError:
 651 |                 # If .dict() fails, try to use model_dump or __dict__ 
 652 |                 try:
 653 |                     response_dict = litellm_response.model_dump() if hasattr(litellm_response, 'model_dump') else litellm_response.__dict__
 654 |                 except AttributeError:
 655 |                     # Fallback - manually extract attributes
 656 |                     response_dict = {
 657 |                         "id": getattr(litellm_response, 'id', f"msg_{uuid.uuid4()}"),
 658 |                         "choices": getattr(litellm_response, 'choices', [{}]),
 659 |                         "usage": getattr(litellm_response, 'usage', {})
 660 |                     }
 661 |                     
 662 |             # Extract the content from the response dict
 663 |             choices = response_dict.get("choices", [{}])
 664 |             message = choices[0].get("message", {}) if choices and len(choices) > 0 else {}
 665 |             content_text = message.get("content", "")
 666 |             tool_calls = message.get("tool_calls", None)
 667 |             finish_reason = choices[0].get("finish_reason", "stop") if choices and len(choices) > 0 else "stop"
 668 |             usage_info = response_dict.get("usage", {})
 669 |             response_id = response_dict.get("id", f"msg_{uuid.uuid4()}")
 670 |         
 671 |         # Create content list for Anthropic format
 672 |         content = []
 673 |         
 674 |         # Add text content block if present (text might be None or empty for pure tool call responses)
 675 |         if content_text is not None and content_text != "":
 676 |             content.append({"type": "text", "text": content_text})
 677 |         
 678 |         # Add tool calls if present (tool_use in Anthropic format) - only for Claude models
 679 |         if tool_calls and is_claude_model:
 680 |             logger.debug(f"Processing tool calls: {tool_calls}")
 681 |             
 682 |             # Convert to list if it's not already
 683 |             if not isinstance(tool_calls, list):
 684 |                 tool_calls = [tool_calls]
 685 |                 
 686 |             for idx, tool_call in enumerate(tool_calls):
 687 |                 logger.debug(f"Processing tool call {idx}: {tool_call}")
 688 |                 
 689 |                 # Extract function data based on whether it's a dict or object
 690 |                 if isinstance(tool_call, dict):
 691 |                     function = tool_call.get("function", {})
 692 |                     tool_id = tool_call.get("id", f"tool_{uuid.uuid4()}")
 693 |                     name = function.get("name", "")
 694 |                     arguments = function.get("arguments", "{}")
 695 |                 else:
 696 |                     function = getattr(tool_call, "function", None)
 697 |                     tool_id = getattr(tool_call, "id", f"tool_{uuid.uuid4()}")
 698 |                     name = getattr(function, "name", "") if function else ""
 699 |                     arguments = getattr(function, "arguments", "{}") if function else "{}"
 700 |                 
 701 |                 # Convert string arguments to dict if needed
 702 |                 if isinstance(arguments, str):
 703 |                     try:
 704 |                         arguments = json.loads(arguments)
 705 |                     except json.JSONDecodeError:
 706 |                         logger.warning(f"Failed to parse tool arguments as JSON: {arguments}")
 707 |                         arguments = {"raw": arguments}
 708 |                 
 709 |                 logger.debug(f"Adding tool_use block: id={tool_id}, name={name}, input={arguments}")
 710 |                 
 711 |                 content.append({
 712 |                     "type": "tool_use",
 713 |                     "id": tool_id,
 714 |                     "name": name,
 715 |                     "input": arguments
 716 |                 })
 717 |         elif tool_calls and not is_claude_model:
 718 |             # For non-Claude models, convert tool calls to text format
 719 |             logger.debug(f"Converting tool calls to text for non-Claude model: {clean_model}")
 720 |             
 721 |             # We'll append tool info to the text content
 722 |             tool_text = "\n\nTool usage:\n"
 723 |             
 724 |             # Convert to list if it's not already
 725 |             if not isinstance(tool_calls, list):
 726 |                 tool_calls = [tool_calls]
 727 |                 
 728 |             for idx, tool_call in enumerate(tool_calls):
 729 |                 # Extract function data based on whether it's a dict or object
 730 |                 if isinstance(tool_call, dict):
 731 |                     function = tool_call.get("function", {})
 732 |                     tool_id = tool_call.get("id", f"tool_{uuid.uuid4()}")
 733 |                     name = function.get("name", "")
 734 |                     arguments = function.get("arguments", "{}")
 735 |                 else:
 736 |                     function = getattr(tool_call, "function", None)
 737 |                     tool_id = getattr(tool_call, "id", f"tool_{uuid.uuid4()}")
 738 |                     name = getattr(function, "name", "") if function else ""
 739 |                     arguments = getattr(function, "arguments", "{}") if function else "{}"
 740 |                 
 741 |                 # Convert string arguments to dict if needed
 742 |                 if isinstance(arguments, str):
 743 |                     try:
 744 |                         args_dict = json.loads(arguments)
 745 |                         arguments_str = json.dumps(args_dict, indent=2)
 746 |                     except json.JSONDecodeError:
 747 |                         arguments_str = arguments
 748 |                 else:
 749 |                     arguments_str = json.dumps(arguments, indent=2)
 750 |                 
 751 |                 tool_text += f"Tool: {name}\nArguments: {arguments_str}\n\n"
 752 |             
 753 |             # Add or append tool text to content
 754 |             if content and content[0]["type"] == "text":
 755 |                 content[0]["text"] += tool_text
 756 |             else:
 757 |                 content.append({"type": "text", "text": tool_text})
 758 |         
 759 |         # Get usage information - extract values safely from object or dict
 760 |         if isinstance(usage_info, dict):
 761 |             prompt_tokens = usage_info.get("prompt_tokens", 0)
 762 |             completion_tokens = usage_info.get("completion_tokens", 0)
 763 |         else:
 764 |             prompt_tokens = getattr(usage_info, "prompt_tokens", 0)
 765 |             completion_tokens = getattr(usage_info, "completion_tokens", 0)
 766 |         
 767 |         # Map OpenAI finish_reason to Anthropic stop_reason
 768 |         stop_reason = None
 769 |         if finish_reason == "stop":
 770 |             stop_reason = "end_turn"
 771 |         elif finish_reason == "length":
 772 |             stop_reason = "max_tokens"
 773 |         elif finish_reason == "tool_calls":
 774 |             stop_reason = "tool_use"
 775 |         else:
 776 |             stop_reason = "end_turn"  # Default
 777 |         
 778 |         # Make sure content is never empty
 779 |         if not content:
 780 |             content.append({"type": "text", "text": ""})
 781 |         
 782 |         # Create Anthropic-style response
 783 |         anthropic_response = MessagesResponse(
 784 |             id=response_id,
 785 |             model=original_request.model,
 786 |             role="assistant",
 787 |             content=content,
 788 |             stop_reason=stop_reason,
 789 |             stop_sequence=None,
 790 |             usage=Usage(
 791 |                 input_tokens=prompt_tokens,
 792 |                 output_tokens=completion_tokens
 793 |             )
 794 |         )
 795 |         
 796 |         return anthropic_response
 797 |         
 798 |     except Exception as e:
 799 |         import traceback
 800 |         error_traceback = traceback.format_exc()
 801 |         error_message = f"Error converting response: {str(e)}\n\nFull traceback:\n{error_traceback}"
 802 |         logger.error(error_message)
 803 |         
 804 |         # In case of any error, create a fallback response
 805 |         return MessagesResponse(
 806 |             id=f"msg_{uuid.uuid4()}",
 807 |             model=original_request.model,
 808 |             role="assistant",
 809 |             content=[{"type": "text", "text": f"Error converting response: {str(e)}. Please check server logs."}],
 810 |             stop_reason="end_turn",
 811 |             usage=Usage(input_tokens=0, output_tokens=0)
 812 |         )
 813 | 
 814 | async def handle_streaming(response_generator, original_request: MessagesRequest):
 815 |     """Handle streaming responses from LiteLLM and convert to Anthropic format."""
 816 |     try:
 817 |         # Send message_start event
 818 |         message_id = f"msg_{uuid.uuid4().hex[:24]}"  # Format similar to Anthropic's IDs
 819 |         
 820 |         message_data = {
 821 |             'type': 'message_start',
 822 |             'message': {
 823 |                 'id': message_id,
 824 |                 'type': 'message',
 825 |                 'role': 'assistant',
 826 |                 'model': original_request.model,
 827 |                 'content': [],
 828 |                 'stop_reason': None,
 829 |                 'stop_sequence': None,
 830 |                 'usage': {
 831 |                     'input_tokens': 0,
 832 |                     'cache_creation_input_tokens': 0,
 833 |                     'cache_read_input_tokens': 0,
 834 |                     'output_tokens': 0
 835 |                 }
 836 |             }
 837 |         }
 838 |         yield f"event: message_start\ndata: {json.dumps(message_data)}\n\n"
 839 |         
 840 |         # Content block index for the first text block
 841 |         yield f"event: content_block_start\ndata: {json.dumps({'type': 'content_block_start', 'index': 0, 'content_block': {'type': 'text', 'text': ''}})}\n\n"
 842 |         
 843 |         # Send a ping to keep the connection alive (Anthropic does this)
 844 |         yield f"event: ping\ndata: {json.dumps({'type': 'ping'})}\n\n"
 845 |         
 846 |         tool_index = None
 847 |         current_tool_call = None
 848 |         tool_content = ""
 849 |         accumulated_text = ""  # Track accumulated text content
 850 |         text_sent = False  # Track if we've sent any text content
 851 |         text_block_closed = False  # Track if text block is closed
 852 |         input_tokens = 0
 853 |         output_tokens = 0
 854 |         has_sent_stop_reason = False
 855 |         last_tool_index = 0
 856 |         
 857 |         # Process each chunk
 858 |         async for chunk in response_generator:
 859 |             try:
 860 | 
 861 |                 
 862 |                 # Check if this is the end of the response with usage data
 863 |                 if hasattr(chunk, 'usage') and chunk.usage is not None:
 864 |                     if hasattr(chunk.usage, 'prompt_tokens'):
 865 |                         input_tokens = chunk.usage.prompt_tokens
 866 |                     if hasattr(chunk.usage, 'completion_tokens'):
 867 |                         output_tokens = chunk.usage.completion_tokens
 868 |                 
 869 |                 # Handle text content
 870 |                 if hasattr(chunk, 'choices') and len(chunk.choices) > 0:
 871 |                     choice = chunk.choices[0]
 872 |                     
 873 |                     # Get the delta from the choice
 874 |                     if hasattr(choice, 'delta'):
 875 |                         delta = choice.delta
 876 |                     else:
 877 |                         # If no delta, try to get message
 878 |                         delta = getattr(choice, 'message', {})
 879 |                     
 880 |                     # Check for finish_reason to know when we're done
 881 |                     finish_reason = getattr(choice, 'finish_reason', None)
 882 |                     
 883 |                     # Process text content
 884 |                     delta_content = None
 885 |                     
 886 |                     # Handle different formats of delta content
 887 |                     if hasattr(delta, 'content'):
 888 |                         delta_content = delta.content
 889 |                     elif isinstance(delta, dict) and 'content' in delta:
 890 |                         delta_content = delta['content']
 891 |                     
 892 |                     # Accumulate text content
 893 |                     if delta_content is not None and delta_content != "":
 894 |                         accumulated_text += delta_content
 895 |                         
 896 |                         # Always emit text deltas if no tool calls started
 897 |                         if tool_index is None and not text_block_closed:
 898 |                             text_sent = True
 899 |                             yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': delta_content}})}\n\n"
 900 |                     
 901 |                     # Process tool calls
 902 |                     delta_tool_calls = None
 903 |                     
 904 |                     # Handle different formats of tool calls
 905 |                     if hasattr(delta, 'tool_calls'):
 906 |                         delta_tool_calls = delta.tool_calls
 907 |                     elif isinstance(delta, dict) and 'tool_calls' in delta:
 908 |                         delta_tool_calls = delta['tool_calls']
 909 |                     
 910 |                     # Process tool calls if any
 911 |                     if delta_tool_calls:
 912 |                         # First tool call we've seen - need to handle text properly
 913 |                         if tool_index is None:
 914 |                             # If we've been streaming text, close that text block
 915 |                             if text_sent and not text_block_closed:
 916 |                                 text_block_closed = True
 917 |                                 yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n"
 918 |                             # If we've accumulated text but not sent it, we need to emit it now
 919 |                             # This handles the case where the first delta has both text and a tool call
 920 |                             elif accumulated_text and not text_sent and not text_block_closed:
 921 |                                 # Send the accumulated text
 922 |                                 text_sent = True
 923 |                                 yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': accumulated_text}})}\n\n"
 924 |                                 # Close the text block
 925 |                                 text_block_closed = True
 926 |                                 yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n"
 927 |                             # Close text block even if we haven't sent anything - models sometimes emit empty text blocks
 928 |                             elif not text_block_closed:
 929 |                                 text_block_closed = True
 930 |                                 yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n"
 931 |                                 
 932 |                         # Convert to list if it's not already
 933 |                         if not isinstance(delta_tool_calls, list):
 934 |                             delta_tool_calls = [delta_tool_calls]
 935 |                         
 936 |                         for tool_call in delta_tool_calls:
 937 |                             # Get the index of this tool call (for multiple tools)
 938 |                             current_index = None
 939 |                             if isinstance(tool_call, dict) and 'index' in tool_call:
 940 |                                 current_index = tool_call['index']
 941 |                             elif hasattr(tool_call, 'index'):
 942 |                                 current_index = tool_call.index
 943 |                             else:
 944 |                                 current_index = 0
 945 |                             
 946 |                             # Check if this is a new tool or a continuation
 947 |                             if tool_index is None or current_index != tool_index:
 948 |                                 # New tool call - create a new tool_use block
 949 |                                 tool_index = current_index
 950 |                                 last_tool_index += 1
 951 |                                 anthropic_tool_index = last_tool_index
 952 |                                 
 953 |                                 # Extract function info
 954 |                                 if isinstance(tool_call, dict):
 955 |                                     function = tool_call.get('function', {})
 956 |                                     name = function.get('name', '') if isinstance(function, dict) else ""
 957 |                                     tool_id = tool_call.get('id', f"toolu_{uuid.uuid4().hex[:24]}")
 958 |                                 else:
 959 |                                     function = getattr(tool_call, 'function', None)
 960 |                                     name = getattr(function, 'name', '') if function else ''
 961 |                                     tool_id = getattr(tool_call, 'id', f"toolu_{uuid.uuid4().hex[:24]}")
 962 |                                 
 963 |                                 # Start a new tool_use block
 964 |                                 yield f"event: content_block_start\ndata: {json.dumps({'type': 'content_block_start', 'index': anthropic_tool_index, 'content_block': {'type': 'tool_use', 'id': tool_id, 'name': name, 'input': {}}})}\n\n"
 965 |                                 current_tool_call = tool_call
 966 |                                 tool_content = ""
 967 |                             
 968 |                             # Extract function arguments
 969 |                             arguments = None
 970 |                             if isinstance(tool_call, dict) and 'function' in tool_call:
 971 |                                 function = tool_call.get('function', {})
 972 |                                 arguments = function.get('arguments', '') if isinstance(function, dict) else ''
 973 |                             elif hasattr(tool_call, 'function'):
 974 |                                 function = getattr(tool_call, 'function', None)
 975 |                                 arguments = getattr(function, 'arguments', '') if function else ''
 976 |                             
 977 |                             # If we have arguments, send them as a delta
 978 |                             if arguments:
 979 |                                 # Try to detect if arguments are valid JSON or just a fragment
 980 |                                 try:
 981 |                                     # If it's already a dict, use it
 982 |                                     if isinstance(arguments, dict):
 983 |                                         args_json = json.dumps(arguments)
 984 |                                     else:
 985 |                                         # Otherwise, try to parse it
 986 |                                         json.loads(arguments)
 987 |                                         args_json = arguments
 988 |                                 except (json.JSONDecodeError, TypeError):
 989 |                                     # If it's a fragment, treat it as a string
 990 |                                     args_json = arguments
 991 |                                 
 992 |                                 # Add to accumulated tool content
 993 |                                 tool_content += args_json if isinstance(args_json, str) else ""
 994 |                                 
 995 |                                 # Send the update
 996 |                                 yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': anthropic_tool_index, 'delta': {'type': 'input_json_delta', 'partial_json': args_json}})}\n\n"
 997 |                     
 998 |                     # Process finish_reason - end the streaming response
 999 |                     if finish_reason and not has_sent_stop_reason:
1000 |                         has_sent_stop_reason = True
1001 |                         
1002 |                         # Close any open tool call blocks
1003 |                         if tool_index is not None:
1004 |                             for i in range(1, last_tool_index + 1):
1005 |                                 yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': i})}\n\n"
1006 |                         
1007 |                         # If we accumulated text but never sent or closed text block, do it now
1008 |                         if not text_block_closed:
1009 |                             if accumulated_text and not text_sent:
1010 |                                 # Send the accumulated text
1011 |                                 yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': accumulated_text}})}\n\n"
1012 |                             # Close the text block
1013 |                             yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n"
1014 |                         
1015 |                         # Map OpenAI finish_reason to Anthropic stop_reason
1016 |                         stop_reason = "end_turn"
1017 |                         if finish_reason == "length":
1018 |                             stop_reason = "max_tokens"
1019 |                         elif finish_reason == "tool_calls":
1020 |                             stop_reason = "tool_use"
1021 |                         elif finish_reason == "stop":
1022 |                             stop_reason = "end_turn"
1023 |                         
1024 |                         # Send message_delta with stop reason and usage
1025 |                         usage = {"output_tokens": output_tokens}
1026 |                         
1027 |                         yield f"event: message_delta\ndata: {json.dumps({'type': 'message_delta', 'delta': {'stop_reason': stop_reason, 'stop_sequence': None}, 'usage': usage})}\n\n"
1028 |                         
1029 |                         # Send message_stop event
1030 |                         yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n"
1031 |                         
1032 |                         # Send final [DONE] marker to match Anthropic's behavior
1033 |                         yield "data: [DONE]\n\n"
1034 |                         return
1035 |             except Exception as e:
1036 |                 # Log error but continue processing other chunks
1037 |                 logger.error(f"Error processing chunk: {str(e)}")
1038 |                 continue
1039 |         
1040 |         # If we didn't get a finish reason, close any open blocks
1041 |         if not has_sent_stop_reason:
1042 |             # Close any open tool call blocks
1043 |             if tool_index is not None:
1044 |                 for i in range(1, last_tool_index + 1):
1045 |                     yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': i})}\n\n"
1046 |             
1047 |             # Close the text content block
1048 |             yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n"
1049 |             
1050 |             # Send final message_delta with usage
1051 |             usage = {"output_tokens": output_tokens}
1052 |             
1053 |             yield f"event: message_delta\ndata: {json.dumps({'type': 'message_delta', 'delta': {'stop_reason': 'end_turn', 'stop_sequence': None}, 'usage': usage})}\n\n"
1054 |             
1055 |             # Send message_stop event
1056 |             yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n"
1057 |             
1058 |             # Send final [DONE] marker to match Anthropic's behavior
1059 |             yield "data: [DONE]\n\n"
1060 |     
1061 |     except Exception as e:
1062 |         import traceback
1063 |         error_traceback = traceback.format_exc()
1064 |         error_message = f"Error in streaming: {str(e)}\n\nFull traceback:\n{error_traceback}"
1065 |         logger.error(error_message)
1066 |         
1067 |         # Send error message_delta
1068 |         yield f"event: message_delta\ndata: {json.dumps({'type': 'message_delta', 'delta': {'stop_reason': 'error', 'stop_sequence': None}, 'usage': {'output_tokens': 0}})}\n\n"
1069 |         
1070 |         # Send message_stop event
1071 |         yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n"
1072 |         
1073 |         # Send final [DONE] marker
1074 |         yield "data: [DONE]\n\n"
1075 | 
1076 | @app.post("/v1/messages")
1077 | async def create_message(
1078 |     request: MessagesRequest,
1079 |     raw_request: Request
1080 | ):
1081 |     try:
1082 |         # print the body here
1083 |         body = await raw_request.body()
1084 |     
1085 |         # Parse the raw body as JSON since it's bytes
1086 |         body_json = json.loads(body.decode('utf-8'))
1087 |         original_model = body_json.get("model", "unknown")
1088 |         
1089 |         # Get the display name for logging, just the model name without provider prefix
1090 |         display_model = original_model
1091 |         if "/" in display_model:
1092 |             display_model = display_model.split("/")[-1]
1093 |         
1094 |         # Clean model name for capability check
1095 |         clean_model = request.model
1096 |         if clean_model.startswith("anthropic/"):
1097 |             clean_model = clean_model[len("anthropic/"):]
1098 |         elif clean_model.startswith("openai/"):
1099 |             clean_model = clean_model[len("openai/"):]
1100 |         
1101 |         logger.debug(f"📊 PROCESSING REQUEST: Model={request.model}, Stream={request.stream}")
1102 |         
1103 |         # Convert Anthropic request to LiteLLM format
1104 |         litellm_request = convert_anthropic_to_litellm(request)
1105 |         
1106 |         # Determine which API key to use based on the model
1107 |         if request.model.startswith("openai/"):
1108 |             litellm_request["api_key"] = OPENAI_API_KEY
1109 |             logger.debug(f"Using OpenAI API key for model: {request.model}")
1110 |         elif request.model.startswith("gemini/"):
1111 |             litellm_request["api_key"] = GEMINI_API_KEY
1112 |             logger.debug(f"Using Gemini API key for model: {request.model}")
1113 |         else:
1114 |             litellm_request["api_key"] = ANTHROPIC_API_KEY
1115 |             logger.debug(f"Using Anthropic API key for model: {request.model}")
1116 |         
1117 |         # For OpenAI models - modify request format to work with limitations
1118 |         if "openai" in litellm_request["model"] and "messages" in litellm_request:
1119 |             logger.debug(f"Processing OpenAI model request: {litellm_request['model']}")
1120 |             
1121 |             # For OpenAI models, we need to convert content blocks to simple strings
1122 |             # and handle other requirements
1123 |             for i, msg in enumerate(litellm_request["messages"]):
1124 |                 # Special case - handle message content directly when it's a list of tool_result
1125 |                 # This is a specific case we're seeing in the error
1126 |                 if "content" in msg and isinstance(msg["content"], list):
1127 |                     is_only_tool_result = True
1128 |                     for block in msg["content"]:
1129 |                         if not isinstance(block, dict) or block.get("type") != "tool_result":
1130 |                             is_only_tool_result = False
1131 |                             break
1132 |                     
1133 |                     if is_only_tool_result and len(msg["content"]) > 0:
1134 |                         logger.warning(f"Found message with only tool_result content - special handling required")
1135 |                         # Extract the content from all tool_result blocks
1136 |                         all_text = ""
1137 |                         for block in msg["content"]:
1138 |                             all_text += "Tool Result:\n"
1139 |                             result_content = block.get("content", [])
1140 |                             
1141 |                             # Handle different formats of content
1142 |                             if isinstance(result_content, list):
1143 |                                 for item in result_content:
1144 |                                     if isinstance(item, dict) and item.get("type") == "text":
1145 |                                         all_text += item.get("text", "") + "\n"
1146 |                                     elif isinstance(item, dict):
1147 |                                         # Fall back to string representation of any dict
1148 |                                         try:
1149 |                                             item_text = item.get("text", json.dumps(item))
1150 |                                             all_text += item_text + "\n"
1151 |                                         except:
1152 |                                             all_text += str(item) + "\n"
1153 |                             elif isinstance(result_content, str):
1154 |                                 all_text += result_content + "\n"
1155 |                             else:
1156 |                                 try:
1157 |                                     all_text += json.dumps(result_content) + "\n"
1158 |                                 except:
1159 |                                     all_text += str(result_content) + "\n"
1160 |                         
1161 |                         # Replace the list with extracted text
1162 |                         litellm_request["messages"][i]["content"] = all_text.strip() or "..."
1163 |                         logger.warning(f"Converted tool_result to plain text: {all_text.strip()[:200]}...")
1164 |                         continue  # Skip normal processing for this message
1165 |                 
1166 |                 # 1. Handle content field - normal case
1167 |                 if "content" in msg:
1168 |                     # Check if content is a list (content blocks)
1169 |                     if isinstance(msg["content"], list):
1170 |                         # Convert complex content blocks to simple string
1171 |                         text_content = ""
1172 |                         for block in msg["content"]:
1173 |                             if isinstance(block, dict):
1174 |                                 # Handle different content block types
1175 |                                 if block.get("type") == "text":
1176 |                                     text_content += block.get("text", "") + "\n"
1177 |                                 
1178 |                                 # Handle tool_result content blocks - extract nested text
1179 |                                 elif block.get("type") == "tool_result":
1180 |                                     tool_id = block.get("tool_use_id", "unknown")
1181 |                                     text_content += f"[Tool Result ID: {tool_id}]\n"
1182 |                                     
1183 |                                     # Extract text from the tool_result content
1184 |                                     result_content = block.get("content", [])
1185 |                                     if isinstance(result_content, list):
1186 |                                         for item in result_content:
1187 |                                             if isinstance(item, dict) and item.get("type") == "text":
1188 |                                                 text_content += item.get("text", "") + "\n"
1189 |                                             elif isinstance(item, dict):
1190 |                                                 # Handle any dict by trying to extract text or convert to JSON
1191 |                                                 if "text" in item:
1192 |                                                     text_content += item.get("text", "") + "\n"
1193 |                                                 else:
1194 |                                                     try:
1195 |                                                         text_content += json.dumps(item) + "\n"
1196 |                                                     except:
1197 |                                                         text_content += str(item) + "\n"
1198 |                                     elif isinstance(result_content, dict):
1199 |                                         # Handle dictionary content
1200 |                                         if result_content.get("type") == "text":
1201 |                                             text_content += result_content.get("text", "") + "\n"
1202 |                                         else:
1203 |                                             try:
1204 |                                                 text_content += json.dumps(result_content) + "\n"
1205 |                                             except:
1206 |                                                 text_content += str(result_content) + "\n"
1207 |                                     elif isinstance(result_content, str):
1208 |                                         text_content += result_content + "\n"
1209 |                                     else:
1210 |                                         try:
1211 |                                             text_content += json.dumps(result_content) + "\n"
1212 |                                         except:
1213 |                                             text_content += str(result_content) + "\n"
1214 |                                 
1215 |                                 # Handle tool_use content blocks
1216 |                                 elif block.get("type") == "tool_use":
1217 |                                     tool_name = block.get("name", "unknown")
1218 |                                     tool_id = block.get("id", "unknown")
1219 |                                     tool_input = json.dumps(block.get("input", {}))
1220 |                                     text_content += f"[Tool: {tool_name} (ID: {tool_id})]\nInput: {tool_input}\n\n"
1221 |                                 
1222 |                                 # Handle image content blocks
1223 |                                 elif block.get("type") == "image":
1224 |                                     text_content += "[Image content - not displayed in text format]\n"
1225 |                         
1226 |                         # Make sure content is never empty for OpenAI models
1227 |                         if not text_content.strip():
1228 |                             text_content = "..."
1229 |                         
1230 |                         litellm_request["messages"][i]["content"] = text_content.strip()
1231 |                     # Also check for None or empty string content
1232 |                     elif msg["content"] is None:
1233 |                         litellm_request["messages"][i]["content"] = "..." # Empty content not allowed
1234 |                 
1235 |                 # 2. Remove any fields OpenAI doesn't support in messages
1236 |                 for key in list(msg.keys()):
1237 |                     if key not in ["role", "content", "name", "tool_call_id", "tool_calls"]:
1238 |                         logger.warning(f"Removing unsupported field from message: {key}")
1239 |                         del msg[key]
1240 |             
1241 |             # 3. Final validation - check for any remaining invalid values and dump full message details
1242 |             for i, msg in enumerate(litellm_request["messages"]):
1243 |                 # Log the message format for debugging
1244 |                 logger.debug(f"Message {i} format check - role: {msg.get('role')}, content type: {type(msg.get('content'))}")
1245 |                 
1246 |                 # If content is still a list or None, replace with placeholder
1247 |                 if isinstance(msg.get("content"), list):
1248 |                     logger.warning(f"CRITICAL: Message {i} still has list content after processing: {json.dumps(msg.get('content'))}")
1249 |                     # Last resort - stringify the entire content as JSON
1250 |                     litellm_request["messages"][i]["content"] = f"Content as JSON: {json.dumps(msg.get('content'))}"
1251 |                 elif msg.get("content") is None:
1252 |                     logger.warning(f"Message {i} has None content - replacing with placeholder")
1253 |                     litellm_request["messages"][i]["content"] = "..." # Fallback placeholder
1254 |         
1255 |         # Only log basic info about the request, not the full details
1256 |         logger.debug(f"Request for model: {litellm_request.get('model')}, stream: {litellm_request.get('stream', False)}")
1257 |         
1258 |         # Handle streaming mode
1259 |         if request.stream:
1260 |             # Use LiteLLM for streaming
1261 |             num_tools = len(request.tools) if request.tools else 0
1262 |             
1263 |             log_request_beautifully(
1264 |                 "POST", 
1265 |                 raw_request.url.path, 
1266 |                 display_model, 
1267 |                 litellm_request.get('model'),
1268 |                 len(litellm_request['messages']),
1269 |                 num_tools,
1270 |                 200  # Assuming success at this point
1271 |             )
1272 |             # Ensure we use the async version for streaming
1273 |             response_generator = await litellm.acompletion(**litellm_request)
1274 |             
1275 |             return StreamingResponse(
1276 |                 handle_streaming(response_generator, request),
1277 |                 media_type="text/event-stream"
1278 |             )
1279 |         else:
1280 |             # Use LiteLLM for regular completion
1281 |             num_tools = len(request.tools) if request.tools else 0
1282 |             
1283 |             log_request_beautifully(
1284 |                 "POST", 
1285 |                 raw_request.url.path, 
1286 |                 display_model, 
1287 |                 litellm_request.get('model'),
1288 |                 len(litellm_request['messages']),
1289 |                 num_tools,
1290 |                 200  # Assuming success at this point
1291 |             )
1292 |             start_time = time.time()
1293 |             litellm_response = litellm.completion(**litellm_request)
1294 |             logger.debug(f"✅ RESPONSE RECEIVED: Model={litellm_request.get('model')}, Time={time.time() - start_time:.2f}s")
1295 |             
1296 |             # Convert LiteLLM response to Anthropic format
1297 |             anthropic_response = convert_litellm_to_anthropic(litellm_response, request)
1298 |             
1299 |             return anthropic_response
1300 |                 
1301 |     except Exception as e:
1302 |         import traceback
1303 |         error_traceback = traceback.format_exc()
1304 |         
1305 |         # Capture as much info as possible about the error
1306 |         error_details = {
1307 |             "error": str(e),
1308 |             "type": type(e).__name__,
1309 |             "traceback": error_traceback
1310 |         }
1311 |         
1312 |         # Check for LiteLLM-specific attributes
1313 |         for attr in ['message', 'status_code', 'response', 'llm_provider', 'model']:
1314 |             if hasattr(e, attr):
1315 |                 error_details[attr] = getattr(e, attr)
1316 |         
1317 |         # Check for additional exception details in dictionaries
1318 |         if hasattr(e, '__dict__'):
1319 |             for key, value in e.__dict__.items():
1320 |                 if key not in error_details and key not in ['args', '__traceback__']:
1321 |                     error_details[key] = str(value)
1322 |         
1323 |         # Log all error details
1324 |         logger.error(f"Error processing request: {json.dumps(error_details, indent=2)}")
1325 |         
1326 |         # Format error for response
1327 |         error_message = f"Error: {str(e)}"
1328 |         if 'message' in error_details and error_details['message']:
1329 |             error_message += f"\nMessage: {error_details['message']}"
1330 |         if 'response' in error_details and error_details['response']:
1331 |             error_message += f"\nResponse: {error_details['response']}"
1332 |         
1333 |         # Return detailed error
1334 |         status_code = error_details.get('status_code', 500)
1335 |         raise HTTPException(status_code=status_code, detail=error_message)
1336 | 
1337 | @app.post("/v1/messages/count_tokens")
1338 | async def count_tokens(
1339 |     request: TokenCountRequest,
1340 |     raw_request: Request
1341 | ):
1342 |     try:
1343 |         # Log the incoming token count request
1344 |         original_model = request.original_model or request.model
1345 |         
1346 |         # Get the display name for logging, just the model name without provider prefix
1347 |         display_model = original_model
1348 |         if "/" in display_model:
1349 |             display_model = display_model.split("/")[-1]
1350 |         
1351 |         # Clean model name for capability check
1352 |         clean_model = request.model
1353 |         if clean_model.startswith("anthropic/"):
1354 |             clean_model = clean_model[len("anthropic/"):]
1355 |         elif clean_model.startswith("openai/"):
1356 |             clean_model = clean_model[len("openai/"):]
1357 |         
1358 |         # Convert the messages to a format LiteLLM can understand
1359 |         converted_request = convert_anthropic_to_litellm(
1360 |             MessagesRequest(
1361 |                 model=request.model,
1362 |                 max_tokens=100,  # Arbitrary value not used for token counting
1363 |                 messages=request.messages,
1364 |                 system=request.system,
1365 |                 tools=request.tools,
1366 |                 tool_choice=request.tool_choice,
1367 |                 thinking=request.thinking
1368 |             )
1369 |         )
1370 |         
1371 |         # Use LiteLLM's token_counter function
1372 |         try:
1373 |             # Import token_counter function
1374 |             from litellm import token_counter
1375 |             
1376 |             # Log the request beautifully
1377 |             num_tools = len(request.tools) if request.tools else 0
1378 |             
1379 |             log_request_beautifully(
1380 |                 "POST",
1381 |                 raw_request.url.path,
1382 |                 display_model,
1383 |                 converted_request.get('model'),
1384 |                 len(converted_request['messages']),
1385 |                 num_tools,
1386 |                 200  # Assuming success at this point
1387 |             )
1388 |             
1389 |             # Count tokens
1390 |             token_count = token_counter(
1391 |                 model=converted_request["model"],
1392 |                 messages=converted_request["messages"],
1393 |             )
1394 |             
1395 |             # Return Anthropic-style response
1396 |             return TokenCountResponse(input_tokens=token_count)
1397 |             
1398 |         except ImportError:
1399 |             logger.error("Could not import token_counter from litellm")
1400 |             # Fallback to a simple approximation
1401 |             return TokenCountResponse(input_tokens=1000)  # Default fallback
1402 |             
1403 |     except Exception as e:
1404 |         import traceback
1405 |         error_traceback = traceback.format_exc()
1406 |         logger.error(f"Error counting tokens: {str(e)}\n{error_traceback}")
1407 |         raise HTTPException(status_code=500, detail=f"Error counting tokens: {str(e)}")
1408 | 
1409 | @app.get("/")
1410 | async def root():
1411 |     return {"message": "Anthropic Proxy for LiteLLM"}
1412 | 
1413 | # Define ANSI color codes for terminal output
1414 | class Colors:
1415 |     CYAN = "\033[96m"
1416 |     BLUE = "\033[94m"
1417 |     GREEN = "\033[92m"
1418 |     YELLOW = "\033[93m"
1419 |     RED = "\033[91m"
1420 |     MAGENTA = "\033[95m"
1421 |     RESET = "\033[0m"
1422 |     BOLD = "\033[1m"
1423 |     UNDERLINE = "\033[4m"
1424 |     DIM = "\033[2m"
1425 | def log_request_beautifully(method, path, claude_model, openai_model, num_messages, num_tools, status_code):
1426 |     """Log requests in a beautiful, twitter-friendly format showing Claude to OpenAI mapping."""
1427 |     # Format the Claude model name nicely
1428 |     claude_display = f"{Colors.CYAN}{claude_model}{Colors.RESET}"
1429 |     
1430 |     # Extract endpoint name
1431 |     endpoint = path
1432 |     if "?" in endpoint:
1433 |         endpoint = endpoint.split("?")[0]
1434 |     
1435 |     # Extract just the OpenAI model name without provider prefix
1436 |     openai_display = openai_model
1437 |     if "/" in openai_display:
1438 |         openai_display = openai_display.split("/")[-1]
1439 |     openai_display = f"{Colors.GREEN}{openai_display}{Colors.RESET}"
1440 |     
1441 |     # Format tools and messages
1442 |     tools_str = f"{Colors.MAGENTA}{num_tools} tools{Colors.RESET}"
1443 |     messages_str = f"{Colors.BLUE}{num_messages} messages{Colors.RESET}"
1444 |     
1445 |     # Format status code
1446 |     status_str = f"{Colors.GREEN}✓ {status_code} OK{Colors.RESET}" if status_code == 200 else f"{Colors.RED}✗ {status_code}{Colors.RESET}"
1447 |     
1448 | 
1449 |     # Put it all together in a clear, beautiful format
1450 |     log_line = f"{Colors.BOLD}{method} {endpoint}{Colors.RESET} {status_str}"
1451 |     model_line = f"{claude_display} → {openai_display} {tools_str} {messages_str}"
1452 |     
1453 |     # Print to console
1454 |     print(log_line)
1455 |     print(model_line)
1456 |     sys.stdout.flush()
1457 | 
1458 | if __name__ == "__main__":
1459 |     import sys
1460 |     if len(sys.argv) > 1 and sys.argv[1] == "--help":
1461 |         print("Run with: uvicorn server:app --reload --host 0.0.0.0 --port 8082")
1462 |         sys.exit(0)
1463 |     
1464 |     # Configure uvicorn to run with minimal logs
1465 |     uvicorn.run(app, host="0.0.0.0", port=8082, log_level="error")


--------------------------------------------------------------------------------
/tests.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | """
  3 | Comprehensive test suite for Claude-on-OpenAI Proxy.
  4 | 
  5 | This script provides tests for both streaming and non-streaming requests,
  6 | with various scenarios including tool use, multi-turn conversations,
  7 | and content blocks.
  8 | 
  9 | Usage:
 10 |   python tests.py                    # Run all tests
 11 |   python tests.py --no-streaming     # Skip streaming tests
 12 |   python tests.py --simple           # Run only simple tests
 13 |   python tests.py --tools            # Run tool-related tests only
 14 | """
 15 | 
 16 | import os
 17 | import json
 18 | import time
 19 | import httpx
 20 | import argparse
 21 | import asyncio
 22 | import sys
 23 | from datetime import datetime
 24 | from typing import Dict, Any, List, Optional, Set
 25 | from dotenv import load_dotenv
 26 | 
 27 | # Load environment variables
 28 | load_dotenv()
 29 | 
 30 | # Configuration
 31 | ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY")
 32 | PROXY_API_KEY = os.environ.get("ANTHROPIC_API_KEY")  # Using same key for proxy
 33 | ANTHROPIC_API_URL = "https://api.anthropic.com/v1/messages"
 34 | PROXY_API_URL = "http://localhost:8082/v1/messages"
 35 | ANTHROPIC_VERSION = "2023-06-01"
 36 | MODEL = "claude-3-sonnet-20240229"  # Change to your preferred model
 37 | 
 38 | # Headers
 39 | anthropic_headers = {
 40 |     "x-api-key": ANTHROPIC_API_KEY,
 41 |     "anthropic-version": ANTHROPIC_VERSION,
 42 |     "content-type": "application/json",
 43 | }
 44 | 
 45 | proxy_headers = {
 46 |     "x-api-key": PROXY_API_KEY,
 47 |     "anthropic-version": ANTHROPIC_VERSION,
 48 |     "content-type": "application/json",
 49 | }
 50 | 
 51 | # Tool definitions
 52 | calculator_tool = {
 53 |     "name": "calculator",
 54 |     "description": "Evaluate mathematical expressions",
 55 |     "input_schema": {
 56 |         "type": "object",
 57 |         "properties": {
 58 |             "expression": {
 59 |                 "type": "string",
 60 |                 "description": "The mathematical expression to evaluate"
 61 |             }
 62 |         },
 63 |         "required": ["expression"]
 64 |     }
 65 | }
 66 | 
 67 | weather_tool = {
 68 |     "name": "weather",
 69 |     "description": "Get weather information for a location",
 70 |     "input_schema": {
 71 |         "type": "object",
 72 |         "properties": {
 73 |             "location": {
 74 |                 "type": "string",
 75 |                 "description": "The city or location to get weather for"
 76 |             },
 77 |             "units": {
 78 |                 "type": "string",
 79 |                 "enum": ["celsius", "fahrenheit"],
 80 |                 "description": "Temperature units"
 81 |             }
 82 |         },
 83 |         "required": ["location"]
 84 |     }
 85 | }
 86 | 
 87 | search_tool = {
 88 |     "name": "search",
 89 |     "description": "Search for information on the web",
 90 |     "input_schema": {
 91 |         "type": "object",
 92 |         "properties": {
 93 |             "query": {
 94 |                 "type": "string", 
 95 |                 "description": "The search query"
 96 |             }
 97 |         },
 98 |         "required": ["query"]
 99 |     }
100 | }
101 | 
102 | # Test scenarios
103 | TEST_SCENARIOS = {
104 |     # Simple text response
105 |     "simple": {
106 |         "model": MODEL,
107 |         "max_tokens": 300,
108 |         "messages": [
109 |             {"role": "user", "content": "Hello, world! Can you tell me about Paris in 2-3 sentences?"}
110 |         ]
111 |     },
112 |     
113 |     # Basic tool use
114 |     "calculator": {
115 |         "model": MODEL,
116 |         "max_tokens": 300,
117 |         "messages": [
118 |             {"role": "user", "content": "What is 135 + 7.5 divided by 2.5?"}
119 |         ],
120 |         "tools": [calculator_tool],
121 |         "tool_choice": {"type": "auto"}
122 |     },
123 |     
124 |     # Multiple tools
125 |     "multi_tool": {
126 |         "model": MODEL,
127 |         "max_tokens": 500,
128 |         "temperature": 0.7,
129 |         "top_p": 0.95,
130 |         "system": "You are a helpful assistant that uses tools when appropriate. Be concise and precise.",
131 |         "messages": [
132 |             {"role": "user", "content": "I'm planning a trip to New York next week. What's the weather like and what are some interesting places to visit?"}
133 |         ],
134 |         "tools": [weather_tool, search_tool],
135 |         "tool_choice": {"type": "auto"}
136 |     },
137 |     
138 |     # Multi-turn conversation
139 |     "multi_turn": {
140 |         "model": MODEL,
141 |         "max_tokens": 500,
142 |         "messages": [
143 |             {"role": "user", "content": "Let's do some math. What is 240 divided by 8?"},
144 |             {"role": "assistant", "content": "To calculate 240 divided by 8, I'll perform the division:\n\n240 ÷ 8 = 30\n\nSo the result is 30."},
145 |             {"role": "user", "content": "Now multiply that by 4 and tell me the result."}
146 |         ],
147 |         "tools": [calculator_tool],
148 |         "tool_choice": {"type": "auto"}
149 |     },
150 |     
151 |     # Content blocks
152 |     "content_blocks": {
153 |         "model": MODEL,
154 |         "max_tokens": 500,
155 |         "messages": [
156 |             {"role": "user", "content": [
157 |                 {"type": "text", "text": "I need to know the weather in Los Angeles and calculate 75.5 / 5. Can you help with both?"}
158 |             ]}
159 |         ],
160 |         "tools": [calculator_tool, weather_tool],
161 |         "tool_choice": {"type": "auto"}
162 |     },
163 |     
164 |     # Simple streaming test
165 |     "simple_stream": {
166 |         "model": MODEL,
167 |         "max_tokens": 100,
168 |         "stream": True,
169 |         "messages": [
170 |             {"role": "user", "content": "Count from 1 to 5, with one number per line."}
171 |         ]
172 |     },
173 |     
174 |     # Tool use with streaming
175 |     "calculator_stream": {
176 |         "model": MODEL,
177 |         "max_tokens": 300,
178 |         "stream": True,
179 |         "messages": [
180 |             {"role": "user", "content": "What is 135 + 17.5 divided by 2.5?"}
181 |         ],
182 |         "tools": [calculator_tool],
183 |         "tool_choice": {"type": "auto"}
184 |     }
185 | }
186 | 
187 | # Required event types for Anthropic streaming responses
188 | REQUIRED_EVENT_TYPES = {
189 |     "message_start", 
190 |     "content_block_start", 
191 |     "content_block_delta", 
192 |     "content_block_stop", 
193 |     "message_delta", 
194 |     "message_stop"
195 | }
196 | 
197 | # ================= NON-STREAMING TESTS =================
198 | 
199 | def get_response(url, headers, data):
200 |     """Send a request and get the response."""
201 |     start_time = time.time()
202 |     response = httpx.post(url, headers=headers, json=data, timeout=30)
203 |     elapsed = time.time() - start_time
204 |     
205 |     print(f"Response time: {elapsed:.2f} seconds")
206 |     return response
207 | 
208 | def compare_responses(anthropic_response, proxy_response, check_tools=False):
209 |     """Compare the two responses to see if they're similar enough."""
210 |     anthropic_json = anthropic_response.json()
211 |     proxy_json = proxy_response.json()
212 |     
213 |     print("\n--- Anthropic Response Structure ---")
214 |     print(json.dumps({k: v for k, v in anthropic_json.items() if k != "content"}, indent=2))
215 |     
216 |     print("\n--- Proxy Response Structure ---")
217 |     print(json.dumps({k: v for k, v in proxy_json.items() if k != "content"}, indent=2))
218 |     
219 |     # Basic structure verification with more flexibility
220 |     # The proxy might map values differently, so we're more lenient in our checks
221 |     assert proxy_json.get("role") == "assistant", "Proxy role is not 'assistant'"
222 |     assert proxy_json.get("type") == "message", "Proxy type is not 'message'"
223 |     
224 |     # Check if stop_reason is reasonable (might be different between Anthropic and our proxy)
225 |     valid_stop_reasons = ["end_turn", "max_tokens", "stop_sequence", "tool_use", None]
226 |     assert proxy_json.get("stop_reason") in valid_stop_reasons, "Invalid stop reason"
227 |     
228 |     # Check content exists and has valid structure
229 |     assert "content" in anthropic_json, "No content in Anthropic response"
230 |     assert "content" in proxy_json, "No content in Proxy response"
231 |     
232 |     anthropic_content = anthropic_json["content"]
233 |     proxy_content = proxy_json["content"]
234 |     
235 |     # Make sure content is a list and has at least one item
236 |     assert isinstance(anthropic_content, list), "Anthropic content is not a list"
237 |     assert isinstance(proxy_content, list), "Proxy content is not a list" 
238 |     assert len(proxy_content) > 0, "Proxy content is empty"
239 |     
240 |     # If we're checking for tool uses
241 |     if check_tools:
242 |         # Check if content has tool use
243 |         anthropic_tool = None
244 |         proxy_tool = None
245 |         
246 |         # Find tool use in Anthropic response
247 |         for item in anthropic_content:
248 |             if item.get("type") == "tool_use":
249 |                 anthropic_tool = item
250 |                 break
251 |                 
252 |         # Find tool use in Proxy response
253 |         for item in proxy_content:
254 |             if item.get("type") == "tool_use":
255 |                 proxy_tool = item
256 |                 break
257 |         
258 |         # At least one of them should have a tool use
259 |         if anthropic_tool is not None:
260 |             print("\n---------- ANTHROPIC TOOL USE ----------")
261 |             print(json.dumps(anthropic_tool, indent=2))
262 |             
263 |             if proxy_tool is not None:
264 |                 print("\n---------- PROXY TOOL USE ----------")
265 |                 print(json.dumps(proxy_tool, indent=2))
266 |                 
267 |                 # Check tool structure
268 |                 assert proxy_tool.get("name") is not None, "Proxy tool has no name"
269 |                 assert proxy_tool.get("input") is not None, "Proxy tool has no input"
270 |                 
271 |                 print("\n✅ Both responses contain tool use")
272 |             else:
273 |                 print("\n⚠️ Proxy response does not contain tool use, but Anthropic does")
274 |         elif proxy_tool is not None:
275 |             print("\n---------- PROXY TOOL USE ----------")
276 |             print(json.dumps(proxy_tool, indent=2))
277 |             print("\n⚠️ Proxy response contains tool use, but Anthropic does not")
278 |         else:
279 |             print("\n⚠️ Neither response contains tool use")
280 |     
281 |     # Check if content has text
282 |     anthropic_text = None
283 |     proxy_text = None
284 |     
285 |     for item in anthropic_content:
286 |         if item.get("type") == "text":
287 |             anthropic_text = item.get("text")
288 |             break
289 |             
290 |     for item in proxy_content:
291 |         if item.get("type") == "text":
292 |             proxy_text = item.get("text")
293 |             break
294 |     
295 |     # For tool use responses, there might not be text content
296 |     if check_tools and (anthropic_text is None or proxy_text is None):
297 |         print("\n⚠️ One or both responses don't have text content (expected for tool-only responses)")
298 |         return True
299 |     
300 |     assert anthropic_text is not None, "No text found in Anthropic response"
301 |     assert proxy_text is not None, "No text found in Proxy response"
302 |     
303 |     # Print the first few lines of each text response
304 |     max_preview_lines = 5
305 |     anthropic_preview = "\n".join(anthropic_text.strip().split("\n")[:max_preview_lines])
306 |     proxy_preview = "\n".join(proxy_text.strip().split("\n")[:max_preview_lines])
307 |     
308 |     print("\n---------- ANTHROPIC TEXT PREVIEW ----------")
309 |     print(anthropic_preview)
310 |     
311 |     print("\n---------- PROXY TEXT PREVIEW ----------")
312 |     print(proxy_preview)
313 |     
314 |     # Check for some minimum text overlap - proxy might have different exact wording
315 |     # but should have roughly similar content
316 |     return True  # We're not enforcing similarity, just basic structure
317 | 
318 | def test_request(test_name, request_data, check_tools=False):
319 |     """Run a test with the given request data."""
320 |     print(f"\n{'='*20} RUNNING TEST: {test_name} {'='*20}")
321 |     
322 |     # Log the request data
323 |     print(f"\nRequest data:\n{json.dumps({k: v for k, v in request_data.items() if k != 'messages'}, indent=2)}")
324 |     
325 |     # Make copies of the request data to avoid modifying the original
326 |     anthropic_data = request_data.copy()
327 |     proxy_data = request_data.copy()
328 |     
329 |     try:
330 |         # Send requests to both APIs
331 |         print("\nSending to Anthropic API...")
332 |         anthropic_response = get_response(ANTHROPIC_API_URL, anthropic_headers, anthropic_data)
333 |         
334 |         print("\nSending to Proxy...")
335 |         proxy_response = get_response(PROXY_API_URL, proxy_headers, proxy_data)
336 |         
337 |         # Check response codes
338 |         print(f"\nAnthropic status code: {anthropic_response.status_code}")
339 |         print(f"Proxy status code: {proxy_response.status_code}")
340 |         
341 |         if anthropic_response.status_code != 200 or proxy_response.status_code != 200:
342 |             print("\n⚠️ One or both requests failed")
343 |             if anthropic_response.status_code != 200:
344 |                 print(f"Anthropic error: {anthropic_response.text}")
345 |             if proxy_response.status_code != 200:
346 |                 print(f"Proxy error: {proxy_response.text}")
347 |             return False
348 |         
349 |         # Compare the responses
350 |         result = compare_responses(anthropic_response, proxy_response, check_tools=check_tools)
351 |         if result:
352 |             print(f"\n✅ Test {test_name} passed!")
353 |             return True
354 |         else:
355 |             print(f"\n❌ Test {test_name} failed!")
356 |             return False
357 |     
358 |     except Exception as e:
359 |         print(f"\n❌ Error in test {test_name}: {str(e)}")
360 |         import traceback
361 |         traceback.print_exc()
362 |         return False
363 | 
364 | # ================= STREAMING TESTS =================
365 | 
366 | class StreamStats:
367 |     """Track statistics about a streaming response."""
368 |     
369 |     def __init__(self):
370 |         self.event_types = set()
371 |         self.event_counts = {}
372 |         self.first_event_time = None
373 |         self.last_event_time = None
374 |         self.total_chunks = 0
375 |         self.events = []
376 |         self.text_content = ""
377 |         self.content_blocks = {}
378 |         self.has_tool_use = False
379 |         self.has_error = False
380 |         self.error_message = ""
381 |         self.text_content_by_block = {}
382 |         
383 |     def add_event(self, event_data):
384 |         """Track information about each received event."""
385 |         now = datetime.now()
386 |         if self.first_event_time is None:
387 |             self.first_event_time = now
388 |         self.last_event_time = now
389 |         
390 |         self.total_chunks += 1
391 |         
392 |         # Record event type and increment count
393 |         if "type" in event_data:
394 |             event_type = event_data["type"]
395 |             self.event_types.add(event_type)
396 |             self.event_counts[event_type] = self.event_counts.get(event_type, 0) + 1
397 |             
398 |             # Track specific event data
399 |             if event_type == "content_block_start":
400 |                 block_idx = event_data.get("index")
401 |                 content_block = event_data.get("content_block", {})
402 |                 if content_block.get("type") == "tool_use":
403 |                     self.has_tool_use = True
404 |                 self.content_blocks[block_idx] = content_block
405 |                 self.text_content_by_block[block_idx] = ""
406 |                 
407 |             elif event_type == "content_block_delta":
408 |                 block_idx = event_data.get("index")
409 |                 delta = event_data.get("delta", {})
410 |                 if delta.get("type") == "text_delta":
411 |                     text = delta.get("text", "")
412 |                     self.text_content += text
413 |                     # Also track text by block ID
414 |                     if block_idx in self.text_content_by_block:
415 |                         self.text_content_by_block[block_idx] += text
416 |                         
417 |         # Keep track of all events for debugging
418 |         self.events.append(event_data)
419 |                 
420 |     def get_duration(self):
421 |         """Calculate the total duration of the stream in seconds."""
422 |         if self.first_event_time is None or self.last_event_time is None:
423 |             return 0
424 |         return (self.last_event_time - self.first_event_time).total_seconds()
425 |         
426 |     def summarize(self):
427 |         """Print a summary of the stream statistics."""
428 |         print(f"Total chunks: {self.total_chunks}")
429 |         print(f"Unique event types: {sorted(list(self.event_types))}")
430 |         print(f"Event counts: {json.dumps(self.event_counts, indent=2)}")
431 |         print(f"Duration: {self.get_duration():.2f} seconds")
432 |         print(f"Has tool use: {self.has_tool_use}")
433 |         
434 |         # Print the first few lines of content
435 |         if self.text_content:
436 |             max_preview_lines = 5
437 |             text_preview = "\n".join(self.text_content.strip().split("\n")[:max_preview_lines])
438 |             print(f"Text preview:\n{text_preview}")
439 |         else:
440 |             print("No text content extracted")
441 |             
442 |         if self.has_error:
443 |             print(f"Error: {self.error_message}")
444 | 
445 | async def stream_response(url, headers, data, stream_name):
446 |     """Send a streaming request and process the response."""
447 |     print(f"\nStarting {stream_name} stream...")
448 |     stats = StreamStats()
449 |     error = None
450 |     
451 |     try:
452 |         async with httpx.AsyncClient() as client:
453 |             # Add stream flag to ensure it's streamed
454 |             request_data = data.copy()
455 |             request_data["stream"] = True
456 |             
457 |             start_time = time.time()
458 |             async with client.stream("POST", url, json=request_data, headers=headers, timeout=30) as response:
459 |                 if response.status_code != 200:
460 |                     error_text = await response.aread()
461 |                     stats.has_error = True
462 |                     stats.error_message = f"HTTP {response.status_code}: {error_text.decode('utf-8')}"
463 |                     error = stats.error_message
464 |                     print(f"Error: {stats.error_message}")
465 |                     return stats, error
466 |                 
467 |                 print(f"{stream_name} connected, receiving events...")
468 |                 
469 |                 # Process each chunk
470 |                 buffer = ""
471 |                 async for chunk in response.aiter_text():
472 |                     if not chunk.strip():
473 |                         continue
474 |                     
475 |                     # Handle multiple events in one chunk
476 |                     buffer += chunk
477 |                     events = buffer.split("\n\n")
478 |                     
479 |                     # Process all complete events
480 |                     for event_text in events[:-1]:  # All but the last (possibly incomplete) event
481 |                         if not event_text.strip():
482 |                             continue
483 |                         
484 |                         # Parse server-sent event format
485 |                         if "data: " in event_text:
486 |                             # Extract the data part
487 |                             data_parts = []
488 |                             for line in event_text.split("\n"):
489 |                                 if line.startswith("data: "):
490 |                                     data_part = line[len("data: "):]
491 |                                     # Skip the "[DONE]" marker
492 |                                     if data_part == "[DONE]":
493 |                                         break
494 |                                     data_parts.append(data_part)
495 |                             
496 |                             if data_parts:
497 |                                 try:
498 |                                     event_data = json.loads("".join(data_parts))
499 |                                     stats.add_event(event_data)
500 |                                 except json.JSONDecodeError as e:
501 |                                     print(f"Error parsing event: {e}\nRaw data: {''.join(data_parts)}")
502 |                     
503 |                     # Keep the last (potentially incomplete) event for the next iteration
504 |                     buffer = events[-1] if events else ""
505 |                     
506 |                 # Process any remaining complete events in the buffer
507 |                 if buffer.strip():
508 |                     lines = buffer.strip().split("\n")
509 |                     data_lines = [line[len("data: "):] for line in lines if line.startswith("data: ")]
510 |                     if data_lines and data_lines[0] != "[DONE]":
511 |                         try:
512 |                             event_data = json.loads("".join(data_lines))
513 |                             stats.add_event(event_data)
514 |                         except:
515 |                             pass
516 |                 
517 |             elapsed = time.time() - start_time
518 |             print(f"{stream_name} stream completed in {elapsed:.2f} seconds")
519 |     except Exception as e:
520 |         stats.has_error = True
521 |         stats.error_message = str(e)
522 |         error = str(e)
523 |         print(f"Error in {stream_name} stream: {e}")
524 |     
525 |     return stats, error
526 | 
527 | def compare_stream_stats(anthropic_stats, proxy_stats):
528 |     """Compare the statistics from the two streams to see if they're similar enough."""
529 |     
530 |     print("\n--- Stream Comparison ---")
531 |     
532 |     # Required events
533 |     anthropic_missing = REQUIRED_EVENT_TYPES - anthropic_stats.event_types
534 |     proxy_missing = REQUIRED_EVENT_TYPES - proxy_stats.event_types
535 |     
536 |     print(f"Anthropic missing event types: {anthropic_missing}")
537 |     print(f"Proxy missing event types: {proxy_missing}")
538 |     
539 |     # Check if proxy has the required events
540 |     if proxy_missing:
541 |         print(f"⚠️ Proxy is missing required event types: {proxy_missing}")
542 |     else:
543 |         print("✅ Proxy has all required event types")
544 |     
545 |     # Compare content
546 |     if anthropic_stats.text_content and proxy_stats.text_content:
547 |         anthropic_preview = "\n".join(anthropic_stats.text_content.strip().split("\n")[:5])
548 |         proxy_preview = "\n".join(proxy_stats.text_content.strip().split("\n")[:5])
549 |         
550 |         print("\n--- Anthropic Content Preview ---")
551 |         print(anthropic_preview)
552 |         
553 |         print("\n--- Proxy Content Preview ---")
554 |         print(proxy_preview)
555 |     
556 |     # Compare tool use
557 |     if anthropic_stats.has_tool_use and proxy_stats.has_tool_use:
558 |         print("✅ Both have tool use")
559 |     elif anthropic_stats.has_tool_use and not proxy_stats.has_tool_use:
560 |         print("⚠️ Anthropic has tool use but proxy does not")
561 |     elif not anthropic_stats.has_tool_use and proxy_stats.has_tool_use:
562 |         print("⚠️ Proxy has tool use but Anthropic does not")
563 |     
564 |     # Success as long as proxy has some content and no errors
565 |     return (not proxy_stats.has_error and 
566 |             len(proxy_stats.text_content) > 0 or proxy_stats.has_tool_use)
567 | 
568 | async def test_streaming(test_name, request_data):
569 |     """Run a streaming test with the given request data."""
570 |     print(f"\n{'='*20} RUNNING STREAMING TEST: {test_name} {'='*20}")
571 |     
572 |     # Log the request data
573 |     print(f"\nRequest data:\n{json.dumps({k: v for k, v in request_data.items() if k != 'messages'}, indent=2)}")
574 |     
575 |     # Make copies of the request data to avoid modifying the original
576 |     anthropic_data = request_data.copy()
577 |     proxy_data = request_data.copy()
578 |     
579 |     if not anthropic_data.get("stream"):
580 |         anthropic_data["stream"] = True
581 |     if not proxy_data.get("stream"):
582 |         proxy_data["stream"] = True
583 |     
584 |     check_tools = "tools" in request_data
585 |     
586 |     try:
587 |         # Send streaming requests
588 |         anthropic_stats, anthropic_error = await stream_response(
589 |             ANTHROPIC_API_URL, anthropic_headers, anthropic_data, "Anthropic"
590 |         )
591 |         
592 |         proxy_stats, proxy_error = await stream_response(
593 |             PROXY_API_URL, proxy_headers, proxy_data, "Proxy"
594 |         )
595 |         
596 |         # Print statistics
597 |         print("\n--- Anthropic Stream Statistics ---")
598 |         anthropic_stats.summarize()
599 |         
600 |         print("\n--- Proxy Stream Statistics ---")
601 |         proxy_stats.summarize()
602 |         
603 |         # Compare the responses
604 |         if anthropic_error:
605 |             print(f"\n⚠️ Anthropic stream had an error: {anthropic_error}")
606 |             # If Anthropic errors, the test passes if proxy does anything useful
607 |             if not proxy_error and proxy_stats.total_chunks > 0:
608 |                 print(f"\n✅ Test {test_name} passed! (Proxy worked even though Anthropic failed)")
609 |                 return True
610 |             else:
611 |                 print(f"\n❌ Test {test_name} failed! Both streams had errors.")
612 |                 return False
613 |         
614 |         if proxy_error:
615 |             print(f"\n❌ Test {test_name} failed! Proxy had an error: {proxy_error}")
616 |             return False
617 |         
618 |         result = compare_stream_stats(anthropic_stats, proxy_stats)
619 |         if result:
620 |             print(f"\n✅ Test {test_name} passed!")
621 |             return True
622 |         else:
623 |             print(f"\n❌ Test {test_name} failed!")
624 |             return False
625 |     
626 |     except Exception as e:
627 |         print(f"\n❌ Error in test {test_name}: {str(e)}")
628 |         import traceback
629 |         traceback.print_exc()
630 |         return False
631 | 
632 | # ================= MAIN =================
633 | 
634 | async def run_tests(args):
635 |     """Run all tests based on command-line arguments."""
636 |     # Track test results
637 |     results = {}
638 |     
639 |     # First run non-streaming tests
640 |     if not args.streaming_only:
641 |         print("\n\n=========== RUNNING NON-STREAMING TESTS ===========\n")
642 |         for test_name, test_data in TEST_SCENARIOS.items():
643 |             # Skip streaming tests
644 |             if test_data.get("stream"):
645 |                 continue
646 |                 
647 |             # Skip tool tests if requested
648 |             if args.simple and "tools" in test_data:
649 |                 continue
650 |                 
651 |             # Skip non-tool tests if tools_only
652 |             if args.tools_only and "tools" not in test_data:
653 |                 continue
654 |                 
655 |             # Run the test
656 |             check_tools = "tools" in test_data
657 |             result = test_request(test_name, test_data, check_tools=check_tools)
658 |             results[test_name] = result
659 |     
660 |     # Now run streaming tests
661 |     if not args.no_streaming:
662 |         print("\n\n=========== RUNNING STREAMING TESTS ===========\n")
663 |         for test_name, test_data in TEST_SCENARIOS.items():
664 |             # Only select streaming tests, or force streaming
665 |             if not test_data.get("stream") and not test_name.endswith("_stream"):
666 |                 continue
667 |                 
668 |             # Skip tool tests if requested
669 |             if args.simple and "tools" in test_data:
670 |                 continue
671 |                 
672 |             # Skip non-tool tests if tools_only
673 |             if args.tools_only and "tools" not in test_data:
674 |                 continue
675 |                 
676 |             # Run the streaming test
677 |             result = await test_streaming(test_name, test_data)
678 |             results[f"{test_name}_streaming"] = result
679 |     
680 |     # Print summary
681 |     print("\n\n=========== TEST SUMMARY ===========\n")
682 |     total = len(results)
683 |     passed = sum(1 for v in results.values() if v)
684 |     
685 |     for test, result in results.items():
686 |         print(f"{test}: {'✅ PASS' if result else '❌ FAIL'}")
687 |     
688 |     print(f"\nTotal: {passed}/{total} tests passed")
689 |     
690 |     if passed == total:
691 |         print("\n🎉 All tests passed!")
692 |         return True
693 |     else:
694 |         print(f"\n⚠️ {total - passed} tests failed")
695 |         return False
696 | 
697 | async def main():
698 |     # Check that API key is set
699 |     if not ANTHROPIC_API_KEY:
700 |         print("Error: ANTHROPIC_API_KEY not set in .env file")
701 |         return
702 |     
703 |     # Parse command-line arguments
704 |     parser = argparse.ArgumentParser(description="Test the Claude-on-OpenAI proxy")
705 |     parser.add_argument("--no-streaming", action="store_true", help="Skip streaming tests")
706 |     parser.add_argument("--streaming-only", action="store_true", help="Only run streaming tests")
707 |     parser.add_argument("--simple", action="store_true", help="Only run simple tests (no tools)")
708 |     parser.add_argument("--tools-only", action="store_true", help="Only run tool tests")
709 |     args = parser.parse_args()
710 |     
711 |     # Run tests
712 |     success = await run_tests(args)
713 |     sys.exit(0 if success else 1)
714 | 
715 | if __name__ == "__main__":
716 |     asyncio.run(main()) 


--------------------------------------------------------------------------------