├── .env.example ├── .gitignore ├── .python-version ├── README.md ├── pic.png ├── pyproject.toml ├── server.py ├── tests.py └── uv.lock /.env.example: -------------------------------------------------------------------------------- 1 | # Required API Keys 2 | ANTHROPIC_API_KEY="your-anthropic-api-key" # Needed if proxying *to* Anthropic 3 | OPENAI_API_KEY="sk-..." 4 | GEMINI_API_KEY="your-google-ai-studio-key" 5 | 6 | # Optional: Provider Preference and Model Mapping 7 | # Controls which provider (google or openai) is preferred for mapping haiku/sonnet. 8 | # Defaults to openai if not set. 9 | PREFERRED_PROVIDER="openai" 10 | 11 | # Optional: Specify the exact models to map haiku/sonnet to. 12 | # If PREFERRED_PROVIDER=google, these MUST be valid Gemini model names known to the server. 13 | # Defaults to gemini-2.5-pro-preview-03-25 and gemini-2.0-flash if PREFERRED_PROVIDER=google. 14 | # Defaults to gpt-4.1 and gpt-4.1-mini if PREFERRED_PROVIDER=openai. 15 | # BIG_MODEL="gpt-4.1" 16 | # SMALL_MODEL="gpt-4.1-mini" 17 | 18 | # Example Google mapping: 19 | # PREFERRED_PROVIDER="google" 20 | # BIG_MODEL="gemini-2.5-pro-preview-03-25" 21 | # SMALL_MODEL="gemini-2.0-flash" -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Environment variables 2 | .env 3 | 4 | # Python 5 | __pycache__/ 6 | *.py[cod] 7 | *$py.class 8 | *.so 9 | .Python 10 | build/ 11 | develop-eggs/ 12 | dist/ 13 | downloads/ 14 | eggs/ 15 | .eggs/ 16 | lib/ 17 | lib64/ 18 | parts/ 19 | sdist/ 20 | var/ 21 | wheels/ 22 | *.egg-info/ 23 | .installed.cfg 24 | *.egg 25 | 26 | # Virtual environments 27 | venv/ 28 | env/ 29 | ENV/ 30 | 31 | # Logs 32 | *.log 33 | 34 | # IDE specific files 35 | .idea/ 36 | .vscode/ 37 | *.swp 38 | *.swo -------------------------------------------------------------------------------- /.python-version: -------------------------------------------------------------------------------- 1 | 3.10 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Anthropic API Proxy for Gemini & OpenAI Models 🔄 2 | 3 | **Use Anthropic clients (like Claude Code) with Gemini or OpenAI backends.** 🤝 4 | 5 | A proxy server that lets you use Anthropic clients with Gemini or OpenAI models via LiteLLM. 🌉 6 | 7 | 8 | ![Anthropic API Proxy](pic.png) 9 | 10 | ## Quick Start ⚡ 11 | 12 | ### Prerequisites 13 | 14 | - OpenAI API key 🔑 15 | - Google AI Studio (Gemini) API key (if using Google provider) 🔑 16 | - [uv](https://github.com/astral-sh/uv) installed. 17 | 18 | ### Setup 🛠️ 19 | 20 | 1. **Clone this repository**: 21 | ```bash 22 | git clone https://github.com/1rgs/claude-code-openai.git 23 | cd claude-code-openai 24 | ``` 25 | 26 | 2. **Install uv** (if you haven't already): 27 | ```bash 28 | curl -LsSf https://astral.sh/uv/install.sh | sh 29 | ``` 30 | *(`uv` will handle dependencies based on `pyproject.toml` when you run the server)* 31 | 32 | 3. **Configure Environment Variables**: 33 | Copy the example environment file: 34 | ```bash 35 | cp .env.example .env 36 | ``` 37 | Edit `.env` and fill in your API keys and model configurations: 38 | 39 | * `ANTHROPIC_API_KEY`: (Optional) Needed only if proxying *to* Anthropic models. 40 | * `OPENAI_API_KEY`: Your OpenAI API key (Required if using the default OpenAI preference or as fallback). 41 | * `GEMINI_API_KEY`: Your Google AI Studio (Gemini) API key (Required if PREFERRED_PROVIDER=google). 42 | * `PREFERRED_PROVIDER` (Optional): Set to `openai` (default) or `google`. This determines the primary backend for mapping `haiku`/`sonnet`. 43 | * `BIG_MODEL` (Optional): The model to map `sonnet` requests to. Defaults to `gpt-4.1` (if `PREFERRED_PROVIDER=openai`) or `gemini-2.5-pro-preview-03-25`. 44 | * `SMALL_MODEL` (Optional): The model to map `haiku` requests to. Defaults to `gpt-4.1-mini` (if `PREFERRED_PROVIDER=openai`) or `gemini-2.0-flash`. 45 | 46 | **Mapping Logic:** 47 | - If `PREFERRED_PROVIDER=openai` (default), `haiku`/`sonnet` map to `SMALL_MODEL`/`BIG_MODEL` prefixed with `openai/`. 48 | - If `PREFERRED_PROVIDER=google`, `haiku`/`sonnet` map to `SMALL_MODEL`/`BIG_MODEL` prefixed with `gemini/` *if* those models are in the server's known `GEMINI_MODELS` list (otherwise falls back to OpenAI mapping). 49 | 50 | 4. **Run the server**: 51 | ```bash 52 | uv run uvicorn server:app --host 0.0.0.0 --port 8082 --reload 53 | ``` 54 | *(`--reload` is optional, for development)* 55 | 56 | ### Using with Claude Code 🎮 57 | 58 | 1. **Install Claude Code** (if you haven't already): 59 | ```bash 60 | npm install -g @anthropic-ai/claude-code 61 | ``` 62 | 63 | 2. **Connect to your proxy**: 64 | ```bash 65 | ANTHROPIC_BASE_URL=http://localhost:8082 claude 66 | ``` 67 | 68 | 3. **That's it!** Your Claude Code client will now use the configured backend models (defaulting to Gemini) through the proxy. 🎯 69 | 70 | ## Model Mapping 🗺️ 71 | 72 | The proxy automatically maps Claude models to either OpenAI or Gemini models based on the configured model: 73 | 74 | | Claude Model | Default Mapping | When BIG_MODEL/SMALL_MODEL is a Gemini model | 75 | |--------------|--------------|---------------------------| 76 | | haiku | openai/gpt-4o-mini | gemini/[model-name] | 77 | | sonnet | openai/gpt-4o | gemini/[model-name] | 78 | 79 | ### Supported Models 80 | 81 | #### OpenAI Models 82 | The following OpenAI models are supported with automatic `openai/` prefix handling: 83 | - o3-mini 84 | - o1 85 | - o1-mini 86 | - o1-pro 87 | - gpt-4.5-preview 88 | - gpt-4o 89 | - gpt-4o-audio-preview 90 | - chatgpt-4o-latest 91 | - gpt-4o-mini 92 | - gpt-4o-mini-audio-preview 93 | - gpt-4.1 94 | - gpt-4.1-mini 95 | 96 | #### Gemini Models 97 | The following Gemini models are supported with automatic `gemini/` prefix handling: 98 | - gemini-2.5-pro-preview-03-25 99 | - gemini-2.0-flash 100 | 101 | ### Model Prefix Handling 102 | The proxy automatically adds the appropriate prefix to model names: 103 | - OpenAI models get the `openai/` prefix 104 | - Gemini models get the `gemini/` prefix 105 | - The BIG_MODEL and SMALL_MODEL will get the appropriate prefix based on whether they're in the OpenAI or Gemini model lists 106 | 107 | For example: 108 | - `gpt-4o` becomes `openai/gpt-4o` 109 | - `gemini-2.5-pro-preview-03-25` becomes `gemini/gemini-2.5-pro-preview-03-25` 110 | - When BIG_MODEL is set to a Gemini model, Claude Sonnet will map to `gemini/[model-name]` 111 | 112 | ### Customizing Model Mapping 113 | 114 | Control the mapping using environment variables in your `.env` file or directly: 115 | 116 | **Example 1: Default (Use OpenAI)** 117 | No changes needed in `.env` beyond API keys, or ensure: 118 | ```dotenv 119 | OPENAI_API_KEY="your-openai-key" 120 | GEMINI_API_KEY="your-google-key" # Needed if PREFERRED_PROVIDER=google 121 | # PREFERRED_PROVIDER="openai" # Optional, it's the default 122 | # BIG_MODEL="gpt-4.1" # Optional, it's the default 123 | # SMALL_MODEL="gpt-4.1-mini" # Optional, it's the default 124 | ``` 125 | 126 | **Example 2: Prefer Google** 127 | ```dotenv 128 | GEMINI_API_KEY="your-google-key" 129 | OPENAI_API_KEY="your-openai-key" # Needed for fallback 130 | PREFERRED_PROVIDER="google" 131 | # BIG_MODEL="gemini-2.5-pro-preview-03-25" # Optional, it's the default for Google pref 132 | # SMALL_MODEL="gemini-2.0-flash" # Optional, it's the default for Google pref 133 | ``` 134 | 135 | **Example 3: Use Specific OpenAI Models** 136 | ```dotenv 137 | OPENAI_API_KEY="your-openai-key" 138 | GEMINI_API_KEY="your-google-key" 139 | PREFERRED_PROVIDER="openai" 140 | BIG_MODEL="gpt-4o" # Example specific model 141 | SMALL_MODEL="gpt-4o-mini" # Example specific model 142 | ``` 143 | 144 | ## How It Works 🧩 145 | 146 | This proxy works by: 147 | 148 | 1. **Receiving requests** in Anthropic's API format 📥 149 | 2. **Translating** the requests to OpenAI format via LiteLLM 🔄 150 | 3. **Sending** the translated request to OpenAI 📤 151 | 4. **Converting** the response back to Anthropic format 🔄 152 | 5. **Returning** the formatted response to the client ✅ 153 | 154 | The proxy handles both streaming and non-streaming responses, maintaining compatibility with all Claude clients. 🌊 155 | 156 | ## Contributing 🤝 157 | 158 | Contributions are welcome! Please feel free to submit a Pull Request. 🎁 159 | -------------------------------------------------------------------------------- /pic.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/1rgs/claude-code-proxy/e9c8cf8de6e8f11cf54dd677634e9796e040f2fd/pic.png -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [project] 2 | name = "anthropic-proxy" 3 | version = "0.1.0" 4 | description = "Proxy that translates between Anthropic API and LiteLLM" 5 | readme = "README.md" 6 | requires-python = ">=3.10" 7 | dependencies = [ 8 | "fastapi[standard]>=0.115.11", 9 | "uvicorn>=0.34.0", 10 | "httpx>=0.25.0", 11 | "pydantic>=2.0.0", 12 | "litellm>=1.40.14", 13 | "python-dotenv>=1.0.0", 14 | ] 15 | 16 | -------------------------------------------------------------------------------- /server.py: -------------------------------------------------------------------------------- 1 | from fastapi import FastAPI, Request, HTTPException 2 | import uvicorn 3 | import logging 4 | import json 5 | from pydantic import BaseModel, Field, field_validator 6 | from typing import List, Dict, Any, Optional, Union, Literal 7 | import httpx 8 | import os 9 | from fastapi.responses import JSONResponse, StreamingResponse 10 | import litellm 11 | import uuid 12 | import time 13 | from dotenv import load_dotenv 14 | import re 15 | from datetime import datetime 16 | import sys 17 | 18 | # Load environment variables from .env file 19 | load_dotenv() 20 | 21 | # Configure logging 22 | logging.basicConfig( 23 | level=logging.WARN, # Change to INFO level to show more details 24 | format='%(asctime)s - %(levelname)s - %(message)s', 25 | ) 26 | logger = logging.getLogger(__name__) 27 | 28 | # Configure uvicorn to be quieter 29 | import uvicorn 30 | # Tell uvicorn's loggers to be quiet 31 | logging.getLogger("uvicorn").setLevel(logging.WARNING) 32 | logging.getLogger("uvicorn.access").setLevel(logging.WARNING) 33 | logging.getLogger("uvicorn.error").setLevel(logging.WARNING) 34 | 35 | # Create a filter to block any log messages containing specific strings 36 | class MessageFilter(logging.Filter): 37 | def filter(self, record): 38 | # Block messages containing these strings 39 | blocked_phrases = [ 40 | "LiteLLM completion()", 41 | "HTTP Request:", 42 | "selected model name for cost calculation", 43 | "utils.py", 44 | "cost_calculator" 45 | ] 46 | 47 | if hasattr(record, 'msg') and isinstance(record.msg, str): 48 | for phrase in blocked_phrases: 49 | if phrase in record.msg: 50 | return False 51 | return True 52 | 53 | # Apply the filter to the root logger to catch all messages 54 | root_logger = logging.getLogger() 55 | root_logger.addFilter(MessageFilter()) 56 | 57 | # Custom formatter for model mapping logs 58 | class ColorizedFormatter(logging.Formatter): 59 | """Custom formatter to highlight model mappings""" 60 | BLUE = "\033[94m" 61 | GREEN = "\033[92m" 62 | YELLOW = "\033[93m" 63 | RED = "\033[91m" 64 | RESET = "\033[0m" 65 | BOLD = "\033[1m" 66 | 67 | def format(self, record): 68 | if record.levelno == logging.debug and "MODEL MAPPING" in record.msg: 69 | # Apply colors and formatting to model mapping logs 70 | return f"{self.BOLD}{self.GREEN}{record.msg}{self.RESET}" 71 | return super().format(record) 72 | 73 | # Apply custom formatter to console handler 74 | for handler in logger.handlers: 75 | if isinstance(handler, logging.StreamHandler): 76 | handler.setFormatter(ColorizedFormatter('%(asctime)s - %(levelname)s - %(message)s')) 77 | 78 | app = FastAPI() 79 | 80 | # Get API keys from environment 81 | ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY") 82 | OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY") 83 | GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY") 84 | 85 | # Get preferred provider (default to openai) 86 | PREFERRED_PROVIDER = os.environ.get("PREFERRED_PROVIDER", "openai").lower() 87 | 88 | # Get model mapping configuration from environment 89 | # Default to latest OpenAI models if not set 90 | BIG_MODEL = os.environ.get("BIG_MODEL", "gpt-4.1") 91 | SMALL_MODEL = os.environ.get("SMALL_MODEL", "gpt-4.1-mini") 92 | 93 | # List of OpenAI models 94 | OPENAI_MODELS = [ 95 | "o3-mini", 96 | "o1", 97 | "o1-mini", 98 | "o1-pro", 99 | "gpt-4.5-preview", 100 | "gpt-4o", 101 | "gpt-4o-audio-preview", 102 | "chatgpt-4o-latest", 103 | "gpt-4o-mini", 104 | "gpt-4o-mini-audio-preview", 105 | "gpt-4.1", # Added default big model 106 | "gpt-4.1-mini" # Added default small model 107 | ] 108 | 109 | # List of Gemini models 110 | GEMINI_MODELS = [ 111 | "gemini-2.5-pro-preview-03-25", 112 | "gemini-2.0-flash" 113 | ] 114 | 115 | # Helper function to clean schema for Gemini 116 | def clean_gemini_schema(schema: Any) -> Any: 117 | """Recursively removes unsupported fields from a JSON schema for Gemini.""" 118 | if isinstance(schema, dict): 119 | # Remove specific keys unsupported by Gemini tool parameters 120 | schema.pop("additionalProperties", None) 121 | schema.pop("default", None) 122 | 123 | # Check for unsupported 'format' in string types 124 | if schema.get("type") == "string" and "format" in schema: 125 | allowed_formats = {"enum", "date-time"} 126 | if schema["format"] not in allowed_formats: 127 | logger.debug(f"Removing unsupported format '{schema['format']}' for string type in Gemini schema.") 128 | schema.pop("format") 129 | 130 | # Recursively clean nested schemas (properties, items, etc.) 131 | for key, value in list(schema.items()): # Use list() to allow modification during iteration 132 | schema[key] = clean_gemini_schema(value) 133 | elif isinstance(schema, list): 134 | # Recursively clean items in a list 135 | return [clean_gemini_schema(item) for item in schema] 136 | return schema 137 | 138 | # Models for Anthropic API requests 139 | class ContentBlockText(BaseModel): 140 | type: Literal["text"] 141 | text: str 142 | 143 | class ContentBlockImage(BaseModel): 144 | type: Literal["image"] 145 | source: Dict[str, Any] 146 | 147 | class ContentBlockToolUse(BaseModel): 148 | type: Literal["tool_use"] 149 | id: str 150 | name: str 151 | input: Dict[str, Any] 152 | 153 | class ContentBlockToolResult(BaseModel): 154 | type: Literal["tool_result"] 155 | tool_use_id: str 156 | content: Union[str, List[Dict[str, Any]], Dict[str, Any], List[Any], Any] 157 | 158 | class SystemContent(BaseModel): 159 | type: Literal["text"] 160 | text: str 161 | 162 | class Message(BaseModel): 163 | role: Literal["user", "assistant"] 164 | content: Union[str, List[Union[ContentBlockText, ContentBlockImage, ContentBlockToolUse, ContentBlockToolResult]]] 165 | 166 | class Tool(BaseModel): 167 | name: str 168 | description: Optional[str] = None 169 | input_schema: Dict[str, Any] 170 | 171 | class ThinkingConfig(BaseModel): 172 | enabled: bool 173 | 174 | class MessagesRequest(BaseModel): 175 | model: str 176 | max_tokens: int 177 | messages: List[Message] 178 | system: Optional[Union[str, List[SystemContent]]] = None 179 | stop_sequences: Optional[List[str]] = None 180 | stream: Optional[bool] = False 181 | temperature: Optional[float] = 1.0 182 | top_p: Optional[float] = None 183 | top_k: Optional[int] = None 184 | metadata: Optional[Dict[str, Any]] = None 185 | tools: Optional[List[Tool]] = None 186 | tool_choice: Optional[Dict[str, Any]] = None 187 | thinking: Optional[ThinkingConfig] = None 188 | original_model: Optional[str] = None # Will store the original model name 189 | 190 | @field_validator('model') 191 | def validate_model_field(cls, v, info): # Renamed to avoid conflict 192 | original_model = v 193 | new_model = v # Default to original value 194 | 195 | logger.debug(f"📋 MODEL VALIDATION: Original='{original_model}', Preferred='{PREFERRED_PROVIDER}', BIG='{BIG_MODEL}', SMALL='{SMALL_MODEL}'") 196 | 197 | # Remove provider prefixes for easier matching 198 | clean_v = v 199 | if clean_v.startswith('anthropic/'): 200 | clean_v = clean_v[10:] 201 | elif clean_v.startswith('openai/'): 202 | clean_v = clean_v[7:] 203 | elif clean_v.startswith('gemini/'): 204 | clean_v = clean_v[7:] 205 | 206 | # --- Mapping Logic --- START --- 207 | mapped = False 208 | # Map Haiku to SMALL_MODEL based on provider preference 209 | if 'haiku' in clean_v.lower(): 210 | if PREFERRED_PROVIDER == "google" and SMALL_MODEL in GEMINI_MODELS: 211 | new_model = f"gemini/{SMALL_MODEL}" 212 | mapped = True 213 | else: 214 | new_model = f"openai/{SMALL_MODEL}" 215 | mapped = True 216 | 217 | # Map Sonnet to BIG_MODEL based on provider preference 218 | elif 'sonnet' in clean_v.lower(): 219 | if PREFERRED_PROVIDER == "google" and BIG_MODEL in GEMINI_MODELS: 220 | new_model = f"gemini/{BIG_MODEL}" 221 | mapped = True 222 | else: 223 | new_model = f"openai/{BIG_MODEL}" 224 | mapped = True 225 | 226 | # Add prefixes to non-mapped models if they match known lists 227 | elif not mapped: 228 | if clean_v in GEMINI_MODELS and not v.startswith('gemini/'): 229 | new_model = f"gemini/{clean_v}" 230 | mapped = True # Technically mapped to add prefix 231 | elif clean_v in OPENAI_MODELS and not v.startswith('openai/'): 232 | new_model = f"openai/{clean_v}" 233 | mapped = True # Technically mapped to add prefix 234 | # --- Mapping Logic --- END --- 235 | 236 | if mapped: 237 | logger.debug(f"📌 MODEL MAPPING: '{original_model}' ➡️ '{new_model}'") 238 | else: 239 | # If no mapping occurred and no prefix exists, log warning or decide default 240 | if not v.startswith(('openai/', 'gemini/', 'anthropic/')): 241 | logger.warning(f"⚠️ No prefix or mapping rule for model: '{original_model}'. Using as is.") 242 | new_model = v # Ensure we return the original if no rule applied 243 | 244 | # Store the original model in the values dictionary 245 | values = info.data 246 | if isinstance(values, dict): 247 | values['original_model'] = original_model 248 | 249 | return new_model 250 | 251 | class TokenCountRequest(BaseModel): 252 | model: str 253 | messages: List[Message] 254 | system: Optional[Union[str, List[SystemContent]]] = None 255 | tools: Optional[List[Tool]] = None 256 | thinking: Optional[ThinkingConfig] = None 257 | tool_choice: Optional[Dict[str, Any]] = None 258 | original_model: Optional[str] = None # Will store the original model name 259 | 260 | @field_validator('model') 261 | def validate_model_token_count(cls, v, info): # Renamed to avoid conflict 262 | # Use the same logic as MessagesRequest validator 263 | # NOTE: Pydantic validators might not share state easily if not class methods 264 | # Re-implementing the logic here for clarity, could be refactored 265 | original_model = v 266 | new_model = v # Default to original value 267 | 268 | logger.debug(f"📋 TOKEN COUNT VALIDATION: Original='{original_model}', Preferred='{PREFERRED_PROVIDER}', BIG='{BIG_MODEL}', SMALL='{SMALL_MODEL}'") 269 | 270 | # Remove provider prefixes for easier matching 271 | clean_v = v 272 | if clean_v.startswith('anthropic/'): 273 | clean_v = clean_v[10:] 274 | elif clean_v.startswith('openai/'): 275 | clean_v = clean_v[7:] 276 | elif clean_v.startswith('gemini/'): 277 | clean_v = clean_v[7:] 278 | 279 | # --- Mapping Logic --- START --- 280 | mapped = False 281 | # Map Haiku to SMALL_MODEL based on provider preference 282 | if 'haiku' in clean_v.lower(): 283 | if PREFERRED_PROVIDER == "google" and SMALL_MODEL in GEMINI_MODELS: 284 | new_model = f"gemini/{SMALL_MODEL}" 285 | mapped = True 286 | else: 287 | new_model = f"openai/{SMALL_MODEL}" 288 | mapped = True 289 | 290 | # Map Sonnet to BIG_MODEL based on provider preference 291 | elif 'sonnet' in clean_v.lower(): 292 | if PREFERRED_PROVIDER == "google" and BIG_MODEL in GEMINI_MODELS: 293 | new_model = f"gemini/{BIG_MODEL}" 294 | mapped = True 295 | else: 296 | new_model = f"openai/{BIG_MODEL}" 297 | mapped = True 298 | 299 | # Add prefixes to non-mapped models if they match known lists 300 | elif not mapped: 301 | if clean_v in GEMINI_MODELS and not v.startswith('gemini/'): 302 | new_model = f"gemini/{clean_v}" 303 | mapped = True # Technically mapped to add prefix 304 | elif clean_v in OPENAI_MODELS and not v.startswith('openai/'): 305 | new_model = f"openai/{clean_v}" 306 | mapped = True # Technically mapped to add prefix 307 | # --- Mapping Logic --- END --- 308 | 309 | if mapped: 310 | logger.debug(f"📌 TOKEN COUNT MAPPING: '{original_model}' ➡️ '{new_model}'") 311 | else: 312 | if not v.startswith(('openai/', 'gemini/', 'anthropic/')): 313 | logger.warning(f"⚠️ No prefix or mapping rule for token count model: '{original_model}'. Using as is.") 314 | new_model = v # Ensure we return the original if no rule applied 315 | 316 | # Store the original model in the values dictionary 317 | values = info.data 318 | if isinstance(values, dict): 319 | values['original_model'] = original_model 320 | 321 | return new_model 322 | 323 | class TokenCountResponse(BaseModel): 324 | input_tokens: int 325 | 326 | class Usage(BaseModel): 327 | input_tokens: int 328 | output_tokens: int 329 | cache_creation_input_tokens: int = 0 330 | cache_read_input_tokens: int = 0 331 | 332 | class MessagesResponse(BaseModel): 333 | id: str 334 | model: str 335 | role: Literal["assistant"] = "assistant" 336 | content: List[Union[ContentBlockText, ContentBlockToolUse]] 337 | type: Literal["message"] = "message" 338 | stop_reason: Optional[Literal["end_turn", "max_tokens", "stop_sequence", "tool_use"]] = None 339 | stop_sequence: Optional[str] = None 340 | usage: Usage 341 | 342 | @app.middleware("http") 343 | async def log_requests(request: Request, call_next): 344 | # Get request details 345 | method = request.method 346 | path = request.url.path 347 | 348 | # Log only basic request details at debug level 349 | logger.debug(f"Request: {method} {path}") 350 | 351 | # Process the request and get the response 352 | response = await call_next(request) 353 | 354 | return response 355 | 356 | # Not using validation function as we're using the environment API key 357 | 358 | def parse_tool_result_content(content): 359 | """Helper function to properly parse and normalize tool result content.""" 360 | if content is None: 361 | return "No content provided" 362 | 363 | if isinstance(content, str): 364 | return content 365 | 366 | if isinstance(content, list): 367 | result = "" 368 | for item in content: 369 | if isinstance(item, dict) and item.get("type") == "text": 370 | result += item.get("text", "") + "\n" 371 | elif isinstance(item, str): 372 | result += item + "\n" 373 | elif isinstance(item, dict): 374 | if "text" in item: 375 | result += item.get("text", "") + "\n" 376 | else: 377 | try: 378 | result += json.dumps(item) + "\n" 379 | except: 380 | result += str(item) + "\n" 381 | else: 382 | try: 383 | result += str(item) + "\n" 384 | except: 385 | result += "Unparseable content\n" 386 | return result.strip() 387 | 388 | if isinstance(content, dict): 389 | if content.get("type") == "text": 390 | return content.get("text", "") 391 | try: 392 | return json.dumps(content) 393 | except: 394 | return str(content) 395 | 396 | # Fallback for any other type 397 | try: 398 | return str(content) 399 | except: 400 | return "Unparseable content" 401 | 402 | def convert_anthropic_to_litellm(anthropic_request: MessagesRequest) -> Dict[str, Any]: 403 | """Convert Anthropic API request format to LiteLLM format (which follows OpenAI).""" 404 | # LiteLLM already handles Anthropic models when using the format model="anthropic/claude-3-opus-20240229" 405 | # So we just need to convert our Pydantic model to a dict in the expected format 406 | 407 | messages = [] 408 | 409 | # Add system message if present 410 | if anthropic_request.system: 411 | # Handle different formats of system messages 412 | if isinstance(anthropic_request.system, str): 413 | # Simple string format 414 | messages.append({"role": "system", "content": anthropic_request.system}) 415 | elif isinstance(anthropic_request.system, list): 416 | # List of content blocks 417 | system_text = "" 418 | for block in anthropic_request.system: 419 | if hasattr(block, 'type') and block.type == "text": 420 | system_text += block.text + "\n\n" 421 | elif isinstance(block, dict) and block.get("type") == "text": 422 | system_text += block.get("text", "") + "\n\n" 423 | 424 | if system_text: 425 | messages.append({"role": "system", "content": system_text.strip()}) 426 | 427 | # Add conversation messages 428 | for idx, msg in enumerate(anthropic_request.messages): 429 | content = msg.content 430 | if isinstance(content, str): 431 | messages.append({"role": msg.role, "content": content}) 432 | else: 433 | # Special handling for tool_result in user messages 434 | # OpenAI/LiteLLM format expects the assistant to call the tool, 435 | # and the user's next message to include the result as plain text 436 | if msg.role == "user" and any(block.type == "tool_result" for block in content if hasattr(block, "type")): 437 | # For user messages with tool_result, split into separate messages 438 | text_content = "" 439 | 440 | # Extract all text parts and concatenate them 441 | for block in content: 442 | if hasattr(block, "type"): 443 | if block.type == "text": 444 | text_content += block.text + "\n" 445 | elif block.type == "tool_result": 446 | # Add tool result as a message by itself - simulate the normal flow 447 | tool_id = block.tool_use_id if hasattr(block, "tool_use_id") else "" 448 | 449 | # Handle different formats of tool result content 450 | result_content = "" 451 | if hasattr(block, "content"): 452 | if isinstance(block.content, str): 453 | result_content = block.content 454 | elif isinstance(block.content, list): 455 | # If content is a list of blocks, extract text from each 456 | for content_block in block.content: 457 | if hasattr(content_block, "type") and content_block.type == "text": 458 | result_content += content_block.text + "\n" 459 | elif isinstance(content_block, dict) and content_block.get("type") == "text": 460 | result_content += content_block.get("text", "") + "\n" 461 | elif isinstance(content_block, dict): 462 | # Handle any dict by trying to extract text or convert to JSON 463 | if "text" in content_block: 464 | result_content += content_block.get("text", "") + "\n" 465 | else: 466 | try: 467 | result_content += json.dumps(content_block) + "\n" 468 | except: 469 | result_content += str(content_block) + "\n" 470 | elif isinstance(block.content, dict): 471 | # Handle dictionary content 472 | if block.content.get("type") == "text": 473 | result_content = block.content.get("text", "") 474 | else: 475 | try: 476 | result_content = json.dumps(block.content) 477 | except: 478 | result_content = str(block.content) 479 | else: 480 | # Handle any other type by converting to string 481 | try: 482 | result_content = str(block.content) 483 | except: 484 | result_content = "Unparseable content" 485 | 486 | # In OpenAI format, tool results come from the user (rather than being content blocks) 487 | text_content += f"Tool result for {tool_id}:\n{result_content}\n" 488 | 489 | # Add as a single user message with all the content 490 | messages.append({"role": "user", "content": text_content.strip()}) 491 | else: 492 | # Regular handling for other message types 493 | processed_content = [] 494 | for block in content: 495 | if hasattr(block, "type"): 496 | if block.type == "text": 497 | processed_content.append({"type": "text", "text": block.text}) 498 | elif block.type == "image": 499 | processed_content.append({"type": "image", "source": block.source}) 500 | elif block.type == "tool_use": 501 | # Handle tool use blocks if needed 502 | processed_content.append({ 503 | "type": "tool_use", 504 | "id": block.id, 505 | "name": block.name, 506 | "input": block.input 507 | }) 508 | elif block.type == "tool_result": 509 | # Handle different formats of tool result content 510 | processed_content_block = { 511 | "type": "tool_result", 512 | "tool_use_id": block.tool_use_id if hasattr(block, "tool_use_id") else "" 513 | } 514 | 515 | # Process the content field properly 516 | if hasattr(block, "content"): 517 | if isinstance(block.content, str): 518 | # If it's a simple string, create a text block for it 519 | processed_content_block["content"] = [{"type": "text", "text": block.content}] 520 | elif isinstance(block.content, list): 521 | # If it's already a list of blocks, keep it 522 | processed_content_block["content"] = block.content 523 | else: 524 | # Default fallback 525 | processed_content_block["content"] = [{"type": "text", "text": str(block.content)}] 526 | else: 527 | # Default empty content 528 | processed_content_block["content"] = [{"type": "text", "text": ""}] 529 | 530 | processed_content.append(processed_content_block) 531 | 532 | messages.append({"role": msg.role, "content": processed_content}) 533 | 534 | # Cap max_tokens for OpenAI models to their limit of 16384 535 | max_tokens = anthropic_request.max_tokens 536 | if anthropic_request.model.startswith("openai/") or anthropic_request.model.startswith("gemini/"): 537 | max_tokens = min(max_tokens, 16384) 538 | logger.debug(f"Capping max_tokens to 16384 for OpenAI/Gemini model (original value: {anthropic_request.max_tokens})") 539 | 540 | # Create LiteLLM request dict 541 | litellm_request = { 542 | "model": anthropic_request.model, # t understands "anthropic/claude-x" format 543 | "messages": messages, 544 | "max_tokens": max_tokens, 545 | "temperature": anthropic_request.temperature, 546 | "stream": anthropic_request.stream, 547 | } 548 | 549 | # Add optional parameters if present 550 | if anthropic_request.stop_sequences: 551 | litellm_request["stop"] = anthropic_request.stop_sequences 552 | 553 | if anthropic_request.top_p: 554 | litellm_request["top_p"] = anthropic_request.top_p 555 | 556 | if anthropic_request.top_k: 557 | litellm_request["top_k"] = anthropic_request.top_k 558 | 559 | # Convert tools to OpenAI format 560 | if anthropic_request.tools: 561 | openai_tools = [] 562 | is_gemini_model = anthropic_request.model.startswith("gemini/") 563 | 564 | for tool in anthropic_request.tools: 565 | # Convert to dict if it's a pydantic model 566 | if hasattr(tool, 'dict'): 567 | tool_dict = tool.dict() 568 | else: 569 | # Ensure tool_dict is a dictionary, handle potential errors if 'tool' isn't dict-like 570 | try: 571 | tool_dict = dict(tool) if not isinstance(tool, dict) else tool 572 | except (TypeError, ValueError): 573 | logger.error(f"Could not convert tool to dict: {tool}") 574 | continue # Skip this tool if conversion fails 575 | 576 | # Clean the schema if targeting a Gemini model 577 | input_schema = tool_dict.get("input_schema", {}) 578 | if is_gemini_model: 579 | logger.debug(f"Cleaning schema for Gemini tool: {tool_dict.get('name')}") 580 | input_schema = clean_gemini_schema(input_schema) 581 | 582 | # Create OpenAI-compatible function tool 583 | openai_tool = { 584 | "type": "function", 585 | "function": { 586 | "name": tool_dict["name"], 587 | "description": tool_dict.get("description", ""), 588 | "parameters": input_schema # Use potentially cleaned schema 589 | } 590 | } 591 | openai_tools.append(openai_tool) 592 | 593 | litellm_request["tools"] = openai_tools 594 | 595 | # Convert tool_choice to OpenAI format if present 596 | if anthropic_request.tool_choice: 597 | if hasattr(anthropic_request.tool_choice, 'dict'): 598 | tool_choice_dict = anthropic_request.tool_choice.dict() 599 | else: 600 | tool_choice_dict = anthropic_request.tool_choice 601 | 602 | # Handle Anthropic's tool_choice format 603 | choice_type = tool_choice_dict.get("type") 604 | if choice_type == "auto": 605 | litellm_request["tool_choice"] = "auto" 606 | elif choice_type == "any": 607 | litellm_request["tool_choice"] = "any" 608 | elif choice_type == "tool" and "name" in tool_choice_dict: 609 | litellm_request["tool_choice"] = { 610 | "type": "function", 611 | "function": {"name": tool_choice_dict["name"]} 612 | } 613 | else: 614 | # Default to auto if we can't determine 615 | litellm_request["tool_choice"] = "auto" 616 | 617 | return litellm_request 618 | 619 | def convert_litellm_to_anthropic(litellm_response: Union[Dict[str, Any], Any], 620 | original_request: MessagesRequest) -> MessagesResponse: 621 | """Convert LiteLLM (OpenAI format) response to Anthropic API response format.""" 622 | 623 | # Enhanced response extraction with better error handling 624 | try: 625 | # Get the clean model name to check capabilities 626 | clean_model = original_request.model 627 | if clean_model.startswith("anthropic/"): 628 | clean_model = clean_model[len("anthropic/"):] 629 | elif clean_model.startswith("openai/"): 630 | clean_model = clean_model[len("openai/"):] 631 | 632 | # Check if this is a Claude model (which supports content blocks) 633 | is_claude_model = clean_model.startswith("claude-") 634 | 635 | # Handle ModelResponse object from LiteLLM 636 | if hasattr(litellm_response, 'choices') and hasattr(litellm_response, 'usage'): 637 | # Extract data from ModelResponse object directly 638 | choices = litellm_response.choices 639 | message = choices[0].message if choices and len(choices) > 0 else None 640 | content_text = message.content if message and hasattr(message, 'content') else "" 641 | tool_calls = message.tool_calls if message and hasattr(message, 'tool_calls') else None 642 | finish_reason = choices[0].finish_reason if choices and len(choices) > 0 else "stop" 643 | usage_info = litellm_response.usage 644 | response_id = getattr(litellm_response, 'id', f"msg_{uuid.uuid4()}") 645 | else: 646 | # For backward compatibility - handle dict responses 647 | # If response is a dict, use it, otherwise try to convert to dict 648 | try: 649 | response_dict = litellm_response if isinstance(litellm_response, dict) else litellm_response.dict() 650 | except AttributeError: 651 | # If .dict() fails, try to use model_dump or __dict__ 652 | try: 653 | response_dict = litellm_response.model_dump() if hasattr(litellm_response, 'model_dump') else litellm_response.__dict__ 654 | except AttributeError: 655 | # Fallback - manually extract attributes 656 | response_dict = { 657 | "id": getattr(litellm_response, 'id', f"msg_{uuid.uuid4()}"), 658 | "choices": getattr(litellm_response, 'choices', [{}]), 659 | "usage": getattr(litellm_response, 'usage', {}) 660 | } 661 | 662 | # Extract the content from the response dict 663 | choices = response_dict.get("choices", [{}]) 664 | message = choices[0].get("message", {}) if choices and len(choices) > 0 else {} 665 | content_text = message.get("content", "") 666 | tool_calls = message.get("tool_calls", None) 667 | finish_reason = choices[0].get("finish_reason", "stop") if choices and len(choices) > 0 else "stop" 668 | usage_info = response_dict.get("usage", {}) 669 | response_id = response_dict.get("id", f"msg_{uuid.uuid4()}") 670 | 671 | # Create content list for Anthropic format 672 | content = [] 673 | 674 | # Add text content block if present (text might be None or empty for pure tool call responses) 675 | if content_text is not None and content_text != "": 676 | content.append({"type": "text", "text": content_text}) 677 | 678 | # Add tool calls if present (tool_use in Anthropic format) - only for Claude models 679 | if tool_calls and is_claude_model: 680 | logger.debug(f"Processing tool calls: {tool_calls}") 681 | 682 | # Convert to list if it's not already 683 | if not isinstance(tool_calls, list): 684 | tool_calls = [tool_calls] 685 | 686 | for idx, tool_call in enumerate(tool_calls): 687 | logger.debug(f"Processing tool call {idx}: {tool_call}") 688 | 689 | # Extract function data based on whether it's a dict or object 690 | if isinstance(tool_call, dict): 691 | function = tool_call.get("function", {}) 692 | tool_id = tool_call.get("id", f"tool_{uuid.uuid4()}") 693 | name = function.get("name", "") 694 | arguments = function.get("arguments", "{}") 695 | else: 696 | function = getattr(tool_call, "function", None) 697 | tool_id = getattr(tool_call, "id", f"tool_{uuid.uuid4()}") 698 | name = getattr(function, "name", "") if function else "" 699 | arguments = getattr(function, "arguments", "{}") if function else "{}" 700 | 701 | # Convert string arguments to dict if needed 702 | if isinstance(arguments, str): 703 | try: 704 | arguments = json.loads(arguments) 705 | except json.JSONDecodeError: 706 | logger.warning(f"Failed to parse tool arguments as JSON: {arguments}") 707 | arguments = {"raw": arguments} 708 | 709 | logger.debug(f"Adding tool_use block: id={tool_id}, name={name}, input={arguments}") 710 | 711 | content.append({ 712 | "type": "tool_use", 713 | "id": tool_id, 714 | "name": name, 715 | "input": arguments 716 | }) 717 | elif tool_calls and not is_claude_model: 718 | # For non-Claude models, convert tool calls to text format 719 | logger.debug(f"Converting tool calls to text for non-Claude model: {clean_model}") 720 | 721 | # We'll append tool info to the text content 722 | tool_text = "\n\nTool usage:\n" 723 | 724 | # Convert to list if it's not already 725 | if not isinstance(tool_calls, list): 726 | tool_calls = [tool_calls] 727 | 728 | for idx, tool_call in enumerate(tool_calls): 729 | # Extract function data based on whether it's a dict or object 730 | if isinstance(tool_call, dict): 731 | function = tool_call.get("function", {}) 732 | tool_id = tool_call.get("id", f"tool_{uuid.uuid4()}") 733 | name = function.get("name", "") 734 | arguments = function.get("arguments", "{}") 735 | else: 736 | function = getattr(tool_call, "function", None) 737 | tool_id = getattr(tool_call, "id", f"tool_{uuid.uuid4()}") 738 | name = getattr(function, "name", "") if function else "" 739 | arguments = getattr(function, "arguments", "{}") if function else "{}" 740 | 741 | # Convert string arguments to dict if needed 742 | if isinstance(arguments, str): 743 | try: 744 | args_dict = json.loads(arguments) 745 | arguments_str = json.dumps(args_dict, indent=2) 746 | except json.JSONDecodeError: 747 | arguments_str = arguments 748 | else: 749 | arguments_str = json.dumps(arguments, indent=2) 750 | 751 | tool_text += f"Tool: {name}\nArguments: {arguments_str}\n\n" 752 | 753 | # Add or append tool text to content 754 | if content and content[0]["type"] == "text": 755 | content[0]["text"] += tool_text 756 | else: 757 | content.append({"type": "text", "text": tool_text}) 758 | 759 | # Get usage information - extract values safely from object or dict 760 | if isinstance(usage_info, dict): 761 | prompt_tokens = usage_info.get("prompt_tokens", 0) 762 | completion_tokens = usage_info.get("completion_tokens", 0) 763 | else: 764 | prompt_tokens = getattr(usage_info, "prompt_tokens", 0) 765 | completion_tokens = getattr(usage_info, "completion_tokens", 0) 766 | 767 | # Map OpenAI finish_reason to Anthropic stop_reason 768 | stop_reason = None 769 | if finish_reason == "stop": 770 | stop_reason = "end_turn" 771 | elif finish_reason == "length": 772 | stop_reason = "max_tokens" 773 | elif finish_reason == "tool_calls": 774 | stop_reason = "tool_use" 775 | else: 776 | stop_reason = "end_turn" # Default 777 | 778 | # Make sure content is never empty 779 | if not content: 780 | content.append({"type": "text", "text": ""}) 781 | 782 | # Create Anthropic-style response 783 | anthropic_response = MessagesResponse( 784 | id=response_id, 785 | model=original_request.model, 786 | role="assistant", 787 | content=content, 788 | stop_reason=stop_reason, 789 | stop_sequence=None, 790 | usage=Usage( 791 | input_tokens=prompt_tokens, 792 | output_tokens=completion_tokens 793 | ) 794 | ) 795 | 796 | return anthropic_response 797 | 798 | except Exception as e: 799 | import traceback 800 | error_traceback = traceback.format_exc() 801 | error_message = f"Error converting response: {str(e)}\n\nFull traceback:\n{error_traceback}" 802 | logger.error(error_message) 803 | 804 | # In case of any error, create a fallback response 805 | return MessagesResponse( 806 | id=f"msg_{uuid.uuid4()}", 807 | model=original_request.model, 808 | role="assistant", 809 | content=[{"type": "text", "text": f"Error converting response: {str(e)}. Please check server logs."}], 810 | stop_reason="end_turn", 811 | usage=Usage(input_tokens=0, output_tokens=0) 812 | ) 813 | 814 | async def handle_streaming(response_generator, original_request: MessagesRequest): 815 | """Handle streaming responses from LiteLLM and convert to Anthropic format.""" 816 | try: 817 | # Send message_start event 818 | message_id = f"msg_{uuid.uuid4().hex[:24]}" # Format similar to Anthropic's IDs 819 | 820 | message_data = { 821 | 'type': 'message_start', 822 | 'message': { 823 | 'id': message_id, 824 | 'type': 'message', 825 | 'role': 'assistant', 826 | 'model': original_request.model, 827 | 'content': [], 828 | 'stop_reason': None, 829 | 'stop_sequence': None, 830 | 'usage': { 831 | 'input_tokens': 0, 832 | 'cache_creation_input_tokens': 0, 833 | 'cache_read_input_tokens': 0, 834 | 'output_tokens': 0 835 | } 836 | } 837 | } 838 | yield f"event: message_start\ndata: {json.dumps(message_data)}\n\n" 839 | 840 | # Content block index for the first text block 841 | yield f"event: content_block_start\ndata: {json.dumps({'type': 'content_block_start', 'index': 0, 'content_block': {'type': 'text', 'text': ''}})}\n\n" 842 | 843 | # Send a ping to keep the connection alive (Anthropic does this) 844 | yield f"event: ping\ndata: {json.dumps({'type': 'ping'})}\n\n" 845 | 846 | tool_index = None 847 | current_tool_call = None 848 | tool_content = "" 849 | accumulated_text = "" # Track accumulated text content 850 | text_sent = False # Track if we've sent any text content 851 | text_block_closed = False # Track if text block is closed 852 | input_tokens = 0 853 | output_tokens = 0 854 | has_sent_stop_reason = False 855 | last_tool_index = 0 856 | 857 | # Process each chunk 858 | async for chunk in response_generator: 859 | try: 860 | 861 | 862 | # Check if this is the end of the response with usage data 863 | if hasattr(chunk, 'usage') and chunk.usage is not None: 864 | if hasattr(chunk.usage, 'prompt_tokens'): 865 | input_tokens = chunk.usage.prompt_tokens 866 | if hasattr(chunk.usage, 'completion_tokens'): 867 | output_tokens = chunk.usage.completion_tokens 868 | 869 | # Handle text content 870 | if hasattr(chunk, 'choices') and len(chunk.choices) > 0: 871 | choice = chunk.choices[0] 872 | 873 | # Get the delta from the choice 874 | if hasattr(choice, 'delta'): 875 | delta = choice.delta 876 | else: 877 | # If no delta, try to get message 878 | delta = getattr(choice, 'message', {}) 879 | 880 | # Check for finish_reason to know when we're done 881 | finish_reason = getattr(choice, 'finish_reason', None) 882 | 883 | # Process text content 884 | delta_content = None 885 | 886 | # Handle different formats of delta content 887 | if hasattr(delta, 'content'): 888 | delta_content = delta.content 889 | elif isinstance(delta, dict) and 'content' in delta: 890 | delta_content = delta['content'] 891 | 892 | # Accumulate text content 893 | if delta_content is not None and delta_content != "": 894 | accumulated_text += delta_content 895 | 896 | # Always emit text deltas if no tool calls started 897 | if tool_index is None and not text_block_closed: 898 | text_sent = True 899 | yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': delta_content}})}\n\n" 900 | 901 | # Process tool calls 902 | delta_tool_calls = None 903 | 904 | # Handle different formats of tool calls 905 | if hasattr(delta, 'tool_calls'): 906 | delta_tool_calls = delta.tool_calls 907 | elif isinstance(delta, dict) and 'tool_calls' in delta: 908 | delta_tool_calls = delta['tool_calls'] 909 | 910 | # Process tool calls if any 911 | if delta_tool_calls: 912 | # First tool call we've seen - need to handle text properly 913 | if tool_index is None: 914 | # If we've been streaming text, close that text block 915 | if text_sent and not text_block_closed: 916 | text_block_closed = True 917 | yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n" 918 | # If we've accumulated text but not sent it, we need to emit it now 919 | # This handles the case where the first delta has both text and a tool call 920 | elif accumulated_text and not text_sent and not text_block_closed: 921 | # Send the accumulated text 922 | text_sent = True 923 | yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': accumulated_text}})}\n\n" 924 | # Close the text block 925 | text_block_closed = True 926 | yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n" 927 | # Close text block even if we haven't sent anything - models sometimes emit empty text blocks 928 | elif not text_block_closed: 929 | text_block_closed = True 930 | yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n" 931 | 932 | # Convert to list if it's not already 933 | if not isinstance(delta_tool_calls, list): 934 | delta_tool_calls = [delta_tool_calls] 935 | 936 | for tool_call in delta_tool_calls: 937 | # Get the index of this tool call (for multiple tools) 938 | current_index = None 939 | if isinstance(tool_call, dict) and 'index' in tool_call: 940 | current_index = tool_call['index'] 941 | elif hasattr(tool_call, 'index'): 942 | current_index = tool_call.index 943 | else: 944 | current_index = 0 945 | 946 | # Check if this is a new tool or a continuation 947 | if tool_index is None or current_index != tool_index: 948 | # New tool call - create a new tool_use block 949 | tool_index = current_index 950 | last_tool_index += 1 951 | anthropic_tool_index = last_tool_index 952 | 953 | # Extract function info 954 | if isinstance(tool_call, dict): 955 | function = tool_call.get('function', {}) 956 | name = function.get('name', '') if isinstance(function, dict) else "" 957 | tool_id = tool_call.get('id', f"toolu_{uuid.uuid4().hex[:24]}") 958 | else: 959 | function = getattr(tool_call, 'function', None) 960 | name = getattr(function, 'name', '') if function else '' 961 | tool_id = getattr(tool_call, 'id', f"toolu_{uuid.uuid4().hex[:24]}") 962 | 963 | # Start a new tool_use block 964 | yield f"event: content_block_start\ndata: {json.dumps({'type': 'content_block_start', 'index': anthropic_tool_index, 'content_block': {'type': 'tool_use', 'id': tool_id, 'name': name, 'input': {}}})}\n\n" 965 | current_tool_call = tool_call 966 | tool_content = "" 967 | 968 | # Extract function arguments 969 | arguments = None 970 | if isinstance(tool_call, dict) and 'function' in tool_call: 971 | function = tool_call.get('function', {}) 972 | arguments = function.get('arguments', '') if isinstance(function, dict) else '' 973 | elif hasattr(tool_call, 'function'): 974 | function = getattr(tool_call, 'function', None) 975 | arguments = getattr(function, 'arguments', '') if function else '' 976 | 977 | # If we have arguments, send them as a delta 978 | if arguments: 979 | # Try to detect if arguments are valid JSON or just a fragment 980 | try: 981 | # If it's already a dict, use it 982 | if isinstance(arguments, dict): 983 | args_json = json.dumps(arguments) 984 | else: 985 | # Otherwise, try to parse it 986 | json.loads(arguments) 987 | args_json = arguments 988 | except (json.JSONDecodeError, TypeError): 989 | # If it's a fragment, treat it as a string 990 | args_json = arguments 991 | 992 | # Add to accumulated tool content 993 | tool_content += args_json if isinstance(args_json, str) else "" 994 | 995 | # Send the update 996 | yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': anthropic_tool_index, 'delta': {'type': 'input_json_delta', 'partial_json': args_json}})}\n\n" 997 | 998 | # Process finish_reason - end the streaming response 999 | if finish_reason and not has_sent_stop_reason: 1000 | has_sent_stop_reason = True 1001 | 1002 | # Close any open tool call blocks 1003 | if tool_index is not None: 1004 | for i in range(1, last_tool_index + 1): 1005 | yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': i})}\n\n" 1006 | 1007 | # If we accumulated text but never sent or closed text block, do it now 1008 | if not text_block_closed: 1009 | if accumulated_text and not text_sent: 1010 | # Send the accumulated text 1011 | yield f"event: content_block_delta\ndata: {json.dumps({'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': accumulated_text}})}\n\n" 1012 | # Close the text block 1013 | yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n" 1014 | 1015 | # Map OpenAI finish_reason to Anthropic stop_reason 1016 | stop_reason = "end_turn" 1017 | if finish_reason == "length": 1018 | stop_reason = "max_tokens" 1019 | elif finish_reason == "tool_calls": 1020 | stop_reason = "tool_use" 1021 | elif finish_reason == "stop": 1022 | stop_reason = "end_turn" 1023 | 1024 | # Send message_delta with stop reason and usage 1025 | usage = {"output_tokens": output_tokens} 1026 | 1027 | yield f"event: message_delta\ndata: {json.dumps({'type': 'message_delta', 'delta': {'stop_reason': stop_reason, 'stop_sequence': None}, 'usage': usage})}\n\n" 1028 | 1029 | # Send message_stop event 1030 | yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n" 1031 | 1032 | # Send final [DONE] marker to match Anthropic's behavior 1033 | yield "data: [DONE]\n\n" 1034 | return 1035 | except Exception as e: 1036 | # Log error but continue processing other chunks 1037 | logger.error(f"Error processing chunk: {str(e)}") 1038 | continue 1039 | 1040 | # If we didn't get a finish reason, close any open blocks 1041 | if not has_sent_stop_reason: 1042 | # Close any open tool call blocks 1043 | if tool_index is not None: 1044 | for i in range(1, last_tool_index + 1): 1045 | yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': i})}\n\n" 1046 | 1047 | # Close the text content block 1048 | yield f"event: content_block_stop\ndata: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n" 1049 | 1050 | # Send final message_delta with usage 1051 | usage = {"output_tokens": output_tokens} 1052 | 1053 | yield f"event: message_delta\ndata: {json.dumps({'type': 'message_delta', 'delta': {'stop_reason': 'end_turn', 'stop_sequence': None}, 'usage': usage})}\n\n" 1054 | 1055 | # Send message_stop event 1056 | yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n" 1057 | 1058 | # Send final [DONE] marker to match Anthropic's behavior 1059 | yield "data: [DONE]\n\n" 1060 | 1061 | except Exception as e: 1062 | import traceback 1063 | error_traceback = traceback.format_exc() 1064 | error_message = f"Error in streaming: {str(e)}\n\nFull traceback:\n{error_traceback}" 1065 | logger.error(error_message) 1066 | 1067 | # Send error message_delta 1068 | yield f"event: message_delta\ndata: {json.dumps({'type': 'message_delta', 'delta': {'stop_reason': 'error', 'stop_sequence': None}, 'usage': {'output_tokens': 0}})}\n\n" 1069 | 1070 | # Send message_stop event 1071 | yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n" 1072 | 1073 | # Send final [DONE] marker 1074 | yield "data: [DONE]\n\n" 1075 | 1076 | @app.post("/v1/messages") 1077 | async def create_message( 1078 | request: MessagesRequest, 1079 | raw_request: Request 1080 | ): 1081 | try: 1082 | # print the body here 1083 | body = await raw_request.body() 1084 | 1085 | # Parse the raw body as JSON since it's bytes 1086 | body_json = json.loads(body.decode('utf-8')) 1087 | original_model = body_json.get("model", "unknown") 1088 | 1089 | # Get the display name for logging, just the model name without provider prefix 1090 | display_model = original_model 1091 | if "/" in display_model: 1092 | display_model = display_model.split("/")[-1] 1093 | 1094 | # Clean model name for capability check 1095 | clean_model = request.model 1096 | if clean_model.startswith("anthropic/"): 1097 | clean_model = clean_model[len("anthropic/"):] 1098 | elif clean_model.startswith("openai/"): 1099 | clean_model = clean_model[len("openai/"):] 1100 | 1101 | logger.debug(f"📊 PROCESSING REQUEST: Model={request.model}, Stream={request.stream}") 1102 | 1103 | # Convert Anthropic request to LiteLLM format 1104 | litellm_request = convert_anthropic_to_litellm(request) 1105 | 1106 | # Determine which API key to use based on the model 1107 | if request.model.startswith("openai/"): 1108 | litellm_request["api_key"] = OPENAI_API_KEY 1109 | logger.debug(f"Using OpenAI API key for model: {request.model}") 1110 | elif request.model.startswith("gemini/"): 1111 | litellm_request["api_key"] = GEMINI_API_KEY 1112 | logger.debug(f"Using Gemini API key for model: {request.model}") 1113 | else: 1114 | litellm_request["api_key"] = ANTHROPIC_API_KEY 1115 | logger.debug(f"Using Anthropic API key for model: {request.model}") 1116 | 1117 | # For OpenAI models - modify request format to work with limitations 1118 | if "openai" in litellm_request["model"] and "messages" in litellm_request: 1119 | logger.debug(f"Processing OpenAI model request: {litellm_request['model']}") 1120 | 1121 | # For OpenAI models, we need to convert content blocks to simple strings 1122 | # and handle other requirements 1123 | for i, msg in enumerate(litellm_request["messages"]): 1124 | # Special case - handle message content directly when it's a list of tool_result 1125 | # This is a specific case we're seeing in the error 1126 | if "content" in msg and isinstance(msg["content"], list): 1127 | is_only_tool_result = True 1128 | for block in msg["content"]: 1129 | if not isinstance(block, dict) or block.get("type") != "tool_result": 1130 | is_only_tool_result = False 1131 | break 1132 | 1133 | if is_only_tool_result and len(msg["content"]) > 0: 1134 | logger.warning(f"Found message with only tool_result content - special handling required") 1135 | # Extract the content from all tool_result blocks 1136 | all_text = "" 1137 | for block in msg["content"]: 1138 | all_text += "Tool Result:\n" 1139 | result_content = block.get("content", []) 1140 | 1141 | # Handle different formats of content 1142 | if isinstance(result_content, list): 1143 | for item in result_content: 1144 | if isinstance(item, dict) and item.get("type") == "text": 1145 | all_text += item.get("text", "") + "\n" 1146 | elif isinstance(item, dict): 1147 | # Fall back to string representation of any dict 1148 | try: 1149 | item_text = item.get("text", json.dumps(item)) 1150 | all_text += item_text + "\n" 1151 | except: 1152 | all_text += str(item) + "\n" 1153 | elif isinstance(result_content, str): 1154 | all_text += result_content + "\n" 1155 | else: 1156 | try: 1157 | all_text += json.dumps(result_content) + "\n" 1158 | except: 1159 | all_text += str(result_content) + "\n" 1160 | 1161 | # Replace the list with extracted text 1162 | litellm_request["messages"][i]["content"] = all_text.strip() or "..." 1163 | logger.warning(f"Converted tool_result to plain text: {all_text.strip()[:200]}...") 1164 | continue # Skip normal processing for this message 1165 | 1166 | # 1. Handle content field - normal case 1167 | if "content" in msg: 1168 | # Check if content is a list (content blocks) 1169 | if isinstance(msg["content"], list): 1170 | # Convert complex content blocks to simple string 1171 | text_content = "" 1172 | for block in msg["content"]: 1173 | if isinstance(block, dict): 1174 | # Handle different content block types 1175 | if block.get("type") == "text": 1176 | text_content += block.get("text", "") + "\n" 1177 | 1178 | # Handle tool_result content blocks - extract nested text 1179 | elif block.get("type") == "tool_result": 1180 | tool_id = block.get("tool_use_id", "unknown") 1181 | text_content += f"[Tool Result ID: {tool_id}]\n" 1182 | 1183 | # Extract text from the tool_result content 1184 | result_content = block.get("content", []) 1185 | if isinstance(result_content, list): 1186 | for item in result_content: 1187 | if isinstance(item, dict) and item.get("type") == "text": 1188 | text_content += item.get("text", "") + "\n" 1189 | elif isinstance(item, dict): 1190 | # Handle any dict by trying to extract text or convert to JSON 1191 | if "text" in item: 1192 | text_content += item.get("text", "") + "\n" 1193 | else: 1194 | try: 1195 | text_content += json.dumps(item) + "\n" 1196 | except: 1197 | text_content += str(item) + "\n" 1198 | elif isinstance(result_content, dict): 1199 | # Handle dictionary content 1200 | if result_content.get("type") == "text": 1201 | text_content += result_content.get("text", "") + "\n" 1202 | else: 1203 | try: 1204 | text_content += json.dumps(result_content) + "\n" 1205 | except: 1206 | text_content += str(result_content) + "\n" 1207 | elif isinstance(result_content, str): 1208 | text_content += result_content + "\n" 1209 | else: 1210 | try: 1211 | text_content += json.dumps(result_content) + "\n" 1212 | except: 1213 | text_content += str(result_content) + "\n" 1214 | 1215 | # Handle tool_use content blocks 1216 | elif block.get("type") == "tool_use": 1217 | tool_name = block.get("name", "unknown") 1218 | tool_id = block.get("id", "unknown") 1219 | tool_input = json.dumps(block.get("input", {})) 1220 | text_content += f"[Tool: {tool_name} (ID: {tool_id})]\nInput: {tool_input}\n\n" 1221 | 1222 | # Handle image content blocks 1223 | elif block.get("type") == "image": 1224 | text_content += "[Image content - not displayed in text format]\n" 1225 | 1226 | # Make sure content is never empty for OpenAI models 1227 | if not text_content.strip(): 1228 | text_content = "..." 1229 | 1230 | litellm_request["messages"][i]["content"] = text_content.strip() 1231 | # Also check for None or empty string content 1232 | elif msg["content"] is None: 1233 | litellm_request["messages"][i]["content"] = "..." # Empty content not allowed 1234 | 1235 | # 2. Remove any fields OpenAI doesn't support in messages 1236 | for key in list(msg.keys()): 1237 | if key not in ["role", "content", "name", "tool_call_id", "tool_calls"]: 1238 | logger.warning(f"Removing unsupported field from message: {key}") 1239 | del msg[key] 1240 | 1241 | # 3. Final validation - check for any remaining invalid values and dump full message details 1242 | for i, msg in enumerate(litellm_request["messages"]): 1243 | # Log the message format for debugging 1244 | logger.debug(f"Message {i} format check - role: {msg.get('role')}, content type: {type(msg.get('content'))}") 1245 | 1246 | # If content is still a list or None, replace with placeholder 1247 | if isinstance(msg.get("content"), list): 1248 | logger.warning(f"CRITICAL: Message {i} still has list content after processing: {json.dumps(msg.get('content'))}") 1249 | # Last resort - stringify the entire content as JSON 1250 | litellm_request["messages"][i]["content"] = f"Content as JSON: {json.dumps(msg.get('content'))}" 1251 | elif msg.get("content") is None: 1252 | logger.warning(f"Message {i} has None content - replacing with placeholder") 1253 | litellm_request["messages"][i]["content"] = "..." # Fallback placeholder 1254 | 1255 | # Only log basic info about the request, not the full details 1256 | logger.debug(f"Request for model: {litellm_request.get('model')}, stream: {litellm_request.get('stream', False)}") 1257 | 1258 | # Handle streaming mode 1259 | if request.stream: 1260 | # Use LiteLLM for streaming 1261 | num_tools = len(request.tools) if request.tools else 0 1262 | 1263 | log_request_beautifully( 1264 | "POST", 1265 | raw_request.url.path, 1266 | display_model, 1267 | litellm_request.get('model'), 1268 | len(litellm_request['messages']), 1269 | num_tools, 1270 | 200 # Assuming success at this point 1271 | ) 1272 | # Ensure we use the async version for streaming 1273 | response_generator = await litellm.acompletion(**litellm_request) 1274 | 1275 | return StreamingResponse( 1276 | handle_streaming(response_generator, request), 1277 | media_type="text/event-stream" 1278 | ) 1279 | else: 1280 | # Use LiteLLM for regular completion 1281 | num_tools = len(request.tools) if request.tools else 0 1282 | 1283 | log_request_beautifully( 1284 | "POST", 1285 | raw_request.url.path, 1286 | display_model, 1287 | litellm_request.get('model'), 1288 | len(litellm_request['messages']), 1289 | num_tools, 1290 | 200 # Assuming success at this point 1291 | ) 1292 | start_time = time.time() 1293 | litellm_response = litellm.completion(**litellm_request) 1294 | logger.debug(f"✅ RESPONSE RECEIVED: Model={litellm_request.get('model')}, Time={time.time() - start_time:.2f}s") 1295 | 1296 | # Convert LiteLLM response to Anthropic format 1297 | anthropic_response = convert_litellm_to_anthropic(litellm_response, request) 1298 | 1299 | return anthropic_response 1300 | 1301 | except Exception as e: 1302 | import traceback 1303 | error_traceback = traceback.format_exc() 1304 | 1305 | # Capture as much info as possible about the error 1306 | error_details = { 1307 | "error": str(e), 1308 | "type": type(e).__name__, 1309 | "traceback": error_traceback 1310 | } 1311 | 1312 | # Check for LiteLLM-specific attributes 1313 | for attr in ['message', 'status_code', 'response', 'llm_provider', 'model']: 1314 | if hasattr(e, attr): 1315 | error_details[attr] = getattr(e, attr) 1316 | 1317 | # Check for additional exception details in dictionaries 1318 | if hasattr(e, '__dict__'): 1319 | for key, value in e.__dict__.items(): 1320 | if key not in error_details and key not in ['args', '__traceback__']: 1321 | error_details[key] = str(value) 1322 | 1323 | # Log all error details 1324 | logger.error(f"Error processing request: {json.dumps(error_details, indent=2)}") 1325 | 1326 | # Format error for response 1327 | error_message = f"Error: {str(e)}" 1328 | if 'message' in error_details and error_details['message']: 1329 | error_message += f"\nMessage: {error_details['message']}" 1330 | if 'response' in error_details and error_details['response']: 1331 | error_message += f"\nResponse: {error_details['response']}" 1332 | 1333 | # Return detailed error 1334 | status_code = error_details.get('status_code', 500) 1335 | raise HTTPException(status_code=status_code, detail=error_message) 1336 | 1337 | @app.post("/v1/messages/count_tokens") 1338 | async def count_tokens( 1339 | request: TokenCountRequest, 1340 | raw_request: Request 1341 | ): 1342 | try: 1343 | # Log the incoming token count request 1344 | original_model = request.original_model or request.model 1345 | 1346 | # Get the display name for logging, just the model name without provider prefix 1347 | display_model = original_model 1348 | if "/" in display_model: 1349 | display_model = display_model.split("/")[-1] 1350 | 1351 | # Clean model name for capability check 1352 | clean_model = request.model 1353 | if clean_model.startswith("anthropic/"): 1354 | clean_model = clean_model[len("anthropic/"):] 1355 | elif clean_model.startswith("openai/"): 1356 | clean_model = clean_model[len("openai/"):] 1357 | 1358 | # Convert the messages to a format LiteLLM can understand 1359 | converted_request = convert_anthropic_to_litellm( 1360 | MessagesRequest( 1361 | model=request.model, 1362 | max_tokens=100, # Arbitrary value not used for token counting 1363 | messages=request.messages, 1364 | system=request.system, 1365 | tools=request.tools, 1366 | tool_choice=request.tool_choice, 1367 | thinking=request.thinking 1368 | ) 1369 | ) 1370 | 1371 | # Use LiteLLM's token_counter function 1372 | try: 1373 | # Import token_counter function 1374 | from litellm import token_counter 1375 | 1376 | # Log the request beautifully 1377 | num_tools = len(request.tools) if request.tools else 0 1378 | 1379 | log_request_beautifully( 1380 | "POST", 1381 | raw_request.url.path, 1382 | display_model, 1383 | converted_request.get('model'), 1384 | len(converted_request['messages']), 1385 | num_tools, 1386 | 200 # Assuming success at this point 1387 | ) 1388 | 1389 | # Count tokens 1390 | token_count = token_counter( 1391 | model=converted_request["model"], 1392 | messages=converted_request["messages"], 1393 | ) 1394 | 1395 | # Return Anthropic-style response 1396 | return TokenCountResponse(input_tokens=token_count) 1397 | 1398 | except ImportError: 1399 | logger.error("Could not import token_counter from litellm") 1400 | # Fallback to a simple approximation 1401 | return TokenCountResponse(input_tokens=1000) # Default fallback 1402 | 1403 | except Exception as e: 1404 | import traceback 1405 | error_traceback = traceback.format_exc() 1406 | logger.error(f"Error counting tokens: {str(e)}\n{error_traceback}") 1407 | raise HTTPException(status_code=500, detail=f"Error counting tokens: {str(e)}") 1408 | 1409 | @app.get("/") 1410 | async def root(): 1411 | return {"message": "Anthropic Proxy for LiteLLM"} 1412 | 1413 | # Define ANSI color codes for terminal output 1414 | class Colors: 1415 | CYAN = "\033[96m" 1416 | BLUE = "\033[94m" 1417 | GREEN = "\033[92m" 1418 | YELLOW = "\033[93m" 1419 | RED = "\033[91m" 1420 | MAGENTA = "\033[95m" 1421 | RESET = "\033[0m" 1422 | BOLD = "\033[1m" 1423 | UNDERLINE = "\033[4m" 1424 | DIM = "\033[2m" 1425 | def log_request_beautifully(method, path, claude_model, openai_model, num_messages, num_tools, status_code): 1426 | """Log requests in a beautiful, twitter-friendly format showing Claude to OpenAI mapping.""" 1427 | # Format the Claude model name nicely 1428 | claude_display = f"{Colors.CYAN}{claude_model}{Colors.RESET}" 1429 | 1430 | # Extract endpoint name 1431 | endpoint = path 1432 | if "?" in endpoint: 1433 | endpoint = endpoint.split("?")[0] 1434 | 1435 | # Extract just the OpenAI model name without provider prefix 1436 | openai_display = openai_model 1437 | if "/" in openai_display: 1438 | openai_display = openai_display.split("/")[-1] 1439 | openai_display = f"{Colors.GREEN}{openai_display}{Colors.RESET}" 1440 | 1441 | # Format tools and messages 1442 | tools_str = f"{Colors.MAGENTA}{num_tools} tools{Colors.RESET}" 1443 | messages_str = f"{Colors.BLUE}{num_messages} messages{Colors.RESET}" 1444 | 1445 | # Format status code 1446 | status_str = f"{Colors.GREEN}✓ {status_code} OK{Colors.RESET}" if status_code == 200 else f"{Colors.RED}✗ {status_code}{Colors.RESET}" 1447 | 1448 | 1449 | # Put it all together in a clear, beautiful format 1450 | log_line = f"{Colors.BOLD}{method} {endpoint}{Colors.RESET} {status_str}" 1451 | model_line = f"{claude_display} → {openai_display} {tools_str} {messages_str}" 1452 | 1453 | # Print to console 1454 | print(log_line) 1455 | print(model_line) 1456 | sys.stdout.flush() 1457 | 1458 | if __name__ == "__main__": 1459 | import sys 1460 | if len(sys.argv) > 1 and sys.argv[1] == "--help": 1461 | print("Run with: uvicorn server:app --reload --host 0.0.0.0 --port 8082") 1462 | sys.exit(0) 1463 | 1464 | # Configure uvicorn to run with minimal logs 1465 | uvicorn.run(app, host="0.0.0.0", port=8082, log_level="error") -------------------------------------------------------------------------------- /tests.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | """ 3 | Comprehensive test suite for Claude-on-OpenAI Proxy. 4 | 5 | This script provides tests for both streaming and non-streaming requests, 6 | with various scenarios including tool use, multi-turn conversations, 7 | and content blocks. 8 | 9 | Usage: 10 | python tests.py # Run all tests 11 | python tests.py --no-streaming # Skip streaming tests 12 | python tests.py --simple # Run only simple tests 13 | python tests.py --tools # Run tool-related tests only 14 | """ 15 | 16 | import os 17 | import json 18 | import time 19 | import httpx 20 | import argparse 21 | import asyncio 22 | import sys 23 | from datetime import datetime 24 | from typing import Dict, Any, List, Optional, Set 25 | from dotenv import load_dotenv 26 | 27 | # Load environment variables 28 | load_dotenv() 29 | 30 | # Configuration 31 | ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY") 32 | PROXY_API_KEY = os.environ.get("ANTHROPIC_API_KEY") # Using same key for proxy 33 | ANTHROPIC_API_URL = "https://api.anthropic.com/v1/messages" 34 | PROXY_API_URL = "http://localhost:8082/v1/messages" 35 | ANTHROPIC_VERSION = "2023-06-01" 36 | MODEL = "claude-3-sonnet-20240229" # Change to your preferred model 37 | 38 | # Headers 39 | anthropic_headers = { 40 | "x-api-key": ANTHROPIC_API_KEY, 41 | "anthropic-version": ANTHROPIC_VERSION, 42 | "content-type": "application/json", 43 | } 44 | 45 | proxy_headers = { 46 | "x-api-key": PROXY_API_KEY, 47 | "anthropic-version": ANTHROPIC_VERSION, 48 | "content-type": "application/json", 49 | } 50 | 51 | # Tool definitions 52 | calculator_tool = { 53 | "name": "calculator", 54 | "description": "Evaluate mathematical expressions", 55 | "input_schema": { 56 | "type": "object", 57 | "properties": { 58 | "expression": { 59 | "type": "string", 60 | "description": "The mathematical expression to evaluate" 61 | } 62 | }, 63 | "required": ["expression"] 64 | } 65 | } 66 | 67 | weather_tool = { 68 | "name": "weather", 69 | "description": "Get weather information for a location", 70 | "input_schema": { 71 | "type": "object", 72 | "properties": { 73 | "location": { 74 | "type": "string", 75 | "description": "The city or location to get weather for" 76 | }, 77 | "units": { 78 | "type": "string", 79 | "enum": ["celsius", "fahrenheit"], 80 | "description": "Temperature units" 81 | } 82 | }, 83 | "required": ["location"] 84 | } 85 | } 86 | 87 | search_tool = { 88 | "name": "search", 89 | "description": "Search for information on the web", 90 | "input_schema": { 91 | "type": "object", 92 | "properties": { 93 | "query": { 94 | "type": "string", 95 | "description": "The search query" 96 | } 97 | }, 98 | "required": ["query"] 99 | } 100 | } 101 | 102 | # Test scenarios 103 | TEST_SCENARIOS = { 104 | # Simple text response 105 | "simple": { 106 | "model": MODEL, 107 | "max_tokens": 300, 108 | "messages": [ 109 | {"role": "user", "content": "Hello, world! Can you tell me about Paris in 2-3 sentences?"} 110 | ] 111 | }, 112 | 113 | # Basic tool use 114 | "calculator": { 115 | "model": MODEL, 116 | "max_tokens": 300, 117 | "messages": [ 118 | {"role": "user", "content": "What is 135 + 7.5 divided by 2.5?"} 119 | ], 120 | "tools": [calculator_tool], 121 | "tool_choice": {"type": "auto"} 122 | }, 123 | 124 | # Multiple tools 125 | "multi_tool": { 126 | "model": MODEL, 127 | "max_tokens": 500, 128 | "temperature": 0.7, 129 | "top_p": 0.95, 130 | "system": "You are a helpful assistant that uses tools when appropriate. Be concise and precise.", 131 | "messages": [ 132 | {"role": "user", "content": "I'm planning a trip to New York next week. What's the weather like and what are some interesting places to visit?"} 133 | ], 134 | "tools": [weather_tool, search_tool], 135 | "tool_choice": {"type": "auto"} 136 | }, 137 | 138 | # Multi-turn conversation 139 | "multi_turn": { 140 | "model": MODEL, 141 | "max_tokens": 500, 142 | "messages": [ 143 | {"role": "user", "content": "Let's do some math. What is 240 divided by 8?"}, 144 | {"role": "assistant", "content": "To calculate 240 divided by 8, I'll perform the division:\n\n240 ÷ 8 = 30\n\nSo the result is 30."}, 145 | {"role": "user", "content": "Now multiply that by 4 and tell me the result."} 146 | ], 147 | "tools": [calculator_tool], 148 | "tool_choice": {"type": "auto"} 149 | }, 150 | 151 | # Content blocks 152 | "content_blocks": { 153 | "model": MODEL, 154 | "max_tokens": 500, 155 | "messages": [ 156 | {"role": "user", "content": [ 157 | {"type": "text", "text": "I need to know the weather in Los Angeles and calculate 75.5 / 5. Can you help with both?"} 158 | ]} 159 | ], 160 | "tools": [calculator_tool, weather_tool], 161 | "tool_choice": {"type": "auto"} 162 | }, 163 | 164 | # Simple streaming test 165 | "simple_stream": { 166 | "model": MODEL, 167 | "max_tokens": 100, 168 | "stream": True, 169 | "messages": [ 170 | {"role": "user", "content": "Count from 1 to 5, with one number per line."} 171 | ] 172 | }, 173 | 174 | # Tool use with streaming 175 | "calculator_stream": { 176 | "model": MODEL, 177 | "max_tokens": 300, 178 | "stream": True, 179 | "messages": [ 180 | {"role": "user", "content": "What is 135 + 17.5 divided by 2.5?"} 181 | ], 182 | "tools": [calculator_tool], 183 | "tool_choice": {"type": "auto"} 184 | } 185 | } 186 | 187 | # Required event types for Anthropic streaming responses 188 | REQUIRED_EVENT_TYPES = { 189 | "message_start", 190 | "content_block_start", 191 | "content_block_delta", 192 | "content_block_stop", 193 | "message_delta", 194 | "message_stop" 195 | } 196 | 197 | # ================= NON-STREAMING TESTS ================= 198 | 199 | def get_response(url, headers, data): 200 | """Send a request and get the response.""" 201 | start_time = time.time() 202 | response = httpx.post(url, headers=headers, json=data, timeout=30) 203 | elapsed = time.time() - start_time 204 | 205 | print(f"Response time: {elapsed:.2f} seconds") 206 | return response 207 | 208 | def compare_responses(anthropic_response, proxy_response, check_tools=False): 209 | """Compare the two responses to see if they're similar enough.""" 210 | anthropic_json = anthropic_response.json() 211 | proxy_json = proxy_response.json() 212 | 213 | print("\n--- Anthropic Response Structure ---") 214 | print(json.dumps({k: v for k, v in anthropic_json.items() if k != "content"}, indent=2)) 215 | 216 | print("\n--- Proxy Response Structure ---") 217 | print(json.dumps({k: v for k, v in proxy_json.items() if k != "content"}, indent=2)) 218 | 219 | # Basic structure verification with more flexibility 220 | # The proxy might map values differently, so we're more lenient in our checks 221 | assert proxy_json.get("role") == "assistant", "Proxy role is not 'assistant'" 222 | assert proxy_json.get("type") == "message", "Proxy type is not 'message'" 223 | 224 | # Check if stop_reason is reasonable (might be different between Anthropic and our proxy) 225 | valid_stop_reasons = ["end_turn", "max_tokens", "stop_sequence", "tool_use", None] 226 | assert proxy_json.get("stop_reason") in valid_stop_reasons, "Invalid stop reason" 227 | 228 | # Check content exists and has valid structure 229 | assert "content" in anthropic_json, "No content in Anthropic response" 230 | assert "content" in proxy_json, "No content in Proxy response" 231 | 232 | anthropic_content = anthropic_json["content"] 233 | proxy_content = proxy_json["content"] 234 | 235 | # Make sure content is a list and has at least one item 236 | assert isinstance(anthropic_content, list), "Anthropic content is not a list" 237 | assert isinstance(proxy_content, list), "Proxy content is not a list" 238 | assert len(proxy_content) > 0, "Proxy content is empty" 239 | 240 | # If we're checking for tool uses 241 | if check_tools: 242 | # Check if content has tool use 243 | anthropic_tool = None 244 | proxy_tool = None 245 | 246 | # Find tool use in Anthropic response 247 | for item in anthropic_content: 248 | if item.get("type") == "tool_use": 249 | anthropic_tool = item 250 | break 251 | 252 | # Find tool use in Proxy response 253 | for item in proxy_content: 254 | if item.get("type") == "tool_use": 255 | proxy_tool = item 256 | break 257 | 258 | # At least one of them should have a tool use 259 | if anthropic_tool is not None: 260 | print("\n---------- ANTHROPIC TOOL USE ----------") 261 | print(json.dumps(anthropic_tool, indent=2)) 262 | 263 | if proxy_tool is not None: 264 | print("\n---------- PROXY TOOL USE ----------") 265 | print(json.dumps(proxy_tool, indent=2)) 266 | 267 | # Check tool structure 268 | assert proxy_tool.get("name") is not None, "Proxy tool has no name" 269 | assert proxy_tool.get("input") is not None, "Proxy tool has no input" 270 | 271 | print("\n✅ Both responses contain tool use") 272 | else: 273 | print("\n⚠️ Proxy response does not contain tool use, but Anthropic does") 274 | elif proxy_tool is not None: 275 | print("\n---------- PROXY TOOL USE ----------") 276 | print(json.dumps(proxy_tool, indent=2)) 277 | print("\n⚠️ Proxy response contains tool use, but Anthropic does not") 278 | else: 279 | print("\n⚠️ Neither response contains tool use") 280 | 281 | # Check if content has text 282 | anthropic_text = None 283 | proxy_text = None 284 | 285 | for item in anthropic_content: 286 | if item.get("type") == "text": 287 | anthropic_text = item.get("text") 288 | break 289 | 290 | for item in proxy_content: 291 | if item.get("type") == "text": 292 | proxy_text = item.get("text") 293 | break 294 | 295 | # For tool use responses, there might not be text content 296 | if check_tools and (anthropic_text is None or proxy_text is None): 297 | print("\n⚠️ One or both responses don't have text content (expected for tool-only responses)") 298 | return True 299 | 300 | assert anthropic_text is not None, "No text found in Anthropic response" 301 | assert proxy_text is not None, "No text found in Proxy response" 302 | 303 | # Print the first few lines of each text response 304 | max_preview_lines = 5 305 | anthropic_preview = "\n".join(anthropic_text.strip().split("\n")[:max_preview_lines]) 306 | proxy_preview = "\n".join(proxy_text.strip().split("\n")[:max_preview_lines]) 307 | 308 | print("\n---------- ANTHROPIC TEXT PREVIEW ----------") 309 | print(anthropic_preview) 310 | 311 | print("\n---------- PROXY TEXT PREVIEW ----------") 312 | print(proxy_preview) 313 | 314 | # Check for some minimum text overlap - proxy might have different exact wording 315 | # but should have roughly similar content 316 | return True # We're not enforcing similarity, just basic structure 317 | 318 | def test_request(test_name, request_data, check_tools=False): 319 | """Run a test with the given request data.""" 320 | print(f"\n{'='*20} RUNNING TEST: {test_name} {'='*20}") 321 | 322 | # Log the request data 323 | print(f"\nRequest data:\n{json.dumps({k: v for k, v in request_data.items() if k != 'messages'}, indent=2)}") 324 | 325 | # Make copies of the request data to avoid modifying the original 326 | anthropic_data = request_data.copy() 327 | proxy_data = request_data.copy() 328 | 329 | try: 330 | # Send requests to both APIs 331 | print("\nSending to Anthropic API...") 332 | anthropic_response = get_response(ANTHROPIC_API_URL, anthropic_headers, anthropic_data) 333 | 334 | print("\nSending to Proxy...") 335 | proxy_response = get_response(PROXY_API_URL, proxy_headers, proxy_data) 336 | 337 | # Check response codes 338 | print(f"\nAnthropic status code: {anthropic_response.status_code}") 339 | print(f"Proxy status code: {proxy_response.status_code}") 340 | 341 | if anthropic_response.status_code != 200 or proxy_response.status_code != 200: 342 | print("\n⚠️ One or both requests failed") 343 | if anthropic_response.status_code != 200: 344 | print(f"Anthropic error: {anthropic_response.text}") 345 | if proxy_response.status_code != 200: 346 | print(f"Proxy error: {proxy_response.text}") 347 | return False 348 | 349 | # Compare the responses 350 | result = compare_responses(anthropic_response, proxy_response, check_tools=check_tools) 351 | if result: 352 | print(f"\n✅ Test {test_name} passed!") 353 | return True 354 | else: 355 | print(f"\n❌ Test {test_name} failed!") 356 | return False 357 | 358 | except Exception as e: 359 | print(f"\n❌ Error in test {test_name}: {str(e)}") 360 | import traceback 361 | traceback.print_exc() 362 | return False 363 | 364 | # ================= STREAMING TESTS ================= 365 | 366 | class StreamStats: 367 | """Track statistics about a streaming response.""" 368 | 369 | def __init__(self): 370 | self.event_types = set() 371 | self.event_counts = {} 372 | self.first_event_time = None 373 | self.last_event_time = None 374 | self.total_chunks = 0 375 | self.events = [] 376 | self.text_content = "" 377 | self.content_blocks = {} 378 | self.has_tool_use = False 379 | self.has_error = False 380 | self.error_message = "" 381 | self.text_content_by_block = {} 382 | 383 | def add_event(self, event_data): 384 | """Track information about each received event.""" 385 | now = datetime.now() 386 | if self.first_event_time is None: 387 | self.first_event_time = now 388 | self.last_event_time = now 389 | 390 | self.total_chunks += 1 391 | 392 | # Record event type and increment count 393 | if "type" in event_data: 394 | event_type = event_data["type"] 395 | self.event_types.add(event_type) 396 | self.event_counts[event_type] = self.event_counts.get(event_type, 0) + 1 397 | 398 | # Track specific event data 399 | if event_type == "content_block_start": 400 | block_idx = event_data.get("index") 401 | content_block = event_data.get("content_block", {}) 402 | if content_block.get("type") == "tool_use": 403 | self.has_tool_use = True 404 | self.content_blocks[block_idx] = content_block 405 | self.text_content_by_block[block_idx] = "" 406 | 407 | elif event_type == "content_block_delta": 408 | block_idx = event_data.get("index") 409 | delta = event_data.get("delta", {}) 410 | if delta.get("type") == "text_delta": 411 | text = delta.get("text", "") 412 | self.text_content += text 413 | # Also track text by block ID 414 | if block_idx in self.text_content_by_block: 415 | self.text_content_by_block[block_idx] += text 416 | 417 | # Keep track of all events for debugging 418 | self.events.append(event_data) 419 | 420 | def get_duration(self): 421 | """Calculate the total duration of the stream in seconds.""" 422 | if self.first_event_time is None or self.last_event_time is None: 423 | return 0 424 | return (self.last_event_time - self.first_event_time).total_seconds() 425 | 426 | def summarize(self): 427 | """Print a summary of the stream statistics.""" 428 | print(f"Total chunks: {self.total_chunks}") 429 | print(f"Unique event types: {sorted(list(self.event_types))}") 430 | print(f"Event counts: {json.dumps(self.event_counts, indent=2)}") 431 | print(f"Duration: {self.get_duration():.2f} seconds") 432 | print(f"Has tool use: {self.has_tool_use}") 433 | 434 | # Print the first few lines of content 435 | if self.text_content: 436 | max_preview_lines = 5 437 | text_preview = "\n".join(self.text_content.strip().split("\n")[:max_preview_lines]) 438 | print(f"Text preview:\n{text_preview}") 439 | else: 440 | print("No text content extracted") 441 | 442 | if self.has_error: 443 | print(f"Error: {self.error_message}") 444 | 445 | async def stream_response(url, headers, data, stream_name): 446 | """Send a streaming request and process the response.""" 447 | print(f"\nStarting {stream_name} stream...") 448 | stats = StreamStats() 449 | error = None 450 | 451 | try: 452 | async with httpx.AsyncClient() as client: 453 | # Add stream flag to ensure it's streamed 454 | request_data = data.copy() 455 | request_data["stream"] = True 456 | 457 | start_time = time.time() 458 | async with client.stream("POST", url, json=request_data, headers=headers, timeout=30) as response: 459 | if response.status_code != 200: 460 | error_text = await response.aread() 461 | stats.has_error = True 462 | stats.error_message = f"HTTP {response.status_code}: {error_text.decode('utf-8')}" 463 | error = stats.error_message 464 | print(f"Error: {stats.error_message}") 465 | return stats, error 466 | 467 | print(f"{stream_name} connected, receiving events...") 468 | 469 | # Process each chunk 470 | buffer = "" 471 | async for chunk in response.aiter_text(): 472 | if not chunk.strip(): 473 | continue 474 | 475 | # Handle multiple events in one chunk 476 | buffer += chunk 477 | events = buffer.split("\n\n") 478 | 479 | # Process all complete events 480 | for event_text in events[:-1]: # All but the last (possibly incomplete) event 481 | if not event_text.strip(): 482 | continue 483 | 484 | # Parse server-sent event format 485 | if "data: " in event_text: 486 | # Extract the data part 487 | data_parts = [] 488 | for line in event_text.split("\n"): 489 | if line.startswith("data: "): 490 | data_part = line[len("data: "):] 491 | # Skip the "[DONE]" marker 492 | if data_part == "[DONE]": 493 | break 494 | data_parts.append(data_part) 495 | 496 | if data_parts: 497 | try: 498 | event_data = json.loads("".join(data_parts)) 499 | stats.add_event(event_data) 500 | except json.JSONDecodeError as e: 501 | print(f"Error parsing event: {e}\nRaw data: {''.join(data_parts)}") 502 | 503 | # Keep the last (potentially incomplete) event for the next iteration 504 | buffer = events[-1] if events else "" 505 | 506 | # Process any remaining complete events in the buffer 507 | if buffer.strip(): 508 | lines = buffer.strip().split("\n") 509 | data_lines = [line[len("data: "):] for line in lines if line.startswith("data: ")] 510 | if data_lines and data_lines[0] != "[DONE]": 511 | try: 512 | event_data = json.loads("".join(data_lines)) 513 | stats.add_event(event_data) 514 | except: 515 | pass 516 | 517 | elapsed = time.time() - start_time 518 | print(f"{stream_name} stream completed in {elapsed:.2f} seconds") 519 | except Exception as e: 520 | stats.has_error = True 521 | stats.error_message = str(e) 522 | error = str(e) 523 | print(f"Error in {stream_name} stream: {e}") 524 | 525 | return stats, error 526 | 527 | def compare_stream_stats(anthropic_stats, proxy_stats): 528 | """Compare the statistics from the two streams to see if they're similar enough.""" 529 | 530 | print("\n--- Stream Comparison ---") 531 | 532 | # Required events 533 | anthropic_missing = REQUIRED_EVENT_TYPES - anthropic_stats.event_types 534 | proxy_missing = REQUIRED_EVENT_TYPES - proxy_stats.event_types 535 | 536 | print(f"Anthropic missing event types: {anthropic_missing}") 537 | print(f"Proxy missing event types: {proxy_missing}") 538 | 539 | # Check if proxy has the required events 540 | if proxy_missing: 541 | print(f"⚠️ Proxy is missing required event types: {proxy_missing}") 542 | else: 543 | print("✅ Proxy has all required event types") 544 | 545 | # Compare content 546 | if anthropic_stats.text_content and proxy_stats.text_content: 547 | anthropic_preview = "\n".join(anthropic_stats.text_content.strip().split("\n")[:5]) 548 | proxy_preview = "\n".join(proxy_stats.text_content.strip().split("\n")[:5]) 549 | 550 | print("\n--- Anthropic Content Preview ---") 551 | print(anthropic_preview) 552 | 553 | print("\n--- Proxy Content Preview ---") 554 | print(proxy_preview) 555 | 556 | # Compare tool use 557 | if anthropic_stats.has_tool_use and proxy_stats.has_tool_use: 558 | print("✅ Both have tool use") 559 | elif anthropic_stats.has_tool_use and not proxy_stats.has_tool_use: 560 | print("⚠️ Anthropic has tool use but proxy does not") 561 | elif not anthropic_stats.has_tool_use and proxy_stats.has_tool_use: 562 | print("⚠️ Proxy has tool use but Anthropic does not") 563 | 564 | # Success as long as proxy has some content and no errors 565 | return (not proxy_stats.has_error and 566 | len(proxy_stats.text_content) > 0 or proxy_stats.has_tool_use) 567 | 568 | async def test_streaming(test_name, request_data): 569 | """Run a streaming test with the given request data.""" 570 | print(f"\n{'='*20} RUNNING STREAMING TEST: {test_name} {'='*20}") 571 | 572 | # Log the request data 573 | print(f"\nRequest data:\n{json.dumps({k: v for k, v in request_data.items() if k != 'messages'}, indent=2)}") 574 | 575 | # Make copies of the request data to avoid modifying the original 576 | anthropic_data = request_data.copy() 577 | proxy_data = request_data.copy() 578 | 579 | if not anthropic_data.get("stream"): 580 | anthropic_data["stream"] = True 581 | if not proxy_data.get("stream"): 582 | proxy_data["stream"] = True 583 | 584 | check_tools = "tools" in request_data 585 | 586 | try: 587 | # Send streaming requests 588 | anthropic_stats, anthropic_error = await stream_response( 589 | ANTHROPIC_API_URL, anthropic_headers, anthropic_data, "Anthropic" 590 | ) 591 | 592 | proxy_stats, proxy_error = await stream_response( 593 | PROXY_API_URL, proxy_headers, proxy_data, "Proxy" 594 | ) 595 | 596 | # Print statistics 597 | print("\n--- Anthropic Stream Statistics ---") 598 | anthropic_stats.summarize() 599 | 600 | print("\n--- Proxy Stream Statistics ---") 601 | proxy_stats.summarize() 602 | 603 | # Compare the responses 604 | if anthropic_error: 605 | print(f"\n⚠️ Anthropic stream had an error: {anthropic_error}") 606 | # If Anthropic errors, the test passes if proxy does anything useful 607 | if not proxy_error and proxy_stats.total_chunks > 0: 608 | print(f"\n✅ Test {test_name} passed! (Proxy worked even though Anthropic failed)") 609 | return True 610 | else: 611 | print(f"\n❌ Test {test_name} failed! Both streams had errors.") 612 | return False 613 | 614 | if proxy_error: 615 | print(f"\n❌ Test {test_name} failed! Proxy had an error: {proxy_error}") 616 | return False 617 | 618 | result = compare_stream_stats(anthropic_stats, proxy_stats) 619 | if result: 620 | print(f"\n✅ Test {test_name} passed!") 621 | return True 622 | else: 623 | print(f"\n❌ Test {test_name} failed!") 624 | return False 625 | 626 | except Exception as e: 627 | print(f"\n❌ Error in test {test_name}: {str(e)}") 628 | import traceback 629 | traceback.print_exc() 630 | return False 631 | 632 | # ================= MAIN ================= 633 | 634 | async def run_tests(args): 635 | """Run all tests based on command-line arguments.""" 636 | # Track test results 637 | results = {} 638 | 639 | # First run non-streaming tests 640 | if not args.streaming_only: 641 | print("\n\n=========== RUNNING NON-STREAMING TESTS ===========\n") 642 | for test_name, test_data in TEST_SCENARIOS.items(): 643 | # Skip streaming tests 644 | if test_data.get("stream"): 645 | continue 646 | 647 | # Skip tool tests if requested 648 | if args.simple and "tools" in test_data: 649 | continue 650 | 651 | # Skip non-tool tests if tools_only 652 | if args.tools_only and "tools" not in test_data: 653 | continue 654 | 655 | # Run the test 656 | check_tools = "tools" in test_data 657 | result = test_request(test_name, test_data, check_tools=check_tools) 658 | results[test_name] = result 659 | 660 | # Now run streaming tests 661 | if not args.no_streaming: 662 | print("\n\n=========== RUNNING STREAMING TESTS ===========\n") 663 | for test_name, test_data in TEST_SCENARIOS.items(): 664 | # Only select streaming tests, or force streaming 665 | if not test_data.get("stream") and not test_name.endswith("_stream"): 666 | continue 667 | 668 | # Skip tool tests if requested 669 | if args.simple and "tools" in test_data: 670 | continue 671 | 672 | # Skip non-tool tests if tools_only 673 | if args.tools_only and "tools" not in test_data: 674 | continue 675 | 676 | # Run the streaming test 677 | result = await test_streaming(test_name, test_data) 678 | results[f"{test_name}_streaming"] = result 679 | 680 | # Print summary 681 | print("\n\n=========== TEST SUMMARY ===========\n") 682 | total = len(results) 683 | passed = sum(1 for v in results.values() if v) 684 | 685 | for test, result in results.items(): 686 | print(f"{test}: {'✅ PASS' if result else '❌ FAIL'}") 687 | 688 | print(f"\nTotal: {passed}/{total} tests passed") 689 | 690 | if passed == total: 691 | print("\n🎉 All tests passed!") 692 | return True 693 | else: 694 | print(f"\n⚠️ {total - passed} tests failed") 695 | return False 696 | 697 | async def main(): 698 | # Check that API key is set 699 | if not ANTHROPIC_API_KEY: 700 | print("Error: ANTHROPIC_API_KEY not set in .env file") 701 | return 702 | 703 | # Parse command-line arguments 704 | parser = argparse.ArgumentParser(description="Test the Claude-on-OpenAI proxy") 705 | parser.add_argument("--no-streaming", action="store_true", help="Skip streaming tests") 706 | parser.add_argument("--streaming-only", action="store_true", help="Only run streaming tests") 707 | parser.add_argument("--simple", action="store_true", help="Only run simple tests (no tools)") 708 | parser.add_argument("--tools-only", action="store_true", help="Only run tool tests") 709 | args = parser.parse_args() 710 | 711 | # Run tests 712 | success = await run_tests(args) 713 | sys.exit(0 if success else 1) 714 | 715 | if __name__ == "__main__": 716 | asyncio.run(main()) --------------------------------------------------------------------------------