├── .python-version ├── src └── aider_mcp_server │ ├── atoms │ ├── __init__.py │ ├── tools │ │ ├── __init__.py │ │ ├── aider_list_models.py │ │ └── aider_ai_code.py │ ├── utils.py │ ├── data_types.py │ └── logging.py │ ├── tests │ ├── __init__.py │ └── atoms │ │ ├── __init__.py │ │ ├── tools │ │ ├── __init__.py │ │ ├── test_aider_list_models.py │ │ └── test_aider_ai_code.py │ │ └── test_logging.py │ ├── __init__.py │ ├── __main__.py │ └── server.py ├── .claude └── commands │ ├── multi_aider_sub_agent.md │ ├── context_prime.md │ ├── context_prime_w_aider.md │ └── jprompt_ultra_diff_review.md ├── .mcp.json ├── .env.sample ├── pyproject.toml ├── .gitignore ├── specs └── init-aider-mcp-exp.md ├── ai_docs ├── programmable-aider-documentation.md └── just-prompt-example-mcp-server.xml └── README.md /.python-version: -------------------------------------------------------------------------------- 1 | 3.12 2 | -------------------------------------------------------------------------------- /src/aider_mcp_server/atoms/__init__.py: -------------------------------------------------------------------------------- 1 | # Atoms package initialization -------------------------------------------------------------------------------- /src/aider_mcp_server/tests/__init__.py: -------------------------------------------------------------------------------- 1 | # Tests package initialization -------------------------------------------------------------------------------- /src/aider_mcp_server/atoms/tools/__init__.py: -------------------------------------------------------------------------------- 1 | # Tools package initialization -------------------------------------------------------------------------------- /src/aider_mcp_server/tests/atoms/__init__.py: -------------------------------------------------------------------------------- 1 | # Atoms tests package initialization -------------------------------------------------------------------------------- /src/aider_mcp_server/tests/atoms/tools/__init__.py: -------------------------------------------------------------------------------- 1 | # Tools tests package initialization -------------------------------------------------------------------------------- /src/aider_mcp_server/atoms/utils.py: -------------------------------------------------------------------------------- 1 | DEFAULT_EDITOR_MODEL = "openai/gpt-4.1" 2 | DEFAULT_TESTING_MODEL = "openai/gpt-4.1" 3 | 4 | -------------------------------------------------------------------------------- /src/aider_mcp_server/__init__.py: -------------------------------------------------------------------------------- 1 | from aider_mcp_server.__main__ import main 2 | 3 | # This just re-exports the main function from __main__.py -------------------------------------------------------------------------------- /.claude/commands/multi_aider_sub_agent.md: -------------------------------------------------------------------------------- 1 | Run a multi aider_ai_code call with a sub agent call to fulfill the following tasks back to back in the most sensible order. If the given task(s) can be broken down into smaller tasks, do that. If tasks are dependent on certain changes to be made first, make sure to run the dependent tasks first. $ARGUMENTS -------------------------------------------------------------------------------- /.claude/commands/context_prime.md: -------------------------------------------------------------------------------- 1 | ## Context 2 | 3 | READ README.md, THEN run git ls-files and eza --git-ignore --tree to understand the context of the project don't read any other files. 4 | 5 | ## Commands & Feedback Loops 6 | 7 | We're using `uv run pytest` to run tests. 8 | 9 | You can validate the app works with `uv run aider-mcp-server --help`. -------------------------------------------------------------------------------- /.mcp.json: -------------------------------------------------------------------------------- 1 | { 2 | "mcpServers": { 3 | "aider-mcp-server": { 4 | "type": "stdio", 5 | "command": "uv", 6 | "args": [ 7 | "--directory", 8 | ".", 9 | "run", 10 | "aider-mcp-server", 11 | "--editor-model", 12 | "gemini/gemini-2.5-pro-preview-03-25", 13 | "--current-working-dir", 14 | "." 15 | ], 16 | "env": {} 17 | } 18 | } 19 | } 20 | -------------------------------------------------------------------------------- /.claude/commands/context_prime_w_aider.md: -------------------------------------------------------------------------------- 1 | ## Context 2 | 3 | READ README.md, THEN run git ls-files and eza --git-ignore --tree to understand the context of the project don't read any other files. 4 | 5 | ## Commands & Feedback Loops 6 | 7 | To validate code use `uv run pytest` to run tests. (don't run this now) 8 | 9 | You can validate the app works with `uv run aider-mcp-server --help`. 10 | 11 | ## Coding 12 | 13 | For coding always use the aider_ai_code tool. -------------------------------------------------------------------------------- /src/aider_mcp_server/atoms/tools/aider_list_models.py: -------------------------------------------------------------------------------- 1 | from typing import List 2 | from aider.models import fuzzy_match_models 3 | 4 | def list_models(substring: str) -> List[str]: 5 | """ 6 | List available models that match the provided substring. 7 | 8 | Args: 9 | substring (str): Substring to match against available models. 10 | 11 | Returns: 12 | List[str]: List of model names matching the substring. 13 | """ 14 | return fuzzy_match_models(substring) -------------------------------------------------------------------------------- /.env.sample: -------------------------------------------------------------------------------- 1 | # Environment Variables for just-prompt 2 | 3 | # OpenAI API Key 4 | OPENAI_API_KEY=your_openai_api_key_here 5 | 6 | # Anthropic API Key 7 | ANTHROPIC_API_KEY=your_anthropic_api_key_here 8 | 9 | # Gemini API Key 10 | GEMINI_API_KEY=your_gemini_api_key_here 11 | 12 | # Groq API Key 13 | GROQ_API_KEY=your_groq_api_key_here 14 | 15 | # DeepSeek API Key 16 | DEEPSEEK_API_KEY=your_deepseek_api_key_here 17 | 18 | # OpenRouter API Key 19 | OPENROUTER_API_KEY=your_openrouter_api_key_here 20 | 21 | # Ollama endpoint (if not default) 22 | OLLAMA_HOST=http://localhost:11434 23 | 24 | FIREWORKS_API_KEY= -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [project] 2 | name = "aider-mcp-server" 3 | version = "0.1.0" 4 | description = "Model context protocol server for offloading ai coding work to Aider" 5 | readme = "README.md" 6 | authors = [ 7 | { name = "IndyDevDan", email = "minor7addfloortom@gmail.com" } 8 | ] 9 | requires-python = ">=3.12" 10 | dependencies = [ 11 | "aider-chat>=0.81.0", 12 | "boto3>=1.37.27", 13 | "mcp>=1.6.0", 14 | "pydantic>=2.11.2", 15 | "pytest>=8.3.5", 16 | "rich>=14.0.0", 17 | ] 18 | 19 | [project.scripts] 20 | aider-mcp-server = "aider_mcp_server:main" 21 | 22 | [build-system] 23 | requires = ["hatchling"] 24 | build-backend = "hatchling.build" 25 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Python-generated files 2 | __pycache__/ 3 | *.py[oc] 4 | build/ 5 | dist/ 6 | wheels/ 7 | *.egg-info 8 | 9 | # Virtual environments 10 | .venv 11 | 12 | .env 13 | 14 | # Byte-compiled / optimized / DLL files 15 | __pycache__/ 16 | *.py[cod] 17 | *$py.class 18 | 19 | # Distribution / packaging 20 | dist/ 21 | build/ 22 | *.egg-info/ 23 | *.egg 24 | 25 | # Unit test / coverage reports 26 | htmlcov/ 27 | .tox/ 28 | .nox/ 29 | .coverage 30 | .coverage.* 31 | .cache 32 | nosetests.xml 33 | coverage.xml 34 | *.cover 35 | .hypothesis/ 36 | .pytest_cache/ 37 | 38 | # Jupyter Notebook 39 | .ipynb_checkpoints 40 | 41 | # Environments 42 | .env 43 | .venv 44 | env/ 45 | venv/ 46 | ENV/ 47 | env.bak/ 48 | venv.bak/ 49 | 50 | # mypy 51 | .mypy_cache/ 52 | .dmypy.json 53 | dmypy.json 54 | 55 | # IDE specific files 56 | .idea/ 57 | .vscode/ 58 | *.swp 59 | *.swo 60 | .DS_Store 61 | 62 | 63 | prompts/responses 64 | .aider* 65 | 66 | focus_output/ 67 | 68 | # Log files 69 | logs/ 70 | *.log -------------------------------------------------------------------------------- /src/aider_mcp_server/__main__.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import asyncio 3 | from aider_mcp_server.server import serve 4 | from aider_mcp_server.atoms.utils import DEFAULT_EDITOR_MODEL 5 | 6 | def main(): 7 | # Create the argument parser 8 | parser = argparse.ArgumentParser(description="Aider MCP Server - Offload AI coding tasks to Aider") 9 | 10 | # Add arguments 11 | parser.add_argument( 12 | "--editor-model", 13 | type=str, 14 | default=DEFAULT_EDITOR_MODEL, 15 | help=f"Editor model to use (default: {DEFAULT_EDITOR_MODEL})" 16 | ) 17 | parser.add_argument( 18 | "--current-working-dir", 19 | type=str, 20 | required=True, 21 | help="Current working directory (must be a valid git repository)" 22 | ) 23 | 24 | args = parser.parse_args() 25 | 26 | # Run the server asynchronously 27 | asyncio.run(serve( 28 | editor_model=args.editor_model, 29 | current_working_dir=args.current_working_dir 30 | )) 31 | 32 | if __name__ == "__main__": 33 | main() -------------------------------------------------------------------------------- /src/aider_mcp_server/tests/atoms/tools/test_aider_list_models.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | from aider_mcp_server.atoms.tools.aider_list_models import list_models 3 | 4 | def test_list_models_openai(): 5 | """Test that list_models returns GPT-4o model when searching for openai.""" 6 | models = list_models("openai") 7 | assert any("gpt-4o" in model for model in models), "Expected to find GPT-4o model in the list" 8 | 9 | def test_list_models_gemini(): 10 | """Test that list_models returns Gemini models when searching for gemini.""" 11 | models = list_models("gemini") 12 | assert any("gemini" in model.lower() for model in models), "Expected to find Gemini models in the list" 13 | 14 | def test_list_models_empty(): 15 | """Test that list_models with an empty string returns all models.""" 16 | models = list_models("") 17 | assert len(models) > 0, "Expected to get at least some models with empty string" 18 | 19 | def test_list_models_nonexistent(): 20 | """Test that list_models with a nonexistent model returns an empty list.""" 21 | models = list_models("this_model_does_not_exist_12345") 22 | assert len(models) == 0, "Expected to get no models with a nonexistent model name" -------------------------------------------------------------------------------- /.claude/commands/jprompt_ultra_diff_review.md: -------------------------------------------------------------------------------- 1 | # Ultra Diff Review 2 | > Execute each task in the order given to conduct a thorough code review. 3 | 4 | ## Task 1: Create diff.txt 5 | 6 | Create a new file called diff.md. 7 | 8 | At the top of the file, add the following markdown: 9 | 10 | ```md 11 | # Code Review 12 | - Review the diff, report on issues, bugs, and improvements. 13 | - End with a concise markdown table of any issues found, their solutions, and a risk assessment for each issue if applicable. 14 | - Use emojis to convey the severity of each issue. 15 | 16 | ## Diff 17 | 18 | ``` 19 | 20 | ## Task 2: git diff and append 21 | 22 | Then run git diff and append the output to the file. 23 | 24 | ## Task 3: just-prompt multi-llm tool call 25 | 26 | Then use that file as the input to this just-prompt tool call. 27 | 28 | prompts_from_file_to_file( 29 | from_file = diff.md, 30 | models = "openai:o3-mini, anthropic:claude-3-7-sonnet-20250219:4k, gemini:gemini-2.0-flash-thinking-exp" 31 | output_dir = ultra_diff_review/ 32 | ) 33 | 34 | ## Task 4: Read the output files and synthesize 35 | 36 | Then read the output files and think hard to synthesize the results into a new single file called `ultra_diff_review/fusion_ultra_diff_review.md` following the original instructions plus any additional instructions or callouts you think are needed to create the best possible review. 37 | 38 | ## Task 5: Present the results 39 | 40 | Then let me know which issues you think are worth resolving and we'll proceed from there. -------------------------------------------------------------------------------- /src/aider_mcp_server/atoms/data_types.py: -------------------------------------------------------------------------------- 1 | from typing import List, Optional, Dict, Any, Union 2 | from pydantic import BaseModel, Field 3 | 4 | # MCP Protocol Base Types 5 | class MCPRequest(BaseModel): 6 | """Base class for MCP protocol requests.""" 7 | name: str 8 | parameters: Dict[str, Any] 9 | 10 | class MCPResponse(BaseModel): 11 | """Base class for MCP protocol responses.""" 12 | pass 13 | 14 | class MCPErrorResponse(MCPResponse): 15 | """Error response for MCP protocol.""" 16 | error: str 17 | 18 | # Tool-specific request parameter models 19 | class AICodeParams(BaseModel): 20 | """Parameters for the aider_ai_code tool.""" 21 | ai_coding_prompt: str 22 | relative_editable_files: List[str] 23 | relative_readonly_files: List[str] = Field(default_factory=list) 24 | 25 | class ListModelsParams(BaseModel): 26 | """Parameters for the list_models tool.""" 27 | substring: str = "" 28 | 29 | # Tool-specific response models 30 | class AICodeResponse(MCPResponse): 31 | """Response for the aider_ai_code tool.""" 32 | status: str # 'success' or 'failure' 33 | message: Optional[str] = None 34 | 35 | class ListModelsResponse(MCPResponse): 36 | """Response for the list_models tool.""" 37 | models: List[str] 38 | 39 | # Specific request types 40 | class AICodeRequest(MCPRequest): 41 | """Request for the aider_ai_code tool.""" 42 | name: str = "aider_ai_code" 43 | parameters: AICodeParams 44 | 45 | class ListModelsRequest(MCPRequest): 46 | """Request for the list_models tool.""" 47 | name: str = "list_models" 48 | parameters: ListModelsParams 49 | 50 | # Union type for all possible MCP responses 51 | MCPToolResponse = Union[AICodeResponse, ListModelsResponse, MCPErrorResponse] -------------------------------------------------------------------------------- /specs/init-aider-mcp-exp.md: -------------------------------------------------------------------------------- 1 | # Aider Model Context Protocol (MCP) Experimental Server 2 | > Here we detail how we'll build the experimental ai coding aider mcp server. 3 | 4 | ## Why? 5 | Claude Code is a new, powerful agentic coding tool that is currently in beta. It's great but it's incredibly expensive. 6 | We can offload some of the work to a simpler ai coding tool: Aider. The original AI Coding Assistant. 7 | 8 | By discretely offloading work to Aider, we can not only reduce costs but use Claude Code (and auxillary LLM calls combined with aider) to better create more, reliable code through multiple - focused - LLM calls. 9 | 10 | ## Resources to ingest 11 | > To understand how we'll build this, READ these files 12 | 13 | ai_docs/just-prompt-example-mcp-server.xml 14 | ai_docs/programmable-aider-documentation.md 15 | 16 | ## Implementation Notes 17 | 18 | - We want to mirror the exact structure of the just-prompt codebase as closely as possible. Minus of course the tools that are specific to just-prompt. 19 | - Every atom must be tested in a respective tests/*_test.py file. 20 | - every atom/tools/*.py must only have a single responsibility - one method. 21 | - when we run aider run in no commit mode, we should not commit any changes to the codebase. 22 | - if architect_model is not provided, don't use architect mode. 23 | 24 | ## Application Structure 25 | 26 | - src/ 27 | - aider_mcp_server/ 28 | - __init__.py 29 | - __main__.py 30 | - server.py 31 | - serve(editor_model: str = DEFAULT_EDITOR_MODEL, current_working_dir: str = ".", architect_model: str = None) -> None 32 | - atoms/ 33 | - __init__.py 34 | - tools/ 35 | - __init__.py 36 | - aider_ai_code.py 37 | - code_with_aider(ai_coding_prompt: str, relative_editable_files: List[str], relative_readonly_files: List[str] = []) -> str 38 | - runs one shot aider based on ai_docs/programmable-aider-documentation.md 39 | - outputs 'success' or 'failure' 40 | - aider_list_models.py 41 | - list_models(substring: str) -> List[str] 42 | - calls aider.models.fuzzy_match_models(substr: str) and returns the list of models 43 | - utils.py 44 | - DEFAULT_EDITOR_MODEL = "gemini/gemini-2.5-pro-exp-03-25" 45 | - DEFAULT_ARCHITECT_MODEL = "gemini/gemini-2.5-pro-exp-03-25" 46 | - data_types.py 47 | - tests/ 48 | - __init__.py 49 | - atoms/ 50 | - __init__.py 51 | - tools/ 52 | - __init__.py 53 | - test_aider_ai_code.py 54 | - here create tests for basic 'math' functionality: 'add, 'subtract', 'multiply', 'divide'. Use temp dirs. 55 | - test_aider_list_models.py 56 | - here create a real call to list_models(openai) and assert gpt-4o substr in list. 57 | 58 | ## Commands 59 | 60 | - if for whatever reason you need additional python packages use `uv add `. 61 | 62 | ## Validation 63 | - Use `uv run pytest ` to run tests. Every atom/ must be tested. 64 | - Don't mock any tests - run real LLM calls. Make sure to test for failure paths. 65 | - At the end run `uv run aider-mcp-server --help` to validate the server is working. 66 | 67 | -------------------------------------------------------------------------------- /ai_docs/programmable-aider-documentation.md: -------------------------------------------------------------------------------- 1 | # Aider is a programmable AI coding assistant 2 | 3 | Here's how to use it in python to build tools that allow us to offload ai coding tasks to aider. 4 | 5 | ## Code Examples 6 | 7 | ``` 8 | 9 | class AICodeParams(BaseModel): 10 | architect: bool = True 11 | prompt: str 12 | model: str 13 | editor_model: Optional[str] = None 14 | editable_context: List[str] 15 | readonly_context: List[str] = [] 16 | settings: Optional[dict] 17 | use_git: bool = True 18 | 19 | 20 | def build_ai_coding_assistant(params: AICodeParams) -> Coder: 21 | """Create and configure a Coder instance based on provided parameters""" 22 | settings = params.settings or {} 23 | auto_commits = settings.get("auto_commits", False) 24 | suggest_shell_commands = settings.get("suggest_shell_commands", False) 25 | detect_urls = settings.get("detect_urls", False) 26 | 27 | # Extract budget_tokens setting once for both models 28 | budget_tokens = settings.get("budget_tokens") 29 | 30 | if params.architect: 31 | model = Model(model=params.model, editor_model=params.editor_model) 32 | extra_params = {} 33 | 34 | # Add reasoning_effort if available 35 | if settings.get("reasoning_effort"): 36 | extra_params["reasoning_effort"] = settings["reasoning_effort"] 37 | 38 | # Add thinking budget if specified 39 | if budget_tokens is not None: 40 | extra_params = add_thinking_budget_to_params(extra_params, budget_tokens) 41 | 42 | model.extra_params = extra_params 43 | return Coder.create( 44 | main_model=model, 45 | edit_format="architect", 46 | io=InputOutput(yes=True), 47 | fnames=params.editable_context, 48 | read_only_fnames=params.readonly_context, 49 | auto_commits=auto_commits, 50 | suggest_shell_commands=suggest_shell_commands, 51 | detect_urls=detect_urls, 52 | use_git=params.use_git, 53 | ) 54 | else: 55 | model = Model(params.model) 56 | extra_params = {} 57 | 58 | # Add reasoning_effort if available 59 | if settings.get("reasoning_effort"): 60 | extra_params["reasoning_effort"] = settings["reasoning_effort"] 61 | 62 | # Add thinking budget if specified (consistent for both modes) 63 | if budget_tokens is not None: 64 | extra_params = add_thinking_budget_to_params(extra_params, budget_tokens) 65 | 66 | model.extra_params = extra_params 67 | return Coder.create( 68 | main_model=model, 69 | io=InputOutput(yes=True), 70 | fnames=params.editable_context, 71 | read_only_fnames=params.readonly_context, 72 | auto_commits=auto_commits, 73 | suggest_shell_commands=suggest_shell_commands, 74 | detect_urls=detect_urls, 75 | use_git=params.use_git, 76 | ) 77 | 78 | 79 | def ai_code(coder: Coder, params: AICodeParams): 80 | """Execute AI coding using provided coder instance and parameters""" 81 | # Execute the AI coding with the provided prompt 82 | coder.run(params.prompt) 83 | 84 | 85 | ``` -------------------------------------------------------------------------------- /src/aider_mcp_server/atoms/logging.py: -------------------------------------------------------------------------------- 1 | import os 2 | import logging 3 | import time 4 | from pathlib import Path 5 | from typing import Optional, Union 6 | 7 | 8 | class Logger: 9 | """Custom logger that writes to both console and file.""" 10 | 11 | def __init__( 12 | self, 13 | name: str, 14 | log_dir: Optional[Union[str, Path]] = None, 15 | level: int = logging.INFO, 16 | ): 17 | """ 18 | Initialize the logger. 19 | 20 | Args: 21 | name: Logger name 22 | log_dir: Directory to store log files (defaults to ./logs) 23 | level: Logging level 24 | """ 25 | self.name = name 26 | self.level = level 27 | 28 | # Set up the logger 29 | self.logger = logging.getLogger(name) 30 | self.logger.setLevel(level) 31 | self.logger.propagate = False 32 | 33 | # Clear any existing handlers 34 | if self.logger.handlers: 35 | self.logger.handlers.clear() 36 | 37 | # Define a standard formatter 38 | log_formatter = logging.Formatter( 39 | '%(asctime)s [%(levelname)s] %(name)s: %(message)s', 40 | datefmt='%Y-%m-%d %H:%M:%S' 41 | ) 42 | 43 | # Add console handler with standard formatting 44 | console_handler = logging.StreamHandler() 45 | console_handler.setFormatter(log_formatter) 46 | console_handler.setLevel(level) 47 | self.logger.addHandler(console_handler) 48 | 49 | # Add file handler if log_dir is provided 50 | if log_dir is not None: 51 | # Create log directory if it doesn't exist 52 | log_dir = Path(log_dir) 53 | log_dir.mkdir(parents=True, exist_ok=True) 54 | 55 | # Use a fixed log file name 56 | log_file_name = "aider_mcp_server.log" 57 | log_file_path = log_dir / log_file_name 58 | 59 | # Set up file handler to append 60 | file_handler = logging.FileHandler(log_file_path, mode='a') 61 | # Use the same formatter as the console handler 62 | file_handler.setFormatter(log_formatter) 63 | file_handler.setLevel(level) 64 | self.logger.addHandler(file_handler) 65 | 66 | self.log_file_path = log_file_path 67 | self.logger.info(f"Logging to: {log_file_path}") 68 | 69 | def debug(self, message: str, **kwargs): 70 | """Log a debug message.""" 71 | self.logger.debug(message, **kwargs) 72 | 73 | def info(self, message: str, **kwargs): 74 | """Log an info message.""" 75 | self.logger.info(message, **kwargs) 76 | 77 | def warning(self, message: str, **kwargs): 78 | """Log a warning message.""" 79 | self.logger.warning(message, **kwargs) 80 | 81 | def error(self, message: str, **kwargs): 82 | """Log an error message.""" 83 | self.logger.error(message, **kwargs) 84 | 85 | def critical(self, message: str, **kwargs): 86 | """Log a critical message.""" 87 | self.logger.critical(message, **kwargs) 88 | 89 | def exception(self, message: str, **kwargs): 90 | """Log an exception message with traceback.""" 91 | self.logger.exception(message, **kwargs) 92 | 93 | 94 | def get_logger( 95 | name: str, 96 | log_dir: Optional[Union[str, Path]] = None, 97 | level: int = logging.INFO, 98 | ) -> Logger: 99 | """ 100 | Get a configured logger instance. 101 | 102 | Args: 103 | name: Logger name 104 | log_dir: Directory to store log files (defaults to ./logs) 105 | level: Logging level 106 | 107 | Returns: 108 | Configured Logger instance 109 | """ 110 | if log_dir is None: 111 | # Default log directory is ./logs 112 | log_dir = Path("./logs") 113 | 114 | return Logger( 115 | name=name, 116 | log_dir=log_dir, 117 | level=level, 118 | ) 119 | -------------------------------------------------------------------------------- /src/aider_mcp_server/tests/atoms/test_logging.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | import logging 3 | from pathlib import Path 4 | 5 | from aider_mcp_server.atoms.logging import Logger, get_logger 6 | 7 | 8 | def test_logger_creation_and_file_output(tmp_path): 9 | """Test Logger instance creation using get_logger and log file existence with fixed name.""" 10 | log_dir = tmp_path / "logs" 11 | logger_name = "test_logger_creation" 12 | expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name 13 | 14 | # --- Test get_logger --- 15 | logger = get_logger( 16 | name=logger_name, 17 | log_dir=log_dir, 18 | level=logging.INFO, 19 | ) 20 | assert logger is not None, "Logger instance from get_logger should be created" 21 | assert logger.name == logger_name 22 | 23 | # Log a message to ensure file handling is triggered 24 | logger.info("Initial log message.") 25 | 26 | # Verify log directory and file exist 27 | assert log_dir.exists(), f"Log directory should be created by get_logger at {log_dir}" 28 | assert log_dir.is_dir(), f"Log path created by get_logger should be a directory" 29 | assert expected_log_file.exists(), f"Log file should be created by get_logger at {expected_log_file}" 30 | assert expected_log_file.is_file(), f"Log path created by get_logger should point to a file" 31 | 32 | 33 | def test_log_levels_and_output(tmp_path): 34 | """Test logging at different levels to the fixed log file using get_logger.""" 35 | log_dir = tmp_path / "logs" 36 | logger_name = "test_logger_levels" 37 | expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name 38 | 39 | # Instantiate our custom logger with DEBUG level using get_logger 40 | logger = get_logger( 41 | name=logger_name, 42 | log_dir=log_dir, 43 | level=logging.DEBUG, 44 | ) 45 | 46 | # Log messages at different levels 47 | messages = { 48 | logging.DEBUG: "This is a debug message.", 49 | logging.INFO: "This is an info message.", 50 | logging.WARNING: "This is a warning message.", 51 | logging.ERROR: "This is an error message.", 52 | logging.CRITICAL: "This is a critical message.", 53 | } 54 | 55 | logger.debug(messages[logging.DEBUG]) 56 | logger.info(messages[logging.INFO]) 57 | logger.warning(messages[logging.WARNING]) 58 | logger.error(messages[logging.ERROR]) 59 | logger.critical(messages[logging.CRITICAL]) 60 | 61 | # Verify file output 62 | assert expected_log_file.exists(), "Log file should exist for level testing" 63 | 64 | file_content = expected_log_file.read_text() 65 | 66 | # Verify file output contains messages and level indicators 67 | for level, msg in messages.items(): 68 | level_name = logging.getLevelName(level) 69 | assert msg in file_content, f"Message '{msg}' not found in file content" 70 | assert level_name in file_content, f"Level '{level_name}' not found in file content" 71 | assert logger_name in file_content, f"Logger name '{logger_name}' not found in file content" 72 | 73 | 74 | def test_log_level_filtering(tmp_path): 75 | """Test that messages below the set log level are filtered using get_logger.""" 76 | log_dir = tmp_path / "logs" 77 | logger_name = "test_logger_filtering" 78 | expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name 79 | 80 | # Instantiate the logger with WARNING level using get_logger 81 | logger = get_logger( 82 | name=logger_name, 83 | log_dir=log_dir, 84 | level=logging.WARNING, 85 | ) 86 | 87 | # Log messages at different levels 88 | debug_msg = "This debug message should NOT appear." 89 | info_msg = "This info message should NOT appear." 90 | warning_msg = "This warning message SHOULD appear." 91 | error_msg = "This error message SHOULD appear." 92 | critical_msg = "This critical message SHOULD appear." # Add critical for completeness 93 | 94 | logger.debug(debug_msg) 95 | logger.info(info_msg) 96 | logger.warning(warning_msg) 97 | logger.error(error_msg) 98 | logger.critical(critical_msg) 99 | 100 | # Verify file output filtering 101 | assert expected_log_file.exists(), "Log file should exist for filtering testing" 102 | 103 | file_content = expected_log_file.read_text() 104 | 105 | assert debug_msg not in file_content, "Debug message should be filtered from file" 106 | assert info_msg not in file_content, "Info message should be filtered from file" 107 | assert warning_msg in file_content, "Warning message should appear in file" 108 | assert error_msg in file_content, "Error message should appear in file" 109 | assert critical_msg in file_content, "Critical message should appear in file" 110 | assert logging.getLevelName(logging.DEBUG) not in file_content, "DEBUG level indicator should be filtered from file" 111 | assert logging.getLevelName(logging.INFO) not in file_content, "INFO level indicator should be filtered from file" 112 | assert logging.getLevelName(logging.WARNING) in file_content, "WARNING level indicator should appear in file" 113 | assert logging.getLevelName(logging.ERROR) in file_content, "ERROR level indicator should appear in file" 114 | assert logging.getLevelName(logging.CRITICAL) in file_content, "CRITICAL level indicator should appear in file" 115 | assert logger_name in file_content, f"Logger name '{logger_name}' should appear in file content" 116 | 117 | 118 | def test_log_appending(tmp_path): 119 | """Test that log messages are appended to the existing log file.""" 120 | log_dir = tmp_path / "logs" 121 | logger_name_1 = "test_logger_append_1" 122 | logger_name_2 = "test_logger_append_2" 123 | expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name 124 | 125 | # First logger instance and message 126 | logger1 = get_logger( 127 | name=logger_name_1, 128 | log_dir=log_dir, 129 | level=logging.INFO, 130 | ) 131 | message1 = "First message to append." 132 | logger1.info(message1) 133 | 134 | # Ensure some time passes or context switches if needed, though file handler should manage appending 135 | # Second logger instance (or could reuse logger1) and message 136 | logger2 = get_logger( 137 | name=logger_name_2, # Can use a different name or the same 138 | log_dir=log_dir, 139 | level=logging.INFO, 140 | ) 141 | message2 = "Second message to append." 142 | logger2.info(message2) 143 | 144 | # Verify both messages are in the file 145 | assert expected_log_file.exists(), "Log file should exist for appending test" 146 | file_content = expected_log_file.read_text() 147 | 148 | assert message1 in file_content, "First message not found in appended log file" 149 | assert logger_name_1 in file_content, "First logger name not found in appended log file" 150 | assert message2 in file_content, "Second message not found in appended log file" 151 | assert logger_name_2 in file_content, "Second logger name not found in appended log file" 152 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Aider MCP Server - Experimental 2 | > Model context protocol server for offloading AI coding work to Aider, enhancing development efficiency and flexibility. 3 | 4 | ## Overview 5 | 6 | This server allows Claude Code to offload AI coding tasks to Aider, the best open source AI coding assistant. By delegating certain coding tasks to Aider, we can reduce costs, gain control over our coding model and operate Claude Code in a more orchestrative way to review and revise code. 7 | 8 | ## Setup 9 | 10 | 0. Clone the repository: 11 | 12 | ```bash 13 | git clone https://github.com/disler/aider-mcp-server.git 14 | ``` 15 | 16 | 1. Install dependencies: 17 | 18 | ```bash 19 | uv sync 20 | ``` 21 | 22 | 2. Create your environment file: 23 | 24 | ```bash 25 | cp .env.sample .env 26 | ``` 27 | 28 | 3. Configure your API keys in the `.env` file (or use the mcpServers "env" section) to have the api key needed for the model you want to use in aider: 29 | 30 | ``` 31 | GEMINI_API_KEY=your_gemini_api_key_here 32 | OPENAI_API_KEY=your_openai_api_key_here 33 | ANTHROPIC_API_KEY=your_anthropic_api_key_here 34 | ...see .env.sample for more 35 | ``` 36 | 37 | 4. Copy and fill out the the `.mcp.json` into the root of your project and update the `--directory` to point to this project's root directory and the `--current-working-dir` to point to the root of your project. 38 | 39 | ```json 40 | { 41 | "mcpServers": { 42 | "aider-mcp-server": { 43 | "type": "stdio", 44 | "command": "uv", 45 | "args": [ 46 | "--directory", 47 | "", 48 | "run", 49 | "aider-mcp-server", 50 | "--editor-model", 51 | "gpt-4o", 52 | "--current-working-dir", 53 | "" 54 | ], 55 | "env": { 56 | "GEMINI_API_KEY": "", 57 | "OPENAI_API_KEY": "", 58 | "ANTHROPIC_API_KEY": "", 59 | ...see .env.sample for more 60 | } 61 | } 62 | } 63 | } 64 | ``` 65 | 66 | ## Testing 67 | > Tests run with gemini-2.5-pro-exp-03-25 68 | 69 | To run all tests: 70 | 71 | ```bash 72 | uv run pytest 73 | ``` 74 | 75 | To run specific tests: 76 | 77 | ```bash 78 | # Test listing models 79 | uv run pytest src/aider_mcp_server/tests/atoms/tools/test_aider_list_models.py 80 | 81 | # Test AI coding 82 | uv run pytest src/aider_mcp_server/tests/atoms/tools/test_aider_ai_code.py 83 | ``` 84 | 85 | Note: The AI coding tests require a valid API key for the Gemini model. Make sure to set it in your `.env` file before running the tests. 86 | 87 | ## Add this MCP server to Claude Code 88 | 89 | ### Add with `gemini-2.5-pro-exp-03-25` 90 | 91 | ```bash 92 | claude mcp add aider-mcp-server -s local \ 93 | -- \ 94 | uv --directory "" \ 95 | run aider-mcp-server \ 96 | --editor-model "gemini/gemini-2.5-pro-exp-03-25" \ 97 | --current-working-dir "" 98 | ``` 99 | 100 | ### Add with `gemini-2.5-pro-preview-03-25` 101 | 102 | ```bash 103 | claude mcp add aider-mcp-server -s local \ 104 | -- \ 105 | uv --directory "" \ 106 | run aider-mcp-server \ 107 | --editor-model "gemini/gemini-2.5-pro-preview-03-25" \ 108 | --current-working-dir "" 109 | ``` 110 | 111 | ### Add with `quasar-alpha` 112 | 113 | ```bash 114 | claude mcp add aider-mcp-server -s local \ 115 | -- \ 116 | uv --directory "" \ 117 | run aider-mcp-server \ 118 | --editor-model "openrouter/openrouter/quasar-alpha" \ 119 | --current-working-dir "" 120 | ``` 121 | 122 | ### Add with `llama4-maverick-instruct-basic` 123 | 124 | ```bash 125 | claude mcp add aider-mcp-server -s local \ 126 | -- \ 127 | uv --directory "" \ 128 | run aider-mcp-server \ 129 | --editor-model "fireworks_ai/accounts/fireworks/models/llama4-maverick-instruct-basic" \ 130 | --current-working-dir "" 131 | ``` 132 | 133 | ## Usage 134 | 135 | This MCP server provides the following functionalities: 136 | 137 | 1. **Offload AI coding tasks to Aider**: 138 | - Takes a prompt and file paths 139 | - Uses Aider to implement the requested changes 140 | - Returns success or failure 141 | 142 | 2. **List available models**: 143 | - Provides a list of models matching a substring 144 | - Useful for discovering supported models 145 | 146 | 147 | ## Available Tools 148 | 149 | This MCP server exposes the following tools: 150 | 151 | ### 1. `aider_ai_code` 152 | 153 | This tool allows you to run Aider to perform AI coding tasks based on a provided prompt and specified files. 154 | 155 | **Parameters:** 156 | 157 | - `ai_coding_prompt` (string, required): The natural language instruction for the AI coding task. 158 | - `relative_editable_files` (list of strings, required): A list of file paths (relative to the `current_working_dir`) that Aider is allowed to modify. If a file doesn't exist, it will be created. 159 | - `relative_readonly_files` (list of strings, optional): A list of file paths (relative to the `current_working_dir`) that Aider can read for context but cannot modify. Defaults to an empty list `[]`. 160 | - `model` (string, optional): The primary AI model Aider should use for generating code. Defaults to `"gemini/gemini-2.5-pro-exp-03-25"`. You can use the `list_models` tool to find other available models. 161 | - `editor_model` (string, optional): The AI model Aider should use for editing/refining code, particularly when using architect mode. If not provided, the primary `model` might be used depending on Aider's internal logic. Defaults to `None`. 162 | 163 | **Example Usage (within an MCP request):** 164 | 165 | Claude Code Prompt: 166 | ``` 167 | Use the Aider AI Code tool to: Refactor the calculate_sum function in calculator.py to handle potential TypeError exceptions. 168 | ``` 169 | 170 | Result: 171 | ```json 172 | { 173 | "name": "aider_ai_code", 174 | "parameters": { 175 | "ai_coding_prompt": "Refactor the calculate_sum function in calculator.py to handle potential TypeError exceptions.", 176 | "relative_editable_files": ["src/calculator.py"], 177 | "relative_readonly_files": ["docs/requirements.txt"], 178 | "model": "openai/gpt-4o" 179 | } 180 | } 181 | ``` 182 | 183 | **Returns:** 184 | 185 | - A simple dict: {success, diff} 186 | - `success`: boolean - Whether the operation was successful. 187 | - `diff`: string - The diff of the changes made to the file. 188 | 189 | ### 2. `list_models` 190 | 191 | This tool lists available AI models supported by Aider that match a given substring. 192 | 193 | **Parameters:** 194 | 195 | - `substring` (string, required): The substring to search for within the names of available models. 196 | 197 | **Example Usage (within an MCP request):** 198 | 199 | Claude Code Prompt: 200 | ``` 201 | Use the Aider List Models tool to: List models that contain the substring "gemini". 202 | ``` 203 | 204 | Result: 205 | ```json 206 | { 207 | "name": "list_models", 208 | "parameters": { 209 | "substring": "gemini" 210 | } 211 | } 212 | ``` 213 | 214 | **Returns:** 215 | 216 | - A list of model name strings that match the provided substring. Example: `["gemini/gemini-1.5-flash", "gemini/gemini-1.5-pro", "gemini/gemini-pro"]` 217 | 218 | ## Architecture 219 | 220 | The server is structured as follows: 221 | 222 | - **Server layer**: Handles MCP protocol communication 223 | - **Atoms layer**: Individual, pure functional components 224 | - **Tools**: Specific capabilities (AI coding, listing models) 225 | - **Utils**: Constants and helper functions 226 | - **Data Types**: Type definitions using Pydantic 227 | 228 | All components are thoroughly tested for reliability. 229 | 230 | ## Codebase Structure 231 | 232 | The project is organized into the following main directories and files: 233 | 234 | ``` 235 | . 236 | ├── ai_docs # Documentation related to AI models and examples 237 | │ ├── just-prompt-example-mcp-server.xml 238 | │ └── programmable-aider-documentation.md 239 | ├── pyproject.toml # Project metadata and dependencies 240 | ├── README.md # This file 241 | ├── specs # Specification documents 242 | │ └── init-aider-mcp-exp.md 243 | ├── src # Source code directory 244 | │ └── aider_mcp_server # Main package for the server 245 | │ ├── __init__.py # Package initializer 246 | │ ├── __main__.py # Main entry point for the server executable 247 | │ ├── atoms # Core, reusable components (pure functions) 248 | │ │ ├── __init__.py 249 | │ │ ├── data_types.py # Pydantic models for data structures 250 | │ │ ├── logging.py # Custom logging setup 251 | │ │ ├── tools # Individual tool implementations 252 | │ │ │ ├── __init__.py 253 | │ │ │ ├── aider_ai_code.py # Logic for the aider_ai_code tool 254 | │ │ │ └── aider_list_models.py # Logic for the list_models tool 255 | │ │ └── utils.py # Utility functions and constants (like default models) 256 | │ ├── server.py # MCP server logic, tool registration, request handling 257 | │ └── tests # Unit and integration tests 258 | │ ├── __init__.py 259 | │ └── atoms # Tests for the atoms layer 260 | │ ├── __init__.py 261 | │ ├── test_logging.py # Tests for logging 262 | │ └── tools # Tests for the tools 263 | │ ├── __init__.py 264 | │ ├── test_aider_ai_code.py # Tests for AI coding tool 265 | │ └── test_aider_list_models.py # Tests for model listing tool 266 | ``` 267 | 268 | - **`src/aider_mcp_server`**: Contains the main application code. 269 | - **`atoms`**: Holds the fundamental building blocks. These are designed to be pure functions or simple classes with minimal dependencies. 270 | - **`tools`**: Each file here implements the core logic for a specific MCP tool (`aider_ai_code`, `list_models`). 271 | - **`utils.py`**: Contains shared constants like default model names. 272 | - **`data_types.py`**: Defines Pydantic models for request/response structures, ensuring data validation. 273 | - **`logging.py`**: Sets up a consistent logging format for console and file output. 274 | - **`server.py`**: Orchestrates the MCP server. It initializes the server, registers the tools defined in the `atoms/tools` directory, handles incoming requests, routes them to the appropriate tool logic, and sends back responses according to the MCP protocol. 275 | - **`__main__.py`**: Provides the command-line interface entry point (`aider-mcp-server`), parsing arguments like `--editor-model` and starting the server defined in `server.py`. 276 | - **`tests`**: Contains tests mirroring the structure of the `src` directory, ensuring that each component (especially atoms) works as expected. 277 | 278 | -------------------------------------------------------------------------------- /src/aider_mcp_server/atoms/tools/aider_ai_code.py: -------------------------------------------------------------------------------- 1 | import json 2 | from typing import List, Optional, Dict, Any, Union 3 | import os 4 | import os.path 5 | import subprocess 6 | from aider.models import Model 7 | from aider.coders import Coder 8 | from aider.io import InputOutput 9 | from aider_mcp_server.atoms.logging import get_logger 10 | from aider_mcp_server.atoms.utils import DEFAULT_EDITOR_MODEL 11 | 12 | # Configure logging for this module 13 | logger = get_logger(__name__) 14 | 15 | # Type alias for response dictionary 16 | ResponseDict = Dict[str, Union[bool, str]] 17 | 18 | 19 | def _get_changes_diff_or_content( 20 | relative_editable_files: List[str], working_dir: str = None 21 | ) -> str: 22 | """ 23 | Get the git diff for the specified files, or their content if git fails. 24 | 25 | Args: 26 | relative_editable_files: List of files to check for changes 27 | working_dir: The working directory where the git repo is located 28 | """ 29 | diff = "" 30 | # Log current directory for debugging 31 | current_dir = os.getcwd() 32 | logger.info(f"Current directory during diff: {current_dir}") 33 | if working_dir: 34 | logger.info(f"Using working directory: {working_dir}") 35 | 36 | # Always attempt to use git 37 | files_arg = " ".join(relative_editable_files) 38 | logger.info(f"Attempting to get git diff for: {' '.join(relative_editable_files)}") 39 | 40 | try: 41 | # Use git -C to specify the repository directory 42 | if working_dir: 43 | diff_cmd = f"git -C {working_dir} diff -- {files_arg}" 44 | else: 45 | diff_cmd = f"git diff -- {files_arg}" 46 | 47 | logger.info(f"Running git command: {diff_cmd}") 48 | diff = subprocess.check_output( 49 | diff_cmd, shell=True, text=True, stderr=subprocess.PIPE 50 | ) 51 | logger.info("Successfully obtained git diff.") 52 | except subprocess.CalledProcessError as e: 53 | logger.warning( 54 | f"Git diff command failed with exit code {e.returncode}. Error: {e.stderr.strip()}" 55 | ) 56 | logger.warning("Falling back to reading file contents.") 57 | diff = "Git diff failed. Current file contents:\n\n" 58 | for file_path in relative_editable_files: 59 | full_path = ( 60 | os.path.join(working_dir, file_path) if working_dir else file_path 61 | ) 62 | if os.path.exists(full_path): 63 | try: 64 | with open(full_path, "r") as f: 65 | content = f.read() 66 | diff += f"--- {file_path} ---\n{content}\n\n" 67 | logger.info(f"Read content for {file_path}") 68 | except Exception as read_e: 69 | logger.error( 70 | f"Failed reading file {full_path} for content fallback: {read_e}" 71 | ) 72 | diff += f"--- {file_path} --- (Error reading file)\n\n" 73 | else: 74 | logger.warning(f"File {full_path} not found during content fallback.") 75 | diff += f"--- {file_path} --- (File not found)\n\n" 76 | except Exception as e: 77 | logger.error(f"Unexpected error getting git diff: {str(e)}") 78 | diff = f"Error getting git diff: {str(e)}\n\n" # Provide error in diff string as fallback 79 | return diff 80 | 81 | 82 | def _check_for_meaningful_changes( 83 | relative_editable_files: List[str], working_dir: str = None 84 | ) -> bool: 85 | """ 86 | Check if the edited files contain meaningful content. 87 | 88 | Args: 89 | relative_editable_files: List of files to check 90 | working_dir: The working directory where files are located 91 | """ 92 | for file_path in relative_editable_files: 93 | # Use the working directory if provided 94 | full_path = os.path.join(working_dir, file_path) if working_dir else file_path 95 | logger.info(f"Checking for meaningful content in: {full_path}") 96 | 97 | if os.path.exists(full_path): 98 | try: 99 | with open(full_path, "r") as f: 100 | content = f.read() 101 | # Check if the file has more than just whitespace or a single comment line, 102 | # or contains common code keywords. This is a heuristic. 103 | stripped_content = content.strip() 104 | if stripped_content and ( 105 | len(stripped_content.split("\n")) > 1 106 | or any( 107 | kw in content 108 | for kw in [ 109 | "def ", 110 | "class ", 111 | "import ", 112 | "from ", 113 | "async def", 114 | ] 115 | ) 116 | ): 117 | logger.info(f"Meaningful content found in: {file_path}") 118 | return True 119 | except Exception as e: 120 | logger.error( 121 | f"Failed reading file {full_path} during meaningful change check: {e}" 122 | ) 123 | # If we can't read it, we can't confirm meaningful change from this file 124 | continue 125 | else: 126 | logger.info( 127 | f"File not found or empty, skipping meaningful check: {full_path}" 128 | ) 129 | 130 | logger.info("No meaningful changes detected in any editable files.") 131 | return False 132 | 133 | 134 | def _process_coder_results( 135 | relative_editable_files: List[str], working_dir: str = None 136 | ) -> ResponseDict: 137 | """ 138 | Process the results after Aider has run, checking for meaningful changes 139 | and retrieving the diff or content. 140 | 141 | Args: 142 | relative_editable_files: List of files that were edited 143 | working_dir: The working directory where the git repo is located 144 | 145 | Returns: 146 | Dictionary with success status and diff output 147 | """ 148 | diff_output = _get_changes_diff_or_content(relative_editable_files, working_dir) 149 | logger.info("Checking for meaningful changes in edited files...") 150 | has_meaningful_content = _check_for_meaningful_changes( 151 | relative_editable_files, working_dir 152 | ) 153 | 154 | if has_meaningful_content: 155 | logger.info("Meaningful changes found. Processing successful.") 156 | return {"success": True, "diff": diff_output} 157 | else: 158 | logger.warning( 159 | "No meaningful changes detected. Processing marked as unsuccessful." 160 | ) 161 | # Even if no meaningful content, provide the diff/content if available 162 | return { 163 | "success": False, 164 | "diff": diff_output 165 | or "No meaningful changes detected and no diff/content available.", 166 | } 167 | 168 | 169 | def _format_response(response: ResponseDict) -> str: 170 | """ 171 | Format the response dictionary as a JSON string. 172 | 173 | Args: 174 | response: Dictionary containing success status and diff output 175 | 176 | Returns: 177 | JSON string representation of the response 178 | """ 179 | return json.dumps(response, indent=4) 180 | 181 | 182 | def code_with_aider( 183 | ai_coding_prompt: str, 184 | relative_editable_files: List[str], 185 | relative_readonly_files: List[str] = [], 186 | model: str = DEFAULT_EDITOR_MODEL, 187 | working_dir: str = None, 188 | ) -> str: 189 | """ 190 | Run Aider to perform AI coding tasks based on the provided prompt and files. 191 | 192 | Args: 193 | ai_coding_prompt (str): The prompt for the AI to execute. 194 | relative_editable_files (List[str]): List of files that can be edited. 195 | relative_readonly_files (List[str], optional): List of files that can be read but not edited. Defaults to []. 196 | model (str, optional): The model to use. Defaults to DEFAULT_EDITOR_MODEL. 197 | working_dir (str, required): The working directory where git repository is located and files are stored. 198 | 199 | Returns: 200 | Dict[str, Any]: {'success': True/False, 'diff': str with git diff output} 201 | """ 202 | logger.info("Starting code_with_aider process.") 203 | logger.info(f"Prompt: '{ai_coding_prompt}'") 204 | 205 | # Working directory must be provided 206 | if not working_dir: 207 | error_msg = "Error: working_dir is required for code_with_aider" 208 | logger.error(error_msg) 209 | return json.dumps({"success": False, "diff": error_msg}) 210 | 211 | logger.info(f"Working directory: {working_dir}") 212 | logger.info(f"Editable files: {relative_editable_files}") 213 | logger.info(f"Readonly files: {relative_readonly_files}") 214 | logger.info(f"Model: {model}") 215 | 216 | try: 217 | # Configure the model 218 | logger.info("Configuring AI model...") # Point 1: Before init 219 | ai_model = Model(model) 220 | logger.info(f"Configured model: {model}") 221 | logger.info("AI model configured.") # Point 2: After init 222 | 223 | # Create the coder instance 224 | logger.info("Creating Aider coder instance...") 225 | # Use working directory for chat history file if provided 226 | history_dir = working_dir 227 | abs_editable_files = [ 228 | os.path.join(working_dir, file) for file in relative_editable_files 229 | ] 230 | abs_readonly_files = [ 231 | os.path.join(working_dir, file) for file in relative_readonly_files 232 | ] 233 | chat_history_file = os.path.join(history_dir, ".aider.chat.history.md") 234 | logger.info(f"Using chat history file: {chat_history_file}") 235 | 236 | coder = Coder.create( 237 | main_model=ai_model, 238 | io=InputOutput( 239 | yes=True, 240 | chat_history_file=chat_history_file, 241 | ), 242 | fnames=abs_editable_files, 243 | read_only_fnames=abs_readonly_files, 244 | auto_commits=False, # We'll handle commits separately 245 | suggest_shell_commands=False, 246 | detect_urls=False, 247 | use_git=True, # Always use git 248 | ) 249 | logger.info("Aider coder instance created successfully.") 250 | 251 | # Run the coding session 252 | logger.info("Starting Aider coding session...") # Point 3: Before run 253 | result = coder.run(ai_coding_prompt) 254 | logger.info(f"Aider coding session result: {result}") 255 | logger.info("Aider coding session finished.") # Point 4: After run 256 | 257 | # Process the results after the coder has run 258 | logger.info("Processing coder results...") # Point 5: Processing results 259 | try: 260 | response = _process_coder_results(relative_editable_files, working_dir) 261 | logger.info("Coder results processed.") 262 | except Exception as e: 263 | logger.exception( 264 | f"Error processing coder results: {str(e)}" 265 | ) # Point 6: Error 266 | response = { 267 | "success": False, 268 | "diff": f"Error processing files after execution: {str(e)}", 269 | } 270 | 271 | except Exception as e: 272 | logger.exception( 273 | f"Critical Error in code_with_aider: {str(e)}" 274 | ) # Point 6: Error 275 | response = { 276 | "success": False, 277 | "diff": f"Unhandled Error during Aider execution: {str(e)}", 278 | } 279 | 280 | formatted_response = _format_response(response) 281 | logger.info( 282 | f"code_with_aider process completed. Success: {response.get('success')}" 283 | ) 284 | logger.info( 285 | f"Formatted response: {formatted_response}" 286 | ) # Log complete response for debugging 287 | return formatted_response 288 | -------------------------------------------------------------------------------- /src/aider_mcp_server/server.py: -------------------------------------------------------------------------------- 1 | import json 2 | import sys 3 | import os 4 | import asyncio 5 | import subprocess 6 | import logging 7 | from typing import Dict, Any, Optional, List, Tuple, Union 8 | 9 | import mcp 10 | from mcp.server import Server 11 | from mcp.server.stdio import stdio_server 12 | from mcp.types import Tool, TextContent 13 | 14 | from aider_mcp_server.atoms.logging import get_logger 15 | from aider_mcp_server.atoms.utils import DEFAULT_EDITOR_MODEL 16 | from aider_mcp_server.atoms.tools.aider_ai_code import code_with_aider 17 | from aider_mcp_server.atoms.tools.aider_list_models import list_models 18 | 19 | # Configure logging 20 | logger = get_logger(__name__) 21 | 22 | # Define MCP tools 23 | AIDER_AI_CODE_TOOL = Tool( 24 | name="aider_ai_code", 25 | description="Run Aider to perform AI coding tasks based on the provided prompt and files", 26 | inputSchema={ 27 | "type": "object", 28 | "properties": { 29 | "ai_coding_prompt": { 30 | "type": "string", 31 | "description": "The prompt for the AI to execute", 32 | }, 33 | "relative_editable_files": { 34 | "type": "array", 35 | "description": "LIST of relative paths to files that can be edited", 36 | "items": {"type": "string"}, 37 | }, 38 | "relative_readonly_files": { 39 | "type": "array", 40 | "description": "LIST of relative paths to files that can be read but not edited, add files that are not editable but useful for context", 41 | "items": {"type": "string"}, 42 | }, 43 | "model": { 44 | "type": "string", 45 | "description": "The primary AI model Aider should use for generating code, leave blank unless model is specified in the request", 46 | }, 47 | }, 48 | "required": ["ai_coding_prompt", "relative_editable_files"], 49 | }, 50 | ) 51 | 52 | LIST_MODELS_TOOL = Tool( 53 | name="list_models", 54 | description="List available models that match the provided substring", 55 | inputSchema={ 56 | "type": "object", 57 | "properties": { 58 | "substring": { 59 | "type": "string", 60 | "description": "Substring to match against available models", 61 | } 62 | }, 63 | }, 64 | ) 65 | 66 | 67 | def is_git_repository(directory: str) -> Tuple[bool, Union[str, None]]: 68 | """ 69 | Check if the specified directory is a git repository. 70 | 71 | Args: 72 | directory (str): The directory to check. 73 | 74 | Returns: 75 | Tuple[bool, Union[str, None]]: A tuple containing a boolean indicating if it's a git repo, 76 | and an error message if it's not. 77 | """ 78 | try: 79 | # Make sure the directory exists 80 | if not os.path.isdir(directory): 81 | return False, f"Directory does not exist: {directory}" 82 | 83 | # Use the git command with -C option to specify the working directory 84 | # This way we don't need to change our current directory 85 | result = subprocess.run( 86 | ["git", "-C", directory, "rev-parse", "--is-inside-work-tree"], 87 | capture_output=True, 88 | text=True, 89 | check=False, 90 | ) 91 | 92 | if result.returncode == 0 and result.stdout.strip() == "true": 93 | return True, None 94 | else: 95 | return False, result.stderr.strip() or "Directory is not a git repository" 96 | 97 | except subprocess.SubprocessError as e: 98 | return False, f"Error checking git repository: {str(e)}" 99 | except Exception as e: 100 | return False, f"Unexpected error checking git repository: {str(e)}" 101 | 102 | 103 | def process_aider_ai_code_request( 104 | params: Dict[str, Any], 105 | editor_model: str, 106 | current_working_dir: str, 107 | ) -> Dict[str, Any]: 108 | """ 109 | Process an aider_ai_code request. 110 | 111 | Args: 112 | params (Dict[str, Any]): The request parameters. 113 | editor_model (str): The editor model to use. 114 | current_working_dir (str): The current working directory where git repo is located. 115 | 116 | Returns: 117 | Dict[str, Any]: The response data. 118 | """ 119 | ai_coding_prompt = params.get("ai_coding_prompt", "") 120 | relative_editable_files = params.get("relative_editable_files", []) 121 | relative_readonly_files = params.get("relative_readonly_files", []) 122 | 123 | # Ensure relative_editable_files is a list 124 | if isinstance(relative_editable_files, str): 125 | logger.info( 126 | f"Converting single editable file string to list: {relative_editable_files}" 127 | ) 128 | relative_editable_files = [relative_editable_files] 129 | 130 | # Ensure relative_readonly_files is a list 131 | if isinstance(relative_readonly_files, str): 132 | logger.info( 133 | f"Converting single readonly file string to list: {relative_readonly_files}" 134 | ) 135 | relative_readonly_files = [relative_readonly_files] 136 | 137 | # Get the model from request parameters if provided 138 | request_model = params.get("model") 139 | 140 | # Log the request details 141 | logger.info(f"AI Coding Request: Prompt: '{ai_coding_prompt}'") 142 | logger.info(f"Editable files: {relative_editable_files}") 143 | logger.info(f"Readonly files: {relative_readonly_files}") 144 | logger.info(f"Editor model: {editor_model}") 145 | if request_model: 146 | logger.info(f"Request-specified model: {request_model}") 147 | 148 | # Use the model specified in the request if provided, otherwise use the editor model 149 | model_to_use = request_model if request_model else editor_model 150 | 151 | # Use the passed-in current_working_dir parameter 152 | logger.info(f"Using working directory for code_with_aider: {current_working_dir}") 153 | 154 | result_json = code_with_aider( 155 | ai_coding_prompt=ai_coding_prompt, 156 | relative_editable_files=relative_editable_files, 157 | relative_readonly_files=relative_readonly_files, 158 | model=model_to_use, 159 | working_dir=current_working_dir, 160 | ) 161 | 162 | # Parse the JSON string result 163 | try: 164 | result_dict = json.loads(result_json) 165 | except json.JSONDecodeError as e: 166 | logger.error(f"Error: Failed to parse JSON response from code_with_aider: {e}") 167 | logger.error(f"Received raw response: {result_json}") 168 | return {"error": "Failed to process AI coding result"} 169 | 170 | logger.info( 171 | f"AI Coding Request Completed. Success: {result_dict.get('success', False)}" 172 | ) 173 | return { 174 | "success": result_dict.get("success", False), 175 | "diff": result_dict.get("diff", "Error retrieving diff"), 176 | } 177 | 178 | 179 | def process_list_models_request(params: Dict[str, Any]) -> Dict[str, Any]: 180 | """ 181 | Process a list_models request. 182 | 183 | Args: 184 | params (Dict[str, Any]): The request parameters. 185 | 186 | Returns: 187 | Dict[str, Any]: The response data. 188 | """ 189 | substring = params.get("substring", "") 190 | 191 | # Log the request details 192 | logger.info(f"List Models Request: Substring: '{substring}'") 193 | 194 | models = list_models(substring) 195 | logger.info(f"Found {len(models)} models matching '{substring}'") 196 | 197 | return {"models": models} 198 | 199 | 200 | def handle_request( 201 | request: Dict[str, Any], 202 | current_working_dir: str, 203 | editor_model: str, 204 | ) -> Dict[str, Any]: 205 | """ 206 | Handle incoming MCP requests according to the MCP protocol. 207 | 208 | Args: 209 | request (Dict[str, Any]): The request JSON. 210 | current_working_dir (str): The current working directory. Must be a valid git repository. 211 | editor_model (str): The editor model to use. 212 | 213 | Returns: 214 | Dict[str, Any]: The response JSON. 215 | """ 216 | try: 217 | # Validate current_working_dir is provided and is a git repository 218 | if not current_working_dir: 219 | error_msg = "Error: current_working_dir is required. Please provide a valid git repository path." 220 | logger.error(error_msg) 221 | return {"error": error_msg} 222 | 223 | # MCP protocol requires 'name' and 'parameters' fields 224 | if "name" not in request: 225 | logger.error("Error: Received request missing 'name' field.") 226 | return {"error": "Missing 'name' field in request"} 227 | 228 | request_type = request.get("name") 229 | params = request.get("parameters", {}) 230 | 231 | logger.info( 232 | f"Received request: Type='{request_type}', CWD='{current_working_dir}'" 233 | ) 234 | 235 | # Validate that the current_working_dir is a git repository before changing to it 236 | is_git_repo, error_message = is_git_repository(current_working_dir) 237 | if not is_git_repo: 238 | error_msg = f"Error: The specified directory '{current_working_dir}' is not a valid git repository: {error_message}" 239 | logger.error(error_msg) 240 | return {"error": error_msg} 241 | 242 | # Set working directory 243 | logger.info(f"Changing working directory to: {current_working_dir}") 244 | os.chdir(current_working_dir) 245 | 246 | # Route to the appropriate handler based on request type 247 | if request_type == "aider_ai_code": 248 | return process_aider_ai_code_request( 249 | params, editor_model, current_working_dir 250 | ) 251 | 252 | elif request_type == "list_models": 253 | return process_list_models_request(params) 254 | 255 | else: 256 | # Unknown request type 257 | logger.warning(f"Warning: Unknown request type received: {request_type}") 258 | return {"error": f"Unknown request type: {request_type}"} 259 | 260 | except Exception as e: 261 | # Handle any errors 262 | logger.exception( 263 | f"Critical Error: Unhandled exception during request processing: {str(e)}" 264 | ) 265 | return {"error": f"Internal server error: {str(e)}"} 266 | 267 | 268 | async def serve( 269 | editor_model: str = DEFAULT_EDITOR_MODEL, 270 | current_working_dir: str = None, 271 | ) -> None: 272 | """ 273 | Start the MCP server following the Model Context Protocol. 274 | 275 | The server reads JSON requests from stdin and writes JSON responses to stdout. 276 | Each request should contain a 'name' field indicating the tool to invoke, and 277 | a 'parameters' field with the tool-specific parameters. 278 | 279 | Args: 280 | editor_model (str, optional): The editor model to use. Defaults to DEFAULT_EDITOR_MODEL. 281 | current_working_dir (str, required): The current working directory. Must be a valid git repository. 282 | 283 | Raises: 284 | ValueError: If current_working_dir is not provided or is not a git repository. 285 | """ 286 | logger.info(f"Starting Aider MCP Server") 287 | logger.info(f"Editor Model: {editor_model}") 288 | 289 | # Validate current_working_dir is provided 290 | if not current_working_dir: 291 | error_msg = "Error: current_working_dir is required. Please provide a valid git repository path." 292 | logger.error(error_msg) 293 | raise ValueError(error_msg) 294 | 295 | logger.info(f"Initial Working Directory: {current_working_dir}") 296 | 297 | # Validate that the current_working_dir is a git repository 298 | is_git_repo, error_message = is_git_repository(current_working_dir) 299 | if not is_git_repo: 300 | error_msg = f"Error: The specified directory '{current_working_dir}' is not a valid git repository: {error_message}" 301 | logger.error(error_msg) 302 | raise ValueError(error_msg) 303 | 304 | logger.info(f"Validated git repository at: {current_working_dir}") 305 | 306 | # Set working directory 307 | logger.info(f"Setting working directory to: {current_working_dir}") 308 | os.chdir(current_working_dir) 309 | 310 | # Create the MCP server 311 | server = Server("aider-mcp-server") 312 | 313 | @server.list_tools() 314 | async def list_tools() -> List[Tool]: 315 | """Register all available tools with the MCP server.""" 316 | return [AIDER_AI_CODE_TOOL, LIST_MODELS_TOOL] 317 | 318 | @server.call_tool() 319 | async def call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]: 320 | """Handle tool calls from the MCP client.""" 321 | logger.info(f"Received Tool Call: Name='{name}'") 322 | logger.info(f"Arguments: {arguments}") 323 | 324 | try: 325 | if name == "aider_ai_code": 326 | logger.info(f"Processing 'aider_ai_code' tool call...") 327 | result = process_aider_ai_code_request( 328 | arguments, editor_model, current_working_dir 329 | ) 330 | return [TextContent(type="text", text=json.dumps(result))] 331 | 332 | elif name == "list_models": 333 | logger.info(f"Processing 'list_models' tool call...") 334 | result = process_list_models_request(arguments) 335 | return [TextContent(type="text", text=json.dumps(result))] 336 | 337 | else: 338 | logger.warning(f"Warning: Received call for unknown tool: {name}") 339 | return [ 340 | TextContent( 341 | type="text", text=json.dumps({"error": f"Unknown tool: {name}"}) 342 | ) 343 | ] 344 | 345 | except Exception as e: 346 | logger.exception(f"Error: Exception during tool call '{name}': {e}") 347 | return [ 348 | TextContent( 349 | type="text", 350 | text=json.dumps( 351 | {"error": f"Error processing tool {name}: {str(e)}"} 352 | ), 353 | ) 354 | ] 355 | 356 | # Initialize and run the server 357 | try: 358 | options = server.create_initialization_options() 359 | logger.info("Initializing stdio server connection...") 360 | async with stdio_server() as (read_stream, write_stream): 361 | logger.info("Server running. Waiting for requests...") 362 | await server.run(read_stream, write_stream, options, raise_exceptions=True) 363 | except Exception as e: 364 | logger.exception( 365 | f"Critical Error: Server stopped due to unhandled exception: {e}" 366 | ) 367 | raise 368 | finally: 369 | logger.info("Aider MCP Server shutting down.") 370 | -------------------------------------------------------------------------------- /src/aider_mcp_server/tests/atoms/tools/test_aider_ai_code.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import tempfile 4 | import pytest 5 | import shutil 6 | import subprocess 7 | from aider_mcp_server.atoms.tools.aider_ai_code import code_with_aider 8 | from aider_mcp_server.atoms.utils import DEFAULT_TESTING_MODEL 9 | 10 | @pytest.fixture 11 | def temp_dir(): 12 | """Create a temporary directory with an initialized Git repository for testing.""" 13 | tmp_dir = tempfile.mkdtemp() 14 | 15 | # Initialize git repository in the temp directory 16 | subprocess.run(["git", "init"], cwd=tmp_dir, capture_output=True, text=True, check=True) 17 | 18 | # Configure git user for the repository 19 | subprocess.run(["git", "config", "user.name", "Test User"], cwd=tmp_dir, capture_output=True, text=True, check=True) 20 | subprocess.run(["git", "config", "user.email", "test@example.com"], cwd=tmp_dir, capture_output=True, text=True, check=True) 21 | 22 | # Create and commit an initial file to have a valid git history 23 | with open(os.path.join(tmp_dir, "README.md"), "w") as f: 24 | f.write("# Test Repository\nThis is a test repository for Aider MCP Server tests.") 25 | 26 | subprocess.run(["git", "add", "README.md"], cwd=tmp_dir, capture_output=True, text=True, check=True) 27 | subprocess.run(["git", "commit", "-m", "Initial commit"], cwd=tmp_dir, capture_output=True, text=True, check=True) 28 | 29 | yield tmp_dir 30 | 31 | # Clean up 32 | shutil.rmtree(tmp_dir) 33 | 34 | def test_addition(temp_dir): 35 | """Test that code_with_aider can create a file that adds two numbers.""" 36 | # Create the test file 37 | test_file = os.path.join(temp_dir, "math_add.py") 38 | with open(test_file, "w") as f: 39 | f.write("# This file should implement addition\n") 40 | 41 | prompt = "Implement a function add(a, b) that returns the sum of a and b in the math_add.py file." 42 | 43 | # Run code_with_aider with working_dir 44 | result = code_with_aider( 45 | ai_coding_prompt=prompt, 46 | relative_editable_files=[test_file], 47 | working_dir=temp_dir # Pass the temp directory as working_dir 48 | ) 49 | 50 | # Parse the JSON result 51 | result_dict = json.loads(result) 52 | 53 | # Check that it succeeded 54 | assert result_dict["success"] is True, "Expected code_with_aider to succeed" 55 | assert "diff" in result_dict, "Expected diff to be in result" 56 | 57 | # Check that the file was modified correctly 58 | with open(test_file, "r") as f: 59 | content = f.read() 60 | 61 | assert any(x in content for x in ["def add(a, b):", "def add(a:"]), "Expected to find add function in the file" 62 | assert "return a + b" in content, "Expected to find return statement in the file" 63 | 64 | # Try to import and use the function 65 | import sys 66 | sys.path.append(temp_dir) 67 | from math_add import add 68 | assert add(2, 3) == 5, "Expected add(2, 3) to return 5" 69 | 70 | def test_subtraction(temp_dir): 71 | """Test that code_with_aider can create a file that subtracts two numbers.""" 72 | # Create the test file 73 | test_file = os.path.join(temp_dir, "math_subtract.py") 74 | with open(test_file, "w") as f: 75 | f.write("# This file should implement subtraction\n") 76 | 77 | prompt = "Implement a function subtract(a, b) that returns a minus b in the math_subtract.py file." 78 | 79 | # Run code_with_aider with working_dir 80 | result = code_with_aider( 81 | ai_coding_prompt=prompt, 82 | relative_editable_files=[test_file], 83 | working_dir=temp_dir # Pass the temp directory as working_dir 84 | ) 85 | 86 | # Parse the JSON result 87 | result_dict = json.loads(result) 88 | 89 | # Check that it succeeded 90 | assert result_dict["success"] is True, "Expected code_with_aider to succeed" 91 | assert "diff" in result_dict, "Expected diff to be in result" 92 | 93 | # Check that the file was modified correctly 94 | with open(test_file, "r") as f: 95 | content = f.read() 96 | 97 | assert any(x in content for x in ["def subtract(a, b):", "def subtract(a:"]), "Expected to find subtract function in the file" 98 | assert "return a - b" in content, "Expected to find return statement in the file" 99 | 100 | # Try to import and use the function 101 | import sys 102 | sys.path.append(temp_dir) 103 | from math_subtract import subtract 104 | assert subtract(5, 3) == 2, "Expected subtract(5, 3) to return 2" 105 | 106 | def test_multiplication(temp_dir): 107 | """Test that code_with_aider can create a file that multiplies two numbers.""" 108 | # Create the test file 109 | test_file = os.path.join(temp_dir, "math_multiply.py") 110 | with open(test_file, "w") as f: 111 | f.write("# This file should implement multiplication\n") 112 | 113 | prompt = "Implement a function multiply(a, b) that returns the product of a and b in the math_multiply.py file." 114 | 115 | # Run code_with_aider with working_dir 116 | result = code_with_aider( 117 | ai_coding_prompt=prompt, 118 | relative_editable_files=[test_file], 119 | working_dir=temp_dir # Pass the temp directory as working_dir 120 | ) 121 | 122 | # Parse the JSON result 123 | result_dict = json.loads(result) 124 | 125 | # Check that it succeeded 126 | assert result_dict["success"] is True, "Expected code_with_aider to succeed" 127 | assert "diff" in result_dict, "Expected diff to be in result" 128 | 129 | # Check that the file was modified correctly 130 | with open(test_file, "r") as f: 131 | content = f.read() 132 | 133 | assert any(x in content for x in ["def multiply(a, b):", "def multiply(a:"]), "Expected to find multiply function in the file" 134 | assert "return a * b" in content, "Expected to find return statement in the file" 135 | 136 | # Try to import and use the function 137 | import sys 138 | sys.path.append(temp_dir) 139 | from math_multiply import multiply 140 | assert multiply(2, 3) == 6, "Expected multiply(2, 3) to return 6" 141 | 142 | def test_division(temp_dir): 143 | """Test that code_with_aider can create a file that divides two numbers.""" 144 | # Create the test file 145 | test_file = os.path.join(temp_dir, "math_divide.py") 146 | with open(test_file, "w") as f: 147 | f.write("# This file should implement division\n") 148 | 149 | prompt = "Implement a function divide(a, b) that returns a divided by b in the math_divide.py file. Handle division by zero by returning None." 150 | 151 | # Run code_with_aider with working_dir 152 | result = code_with_aider( 153 | ai_coding_prompt=prompt, 154 | relative_editable_files=[test_file], 155 | working_dir=temp_dir # Pass the temp directory as working_dir 156 | ) 157 | 158 | # Parse the JSON result 159 | result_dict = json.loads(result) 160 | 161 | # Check that it succeeded 162 | assert result_dict["success"] is True, "Expected code_with_aider to succeed" 163 | assert "diff" in result_dict, "Expected diff to be in result" 164 | 165 | # Check that the file was modified correctly 166 | with open(test_file, "r") as f: 167 | content = f.read() 168 | 169 | assert any(x in content for x in ["def divide(a, b):", "def divide(a:"]), "Expected to find divide function in the file" 170 | assert "return" in content, "Expected to find return statement in the file" 171 | 172 | # Try to import and use the function 173 | import sys 174 | sys.path.append(temp_dir) 175 | from math_divide import divide 176 | assert divide(6, 3) == 2, "Expected divide(6, 3) to return 2" 177 | assert divide(1, 0) is None, "Expected divide(1, 0) to return None" 178 | 179 | def test_failure_case(temp_dir): 180 | """Test that code_with_aider returns error information for a failure scenario.""" 181 | 182 | try: 183 | # Ensure this test runs in a non-git directory 184 | os.chdir(temp_dir) 185 | 186 | # Create a test file in the temp directory 187 | test_file = os.path.join(temp_dir, "failure_test.py") 188 | with open(test_file, "w") as f: 189 | f.write("# This file should trigger a failure\n") 190 | 191 | # Use an invalid model name to ensure a failure 192 | prompt = "This prompt should fail because we're using a non-existent model." 193 | 194 | # Run code_with_aider with an invalid model name 195 | result = code_with_aider( 196 | ai_coding_prompt=prompt, 197 | relative_editable_files=[test_file], 198 | model="non_existent_model_123456789", # This model doesn't exist 199 | working_dir=temp_dir # Pass the temp directory as working_dir 200 | ) 201 | 202 | # Parse the JSON result 203 | result_dict = json.loads(result) 204 | 205 | # Check the result - we're still expecting success=False but the important part 206 | # is that we get a diff that explains the error. 207 | # The diff should indicate that no meaningful changes were made, 208 | # often because the model couldn't be reached or produced no output. 209 | assert "diff" in result_dict, "Expected diff to be in result" 210 | diff_content = result_dict["diff"] 211 | assert "File contents after editing (git not used):" in diff_content or "No meaningful changes detected" in diff_content, \ 212 | f"Expected error information like 'File contents after editing' or 'No meaningful changes' in diff, but got: {diff_content}" 213 | finally: 214 | # Make sure we go back to the main directory 215 | os.chdir("/Users/indydevdan/Documents/projects/aider-mcp-exp") 216 | 217 | def test_complex_tasks(temp_dir): 218 | """Test that code_with_aider correctly implements more complex tasks.""" 219 | # Create the test file for a calculator class 220 | test_file = os.path.join(temp_dir, "calculator.py") 221 | with open(test_file, "w") as f: 222 | f.write("# This file should implement a calculator class\n") 223 | 224 | # More complex prompt suitable for architect mode 225 | prompt = """ 226 | Create a Calculator class with the following features: 227 | 1. Basic operations: add, subtract, multiply, divide methods 228 | 2. Memory functions: memory_store, memory_recall, memory_clear 229 | 3. A history feature that keeps track of operations 230 | 4. A method to show_history 231 | 5. Error handling for division by zero 232 | 233 | All methods should be well-documented with docstrings. 234 | """ 235 | 236 | # Run code_with_aider with explicit model 237 | result = code_with_aider( 238 | ai_coding_prompt=prompt, 239 | relative_editable_files=[test_file], 240 | model=DEFAULT_TESTING_MODEL, # Main model 241 | working_dir=temp_dir # Pass the temp directory as working_dir 242 | ) 243 | 244 | # Parse the JSON result 245 | result_dict = json.loads(result) 246 | 247 | # Check that it succeeded 248 | assert result_dict["success"] is True, "Expected code_with_aider with architect mode to succeed" 249 | assert "diff" in result_dict, "Expected diff to be in result" 250 | 251 | # Check that the file was modified correctly with expected elements 252 | with open(test_file, "r") as f: 253 | content = f.read() 254 | 255 | # Check for class definition and methods - relaxed assertions to accommodate type hints 256 | assert "class Calculator" in content, "Expected to find Calculator class definition" 257 | assert "add" in content, "Expected to find add method" 258 | assert "subtract" in content, "Expected to find subtract method" 259 | assert "multiply" in content, "Expected to find multiply method" 260 | assert "divide" in content, "Expected to find divide method" 261 | assert "memory_" in content, "Expected to find memory functions" 262 | assert "history" in content, "Expected to find history functionality" 263 | 264 | # Import and test basic calculator functionality 265 | import sys 266 | sys.path.append(temp_dir) 267 | from calculator import Calculator 268 | 269 | # Test the calculator 270 | calc = Calculator() 271 | 272 | # Test basic operations 273 | assert calc.add(2, 3) == 5, "Expected add(2, 3) to return 5" 274 | assert calc.subtract(5, 3) == 2, "Expected subtract(5, 3) to return 2" 275 | assert calc.multiply(2, 3) == 6, "Expected multiply(2, 3) to return 6" 276 | assert calc.divide(6, 3) == 2, "Expected divide(6, 3) to return 2" 277 | 278 | # Test division by zero error handling 279 | try: 280 | result = calc.divide(5, 0) 281 | assert result is None or isinstance(result, (str, type(None))), \ 282 | "Expected divide by zero to return None, error message, or raise exception" 283 | except Exception: 284 | # It's fine if it raises an exception - that's valid error handling too 285 | pass 286 | 287 | # Test memory functions if implemented as expected 288 | try: 289 | calc.memory_store(10) 290 | assert calc.memory_recall() == 10, "Expected memory_recall() to return stored value" 291 | calc.memory_clear() 292 | assert calc.memory_recall() == 0 or calc.memory_recall() is None, \ 293 | "Expected memory_recall() to return 0 or None after clearing" 294 | except (AttributeError, TypeError): 295 | # Some implementations might handle memory differently 296 | pass 297 | 298 | def test_diff_output(temp_dir): 299 | """Test that code_with_aider produces proper git diff output when modifying existing files.""" 300 | # Create an initial math file 301 | test_file = os.path.join(temp_dir, "math_operations.py") 302 | initial_content = """# Math operations module 303 | def add(a, b): 304 | return a + b 305 | 306 | def subtract(a, b): 307 | return a - b 308 | """ 309 | 310 | with open(test_file, "w") as f: 311 | f.write(initial_content) 312 | 313 | # Commit the initial file to git 314 | subprocess.run(["git", "add", "math_operations.py"], cwd=temp_dir, capture_output=True, text=True, check=True) 315 | subprocess.run(["git", "commit", "-m", "Add initial math operations"], cwd=temp_dir, capture_output=True, text=True, check=True) 316 | 317 | # Now modify the file using Aider 318 | prompt = "Add a multiply function that takes two parameters and returns their product. Also add a docstring to the existing add function." 319 | 320 | result = code_with_aider( 321 | ai_coding_prompt=prompt, 322 | relative_editable_files=["math_operations.py"], 323 | model=DEFAULT_TESTING_MODEL, 324 | working_dir=temp_dir 325 | ) 326 | 327 | # Parse the JSON result 328 | result_dict = json.loads(result) 329 | 330 | # Check that it succeeded 331 | assert result_dict["success"] is True, "Expected code_with_aider to succeed" 332 | assert "diff" in result_dict, "Expected diff to be in result" 333 | 334 | # Verify the diff contains expected git diff markers 335 | diff_content = result_dict["diff"] 336 | assert "diff --git" in diff_content, "Expected git diff header in diff output" 337 | assert "@@" in diff_content, "Expected hunk headers (@@) in diff output" 338 | assert "+++ b/math_operations.py" in diff_content, "Expected new file marker in diff" 339 | assert "--- a/math_operations.py" in diff_content, "Expected old file marker in diff" 340 | 341 | # Verify the diff shows additions (lines starting with +) 342 | diff_lines = diff_content.split('\n') 343 | added_lines = [line for line in diff_lines if line.startswith('+') and not line.startswith('+++')] 344 | assert len(added_lines) > 0, "Expected to find added lines in diff" 345 | 346 | # Check that multiply function was actually added to the file 347 | with open(test_file, "r") as f: 348 | final_content = f.read() 349 | 350 | assert "def multiply" in final_content, "Expected multiply function to be added" 351 | assert "docstring" in final_content.lower() or '"""' in final_content, "Expected docstring to be added" 352 | -------------------------------------------------------------------------------- /ai_docs/just-prompt-example-mcp-server.xml: -------------------------------------------------------------------------------- 1 | This file is a merged representation of a subset of the codebase, containing files not matching ignore patterns, combined into a single document by Repomix. 2 | 3 | 4 | This section contains a summary of this file. 5 | 6 | 7 | This file contains a packed representation of the entire repository's contents. 8 | It is designed to be easily consumable by AI systems for analysis, code review, 9 | or other automated processes. 10 | 11 | 12 | 13 | The content is organized as follows: 14 | 1. This summary section 15 | 2. Repository information 16 | 3. Directory structure 17 | 4. Repository files, each consisting of: 18 | - File path as an attribute 19 | - Full contents of the file 20 | 21 | 22 | 23 | - This file should be treated as read-only. Any changes should be made to the 24 | original repository files, not this packed version. 25 | - When processing this file, use the file path to distinguish 26 | between different files in the repository. 27 | - Be aware that this file may contain sensitive information. Handle it with 28 | the same level of security as you would the original repository. 29 | 30 | 31 | 32 | - Some files may have been excluded based on .gitignore rules and Repomix's configuration 33 | - Binary files are not included in this packed representation. Please refer to the Repository Structure section for a complete list of file paths, including binary files 34 | - Files matching these patterns are excluded: uv.lock, example_outputs/*, ai_docs 35 | - Files matching patterns in .gitignore are excluded 36 | - Files matching default ignore patterns are excluded 37 | - Files are sorted by Git change count (files with more changes are at the bottom) 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | .claude/ 48 | commands/ 49 | context_prime_w_lead.md 50 | context_prime.md 51 | jprompt_ultra_diff_review.md 52 | project_hello_w_name.md 53 | project_hello.md 54 | prompts/ 55 | countdown_component.txt 56 | mock_bin_search.txt 57 | mock_ui_component.txt 58 | specs/ 59 | init-just-prompt.md 60 | src/ 61 | just_prompt/ 62 | atoms/ 63 | llm_providers/ 64 | __init__.py 65 | anthropic.py 66 | deepseek.py 67 | gemini.py 68 | groq.py 69 | ollama.py 70 | openai.py 71 | shared/ 72 | __init__.py 73 | data_types.py 74 | model_router.py 75 | utils.py 76 | validator.py 77 | __init__.py 78 | molecules/ 79 | __init__.py 80 | list_models.py 81 | list_providers.py 82 | prompt_from_file_to_file.py 83 | prompt_from_file.py 84 | prompt.py 85 | tests/ 86 | atoms/ 87 | llm_providers/ 88 | __init__.py 89 | test_anthropic.py 90 | test_deepseek.py 91 | test_gemini.py 92 | test_groq.py 93 | test_ollama.py 94 | test_openai.py 95 | shared/ 96 | __init__.py 97 | test_model_router.py 98 | test_utils.py 99 | test_validator.py 100 | __init__.py 101 | molecules/ 102 | __init__.py 103 | test_list_models.py 104 | test_list_providers.py 105 | test_prompt_from_file_to_file.py 106 | test_prompt_from_file.py 107 | test_prompt.py 108 | __init__.py 109 | __init__.py 110 | __main__.py 111 | server.py 112 | ultra_diff_review/ 113 | diff_anthropic_claude-3-7-sonnet-20250219_4k.md 114 | diff_gemini_gemini-2.0-flash-thinking-exp.md 115 | diff_openai_o3-mini.md 116 | fusion_ultra_diff_review.md 117 | .env.sample 118 | .gitignore 119 | .mcp.json 120 | .python-version 121 | list_models.py 122 | pyproject.toml 123 | README.md 124 | 125 | 126 | 127 | This section contains the contents of the repository's files. 128 | 129 | 130 | READ README.md, THEN run git ls-files to understand the context of the project. 131 | 132 | Be sure to also READ: $ARGUMENTS and nothing else. 133 | 134 | 135 | 136 | hi how are you 137 | 138 | 139 | 140 | Create a countdown timer component that satisfies these requirements: 141 | 142 | 1. Framework implementations: 143 | - Vue.js 144 | - Svelte 145 | - React 146 | - Vanilla JavaScript 147 | 148 | 2. Component interface: 149 | - :start-time: number (starting time in seconds) 150 | - :format: number (display format, 0 = MM:SS, 1 = HH:MM:SS) 151 | 152 | 3. Features: 153 | - Count down from start-time to zero 154 | - Display remaining time in specified format 155 | - Stop counting when reaching zero 156 | - Emit/callback 'finished' event when countdown completes 157 | - Provide a visual indication when time is running low (< 10% of total) 158 | 159 | 4. Include: 160 | - Component implementation 161 | - Sample usage 162 | - Clear comments explaining key parts 163 | 164 | Provide clean, well-structured code for each framework version. 165 | 166 | 167 | 168 | python: return code exclusively: def binary_search(arr, target) -> Optional[int]: 169 | 170 | 171 | 172 | Build vue, react, and svelte components for this component definition: 173 | 174 | 175 | 176 | The tree is a json object that looks like this: 177 | 178 | ```json 179 | { 180 | "name": "TableOfContents", 181 | "children": [ 182 | { 183 | "name": "Item", 184 | "children": [ 185 | { 186 | "name": "Item", 187 | "children": [] 188 | } 189 | ] 190 | }, 191 | { 192 | "name": "Item 2", 193 | "children": [] 194 | } 195 | ] 196 | } 197 | ``` 198 | 199 | 200 | 201 | # Specification for Just Prompt 202 | > We're building a lightweight wrapper mcp server around openai, anthropic, gemini, groq, deepseek, and ollama. 203 | 204 | ## Implementation details 205 | 206 | - First, READ ai_docs/* to understand the providers, models, and to see an example mcp server. 207 | - Mirror the work done inside `of ai_docs/pocket-pick-mcp-server-example.xml`. Here we have a complete example of how to build a mcp server. We also have a complete codebase structure that we want to replicate. With some slight tweaks - see `Codebase Structure` below. 208 | - Don't mock any tests - run simple "What is the capital of France?" tests and expect them to pass case insensitive. 209 | - Be sure to use load_dotenv() in the tests. 210 | - models_prefixed_by_provider look like this: 211 | - openai:gpt-4o 212 | - anthropic:claude-3-5-sonnet-20240620 213 | - gemini:gemini-1.5-flash 214 | - groq:llama-3.1-70b-versatile 215 | - deepseek:deepseek-coder 216 | - ollama:llama3.1 217 | - or using short names: 218 | - o:gpt-4o 219 | - a:claude-3-5-sonnet-20240620 220 | - g:gemini-1.5-flash 221 | - q:llama-3.1-70b-versatile 222 | - d:deepseek-coder 223 | - l:llama3.1 224 | - Be sure to comment every function and class with clear doc strings. 225 | - Don't explicitly write out the full list of models for a provider. Instead, use the `list_models` function. 226 | - Create a 'magic' function somewhere using the weak_provider_and_model param - make sure this is callable. We're going to take the 'models_prefixed_by_provider' and pass it to this function running a custom prompt where we ask the model to return the right model for this given item. TO be clear the 'models_prefixed_by_provider' will be a natural language query and will sometimes be wrong, so we want to correct it after parsing the provider and update it to the right value by provider this weak model prompt the list_model() call for the provider, then add the to the prompt and ask it to return the right model ONLY IF the model (from the split : call) is not in the providers list_model() already. If we run this functionality be sure to log 'weak_provider_and_model' and the 'models_prefixed_by_provider' and the 'corrected_model' to the console. If we dont just say 'using and '. 227 | - For tests use these models 228 | - o:gpt-4o-mini 229 | - a:claude-3-5-haiku 230 | - g:gemini-2.0-flash 231 | - q:qwen-2.5-32b 232 | - d:deepseek-coder 233 | - l:gemma3:12b 234 | - To implement list models read `list_models.py`. 235 | 236 | ## Tools we want to expose 237 | > Here's the tools we want to expose: 238 | 239 | prompt(text, models_prefixed_by_provider: List[str]) -> List[str] (return value is list of responses) 240 | 241 | prompt_from_file(file, models_prefixed_by_provider: List[str]) -> List[str] (return value is list of responses) 242 | 243 | prompt_from_file_to_file(file, models_prefixed_by_provider: List[str], output_dir: str = ".") -> List[str] (return value is a list of file paths) 244 | 245 | list_providers() -> List[str] 246 | 247 | list_models(provider: str) -> List[str] 248 | 249 | ## Codebase Structure 250 | 251 | - .env.sample 252 | - src/ 253 | - just_prompt/ 254 | - __init__.py 255 | - __main__.py 256 | - server.py 257 | - serve(weak_provider_and_model: str = "o:gpt-4o-mini") -> None 258 | - atoms/ 259 | - __init__.py 260 | - llm_providers/ 261 | - __init__.py 262 | - openai.py 263 | - prompt(text, model) -> str 264 | - list_models() -> List[str] 265 | - anthropic.py 266 | - ...same as openai.py 267 | - gemini.py 268 | - ... 269 | - groq.py 270 | - ... 271 | - deepseek.py 272 | - ... 273 | - ollama.py 274 | - ... 275 | - shared/ 276 | - __init__.py 277 | - validator.py 278 | - validate_models_prefixed_by_provider(models_prefixed_by_provider: List[str]) -> raise error if a model prefix does not match a provider 279 | - utils.py 280 | - split_provider_and_model(model: str) -> Tuple[str, str] - be sure this only splits the first : in the model string and leaves the rest of the string as the model name. Models will have additional : in the string and we want to ignore them and leave them for the model name. 281 | - data_types.py 282 | - class PromptRequest(BaseModel) {text: str, models_prefixed_by_provider: List[str]} 283 | - class PromptResponse(BaseModel) {responses: List[str]} 284 | - class PromptFromFileRequest(BaseModel) {file: str, models_prefixed_by_provider: List[str]} 285 | - class PromptFromFileResponse(BaseModel) {responses: List[str]} 286 | - class PromptFromFileToFileRequest(BaseModel) {file: str, models_prefixed_by_provider: List[str], output_dir: str = "."} 287 | - class PromptFromFileToFileResponse(BaseModel) {file_paths: List[str]} 288 | - class ListProvidersRequest(BaseModel) {} 289 | - class ListProvidersResponse(BaseModel) {providers: List[str]} - returns all providers with long and short names 290 | - class ListModelsRequest(BaseModel) {provider: str} 291 | - class ListModelsResponse(BaseModel) {models: List[str]} - returns all models for a given provider 292 | - class ModelAlias(BaseModel) {provider: str, model: str} 293 | - class ModelProviders(Enum): 294 | OPENAI = ("openai", "o") 295 | ANTHROPIC = ("anthropic", "a") 296 | GEMINI = ("gemini", "g") 297 | GROQ = ("groq", "q") 298 | DEEPSEEK = ("deepseek", "d") 299 | OLLAMA = ("ollama", "l") 300 | 301 | def __init__(self, full_name, short_name): 302 | self.full_name = full_name 303 | self.short_name = short_name 304 | 305 | @classmethod 306 | def from_name(cls, name): 307 | for provider in cls: 308 | if provider.full_name == name or provider.short_name == name: 309 | return provider 310 | return None 311 | - model_router.py 312 | - molecules/ 313 | - __init__.py 314 | - prompt.py 315 | - prompt_from_file.py 316 | - prompt_from_file_to_file.py 317 | - list_providers.py 318 | - list_models.py 319 | - tests/ 320 | - __init__.py 321 | - atoms/ 322 | - __init__.py 323 | - llm_providers/ 324 | - __init__.py 325 | - test_openai.py 326 | - test_anthropic.py 327 | - test_gemini.py 328 | - test_groq.py 329 | - test_deepseek.py 330 | - test_ollama.py 331 | - shared/ 332 | - __init__.py 333 | - test_utils.py 334 | - molecules/ 335 | - __init__.py 336 | - test_prompt.py 337 | - test_prompt_from_file.py 338 | - test_prompt_from_file_to_file.py 339 | - test_list_providers.py 340 | - test_list_models.py 341 | 342 | ## Per provider documentation 343 | 344 | ### OpenAI 345 | See: `ai_docs/llm_providers_details.xml` 346 | 347 | ### Anthropic 348 | See: `ai_docs/llm_providers_details.xml` 349 | 350 | ### Gemini 351 | See: `ai_docs/llm_providers_details.xml` 352 | 353 | ### Groq 354 | 355 | Quickstart 356 | Get up and running with the Groq API in a few minutes. 357 | 358 | Create an API Key 359 | Please visit here to create an API Key. 360 | 361 | Set up your API Key (recommended) 362 | Configure your API key as an environment variable. This approach streamlines your API usage by eliminating the need to include your API key in each request. Moreover, it enhances security by minimizing the risk of inadvertently including your API key in your codebase. 363 | 364 | In your terminal of choice: 365 | 366 | export GROQ_API_KEY= 367 | Requesting your first chat completion 368 | curl 369 | JavaScript 370 | Python 371 | JSON 372 | Install the Groq Python library: 373 | 374 | pip install groq 375 | Performing a Chat Completion: 376 | 377 | import os 378 | 379 | from groq import Groq 380 | 381 | client = Groq( 382 | api_key=os.environ.get("GROQ_API_KEY"), 383 | ) 384 | 385 | chat_completion = client.chat.completions.create( 386 | messages=[ 387 | { 388 | "role": "user", 389 | "content": "Explain the importance of fast language models", 390 | } 391 | ], 392 | model="llama-3.3-70b-versatile", 393 | ) 394 | 395 | print(chat_completion.choices[0].message.content) 396 | Now that you have successfully received a chat completion, you can try out the other endpoints in the API. 397 | 398 | Next Steps 399 | Check out the Playground to try out the Groq API in your browser 400 | Join our GroqCloud developer community on Discord 401 | Chat with our Docs at lightning speed using the Groq API! 402 | Add a how-to on your project to the Groq API Cookbook 403 | 404 | ### DeepSeek 405 | See: `ai_docs/llm_providers_details.xml` 406 | 407 | ### Ollama 408 | See: `ai_docs/llm_providers_details.xml` 409 | 410 | 411 | ## Validation (close the loop) 412 | 413 | - Run `uv run pytest ` to validate the tests are passing - do this iteratively as you build out the tests. 414 | - After code is written, run `uv run pytest` to validate all tests are passing. 415 | - At the end Use `uv run just-prompt --help` to validate the mcp server works. 416 | 417 | 418 | 419 | # LLM Providers package - interfaces for various LLM APIs 420 | 421 | 422 | 423 | """ 424 | DeepSeek provider implementation. 425 | """ 426 | 427 | import os 428 | from typing import List 429 | import logging 430 | from openai import OpenAI 431 | from dotenv import load_dotenv 432 | 433 | # Load environment variables 434 | load_dotenv() 435 | 436 | # Configure logging 437 | logger = logging.getLogger(__name__) 438 | 439 | # Initialize DeepSeek client with OpenAI-compatible interface 440 | client = OpenAI( 441 | api_key=os.environ.get("DEEPSEEK_API_KEY"), 442 | base_url="https://api.deepseek.com" 443 | ) 444 | 445 | 446 | def prompt(text: str, model: str) -> str: 447 | """ 448 | Send a prompt to DeepSeek and get a response. 449 | 450 | Args: 451 | text: The prompt text 452 | model: The model name 453 | 454 | Returns: 455 | Response string from the model 456 | """ 457 | try: 458 | logger.info(f"Sending prompt to DeepSeek model: {model}") 459 | 460 | # Create chat completion 461 | response = client.chat.completions.create( 462 | model=model, 463 | messages=[{"role": "user", "content": text}], 464 | stream=False, 465 | ) 466 | 467 | # Extract response content 468 | return response.choices[0].message.content 469 | except Exception as e: 470 | logger.error(f"Error sending prompt to DeepSeek: {e}") 471 | raise ValueError(f"Failed to get response from DeepSeek: {str(e)}") 472 | 473 | 474 | def list_models() -> List[str]: 475 | """ 476 | List available DeepSeek models. 477 | 478 | Returns: 479 | List of model names 480 | """ 481 | try: 482 | logger.info("Listing DeepSeek models") 483 | response = client.models.list() 484 | 485 | # Extract model IDs 486 | models = [model.id for model in response.data] 487 | 488 | return models 489 | except Exception as e: 490 | logger.error(f"Error listing DeepSeek models: {e}") 491 | # Return some known models if API fails 492 | logger.info("Returning hardcoded list of known DeepSeek models") 493 | return [ 494 | "deepseek-coder", 495 | "deepseek-chat", 496 | "deepseek-reasoner", 497 | "deepseek-coder-v2", 498 | "deepseek-reasoner-lite" 499 | ] 500 | 501 | 502 | 503 | """ 504 | Google Gemini provider implementation. 505 | """ 506 | 507 | import os 508 | from typing import List 509 | import logging 510 | import google.generativeai as genai 511 | from dotenv import load_dotenv 512 | 513 | # Load environment variables 514 | load_dotenv() 515 | 516 | # Configure logging 517 | logger = logging.getLogger(__name__) 518 | 519 | # Initialize Gemini 520 | genai.configure(api_key=os.environ.get("GEMINI_API_KEY")) 521 | 522 | 523 | def prompt(text: str, model: str) -> str: 524 | """ 525 | Send a prompt to Google Gemini and get a response. 526 | 527 | Args: 528 | text: The prompt text 529 | model: The model name 530 | 531 | Returns: 532 | Response string from the model 533 | """ 534 | try: 535 | logger.info(f"Sending prompt to Gemini model: {model}") 536 | 537 | # Create generative model 538 | gemini_model = genai.GenerativeModel(model_name=model) 539 | 540 | # Generate content 541 | response = gemini_model.generate_content(text) 542 | 543 | return response.text 544 | except Exception as e: 545 | logger.error(f"Error sending prompt to Gemini: {e}") 546 | raise ValueError(f"Failed to get response from Gemini: {str(e)}") 547 | 548 | 549 | def list_models() -> List[str]: 550 | """ 551 | List available Google Gemini models. 552 | 553 | Returns: 554 | List of model names 555 | """ 556 | try: 557 | logger.info("Listing Gemini models") 558 | 559 | # Get the list of models 560 | models = [] 561 | for m in genai.list_models(): 562 | if "generateContent" in m.supported_generation_methods: 563 | models.append(m.name) 564 | 565 | # Format model names - strip the "models/" prefix if present 566 | formatted_models = [model.replace("models/", "") for model in models] 567 | 568 | return formatted_models 569 | except Exception as e: 570 | logger.error(f"Error listing Gemini models: {e}") 571 | # Return some known models if API fails 572 | logger.info("Returning hardcoded list of known Gemini models") 573 | return [ 574 | "gemini-1.5-pro", 575 | "gemini-1.5-flash", 576 | "gemini-1.5-flash-latest", 577 | "gemini-1.0-pro", 578 | "gemini-2.0-flash" 579 | ] 580 | 581 | 582 | 583 | """ 584 | Groq provider implementation. 585 | """ 586 | 587 | import os 588 | from typing import List 589 | import logging 590 | from groq import Groq 591 | from dotenv import load_dotenv 592 | 593 | # Load environment variables 594 | load_dotenv() 595 | 596 | # Configure logging 597 | logger = logging.getLogger(__name__) 598 | 599 | # Initialize Groq client 600 | client = Groq(api_key=os.environ.get("GROQ_API_KEY")) 601 | 602 | 603 | def prompt(text: str, model: str) -> str: 604 | """ 605 | Send a prompt to Groq and get a response. 606 | 607 | Args: 608 | text: The prompt text 609 | model: The model name 610 | 611 | Returns: 612 | Response string from the model 613 | """ 614 | try: 615 | logger.info(f"Sending prompt to Groq model: {model}") 616 | 617 | # Create chat completion 618 | chat_completion = client.chat.completions.create( 619 | messages=[{"role": "user", "content": text}], 620 | model=model, 621 | ) 622 | 623 | # Extract response content 624 | return chat_completion.choices[0].message.content 625 | except Exception as e: 626 | logger.error(f"Error sending prompt to Groq: {e}") 627 | raise ValueError(f"Failed to get response from Groq: {str(e)}") 628 | 629 | 630 | def list_models() -> List[str]: 631 | """ 632 | List available Groq models. 633 | 634 | Returns: 635 | List of model names 636 | """ 637 | try: 638 | logger.info("Listing Groq models") 639 | response = client.models.list() 640 | 641 | # Extract model IDs 642 | models = [model.id for model in response.data] 643 | 644 | return models 645 | except Exception as e: 646 | logger.error(f"Error listing Groq models: {e}") 647 | # Return some known models if API fails 648 | logger.info("Returning hardcoded list of known Groq models") 649 | return [ 650 | "llama-3.3-70b-versatile", 651 | "llama-3.1-70b-versatile", 652 | "llama-3.1-8b-versatile", 653 | "mixtral-8x7b-32768", 654 | "gemma-7b-it", 655 | "qwen-2.5-32b" 656 | ] 657 | 658 | 659 | 660 | # Shared package - common utilities and data types 661 | 662 | 663 | 664 | # Atoms package - basic building blocks 665 | 666 | 667 | 668 | # Molecules package - higher-level functionality built from atoms 669 | 670 | 671 | 672 | """ 673 | List models functionality for just-prompt. 674 | """ 675 | 676 | from typing import List 677 | import logging 678 | from ..atoms.shared.validator import validate_provider 679 | from ..atoms.shared.model_router import ModelRouter 680 | 681 | logger = logging.getLogger(__name__) 682 | 683 | 684 | def list_models(provider: str) -> List[str]: 685 | """ 686 | List available models for a provider. 687 | 688 | Args: 689 | provider: Provider name (full or short) 690 | 691 | Returns: 692 | List of model names 693 | """ 694 | # Validate provider 695 | validate_provider(provider) 696 | 697 | # Get models from provider 698 | return ModelRouter.route_list_models(provider) 699 | 700 | 701 | 702 | """ 703 | List providers functionality for just-prompt. 704 | """ 705 | 706 | from typing import List, Dict 707 | import logging 708 | from ..atoms.shared.data_types import ModelProviders 709 | 710 | logger = logging.getLogger(__name__) 711 | 712 | 713 | def list_providers() -> List[Dict[str, str]]: 714 | """ 715 | List all available providers with their full and short names. 716 | 717 | Returns: 718 | List of dictionaries with provider information 719 | """ 720 | providers = [] 721 | for provider in ModelProviders: 722 | providers.append({ 723 | "name": provider.name, 724 | "full_name": provider.full_name, 725 | "short_name": provider.short_name 726 | }) 727 | 728 | return providers 729 | 730 | 731 | 732 | # LLM Providers tests package 733 | 734 | 735 | 736 | """ 737 | Tests for DeepSeek provider. 738 | """ 739 | 740 | import pytest 741 | import os 742 | from dotenv import load_dotenv 743 | from just_prompt.atoms.llm_providers import deepseek 744 | 745 | # Load environment variables 746 | load_dotenv() 747 | 748 | # Skip tests if API key not available 749 | if not os.environ.get("DEEPSEEK_API_KEY"): 750 | pytest.skip("DeepSeek API key not available", allow_module_level=True) 751 | 752 | 753 | def test_list_models(): 754 | """Test listing DeepSeek models.""" 755 | models = deepseek.list_models() 756 | assert isinstance(models, list) 757 | assert len(models) > 0 758 | assert all(isinstance(model, str) for model in models) 759 | 760 | 761 | def test_prompt(): 762 | """Test sending prompt to DeepSeek.""" 763 | response = deepseek.prompt("What is the capital of France?", "deepseek-coder") 764 | assert isinstance(response, str) 765 | assert len(response) > 0 766 | assert "paris" in response.lower() or "Paris" in response 767 | 768 | 769 | 770 | """ 771 | Tests for Groq provider. 772 | """ 773 | 774 | import pytest 775 | import os 776 | from dotenv import load_dotenv 777 | from just_prompt.atoms.llm_providers import groq 778 | 779 | # Load environment variables 780 | load_dotenv() 781 | 782 | # Skip tests if API key not available 783 | if not os.environ.get("GROQ_API_KEY"): 784 | pytest.skip("Groq API key not available", allow_module_level=True) 785 | 786 | 787 | def test_list_models(): 788 | """Test listing Groq models.""" 789 | models = groq.list_models() 790 | assert isinstance(models, list) 791 | assert len(models) > 0 792 | assert all(isinstance(model, str) for model in models) 793 | 794 | 795 | def test_prompt(): 796 | """Test sending prompt to Groq.""" 797 | response = groq.prompt("What is the capital of France?", "qwen-2.5-32b") 798 | assert isinstance(response, str) 799 | assert len(response) > 0 800 | assert "paris" in response.lower() or "Paris" in response 801 | 802 | 803 | 804 | # Shared tests package 805 | 806 | 807 | 808 | """ 809 | Tests for utility functions. 810 | """ 811 | 812 | import pytest 813 | from just_prompt.atoms.shared.utils import split_provider_and_model, get_provider_from_prefix 814 | 815 | 816 | def test_split_provider_and_model(): 817 | """Test splitting provider and model from string.""" 818 | # Test basic splitting 819 | provider, model = split_provider_and_model("openai:gpt-4") 820 | assert provider == "openai" 821 | assert model == "gpt-4" 822 | 823 | # Test short provider name 824 | provider, model = split_provider_and_model("o:gpt-4") 825 | assert provider == "o" 826 | assert model == "gpt-4" 827 | 828 | # Test model with colons 829 | provider, model = split_provider_and_model("ollama:llama3:latest") 830 | assert provider == "ollama" 831 | assert model == "llama3:latest" 832 | 833 | # Test invalid format 834 | with pytest.raises(ValueError): 835 | split_provider_and_model("invalid-model-string") 836 | 837 | 838 | def test_get_provider_from_prefix(): 839 | """Test getting provider from prefix.""" 840 | # Test full names 841 | assert get_provider_from_prefix("openai") == "openai" 842 | assert get_provider_from_prefix("anthropic") == "anthropic" 843 | assert get_provider_from_prefix("gemini") == "gemini" 844 | assert get_provider_from_prefix("groq") == "groq" 845 | assert get_provider_from_prefix("deepseek") == "deepseek" 846 | assert get_provider_from_prefix("ollama") == "ollama" 847 | 848 | # Test short names 849 | assert get_provider_from_prefix("o") == "openai" 850 | assert get_provider_from_prefix("a") == "anthropic" 851 | assert get_provider_from_prefix("g") == "gemini" 852 | assert get_provider_from_prefix("q") == "groq" 853 | assert get_provider_from_prefix("d") == "deepseek" 854 | assert get_provider_from_prefix("l") == "ollama" 855 | 856 | # Test invalid prefix 857 | with pytest.raises(ValueError): 858 | get_provider_from_prefix("unknown") 859 | 860 | 861 | 862 | # Atoms tests package 863 | 864 | 865 | 866 | # Molecules tests package 867 | 868 | 869 | 870 | """ 871 | Tests for list_providers functionality. 872 | """ 873 | 874 | import pytest 875 | from just_prompt.molecules.list_providers import list_providers 876 | 877 | 878 | def test_list_providers(): 879 | """Test listing providers.""" 880 | providers = list_providers() 881 | 882 | # Check basic structure 883 | assert isinstance(providers, list) 884 | assert len(providers) > 0 885 | assert all(isinstance(p, dict) for p in providers) 886 | 887 | # Check expected providers are present 888 | provider_names = [p["name"] for p in providers] 889 | assert "OPENAI" in provider_names 890 | assert "ANTHROPIC" in provider_names 891 | assert "GEMINI" in provider_names 892 | assert "GROQ" in provider_names 893 | assert "DEEPSEEK" in provider_names 894 | assert "OLLAMA" in provider_names 895 | 896 | # Check each provider has required fields 897 | for provider in providers: 898 | assert "name" in provider 899 | assert "full_name" in provider 900 | assert "short_name" in provider 901 | 902 | # Check full_name and short_name values 903 | if provider["name"] == "OPENAI": 904 | assert provider["full_name"] == "openai" 905 | assert provider["short_name"] == "o" 906 | elif provider["name"] == "ANTHROPIC": 907 | assert provider["full_name"] == "anthropic" 908 | assert provider["short_name"] == "a" 909 | 910 | 911 | 912 | # Tests package 913 | 914 | 915 | 916 | # just-prompt - A lightweight wrapper MCP server for various LLM providers 917 | 918 | __version__ = "0.1.0" 919 | 920 | 921 | 922 | # Code Review 923 | 924 | I've analyzed the changes made to the `list_models.py` file. The diff shows a complete refactoring of the file that organizes model listing functionality into separate functions for different AI providers. 925 | 926 | ## Key Changes 927 | 928 | 1. **Code Organization:** The code has been restructured from a series of commented blocks into organized functions for each AI provider. 929 | 2. **Function Implementation:** Each provider now has a dedicated function for listing their available models. 930 | 3. **DeepSeek API Key:** A hardcoded API key is now present in the DeepSeek function. 931 | 4. **Function Execution:** All functions are defined but commented out at the bottom of the file. 932 | 933 | ## Issues and Improvements 934 | 935 | ### 1. Hardcoded API Key 936 | The `list_deepseek_models()` function contains a hardcoded API key: `"sk-ds-3f422175ff114212a42d7107c3efd1e4"`. This is a significant security risk as API keys should never be stored in source code. 937 | 938 | ### 2. Inconsistent Environment Variable Usage 939 | Most functions use environment variables for API keys, but the DeepSeek function does not follow this pattern. 940 | 941 | ### 3. Error Handling 942 | None of the functions include error handling for API failures, network issues, or missing API keys. 943 | 944 | ### 4. Import Organization 945 | Import statements are scattered throughout the functions instead of being consolidated at the top of the file. 946 | 947 | ### 5. No Main Function 948 | There's no main function or entrypoint that would allow users to select which model list they want to see. 949 | 950 | ## Issue Summary 951 | 952 | | Issue | Solution | Risk Assessment | 953 | |-------|----------|-----------------| 954 | | 🚨 Hardcoded API key in DeepSeek function | Replace with environment variable: `api_key=os.environ.get("DEEPSEEK_API_KEY")` | High - Security risk, potential unauthorized API usage and charges | 955 | | ⚠️ No error handling | Add try/except blocks to handle API errors, network issues, and missing credentials | Medium - Code will fail without clear error messages | 956 | | 🔧 Inconsistent environment variable usage | Standardize API key access across all providers | Low - Maintenance and consistency issue | 957 | | 🔧 Scattered imports | Consolidate common imports at the top of the file | Low - Code organization issue | 958 | | 💡 No main function or CLI | Add a main function with argument parsing to run specific provider functions | Low - Usability enhancement | 959 | | 💡 Missing API key validation | Add checks to validate API keys are present before making API calls | Medium - Prevents unclear errors when keys are missing | 960 | 961 | The most critical issue is the hardcoded API key which should be addressed immediately to prevent security risks. 962 | 963 | 964 | 965 | ## Code Review 966 | 967 | The diff introduces modularity and improves the structure of the script by encapsulating the model listing logic for each provider into separate functions. However, there are a few issues and areas for improvement. 968 | 969 | **Issues, Bugs, and Improvements:** 970 | 971 | 1. **🚨 Hardcoded API Key (DeepSeek):** The `list_deepseek_models` function includes a hardcoded API key for DeepSeek. This is a major security vulnerability as API keys should be kept secret and managed securely, preferably through environment variables. 972 | 973 | 2. **⚠️ Lack of Error Handling:** The script lacks error handling. If API calls fail due to network issues, invalid API keys, or other reasons, the script will likely crash or produce uninformative error messages. Robust error handling is crucial for production-ready code. 974 | 975 | 3. **ℹ️ Inconsistent API Key Loading (Minor):** While `dotenv` is used for Anthropic and Gemini API keys, OpenAI, Groq, and DeepSeek (partially) rely directly on environment variables. While functional, consistent use of `dotenv` for all API keys would enhance maintainability and project consistency. 976 | 977 | 4. **ℹ️ Missing Function Docstrings (Minor):** The functions lack docstrings explaining their purpose, parameters (if any), and return values. Docstrings enhance code readability and make it easier to understand the function's role. 978 | 979 | 5. **ℹ️ No Centralized Configuration (Minor):** While using environment variables is good, having a more centralized configuration mechanism (even if it's just a `.env` file loaded by `dotenv`) could be beneficial for managing various settings in the future. 980 | 981 | **Markdown Table of Issues:** 982 | 983 | | Issue | Solution | Risk Assessment | 984 | |----------------------------|-------------------------------------------------------------|-----------------| 985 | | 🚨 **Hardcoded API Key (DeepSeek)** | Use environment variables to store and access the DeepSeek API key. | High | 986 | | ⚠️ **Lack of Error Handling** | Implement `try-except` blocks to handle potential API errors. | Medium | 987 | | ℹ️ **Inconsistent API Key Loading** | Use `dotenv` consistently for all API keys. | Low | 988 | | ℹ️ **Missing Function Docstrings** | Add docstrings to each function explaining its purpose. | Low | 989 | | ℹ️ **No Centralized Config** | Consider a more centralized configuration approach if needed. | Low | 990 | 991 | 992 | 993 | Below is a review of the changes with observations on potential issues, bugs, and improvements: 994 | 995 | 1. Hardcoded API key in list_deepseek_models 996 |  • Issue: The Deepseek function uses a hardcoded API key ("sk-ds-3f422175ff114212a42d7107c3efd1e4"). This exposes sensitive credentials in the source code. 997 |  • Recommendation: Retrieve the key from an environment variable (or a secure vault) as is done for other models. 998 |  • Severity: 🚨 Critical 999 | 1000 | 2. Repeated load_dotenv calls 1001 |  • Issue: Both list_anthropic_models and list_gemini_models call load_dotenv() even if they might be used in the same run. 1002 |  • Recommendation: Consider loading environment variables once in a main entry point or in a shared initialization function. 1003 |  • Severity: ⚠️ Moderate 1004 | 1005 | 3. Redundant API calls in list_gemini_models 1006 |  • Issue: The Gemini function calls client.models.list() twice (once for generateContent and again for embedContent). This might be inefficient if each call performs network I/O. 1007 |  • Recommendation: Cache the result of client.models.list() into a variable and reuse it for both loops. 1008 |  • Severity: ⚠️ Low 1009 | 1010 | 4. Inconsistent variable naming and potential confusion 1011 |  • Observation: In list_groq_models, the result of client.models.list() is stored in a variable named chat_completion even though the function is about listing models. 1012 |  • Recommendation: Use a name such as models or model_list for clarity. 1013 |  • Severity: ℹ️ Low 1014 | 1015 | 5. Lack of error handling for API calls 1016 |  • Observation: All functions simply print the results of API calls without handling potential exceptions (e.g., network errors, invalid credentials). 1017 |  • Recommendation: Wrap API calls in try-except blocks and add meaningful error messages. 1018 |  • Severity: ⚠️ Moderate 1019 | 1020 | 6. Consistency in output formatting 1021 |  • Observation: While some functions print header messages (like list_anthropic_models and list_gemini_models), others (like list_openai_models or list_deepseek_models) simply print the raw result. 1022 |  • Recommendation: Add consistent formatting or output messages for clarity. 1023 |  • Severity: ℹ️ Low 1024 | 1025 | Below is a concise summary in a markdown table: 1026 | 1027 | | Issue | Solution | Risk Assessment | 1028 | |--------------------------------------|------------------------------------------------------------------------------------------|--------------------------| 1029 | | Hardcoded API key in Deepseek | Use an environment variable (e.g., os.environ.get("DEEPSEEK_API_KEY")) | 🚨 Critical | 1030 | | Multiple load_dotenv() calls | Load environment variables once at program start instead of in each function | ⚠️ Moderate | 1031 | | Redundant API call in Gemini models | Cache client.models.list() in a variable and reuse it for looping through supported actions | ⚠️ Low | 1032 | | Inconsistent variable naming (Groq) | Rename variables (e.g., change "chat_completion" to "models" in list_groq_models) | ℹ️ Low (cosmetic) | 1033 | | Lack of error handling | Wrap API calls in try-except blocks and log errors or provide user-friendly error messages | ⚠️ Moderate | 1034 | 1035 | This review should help in making the code more secure, efficient, and maintainable. 1036 | 1037 | 1038 | 1039 | # Ultra Diff Review - Fusion Analysis 1040 | 1041 | ## Overview 1042 | This is a synthesized analysis combining insights from multiple LLM reviews of the changes made to `list_models.py`. The code has been refactored to organize model listing functionality into separate functions for different AI providers. 1043 | 1044 | ## Critical Issues 1045 | 1046 | ### 1. 🚨 Hardcoded API Key (DeepSeek) 1047 | **Description**: The `list_deepseek_models()` function contains a hardcoded API key (`"sk-ds-3f422175ff114212a42d7107c3efd1e4"`). 1048 | **Impact**: Major security vulnerability that could lead to unauthorized API usage and charges. 1049 | **Solution**: Use environment variables instead: 1050 | ```python 1051 | api_key=os.environ.get("DEEPSEEK_API_KEY") 1052 | ``` 1053 | 1054 | ### 2. ⚠️ Lack of Error Handling 1055 | **Description**: None of the functions include error handling for API failures, network issues, or missing credentials. 1056 | **Impact**: Code will crash or produce uninformative errors with actual usage. 1057 | **Solution**: Implement try-except blocks for all API calls: 1058 | ```python 1059 | try: 1060 | client = DeepSeek(api_key=os.environ.get("DEEPSEEK_API_KEY")) 1061 | models = client.models.list() 1062 | # Process models 1063 | except Exception as e: 1064 | print(f"Error fetching DeepSeek models: {e}") 1065 | ``` 1066 | 1067 | ## Medium Priority Issues 1068 | 1069 | ### 3. ⚠️ Multiple load_dotenv() Calls 1070 | **Description**: Both `list_anthropic_models()` and `list_gemini_models()` call `load_dotenv()` independently. 1071 | **Impact**: Redundant operations if multiple functions are called in the same run. 1072 | **Solution**: Move `load_dotenv()` to a single location at the top of the file. 1073 | 1074 | ### 4. ⚠️ Inconsistent API Key Access Patterns 1075 | **Description**: Different functions use different methods to access API keys. 1076 | **Impact**: Reduces code maintainability and consistency. 1077 | **Solution**: Standardize API key access patterns across all providers. 1078 | 1079 | ### 5. ⚠️ Redundant API Call in Gemini Function 1080 | **Description**: `list_gemini_models()` calls `client.models.list()` twice for different filtering operations. 1081 | **Impact**: Potential performance issue - may make unnecessary network calls. 1082 | **Solution**: Store results in a variable and reuse: 1083 | ```python 1084 | models = client.models.list() 1085 | print("List of models that support generateContent:\n") 1086 | for m in models: 1087 | # Filter for generateContent 1088 | 1089 | print("List of models that support embedContent:\n") 1090 | for m in models: 1091 | # Filter for embedContent 1092 | ``` 1093 | 1094 | ## Low Priority Issues 1095 | 1096 | ### 6. ℹ️ Inconsistent Variable Naming 1097 | **Description**: In `list_groq_models()`, the result of `client.models.list()` is stored in a variable named `chat_completion`. 1098 | **Impact**: Low - could cause confusion during maintenance. 1099 | **Solution**: Use a more appropriate variable name like `models` or `model_list`. 1100 | 1101 | ### 7. ℹ️ Inconsistent Output Formatting 1102 | **Description**: Some functions include descriptive print statements, while others just print raw results. 1103 | **Impact**: Low - user experience inconsistency. 1104 | **Solution**: Standardize output formatting across all functions. 1105 | 1106 | ### 8. ℹ️ Scattered Imports 1107 | **Description**: Import statements are scattered throughout functions rather than at the top of the file. 1108 | **Impact**: Low - code organization issue. 1109 | **Solution**: Consolidate imports at the top of the file. 1110 | 1111 | ### 9. ℹ️ Missing Function Docstrings 1112 | **Description**: Functions lack documentation describing their purpose and usage. 1113 | **Impact**: Low - reduces code readability and maintainability. 1114 | **Solution**: Add docstrings to all functions. 1115 | 1116 | ### 10. 💡 No Main Function 1117 | **Description**: There's no main function to coordinate the execution of different provider functions. 1118 | **Impact**: Low - usability enhancement needed. 1119 | **Solution**: Add a main function with argument parsing to run specific provider functions. 1120 | 1121 | ## Summary Table 1122 | 1123 | | ID | Issue | Solution | Risk Assessment | 1124 | |----|-------|----------|-----------------| 1125 | | 1 | 🚨 Hardcoded API key (DeepSeek) | Use environment variables | High | 1126 | | 2 | ⚠️ No error handling | Add try/except blocks for API calls | Medium | 1127 | | 3 | ⚠️ Multiple load_dotenv() calls | Move to single location at file top | Medium | 1128 | | 4 | ⚠️ Inconsistent API key access | Standardize patterns across providers | Medium | 1129 | | 5 | ⚠️ Redundant API call (Gemini) | Cache API response in variable | Medium | 1130 | | 6 | ℹ️ Inconsistent variable naming | Rename variables appropriately | Low | 1131 | | 7 | ℹ️ Inconsistent output formatting | Standardize output format | Low | 1132 | | 8 | ℹ️ Scattered imports | Consolidate imports at file top | Low | 1133 | | 9 | ℹ️ Missing function docstrings | Add documentation to functions | Low | 1134 | | 10 | 💡 No main function | Add main() with argument parsing | Low | 1135 | 1136 | ## Recommendation 1137 | The hardcoded API key issue (#1) should be addressed immediately as it poses a significant security risk. Following that, implementing proper error handling (#2) would greatly improve the reliability of the code. 1138 | 1139 | 1140 | 1141 | 3.12 1142 | 1143 | 1144 | 1145 | READ README.md, THEN run git ls-files to understand the context of the project. 1146 | 1147 | 1148 | 1149 | hi how are you $ARGUMENTS 1150 | 1151 | 1152 | 1153 | """ 1154 | Data types and models for just-prompt MCP server. 1155 | """ 1156 | 1157 | from enum import Enum 1158 | 1159 | 1160 | class ModelProviders(Enum): 1161 | """ 1162 | Enum of supported model providers with their full and short names. 1163 | """ 1164 | OPENAI = ("openai", "o") 1165 | ANTHROPIC = ("anthropic", "a") 1166 | GEMINI = ("gemini", "g") 1167 | GROQ = ("groq", "q") 1168 | DEEPSEEK = ("deepseek", "d") 1169 | OLLAMA = ("ollama", "l") 1170 | 1171 | def __init__(self, full_name, short_name): 1172 | self.full_name = full_name 1173 | self.short_name = short_name 1174 | 1175 | @classmethod 1176 | def from_name(cls, name): 1177 | """ 1178 | Get provider enum from full or short name. 1179 | 1180 | Args: 1181 | name: The provider name (full or short) 1182 | 1183 | Returns: 1184 | ModelProviders: The corresponding provider enum, or None if not found 1185 | """ 1186 | for provider in cls: 1187 | if provider.full_name == name or provider.short_name == name: 1188 | return provider 1189 | return None 1190 | 1191 | 1192 | 1193 | """ 1194 | Validation utilities for just-prompt. 1195 | """ 1196 | 1197 | from typing import List, Dict, Optional, Tuple 1198 | import logging 1199 | import os 1200 | from .data_types import ModelProviders 1201 | from .utils import split_provider_and_model, get_api_key 1202 | 1203 | logger = logging.getLogger(__name__) 1204 | 1205 | 1206 | def validate_models_prefixed_by_provider(models_prefixed_by_provider: List[str]) -> bool: 1207 | """ 1208 | Validate that provider prefixes in model strings are valid. 1209 | 1210 | Args: 1211 | models_prefixed_by_provider: List of model strings in format "provider:model" 1212 | 1213 | Returns: 1214 | True if all valid, raises ValueError otherwise 1215 | """ 1216 | if not models_prefixed_by_provider: 1217 | raise ValueError("No models provided") 1218 | 1219 | for model_string in models_prefixed_by_provider: 1220 | try: 1221 | provider_prefix, model_name = split_provider_and_model(model_string) 1222 | provider = ModelProviders.from_name(provider_prefix) 1223 | if provider is None: 1224 | raise ValueError(f"Unknown provider prefix: {provider_prefix}") 1225 | except Exception as e: 1226 | logger.error(f"Validation error for model string '{model_string}': {str(e)}") 1227 | raise 1228 | 1229 | return True 1230 | 1231 | 1232 | def validate_provider(provider: str) -> bool: 1233 | """ 1234 | Validate that a provider name is valid. 1235 | 1236 | Args: 1237 | provider: Provider name (full or short) 1238 | 1239 | Returns: 1240 | True if valid, raises ValueError otherwise 1241 | """ 1242 | provider_enum = ModelProviders.from_name(provider) 1243 | if provider_enum is None: 1244 | raise ValueError(f"Unknown provider: {provider}") 1245 | 1246 | return True 1247 | 1248 | 1249 | def validate_provider_api_keys() -> Dict[str, bool]: 1250 | """ 1251 | Validate that API keys are available for each provider. 1252 | 1253 | Returns: 1254 | Dictionary mapping provider names to availability status (True if available, False otherwise) 1255 | """ 1256 | available_providers = {} 1257 | 1258 | # Check API keys for each provider 1259 | for provider in ModelProviders: 1260 | provider_name = provider.full_name 1261 | 1262 | # Special case for Ollama which uses OLLAMA_HOST instead of an API key 1263 | if provider_name == "ollama": 1264 | host = os.environ.get("OLLAMA_HOST") 1265 | is_available = host is not None and host.strip() != "" 1266 | available_providers[provider_name] = is_available 1267 | else: 1268 | # Get API key 1269 | api_key = get_api_key(provider_name) 1270 | is_available = api_key is not None and api_key.strip() != "" 1271 | available_providers[provider_name] = is_available 1272 | 1273 | return available_providers 1274 | 1275 | 1276 | def print_provider_availability(detailed: bool = True) -> None: 1277 | """ 1278 | Print information about which providers are available based on API keys. 1279 | 1280 | Args: 1281 | detailed: Whether to print detailed information about missing keys 1282 | """ 1283 | availability = validate_provider_api_keys() 1284 | 1285 | available = [p for p, status in availability.items() if status] 1286 | unavailable = [p for p, status in availability.items() if not status] 1287 | 1288 | # Print availability information 1289 | logger.info(f"Available LLM providers: {', '.join(available)}") 1290 | 1291 | if detailed and unavailable: 1292 | env_vars = { 1293 | "openai": "OPENAI_API_KEY", 1294 | "anthropic": "ANTHROPIC_API_KEY", 1295 | "gemini": "GEMINI_API_KEY", 1296 | "groq": "GROQ_API_KEY", 1297 | "deepseek": "DEEPSEEK_API_KEY", 1298 | "ollama": "OLLAMA_HOST" 1299 | } 1300 | 1301 | logger.warning(f"The following providers are unavailable due to missing API keys:") 1302 | for provider in unavailable: 1303 | env_var = env_vars.get(provider) 1304 | if env_var: 1305 | logger.warning(f" - {provider}: Missing environment variable {env_var}") 1306 | else: 1307 | logger.warning(f" - {provider}: Missing configuration") 1308 | 1309 | 1310 | 1311 | """ 1312 | Prompt from file functionality for just-prompt. 1313 | """ 1314 | 1315 | from typing import List 1316 | import logging 1317 | import os 1318 | from pathlib import Path 1319 | from .prompt import prompt 1320 | 1321 | logger = logging.getLogger(__name__) 1322 | 1323 | 1324 | def prompt_from_file(file: str, models_prefixed_by_provider: List[str] = None) -> List[str]: 1325 | """ 1326 | Read text from a file and send it as a prompt to multiple models. 1327 | 1328 | Args: 1329 | file: Path to the text file 1330 | models_prefixed_by_provider: List of model strings in format "provider:model" 1331 | If None, uses the DEFAULT_MODELS environment variable 1332 | 1333 | Returns: 1334 | List of responses from the models 1335 | """ 1336 | file_path = Path(file) 1337 | 1338 | # Validate file 1339 | if not file_path.exists(): 1340 | raise FileNotFoundError(f"File not found: {file}") 1341 | 1342 | if not file_path.is_file(): 1343 | raise ValueError(f"Not a file: {file}") 1344 | 1345 | # Read file content 1346 | try: 1347 | with open(file_path, 'r', encoding='utf-8') as f: 1348 | text = f.read() 1349 | except Exception as e: 1350 | logger.error(f"Error reading file {file}: {e}") 1351 | raise ValueError(f"Error reading file: {str(e)}") 1352 | 1353 | # Send prompt with file content 1354 | return prompt(text, models_prefixed_by_provider) 1355 | 1356 | 1357 | 1358 | """ 1359 | Tests for Gemini provider. 1360 | """ 1361 | 1362 | import pytest 1363 | import os 1364 | from dotenv import load_dotenv 1365 | from just_prompt.atoms.llm_providers import gemini 1366 | 1367 | # Load environment variables 1368 | load_dotenv() 1369 | 1370 | # Skip tests if API key not available 1371 | if not os.environ.get("GEMINI_API_KEY"): 1372 | pytest.skip("Gemini API key not available", allow_module_level=True) 1373 | 1374 | 1375 | def test_list_models(): 1376 | """Test listing Gemini models.""" 1377 | models = gemini.list_models() 1378 | 1379 | # Assertions 1380 | assert isinstance(models, list) 1381 | assert len(models) > 0 1382 | assert all(isinstance(model, str) for model in models) 1383 | 1384 | # Check for at least one expected model containing gemini 1385 | gemini_models = [model for model in models if "gemini" in model.lower()] 1386 | assert len(gemini_models) > 0, "No Gemini models found" 1387 | 1388 | 1389 | def test_prompt(): 1390 | """Test sending prompt to Gemini.""" 1391 | # Using gemini-1.5-flash as the model for testing 1392 | response = gemini.prompt("What is the capital of France?", "gemini-1.5-flash") 1393 | 1394 | # Assertions 1395 | assert isinstance(response, str) 1396 | assert len(response) > 0 1397 | assert "paris" in response.lower() or "Paris" in response 1398 | 1399 | 1400 | 1401 | """ 1402 | Tests for Ollama provider. 1403 | """ 1404 | 1405 | import pytest 1406 | import os 1407 | from dotenv import load_dotenv 1408 | from just_prompt.atoms.llm_providers import ollama 1409 | 1410 | # Load environment variables 1411 | load_dotenv() 1412 | 1413 | 1414 | def test_list_models(): 1415 | """Test listing Ollama models.""" 1416 | models = ollama.list_models() 1417 | assert isinstance(models, list) 1418 | assert isinstance(models[0], str) 1419 | assert len(models) > 0 1420 | 1421 | 1422 | def test_prompt(): 1423 | """Test sending prompt to Ollama.""" 1424 | # Using llama3 as default model - adjust if needed based on your environment 1425 | 1426 | response = ollama.prompt("What is the capital of France?", "gemma3:12b") 1427 | 1428 | # Assertions 1429 | assert isinstance(response, str) 1430 | assert len(response) > 0 1431 | assert "paris" in response.lower() or "Paris" in response 1432 | 1433 | 1434 | 1435 | """ 1436 | Tests for OpenAI provider. 1437 | """ 1438 | 1439 | import pytest 1440 | import os 1441 | from dotenv import load_dotenv 1442 | from just_prompt.atoms.llm_providers import openai 1443 | 1444 | # Load environment variables 1445 | load_dotenv() 1446 | 1447 | # Skip tests if API key not available 1448 | if not os.environ.get("OPENAI_API_KEY"): 1449 | pytest.skip("OpenAI API key not available", allow_module_level=True) 1450 | 1451 | 1452 | def test_list_models(): 1453 | """Test listing OpenAI models.""" 1454 | models = openai.list_models() 1455 | 1456 | # Assertions 1457 | assert isinstance(models, list) 1458 | assert len(models) > 0 1459 | assert all(isinstance(model, str) for model in models) 1460 | 1461 | # Check for at least one expected model 1462 | gpt_models = [model for model in models if "gpt" in model.lower()] 1463 | assert len(gpt_models) > 0, "No GPT models found" 1464 | 1465 | 1466 | def test_prompt(): 1467 | """Test sending prompt to OpenAI.""" 1468 | response = openai.prompt("What is the capital of France?", "gpt-4o-mini") 1469 | 1470 | # Assertions 1471 | assert isinstance(response, str) 1472 | assert len(response) > 0 1473 | assert "paris" in response.lower() or "Paris" in response 1474 | 1475 | 1476 | 1477 | """ 1478 | Tests for validator functions. 1479 | """ 1480 | 1481 | import pytest 1482 | import os 1483 | from unittest.mock import patch 1484 | from just_prompt.atoms.shared.validator import ( 1485 | validate_models_prefixed_by_provider, 1486 | validate_provider, 1487 | validate_provider_api_keys, 1488 | print_provider_availability 1489 | ) 1490 | 1491 | 1492 | def test_validate_models_prefixed_by_provider(): 1493 | """Test validating model strings.""" 1494 | # Valid model strings 1495 | assert validate_models_prefixed_by_provider(["openai:gpt-4o-mini"]) == True 1496 | assert validate_models_prefixed_by_provider(["anthropic:claude-3-5-haiku"]) == True 1497 | assert validate_models_prefixed_by_provider(["o:gpt-4o-mini", "a:claude-3-5-haiku"]) == True 1498 | 1499 | # Invalid model strings 1500 | with pytest.raises(ValueError): 1501 | validate_models_prefixed_by_provider([]) 1502 | 1503 | with pytest.raises(ValueError): 1504 | validate_models_prefixed_by_provider(["unknown:model"]) 1505 | 1506 | with pytest.raises(ValueError): 1507 | validate_models_prefixed_by_provider(["invalid-format"]) 1508 | 1509 | 1510 | def test_validate_provider(): 1511 | """Test validating provider names.""" 1512 | # Valid providers 1513 | assert validate_provider("openai") == True 1514 | assert validate_provider("anthropic") == True 1515 | assert validate_provider("o") == True 1516 | assert validate_provider("a") == True 1517 | 1518 | # Invalid providers 1519 | with pytest.raises(ValueError): 1520 | validate_provider("unknown") 1521 | 1522 | with pytest.raises(ValueError): 1523 | validate_provider("") 1524 | 1525 | 1526 | def test_validate_provider_api_keys(): 1527 | """Test validating provider API keys.""" 1528 | # Use mocked environment variables with a mix of valid, empty, and missing keys 1529 | with patch.dict(os.environ, { 1530 | "OPENAI_API_KEY": "test-key", 1531 | "ANTHROPIC_API_KEY": "test-key", 1532 | "GROQ_API_KEY": "test-key", 1533 | # GEMINI_API_KEY not defined 1534 | "DEEPSEEK_API_KEY": "test-key", 1535 | "OLLAMA_HOST": "http://localhost:11434" 1536 | }): 1537 | # Call the function to validate provider API keys 1538 | availability = validate_provider_api_keys() 1539 | 1540 | # Check that each provider has the correct availability status 1541 | assert availability["openai"] is True 1542 | assert availability["anthropic"] is True 1543 | assert availability["groq"] is True 1544 | 1545 | # This depends on the actual implementation. Since we're mocking the environment, 1546 | # let's just assert that the keys exist rather than specific values 1547 | assert "gemini" in availability 1548 | assert "deepseek" in availability 1549 | assert "ollama" in availability 1550 | 1551 | # Make sure all providers are included in the result 1552 | assert set(availability.keys()) == {"openai", "anthropic", "gemini", "groq", "deepseek", "ollama"} 1553 | 1554 | 1555 | def test_validate_provider_api_keys_none(): 1556 | """Test validating provider API keys when none are available.""" 1557 | # Use mocked environment variables with no API keys 1558 | with patch.dict(os.environ, {}, clear=True): 1559 | # Call the function to validate provider API keys 1560 | availability = validate_provider_api_keys() 1561 | 1562 | # Check that all providers are marked as unavailable 1563 | assert all(status is False for status in availability.values()) 1564 | assert set(availability.keys()) == {"openai", "anthropic", "gemini", "groq", "deepseek", "ollama"} 1565 | 1566 | 1567 | def test_print_provider_availability(): 1568 | """Test printing provider availability.""" 1569 | # Mock the validate_provider_api_keys function to return a controlled result 1570 | mock_availability = { 1571 | "openai": True, 1572 | "anthropic": False, 1573 | "gemini": True, 1574 | "groq": False, 1575 | "deepseek": True, 1576 | "ollama": False 1577 | } 1578 | 1579 | with patch('just_prompt.atoms.shared.validator.validate_provider_api_keys', 1580 | return_value=mock_availability): 1581 | 1582 | # Mock the logger to verify the log messages 1583 | with patch('just_prompt.atoms.shared.validator.logger') as mock_logger: 1584 | # Call the function to print provider availability 1585 | print_provider_availability(detailed=True) 1586 | 1587 | # Verify that info was called with a message about available providers 1588 | mock_logger.info.assert_called_once() 1589 | info_call_args = mock_logger.info.call_args[0][0] 1590 | assert "Available LLM providers:" in info_call_args 1591 | assert "openai" in info_call_args 1592 | assert "gemini" in info_call_args 1593 | assert "deepseek" in info_call_args 1594 | 1595 | # Check that warning was called multiple times 1596 | assert mock_logger.warning.call_count >= 2 1597 | 1598 | # Check that the first warning is about missing API keys 1599 | warning_calls = [call[0][0] for call in mock_logger.warning.call_args_list] 1600 | assert "The following providers are unavailable due to missing API keys:" in warning_calls 1601 | 1602 | 1603 | 1604 | """ 1605 | Tests for prompt_from_file functionality. 1606 | """ 1607 | 1608 | import pytest 1609 | import os 1610 | import tempfile 1611 | from dotenv import load_dotenv 1612 | from just_prompt.molecules.prompt_from_file import prompt_from_file 1613 | 1614 | # Load environment variables 1615 | load_dotenv() 1616 | 1617 | 1618 | def test_nonexistent_file(): 1619 | """Test with non-existent file.""" 1620 | with pytest.raises(FileNotFoundError): 1621 | prompt_from_file("/non/existent/file.txt", ["o:gpt-4o-mini"]) 1622 | 1623 | 1624 | def test_file_read(): 1625 | """Test that the file is read correctly and processes with real API call.""" 1626 | # Create temporary file with a simple question 1627 | with tempfile.NamedTemporaryFile(mode='w+', delete=False) as temp: 1628 | temp.write("What is the capital of France?") 1629 | temp_path = temp.name 1630 | 1631 | try: 1632 | # Make real API call 1633 | response = prompt_from_file(temp_path, ["o:gpt-4o-mini"]) 1634 | 1635 | # Assertions 1636 | assert isinstance(response, list) 1637 | assert len(response) == 1 1638 | assert "paris" in response[0].lower() or "Paris" in response[0] 1639 | finally: 1640 | # Clean up 1641 | os.unlink(temp_path) 1642 | 1643 | 1644 | 1645 | """ 1646 | Tests for prompt functionality. 1647 | """ 1648 | 1649 | import pytest 1650 | import os 1651 | from dotenv import load_dotenv 1652 | from just_prompt.molecules.prompt import prompt 1653 | 1654 | # Load environment variables 1655 | load_dotenv() 1656 | 1657 | def test_prompt_basic(): 1658 | """Test basic prompt functionality with a real API call.""" 1659 | # Define a simple test case 1660 | test_prompt = "What is the capital of France?" 1661 | test_models = ["openai:gpt-4o-mini"] 1662 | 1663 | # Call the prompt function with a real model 1664 | response = prompt(test_prompt, test_models) 1665 | 1666 | # Assertions 1667 | assert isinstance(response, list) 1668 | assert len(response) == 1 1669 | assert "paris" in response[0].lower() or "Paris" in response[0] 1670 | 1671 | def test_prompt_multiple_models(): 1672 | """Test prompt with multiple models.""" 1673 | # Skip if API keys aren't available 1674 | if not os.environ.get("OPENAI_API_KEY") or not os.environ.get("ANTHROPIC_API_KEY"): 1675 | pytest.skip("Required API keys not available") 1676 | 1677 | # Define a simple test case 1678 | test_prompt = "What is the capital of France?" 1679 | test_models = ["openai:gpt-4o-mini", "anthropic:claude-3-5-haiku-20241022"] 1680 | 1681 | # Call the prompt function with multiple models 1682 | response = prompt(test_prompt, test_models) 1683 | 1684 | # Assertions 1685 | assert isinstance(response, list) 1686 | assert len(response) == 2 1687 | # Check all responses contain Paris 1688 | for r in response: 1689 | assert "paris" in r.lower() or "Paris" in r 1690 | 1691 | 1692 | 1693 | # Environment Variables for just-prompt 1694 | 1695 | # OpenAI API Key 1696 | OPENAI_API_KEY=your_openai_api_key_here 1697 | 1698 | # Anthropic API Key 1699 | ANTHROPIC_API_KEY=your_anthropic_api_key_here 1700 | 1701 | # Gemini API Key 1702 | GEMINI_API_KEY=your_gemini_api_key_here 1703 | 1704 | # Groq API Key 1705 | GROQ_API_KEY=your_groq_api_key_here 1706 | 1707 | # DeepSeek API Key 1708 | DEEPSEEK_API_KEY=your_deepseek_api_key_here 1709 | 1710 | # Ollama endpoint (if not default) 1711 | OLLAMA_HOST=http://localhost:11434 1712 | 1713 | 1714 | 1715 | """ 1716 | Anthropic provider implementation. 1717 | """ 1718 | 1719 | import os 1720 | import re 1721 | import anthropic 1722 | from typing import List, Tuple 1723 | import logging 1724 | from dotenv import load_dotenv 1725 | 1726 | # Load environment variables 1727 | load_dotenv() 1728 | 1729 | # Configure logging 1730 | logger = logging.getLogger(__name__) 1731 | 1732 | # Initialize Anthropic client 1733 | client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY")) 1734 | 1735 | 1736 | def parse_thinking_suffix(model: str) -> Tuple[str, int]: 1737 | """ 1738 | Parse a model name to check for thinking token budget suffixes. 1739 | Only works with the claude-3-7-sonnet-20250219 model. 1740 | 1741 | Supported formats: 1742 | - model:1k, model:4k, model:16k 1743 | - model:1000, model:1054, model:1333, etc. (any value between 1024-16000) 1744 | 1745 | Args: 1746 | model: The model name potentially with a thinking suffix 1747 | 1748 | Returns: 1749 | Tuple of (base_model_name, thinking_budget) 1750 | If no thinking suffix is found, thinking_budget will be 0 1751 | """ 1752 | # Look for patterns like ":1k", ":4k", ":16k" or ":1000", ":1054", etc. 1753 | pattern = r'^(.+?)(?::(\d+)k?)?$' 1754 | match = re.match(pattern, model) 1755 | 1756 | if not match: 1757 | return model, 0 1758 | 1759 | base_model = match.group(1) 1760 | thinking_suffix = match.group(2) 1761 | 1762 | # Validate the model - only claude-3-7-sonnet-20250219 supports thinking 1763 | if base_model != "claude-3-7-sonnet-20250219": 1764 | logger.warning(f"Model {base_model} does not support thinking, ignoring thinking suffix") 1765 | return base_model, 0 1766 | 1767 | if not thinking_suffix: 1768 | return model, 0 1769 | 1770 | # Convert to integer 1771 | try: 1772 | thinking_budget = int(thinking_suffix) 1773 | # If a small number like 1, 4, 16 is provided, assume it's in "k" (multiply by 1024) 1774 | if thinking_budget < 100: 1775 | thinking_budget *= 1024 1776 | 1777 | # Adjust values outside the range 1778 | if thinking_budget < 1024: 1779 | logger.warning(f"Thinking budget {thinking_budget} below minimum (1024), using 1024 instead") 1780 | thinking_budget = 1024 1781 | elif thinking_budget > 16000: 1782 | logger.warning(f"Thinking budget {thinking_budget} above maximum (16000), using 16000 instead") 1783 | thinking_budget = 16000 1784 | 1785 | logger.info(f"Using thinking budget of {thinking_budget} tokens for model {base_model}") 1786 | return base_model, thinking_budget 1787 | except ValueError: 1788 | logger.warning(f"Invalid thinking budget format: {thinking_suffix}, ignoring") 1789 | return base_model, 0 1790 | 1791 | 1792 | def prompt_with_thinking(text: str, model: str, thinking_budget: int) -> str: 1793 | """ 1794 | Send a prompt to Anthropic Claude with thinking enabled and get a response. 1795 | 1796 | Args: 1797 | text: The prompt text 1798 | model: The base model name (without thinking suffix) 1799 | thinking_budget: The token budget for thinking 1800 | 1801 | Returns: 1802 | Response string from the model 1803 | """ 1804 | try: 1805 | # Ensure max_tokens is greater than thinking_budget 1806 | # Documentation requires this: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#max-tokens-and-context-window-size 1807 | max_tokens = thinking_budget + 1000 # Adding 1000 tokens for the response 1808 | 1809 | logger.info(f"Sending prompt to Anthropic model {model} with thinking budget {thinking_budget}") 1810 | message = client.messages.create( 1811 | model=model, 1812 | max_tokens=max_tokens, 1813 | thinking={ 1814 | "type": "enabled", 1815 | "budget_tokens": thinking_budget, 1816 | }, 1817 | messages=[{"role": "user", "content": text}] 1818 | ) 1819 | 1820 | # Extract the response from the message content 1821 | # Filter out thinking blocks and only get text blocks 1822 | text_blocks = [block for block in message.content if block.type == "text"] 1823 | 1824 | if not text_blocks: 1825 | raise ValueError("No text content found in response") 1826 | 1827 | return text_blocks[0].text 1828 | except Exception as e: 1829 | logger.error(f"Error sending prompt with thinking to Anthropic: {e}") 1830 | raise ValueError(f"Failed to get response from Anthropic with thinking: {str(e)}") 1831 | 1832 | 1833 | def prompt(text: str, model: str) -> str: 1834 | """ 1835 | Send a prompt to Anthropic Claude and get a response. 1836 | 1837 | Automatically handles thinking suffixes in the model name (e.g., claude-3-7-sonnet-20250219:4k) 1838 | 1839 | Args: 1840 | text: The prompt text 1841 | model: The model name, optionally with thinking suffix 1842 | 1843 | Returns: 1844 | Response string from the model 1845 | """ 1846 | # Parse the model name to check for thinking suffixes 1847 | base_model, thinking_budget = parse_thinking_suffix(model) 1848 | 1849 | # If thinking budget is specified, use prompt_with_thinking 1850 | if thinking_budget > 0: 1851 | return prompt_with_thinking(text, base_model, thinking_budget) 1852 | 1853 | # Otherwise, use regular prompt 1854 | try: 1855 | logger.info(f"Sending prompt to Anthropic model: {base_model}") 1856 | message = client.messages.create( 1857 | model=base_model, max_tokens=4096, messages=[{"role": "user", "content": text}] 1858 | ) 1859 | 1860 | # Extract the response from the message content 1861 | # Get only text blocks 1862 | text_blocks = [block for block in message.content if block.type == "text"] 1863 | 1864 | if not text_blocks: 1865 | raise ValueError("No text content found in response") 1866 | 1867 | return text_blocks[0].text 1868 | except Exception as e: 1869 | logger.error(f"Error sending prompt to Anthropic: {e}") 1870 | raise ValueError(f"Failed to get response from Anthropic: {str(e)}") 1871 | 1872 | 1873 | def list_models() -> List[str]: 1874 | """ 1875 | List available Anthropic models. 1876 | 1877 | Returns: 1878 | List of model names 1879 | """ 1880 | try: 1881 | logger.info("Listing Anthropic models") 1882 | response = client.models.list() 1883 | 1884 | models = [model.id for model in response.data] 1885 | return models 1886 | except Exception as e: 1887 | logger.error(f"Error listing Anthropic models: {e}") 1888 | # Return some known models if API fails 1889 | logger.info("Returning hardcoded list of known Anthropic models") 1890 | return [ 1891 | "claude-3-7-sonnet", 1892 | "claude-3-5-sonnet", 1893 | "claude-3-5-sonnet-20240620", 1894 | "claude-3-opus-20240229", 1895 | "claude-3-sonnet-20240229", 1896 | "claude-3-haiku-20240307", 1897 | "claude-3-5-haiku", 1898 | ] 1899 | 1900 | 1901 | 1902 | """ 1903 | OpenAI provider implementation. 1904 | """ 1905 | 1906 | import os 1907 | from openai import OpenAI 1908 | from typing import List 1909 | import logging 1910 | from dotenv import load_dotenv 1911 | 1912 | # Load environment variables 1913 | load_dotenv() 1914 | 1915 | # Configure logging 1916 | logger = logging.getLogger(__name__) 1917 | 1918 | # Initialize OpenAI client 1919 | client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) 1920 | 1921 | 1922 | def prompt(text: str, model: str) -> str: 1923 | """ 1924 | Send a prompt to OpenAI and get a response. 1925 | 1926 | Args: 1927 | text: The prompt text 1928 | model: The model name 1929 | 1930 | Returns: 1931 | Response string from the model 1932 | """ 1933 | try: 1934 | logger.info(f"Sending prompt to OpenAI model: {model}") 1935 | response = client.chat.completions.create( 1936 | model=model, 1937 | messages=[{"role": "user", "content": text}], 1938 | ) 1939 | 1940 | return response.choices[0].message.content 1941 | except Exception as e: 1942 | logger.error(f"Error sending prompt to OpenAI: {e}") 1943 | raise ValueError(f"Failed to get response from OpenAI: {str(e)}") 1944 | 1945 | 1946 | def list_models() -> List[str]: 1947 | """ 1948 | List available OpenAI models. 1949 | 1950 | Returns: 1951 | List of model names 1952 | """ 1953 | try: 1954 | logger.info("Listing OpenAI models") 1955 | response = client.models.list() 1956 | 1957 | # Return all models without filtering 1958 | models = [model.id for model in response.data] 1959 | 1960 | return models 1961 | except Exception as e: 1962 | logger.error(f"Error listing OpenAI models: {e}") 1963 | raise ValueError(f"Failed to list OpenAI models: {str(e)}") 1964 | 1965 | 1966 | 1967 | """ 1968 | Model router for dispatching requests to the appropriate provider. 1969 | """ 1970 | 1971 | import logging 1972 | from typing import List, Dict, Any, Optional 1973 | import importlib 1974 | from .utils import split_provider_and_model 1975 | from .data_types import ModelProviders 1976 | 1977 | logger = logging.getLogger(__name__) 1978 | 1979 | 1980 | class ModelRouter: 1981 | """ 1982 | Routes requests to the appropriate provider based on the model string. 1983 | """ 1984 | 1985 | @staticmethod 1986 | def validate_and_correct_model(provider_name: str, model_name: str) -> str: 1987 | """ 1988 | Validate a model name against available models for a provider, and correct it if needed. 1989 | 1990 | Args: 1991 | provider_name: Provider name (full name) 1992 | model_name: Model name to validate and potentially correct 1993 | 1994 | Returns: 1995 | Validated and potentially corrected model name 1996 | """ 1997 | # Early return for our thinking token model to bypass validation 1998 | if "claude-3-7-sonnet-20250219" in model_name: 1999 | return model_name 2000 | 2001 | try: 2002 | # Import the provider module 2003 | provider_module_name = f"just_prompt.atoms.llm_providers.{provider_name}" 2004 | provider_module = importlib.import_module(provider_module_name) 2005 | 2006 | # Get available models 2007 | available_models = provider_module.list_models() 2008 | 2009 | # Check if model is in available models 2010 | if model_name in available_models: 2011 | return model_name 2012 | 2013 | # Model needs correction - use the default correction model 2014 | import os 2015 | 2016 | correction_model = os.environ.get( 2017 | "CORRECTION_MODEL", "anthropic:claude-3-7-sonnet-20250219" 2018 | ) 2019 | 2020 | # Use magic model correction 2021 | corrected_model = ModelRouter.magic_model_correction( 2022 | provider_name, model_name, correction_model 2023 | ) 2024 | 2025 | if corrected_model != model_name: 2026 | logger.info( 2027 | f"Corrected model name from '{model_name}' to '{corrected_model}' for provider '{provider_name}'" 2028 | ) 2029 | return corrected_model 2030 | 2031 | return model_name 2032 | except Exception as e: 2033 | logger.warning( 2034 | f"Error validating model '{model_name}' for provider '{provider_name}': {e}" 2035 | ) 2036 | return model_name 2037 | 2038 | @staticmethod 2039 | def route_prompt(model_string: str, text: str) -> str: 2040 | """ 2041 | Route a prompt to the appropriate provider. 2042 | 2043 | Args: 2044 | model_string: String in format "provider:model" 2045 | text: The prompt text 2046 | 2047 | Returns: 2048 | Response from the model 2049 | """ 2050 | provider_prefix, model = split_provider_and_model(model_string) 2051 | provider = ModelProviders.from_name(provider_prefix) 2052 | 2053 | if not provider: 2054 | raise ValueError(f"Unknown provider prefix: {provider_prefix}") 2055 | 2056 | # Validate and potentially correct the model name 2057 | validated_model = ModelRouter.validate_and_correct_model( 2058 | provider.full_name, model 2059 | ) 2060 | 2061 | # Import the appropriate provider module 2062 | try: 2063 | module_name = f"just_prompt.atoms.llm_providers.{provider.full_name}" 2064 | provider_module = importlib.import_module(module_name) 2065 | 2066 | # Call the prompt function 2067 | return provider_module.prompt(text, validated_model) 2068 | except ImportError as e: 2069 | logger.error(f"Failed to import provider module: {e}") 2070 | raise ValueError(f"Provider not available: {provider.full_name}") 2071 | except Exception as e: 2072 | logger.error(f"Error routing prompt to {provider.full_name}: {e}") 2073 | raise 2074 | 2075 | @staticmethod 2076 | def route_list_models(provider_name: str) -> List[str]: 2077 | """ 2078 | Route a list_models request to the appropriate provider. 2079 | 2080 | Args: 2081 | provider_name: Provider name (full or short) 2082 | 2083 | Returns: 2084 | List of model names 2085 | """ 2086 | provider = ModelProviders.from_name(provider_name) 2087 | 2088 | if not provider: 2089 | raise ValueError(f"Unknown provider: {provider_name}") 2090 | 2091 | # Import the appropriate provider module 2092 | try: 2093 | module_name = f"just_prompt.atoms.llm_providers.{provider.full_name}" 2094 | provider_module = importlib.import_module(module_name) 2095 | 2096 | # Call the list_models function 2097 | return provider_module.list_models() 2098 | except ImportError as e: 2099 | logger.error(f"Failed to import provider module: {e}") 2100 | raise ValueError(f"Provider not available: {provider.full_name}") 2101 | except Exception as e: 2102 | logger.error(f"Error listing models for {provider.full_name}: {e}") 2103 | raise 2104 | 2105 | @staticmethod 2106 | def magic_model_correction(provider: str, model: str, correction_model: str) -> str: 2107 | """ 2108 | Correct a model name using a correction AI model if needed. 2109 | 2110 | Args: 2111 | provider: Provider name 2112 | model: Original model name 2113 | correction_model: Model to use for the correction llm prompt, e.g. "o:gpt-4o-mini" 2114 | 2115 | Returns: 2116 | Corrected model name 2117 | """ 2118 | provider_module_name = f"just_prompt.atoms.llm_providers.{provider}" 2119 | 2120 | try: 2121 | provider_module = importlib.import_module(provider_module_name) 2122 | available_models = provider_module.list_models() 2123 | 2124 | # If model is already in available models, no correction needed 2125 | if model in available_models: 2126 | logger.info(f"Using {provider} and {model}") 2127 | return model 2128 | 2129 | # Model needs correction - use correction model to correct it 2130 | correction_provider, correction_model_name = split_provider_and_model( 2131 | correction_model 2132 | ) 2133 | correction_provider_enum = ModelProviders.from_name(correction_provider) 2134 | 2135 | if not correction_provider_enum: 2136 | logger.warning( 2137 | f"Invalid correction model provider: {correction_provider}, skipping correction" 2138 | ) 2139 | return model 2140 | 2141 | correction_module_name = ( 2142 | f"just_prompt.atoms.llm_providers.{correction_provider_enum.full_name}" 2143 | ) 2144 | correction_module = importlib.import_module(correction_module_name) 2145 | 2146 | # Build prompt for the correction model 2147 | prompt = f""" 2148 | Given a user-provided model name "{model}" for the provider "{provider}", and the list of actual available models below, 2149 | return the closest matching model name from the available models list. 2150 | Only return the exact model name, nothing else. 2151 | 2152 | Available models: {', '.join(available_models)} 2153 | """ 2154 | # Get correction from correction model 2155 | corrected_model = correction_module.prompt( 2156 | prompt, correction_model_name 2157 | ).strip() 2158 | 2159 | # Verify the corrected model exists in the available models 2160 | if corrected_model in available_models: 2161 | logger.info(f"correction_model: {correction_model}") 2162 | logger.info(f"models_prefixed_by_provider: {provider}:{model}") 2163 | logger.info(f"corrected_model: {corrected_model}") 2164 | return corrected_model 2165 | else: 2166 | logger.warning( 2167 | f"Corrected model {corrected_model} not found in available models" 2168 | ) 2169 | return model 2170 | 2171 | except Exception as e: 2172 | logger.error(f"Error in model correction: {e}") 2173 | return model 2174 | 2175 | 2176 | 2177 | """ 2178 | Utility functions for just-prompt. 2179 | """ 2180 | 2181 | from typing import Tuple, List 2182 | import os 2183 | from dotenv import load_dotenv 2184 | import logging 2185 | 2186 | # Set up logging 2187 | logging.basicConfig( 2188 | level=logging.INFO, 2189 | format='%(asctime)s [%(levelname)s] %(message)s', 2190 | datefmt='%Y-%m-%d %H:%M:%S' 2191 | ) 2192 | 2193 | # Load environment variables 2194 | load_dotenv() 2195 | 2196 | # Default model constants 2197 | DEFAULT_MODEL = "anthropic:claude-3-7-sonnet-20250219" 2198 | 2199 | 2200 | def split_provider_and_model(model_string: str) -> Tuple[str, str]: 2201 | """ 2202 | Split a model string into provider and model name. 2203 | 2204 | Note: This only splits the first colon in the model string and leaves the rest of the string 2205 | as the model name. Models will have additional colons in the string and we want to ignore them 2206 | and leave them for the model name. 2207 | 2208 | Args: 2209 | model_string: String in format "provider:model" 2210 | 2211 | Returns: 2212 | Tuple containing (provider, model) 2213 | """ 2214 | parts = model_string.split(":", 1) 2215 | if len(parts) != 2: 2216 | raise ValueError(f"Invalid model string format: {model_string}. Expected format: 'provider:model'") 2217 | 2218 | provider, model = parts 2219 | return provider, model 2220 | 2221 | 2222 | def get_provider_from_prefix(prefix: str) -> str: 2223 | """ 2224 | Get the full provider name from a prefix. 2225 | 2226 | Args: 2227 | prefix: Provider prefix (short or full name) 2228 | 2229 | Returns: 2230 | Full provider name 2231 | """ 2232 | from .data_types import ModelProviders 2233 | 2234 | provider = ModelProviders.from_name(prefix) 2235 | if provider is None: 2236 | raise ValueError(f"Unknown provider prefix: {prefix}") 2237 | 2238 | return provider.full_name 2239 | 2240 | 2241 | def get_models_prefixed_by_provider(provider_prefix: str, model_name: str) -> str: 2242 | """ 2243 | Format a model string with provider prefix. 2244 | 2245 | Args: 2246 | provider_prefix: The provider prefix (short or full name) 2247 | model_name: The model name 2248 | 2249 | Returns: 2250 | Formatted string in "provider:model" format 2251 | """ 2252 | provider = get_provider_from_prefix(provider_prefix) 2253 | return f"{provider}:{model_name}" 2254 | 2255 | 2256 | def get_api_key(provider: str) -> str: 2257 | """ 2258 | Get the API key for a provider from environment variables. 2259 | 2260 | Args: 2261 | provider: Provider name (full name) 2262 | 2263 | Returns: 2264 | API key as string 2265 | """ 2266 | key_mapping = { 2267 | "openai": "OPENAI_API_KEY", 2268 | "anthropic": "ANTHROPIC_API_KEY", 2269 | "gemini": "GEMINI_API_KEY", 2270 | "groq": "GROQ_API_KEY", 2271 | "deepseek": "DEEPSEEK_API_KEY" 2272 | } 2273 | 2274 | env_var = key_mapping.get(provider) 2275 | if not env_var: 2276 | return None 2277 | 2278 | return os.environ.get(env_var) 2279 | 2280 | 2281 | 2282 | """ 2283 | Tests for Anthropic provider. 2284 | """ 2285 | 2286 | import pytest 2287 | import os 2288 | from dotenv import load_dotenv 2289 | from just_prompt.atoms.llm_providers import anthropic 2290 | 2291 | # Load environment variables 2292 | load_dotenv() 2293 | 2294 | # Skip tests if API key not available 2295 | if not os.environ.get("ANTHROPIC_API_KEY"): 2296 | pytest.skip("Anthropic API key not available", allow_module_level=True) 2297 | 2298 | 2299 | def test_list_models(): 2300 | """Test listing Anthropic models.""" 2301 | models = anthropic.list_models() 2302 | 2303 | # Assertions 2304 | assert isinstance(models, list) 2305 | assert len(models) > 0 2306 | assert all(isinstance(model, str) for model in models) 2307 | 2308 | # Check for at least one expected model 2309 | claude_models = [model for model in models if "claude" in model.lower()] 2310 | assert len(claude_models) > 0, "No Claude models found" 2311 | 2312 | 2313 | def test_prompt(): 2314 | """Test sending prompt to Anthropic.""" 2315 | # Use the correct model name from the available models 2316 | response = anthropic.prompt("What is the capital of France?", "claude-3-5-haiku-20241022") 2317 | 2318 | # Assertions 2319 | assert isinstance(response, str) 2320 | assert len(response) > 0 2321 | assert "paris" in response.lower() or "Paris" in response 2322 | 2323 | 2324 | def test_parse_thinking_suffix(): 2325 | """Test parsing thinking suffix from model names.""" 2326 | # Test cases with no suffix 2327 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet") == ("claude-3-7-sonnet", 0) 2328 | assert anthropic.parse_thinking_suffix("claude-3-5-haiku-20241022") == ("claude-3-5-haiku-20241022", 0) 2329 | 2330 | # Test cases with supported model and k suffixes 2331 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:1k") == ("claude-3-7-sonnet-20250219", 1024) 2332 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:4k") == ("claude-3-7-sonnet-20250219", 4096) 2333 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:15k") == ("claude-3-7-sonnet-20250219", 15360) # 15*1024=15360 < 16000 2334 | 2335 | # Test cases with supported model and numeric suffixes 2336 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:1024") == ("claude-3-7-sonnet-20250219", 1024) 2337 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:4096") == ("claude-3-7-sonnet-20250219", 4096) 2338 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:8000") == ("claude-3-7-sonnet-20250219", 8000) 2339 | 2340 | # Test cases with non-supported model 2341 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet:1k") == ("claude-3-7-sonnet", 0) 2342 | assert anthropic.parse_thinking_suffix("claude-3-5-haiku:4k") == ("claude-3-5-haiku", 0) 2343 | 2344 | # Test cases with out-of-range values (should adjust to valid range) 2345 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:500") == ("claude-3-7-sonnet-20250219", 1024) # Below min 1024, should use 1024 2346 | assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:20000") == ("claude-3-7-sonnet-20250219", 16000) # Above max 16000, should use 16000 2347 | 2348 | 2349 | def test_prompt_with_thinking(): 2350 | """Test sending prompt with thinking enabled.""" 2351 | # Test with 1k thinking tokens on the supported model 2352 | response = anthropic.prompt("What is the capital of Spain?", "claude-3-7-sonnet-20250219:1k") 2353 | 2354 | # Assertions 2355 | assert isinstance(response, str) 2356 | assert len(response) > 0 2357 | assert "madrid" in response.lower() or "Madrid" in response 2358 | 2359 | # Test with 2k thinking tokens on the supported model 2360 | response = anthropic.prompt("What is the capital of Germany?", "claude-3-7-sonnet-20250219:2k") 2361 | 2362 | # Assertions 2363 | assert isinstance(response, str) 2364 | assert len(response) > 0 2365 | assert "berlin" in response.lower() or "Berlin" in response 2366 | 2367 | # Test with out-of-range but auto-corrected thinking tokens 2368 | response = anthropic.prompt("What is the capital of Italy?", "claude-3-7-sonnet-20250219:500") 2369 | 2370 | # Assertions (should still work with a corrected budget of 1024) 2371 | assert isinstance(response, str) 2372 | assert len(response) > 0 2373 | assert "rome" in response.lower() or "Rome" in response 2374 | 2375 | 2376 | 2377 | """ 2378 | Tests for model router. 2379 | """ 2380 | 2381 | import pytest 2382 | import os 2383 | from unittest.mock import patch, MagicMock 2384 | import importlib 2385 | from just_prompt.atoms.shared.model_router import ModelRouter 2386 | from just_prompt.atoms.shared.data_types import ModelProviders 2387 | 2388 | 2389 | @patch('importlib.import_module') 2390 | def test_route_prompt(mock_import_module): 2391 | """Test routing prompts to the appropriate provider.""" 2392 | # Set up mock 2393 | mock_module = MagicMock() 2394 | mock_module.prompt.return_value = "Paris is the capital of France." 2395 | mock_import_module.return_value = mock_module 2396 | 2397 | # Test with full provider name 2398 | response = ModelRouter.route_prompt("openai:gpt-4o-mini", "What is the capital of France?") 2399 | assert response == "Paris is the capital of France." 2400 | mock_import_module.assert_called_with("just_prompt.atoms.llm_providers.openai") 2401 | mock_module.prompt.assert_called_with("What is the capital of France?", "gpt-4o-mini") 2402 | 2403 | # Test with short provider name 2404 | response = ModelRouter.route_prompt("o:gpt-4o-mini", "What is the capital of France?") 2405 | assert response == "Paris is the capital of France." 2406 | 2407 | # Test invalid provider 2408 | with pytest.raises(ValueError): 2409 | ModelRouter.route_prompt("unknown:model", "What is the capital of France?") 2410 | 2411 | 2412 | @patch('importlib.import_module') 2413 | def test_route_list_models(mock_import_module): 2414 | """Test routing list_models requests to the appropriate provider.""" 2415 | # Set up mock 2416 | mock_module = MagicMock() 2417 | mock_module.list_models.return_value = ["model1", "model2"] 2418 | mock_import_module.return_value = mock_module 2419 | 2420 | # Test with full provider name 2421 | models = ModelRouter.route_list_models("openai") 2422 | assert models == ["model1", "model2"] 2423 | mock_import_module.assert_called_with("just_prompt.atoms.llm_providers.openai") 2424 | mock_module.list_models.assert_called_once() 2425 | 2426 | # Test with short provider name 2427 | models = ModelRouter.route_list_models("o") 2428 | assert models == ["model1", "model2"] 2429 | 2430 | # Test invalid provider 2431 | with pytest.raises(ValueError): 2432 | ModelRouter.route_list_models("unknown") 2433 | 2434 | 2435 | def test_validate_and_correct_model_shorthand(): 2436 | """Test validation and correction of shorthand model names like a:sonnet.3.7.""" 2437 | try: 2438 | # Test with shorthand notation a:sonnet.3.7 2439 | # This should be corrected to claude-3-7-sonnet-20250219 2440 | # First, use the split_provider_and_model to get the provider and model 2441 | from just_prompt.atoms.shared.utils import split_provider_and_model 2442 | provider_prefix, model = split_provider_and_model("a:sonnet.3.7") 2443 | 2444 | # Get the provider enum 2445 | provider = ModelProviders.from_name(provider_prefix) 2446 | 2447 | # Call validate_and_correct_model 2448 | result = ModelRouter.magic_model_correction(provider.full_name, model, "anthropic:claude-3-7-sonnet-20250219") 2449 | 2450 | # The magic_model_correction method should correct sonnet.3.7 to claude-3-7-sonnet-20250219 2451 | assert "claude-3-7" in result, f"Expected sonnet.3.7 to be corrected to a claude-3-7 model, got {result}" 2452 | print(f"Shorthand model 'sonnet.3.7' was corrected to '{result}'") 2453 | except Exception as e: 2454 | pytest.fail(f"Test failed with error: {e}") 2455 | 2456 | 2457 | 2458 | """ 2459 | Tests for list_models functionality for all providers. 2460 | """ 2461 | 2462 | import pytest 2463 | import os 2464 | from dotenv import load_dotenv 2465 | from just_prompt.molecules.list_models import list_models 2466 | 2467 | # Load environment variables 2468 | load_dotenv() 2469 | 2470 | def test_list_models_openai(): 2471 | """Test listing OpenAI models with real API call.""" 2472 | # Skip if API key isn't available 2473 | if not os.environ.get("OPENAI_API_KEY"): 2474 | pytest.skip("OpenAI API key not available") 2475 | 2476 | # Test with full provider name 2477 | models = list_models("openai") 2478 | 2479 | # Assertions 2480 | assert isinstance(models, list) 2481 | assert len(models) > 0 2482 | 2483 | # Check for specific model patterns that should exist 2484 | assert any("gpt" in model.lower() for model in models) 2485 | 2486 | def test_list_models_anthropic(): 2487 | """Test listing Anthropic models with real API call.""" 2488 | # Skip if API key isn't available 2489 | if not os.environ.get("ANTHROPIC_API_KEY"): 2490 | pytest.skip("Anthropic API key not available") 2491 | 2492 | # Test with full provider name 2493 | models = list_models("anthropic") 2494 | 2495 | # Assertions 2496 | assert isinstance(models, list) 2497 | assert len(models) > 0 2498 | 2499 | # Check for specific model patterns that should exist 2500 | assert any("claude" in model.lower() for model in models) 2501 | 2502 | def test_list_models_gemini(): 2503 | """Test listing Gemini models with real API call.""" 2504 | # Skip if API key isn't available 2505 | if not os.environ.get("GEMINI_API_KEY"): 2506 | pytest.skip("Gemini API key not available") 2507 | 2508 | # Test with full provider name 2509 | models = list_models("gemini") 2510 | 2511 | # Assertions 2512 | assert isinstance(models, list) 2513 | assert len(models) > 0 2514 | 2515 | # Check for specific model patterns that should exist 2516 | assert any("gemini" in model.lower() for model in models) 2517 | 2518 | def test_list_models_groq(): 2519 | """Test listing Groq models with real API call.""" 2520 | # Skip if API key isn't available 2521 | if not os.environ.get("GROQ_API_KEY"): 2522 | pytest.skip("Groq API key not available") 2523 | 2524 | # Test with full provider name 2525 | models = list_models("groq") 2526 | 2527 | # Assertions 2528 | assert isinstance(models, list) 2529 | assert len(models) > 0 2530 | 2531 | # Check for specific model patterns (llama or mixtral are common in Groq) 2532 | assert any(("llama" in model.lower() or "mixtral" in model.lower()) for model in models) 2533 | 2534 | def test_list_models_deepseek(): 2535 | """Test listing DeepSeek models with real API call.""" 2536 | # Skip if API key isn't available 2537 | if not os.environ.get("DEEPSEEK_API_KEY"): 2538 | pytest.skip("DeepSeek API key not available") 2539 | 2540 | # Test with full provider name 2541 | models = list_models("deepseek") 2542 | 2543 | # Assertions 2544 | assert isinstance(models, list) 2545 | assert len(models) > 0 2546 | 2547 | # Check for basic list return (no specific pattern needed) 2548 | assert all(isinstance(model, str) for model in models) 2549 | 2550 | def test_list_models_ollama(): 2551 | """Test listing Ollama models with real API call.""" 2552 | # Test with full provider name 2553 | models = list_models("ollama") 2554 | 2555 | # Assertions 2556 | assert isinstance(models, list) 2557 | assert len(models) > 0 2558 | 2559 | # Check for basic list return (model entries could be anything) 2560 | assert all(isinstance(model, str) for model in models) 2561 | 2562 | def test_list_models_with_short_names(): 2563 | """Test listing models using short provider names.""" 2564 | # Test each provider with short name (only if API key available) 2565 | 2566 | # OpenAI - short name "o" 2567 | if os.environ.get("OPENAI_API_KEY"): 2568 | models = list_models("o") 2569 | assert isinstance(models, list) 2570 | assert len(models) > 0 2571 | assert any("gpt" in model.lower() for model in models) 2572 | 2573 | # Anthropic - short name "a" 2574 | if os.environ.get("ANTHROPIC_API_KEY"): 2575 | models = list_models("a") 2576 | assert isinstance(models, list) 2577 | assert len(models) > 0 2578 | assert any("claude" in model.lower() for model in models) 2579 | 2580 | # Gemini - short name "g" 2581 | if os.environ.get("GEMINI_API_KEY"): 2582 | models = list_models("g") 2583 | assert isinstance(models, list) 2584 | assert len(models) > 0 2585 | assert any("gemini" in model.lower() for model in models) 2586 | 2587 | # Groq - short name "q" 2588 | if os.environ.get("GROQ_API_KEY"): 2589 | models = list_models("q") 2590 | assert isinstance(models, list) 2591 | assert len(models) > 0 2592 | 2593 | # DeepSeek - short name "d" 2594 | if os.environ.get("DEEPSEEK_API_KEY"): 2595 | models = list_models("d") 2596 | assert isinstance(models, list) 2597 | assert len(models) > 0 2598 | 2599 | # Ollama - short name "l" 2600 | models = list_models("l") 2601 | assert isinstance(models, list) 2602 | assert len(models) > 0 2603 | 2604 | def test_list_models_invalid_provider(): 2605 | """Test with invalid provider name.""" 2606 | # Test invalid provider 2607 | with pytest.raises(ValueError): 2608 | list_models("unknown_provider") 2609 | 2610 | 2611 | 2612 | """ 2613 | Tests for prompt_from_file_to_file functionality. 2614 | """ 2615 | 2616 | import pytest 2617 | import os 2618 | import tempfile 2619 | import shutil 2620 | from dotenv import load_dotenv 2621 | from just_prompt.molecules.prompt_from_file_to_file import prompt_from_file_to_file 2622 | 2623 | # Load environment variables 2624 | load_dotenv() 2625 | 2626 | 2627 | def test_directory_creation_and_file_writing(): 2628 | """Test that the output directory is created and files are written with real API responses.""" 2629 | # Create temporary input file with a simple question 2630 | with tempfile.NamedTemporaryFile(mode='w+', delete=False) as temp_file: 2631 | temp_file.write("What is the capital of France?") 2632 | input_path = temp_file.name 2633 | 2634 | # Create a deep non-existent directory path 2635 | temp_dir = os.path.join(tempfile.gettempdir(), "just_prompt_test_dir", "output") 2636 | 2637 | try: 2638 | # Make real API call 2639 | file_paths = prompt_from_file_to_file( 2640 | input_path, 2641 | ["o:gpt-4o-mini"], 2642 | temp_dir 2643 | ) 2644 | 2645 | # Assertions 2646 | assert isinstance(file_paths, list) 2647 | assert len(file_paths) == 1 2648 | 2649 | # Check that the file exists 2650 | assert os.path.exists(file_paths[0]) 2651 | 2652 | # Check that the file has a .md extension 2653 | assert file_paths[0].endswith('.md') 2654 | 2655 | # Check file content contains the expected response 2656 | with open(file_paths[0], 'r') as f: 2657 | content = f.read() 2658 | assert "paris" in content.lower() or "Paris" in content 2659 | finally: 2660 | # Clean up 2661 | os.unlink(input_path) 2662 | # Remove the created directory and all its contents 2663 | if os.path.exists(os.path.dirname(temp_dir)): 2664 | shutil.rmtree(os.path.dirname(temp_dir)) 2665 | 2666 | 2667 | 2668 | { 2669 | "mcpServers": { 2670 | "just-prompt": { 2671 | "type": "stdio", 2672 | "command": "uv", 2673 | "args": [ 2674 | "--directory", 2675 | ".", 2676 | "run", 2677 | "just-prompt", 2678 | "--default-models", 2679 | "anthropic:claude-3-7-sonnet-20250219,openai:o3-mini,gemini:gemini-2.5-pro-exp-03-25" 2680 | ], 2681 | "env": {} 2682 | } 2683 | } 2684 | } 2685 | 2686 | 2687 | 2688 | [project] 2689 | name = "just-prompt" 2690 | version = "0.1.0" 2691 | description = "A lightweight MCP server for various LLM providers" 2692 | readme = "README.md" 2693 | requires-python = ">=3.10" 2694 | dependencies = [ 2695 | "anthropic>=0.49.0", 2696 | "google-genai>=1.7.0", 2697 | "groq>=0.20.0", 2698 | "ollama>=0.4.7", 2699 | "openai>=1.68.0", 2700 | "python-dotenv>=1.0.1", 2701 | "pydantic>=2.0.0", 2702 | "mcp>=0.1.5", 2703 | ] 2704 | 2705 | [project.scripts] 2706 | just-prompt = "just_prompt.__main__:main" 2707 | 2708 | [project.optional-dependencies] 2709 | test = [ 2710 | "pytest>=7.3.1", 2711 | "pytest-asyncio>=0.20.3", 2712 | ] 2713 | 2714 | [build-system] 2715 | requires = ["setuptools>=61.0"] 2716 | build-backend = "setuptools.build_meta" 2717 | 2718 | 2719 | 2720 | """ 2721 | Ollama provider implementation. 2722 | """ 2723 | 2724 | import os 2725 | from typing import List 2726 | import logging 2727 | import ollama 2728 | from dotenv import load_dotenv 2729 | 2730 | # Load environment variables 2731 | load_dotenv() 2732 | 2733 | # Configure logging 2734 | logger = logging.getLogger(__name__) 2735 | 2736 | 2737 | def prompt(text: str, model: str) -> str: 2738 | """ 2739 | Send a prompt to Ollama and get a response. 2740 | 2741 | Args: 2742 | text: The prompt text 2743 | model: The model name 2744 | 2745 | Returns: 2746 | Response string from the model 2747 | """ 2748 | try: 2749 | logger.info(f"Sending prompt to Ollama model: {model}") 2750 | 2751 | # Create chat completion 2752 | response = ollama.chat( 2753 | model=model, 2754 | messages=[ 2755 | { 2756 | "role": "user", 2757 | "content": text, 2758 | }, 2759 | ], 2760 | ) 2761 | 2762 | # Extract response content 2763 | return response.message.content 2764 | except Exception as e: 2765 | logger.error(f"Error sending prompt to Ollama: {e}") 2766 | raise ValueError(f"Failed to get response from Ollama: {str(e)}") 2767 | 2768 | 2769 | def list_models() -> List[str]: 2770 | """ 2771 | List available Ollama models. 2772 | 2773 | Returns: 2774 | List of model names 2775 | """ 2776 | logger.info("Listing Ollama models") 2777 | response = ollama.list() 2778 | 2779 | # Extract model names from the models attribute 2780 | models = [model.model for model in response.models] 2781 | 2782 | return models 2783 | 2784 | 2785 | 2786 | """ 2787 | Prompt from file to file functionality for just-prompt. 2788 | """ 2789 | 2790 | from typing import List 2791 | import logging 2792 | import os 2793 | from pathlib import Path 2794 | from .prompt_from_file import prompt_from_file 2795 | from ..atoms.shared.utils import DEFAULT_MODEL 2796 | 2797 | logger = logging.getLogger(__name__) 2798 | 2799 | 2800 | def prompt_from_file_to_file(file: str, models_prefixed_by_provider: List[str] = None, output_dir: str = ".") -> List[str]: 2801 | """ 2802 | Read text from a file, send it as prompt to multiple models, and save responses to files. 2803 | 2804 | Args: 2805 | file: Path to the text file 2806 | models_prefixed_by_provider: List of model strings in format "provider:model" 2807 | If None, uses the DEFAULT_MODELS environment variable 2808 | output_dir: Directory to save response files 2809 | 2810 | Returns: 2811 | List of paths to the output files 2812 | """ 2813 | # Validate output directory 2814 | output_path = Path(output_dir) 2815 | if not output_path.exists(): 2816 | output_path.mkdir(parents=True, exist_ok=True) 2817 | 2818 | if not output_path.is_dir(): 2819 | raise ValueError(f"Not a directory: {output_dir}") 2820 | 2821 | # Get the base name of the input file 2822 | input_file_name = Path(file).stem 2823 | 2824 | # Get responses 2825 | responses = prompt_from_file(file, models_prefixed_by_provider) 2826 | 2827 | # Save responses to files 2828 | output_files = [] 2829 | 2830 | # Get the models that were actually used 2831 | models_used = models_prefixed_by_provider 2832 | if not models_used: 2833 | default_models = os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL) 2834 | models_used = [model.strip() for model in default_models.split(",")] 2835 | 2836 | for i, (model_string, response) in enumerate(zip(models_used, responses)): 2837 | # Sanitize model string for filename (replace colons with underscores) 2838 | safe_model_name = model_string.replace(":", "_") 2839 | 2840 | # Create output filename with .md extension 2841 | output_file = output_path / f"{input_file_name}_{safe_model_name}.md" 2842 | 2843 | # Write response to file as markdown 2844 | try: 2845 | with open(output_file, 'w', encoding='utf-8') as f: 2846 | f.write(response) 2847 | output_files.append(str(output_file)) 2848 | except Exception as e: 2849 | logger.error(f"Error writing response to {output_file}: {e}") 2850 | output_files.append(f"Error: {str(e)}") 2851 | 2852 | return output_files 2853 | 2854 | 2855 | 2856 | """ 2857 | Prompt functionality for just-prompt. 2858 | """ 2859 | 2860 | from typing import List 2861 | import logging 2862 | import concurrent.futures 2863 | import os 2864 | from ..atoms.shared.validator import validate_models_prefixed_by_provider 2865 | from ..atoms.shared.utils import split_provider_and_model, DEFAULT_MODEL 2866 | from ..atoms.shared.model_router import ModelRouter 2867 | 2868 | logger = logging.getLogger(__name__) 2869 | 2870 | 2871 | def _process_model_prompt(model_string: str, text: str) -> str: 2872 | """ 2873 | Process a single model prompt. 2874 | 2875 | Args: 2876 | model_string: String in format "provider:model" 2877 | text: The prompt text 2878 | 2879 | Returns: 2880 | Response from the model 2881 | """ 2882 | try: 2883 | return ModelRouter.route_prompt(model_string, text) 2884 | except Exception as e: 2885 | logger.error(f"Error processing prompt for {model_string}: {e}") 2886 | return f"Error ({model_string}): {str(e)}" 2887 | 2888 | 2889 | def _correct_model_name(provider: str, model: str, correction_model: str) -> str: 2890 | """ 2891 | Correct a model name using the correction model. 2892 | 2893 | Args: 2894 | provider: Provider name 2895 | model: Model name 2896 | correction_model: Model to use for correction 2897 | 2898 | Returns: 2899 | Corrected model name 2900 | """ 2901 | try: 2902 | return ModelRouter.magic_model_correction(provider, model, correction_model) 2903 | except Exception as e: 2904 | logger.error(f"Error correcting model name {provider}:{model}: {e}") 2905 | return model 2906 | 2907 | 2908 | def prompt(text: str, models_prefixed_by_provider: List[str] = None) -> List[str]: 2909 | """ 2910 | Send a prompt to multiple models using parallel processing. 2911 | 2912 | Args: 2913 | text: The prompt text 2914 | models_prefixed_by_provider: List of model strings in format "provider:model" 2915 | If None, uses the DEFAULT_MODELS environment variable 2916 | 2917 | Returns: 2918 | List of responses from the models 2919 | """ 2920 | # Use default models if no models provided 2921 | if not models_prefixed_by_provider: 2922 | default_models = os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL) 2923 | models_prefixed_by_provider = [model.strip() for model in default_models.split(",")] 2924 | # Validate model strings 2925 | validate_models_prefixed_by_provider(models_prefixed_by_provider) 2926 | 2927 | # Prepare corrected model strings 2928 | corrected_models = [] 2929 | for model_string in models_prefixed_by_provider: 2930 | provider, model = split_provider_and_model(model_string) 2931 | 2932 | # Get correction model from environment 2933 | correction_model = os.environ.get("CORRECTION_MODEL", DEFAULT_MODEL) 2934 | 2935 | # Check if model needs correction 2936 | corrected_model = _correct_model_name(provider, model, correction_model) 2937 | 2938 | # Use corrected model 2939 | if corrected_model != model: 2940 | model_string = f"{provider}:{corrected_model}" 2941 | 2942 | corrected_models.append(model_string) 2943 | 2944 | # Process each model in parallel using ThreadPoolExecutor 2945 | responses = [] 2946 | with concurrent.futures.ThreadPoolExecutor() as executor: 2947 | # Submit all tasks 2948 | future_to_model = { 2949 | executor.submit(_process_model_prompt, model_string, text): model_string 2950 | for model_string in corrected_models 2951 | } 2952 | 2953 | # Collect results in order 2954 | for model_string in corrected_models: 2955 | for future, future_model in future_to_model.items(): 2956 | if future_model == model_string: 2957 | responses.append(future.result()) 2958 | break 2959 | 2960 | return responses 2961 | 2962 | 2963 | 2964 | def list_openai_models(): 2965 | from openai import OpenAI 2966 | 2967 | client = OpenAI() 2968 | 2969 | print(client.models.list()) 2970 | 2971 | 2972 | def list_groq_models(): 2973 | import os 2974 | from groq import Groq 2975 | 2976 | client = Groq( 2977 | api_key=os.environ.get("GROQ_API_KEY"), 2978 | ) 2979 | 2980 | chat_completion = client.models.list() 2981 | 2982 | print(chat_completion) 2983 | 2984 | 2985 | def list_anthropic_models(): 2986 | import anthropic 2987 | import os 2988 | from dotenv import load_dotenv 2989 | 2990 | load_dotenv() 2991 | 2992 | client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY")) 2993 | models = client.models.list() 2994 | print("Available Anthropic models:") 2995 | for model in models.data: 2996 | print(f"- {model.id}") 2997 | 2998 | 2999 | def list_gemini_models(): 3000 | import os 3001 | from google import genai 3002 | from dotenv import load_dotenv 3003 | 3004 | load_dotenv() 3005 | 3006 | client = genai.Client(api_key=os.environ.get("GEMINI_API_KEY")) 3007 | 3008 | print("List of models that support generateContent:\n") 3009 | for m in client.models.list(): 3010 | for action in m.supported_actions: 3011 | if action == "generateContent": 3012 | print(m.name) 3013 | 3014 | print("List of models that support embedContent:\n") 3015 | for m in client.models.list(): 3016 | for action in m.supported_actions: 3017 | if action == "embedContent": 3018 | print(m.name) 3019 | 3020 | 3021 | def list_deepseek_models(): 3022 | from openai import OpenAI 3023 | 3024 | # for backward compatibility, you can still use `https://api.deepseek.com/v1` as `base_url`. 3025 | client = OpenAI( 3026 | api_key="sk-ds-3f422175ff114212a42d7107c3efd1e4", # fake 3027 | base_url="https://api.deepseek.com", 3028 | ) 3029 | print(client.models.list()) 3030 | 3031 | 3032 | def list_ollama_models(): 3033 | import ollama 3034 | 3035 | print(ollama.list()) 3036 | 3037 | 3038 | # Uncomment to run the functions 3039 | # list_openai_models() 3040 | # list_groq_models() 3041 | # list_anthropic_models() 3042 | # list_gemini_models() 3043 | # list_deepseek_models() 3044 | # list_ollama_models() 3045 | 3046 | 3047 | 3048 | """ 3049 | Main entry point for just-prompt. 3050 | """ 3051 | 3052 | import argparse 3053 | import asyncio 3054 | import logging 3055 | import os 3056 | import sys 3057 | from dotenv import load_dotenv 3058 | from .server import serve 3059 | from .atoms.shared.utils import DEFAULT_MODEL 3060 | from .atoms.shared.validator import print_provider_availability 3061 | 3062 | # Load environment variables 3063 | load_dotenv() 3064 | 3065 | # Configure logging 3066 | logging.basicConfig( 3067 | level=logging.INFO, 3068 | format='%(asctime)s [%(levelname)s] %(message)s', 3069 | datefmt='%Y-%m-%d %H:%M:%S' 3070 | ) 3071 | logger = logging.getLogger(__name__) 3072 | 3073 | 3074 | def main(): 3075 | """ 3076 | Main entry point for just-prompt. 3077 | """ 3078 | parser = argparse.ArgumentParser(description="just-prompt - A lightweight MCP server for various LLM providers") 3079 | parser.add_argument( 3080 | "--default-models", 3081 | default=DEFAULT_MODEL, 3082 | help="Comma-separated list of default models to use for prompts and model name correction, in format provider:model" 3083 | ) 3084 | parser.add_argument( 3085 | "--log-level", 3086 | choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], 3087 | default="INFO", 3088 | help="Logging level" 3089 | ) 3090 | parser.add_argument( 3091 | "--show-providers", 3092 | action="store_true", 3093 | help="Show available providers and exit" 3094 | ) 3095 | 3096 | args = parser.parse_args() 3097 | 3098 | # Set logging level 3099 | logging.getLogger().setLevel(getattr(logging, args.log_level)) 3100 | 3101 | # Show provider availability 3102 | print_provider_availability() 3103 | 3104 | # If --show-providers flag is provided, exit after showing provider info 3105 | if args.show_providers: 3106 | sys.exit(0) 3107 | 3108 | try: 3109 | # Start server (asyncio) 3110 | asyncio.run(serve(args.default_models)) 3111 | except Exception as e: 3112 | logger.error(f"Error starting server: {e}") 3113 | sys.exit(1) 3114 | 3115 | 3116 | if __name__ == "__main__": 3117 | main() 3118 | 3119 | 3120 | 3121 | # Python-generated files 3122 | __pycache__/ 3123 | *.py[oc] 3124 | build/ 3125 | dist/ 3126 | wheels/ 3127 | *.egg-info 3128 | 3129 | # Virtual environments 3130 | .venv 3131 | 3132 | .env 3133 | 3134 | # Byte-compiled / optimized / DLL files 3135 | __pycache__/ 3136 | *.py[cod] 3137 | *$py.class 3138 | 3139 | # Distribution / packaging 3140 | dist/ 3141 | build/ 3142 | *.egg-info/ 3143 | *.egg 3144 | 3145 | # Unit test / coverage reports 3146 | htmlcov/ 3147 | .tox/ 3148 | .nox/ 3149 | .coverage 3150 | .coverage.* 3151 | .cache 3152 | nosetests.xml 3153 | coverage.xml 3154 | *.cover 3155 | .hypothesis/ 3156 | .pytest_cache/ 3157 | 3158 | # Jupyter Notebook 3159 | .ipynb_checkpoints 3160 | 3161 | # Environments 3162 | .env 3163 | .venv 3164 | env/ 3165 | venv/ 3166 | ENV/ 3167 | env.bak/ 3168 | venv.bak/ 3169 | 3170 | # mypy 3171 | .mypy_cache/ 3172 | .dmypy.json 3173 | dmypy.json 3174 | 3175 | # IDE specific files 3176 | .idea/ 3177 | .vscode/ 3178 | *.swp 3179 | *.swo 3180 | .DS_Store 3181 | 3182 | 3183 | prompts/responses 3184 | .aider* 3185 | 3186 | focus_output/ 3187 | 3188 | 3189 | 3190 | # Ultra Diff Review 3191 | > Execute each task in the order given to conduct a thorough code review. 3192 | 3193 | ## Task 1: Create diff.txt 3194 | 3195 | Create a new file called diff.md. 3196 | 3197 | At the top of the file, add the following markdown: 3198 | 3199 | ```md 3200 | # Code Review 3201 | - Review the diff, report on issues, bugs, and improvements. 3202 | - End with a concise markdown table of any issues found, their solutions, and a risk assessment for each issue if applicable. 3203 | - Use emojis to convey the severity of each issue. 3204 | 3205 | ## Diff 3206 | 3207 | ``` 3208 | 3209 | ## Task 2: git diff and append 3210 | 3211 | Then run git diff and append the output to the file. 3212 | 3213 | ## Task 3: just-prompt multi-llm tool call 3214 | 3215 | Then use that file as the input to this just-prompt tool call. 3216 | 3217 | prompts_from_file_to_file( 3218 | from_file = diff.md, 3219 | models = "openai:o3-mini, anthropic:claude-3-7-sonnet-20250219:4k, gemini:gemini-2.0-flash-thinking-exp" 3220 | output_dir = ultra_diff_review/ 3221 | ) 3222 | 3223 | ## Task 4: Read the output files and synthesize 3224 | 3225 | Then read the output files and think hard to synthesize the results into a new single file called `ultra_diff_review/fusion_ultra_diff_review.md` following the original instructions plus any additional instructions or callouts you think are needed to create the best possible review. 3226 | 3227 | ## Task 5: Present the results 3228 | 3229 | Then let me know which issues you think are worth resolving and we'll proceed from there. 3230 | 3231 | 3232 | 3233 | """ 3234 | MCP server for just-prompt. 3235 | """ 3236 | 3237 | import asyncio 3238 | import logging 3239 | import os 3240 | from typing import List, Dict, Any, Optional 3241 | from mcp.server import Server 3242 | from mcp.server.stdio import stdio_server 3243 | from mcp.types import Tool, TextContent 3244 | from pydantic import BaseModel, Field 3245 | from .atoms.shared.utils import DEFAULT_MODEL 3246 | from .atoms.shared.validator import print_provider_availability 3247 | from .molecules.prompt import prompt 3248 | from .molecules.prompt_from_file import prompt_from_file 3249 | from .molecules.prompt_from_file_to_file import prompt_from_file_to_file 3250 | from .molecules.list_providers import list_providers as list_providers_func 3251 | from .molecules.list_models import list_models as list_models_func 3252 | from dotenv import load_dotenv 3253 | 3254 | # Load environment variables 3255 | load_dotenv() 3256 | 3257 | # Configure logging 3258 | logging.basicConfig( 3259 | level=logging.INFO, 3260 | format='%(asctime)s [%(levelname)s] %(message)s', 3261 | datefmt='%Y-%m-%d %H:%M:%S' 3262 | ) 3263 | logger = logging.getLogger(__name__) 3264 | 3265 | # Tool names enum 3266 | class JustPromptTools: 3267 | PROMPT = "prompt" 3268 | PROMPT_FROM_FILE = "prompt_from_file" 3269 | PROMPT_FROM_FILE_TO_FILE = "prompt_from_file_to_file" 3270 | LIST_PROVIDERS = "list_providers" 3271 | LIST_MODELS = "list_models" 3272 | 3273 | # Schema classes for MCP tools 3274 | class PromptSchema(BaseModel): 3275 | text: str = Field(..., description="The prompt text") 3276 | models_prefixed_by_provider: Optional[List[str]] = Field( 3277 | None, 3278 | description="List of models with provider prefixes (e.g., 'openai:gpt-4o' or 'o:gpt-4o'). If not provided, uses default models." 3279 | ) 3280 | 3281 | class PromptFromFileSchema(BaseModel): 3282 | file: str = Field(..., description="Path to the file containing the prompt") 3283 | models_prefixed_by_provider: Optional[List[str]] = Field( 3284 | None, 3285 | description="List of models with provider prefixes (e.g., 'openai:gpt-4o' or 'o:gpt-4o'). If not provided, uses default models." 3286 | ) 3287 | 3288 | class PromptFromFileToFileSchema(BaseModel): 3289 | file: str = Field(..., description="Path to the file containing the prompt") 3290 | models_prefixed_by_provider: Optional[List[str]] = Field( 3291 | None, 3292 | description="List of models with provider prefixes (e.g., 'openai:gpt-4o' or 'o:gpt-4o'). If not provided, uses default models." 3293 | ) 3294 | output_dir: str = Field( 3295 | default=".", 3296 | description="Directory to save the response files to (default: current directory)" 3297 | ) 3298 | 3299 | class ListProvidersSchema(BaseModel): 3300 | pass 3301 | 3302 | class ListModelsSchema(BaseModel): 3303 | provider: str = Field(..., description="Provider to list models for (e.g., 'openai' or 'o')") 3304 | 3305 | 3306 | async def serve(default_models: str = DEFAULT_MODEL) -> None: 3307 | """ 3308 | Start the MCP server. 3309 | 3310 | Args: 3311 | default_models: Comma-separated list of default models to use for prompts and corrections 3312 | """ 3313 | # Set global default models for prompts and corrections 3314 | os.environ["DEFAULT_MODELS"] = default_models 3315 | 3316 | # Parse default models into a list 3317 | default_models_list = [model.strip() for model in default_models.split(",")] 3318 | 3319 | # Set the first model as the correction model 3320 | correction_model = default_models_list[0] if default_models_list else "o:gpt-4o-mini" 3321 | os.environ["CORRECTION_MODEL"] = correction_model 3322 | 3323 | logger.info(f"Starting server with default models: {default_models}") 3324 | logger.info(f"Using correction model: {correction_model}") 3325 | 3326 | # Check and log provider availability 3327 | print_provider_availability() 3328 | 3329 | # Create the MCP server 3330 | server = Server("just-prompt") 3331 | 3332 | @server.list_tools() 3333 | async def list_tools() -> List[Tool]: 3334 | """Register all available tools with the MCP server.""" 3335 | return [ 3336 | Tool( 3337 | name=JustPromptTools.PROMPT, 3338 | description="Send a prompt to multiple LLM models", 3339 | inputSchema=PromptSchema.schema(), 3340 | ), 3341 | Tool( 3342 | name=JustPromptTools.PROMPT_FROM_FILE, 3343 | description="Send a prompt from a file to multiple LLM models", 3344 | inputSchema=PromptFromFileSchema.schema(), 3345 | ), 3346 | Tool( 3347 | name=JustPromptTools.PROMPT_FROM_FILE_TO_FILE, 3348 | description="Send a prompt from a file to multiple LLM models and save responses to files", 3349 | inputSchema=PromptFromFileToFileSchema.schema(), 3350 | ), 3351 | Tool( 3352 | name=JustPromptTools.LIST_PROVIDERS, 3353 | description="List all available LLM providers", 3354 | inputSchema=ListProvidersSchema.schema(), 3355 | ), 3356 | Tool( 3357 | name=JustPromptTools.LIST_MODELS, 3358 | description="List all available models for a specific LLM provider", 3359 | inputSchema=ListModelsSchema.schema(), 3360 | ), 3361 | ] 3362 | 3363 | @server.call_tool() 3364 | async def call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]: 3365 | """Handle tool calls from the MCP client.""" 3366 | logger.info(f"Tool call: {name}, arguments: {arguments}") 3367 | 3368 | try: 3369 | if name == JustPromptTools.PROMPT: 3370 | models_to_use = arguments.get("models_prefixed_by_provider") 3371 | responses = prompt(arguments["text"], models_to_use) 3372 | 3373 | # Get the model names that were actually used 3374 | models_used = models_to_use if models_to_use else [model.strip() for model in os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL).split(",")] 3375 | 3376 | return [TextContent( 3377 | type="text", 3378 | text="\n".join([f"Model: {models_used[i]}\nResponse: {resp}" 3379 | for i, resp in enumerate(responses)]) 3380 | )] 3381 | 3382 | elif name == JustPromptTools.PROMPT_FROM_FILE: 3383 | models_to_use = arguments.get("models_prefixed_by_provider") 3384 | responses = prompt_from_file(arguments["file"], models_to_use) 3385 | 3386 | # Get the model names that were actually used 3387 | models_used = models_to_use if models_to_use else [model.strip() for model in os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL).split(",")] 3388 | 3389 | return [TextContent( 3390 | type="text", 3391 | text="\n".join([f"Model: {models_used[i]}\nResponse: {resp}" 3392 | for i, resp in enumerate(responses)]) 3393 | )] 3394 | 3395 | elif name == JustPromptTools.PROMPT_FROM_FILE_TO_FILE: 3396 | output_dir = arguments.get("output_dir", ".") 3397 | models_to_use = arguments.get("models_prefixed_by_provider") 3398 | file_paths = prompt_from_file_to_file( 3399 | arguments["file"], 3400 | models_to_use, 3401 | output_dir 3402 | ) 3403 | return [TextContent( 3404 | type="text", 3405 | text=f"Responses saved to:\n" + "\n".join(file_paths) 3406 | )] 3407 | 3408 | elif name == JustPromptTools.LIST_PROVIDERS: 3409 | providers = list_providers_func() 3410 | provider_text = "\nAvailable Providers:\n" 3411 | for provider in providers: 3412 | provider_text += f"- {provider['name']}: full_name='{provider['full_name']}', short_name='{provider['short_name']}'\n" 3413 | return [TextContent( 3414 | type="text", 3415 | text=provider_text 3416 | )] 3417 | 3418 | elif name == JustPromptTools.LIST_MODELS: 3419 | models = list_models_func(arguments["provider"]) 3420 | return [TextContent( 3421 | type="text", 3422 | text=f"Models for provider '{arguments['provider']}':\n" + 3423 | "\n".join([f"- {model}" for model in models]) 3424 | )] 3425 | 3426 | else: 3427 | return [TextContent( 3428 | type="text", 3429 | text=f"Unknown tool: {name}" 3430 | )] 3431 | 3432 | except Exception as e: 3433 | logger.error(f"Error handling tool call: {name}, error: {e}") 3434 | return [TextContent( 3435 | type="text", 3436 | text=f"Error: {str(e)}" 3437 | )] 3438 | 3439 | # Initialize and run the server 3440 | try: 3441 | options = server.create_initialization_options() 3442 | async with stdio_server() as (read_stream, write_stream): 3443 | await server.run(read_stream, write_stream, options, raise_exceptions=True) 3444 | except Exception as e: 3445 | logger.error(f"Error running server: {e}") 3446 | raise 3447 | 3448 | 3449 | 3450 | # Just Prompt - A lightweight MCP server for LLM providers 3451 | 3452 | `just-prompt` is a Model Control Protocol (MCP) server that provides a unified interface to various Large Language Model (LLM) providers including OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, and Ollama. 3453 | 3454 | ## Tools 3455 | 3456 | The following MCP tools are available in the server: 3457 | 3458 | - **`prompt`**: Send a prompt to multiple LLM models 3459 | - Parameters: 3460 | - `text`: The prompt text 3461 | - `models_prefixed_by_provider` (optional): List of models with provider prefixes. If not provided, uses default models. 3462 | 3463 | - **`prompt_from_file`**: Send a prompt from a file to multiple LLM models 3464 | - Parameters: 3465 | - `file`: Path to the file containing the prompt 3466 | - `models_prefixed_by_provider` (optional): List of models with provider prefixes. If not provided, uses default models. 3467 | 3468 | - **`prompt_from_file_to_file`**: Send a prompt from a file to multiple LLM models and save responses as markdown files 3469 | - Parameters: 3470 | - `file`: Path to the file containing the prompt 3471 | - `models_prefixed_by_provider` (optional): List of models with provider prefixes. If not provided, uses default models. 3472 | - `output_dir` (default: "."): Directory to save the response markdown files to 3473 | 3474 | - **`list_providers`**: List all available LLM providers 3475 | - Parameters: None 3476 | 3477 | - **`list_models`**: List all available models for a specific LLM provider 3478 | - Parameters: 3479 | - `provider`: Provider to list models for (e.g., 'openai' or 'o') 3480 | 3481 | ## Provider Prefixes 3482 | > every model must be prefixed with the provider name 3483 | > 3484 | > use the short name for faster referencing 3485 | 3486 | - `o` or `openai`: OpenAI 3487 | - `o:gpt-4o-mini` 3488 | - `openai:gpt-4o-mini` 3489 | - `a` or `anthropic`: Anthropic 3490 | - `a:claude-3-5-haiku` 3491 | - `anthropic:claude-3-5-haiku` 3492 | - `g` or `gemini`: Google Gemini 3493 | - `g:gemini-2.5-pro-exp-03-25` 3494 | - `gemini:gemini:gemini-2.5-pro-exp-03-25` 3495 | - `q` or `groq`: Groq 3496 | - `q:llama-3.1-70b-versatile` 3497 | - `groq:llama-3.1-70b-versatile` 3498 | - `d` or `deepseek`: DeepSeek 3499 | - `d:deepseek-coder` 3500 | - `deepseek:deepseek-coder` 3501 | - `l` or `ollama`: Ollama 3502 | - `l:llama3.1` 3503 | - `ollama:llama3.1` 3504 | 3505 | ## Features 3506 | 3507 | - Unified API for multiple LLM providers 3508 | - Support for text prompts from strings or files 3509 | - Run multiple models in parallel 3510 | - Automatic model name correction using the first model in the `--default-models` list 3511 | - Ability to save responses to files 3512 | - Easy listing of available providers and models 3513 | 3514 | ## Installation 3515 | 3516 | ```bash 3517 | # Clone the repository 3518 | git clone https://github.com/yourusername/just-prompt.git 3519 | cd just-prompt 3520 | 3521 | # Install with pip 3522 | uv sync 3523 | ``` 3524 | 3525 | ### Environment Variables 3526 | 3527 | Create a `.env` file with your API keys (you can copy the `.env.sample` file): 3528 | 3529 | ```bash 3530 | cp .env.sample .env 3531 | ``` 3532 | 3533 | Then edit the `.env` file to add your API keys (or export them in your shell): 3534 | 3535 | ``` 3536 | OPENAI_API_KEY=your_openai_api_key_here 3537 | ANTHROPIC_API_KEY=your_anthropic_api_key_here 3538 | GEMINI_API_KEY=your_gemini_api_key_here 3539 | GROQ_API_KEY=your_groq_api_key_here 3540 | DEEPSEEK_API_KEY=your_deepseek_api_key_here 3541 | OLLAMA_HOST=http://localhost:11434 3542 | ``` 3543 | 3544 | ## Claude Code Installation 3545 | 3546 | Default model set to `anthropic:claude-3-7-sonnet-20250219`. 3547 | 3548 | If you use Claude Code right out of the repository you can see in the .mcp.json file we set the default models to... 3549 | 3550 | ``` 3551 | { 3552 | "mcpServers": { 3553 | "just-prompt": { 3554 | "type": "stdio", 3555 | "command": "uv", 3556 | "args": [ 3557 | "--directory", 3558 | ".", 3559 | "run", 3560 | "just-prompt", 3561 | "--default-models", 3562 | "anthropic:claude-3-7-sonnet-20250219,openai:o3-mini,gemini:gemini-2.5-pro-exp-03-25" 3563 | ], 3564 | "env": {} 3565 | } 3566 | } 3567 | } 3568 | ``` 3569 | 3570 | The `--default-models` parameter sets the models to use when none are explicitly provided to the API endpoints. The first model in the list is also used for model name correction when needed. This can be a list of models separated by commas. 3571 | 3572 | When starting the server, it will automatically check which API keys are available in your environment and inform you which providers you can use. If a key is missing, the provider will be listed as unavailable, but the server will still start and can be used with the providers that are available. 3573 | 3574 | ### Using `mcp add-json` 3575 | 3576 | Copy this and paste it into claude code with BUT don't run until you copy the json 3577 | 3578 | ``` 3579 | claude mcp add just-prompt "$(pbpaste)" 3580 | ``` 3581 | 3582 | JSON to copy 3583 | 3584 | ``` 3585 | { 3586 | "command": "uv", 3587 | "args": ["--directory", ".", "run", "just-prompt"] 3588 | } 3589 | ``` 3590 | 3591 | With a custom default model set to `openai:gpt-4o`. 3592 | 3593 | ``` 3594 | { 3595 | "command": "uv", 3596 | "args": ["--directory", ".", "run", "just-prompt", "--default-models", "openai:gpt-4o"] 3597 | } 3598 | ``` 3599 | 3600 | With multiple default models: 3601 | 3602 | ``` 3603 | { 3604 | "command": "uv", 3605 | "args": ["--directory", ".", "run", "just-prompt", "--default-models", "anthropic:claude-3-7-sonnet-20250219,openai:gpt-4o,gemini:gemini-2.5-pro-exp-03-25"] 3606 | } 3607 | ``` 3608 | 3609 | ### Using `mcp add` with project scope 3610 | 3611 | ```bash 3612 | # With default model (anthropic:claude-3-7-sonnet-20250219) 3613 | claude mcp add just-prompt -s project \ 3614 | -- \ 3615 | uv --directory . \ 3616 | run just-prompt 3617 | 3618 | # With custom default model 3619 | claude mcp add just-prompt -s project \ 3620 | -- \ 3621 | uv --directory . \ 3622 | run just-prompt --default-models "openai:gpt-4o" 3623 | 3624 | # With multiple default models 3625 | claude mcp add just-prompt -s user \ 3626 | -- \ 3627 | uv --directory . \ 3628 | run just-prompt --default-models "anthropic:claude-3-7-sonnet-20250219:4k,openai:o3-mini,gemini:gemini-2.0-flash,openai:gpt-4.5-preview,gemini:gemini-2.5-pro-exp-03-25" 3629 | ``` 3630 | 3631 | 3632 | ## `mcp remove` 3633 | 3634 | claude mcp remove just-prompt 3635 | 3636 | ## Running Tests 3637 | 3638 | ```bash 3639 | uv run pytest 3640 | ``` 3641 | 3642 | ## Codebase Structure 3643 | 3644 | ``` 3645 | . 3646 | ├── ai_docs/ # Documentation for AI model details 3647 | │ ├── llm_providers_details.xml 3648 | │ └── pocket-pick-mcp-server-example.xml 3649 | ├── list_models.py # Script to list available LLM models 3650 | ├── pyproject.toml # Python project configuration 3651 | ├── specs/ # Project specifications 3652 | │ └── init-just-prompt.md 3653 | ├── src/ # Source code directory 3654 | │ └── just_prompt/ 3655 | │ ├── __init__.py 3656 | │ ├── __main__.py 3657 | │ ├── atoms/ # Core components 3658 | │ │ ├── llm_providers/ # Individual provider implementations 3659 | │ │ │ ├── anthropic.py 3660 | │ │ │ ├── deepseek.py 3661 | │ │ │ ├── gemini.py 3662 | │ │ │ ├── groq.py 3663 | │ │ │ ├── ollama.py 3664 | │ │ │ └── openai.py 3665 | │ │ └── shared/ # Shared utilities and data types 3666 | │ │ ├── data_types.py 3667 | │ │ ├── model_router.py 3668 | │ │ ├── utils.py 3669 | │ │ └── validator.py 3670 | │ ├── molecules/ # Higher-level functionality 3671 | │ │ ├── list_models.py 3672 | │ │ ├── list_providers.py 3673 | │ │ ├── prompt.py 3674 | │ │ ├── prompt_from_file.py 3675 | │ │ └── prompt_from_file_to_file.py 3676 | │ ├── server.py # MCP server implementation 3677 | │ └── tests/ # Test directory 3678 | │ ├── atoms/ # Tests for atoms 3679 | │ │ ├── llm_providers/ 3680 | │ │ └── shared/ 3681 | │ └── molecules/ # Tests for molecules 3682 | ``` 3683 | 3684 | ## Context Priming 3685 | READ README.md, then run git ls-files, and 'eza --git-ignore --tree' to understand the context of the project. 3686 | 3687 | ## Thinking Tokens with Claude 3688 | 3689 | The Anthropic Claude model `claude-3-7-sonnet-20250219` supports extended thinking capabilities using thinking tokens. This allows Claude to do more thorough thought processes before answering. 3690 | 3691 | You can enable thinking tokens by adding a suffix to the model name in this format: 3692 | - `anthropic:claude-3-7-sonnet-20250219:1k` - Use 1024 thinking tokens 3693 | - `anthropic:claude-3-7-sonnet-20250219:4k` - Use 4096 thinking tokens 3694 | - `anthropic:claude-3-7-sonnet-20250219:8000` - Use 8000 thinking tokens 3695 | 3696 | Example usage: 3697 | ```bash 3698 | # Using 4k thinking tokens with Claude 3699 | uv run just-prompt prompt "Analyze the advantages and disadvantages of quantum computing vs classical computing" \ 3700 | --models-prefixed-by-provider anthropic:claude-3-7-sonnet-20250219:4k 3701 | ``` 3702 | 3703 | Notes: 3704 | - Thinking tokens are only supported for the `claude-3-7-sonnet-20250219` model 3705 | - Valid thinking token budgets range from 1024 to 16000 3706 | - Values outside this range will be automatically adjusted to be within range 3707 | - You can specify the budget with k notation (1k, 4k, etc.) or with exact numbers (1024, 4096, etc.) 3708 | 3709 | ## Resources 3710 | - https://docs.anthropic.com/en/api/models-list?q=list+models 3711 | - https://github.com/googleapis/python-genai 3712 | - https://platform.openai.com/docs/api-reference/models/list 3713 | - https://api-docs.deepseek.com/api/list-models 3714 | - https://github.com/ollama/ollama-python 3715 | - https://github.com/openai/openai-python 3716 | 3717 | 3718 | 3719 | --------------------------------------------------------------------------------