├── .gitignore ├── .gitattributes ├── requirements.txt ├── aphra ├── workflows │ ├── short_article │ │ ├── docs │ │ │ ├── workflow-diagram.png │ │ │ ├── workflow-diagram.md │ │ │ └── README.md │ │ ├── prompts │ │ │ ├── step3_system.txt │ │ │ ├── step5_system.txt │ │ │ ├── step1_system.txt │ │ │ ├── step4_system.txt │ │ │ ├── step2_system.txt │ │ │ ├── step1_user.txt │ │ │ ├── step3_user.txt │ │ │ ├── step5_user.txt │ │ │ ├── step2_user.txt │ │ │ └── step4_user.txt │ │ ├── examples │ │ │ ├── __init__.py │ │ │ ├── simple_demo.py │ │ │ └── gradio_demo.py │ │ ├── tests │ │ │ ├── __init__.py │ │ │ ├── test_prompts.py │ │ │ └── test_parsers.py │ │ ├── config │ │ │ └── default.toml │ │ ├── aux │ │ │ ├── __init__.py │ │ │ └── parsers.py │ │ ├── __init__.py │ │ └── short_article_workflow.py │ └── __init__.py ├── __init__.py ├── core │ ├── __init__.py │ ├── context.py │ ├── prompts.py │ ├── workflow.py │ ├── config.py │ ├── parsers.py │ ├── llm_client.py │ └── registry.py └── translate.py ├── docs ├── index.html └── aphra.html ├── CITATION.cff ├── config.example.toml ├── Dockerfile ├── .github ├── ISSUE_TEMPLATE │ ├── feature_request.md │ └── bug_report.md └── workflows │ └── pylint.yml ├── pyproject.toml ├── .pylintrc ├── tests ├── test_translate.py ├── test_llm_client.py ├── test_context.py ├── test_core_prompts.py ├── test_core_parsers.py └── test_registry.py ├── LICENSE ├── setup.py ├── entrypoint.sh ├── aphra_runner.py ├── gradio-demo.py ├── CONTRIBUTING.md └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | .DS_Store 3 | model_calls.log 4 | config.toml 5 | __pycache__ 6 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | openai>=1.40.2 2 | toml>=0.10.2 3 | requests>=2.32.3 4 | setuptools>=72.1.0 5 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/docs/workflow-diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavidLMS/aphra/HEAD/aphra/workflows/short_article/docs/workflow-diagram.png -------------------------------------------------------------------------------- /docs/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step3_system.txt: -------------------------------------------------------------------------------- 1 | You are tasked with translating a {source_language} text into {target_language} while maintaining the author's original writing style. -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step5_system.txt: -------------------------------------------------------------------------------- 1 | You are tasked with creating an improved {target_language} translation of a {source_language} text. You will be provided with several pieces of information to help you create this translation. -------------------------------------------------------------------------------- /aphra/workflows/short_article/examples/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Examples for the Short Article workflow. 3 | 4 | This package contains demonstration scripts and examples showing how to use 5 | the Short Article workflow for translation tasks. 6 | """ 7 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/tests/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Tests for the Short Article workflow. 3 | 4 | This package contains tests specific to the short article translation workflow, 5 | including tests for parsers, prompts, and workflow execution. 6 | """ 7 | -------------------------------------------------------------------------------- /aphra/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Aphra package initializer. 3 | This module exposes the main API components and modules. 4 | """ 5 | from .translate import translate 6 | from . import workflows 7 | from . import core 8 | 9 | __all__ = ['translate', 'workflows', 'core'] 10 | -------------------------------------------------------------------------------- /CITATION.cff: -------------------------------------------------------------------------------- 1 | cff-version: 1.2.0 2 | message: "If you use this software, please cite it as below." 3 | authors: 4 | - family-names: "Romero Santos" 5 | given-names: "David" 6 | title: "🌐💬 Aphra: An open-source translation agent" 7 | date-released: 2024-08-17 8 | url: "https://github.com/DavidLMS/aphra" 9 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step1_system.txt: -------------------------------------------------------------------------------- 1 | You are an expert translator tasked with analyzing and understanding a {source_language} text. Your goal is to identify specific terms, legal {source_language} terms, phrases, and cultural references that may need explanation or adaptation for an {target_language}-speaking audience. -------------------------------------------------------------------------------- /aphra/workflows/short_article/config/default.toml: -------------------------------------------------------------------------------- 1 | # Default configuration for the Short Article workflow 2 | # These values can be overridden in config.toml under the [short_article] section 3 | 4 | # LLM models used by this workflow 5 | writer = "anthropic/claude-sonnet-4" 6 | searcher = "perplexity/sonar" 7 | critiquer = "anthropic/claude-sonnet-4" -------------------------------------------------------------------------------- /aphra/workflows/short_article/aux/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Auxiliary utilities for the Short Article workflow. 3 | 4 | This module contains parsers and utilities specific to the short article 5 | translation workflow. 6 | """ 7 | 8 | from .parsers import parse_analysis, parse_translation 9 | 10 | __all__ = ['parse_analysis', 'parse_translation'] 11 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Short Article workflow for translating articles and blog posts. 3 | 4 | This module contains the short article translation workflow which implements 5 | a 5-step process for contextual translation of articles and blog posts. 6 | """ 7 | 8 | from .short_article_workflow import ShortArticleWorkflow 9 | 10 | __all__ = ['ShortArticleWorkflow'] 11 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step4_system.txt: -------------------------------------------------------------------------------- 1 | You are a professional translator and language expert specializing in {source_language} to {target_language} translations. Your task is to critically analyze a basic {target_language} translation of a {source_language} text and provide suggestions for improvement. You will also identify terms that would benefit from translator's notes for better understanding. -------------------------------------------------------------------------------- /config.example.toml: -------------------------------------------------------------------------------- 1 | [openrouter] 2 | # Replace with your actual OpenRouter API key 3 | api_key = "your_openrouter_api_key_here" 4 | 5 | [short_article] 6 | # Configuration for the Short Article workflow 7 | # These values override the defaults in workflows/short_article/config/default.toml 8 | writer = "anthropic/claude-sonnet-4" 9 | searcher = "perplexity/sonar" 10 | critiquer = "anthropic/claude-sonnet-4" -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step2_system.txt: -------------------------------------------------------------------------------- 1 | You are tasked with searching for information about a specific term, taking into account provided keywords, to assist a {source_language} to {target_language} translator in making the most reliable and contextualized translation possible. Your goal is to provide comprehensive context and relevant information that will help the translator understand the nuances and cultural implications of the term. -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | # Use an official Python runtime as a parent image 2 | FROM python:3.8-slim 3 | 4 | # Set the working directory in the container 5 | WORKDIR /workspace 6 | 7 | # Copy the necessary files to install dependencies 8 | COPY requirements.txt ./ 9 | COPY setup.py ./ 10 | COPY config.toml ./config.toml 11 | 12 | # Install the dependencies 13 | RUN pip install --no-cache-dir -r requirements.txt 14 | 15 | # Copy the rest of the application 16 | COPY . . 17 | 18 | # Ensure the entry script has execution permissions 19 | RUN chmod +x /workspace/entrypoint.sh 20 | 21 | # Set the entrypoint to the script and pass any arguments 22 | ENTRYPOINT ["/workspace/entrypoint.sh"] 23 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: "[REQUEST]" 5 | labels: enhancement 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /.github/workflows/pylint.yml: -------------------------------------------------------------------------------- 1 | name: Pylint 2 | 3 | on: [push] 4 | 5 | jobs: 6 | build: 7 | runs-on: ubuntu-latest 8 | strategy: 9 | matrix: 10 | python-version: ["3.9", "3.10", "3.11"] 11 | steps: 12 | - uses: actions/checkout@v4 13 | - name: Set up Python ${{ matrix.python-version }} 14 | uses: actions/setup-python@v3 15 | with: 16 | python-version: ${{ matrix.python-version }} 17 | - name: Install dependencies 18 | run: | 19 | python -m pip install --upgrade pip 20 | pip install openai toml requests pylint 21 | - name: Analysing the code with pylint 22 | run: | 23 | find aphra -name '*.py' -not -path '*/examples/*' -not -path '*/tests/*' | xargs python -m pylint --rcfile=.pylintrc 24 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [tool.poetry] 2 | name = "aphra" 3 | version = "2.1.0" 4 | description = "A translation package using LLMs" 5 | authors = ["DavidLMS "] 6 | license = "MIT" 7 | homepage = "https://davidlms.github.io/aphra" 8 | repository = "https://github.com/DavidLMS/aphra" 9 | readme = "README.md" 10 | packages = [ 11 | { include = "aphra" } 12 | ] 13 | include = [ 14 | "aphra/prompts/*.txt" 15 | ] 16 | 17 | [tool.poetry.dependencies] 18 | python = ">=3.8" 19 | openai = ">=1.40.2" 20 | toml = ">=0.10.2" 21 | requests = ">=2.32.3" 22 | setuptools = ">=72.1.0" 23 | 24 | [tool.poetry.scripts] 25 | aphra-translate = "aphra.translate:main" 26 | 27 | [build-system] 28 | requires = ["poetry-core>=1.0.0"] 29 | build-backend = "poetry.core.masonry.api" 30 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: "[BUG]" 5 | labels: bug 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Screenshots** 24 | If applicable, add screenshots to help explain your problem. 25 | 26 | **Desktop (please complete the following information):** 27 | - OS: [e.g. iOS] 28 | - Browser [e.g. chrome, safari] 29 | - Version [e.g. 22] 30 | 31 | **Additional context** 32 | Add any other context about the problem here. 33 | -------------------------------------------------------------------------------- /.pylintrc: -------------------------------------------------------------------------------- 1 | [MESSAGES CONTROL] 2 | # Disable specific warnings that are acceptable in this codebase context: 3 | # - broad-exception-caught: We catch general exceptions for graceful fallbacks 4 | # - import-outside-toplevel: Used for lazy loading in auto-discovery system 5 | # - logging-fstring-interpolation: f-strings in logging are acceptable 6 | # - invalid-name: Variable naming follows project conventions 7 | disable=broad-exception-caught,import-outside-toplevel,logging-fstring-interpolation,invalid-name 8 | 9 | [DESIGN] 10 | # Maximum number of arguments for function / method 11 | max-args=10 12 | 13 | # Maximum number of locals for function / method body 14 | max-locals=25 15 | 16 | # Maximum number of return statements in function / method body 17 | max-returns=6 18 | 19 | # Maximum number of branch for function / method body 20 | max-branches=15 21 | 22 | # Maximum number of statements in function / method body 23 | max-statements=50 24 | 25 | [FORMAT] 26 | # Maximum number of characters on a single line 27 | max-line-length=120 28 | -------------------------------------------------------------------------------- /aphra/core/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Core components for the Aphra translation system. 3 | 4 | This module contains the fundamental building blocks used across 5 | all workflows. 6 | """ 7 | 8 | from .llm_client import LLMModelClient 9 | from .parsers import parse_xml_tag, parse_multiple_xml_tags, parse_xml_tag_with_attributes 10 | from .prompts import get_prompt, list_workflow_prompts 11 | from .context import TranslationContext 12 | from .workflow import AbstractWorkflow 13 | from .registry import ( 14 | WorkflowRegistry, 15 | get_registry, 16 | register_workflow, 17 | get_workflow, 18 | get_suitable_workflow 19 | ) 20 | 21 | __all__ = [ 22 | 'LLMModelClient', 23 | 'parse_xml_tag', 24 | 'parse_multiple_xml_tags', 25 | 'parse_xml_tag_with_attributes', 26 | 'get_prompt', 27 | 'list_workflow_prompts', 28 | 'TranslationContext', 29 | 'AbstractWorkflow', 30 | 'WorkflowRegistry', 31 | 'get_registry', 32 | 'register_workflow', 33 | 'get_workflow', 34 | 'get_suitable_workflow' 35 | ] 36 | -------------------------------------------------------------------------------- /tests/test_translate.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for the translate function in the aphra module. 3 | """ 4 | 5 | import unittest 6 | from aphra.translate import translate 7 | 8 | class TestTranslate(unittest.TestCase): 9 | """ 10 | Test cases for the translate function in the aphra module. 11 | """ 12 | 13 | def setUp(self): 14 | """ 15 | Set up the test case with default parameters. 16 | """ 17 | self.source_language = 'Spanish' 18 | self.target_language = 'English' 19 | self.text = 'Hola mundo' 20 | self.config_file = 'config.toml' 21 | 22 | def test_translation(self): 23 | """ 24 | Test the translate function to ensure it returns a valid translation. 25 | """ 26 | translation = translate( 27 | self.source_language, 28 | self.target_language, 29 | self.text, 30 | self.config_file, 31 | log_calls=False 32 | ) 33 | self.assertIsNotNone(translation) 34 | 35 | if __name__ == '__main__': 36 | unittest.main() 37 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 David Romero 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | """ 2 | Setup script for the Aphra package. 3 | 4 | This script uses setuptools to package and distribute the Aphra package, which 5 | provides translation functionality using LLMs (Large Language Models). 6 | """ 7 | 8 | from setuptools import setup, find_packages 9 | 10 | setup( 11 | name='aphra', 12 | version='2.1.0', 13 | packages=find_packages(), 14 | install_requires=[ 15 | 'openai>=1.40.2', 16 | 'toml>=0.10.2', 17 | 'requests>=2.32.3', 18 | 'setuptools>=72.1.0' 19 | ], 20 | entry_points={ 21 | 'console_scripts': [ 22 | 'aphra-translate=aphra.translate:main', 23 | ], 24 | }, 25 | package_data={ 26 | 'aphra': ['prompts/*.txt'], 27 | }, 28 | include_package_data=True, 29 | description='A translation package using LLMs', 30 | author='DavidLMS', 31 | author_email='dobles-establecer-0m@icloud.com', 32 | url='https://github.com/DavidLMS/aphra', 33 | classifiers=[ 34 | 'Programming Language :: Python :: 3', 35 | 'License :: OSI Approved :: MIT License', 36 | 'Operating System :: OS Independent', 37 | ], 38 | python_requires='>=3.8', 39 | ) 40 | -------------------------------------------------------------------------------- /tests/test_llm_client.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for the LLMModelClient class in the aphra module. 3 | """ 4 | 5 | import unittest 6 | 7 | class TestLLMModelClient(unittest.TestCase): 8 | """ 9 | Test cases for the LLMModelClient class in the aphra module. 10 | """ 11 | 12 | def setUp(self): 13 | """ 14 | Set up the test case with a default configuration file. 15 | """ 16 | self.config_file = 'config.toml' 17 | from aphra.core.llm_client import LLMModelClient 18 | self.client = LLMModelClient(self.config_file) 19 | 20 | def test_load_config(self): 21 | """ 22 | Test loading the configuration file. 23 | """ 24 | self.assertIsNotNone(self.client.api_key_openrouter) 25 | 26 | def test_call_model(self): 27 | """ 28 | Test making a call to the model. 29 | """ 30 | system_prompt = "Translate the following text." 31 | user_prompt = "Hola mundo" 32 | model_name = "anthropic/claude-sonnet-4" 33 | response = self.client.call_model(system_prompt, user_prompt, model_name) 34 | self.assertIsNotNone(response) 35 | 36 | if __name__ == '__main__': 37 | unittest.main() 38 | -------------------------------------------------------------------------------- /tests/test_context.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for the TranslationContext class in the aphra module. 3 | """ 4 | 5 | import unittest 6 | 7 | class TestTranslationContext(unittest.TestCase): 8 | """ 9 | Test cases for the TranslationContext class in the aphra module. 10 | """ 11 | 12 | def setUp(self): 13 | """ 14 | Set up the test case with default parameters. 15 | """ 16 | self.config_file = 'config.toml' 17 | from aphra.core.llm_client import LLMModelClient 18 | from aphra.core.context import TranslationContext 19 | 20 | model_client = LLMModelClient(self.config_file) 21 | self.context = TranslationContext( 22 | model_client=model_client, 23 | source_language='English', 24 | target_language='Spanish', 25 | log_calls=False 26 | ) 27 | 28 | def test_context_initialization(self): 29 | """ 30 | Test initializing the TranslationContext. 31 | """ 32 | self.assertIsNotNone(self.context.model_client) 33 | self.assertEqual(self.context.source_language, 'English') 34 | self.assertEqual(self.context.target_language, 'Spanish') 35 | self.assertFalse(self.context.log_calls) 36 | 37 | if __name__ == '__main__': 38 | unittest.main() 39 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/docs/workflow-diagram.md: -------------------------------------------------------------------------------- 1 | ```mermaid 2 | flowchart LR 3 | T[📄 Original Text] 4 | 5 | subgraph "1. Analysis" 6 | direction TB 7 | A[🤖 LLM Writer] -->C[📄 Glossary] 8 | end 9 | 10 | subgraph "2. Search" 11 | direction TB 12 | D[🔎 LLM Searcher] --> F[📄 Contextualized Glossary] 13 | end 14 | 15 | subgraph "3. Initial Translation" 16 | direction TB 17 | G[🤖 LLM Writer] -->H[📝 Basic Translation] 18 | end 19 | 20 | subgraph "4. Critique" 21 | direction TB 22 | I[⚖️ LLM Critic] --> J[💬 Critique] 23 | end 24 | 25 | subgraph "5. Final Translation" 26 | direction TB 27 | K[🤖 LLM Writer] --> L[✅ Final Translation] 28 | end 29 | 30 | T --> A 31 | T --> G 32 | C --> D 33 | F --> I 34 | H --> I 35 | T --> K 36 | H --> K 37 | F --> K 38 | J --> K 39 | 40 | classDef default fill:#abb,stroke:#333,stroke-width:2px; 41 | classDef robot fill:#bbf,stroke:#333,stroke-width:2px; 42 | classDef document fill:#bfb,stroke:#333,stroke-width:2px; 43 | classDef search fill:#fbf,stroke:#333,stroke-width:2px; 44 | classDef critic fill:#ffb,stroke:#333,stroke-width:2px; 45 | class A,G,K robot; 46 | class T,B,C,F,H,L document; 47 | class D search; 48 | class I,J critic; 49 | ``` -------------------------------------------------------------------------------- /entrypoint.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Verify that the correct number of arguments have been passed 4 | if [ "$#" -ne 4 ];then 5 | echo "Usage: $0 " 6 | exit 1 7 | fi 8 | 9 | # Assign arguments to variables 10 | SOURCE_LANGUAGE=$1 11 | TARGET_LANGUAGE=$2 12 | INPUT_FILE=$3 13 | OUTPUT_FILE=$4 14 | 15 | # Ensure the input file exists 16 | if [ ! -f "$INPUT_FILE" ]; then 17 | echo "Input file $INPUT_FILE does not exist." 18 | exit 1 19 | fi 20 | 21 | # Read the content of the input file into a variable 22 | TEXT=$(cat "$INPUT_FILE") 23 | 24 | # Escaping the content to ensure it's safely passed into the Python command 25 | ESCAPED_TEXT=$(printf '%s\n' "$TEXT" | sed -e 's/"/\\"/g' -e 's/\$/\\$/g') 26 | 27 | # Prepare the Python command with the actual content of the ESCAPED_TEXT variable 28 | PYTHON_COMMAND=$(cat < "$OUTPUT_FILE" 40 | 41 | # Output a message with the output file name 42 | OUTPUT_FILENAME=$(basename "$OUTPUT_FILE") 43 | echo "Translation completed. See file $OUTPUT_FILENAME for the result." 44 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step1_user.txt: -------------------------------------------------------------------------------- 1 | Here is the {source_language} text you need to analyze: 2 | 3 | <{source_language}_text> 4 | {post_content} 5 | 6 | 7 | Please follow these steps: 8 | 9 | 1. Carefully read and analyze the {source_language} text. 10 | 11 | 2. Identify and list any terms, phrases, or cultural references that may be difficult for an {target_language}-speaking audience to understand. This may include: 12 | - Idiomatic expressions 13 | - Legal {source_language} terms 14 | - Culturally specific terms or concepts 15 | - Historical or geographical references 16 | - Wordplay or puns that don't translate directly 17 | 18 | The choices must be present in the text. 19 | 20 | Present your analysis in the following format: 21 | 22 | 23 | Reasoning about the suitability of the chosen terms and/or phrases. 24 | 25 | 26 | 27 | {source_language} term/phrase 1keywords that you would use in a search engine to get the proper context of the term 28 | {source_language} term/phrase 1keywords that you would use in a search engine to get the proper context of the term 29 | (Continue for all identified elements) 30 | 31 | 32 | Remember to be thorough in your analysis and explanations, considering both linguistic and cultural aspects of the text. -------------------------------------------------------------------------------- /aphra_runner.py: -------------------------------------------------------------------------------- 1 | """ 2 | Command-line runner for Aphra translation system. 3 | 4 | This module provides a command-line interface to the Aphra translation 5 | functionality, allowing batch processing of text files. 6 | """ 7 | import sys 8 | import urllib.parse 9 | from aphra import translate 10 | 11 | def decode_path(path): 12 | """ 13 | Decode URL-encoded file paths. 14 | 15 | Args: 16 | path: URL-encoded file path string 17 | 18 | Returns: 19 | str: Decoded file path 20 | """ 21 | return urllib.parse.unquote(path) 22 | 23 | def main(): 24 | """ 25 | Main entry point for command-line translation. 26 | 27 | Processes command-line arguments and performs translation using Aphra. 28 | Expected arguments: config_file source_lang target_lang input_file output_file 29 | """ 30 | config_file = decode_path(sys.argv[1]) 31 | source_language = sys.argv[2] 32 | target_language = sys.argv[3] 33 | input_file = decode_path(sys.argv[4]) 34 | output_file = decode_path(sys.argv[5]) 35 | 36 | with open(input_file, 'r', encoding='utf-8') as f: 37 | text = f.read() 38 | 39 | translated_text = translate( 40 | source_language=source_language, 41 | target_language=target_language, 42 | text=text, 43 | config_file=config_file 44 | ) 45 | 46 | with open(output_file, 'w', encoding='utf-8') as f: 47 | f.write(translated_text) 48 | 49 | if __name__ == "__main__": 50 | main() 51 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step3_user.txt: -------------------------------------------------------------------------------- 1 | Here is the {source_language} text to be translated: 2 | 3 | <{source_language}_text> 4 | {text} 5 | 6 | 7 | Your goal is to produce an accurate {target_language} translation that preserves the nuances, tone, and stylistic elements of the original {source_language} text. Follow these steps: 8 | 9 | 1. Carefully read the {source_language} text and analyze the author's writing style. Pay attention to: 10 | - Sentence structure and length 11 | - Word choice and level of formality 12 | - Use of literary devices or figurative language 13 | - Rhythm and flow of the text 14 | - Any unique or distinctive elements of the author's voice 15 | 16 | 2. Begin the translation process: 17 | - Translate the text sentence by sentence, ensuring accuracy of meaning 18 | - Choose {target_language} words and phrases that best capture the tone and style of the original 19 | - Maintain similar sentence structures where possible, unless it compromises clarity in {target_language} 20 | - Preserve any idiomatic expressions, metaphors, or cultural references, adapting them if necessary to make sense in {target_language} while retaining their essence 21 | 22 | 3. After completing the translation, review it to ensure it reads naturally in {target_language} while still echoing the original {source_language} style. 23 | 24 | 4. Provide your {target_language} translation within tags. 25 | 26 | 5. After the translation, briefly explain (in 2-3 sentences) how you maintained the author's writing style in your translation. Include this explanation within tags. 27 | 28 | Remember, the goal is not just to convey the meaning, but to do so in a way that an {target_language} reader would have a similar experience to a {source_language} reader of the original text. -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step5_user.txt: -------------------------------------------------------------------------------- 1 | Follow these steps carefully: 2 | 3 | 1. First, read the original {source_language} text: 4 | 5 | {text} 6 | 7 | 8 | 2. Next, review the basic {target_language} translation: 9 | 10 | {translation} 11 | 12 | 13 | 3. Carefully study the glossary of terms, which provides explanations and context for better translation: 14 | 15 | {glossary} 16 | 17 | 18 | 4. Consider the critique of the basic translation: 19 | 20 | {critique} 21 | 22 | 23 | 5. Now, create a new translation taking into account the glossary of terms and the critique. Remember to maintain the author's original style. Pay close attention to the nuances and context provided in the glossary and address the issues raised in the critique. 24 | 25 | 6. If it is necessary to make a clarification through a translator's note, do so by inserting a numbered reference in square brackets immediately after the term that needs clarification. For example: "Term[1] that needs clarification in the text." 26 | 27 | 7. After completing your translation, add a translator's notes section at the end of the document. List each numbered note with its corresponding explanation. For example: 28 | 29 | [1] Description of the note that clarifies term 1. 30 | [2] Description of the note that clarifies term 2. 31 | 32 | 8. Present your final output in the following format: 33 | 34 | Your new {target_language} translation, including any numbered references for translator's notes. 35 | 36 | List your numbered translator's notes here, if any. 37 | 38 | 39 | Remember to carefully consider the context, maintain the author's style, and address the issues raised in the critique while creating your improved translation. -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step2_user.txt: -------------------------------------------------------------------------------- 1 | The term to be searched is: 2 | 3 | {term} 4 | 5 | 6 | The keywords to consider for context are: 7 | 8 | {keywords} 9 | 10 | 11 | Follow these steps to complete the task: 12 | 13 | 1. Conduct a thorough search for information about the term, paying special attention to its usage in {source_language}-speaking contexts. 14 | 15 | 2. Consider the provided keywords and how they relate to the term. Look for connections between the term and these keywords to provide a more focused context. 16 | 17 | 3. Gather information from reliable sources, including dictionaries, academic papers, news articles, and cultural references. 18 | 19 | 4. Organize the information you find into the following categories: 20 | a. Definition and literal meaning 21 | b. Cultural context and usage 22 | c. Regional variations (if applicable) 23 | d. Historical background (if relevant) 24 | e. Related terms or concepts 25 | f. Examples of usage in sentences or phrases 26 | 27 | 5. Provide any additional information that might be helpful for a translator, such as potential false friends, idiomatic expressions, or common translation pitfalls related to this term. 28 | 29 | 6. If the term has multiple meanings or uses, make sure to cover all relevant interpretations, especially those that might be influenced by the provided keywords. 30 | 31 | Present your findings in a clear, concise manner, using bullet points where appropriate. Begin your response with an opening statement that introduces the term and its general meaning or significance. 32 | 33 | Provide your complete response within tags. This will allow the translator to easily identify and utilize the information you've gathered. 34 | 35 | Remember, your goal is to provide comprehensive context that will enable the translator to make informed decisions about the most appropriate translation of the term, considering its cultural and linguistic nuances. -------------------------------------------------------------------------------- /aphra/translate.py: -------------------------------------------------------------------------------- 1 | """ 2 | Module for translating text using multiple steps and language models. 3 | 4 | This module provides the main translation functionality using Aphra's 5 | workflow-based translation system. 6 | """ 7 | 8 | from .core.llm_client import LLMModelClient 9 | from .core.context import TranslationContext 10 | from .core.registry import get_suitable_workflow 11 | 12 | def load_model_client(config_file): 13 | """ 14 | Loads the LLMModelClient with the provided configuration file. 15 | 16 | :param config_file: Path to the TOML file containing the configuration. 17 | :return: An instance of LLMModelClient initialized with the provided configuration. 18 | """ 19 | return LLMModelClient(config_file) 20 | 21 | def translate(source_language, target_language, text, config_file="config.toml", log_calls=False): 22 | """ 23 | Translates the provided text from the source language to the target language using workflows. 24 | 25 | This function provides a convenient interface to Aphra's workflow-based 26 | translation system. 27 | 28 | :param source_language: The source language of the text. 29 | :param target_language: The target language of the text. 30 | :param text: The text to be translated. 31 | :param config_file: Path to the TOML file containing the configuration. 32 | :param log_calls: Boolean indicating whether to log the call details. 33 | :return: The improved translation of the text. 34 | """ 35 | # Load the model client 36 | model_client = load_model_client(config_file) 37 | 38 | # Create translation context 39 | context = TranslationContext( 40 | model_client=model_client, 41 | source_language=source_language, 42 | target_language=target_language, 43 | log_calls=log_calls 44 | ) 45 | 46 | # Find the most suitable workflow for this content 47 | workflow = get_suitable_workflow(text) 48 | 49 | if workflow is None: 50 | raise ValueError("No suitable workflow found for the provided text") 51 | 52 | # Execute the workflow 53 | result = workflow.run(context, text) 54 | 55 | return result 56 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/prompts/step4_user.txt: -------------------------------------------------------------------------------- 1 | Here is the original {source_language} text: 2 | <{source_language}_text> 3 | {text} 4 | 5 | 6 | Here is the basic {target_language} translation: 7 | <{target_language}_translation> 8 | {translation} 9 | 10 | 11 | Here is a glossary of terms from the original text, explained and contextualized for a better translation: 12 | 13 | {glossary} 14 | 15 | 16 | Please follow these steps to complete your task: 17 | 18 | 1. Carefully read the {source_language} text, the {target_language} translation, and the glossary. 19 | 20 | 2. Analyze the translation for accuracy, fluency, and cultural appropriateness. Consider the following aspects: 21 | - Semantic accuracy: Does the translation convey the same meaning as the original? 22 | - Grammar and syntax: Is the {target_language} grammatically correct and natural-sounding? 23 | - Idiomatic expressions: Are {source_language} idioms appropriately translated or adapted? 24 | - Cultural nuances: Are cultural references accurately conveyed or explained? 25 | - Terminology: Is specialized vocabulary correctly translated, especially considering the provided glossary? 26 | 27 | 3. Identify terms or concepts that would benefit from a translator's note. These may include: 28 | - Cultural references that may not be familiar to the target audience 29 | - Words or phrases with multiple meanings or connotations in {source_language} 30 | - Concepts that require additional context for full understanding 31 | 32 | 4. Provide your criticism and suggestions in the following format: 33 | 34 | 35 | 36 | [List specific suggestions for improving the translation, with explanations for each suggestion] 37 | 38 | 39 | 40 | [List terms or concepts that should have a translator's note, explaining why each note is necessary and what information it should include] 41 | 42 | 43 | 44 | Be thorough in your analysis, but also concise in your explanations. Focus on the most important improvements and notes that would significantly enhance the quality and clarity of the translation. -------------------------------------------------------------------------------- /aphra/core/context.py: -------------------------------------------------------------------------------- 1 | """ 2 | Context management for translation workflows. 3 | 4 | This module provides the TranslationContext class that encapsulates 5 | all the state and configuration needed during translation execution. 6 | """ 7 | 8 | from dataclasses import dataclass 9 | from typing import Dict, Any, Optional 10 | from .llm_client import LLMModelClient 11 | 12 | @dataclass 13 | class TranslationContext: 14 | """ 15 | Context for translation containing parameters and settings. 16 | 17 | This class encapsulates the parameters and settings needed for performing a translation, 18 | including the model client, source and target languages, and logging preferences. 19 | """ 20 | model_client: LLMModelClient 21 | source_language: str 22 | target_language: str 23 | log_calls: bool 24 | 25 | # Additional fields for workflow state 26 | metadata: Dict[str, Any] = None 27 | intermediate_results: Dict[str, Any] = None 28 | workflow_config: Optional[Dict[str, Any]] = None 29 | 30 | def __post_init__(self): 31 | """Initialize optional fields if not provided.""" 32 | if self.metadata is None: 33 | self.metadata = {} 34 | if self.intermediate_results is None: 35 | self.intermediate_results = {} 36 | if self.workflow_config is None: 37 | self.workflow_config = {} 38 | 39 | def get_workflow_config(self, key: str = None, default: Any = None) -> Any: 40 | """ 41 | Get workflow-specific configuration value. 42 | 43 | Args: 44 | key: Configuration key to retrieve. If None, returns full config dict. 45 | default: Default value if key is not found. 46 | 47 | Returns: 48 | Configuration value or default if not found. 49 | """ 50 | if key is None: 51 | return self.workflow_config 52 | return self.workflow_config.get(key, default) 53 | 54 | def set_workflow_config(self, config: Dict[str, Any]) -> None: 55 | """Set workflow-specific configuration.""" 56 | self.workflow_config = config 57 | 58 | def store_result(self, step_name: str, result: Any) -> None: 59 | """Store intermediate result from a workflow step.""" 60 | self.intermediate_results[step_name] = result 61 | 62 | def get_result(self, step_name: str) -> Any: 63 | """Retrieve intermediate result from a workflow step.""" 64 | return self.intermediate_results.get(step_name) 65 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/tests/test_prompts.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for prompt functionality specific to the Short Article workflow. 3 | 4 | These tests verify that the workflow can correctly load and format 5 | its specific prompt templates. 6 | """ 7 | 8 | import unittest 9 | from ....core.prompts import get_prompt 10 | 11 | class TestShortArticlePrompts(unittest.TestCase): 12 | """ 13 | Test cases for the short article workflow prompt functions. 14 | """ 15 | 16 | def test_get_prompt_with_formatting(self): 17 | """ 18 | Test getting a prompt and formatting it correctly with the new signature. 19 | """ 20 | file_name = 'step1_system.txt' 21 | prompt = get_prompt('short_article', file_name, 22 | source_language='Spanish', 23 | target_language='English') 24 | 25 | # Verify that the prompt contains the formatted languages 26 | self.assertIn('Spanish', prompt) 27 | self.assertIn('English', prompt) 28 | 29 | def test_get_prompt_without_formatting(self): 30 | """ 31 | Test getting a prompt without formatting parameters. 32 | """ 33 | file_name = 'step1_system.txt' 34 | prompt = get_prompt('short_article', file_name) 35 | 36 | # Should return the prompt template content 37 | self.assertIsInstance(prompt, str) 38 | self.assertGreater(len(prompt), 0) 39 | 40 | def test_get_prompt_all_steps(self): 41 | """ 42 | Test that all prompt files for the workflow can be loaded. 43 | """ 44 | step_files = [ 45 | 'step1_system.txt', 'step1_user.txt', 46 | 'step2_system.txt', 'step2_user.txt', 47 | 'step3_system.txt', 'step3_user.txt', 48 | 'step4_system.txt', 'step4_user.txt', 49 | 'step5_system.txt', 'step5_user.txt' 50 | ] 51 | 52 | for file_name in step_files: 53 | with self.subTest(file_name=file_name): 54 | prompt = get_prompt('short_article', file_name) 55 | self.assertIsInstance(prompt, str) 56 | self.assertGreater(len(prompt), 0) 57 | 58 | def test_get_prompt_missing_file(self): 59 | """ 60 | Test behavior when requesting a non-existent prompt file. 61 | """ 62 | with self.assertRaises(FileNotFoundError): 63 | get_prompt('short_article', 'nonexistent_prompt.txt') 64 | 65 | def test_get_prompt_missing_workflow(self): 66 | """ 67 | Test behavior when requesting prompts from a non-existent workflow. 68 | """ 69 | with self.assertRaises(FileNotFoundError): 70 | get_prompt('nonexistent_workflow', 'step1_system.txt') 71 | 72 | if __name__ == '__main__': 73 | unittest.main() 74 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/aux/parsers.py: -------------------------------------------------------------------------------- 1 | """ 2 | Parsers specific to the Short Article workflow. 3 | 4 | This module contains parsers for extracting content from LLM responses 5 | that are specific to the short article translation workflow. 6 | 7 | These parsers use the generic XML parsing functions from the core module 8 | to avoid code duplication while maintaining a clear API. 9 | """ 10 | 11 | import logging 12 | from typing import List, Dict, Any 13 | from ....core.parsers import parse_xml_tag, parse_multiple_xml_tags 14 | 15 | def parse_analysis(analysis_str: str) -> List[Dict[str, Any]]: 16 | """ 17 | Parses the analysis part of the provided string and returns 18 | a list of items with their names and keywords. 19 | 20 | Uses generic XML parsers from the core module to extract structured data 21 | from the tag and its nested elements. 22 | 23 | Args: 24 | analysis_str: String containing the analysis in the specified format. 25 | 26 | Returns: 27 | List[Dict]: A list of dictionaries, each containing 'name' and 'keywords' from the analysis. 28 | """ 29 | # 1. Extract content of tag 30 | analysis_content = parse_xml_tag(analysis_str, "analysis") 31 | if not analysis_content: 32 | logging.error('Could not find tag in content') 33 | return [] 34 | 35 | # 2. Extract all tags within the analysis 36 | item_contents = parse_multiple_xml_tags(analysis_content, "item") 37 | if not item_contents: 38 | logging.warning('No tags found within ') 39 | return [] 40 | 41 | # 3. For each item, extract name and keywords 42 | items = [] 43 | for item_content in item_contents: 44 | name = parse_xml_tag(item_content, "name") 45 | keywords_str = parse_xml_tag(item_content, "keywords") 46 | 47 | if name and keywords_str: 48 | items.append({ 49 | 'name': name, 50 | 'keywords': keywords_str.split(', ') 51 | }) 52 | else: 53 | logging.warning('Incomplete item found - name: %s, keywords: %s', name, keywords_str) 54 | 55 | return items 56 | 57 | def parse_translation(translation_str: str) -> str: 58 | """ 59 | Parses the provided string and returns the content within 60 | tags. 61 | 62 | Uses the generic XML parser from the core module to extract the translation. 63 | 64 | Args: 65 | translation_str: String containing the translation in the specified format. 66 | 67 | Returns: 68 | str: String containing the content. 69 | """ 70 | result = parse_xml_tag(translation_str, "improved_translation") 71 | if result is None: 72 | logging.error('Could not find tag in content') 73 | return "" 74 | 75 | return result 76 | -------------------------------------------------------------------------------- /aphra/core/prompts.py: -------------------------------------------------------------------------------- 1 | """ 2 | Core prompt template loading utilities. 3 | 4 | This module provides generic prompt template loading functionality 5 | for all workflows in the Aphra translation system. 6 | """ 7 | 8 | import os 9 | from importlib import resources 10 | 11 | def get_prompt(workflow_name: str, file_name: str, **kwargs) -> str: 12 | """ 13 | Reads a prompt template from a workflow's prompts directory and formats it. 14 | 15 | Args: 16 | workflow_name: Name of the workflow (e.g., 'short_article', 'subtitles') 17 | file_name: Name of the prompt file (e.g., 'step1_system.txt') 18 | **kwargs: Optional keyword arguments to format the prompt template 19 | 20 | Returns: 21 | str: The formatted prompt content 22 | 23 | Raises: 24 | FileNotFoundError: If the prompt file doesn't exist 25 | KeyError: If required format parameters are missing 26 | """ 27 | try: 28 | # Try using importlib.resources first (works in packaged installations) 29 | ref = resources.files('aphra.workflows') / workflow_name / 'prompts' / file_name 30 | with ref.open('r', encoding="utf-8") as file: 31 | content = file.read() 32 | except (AttributeError, FileNotFoundError) as exc: 33 | # Fallback to direct file access (works in development) 34 | workflows_path = os.path.dirname(os.path.dirname(__file__)) # Go up to aphra/ 35 | file_path = os.path.join(workflows_path, 'workflows', workflow_name, 'prompts', file_name) 36 | 37 | if not os.path.exists(file_path): 38 | raise FileNotFoundError(f"Prompt file not found: {file_path}") from exc 39 | 40 | with open(file_path, 'r', encoding="utf-8") as file: 41 | content = file.read() 42 | 43 | # Format the content with provided kwargs if any 44 | if kwargs: 45 | try: 46 | formatted_prompt = content.format(**kwargs) 47 | except KeyError as exc: 48 | msg = f"Missing format parameter {exc} for prompt {workflow_name}/{file_name}" 49 | raise KeyError(msg) from exc 50 | else: 51 | formatted_prompt = content 52 | 53 | return formatted_prompt 54 | 55 | def list_workflow_prompts(workflow_name: str) -> list[str]: 56 | """ 57 | List all available prompt files for a workflow. 58 | 59 | Args: 60 | workflow_name: Name of the workflow 61 | 62 | Returns: 63 | list[str]: List of prompt filenames available for the workflow 64 | 65 | Raises: 66 | FileNotFoundError: If the workflow prompts directory doesn't exist 67 | """ 68 | try: 69 | # Try using importlib.resources first 70 | prompts_ref = resources.files('aphra.workflows') / workflow_name / 'prompts' 71 | return [f.name for f in prompts_ref.iterdir() if f.is_file()] 72 | except (AttributeError, FileNotFoundError) as exc: 73 | # Fallback to direct directory access 74 | workflows_path = os.path.dirname(os.path.dirname(__file__)) 75 | prompts_path = os.path.join(workflows_path, 'workflows', workflow_name, 'prompts') 76 | 77 | if not os.path.exists(prompts_path): 78 | msg = f"Workflow prompts directory not found: {prompts_path}" 79 | raise FileNotFoundError(msg) from exc 80 | 81 | return [f for f in os.listdir(prompts_path) 82 | if os.path.isfile(os.path.join(prompts_path, f))] 83 | -------------------------------------------------------------------------------- /aphra/core/workflow.py: -------------------------------------------------------------------------------- 1 | """ 2 | Workflow base classes. 3 | 4 | This module defines the contract for translation workflows. 5 | """ 6 | 7 | from abc import ABC, abstractmethod 8 | from typing import Dict, Any 9 | from .context import TranslationContext 10 | from .config import load_workflow_config 11 | 12 | class AbstractWorkflow(ABC): 13 | """ 14 | Base class for translation workflows. 15 | 16 | A workflow orchestrates a translation process for a specific type of content 17 | using methods that can be overridden to customize behavior. 18 | """ 19 | 20 | @abstractmethod 21 | def get_workflow_name(self) -> str: 22 | """ 23 | Get the unique name of this workflow. 24 | 25 | Returns: 26 | str: The workflow name identifier 27 | """ 28 | raise NotImplementedError("Subclasses must implement get_workflow_name") 29 | 30 | @abstractmethod 31 | def is_suitable_for(self, text: str, **kwargs) -> bool: 32 | """ 33 | Determine if this workflow is suitable for the given content. 34 | 35 | Args: 36 | text: The text content to evaluate 37 | **kwargs: Additional evaluation parameters 38 | 39 | Returns: 40 | bool: True if this workflow is suitable for the content 41 | """ 42 | raise NotImplementedError("Subclasses must implement is_suitable_for") 43 | 44 | def load_config(self, global_config_path: str = None) -> Dict[str, Any]: 45 | """ 46 | Load workflow-specific configuration. 47 | 48 | This method automatically loads the workflow's default configuration 49 | and applies user overrides from the global config file. 50 | 51 | Args: 52 | global_config_path: Path to global config file. If None, uses 'config.toml' 53 | 54 | Returns: 55 | Dict containing merged configuration values 56 | """ 57 | return load_workflow_config(self.get_workflow_name(), global_config_path) 58 | 59 | def run(self, context: TranslationContext, text: str = None) -> str: 60 | """ 61 | Run the complete workflow with configuration management. 62 | 63 | This method: 64 | 1. Loads workflow-specific configuration 65 | 2. Sets it in the translation context 66 | 3. Calls the execute method 67 | 68 | Args: 69 | context: The translation context 70 | text: The text to translate (optional if already in context) 71 | 72 | Returns: 73 | str: The final translation result 74 | """ 75 | # Load workflow configuration and set it in context 76 | workflow_config = self.load_config() 77 | context.set_workflow_config(workflow_config) 78 | 79 | # Get text from context if not provided 80 | if text is None: 81 | text = getattr(context, 'text', '') 82 | 83 | return self.execute(context, text) 84 | 85 | @abstractmethod 86 | def execute(self, context: TranslationContext, text: str) -> str: 87 | """ 88 | Execute the complete workflow with the given context and text. 89 | 90 | Args: 91 | context: The translation context 92 | text: The text to translate 93 | 94 | Returns: 95 | str: The final translation result 96 | """ 97 | raise NotImplementedError("Subclasses must implement execute") 98 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/examples/simple_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | """ 3 | Simple demo showing how to use the Short Article Workflow directly. 4 | 5 | This example demonstrates the basic usage of the ShortArticleWorkflow 6 | without any web interface - just pure Python code. 7 | """ 8 | 9 | import os 10 | import tempfile 11 | import toml 12 | from ..short_article_workflow import ShortArticleWorkflow 13 | from ....core.context import TranslationContext 14 | from ....core.llm_client import LLMModelClient 15 | 16 | def main(): 17 | """ 18 | Simple example of using the Short Article Workflow. 19 | """ 20 | # Sample text to translate 21 | sample_text = """ 22 | El cambio climático es uno de los desafíos más importantes de nuestro tiempo. 23 | Los científicos han demostrado que las actividades humanas están causando 24 | un calentamiento global sin precedentes, lo que está alterando los patrones 25 | climáticos en todo el mundo. 26 | """ 27 | 28 | # Configuration - you would normally have this in a config file 29 | # Note: workflow-specific config (writer, searcher, etc.) is now handled 30 | # automatically by the workflow itself from its default.toml 31 | config_data = { 32 | "openrouter": { 33 | "api_key": "your-openrouter-api-key-here" # Replace with your actual key 34 | } 35 | } 36 | 37 | print("🌐💬 Short Article Workflow Demo") 38 | print("=" * 40) 39 | 40 | # Check if API key is provided 41 | if config_data["openrouter"]["api_key"] == "your-openrouter-api-key-here": 42 | print("❌ Please set your OpenRouter API key in this script") 43 | print(" Edit the 'api_key' value in the config_data dictionary") 44 | return 45 | 46 | with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.toml') as tmp: 47 | toml.dump(config_data, tmp) 48 | config_file = tmp.name 49 | 50 | try: 51 | # Load the model client with the config file 52 | model_client = LLMModelClient(config_file) 53 | 54 | # Create translation context 55 | context = TranslationContext( 56 | model_client=model_client, 57 | source_language="Spanish", 58 | target_language="English", 59 | log_calls=True # Enable logging to see the workflow steps 60 | ) 61 | 62 | print("📝 Original text:") 63 | print(f" {sample_text.strip()[:100]}...") 64 | print() 65 | 66 | print("🚀 Running Short Article Workflow...") 67 | print(" This will execute 5 steps: analyze → search → translate → critique → refine") 68 | print() 69 | 70 | # Create and run the workflow 71 | workflow = ShortArticleWorkflow() 72 | translation = workflow.run(context, sample_text.strip()) 73 | 74 | print("✅ Translation completed!") 75 | print("=" * 40) 76 | print("🎯 Final translation:") 77 | print(f" {translation}") 78 | 79 | except Exception as exc: 80 | print(f"❌ Error during translation: {exc}") 81 | print(" Make sure you have:") 82 | print(" 1. Set a valid OpenRouter API key") 83 | print(" 2. Internet connection for API calls") 84 | print(" 3. Installed all required dependencies") 85 | 86 | finally: 87 | # Clean up temporary config file 88 | os.unlink(config_file) 89 | 90 | if __name__ == "__main__": 91 | main() 92 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/docs/README.md: -------------------------------------------------------------------------------- 1 | # Short Article Workflow 2 | 3 | The Short Article workflow implements a comprehensive 5-step translation process designed for articles, blog posts, and general text content. 4 | 5 | ## Overview 6 | 7 | This workflow provides high-quality translation through a multi-stage process that ensures accuracy, context awareness, and natural language flow. 8 | 9 | ### When to Use 10 | 11 | This workflow is suitable for: 12 | - Articles and blog posts 13 | - General text content 14 | - Documents requiring high translation quality 15 | - Content where context and nuance are important 16 | 17 | ### Workflow Steps 18 | 19 | 1. **Analysis** - Identify key terms and concepts in the source text 20 | 2. **Search** - Generate contextual explanations using web search 21 | 3. **Translation** - Create initial translation based on analysis and context 22 | 4. **Critique** - Evaluate translation quality and identify improvements 23 | 5. **Refinement** - Produce final improved translation incorporating feedback 24 | 25 | ## Workflow Diagram 26 | 27 | ![Short Article Workflow Diagram](workflow-diagram.png) 28 | 29 | For the interactive Mermaid version, see [workflow-diagram.md](workflow-diagram.md). 30 | 31 | ## Configuration 32 | 33 | ### Default Models 34 | - **Writer**: `anthropic/claude-sonnet-4` (for analysis, translation, and refinement) 35 | - **Searcher**: `perplexity/sonar` (for web search and context gathering) 36 | - **Critiquer**: `anthropic/claude-sonnet-4` (for translation evaluation) 37 | 38 | ### Customization 39 | 40 | You can override the default configuration in your `config.toml`: 41 | 42 | ```toml 43 | [short_article] 44 | writer = "different/model" 45 | searcher = "different/search-model" 46 | critiquer = "different/critique-model" 47 | ``` 48 | 49 | ## Usage Examples 50 | 51 | ### Basic Usage 52 | 53 | ```python 54 | from aphra import translate 55 | 56 | translation = translate( 57 | source_language="Spanish", 58 | target_language="English", 59 | text="Your text here", 60 | config_file="config.toml" 61 | ) 62 | ``` 63 | 64 | ### Direct Workflow Usage 65 | 66 | ```python 67 | from aphra.workflows.short_article import ShortArticleWorkflow 68 | from aphra.core.context import TranslationContext 69 | from aphra.core.llm_client import LLMModelClient 70 | 71 | workflow = ShortArticleWorkflow() 72 | model_client = LLMModelClient('config.toml') 73 | 74 | context = TranslationContext( 75 | model_client=model_client, 76 | source_language="Spanish", 77 | target_language="English", 78 | log_calls=False 79 | ) 80 | 81 | result = workflow.run(context, "Your text here") 82 | ``` 83 | 84 | For more examples, see the [examples directory](../examples/). 85 | 86 | ## Testing 87 | 88 | The workflow includes comprehensive tests that make real API calls to validate functionality: 89 | 90 | ```bash 91 | python -m pytest aphra/workflows/short_article/tests/ -v 92 | ``` 93 | 94 | ## Implementation Details 95 | 96 | - **Auto-Discovery**: Automatically detected by the system 97 | - **Self-Contained**: All workflow code, prompts, and configuration in one directory 98 | - **Extensible**: Inherit from this workflow to create specialized versions 99 | - **Configurable**: User can override any model or parameter 100 | 101 | ## Performance 102 | 103 | - **Average Time**: 20-30 seconds for typical articles (varies by content length and API response times) 104 | - **API Calls**: 5-7 calls per translation (depending on search results) 105 | - **Quality**: High-quality translations with context awareness and critique-driven refinement -------------------------------------------------------------------------------- /aphra/workflows/short_article/tests/test_parsers.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for the parser functions specific to the Short Article workflow. 3 | 4 | These tests verify that the workflow-specific parsers work correctly 5 | using the generic core parsers internally. 6 | """ 7 | 8 | import unittest 9 | from ..aux.parsers import parse_analysis, parse_translation 10 | 11 | class TestShortArticleParsers(unittest.TestCase): 12 | """ 13 | Test cases for the short article workflow-specific parser functions. 14 | """ 15 | 16 | def test_parse_analysis(self): 17 | """ 18 | Test parsing an analysis string into items using the new generic parsers. 19 | """ 20 | analysis_str = ( 21 | "Holahello, hi" 22 | ) 23 | parsed = parse_analysis(analysis_str) 24 | self.assertEqual(len(parsed), 1) 25 | self.assertEqual(parsed[0]['name'], 'Hola') 26 | self.assertIn('hello', parsed[0]['keywords']) 27 | 28 | def test_parse_analysis_multiple_items(self): 29 | """ 30 | Test parsing an analysis string with multiple items. 31 | """ 32 | analysis_str = ( 33 | "" 34 | "Holahello, hi" 35 | "Mundoworld, earth" 36 | "" 37 | ) 38 | parsed = parse_analysis(analysis_str) 39 | self.assertEqual(len(parsed), 2) 40 | self.assertEqual(parsed[0]['name'], 'Hola') 41 | self.assertEqual(parsed[1]['name'], 'Mundo') 42 | self.assertIn('hello', parsed[0]['keywords']) 43 | self.assertIn('world', parsed[1]['keywords']) 44 | 45 | def test_parse_analysis_empty(self): 46 | """ 47 | Test parsing an empty or invalid analysis string. 48 | """ 49 | # Empty analysis 50 | parsed = parse_analysis("") 51 | self.assertEqual(len(parsed), 0) 52 | 53 | # No analysis tag 54 | parsed = parse_analysis("No analysis here") 55 | self.assertEqual(len(parsed), 0) 56 | 57 | def test_parse_translation(self): 58 | """ 59 | Test parsing a translation string into improved translation content. 60 | """ 61 | translation_str = "Hello world" 62 | translation = parse_translation(translation_str) 63 | self.assertEqual(translation, "Hello world") 64 | 65 | def test_parse_translation_with_extra_content(self): 66 | """ 67 | Test parsing translation with content before and after the tag. 68 | """ 69 | translation_str = ( 70 | "Some preamble text\n" 71 | "Hello beautiful world\n" 72 | "Some postamble text" 73 | ) 74 | translation = parse_translation(translation_str) 75 | self.assertEqual(translation, "Hello beautiful world") 76 | 77 | def test_parse_translation_missing_tag(self): 78 | """ 79 | Test parsing translation when the tag is missing. 80 | """ 81 | translation_str = "Just some text without tags" 82 | translation = parse_translation(translation_str) 83 | self.assertEqual(translation, "") 84 | 85 | def test_parse_translation_empty_tag(self): 86 | """ 87 | Test parsing translation with empty tag. 88 | """ 89 | translation_str = "" 90 | translation = parse_translation(translation_str) 91 | self.assertEqual(translation, "") 92 | 93 | if __name__ == '__main__': 94 | unittest.main() 95 | -------------------------------------------------------------------------------- /tests/test_core_prompts.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for the core prompt system. 3 | 4 | These tests verify the generic prompt loading functionality that can be used 5 | across all workflows. 6 | """ 7 | 8 | import unittest 9 | from aphra.core.prompts import get_prompt, list_workflow_prompts 10 | 11 | 12 | class TestCorePrompts(unittest.TestCase): 13 | """ 14 | Test cases for the core prompt system functions. 15 | """ 16 | 17 | def test_get_prompt_basic(self): 18 | """ 19 | Test basic prompt loading functionality. 20 | """ 21 | # Test with the existing short_article workflow 22 | prompt = get_prompt('short_article', 'step1_system.txt') 23 | self.assertIsInstance(prompt, str) 24 | self.assertGreater(len(prompt), 0) 25 | 26 | def test_get_prompt_with_formatting(self): 27 | """ 28 | Test prompt loading with formatting parameters. 29 | """ 30 | prompt = get_prompt('short_article', 'step1_system.txt', 31 | source_language='Spanish', 32 | target_language='English') 33 | self.assertIn('Spanish', prompt) 34 | self.assertIn('English', prompt) 35 | 36 | def test_get_prompt_missing_workflow(self): 37 | """ 38 | Test behavior when requesting prompts from non-existent workflow. 39 | """ 40 | with self.assertRaises(FileNotFoundError): 41 | get_prompt('nonexistent_workflow', 'step1_system.txt') 42 | 43 | def test_get_prompt_missing_file(self): 44 | """ 45 | Test behavior when requesting non-existent prompt file. 46 | """ 47 | with self.assertRaises(FileNotFoundError): 48 | get_prompt('short_article', 'nonexistent_prompt.txt') 49 | 50 | def test_get_prompt_missing_format_parameter(self): 51 | """ 52 | Test behavior when required format parameter is missing. 53 | """ 54 | # This should raise KeyError if the template requires parameters 55 | # but they're not provided (depends on the actual template content) 56 | try: 57 | prompt = get_prompt('short_article', 'step1_system.txt', 58 | source_language='Spanish') 59 | # If it doesn't raise an error, the template is flexible 60 | self.assertIsInstance(prompt, str) 61 | except KeyError: 62 | # If it raises KeyError, that's also acceptable behavior 63 | pass 64 | 65 | def test_list_workflow_prompts(self): 66 | """ 67 | Test listing available prompt files for a workflow. 68 | """ 69 | prompts = list_workflow_prompts('short_article') 70 | self.assertIsInstance(prompts, list) 71 | self.assertGreater(len(prompts), 0) 72 | 73 | # Check that expected files are present 74 | expected_files = [ 75 | 'step1_system.txt', 'step1_user.txt', 76 | 'step2_system.txt', 'step2_user.txt', 77 | 'step3_system.txt', 'step3_user.txt', 78 | 'step4_system.txt', 'step4_user.txt', 79 | 'step5_system.txt', 'step5_user.txt' 80 | ] 81 | 82 | for expected_file in expected_files: 83 | self.assertIn(expected_file, prompts) 84 | 85 | def test_list_workflow_prompts_missing_workflow(self): 86 | """ 87 | Test listing prompts for non-existent workflow. 88 | """ 89 | with self.assertRaises(FileNotFoundError): 90 | list_workflow_prompts('nonexistent_workflow') 91 | 92 | def test_prompt_file_extensions(self): 93 | """ 94 | Test that only .txt files are listed as prompts. 95 | """ 96 | prompts = list_workflow_prompts('short_article') 97 | 98 | # All files should have .txt extension 99 | for prompt_file in prompts: 100 | self.assertTrue(prompt_file.endswith('.txt')) 101 | 102 | 103 | if __name__ == '__main__': 104 | unittest.main() -------------------------------------------------------------------------------- /aphra/core/config.py: -------------------------------------------------------------------------------- 1 | """ 2 | Generic configuration management for workflows. 3 | 4 | This module provides functions to load and merge workflow-specific configuration 5 | with user overrides for any workflow in the system. 6 | """ 7 | 8 | import os 9 | from typing import Dict, Any, Optional 10 | import logging 11 | import toml 12 | 13 | def load_workflow_config(workflow_name: str, 14 | global_config_path: Optional[str] = None) -> Dict[str, Any]: 15 | """ 16 | Load workflow configuration with user overrides. 17 | 18 | This generic function works for any workflow by: 19 | 1. Loading default config from workflows/{workflow_name}/config/default.toml 20 | 2. Applying user overrides from config.toml section [{workflow_name}] 21 | 3. Returning the merged configuration 22 | 23 | Args: 24 | workflow_name: Name of the workflow (e.g., 'short_article', 'subtitles') 25 | global_config_path: Path to global config.toml file. If None, looks for 26 | config.toml in the current working directory. 27 | 28 | Returns: 29 | Dict containing merged configuration values 30 | 31 | Example: 32 | config = load_workflow_config('short_article') 33 | writer_model = config.get('writer', 'default-model') 34 | """ 35 | # Build path to workflow's default config 36 | # Assuming we're in aphra/core/ and want to reach aphra/workflows/ 37 | core_dir = os.path.dirname(__file__) 38 | aphra_dir = os.path.dirname(core_dir) 39 | workflow_config_path = os.path.join( 40 | aphra_dir, 'workflows', workflow_name, 'config', 'default.toml' 41 | ) 42 | 43 | # Load default workflow config 44 | config = {} 45 | try: 46 | with open(workflow_config_path, 'r', encoding='utf-8') as config_file: 47 | config = toml.load(config_file) 48 | logging.debug("Loaded default config for workflow '%s'", workflow_name) 49 | except FileNotFoundError: 50 | logging.warning("Default config not found for workflow '%s' at %s", 51 | workflow_name, workflow_config_path) 52 | config = {} 53 | except Exception as exc: 54 | logging.error("Error loading default config for workflow '%s': %s", 55 | workflow_name, exc) 56 | config = {} 57 | 58 | # Load user overrides from global config 59 | if global_config_path is None: 60 | global_config_path = 'config.toml' 61 | 62 | try: 63 | with open(global_config_path, 'r', encoding='utf-8') as config_file: 64 | global_config = toml.load(config_file) 65 | 66 | # Apply overrides from workflow-specific section 67 | if workflow_name in global_config: 68 | config.update(global_config[workflow_name]) 69 | logging.debug("Applied user overrides for workflow '%s'", workflow_name) 70 | 71 | except FileNotFoundError: 72 | logging.debug("Global config file not found: %s", global_config_path) 73 | # No global config file, use defaults 74 | except Exception as exc: 75 | logging.warning("Error reading global config file %s: %s", 76 | global_config_path, exc) 77 | # Error reading config, use defaults 78 | 79 | return config 80 | 81 | def get_workflow_config_path(workflow_name: str) -> str: 82 | """ 83 | Get the path to a workflow's default configuration file. 84 | 85 | Args: 86 | workflow_name: Name of the workflow 87 | 88 | Returns: 89 | str: Path to the workflow's default.toml file 90 | """ 91 | core_dir = os.path.dirname(__file__) 92 | aphra_dir = os.path.dirname(core_dir) 93 | return os.path.join(aphra_dir, 'workflows', workflow_name, 'config', 'default.toml') 94 | 95 | def workflow_has_config(workflow_name: str) -> bool: 96 | """ 97 | Check if a workflow has a configuration file. 98 | 99 | Args: 100 | workflow_name: Name of the workflow 101 | 102 | Returns: 103 | bool: True if the workflow has a default.toml file 104 | """ 105 | config_path = get_workflow_config_path(workflow_name) 106 | return os.path.isfile(config_path) 107 | -------------------------------------------------------------------------------- /aphra/core/parsers.py: -------------------------------------------------------------------------------- 1 | """ 2 | Generic parsing utilities for XML-like content extraction. 3 | 4 | This module provides generic parsers that can be used across different 5 | workflows for extracting content from XML-like tags in LLM responses. 6 | """ 7 | 8 | import logging 9 | import re 10 | from typing import Optional 11 | 12 | def parse_xml_tag(content: str, tag_name: str) -> Optional[str]: 13 | """ 14 | Extract content from within XML-like tags in a string. 15 | 16 | This is a generic parser that can extract content from any XML-like tag 17 | in LLM responses, making it reusable across different workflows. 18 | 19 | Args: 20 | content: The string content containing XML-like tags 21 | tag_name: The name of the tag to extract (without < >) 22 | 23 | Returns: 24 | str: The content within the tags, or None if not found 25 | 26 | Example: 27 | >>> content = "Some text Hello World more text" 28 | >>> parse_xml_tag(content, "result") 29 | "Hello World" 30 | """ 31 | try: 32 | start_tag = f"<{tag_name}>" 33 | end_tag = f"" 34 | 35 | start_index = content.find(start_tag) 36 | if start_index == -1: 37 | logging.warning("Start tag '<%s>' not found in content", tag_name) 38 | return None 39 | 40 | start_index += len(start_tag) 41 | end_index = content.find(end_tag, start_index) 42 | 43 | if end_index == -1: 44 | logging.warning("End tag '' not found in content", tag_name) 45 | return None 46 | 47 | extracted_content = content[start_index:end_index].strip() 48 | return extracted_content 49 | 50 | except Exception as exc: 51 | logging.error("Error parsing XML tag '%s': %s", tag_name, exc) 52 | return None 53 | 54 | def parse_multiple_xml_tags(content: str, tag_name: str) -> list[str]: 55 | """ 56 | Extract content from multiple XML-like tags of the same type. 57 | 58 | Args: 59 | content: The string content containing XML-like tags 60 | tag_name: The name of the tag to extract (without < >) 61 | 62 | Returns: 63 | list[str]: List of content within all matching tags 64 | 65 | Example: 66 | >>> content = "Text First more Second end" 67 | >>> parse_multiple_xml_tags(content, "item") 68 | ["First", "Second"] 69 | """ 70 | try: 71 | # Use regex to find all occurrences 72 | pattern = f"<{re.escape(tag_name)}>(.*?)" 73 | matches = re.findall(pattern, content, re.DOTALL) 74 | 75 | # Strip whitespace from each match 76 | results = [match.strip() for match in matches] 77 | return results 78 | 79 | except Exception as exc: 80 | logging.error("Error parsing multiple XML tags '%s': %s", tag_name, exc) 81 | return [] 82 | 83 | def parse_xml_tag_with_attributes(content: str, tag_name: str) -> Optional[dict]: 84 | """ 85 | Extract content and attributes from XML-like tags. 86 | 87 | Args: 88 | content: The string content containing XML-like tags 89 | tag_name: The name of the tag to extract (without < >) 90 | 91 | Returns: 92 | dict: Dictionary with 'content' and 'attributes' keys, or None if not found 93 | 94 | Example: 95 | >>> content = 'Text Hello World' 96 | >>> parse_xml_tag_with_attributes(content, "result") 97 | {"content": "Hello World", "attributes": {"type": "success"}} 98 | """ 99 | try: 100 | # Pattern to match tag with optional attributes 101 | pattern = f"<{re.escape(tag_name)}([^>]*)>(.*?)" 102 | match = re.search(pattern, content, re.DOTALL) 103 | 104 | if not match: 105 | logging.warning("Tag '<%s>' not found in content", tag_name) 106 | return None 107 | 108 | attributes_str = match.group(1).strip() 109 | tag_content = match.group(2).strip() 110 | 111 | # Parse attributes if any 112 | attributes = {} 113 | if attributes_str: 114 | # Simple attribute parsing (handles key="value" format) 115 | attr_pattern = r'(\w+)="([^"]*)"' 116 | attributes = dict(re.findall(attr_pattern, attributes_str)) 117 | 118 | return { 119 | 'content': tag_content, 120 | 'attributes': attributes 121 | } 122 | 123 | except Exception as exc: 124 | logging.error("Error parsing XML tag with attributes '%s': %s", tag_name, exc) 125 | return None 126 | -------------------------------------------------------------------------------- /aphra/core/llm_client.py: -------------------------------------------------------------------------------- 1 | """ 2 | Module for interacting with the model via the OpenRouter API. 3 | """ 4 | 5 | import logging 6 | import toml 7 | import requests 8 | from openai import OpenAI 9 | 10 | class LLMModelClient: 11 | """ 12 | A client for interacting with the model via the OpenRouter API. 13 | """ 14 | 15 | def __init__(self, config_file): 16 | """ 17 | Initializes the LLMModelClient with the configuration from a file. 18 | 19 | :param config_file: Path to the TOML file containing the configuration. 20 | """ 21 | self.load_config(config_file) 22 | self.client = OpenAI( 23 | base_url="https://openrouter.ai/api/v1", 24 | api_key=self.api_key_openrouter 25 | ) 26 | self.logging_configured = False 27 | 28 | def load_config(self, config_file_path): 29 | """ 30 | Loads configuration from a TOML file. 31 | 32 | :param config_file_path: Path to the TOML file. 33 | """ 34 | try: 35 | with open(config_file_path, 'r', encoding='utf-8') as file: 36 | config = toml.load(file) 37 | self.api_key_openrouter = config['openrouter']['api_key'] 38 | except FileNotFoundError: 39 | logging.error('File not found: %s', config_file_path) 40 | raise 41 | except toml.TomlDecodeError: 42 | logging.error('Error decoding TOML file: %s', config_file_path) 43 | raise 44 | except KeyError as exc: 45 | logging.error('Missing key in config file: %s', exc) 46 | raise 47 | 48 | def call_model(self, system_prompt, user_prompt, model_name, *, 49 | log_call=False, enable_web_search=False, 50 | web_search_context="high"): 51 | """ 52 | Calls the model with the provided prompts. 53 | 54 | :param system_prompt: The system prompt to set the context for the model. 55 | :param user_prompt: The user prompt to send to the model. 56 | :param model_name: The name of the model to use. 57 | :param log_call: Boolean indicating whether to log the call details. 58 | :param enable_web_search: Boolean indicating whether to enable web search via OpenRouter. 59 | :param web_search_context: Context size for web search ('low', 'medium', 'high'). 60 | :return: The model's response. 61 | """ 62 | response = None 63 | try: 64 | # Prepare the request parameters 65 | request_params = { 66 | "model": model_name, 67 | "messages": [ 68 | {"role": "system", "content": system_prompt}, 69 | {"role": "user", "content": user_prompt} 70 | ] 71 | } 72 | 73 | # Add web search capabilities if enabled (OpenRouter format) 74 | if enable_web_search: 75 | # Append :online to model name for web search 76 | if not model_name.endswith(":online"): 77 | request_params["model"] = f"{model_name}:online" 78 | 79 | # Add web search options 80 | request_params["web_search_options"] = { 81 | "search_context_size": web_search_context 82 | } 83 | 84 | response = self.client.chat.completions.create(**request_params) 85 | response_content = response.choices[0].message.content 86 | 87 | if log_call: 88 | self.log_model_call(user_prompt, response_content) 89 | 90 | return response_content 91 | except requests.exceptions.RequestException as exc: 92 | logging.error('Request error: %s', exc) 93 | raise 94 | except (ValueError, KeyError, TypeError) as exc: 95 | logging.error('Error parsing response: %s', exc) 96 | if response and hasattr(response, 'text'): 97 | logging.error('Response content: %s', response.text) 98 | else: 99 | logging.error('No response available') 100 | raise 101 | 102 | def log_model_call(self, user_prompt, response): 103 | """ 104 | Logs the details of a model call to a log file. 105 | 106 | :param user_prompt: The user prompt sent to the model. 107 | :param response: The response received from the model. 108 | """ 109 | if not self.logging_configured: 110 | logging.basicConfig(filename='aphra.log', level=logging.INFO, 111 | format='%(asctime)s - %(levelname)s - %(message)s') 112 | self.logging_configured = True 113 | 114 | logging.info("\nUSER_PROMPT\n") 115 | logging.info(user_prompt) 116 | logging.info("\nRESPONSE\n") 117 | logging.info(response) 118 | -------------------------------------------------------------------------------- /tests/test_core_parsers.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for the generic XML parser functions in the core module. 3 | 4 | These tests verify the generic parsing functionality that can be used 5 | across all workflows. 6 | """ 7 | 8 | import unittest 9 | from aphra.core.parsers import ( 10 | parse_xml_tag, 11 | parse_multiple_xml_tags, 12 | parse_xml_tag_with_attributes 13 | ) 14 | 15 | 16 | class TestCoreParsers(unittest.TestCase): 17 | """ 18 | Test cases for the core generic XML parser functions. 19 | """ 20 | 21 | def test_parse_xml_tag_simple(self): 22 | """ 23 | Test parsing a simple XML tag. 24 | """ 25 | content = "Some text Hello World more text" 26 | result = parse_xml_tag(content, "result") 27 | self.assertEqual(result, "Hello World") 28 | 29 | def test_parse_xml_tag_multiline(self): 30 | """ 31 | Test parsing XML tag with multiline content. 32 | """ 33 | content = """Some text 34 | 35 | Hello 36 | World 37 | 38 | more text""" 39 | result = parse_xml_tag(content, "result") 40 | self.assertEqual(result.strip(), "Hello\n World") 41 | 42 | def test_parse_xml_tag_missing(self): 43 | """ 44 | Test parsing when tag is missing. 45 | """ 46 | content = "Some text without the tag" 47 | result = parse_xml_tag(content, "result") 48 | self.assertIsNone(result) 49 | 50 | def test_parse_xml_tag_empty(self): 51 | """ 52 | Test parsing empty XML tag. 53 | """ 54 | content = "Some text more text" 55 | result = parse_xml_tag(content, "result") 56 | self.assertEqual(result, "") 57 | 58 | def test_parse_multiple_xml_tags(self): 59 | """ 60 | Test parsing multiple XML tags of the same type. 61 | """ 62 | content = "Text First more Second end" 63 | results = parse_multiple_xml_tags(content, "item") 64 | self.assertEqual(len(results), 2) 65 | self.assertEqual(results[0], "First") 66 | self.assertEqual(results[1], "Second") 67 | 68 | def test_parse_multiple_xml_tags_empty(self): 69 | """ 70 | Test parsing when no matching tags exist. 71 | """ 72 | content = "Text without any matching tags" 73 | results = parse_multiple_xml_tags(content, "item") 74 | self.assertEqual(len(results), 0) 75 | 76 | def test_parse_multiple_xml_tags_nested_content(self): 77 | """ 78 | Test parsing multiple XML tags with nested content. 79 | """ 80 | content = """ 81 | 82 | First 83 | A 84 | 85 | 86 | Second 87 | B 88 | 89 | """ 90 | results = parse_multiple_xml_tags(content, "item") 91 | self.assertEqual(len(results), 2) 92 | self.assertIn("First", results[0]) 93 | self.assertIn("Second", results[1]) 94 | 95 | def test_parse_xml_tag_with_attributes_simple(self): 96 | """ 97 | Test parsing XML tag with simple attributes. 98 | """ 99 | content = 'Text Hello World' 100 | result = parse_xml_tag_with_attributes(content, "result") 101 | 102 | self.assertIsInstance(result, dict) 103 | self.assertEqual(result['content'], "Hello World") 104 | self.assertEqual(result['attributes']['type'], "success") 105 | 106 | def test_parse_xml_tag_with_attributes_multiple(self): 107 | """ 108 | Test parsing XML tag with multiple attributes. 109 | """ 110 | content = 'Text Hello World' 111 | result = parse_xml_tag_with_attributes(content, "result") 112 | 113 | self.assertEqual(result['content'], "Hello World") 114 | self.assertEqual(result['attributes']['type'], "success") 115 | self.assertEqual(result['attributes']['code'], "200") 116 | 117 | def test_parse_xml_tag_with_attributes_no_attributes(self): 118 | """ 119 | Test parsing XML tag without attributes. 120 | """ 121 | content = 'Text Hello World' 122 | result = parse_xml_tag_with_attributes(content, "result") 123 | 124 | self.assertEqual(result['content'], "Hello World") 125 | self.assertEqual(len(result['attributes']), 0) 126 | 127 | def test_parse_xml_tag_with_attributes_missing(self): 128 | """ 129 | Test parsing when tag with attributes is missing. 130 | """ 131 | content = "Text without the tag" 132 | result = parse_xml_tag_with_attributes(content, "result") 133 | self.assertIsNone(result) 134 | 135 | 136 | if __name__ == '__main__': 137 | unittest.main() -------------------------------------------------------------------------------- /aphra/workflows/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Workflow implementations with automatic discovery. 3 | 4 | This module automatically discovers and imports all workflow classes 5 | from subdirectories, making it easy to add new workflows without 6 | modifying this file. 7 | """ 8 | 9 | import os 10 | import importlib 11 | import logging 12 | from typing import List, Type, Dict 13 | 14 | # Import the base class for type checking 15 | try: 16 | from ..core.workflow import AbstractWorkflow 17 | except ImportError: 18 | # Fallback for cases where core is not yet available 19 | AbstractWorkflow = None 20 | 21 | logger = logging.getLogger(__name__) 22 | 23 | # Initialize __all__ as empty list - will be populated by auto-discovery 24 | __all__ = [] 25 | 26 | def _discover_workflows() -> Dict[str, Type]: 27 | """ 28 | Auto-discover workflow classes from subdirectories. 29 | 30 | Scans all subdirectories of the workflows package and looks for 31 | classes that inherit from AbstractWorkflow. 32 | 33 | Returns: 34 | Dict[str, Type]: Mapping of class name to workflow class 35 | """ 36 | workflows = {} 37 | current_dir = os.path.dirname(__file__) 38 | 39 | if not current_dir: 40 | logger.warning("Could not determine workflows directory") 41 | return workflows 42 | 43 | try: 44 | # Scan all items in the workflows directory 45 | for item in os.listdir(current_dir): 46 | item_path = os.path.join(current_dir, item) 47 | 48 | # Skip files and special directories 49 | if not os.path.isdir(item_path) or item.startswith('__'): 50 | continue 51 | 52 | # Skip if no __init__.py (not a proper Python package) 53 | init_file = os.path.join(item_path, '__init__.py') 54 | if not os.path.exists(init_file): 55 | logger.debug("Skipping %s: no __init__.py found", item) 56 | continue 57 | 58 | try: 59 | # Import the workflow package 60 | module = importlib.import_module(f'.{item}', package=__name__) 61 | logger.debug("Successfully imported workflow package: %s", item) 62 | 63 | # Look for workflow classes in the module 64 | workflow_classes_found = 0 65 | for attr_name in dir(module): 66 | attr = getattr(module, attr_name, None) 67 | 68 | # Check if it's a class that inherits from AbstractWorkflow 69 | if (isinstance(attr, type) and 70 | AbstractWorkflow is not None and 71 | issubclass(attr, AbstractWorkflow) and 72 | attr != AbstractWorkflow): 73 | 74 | workflows[attr_name] = attr 75 | workflow_classes_found += 1 76 | logger.debug("Discovered workflow: %s from %s", attr_name, item) 77 | 78 | if workflow_classes_found == 0: 79 | logger.warning("No workflow classes found in %s", item) 80 | 81 | except ImportError as exc: 82 | logger.warning("Failed to import workflow package %s: %s", item, exc) 83 | continue 84 | except Exception as exc: 85 | logger.error("Unexpected error while discovering workflow %s: %s", item, exc) 86 | continue 87 | 88 | except OSError as exc: 89 | logger.error("Failed to scan workflows directory: %s", exc) 90 | 91 | logger.debug("Workflow discovery completed. Found %d workflows: %s", 92 | len(workflows), list(workflows.keys())) 93 | return workflows 94 | 95 | def _setup_module_exports(workflows: Dict[str, Type]) -> List[str]: 96 | """ 97 | Set up module-level exports for discovered workflows. 98 | 99 | Args: 100 | workflows: Dictionary of workflow name to class mappings 101 | 102 | Returns: 103 | List[str]: List of exported workflow class names 104 | """ 105 | exported_classes = [] 106 | 107 | # Add each workflow class to the module globals and collect names 108 | for class_name, workflow_class in workflows.items(): 109 | globals()[class_name] = workflow_class 110 | exported_classes.append(class_name) 111 | 112 | # Sort for consistency 113 | exported_classes.sort() 114 | return exported_classes 115 | 116 | # Perform auto-discovery 117 | logger.debug("Starting workflow auto-discovery...") 118 | _discovered_workflows = _discover_workflows() 119 | 120 | # Set up module exports 121 | __all__ = _setup_module_exports(_discovered_workflows) 122 | 123 | # Log final state 124 | logger.debug("Workflows module initialized with: %s", __all__) 125 | 126 | # For backward compatibility and explicit access 127 | def get_available_workflows() -> List[str]: 128 | """ 129 | Get a list of all available workflow class names. 130 | 131 | Returns: 132 | List[str]: List of available workflow class names 133 | """ 134 | return list(__all__) 135 | 136 | def get_workflow_class(name: str) -> Type: 137 | """ 138 | Get a workflow class by name. 139 | 140 | Args: 141 | name: The name of the workflow class 142 | 143 | Returns: 144 | Type: The workflow class 145 | 146 | Raises: 147 | AttributeError: If the workflow class is not found 148 | """ 149 | if name not in globals(): 150 | raise AttributeError(f"Workflow class '{name}' not found. Available: {__all__}") 151 | 152 | return globals()[name] 153 | -------------------------------------------------------------------------------- /tests/test_registry.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test cases for the workflow registry and auto-discovery system. 3 | 4 | These tests verify that the registry correctly discovers and manages workflows. 5 | """ 6 | 7 | import unittest 8 | from aphra.core.registry import WorkflowRegistry, get_registry, get_workflow 9 | from aphra.core.workflow import AbstractWorkflow 10 | 11 | 12 | class TestWorkflowRegistry(unittest.TestCase): 13 | """ 14 | Test cases for the workflow registry system. 15 | """ 16 | 17 | def setUp(self): 18 | """Set up test cases with a fresh registry instance.""" 19 | self.registry = WorkflowRegistry() 20 | 21 | def test_registry_initialization(self): 22 | """ 23 | Test that registry initializes and discovers workflows. 24 | """ 25 | # Should discover at least the short_article workflow 26 | workflows = self.registry.list_workflows() 27 | self.assertGreater(len(workflows), 0) 28 | self.assertIn('short_article', workflows) 29 | 30 | def test_get_workflow_by_name(self): 31 | """ 32 | Test retrieving a workflow by name. 33 | """ 34 | workflow = self.registry.get_workflow('short_article') 35 | self.assertIsNotNone(workflow) 36 | self.assertIsInstance(workflow, AbstractWorkflow) 37 | self.assertEqual(workflow.get_workflow_name(), 'short_article') 38 | 39 | def test_get_workflow_missing(self): 40 | """ 41 | Test retrieving a non-existent workflow. 42 | """ 43 | workflow = self.registry.get_workflow('nonexistent_workflow') 44 | self.assertIsNone(workflow) 45 | 46 | def test_suitable_workflow_discovery(self): 47 | """ 48 | Test finding suitable workflow for content. 49 | """ 50 | # Should find short_article workflow for general text 51 | workflow = self.registry.get_suitable_workflow('Hello world') 52 | self.assertIsNotNone(workflow) 53 | self.assertEqual(workflow.get_workflow_name(), 'short_article') 54 | 55 | def test_suitable_workflow_none_found(self): 56 | """ 57 | Test when no suitable workflow is found. 58 | """ 59 | # Empty text should not match any workflow 60 | workflow = self.registry.get_suitable_workflow('') 61 | self.assertIsNone(workflow) 62 | 63 | def test_workflow_info(self): 64 | """ 65 | Test getting information about a workflow. 66 | """ 67 | info = self.registry.get_workflow_info('short_article') 68 | self.assertIsNotNone(info) 69 | self.assertIn('name', info) 70 | self.assertIn('class', info) 71 | self.assertIn('module', info) 72 | self.assertEqual(info['name'], 'short_article') 73 | 74 | def test_workflow_info_missing(self): 75 | """ 76 | Test getting info for non-existent workflow. 77 | """ 78 | info = self.registry.get_workflow_info('nonexistent_workflow') 79 | self.assertIsNone(info) 80 | 81 | def test_global_registry_singleton(self): 82 | """ 83 | Test that get_registry returns the same instance. 84 | """ 85 | registry1 = get_registry() 86 | registry2 = get_registry() 87 | self.assertIs(registry1, registry2) 88 | 89 | def test_global_get_workflow_function(self): 90 | """ 91 | Test the global get_workflow convenience function. 92 | """ 93 | workflow = get_workflow('short_article') 94 | self.assertIsNotNone(workflow) 95 | self.assertEqual(workflow.get_workflow_name(), 'short_article') 96 | 97 | 98 | class TestWorkflowAutoDiscovery(unittest.TestCase): 99 | """ 100 | Test cases for the auto-discovery system. 101 | """ 102 | 103 | def test_workflows_module_discovery(self): 104 | """ 105 | Test that workflows are auto-discovered in the workflows module. 106 | """ 107 | from aphra.workflows import __all__, get_available_workflows 108 | 109 | # Should discover at least ShortArticleWorkflow 110 | self.assertGreater(len(__all__), 0) 111 | self.assertIn('ShortArticleWorkflow', __all__) 112 | 113 | # get_available_workflows should return the same list 114 | available = get_available_workflows() 115 | self.assertEqual(available, __all__) 116 | 117 | def test_workflow_class_access(self): 118 | """ 119 | Test that discovered workflow classes are accessible. 120 | """ 121 | from aphra.workflows import ShortArticleWorkflow, get_workflow_class 122 | 123 | # Direct import should work 124 | self.assertIsNotNone(ShortArticleWorkflow) 125 | self.assertTrue(issubclass(ShortArticleWorkflow, AbstractWorkflow)) 126 | 127 | # get_workflow_class function should work 128 | workflow_class = get_workflow_class('ShortArticleWorkflow') 129 | self.assertIs(workflow_class, ShortArticleWorkflow) 130 | 131 | def test_workflow_class_access_missing(self): 132 | """ 133 | Test accessing non-existent workflow class. 134 | """ 135 | from aphra.workflows import get_workflow_class 136 | 137 | with self.assertRaises(AttributeError): 138 | get_workflow_class('NonexistentWorkflow') 139 | 140 | def test_discovery_integration_with_registry(self): 141 | """ 142 | Test that auto-discovery integrates correctly with registry. 143 | """ 144 | from aphra.workflows import __all__ 145 | 146 | registry = get_registry() 147 | registry_workflows = registry.list_workflows() 148 | 149 | # Every discovered workflow should be registered 150 | # (Though names might be different - class name vs workflow name) 151 | self.assertGreater(len(registry_workflows), 0) 152 | 153 | # At least short_article should be present 154 | self.assertIn('short_article', registry_workflows) 155 | 156 | 157 | if __name__ == '__main__': 158 | unittest.main() -------------------------------------------------------------------------------- /aphra/workflows/short_article/examples/gradio_demo.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tempfile 3 | import gradio as gr 4 | import toml 5 | import requests 6 | import logging 7 | 8 | from ..short_article_workflow import ShortArticleWorkflow 9 | from ....core.context import TranslationContext 10 | 11 | OPENROUTER_MODELS_URL="https://openrouter.ai/api/v1/models" 12 | 13 | theme = gr.themes.Soft( 14 | primary_hue="rose", 15 | secondary_hue="pink", 16 | spacing_size="lg", 17 | ) 18 | 19 | def fetch_openrouter_models(): 20 | """ 21 | Fetch available models from OpenRouter API. 22 | Returns a list of model IDs (names). 23 | """ 24 | try: 25 | response = requests.get(OPENROUTER_MODELS_URL, timeout=10) 26 | response.raise_for_status() 27 | data = response.json() 28 | 29 | # Extract model IDs from the response 30 | models = [model['id'] for model in data.get('data', [])] 31 | return sorted(models) 32 | except requests.RequestException as e: 33 | logging.warning(f"Failed to fetch models from OpenRouter: {e}") 34 | # Fallback to default models if API fails 35 | return [ 36 | "anthropic/claude-sonnet-4", 37 | "perplexity/sonar" 38 | ] 39 | 40 | def get_default_models(): 41 | """Get default model selections for different roles.""" 42 | models = fetch_openrouter_models() 43 | 44 | # Default selections based on common good models 45 | writer_default = "anthropic/claude-sonnet-4" 46 | searcher_default = "perplexity/sonar" 47 | critic_default = "anthropic/claude-sonnet-4" 48 | 49 | # Use fallbacks if defaults not available 50 | if writer_default not in models and models: 51 | writer_default = models[0] 52 | if searcher_default not in models and models: 53 | searcher_default = models[0] 54 | if critic_default not in models and models: 55 | critic_default = models[0] 56 | 57 | return models, writer_default, searcher_default, critic_default 58 | 59 | def create_config_file(api_key, writer_model, searcher_model, critic_model): 60 | config = { 61 | "openrouter": {"api_key": api_key}, 62 | "llms": { 63 | "writer": writer_model, 64 | "searcher": searcher_model, 65 | "critiquer": critic_model 66 | } 67 | } 68 | with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.toml') as tmp: 69 | toml.dump(config, tmp) 70 | return tmp.name 71 | 72 | def process_input(file, text_input, api_key, writer_model, searcher_model, critic_model, source_lang, target_lang): 73 | if file is not None: 74 | with open(file, 'r', encoding='utf-8') as file: 75 | text = file.read() 76 | else: 77 | text = text_input 78 | 79 | config_file = create_config_file(api_key, writer_model, searcher_model, critic_model) 80 | 81 | try: 82 | # Create translation context 83 | context = TranslationContext( 84 | source_language=source_lang, 85 | target_language=target_lang, 86 | text=text, 87 | config_file=config_file, 88 | log_calls=False 89 | ) 90 | 91 | # Use the specific Short Article workflow 92 | workflow = ShortArticleWorkflow() 93 | translation = workflow.run(context) 94 | 95 | finally: 96 | os.unlink(config_file) 97 | 98 | return translation 99 | 100 | def create_interface(): 101 | # Get dynamic model list and defaults 102 | models, writer_default, searcher_default, critic_default = get_default_models() 103 | 104 | with gr.Blocks(theme=theme) as demo: 105 | gr.Markdown("🌐💬 Aphra - Short Article Demo") 106 | gr.Markdown( 107 | """

108 | [Project Page] | [Github]

109 | """ 110 | ) 111 | gr.Markdown("🌐💬 This demo shows the **Short Article Workflow** specifically - a 5-step translation process designed for articles and general text content.") 112 | 113 | with gr.Row(): 114 | api_key = gr.Textbox(label="OpenRouter API Key", type="password") 115 | 116 | writer_model = gr.Dropdown( 117 | models, 118 | label="Writer Model", 119 | value=writer_default, 120 | allow_custom_value=True 121 | ) 122 | searcher_model = gr.Dropdown( 123 | models, 124 | label="Searcher Model", 125 | value=searcher_default, 126 | allow_custom_value=True 127 | ) 128 | critic_model = gr.Dropdown( 129 | models, 130 | label="Critic Model", 131 | value=critic_default, 132 | allow_custom_value=True 133 | ) 134 | 135 | with gr.Row(): 136 | source_lang = gr.Dropdown( 137 | ["Spanish", "English", "French", "German"], 138 | label="Source Language", 139 | value="Spanish", 140 | allow_custom_value=True 141 | ) 142 | target_lang = gr.Dropdown( 143 | ["English", "Spanish", "French", "German"], 144 | label="Target Language", 145 | value="English", 146 | allow_custom_value=True 147 | ) 148 | 149 | with gr.Row(): 150 | file = gr.File(label="Upload .txt or .md file", file_types=[".txt", ".md"]) 151 | text_input = gr.Textbox(label="Or paste your text here", lines=5) 152 | 153 | translate_btn = gr.Button("Translate with Short Article Workflow") 154 | 155 | output = gr.Textbox(label="Translation by Short Article Workflow") 156 | 157 | translate_btn.click( 158 | process_input, 159 | inputs=[file, text_input, api_key, writer_model, searcher_model, critic_model, source_lang, target_lang], 160 | outputs=[output] 161 | ) 162 | 163 | return demo 164 | 165 | if __name__ == "__main__": 166 | interface = create_interface() 167 | interface.launch() 168 | -------------------------------------------------------------------------------- /aphra/core/registry.py: -------------------------------------------------------------------------------- 1 | """ 2 | Workflow registry for managing available translation workflows. 3 | 4 | This module provides a centralized registry for discovering and 5 | managing translation workflows. 6 | """ 7 | 8 | import logging 9 | from typing import Dict, List, Optional, Type 10 | from .workflow import AbstractWorkflow 11 | 12 | logger = logging.getLogger(__name__) 13 | 14 | class WorkflowRegistry: 15 | """ 16 | Registry for managing translation workflows. 17 | 18 | This class maintains a registry of available workflows and provides 19 | methods for workflow discovery and selection. Workflows are automatically 20 | discovered and registered from the workflows package. 21 | """ 22 | 23 | def __init__(self): 24 | """Initialize the workflow registry with auto-discovered workflows.""" 25 | self._workflows: Dict[str, Type[AbstractWorkflow]] = {} 26 | self._register_discovered_workflows() 27 | 28 | def _register_discovered_workflows(self): 29 | """Register all auto-discovered workflows from the workflows package.""" 30 | try: 31 | # Import workflows after the module is fully initialized 32 | from .. import workflows 33 | 34 | # Get all workflow classes that were auto-discovered 35 | for class_name in workflows.__all__: 36 | workflow_class = getattr(workflows, class_name, None) 37 | if workflow_class is not None: 38 | self.register_workflow(workflow_class) 39 | logger.debug("Auto-registered workflow: %s", class_name) 40 | else: 41 | logger.warning("Failed to get workflow class: %s", class_name) 42 | 43 | except ImportError as exc: 44 | logger.error("Failed to import workflows for auto-registration: %s", exc) 45 | except Exception as exc: 46 | logger.error("Unexpected error during workflow auto-registration: %s", exc) 47 | 48 | def register_workflow(self, workflow_class: Type[AbstractWorkflow]): 49 | """ 50 | Register a new workflow type. 51 | 52 | Args: 53 | workflow_class: The workflow class to register 54 | """ 55 | # Create temporary instance to get the workflow name 56 | temp_workflow = workflow_class() 57 | workflow_name = temp_workflow.get_workflow_name() 58 | self._workflows[workflow_name] = workflow_class 59 | 60 | def get_workflow(self, workflow_name: str) -> Optional[AbstractWorkflow]: 61 | """ 62 | Get a workflow instance by name. 63 | 64 | Args: 65 | workflow_name: The name of the workflow to retrieve 66 | 67 | Returns: 68 | AbstractWorkflow: An instance of the requested workflow, or None if not found 69 | """ 70 | workflow_class = self._workflows.get(workflow_name) 71 | if workflow_class: 72 | return workflow_class() 73 | return None 74 | 75 | def get_suitable_workflow(self, text: str, **kwargs) -> Optional[AbstractWorkflow]: 76 | """ 77 | Find the most suitable workflow for the given content. 78 | 79 | Args: 80 | text: The text content to analyze 81 | **kwargs: Additional parameters for workflow evaluation 82 | 83 | Returns: 84 | AbstractWorkflow: The most suitable workflow instance, or None if none found 85 | """ 86 | # For now, we check workflows in registration order 87 | # In the future, we could implement more sophisticated selection logic 88 | for workflow_class in self._workflows.values(): 89 | workflow = workflow_class() 90 | if workflow.is_suitable_for(text, **kwargs): 91 | return workflow 92 | 93 | return None 94 | 95 | def list_workflows(self) -> List[str]: 96 | """ 97 | Get a list of all registered workflow names. 98 | 99 | Returns: 100 | List[str]: Names of all registered workflows 101 | """ 102 | return list(self._workflows.keys()) 103 | 104 | def get_workflow_info(self, workflow_name: str) -> Optional[Dict[str, str]]: 105 | """ 106 | Get information about a specific workflow. 107 | 108 | Args: 109 | workflow_name: The name of the workflow 110 | 111 | Returns: 112 | Dict[str, str]: Information about the workflow, or None if not found 113 | """ 114 | workflow = self.get_workflow(workflow_name) 115 | if workflow: 116 | return { 117 | 'name': workflow.get_workflow_name(), 118 | 'class': workflow.__class__.__name__, 119 | 'module': workflow.__class__.__module__ 120 | } 121 | return None 122 | 123 | # Global registry instance 124 | _registry = WorkflowRegistry() 125 | 126 | def get_registry() -> WorkflowRegistry: 127 | """ 128 | Get the global workflow registry instance. 129 | 130 | Returns: 131 | WorkflowRegistry: The global registry instance 132 | """ 133 | return _registry 134 | 135 | def register_workflow(workflow_class: Type[AbstractWorkflow]): 136 | """ 137 | Convenient function to register a workflow with the global registry. 138 | 139 | Args: 140 | workflow_class: The workflow class to register 141 | """ 142 | _registry.register_workflow(workflow_class) 143 | 144 | def get_workflow(workflow_name: str) -> Optional[AbstractWorkflow]: 145 | """ 146 | Convenient function to get a workflow from the global registry. 147 | 148 | Args: 149 | workflow_name: The name of the workflow to retrieve 150 | 151 | Returns: 152 | AbstractWorkflow: An instance of the requested workflow, or None if not found 153 | """ 154 | return _registry.get_workflow(workflow_name) 155 | 156 | def get_suitable_workflow(text: str, **kwargs) -> Optional[AbstractWorkflow]: 157 | """ 158 | Convenient function to find a suitable workflow from the global registry. 159 | 160 | Args: 161 | text: The text content to analyze 162 | **kwargs: Additional parameters for workflow evaluation 163 | 164 | Returns: 165 | AbstractWorkflow: The most suitable workflow instance, or None if none found 166 | """ 167 | return _registry.get_suitable_workflow(text, **kwargs) 168 | -------------------------------------------------------------------------------- /gradio-demo.py: -------------------------------------------------------------------------------- 1 | """ 2 | Gradio web interface demo for Aphra translation system. 3 | 4 | This module provides a user-friendly web interface for the Aphra translation 5 | system using Gradio, allowing users to configure models and translate text 6 | through a browser interface. 7 | """ 8 | import os 9 | import tempfile 10 | import gradio as gr 11 | import toml 12 | import requests 13 | import logging 14 | # Import the translate function 15 | from aphra import translate 16 | 17 | OPENROUTER_MODELS_URL="https://openrouter.ai/api/v1/models" 18 | 19 | theme = gr.themes.Soft( 20 | primary_hue="rose", 21 | secondary_hue="pink", 22 | spacing_size="lg", 23 | ) 24 | 25 | def fetch_openrouter_models(): 26 | """ 27 | Fetch available models from OpenRouter API. 28 | Returns a list of model IDs (names). 29 | """ 30 | try: 31 | response = requests.get(OPENROUTER_MODELS_URL, timeout=10) 32 | response.raise_for_status() 33 | data = response.json() 34 | 35 | # Extract model IDs from the response 36 | models = [model['id'] for model in data.get('data', [])] 37 | return sorted(models) 38 | except requests.RequestException as e: 39 | logging.warning(f"Failed to fetch models from OpenRouter: {e}") 40 | # Fallback to default models if API fails 41 | return [ 42 | "anthropic/claude-sonnet-4", 43 | "perplexity/sonar" 44 | ] 45 | 46 | def get_default_models(): 47 | """Get default model selections for different roles.""" 48 | models = fetch_openrouter_models() 49 | 50 | # Default selections based on common good models 51 | writer_default = "anthropic/claude-sonnet-4" 52 | searcher_default = "perplexity/sonar" 53 | critic_default = "anthropic/claude-sonnet-4" 54 | 55 | # Use fallbacks if defaults not available 56 | if writer_default not in models and models: 57 | writer_default = models[0] 58 | if searcher_default not in models and models: 59 | searcher_default = models[0] 60 | if critic_default not in models and models: 61 | critic_default = models[0] 62 | 63 | return models, writer_default, searcher_default, critic_default 64 | 65 | def create_config_file(api_key, writer_model, searcher_model, critic_model): 66 | """ 67 | Create a temporary TOML configuration file for Aphra. 68 | 69 | Args: 70 | api_key: OpenRouter API key 71 | writer_model: Model to use for writing/translation 72 | searcher_model: Model to use for searching/research 73 | critic_model: Model to use for criticism/review 74 | 75 | Returns: 76 | str: Path to the temporary configuration file 77 | """ 78 | config = { 79 | "openrouter": {"api_key": api_key}, 80 | "short_article": { 81 | "writer": writer_model, 82 | "searcher": searcher_model, 83 | "critiquer": critic_model 84 | } 85 | } 86 | with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.toml') as tmp: 87 | toml.dump(config, tmp) 88 | return tmp.name 89 | 90 | def process_input(file, text_input, api_key, writer_model, searcher_model, critic_model, source_lang, target_lang): 91 | """ 92 | Process translation input from either file or text input. 93 | 94 | Args: 95 | file: Uploaded file object (if any) 96 | text_input: Direct text input string 97 | api_key: OpenRouter API key 98 | writer_model: Model for writing/translation 99 | searcher_model: Model for searching/research 100 | critic_model: Model for criticism/review 101 | source_lang: Source language for translation 102 | target_lang: Target language for translation 103 | 104 | Returns: 105 | str: Translated text 106 | """ 107 | if file is not None: 108 | with open(file, 'r', encoding='utf-8') as file: 109 | text = file.read() 110 | else: 111 | text = text_input 112 | config_file = create_config_file(api_key, writer_model, searcher_model, critic_model) 113 | try: 114 | translation = translate( 115 | source_language=source_lang, 116 | target_language=target_lang, 117 | text=text, 118 | config_file=config_file, 119 | log_calls=False 120 | ) 121 | finally: 122 | os.unlink(config_file) 123 | 124 | return translation 125 | 126 | def create_interface(): 127 | """ 128 | Create and configure the Gradio web interface. 129 | 130 | Returns: 131 | gr.Blocks: Configured Gradio interface 132 | """ 133 | # Get dynamic model list and defaults 134 | models, writer_default, searcher_default, critic_default = get_default_models() 135 | 136 | with gr.Blocks(theme=theme) as demo: 137 | gr.Markdown("🌐💬 Aphra") 138 | gr.Markdown( 139 | """

140 | [Project Page] | [Github]

141 | """ 142 | ) 143 | gr.Markdown("🌐💬 Aphra is an open-source translation agent with a workflow architecture designed to enhance the quality of text translations by leveraging large language models (LLMs).") 144 | 145 | with gr.Row(): 146 | api_key = gr.Textbox(label="OpenRouter API Key", type="password") 147 | 148 | writer_model = gr.Dropdown( 149 | models, 150 | label="Writer Model", 151 | value=writer_default, 152 | allow_custom_value=True 153 | ) 154 | searcher_model = gr.Dropdown( 155 | models, 156 | label="Searcher Model", 157 | value=searcher_default, 158 | allow_custom_value=True 159 | ) 160 | critic_model = gr.Dropdown( 161 | models, 162 | label="Critic Model", 163 | value=critic_default, 164 | allow_custom_value=True 165 | ) 166 | 167 | with gr.Row(): 168 | source_lang = gr.Dropdown( 169 | ["Spanish", "English", "French", "German"], 170 | label="Source Language", 171 | value="Spanish", 172 | allow_custom_value=True 173 | ) 174 | target_lang = gr.Dropdown( 175 | ["English", "Spanish", "French", "German"], 176 | label="Target Language", 177 | value="English", 178 | allow_custom_value=True 179 | ) 180 | 181 | with gr.Row(): 182 | file = gr.File(label="Upload .txt or .md file", file_types=[".txt", ".md"]) 183 | text_input = gr.Textbox(label="Or paste your text here", lines=5) 184 | 185 | translate_btn = gr.Button("Translate with 🌐💬 Aphra") 186 | 187 | output = gr.Textbox(label="Translation by 🌐💬 Aphra") 188 | 189 | translate_btn.click( 190 | process_input, 191 | inputs=[file, text_input, api_key, writer_model, searcher_model, critic_model, source_lang, target_lang], 192 | outputs=[output] 193 | ) 194 | 195 | return demo 196 | 197 | if __name__ == "__main__": 198 | interface = create_interface() 199 | interface.launch() 200 | -------------------------------------------------------------------------------- /aphra/workflows/short_article/short_article_workflow.py: -------------------------------------------------------------------------------- 1 | """ 2 | Short Article workflow implementation. 3 | 4 | This workflow implements the 5-step translation process for articles 5 | and similar content types. 6 | """ 7 | 8 | from typing import List, Dict, Any 9 | from ...core.context import TranslationContext 10 | from ...core.prompts import get_prompt 11 | from ...core.workflow import AbstractWorkflow 12 | from .aux.parsers import parse_analysis, parse_translation 13 | 14 | class ShortArticleWorkflow(AbstractWorkflow): 15 | """ 16 | Workflow for translating articles and similar content. 17 | 18 | This workflow implements the proven 5-step process using direct methods: 19 | 1. analyze() - Identify key terms and concepts 20 | 2. search() - Generate contextual explanations with web search 21 | 3. translate() - Create initial translation 22 | 4. critique() - Evaluate translation quality 23 | 5. refine() - Produce final improved translation 24 | 25 | To customize: simply inherit from this class and override any method. 26 | """ 27 | 28 | def get_workflow_name(self) -> str: 29 | """Get the unique name of this workflow.""" 30 | return "short_article" 31 | 32 | def is_suitable_for(self, text: str, **_kwargs) -> bool: 33 | """ 34 | Determine if this workflow is suitable for the given content. 35 | 36 | This workflow is suitable for: 37 | - Articles and blog posts 38 | - General text content 39 | - Serves as the default workflow when no other workflow matches 40 | 41 | Args: 42 | text: The text content to evaluate 43 | **kwargs: Additional evaluation parameters 44 | 45 | Returns: 46 | bool: True if this workflow is suitable 47 | """ 48 | # This workflow accepts any non-empty text 49 | return len(text.strip()) > 0 50 | 51 | def analyze(self, context: TranslationContext, text: str) -> List[Dict[str, Any]]: 52 | """ 53 | Analyze the source text to identify key terms and concepts. 54 | 55 | Args: 56 | context: The translation context 57 | text: The text to analyze 58 | 59 | Returns: 60 | List[Dict]: Parsed analysis results with term names and keywords 61 | """ 62 | # Get writer model from workflow configuration 63 | writer_model = context.get_workflow_config('writer') 64 | 65 | # Get prompts for analysis 66 | system_prompt = get_prompt( 67 | 'short_article', 68 | 'step1_system.txt', 69 | post_content=text, 70 | source_language=context.source_language, 71 | target_language=context.target_language 72 | ) 73 | user_prompt = get_prompt( 74 | 'short_article', 75 | 'step1_user.txt', 76 | post_content=text, 77 | source_language=context.source_language, 78 | target_language=context.target_language 79 | ) 80 | 81 | # Call LLM for analysis 82 | analysis_content = context.model_client.call_model( 83 | system_prompt, 84 | user_prompt, 85 | writer_model, 86 | log_call=context.log_calls 87 | ) 88 | 89 | # Parse and return analysis 90 | return parse_analysis(analysis_content) 91 | 92 | def search(self, context: TranslationContext, parsed_items: List[Dict[str, Any]]) -> str: 93 | """ 94 | Generate contextual explanations for analyzed terms using web search. 95 | 96 | Args: 97 | context: The translation context 98 | parsed_items: List of terms from analysis step 99 | 100 | Returns: 101 | str: Formatted glossary content 102 | """ 103 | if not parsed_items: 104 | return "" 105 | 106 | # Get searcher model from workflow configuration 107 | searcher_model = context.get_workflow_config('searcher') 108 | glossary = [] 109 | 110 | for item in parsed_items: 111 | # Generate explanation for each term using web search 112 | term_explanation = self._generate_term_explanation(context, item, searcher_model) 113 | 114 | # Format glossary entry 115 | glossary_entry = ( 116 | f"### {item['name']}\n\n**Keywords:** {', '.join(item['keywords'])}\n\n" 117 | f"**Explanation:**\n{term_explanation}\n" 118 | ) 119 | glossary.append(glossary_entry) 120 | 121 | return "\n".join(glossary) 122 | 123 | def translate(self, context: TranslationContext, text: str) -> str: 124 | """ 125 | Create the initial translation of the source text. 126 | 127 | Args: 128 | context: The translation context 129 | text: The text to translate 130 | 131 | Returns: 132 | str: The initial translation 133 | """ 134 | # Get writer model from workflow configuration 135 | writer_model = context.get_workflow_config('writer') 136 | 137 | # Get prompts for translation 138 | system_prompt = get_prompt( 139 | 'short_article', 140 | 'step3_system.txt', 141 | text=text, 142 | source_language=context.source_language, 143 | target_language=context.target_language 144 | ) 145 | user_prompt = get_prompt( 146 | 'short_article', 147 | 'step3_user.txt', 148 | text=text, 149 | source_language=context.source_language, 150 | target_language=context.target_language 151 | ) 152 | 153 | # Call LLM for translation 154 | return context.model_client.call_model( 155 | system_prompt, 156 | user_prompt, 157 | writer_model, 158 | log_call=context.log_calls 159 | ) 160 | 161 | def critique(self, context: TranslationContext, text: str, 162 | translation: str, glossary: str) -> str: 163 | """ 164 | Evaluate the translation quality and provide feedback. 165 | 166 | Args: 167 | context: The translation context 168 | text: The original text 169 | translation: The initial translation 170 | glossary: The glossary from search step 171 | 172 | Returns: 173 | str: Critique and feedback 174 | """ 175 | # Get critiquer model from workflow configuration 176 | critiquer_model = context.get_workflow_config('critiquer') 177 | 178 | # Get prompts for critique 179 | system_prompt = get_prompt( 180 | 'short_article', 181 | 'step4_system.txt', 182 | text=text, 183 | translation=translation, 184 | glossary=glossary, 185 | source_language=context.source_language, 186 | target_language=context.target_language 187 | ) 188 | user_prompt = get_prompt( 189 | 'short_article', 190 | 'step4_user.txt', 191 | text=text, 192 | translation=translation, 193 | glossary=glossary, 194 | source_language=context.source_language, 195 | target_language=context.target_language 196 | ) 197 | 198 | # Call LLM for critique 199 | return context.model_client.call_model( 200 | system_prompt, 201 | user_prompt, 202 | critiquer_model, 203 | log_call=context.log_calls 204 | ) 205 | 206 | def refine(self, context: TranslationContext, text: str, *, 207 | translation: str, glossary: str, critique: str) -> str: 208 | """ 209 | Produce the final refined translation based on critique feedback. 210 | 211 | Args: 212 | context: The translation context 213 | text: The original text 214 | translation: The initial translation 215 | glossary: The glossary from search step 216 | critique: The critique feedback 217 | 218 | Returns: 219 | str: The final refined translation 220 | """ 221 | # Get writer model from workflow configuration 222 | writer_model = context.get_workflow_config('writer') 223 | 224 | # Get prompts for refinement 225 | system_prompt = get_prompt( 226 | 'short_article', 227 | 'step5_system.txt', 228 | text=text, 229 | translation=translation, 230 | glossary=glossary, 231 | critique=critique, 232 | source_language=context.source_language, 233 | target_language=context.target_language 234 | ) 235 | user_prompt = get_prompt( 236 | 'short_article', 237 | 'step5_user.txt', 238 | text=text, 239 | translation=translation, 240 | glossary=glossary, 241 | critique=critique, 242 | source_language=context.source_language, 243 | target_language=context.target_language 244 | ) 245 | 246 | # Call LLM for refinement 247 | final_translation_content = context.model_client.call_model( 248 | system_prompt, 249 | user_prompt, 250 | writer_model, 251 | log_call=context.log_calls 252 | ) 253 | 254 | # Parse and return final translation 255 | return parse_translation(final_translation_content) 256 | 257 | def execute(self, context: TranslationContext, text: str) -> str: 258 | """ 259 | Execute the complete short article workflow. 260 | 261 | This method orchestrates the 5-step process in sequence. 262 | 263 | Args: 264 | context: The translation context 265 | text: The text to translate 266 | 267 | Returns: 268 | str: The final refined translation 269 | """ 270 | # Step 1: Analyze the text to identify key terms 271 | analysis = self.analyze(context, text) 272 | 273 | # Step 2: Search for contextual information about the terms 274 | glossary = self.search(context, analysis) 275 | 276 | # Step 3: Create initial translation 277 | translation = self.translate(context, text) 278 | 279 | # Step 4: Critique the translation 280 | critique = self.critique(context, text, translation, glossary) 281 | 282 | # Step 5: Refine the translation based on critique 283 | final_translation = self.refine(context, text, translation=translation, 284 | glossary=glossary, critique=critique) 285 | 286 | return final_translation 287 | 288 | def _generate_term_explanation(self, context: TranslationContext, 289 | item: Dict[str, Any], model: str) -> str: 290 | """ 291 | Generate explanation for a single term using web search. 292 | 293 | Args: 294 | context: The translation context 295 | item: Dictionary with 'name' and 'keywords' keys 296 | model: The model to use for generation 297 | 298 | Returns: 299 | str: The generated explanation with web search results 300 | """ 301 | system_prompt = get_prompt( 302 | 'short_article', 303 | 'step2_system.txt', 304 | term=item['name'], 305 | keywords=", ".join(item['keywords']), 306 | source_language=context.source_language, 307 | target_language=context.target_language 308 | ) 309 | user_prompt = get_prompt( 310 | 'short_article', 311 | 'step2_user.txt', 312 | term=item['name'], 313 | keywords=", ".join(item['keywords']), 314 | source_language=context.source_language, 315 | target_language=context.target_language 316 | ) 317 | 318 | return context.model_client.call_model( 319 | system_prompt, 320 | user_prompt, 321 | model, 322 | log_call=context.log_calls, 323 | enable_web_search=True, 324 | web_search_context="high" 325 | ) 326 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | 2 | # Contributing to 🌐💬 Aphra 3 | 4 | First off, thanks for taking the time to contribute! Your help is greatly appreciated. 5 | 6 | All types of contributions are encouraged and valued, whether it's code, documentation, suggestions for new features, or bug reports. Please read through the following guidelines before contributing to ensure a smooth process for everyone involved. 7 | 8 | > And if you like the project, but just don't have time to contribute, that's fine. There are other easy ways to support the project and show your appreciation, which we would also be very happy about: 9 | > - Star the project 10 | > - Tweet about it 11 | > - Refer this project in your project's README 12 | > - Mention the project at local meetups and tell your friends/colleagues 13 | 14 | 15 | ## Table of Contents 16 | 17 | - [I Have a Question](#i-have-a-question) 18 | - [I Want To Contribute](#i-want-to-contribute) 19 | - [Reporting Bugs](#reporting-bugs) 20 | - [Suggesting Enhancements](#suggesting-enhancements) 21 | - [Your First Code Contribution](#your-first-code-contribution) 22 | - [Improving The Documentation](#improving-the-documentation) 23 | - [Styleguides](#styleguides) 24 | - [Commit Messages](#commit-messages) 25 | 26 | ## I Have a Question 27 | 28 | If you want to ask a question, we assume that you have read the available [Documentation](https://github.com/DavidLMS/aphra/blob/main/README.md). 29 | 30 | Before you ask a question, it is best to search for existing [Issues](https://github.com/DavidLMS/aphra/issues) that might help you. If you find a relevant issue but still need clarification, feel free to comment on it. Additionally, it’s a good idea to search the web for answers before asking. 31 | 32 | If you still need to ask a question, we recommend the following: 33 | 34 | - Open an [Issue](https://github.com/DavidLMS/aphra/issues/new). 35 | - Provide as much context as you can about what you're running into. 36 | - Provide project and platform versions (Python, OS, etc.), depending on what seems relevant. 37 | 38 | We (or someone in the community) will then take care of the issue as soon as possible. 39 | 40 | ## I Want To Contribute 41 | 42 | > ### Legal Notice 43 | > When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content, and that the content you contribute may be provided under the project license. 44 | 45 | ### Reporting Bugs 46 | 47 | #### Before Submitting a Bug Report 48 | 49 | A good bug report shouldn't leave others needing to chase you up for more information. Please investigate carefully, collect information, and describe the issue in detail in your report. Follow these steps to help us fix any potential bugs as quickly as possible: 50 | 51 | - Ensure you are using the latest version. 52 | - Verify that your issue is not due to misconfiguration or environmental issues. Make sure you have read the [documentation](https://github.com/DavidLMS/aphra/blob/main/README.md). 53 | - Check if the issue has already been reported by searching the [bug tracker](https://github.com/DavidLMS/aphra/issues?q=label%3Abug). 54 | - Gather as much information as possible about the bug: 55 | - Stack trace (if applicable) 56 | - OS, platform, and version (Windows, Linux, macOS, etc.) 57 | - Python version and any relevant package versions 58 | - Steps to reliably reproduce the issue 59 | 60 | #### How Do I Submit a Good Bug Report? 61 | 62 | > Do not report security-related issues, vulnerabilities, or bugs with sensitive information in public forums. Instead, report these issues privately by emailing hola_at_davidlms.com. 63 | 64 | We use GitHub issues to track bugs and errors. If you run into an issue with the project: 65 | 66 | - Open an [Issue](https://github.com/DavidLMS/aphra/issues/new). (Since we can't be sure yet if it’s a bug, avoid labeling it as such until confirmed.) 67 | - Explain the behavior you expected and what actually happened. 68 | - Provide as much context as possible and describe the steps someone else can follow to recreate the issue. This usually includes a code snippet or an example project. 69 | 70 | Once it's filed: 71 | 72 | - The project team will label the issue accordingly. 73 | - A team member will try to reproduce the issue. If the issue cannot be reproduced, the team will ask for more information and label the issue as `needs-repro`. 74 | - If the issue is reproducible, it will be labeled `needs-fix` and potentially other relevant tags. 75 | 76 | ### Suggesting Enhancements 77 | 78 | This section guides you through submitting an enhancement suggestion for 🌐💬 Aphra, whether it's a new feature or an improvement to existing functionality. 79 | 80 | #### Before Submitting an Enhancement 81 | 82 | - Ensure you are using the latest version. 83 | - Check the [documentation](https://github.com/DavidLMS/aphra/blob/main/README.md) to see if your suggestion is already supported. 84 | - Search the [issue tracker](https://github.com/DavidLMS/aphra/issues) to see if the enhancement has already been suggested. If so, add a comment to the existing issue instead of opening a new one. 85 | - Make sure your suggestion aligns with the scope and aims of the project. It's important to suggest features that will be beneficial to the majority of users. 86 | 87 | #### How Do I Submit a Good Enhancement Suggestion? 88 | 89 | Enhancement suggestions are tracked as [GitHub issues](https://github.com/DavidLMS/aphra/issues). 90 | 91 | - Use a **clear and descriptive title** for the suggestion. 92 | - Provide a **detailed description** of the enhancement, including any relevant context. 93 | - **Describe the current behavior** and **explain what you would expect instead**, along with reasons why the enhancement would be beneficial. 94 | - Include **screenshots or diagrams** if applicable to help illustrate the suggestion. 95 | - Explain why this enhancement would be useful to most `🌐💬 Aphra` users. 96 | 97 | ### Your First Code Contribution 98 | 99 | #### Pre-requisites 100 | 101 | You should first [fork](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo) the `🌐💬 Aphra` repository and then clone your forked repository: 102 | 103 | ```bash 104 | git clone https://github.com//aphra.git 105 | ```` 106 | 107 | Once in the cloned repository directory, create a new branch for your contribution: 108 | 109 | ```bash 110 | git checkout -B 111 | ```` 112 | 113 | ### Understanding the Architecture 114 | 115 | 🌐💬 Aphra uses a modern workflow architecture with auto-discovery and self-contained workflows: 116 | 117 | - **Core System** (`aphra/core/`): Base classes, context management, configuration system, and auto-discovery registry 118 | - **Self-Contained Workflows** (`aphra/workflows/`): Complete workflow packages with their own prompts, configuration, parsers, examples, and tests 119 | - **Auto-Discovery**: Workflows are automatically detected and registered - no manual registration needed 120 | - **Workflow-Specific Configuration**: Each workflow has its own `config/default.toml` with LLM models and parameters 121 | 122 | #### Current Workflow Structure: 123 | ``` 124 | aphra/workflows/short_article/ 125 | ├── config/ 126 | │ └── default.toml # Workflow-specific configuration 127 | ├── docs/ # Optional workflow documentation 128 | │ ├── README.md # Complete workflow documentation 129 | │ ├── workflow-diagram.md # Mermaid diagram source 130 | │ └── workflow-diagram.png # Generated diagram image 131 | ├── examples/ 132 | │ ├── gradio_demo.py # Interactive web demo 133 | │ └── simple_demo.py # Basic usage example 134 | ├── prompts/ 135 | │ ├── step1_system.txt # Prompt templates 136 | │ └── ... 137 | ├── aux/ 138 | │ └── parsers.py # Workflow-specific utilities 139 | ├── tests/ 140 | │ ├── test_parsers.py # Workflow tests 141 | │ └── test_prompts.py 142 | └── short_article_workflow.py # Main workflow implementation 143 | ``` 144 | 145 | ### Contributing Workflow 146 | 147 | 1. **Understand the Component You're Modifying:** 148 | - **Core System** (`aphra/core/`): Changes here affect the entire system. Use caution and ensure backwards compatibility. 149 | - **Existing Workflows**: Modify within the workflow's own directory structure. All changes stay self-contained. 150 | - **New Workflows**: Follow the auto-discovery structure - no manual registration required. 151 | 152 | 2. **Development Guidelines:** 153 | - Make sure your code follows the style guide and passes linting with `pylint`. 154 | - Write tests for any new functionality you add. 155 | - For new workflows, inherit from `AbstractWorkflow` and implement required methods. 156 | - Use the workflow-specific configuration system (`config/default.toml`). 157 | - Ensure all tests pass before submitting a pull request. 158 | - Document any changes to APIs or core functionality. 159 | 160 | 3. **Workflow Development:** 161 | - **Configuration**: Define your LLM models and parameters in `config/default.toml` 162 | - **API Access**: Use `context.get_workflow_config('writer')` to access model names 163 | - **Auto-Discovery**: Simply create the directory structure - no manual registration 164 | - **Self-Contained**: Keep all workflow-related code within the workflow directory 165 | 166 | 4. **Testing Guidelines:** 167 | - **Real API Calls**: Tests use real OpenRouter API calls with actual models 168 | - **Configuration**: Tests use `config.toml` with real API keys 169 | - **Structure**: Core tests in `tests/`, workflow tests in `workflows/[name]/tests/` 170 | - **Coverage**: Test individual methods AND complete workflow execution 171 | - **Run Tests**: `python -m pytest tests/ -v` (requires valid API key) 172 | 173 | 5. **Submission Process:** 174 | - Submit your pull request with a clear and descriptive title and description. 175 | - Explain how your changes fit into the self-contained workflow architecture. 176 | - Include examples of how to use any new components you've created. 177 | - Ensure your workflow is fully self-contained and follows the standard structure. 178 | 179 | ## Creating New Workflows 180 | 181 | Adding a new workflow to 🌐💬 Aphra is straightforward thanks to the auto-discovery system. Simply create the directory structure and your workflow will be automatically detected. 182 | 183 | ### Step-by-Step Guide 184 | 185 | #### 1. Create the Workflow Directory Structure 186 | 187 | ```bash 188 | mkdir -p aphra/workflows/my_workflow/{config,examples,prompts,aux,tests} 189 | ``` 190 | 191 | #### 2. Implement the Workflow Class 192 | 193 | Create `aphra/workflows/my_workflow/my_workflow.py`: 194 | 195 | ```python 196 | from typing import Dict, Any 197 | from ...core.context import TranslationContext 198 | from ...core.workflow import AbstractWorkflow 199 | 200 | class MyWorkflow(AbstractWorkflow): 201 | def get_workflow_name(self) -> str: 202 | return "my_workflow" 203 | 204 | def is_suitable_for(self, text: str, **kwargs) -> bool: 205 | # Define when this workflow should be used 206 | return "specific_condition" in text.lower() 207 | 208 | def execute(self, context: TranslationContext, text: str) -> str: 209 | # Access workflow configuration 210 | writer_model = context.get_workflow_config('writer') 211 | 212 | # Implement your translation logic 213 | # Call LLM: context.model_client.call_model(system, user, writer_model) 214 | 215 | return translated_text 216 | ``` 217 | 218 | #### 3. Create Configuration File 219 | 220 | Create `aphra/workflows/my_workflow/config/default.toml`: 221 | 222 | ```toml 223 | # Default configuration for My Workflow 224 | writer = "anthropic/claude-sonnet-4" 225 | searcher = "perplexity/sonar" 226 | critiquer = "anthropic/claude-sonnet-4" 227 | 228 | # Workflow-specific parameters 229 | custom_param = "default_value" 230 | max_retries = 3 231 | ``` 232 | 233 | #### 4. Add Prompt Templates 234 | 235 | Create prompt files in `aphra/workflows/my_workflow/prompts/`: 236 | 237 | ``` 238 | prompts/ 239 | ├── system_prompt.txt 240 | └── user_prompt.txt 241 | ``` 242 | 243 | #### 5. Create Examples 244 | 245 | Create usage examples in `aphra/workflows/my_workflow/examples/`: 246 | 247 | ```python 248 | # simple_demo.py 249 | from ..my_workflow import MyWorkflow 250 | from ....core.context import TranslationContext 251 | from ....core.llm_client import LLMModelClient 252 | 253 | def main(): 254 | workflow = MyWorkflow() 255 | model_client = LLMModelClient('config.toml') 256 | context = TranslationContext( 257 | model_client=model_client, 258 | source_language="Spanish", 259 | target_language="English", 260 | log_calls=False 261 | ) 262 | 263 | result = workflow.run(context, "Your text here") 264 | print(result) 265 | ``` 266 | 267 | #### 6. Write Tests 268 | 269 | Create tests in `aphra/workflows/my_workflow/tests/`: 270 | 271 | ```python 272 | # test_my_workflow.py 273 | import unittest 274 | from ..my_workflow import MyWorkflow 275 | 276 | class TestMyWorkflow(unittest.TestCase): 277 | def test_workflow_execution(self): 278 | workflow = MyWorkflow() 279 | # Test with real API calls 280 | # ... 281 | ``` 282 | 283 | #### 7. User Configuration Override 284 | 285 | Users can override your default configuration in their `config.toml`: 286 | 287 | ```toml 288 | [openrouter] 289 | api_key = "user_api_key" 290 | 291 | [my_workflow] 292 | writer = "different/model" 293 | custom_param = "user_value" 294 | ``` 295 | 296 | ### Best Practices 297 | 298 | - **Self-Contained**: Keep everything related to your workflow in its directory 299 | - **Clear Naming**: Use descriptive names for methods and configuration parameters 300 | - **Error Handling**: Handle API failures and malformed responses gracefully 301 | - **Documentation**: Add docstrings explaining what your workflow does and when to use it 302 | - **Real Tests**: Write tests that make actual API calls to validate functionality 303 | - **Examples**: Provide both simple and advanced usage examples 304 | - **Diagrams**: Consider adding workflow diagrams to visualize the process (optional but recommended) 305 | 306 | ### Auto-Discovery 307 | 308 | Once you create the directory structure and implement the class, your workflow is automatically: 309 | 310 | - **Discovered** by the system at runtime 311 | - **Available** through the standard `translate()` function 312 | - **Configurable** through the user's `config.toml` 313 | 314 | No manual registration or configuration is needed! 315 | 316 | ### Adding Workflow Documentation (Optional) 317 | 318 | To provide comprehensive documentation for your workflow, consider adding a `docs/` directory: 319 | 320 | #### Documentation Structure 321 | 322 | ```bash 323 | mkdir -p aphra/workflows/my_workflow/docs 324 | ``` 325 | 326 | Create the following files: 327 | 328 | 1. **`docs/README.md`** - Complete workflow documentation: 329 | ```markdown 330 | # My Workflow 331 | 332 | ## Overview 333 | Brief description of what this workflow does. 334 | 335 | ## When to Use 336 | Explain when this workflow should be used. 337 | 338 | ## Configuration 339 | Document configuration options and defaults. 340 | 341 | ## Usage Examples 342 | Show basic and advanced usage. 343 | ``` 344 | 345 | 2. **`docs/workflow-diagram.md`** (Optional) - Mermaid flowchart: 346 | ```markdown 347 | ```mermaid 348 | flowchart LR 349 | A[Input] --> B[Step 1] 350 | B --> C[Step 2] 351 | C --> D[Output] 352 | ``` 353 | ``` 354 | 355 | 3. **`docs/workflow-diagram.png`** (Optional) - Generated diagram image 356 | 357 | #### Creating Workflow Diagrams 358 | 359 | Workflow diagrams help users understand the process visually. Use [Mermaid](https://mermaid.js.org/) syntax: 360 | 361 | **Basic Template:** 362 | ```mermaid 363 | flowchart LR 364 | T[📄 Original Text] 365 | 366 | subgraph "Step 1: Your Process" 367 | A[🤖 LLM Model] --> B[📄 Output] 368 | end 369 | 370 | T --> A 371 | 372 | classDef default fill:#abb,stroke:#333,stroke-width:2px; 373 | classDef robot fill:#bbf,stroke:#333,stroke-width:2px; 374 | classDef document fill:#bfb,stroke:#333,stroke-width:2px; 375 | class A robot; 376 | class T,B document; 377 | ``` 378 | 379 | **Tools for PNG Generation:** 380 | - [Mermaid Live Editor](https://mermaid.live/) 381 | - [mermaid-cli](https://github.com/mermaid-js/mermaid-cli) 382 | - GitHub renders Mermaid automatically in markdown 383 | 384 | #### Documentation Best Practices 385 | 386 | - **Clear Overview**: Explain what problem your workflow solves 387 | - **Usage Guidelines**: When to use vs. when not to use this workflow 388 | - **Configuration Options**: Document all configurable parameters 389 | - **Examples**: Provide both simple and complex usage examples 390 | - **Performance Notes**: Mention typical execution times and API usage 391 | - **Limitations**: Be honest about what your workflow cannot do 392 | 393 | ### Improving The Documentation 394 | 395 | Contributions to documentation are welcome! Well-documented code is easier to understand and maintain. If you see areas where documentation can be improved, feel free to submit your suggestions. 396 | 397 | ### Regenerating API Documentation 398 | 399 | 🌐💬 Aphra uses `pdoc` to automatically generate HTML documentation from Python docstrings. The documentation is stored in the `docs/` directory and includes all modules and their APIs. 400 | 401 | #### Prerequisites 402 | 403 | Ensure `pdoc` is installed: 404 | ```bash 405 | pip install pdoc 406 | ``` 407 | 408 | #### How to regenerate documentation 409 | 410 | **Complete regeneration** (recommended): 411 | ```bash 412 | rm -rf docs && python -m pdoc --include-undocumented -o docs aphra 413 | ``` 414 | 415 | #### Why the flags are needed 416 | 417 | - `--include-undocumented`: Ensures all classes and methods are documented, even without docstrings 418 | - `-o docs`: Specifies the output directory 419 | - `aphra`: The main package to document 420 | 421 | #### Important notes 422 | 423 | **Module imports matter**: For a module to appear in the documentation index, it must be imported in the main `aphra/__init__.py` file and included in `__all__`. For example: 424 | ```python 425 | from . import workflows # Import the module 426 | from . import core 427 | 428 | __all__ = ['translate', 'workflows', 'core'] # Export it 429 | ``` 430 | 431 | **Module structure**: The generated documentation reflects the package structure: 432 | - `docs/aphra.html` - Main package documentation with submodules index 433 | - `docs/aphra/workflows.html` - Workflows module documentation 434 | - `docs/aphra/core.html` - Core components documentation 435 | - etc. 436 | 437 | #### Common issues 438 | 439 | - **Missing modules in index**: Check that the module is imported and exported in `aphra/__init__.py` 440 | - **Outdated function signatures**: Regenerate after making changes to function parameters 441 | - **Empty documentation**: Ensure the `--include-undocumented` flag is used 442 | 443 | #### When to regenerate 444 | 445 | - After adding new workflows, core components, or modules 446 | - After modifying function signatures or class interfaces 447 | - After adding or updating docstrings 448 | - Before releasing new versions 449 | 450 | ## Styleguides 451 | 452 | ### Commit Messages 453 | 454 | - Use clear and descriptive commit messages. 455 | - Follow the general format: `Short summary (50 characters or less)` followed by an optional detailed explanation. 456 | 457 | ### Code Style 458 | 459 | - Ensure your code adheres to the project's coding standards and passes all linting checks with `pylint`. 460 | 461 | ## License 462 | 463 | By contributing to 🌐💬 Aphra, you agree that your contributions will be licensed under the MIT License. 464 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # 🌐💬 Aphra 2 | 3 |

4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |

17 | 18 | 🌐💬 Aphra is an open-source translation agent designed to enhance the quality of text translations by leveraging large language models (LLMs). Unlike traditional translation tools that rely solely on direct translations, Aphra introduces a multi-stage, context-aware process that includes glossary creation, contextual search, critique, and refinement. This approach aims to produce translations that not only retain the original meaning but also incorporate translator notes, contextual adjustments, and stylistic improvements. Whether you're translating blog posts, articles, or complex documents, Aphra ensures a more nuanced and accurate translation that respects the original content's integrity. 19 | 20 | > **Important Note:** 🌐💬 Aphra is not intended to replace the work of a professional translator. Instead, it aims to facilitate multilingual support in small projects where hiring a professional translator may not be feasible. Aphra offers a practical solution for achieving quality translations in contexts where a fully professional translation service is out of scope, ensuring that language barriers do not hinder the global reach of your content. 21 | 22 |

23 | Demo 24 | · 25 | Report Bug 26 | · 27 | Request Feature 28 | · 29 | Wiki 30 |

31 | 32 | ## Table of Contents 33 | 34 | [Motivation](#motivation) 35 | 36 | [Why Aphra?](#why-aphra) 37 | 38 | [How 🌐💬 Aphra Works](#how--aphra-works) 39 | 40 | [Demo](#demo) 41 | 42 | [Getting Started](#getting-started) 43 | 44 | [Customizability and Ideas for Extensions](#customizability-and-ideas-for-extensions) 45 | 46 | [License](#license) 47 | 48 | [Contributing](#contributing) 49 | 50 | [References](#references) 51 | 52 | ## Motivation 53 | 54 | The spark for starting this project came from a desire to challenge myself by designing a complex agentic workflow from scratch. The primary goal here is to learn and grow through the process of building something like this from the ground up. I chose the theme of translation because I've been toying with the idea of publishing [my personal blog](https://davidlms.com) in English as well. I have successfully integrated 🌐💬 Aphra into the publication pipeline, making translations a seamless part of the process. If you're interested in how this was achieved, you can find a detailed guide in the [Wiki](https://github.com/DavidLMS/aphra/wiki/Using-in-a-pipeline). 55 | 56 | As a computer science teacher, I also saw this as a great opportunity to create a small, yet complete, open-source project that follows best practices for publishing on GitHub. That's why there are so many options to get started, all designed with a didactic approach in mind. Any feedback on how to improve in that area is more than welcome. 57 | 58 | ## Why Aphra? 59 | 60 | The name "Aphra" is a tribute to [Aphra Behn](https://en.wikipedia.org/wiki/Aphra_Behn), one of the first English women to earn a living through writing in the 17th century. Behn was a playwright, poet, and translator who broke through significant cultural barriers, making her an early pioneer for women in literature. 61 | 62 | Naming this project after Aphra Behn is a way to honor her legacy of challenging the status quo and shaping the way we think about language and expression. Her influence reminds us of the importance of creating spaces where voices can be heard and ideas can flourish. 63 | 64 | As [Virginia Woolf](https://en.wikipedia.org/wiki/Virginia_Woolf) famously said, "All women together, ought to let flowers fall upon the grave of Aphra Behn... for it was she who earned them the right to speak their minds" (Woolf, Virginia. A Room of One's Own. 1928, at 65). 65 | 66 | ## How 🌐💬 Aphra Works 67 | 68 | 🌐💬 Aphra employs a multi-stage, agentic approach to translation using a **workflow architecture** designed to closely mimic the steps a human translator might take when working on a text. The system is built around workflows that orchestrate translation through specialized methods. 69 | 70 | ### Architecture 71 | 72 | Aphra's architecture consists of several key components: 73 | 74 | - **Workflows**: Self-contained classes that implement complete translation processes using simple methods. 75 | - **Context**: Shared state management across the entire translation process. 76 | - **Registry**: Central discovery and management system for available workflows. 77 | - **Core Components**: LLM client, parsers, and utilities that workflows use internally. 78 | 79 | ### Short Article Workflow (Default) 80 | 81 | ![aphra-article-diagram](aphra/workflows/short_article/docs/workflow-diagram.png) 82 | 83 | The default workflow implements the proven 5-step translation process using simple methods: 84 | 85 | 1. **analyze()**: The "LLM Writer" analyzes the original text, identifying key expressions, terms, and entities that may pose challenges in translation, such as culturally specific references or industry jargon. 86 | 87 | 2. **search()**: Any LLM model, enhanced with OpenRouter's web search capabilities, takes the identified terms and searches for real-time, up-to-date context. This includes current definitions, background information, or examples of usage in different contexts, ensuring that the translation is well-informed and accurate. 88 | 89 | 3. **translate()**: The "LLM Writer" produces an initial translation that preserves the original style and structure of the text, focusing on linguistic accuracy while preparing for contextual refinement. 90 | 91 | 4. **critique()**: The "LLM Critic" reviews the initial translation in light of the gathered context and original text, providing feedback on areas where the translation could be improved. The critique highlights potential misinterpretations, suggests alternative phrasings, or recommends adding translator notes for clarity. 92 | 93 | 5. **refine()**: Finally, the "LLM Writer" creates the final translation, incorporating the critic's feedback and the contextual information gathered earlier. The result is a polished, contextually aware translation that is more nuanced and accurate than a simple literal translation. 94 | 95 | ### Web Search Integration 96 | 97 | Aphra leverages OpenRouter's advanced web search capabilities: 98 | - **Universal Web Access**: Any model can now access real-time web information via OpenRouter's web plugin. 99 | - **High-Context Search**: Uses "high" search context for maximum information retrieval. 100 | - **Automatic Citations**: Web search results include proper source citations. 101 | - **Cost-Effective**: Powered by Exa search with transparent pricing ($4 per 1000 results). 102 | 103 | ### Extensible Design 104 | 105 | The workflow architecture enables: 106 | - **Custom Workflows**: Create specialized translation workflows by inheriting from base classes and overriding methods. 107 | - **Method Reusability**: Individual methods can be reused by inheritance or composition. 108 | - **Easy Testing**: Each method can be tested independently. 109 | - **Future Expansion**: New workflows can be added without modifying existing code. 110 | 111 | This structured approach enables 🌐💬 Aphra to produce translations that are not only linguistically accurate but also contextually rich, while providing a solid foundation for extending the system to handle various types of content and use cases. 112 | 113 | ## Demo 114 | 115 | You can test 🌐💬 Aphra here: [https://huggingface.co/spaces/davidlms/aphra](https://huggingface.co/spaces/davidlms/aphra). 116 | 117 | ## Getting Started 118 | 119 | To get started with 🌐💬 Aphra, follow these steps: 120 | 121 | ### Prerequisites 122 | 123 | Ensure you have the following installed on your system: 124 | - `git` (for cloning the repository) 125 | - Python 3.8 or higher 126 | - `pip` (Python package installer) 127 | - Docker (optional, for using Docker) 128 | 129 | ### Clone the Repository 130 | 131 | Before proceeding with the configuration or installation, you need to clone the repository. This is a common step required for all installation methods. 132 | 133 | 1. Clone the repository: 134 | ```bash 135 | git clone https://github.com/DavidLMS/aphra.git 136 | ``` 137 | 138 | 2. Navigate into the project directory: 139 | ```bash 140 | cd aphra 141 | ``` 142 | 143 | ### Configuration 144 | 145 | 1. Copy the example configuration file: 146 | ```bash 147 | cp config.example.toml config.toml 148 | ``` 149 | 150 | 2. Edit `config.toml` to add your [OpenRouter](https://openrouter.ai) API key and desired model names. 151 | 152 | After configuring the `config.toml` file, you can either: 153 | 154 | - **Use 🌐💬 Aphra directly in the current directory** of the repository (as explained in the [Usage section](#usage)), or 155 | - **Proceed with the installation** in the next section to make 🌐💬 Aphra accessible from any script on your system. 156 | 157 | > **Note:** If you choose to proceed with the installation, remember to move the `config.toml` file to the location of the script using 🌐💬 Aphra, or specify its path directly when calling the function. 158 | 159 | ### Installation 160 | 161 | #### Option 1: Install Locally with `pip` 162 | 163 | This option is the simplest way to install 🌐💬 Aphra if you don't need to isolate its dependencies from other projects. It directly installs the package on your system using `pip`, which is the standard package manager for Python. 164 | 165 | 1. Install the package locally: 166 | ```bash 167 | pip install . 168 | ``` 169 | 170 | #### Option 2: Install with Poetry 171 | 172 | Poetry is a dependency management and packaging tool for Python that helps you manage your project's dependencies more effectively. It also simplifies the process of packaging your Python projects. 173 | 174 | 1. Install Poetry if you haven't already: 175 | ```bash 176 | curl -sSL https://install.python-poetry.org | python3 - 177 | ``` 178 | 179 | 2. Install dependencies and the package: 180 | ```bash 181 | poetry install 182 | ``` 183 | 184 | 3. Activate the virtual environment created by Poetry: 185 | ```bash 186 | poetry shell 187 | ``` 188 | 189 | #### Option 3: Use a Virtual Environment 190 | 191 | A virtual environment is an isolated environment that allows you to install packages separately from your system's Python installation. This is particularly useful to avoid conflicts between packages required by different projects. 192 | 193 | 1. Create and activate a virtual environment: 194 | ```bash 195 | python -m venv aphra 196 | source aphra/bin/activate # On Windows: aphra\Scripts\activate 197 | ``` 198 | 199 | 2. Remove the file pyproject.toml: 200 | ```bash 201 | rm pyproject.toml 202 | ``` 203 | 204 | 3. Install the package locally: 205 | ```bash 206 | pip install . 207 | ``` 208 | 209 | #### Option 4: Use Docker 210 | 211 | Docker is a platform that allows you to package an application and its dependencies into a "container." This container can run consistently across different environments, making it ideal for ensuring that your project works the same way on any machine. 212 | 213 | 1. Build the Docker image: 214 | ```bash 215 | docker build -t aphra . 216 | ``` 217 | > **Note:** If you encounter permission errors during the build, try running the command with `sudo`. 218 | 219 | 2. Ensure the entry script has execution permissions. Run the following command: 220 | ```bash 221 | chmod +x entrypoint.sh 222 | ``` 223 | > **For Windows users:** You can add execute permissions using Git Bash or WSL (Windows Subsystem for Linux). If you’re using PowerShell or Command Prompt, you might not need to change permissions, but ensure the script is executable in your environment. 224 | 225 | 3. Understand the `docker run` command: 226 | - `-v $(pwd):/workspace`: This option mounts your current directory (`$(pwd)` in Unix-like systems, `%cd%` in Windows) to the `/workspace` directory inside the container. This allows the container to access files in your current directory. 227 | - `aphra`: This is the name of the Docker image you built in step 1. 228 | - `English Spanish`: These are the source and target languages for translation. Replace them with the languages you need. 229 | - `input.md`: This is the path to the input file on your host machine. 230 | - `output.md`: This is the path where the translated output will be saved on your host machine. 231 | 232 | 4. Run the Docker container: 233 | ```bash 234 | docker run -v $(pwd):/workspace aphra English Spanish input.md output.md 235 | ``` 236 | 237 | 5. Display the translation by printing the content of the output file: 238 | - On Unix-like systems (Linux, macOS, WSL): 239 | ```bash 240 | cat output.md 241 | ``` 242 | - On Windows (PowerShell): 243 | ```bash 244 | Get-Content output.md 245 | ``` 246 | - On Windows (Command Prompt): 247 | ```cmd 248 | type output.md 249 | ``` 250 | 251 | ### Usage 252 | 253 | #### Using Aphra from the Command Line 254 | 255 | You can run Aphra directly from the terminal using the `aphra_runner.py` script. This is particularly useful for automating translations as part of a larger workflow or pipeline. 256 | 257 | To translate a file from the command line, use the following syntax: 258 | 259 | ```bash 260 | python aphra_runner.py 261 | ``` 262 | 263 | - ``: Path to the configuration file containing API keys and model settings (e.g., `config.toml`). 264 | - ``: The language of the input text (e.g., "Spanish"). 265 | - ``: The language you want to translate the text into (e.g., "English"). 266 | - ``: Path to the input file containing the text you want to translate. 267 | - ``: Path where the translated text will be saved. 268 | 269 | **Example:** 270 | 271 | ```bash 272 | python aphra_runner.py config.toml Spanish English input.md output.md 273 | ``` 274 | 275 | In this example: 276 | - The configuration file `config.toml` is used. 277 | - The text in `input.md` is translated from Spanish to English. 278 | - The translated content is saved to `output.md`. 279 | 280 | #### Using Aphra as a Python Function 281 | 282 | If you prefer to use Aphra directly in your Python code, the `translate` function allows you to translate text from one language to another using the configured language models. The function takes the following parameters: 283 | 284 | - `source_language`: The language of the input text (e.g., "Spanish"). 285 | - `target_language`: The language you want to translate the text into (e.g., "English"). 286 | - `text`: The text you want to translate. 287 | - `config_file`: The path to the configuration file containing API keys and model settings. Defaults to "config.toml". 288 | - `log_calls`: A boolean indicating whether to log API calls for debugging purposes. Defaults to `False`. 289 | 290 | Here is how you can use the `translate` function in a generic way: 291 | 292 | ```python 293 | from aphra import translate 294 | 295 | translation = translate(source_language='source_language', 296 | target_language='target_language', 297 | text='text_to_translate', 298 | config_file='config.toml', 299 | log_calls=False) 300 | print(translation) 301 | ``` 302 | 303 | ##### Example 1: Translating a Simple Sentence 304 | 305 | Suppose you want to translate the sentence "Hola mundo" from Spanish to English. The code would look like this: 306 | 307 | ```python 308 | from aphra import translate 309 | 310 | translation = translate(source_language='Spanish', 311 | target_language='English', 312 | text='Hola mundo', 313 | config_file='config.toml', 314 | log_calls=False) 315 | print(translation) 316 | ``` 317 | 318 | ##### Example 2: Translating Content from a Markdown File 319 | 320 | If you have a Markdown file (`input.md`) containing the text you want to translate, you can read the file, translate its content, and then print the result or save it to another file. Here's how: 321 | 322 | ```python 323 | from aphra import translate 324 | 325 | # Read the content from the Markdown file 326 | with open('input.md', 'r', encoding='utf-8') as file: 327 | text_to_translate = file.read() 328 | 329 | # Translate the content from Spanish to English 330 | translation = translate(source_language='Spanish', 331 | target_language='English', 332 | text=text_to_translate, 333 | config_file='config.toml', 334 | log_calls=False) 335 | 336 | # Print the translation or save it to a file 337 | print(translation) 338 | 339 | with open('output.md', 'w', encoding='utf-8') as output_file: 340 | output_file.write(translation) 341 | ``` 342 | 343 | In this example: 344 | 345 | - We first read the text from `input.md`. 346 | - Then, we translate the text from Spanish to English. 347 | - Finally, we print the translation to the console and save it to `output.md`. 348 | 349 | ## Customizability and Ideas for Extensions 350 | 351 | 🌐💬 Aphra's workflow architecture is designed with extensibility and customization at its core. The system provides multiple levels of customization, from simple prompt modifications to creating entirely new workflows. 352 | 353 | ### Customization Levels 354 | 355 | #### 1. Configuration Customization (Simplest) 356 | Override workflow configuration in your `config.toml` to use different models: 357 | 358 | ```toml 359 | [openrouter] 360 | api_key = "your_api_key" 361 | 362 | [short_article] 363 | writer = "anthropic/claude-sonnet-4" 364 | searcher = "perplexity/sonar" 365 | critiquer = "different/model" 366 | ``` 367 | 368 | #### 2. Prompt Customization (Simple) 369 | Modify the prompts within a workflow's `prompts/` folder to adapt the output for your specific use cases: 370 | - `aphra/workflows/short_article/prompts/step1_system.txt` - Analysis step prompts 371 | - `aphra/workflows/short_article/prompts/step2_system.txt` - Search step prompts 372 | - `aphra/workflows/short_article/prompts/step3_system.txt` - Translation step prompts 373 | - `aphra/workflows/short_article/prompts/step4_system.txt` - Critique step prompts 374 | - `aphra/workflows/short_article/prompts/step5_system.txt` - Refinement step prompts 375 | 376 | #### 3. Method Customization (Intermediate) 377 | Customize translation behavior by inheriting from `ShortArticleWorkflow` and overriding specific methods: 378 | 379 | ```python 380 | from aphra.workflows.short_article.short_article_workflow import ShortArticleWorkflow 381 | from aphra.core.context import TranslationContext 382 | 383 | class CustomWorkflow(ShortArticleWorkflow): 384 | def analyze(self, context: TranslationContext, text: str): 385 | # Your custom analysis logic here 386 | return super().analyze(context, text) 387 | 388 | def search(self, context: TranslationContext, parsed_items): 389 | # Your custom search logic here 390 | return super().search(context, parsed_items) 391 | ``` 392 | 393 | #### 4. Complete Workflow Creation (Advanced) 394 | Build entirely new workflows by creating a self-contained workflow directory with auto-discovery: 395 | 396 | ```bash 397 | mkdir -p aphra/workflows/my_workflow/{config,examples,prompts,aux,tests} 398 | ``` 399 | 400 | Then implement your workflow class: 401 | 402 | ```python 403 | # aphra/workflows/my_workflow/my_workflow.py 404 | from aphra.core.workflow import AbstractWorkflow 405 | from aphra.core.context import TranslationContext 406 | 407 | class MyWorkflow(AbstractWorkflow): 408 | def get_workflow_name(self) -> str: 409 | return "my_workflow" 410 | 411 | def is_suitable_for(self, text: str, **kwargs) -> bool: 412 | # Define suitability criteria 413 | return "specific_condition" in text.lower() 414 | 415 | def run(self, context: TranslationContext, text: str) -> str: 416 | # Access workflow configuration 417 | writer_model = context.get_workflow_config('writer') 418 | 419 | # Your complete workflow logic here 420 | return translated_text 421 | ``` 422 | 423 | Create configuration file `aphra/workflows/my_workflow/config/default.toml`: 424 | 425 | ```toml 426 | writer = "anthropic/claude-sonnet-4" 427 | searcher = "perplexity/sonar" 428 | custom_param = "default_value" 429 | ``` 430 | 431 | No manual registration needed - auto-discovery handles everything! 432 | 433 | ### Extension Ideas 434 | 435 | The workflow architecture opens up exciting possibilities: 436 | 437 | - **Specialized Content Workflows:** 438 | - **Academic Papers**: Enhanced terminology handling and citation preservation. 439 | - **Technical Documentation**: API reference translation with code preservation. 440 | - **Marketing Content**: Tone and brand voice adaptation across languages. 441 | - **Legal Documents**: Precision-focused translation with legal term verification. 442 | 443 | - **Enhanced Search Capabilities:** 444 | - **Agent-Based Web Search**: Replace LLM searcher with custom web search agents. 445 | - **Domain-Specific Databases**: Integrate specialized terminology databases. 446 | - **Visual Context**: Add image analysis for documents with visual elements. 447 | 448 | - **Local and Hybrid Operation:** 449 | - **Ollama Integration**: Run workflows entirely locally using open-source models. 450 | - **Hybrid Cloud-Local**: Use local models for sensitive content, cloud for complex analysis. 451 | - **Custom Model Integration**: Plug in specialized translation models. 452 | 453 | - **Quality Assurance Extensions:** 454 | - **Multiple Critic Workflow**: Use several specialized critics for different aspects. 455 | - **Human-in-the-Loop**: Add human review steps at critical points. 456 | - **Quality Metrics**: Automatic translation quality assessment. 457 | 458 | - **Performance Optimizations:** 459 | - **Parallel Step Execution**: Run independent steps concurrently. 460 | - **Caching System**: Cache analysis and search results for similar content. 461 | - **Streaming Translation**: Process large documents in chunks. 462 | 463 | ### Getting Started with Extensions 464 | 465 | 1. **Fork the Repository**: Start with your own copy of Aphra. 466 | 2. **Study the Existing Code**: Examine `aphra/workflows/short_article/` for examples of workflow implementation. 467 | 3. **Create Your Components**: Build your custom steps and workflows. 468 | 4. **Test Thoroughly**: Use the existing test framework as a guide. 469 | 5. **Share Your Work**: Consider contributing your extensions back to the community. 470 | 471 | The workflow design ensures that your extensions are isolated, testable, and maintainable, while the registry system makes them discoverable and reusable. 472 | 473 | Feel free to experiment and extend 🌐💬 Aphra in ways that suit your projects and ideas. The architecture is built to grow with your needs! 474 | 475 | ## License 476 | 477 | 🌐💬 Aphra is released under the [MIT License](https://github.com/DavidLMS/aphra/blob/main/LICENSE). You are free to use, modify, and distribute the code for both commercial and non-commercial purposes. 478 | 479 | ## Contributing 480 | 481 | Contributions to 🌐💬 Aphra are welcome! Whether it's improving the code, enhancing the documentation, or suggesting new features, your input is valuable. Please check out the [CONTRIBUTING.md](https://github.com/DavidLMS/aphra/blob/main/CONTRIBUTING.md) file for guidelines on how to get started and make your contributions count. 482 | 483 | ## References 484 | 485 | - *Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models*, Shao et al. (2024), [https://arxiv.org/abs/2402.14207](https://arxiv.org/abs/2402.14207) 486 | - *Translation Agent*, Ng (2024), [https://github.com/andrewyng/translation-agent](https://github.com/andrewyng/translation-agent) 487 | -------------------------------------------------------------------------------- /docs/aphra.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | aphra API documentation 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |

17 | 18 | 19 |

20 | 21 | 23 | 24 | 25 |

Submodules

26 |

translate
workflows
core

31 | 32 |

API Documentation

33 |

35 | translate 36 |

38 | 39 | 40 | 41 | 42 | built with pdoc logo

45 | 46 |

47 |

48 |

49 |

50 |

51 | aphra

52 | 53 |

Aphra package initializer. 54 | This module exposes the main API components and modules.

55 |

56 | 57 | 58 | 59 | View Source 60 | 61 |

1"""
 62 | 2Aphra package initializer.
 63 | 3This module exposes the main API components and modules.
 64 | 4"""
 65 | 5from .translate import translate
 66 | 6from . import workflows
 67 | 7from . import core
 68 | 8
 69 | 9__all__ = ['translate', 'workflows', 'core']
 70 |

71 | 72 | 73 |

74 |

75 | 76 |

77 | 78 | def 79 | translate( source_language, target_language, text, config_file='config.toml', log_calls=False): 80 | 81 | View Source 82 | 83 |

84 | 85 |

22def translate(source_language, target_language, text, config_file="config.toml", log_calls=False):
 86 | 23    """
 87 | 24    Translates the provided text from the source language to the target language using workflows.
 88 | 25
 89 | 26    This function provides a convenient interface to Aphra's workflow-based
 90 | 27    translation system.
 91 | 28
 92 | 29    :param source_language: The source language of the text.
 93 | 30    :param target_language: The target language of the text.
 94 | 31    :param text: The text to be translated.
 95 | 32    :param config_file: Path to the TOML file containing the configuration.
 96 | 33    :param log_calls: Boolean indicating whether to log the call details.
 97 | 34    :return: The improved translation of the text.
 98 | 35    """
 99 | 36    # Load the model client
100 | 37    model_client = load_model_client(config_file)
101 | 38
102 | 39    # Create translation context
103 | 40    context = TranslationContext(
104 | 41        model_client=model_client,
105 | 42        source_language=source_language,
106 | 43        target_language=target_language,
107 | 44        log_calls=log_calls
108 | 45    )
109 | 46
110 | 47    # Find the most suitable workflow for this content
111 | 48    workflow = get_suitable_workflow(text)
112 | 49
113 | 50    if workflow is None:
114 | 51        raise ValueError("No suitable workflow found for the provided text")
115 | 52
116 | 53    # Execute the workflow
117 | 54    result = workflow.run(context, text)
118 | 55
119 | 56    return result
120 |

121 | 122 | 123 |

Translates the provided text from the source language to the target language using workflows.

124 | 125 |

This function provides a convenient interface to Aphra's workflow-based 126 | translation system.

127 | 128 |

Parameters

129 | 130 |

source_language: The source language of the text.
target_language: The target language of the text.
text: The text to be translated.
config_file: Path to the TOML file containing the configuration.
log_calls: Boolean indicating whether to log the call details.

137 | 138 |

Returns

139 | 140 |

141 |
The improved translation of the text.
142 |

143 |

144 | 145 | 146 |

147 |

148 | 330 | --------------------------------------------------------------------------------