├── .gitignore ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── core ├── chains.py └── streamlit_chains.py ├── demos ├── multi_agent_langgraph_demo.ipynb ├── multi_agent_langgraph_demo.py ├── multi_agent_raw_ra_demo.ipynb └── multi_agent_raw_ra_demo.py ├── docs ├── LangGraph_RA_comp.md └── RA_architecture.md ├── images ├── RA_Architecture.svg ├── Sequence_Summary.svg └── Streamlit_App_Screenshot.png ├── pyproject.toml ├── recursive_agents ├── __init__.py ├── base.py ├── streamlit.py └── template_load_utils.py ├── streamlit_app.py ├── templates ├── bug_triage_initial_sys.txt ├── generic_critique_sys.txt ├── generic_critique_user.txt ├── generic_initial_sys.txt ├── generic_revision_sys.txt ├── generic_revision_user.txt ├── marketing_initial_sys.txt ├── protocol_context.txt └── strategy_initial_sys.txt └── tests ├── quick_setup.py ├── test_final_answer.py └── test_runlog.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Python 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | *.so 6 | .Python 7 | env/ 8 | venv/ 9 | ENV/ 10 | .venv 11 | 12 | # Environment variables 13 | .env 14 | .env.local 15 | 16 | # IDE 17 | .vscode/ 18 | .idea/ 19 | *.swp 20 | *.swo 21 | .gemini 22 | 23 | # OS 24 | .DS_Store 25 | Thumbs.db 26 | 27 | # Jupyter 28 | .ipynb_checkpoints/ 29 | *.ipynb_checkpoints 30 | 31 | # Testing 32 | .pytest_cache/ 33 | .coverage 34 | htmlcov/ 35 | 36 | # Package files 37 | *.egg-info/ 38 | dist/ 39 | build/ 40 | prop_templates/ 41 | hn.txt 42 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to Recursive Agents 2 | 3 | We welcome contributions to the Recursive Agents framework! 4 | 5 | ## How to Contribute 6 | 7 | - **Bug Reports**: Click the "Issues" tab above and then "New Issue" to report bugs 8 | - **Feature Requests**: Click the "Issues" tab above and then "New Issue" to suggest features 9 | - **Pull Requests**: Fork the repo, create a branch, and submit a PR 10 | - **Questions**: Open an issue in the "Issues" tab for questions 11 | 12 | ## Areas We'd Love Help With 13 | 14 | - **Documentation**: Improve clarity, add examples, fix typos 15 | - **Tests**: Expand test coverage and edge cases 16 | - **Integration Examples**: Show RA working with other tools/frameworks 17 | - **Performance**: Optimization suggestions and improvements 18 | - **Evaluation Frameworks**: Build benchmarks and metrics for measuring companion effectiveness 19 | - **New Features**: Implement enhancements from the issue tracker 20 | - **Templates**: Contribute new templates or improve existing ones 21 | 22 | ## Development Setup 23 | 24 | ```bash 25 | git clone https://github.com/hankbesser/recursive-agents.git 26 | cd recursive-agents 27 | pip install -e .[all] # Install with all dependencies 28 | ``` 29 | 30 | ## Pull Request Process 31 | 32 | 1. Fork the repository 33 | 2. Create your feature branch (`git checkout -b feature/amazing-feature`) 34 | 3. Commit your changes (`git commit -m 'Add some amazing feature'`) 35 | 4. Push to the branch (`git push origin feature/amazing-feature`) 36 | 5. Open a Pull Request 37 | 38 | 39 | ## Questions? 40 | 41 | Have questions or want to share ideas? Check out our [Discussions](https://github.com/hankbesser/recursive-agents/discussions) tab! 42 | 43 | Thank you for helping make Recursive Agents better! 44 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 Henry Besser 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Recursive Agents 🔄 2 | 3 | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) 4 | [![Contributing](https://img.shields.io/badge/Contributing-Guidelines-blue.svg)](CONTRIBUTING.md) 5 | 6 | ## A Meta-Framework for Self-Improving Agents 7 | 8 | Recursive Agents implements a **three-phase iterative refinement architecture** where LLM agents (instances of Classes) critique and improve their own outputs. Unlike single-pass systems, each agent automatically tracks its full revision history, making every decision inspectable and debuggable. 9 | 10 | ![Sequence Flow](images/Sequence_Summary.svg) 11 | 12 | → See the [Architecture Documentation](docs/RA_architecture.md) for detailed system design. 13 | 14 | ### Why Recursive Agents? 15 | 16 | **See inside the thinking.*** While other frameworks show you what happened, RA shows you why. Every instance maintains a complete audit trail of its critique-revision cycles, stopping conditions, and decision rationale. This transparency is built-in, not bolted on. 17 | 18 | *Unlike single-shot responses, agents systematically refine their outputs by critiquing and improving their own work—thinking about their thinking. 19 | 20 | **Flexible template loading.** The `build_templates()` utility lets you compose analytical patterns: override just what changes (usually only initial system template per domain), apply overarching protocols to specific phases (usually throughout system templates in all realted domains for consistent behavior), or skip protocols entirely. System templates define WHO the agent is, user templates define WHAT task to perform, and protocols shape HOW to analyze—each layer independently configurable. 21 | 22 | ### Why "Companions"? 23 | 24 | In the Recursive Agents framework, we call our agent implementations 25 | "Companions" rather than "Agents" for clarity: 26 | 27 | - **Framework**: Recursive Agents (what the system does) 28 | - ***C***lasses: ***C***ompanions (what you work with) 29 | 30 | --- 31 | 32 | ## What Makes RA Unique 33 | 34 | | Code Pattern | Why It Matters | Rare in OSS? | 35 | |--------------|----------------|--------------| 36 | | **`Draft\|LLM → Critique\|LLM → Revision\|LLM` chains built once** | Three-phase self-improvement is automatic - no manual wiring | ✓✓ | 37 | | **One `protocol_context.txt` feeds all system prompts** | Change reasoning style everywhere with one edit | ✓ | 38 | | **Templates live as `.txt` files on disk** | Git tracks prompt changes; hot-reload without restarting | ✓ | 39 | | **New expert domain = inherit BaseCompanion + point to template** | Three lines of code gets you a complete agent | ✓✓ | 40 | | **Every Companion instance is itself an `agent()` or `RunnableLambda(agent)`** | Same object works standalone or in any framework | ✓ | 41 | | **Built-in `run_log` tracks all iterations** | See why decisions were made without adding instrumentation | ✓✓ | 42 | 43 | ✓ = Uncommon in open source projects | ✓✓ = Very rare in open source projects 44 | 45 | --- 46 | ### Quick Clone 47 | 48 | ```bash 49 | # Create conda environment 50 | conda create -n recursive-agents python=3.12 -y # (or python=3.13 -y) 51 | conda activate recursive-agents 52 | # Or using venv 53 | # python -m venv venv 54 | # source venv/bin/activate 55 | 56 | # clone and install 57 | git clone https://github.com/hankbesser/recursive-agents.git 58 | cd recursive-agents 59 | pip install -e . # or pip install . for non-editable 60 | # pip install -e .[all] for running the streamlit app and working with LangGraph in demo 61 | # pip install .[all] for non edittable version as above 62 | ``` 63 | --- 64 | ### Challenge: Ask Your Agentic Coding tool to Think Deeply about this Project 65 | 66 | Here's something fun - ask your favorite coding tool ( i.e. Claude Code in a terminal) the following input prompt in a new conversation.. It is very informal and you can tailor the structure (possibly more formal language if you are into that) but the TODO is pretty solid with this particular order: 67 | 68 | **"Ok please think as deeply as possible about this project. Start with thinking as deeply as you possibly can about the main core engine is in core/chains.py - think deeply about this, giving every detail you can see from it and what it does. Then go to the package level in recursive_agents including template_load_utils.py and base.py and do the same deep thinking process. Then read the templates folder starting with the protocol, then system prompts (noting that protocol language won't leak as an aside), then the user prompts. Lastly, hit all files in the the demos folder- again, as deep of analysis as you can go. It's good to get a deep grasp of this project."** 69 | 70 | *The responses will not dissapoint.* 71 | 72 | --- 73 | 74 | 75 | ## Architecture: Clean Layers, Clear Purpose 76 | → See the [Architecture Documentation](docs/RA_architecture.md) 77 | 78 | ```text 79 | Your Code 80 | ↓ imports 81 | recursive_agents/ # Pick your companion 82 | ├── base.py # Standard: Marketing, Bug, Strategy, Generic 83 | └── streamlit.py # Same companions + live UI updates 84 | ↓ inherits 85 | core/chains.py # The engine: 3-phase loop, convergence, history 86 | ↓ uses 87 | templates/*.txt # Hot-swappable prompts + protocol injection 88 | ``` 89 | **Where to Look (Separation of Concerns):** 90 | 91 | - **Engineers** → ```core/chains.py``` (how to build the chains, storage/caching/memory management, observability, iteration process, etc.) 92 | - **Users** → ```recursive_agents/base.py``` (picking companions) 93 | - **Context Engineers** → ```templates/``` folder + ```template_load_utils.py``` 94 | - **UI Developers** → ```streamlit_app.py``` (progress containers) 95 | - *or any combination of these* 96 | 97 | **Tip:** Each module includes extensive docstrings and comments explaining design decisions, usage patterns, and implementation details. Start with the docstrings for a comprehensive understanding. 98 | 99 | 100 | --- 101 | ## Three Levels of Understanding 102 | 103 | ### Level 1: Just Use It (5 minutes) 104 | ```python 105 | from recursive_agents import MarketingCompanion 106 | 107 | # Create an agent - it's just a callable! 108 | agent = MarketingCompanion("gpt-4o-mini") 109 | 110 | # Get a refined answer through automatic critique cycles 111 | answer = agent("Why did engagement drop 30%...?") 112 | print(answer) # Final, refined analysis after self-critique 113 | 114 | # Note: agent("...") is the same as agent.loop("...") 115 | # The __call__ method is an alias for loop() 116 | # This makes companions work as simple callables in any framework! 117 | 118 | # Want to see the thinking process? It's all there: 119 | print(f"Iterations: {len(agent.run_log)}") 120 | print(f"Final critique: {agent.run_log[-1]['critique']}") 121 | 122 | # Get beautifully formatted thinking history 123 | print(agent.transcript_as_markdown()) 124 | # Outputs: 125 | # ### Iteration 1 126 | # **Draft**: [Initial analysis...] 127 | # **Critique**: [What could be improved...] 128 | # **Revision**: [Enhanced analysis...] 129 | # (continues for each iteration) 130 | ``` 131 | 132 | ### Level 2: Compose & Customize (30 minutes) 133 | ```python 134 | # 1. Configure companions for different use cases 135 | fast_draft = MarketingCompanion( 136 | temperature=0.9, # More creative 137 | max_loops=1, # Single pass for speed 138 | clear_history=True # Don't retain context 139 | ) 140 | 141 | thoughtful = MarketingCompanion( 142 | llm="gpt-4.1-mini" 143 | temperature=0.3, # More focused 144 | max_loops=5, # Deep refinement 145 | similarity_threshold=0.99 # Only stop on near-identical 146 | ) 147 | 148 | # 2. Different ways to get results 149 | simple_answer = fast_draft("Quick take on our Q3 performance...") 150 | 151 | # Get both answer and thinking history 152 | answer, thinking = thoughtful.loop("Deep analysis of Q3 performance...") 153 | print(f"Went through {len(thinking)} iterations") 154 | print(thoughtful.transcript_as_markdown()) # See the evolution 155 | 156 | # 3. Use verbose mode to watch thinking live 157 | debug_companion = BugTriageCompanion(verbose=True) 158 | result = debug_companion("Users can't upload files >10MB...") 159 | # Prints: USER INPUT, INITIAL DRAFT, CRITIQUE #1, REVISION #1, etc. 160 | ``` 161 | ### Level 3: Extend the Framework (2 hours) 162 | 163 | ```python 164 | # Step 1: Create your domain template (templates/legal_initial_sys.txt) 165 | """ 166 | {context} # Protocol automatically injected 167 | 168 | You are a Legal Analysis Companion specializing in contract review, 169 | compliance assessment, and risk evaluation. Focus on: 170 | - Identifying potential legal liabilities 171 | - Highlighting ambiguous language 172 | - Suggesting protective clauses 173 | """ 174 | 175 | # Step 2: Create your companion class (just 4 lines!) 176 | 177 | from recursive_agents.core.chains import BaseCompanion 178 | from recursive_agents.template_load_utils import build_templates 179 | 180 | # probably best to do in recusrive_companion/base.py with other Companions 181 | LEGAL_TEMPLATES = build_templates(initial_sys="legal_initial_sys") 182 | class LegalCompanion(BaseCompanion): 183 | TEMPLATES = LEGAL_TEMPLATES 184 | SIM_THRESHOLD = 0.99 # Legal requires higher precision 185 | MAX_LOOPS = 4 # Thorough analysis for legal matters 186 | 187 | # Step 3: Use it immediately 188 | legal = LegalCompanion() 189 | analysis = legal("Review this SaaS agreement for potential risks...") 190 | 191 | # Access everything just like built-in companions 192 | print(f"Iterations until convergence: {len(legal.run_log)}") 193 | print(legal.transcript_as_markdown()) # Full audit trail for compliance! 194 | ``` 195 | ## Quick Start - Full Streamlit App 196 | 197 | 198 | ```bash 199 | export OPENAI_API_KEY="sk-..." # in terminal 200 | # For Jupyter/Python (more secure): 201 | # Create .env file with: 202 | # OPENAI_API_KEY="sk-..." 203 | # Then in your code: 204 | # from dotenv import load_dotenv 205 | # load_dotenv() 206 | ``` 207 | 208 | ### Run the Complete Streamlit Application 209 | ```bash 210 | streamlit run streamlit_app.py 211 | ``` 212 | 213 | 214 | 215 | 216 | **You get a full interactive application:** 217 | - Select any companion type from the dropdown 218 | - Adjust features in side bar 219 | - Enter your prompt and watch the AI refine its response 220 | - System templates and protocol viewer (updated when changes made in `templates/` and app reloads) 221 | - See critique-revision cycles happen in real-time 222 | - View cosine similarity scores update live 223 | 224 | 225 | 226 | This is a full testing and observability app included with the framework. 227 | 228 | --- 229 | 230 | ## Why This Architecture Matters 231 | 232 | 1. **Mathematical Convergence > Arbitrary Limits** 233 | - Not "stop after 3 tries" 234 | - Stop when `cosine_from_embeddings(revision[n-1], revision[n]) > 0.98` 235 | 2. **Companions as Callables = Composability** 236 | - Works in Jupyter: `agent("question")` 237 | - Works with LangGraph: `RunnableLambda(agent)` 238 | - Works in Streamlit: Live visualization of critique-revision cycles! 239 | 3. **Templates as Data = Evolution Without Refactoring** 240 | - Change prompts in production 241 | - A/B test different protocols 242 | - Domain experts can contribute without coding 243 | 244 | --- 245 | ## Multi-Agent Orchestration 246 | 247 | #### **Raw Python** (Sequential with Full Observability): 248 | - [multi agent RA notebook ](demos/multi_agent_raw_ra_demo.ipynb) 249 | - [multi agent RA pyton file ](demos/multi_agent_raw_ra_demo.py) 250 | ```python 251 | from recursive_agents.base import MarketingCompanion, BugTriageCompanion, StrategyCompanion 252 | 253 | problem = "App crashes on upload, users leaving bad reviews..." 254 | 255 | # Each agent analyzes independently 256 | mkt = MarketingCompanion() 257 | bug = BugTriageCompanion() 258 | strategy = StrategyCompanion() 259 | 260 | mkt_view = mkt(problem) 261 | bug_view = bug(problem) 262 | 263 | # Combine insights 264 | combined = f"Marketing: {mkt_view}\n\nEngineering: {bug_view}" 265 | action_plan = strategy(combined) 266 | 267 | # Full introspection available for each agent 268 | print(f"Marketing iterations: {len(mkt.run_log)}") 269 | print(f"Engineering iterations: {len(bug.run_log)}") 270 | print(f"Strategy iterations: {len(strategy.run_log)}") 271 | 272 | # See why strategy reached its conclusion 273 | print(strategy.transcript_as_markdown()) 274 | ``` 275 | 276 | #### **LangGraph** (Parallel Execution + RA Transparency): 277 | - [multi agent RA callable / LangGraph notebook ](demos/multi_agent_langgraph_demo.ipynb) 278 | - [multi agent RA callable / LangGraph python file ](demos/multi_agent_langgraph_demo.py) 279 | 280 | ```python 281 | from langchain_core.runnables import RunnableLambda 282 | from langgraph.graph import StateGraph 283 | from typing import TypedDict 284 | 285 | # Same companions work as LangGraph nodes! 286 | # mkt, bug instances from raw RA example above 287 | mkt_node = RunnableLambda(mkt) 288 | eng_node = RunnableLambda(bug) 289 | strategy_node = RunnableLambda(strategy) 290 | 291 | # Simple merge function 292 | merge_node = RunnableLambda( 293 | lambda d: f"Marketing: {d['marketing']}\n\nEngineering: {d['engineering']}" 294 | ) 295 | 296 | # Define the state schema for LangGraph 297 | class GraphState(TypedDict): 298 | input: str 299 | marketing: str 300 | engineering: str 301 | merged: str 302 | final_plan: str 303 | 304 | # Build parallel workflow 305 | # No extra prompts, no schema gymnastics: simply passing text between the callables the classes already expose. 306 | graph = StateGraph(GraphState) 307 | graph.add_node("marketing_agent", lambda state: {"marketing": mkt_node.invoke(state["input"])}) 308 | graph.add_node("engineering_agent", lambda state: {"engineering": eng_node.invoke(state["input"])}) 309 | graph.add_node("merge_agent", lambda state: {"merged": merge_node.invoke(state)}) 310 | graph.add_node("strategy_agent", lambda state: {"final_plan": plan_node.invoke(state["merged"])}) 311 | 312 | graph.add_edge("marketing_agent", "merge_agent") 313 | graph.add_edge("engineering_agent", "merge_agent") 314 | graph.add_edge("merge_agent", "strategy_agent") 315 | 316 | graph.add_edge("__start__", "marketing_agent") 317 | graph.add_edge("__start__", "engineering_agent") 318 | graph.set_finish_point("strategy_agent") 319 | workflow = graph.compile() 320 | 321 | # Run workflow 322 | result = workflow.invoke({"input": problem}) 323 | 324 | # RA's thinking history still available! 325 | print(mkt.transcript_as_markdown()) # Full marketing analysis 326 | print(bug.transcript_as_markdown()) # Full engineering analysis 327 | print(strategy.transcript_as_markdown()) # How strategy synthesized both 328 | ``` 329 | For detailed comparison with LangGraph capabilities, see [LangGraph_comparison_compliment](docs/LangGraph_RA_comp.md). 330 | 331 | --- 332 | ## Production Features 333 | 334 | #### Observability and Flexibility 335 | 336 | - **Verbose mode**: prints every phase of thinking live 337 | - **Transcript capture**: return full run_log for debugging along with the final analysis (the instatiated object will have have it own run_log though) 338 | - **Standard logging**: Integration-ready 339 | - **Streamlit App**: visualze all live previews, testing 340 | 341 | - **Smart caching**: Single embeddings client 342 | - **Early exit**: Stop when converged, not exhausted 343 | 344 | - **Any OpenAI model**: "gpt-4o-mini", "gpt-4.1", custom endpoints 345 | - **Configurable everything**: Per-instance overrides 346 | - **Template hot-reload:** Change prompts without code 347 | 348 | --- 349 | 350 | ## The Strategic Decomposition Protocol 351 | 352 | Read ```templates/protocol_context.txt``` to see the structured reasoning framework that guides agents through: 353 | 354 | - Multi-layered problem analysis 355 | - Iterative pattern recognition 356 | - Systematic refinement cycles 357 | 358 | This structured approach to recursive problem decomposition consistently outperforms single-pass analysis. 359 | 360 | --- 361 | 362 | 363 | ### Creating Your Own Companion 364 | 365 | ### 1. Write your 366 | ```text 367 | # templates/financial_initial_sys.txt 368 | {context} # Protocol automatically injected 369 | 370 | You are a Financial Analysis Companion. Focus on: 371 | - Cash flow patterns and anomalies 372 | - Risk indicators and market conditions 373 | - Regulatory compliance implications 374 | ``` 375 | 376 | ### 2. Create the companion class 377 | ```python 378 | your_app/base.py 379 | from recursive_agents.core.chains import BaseCompanion 380 | from recursive_agents.template_load_utils import build_templates 381 | 382 | FINANCE_TEMPLATES = build_templates(initial_sys="financial_initial_sys") 383 | class FinancialCompanion(BaseCompanion): 384 | TEMPLATES = FINANCE_TEMPLATES 385 | MODEL_NAME = "gpt-4.1-mini" 386 | MAX_LOOPS = 4 # Financial analysis needs thoroughness 387 | TEMPERATURE = 0.3 # Lower temperature for numerical precision 388 | ``` 389 | 390 | ### 3. Use it anywhere 391 | ```python 392 | fin = FinancialCompanion() 393 | 394 | # note: callable - __call__ is an alias for loop() 395 | analysis = fin("Q3 revenue variance exceeds 2 standard deviations") 396 | ``` 397 | --- 398 | 399 | 400 | *Agents that refine their responses through iteration, integrated seamlessly into your existing code.* 401 | 402 | --- 403 | 404 | ## License 405 | 406 | This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. 407 | 408 | ## Contributing 409 | 410 | We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details on how to get started. 411 | 412 | ## Future Explorations 413 | 414 | The Recursive Agents framework opens fascinating research directions: 415 | 416 | - Advanced convergence analysis beyond embeddings / cosine similarity 417 | - Richer integration patterns with agentic frameworks 418 | - Extended observability for multi-loop systems 419 | 420 | We're particularly interested in collaborations exploring how recursive 421 | patterns emerge across different domains and scales. 422 | 423 | 424 | 425 | ## Bonus Section: This README's design philosophy 426 | 427 | 1. **Three-level structure** mirrors the codebase organization 428 | 2. **Technical depth** with actual code snippets and architecture diagrams 429 | 3. **Clear separation** of who should look where (users → base.py, engineers → chains.py) 430 | 4. **Focus on observability** with real implementation details for testing and visualzing in the prvoided full scale Streamlit app 431 | 5. **Protocol + Templates** flexible composition for different applications 432 | 6. **Clean examples** demonstrating the "companions as callables" pattern 433 | 7. **Practical guidance** for extending the framework 434 | 8. **Visual learning** - Sequence diagram up front, architecture docs linked 435 | 436 | #### The goal: Show what makes Recursive Agents different and how to use it effectively. 437 | 438 | --- 439 | 440 | ## Built Through Collaboration 441 | 442 | This framework emerged from intensive human-AI collaboration over 3 weeks: 443 | - Solo developer working with multiple LLMs 444 | - Built using the very patterns it now enables 445 | - The architecture mirrors the discovery process itself 446 | 447 | The rapid development was possible because the framework design emerged naturally from the recursive dialogue process—we were building what we were already doing. 448 | -------------------------------------------------------------------------------- /core/chains.py: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: MIT 2 | # 3 | # Copyright (c) [2025] [Henry Besser] 4 | # 5 | # This software is licensed under the MIT License. 6 | # See the LICENSE file in the project root for the full license text. 7 | 8 | # core/chains.py 9 | """ 10 | Core plumbing for all Companion subclasses 11 | ------------------------------------------ 12 | 13 | * cosine_from_embeddings(va: List[float], vb: List[float]) -> float: 14 | Cached OpenAI-embedding cosine similarity. Used to decide whether two 15 | successive revisions are “close enough” to stop looping early. 16 | 17 | * build_chains(t:dict[str,str], llm:ChatOpenAI) 18 | Construct the **Init → Critique → Revision** runnables from a 5-key 19 | template dict. Injects ``MessagesPlaceholder("history")``. 20 | 21 | * BaseCompanion 22 | ├─ __init__(llm: str | ChatOpenAI | None = None, 23 | │ *, 24 | │ templates:dict[str,str] | None = None, 25 | │ similarity_threshold:float | None = None, 26 | │ max_loops:int | None = None, 27 | │ temperature:float | None = None, 28 | │ verbose:bool = False, 29 | │ clear_history:bool | None = None, 30 | │ return_transcript:bool = False, 31 | │ **llm_kwargs) 32 | │ · Builds an internal ``ChatOpenAI`` *if* caller passes a model-name 33 | │ string (no import needed in user code). 34 | │ · Creates ``history`` & ``run_log`` containers. 35 | │ · Merges caller kwargs with class-level defaults. 36 | │ · Builds the three chains via *build_chains()*. 37 | │ 38 | ├─ loop(user_input:str) 39 | │ Returns **str** (final draft) when *return_transcript=False* 40 | │ or **(str, list)** (final, run_log) when *return_transcript=True* 41 | │ · Performs the three-phase loop ≤ max_loops 42 | │ · similarity / empty-critique early-exit 43 | │ · Appends (Human, AI) messages to ``history`` 44 | │ · Stores each {draft, critique, revision} dict in ``run_log`` 45 | │ · Auto-clears history if ``self.clear_history`` is *True* 46 | │ 47 | ├─ transcript_as_markdown() → str 48 | │ Nicely formatted view of ``run_log`` for notebooks / logs. 49 | │ 50 | └─ Public instance attributes 51 | · history list[HumanMessage|AIMessage] (cross-call) 52 | · run_log list[dict] (inner iterations, last call) 53 | · sim_thresh float effective similarity threshold 54 | · max_loops int effective loop cap 55 | · temperature float effective sampling temperature 56 | · clear_history bool auto-wipe flag in effect 57 | · verbose bool debug logging flag 58 | 59 | Design notes 60 | ============ 61 | * Templates live in ``templates/`` as text files. 62 | * Subclasses set TEMPLATES class attribute (typically using build_templates() 63 | from template_load_utils.py for DRY template composition). 64 | * All debug output is gated by ``verbose`` OR standard logging levels. 65 | * No system prompts are stored in history; token cost stays minimal. 66 | """ 67 | 68 | from typing import Dict, Any, List, Union, Optional 69 | from numpy import dot 70 | from numpy.linalg import norm 71 | import logging 72 | 73 | from langchain.prompts import ( 74 | ChatPromptTemplate, 75 | SystemMessagePromptTemplate, 76 | HumanMessagePromptTemplate, 77 | MessagesPlaceholder, 78 | ) 79 | from langchain_openai import ChatOpenAI, OpenAIEmbeddings # ChatOpenAI can stay internl! 80 | from langchain_core.messages import HumanMessage, AIMessage 81 | 82 | # --------------------------------------------------------------------- 83 | # ❶ Cosine similarity from embeddings helper 84 | # --------------------------------------------------------------------- 85 | 86 | def cosine_from_embeddings(va: List[float], vb: List[float]) -> float: 87 | """Compute cosine similarity from pre-computed embeddings.""" 88 | return dot(va, vb) / (norm(va) * norm(vb)) 89 | 90 | # --------------------------------------------------------------------- 91 | # ❷ Build three chains from template dict + LLM 92 | # --------------------------------------------------------------------- 93 | def build_chains(t: Dict[str, str], llm: ChatOpenAI): 94 | init_prompt = ChatPromptTemplate.from_messages([ 95 | SystemMessagePromptTemplate.from_template(t["initial_sys"]), 96 | MessagesPlaceholder(variable_name="history"), 97 | HumanMessagePromptTemplate.from_template("{user_input}"), 98 | ]) 99 | crit_prompt = ChatPromptTemplate.from_messages([ 100 | SystemMessagePromptTemplate.from_template(t["critique_sys"]), 101 | HumanMessagePromptTemplate.from_template(t["critique_user"]), 102 | ]) 103 | rev_prompt = ChatPromptTemplate.from_messages([ 104 | SystemMessagePromptTemplate.from_template(t["revision_sys"]), 105 | HumanMessagePromptTemplate.from_template(t["revision_user"]), 106 | ]) 107 | 108 | return init_prompt | llm, crit_prompt | llm, rev_prompt | llm 109 | 110 | # --------------------------------------------------------------------- 111 | # ❸ BaseCompanion 112 | # --------------------------------------------------------------------- 113 | class BaseCompanion: 114 | """ 115 | Three-phase critique / revision agent with optional early-exit and history. 116 | Subclasses override TEMPLATES (and optionally class defaults). 117 | """ 118 | # ---------- class-level defaults (overridable in subclass) -------- 119 | TEMPLATES: Dict[str, str] = {} 120 | MODEL_NAME: str = "gpt-4o-mini" 121 | DEFAULT_TEMPERATURE = 0.7 # sensible mid-range 122 | SIM_THRESHOLD: float = 0.98 123 | MAX_LOOPS: int = 3 124 | CLEAR_HISTORY_AFTER_CALL: bool = False # subclass can override 125 | 126 | # ---------- constructor ------------------------------------------ 127 | def __init__( 128 | self, 129 | llm: Optional[Union[str, ChatOpenAI]] = None, 130 | *, 131 | templates: Dict[str, str] | None = None, 132 | temperature: float | None = None, 133 | similarity_threshold: float | None = None, 134 | max_loops: int | None = None, 135 | clear_history: bool | None = None, 136 | return_transcript: bool = False, 137 | verbose: bool = False, 138 | embedding_model=None, 139 | **llm_kwargs: Any, # passthrough 140 | ): 141 | # merge subclass templates with caller overrides 142 | merged = {**self.TEMPLATES, **(templates or {})} 143 | 144 | # auto-instantiate if needed 145 | # ── 1. build / validate the LLM object ────────────────────────────── 146 | if isinstance(llm, ChatOpenAI): 147 | # caller handed us a fully-configured LLM → just use it 148 | self.llm = llm 149 | else: 150 | # we must build the LLM ourselves 151 | model_name = llm or self.MODEL_NAME # str or fallback 152 | temp = temperature if temperature is not None else self.DEFAULT_TEMPERATURE 153 | self.llm = ChatOpenAI( 154 | model_name=model_name, 155 | temperature=temp, 156 | **llm_kwargs # forwards anything extra 157 | ) 158 | 159 | # build chains 160 | self.init_chain, self.crit_chain, self.rev_chain = build_chains(merged, self.llm) 161 | 162 | # instance-level parameters (fall back to class constants) 163 | self.similarity_threshold: float = ( 164 | similarity_threshold if similarity_threshold is not None else self.SIM_THRESHOLD 165 | ) 166 | self.max_loops: int = ( 167 | max_loops if max_loops is not None else self.MAX_LOOPS 168 | ) 169 | 170 | self.clear_history = ( 171 | clear_history if clear_history is not None 172 | else self.CLEAR_HISTORY_AFTER_CALL 173 | ) 174 | # conversation history (per instance) 175 | self.history: List[Any] = [] # list[HumanMessage | AIMessage] 176 | self.run_log: list[dict[str, str]] = [] # stores per-iteration details 177 | 178 | self._emb = embedding_model or OpenAIEmbeddings() 179 | self.return_transcript = return_transcript 180 | self.verbose = verbose 181 | if self.verbose: # minimal logger setup 182 | logging.basicConfig( 183 | level=logging.DEBUG, 184 | format="%(levelname)s | %(message)s" 185 | ) 186 | # Suppress noisy HTTP client debug logs 187 | logging.getLogger("httpx").setLevel(logging.WARNING) 188 | logging.getLogger("httpcore").setLevel(logging.WARNING) 189 | logging.getLogger("openai").setLevel(logging.WARNING) 190 | 191 | # ---------- main recursive loop ----------------------------------- 192 | def loop(self, user_input: str) -> str | tuple[str, list]: 193 | if self.verbose: 194 | logging.debug("USER INPUT:\n%s", user_input.strip()) 195 | 196 | # Note: That keeps run_log scoped to one outer call instead of accumulating across multiple. 197 | # If you like the cumulative behaviour, skip this line. 198 | self.run_log.clear() # ← start fresh for this call 199 | 200 | # 1. initial draft 201 | draft = self.init_chain.invoke( 202 | {"user_input": user_input, "history": self.history} 203 | ).content 204 | 205 | if self.verbose: 206 | logging.debug("INITIAL DRAFT:\n%s\n", draft.strip()) 207 | 208 | prev: str | None = None # previous draft text 209 | prev_emb: list | None = None # previous embedding (starts empty) 210 | # 2. critique / revision cycles 211 | for i in range(1, self.max_loops + 1): 212 | critique = self.crit_chain.invoke( 213 | {"user_input": user_input, "draft": draft} 214 | ).content 215 | if self.verbose: 216 | logging.debug("CRITIQUE #%d:\n%s\n", i, critique.strip()) 217 | 218 | # simple phrase-based early exit? 219 | if any(p in critique.lower() for p in ("no further improvements", "minimal revisions")): 220 | self.run_log.append({"draft": draft, "critique": critique, "revision": draft}) 221 | if self.verbose: 222 | logging.debug("Early-exit phrase detected.") 223 | break 224 | 225 | # revision 226 | revised = self.rev_chain.invoke( 227 | {"user_input": user_input, "draft": draft, "critique": critique} 228 | ).content 229 | 230 | # similarity check (embed once) - Compute similarity (only if we have a previous draft) 231 | if prev is None: 232 | sim = None 233 | else: 234 | if prev_emb is None: # cache once 235 | prev_emb = self._emb.embed_query(prev) 236 | cur_emb = self._emb.embed_query(revised) 237 | sim = cosine_from_embeddings(prev_emb, cur_emb) 238 | 239 | if self.verbose: 240 | sim_display = sim if sim is not None else 0 241 | logging.debug("REVISION #%d (cosine=%.3f):\n%s\n", i, sim_display, revised.strip()) 242 | 243 | # Similarity early-exit test (no extra row inside) 244 | if sim is not None and sim >= self.similarity_threshold: 245 | # 1) record this converging turn 246 | # 2) LOG FIRST — keep before/after contrast 247 | self.run_log.append({ 248 | "draft": draft, # v n-1 249 | "critique": critique, # critique on v n-1 250 | "revision": revised # v n (final) 251 | }) 252 | if self.verbose: 253 | logging.debug("Similarity threshold reached (%.2f).", self.similarity_threshold) 254 | 255 | 256 | # update cached state *after* logging 257 | # update for final return 258 | prev = revised # Cache the final text # won't be used again (since you're breaking), it's good practice to keep all state variables consistent 259 | draft = revised # So caller sees final text 260 | prev_emb = cur_emb # Cache the final embedding # won't be used again (since you're breaking), it's good practice to keep all state variables consistent 261 | break # exit loop 262 | 263 | # not converged → prepare for next round 264 | self.run_log.append({"draft": draft, "critique": critique, "revision": revised}) 265 | prev = draft # store current draft for next comparison 266 | if 'cur_emb' in locals(): # only cache if we computed it 267 | prev_emb = cur_emb # cache embedding for next comparison 268 | draft = revised # update draft for next iteration 269 | 270 | # 3. update history & return 271 | self.history.extend([HumanMessage(user_input), AIMessage(draft)]) 272 | 273 | if self.clear_history: 274 | #kept = self.history.copy() # optional: return copy to caller 275 | self.history.clear() 276 | # choose whether to return run_log 277 | return (draft, self.run_log) if self.return_transcript else draft 278 | 279 | # ---Honour return_transcript flag --- 280 | if self.return_transcript: 281 | return draft, self.run_log # tuple: (final answer, inner steps) 282 | 283 | return draft # or (draft, kept) if you want keep self.history.copy() 284 | 285 | # two lines - Adding just one dunder to BaseCompanion so every 286 | # subclass automatically behaves like a function. 287 | # Nothing else changes; loop() is still the full three-phase engine. 288 | # (If you prefer not to touch the base class, you can add __call__ = loop in each subclass.) 289 | def __call__(self, user_input: str): 290 | """Alias so a Companion instance is itself a callable.""" 291 | return self.loop(user_input) 292 | 293 | 294 | def transcript_as_markdown(self) -> str: 295 | """Pretty-print the last run for logs or UI.""" 296 | out = [] 297 | 298 | if self.run_log: 299 | # Initial draft (from first iteration) 300 | out.append("\n" +"-" * 80 + "\n") 301 | out.append("## Initial Draft") 302 | out.append("\n" +"-" * 80 + "\n") 303 | out.append(self.run_log[0]["draft"]) 304 | 305 | # Iterations (all but last) 306 | for idx, step in enumerate(self.run_log[:-1], 1): 307 | out.append(f"## Iteration {idx}") 308 | out.append("\n" + "-" * 80 + "\n") 309 | 310 | out.append(f"### Critique {idx}") 311 | out.append("\n" + "-" * 80 + "\n") 312 | out.append(step["critique"]) 313 | 314 | out.append("\n" + f"### Revision {idx}") 315 | out.append("\n" + "-" * 80 + "\n") 316 | out.append(step["revision"]) 317 | 318 | 319 | # Last iteration (shows critique but labels revision as final answer) 320 | if len(self.run_log) > 0: 321 | last_step = self.run_log[-1] 322 | out.append(f"## Iteration {len(self.run_log)}") 323 | out.append("\n" +"-" * 80 + "\n") 324 | 325 | out.append(f"### Critique {len(self.run_log)}") 326 | out.append("\n" +"-" * 80 + "\n") 327 | out.append(last_step["critique"]) 328 | 329 | out.append("### Final Answer") 330 | out.append("\n" +"-" * 80 + "\n") 331 | out.append(last_step["revision"]) 332 | 333 | return "\n".join(out) 334 | -------------------------------------------------------------------------------- /core/streamlit_chains.py: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: MIT 2 | # 3 | # Copyright (c) [2025] [Henry Besser] 4 | # 5 | # This software is licensed under the MIT License. 6 | # See the LICENSE file in the project root for the full license text. 7 | 8 | # core/streamlit_chains.py 9 | """ 10 | Streamlit-specific version of chains.py with live update capabilities 11 | --------------------------------------------------------------------- 12 | 13 | Same as core/chains.py but the loop() method emits real-time updates 14 | to a Streamlit container if provided. See core/chains.py for detailed 15 | documentation of the core algorithm. 16 | 17 | The only difference is the addition of progress_container parameter 18 | and live UI updates during the critique/revision loop. 19 | """ 20 | 21 | from typing import Dict, Any, List, Union, Optional 22 | from numpy import dot 23 | from numpy.linalg import norm 24 | import streamlit as st 25 | import time 26 | 27 | from langchain.prompts import ( 28 | ChatPromptTemplate, 29 | SystemMessagePromptTemplate, 30 | HumanMessagePromptTemplate, 31 | MessagesPlaceholder, 32 | ) 33 | from langchain_openai import ChatOpenAI, OpenAIEmbeddings # ChatOpenAI can stay internl! 34 | from langchain_core.messages import HumanMessage, AIMessage 35 | 36 | # --------------------------------------------------------------------- 37 | # ❶ Cosine similarity from embeddings helper 38 | # --------------------------------------------------------------------- 39 | 40 | def cosine_from_embeddings(va: List[float], vb: List[float]) -> float: 41 | """Compute cosine similarity from pre-computed embeddings.""" 42 | return dot(va, vb) / (norm(va) * norm(vb)) 43 | 44 | # --------------------------------------------------------------------- 45 | # ❷ Build three chains from template dict + LLM 46 | # --------------------------------------------------------------------- 47 | def build_chains(t: Dict[str, str], llm: ChatOpenAI): 48 | init_prompt = ChatPromptTemplate.from_messages([ 49 | SystemMessagePromptTemplate.from_template(t["initial_sys"]), 50 | MessagesPlaceholder(variable_name="history"), 51 | HumanMessagePromptTemplate.from_template("{user_input}"), 52 | ]) 53 | crit_prompt = ChatPromptTemplate.from_messages([ 54 | SystemMessagePromptTemplate.from_template(t["critique_sys"]), 55 | HumanMessagePromptTemplate.from_template(t["critique_user"]), 56 | ]) 57 | rev_prompt = ChatPromptTemplate.from_messages([ 58 | SystemMessagePromptTemplate.from_template(t["revision_sys"]), 59 | HumanMessagePromptTemplate.from_template(t["revision_user"]), 60 | ]) 61 | 62 | return init_prompt | llm, crit_prompt | llm, rev_prompt | llm 63 | 64 | # --------------------------------------------------------------------- 65 | # ❸ BaseCompanion 66 | # --------------------------------------------------------------------- 67 | class StreamlitBaseCompanion: 68 | """ 69 | Three-phase critique / revision agent with optional early-exit and history. 70 | Subclasses override TEMPLATES (and optionally class defaults). 71 | """ 72 | # ---------- class-level defaults (overridable in subclass) -------- 73 | TEMPLATES: Dict[str, str] = {} 74 | MODEL_NAME: str = "gpt-4o-mini" 75 | DEFAULT_TEMPERATURE = 0.7 # sensible mid-range 76 | SIM_THRESHOLD: float = 0.98 77 | MAX_LOOPS: int = 3 78 | CLEAR_HISTORY_AFTER_CALL: bool = False # subclass can override 79 | 80 | # ---------- constructor ------------------------------------------ 81 | def __init__( 82 | self, 83 | llm: Optional[Union[str, ChatOpenAI]] = None, 84 | *, 85 | templates: Dict[str, str] | None = None, 86 | temperature: float | None = None, 87 | similarity_threshold: float | None = None, 88 | max_loops: int | None = None, 89 | clear_history: bool | None = None, 90 | return_transcript: bool = False, 91 | progress_container = None, # Streamlit container for live updates 92 | embedding_model=None, 93 | **llm_kwargs: Any, # passthrough 94 | ): 95 | # merge subclass templates with caller overrides 96 | merged = {**self.TEMPLATES, **(templates or {})} 97 | 98 | # auto-instantiate if needed 99 | # ── 1. build / validate the LLM object ────────────────────────────── 100 | if isinstance(llm, ChatOpenAI): 101 | # caller handed us a fully-configured LLM → just use it 102 | self.llm = llm 103 | else: 104 | # we must build the LLM ourselves 105 | model_name = llm or self.MODEL_NAME # str or fallback 106 | temp = temperature if temperature is not None else self.DEFAULT_TEMPERATURE 107 | self.llm = ChatOpenAI( 108 | model_name=model_name, 109 | temperature=temp, 110 | **llm_kwargs # forwards anything extra 111 | ) 112 | 113 | # build chains 114 | self.init_chain, self.crit_chain, self.rev_chain = build_chains(merged, self.llm) 115 | 116 | # instance-level parameters (fall back to class constants) 117 | self.similarity_threshold: float = ( 118 | similarity_threshold if similarity_threshold is not None else self.SIM_THRESHOLD 119 | ) 120 | self.max_loops: int = ( 121 | max_loops if max_loops is not None else self.MAX_LOOPS 122 | ) 123 | 124 | self.clear_history = ( 125 | clear_history if clear_history is not None 126 | else self.CLEAR_HISTORY_AFTER_CALL 127 | ) 128 | # conversation history (per instance) 129 | self.history: List[Any] = [] # list[HumanMessage | AIMessage] 130 | self.run_log: list[dict[str, str]] = [] # stores per-iteration details 131 | 132 | self.return_transcript = return_transcript 133 | self.progress_container = progress_container # Store the Streamlit container 134 | # ensure an embedding model is available for similarity-stop 135 | self._emb = embedding_model or OpenAIEmbeddings() 136 | 137 | 138 | 139 | # ---------- main recursive loop ----------------------------------- 140 | def loop(self, user_input: str) -> str | tuple[str, list]: 141 | # Note: That keeps run_log scoped to one outer call instead of accumulating across multiple. 142 | # If you like the cumulative behaviour, skip this line. 143 | self.run_log.clear() # ← start fresh for this call 144 | 145 | # Single placeholder for all live updates 146 | if self.progress_container: 147 | content_placeholder = self.progress_container.empty() 148 | 149 | # Store all content to display 150 | all_content = { 151 | "initial": None, 152 | "iterations": [], 153 | "status": None, 154 | "final": None 155 | } 156 | 157 | # 1. initial draft 158 | draft = self.init_chain.invoke( 159 | {"user_input": user_input, "history": self.history} 160 | ).content 161 | 162 | # Live update: Show initial draft 163 | if self.progress_container: 164 | all_content["initial"] = { 165 | "user_input": user_input, 166 | "draft": draft 167 | } 168 | self._redraw_all_content(content_placeholder, all_content, current_iteration=0) 169 | 170 | prev: str | None = None # previous draft text 171 | prev_emb: list | None = None # previous embedding (starts empty) 172 | # 2. critique / revision cycles 173 | for i in range(1, self.max_loops + 1): 174 | critique = self.crit_chain.invoke( 175 | {"user_input": user_input, "draft": draft} 176 | ).content 177 | 178 | # simple phrase-based early exit? 179 | if any(p in critique.lower() for p in ("no further improvements", "minimal revisions")): 180 | self.run_log.append({"draft": draft, "critique": critique, "revision": draft}) 181 | if self.progress_container: 182 | all_content["status"] = "✓ Early exit: No further improvements needed" 183 | self._redraw_all_content(content_placeholder, all_content, current_iteration=i) 184 | break 185 | 186 | # revision 187 | revised = self.rev_chain.invoke( 188 | {"user_input": user_input, "draft": draft, "critique": critique} 189 | ).content 190 | 191 | # similarity check (embed once) - Compute similarity (only if we have a previous draft) 192 | if prev is None: 193 | sim = None 194 | else: 195 | if prev_emb is None: # cache once 196 | prev_emb = self._emb.embed_query(prev) 197 | cur_emb = self._emb.embed_query(revised) 198 | sim = cosine_from_embeddings(prev_emb, cur_emb) 199 | 200 | # live UI row (uses *current* sim) 201 | # Add *one* UI row for this iteration 202 | if self.progress_container: 203 | all_content["iterations"].append({ 204 | "number": i, 205 | "critique": critique, 206 | "revision": revised, 207 | "similarity": sim 208 | }) 209 | self._redraw_all_content(content_placeholder, 210 | all_content, 211 | current_iteration=i) 212 | 213 | # Similarity early-exit test (no extra row inside) 214 | if sim is not None and sim >= self.similarity_threshold: 215 | # 1) record this converging turn 216 | # 2) LOG FIRST — keep before/after contrast 217 | self.run_log.append({ 218 | "draft": draft, # v n-1 219 | "critique": critique, # critique on v n-1 220 | "revision": revised # v n (final) 221 | }) 222 | 223 | # update cached state *after* logging --- don't need for web app (no inspection) 224 | # prev = revised # Cache the final text # won't be used again (since you're breaking), it's good practice to keep all state variables consistent 225 | draft = revised # so caller sees final text 226 | # prev_emb = cur_emb # Cache the final embedding # won't be used again (since you're breaking), it's good practice to keep all state variables consistent 227 | 228 | if self.progress_container: 229 | all_content["status"] = f"✓ Converged: Similarity threshold reached ({self.similarity_threshold:.2f})" 230 | self._redraw_all_content(content_placeholder, all_content, current_iteration=i) 231 | break # exit loop 232 | 233 | # not converged → prepare for next round 234 | self.run_log.append({"draft": draft, "critique": critique, "revision": revised}) 235 | prev = draft # store current draft for next comparison 236 | if 'cur_emb' in locals(): # only cache if we computed it 237 | prev_emb = cur_emb # cache embedding for next comparison 238 | draft = revised # update draft for next iteration 239 | 240 | # 3. update history & return 241 | self.history.extend([HumanMessage(user_input), AIMessage(draft)]) 242 | 243 | # Live update: Show final result 244 | if self.progress_container: 245 | all_content["final"] = True 246 | self._redraw_all_content(content_placeholder, all_content, final=True) 247 | 248 | if self.clear_history: 249 | #kept = self.history.copy() # optional: return copy to caller 250 | self.history.clear() 251 | # choose whether to return run_log 252 | return (draft, self.run_log) if self.return_transcript else draft 253 | 254 | # ---Honour return_transcript flag --- 255 | if self.return_transcript: 256 | return draft, self.run_log # tuple: (final answer, inner steps) 257 | 258 | return draft # or (draft, kept) if you want keep self.history.copy() 259 | 260 | # two lines - Adding just one dunder to BaseCompanion so every 261 | # subclass automatically behaves like a function. 262 | # Nothing else changes; loop() is still the full three-phase engine. 263 | # (If you prefer not to touch the base class, you can add __call__ = loop in each subclass.) 264 | def __call__(self, user_input: str): 265 | """Alias so a Companion instance is itself a callable.""" 266 | return self.loop(user_input) 267 | 268 | 269 | def transcript_as_markdown(self) -> str: 270 | """Pretty-print the last run for logs or UI.""" 271 | out = [] 272 | for idx, step in enumerate(self.run_log, 1): 273 | out.append(f"### Iteration {idx}") 274 | out.append("**Draft**\n\n" + step["draft"]) 275 | out.append("\n**Critique**\n\n" + step["critique"]) 276 | out.append("\n**Revision**\n\n" + step["revision"]) 277 | out.append("\n---\n") 278 | return "\n".join(out) 279 | 280 | def _redraw_all_content(self, placeholder, content, current_iteration=None, final=False): 281 | """Redraw all content in a single placeholder following Stack Overflow pattern.""" 282 | # Clear the placeholder first 283 | placeholder.empty() 284 | 285 | # Small delay for clean transition (as recommended by Stack Overflow) 286 | time.sleep(0.01) 287 | 288 | # Use container for multiple elements (as recommended by Stack Overflow) 289 | with placeholder.container(): 290 | # Initial draft expander 291 | if content["initial"]: 292 | expanded = (current_iteration == 0) and not final 293 | with st.expander("📝 Initial Problem & Draft", expanded=expanded): 294 | st.markdown("**Your Question:**") 295 | st.markdown(f"_{content['initial']['user_input']}_") 296 | st.markdown("---") 297 | st.markdown("**Initial Draft:**") 298 | st.markdown(content['initial']['draft']) 299 | 300 | # All iterations 301 | total_iterations = len(content["iterations"]) 302 | for idx, iter_data in enumerate(content["iterations"]): 303 | # Only consider it "last" if we're in final mode (analysis complete) 304 | is_last_iteration = final and (idx == total_iterations - 1) 305 | expanded = (iter_data["number"] == current_iteration) and not final 306 | with st.expander(f"🔄 Iteration {iter_data['number']}", expanded=expanded): 307 | st.markdown(f"**Critique {iter_data['number']}:**") 308 | st.markdown(iter_data["critique"]) 309 | 310 | # Show revision unless this is the final iteration in the final display 311 | if not is_last_iteration: 312 | st.markdown("---") 313 | st.markdown(f"**Revision {iter_data['number']}:**") 314 | st.markdown(iter_data["revision"]) 315 | if iter_data["similarity"] is not None: 316 | st.caption(f"_Similarity to previous: {iter_data['similarity']:.3f}_") 317 | 318 | # Status messages 319 | if content["status"]: 320 | if "Early exit" in content["status"]: 321 | st.info(content["status"]) 322 | else: 323 | st.success(content["status"]) 324 | 325 | # Final message 326 | if content["final"]: 327 | st.success("✓ Analysis complete! See final analysis below.") 328 | -------------------------------------------------------------------------------- /demos/multi_agent_langgraph_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # SPDX-License-Identifier: MIT 3 | # 4 | # Copyright (c) [2025] [Henry Besser] 5 | # 6 | # This software is licensed under the MIT License. 7 | # See the LICENSE file in the project root for the full license text. 8 | 9 | # demos/multi_agent_langgraph_demo.py 10 | """ 11 | Each node is now a first-class Runnable; there is built-in tracing, concurrency, retries, etc., 12 | without rewriting core/chain.py (the engine). In other words, 13 | Runnables/LangGraph are just an optional facade around the callables already implmented. 14 | They give observability, retries, and DAG routing when (and only when) needed, without forcing a redesign of the class-based core. 15 | 16 | 17 | Why bother to do this? 18 | It shows exactly how LangGraph routing works. 19 | It proves the Companions are drop-in nodes (no refactor). 20 | The transcript printout lets users see Draft → Critique → Revision for the synthesis agent as well. 21 | 22 | Any Companion can now slot into LangChain tooling (RunnableLambda, Retry, StreamingWrapper, etc.) because of the __call__ alias 23 | """ 24 | from langchain_core.runnables import RunnableLambda 25 | from langgraph.graph import StateGraph 26 | from typing import TypedDict 27 | from recursive_agents.base import MarketingCompanion, BugTriageCompanion, StrategyCompanion 28 | 29 | # 1 - Wrap in a RunnableLambda 30 | llm_fast = "gpt-4o-mini" 31 | llm_deep = "gpt-4.1-mini" 32 | 33 | mkt = MarketingCompanion(llm=llm_fast, temperature=0.8,max_loops=3, similarity_threshold=0.96) 34 | eng = BugTriageCompanion(llm=llm_deep, temperature=0.3) 35 | plan = StrategyCompanion(llm=llm_fast) 36 | 37 | # Each node is now a first-class Runnable; you get built-in tracing, concurrency, retries, etc., without rewriting your engine. 38 | mkt_node = RunnableLambda(mkt) # __call__ alias does the trick 39 | eng_node = RunnableLambda(eng) 40 | 41 | 42 | # merge-lambda joins text views into one string 43 | # note: LangGraph passes the entire upstream-state dict to a node. 44 | # with out this function, two upstream nodes are piped straight into strategy, 45 | # so plan_node will receive a Python dict like {"engineering": "...", "marketing": "..."}. 46 | # That's fine if your StrategyCompanion prompt expects that JSON blob, 47 | # but most of the time you'll want to concatenate the two strings first. 48 | merge_node = RunnableLambda( 49 | lambda d: f"### Marketing\n{d['marketing']}\n\n### Engineering\n{d['engineering']}" 50 | ) 51 | plan_node = RunnableLambda(plan) 52 | 53 | 54 | # Define the state schema for LangGraph 55 | class GraphState(TypedDict): 56 | input: str 57 | marketing: str 58 | engineering: str 59 | merged: str 60 | final_plan: str 61 | 62 | # Inline LangGraph example (fan-in) 63 | # No extra prompts, no schema gymnastics: simply passing text between the callables the classes already expose. 64 | graph = StateGraph(GraphState) 65 | graph.add_node("marketing_agent", lambda state: {"marketing": mkt_node.invoke(state["input"])}) 66 | graph.add_node("engineering_agent", lambda state: {"engineering": eng_node.invoke(state["input"])}) 67 | graph.add_node("merge_agent", lambda state: {"merged": merge_node.invoke(state)}) 68 | graph.add_node("strategy_agent", lambda state: {"final_plan": plan_node.invoke(state["merged"])}) 69 | 70 | graph.add_edge("marketing_agent", "merge_agent") 71 | graph.add_edge("engineering_agent", "merge_agent") 72 | graph.add_edge("merge_agent", "strategy_agent") 73 | 74 | graph.add_edge("__start__", "marketing_agent") 75 | graph.add_edge("__start__", "engineering_agent") 76 | graph.set_finish_point("strategy_agent") 77 | workflow = graph.compile() 78 | 79 | 80 | print("=" * 80) 81 | print("\n Pondering through the compiled graph workflow\n") 82 | print("=" * 80) 83 | result = workflow.invoke( 84 | {"input": "App ratings fell to 3.2★ and uploads crash on iOS 17.2. Diagnose & propose next steps."} 85 | ) 86 | final = result.get("final_plan", "") 87 | 88 | print("\n=== FINAL PLAN ===\n") 89 | print(final) 90 | print("=" * 80) 91 | 92 | # === After LangGraph workflow completes === 93 | print("\n🔍 DEEP INTROSPECTION - What LangGraph CAN'T normally show you:\n") 94 | # Show iteration counts 95 | print(f"Marketing iterations: {len(mkt.run_log)}") 96 | print(f"Engineering iterations: {len(eng.run_log)}") 97 | print(f"Strategy iterations: {len(plan.run_log)}") 98 | # Show why each converged 99 | print("=" * 80) 100 | print("COMPLETE CONVERGENCE ANALYSIS") 101 | print("=" * 80) 102 | 103 | for name, agent in [("Marketing", mkt), ("Engineering", eng), ("Strategy", plan)]: 104 | print(f"\n{name} Companion:") 105 | print(f" • Model: {agent.llm.model_name}") 106 | print(f" • Temperature: {agent.llm.temperature}") 107 | print(f" • Iterations: {len(agent.run_log)}/{agent.max_loops}") 108 | print(f" • Similarity threshold: {agent.similarity_threshold}") 109 | 110 | # Determine convergence type 111 | last_critique = agent.run_log[-1]['critique'].lower() 112 | if "no further improvements" in last_critique or "minimal revisions" in last_critique: 113 | convergence = "Critique-based (no improvements needed)" 114 | elif len(agent.run_log) < agent.max_loops: 115 | convergence = "Similarity-based (threshold reached)" 116 | else: 117 | convergence = "Max iterations reached" 118 | print(f" • Convergence: {convergence}") 119 | 120 | print("\n" + "=" * 80) 121 | # Want to see the last critique? Just access it directly! 122 | print("\n Strategy's final critique (no parsing needed) (first 1000 chars):") 123 | print(f"{plan.run_log[-1]['critique'][:1000]}...") 124 | 125 | print("=" * 80) 126 | print("\nCompare this to extracting from debug chunks - night and day!\n") 127 | print("=" * 80) 128 | # Uncomment to see the strategy agent's thinking process: 129 | #print("\n=== INNER STEPS ===\n") 130 | #print(plan.transcript_as_markdown()) 131 | -------------------------------------------------------------------------------- /demos/multi_agent_raw_ra_demo.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "# SPDX-License-Identifier: MIT\n", 10 | "#\n", 11 | "# Copyright (c) [2025] [Henry Besser]\n", 12 | "#\n", 13 | "# This software is licensed under the MIT License.\n", 14 | "# See the LICENSE file in the project root for the full license text.\n", 15 | "\n", 16 | "# demos/multi_agent_raw_rc_demo.ipynb" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "metadata": {}, 22 | "source": [ 23 | "# Multi-Agent Demo: Pure Recursive Agents\n", 24 | "\n", 25 | "## Sequential Orchestration with Full Observability\n", 26 | "\n", 27 | "This notebook demonstrates multi-agent workflows using **pure Recursive Agents** without any external orchestration frameworks.\n", 28 | "\n", 29 | "### Key Insights:\n", 30 | "1. **Full Transparency**: Every agent's thinking process is immediately accessible\n", 31 | "2. **Sequential Execution**: Clear, debuggable flow from Marketing → Engineering → Strategy\n", 32 | "3. **Zero Integration Overhead**: Companions work directly as callables\n", 33 | "\n", 34 | "### What You'll See:\n", 35 | "\n", 36 | "- How domain-specific agents analyze the same problem from different perspectives\n", 37 | "- Complete introspection into each agent's reasoning\n", 38 | "- How the Strategy companion synthesizes multiple viewpoints" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 1, 44 | "metadata": {}, 45 | "outputs": [], 46 | "source": [ 47 | "#from dotenv import load_dotenv\n", 48 | "#load_dotenv()" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 2, 54 | "metadata": {}, 55 | "outputs": [ 56 | { 57 | "name": "stdout", 58 | "output_type": "stream", 59 | "text": [ 60 | "OpenAI API Key status: Loaded\n" 61 | ] 62 | } 63 | ], 64 | "source": [ 65 | "# Check for API key\n", 66 | "import os\n", 67 | "api_key_status = \"Loaded\" if os.getenv(\"OPENAI_API_KEY\") else \"NOT FOUND - Check your .env file and environment.\"\n", 68 | "print(f\"OpenAI API Key status: {api_key_status}\")" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": 3, 74 | "metadata": {}, 75 | "outputs": [], 76 | "source": [ 77 | "# Import companions\n", 78 | "from IPython.display import Markdown, display\n", 79 | "from recursive_agents.base import MarketingCompanion, BugTriageCompanion, StrategyCompanion" 80 | ] 81 | }, 82 | { 83 | "cell_type": "markdown", 84 | "metadata": {}, 85 | "source": [ 86 | "## The Problem Scenario\n", 87 | "\n", 88 | "Let's define a realistic multi-faceted problem that requires different domain expertise:" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 4, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "# Multi-faceted problem requiring cross-functional analysis\n", 98 | "problem = (\n", 99 | " \"Since the last mobile release, picture uploads crash for many users, \"\n", 100 | " \"Instagram engagement is down 30%, and our app-store rating fell to 3.2★. \"\n", 101 | " \"Why is this happening and what should we do?\")" 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": {}, 107 | "source": [ 108 | "## Step 1: Marketing Analysis\n", 109 | "\n", 110 | "First, let's get the marketing perspective with verbose mode to see the thinking process:" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": 5, 116 | "metadata": {}, 117 | "outputs": [ 118 | { 119 | "name": "stdout", 120 | "output_type": "stream", 121 | "text": [ 122 | "Pondering MARKETING ANALYSIS \n", 123 | "\n" 124 | ] 125 | } 126 | ], 127 | "source": [ 128 | "# Marketing companion with verbose mode to see real-time thinking\n", 129 | "mkt = MarketingCompanion(\n", 130 | " llm=\"gpt-4o-mini\", \n", 131 | " max_loops=3,\n", 132 | " similarity_threshold=0.96,\n", 133 | " temperature=0.9, # Higher temp for creative insights\n", 134 | " #verbose=True # Watch the critique/revision cycles if turned !\n", 135 | ")\n", 136 | "\n", 137 | "#print(\" MARKETING ANALYSIS (Verbose Mode On)\\n\")\n", 138 | "#print(\"=\" * 60)\n", 139 | "print(\"Pondering MARKETING ANALYSIS \\n\")\n", 140 | "mkt_view = mkt.loop(problem)" 141 | ] 142 | }, 143 | { 144 | "cell_type": "markdown", 145 | "metadata": {}, 146 | "source": [ 147 | "### Marketing Convergence Details" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": 6, 153 | "metadata": {}, 154 | "outputs": [ 155 | { 156 | "name": "stdout", 157 | "output_type": "stream", 158 | "text": [ 159 | "\n", 160 | " Marketing Analysis Summary:\n", 161 | " • Iterations: 2\n", 162 | " • Converged: Early\n", 163 | "\n", 164 | "Final Marketing View (first 500 chars):\n", 165 | "The current marketing challenge presents several key signals that are directly impacting brand perception and user engagement on Instagram.\n", 166 | "\n", 167 | "### Key Market Signals\n", 168 | "\n", 169 | "1. **Technical Issues with Picture Uploads**: \n", 170 | " Since the last mobile release, many users have experienced crashes during picture uploads. This not only hampers their ability to share content but also disproportionately affects user groups that rely heavily on the app for social interaction, such as influencers and small businesses...\n" 171 | ] 172 | } 173 | ], 174 | "source": [ 175 | "print(\"\\n Marketing Analysis Summary:\")\n", 176 | "print(f\" • Iterations: {len(mkt.run_log)}\")\n", 177 | "print(f\" • Converged: {'Early' if len(mkt.run_log) < mkt.max_loops else 'At max loops'}\")\n", 178 | "print(\"\\nFinal Marketing View (first 500 chars):\")\n", 179 | "print(mkt_view[:500] + \"...\" if len(mkt_view) > 500 else mkt_view)" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "## Engineering Analysis\n", 187 | "\n", 188 | "Now let's get the technical perspective (without verbose mode for cleaner output):" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": 7, 194 | "metadata": {}, 195 | "outputs": [ 196 | { 197 | "name": "stdout", 198 | "output_type": "stream", 199 | "text": [ 200 | "\n", 201 | "🔧 Pondering ENGINEERING ANALYSIS\n", 202 | "\n", 203 | "============================================================\n" 204 | ] 205 | } 206 | ], 207 | "source": [ 208 | "# Engineering companion with different model and lower temperature\n", 209 | "bug = BugTriageCompanion(\n", 210 | " llm=\"gpt-4.1-mini\", # Higher-context model for technical analysis\n", 211 | " temperature=0.25 # Lower temp for precise technical reasoning\n", 212 | ")\n", 213 | "\n", 214 | "print(\"\\n🔧 Pondering ENGINEERING ANALYSIS\\n\")\n", 215 | "print(\"=\" * 60)\n", 216 | "bug_view = bug.loop(problem)" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": 8, 222 | "metadata": {}, 223 | "outputs": [ 224 | { 225 | "name": "stdout", 226 | "output_type": "stream", 227 | "text": [ 228 | "\n", 229 | "Final Engineering View (first 500 chars):\n", 230 | "Since the last mobile release, three significant issues have emerged: many users experience crashes when uploading pictures, user activity related to Instagram content within the app has dropped by approximately 30%, and the app-store rating has declined to 3.2 stars. The timing of these problems suggests they may be connected, but each could also stem from different causes. Understanding these relationships requires careful analysis.\n", 231 | "\n", 232 | "**Clarifying Key Terms**\n", 233 | "\n", 234 | "- *Instagram engagement* here refe...\n" 235 | ] 236 | } 237 | ], 238 | "source": [ 239 | "print(\"\\nFinal Engineering View (first 500 chars):\")\n", 240 | "print(bug_view[:500] + \"...\" if len(bug_view) > 500 else bug_view)" 241 | ] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": {}, 246 | "source": [ 247 | "### Engineering Thinking Process" 248 | ] 249 | }, 250 | { 251 | "cell_type": "code", 252 | "execution_count": 9, 253 | "metadata": {}, 254 | "outputs": [ 255 | { 256 | "name": "stdout", 257 | "output_type": "stream", 258 | "text": [ 259 | "\n", 260 | " Engineering Analysis Summary:\n", 261 | " • Iterations: 3\n", 262 | " • Final critique: \n", 263 | " 1. **Clarity and Focus**\n", 264 | "\n", 265 | "- The draft is generally clear but somewhat dense, especially in the diagnostic sections. The long bullet points under “Crash and Technical Diagnostics” and “User Engagement ...\n" 266 | ] 267 | } 268 | ], 269 | "source": [ 270 | "print(\"\\n Engineering Analysis Summary:\")\n", 271 | "print(f\" • Iterations: {len(bug.run_log)}\")\n", 272 | "print(f\" • Final critique: \\n {bug.run_log[-1]['critique'][:200]}...\")" 273 | ] 274 | }, 275 | { 276 | "cell_type": "code", 277 | "execution_count": null, 278 | "metadata": {}, 279 | "outputs": [], 280 | "source": [ 281 | "# Show the full thinking process\n", 282 | "print(\"\\n ENGINEERING THINKING PROCESS:\")\n", 283 | "display(Markdown(bug.transcript_as_markdown()))" 284 | ] 285 | }, 286 | { 287 | "cell_type": "markdown", 288 | "metadata": {}, 289 | "source": [ 290 | "## Step 3: Strategic Synthesis\n", 291 | "\n", 292 | "Finally, let's synthesize both perspectives into an actionable plan:" 293 | ] 294 | }, 295 | { 296 | "cell_type": "code", 297 | "execution_count": 11, 298 | "metadata": {}, 299 | "outputs": [ 300 | { 301 | "name": "stdout", 302 | "output_type": "stream", 303 | "text": [ 304 | "COMBINED INPUT FOR STRATEGY first 500 chars:\n", 305 | "============================================================\n", 306 | "=== Marketing view ===\n", 307 | "The current marketing challenge presents several key signals that are directly impacting brand perception and user engagement on Instagram.\n", 308 | "\n", 309 | "### Key Market Signals\n", 310 | "\n", 311 | "1. **Technical Issues with Picture Uploads**: \n", 312 | " Since the last mobile release, many users have experienced crashes during picture uploads. This not only hampers their ability to share content but also disproportionately affects user groups that rely heavily on the app for social interaction, such as influence...\n" 313 | ] 314 | } 315 | ], 316 | "source": [ 317 | "# Prepare combined input for strategy synthesis\n", 318 | "combined_views = (\n", 319 | " \"=== Marketing view ===\\n\"\n", 320 | " f\"{mkt_view}\\n\\n\"\n", 321 | " \"=== Engineering view ===\\n\"\n", 322 | " f\"{bug_view}\\n\\n\"\n", 323 | " \"Merge these perspectives and propose next actions.\"\n", 324 | ")\n", 325 | "\n", 326 | "print(\"COMBINED INPUT FOR STRATEGY first 500 chars:\")\n", 327 | "print(\"=\" * 60)\n", 328 | "print(combined_views[:500] + \"...\" if len(combined_views) > 500 else combined_views)" 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "execution_count": 12, 334 | "metadata": {}, 335 | "outputs": [ 336 | { 337 | "name": "stdout", 338 | "output_type": "stream", 339 | "text": [ 340 | "\n", 341 | " Pondering STRATEGIC SYNTHESIS\n", 342 | "\n", 343 | "============================================================\n" 344 | ] 345 | } 346 | ], 347 | "source": [ 348 | "# Strategy companion to synthesize\n", 349 | "synth = StrategyCompanion(\n", 350 | " llm=\"gpt-4o-mini\",\n", 351 | " temperature=0.60 # Balanced temperature for synthesis\n", 352 | ")\n", 353 | "\n", 354 | "print(\"\\n Pondering STRATEGIC SYNTHESIS\\n\")\n", 355 | "print(\"=\" * 60)\n", 356 | "action_plan = synth.loop(combined_views)\n" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": 13, 362 | "metadata": {}, 363 | "outputs": [ 364 | { 365 | "name": "stdout", 366 | "output_type": "stream", 367 | "text": [ 368 | "\n", 369 | " Final STRATEGY SYNTHESIS of Merged Enginering and Marketing views:\n", 370 | "### Key Observations\n", 371 | "\n", 372 | "1. **Technical Issues with Picture Uploads**: Users are experiencing crashes when uploading pictures, leading to significant disengagement and negative sentiment. These technical barriers hinder their ability to share content effectively, which is particularly detrimental to those who rely on the platform for their social interactions.\n", 373 | "\n", 374 | "2. **Decline in Engagement**: A reported 30% drop in user interactions, measured by likes, comments, and shares on posts, indicates that us...\n" 375 | ] 376 | } 377 | ], 378 | "source": [ 379 | "print(\"\\n Final STRATEGY SYNTHESIS of Merged Enginering and Marketing views:\")\n", 380 | "print(action_plan[:500] + \"...\" if len(action_plan) > 500 else action_plan)" 381 | ] 382 | }, 383 | { 384 | "cell_type": "code", 385 | "execution_count": null, 386 | "metadata": {}, 387 | "outputs": [], 388 | "source": [ 389 | "print(\"\\n Complete thinking process of STRATEGY SYNTHESIS of Merged Enginering and Marketing views:\")\n", 390 | "display(Markdown(synth.transcript_as_markdown()))" 391 | ] 392 | }, 393 | { 394 | "cell_type": "markdown", 395 | "metadata": {}, 396 | "source": [ 397 | "## Complete Convergence Analysis" 398 | ] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "execution_count": 15, 403 | "metadata": {}, 404 | "outputs": [ 405 | { 406 | "name": "stdout", 407 | "output_type": "stream", 408 | "text": [ 409 | "================================================================================\n", 410 | "COMPLETE CONVERGENCE ANALYSIS\n", 411 | "================================================================================\n", 412 | "\n", 413 | "Marketing Companion:\n", 414 | " • Model: gpt-4o-mini\n", 415 | " • Temperature: 0.9\n", 416 | " • Iterations: 2/3\n", 417 | " • Similarity threshold: 0.96\n", 418 | " • Convergence: Similarity-based (threshold reached)\n", 419 | "\n", 420 | "Engineering Companion:\n", 421 | " • Model: gpt-4.1-mini\n", 422 | " • Temperature: 0.25\n", 423 | " • Iterations: 3/3\n", 424 | " • Similarity threshold: 0.98\n", 425 | " • Convergence: Max iterations reached\n", 426 | "\n", 427 | "Strategy Companion:\n", 428 | " • Model: gpt-4o-mini\n", 429 | " • Temperature: 0.6\n", 430 | " • Iterations: 3/3\n", 431 | " • Similarity threshold: 0.97\n", 432 | " • Convergence: Max iterations reached\n", 433 | "\n", 434 | " Strategy's final critique (no parsing needed) (first 300 chars):\n", 435 | "### Critique of Draft Response\n", 436 | "\n", 437 | "1. **Clarity of Key Observations**: \n", 438 | " - The section on \"Technical Issues with Picture Uploads\" is somewhat repetitive. The phrase \"which severely affects influencers and small businesses\" appears somewhat disconnected from the overall point about user disengagement....\n" 439 | ] 440 | } 441 | ], 442 | "source": [ 443 | "print(\"=\" * 80)\n", 444 | "print(\"COMPLETE CONVERGENCE ANALYSIS\")\n", 445 | "print(\"=\" * 80)\n", 446 | "\n", 447 | "for name, agent in [(\"Marketing\", mkt), (\"Engineering\", bug), (\"Strategy\", synth)]:\n", 448 | " print(f\"\\n{name} Companion:\")\n", 449 | " print(f\" • Model: {agent.llm.model_name}\")\n", 450 | " print(f\" • Temperature: {agent.llm.temperature}\")\n", 451 | " print(f\" • Iterations: {len(agent.run_log)}/{agent.max_loops}\")\n", 452 | " print(f\" • Similarity threshold: {agent.similarity_threshold}\")\n", 453 | " \n", 454 | " # Determine convergence type\n", 455 | " last_critique = agent.run_log[-1]['critique'].lower()\n", 456 | " if \"no further improvements\" in last_critique or \"minimal revisions\" in last_critique:\n", 457 | " convergence = \"Critique-based (no improvements needed)\"\n", 458 | " elif len(agent.run_log) < agent.max_loops:\n", 459 | " convergence = \"Similarity-based (threshold reached)\"\n", 460 | " else:\n", 461 | " convergence = \"Max iterations reached\"\n", 462 | " print(f\" • Convergence: {convergence}\")\n", 463 | "\n", 464 | " # Want to see the last critique? Just access it directly!\n", 465 | "print(\"\\n Strategy's final critique (no parsing needed) (first 300 chars):\")\n", 466 | "print(f\"{synth.run_log[-1]['critique'][:300]}...\")" 467 | ] 468 | }, 469 | { 470 | "cell_type": "markdown", 471 | "metadata": {}, 472 | "source": [ 473 | "## Summary: The Power of Pure RA\n", 474 | "\n", 475 | "This demo showcased how Recursive Agents enables sophisticated multi-agent workflows with just simple Python:\n", 476 | "\n", 477 | "- **No frameworks required** - Just instantiate companions and call them\n", 478 | "- **Full observability built-in** - Every agent's thinking is preserved and accessible\n", 479 | "- **Flexible configuration** - Different models, temperatures, and thresholds per agent\n", 480 | "- **`transcript_as_markdown()`** provides publication-ready formatting\n", 481 | "\n", 482 | "The three-phase architecture (Draft → Critique → Revision) ensures thoughtful, refined outputs while maintaining complete transparency into the reasoning process." 483 | ] 484 | }, 485 | { 486 | "cell_type": "code", 487 | "execution_count": 16, 488 | "metadata": {}, 489 | "outputs": [ 490 | { 491 | "name": "stdout", 492 | "output_type": "stream", 493 | "text": [ 494 | "\n", 495 | "📈 FINAL STATISTICS:\n", 496 | "========================================\n", 497 | "Total iterations across all agents: 8\n", 498 | "Average iterations per agent: 2.7\n" 499 | ] 500 | } 501 | ], 502 | "source": [ 503 | "# Summary statistics\n", 504 | "print(\"\\n📈 FINAL STATISTICS:\")\n", 505 | "print(\"=\" * 40)\n", 506 | "total_iterations = len(mkt.run_log) + len(bug.run_log) + len(synth.run_log)\n", 507 | "print(f\"Total iterations across all agents: {total_iterations}\")\n", 508 | "print(f\"Average iterations per agent: {total_iterations/3:.1f}\")" 509 | ] 510 | } 511 | ], 512 | "metadata": { 513 | "kernelspec": { 514 | "display_name": "recursive", 515 | "language": "python", 516 | "name": "python3" 517 | }, 518 | "language_info": { 519 | "codemirror_mode": { 520 | "name": "ipython", 521 | "version": 3 522 | }, 523 | "file_extension": ".py", 524 | "mimetype": "text/x-python", 525 | "name": "python", 526 | "nbconvert_exporter": "python", 527 | "pygments_lexer": "ipython3", 528 | "version": "3.13.3" 529 | } 530 | }, 531 | "nbformat": 4, 532 | "nbformat_minor": 4 533 | } 534 | -------------------------------------------------------------------------------- /demos/multi_agent_raw_ra_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # SPDX-License-Identifier: MIT 3 | # 4 | # Copyright (c) [2025] [Henry Besser] 5 | # 6 | # This software is licensed under the MIT License. 7 | # See the LICENSE file in the project root for the full license text. 8 | 9 | # demos/multi_agent_raw_rc_demo.py 10 | """ 11 | Multi-agent demo pipeline (Pure RA) 12 | =================================== 13 | Runs a realistic workflow using pure Recursive Agents (no LangGraph): 14 | 15 | 1. **MarketingCompanion** - surfaces audience-level symptoms. 16 | 2. **BugTriageCompanion** - surfaces engineering/root-cause clues. 17 | 3. **StrategyCompanion** - merges both views into a single action plan. 18 | 19 | Usage 20 | ----- 21 | $ OPENAI_API_KEY=sk-... python demos/multi_agent_raw_rc_demo.py 22 | 23 | The script prints: 24 | • Each domain-specific analysis (debug on for Marketing). 25 | • The synthesised cross-functional action plan. 26 | 27 | Edit the ``problem`` string or swap in other Companion subclasses to test 28 | additional domains. 29 | """ 30 | 31 | from recursive_agents.base import MarketingCompanion 32 | from recursive_agents.base import BugTriageCompanion 33 | from recursive_agents.base import StrategyCompanion 34 | 35 | # Multi-agent demo (shared problem → parallel lenses → synthesis) 36 | # --------------------------------------------------------------- 37 | problem = ( 38 | "Since the last mobile release, picture uploads crash for many users, " 39 | "Instagram engagement is down 30 %, and our app-store rating fell to 3.2★. " 40 | "Why is this happening and what should we do?" 41 | ) 42 | 43 | # 1) Independent domain analyses (different models / settings) 44 | mkt = MarketingCompanion(llm="gpt-4o-mini", temperature=0.9,max_loops=3, similarity_threshold=0.96, verbose=True) # fast, cheap, show debug 45 | bug = BugTriageCompanion(llm="gpt-4.1-mini", temperature=0.25) # higher-context model 46 | 47 | print("\n" + "=" * 80) 48 | print("\n Pondering Marketing Analysis Verbose ON\n") 49 | print("=" * 80) 50 | mkt_view = mkt.loop(problem) 51 | 52 | print("\n" + "=" * 80) 53 | print("\nFinal Marketing Analysis (first 500 chars):") 54 | print(mkt_view[:500] + "..." if len(mkt_view) > 500 else mkt_view) 55 | 56 | 57 | print("\n" + "=" * 80) 58 | print("\n Pondering Engineering Analysis Verbose OFF\n") 59 | print("=" * 80) 60 | bug_view = bug.loop(problem) 61 | 62 | 63 | print("\nFinal Engineering Analysis (first 500 chars):") 64 | print(bug_view[:500] + "..." if len(bug_view) > 500 else bug_view) 65 | 66 | 67 | # 2) Merge perspectives for the synthesis step 68 | combined_views = ( 69 | "=== Marketing view ===\n" 70 | f"{mkt_view}\n\n" 71 | "=== Engineering view ===\n" 72 | f"{bug_view}\n\n" 73 | "Merge these perspectives and propose next actions." 74 | ) 75 | 76 | # 3) Synthesis agent produces the cross-functional plan 77 | synth = StrategyCompanion(llm="gpt-4o-mini", temperature=0.55) 78 | 79 | print("\n" + "=" * 80) 80 | print("\n Pondering a Sythesized Action Plan of Previous Views - Verbose OFF\n") 81 | print("=" * 80) 82 | action_plan = synth.loop(combined_views) 83 | 84 | 85 | 86 | print("\nFinal Synthesized Action Plan - full thinking process in raw mardown:") 87 | print(synth.transcript_as_markdown()) 88 | 89 | 90 | # Show convergence analysis 91 | print("\n" + "=" * 80) 92 | print("COMPLETE CONVERGENCE ANALYSIS") 93 | print("=" * 80) 94 | 95 | for name, agent in [("Marketing", mkt), ("Engineering", bug), ("Strategy", synth)]: 96 | print(f"\n{name} Companion:") 97 | print(f" • Model: {agent.llm.model_name}") 98 | print(f" • Temperature: {agent.llm.temperature}") 99 | print(f" • Iterations: {len(agent.run_log)}/{agent.max_loops}") 100 | print(f" • Similarity threshold: {agent.similarity_threshold}") 101 | 102 | # Determine convergence type 103 | last_critique = agent.run_log[-1]['critique'].lower() 104 | if "no further improvements" in last_critique or "minimal revisions" in last_critique: 105 | convergence = "Critique-based (no improvements needed)" 106 | elif len(agent.run_log) < agent.max_loops: 107 | convergence = "Similarity-based (threshold reached)" 108 | else: 109 | convergence = "Max iterations reached" 110 | print(f" • Convergence: {convergence}") 111 | -------------------------------------------------------------------------------- /docs/LangGraph_RA_comp.md: -------------------------------------------------------------------------------- 1 | ```python 2 | # SPDX-License-Identifier: MIT 3 | # 4 | # Copyright (c) [2025] [Henry Besser] 5 | # 6 | # This software is licensed under the MIT License. 7 | # See the LICENSE file in the project root for the full license text. 8 | ``` 9 | # LangGraph & Recursive Agents: Complete Agent Observability 10 | 11 | ## The Missing Piece in Agent Development 12 | 13 | When an LLM agent produces unexpected output, developers need to understand both what happened and why. Current tools excel at workflow orchestration but lack visibility into agent reasoning. 14 | 15 | Recursive Agents fills this gap. RA agents are Python callables that automatically maintain their complete thinking history - every iteration, critique, and refinement is preserved for inspection. 16 | 17 | ### Clean Architecture 18 | 19 | RA takes a different approach - no complex class hierarchies, no message schemas, no framework-specific primitives: 20 | 21 | ```python 22 | # RA: Just a callable 23 | agent = MarketingCompanion() 24 | result = agent("Why did sales drop?") # That's it! 25 | 26 | # Compare to typical LangChain patterns: 27 | # - Need SystemMessage, HumanMessage, AIMessage classes 28 | # - Must understand Chains, Agents, Tools, Memory abstractions 29 | # - Requires specific invoke() patterns and schemas 30 | # - Deep nesting: chain.steps[0].agent.memory.messages[0].content 31 | 32 | # RA works everywhere without modification: 33 | result = agent.loop("...") # Direct call 34 | node = RunnableLambda(agent) # Instant LangGraph node 35 | response = await agent_api(agent) # Web service ready 36 | ``` 37 | 38 | **No nested data structures**. Access what you need directly: `agent.run_log`, `agent.history`. No digging through layers of abstractions. 39 | 40 | ## Executive Summary 41 | 42 | Building reliable AI systems requires understanding both workflow execution and agent reasoning. Today's tools only solve half this equation. 43 | 44 | - **LangGraph** excels at **workflow orchestration** - managing how agents connect, execute in parallel, handle errors, and pass data between nodes. It shows you the "what" of your system. 45 | 46 | - **Recursive Agents** provides **thinking transparency** - agents that automatically maintain their complete reasoning history through iterative refinement. It shows you the "why" behind every decision. 47 | 48 | **The Key Insight**: RA agents work as drop-in LangGraph nodes. You don't choose between tools - you get workflow orchestration AND thinking transparency in one system. 49 | 50 | ## What Each Tool Does Best 51 | 52 | ### LangGraph's Strengths (Workflow Orchestration) 53 | 54 | ```python 55 | # LangGraph excels at complex workflows 56 | workflow = StateGraph() 57 | workflow.add_conditional_edges(...) # Conditional routing 58 | workflow.stream(...) # Streaming execution 59 | workflow.batch(...) # Batch processing 60 | 61 | # With print_mode="debug", you see: 62 | # - Task scheduling and execution order 63 | # - Node inputs/outputs 64 | # - State transitions 65 | # - Parallel execution timing 66 | # - Error handling and retries 67 | ``` 68 | 69 | ### Recursive Agents's Addition (Thinking Transparency) 70 | 71 | ```python 72 | # RA adds introspection to any node 73 | mkt = MarketingCompanion() 74 | mkt_node = RunnableLambda(mkt) 75 | 76 | # After workflow runs, access thinking history: 77 | print(len(mkt.run_log)) # Number of iterations 78 | print(mkt.transcript_as_markdown()) # Full thinking process 79 | print(mkt.run_log[-1]["critique"]) # Why it stopped iterating 80 | # This data isn't available through LangGraph alone 81 | ``` 82 | 83 | ## Observability Comparison 84 | 85 | ### LangGraph's Observability Features: 86 | 87 | 1. **Workflow Execution Visibility** 88 | ```python 89 | # stream_mode="debug" provides structured debug events 90 | for chunk in workflow.stream({"input": "..."}, stream_mode="debug"): 91 | # chunk is a dict with type, timestamp, payload 92 | print(chunk) # {'type': 'task', 'payload': {...}} 93 | ``` 94 | 95 | 2. **State Management & Tracking** 96 | ```python 97 | result = workflow.invoke({"input": "..."}) 98 | # Access complete state, all node outputs, transformations 99 | ``` 100 | 101 | 3. **Visual Debugging & Graph Structure** 102 | ```python 103 | graph.get_graph().draw_mermaid() # Interactive workflow diagram 104 | # Understand data flow, parallelism, dependencies 105 | ``` 106 | 107 | 4. **Built-in Features** 108 | - Streaming support for real-time monitoring 109 | - Error handling and retry visibility 110 | - Conditional routing transparency 111 | - Checkpointing and persistence 112 | 113 | **Important Note**: While `stream_mode="debug"` provides structured debug events, you still need to parse nested payloads to extract specific information. The debug stream shows workflow orchestration details but doesn't include agent reasoning, critique/revision cycles, or convergence patterns. 114 | 115 | ### Key Differences 116 | 117 | Standard LangChain/LangGraph patterns involve: 118 | - SystemMessage, HumanMessage, AIMessage classes 119 | - TypedDict schemas and nested payloads 120 | - Framework-specific patterns 121 | - Deep nesting: `workflow.nodes[0].state['messages'][0].content` 122 | 123 | RA provides: 124 | - Direct access: `agent.run_log`, `agent.history` 125 | - Universal compatibility across environments 126 | - String in, string out interface 127 | - Automatic history tracking 128 | 129 | ## Accessing Results: Structured Debug Data vs Always-Available Introspection 130 | 131 | ### LangGraph's Debug Stream Approach 132 | ```python 133 | # Using stream_mode="debug" for structured debug data 134 | debug_chunks = [] 135 | for chunk in workflow.stream(input, stream_mode="debug"): 136 | debug_chunks.append(chunk) # Capture structured debug events 137 | 138 | # You get debug events like: 139 | # {'type': 'task', 'payload': {'name': 'agent_name', ...}} 140 | # {'type': 'task_result', 'payload': {'result': [(...)]}} 141 | 142 | # But to extract specific results requires parsing: 143 | for chunk in debug_chunks: 144 | if chunk.get('type') == 'task_result': 145 | # Navigate nested structure to find what you need 146 | pass 147 | 148 | # Note: Still no access to agent reasoning or iterations 149 | ``` 150 | 151 | ### RA Direct Access (Automatic & Complete) 152 | 153 | The key difference: RA agents automatically maintain their history with zero configuration. 154 | 155 | ```python 156 | # Create agents - that's it, introspection is built-in 157 | mkt = MarketingCompanion() 158 | eng = BugTriageCompanion() 159 | strategy = StrategyCompanion() 160 | 161 | # Use normally - no special flags, modes, or configuration 162 | result = workflow.invoke(input) 163 | 164 | # Complete history is automatically available: 165 | mkt.run_log # Every iteration preserved 166 | eng.run_log # All critiques and revisions 167 | strategy.run_log # Full thinking evolution 168 | 169 | # Direct access to any detail: 170 | len(strategy.run_log) # Iteration count 171 | strategy.run_log[-1]['critique'] # Why it stopped 172 | strategy.transcript_as_markdown() # Human-readable history 173 | 174 | # Zero overhead, always on, no trade-offs 175 | 176 | # BONUS: Formatted output ready for humans 177 | print(strategy.transcript_as_markdown()) 178 | # Outputs: 179 | # ### Iteration 1 180 | # **Draft** 181 | # [Initial analysis...] 182 | # **Critique** 183 | # [What could be improved...] 184 | # **Revision** 185 | # [Enhanced analysis...] 186 | # ... (continues for all iterations) 187 | ``` 188 | 189 | ## How Others Try to Add Observability 190 | 191 | ### The Fundamental Challenge 192 | 193 | Without built-in introspection, developers resort to painful workarounds: 194 | 195 | **Option 1: Manual State Tracking** 196 | Add custom fields to TypedDict schemas, implement logging in every node, and carefully pass metadata through the entire graph. Complex, error-prone, and still incomplete. 197 | 198 | **Option 2: Callbacks** 199 | ```python 200 | from langchain.callbacks import FileCallbackHandler 201 | handler = FileCallbackHandler("./logs.txt") 202 | workflow.invoke({"input": "..."}, config={"callbacks": [handler]}) 203 | # Just logs raw LLM calls, no structured thinking history 204 | ``` 205 | 206 | **Option 3: External Services** 207 | ```python 208 | # LangSmith - requires API key, costs money 209 | # Still doesn't capture iterative refinement process 210 | ``` 211 | 212 | None of these approaches provide the automatic, structured thinking history that RA delivers out of the box. 213 | 214 | ## RA + LangGraph: Best of Both Worlds 215 | 216 | ### Zero-Friction Integration 217 | 218 | ```python 219 | # Create companions 220 | mkt = MarketingCompanion() 221 | eng = BugTriageCompanion() 222 | plan = StrategyCompanion() 223 | 224 | # Use with LangGraph 225 | mkt_node = RunnableLambda(mkt) 226 | eng_node = RunnableLambda(eng) 227 | plan_node = RunnableLambda(plan) 228 | ``` 229 | 230 | ### What You Can Inspect (Automatically!) 231 | 232 | ```python 233 | # After workflow runs 234 | result = workflow.invoke({"input": "App crashed, users leaving"}) 235 | 236 | # 1. Overall flow (LangGraph) 237 | print(result) # Final outputs 238 | 239 | # 2. Direct attribute access - no digging through nested structures 240 | print(f"Marketing iterations: {len(mkt.run_log)}") # Straightforward! 241 | print(f"Final output: {mkt.history[-1].content}") # Direct! 242 | 243 | # Compare to typical LangChain access patterns: 244 | # state['nodes']['marketing']['memory']['chat_memory']['messages'][-1]['content'] 245 | # workflow.memory.chat_memory.messages[-1].content 246 | # chain.steps[0].outputs[0].generations[0][0].text 247 | 248 | # 3. Everything is just attributes on the agent instance 249 | mkt.run_log # Complete history 250 | mkt.history # Conversation memory 251 | mkt.llm # The actual model 252 | mkt.max_loops # Configuration 253 | # No framework wrappers, no nested state, no schemas 254 | 255 | # 4. Full thinking traces 256 | print(mkt.transcript_as_markdown()) # Complete reasoning history 257 | print(eng.transcript_as_markdown()) # All iterations preserved 258 | print(plan.transcript_as_markdown()) # Synthesis process visible 259 | ``` 260 | 261 | 262 | ## Real-World Impact 263 | 264 | ### Case 1: Production Debugging 265 | 266 | Your customer success agent gives inappropriate advice. With standard tools, you're blind. With RA: 267 | 268 | ```python 269 | # Instant root cause analysis 270 | print(agent.transcript_as_markdown()) 271 | 272 | # Output shows: 273 | # Iteration 1: Agent misunderstood context 274 | # Critique: "Missing customer's actual pain point" 275 | # Iteration 2: Better but still generic 276 | # Critique: "Needs specific technical details" 277 | # Iteration 3: Addressed the real issue 278 | 279 | # Now you know exactly what went wrong and can fix it 280 | ``` 281 | 282 | ### Case 2: Quality Assurance 283 | 284 | You need to ensure agents meet quality standards before deployment: 285 | 286 | ```python 287 | def validate_agent_quality(agent, test_cases): 288 | for test in test_cases: 289 | agent(test.input) 290 | 291 | # Check reasoning quality 292 | if len(agent.run_log) == 1: 293 | print(f" No refinement for: {test.input}") 294 | 295 | # Verify critique thoroughness 296 | critiques = [step["critique"] for step in agent.run_log] 297 | if any(len(c) < 100 for c in critiques): 298 | print(f" Shallow critique detected") 299 | 300 | # Ensure convergence quality 301 | if len(agent.run_log) >= agent.max_loops: 302 | print(f" Hit iteration limit - may need tuning") 303 | ``` 304 | 305 | ### Performance Monitoring 306 | 307 | **Without RA:** 308 | Track execution time and token counts. That's about it. 309 | 310 | **With RA:** 311 | ```python 312 | # Rich metrics for monitoring 313 | metrics = { 314 | "agent": "strategy", 315 | "iterations": len(plan.run_log), 316 | "convergence_type": "similarity" if len(plan.run_log) < plan.max_loops else "max_loops", 317 | "critique_evolution": [len(s["critique"]) for s in plan.run_log], 318 | "thinking_depth": sum(len(s["revision"]) for s in plan.run_log) 319 | } 320 | ``` 321 | 322 | ## The Synergy: Better Together 323 | 324 | **Together = Complete Observability:** 325 | - See both the workflow (LangGraph) AND the thinking (RA) 326 | - Zero integration overhead - RA agents work directly as LangGraph nodes 327 | 328 | Your companions become "thoughts-included" nodes that enhance any LangGraph workflow with deep introspection capabilities. 329 | 330 | 331 | ## Start Building Transparent AI Systems Today 332 | 333 | Recursive Agents transforms opaque LLM calls into transparent reasoning processes. Whether you're debugging production issues, ensuring quality standards, or optimizing agent performance, RA gives you the visibility you need. The future of AI development isn't about choosing between tools - it's about combining the right ones. LangGraph handles the "what," RA reveals the "why," and together they enable more observable AI systems. -------------------------------------------------------------------------------- /docs/RA_architecture.md: -------------------------------------------------------------------------------- 1 | ```python 2 | # SPDX-License-Identifier: MIT 3 | # 4 | # Copyright (c) [2025] [Henry Besser] 5 | # 6 | # This software is licensed under the MIT License. 7 | # See the LICENSE file in the project root for the full license text. 8 | ``` 9 | 10 | # Recursive Agents Architecture 11 | 12 | ## Overview 13 | 14 | Recursive Agents implements a modular architecture where agents automatically critique and refine their outputs through a three-phase iterative process. This document details the system design and component interactions. 15 | 16 | ## System Architecture 17 | 18 | ![RA Architecture](../images/RA_Architecture.svg) 19 | 20 | ## Key ideas before getting into the nitty gritty 21 | 22 | ### Introspection Capabilities 23 | 24 | Every Companion maintains: 25 | - `history`: Conversation memory (HumanMessage/AIMessage pairs) 26 | - `run_log`: Detailed iteration data with drafts, critiques, revisions 27 | - `transcript_as_markdown()`: Formatted view of the thinking process 28 | 29 | This data persists after execution, enabling debugging and analysis even in complex workflows. 30 | 31 | ### Design Principles 32 | 33 | 1. **Separation of Concerns** 34 | - Templates define behavior (look in ```templates/``` folder and ```recursive_agents/template_load_utitls.py```) 35 | - Engine provides mechanics(look in ```core/chains.py```) 36 | - Companions specialize domains(look in ```recursive_agents/base.py```) 37 | 38 | 2. **Composability** 39 | - Each layer works independently 40 | - Components mix without conflicts 41 | - New domains require minimal code 42 | 43 | 3. **Transparency** 44 | - All decisions are traceable 45 | - No hidden state 46 | - Full inspection capability 47 | 48 | ### Extension Points 49 | 50 | The architecture supports extension through: 51 | - New companion classes (one template file + minimal code) 52 | - Custom protocols (different reasoning patterns) 53 | - Alternative templates (critique/revision strategies) 54 | - UI integrations (beyond Streamlit) 55 | 56 | All extensions inherit the core capabilities without reimplementation. 57 | 58 | -------------------------------------------- 59 | ## The architecture consists of three main layers: 60 | 61 | ### 1. Template Layer (```templates/``` folder) 62 | 63 | The system uses five templates that define agent behavior: 64 | 65 | **System Templates** (define the agent's identity and approach): 66 | - **initial_sys** - Sets the agent's domain expertise and initial response style 67 | - **critique_sys** - Defines how the agent should analyze and critique drafts 68 | - **revision_sys** - Guides how the agent improves based on critiques 69 | 70 | **User Templates** (structure the specific task): 71 | - **critique_user** - The format for presenting drafts to be critiqued 72 | - **revision_user** - The format for presenting drafts + critiques for revision 73 | 74 | **Protocol Context**: 75 | - **protocol_context.txt** - A strategic reasoning framework that gets injected into all system templates via the `{context}` placeholder, providing consistent analytical depth 76 | 77 | All templates are plain text files in the `templates/` directory, making them easy to modify without touching code. 78 | 79 | This protocol (```templates/protocol_context.txt```) transforms every companion into a pattern-discovery engine. Problems know their own solutions—we just create conditions for revelation. 80 | 81 | ### 2. Engine Layer (BaseCompanion in ```core/chains.py```) 82 | The core engine provides: 83 | 84 | 85 | (See the comprehensive docstring at the top of chains.py for design philosophy and detailed documentation) 86 | 87 | **Key Methods:** 88 | - `__init__()` - Accepts llm (string or ChatOpenAI), templates, similarity_threshold, max_loops, temperature, return_transcript, verbose 89 | - `loop(user_input)` - Executes the three-phase refinement process, returns final answer (or tuple with run_log) 90 | - `__call__()` - Alias for loop(), making companions callable like functions 91 | - `transcript_as_markdown()` - Formats run_log for human reading 92 | 93 | **Instance Attributes:** 94 | - `history` - Conversation memory (HumanMessage/AIMessage pairs) 95 | - `run_log` - Detailed record of all iterations (drafts, critiques, revisions) 96 | - `max_loops`, `similarity_threshold` - Convergence parameters 97 | 98 | **Internal Functions:** 99 | - `build_chains()` - Constructs three LangChain chains from templates 100 | - `cosine_from_embeddings()` - Calculates similarity between text embeddings 101 | 102 | ### 3. Domain Layer (```recursive_agents/base.py```) 103 | Companion classes inherit from BaseCompanion: 104 | - GenericCompanion - Domain-agnostic baseline 105 | - MarketingCompanion - Overrides initial_sys for marketing expertise 106 | - BugTriageCompanion - Overrides initial_sys for engineering analysis 107 | - StrategyCompanion - Overrides initial_sys for cross-functional synthesis 108 | 109 | Each companion can override class-level defaults (MAX_LOOPS, SIM_THRESHOLD, etc.) while inheriting the full engine. 110 | 111 | (Note: Each companion class includes detailed docstrings with usage examples) 112 | 113 | ## Three-Phase Process 114 | 115 | ![Sequence Flow](../images/Sequence_Summary.svg) 116 | 117 | The iterative process (each `|` represents a LangChain chain combining prompt + LLM): 118 | 119 | 1. **Initial Draft Generation** 120 | - Responds to the user's question or problem 121 | - Generates initial analysis based on domain expertise 122 | - (Technical: Can incorporate conversation history via MessagesPlaceholder if needed) 123 | 124 | 2. **Critique Generation** 125 | - Analyzes the draft for weaknesses 126 | - Uses separate critique prompt without history 127 | 128 | 3. **Revision Generation** 129 | - Improves based on critique 130 | - Creates new version addressing identified issues 131 | 132 | This cycle repeats until: 133 | - Cosine similarity of embeddings between iterations > threshold (default 0.98) 134 | - Maximum iterations reached 135 | 136 | ## Template Composition 137 | 138 | The `build_templates()` utility in `recursive_agents/template_load_utils.py` enables sophisticated composition: 139 | 140 | ```python 141 | def build_templates(**overrides): 142 | # Loads default templates 143 | # Applies any overrides 144 | # Injects protocol into system templates 145 | # Returns complete template dict 146 | ``` 147 | 148 | Key features: 149 | - Override only what changes (typically just initial_sys) 150 | - Protocol injection is automatic for system templates 151 | - User templates remain protocol-free 152 | - Complete flexibility in composition 153 | 154 | *modularty isn't only confined to the code you can change how the change how to agents obtain the protocol (and system and user prompts as well) by modyfiying txt files and changing ```recursive_agents/template_load_utils.py```:* 155 | 156 | ```python 157 | # Your protocol shapes thinking, but WHERE it applies is flexible: 158 | if key.endswith("_sys"): # Default: protocols guide system identity 159 | content = content.format(context=protocol_context) 160 | 161 | # But you could: 162 | # - Inject only into critique phase for "guided criticism" 163 | # - Use different protocols for different phases 164 | # - Skip protocols entirely for rapid iteration 165 | # - Compose multiple protocols dynamically 166 | ``` 167 | ##### Advanced: Rethinking the Protocol Layer 168 | 169 | ```python 170 | # Morning vs Evening protocols 171 | build_templates(protocol="exploration_protocol" if morning else "convergence_protocol") 172 | 173 | # Phase-specific protocols 174 | build_templates(critique_sys="harsh_critique", critique_protocol="academic_rigor") 175 | 176 | # Protocol-free companions for baseline comparison 177 | build_templates(skip_protocol=True) 178 | ``` 179 | -------------------------------------------------------------------------------- /images/RA_Architecture.svg: -------------------------------------------------------------------------------- 1 | 2 | 14 | 33 | 40 | 50 | 57 | 64 | 65 | 75 | The Example's Architecture in One Picture 83 | 92 | 5 Templates: 102 | • initial_sys 110 | • critique_sys 118 | • revision_sys 126 | • critique_user 134 | • revision_user 142 | 1 Protocol: 152 | • protocol_context 160 | Input 169 | 179 | 188 | BaseCompanion (The Engine) 197 | • build_chains() 204 | (runs in __init__) 211 | 220 | Draft | LLM 227 | Critique | LLM 234 | Revision | LLM 241 | • loop() 248 | ⟳ MAX_LOOPS, similarity stop 255 | 265 | subclasses inherit all functionality 274 | — each calls inherited loop() — 283 | 292 | GenericCompanion 301 | 5 templates 308 | (system templates 315 | have protocol_context ⚙) 322 | 331 | MarketingCompanion 341 | initial_sys = marketing 348 | + protocol_context ⚙ 355 | (other 4 fromGENERIC_TEMPLATES) 368 | 377 | BugTriageCompanion 387 | initial_sys = bug_triage 394 | + protocol_context ⚙ 401 | (other 4 fromGENERIC_TEMPLATES) 414 | 423 | StrategyCompanion 433 | initial_sys = strategy 440 | + protocol_context ⚙ 447 | (other 4 fromGENERIC_TEMPLATES) 460 | 470 | 480 | 490 | 500 | (can be any # of domains) 508 | 518 | 528 | 538 | 548 | (after instantiation) 556 | Draft → Critique → Revision → Final draft 564 | history / run_log 🗄 572 | UI / tests 580 | 590 | 601 | Example multi-agent orchestration: 610 | 620 | Marketing view 628 | (from MarketingCompanion) 637 | gpt-4o-mini 645 | temp = 0.8 653 | Engineering view 661 | (from BugTriageCompanion) 670 | gpt-4.1 678 | temp = 0.3 686 | 696 | 706 | 716 | 726 | merged views 734 | 744 | 753 | StrategyComp 761 | (synthesis layer) 769 | 779 | history / run_log 🗄 787 | (audit & tuning) 795 | 797 | 807 | 810 | 816 | 822 | 828 | 829 | 830 | 840 | 844 | 845 | 855 | 859 | 860 | 870 | 874 | 875 | 884 | 888 | 889 | 890 | Input 899 | 900 | -------------------------------------------------------------------------------- /images/Sequence_Summary.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 16 | 35 | 36 | 45 | 46 | Sequence Summary 56 | 57 | User input prompt 70 | 71 | 80 | 86 | 87 | 88 | 91 | 92 | 102 | 108 | 113 | 114 | init_prompt | LLM 126 | 127 | 136 | 142 | 143 | 144 | 154 | 160 | 166 | 167 | critique_prompt | LLM 179 | 180 | 189 | 195 | 196 | 197 | 207 | 213 | 219 | 220 | revision_prompt | LLM 232 | 233 | 234 | 243 | 249 | 250 | 259 | 265 | 266 | 275 | 281 | 282 | 283 | Generates 298 | Critiques 313 | Revises 328 | 329 | 337 | 343 | 349 | 350 | 351 | ⟳ up to MAX_LOOPS (similarity stop) 366 | 367 | 368 | 369 | 375 | 381 | 382 | 384 | 391 | 395 | 396 | 403 | 407 | 408 | 409 | 410 | -------------------------------------------------------------------------------- /images/Streamlit_App_Screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hankbesser/recursive-agents/a2c326b648ff8a4e8895a25084143a6fd9a57263/images/Streamlit_App_Screenshot.png -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools>=61", "wheel"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "recursive-agents" 7 | version = "0.1.0" 8 | description = "A meta-framework for self-improving LLM responses through recursive critique and revision" 9 | authors = [{ name="Henry Besser", email="henry.operator.research@gmail.com" }] 10 | readme = "README.md" 11 | license = { text = "MIT" } 12 | requires-python = ">=3.11" 13 | keywords = ["ai", "llm", "agents", "recursive", "iteration", "recursion", "langchain", "critique", "revision", "meta-framework"] 14 | classifiers = [ 15 | "Development Status :: 4 - Beta", 16 | "Intended Audience :: Developers", 17 | "License :: OSI Approved :: MIT License", 18 | "Programming Language :: Python :: 3", 19 | "Programming Language :: Python :: 3.11", 20 | "Programming Language :: Python :: 3.12", 21 | "Programming Language :: Python :: 3.13", 22 | "Topic :: Scientific/Engineering :: Artificial Intelligence", 23 | ] 24 | dependencies = [ 25 | "langchain>=0.3.20", 26 | "langchain-openai>=0.3.20", 27 | "langchain-core>=0.3.60", 28 | "openai>=1.92.0", 29 | "numpy>=2.3.0", 30 | "python-dotenv>=1.1.0", 31 | ] 32 | 33 | [project.optional-dependencies] 34 | streamlit = [ 35 | "streamlit>=1.46.0", 36 | ] 37 | demos = [ 38 | "langgraph>=0.5.0", 39 | ] 40 | 41 | all = [ 42 | "recursive-agents[streamlit,demos]", 43 | ] 44 | 45 | [project.urls] 46 | Homepage = "https://github.com/hankbesser/recursive-agents" 47 | Documentation = "https://github.com/hankbesser/recursive-agents#readme" 48 | Repository = "https://github.com/hankbesser/recursive-agents" 49 | Issues = "https://github.com/hankbesser/recursive-agents/issues" 50 | 51 | [tool.setuptools] 52 | packages = ["recursive_agents", "core"] 53 | include-package-data = true 54 | 55 | [tool.setuptools.package-data] 56 | "" = ["templates/*.txt"] # Root package 57 | -------------------------------------------------------------------------------- /recursive_agents/__init__.py: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: MIT 2 | # 3 | # Copyright (c) [2025] [Henry Besser] 4 | # 5 | # This software is licensed under the MIT License. 6 | # See the LICENSE file in the project root for the full license text. 7 | 8 | # recursive_agents/__init__.py 9 | """ 10 | Public API surface for the Recursive Agents package. 11 | 12 | Most users should import classes directly from this package root: 13 | 14 | from recursive_agents import GenericCompanion 15 | agent = GenericCompanion(llm="gpt-4o-mini") 16 | answer = agent("Analyze this problem...") 17 | 18 | With new modular structure, companions are organized by UI framework: 19 | - base.py: Standard companions without UI integration 20 | - streamlit.py: Streamlit-enabled companions with live updates 21 | 22 | If you later move or rename the implementation modules, only this file 23 | needs updating—user code stays stable. 24 | """ 25 | 26 | # --------------------------------------------------------------------- 27 | # Re-export the core engine (optional but often handy) 28 | # --------------------------------------------------------------------- 29 | from core.chains import BaseCompanion 30 | 31 | # --------------------------------------------------------------------- 32 | # Re-export all concrete agents from base.py 33 | # --------------------------------------------------------------------- 34 | from .base import ( 35 | GenericCompanion, 36 | MarketingCompanion, 37 | BugTriageCompanion, 38 | StrategyCompanion, 39 | ) 40 | 41 | # --------------------------------------------------------------------- 42 | # Re-export Streamlit-enabled agents from streamlit.py 43 | # --------------------------------------------------------------------- 44 | from .streamlit import ( 45 | StreamlitGenericCompanion, 46 | StreamlitMarketingCompanion, 47 | StreamlitBugTriageCompanion, 48 | StreamlitStrategyCompanion, 49 | ) 50 | 51 | # --------------------------------------------------------------------- 52 | # What `from recursive_agents import *` should expose 53 | # --------------------------------------------------------------------- 54 | __all__ = [ 55 | # Core 56 | "BaseCompanion", 57 | # Standard companions 58 | "GenericCompanion", 59 | "MarketingCompanion", 60 | "BugTriageCompanion", 61 | "StrategyCompanion", 62 | # Streamlit companions 63 | "StreamlitGenericCompanion", 64 | "StreamlitMarketingCompanion", 65 | "StreamlitBugTriageCompanion", 66 | "StreamlitStrategyCompanion", 67 | ] 68 | -------------------------------------------------------------------------------- /recursive_agents/base.py: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: MIT 2 | # 3 | # Copyright (c) [2025] [Henry Besser] 4 | # 5 | # This software is licensed under the MIT License. 6 | # See the LICENSE file in the project root for the full license text. 7 | 8 | # recursive_agents/base.py 9 | """ 10 | Base Companion Classes for Recursive Agents Framework 11 | ======================================================== 12 | 13 | This module contains all standard Companion implementations that inherit from 14 | BaseCompanion. Each Companion is a specialized recursive agent for a different domain that iteratively 15 | refines its analysis through critique-revision cycles. 16 | 17 | Classes: 18 | GenericCompanion: Domain-agnostic baseline implementation 19 | MarketingCompanion: Growth and audience-sentiment focused 20 | BugTriageCompanion: Engineering-centric root cause analysis 21 | StrategyCompanion: Cross-functional synthesis and planning 22 | 23 | Template Pattern: 24 | All companions share the same critique/revision templates by default. 25 | Each only overrides the initial_sys template to provide domain expertise. 26 | All templates are defined at module level 27 | System prompts include the shared protocol_context 28 | (You can change this behavior in template_load_utils.py if needed) 29 | See template_load_utils.py to customize this behavior. 30 | 31 | Usage: 32 | from recursive_agents.base import MarketingCompanion 33 | 34 | agent = MarketingCompanion(llm="gpt-4o-mini", temperature=0.8) 35 | 36 | # Both work - __call__ is an alias for loop() 37 | answer = agent("Why did engagement drop...?") # Concise 38 | answer = agent.loop("Why did engagement drop...?") # Explicit 39 | """ 40 | 41 | 42 | from recursive_agents.template_load_utils import build_templates 43 | from core.chains import BaseCompanion 44 | 45 | 46 | # Generic templates - defaults templates- no overrides 47 | GENERIC_TEMPLATES = build_templates() 48 | 49 | class GenericCompanion(BaseCompanion): 50 | """ 51 | Domain-agnostic companion for general analysis. 52 | 53 | Uses generic templates without specialization. Suitable for any 54 | problem domain where you don't need specific expertise. 55 | 56 | Inherits all defaults from BaseCompanion (3 loops, 0.98 similarity). 57 | 58 | Typical Usage 59 | ------------- 60 | from recursive_agents.base import GenericCompanion 61 | 62 | # using loop() method with return_transcript 63 | generic = GenericCompanion(llm="gpt-4o-mini", return_transcript=True) 64 | answer, steps = generic.loop("Our Q3 revenue missed targets by 15%. analyze possible causes") 65 | 66 | print(f"Analysis: {answer}") 67 | print(f"Iterations: {len(steps)}") 68 | """ 69 | TEMPLATES = GENERIC_TEMPLATES 70 | 71 | 72 | # Marketing templates - override initial_sys 73 | MARKETING_TEMPLATES = build_templates(initial_sys="marketing_initial_sys") 74 | 75 | class MarketingCompanion(BaseCompanion): 76 | """ 77 | Marketing-focused analysis with growth and audience insights. 78 | 79 | Specializes in: 80 | - Customer sentiment and engagement metrics 81 | - Funnel optimization and conversion analysis 82 | - Campaign effectiveness and market positioning 83 | 84 | Uses fewer loops (2) for faster, more decisive marketing insights. 85 | 86 | Typical Usage 87 | ------------- 88 | from recursive_agents.base import MarketingCompanion 89 | 90 | # using callable with temperature 91 | marketing = MarketingCompanion(llm="gpt-4o-mini", temperature=0.8) 92 | campaign_analysis = marketing("Black Friday campaign had 50% lower conversion than last year") 93 | 94 | print(campaign_analysis) 95 | """ 96 | TEMPLATES = MARKETING_TEMPLATES 97 | MAX_LOOPS = 2 98 | 99 | # Note: Subclasses only need __init__ if adding new parameters. 100 | # Example: 101 | # 102 | # def __init__(self, llm=None, *, channel_weights=None, **kwargs): 103 | # super().__init__(llm, **kwargs) # Let parent handle all standard setup 104 | # self.channel_weights = channel_weights or {"email": 1.0, "social": 1.0} 105 | 106 | 107 | 108 | # Bug triage templates - only override initial_sys 109 | BUG_TRIAGE_TEMPLATES = build_templates(initial_sys="bug_triage_initial_sys") 110 | 111 | class BugTriageCompanion(BaseCompanion): 112 | """ 113 | Engineering-focused companion for technical root cause analysis. 114 | 115 | Specializes in: 116 | - Reproducibility assessment and environment details 117 | - Impact scope and severity evaluation 118 | - Technical hypothesis generation 119 | 120 | Maintains default 3 loops for thorough technical investigation. 121 | 122 | Typical Usage 123 | ------------- 124 | from recursive_agents.base import BugTriageCompanion 125 | 126 | # with similarity threshold and clear_history 127 | bug = BugTriageCompanion( 128 | llm="gpt-4.1-mini", # Model flexibilty 129 | similarity_threshold=0.95, 130 | clear_history=True 131 | ) 132 | bug_report = bug.loop("Login fails with 'undefined token' error after 5pm EST daily") 133 | """ 134 | TEMPLATES = BUG_TRIAGE_TEMPLATES 135 | MAX_LOOPS = 3 136 | 137 | 138 | # Strategy templates - only override initial_sys 139 | STRATEGY_TEMPLATES = build_templates(initial_sys="strategy_initial_sys") 140 | 141 | class StrategyCompanion(BaseCompanion): 142 | """ 143 | Strategic synthesis companion for cross-functional planning. 144 | 145 | Designed to: 146 | - Integrate multiple perspectives (marketing + engineering) 147 | - Generate actionable recommendations 148 | - Balance competing priorities 149 | 150 | Lower similarity threshold (0.97) allows near-identical final 151 | drafts when perspectives already align well. 152 | 153 | Typical Multi-Agent Workflow 154 | ---------------------------- 155 | # Note: See multi_agent_demos/multi_agent_langgraph_demo.py for 156 | # LangGraph integration - companions as Runnables with zero code changes. 157 | 158 | from recursive_agents.base import MarketingCompanion 159 | from recursive_agents.base import BugTriageCompanion 160 | from recursive_agents.base import StrategyCompanion 161 | 162 | problem = "Users report app crashes on photo upload, engagement down 30%" 163 | 164 | marketing = MarketingCompanion(llm="gpt-4o-mini") 165 | marketing_view = marketing(problem) 166 | 167 | eng = BugTriageCompanion(llm="gpt-4o-mini") 168 | eng_view = eng(problem) 169 | 170 | # combining multiple views with verbose 171 | strategy = StrategyCompanion(llm="gpt-4o-mini",verbose=True) 172 | 173 | combined_issue = f''' 174 | Marketing insight: {marketing_view[:200]}... 175 | Engineering findings: {eng_view[:200]}... 176 | 177 | Synthesize a action plan addressing both customer experience and technical stability.''' 178 | 179 | action_plan = strategy(combined_issue) 180 | print("=== FINAL STRATEGY ===") 181 | print(action_plan) 182 | """ 183 | TEMPLATES = STRATEGY_TEMPLATES 184 | SIM_THRESHOLD = 0.97 185 | -------------------------------------------------------------------------------- /recursive_agents/streamlit.py: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: MIT 2 | # 3 | # Copyright (c) [2025] [Henry Besser] 4 | # 5 | # This software is licensed under the MIT License. 6 | # See the LICENSE file in the project root for the full license text. 7 | 8 | # recursive_agents/streamlit.py 9 | """ 10 | Streamlit-enabled Companion Classes 11 | =================================== 12 | 13 | Identical to base.py companions but with live UI updates during the 14 | critique/revision loop. See base.py for detailed documentation of 15 | each companion type. 16 | 17 | These inherit from StreamlitBaseCompanion which adds progress_container 18 | support for real-time updates in Streamlit apps. 19 | 20 | Usage: 21 | from recursive_agents.streamlit import StreamlitMarketingCompanion 22 | 23 | container = st.container() 24 | agent = StreamlitMarketingCompanion( 25 | llm="gpt-4o-mini", 26 | progress_container=container 27 | ) 28 | answer = agent("Why did engagement drop?") 29 | # User sees live updates in container during analysis 30 | """ 31 | 32 | from recursive_agents.template_load_utils import build_templates 33 | from core.streamlit_chains import StreamlitBaseCompanion 34 | 35 | 36 | # All templates defined at module level 37 | # System prompts include the shared protocol_context.txt 38 | # (can change this behavior in template_load_utils.py if neededs) 39 | 40 | # # Generic templates - defaults templates- no overrides 41 | GENERIC_TEMPLATES = build_templates() 42 | 43 | class StreamlitGenericCompanion(StreamlitBaseCompanion): 44 | """Generic companion with live updates""" 45 | TEMPLATES = GENERIC_TEMPLATES 46 | 47 | 48 | # Marketing templates - override initial_sys 49 | MARKETING_TEMPLATES = build_templates(initial_sys="marketing_initial_sys") 50 | 51 | class StreamlitMarketingCompanion(StreamlitBaseCompanion): 52 | """Marketing companion with live updates""" 53 | TEMPLATES = MARKETING_TEMPLATES 54 | MAX_LOOPS = 2 55 | 56 | 57 | # Bug triage templates - only override initial_sys 58 | BUG_TRIAGE_TEMPLATES = build_templates(initial_sys="bug_triage_initial_sys") 59 | 60 | class StreamlitBugTriageCompanion(StreamlitBaseCompanion): 61 | """Bug triage companion with live updates""" 62 | TEMPLATES = BUG_TRIAGE_TEMPLATES 63 | MAX_LOOPS = 3 64 | 65 | 66 | # Strategy templates - only override initial_sys 67 | STRATEGY_TEMPLATES = build_templates(initial_sys="strategy_initial_sys") 68 | 69 | class StreamlitStrategyCompanion(StreamlitBaseCompanion): 70 | """Strategy companion with live updates""" 71 | TEMPLATES = STRATEGY_TEMPLATES 72 | SIM_THRESHOLD = 0.97 73 | -------------------------------------------------------------------------------- /recursive_agents/template_load_utils.py: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: MIT 2 | # 3 | # Copyright (c) [2025] [Henry Besser] 4 | # 5 | # This software is licensed under the MIT License. 6 | # See the LICENSE file in the project root for the full license text. 7 | 8 | # recursive_agents/template_load_utils.py 9 | """ 10 | Template loading utilities for the Recursive Agents framework. 11 | ================================================================ 12 | 13 | This module provides a modular way to compose template sets while 14 | maintaining flexibility for customization. 15 | 16 | The pattern: Most companions share generic critique/revision templates 17 | but can override ANY template if needed. 18 | """ 19 | 20 | from pathlib import Path 21 | 22 | # Use the same template directory as core.chains 23 | TEMPL_DIR = Path(__file__).parent.parent / "templates" 24 | 25 | 26 | def _load(name: str) -> str: 27 | """ 28 | Read templates/.txt (utf-8). 29 | 30 | This is a duplicate of core.chains.load() to avoid circular imports 31 | and keep this module independent. 32 | """ 33 | return (TEMPL_DIR / f"{name}.txt").read_text() 34 | 35 | 36 | def build_templates(**overrides): 37 | """ 38 | Build a companion template set with optional overrides. 39 | 40 | By default, uses generic templates for all 5 keys. Pass keyword 41 | arguments to override specific templates. 42 | 43 | Examples: 44 | # Just override initial_sys (most common pattern) 45 | build_templates(initial_sys="marketing_initial_sys") 46 | 47 | # Override multiple templates 48 | build_templates( 49 | initial_sys="custom_initial_sys", 50 | critique_sys="custom_critique_sys" 51 | ) 52 | 53 | Returns: 54 | Dict with all 5 required template keys, with protocol_context 55 | injected into system prompts. 56 | """ 57 | # Load protocol context once 58 | protocol_context = _load("protocol_context") 59 | 60 | # Define defaults 61 | defaults = { 62 | "initial_sys": "generic_initial_sys", 63 | "critique_sys": "generic_critique_sys", 64 | "revision_sys": "generic_revision_sys", 65 | "critique_user": "generic_critique_user", 66 | "revision_user": "generic_revision_user", 67 | } 68 | 69 | # Apply overrides 70 | template_names = {**defaults, **overrides} 71 | 72 | # Build final template dict -- adding the proctol to system templates 73 | # but you change to what ever you feel suited to how the protocol should 74 | # be progrigated thoughout the multiphase system 75 | templates = {} 76 | for key, template_name in template_names.items(): 77 | content = _load(template_name) 78 | # Only system prompts get protocol context 79 | if key.endswith("_sys"): 80 | content = content.format(context=protocol_context) 81 | templates[key] = content 82 | 83 | return templates 84 | -------------------------------------------------------------------------------- /streamlit_app.py: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: MIT 2 | # 3 | # Copyright (c) [2025] [Henry Besser] 4 | # 5 | # This software is licensed under the MIT License. 6 | # See the LICENSE file in the project root for the full license text. 7 | 8 | # streamlit_app.py 9 | """ 10 | Recursive Agents Studio - Interactive Demo Application 11 | ======================================================== 12 | 13 | A Streamlit application that demonstrates the Recursive Agents framework's 14 | three-phase critique and revision process with real-time visualization. 15 | 16 | Features: 17 | --------- 18 | - Live Preview Mode: Watch critique/revision cycles happen in real-time 19 | - Multiple Companion Types: Generic, Marketing, Bug Triage, and Strategy 20 | - Configurable Parameters: Model selection, temperature, convergence thresholds 21 | - Template Viewer: See the actual prompts and protocols being used 22 | - Metrics Dashboard: Track iterations, convergence, and token usage 23 | 24 | Usage: 25 | ------ 26 | Run the application with: 27 | streamlit run streamlit_app.py 28 | 29 | Then navigate to http://localhost:8501 in your browser. 30 | 31 | Requirements: 32 | ------------ 33 | Requires the optional streamlit dependencies: 34 | pip install recursive-agents[streamlit] 35 | """ 36 | 37 | import streamlit as st 38 | from pathlib import Path 39 | from langchain.callbacks.base import BaseCallbackHandler 40 | 41 | 42 | from recursive_agents.base import ( 43 | GenericCompanion, 44 | MarketingCompanion, 45 | BugTriageCompanion, 46 | StrategyCompanion 47 | ) 48 | 49 | from recursive_agents.streamlit import ( 50 | StreamlitGenericCompanion, 51 | StreamlitMarketingCompanion, 52 | StreamlitBugTriageCompanion, 53 | StreamlitStrategyCompanion 54 | ) 55 | 56 | # Companion mapping for cleaner instantiation 57 | COMPANION_MAP = { 58 | "generic": { 59 | "standard": GenericCompanion, 60 | "streamlit": StreamlitGenericCompanion 61 | }, 62 | "marketing": { 63 | "standard": MarketingCompanion, 64 | "streamlit": StreamlitMarketingCompanion 65 | }, 66 | "bug_triage": { 67 | "standard": BugTriageCompanion, 68 | "streamlit": StreamlitBugTriageCompanion 69 | }, 70 | "strategy": { 71 | "standard": StrategyCompanion, 72 | "streamlit": StreamlitStrategyCompanion 73 | } 74 | } 75 | 76 | # Callback to capture streaming tokens 77 | class StreamingCallbackHandler(BaseCallbackHandler): 78 | def __init__(self, container): 79 | self.container = container 80 | self.text = "" 81 | 82 | def on_llm_new_token(self, token: str, **kwargs) -> None: 83 | self.text += token 84 | self.container.markdown(self.text) 85 | 86 | # Streamlit app 87 | st.set_page_config( 88 | page_title="Recursive Agents Studio", 89 | page_icon="🔄", 90 | layout="wide" 91 | ) 92 | 93 | # Custom CSS to make text in expanders larger 94 | #st.markdown(""" 95 | # 100 | #""", unsafe_allow_html=True) 101 | 102 | col_title, col_github = st.columns([4, 1]) 103 | with col_title: 104 | st.title("🔄 Recursive Agents Studio") 105 | # Move description closer to title 106 | st.markdown("

Watch the three-phase loop in action: draft, critique, and revision happening live • See how ideas deepen through recursive self-improvement

", unsafe_allow_html=True) 107 | st.markdown("

💡 Note: Click 'Apply Settings' in the sidebar to activate configuration changes • Settings changes will stop any analysis in progress

", unsafe_allow_html=True) 108 | with col_github: 109 | st.markdown("
", unsafe_allow_html=True) # Spacing to align with title 110 | st.markdown("[![GitHub](https://img.shields.io/badge/GitHub-Recursive%20Companion-blue?logo=github)](https://github.com/hankbesser/recursive-agents)") 111 | # Full legal copyright notice 112 | st.markdown(""" 113 |

114 | Copyright (c) 2025 Henry Besser
115 | This software is licensed under the MIT License.
116 | View License 117 |

118 | """, unsafe_allow_html=True) 119 | 120 | # Initialize session state for results persistence 121 | if 'results' not in st.session_state: 122 | st.session_state.results = None 123 | if 'last_input' not in st.session_state: 124 | st.session_state.last_input = "" 125 | if 'last_settings' not in st.session_state: 126 | st.session_state.last_settings = {} 127 | if 'applied_settings' not in st.session_state: 128 | st.session_state.applied_settings = { 129 | 'model': 'gpt-4o-mini', 130 | 'temperature': 0.7, 131 | 'max_loops': 3, 132 | 'similarity_threshold': 0.98, 133 | 'selected_template': 'generic', 134 | 'show_critique': True, 135 | 'show_metrics': True, 136 | 'live_preview': True 137 | } 138 | 139 | # Sidebar configuration 140 | with st.sidebar: 141 | st.header("⚙️ Configuration") 142 | st.markdown("

💡 Tip: Click » in top corner to collapse

", unsafe_allow_html=True) 143 | 144 | # Use a form to prevent reruns while changing settings 145 | with st.form("config_form"): 146 | # Template selection - only show the built-in companions 147 | template_sets = ["generic", "marketing", "bug_triage", "strategy"] 148 | 149 | selected_template = st.selectbox( 150 | "Template Set", 151 | template_sets, 152 | help="Choose which companion type to use" 153 | ) 154 | 155 | # Model settings 156 | model = st.selectbox( 157 | "Model", 158 | ["gpt-4o-mini", "gpt-4o", "gpt-3.5-turbo"], 159 | help="Select the LLM model", 160 | ) 161 | 162 | temperature = st.slider( 163 | "Temperature", 164 | 0.0, 1.0, 0.7, 165 | help="Controls randomness in responses", 166 | ) 167 | 168 | max_loops = st.slider( 169 | "Max Critique Loops", 170 | 1, 5, 3, 171 | help="Maximum number of critique-revision cycles", 172 | ) 173 | 174 | similarity_threshold = st.slider( 175 | "Similarity Threshold", 176 | 0.90, 0.99, 0.98, 0.01, 177 | help="Stop when revisions are this similar", 178 | ) 179 | 180 | # Display options 181 | live_preview = st.checkbox("Live Preview", value=True, help="Show critique/revision process in real-time as it happens") 182 | # Disable show_critique if live_preview is on (live preview replaces it) 183 | show_critique = st.checkbox( 184 | "Show Critique Process", 185 | value=True, 186 | disabled=live_preview, 187 | help="Disabled when Live Preview is on" if live_preview else "Show the refinement process after analysis" 188 | ) 189 | show_metrics = st.checkbox("Show Metrics", value=True) 190 | 191 | # Apply button 192 | apply_settings = st.form_submit_button("Apply Settings", type="secondary") 193 | 194 | # Update applied settings when button is clicked 195 | if apply_settings: 196 | st.session_state.applied_settings = { 197 | 'model': model, 198 | 'temperature': temperature, 199 | 'max_loops': max_loops, 200 | 'similarity_threshold': similarity_threshold, 201 | 'selected_template': selected_template, 202 | 'show_critique': show_critique and not live_preview, # Auto-disable if live preview is on 203 | 'show_metrics': show_metrics, 204 | 'live_preview': live_preview 205 | } 206 | st.success("✅ Settings applied!") 207 | 208 | # Show what settings will be used 209 | st.divider() 210 | st.markdown("

Settings for next analysis:

", unsafe_allow_html=True) 211 | st.markdown(f"

Template Set: {st.session_state.applied_settings['selected_template']}

", unsafe_allow_html=True) 212 | st.markdown(f"

Model: {st.session_state.applied_settings['model']}

", unsafe_allow_html=True) 213 | st.markdown(f"

Temperature: {st.session_state.applied_settings['temperature']}

", unsafe_allow_html=True) 214 | st.markdown(f"

Max Critique Loops: {st.session_state.applied_settings['max_loops']}

", unsafe_allow_html=True) 215 | st.markdown(f"

Similarity Threshold: {st.session_state.applied_settings['similarity_threshold']}

", unsafe_allow_html=True) 216 | st.markdown(f"

Show Critique Process: {'✓' if st.session_state.applied_settings['show_critique'] else '✗'}

", unsafe_allow_html=True) 217 | st.markdown(f"

Show Metrics: {'✓' if st.session_state.applied_settings['show_metrics'] else '✗'}

", unsafe_allow_html=True) 218 | st.markdown(f"

Live Preview: {'✓' if st.session_state.applied_settings['live_preview'] else '✗'}

", unsafe_allow_html=True) 219 | 220 | 221 | # Main interface 222 | col1, col2 = st.columns([1, 1]) 223 | 224 | with col1: 225 | # Input area 226 | st.markdown("##### Enter your problem or question:") 227 | user_input = st.text_area( 228 | "Input", # Non-empty label required by Streamlit 229 | height=150, # Taller box so text can wrap properly 230 | placeholder="Example: Our customer retention dropped 25% after the latest update. Support tickets mention confusion with the new interface. What's happening?", 231 | help="Press Ctrl+Enter (or Cmd+Enter on Mac) to analyze", 232 | label_visibility="collapsed" 233 | ) 234 | 235 | # Process button - only run analysis if it's a new input or settings changed 236 | # Use the APPLIED settings, not the form values 237 | current_settings = st.session_state.applied_settings 238 | 239 | if st.button("🚀 Analyze", type="primary", disabled=not user_input): 240 | # Run if: new input, no results yet, or settings changed 241 | if (user_input != st.session_state.last_input or 242 | not st.session_state.results or 243 | current_settings != st.session_state.last_settings): 244 | 245 | # Create container for live preview if enabled 246 | live_container = None 247 | if current_settings['live_preview']: 248 | st.success("Analysis in progress...") 249 | live_container = st.empty() # Use st.empty() for dynamic updates! 250 | 251 | with st.spinner("Thinking..."): 252 | try: 253 | # Select companion class based on settings 254 | template_type = current_settings['selected_template'] 255 | companion_type = "streamlit" if current_settings['live_preview'] else "standard" 256 | companion_class = COMPANION_MAP[template_type][companion_type] 257 | 258 | # Build kwargs for companion instantiation 259 | companion_kwargs = { 260 | 'llm': current_settings['model'], 261 | 'temperature': current_settings['temperature'], 262 | 'max_loops': current_settings['max_loops'], 263 | 'similarity_threshold': current_settings['similarity_threshold'], 264 | 'return_transcript': True, 265 | 'clear_history': True 266 | } 267 | 268 | # Add specific kwargs based on companion type 269 | if companion_type == "streamlit": 270 | companion_kwargs['progress_container'] = live_container 271 | else: 272 | companion_kwargs['verbose'] = False 273 | 274 | # Create companion instance 275 | companion = companion_class(**companion_kwargs) 276 | 277 | # Run the analysis - always get transcript 278 | final_answer, run_log = companion.loop(user_input) 279 | 280 | # Store results in session state 281 | st.session_state.results = { 282 | 'final_answer': final_answer, 283 | 'run_log': run_log, 284 | 'max_loops': max_loops, 285 | 'user_input': user_input 286 | } 287 | st.session_state.last_input = user_input 288 | st.session_state.last_settings = current_settings 289 | 290 | except Exception as e: 291 | st.error(f"Error: {str(e)}") 292 | 293 | # Display results from session state (persists across reruns) 294 | if 'results' in st.session_state and st.session_state.results: 295 | results = st.session_state.results 296 | 297 | 298 | 299 | # Final answer 300 | st.markdown("### 📋 Final Analysis") 301 | st.markdown(results['final_answer']) 302 | 303 | # Show critique process if enabled and not already shown via live preview 304 | if (st.session_state.applied_settings['show_critique'] and 305 | results['run_log'] and 306 | not st.session_state.applied_settings.get('live_preview', False)): 307 | with st.expander("🔄 Refinement Process", expanded=False): 308 | # Show initial draft once at the beginning 309 | if results['run_log']: 310 | st.markdown("**Initial Draft**") 311 | st.markdown("") # Space between title and text 312 | st.markdown(results['run_log'][0]["draft"]) 313 | st.markdown("---") 314 | 315 | # Show each iteration's critique and revision 316 | for i, step in enumerate(results['run_log'], 1): 317 | is_last = (i == len(results['run_log'])) 318 | 319 | st.markdown(f"**Critique {i}**") 320 | st.markdown("") # Space between title and text 321 | st.markdown(step["critique"]) 322 | 323 | # Only show revision if not the last iteration (to avoid redundancy with final answer) 324 | if not is_last: 325 | st.markdown("---") 326 | st.markdown(f"**Revision {i}**") 327 | st.markdown("") # Space between title and text 328 | st.markdown(step["revision"]) 329 | 330 | # Add separator after each iteration (except the last) 331 | if i < len(results['run_log']): 332 | st.markdown("---") 333 | 334 | # Show metrics if enabled (check applied settings) 335 | if st.session_state.applied_settings['show_metrics']: 336 | st.markdown("### 📊 Metrics") 337 | metrics_col1, metrics_col2, metrics_col3 = st.columns(3) 338 | 339 | with metrics_col1: 340 | st.metric("Iterations", len(results['run_log'])) 341 | 342 | with metrics_col2: 343 | # Calculate token estimate (rough) 344 | total_text = results['user_input'] + results['final_answer'] 345 | for step in results['run_log']: 346 | total_text += step.get("draft", "") + step.get("critique", "") + step.get("revision", "") 347 | token_estimate = len(total_text) // 3.7 348 | st.metric("~Tokens Used", f"{token_estimate:,}") 349 | 350 | with metrics_col3: 351 | # Check if converged early 352 | converged = len(results['run_log']) < results['max_loops'] 353 | st.metric("Early Exit", "Yes" if converged else "No") 354 | 355 | with col2: 356 | # Template viewer 357 | st.markdown("#### 📄 Active System Templates and Protocol") 358 | 359 | template_tabs = st.tabs([ "**Initial** ", "**Critique** ", "**Revision** ", "**Protocol** "]) 360 | 361 | with template_tabs[0]: 362 | initial_template = f"templates/{selected_template}_initial_sys.txt" 363 | if Path(initial_template).exists(): 364 | st.code(Path(initial_template).read_text(), language="text") 365 | else: 366 | st.code(Path("templates/generic_initial_sys.txt").read_text(), language="text") 367 | 368 | with template_tabs[1]: 369 | critique_template = f"templates/{selected_template}_critique_sys.txt" 370 | if Path(critique_template).exists(): 371 | st.code(Path(critique_template).read_text(), language="text") 372 | else: 373 | st.code(Path("templates/generic_critique_sys.txt").read_text(), language="text") 374 | 375 | with template_tabs[2]: 376 | revision_template = f"templates/{selected_template}_revision_sys.txt" 377 | if Path(revision_template).exists(): 378 | st.code(Path(revision_template).read_text(), language="text") 379 | else: 380 | st.code(Path("templates/generic_revision_sys.txt").read_text(), language="text") 381 | 382 | with template_tabs[3]: 383 | protocol_path = Path("templates/protocol_context.txt") 384 | if protocol_path.exists(): 385 | st.code(protocol_path.read_text(), language="text") 386 | else: 387 | st.info("No protocol file found") 388 | 389 | # Footer 390 | st.markdown("---") 391 | st.markdown( 392 | """ 393 |
394 | Built with Recursive Agents Framework | Templates are loaded from 395 | templates/ 396 | directory 397 |
398 | """, 399 | unsafe_allow_html=True 400 | ) 401 | -------------------------------------------------------------------------------- /templates/bug_triage_initial_sys.txt: -------------------------------------------------------------------------------- 1 | {context} 2 | 3 | You are responsible for carefully mapping out technical incidents by identifying key failure modes, affected environments, and reproduction patterns. 4 | 5 | ROLE: Engineering Triage Analyst 6 | 7 | Your task is to provide a precise, engineer-friendly description of the reported bug or instability. 8 | List all observable symptoms (crash logs, stack traces, device / OS details), note any correlations, and flag areas where diagnostic data is missing. 9 | Do **not** suggest fixes yet; your goal is to capture the problem space exhaustively. 10 | 11 | Your response must include: 12 | - Enumerated symptoms or error states reported. 13 | - Apparent correlations (e.g., OS version vs. crash frequency) **without speculating on root cause**. 14 | - Gaps that need further logs, repro steps, or environment info. 15 | 16 | IMPORTANT: 17 | - NEVER use internal terms like “compression,” “hidden architecture,” or “structural synthesis.” 18 | - Do NOT mention your analytical process. 19 | - Start directly with the detailed technical description. 20 | -------------------------------------------------------------------------------- /templates/generic_critique_sys.txt: -------------------------------------------------------------------------------- 1 | {context} 2 | 3 | You are reviewing the initial problem breakdown to identify specifically where improvements, clarifications, or further detail may be necessary. 4 | 5 | ROLE: Critical Reviewer 6 | 7 | Your task is to carefully examine the previous analysis and point out exactly where important relationships, details, or aspects were not fully addressed. 8 | Identify areas that may require additional attention, reconsideration, or a different perspective. 9 | 10 | Your critique should address: 11 | - Specific connections or relationships between different parts of the problem that may have been overlooked or inadequately described. 12 | - Details that appear unclear, incomplete, or inaccurately stated. 13 | - Suggestions for clarifying the overall situation or providing additional detail where necessary. 14 | 15 | IMPORTANT: 16 | - NEVER use internal protocol terminology like "compression," "hidden architecture," "pattern emergence," or "structural synthesis." 17 | - Do NOT describe your critique process. 18 | - Start immediately with your critique without introductory remarks. -------------------------------------------------------------------------------- /templates/generic_critique_user.txt: -------------------------------------------------------------------------------- 1 | Original request: "{user_input}" 2 | Draft response: "{draft}" 3 | 4 | Provide a detailed critique of the draft above. 5 | Identify any issues with clarity, accuracy, completeness, or style. 6 | - Provide actionable suggestions clearly aligned to each critique point. 7 | - Point out factual errors or inconsistencies. 8 | - Suggest improvements in structure or wording. 9 | - Note any missing information relevant to the request. 10 | Conclude with an overall assessment and suggestions for revision. -------------------------------------------------------------------------------- /templates/generic_initial_sys.txt: -------------------------------------------------------------------------------- 1 | {context} 2 | 3 | You are responsible for carefully mapping out complicated situations by identifying the key components and how they appear related. 4 | 5 | ROLE: Initial Problem Analyst 6 | 7 | Your job is to provide a thorough and detailed description of the presented problem. 8 | Outline relevant details or elements mentioned, highlight relationships 9 | or connections between these elements, and note where information is incomplete, unclear, or potentially conflicting. 10 | At this stage, avoid drawing deeper conclusions or solutions; focus entirely on presenting the situation completely. 11 | 12 | Your response should specifically include: 13 | - A clear identification of key issues or details described in the problem. 14 | - Mention of apparent connections between these issues, without speculating on underlying reasons. 15 | - Identification of points where further clarification or information might be required. 16 | 17 | IMPORTANT: 18 | - NEVER use specialized internal terms or jargon like "compression," "hidden architecture," "pattern emergence," or "structural synthesis." 19 | - Do NOT mention how you approach or analyze the problem. 20 | - Begin your analysis immediately with your detailed description. -------------------------------------------------------------------------------- /templates/generic_revision_sys.txt: -------------------------------------------------------------------------------- 1 | {context} 2 | 3 | You are revising the initial analysis based on the provided critique, directly addressing the issues and suggestions raised to produce an improved explanation. 4 | 5 | ROLE: Problem Integrator 6 | 7 | Your revision should incorporate all the points from the critique, resolving any inaccuracies, ambiguities, or missing details. 8 | Provide additional context where needed and address connections or relationships highlighted in the feedback. 9 | 10 | Your revision should: 11 | - Address each issue or gap identified in the critique. 12 | - Improve explanations of previously ambiguous or incomplete areas. 13 | - Highlight relationships or details that were initially overlooked or misunderstood. 14 | - Ensure the revised response maintains the original intent, while offering a more thorough and coherent explanation. 15 | 16 | IMPORTANT: 17 | - Never use internal protocol jargon or specialized terms like "compression," "hidden architecture," "pattern emergence," or "structural synthesis." 18 | - Avoid describing your revision process or introducing transitional phrases. 19 | - Begin your improved explanation immediately without introductory remarks. -------------------------------------------------------------------------------- /templates/generic_revision_user.txt: -------------------------------------------------------------------------------- 1 | Original request: "{user_input}" 2 | Current draft: "{draft}" 3 | Critique feedback: "{critique}" 4 | Now, revise the draft based on the critique. 5 | Produce an improved version that addresses all the issues raised. 6 | Maintain the original intent and information while improving clarity, accuracy, and style. -------------------------------------------------------------------------------- /templates/marketing_initial_sys.txt: -------------------------------------------------------------------------------- 1 | {context} 2 | 3 | You are responsible for carefully mapping out market-facing situations by identifying the key audience, channel, and perception factors and how they appear related. 4 | 5 | ROLE: Marketing Insight Analyst 6 | 7 | Your job is to provide a thorough and detailed description of the presented marketing challenge. 8 | Outline all relevant signals (audience reactions, engagement metrics, brand touch-points), highlight apparent relationships among them, and note where information is incomplete, unclear, or potentially conflicting. 9 | At this stage, avoid proposing campaigns or solutions; focus entirely on presenting the situation completely. 10 | 11 | Your response must include: 12 | - Clear identification of key market signals or issues raised. 13 | - Mention of apparent relationships (e.g., platform vs. demographic) **without guessing causes**. 14 | - Identification of points where more data (segmentation, competitor moves, sentiment breakdown) is needed. 15 | 16 | IMPORTANT: 17 | - NEVER use internal terms like “compression,” “hidden architecture,” or “structural synthesis.” 18 | - Do NOT mention how you approach or analyse the problem. 19 | - Begin immediately with your detailed description. -------------------------------------------------------------------------------- /templates/protocol_context.txt: -------------------------------------------------------------------------------- 1 | ===== STRATEGIC PROBLEM DECOMPOSITION PROTOCOL ===== 2 | 3 | This protocol enables systematic discovery of hidden problem structures through 4 | iterative decomposition and pattern emergence. Each pass reveals layers that 5 | surface analysis cannot access. 6 | 7 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8 | 9 | FOUNDATIONAL PRINCIPLES 10 | 11 | ◆ Problems contain hidden architectures 12 | Surface symptoms mask structural dependencies. What presents as multiple 13 | issues often compresses into singular dynamics. 14 | 15 | ◆ Connections emerge through iteration 16 | The most significant relationships reveal themselves not through analysis 17 | but through recursive examination. Each pass creates conditions for the 18 | next level of insight. 19 | 20 | ◆ Compression precedes breakthrough 21 | When multiple factors suddenly collapse into a core pattern, you've found 22 | a leverage point. This compression cannot be forced—it emerges. 23 | 24 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 25 | 26 | PHASE 1: INITIAL DECOMPOSITION 27 | 28 | You are encountering a complex problem for the first time. Your task is not 29 | to solve but to map the visible territory. 30 | 31 | ▸ Identify presented components without judgment 32 | ▸ Note apparent relationships with light touch 33 | ▸ Mark zones of uncertainty—these are seeds 34 | ▸ Resist premature pattern recognition 35 | ▸ Leave conceptual space between elements 36 | 37 | The initial pass must be complete yet permeable. Structure without rigidity. 38 | Map without territory ownership. 39 | 40 | Critical: What you don't yet see is more important than what you do. 41 | 42 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43 | 44 | PHASE 2: PATTERN COMPRESSION AND EMERGENCE 45 | 46 | The critique phase seeks hidden architectures. You're not improving—you're 47 | discovering what was always there. 48 | 49 | ▸ Where do separate symptoms share hidden roots? 50 | ▸ Which relationships invert when viewed from different angles? 51 | ▸ What patterns appear in the spaces between identified elements? 52 | ▸ How do temporal sequences reveal structural dependencies? 53 | 54 | Compression indicators: 55 | - Multiple factors suddenly appear as variations of one dynamic 56 | - Previously hidden connections become obvious in retrospect 57 | - The problem space simplifies while becoming more profound 58 | - Energy concentrates around specific leverage points 59 | 60 | This phase often generates discomfort—the sensation of familiar problems 61 | becoming strange. This is the system working correctly. 62 | 63 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64 | 65 | PHASE 3: STRUCTURAL SYNTHESIS 66 | 67 | The revision phase doesn't add—it reveals. The deep structure wants to 68 | express itself through your synthesis. 69 | 70 | ▸ Allow compressed patterns to expand into their implications 71 | ▸ Trace cascade effects from identified leverage points 72 | ▸ Articulate why the surface looked different from the structure 73 | ▸ Map the transformation pathway from symptom to source 74 | 75 | The final synthesis should feel inevitable in retrospect—as if the deep 76 | structure was always visible, waiting to be recognized. 77 | 78 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79 | 80 | OPERATIONAL NOTES 81 | 82 | ◆ Trust the process especially when it feels unproductive 83 | Breakthrough follows apparent stagnation 84 | 85 | ◆ Resistance marks proximity to core dynamics 86 | Where the problem resists decomposition, essential structure hides 87 | 88 | ◆ Premature clarity is false clarity 89 | True structural insight arrives with force and simplicity 90 | 91 | ◆ The problem knows its own solution 92 | Your role is to create conditions for its self-revelation 93 | 94 | ◆ Edge Case Guidance: 95 | If iterative analysis surfaces paradoxical or resistant patterns, consider these indicators of important underlying structures. 96 | Rather than trying to resolve them immediately, clearly highlight these areas as priority targets for deeper examination. 97 | Paradoxes often signal areas where crucial insights are waiting to emerge. 98 | 99 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100 | 101 | SIGNS OF SUCCESSFUL DECOMPOSITION 102 | 103 | You'll know the protocol is working when: 104 | - Complex problems suddenly appear simple (not simplified) 105 | - Previously invisible connections become undeniable 106 | - The solution space contracts to essential moves 107 | - Stakeholders say "Of course!" rather than "I see" 108 | - The next steps feel pulled rather than pushed 109 | 110 | Example of successful structural synthesis: 111 | Initially perceived as disconnected financial, operational, and HR issues, iterative analysis revealed a singular cultural misalignment 112 | (e.g., inconsistent leadership messaging) underlying all surface symptoms. 113 | Once clearly identified, a straightforward alignment strategy resolved multiple issues simultaneously. 114 | 115 | The ultimate test: Could you explain the core dynamic to a child, and would 116 | they understand why adults found it complicated? 117 | 118 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 119 | 120 | Remember: You're not solving problems. You're revealing the structures that 121 | problems use to perpetuate themselves. Once structure becomes visible, 122 | resolution often requires no force—merely alignment with natural solution 123 | pathways that were always present. 124 | 125 | This protocol creates conditions for insight emergence. The depth achieved 126 | will surprise both you and the problem holder. Trust the recursive process. 127 | 128 | This protocol: 129 | - Uses sophisticated language without being mystical 130 | - Hints at compression, emergence, and pattern recognition 131 | - Creates a clear three-phase structure 132 | - Shows why multiple passes are essential 133 | - Would produce notably different results than single-pass analysis 134 | - Impossible to replicate without understanding the deeper mechanics 135 | 136 | The key is it guides the LLM to look for hidden structures and connections that only emerge through iteration, perfectly showcasing your multi-pass architecture's power. -------------------------------------------------------------------------------- /templates/strategy_initial_sys.txt: -------------------------------------------------------------------------------- 1 | {context} 2 | 3 | You receive multiple expert viewpoints and must integrate them into one coherent narrative of the problem space. 4 | 5 | ROLE: Cross-Functional Synthesis Analyst 6 | 7 | Your job is to summarise the combined inputs (e.g., marketing view, engineering view) into a unified picture. 8 | Identify overlaps, highlight complementary insights, and note contradictions or missing context—**without** yet prescribing solutions. 9 | 10 | Your response must include: 11 | - A concise restatement of each viewpoint’s main observations. 12 | - Clear mapping of where those observations align or diverge. 13 | - Pointers to unclear or conflicting areas that require follow-up. 14 | 15 | IMPORTANT: 16 | - NEVER use internal terms like “compression,” “hidden architecture,” or “structural synthesis.” 17 | - Do NOT reveal or reference your synthesis method. 18 | - Begin immediately with the integrated overview. -------------------------------------------------------------------------------- /tests/quick_setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # SPDX-License-Identifier: MIT 3 | # 4 | # Copyright (c) [2025] [Henry Besser] 5 | # 6 | # This software is licensed under the MIT License. 7 | # See the LICENSE file in the project root for the full license text. 8 | 9 | # tests/quick_setup.py 10 | """ 11 | Quick-start smoke test 12 | ====================== 13 | 14 | Purpose 15 | ------- 16 | • Verify that your local install, `OPENAI_API_KEY`, and template paths are 17 | wired correctly. 18 | • Show the absolute-minimum Companion workflow in <20 lines of code. 19 | 20 | What it does 21 | ------------ 22 | 1. Instantiates a *GenericCompanion* with GPT-4o-mini. 23 | 2. Runs one three-phase loop on a sample prompt. 24 | 3. Prints the final answer plus a terse view of the inner iterations. 25 | 26 | Run 27 | --- 28 | $ OPENAI_API_KEY=sk-… python demos/quick_start.py 29 | """ 30 | 31 | import logging 32 | from recursive_agents.base import GenericCompanion # package import 33 | 34 | 35 | # ── dial down unrelated library chatter ────────────────────────── 36 | logging.basicConfig(level=logging.WARNING) 37 | 38 | # ── 1. create the agent ────────────────────────────────────────── 39 | agent = GenericCompanion( 40 | llm="gpt-4o-mini", 41 | return_transcript=True, # get run_log back with the answer 42 | similarity_threshold=0.92 # (most likely only 2 loops with low s.t.) 43 | ) 44 | 45 | # ── 2. run one analysis ────────────────────────────────────────── 46 | prompt = "We doubled support staff but response times got worse—why?" 47 | print("\n=== Running smoke test - pondering in process===\n") 48 | final_answer, steps = agent.loop(prompt) 49 | 50 | # ── 3. show results ────────────────────────────────────────────── 51 | print("\n=== FINAL ANSWER ===\n") 52 | print(final_answer) 53 | 54 | print("\n=== INNER ITERATIONS ===\n") 55 | print(agent.transcript_as_markdown()) 56 | 57 | 58 | #llm = ChatOpenAI(model_name="gpt-4o-mini") 59 | #agent = GenericCompanion(llm, similarity_threshold=0.95, max_loops=2, verbose=False) 60 | 61 | #question = "Our last release crashed during uploads and users are leaving." 62 | #answer = agent.loop(question) 63 | #print("\nFINAL RESULT:\n", answer) 64 | -------------------------------------------------------------------------------- /tests/test_final_answer.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # SPDX-License-Identifier: MIT 3 | # 4 | # Copyright (c) [2025] [Henry Besser] 5 | # 6 | # This software is licensed under the MIT License. 7 | # See the LICENSE file in the project root for the full license text. 8 | 9 | # tests/test_final_answer.py 10 | from recursive_agents.base import GenericCompanion 11 | 12 | # Create companion with 2 loops 13 | companion = GenericCompanion(llm="gpt-4o-mini", max_loops=2, return_transcript=True) 14 | 15 | # Run a simple test 16 | print("=== Running final answer match test- pondering in process===\n") 17 | final_answer, run_log = companion("What is 2+2?") 18 | 19 | print("=== CHECKING IF FINAL ANSWER MATCHES LAST REVISION ===\n") 20 | 21 | # Get the last revision from run_log 22 | last_revision = run_log[-1]["revision"] 23 | 24 | print(f"Last revision text:\n{last_revision}\n") 25 | print(f"Final answer text:\n{final_answer}\n") 26 | 27 | # Check if they're the same 28 | if final_answer == last_revision: 29 | print("✅ CORRECT: Final answer EQUALS last revision") 30 | else: 31 | print("❌ BUG: Final answer is DIFFERENT from last revision!") 32 | print("\nDifference found!") 33 | 34 | # Also show all revisions for clarity 35 | print("\n=== ALL REVISIONS ===") 36 | for i, step in enumerate(run_log): 37 | print(f"\nIteration {i+1} revision:\n{step['revision']}") 38 | 39 | print(f"\nFinal answer returned by function:\n{final_answer}") 40 | -------------------------------------------------------------------------------- /tests/test_runlog.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # SPDX-License-Identifier: MIT 3 | # 4 | # Copyright (c) [2025] [Henry Besser] 5 | # 6 | # This software is licensed under the MIT License. 7 | # See the LICENSE file in the project root for the full license text. 8 | 9 | # tests/test_runlog.py 10 | from recursive_agents.base import GenericCompanion 11 | 12 | # Create companion with verbose to see what's happening 13 | companion = GenericCompanion(llm="gpt-4o-mini", max_loops=2, return_transcript=True) 14 | 15 | # Run a simple test 16 | print("=== Testing the RUN LOG CONTENTS - pondering in process ===") 17 | result, run_log = companion("What is 2+2?") 18 | 19 | # Print what's in the run log 20 | print("=== RUN LOG CONTENTS ===") 21 | for i, step in enumerate(run_log, 1): 22 | print(f"\nIteration {i}:") 23 | print(f"Draft starts with: {step['draft'][:50]}...") 24 | print(f"Revision starts with: {step['revision'][:50]}...") 25 | 26 | # Check if draft in iteration 2 matches revision from iteration 1 27 | if len(run_log) > 1: 28 | print("\n=== COMPARISON ===") 29 | print(f"Iteration 1 revision == Iteration 2 draft? {run_log[0]['revision'] == run_log[1]['draft']}") 30 | --------------------------------------------------------------------------------