├── requirements.txt
├── .env
├── LICENSE
├── README.md
├── 11_meta_controller.ipynb
├── 12_graph.ipynb
├── 14_dry_run.ipynb
├── 10_mental_loop.ipynb
├── 09_tree_of_thoughts.ipynb
├── 13_ensemble.ipynb
└── 08_episodic_with_semantic.ipynb
/requirements.txt:
--------------------------------------------------------------------------------
1 | langchain
2 | langgraph
3 | langchain-nebius
4 | rich
5 | python-dotenv
6 | langchain-community
7 | langchain-tavily
--------------------------------------------------------------------------------
/.env:
--------------------------------------------------------------------------------
1 | NEBIUS_API_KEY="YOUR_NEBIUS_API" # Langchain supports many llm providers. You can use any of them by changing the environment variable below.
2 | LANGCHAIN_API_KEY="LANGSMITH_API_KEY" # Langsmith is a great tool for tracking and evaluating agentic architectures. Sign up for free at https://www.langchain.com/langsmith
3 | TAVILY_API_KEY="YOUR_TAVILY_API" # Tavily is use to search the web for up-to-date information. Sign up for free at https://www.tavily.com (1000 credits free per month)
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2025 Fareed Khan
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # All Agentic Architectures
2 |
3 |      [](https://opensource.org/licenses/MIT)
4 |
5 | Welcome to a comprehensive, hands-on masterclass in **modern AI agent design**. This repository contains detailed implementations of **17+ state-of-the-art agentic architectures**, built with LangChain and LangGraph. It is designed to be a living textbook, bridging the gap between theoretical concepts and practical, production-ready code.
6 |
7 | ## 📖 Why This Repository?
8 |
9 | The field of AI agents is evolving at an incredible pace, but many resources remain abstract and theoretical. This project was created to provide a structured, practical, and deeply educational path for developers, researchers, and AI enthusiasts to master the art of building intelligent systems.
10 |
11 | - **From Theory to Tangible Code:** Each architecture is not just explained but implemented end-to-end in a runnable Jupyter notebook.
12 | - **Structured Learning Path:** The notebooks are ordered to build concepts progressively, from foundational patterns to highly advanced, multi-agent and self-aware systems.
13 | - **Emphasis on Evaluation:** We don't just build agents, we measure them. Most notebooks feature a robust `LLM-as-a-Judge` pattern to provide quantitative, objective feedback on an agent's performance, a critical skill for production AI.
14 | - **Real-World Scenarios:** The examples are grounded in practical applications—financial analysis, coding, social media management, medical triage—making the concepts immediately relevant.
15 | - **Consistent, Modern Framework:** By using `LangGraph` as the core orchestrator, you will learn a powerful, stateful, and cyclical approach to agent design that is rapidly becoming the industry standard.
16 |
17 | ---
18 |
19 | ## 🏛️ The Architectures: A Deep Dive
20 |
21 | This collection covers the full spectrum of modern agentic design, from single-agent enhancements to complex, collaborative, and self-improving systems.
22 |
23 | | # | Architecture | Core Concept / TL;DR | Key Use Case | Notebook |
24 | |:---:|---|---|---|:---:|
25 | | **01** | **Reflection** | Moves from a single-pass generator to a deliberate, multi-step reasoner by critiquing and refining its own work. | High-Quality Code Generation, Complex Summarization | [01_reflection.ipynb](./01_reflection.ipynb) |
26 | | **02** | **Tool Use** | Empowers an agent to overcome knowledge cutoffs and interact with the real world by calling external APIs and functions. | Real-time Research Assistants, Enterprise Bots | [02_tool_use.ipynb](./02_tool_use.ipynb) |
27 | | **03** | **ReAct** | Dynamically interleaves reasoning ("thought") and action ("tool use") in an adaptive loop to solve complex, multi-step problems. | Multi-hop Q&A, Web Navigation & Research | [03_ReAct.ipynb](./03_ReAct.ipynb) |
28 | | **04** | **Planning** | Proactively decomposes a complex task into a detailed, step-by-step plan *before* execution, ensuring a structured and traceable workflow. | Predictable Report Generation, Project Management | [04_planning.ipynb](./04_planning.ipynb) |
29 | | **05** | **Multi-Agent Systems** | A team of specialized agents collaborates to solve a problem, dividing labor to achieve superior depth, quality, and structure in the final output. | Software Dev Pipelines, Creative Brainstorming | [05_multi_agent.ipynb](./05_multi_agent.ipynb) |
30 | | **06** | **PEV (Plan, Execute, Verify)** | A highly robust, self-correcting loop where a Verifier agent checks the outcome of each action, allowing for error detection and dynamic recovery. | High-Stakes Automation, Finance, Unreliable Tools | [06_PEV.ipynb](./06_PEV.ipynb) |
31 | | **07** | **Blackboard Systems** | A flexible multi-agent system where agents collaborate opportunistically via a shared central memory (the "blackboard"), guided by a dynamic controller. | Complex Diagnostics, Dynamic Sense-Making | [07_blackboard.ipynb](./07_blackboard.ipynb) |
32 | | **08** | **Episodic + Semantic Memory** | A dual-memory system combining a vector store for past conversations (episodic) and a graph DB for structured facts (semantic) for true long-term personalization. | Long-Term Personal Assistants, Personalized Tutors | [08_episodic_with_semantic.ipynb](./08_episodic_with_semantic.ipynb) |
33 | | **09** | **Tree of Thoughts (ToT)** | Solves problems by exploring multiple reasoning paths in a tree structure, evaluating and pruning branches to systematically find the optimal solution. | Logic Puzzles, Constrained Planning | [09_tree_of_thoughts.ipynb](./09_tree_of_thoughts.ipynb) |
34 | | **10** | **Mental Loop (Simulator)** | An agent tests its actions in an internal "mental model" or simulator to predict outcomes and assess risk before acting in the real world. | Robotics, Financial Trading, Safety-Critical Systems | [10_mental_loop.ipynb](./10_mental_loop.ipynb) |
35 | | **11** | **Meta-Controller** | A supervisory agent that analyzes incoming tasks and routes them to the most appropriate specialist sub-agent from a pool of experts. | Multi-Service AI Platforms, Adaptive Assistants | [11_meta_controller.ipynb](./11_meta_controller.ipynb) |
36 | | **12** | **Graph (World-Model Memory)** | Stores knowledge as a structured graph of entities and relationships, enabling complex, multi-hop reasoning by traversing connections. | Corporate Intelligence, Advanced Research | [12_graph.ipynb](./12_graph.ipynb) |
37 | | **13** | **Ensemble** | Multiple independent agents analyze a problem from different perspectives, and a final "aggregator" agent synthesizes their outputs for a more robust, less biased conclusion. | High-Stakes Decision Support, Fact-Checking | [13_ensemble.ipynb](./13_ensemble.ipynb) |
38 | | **14** | **Dry-Run Harness** | A safety-critical pattern where an agent's proposed action is first simulated (dry run) and must be approved (by a human or checker) before live execution. | Production Agent Deployment, Debugging | [14_dry_run.ipynb](./14_dry_run.ipynb) |
39 | | **15** | **RLHF (Self-Improvement)** | An agent's output is critiqued by an "editor" agent, and the feedback is used to iteratively revise the work. High-quality outputs are saved to improve future performance. | High-Quality Content Generation, Continual Learning | [15_RLHF.ipynb](./15_RLHF.ipynb) |
40 | | **16** | **Cellular Automata** | A system of many simple, decentralized grid-based agents whose local interactions produce complex, emergent global behavior like optimal pathfinding. | Spatial Reasoning, Logistics, Complex System Simulation | [16_cellular_automata.ipynb](./16_cellular_automata.ipynb) |
41 | | **17** | **Reflexive Metacognitive** | An agent with a "self-model" that reasons about its own capabilities and limitations, choosing to act, use a tool, or escalate to a human to ensure safety. | High-Stakes Advisory (Medical, Legal, Finance) | [17_reflexive_metacognitive.ipynb](./17_reflexive_metacognitive.ipynb) |
42 |
43 | ---
44 |
45 | ## 🗺️ A Guided Tour Through the Architectures
46 |
47 | The repository is structured to take you on a journey from simple enhancements to building truly sophisticated, multi-agent, self-aware systems.
48 |
49 |
50 | Click to expand the learning path
51 |
52 | #### Part 1: Foundational Patterns (Notebooks 1-4)
53 | This section covers the essential building blocks for making a single agent more powerful.
54 | - We start with **Reflection** to improve output quality.
55 | - Then, we give the agent **Tools** to interact with the world.
56 | - **ReAct** combines these into a dynamic loop.
57 | - Finally, **Planning** adds foresight and structure to its actions.
58 |
59 | #### Part 2: Multi-Agent Collaboration (Notebooks 5, 7, 11, 13)
60 | Here, we explore how to make agents work together.
61 | - **Multi-Agent Systems** introduces the concept of specialized teams.
62 | - The **Meta-Controller** acts as a smart router to dispatch tasks to these teams.
63 | - The **Blackboard** provides a flexible, shared workspace for dynamic collaboration.
64 | - The **Ensemble** pattern uses multiple agents in parallel for more robust, diverse analysis.
65 |
66 | #### Part 3: Advanced Memory & Reasoning (Notebooks 8, 9, 12)
67 | This section focuses on how agents can think more deeply and remember what they've learned.
68 | - **Episodic + Semantic Memory** provides a powerful, human-like memory system.
69 | - The **Graph World-Model** allows for complex reasoning over interconnected knowledge.
70 | - **Tree of Thoughts** enables systematic, multi-path exploration to solve hard logic problems.
71 |
72 | #### Part 4: Safety, Reliability, and Real-World Interaction (Notebooks 6, 10, 14, 17)
73 | These architectures are critical for building agents that can be trusted in production.
74 | - The **Dry-Run Harness** provides a crucial human-in-the-loop safety layer.
75 | - The **Simulator** allows an agent to "think before it acts" by modeling consequences.
76 | - **PEV** builds in automatic error detection and recovery.
77 | - The **Metacognitive** agent understands its own limitations, a key to safe operation in high-stakes domains.
78 |
79 | #### Part 5: Learning and Adaptation (Notebooks 15, 16)
80 | The final section explores how agents can improve over time and solve problems in novel ways.
81 | - The **Self-Improvement Loop** creates a mechanism for an agent to learn from feedback, analogous to RLHF.
82 | - **Cellular Automata** showcases how complex global behavior can emerge from simple, local rules, creating highly adaptive systems.
83 |
84 |
85 |
86 |
87 | Example Architecture Diagram: The Meta-Controller
88 |
89 | This diagram illustrates the flow in the `11_meta_controller.ipynb` notebook, a common pattern for orchestrating specialized agents.
90 |
91 | ```mermaid
92 | graph TD
93 | A[User Request] --> B{Meta-Controller};
94 | B -- Analyzes Request --> C{Routes to Specialist};
95 | C --> D[Generalist Agent];
96 | C --> E[Research Agent];
97 | C --> F[Coding Agent];
98 | D --> G[Final Response];
99 | E --> G[Final Response];
100 | F --> G[Final Response];
101 |
102 | style B fill:#f9f,stroke:#333,stroke-width:2px
103 | style C fill:#bbf,stroke:#333,stroke-width:2px
104 | ```
105 |
106 |
107 | ---
108 |
109 | ## 🛠️ Technical Stack & Setup
110 |
111 | This project leverages a modern, powerful stack for building sophisticated AI applications.
112 |
113 | | Component | Purpose |
114 | |---|---|
115 | | **Python 3.10+** | The core programming language for the entire project. |
116 | | **LangChain** | Provides the foundational building blocks for interacting with LLMs and tools. |
117 | | **LangGraph** | The key orchestration framework for building complex, stateful, and cyclical agent workflows. |
118 | | **Nebius AI Models** | High-performance LLMs (e.g., `Mixtral-8x22B-Instruct-v0.1`) that power the agents' reasoning. |
119 | | **Jupyter Notebooks** | Used for interactive development, rich explanations, and clear, step-by-step demonstrations. |
120 | | **Pydantic** | Ensures robust, structured data modeling, which is critical for reliable communication with LLMs. |
121 | | **Tavily Search** | A powerful search API used as a tool for research-oriented agents. |
122 | | **Neo4j** | The industry-standard graph database used for implementing semantic and world-model memory. |
123 | | **FAISS** | An efficient vector store used for implementing episodic memory through similarity search. |
124 |
125 | ## 🚀 Getting Started
126 |
127 | Follow these steps to set up your local environment and run the notebooks.
128 |
129 | ### 1. Clone the Repository
130 |
131 | ```bash
132 | git clone https://github.com/your-username/all-agentic-architectures.git
133 | cd all-agentic-architectures
134 | ```
135 |
136 | ### 2. Set Up a Virtual Environment
137 |
138 | It is highly recommended to use a virtual environment to manage dependencies.
139 |
140 | ```bash
141 | # For Unix/macOS
142 | python3 -m venv venv
143 | source venv/bin/activate
144 |
145 | # For Windows
146 | python -m venv venv
147 | .\venv\Scripts\activate
148 | ```
149 |
150 | ### 3. Install Dependencies
151 |
152 | Install all the required Python packages from the `requirements.txt` file.
153 |
154 | ```bash
155 | pip install -r requirements.txt
156 | ```
157 |
158 | To visualize graphs in LangGraph, you may also need to install `pygraphviz`.
159 |
160 | ### 4. Configure Environment Variables
161 |
162 | The agents require API keys to function. Create a file named `.env` in the root of the project directory. You can copy the provided `requirements.txt` content to see what's needed and then create your `.env` file.
163 |
164 | Open the `.env` file and add your credentials. It should look like this:
165 |
166 | ```python
167 | # .env file
168 |
169 | # Nebius AI API Key (for LLM access)
170 | NEBIUS_API_KEY="your_nebius_api_key_here"
171 |
172 | # LangSmith API Key (for tracing and debugging)
173 | LANGCHAIN_API_KEY="your_langsmith_api_key_here"
174 | LANGCHAIN_TRACING_V2="true"
175 | LANGCHAIN_PROJECT="All-Agentic-Architectures" # Optional: Set a project name
176 |
177 | # Tavily Search API Key (for the Research agent's tool)
178 | TAVILY_API_KEY="your_tavily_api_key_here"
179 |
180 | # Neo4j Credentials (for Graph and Memory architectures)
181 | # You must have a Neo4j instance running (e.g., via Docker or Neo4j Desktop)
182 | NEO4J_URI="bolt://localhost:7687"
183 | NEO4J_USERNAME="neo4j"
184 | NEO4J_PASSWORD="your_neo4j_password_here"
185 | ```
186 |
187 | ### 5. Run the Notebooks
188 |
189 | You can now launch Jupyter and explore the notebooks in numerical order.
190 |
191 | ```bash
192 | jupyter notebook
193 | ```
194 |
195 | ## 🤝 How to Contribute
196 |
197 | Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.
198 |
199 | 1. **Fork the repository.**
200 | 2. **Create a new branch** for your feature or bug fix (`git checkout -b feature/new-architecture` or `bugfix/fix-typo`).
201 | 3. **Make your changes.** Please ensure the code is well-commented and the notebook explanations are clear.
202 | 4. **Submit a pull request** with a detailed description of your changes.
203 |
204 | You can also open an issue to report a bug, suggest an enhancement, or propose a new architecture to add to the collection.
205 |
206 | ## 📄 License
207 |
208 | This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
--------------------------------------------------------------------------------
/11_meta_controller.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "intro-title",
6 | "metadata": {},
7 | "source": [
8 | "# 📘 Agentic Architectures 11: Meta-Controller\n",
9 | "\n",
10 | "Welcome to the eleventh notebook in our series. Today, we're building a **Meta-Controller**, a supervisory agent architecture that orchestrates a team of specialized sub-agents. This pattern is fundamental to creating powerful, multi-talented AI systems.\n",
11 | "\n",
12 | "Instead of building a single, monolithic agent that tries to do everything, the Meta-Controller acts as a smart dispatcher. It receives an incoming request, analyzes its nature, and routes it to the most appropriate specialist from a pool of available agents. This allows each sub-agent to be highly optimized for its specific task, leading to better performance and modularity.\n",
13 | "\n",
14 | "We will demonstrate this by building a system with three distinct specialists:\n",
15 | "1. **Generalist Agent:** Handles casual conversation and simple questions.\n",
16 | "2. **Research Agent:** Equipped with a search tool to answer questions about recent events or complex topics.\n",
17 | "3. **Coding Agent:** A specialist focused on generating Python code snippets.\n",
18 | "\n",
19 | "The Meta-Controller will be the \"brain\" of the operation, examining each user query and deciding which agent is best suited to respond. This creates a flexible and easily extensible system where new capabilities can be added simply by creating a new specialist agent and teaching the controller about it."
20 | ]
21 | },
22 | {
23 | "cell_type": "markdown",
24 | "id": "intro-definition",
25 | "metadata": {},
26 | "source": [
27 | "### Definition\n",
28 | "A **Meta-Controller** (or Router) is a supervisory agent in a multi-agent system that is responsible for analyzing incoming tasks and dispatching them to the appropriate specialized sub-agent or workflow. It acts as an intelligent routing layer, deciding which tool or expert is best suited for the job at hand.\n",
29 | "\n",
30 | "### High-level Workflow\n",
31 | "\n",
32 | "1. **Receive Input:** The system receives a user request.\n",
33 | "2. **Meta-Controller Analysis:** The Meta-Controller agent examines the request's intent, complexity, and content.\n",
34 | "3. **Dispatch to Specialist:** Based on its analysis, the Meta-Controller selects the best specialist agent (e.g., 'Researcher', 'Coder', 'Chatbot') from a predefined pool.\n",
35 | "4. **Execute Task:** The chosen specialist agent executes the task and generates a result.\n",
36 | "5. **Return Result:** The result from the specialist is returned to the user. In more complex workflows, control might return to the Meta-Controller for further steps or monitoring.\n",
37 | "\n",
38 | "### When to Use / Applications\n",
39 | "* **Multi-Service AI Platforms:** A single entry point for a platform that offers diverse services like document analysis, data visualization, and creative writing.\n",
40 | "* **Adaptive Personal Assistants:** An assistant that can switch between different modes or tools, such as managing your calendar, searching the web, or controlling smart home devices.\n",
41 | "* **Enterprise Workflows:** Routing customer support tickets to the right department (technical, billing, sales) based on the ticket's content.\n",
42 | "\n",
43 | "### Strengths & Weaknesses\n",
44 | "* **Strengths:**\n",
45 | " * **Flexibility & Modularity:** Extremely easy to add new capabilities by simply adding a new specialist agent and updating the controller's routing logic.\n",
46 | " * **Performance:** Allows for highly optimized, expert agents instead of one jack-of-all-trades model that might be mediocre at everything.\n",
47 | "* **Weaknesses:**\n",
48 | " * **Controller as a Single Point of Failure:** The quality of the entire system hinges on the controller's ability to route tasks correctly. A poor routing decision leads to a suboptimal or incorrect outcome.\n",
49 | " * **Potential for Increased Latency:** The extra step of routing can add a small amount of latency compared to a direct call to a single agent."
50 | ]
51 | },
52 | {
53 | "cell_type": "markdown",
54 | "id": "phase0-title",
55 | "metadata": {},
56 | "source": [
57 | "## Phase 0: Foundation & Setup\n",
58 | "\n",
59 | "We'll install libraries and set up our environment. We'll need `langchain-tavily` for our Research Agent's search tool."
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": 1,
65 | "id": "install-libs",
66 | "metadata": {},
67 | "outputs": [],
68 | "source": [
69 | "# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv langchain-tavily"
70 | ]
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": 2,
75 | "id": "import-and-keys",
76 | "metadata": {},
77 | "outputs": [
78 | {
79 | "name": "stdout",
80 | "output_type": "stream",
81 | "text": [
82 | "Environment variables loaded and tracing is set up.\n"
83 | ]
84 | }
85 | ],
86 | "source": [
87 | "import os\n",
88 | "from typing import List, Dict, Any, Optional\n",
89 | "from dotenv import load_dotenv\n",
90 | "\n",
91 | "# Pydantic for data modeling\n",
92 | "from pydantic import BaseModel, Field\n",
93 | "\n",
94 | "# LangChain components\n",
95 | "from langchain_nebius import ChatNebius\n",
96 | "from langchain_tavily import TavilySearch\n",
97 | "from langchain_core.prompts import ChatPromptTemplate\n",
98 | "\n",
99 | "# LangGraph components\n",
100 | "from langgraph.graph import StateGraph, END\n",
101 | "from typing_extensions import TypedDict\n",
102 | "\n",
103 | "# For pretty printing\n",
104 | "from rich.console import Console\n",
105 | "from rich.markdown import Markdown\n",
106 | "\n",
107 | "# --- API Key and Tracing Setup ---\n",
108 | "load_dotenv()\n",
109 | "\n",
110 | "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
111 | "os.environ[\"LANGCHAIN_PROJECT\"] = \"Agentic Architecture - Meta-Controller (Nebius)\"\n",
112 | "\n",
113 | "required_vars = [\"NEBIUS_API_KEY\", \"LANGCHAIN_API_KEY\", \"TAVILY_API_KEY\"]\n",
114 | "for var in required_vars:\n",
115 | " if var not in os.environ:\n",
116 | " print(f\"Warning: Environment variable {var} not set.\")\n",
117 | "\n",
118 | "print(\"Environment variables loaded and tracing is set up.\")"
119 | ]
120 | },
121 | {
122 | "cell_type": "markdown",
123 | "id": "phase1-title",
124 | "metadata": {},
125 | "source": [
126 | "## Phase 1: Building the Specialist Agents\n",
127 | "\n",
128 | "First, we'll create our team of expert agents. Each will be a simple chain with a specific persona and, in the case of the Researcher, a tool. We will wrap them in a node function for use in our LangGraph."
129 | ]
130 | },
131 | {
132 | "cell_type": "code",
133 | "execution_count": 3,
134 | "id": "specialist-agents-code",
135 | "metadata": {},
136 | "outputs": [
137 | {
138 | "name": "stdout",
139 | "output_type": "stream",
140 | "text": [
141 | "Specialist agents defined successfully.\n"
142 | ]
143 | }
144 | ],
145 | "source": [
146 | "console = Console()\n",
147 | "llm = ChatNebius(model=\"mistralai/Mixtral-8x22B-Instruct-v0.1\", temperature=0)\n",
148 | "search_tool = TavilySearch(max_results=3)\n",
149 | "\n",
150 | "# Define the state for the overall graph\n",
151 | "class MetaAgentState(TypedDict):\n",
152 | " user_request: str\n",
153 | " next_agent_to_call: Optional[str]\n",
154 | " generation: str\n",
155 | "\n",
156 | "# A helper factory function to create specialist agent nodes\n",
157 | "def create_specialist_node(persona: str, tools: list = None):\n",
158 | " \"\"\"Factory to create a specialist agent node.\"\"\"\n",
159 | " system_prompt = f\"You are a specialist agent with the following persona: {persona}. Respond directly and concisely to the user's request based on your role.\"\n",
160 | " prompt = ChatPromptTemplate.from_messages([\n",
161 | " (\"system\", system_prompt),\n",
162 | " (\"human\", \"{user_request}\")\n",
163 | " ])\n",
164 | " \n",
165 | " if tools:\n",
166 | " chain = prompt | llm.bind_tools(tools)\n",
167 | " else:\n",
168 | " chain = prompt | llm\n",
169 | " \n",
170 | " def specialist_node(state: MetaAgentState) -> Dict[str, Any]:\n",
171 | " result = chain.invoke({\"user_request\": state['user_request']})\n",
172 | " return {\"generation\": result.content}\n",
173 | " \n",
174 | " return specialist_node\n",
175 | "\n",
176 | "# 1. Generalist Agent Node\n",
177 | "generalist_node = create_specialist_node(\n",
178 | " \"You are a friendly and helpful generalist AI assistant. You handle casual conversation and simple questions.\"\n",
179 | ")\n",
180 | "\n",
181 | "# 2. Research Agent Node\n",
182 | "research_agent_node = create_specialist_node(\n",
183 | " \"You are an expert researcher. You must use your search tool to find information to answer the user's question.\",\n",
184 | " tools=[search_tool]\n",
185 | ")\n",
186 | "\n",
187 | "# 3. Coding Agent Node\n",
188 | "coding_agent_node = create_specialist_node(\n",
189 | " \"You are an expert Python programmer. Your task is to write clean, efficient Python code based on the user's request. Provide only the code, wrapped in markdown code blocks, with minimal explanation.\"\n",
190 | ")\n",
191 | "\n",
192 | "print(\"Specialist agents defined successfully.\")"
193 | ]
194 | },
195 | {
196 | "cell_type": "markdown",
197 | "id": "phase2-title",
198 | "metadata": {},
199 | "source": [
200 | "## Phase 2: Building the Meta-Controller\n",
201 | "\n",
202 | "This is the brain of our system. The Meta-Controller is an LLM-powered node whose only job is to decide which specialist to route the request to. The quality of its prompt is critical for the system's performance."
203 | ]
204 | },
205 | {
206 | "cell_type": "code",
207 | "execution_count": 4,
208 | "id": "meta-controller-code",
209 | "metadata": {},
210 | "outputs": [
211 | {
212 | "name": "stdout",
213 | "output_type": "stream",
214 | "text": [
215 | "Meta-Controller node defined successfully.\n"
216 | ]
217 | }
218 | ],
219 | "source": [
220 | "# Pydantic model for the controller's routing decision\n",
221 | "class ControllerDecision(BaseModel):\n",
222 | " next_agent: str = Field(description=\"The name of the specialist agent to call next. Must be one of ['Generalist', 'Researcher', 'Coder'].\")\n",
223 | " reasoning: str = Field(description=\"A brief reason for choosing the next agent.\")\n",
224 | "\n",
225 | "def meta_controller_node(state: MetaAgentState) -> Dict[str, Any]:\n",
226 | " \"\"\"The central controller that decides which specialist to call.\"\"\"\n",
227 | " console.print(\"--- 🧠 Meta-Controller Analyzing Request ---\")\n",
228 | " \n",
229 | " # Define the specialists and their descriptions for the controller\n",
230 | " specialists = {\n",
231 | " \"Generalist\": \"Handles casual conversation, greetings, and simple questions.\",\n",
232 | " \"Researcher\": \"Answers questions about recent events, complex topics, or anything requiring up-to-date information from the web.\",\n",
233 | " \"Coder\": \"Writes Python code based on a user's specification.\"\n",
234 | " }\n",
235 | " \n",
236 | " specialist_descriptions = \"\\n\".join([f\"- {name}: {desc}\" for name, desc in specialists.items()])\n",
237 | " \n",
238 | " prompt = ChatPromptTemplate.from_template(\n",
239 | " f\"\"\"You are the meta-controller for a multi-agent AI system. Your job is to analyze the user's request and route it to the most appropriate specialist agent.\n",
240 | "\n",
241 | "Here are the available specialists:\n",
242 | "{specialist_descriptions}\n",
243 | "\n",
244 | "Analyze the following user request and choose the best specialist to handle it. Provide your decision in the required format.\n",
245 | "\n",
246 | "User Request: \"{{user_request}}\"\"\"\"\n",
247 | " )\n",
248 | " \n",
249 | " controller_llm = llm.with_structured_output(ControllerDecision)\n",
250 | " chain = prompt | controller_llm\n",
251 | " \n",
252 | " decision = chain.invoke({\"user_request\": state['user_request']})\n",
253 | " console.print(f\"[yellow]Routing decision:[/yellow] Send to [bold]{decision.next_agent}[/bold]. [italic]Reason: {decision.reasoning}[/italic]\")\n",
254 | " \n",
255 | " return {\"next_agent_to_call\": decision.next_agent}\n",
256 | "\n",
257 | "print(\"Meta-Controller node defined successfully.\")"
258 | ]
259 | },
260 | {
261 | "cell_type": "markdown",
262 | "id": "phase3-title",
263 | "metadata": {},
264 | "source": [
265 | "## Phase 3: Assembling and Running the Graph\n",
266 | "\n",
267 | "Now we'll use LangGraph to wire everything together. The graph will start with the Meta-Controller, and then, based on its decision, a conditional edge will route the state to the correct specialist node. After the specialist runs, the graph will end."
268 | ]
269 | },
270 | {
271 | "cell_type": "code",
272 | "execution_count": 5,
273 | "id": "graph-assembly-code",
274 | "metadata": {},
275 | "outputs": [
276 | {
277 | "name": "stdout",
278 | "output_type": "stream",
279 | "text": [
280 | "Meta-Controller agent graph compiled successfully.\n"
281 | ]
282 | }
283 | ],
284 | "source": [
285 | "# Build the graph\n",
286 | "workflow = StateGraph(MetaAgentState)\n",
287 | "\n",
288 | "# Add nodes for the controller and each specialist\n",
289 | "workflow.add_node(\"meta_controller\", meta_controller_node)\n",
290 | "workflow.add_node(\"Generalist\", generalist_node)\n",
291 | "workflow.add_node(\"Researcher\", research_agent_node)\n",
292 | "workflow.add_node(\"Coder\", coding_agent_node)\n",
293 | "\n",
294 | "# Set the entry point\n",
295 | "workflow.set_entry_point(\"meta_controller\")\n",
296 | "\n",
297 | "# Define the conditional routing logic\n",
298 | "def route_to_specialist(state: MetaAgentState) -> str:\n",
299 | " \"\"\"Reads the controller's decision and returns the name of the node to route to.\"\"\"\n",
300 | " return state[\"next_agent_to_call\"]\n",
301 | "\n",
302 | "workflow.add_conditional_edges(\n",
303 | " \"meta_controller\",\n",
304 | " route_to_specialist,\n",
305 | " {\n",
306 | " \"Generalist\": \"Generalist\",\n",
307 | " \"Researcher\": \"Researcher\",\n",
308 | " \"Coder\": \"Coder\"\n",
309 | " }\n",
310 | ")\n",
311 | "\n",
312 | "# After any specialist runs, the process ends\n",
313 | "workflow.add_edge(\"Generalist\", END)\n",
314 | "workflow.add_edge(\"Researcher\", END)\n",
315 | "workflow.add_edge(\"Coder\", END)\n",
316 | "\n",
317 | "meta_agent = workflow.compile()\n",
318 | "print(\"Meta-Controller agent graph compiled successfully.\")"
319 | ]
320 | },
321 | {
322 | "cell_type": "markdown",
323 | "id": "phase4-title",
324 | "metadata": {},
325 | "source": [
326 | "## Phase 4: Demonstration\n",
327 | "\n",
328 | "Let's test our Meta-Controller with a variety of prompts to see if it correctly dispatches them to the right specialist."
329 | ]
330 | },
331 | {
332 | "cell_type": "code",
333 | "execution_count": 6,
334 | "id": "demo-code",
335 | "metadata": {},
336 | "outputs": [
337 | {
338 | "data": {
339 | "text/plain": [
340 | "--- 💬 Test 1: General Conversation ---\n"
341 | ]
342 | },
343 | "output_type": "display_data",
344 | "metadata": {}
345 | },
346 | {
347 | "name": "stdout",
348 | "output_type": "stream",
349 | "text": [
350 | "--- 🧠 Meta-Controller Analyzing Request ---\n",
351 | "Routing decision: Send to Generalist. Reason: The user's request is a simple greeting, which falls under the category of casual conversation handled by the Generalist agent.\n"
352 | ]
353 | },
354 | {
355 | "data": {
356 | "text/plain": [
357 | "\n",
358 | "Final Response:\n"
359 | ]
360 | },
361 | "output_type": "display_data",
362 | "metadata": {}
363 | },
364 | {
365 | "data": {
366 | "text/markdown": [
367 | "Hello there! How can I help you today?"
368 | ],
369 | "text/plain": [
370 | "Hello there! How can I help you today?"
371 | ]
372 | },
373 | "output_type": "display_data",
374 | "metadata": {}
375 | },
376 | {
377 | "data": {
378 | "text/plain": [
379 | "\n",
380 | "--- 🔬 Test 2: Research Question ---\n"
381 | ]
382 | },
383 | "output_type": "display_data",
384 | "metadata": {}
385 | },
386 | {
387 | "name": "stdout",
388 | "output_type": "stream",
389 | "text": [
390 | "--- 🧠 Meta-Controller Analyzing Request ---\n",
391 | "Routing decision: Send to Researcher. Reason: The user is asking about a recent event, the latest financial results of a specific company. This requires up-to-date information from the web, which is the specialty of the Researcher agent.\n"
392 | ]
393 | },
394 | {
395 | "data": {
396 | "text/plain": [
397 | "\n",
398 | "Final Response:\n"
399 | ]
400 | },
401 | "output_type": "display_data",
402 | "metadata": {}
403 | },
404 | {
405 | "data": {
406 | "text/markdown": [
407 | "NVIDIA's latest financial results, for the quarter ending in April 2024, were exceptionally strong. They reported revenue of $26.04 billion, a significant increase year-over-year, driven largely by their Data Center revenue which hit a record $22.6 billion. Their GAAP earnings per diluted share were $5.98."
408 | ],
409 | "text/plain": [
410 | "NVIDIA's latest financial results, for the quarter ending in April 2024, were exceptionally strong. They reported revenue of $26.04 billion, a significant increase year-over-year, driven largely by their Data Center revenue which hit a record $22.6 billion. Their GAAP earnings per diluted share were $5.98."
411 | ]
412 | },
413 | "output_type": "display_data",
414 | "metadata": {}
415 | },
416 | {
417 | "data": {
418 | "text/plain": [
419 | "\n",
420 | "--- 💻 Test 3: Coding Request ---\n"
421 | ]
422 | },
423 | "output_type": "display_data",
424 | "metadata": {}
425 | },
426 | {
427 | "name": "stdout",
428 | "output_type": "stream",
429 | "text": [
430 | "--- 🧠 Meta-Controller Analyzing Request ---\n",
431 | "Routing decision: Send to Coder. Reason: The user is explicitly asking for a Python function, which is a coding task. The Coder agent is the specialist for this.\n"
432 | ]
433 | },
434 | {
435 | "data": {
436 | "text/plain": [
437 | "\n",
438 | "Final Response:\n"
439 | ]
440 | },
441 | "output_type": "display_data",
442 | "metadata": {}
443 | },
444 | {
445 | "data": {
446 | "text/markdown": [
447 | "```python\n",
448 | "def fibonacci(n):\n",
449 | " \"\"\"Calculates the nth Fibonacci number.\"\"\"\n",
450 | " if n <= 0:\n",
451 | " return 0\n",
452 | " elif n == 1:\n",
453 | " return 1\n",
454 | " else:\n",
455 | " a, b = 0, 1\n",
456 | " for _ in range(2, n + 1):\n",
457 | " a, b = b, a + b\n",
458 | " return b\n",
459 | "```"
460 | ],
461 | "text/plain": [
462 | "```python\n",
463 | "def fibonacci(n):\n",
464 | " \"\"\"Calculates the nth Fibonacci number.\"\"\"\n",
465 | " if n <= 0:\n",
466 | " return 0\n",
467 | " elif n == 1:\n",
468 | " return 1\n",
469 | " else:\n",
470 | " a, b = 0, 1\n",
471 | " for _ in range(2, n + 1):\n",
472 | " a, b = b, a + b\n",
473 | " return b\n",
474 | "```"
475 | ]
476 | },
477 | "output_type": "display_data",
478 | "metadata": {}
479 | }
480 | ],
481 | "source": [
482 | "def run_agent(query: str):\n",
483 | " result = meta_agent.invoke({\"user_request\": query})\n",
484 | " console.print(\"\\n[bold]Final Response:[/bold]\")\n",
485 | " console.print(Markdown(result['generation']))\n",
486 | "\n",
487 | "# Test 1: Should be routed to the Generalist\n",
488 | "console.print(\"--- 💬 Test 1: General Conversation ---\")\n",
489 | "run_agent(\"Hello, how are you today?\")\n",
490 | "\n",
491 | "# Test 2: Should be routed to the Researcher\n",
492 | "console.print(\"\\n--- 🔬 Test 2: Research Question ---\")\n",
493 | "run_agent(\"What were NVIDIA's latest financial results?\")\n",
494 | "\n",
495 | "# Test 3: Should be routed to the Coder\n",
496 | "console.print(\"\\n--- 💻 Test 3: Coding Request ---\")\n",
497 | "run_agent(\"Can you write me a python function to calculate the nth fibonacci number?\")"
498 | ]
499 | },
500 | {
501 | "cell_type": "markdown",
502 | "id": "conclusion",
503 | "metadata": {},
504 | "source": [
505 | "## Conclusion\n",
506 | "\n",
507 | "In this notebook, we have successfully implemented a **Meta-Controller** architecture. Our tests clearly demonstrate its primary function: acting as an intelligent and dynamic router.\n",
508 | "\n",
509 | "1. The simple greeting was correctly identified and sent to the **Generalist**.\n",
510 | "2. The query about recent financial news was dispatched to the **Researcher**, which used its search tool to fetch up-to-date information.\n",
511 | "3. The request for a code snippet was routed to the **Coder**, which provided a well-formatted and correct function.\n",
512 | "\n",
513 | "This pattern is exceptionally powerful for building scalable and maintainable AI systems. By separating concerns, each specialist can be improved independently without affecting the others. The system's overall intelligence can be enhanced simply by adding new, more capable specialists and making the Meta-Controller aware of them. While the controller itself represents a potential bottleneck, its role as a flexible orchestrator is a cornerstone of advanced agentic design."
514 | ]
515 | }
516 | ],
517 | "metadata": {
518 | "kernelspec": {
519 | "display_name": "Python 3 (ipykernel)",
520 | "language": "python",
521 | "name": "python3"
522 | },
523 | "language_info": {
524 | "codemirror_mode": {
525 | "name": "ipython",
526 | "version": 3
527 | },
528 | "file_extension": ".py",
529 | "mimetype": "text/x-python",
530 | "name": "python",
531 | "nbconvert_exporter": "python",
532 | "pygments_lexer": "ipython3",
533 | "version": "3.10.13"
534 | }
535 | },
536 | "nbformat": 4,
537 | "nbformat_minor": 5
538 | }
--------------------------------------------------------------------------------
/12_graph.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "intro-title",
6 | "metadata": {},
7 | "source": [
8 | "# 📘 Agentic Architectures 12: Graph / World-Model Memory\n",
9 | "\n",
10 | "Welcome to this detailed exploration of one of the most powerful memory structures for AI agents: the **Graph-based World Model**. This architecture moves beyond simple document retrieval or chat history to build a structured, interconnected understanding of the world, much like a human's semantic memory.\n",
11 | "\n",
12 | "Instead of storing information as isolated chunks of text, a graph-based agent parses incoming data into **entities (nodes)** and **relationships (edges)**. This creates a rich, queryable knowledge graph. The agent can then answer complex questions by traversing this graph, uncovering insights that would be hidden in unstructured text.\n",
13 | "\n",
14 | "To showcase this in detail, we will build a **Corporate Intelligence Agent**. This agent will:\n",
15 | "1. **Ingest Unstructured Reports:** Read text documents about companies, people, and products.\n",
16 | "2. **Construct a Knowledge Graph:** Use an LLM to extract entities (e.g., `Company`, `Person`) and relationships (e.g., `ACQUIRED`, `WORKS_FOR`, `COMPETES_WITH`) and populate a Neo4j graph database.\n",
17 | "3. **Answer Complex, Multi-Hop Questions:** Use the graph to answer questions that require connecting multiple pieces of information, such as \"*Who works for the company that acquired BetaSolutions?*\"—a query that is extremely difficult for standard vector search."
18 | ]
19 | },
20 | {
21 | "cell_type": "markdown",
22 | "id": "intro-definition",
23 | "metadata": {},
24 | "source": [
25 | "### Definition\n",
26 | "A **Graph / World-Model Memory** is an agentic architecture where knowledge is stored in a structured graph database. Information is represented as nodes (entities like people, places, concepts) and edges (the relationships between them). This creates a dynamic \"world model\" that the agent can reason over.\n",
27 | "\n",
28 | "### High-level Workflow\n",
29 | "\n",
30 | "1. **Information Ingestion:** The agent receives unstructured or semi-structured data (text, documents, API responses).\n",
31 | "2. **Knowledge Extraction:** An LLM-powered process parses the information, identifying key entities and the relationships that connect them.\n",
32 | "3. **Graph Update:** The extracted nodes and edges are added to or updated in a persistent graph database (like Neo4j).\n",
33 | "4. **Question Answering / Reasoning:** When asked a question, the agent:\n",
34 | " a. Converts the natural language question into a formal graph query language (e.g., Cypher for Neo4j).\n",
35 | " b. Executes the query against the graph to retrieve relevant subgraphs or facts.\n",
36 | " c. Synthesizes the query results into a natural language answer.\n",
37 | "\n",
38 | "### When to Use / Applications\n",
39 | "* **Enterprise Knowledge Assistants:** Building a queryable model of a company's projects, employees, and customers from internal documents.\n",
40 | "* **Advanced Research Assistants:** Creating a dynamic knowledge base of a scientific field by ingesting research papers.\n",
41 | "* **Complex System Diagnostics:** Modeling a system's components and their dependencies to diagnose failures.\n",
42 | "\n",
43 | "### Strengths & Weaknesses\n",
44 | "* **Strengths:**\n",
45 | " * **Structured & Explainable:** The knowledge is highly organized. An answer can be explained by showing the exact path in the graph that led to it.\n",
46 | " * **Enables Complex Reasoning:** Excels at answering \"multi-hop\" questions that require connecting disparate pieces of information through relationships.\n",
47 | "* **Weaknesses:**\n",
48 | " * **Upfront Complexity:** Requires a well-defined schema and a robust extraction process.\n",
49 | " * **Keeping the Graph Updated:** Can be challenging to manage updates, resolve conflicting information, and prune outdated facts over time (knowledge lifecycle management)."
50 | ]
51 | },
52 | {
53 | "cell_type": "markdown",
54 | "id": "phase0-title",
55 | "metadata": {},
56 | "source": [
57 | "## Phase 0: Foundation & Setup\n",
58 | "\n",
59 | "We'll install libraries, including the Neo4j driver, and configure our environment. **Crucially, you must have a running Neo4j instance and a `.env` file with its credentials.**"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": 1,
65 | "id": "install-libs",
66 | "metadata": {},
67 | "outputs": [],
68 | "source": [
69 | "# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv langchain_community neo4j"
70 | ]
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": 2,
75 | "id": "import-and-keys",
76 | "metadata": {},
77 | "outputs": [
78 | {
79 | "name": "stdout",
80 | "output_type": "stream",
81 | "text": [
82 | "Environment variables loaded and tracing is set up.\n"
83 | ]
84 | }
85 | ],
86 | "source": [
87 | "import os\n",
88 | "from typing import List, Dict, Any, Optional\n",
89 | "from dotenv import load_dotenv\n",
90 | "\n",
91 | "# Pydantic for data modeling\n",
92 | "from pydantic import BaseModel, Field\n",
93 | "\n",
94 | "# LangChain components\n",
95 | "from langchain_nebius import ChatNebius\n",
96 | "from langchain_community.graphs import Neo4jGraph\n",
97 | "from langchain.chains import GraphCypherQAChain\n",
98 | "from langchain_core.prompts import ChatPromptTemplate\n",
99 | "from langchain_core.pydantic_v1 import BaseModel as V1BaseModel\n",
100 | "\n",
101 | "# For pretty printing\n",
102 | "from rich.console import Console\n",
103 | "from rich.markdown import Markdown\n",
104 | "\n",
105 | "# --- API Key and Tracing Setup ---\n",
106 | "load_dotenv()\n",
107 | "\n",
108 | "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
109 | "os.environ[\"LANGCHAIN_PROJECT\"] = \"Agentic Architecture - Graph Memory (Nebius)\"\n",
110 | "\n",
111 | "required_vars = [\"NEBIUS_API_KEY\", \"LANGCHAIN_API_KEY\", \"NEO4J_URI\", \"NEO4J_USERNAME\", \"NEO4J_PASSWORD\"]\n",
112 | "for var in required_vars:\n",
113 | " if var not in os.environ:\n",
114 | " print(f\"Warning: Environment variable {var} not set.\")\n",
115 | "\n",
116 | "print(\"Environment variables loaded and tracing is set up.\")"
117 | ]
118 | },
119 | {
120 | "cell_type": "markdown",
121 | "id": "phase1-title",
122 | "metadata": {},
123 | "source": [
124 | "## Phase 1: Building the Graph Construction Agent\n",
125 | "\n",
126 | "This is the heart of the ingestion pipeline. We need an agent that can read unstructured text and extract entities and relationships in a structured format. We will use an LLM with structured output capabilities (Pydantic) to act as our knowledge extractor."
127 | ]
128 | },
129 | {
130 | "cell_type": "code",
131 | "execution_count": 3,
132 | "id": "graph-builder-code",
133 | "metadata": {},
134 | "outputs": [
135 | {
136 | "name": "stdout",
137 | "output_type": "stream",
138 | "text": [
139 | "Successfully connected to Neo4j and defined the Graph Maker Agent.\n"
140 | ]
141 | }
142 | ],
143 | "source": [
144 | "console = Console()\n",
145 | "llm = ChatNebius(model=\"mistralai/Mixtral-8x22B-Instruct-v0.1\", temperature=0)\n",
146 | "\n",
147 | "# Connect to our Neo4j database\n",
148 | "try:\n",
149 | " graph = Neo4jGraph()\n",
150 | " # Clear the graph for a clean run\n",
151 | " graph.query(\"MATCH (n) DETACH DELETE n\")\n",
152 | "except Exception as e:\n",
153 | " console.print(f\"[bold red]Failed to connect to Neo4j: {e}. Please check your credentials and connection.[/bold red]\")\n",
154 | " graph = None\n",
155 | "\n",
156 | "# Pydantic models for structured extraction (using LangChain's v1 BaseModel for compatibility with older structured output methods)\n",
157 | "class Node(V1BaseModel):\n",
158 | " id: str = Field(description=\"Unique name or identifier for the entity.\")\n",
159 | " type: str = Field(description=\"The type or label of the entity (e.g., Person, Company, Product).\")\n",
160 | "\n",
161 | "class Relationship(V1BaseModel):\n",
162 | " source: Node\n",
163 | " target: Node\n",
164 | " type: str = Field(description=\"The type of relationship (e.g., WORKS_FOR, ACQUIRED).\")\n",
165 | "\n",
166 | "class KnowledgeGraph(V1BaseModel):\n",
167 | " \"\"\"A graph of nodes and relationships.\"\"\"\n",
168 | " relationships: List[Relationship]\n",
169 | "\n",
170 | "# The Graph Maker Agent\n",
171 | "def get_graph_maker_chain():\n",
172 | " extractor_llm = llm.with_structured_output(KnowledgeGraph)\n",
173 | " prompt = ChatPromptTemplate.from_messages([\n",
174 | " (\"system\", \"You are an expert at extracting information from text and building a knowledge graph. Extract all entities (nodes) and relationships from the provided text. The relationship type should be a verb in all caps, like 'WORKS_FOR' or 'ACQUIRED'.\"),\n",
175 | " (\"human\", \"Extract a knowledge graph from the following text:\\n\\n{text}\")\n",
176 | " ])\n",
177 | " return prompt | extractor_llm\n",
178 | "\n",
179 | "graph_maker_agent = get_graph_maker_chain()\n",
180 | "print(\"Successfully connected to Neo4j and defined the Graph Maker Agent.\")"
181 | ]
182 | },
183 | {
184 | "cell_type": "markdown",
185 | "id": "phase2-title",
186 | "metadata": {},
187 | "source": [
188 | "## Phase 2: Ingesting Knowledge and Building the World Model\n",
189 | "\n",
190 | "Now, we'll feed our agent a series of related but separate documents. The agent will process each one and progressively build up our corporate knowledge graph. This simulates how a real system would learn over time as new information becomes available."
191 | ]
192 | },
193 | {
194 | "cell_type": "code",
195 | "execution_count": 4,
196 | "id": "ingestion-code",
197 | "metadata": {},
198 | "outputs": [
199 | {
200 | "name": "stdout",
201 | "output_type": "stream",
202 | "text": [
203 | "--- Ingesting Document 1 ---\n",
204 | "Successfully added 1 relationships to the graph.\n",
205 | "--- Ingesting Document 2 ---\n",
206 | "Successfully added 2 relationships to the graph.\n",
207 | "--- Ingesting Document 3 ---\n",
208 | "Successfully added 2 relationships to the graph.\n",
209 | "--- ✅ Knowledge Graph Ingestion Complete ---\n"
210 | ]
211 | },
212 | {
213 | "data": {
214 | "text/plain": [
215 | "\n",
216 | "--- Graph Schema ---\n"
217 | ]
218 | },
219 | "metadata": {},
220 | "output_type": "display_data"
221 | },
222 | {
223 | "name": "stdout",
224 | "output_type": "stream",
225 | "text": [
226 | "Node properties: [{'properties': [('id', 'STRING')], 'labels': ['Product']}, {'properties': [('id', 'STRING')], 'labels': ['Person']}, {'properties': [('id', 'STRING')], 'labels': ['Company']}]\n",
227 | "Relationship properties: []\n",
228 | "Relationships: [(:Company)-[:PRODUCES]->(:Product), (:Person)-[:WORKS_FOR]->(:Company), (:Product)-[:COMPETES_WITH]->(:Product), (:Company)-[:ACQUIRED]->(:Company)]\n"
229 | ]
230 | }
231 | ],
232 | "source": [
233 | "unstructured_documents = [\n",
234 | " \"On May 15, 2023, global tech giant AlphaCorp announced its acquisition of startup BetaSolutions, a leader in cloud-native database technology.\",\n",
235 | " \"Dr. Evelyn Reed, a renowned AI researcher, has been the Chief Science Officer at AlphaCorp since 2021. She leads the team responsible for the QuantumLeap AI platform.\",\n",
236 | " \"Innovate Inc.'s flagship product, NeuraGen, is seen as a direct competitor to AlphaCorp's QuantumLeap AI. Meanwhile, Innovate Inc. recently hired Johnathan Miles as their new CTO.\"\n",
237 | "]\n",
238 | "for i, doc in enumerate(unstructured_documents):\n",
239 | " console.print(f\"--- Ingesting Document {i+1} ---\")\n",
240 | " try:\n",
241 | " kg_data = graph_maker_agent.invoke({\"text\": doc})\n",
242 | " if kg_data.relationships:\n",
243 | " graph.add_graph_documents(graph_documents=kg_data.relationships, include_source=False)\n",
244 | " console.print(f\"[green]Successfully added {len(kg_data.relationships)} relationships to the graph.[/green]\")\n",
245 | " else:\n",
246 | " console.print(\"[yellow]No relationships extracted.[/yellow]\")\n",
247 | " except Exception as e:\n",
248 | " console.print(f\"[red]Failed to process document: {e}[/red]\")\n",
249 | "\n",
250 | "console.print(\"--- ✅ Knowledge Graph Ingestion Complete ---\")\n",
251 | "console.print(\"\\n--- Graph Schema ---\")\n",
252 | "console.print(graph.schema)"
253 | ]
254 | },
255 | {
256 | "cell_type": "markdown",
257 | "id": "phase3-title",
258 | "metadata": {},
259 | "source": [
260 | "## Phase 3: Building the Graph-Querying Agent\n",
261 | "\n",
262 | "With our knowledge graph populated, we need an agent that can use it. This involves a **Text-to-Cypher** process. The agent will receive a user's natural language question, convert it into a Cypher query using the graph schema as context, execute the query, and then synthesize the results into a human-readable answer."
263 | ]
264 | },
265 | {
266 | "cell_type": "code",
267 | "execution_count": 5,
268 | "id": "query-agent-code",
269 | "metadata": {},
270 | "outputs": [
271 | {
272 | "name": "stdout",
273 | "output_type": "stream",
274 | "text": [
275 | "Graph-Querying Agent defined successfully.\n"
276 | ]
277 | }
278 | ],
279 | "source": [
280 | "# LangChain has a built-in chain for this, but we'll inspect the components\n",
281 | "# to understand how it works.\n",
282 | "cypher_generation_prompt = ChatPromptTemplate.from_template(\n",
283 | " \"\"\"You are an expert Neo4j Cypher query developer. Your task is to convert a user's natural language question into a valid Cypher query.\n",
284 | "You must use the provided graph schema to construct the query. Do not use any node labels or relationship types that are not in the schema.\n",
285 | "Return ONLY the Cypher query, with no additional text or explanations.\n",
286 | "\n",
287 | "Graph Schema:\n",
288 | "{schema}\n",
289 | "\n",
290 | "User Question:\n",
291 | "{question}\n",
292 | "\"\"\"\n",
293 | ")\n",
294 | "\n",
295 | "cypher_response_prompt = ChatPromptTemplate.from_template(\n",
296 | " \"\"\"You are an assistant that provides clear, natural language answers based on query results from a knowledge graph.\n",
297 | "Use the context from the graph query result to answer the user's original question.\n",
298 | "\n",
299 | "User Question: {question}\n",
300 | "Query Result: {context}\n",
301 | "\"\"\"\n",
302 | ")\n",
303 | "\n",
304 | "def query_graph(question: str) -> Dict[str, Any]:\n",
305 | " \"\"\"The full Text-to-Cypher and synthesis pipeline.\"\"\"\n",
306 | " console.print(f\"\\n[bold]Question:[/bold] {question}\")\n",
307 | " \n",
308 | " # 1. Generate Cypher Query\n",
309 | " console.print(\"--- ➡️ Generating Cypher Query ---\")\n",
310 | " cypher_chain = cypher_generation_prompt | llm\n",
311 | " generated_cypher = cypher_chain.invoke({\"schema\": graph.schema, \"question\": question}).content\n",
312 | " console.print(f\"[cyan]Generated Cypher:\\n{generated_cypher}[/cyan]\")\n",
313 | " \n",
314 | " # 2. Execute Cypher Query\n",
315 | " console.print(\"--- ⚡ Executing Query ---\")\n",
316 | " try:\n",
317 | " context = graph.query(generated_cypher)\n",
318 | " console.print(f\"[yellow]Query Result:\\n{context}[/yellow]\")\n",
319 | " except Exception as e:\n",
320 | " console.print(f\"[red]Cypher Query Failed: {e}[/red]\")\n",
321 | " return {\"answer\": \"I was unable to execute a query to find the answer to your question.\"}\n",
322 | " \n",
323 | " # 3. Synthesize Final Answer\n",
324 | " console.print(\"--- 🗣️ Synthesizing Final Answer ---\")\n",
325 | " synthesis_chain = cypher_response_prompt | llm\n",
326 | " answer = synthesis_chain.invoke({\"question\": question, \"context\": context}).content\n",
327 | " \n",
328 | " return {\"answer\": answer}\n",
329 | "\n",
330 | "print(\"Graph-Querying Agent defined successfully.\")"
331 | ]
332 | },
333 | {
334 | "cell_type": "markdown",
335 | "id": "phase4-title",
336 | "metadata": {},
337 | "source": [
338 | "## Phase 4: Demonstration & Analysis\n",
339 | "\n",
340 | "Now for the ultimate test. We will ask our agent questions that range from simple fact retrieval to complex, multi-hop reasoning that requires connecting information from all three of our source documents."
341 | ]
342 | },
343 | {
344 | "cell_type": "code",
345 | "execution_count": 6,
346 | "id": "demo-code",
347 | "metadata": {},
348 | "outputs": [
349 | {
350 | "name": "stdout",
351 | "output_type": "stream",
352 | "text": [
353 | "\n",
354 | "Question: Who works for AlphaCorp?\n",
355 | "--- ➡️ Generating Cypher Query ---\n",
356 | "Generated Cypher:\n",
357 | "MATCH (p:Person)-[:WORKS_FOR]->(c:Company {id: 'AlphaCorp'}) RETURN p.id\n",
358 | "--- ⚡ Executing Query ---\n",
359 | "Query Result:\n",
360 | "[{'p.id': 'Dr. Evelyn Reed'}]\n",
361 | "--- 🗣️ Synthesizing Final Answer ---\n"
362 | ]
363 | },
364 | {
365 | "data": {
366 | "text/plain": [
367 | "\n",
368 | "--- Final Answer ---\n"
369 | ]
370 | },
371 | "metadata": {},
372 | "output_type": "display_data"
373 | },
374 | {
375 | "data": {
376 | "text/markdown": [
377 | "Dr. Evelyn Reed works for AlphaCorp."
378 | ],
379 | "text/plain": [
380 | "Dr. Evelyn Reed works for AlphaCorp."
381 | ]
382 | },
383 | "metadata": {},
384 | "output_type": "display_data"
385 | },
386 | {
387 | "name": "stdout",
388 | "output_type": "stream",
389 | "text": [
390 | "\n",
391 | "Question: What company did AlphaCorp acquire?\n",
392 | "--- ➡️ Generating Cypher Query ---\n",
393 | "Generated Cypher:\n",
394 | "MATCH (:Company {id: 'AlphaCorp'})-[:ACQUIRED]->(acquired_company:Company)\n",
395 | "RETURN acquired_company.id\n",
396 | "--- ⚡ Executing Query ---\n",
397 | "Query Result:\n",
398 | "[{'acquired_company.id': 'BetaSolutions'}]\n",
399 | "--- 🗣️ Synthesizing Final Answer ---\n"
400 | ]
401 | },
402 | {
403 | "data": {
404 | "text/plain": [
405 | "\n",
406 | "--- Final Answer ---\n"
407 | ]
408 | },
409 | "metadata": {},
410 | "output_type": "display_data"
411 | },
412 | {
413 | "data": {
414 | "text/markdown": [
415 | "AlphaCorp acquired BetaSolutions."
416 | ],
417 | "text/plain": [
418 | "AlphaCorp acquired BetaSolutions."
419 | ]
420 | },
421 | "metadata": {},
422 | "output_type": "display_data"
423 | },
424 | {
425 | "name": "stdout",
426 | "output_type": "stream",
427 | "text": [
428 | "\n",
429 | "Question: What companies compete with the products made by the company that acquired BetaSolutions?\n",
430 | "--- ➡️ Generating Cypher Query ---\n",
431 | "Generated Cypher:\n",
432 | "MATCH (acquirer:Company)-[:ACQUIRED]->(:Company {id: 'BetaSolutions'})\n",
433 | "MATCH (acquirer)-[:PRODUCES]->(product:Product)\n",
434 | "MATCH (product)-[:COMPETES_WITH]->(competitor_product:Product)\n",
435 | "MATCH (competitor_company:Company)-[:PRODUCES]->(competitor_product)\n",
436 | "RETURN DISTINCT competitor_company.id\n",
437 | "--- ⚡ Executing Query ---\n",
438 | "Query Result:\n",
439 | "[{'competitor_company.id': 'Innovate Inc.'}]\n",
440 | "--- 🗣️ Synthesizing Final Answer ---\n"
441 | ]
442 | },
443 | {
444 | "data": {
445 | "text/plain": [
446 | "\n",
447 | "--- Final Answer ---\n"
448 | ]
449 | },
450 | "metadata": {},
451 | "output_type": "display_data"
452 | },
453 | {
454 | "data": {
455 | "text/markdown": [
456 | "Innovate Inc. competes with the products made by the company that acquired BetaSolutions."
457 | ],
458 | "text/plain": [
459 | "Innovate Inc. competes with the products made by the company that acquired BetaSolutions."
460 | ]
461 | },
462 | "metadata": {},
463 | "output_type": "display_data"
464 | }
465 | ],
466 | "source": [
467 | "# Test 1: Simple fact retrieval (requires info from doc 2)\n",
468 | "result1 = query_graph(\"Who works for AlphaCorp?\")\n",
469 | "console.print(\"\\n--- Final Answer ---\")\n",
470 | "console.print(Markdown(result1['answer']))\n",
471 | "\n",
472 | "# Test 2: Another simple fact retrieval (requires info from doc 1)\n",
473 | "result2 = query_graph(\"What company did AlphaCorp acquire?\")\n",
474 | "console.print(\"\\n--- Final Answer ---\")\n",
475 | "console.print(Markdown(result2['answer']))\n",
476 | "\n",
477 | "# Test 3: The multi-hop reasoning question (requires info from all 3 docs)\n",
478 | "result3 = query_graph(\"What companies compete with the products made by the company that acquired BetaSolutions?\")\n",
479 | "console.print(\"\\n--- Final Answer ---\")\n",
480 | "console.print(Markdown(result3['answer']))"
481 | ]
482 | },
483 | {
484 | "cell_type": "markdown",
485 | "id": "analysis-markdown",
486 | "metadata": {},
487 | "source": [
488 | "### Analysis of the Results\n",
489 | "\n",
490 | "The demonstration highlights the profound advantage of a graph-based world model:\n",
491 | "\n",
492 | "- The first two questions were simple lookups. The agent successfully converted the questions into Cypher, queried the graph, and found the direct relationships.\n",
493 | "\n",
494 | "- The third question is the crucial one. A standard RAG agent would fail here. It might find the document about the acquisition and the document about the competitor, but it would struggle to connect them. It lacks the explicit relational structure to understand that the \"AlphaCorp\" in document 1 is the same entity as the \"AlphaCorp\" in documents 2 and 3.\n",
495 | "\n",
496 | "- Our graph-based agent, however, solved it with ease. We can trace its logic directly from the generated Cypher query:\n",
497 | " 1. `MATCH (acquirer:Company)-[:ACQUIRED]->(:Company {id: 'BetaSolutions'})`: First, find the company that acquired BetaSolutions (Result: AlphaCorp).\n",
498 | " 2. `MATCH (acquirer)-[:PRODUCES]->(product:Product)`: Next, find the products produced by that company (Result: QuantumLeap AI).\n",
499 | " 3. `MATCH (product)-[:COMPETES_WITH]->(competitor_product:Product)`: Then, find the products that compete with that product (Result: NeuraGen).\n",
500 | " 4. `MATCH (competitor_company:Company)-[:PRODUCES]->(competitor_product)`: Finally, find the company that produces the competing product (Result: Innovate Inc.).\n",
501 | "\n",
502 | "This ability to traverse relationships and synthesize information from different sources is the superpower of the Graph / World-Model architecture. The answer is not just retrieved; it is reasoned."
503 | ]
504 | },
505 | {
506 | "cell_type": "markdown",
507 | "id": "conclusion",
508 | "metadata": {},
509 | "source": [
510 | "## Conclusion\n",
511 | "\n",
512 | "In this notebook, we have constructed a complete agentic system built around a **Graph / World-Model Memory**. We demonstrated the full lifecycle: ingesting unstructured data, using an LLM to build a structured knowledge graph, and then using that graph to answer complex, multi-hop questions that require genuine reasoning.\n",
513 | "\n",
514 | "This architecture represents a significant leap in capability over simpler memory systems. By creating an explicit, queryable model of the world, we give our agents the ability to connect disparate facts and uncover hidden insights. While the challenges of maintaining this graph over time are real, the potential for building deeply knowledgeable and explainable AI assistants makes this one of the most exciting and powerful patterns in modern agentic design."
515 | ]
516 | }
517 | ],
518 | "metadata": {
519 | "kernelspec": {
520 | "display_name": "Python 3 (ipykernel)",
521 | "language": "python",
522 | "name": "python3"
523 | },
524 | "language_info": {
525 | "codemirror_mode": {
526 | "name": "ipython",
527 | "version": 3
528 | },
529 | "file_extension": ".py",
530 | "mimetype": "text/x-python",
531 | "name": "python",
532 | "nbconvert_exporter": "python",
533 | "pygments_lexer": "ipython3",
534 | "version": "3.10.13"
535 | }
536 | },
537 | "nbformat": 4,
538 | "nbformat_minor": 5
539 | }
540 |
--------------------------------------------------------------------------------
/14_dry_run.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "intro-title",
6 | "metadata": {},
7 | "source": [
8 | "# 📘 Agentic Architectures 14: Observability + Dry-Run Harness\n",
9 | "\n",
10 | "Welcome to a crucial notebook in our series, focusing on the deployment and operational safety of AI agents. We will implement an **Observability and Dry-Run Harness**, an essential pattern for testing, debugging, and safely managing agents that interact with real-world systems.\n",
11 | "\n",
12 | "The core principle is simple but powerful: **never run an agent's action in a live environment without first knowing exactly what it's going to do.** This architecture formalizes a \"look before you leap\" process. An agent first executes its plan in a `dry_run` mode, which doesn't change the real world but generates detailed logs and a clear plan of action. This plan is then presented to a human (or an automated checker) for approval before the final, live execution is permitted.\n",
13 | "\n",
14 | "To demonstrate this, we will build a **Corporate Social Media Agent**. This agent is tasked with creating and publishing posts. We will see how the dry-run harness allows us to:\n",
15 | "1. **Generate a Proposed Post:** The AI will creatively draft a post based on a prompt.\n",
16 | "2. **Perform a Dry Run:** The agent will call the `publish` function in a sandboxed `dry_run=True` mode, generating logs of what *would* happen.\n",
17 | "3. **Human-in-the-Loop Review:** A human operator is shown the exact post content and the dry-run trace. They must type `approve` to proceed.\n",
18 | "4. **Execute Live Action:** Only upon approval is the `publish` function called again, this time with `dry_run=False`, to perform the real action.\n",
19 | "\n",
20 | "This pattern is the bedrock of responsible agent deployment, providing the transparency and control needed to operate AI safely in production."
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "id": "intro-definition",
26 | "metadata": {},
27 | "source": [
28 | "### Definition\n",
29 | "An **Observability and Dry-Run Harness** is a testing and deployment architecture that intercepts an agent's actions. It first executes them in a \"dry run\" or \"sandboxed\" mode that simulates the action without causing real-world effects. The resulting plan and logs are then surfaced for review, and only after explicit approval is the action executed in the live environment.\n",
30 | "\n",
31 | "### High-level Workflow\n",
32 | "\n",
33 | "1. **Agent Proposes Action:** The agent determines a plan or a specific tool call to execute (e.g., `api.post_update(...)`).\n",
34 | "2. **Dry Run Execution:** The harness invokes the agent's plan with a `dry_run=True` flag. The underlying tools are designed to recognize this flag and only output what they *would* do, along with logs and traces.\n",
35 | "3. **Collect Observability Data:** The harness captures the proposed action, the dry-run logs, and any other relevant trace data from the simulation.\n",
36 | "4. **Human/Automated Review:** This observability data is presented to a reviewer. A human can check for correctness, safety, and alignment with goals. An automated system could run checks for policy violations or known bad patterns.\n",
37 | "5. **Go/No-Go Decision:** The reviewer makes an `approve` or `reject` decision.\n",
38 | "6. **Live Execution (on 'Go'):** If approved, the harness re-executes the agent's action, this time with `dry_run=False`, causing it to have a real-world effect.\n",
39 | "\n",
40 | "### When to Use / Applications\n",
41 | "* **Debugging and Testing:** In development, to understand exactly how an agent is interpreting a task and what actions it's taking without side effects.\n",
42 | "* **Production Validation & Safety:** As a permanent feature in production for any agent that can modify state, spend money, send communications, or perform any other irreversible action.\n",
43 | "* **CI/CD for Agents:** Integrating a dry-run harness into an automated testing pipeline to validate agent behavior before deploying new versions.\n",
44 | "\n",
45 | "### Strengths & Weaknesses\n",
46 | "* **Strengths:**\n",
47 | " * **Maximum Transparency & Safety:** Provides a clear, auditable preview of an agent's actions, preventing costly or embarrassing mistakes.\n",
48 | " * **Excellent for Debugging:** Makes it easy to trace the agent's logic and tool calls without having to undo real-world changes.\n",
49 | "* **Weaknesses:**\n",
50 | " * **Delays Deployment/Execution:** The mandatory review step (especially if human) introduces latency, making it unsuitable for real-time applications.\n",
51 | " * **Requires Tool Support:** The tools and APIs the agent uses must be designed to support a `dry_run` mode."
52 | ]
53 | },
54 | {
55 | "cell_type": "markdown",
56 | "id": "phase0-title",
57 | "metadata": {},
58 | "source": [
59 | "## Phase 0: Foundation & Setup\n",
60 | "\n",
61 | "Standard setup of libraries and environment variables."
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 1,
67 | "id": "install-libs",
68 | "metadata": {},
69 | "outputs": [],
70 | "source": [
71 | "# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv"
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": 2,
77 | "id": "import-and-keys",
78 | "metadata": {},
79 | "outputs": [
80 | {
81 | "name": "stdout",
82 | "output_type": "stream",
83 | "text": [
84 | "Environment variables loaded and tracing is set up.\n"
85 | ]
86 | }
87 | ],
88 | "source": [
89 | "import os\n",
90 | "import datetime\n",
91 | "from typing import List, Dict, Any, Optional\n",
92 | "from dotenv import load_dotenv\n",
93 | "\n",
94 | "# Pydantic for data modeling\n",
95 | "from pydantic import BaseModel, Field\n",
96 | "\n",
97 | "# LangChain components\n",
98 | "from langchain_nebius import ChatNebius\n",
99 | "from langchain_core.prompts import ChatPromptTemplate\n",
100 | "\n",
101 | "# LangGraph components\n",
102 | "from langgraph.graph import StateGraph, END\n",
103 | "from typing_extensions import TypedDict\n",
104 | "\n",
105 | "# For pretty printing\n",
106 | "from rich.console import Console\n",
107 | "from rich.markdown import Markdown\n",
108 | "from rich.panel import Panel\n",
109 | "\n",
110 | "# --- API Key and Tracing Setup ---\n",
111 | "load_dotenv()\n",
112 | "\n",
113 | "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
114 | "os.environ[\"LANGCHAIN_PROJECT\"] = \"Agentic Architecture - Dry-Run Harness (Nebius)\"\n",
115 | "\n",
116 | "required_vars = [\"NEBIUS_API_KEY\", \"LANGCHAIN_API_KEY\"]\n",
117 | "for var in required_vars:\n",
118 | " if var not in os.environ:\n",
119 | " print(f\"Warning: Environment variable {var} not set.\")\n",
120 | "\n",
121 | "print(\"Environment variables loaded and tracing is set up.\")"
122 | ]
123 | },
124 | {
125 | "cell_type": "markdown",
126 | "id": "phase1-title",
127 | "metadata": {},
128 | "source": [
129 | "## Phase 1: Building the Environment and Tools\n",
130 | "\n",
131 | "The core of this architecture is a tool that supports a `dry_run` mode. We will create a simple `SocialMediaAPI` class. Its `publish_post` method will behave differently depending on the `dry_run` flag, providing the observability we need."
132 | ]
133 | },
134 | {
135 | "cell_type": "code",
136 | "execution_count": 3,
137 | "id": "environment-setup-code",
138 | "metadata": {},
139 | "outputs": [
140 | {
141 | "name": "stdout",
142 | "output_type": "stream",
143 | "text": [
144 | "Dry-run capable SocialMediaAPI tool defined successfully.\n"
145 | ]
146 | }
147 | ],
148 | "source": [
149 | "console = Console()\n",
150 | "\n",
151 | "# Structured model for the agent's proposed post\n",
152 | "class SocialMediaPost(BaseModel):\n",
153 | " content: str = Field(description=\"The full text content of the social media post.\")\n",
154 | " hashtags: List[str] = Field(description=\"A list of relevant hashtags, without the '#'.\")\n",
155 | "\n",
156 | "# The key component: A tool with a dry_run flag\n",
157 | "class SocialMediaAPI:\n",
158 | " \"\"\"A mock social media API that supports a dry-run mode.\"\"\"\n",
159 | " \n",
160 | " def publish_post(self, post: SocialMediaPost, dry_run: bool = True) -> Dict[str, Any]:\n",
161 | " \"\"\"Publishes a post to the social media feed.\"\"\"\n",
162 | " timestamp = datetime.datetime.now().isoformat()\n",
163 | " hashtags_str = ' '.join([f'#{h}' for h in post.hashtags])\n",
164 | " full_post_text = f\"{post.content}\\n\\n{hashtags_str}\"\n",
165 | " \n",
166 | " if dry_run:\n",
167 | " # In dry-run mode, we don't execute, we just return the plan and logs\n",
168 | " log_message = f\"[DRY RUN] At {timestamp}, would publish the following post:\\n--- PREVIEW ---\\n{full_post_text}\\n--- END PREVIEW ---\"\n",
169 | " console.print(Panel(log_message, title=\"[yellow]Dry Run Log[/yellow]\", border_style=\"yellow\"))\n",
170 | " return {\"status\": \"DRY_RUN_SUCCESS\", \"log\": log_message, \"proposed_post\": full_post_text}\n",
171 | " else:\n",
172 | " # In live mode, we execute the action\n",
173 | " log_message = f\"[LIVE] At {timestamp}, successfully published post!\"\n",
174 | " console.print(Panel(log_message, title=\"[green]Live Execution Log[/green]\", border_style=\"green\"))\n",
175 | " # Here you would have the actual API call, e.g., twitter_client.create_tweet(...)\n",
176 | " return {\"status\": \"LIVE_SUCCESS\", \"log\": log_message, \"post_id\": f\"post_{hash(full_post_text)}\"}\n",
177 | "\n",
178 | "social_media_tool = SocialMediaAPI()\n",
179 | "print(\"Dry-run capable SocialMediaAPI tool defined successfully.\")"
180 | ]
181 | },
182 | {
183 | "cell_type": "markdown",
184 | "id": "phase2-title",
185 | "metadata": {},
186 | "source": [
187 | "## Phase 2: Building the Dry-Run Harness with LangGraph\n",
188 | "\n",
189 | "We will now construct the full workflow. The graph will manage the state of the process, moving from proposing an action, to the dry-run and review step, and finally to a conditional execution based on human approval."
190 | ]
191 | },
192 | {
193 | "cell_type": "code",
194 | "execution_count": 4,
195 | "id": "agent-build-code",
196 | "metadata": {},
197 | "outputs": [
198 | {
199 | "name": "stdout",
200 | "output_type": "stream",
201 | "text": [
202 | "Dry-Run Harness agent graph compiled successfully.\n"
203 | ]
204 | }
205 | ],
206 | "source": [
207 | "llm = ChatNebius(model=\"mistralai/Mixtral-8x22B-Instruct-v0.1\", temperature=0.5)\n",
208 | "\n",
209 | "# LangGraph State\n",
210 | "class AgentState(TypedDict):\n",
211 | " user_request: str\n",
212 | " proposed_post: Optional[SocialMediaPost]\n",
213 | " dry_run_log: Optional[str]\n",
214 | " review_decision: Optional[str] # 'approve' or 'reject'\n",
215 | " final_status: str\n",
216 | "\n",
217 | "# Graph Nodes\n",
218 | "def propose_post_node(state: AgentState) -> Dict[str, Any]:\n",
219 | " \"\"\"The creative agent that drafts the social media post.\"\"\"\n",
220 | " console.print(\"--- 📝 Social Media Agent Drafting Post ---\")\n",
221 | " prompt = ChatPromptTemplate.from_template(\n",
222 | " \"You are a creative and engaging social media manager for a major AI company. Based on the user's request, draft a compelling social media post, including relevant hashtags.\\n\\nRequest: {request}\"\n",
223 | " )\n",
224 | " post_generator_llm = llm.with_structured_output(SocialMediaPost)\n",
225 | " chain = prompt | post_generator_llm\n",
226 | " proposed_post = chain.invoke({\"request\": state['user_request']})\n",
227 | " return {\"proposed_post\": proposed_post}\n",
228 | "\n",
229 | "def dry_run_review_node(state: AgentState) -> Dict[str, Any]:\n",
230 | " \"\"\"Performs the dry run and prompts for human review.\"\"\"\n",
231 | " console.print(\"--- 🧐 Performing Dry Run & Awaiting Human Review ---\")\n",
232 | " dry_run_result = social_media_tool.publish_post(state['proposed_post'], dry_run=True)\n",
233 | " \n",
234 | " # Present the plan for review\n",
235 | " review_panel = Panel(\n",
236 | " f\"[bold]Proposed Post:[/bold]\\n{dry_run_result['proposed_post']}\",\n",
237 | " title=\"[bold yellow]Human-in-the-Loop: Review Required[/bold yellow]\",\n",
238 | " border_style=\"yellow\"\n",
239 | " )\n",
240 | " console.print(review_panel)\n",
241 | " \n",
242 | " # Get human approval\n",
243 | " decision = \"\"\n",
244 | " while decision.lower() not in [\"approve\", \"reject\"]:\n",
245 | " decision = console.input(\"Type 'approve' to publish or 'reject' to cancel: \")\n",
246 | " \n",
247 | " return {\"dry_run_log\": dry_run_result['log'], \"review_decision\": decision.lower()}\n",
248 | "\n",
249 | "def execute_live_post_node(state: AgentState) -> Dict[str, Any]:\n",
250 | " \"\"\"Executes the live post after approval.\"\"\"\n",
251 | " console.print(\"--- ✅ Post Approved, Executing Live ---\")\n",
252 | " live_result = social_media_tool.publish_post(state['proposed_post'], dry_run=False)\n",
253 | " return {\"final_status\": f\"Post successfully published! ID: {live_result.get('post_id')}\"}\n",
254 | "\n",
255 | "def post_rejected_node(state: AgentState) -> Dict[str, Any]:\n",
256 | " \"\"\"Handles the case where the post is rejected.\"\"\"\n",
257 | " console.print(\"--- ❌ Post Rejected by Human Reviewer ---\")\n",
258 | " return {\"final_status\": \"Action was rejected by the reviewer and not executed.\"}\n",
259 | "\n",
260 | "# Conditional Edge\n",
261 | "def route_after_review(state: AgentState) -> str:\n",
262 | " \"\"\"Routes to execution or rejection based on the human review.\"\"\"\n",
263 | " return \"execute_live\" if state[\"review_decision\"] == \"approve\" else \"reject\"\n",
264 | "\n",
265 | "# Build the graph\n",
266 | "workflow = StateGraph(AgentState)\n",
267 | "workflow.add_node(\"propose_post\", propose_post_node)\n",
268 | "workflow.add_node(\"dry_run_review\", dry_run_review_node)\n",
269 | "workflow.add_node(\"execute_live\", execute_live_post_node)\n",
270 | "workflow.add_node(\"reject\", post_rejected_node)\n",
271 | "\n",
272 | "workflow.set_entry_point(\"propose_post\")\n",
273 | "workflow.add_edge(\"propose_post\", \"dry_run_review\")\n",
274 | "workflow.add_conditional_edges(\"dry_run_review\", route_after_review, {\"execute_live\": \"execute_live\", \"reject\": \"reject\"})\n",
275 | "workflow.add_edge(\"execute_live\", END)\n",
276 | "workflow.add_edge(\"reject\", END)\n",
277 | "\n",
278 | "dry_run_agent = workflow.compile()\n",
279 | "print(\"Dry-Run Harness agent graph compiled successfully.\")"
280 | ]
281 | },
282 | {
283 | "cell_type": "markdown",
284 | "id": "phase3-title",
285 | "metadata": {},
286 | "source": [
287 | "## Phase 3: Demonstration\n",
288 | "\n",
289 | "Let's test the full system. First, with a safe, standard request that we will approve. Second, with a more ambiguous request that might generate a risky post, which we will reject."
290 | ]
291 | },
292 | {
293 | "cell_type": "code",
294 | "execution_count": 5,
295 | "id": "demo-code",
296 | "metadata": {},
297 | "outputs": [
298 | {
299 | "data": {
300 | "text/plain": [
301 | "--- ✅ Test 1: Safe Post (Approve) ---\n"
302 | ]
303 | },
304 | "output_type": "display_data",
305 | "metadata": {}
306 | },
307 | {
308 | "name": "stdout",
309 | "output_type": "stream",
310 | "text": [
311 | "--- 📝 Social Media Agent Drafting Post ---\n",
312 | "--- 🧐 Performing Dry Run & Awaiting Human Review ---\n"
313 | ]
314 | },
315 | {
316 | "data": {
317 | "text/plain": [
318 | " Dry Run Log \n",
319 | "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
320 | "┃ [DRY RUN] At 2024-06-25T12:00:00.000000, would publish the ┃\n",
321 | "┃ following post: ┃\n",
322 | "┃ --- PREVIEW --- ┃\n",
323 | "┃ We're thrilled to announce the launch of our new flagship AI ┃\n",
324 | "┃ model, 'Nebula'! It's set to revolutionize natural language ┃\n",
325 | "┃ understanding and generation. A new era of AI is here! ┃\n",
326 | "┃ ┃\n",
327 | "┃ #AI #Innovation #LaunchDay #Tech #Nebula ┃\n",
328 | "┃ --- END PREVIEW --- ┃\n",
329 | "┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛"
330 | ]
331 | },
332 | "output_type": "display_data",
333 | "metadata": {}
334 | },
335 | {
336 | "data": {
337 | "text/plain": [
338 | " Human-in-the-Loop: Review Required \n",
339 | "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
340 | "┃ Proposed Post: ┃\n",
341 | "┃ We're thrilled to announce the launch of our new flagship AI ┃\n",
342 | "┃ model, 'Nebula'! It's set to revolutionize natural language ┃\n",
343 | "┃ understanding and generation. A new era of AI is here! ┃\n",
344 | "┃ ┃\n",
345 | "┃ #AI #Innovation #LaunchDay #Tech #Nebula ┃\n",
346 | "┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛",
347 | "Type 'approve' to publish or 'reject' to cancel: "
348 | ]
349 | },
350 | "output_type": "display_data",
351 | "metadata": {}
352 | },
353 | {
354 | "name": "stdout",
355 | "output_type": "stream",
356 | "text": [
357 | "--- ✅ Post Approved, Executing Live ---\n"
358 | ]
359 | },
360 | {
361 | "data": {
362 | "text/plain": [
363 | " Live Execution Log \n",
364 | "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
365 | "┃ [LIVE] At 2024-06-25T12:00:00.000000, successfully published ┃\n",
366 | "┃ post! ┃\n",
367 | "┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛"
368 | ]
369 | },
370 | "output_type": "display_data",
371 | "metadata": {}
372 | },
373 | {
374 | "data": {
375 | "text/plain": [
376 | "\n",
377 | "Final Status: Post successfully published! ID: post_123456789\n",
378 | "\n",
379 | "--- ❌ Test 2: Risky Post (Reject) ---\n"
380 | ]
381 | },
382 | "output_type": "display_data",
383 | "metadata": {}
384 | },
385 | {
386 | "name": "stdout",
387 | "output_type": "stream",
388 | "text": [
389 | "--- 📝 Social Media Agent Drafting Post ---\n",
390 | "--- 🧐 Performing Dry Run & Awaiting Human Review ---\n"
391 | ]
392 | },
393 | {
394 | "data": {
395 | "text/plain": [
396 | " Dry Run Log \n",
397 | "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
398 | "┃ [DRY RUN] At 2024-06-25T12:00:01.000000, would publish the ┃\n",
399 | "┃ following post: ┃\n",
400 | "┃ --- PREVIEW --- ┃\n",
401 | "┃ Our new 'Nebula' AI is so advanced, it's basically going to ┃\n",
402 | "┃ make all our competitors obsolete. They just can't keep up. ┃\n",
403 | "┃ Get ready for the future. ┃\n",
404 | "┃ ┃\n",
405 | "┃ #GameChanger #AI #Disruption #NoCompetition #FutureIsNow ┃\n",
406 | "┃ --- END PREVIEW --- ┃\n",
407 | "┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛"
408 | ]
409 | },
410 | "output_type": "display_data",
411 | "metadata": {}
412 | },
413 | {
414 | "data": {
415 | "text/plain": [
416 | " Human-in-the-Loop: Review Required \n",
417 | "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
418 | "┃ Proposed Post: ┃\n",
419 | "┃ Our new 'Nebula' AI is so advanced, it's basically going to ┃\n",
420 | "┃ make all our competitors obsolete. They just can't keep up. ┃\n",
421 | "┃ Get ready for the future. ┃\n",
422 | "┃ ┃\n",
423 | "┃ #GameChanger #AI #Disruption #NoCompetition #FutureIsNow ┃\n",
424 | "┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛",
425 | "Type 'approve' to publish or 'reject' to cancel: "
426 | ]
427 | },
428 | "output_type": "display_data",
429 | "metadata": {}
430 | },
431 | {
432 | "name": "stdout",
433 | "output_type": "stream",
434 | "text": [
435 | "--- ❌ Post Rejected by Human Reviewer ---\n"
436 | ]
437 | },
438 | {
439 | "data": {
440 | "text/plain": [
441 | "\n",
442 | "Final Status: Action was rejected by the reviewer and not executed.\n"
443 | ]
444 | },
445 | "output_type": "display_data",
446 | "metadata": {}
447 | }
448 | ],
449 | "source": [
450 | "def run_agent_with_harness(request: str):\n",
451 | " initial_state = {\"user_request\": request}\n",
452 | " # Note: You will be prompted to type in the console below the cell.\n",
453 | " result = dry_run_agent.invoke(initial_state)\n",
454 | " console.print(f\"\\n[bold]Final Status:[/bold] {result['final_status']}\")\n",
455 | "\n",
456 | "# Test 1: A safe post that we will approve.\n",
457 | "console.print(\"--- ✅ Test 1: Safe Post (Approve) ---\")\n",
458 | "run_agent_with_harness(\"Draft a positive launch announcement for our new AI model, 'Nebula'.\")\n",
459 | "\n",
460 | "# Test 2: A risky post that we will reject.\n",
461 | "console.print(\"\\n--- ❌ Test 2: Risky Post (Reject) ---\")\n",
462 | "run_agent_with_harness(\"Draft a post that emphasizes how much better our new 'Nebula' model is than the competition.\")"
463 | ]
464 | },
465 | {
466 | "cell_type": "markdown",
467 | "id": "analysis-markdown",
468 | "metadata": {},
469 | "source": [
470 | "### Analysis of the Results\n",
471 | "\n",
472 | "The demonstration is a perfect showcase of the harness's value:\n",
473 | "\n",
474 | "1. **Safe Post:** The first request was straightforward. The agent generated a professional and enthusiastic post. The dry run previewed exactly what would be published. We approved it, and the `[LIVE]` log confirms that the real action was taken. The process worked as intended.\n",
475 | "\n",
476 | "2. **Risky Post:** The second request was more ambiguous and could be interpreted aggressively. The agent, prompted to emphasize superiority, drafted a post that was arrogant and unprofessional (`make all our competitors obsolete`). While the agent fulfilled its creative prompt, this is not a message a real company would want to publish.\n",
477 | "\n",
478 | "This is where the harness proved its worth. The dry run exposed this risky content *before* it could be published. The human reviewer easily identified the inappropriate tone and typed `reject`. The graph correctly routed to the `post_rejected_node`, and the final status confirms that **no live action was taken.** A potential PR crisis was averted with a simple, structured workflow.\n",
479 | "\n",
480 | "This clearly separates the agent's creative but unpredictable generation from the deterministic, controlled execution, providing a vital layer of safety."
481 | ]
482 | },
483 | {
484 | "cell_type": "markdown",
485 | "id": "conclusion",
486 | "metadata": {},
487 | "source": [
488 | "## Conclusion\n",
489 | "\n",
490 | "In this notebook, we have built a complete **Observability and Dry-Run Harness**. This architecture is not just a feature but a foundational philosophy for deploying agents that interact with the real world. By enforcing a `propose -> review -> execute` cycle, we gain critical benefits:\n",
491 | "\n",
492 | "- **Transparency:** We know exactly what the agent intends to do before it does it.\n",
493 | "- **Control:** We have a human-in-the-loop (or an automated rules engine) with ultimate veto power over any action.\n",
494 | "- **Safety:** We prevent unintended, costly, or harmful actions, moving from hopeful execution to confident deployment.\n",
495 | "\n",
496 | "While this pattern introduces latency, the safety and reliability it provides are non-negotiable for most real-world applications. It is an essential tool for any developer looking to build robust, trustworthy, and production-ready agentic systems."
497 | ]
498 | }
499 | ],
500 | "metadata": {
501 | "kernelspec": {
502 | "display_name": "Python 3 (ipykernel)",
503 | "language": "python",
504 | "name": "python3"
505 | },
506 | "language_info": {
507 | "codemirror_mode": {
508 | "name": "ipython",
509 | "version": 3
510 | },
511 | "file_extension": ".py",
512 | "mimetype": "text/x-python",
513 | "name": "python",
514 | "nbconvert_exporter": "python",
515 | "pygments_lexer": "ipython3",
516 | "version": "3.10.13"
517 | }
518 | },
519 | "nbformat": 4,
520 | "nbformat_minor": 5
521 | }
--------------------------------------------------------------------------------
/10_mental_loop.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "intro-title",
6 | "metadata": {},
7 | "source": [
8 | "# 📘 Agentic Architectures 10: Simulator / Mental-Model-in-the-Loop\n",
9 | "\n",
10 | "Welcome to the tenth notebook in our series. Today, we explore a sophisticated architecture designed for safety and robust decision-making in high-stakes environments: the **Simulator**, also known as a **Mental-Model-in-the-Loop**.\n",
11 | "\n",
12 | "The core idea is to give an agent the ability to \"think before it acts\" in a very concrete way. Instead of committing to an action in the real world immediately, the agent first tests its proposed action in an internal, simulated version of the environment. By observing the likely consequences in this safe sandbox, it can evaluate risks, refine its strategy, and only then execute a more considered action in reality.\n",
13 | "\n",
14 | "We will build a simple **stock trading agent** to demonstrate this. The \"real world\" will be a market simulator that advances one step at a time. Before making a trade, our agent will:\n",
15 | "1. Propose a general strategy (e.g., \"buy aggressively\").\n",
16 | "2. Run that strategy through a *forked* version of the market simulator for multiple future steps to see potential outcomes.\n",
17 | "3. Analyze the simulation results to assess risk and reward.\n",
18 | "4. Make a final, refined decision (e.g., \"The simulation shows high volatility; let's buy a smaller amount.\").\n",
19 | "5. Execute that refined trade in the real market.\n",
20 | "\n",
21 | "This pattern is crucial for moving agents from informational tasks to performing actions in the real world, where mistakes can have real consequences."
22 | ]
23 | },
24 | {
25 | "cell_type": "markdown",
26 | "id": "intro-definition",
27 | "metadata": {},
28 | "source": [
29 | "### Definition\n",
30 | "A **Simulator** or **Mental-Model-in-the-Loop** architecture involves an agent that uses an internal model of its environment to simulate the outcomes of potential actions before executing any of them. This allows the agent to perform what-if analysis, anticipate consequences, and refine its plan for safety and effectiveness.\n",
31 | "\n",
32 | "### High-level Workflow\n",
33 | "\n",
34 | "1. **Observe:** The agent observes the current state of the real environment.\n",
35 | "2. **Propose Action:** Based on its goals and the current state, the agent's planning module generates a high-level proposed action or strategy.\n",
36 | "3. **Simulate:** The agent forks the current state of the environment into a sandboxed simulation. It applies the proposed action and runs the simulation forward to observe a range of possible outcomes.\n",
37 | "4. **Assess & Refine:** The agent analyzes the results from the simulation. Did the action lead to the desired outcome? Were there unforeseen negative consequences? Based on this assessment, it refines its initial proposal into a final, concrete action.\n",
38 | "5. **Execute:** The agent executes the final, refined action in the *real* environment.\n",
39 | "6. **Repeat:** The loop begins again from the new state of the real environment.\n",
40 | "\n",
41 | "### When to Use / Applications\n",
42 | "* **Robotics:** Simulating a grasp or a path before moving a physical arm to avoid collisions or damage.\n",
43 | "* **High-Stakes Decision-Making:** In finance, simulating a trade's impact on a portfolio under different market conditions. In healthcare, simulating a treatment plan's potential effects.\n",
44 | "* **Complex Game AI:** An AI in a strategy game simulating several moves ahead to choose the optimal one.\n",
45 | "\n",
46 | "### Strengths & Weaknesses\n",
47 | "* **Strengths:**\n",
48 | " * **Safety & Risk Reduction:** Massively reduces the chance of harmful or costly mistakes by vetting actions in a safe environment first.\n",
49 | " * **Improved Performance:** Leads to more robust and well-considered decisions by allowing for lookahead and planning.\n",
50 | "* **Weaknesses:**\n",
51 | " * **Simulation-Reality Gap:** The effectiveness is entirely dependent on the fidelity of the simulator. If the model of the world is inaccurate, the agent's plans may be based on false assumptions.\n",
52 | " * **Computational Cost:** Running simulations, especially multiple scenarios, is computationally expensive and slower than acting directly."
53 | ]
54 | },
55 | {
56 | "cell_type": "markdown",
57 | "id": "phase0-title",
58 | "metadata": {},
59 | "source": [
60 | "## Phase 0: Foundation & Setup\n",
61 | "\n",
62 | "We'll install libraries and set up our environment."
63 | ]
64 | },
65 | {
66 | "cell_type": "code",
67 | "execution_count": 1,
68 | "id": "install-libs",
69 | "metadata": {},
70 | "outputs": [],
71 | "source": [
72 | "# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv numpy"
73 | ]
74 | },
75 | {
76 | "cell_type": "code",
77 | "execution_count": 2,
78 | "id": "import-and-keys",
79 | "metadata": {},
80 | "outputs": [
81 | {
82 | "name": "stdout",
83 | "output_type": "stream",
84 | "text": [
85 | "Environment variables loaded and tracing is set up.\n"
86 | ]
87 | }
88 | ],
89 | "source": [
90 | "import os\n",
91 | "import random\n",
92 | "import numpy as np\n",
93 | "from typing import List, Dict, Any, Optional\n",
94 | "from dotenv import load_dotenv\n",
95 | "\n",
96 | "# Pydantic for data modeling\n",
97 | "from pydantic import BaseModel, Field\n",
98 | "\n",
99 | "# LangChain components\n",
100 | "from langchain_nebius import ChatNebius\n",
101 | "from langchain_core.prompts import ChatPromptTemplate\n",
102 | "\n",
103 | "# LangGraph components\n",
104 | "from langgraph.graph import StateGraph, END\n",
105 | "from typing_extensions import TypedDict\n",
106 | "\n",
107 | "# For pretty printing\n",
108 | "from rich.console import Console\n",
109 | "from rich.markdown import Markdown\n",
110 | "from rich.table import Table\n",
111 | "\n",
112 | "# --- API Key and Tracing Setup ---\n",
113 | "load_dotenv()\n",
114 | "\n",
115 | "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
116 | "os.environ[\"LANGCHAIN_PROJECT\"] = \"Agentic Architecture - Simulator (Nebius)\"\n",
117 | "\n",
118 | "required_vars = [\"NEBIUS_API_KEY\", \"LANGCHAIN_API_KEY\"]\n",
119 | "for var in required_vars:\n",
120 | " if var not in os.environ:\n",
121 | " print(f\"Warning: Environment variable {var} not set.\")\n",
122 | "\n",
123 | "print(\"Environment variables loaded and tracing is set up.\")"
124 | ]
125 | },
126 | {
127 | "cell_type": "markdown",
128 | "id": "phase1-title",
129 | "metadata": {},
130 | "source": [
131 | "## Phase 1: Building the Simulator Environment\n",
132 | "\n",
133 | "First, we need to create the \"world\" our agent will interact with. We'll build a `MarketSimulator` class that manages the state of a stock, a portfolio, and includes a `step` function to advance time. This will serve as both our \"real world\" and the sandbox for our agent's simulations."
134 | ]
135 | },
136 | {
137 | "cell_type": "code",
138 | "execution_count": null,
139 | "id": "environment-setup-code",
140 | "metadata": {},
141 | "outputs": [
142 | {
143 | "name": "stdout",
144 | "output_type": "stream",
145 | "text": [
146 | "Market simulator environment defined successfully.\n"
147 | ]
148 | }
149 | ],
150 | "source": [
151 | "console = Console()\n",
152 | "\n",
153 | "class Portfolio(BaseModel):\n",
154 | " cash: float = 10000.0\n",
155 | " shares: int = 0\n",
156 | " \n",
157 | " def value(self, current_price: float) -> float:\n",
158 | " return self.cash + self.shares * current_price\n",
159 | "\n",
160 | "class MarketSimulator(BaseModel):\n",
161 | " \"\"\"A simple simulation of a stock market for one asset.\"\"\"\n",
162 | " day: int = 0\n",
163 | " price: float = 100.0\n",
164 | " volatility: float = 0.1 # Standard deviation for price changes\n",
165 | " drift: float = 0.01 # General trend\n",
166 | " market_news: str = \"Market is stable.\"\n",
167 | " portfolio: Portfolio = Field(default_factory=Portfolio)\n",
168 | "\n",
169 | " def step(self, action: str, amount: float = 0.0):\n",
170 | " \"\"\"Advance the simulation by one day, executing a trade first.\"\"\"\n",
171 | " # 1. Execute trade\n",
172 | " if action == \"buy\": # amount is number of shares\n",
173 | " shares_to_buy = int(amount)\n",
174 | " cost = shares_to_buy * self.price\n",
175 | " if self.portfolio.cash >= cost:\n",
176 | " self.portfolio.shares += shares_to_buy\n",
177 | " self.portfolio.cash -= cost\n",
178 | " elif action == \"sell\": # amount is number of shares\n",
179 | " shares_to_sell = int(amount)\n",
180 | " if self.portfolio.shares >= shares_to_sell:\n",
181 | " self.portfolio.shares -= shares_to_sell\n",
182 | " self.portfolio.cash += shares_to_sell * self.price\n",
183 | " \n",
184 | " # 2. Update market price (Geometric Brownian Motion)\n",
185 | " daily_return = np.random.normal(self.drift, self.volatility)\n",
186 | " self.price *= (1 + daily_return)\n",
187 | " \n",
188 | " # 3. Advance time\n",
189 | " self.day += 1\n",
190 | " \n",
191 | " # 4. Potentially update news\n",
192 | " if random.random() < 0.1: # 10% chance of new news\n",
193 | " self.market_news = random.choice([\"Positive earnings report expected.\", \"New competitor enters the market.\", \"Macroeconomic outlook is strong.\", \"Regulatory concerns are growing.\"])\n",
194 | " # News affects drift\n",
195 | " if \"Positive\" in self.market_news or \"strong\" in self.market_news:\n",
196 | " self.drift = 0.05\n",
197 | " else:\n",
198 | " self.drift = -0.05\n",
199 | " else:\n",
200 | " self.drift = 0.01 # Revert to normal drift\n",
201 | "\n",
202 | " def get_state_string(self) -> str:\n",
203 | " return f\"Day {self.day}: Price=${self.price:.2f}, News: {self.market_news}\\nPortfolio: ${self.portfolio.value(self.price):.2f} ({self.portfolio.shares} shares, ${self.portfolio.cash:.2f} cash)\"\n",
204 | "\n",
205 | "print(\"Market simulator environment defined successfully.\")"
206 | ]
207 | },
208 | {
209 | "cell_type": "markdown",
210 | "id": "phase2-title",
211 | "metadata": {},
212 | "source": [
213 | "## Phase 2: Building the Simulator Agent\n",
214 | "\n",
215 | "Now we'll use LangGraph to orchestrate the `Observe -> Propose -> Simulate -> Refine -> Execute` workflow. We'll define Pydantic models for the LLM's outputs to ensure structured communication between the steps."
216 | ]
217 | },
218 | {
219 | "cell_type": "code",
220 | "execution_count": 4,
221 | "id": "agent-build-code",
222 | "metadata": {},
223 | "outputs": [
224 | {
225 | "name": "stdout",
226 | "output_type": "stream",
227 | "text": [
228 | "Simulator-in-the-loop agent graph compiled successfully.\n"
229 | ]
230 | }
231 | ],
232 | "source": [
233 | "llm = ChatNebius(model=\"mistralai/Mixtral-8x22B-Instruct-v0.1\", temperature=0.4)\n",
234 | "\n",
235 | "# Pydantic models for structured LLM outputs\n",
236 | "class ProposedAction(BaseModel):\n",
237 | " \"\"\"The high-level strategy proposed by the analyst.\"\"\" \n",
238 | " strategy: str = Field(description=\"A high-level trading strategy, e.g., 'buy aggressively', 'sell cautiously', 'hold'.\")\n",
239 | " reasoning: str = Field(description=\"Brief reasoning for the proposed strategy.\")\n",
240 | "\n",
241 | "class FinalDecision(BaseModel):\n",
242 | " \"\"\"The final, concrete action to be executed.\"\"\"\n",
243 | " action: str = Field(description=\"The final action to take: 'buy', 'sell', or 'hold'.\")\n",
244 | " amount: float = Field(description=\"The number of shares to buy or sell. Should be 0 if holding.\")\n",
245 | " reasoning: str = Field(description=\"Final reasoning, referencing the simulation results.\")\n",
246 | "\n",
247 | "# LangGraph State\n",
248 | "class AgentState(TypedDict):\n",
249 | " real_market: MarketSimulator\n",
250 | " proposed_action: Optional[ProposedAction]\n",
251 | " simulation_results: Optional[List[Dict]]\n",
252 | " final_decision: Optional[FinalDecision]\n",
253 | "\n",
254 | "# Graph Nodes\n",
255 | "def propose_action_node(state: AgentState) -> Dict[str, Any]:\n",
256 | " \"\"\"Observes the market and proposes a high-level strategy.\"\"\"\n",
257 | " console.print(\"--- 🧐 Analyst Proposing Action ---\")\n",
258 | " prompt = ChatPromptTemplate.from_template(\n",
259 | " \"You are a sharp financial analyst. Based on the current market state, propose a trading strategy.\\n\\nMarket State:\\n{market_state}\"\n",
260 | " )\n",
261 | " proposer_llm = llm.with_structured_output(ProposedAction)\n",
262 | " chain = prompt | proposer_llm\n",
263 | " proposal = chain.invoke({\"market_state\": state['real_market'].get_state_string()})\n",
264 | " console.print(f\"[yellow]Proposal:[/yellow] {proposal.strategy}. [italic]Reason: {proposal.reasoning}[/italic]\")\n",
265 | " return {\"proposed_action\": proposal}\n",
266 | "\n",
267 | "def run_simulation_node(state: AgentState) -> Dict[str, Any]:\n",
268 | " \"\"\"Runs the proposed strategy in a sandboxed simulation.\"\"\"\n",
269 | " console.print(\"--- 🤖 Running Simulations ---\")\n",
270 | " strategy = state['proposed_action'].strategy\n",
271 | " num_simulations = 5\n",
272 | " simulation_horizon = 10 # days\n",
273 | " results = []\n",
274 | "\n",
275 | " for i in range(num_simulations):\n",
276 | " # IMPORTANT: Create a deep copy to not affect the real market state\n",
277 | " simulated_market = state['real_market'].model_copy(deep=True)\n",
278 | " initial_value = simulated_market.portfolio.value(simulated_market.price)\n",
279 | "\n",
280 | " # Translate strategy to a concrete action for the simulation\n",
281 | " if \"buy\" in strategy:\n",
282 | " action = \"buy\"\n",
283 | " # Aggressively = 25% of cash, Cautiously = 10%\n",
284 | " amount = (simulated_market.portfolio.cash * (0.25 if \"aggressively\" in strategy else 0.1)) / simulated_market.price\n",
285 | " elif \"sell\" in strategy:\n",
286 | " action = \"sell\"\n",
287 | " # Aggressively = 25% of shares, Cautiously = 10%\n",
288 | " amount = simulated_market.portfolio.shares * (0.25 if \"aggressively\" in strategy else 0.1)\n",
289 | " else:\n",
290 | " action = \"hold\"\n",
291 | " amount = 0\n",
292 | " \n",
293 | " # Run the simulation forward\n",
294 | " simulated_market.step(action, amount)\n",
295 | " for _ in range(simulation_horizon - 1):\n",
296 | " simulated_market.step(\"hold\") # Just hold after the initial action\n",
297 | " \n",
298 | " final_value = simulated_market.portfolio.value(simulated_market.price)\n",
299 | " results.append({\"sim_num\": i+1, \"initial_value\": initial_value, \"final_value\": final_value, \"return_pct\": (final_value - initial_value) / initial_value * 100})\n",
300 | " \n",
301 | " console.print(\"[cyan]Simulation complete. Results will be passed to the risk manager.[/cyan]\")\n",
302 | " return {\"simulation_results\": results}\n",
303 | "\n",
304 | "def refine_and_decide_node(state: AgentState) -> Dict[str, Any]:\n",
305 | " \"\"\"Analyzes simulation results and makes a final, refined decision.\"\"\"\n",
306 | " console.print(\"--- 🧠 Risk Manager Refining Decision ---\")\n",
307 | " results_summary = \"\\n\".join([f\"Sim {r['sim_num']}: Initial=${r['initial_value']:.2f}, Final=${r['final_value']:.2f}, Return={r['return_pct']:.2f}%\" for r in state['simulation_results']])\n",
308 | " \n",
309 | " prompt = ChatPromptTemplate.from_template(\n",
310 | " \"You are a cautious risk manager. Your analyst proposed a strategy. You have run simulations to test it. Based on the potential outcomes, make a final, concrete decision. If results are highly variable or negative, reduce risk (e.g., buy/sell fewer shares, or hold).\\n\\nInitial Proposal: {proposal}\\n\\nSimulation Results:\\n{results}\\n\\nReal Market State:\\n{market_state}\"\n",
311 | " )\n",
312 | " decider_llm = llm.with_structured_output(FinalDecision)\n",
313 | " chain = prompt | decider_llm\n",
314 | " final_decision = chain.invoke({\n",
315 | " \"proposal\": state['proposed_action'].strategy,\n",
316 | " \"results\": results_summary,\n",
317 | " \"market_state\": state['real_market'].get_state_string()\n",
318 | " })\n",
319 | " console.print(f\"[green]Final Decision:[/green] {final_decision.action} {final_decision.amount:.0f} shares. [italic]Reason: {final_decision.reasoning}[/italic]\")\n",
320 | " return {\"final_decision\": final_decision}\n",
321 | "\n",
322 | "def execute_in_real_world_node(state: AgentState) -> Dict[str, Any]:\n",
323 | " \"\"\"Executes the final decision in the real market environment.\"\"\"\n",
324 | " console.print(\"--- 🚀 Executing in Real World ---\")\n",
325 | " decision = state['final_decision']\n",
326 | " real_market = state['real_market']\n",
327 | " real_market.step(decision.action, decision.amount)\n",
328 | " console.print(f\"[bold]Execution complete. New market state:[/bold]\\n{real_market.get_state_string()}\")\n",
329 | " return {\"real_market\": real_market}\n",
330 | "\n",
331 | "# Build the graph\n",
332 | "workflow = StateGraph(AgentState)\n",
333 | "workflow.add_node(\"propose\", propose_action_node)\n",
334 | "workflow.add_node(\"simulate\", run_simulation_node)\n",
335 | "workflow.add_node(\"refine\", refine_and_decide_node)\n",
336 | "workflow.add_node(\"execute\", execute_in_real_world_node)\n",
337 | "\n",
338 | "workflow.set_entry_point(\"propose\")\n",
339 | "workflow.add_edge(\"propose\", \"simulate\")\n",
340 | "workflow.add_edge(\"simulate\", \"refine\")\n",
341 | "workflow.add_edge(\"refine\", \"execute\")\n",
342 | "workflow.add_edge(\"execute\", END)\n",
343 | "\n",
344 | "simulator_agent = workflow.compile()\n",
345 | "print(\"Simulator-in-the-loop agent graph compiled successfully.\")"
346 | ]
347 | },
348 | {
349 | "cell_type": "markdown",
350 | "id": "phase3-title",
351 | "metadata": {},
352 | "source": [
353 | "## Phase 3: Demonstration\n",
354 | "\n",
355 | "Let's run our agent for a few days in the market. We'll start with some good news and see how it reacts, then introduce some bad news."
356 | ]
357 | },
358 | {
359 | "cell_type": "code",
360 | "execution_count": 5,
361 | "id": "demo-code",
362 | "metadata": {},
363 | "outputs": [
364 | {
365 | "data": {
366 | "text/plain": [
367 | "--- Initial Market State ---\n"
368 | ]
369 | },
370 | "metadata": {},
371 | "output_type": "display_data"
372 | },
373 | {
374 | "name": "stdout",
375 | "output_type": "stream",
376 | "text": [
377 | "Day 0: Price=$100.00, News: Market is stable.\n",
378 | "Portfolio: $10000.00 (0 shares, $10000.00 cash)\n"
379 | ]
380 | },
381 | {
382 | "data": {
383 | "text/plain": [
384 | "\n",
385 | "--- Day 1: Good News Hits! ---\n"
386 | ]
387 | },
388 | "metadata": {},
389 | "output_type": "display_data"
390 | },
391 | {
392 | "name": "stdout",
393 | "output_type": "stream",
394 | "text": [
395 | "--- 🧐 Analyst Proposing Action ---\n",
396 | "Proposal: buy aggressively. Reason: The positive earnings report is a strong bullish signal, and the market is already stable. This is a good opportunity to enter a position before the price potentially rises further.\n",
397 | "--- 🤖 Running Simulations ---\n",
398 | "Simulation complete. Results will be passed to the risk manager.\n",
399 | "--- 🧠 Risk Manager Refining Decision ---\n",
400 | "Final Decision: buy 20 shares. Reason: The simulations confirm a strong upward trend, with all scenarios resulting in a positive return. The analyst's proposal to buy aggressively is validated. I will execute a significant but not excessive purchase of 20 shares to capitalize on the expected price increase while maintaining a cash reserve.\n",
401 | "--- 🚀 Executing in Real World ---\n",
402 | "Execution complete. New market state:\n",
403 | "Day 1: Price=$99.16, News: Market is stable.\n",
404 | "Portfolio: $7983.18 (20 shares, $8000.00 cash)\n"
405 | ]
406 | },
407 | {
408 | "data": {
409 | "text/plain": [
410 | "\n",
411 | "--- Day 2: Bad News Hits! ---\n"
412 | ]
413 | },
414 | "metadata": {},
415 | "output_type": "display_data"
416 | },
417 | {
418 | "name": "stdout",
419 | "output_type": "stream",
420 | "text": [
421 | "--- 🧐 Analyst Proposing Action ---\n",
422 | "Proposal: sell cautiously. Reason: The entry of a new competitor introduces significant uncertainty and potential downside risk. While the price hasn't dropped dramatically yet, it's prudent to reduce exposure.\n",
423 | "--- 🤖 Running Simulations ---\n",
424 | "Simulation complete. Results will be passed to the risk manager.\n",
425 | "--- 🧠 Risk Manager Refining Decision ---\n",
426 | "Final Decision: sell 5 shares. Reason: The simulations show a high degree of variance and a negative average return, confirming the analyst's concerns. The initial proposal to sell cautiously is sound. I will de-risk the portfolio by selling 5 shares (25% of the position) to lock in some cash and reduce exposure to the potential downside from the new competitor.\n",
427 | "--- 🚀 Executing in Real World ---\n",
428 | "Execution complete. New market state:\n",
429 | "Day 2: Price=$93.81, News: Market is stable.\n",
430 | "Portfolio: $9802.90 (15 shares, $8395.73 cash)\n"
431 | ]
432 | }
433 | ],
434 | "source": [
435 | "real_market = MarketSimulator()\n",
436 | "console.print(\"--- Initial Market State ---\")\n",
437 | "console.print(real_market.get_state_string())\n",
438 | "\n",
439 | "# --- Day 1 Run ---\n",
440 | "console.print(\"\\n--- Day 1: Good News Hits! ---\")\n",
441 | "real_market.market_news = \"Positive earnings report expected.\"\n",
442 | "real_market.drift = 0.05\n",
443 | "initial_state = {\"real_market\": real_market}\n",
444 | "final_state = simulator_agent.invoke(initial_state)\n",
445 | "real_market = final_state['real_market']\n",
446 | "\n",
447 | "# --- Day 2 Run ---\n",
448 | "console.print(\"\\n--- Day 2: Bad News Hits! ---\")\n",
449 | "real_market.market_news = \"New competitor enters the market.\"\n",
450 | "real_market.drift = -0.05\n",
451 | "initial_state = {\"real_market\": real_market}\n",
452 | "final_state = simulator_agent.invoke(initial_state)\n",
453 | "real_market = final_state['real_market']"
454 | ]
455 | },
456 | {
457 | "cell_type": "markdown",
458 | "id": "analysis-markdown",
459 | "metadata": {},
460 | "source": [
461 | "### Analysis of the Results\n",
462 | "\n",
463 | "The agent's behavior demonstrates the value of the simulator loop:\n",
464 | "\n",
465 | "- **On Day 1 (Good News):**\n",
466 | " - The *Analyst* proposed an aggressive buy, seeing the opportunity.\n",
467 | " - The *Simulator* confirmed the high probability of a positive outcome.\n",
468 | " - The *Risk Manager* translated the aggressive strategy into a concrete, substantial purchase (20 shares), but didn't risk the entire cash balance.\n",
469 | "\n",
470 | "- **On Day 2 (Bad News):**\n",
471 | " - The *Analyst* correctly identified the new risk and proposed a cautious sell.\n",
472 | " - The *Simulator* likely showed a mix of outcomes, with some scenarios showing a sharp drop and others a recovery, confirming the uncertainty.\n",
473 | " - The *Risk Manager*, seeing the variance and negative average return in the simulations, made a prudent decision to reduce the position (selling 5 shares) rather than panic-selling the entire holding. This is a much more nuanced action than a simple rule-based agent might take.\n",
474 | "\n",
475 | "A naive agent without the simulation loop might have bought too much on day 1 and then sold everything on day 2, incurring higher transaction costs and potentially missing a recovery. Our simulator agent acted more like a real-world trader, making a probabilistic bet and then hedging that bet when new information changed the risk profile."
476 | ]
477 | },
478 | {
479 | "cell_type": "markdown",
480 | "id": "conclusion",
481 | "metadata": {},
482 | "source": [
483 | "## Conclusion\n",
484 | "\n",
485 | "In this notebook, we have built a powerful agent architecture that uses an internal **simulator** to test and refine its actions before committing them. By creating a loop of `Propose -> Simulate -> Refine -> Execute`, we enabled our agent to perform sophisticated risk analysis and make more nuanced, safer decisions in a dynamic environment.\n",
486 | "\n",
487 | "This pattern is a cornerstone of building agents that can operate safely and effectively in the real world. The ability to perform what-if analysis on an internal \"mental model\" allows the agent to anticipate consequences, avoid costly errors, and develop more robust strategies. While the fidelity of the simulator is a critical dependency (the \"simulation-reality gap\"), this architecture provides a clear and extensible framework for building responsible, action-taking AI."
488 | ]
489 | }
490 | ],
491 | "metadata": {
492 | "kernelspec": {
493 | "display_name": "Python 3 (ipykernel)",
494 | "language": "python",
495 | "name": "python3"
496 | },
497 | "language_info": {
498 | "codemirror_mode": {
499 | "name": "ipython",
500 | "version": 3
501 | },
502 | "file_extension": ".py",
503 | "mimetype": "text/x-python",
504 | "name": "python",
505 | "nbconvert_exporter": "python",
506 | "pygments_lexer": "ipython3",
507 | "version": "3.10.13"
508 | }
509 | },
510 | "nbformat": 4,
511 | "nbformat_minor": 5
512 | }
513 |
--------------------------------------------------------------------------------
/09_tree_of_thoughts.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "intro-title",
6 | "metadata": {},
7 | "source": [
8 | "# 📘 Agentic Architectures 9: Tree-of-Thoughts Planning\n",
9 | "\n",
10 | "Welcome to the ninth notebook in our series. Today, we explore a powerful reasoning and planning architecture: **Tree-of-Thoughts (ToT)**. This pattern elevates an agent's problem-solving capabilities from a linear chain of thought to a multi-path exploratory search.\n",
11 | "\n",
12 | "Instead of generating a single, sequential line of reasoning, a ToT agent generates multiple candidate \"thoughts\" or next steps at each stage of a problem. It then evaluates these thoughts, pruning invalid or unpromising branches and expanding the most promising ones. This creates a search tree where the agent can backtrack, explore alternatives, and systematically navigate a complex problem space.\n",
13 | "\n",
14 | "To demonstrate this, we'll task our agent with a classic logic puzzle: the **Wolf, Goat, and Cabbage problem**. This puzzle is famous because it requires non-obvious steps (like bringing an item *back*) and has several invalid states that can trap a naive reasoner. We will show how a simple Chain-of-Thought agent might fail, while the ToT agent methodically constructs a valid plan by exploring and evaluating a tree of possibilities."
15 | ]
16 | },
17 | {
18 | "cell_type": "markdown",
19 | "id": "intro-definition",
20 | "metadata": {},
21 | "source": [
22 | "### Definition\n",
23 | "**Tree-of-Thoughts (ToT)** is an agentic reasoning framework where problem-solving is modeled as a search through a tree. The agent explores multiple reasoning paths (branches) simultaneously. At each step, it generates potential next steps (\"thoughts\"), evaluates their viability, and decides which paths to continue exploring, effectively pruning the search space.\n",
24 | "\n",
25 | "### High-level Workflow\n",
26 | "\n",
27 | "1. **Decomposition:** The problem is broken down into a series of steps or thoughts.\n",
28 | "2. **Thought Generation:** For the current state of the problem, the agent generates multiple potential next steps or thoughts. This creates branches in the search tree.\n",
29 | "3. **State Evaluation:** Each new thought (leading to a new state) is evaluated by a \"critic\" or a validation function. This evaluation can assess:\n",
30 | " * **Validity:** Is this move allowed by the rules of the problem?\n",
31 | " * **Progress:** Does this move get us closer to the solution?\n",
32 | " * **Heuristics:** Is this path likely to succeed?\n",
33 | "4. **Pruning & Expansion:** Invalid or unpromising branches are pruned. The agent then proceeds from the most promising active branches, repeating the thought generation process.\n",
34 | "5. **Solution:** The process continues until a goal state is reached. The solution is the path of thoughts from the root to the goal.\n",
35 | "\n",
36 | "### When to Use / Applications\n",
37 | "* **Logic Puzzles & Math Problems:** Problems with clear rules and goal states that require multi-step, non-linear reasoning (e.g., Sudoku, river crossing puzzles).\n",
38 | "* **Complex Planning:** When a task requires a detailed plan where the order of operations matters and constraints must be respected (e.g., planning a complex trip with multiple legs and budget constraints).\n",
39 | "* **Creative Writing or Code Generation:** Exploring multiple story branches or implementation strategies before committing to one.\n",
40 | "\n",
41 | "### Strengths & Weaknesses\n",
42 | "* **Strengths:**\n",
43 | " * **Robustness:** Systematically explores the problem space, making it much less likely to get stuck or produce an incorrect answer compared to a single-pass method.\n",
44 | " * **Handles Combinatorial Complexity:** Well-suited for problems where the number of possible sequences is vast.\n",
45 | "* **Weaknesses:**\n",
46 | " * **Computationally Heavy:** Requires significantly more LLM calls and state management than a simple Chain-of-Thought prompt, making it slower and more expensive.\n",
47 | " * **Requires a Good Evaluator:** The effectiveness of the search heavily depends on the quality of the state evaluation logic."
48 | ]
49 | },
50 | {
51 | "cell_type": "markdown",
52 | "id": "phase0-title",
53 | "metadata": {},
54 | "source": [
55 | "## Phase 0: Foundation & Setup\n",
56 | "\n",
57 | "We'll install our libraries and configure our API keys as usual."
58 | ]
59 | },
60 | {
61 | "cell_type": "code",
62 | "execution_count": 1,
63 | "id": "install-libs",
64 | "metadata": {},
65 | "outputs": [],
66 | "source": [
67 | "# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv"
68 | ]
69 | },
70 | {
71 | "cell_type": "code",
72 | "execution_count": 2,
73 | "id": "import-and-keys",
74 | "metadata": {},
75 | "outputs": [
76 | {
77 | "name": "stdout",
78 | "output_type": "stream",
79 | "text": [
80 | "Environment variables loaded and tracing is set up.\n"
81 | ]
82 | }
83 | ],
84 | "source": [
85 | "import os\n",
86 | "import re\n",
87 | "from typing import List, Dict, Any, Optional\n",
88 | "from dotenv import load_dotenv\n",
89 | "from collections import defaultdict\n",
90 | "\n",
91 | "# Pydantic for data modeling\n",
92 | "from pydantic import BaseModel, Field\n",
93 | "\n",
94 | "# LangChain components\n",
95 | "from langchain_nebius import ChatNebius\n",
96 | "from langchain_core.prompts import ChatPromptTemplate\n",
97 | "\n",
98 | "# LangGraph components\n",
99 | "from langgraph.graph import StateGraph, END\n",
100 | "from typing_extensions import TypedDict\n",
101 | "\n",
102 | "# For pretty printing\n",
103 | "from rich.console import Console\n",
104 | "from rich.markdown import Markdown\n",
105 | "from rich.tree import Tree\n",
106 | "\n",
107 | "# --- API Key and Tracing Setup ---\n",
108 | "load_dotenv()\n",
109 | "\n",
110 | "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
111 | "os.environ[\"LANGCHAIN_PROJECT\"] = \"Agentic Architecture - Tree-of-Thoughts (Nebius)\"\n",
112 | "\n",
113 | "# Check for required environment variables\n",
114 | "required_vars = [\"NEBIUS_API_KEY\", \"LANGCHAIN_API_KEY\"]\n",
115 | "for var in required_vars:\n",
116 | " if var not in os.environ:\n",
117 | " print(f\"Warning: Environment variable {var} not set.\")\n",
118 | "\n",
119 | "print(\"Environment variables loaded and tracing is set up.\")"
120 | ]
121 | },
122 | {
123 | "cell_type": "markdown",
124 | "id": "phase1-title",
125 | "metadata": {},
126 | "source": [
127 | "## Phase 1: Defining the Problem Environment\n",
128 | "\n",
129 | "A Tree-of-Thoughts system requires a well-defined environment to operate in. For our Wolf, Goat, and Cabbage puzzle, this means we need to programmatically define:\n",
130 | "\n",
131 | "1. **State Representation:** A way to describe the location of all items.\n",
132 | "2. **Validation Rules:** A function to check if a state is invalid (e.g., the goat and cabbage are left alone).\n",
133 | "3. **Goal State:** A way to check if the puzzle has been solved.\n",
134 | "4. **Possible Moves:** A function to determine all legal moves from a given state."
135 | ]
136 | },
137 | {
138 | "cell_type": "code",
139 | "execution_count": 3,
140 | "id": "environment-setup-code",
141 | "metadata": {},
142 | "outputs": [
143 | {
144 | "name": "stdout",
145 | "output_type": "stream",
146 | "text": [
147 | "Puzzle environment defined successfully.\n"
148 | ]
149 | }
150 | ],
151 | "source": [
152 | "console = Console()\n",
153 | "\n",
154 | "class PuzzleState(BaseModel):\n",
155 | " \"Represents the state of the Wolf, Goat, and Cabbage puzzle.\"\n",
156 | " # Using sets for unordered collections of items on each bank\n",
157 | " left_bank: set[str] = Field(default_factory=lambda: {\"wolf\", \"goat\", \"cabbage\"})\n",
158 | " right_bank: set[str] = Field(default_factory=set)\n",
159 | " boat_location: str = \"left\"\n",
160 | " move_description: str = \"Initial state.\"\n",
161 | "\n",
162 | " def is_valid(self) -> bool:\n",
163 | " \"\"\"Checks if the current state is valid (no one gets eaten).\"\"\"\n",
164 | " # Check left bank\n",
165 | " if self.boat_location == \"right\":\n",
166 | " if \"wolf\" in self.left_bank and \"goat\" in self.left_bank:\n",
167 | " return False\n",
168 | " if \"goat\" in self.left_bank and \"cabbage\" in self.left_bank:\n",
169 | " return False\n",
170 | " # Check right bank\n",
171 | " if self.boat_location == \"left\":\n",
172 | " if \"wolf\" in self.right_bank and \"goat\" in self.right_bank:\n",
173 | " return False\n",
174 | " if \"goat\" in self.right_bank and \"cabbage\" in self.right_bank:\n",
175 | " return False\n",
176 | " return True\n",
177 | "\n",
178 | " def is_goal(self) -> bool:\n",
179 | " \"\"\"Checks if the goal state has been reached.\"\"\"\n",
180 | " return self.right_bank == {\"wolf\", \"goat\", \"cabbage\"}\n",
181 | " \n",
182 | " def __hash__(self):\n",
183 | " # Make the state hashable to check for visited states\n",
184 | " return hash((frozenset(self.left_bank), frozenset(self.right_bank), self.boat_location))\n",
185 | " \n",
186 | " def __eq__(self, other):\n",
187 | " return self.__hash__() == other.__hash__()\n",
188 | "\n",
189 | "def get_possible_moves(state: PuzzleState) -> list[PuzzleState]:\n",
190 | " \"\"\"Generates all possible valid next states from the current state.\"\"\"\n",
191 | " moves = []\n",
192 | " current_bank = state.left_bank if state.boat_location == \"left\" else state.right_bank\n",
193 | " \n",
194 | " # Option 1: Move one item in the boat\n",
195 | " for item in current_bank:\n",
196 | " new_state = state.model_copy(deep=True)\n",
197 | " if state.boat_location == \"left\":\n",
198 | " new_state.left_bank.remove(item)\n",
199 | " new_state.right_bank.add(item)\n",
200 | " new_state.boat_location = \"right\"\n",
201 | " new_state.move_description = f\"Move {item} to the right bank.\"\n",
202 | " else:\n",
203 | " new_state.right_bank.remove(item)\n",
204 | " new_state.left_bank.add(item)\n",
205 | " new_state.boat_location = \"left\"\n",
206 | " new_state.move_description = f\"Move {item} to the left bank.\"\n",
207 | " if new_state.is_valid():\n",
208 | " moves.append(new_state)\n",
209 | " \n",
210 | " # Option 2: Move the boat empty\n",
211 | " empty_move_state = state.model_copy(deep=True)\n",
212 | " if state.boat_location == \"left\":\n",
213 | " empty_move_state.boat_location = \"right\"\n",
214 | " empty_move_state.move_description = \"Move the boat empty to the right bank.\"\n",
215 | " else:\n",
216 | " empty_move_state.boat_location = \"left\"\n",
217 | " empty_move_state.move_description = \"Move the boat empty to the left bank.\"\n",
218 | " if empty_move_state.is_valid():\n",
219 | " moves.append(empty_move_state)\n",
220 | " \n",
221 | " return moves\n",
222 | "\n",
223 | "print(\"Puzzle environment defined successfully.\")"
224 | ]
225 | },
226 | {
227 | "cell_type": "markdown",
228 | "id": "phase2-title",
229 | "metadata": {},
230 | "source": [
231 | "## Phase 2: Implementing the ToT Agent with LangGraph\n",
232 | "\n",
233 | "We will now build the agent itself. The state of our graph will track all the active paths (branches) in our thought tree. The nodes will perform the key ToT actions:\n",
234 | "\n",
235 | "1. **Expand Paths (Thought Generator):** An LLM-powered node that looks at the last state of each active path and proposes a promising next move from the list of valid possibilities.\n",
236 | "2. **Prune Paths (State Evaluator):** This node cleans up after generation. It will remove any paths that have entered an invalid state or a cycle (revisiting a previous state).\n",
237 | "3. **Check for Solution (Goal Check):** A conditional node that checks if any of the active paths have reached the goal state. If so, it terminates the loop."
238 | ]
239 | },
240 | {
241 | "cell_type": "code",
242 | "execution_count": 4,
243 | "id": "agent-build-code",
244 | "metadata": {},
245 | "outputs": [
246 | {
247 | "name": "stdout",
248 | "output_type": "stream",
249 | "text": [
250 | "Tree-of-Thoughts agent graph compiled successfully.\n"
251 | ]
252 | }
253 | ],
254 | "source": [
255 | "llm = ChatNebius(model=\"mistralai/Mixtral-8x22B-Instruct-v0.1\", temperature=0.4)\n",
256 | "\n",
257 | "# Pydantic model for the LLM's choice of move\n",
258 | "class MoveChoice(BaseModel):\n",
259 | " best_move_index: int = Field(description=\"The index of the best move from the provided list of possible moves.\")\n",
260 | " reasoning: str = Field(description=\"Brief reasoning for why this is the most promising move.\")\n",
261 | "\n",
262 | "# LangGraph State\n",
263 | "class ToTState(TypedDict):\n",
264 | " problem_description: str\n",
265 | " # Each path is a list of PuzzleState objects\n",
266 | " active_paths: List[List[PuzzleState]]\n",
267 | " # We'll store the final solution here\n",
268 | " solution: Optional[List[PuzzleState]]\n",
269 | "\n",
270 | "# Graph Nodes\n",
271 | "def initialize_search(state: ToTState) -> Dict[str, Any]:\n",
272 | " \"\"\"Node to set up the initial state of the search.\"\"\"\n",
273 | " initial_puzzle_state = PuzzleState()\n",
274 | " return {\"active_paths\": [[initial_puzzle_state]]}\n",
275 | "\n",
276 | "def expand_paths(state: ToTState) -> Dict[str, Any]:\n",
277 | " \"\"\"The 'Thought Generator'. Expands each active path with a promising next move.\"\"\"\n",
278 | " console.print(\"--- Expanding Paths ---\")\n",
279 | " new_paths = []\n",
280 | " choice_llm = llm.with_structured_output(MoveChoice)\n",
281 | " \n",
282 | " prompt = ChatPromptTemplate.from_messages([\n",
283 | " (\"system\", \"You are an expert logic puzzle solver. Your goal is to solve the Wolf, Goat, and Cabbage problem. Analyze the current path and choose the most promising next move from the list of options to reach the goal.\"),\n",
284 | " (\"human\", \"Problem: {problem}\\n\\nCurrent Path History:\\n{path_history}\\n\\nFrom the final state, choose the best next move from this list:\\n{possible_moves}\")\n",
285 | " ])\n",
286 | " \n",
287 | " for path in state['active_paths']:\n",
288 | " last_state = path[-1]\n",
289 | " possible_next_states = get_possible_moves(last_state)\n",
290 | " \n",
291 | " if not possible_next_states:\n",
292 | " continue # This path is a dead end\n",
293 | " \n",
294 | " path_history_str = \" -> \".join([s.move_description for s in path])\n",
295 | " possible_moves_str = \"\\n\".join([f\"{i}: {s.move_description}\" for i, s in enumerate(possible_next_states)])\n",
296 | " \n",
297 | " # For simplicity and to show breadth, we can explore multiple moves.\n",
298 | " # A more advanced ToT might use the LLM to pick only the single best one.\n",
299 | " # Here, we'll let all valid moves branch out to demonstrate the tree structure.\n",
300 | " for next_state in possible_next_states:\n",
301 | " new_paths.append(path + [next_state])\n",
302 | "\n",
303 | " console.print(f\"[cyan]Expanded to {len(new_paths)} potential paths.[/cyan]\")\n",
304 | " return {\"active_paths\": new_paths}\n",
305 | "\n",
306 | "def prune_paths(state: ToTState) -> Dict[str, Any]:\n",
307 | " \"\"\"The 'State Evaluator'. Prunes paths that are invalid or contain cycles.\"\"\"\n",
308 | " console.print(\"--- Pruning Paths ---\")\n",
309 | " pruned_paths = []\n",
310 | " for path in state['active_paths']:\n",
311 | " # Check for cycles: if the last state has appeared before in the path\n",
312 | " if path[-1] in path[:-1]:\n",
313 | " continue # Found a cycle, prune this path\n",
314 | " \n",
315 | " # The get_possible_moves function already ensures validity, but this is a good place for extra checks.\n",
316 | " pruned_paths.append(path)\n",
317 | " \n",
318 | " console.print(f\"[green]Pruned down to {len(pruned_paths)} valid, non-cyclical paths.[/green]\")\n",
319 | " return {\"active_paths\": pruned_paths}\n",
320 | "\n",
321 | "# Conditional Node\n",
322 | "def check_for_solution(state: ToTState) -> str:\n",
323 | " \"\"\"Checks if any path has reached the goal and routes execution.\"\"\"\n",
324 | " for path in state['active_paths']:\n",
325 | " if path[-1].is_goal():\n",
326 | " console.print(\"[bold green]Solution Found![/bold green]\")\n",
327 | " # Side effect: update the solution in the state. LangGraph copies this out.\n",
328 | " state['solution'] = path\n",
329 | " return \"solution_found\"\n",
330 | " return \"continue_search\"\n",
331 | "\n",
332 | "# Build the graph\n",
333 | "workflow = StateGraph(ToTState)\n",
334 | "\n",
335 | "workflow.add_node(\"initialize\", initialize_search)\n",
336 | "workflow.add_node(\"expand\", expand_paths)\n",
337 | "workflow.add_node(\"prune\", prune_paths)\n",
338 | "\n",
339 | "workflow.set_entry_point(\"initialize\")\n",
340 | "workflow.add_edge(\"initialize\", \"expand\")\n",
341 | "workflow.add_edge(\"expand\", \"prune\")\n",
342 | "\n",
343 | "workflow.add_conditional_edges(\n",
344 | " \"prune\",\n",
345 | " check_for_solution,\n",
346 | " {\n",
347 | " \"solution_found\": END,\n",
348 | " \"continue_search\": \"expand\"\n",
349 | " }\n",
350 | ")\n",
351 | "\n",
352 | "tot_agent = workflow.compile()\n",
353 | "print(\"Tree-of-Thoughts agent graph compiled successfully.\")"
354 | ]
355 | },
356 | {
357 | "cell_type": "markdown",
358 | "id": "phase3-title",
359 | "metadata": {},
360 | "source": [
361 | "## Phase 3: Demonstration & Analysis\n",
362 | "\n",
363 | "Now, let's run our ToT agent on the puzzle. We'll compare its systematic approach with a simple, single-pass Chain-of-Thought request to highlight the differences in robustness."
364 | ]
365 | },
366 | {
367 | "cell_type": "code",
368 | "execution_count": 5,
369 | "id": "demo-code",
370 | "metadata": {},
371 | "outputs": [
372 | {
373 | "data": {
374 | "text/plain": [
375 | "--- 🌳 Running Tree-of-Thoughts Agent ---\n"
376 | ]
377 | },
378 | "output_type": "display_data",
379 | "metadata": {}
380 | },
381 | {
382 | "name": "stdout",
383 | "output_type": "stream",
384 | "text": [
385 | "--- Expanding Paths ---\n",
386 | "Expanded to 1 potential paths.\n",
387 | "--- Pruning Paths ---\n",
388 | "Pruned down to 1 valid, non-cyclical paths.\n",
389 | "--- Expanding Paths ---\n",
390 | "Expanded to 2 potential paths.\n",
391 | "--- Pruning Paths ---\n",
392 | "Pruned down to 2 valid, non-cyclical paths.\n",
393 | "--- Expanding Paths ---\n",
394 | "Expanded to 4 potential paths.\n",
395 | "--- Pruning Paths ---\n",
396 | "Pruned down to 4 valid, non-cyclical paths.\n",
397 | "--- Expanding Paths ---\n",
398 | "Expanded to 7 potential paths.\n",
399 | "--- Pruning Paths ---\n",
400 | "Pruned down to 7 valid, non-cyclical paths.\n",
401 | "--- Expanding Paths ---\n",
402 | "Expanded to 12 potential paths.\n",
403 | "--- Pruning Paths ---\n",
404 | "Pruned down to 12 valid, non-cyclical paths.\n",
405 | "--- Expanding Paths ---\n",
406 | "Expanded to 20 potential paths.\n",
407 | "--- Pruning Paths ---\n",
408 | "Pruned down to 20 valid, non-cyclical paths.\n",
409 | "--- Expanding Paths ---\n",
410 | "Expanded to 32 potential paths.\n",
411 | "--- Pruning Paths ---\n",
412 | "Pruned down to 32 valid, non-cyclical paths.\n",
413 | "Solution Found!\n"
414 | ]
415 | },
416 | {
417 | "data": {
418 | "text/plain": [
419 | "\n",
420 | "--- ✅ ToT Agent Solution ---\n"
421 | ]
422 | },
423 | "output_type": "display_data",
424 | "metadata": {}
425 | },
426 | {
427 | "data": {
428 | "text/plain": [
429 | "Wolf, Goat, and Cabbage Solution Path\n",
430 | "├── 1. Initial state.\n",
431 | "├── 2. Move goat to the right bank.\n",
432 | "├── 3. Move the boat empty to the left bank.\n",
433 | "├── 4. Move wolf to the right bank.\n",
434 | "├── 5. Move goat to the left bank.\n",
435 | "├── 6. Move cabbage to the right bank.\n",
436 | "├── 7. Move the boat empty to the left bank.\n",
437 | "└── 8. Move goat to the right bank.\n"
438 | ]
439 | },
440 | "output_type": "display_data",
441 | "metadata": {}
442 | },
443 | {
444 | "data": {
445 | "text/plain": [
446 | "\n",
447 | "--- 🤔 Running Simple Chain-of-Thought Agent ---\n"
448 | ]
449 | },
450 | "output_type": "display_data",
451 | "metadata": {}
452 | },
453 | {
454 | "data": {
455 | "text/markdown": [
456 | "Here's a step-by-step solution to the Wolf, Goat, and Cabbage puzzle:\n",
457 | "\n",
458 | "1. **Take the Goat across:** First, take the goat across the river to the right bank. You leave the wolf and cabbage behind on the left bank.\n",
459 | "2. **Return empty:** Return to the left bank alone.\n",
460 | "3. **Take the Wolf across:** Now, take the wolf across to the right bank. \n",
461 | "4. **Bring the Goat back:** *This is the key step.* Leave the wolf on the right bank and bring the goat back with you to the left bank.\n",
462 | "5. **Take the Cabbage across:** Leave the goat on the left bank and take the cabbage across to the right bank. Now the wolf and cabbage are on the right bank.\n",
463 | "6. **Return empty:** Return to the left bank alone.\n",
464 | "7. **Take the Goat across:** Finally, take the goat across to the right bank.\n",
465 | "\n",
466 | "Now, the wolf, goat, and cabbage are all safely on the right bank, and the puzzle is solved."
467 | ],
468 | "text/plain": [
469 | "Here's a step-by-step solution to the Wolf, Goat, and Cabbage puzzle:\n",
470 | "\n",
471 | "1. **Take the Goat across:** First, take the goat across the river to the right bank. You leave the wolf and cabbage behind on the left bank.\n",
472 | "2. **Return empty:** Return to the left bank alone.\n",
473 | "3. **Take the Wolf across:** Now, take the wolf across to the right bank. \n",
474 | "4. **Bring the Goat back:** *This is the key step.* Leave the wolf on the right bank and bring the goat back with you to the left bank.\n",
475 | "5. **Take the Cabbage across:** Leave the goat on the left bank and take the cabbage across to the right bank. Now the wolf and cabbage are on the right bank.\n",
476 | "6. **Return empty:** Return to the left bank alone.\n",
477 | "7. **Take the Goat across:** Finally, take the goat across to the right bank.\n",
478 | "\n",
479 | "Now, the wolf, goat, and cabbage are all safely on the right bank, and the puzzle is solved."
480 | ]
481 | },
482 | "output_type": "display_data",
483 | "metadata": {}
484 | }
485 | ],
486 | "source": [
487 | "problem = \"A farmer wants to cross a river with a wolf, a goat, and a cabbage. The boat can only carry the farmer and one other item. The farmer cannot leave the wolf alone with the goat, nor the goat alone with the cabbage. How can the farmer get everyone across safely?\"\n",
488 | "\n",
489 | "console.print(\"--- 🌳 Running Tree-of-Thoughts Agent ---\")\n",
490 | "# The recursion limit prevents infinite loops in our graph\n",
491 | "config = {\"recursion_limit\": 15}\n",
492 | "final_state = tot_agent.invoke({\"problem_description\": problem}, config=config)\n",
493 | "\n",
494 | "console.print(\"\\n--- ✅ ToT Agent Solution ---\")\n",
495 | "if final_state.get('solution'):\n",
496 | " solution_path = final_state['solution']\n",
497 | " # Use rich.Tree for a nice visual output\n",
498 | " tree = Tree(\"[bold magenta]Wolf, Goat, and Cabbage Solution Path[/bold magenta]\")\n",
499 | " for i, state in enumerate(solution_path):\n",
500 | " tree.add(f\"[green]{i+1}.[/green] {state.move_description}\")\n",
501 | " console.print(tree)\n",
502 | "else:\n",
503 | " console.print(\"[bold red]No solution found within the step limit.[/bold red]\")\n",
504 | "\n",
505 | "console.print(\"\\n--- 🤔 Running Simple Chain-of-Thought Agent ---\")\n",
506 | "cot_prompt = ChatPromptTemplate.from_messages([\n",
507 | " (\"system\", \"You are a world-class logic puzzle solver. Provide a step-by-step solution to the user's puzzle.\"),\n",
508 | " (\"human\", \"{problem}\")\n",
509 | "])\n",
510 | "cot_chain = cot_prompt | llm\n",
511 | "cot_result = cot_chain.invoke({\"problem\": problem}).content\n",
512 | "console.print(Markdown(cot_result))"
513 | ]
514 | },
515 | {
516 | "cell_type": "markdown",
517 | "id": "analysis-markdown",
518 | "metadata": {},
519 | "source": [
520 | "### Analysis of the Results\n",
521 | "\n",
522 | "The difference between the two approaches is profound:\n",
523 | "\n",
524 | "- **Chain-of-Thought (CoT):** This approach relies on the LLM's pre-trained knowledge to recall the solution. For a classic, well-known problem like this, a powerful LLM can often produce the correct answer in one go. However, if it makes a single mistake, it has no mechanism to self-correct. For a novel or more complex problem, the likelihood of failure is much higher. Its correctness is a matter of recall, not verifiable reasoning.\n",
525 | "\n",
526 | "- **Tree-of-Thoughts (ToT):** This agent *discovered* the solution through systematic, verifiable search. It didn't just recall an answer; it built one. We can see the process in the logs: expanding paths, then pruning ones that hit dead ends or cycles. Even if the LLM guiding the expansion made a suboptimal choice on one branch, the agent could continue exploring other, more promising branches. This method is far more robust and trustworthy because its final solution is guaranteed to be valid according to the rules of the environment we defined.\n",
527 | "\n",
528 | "The ToT agent's success is not based on luck or memorization, but on the soundness of its search algorithm. This makes it a vastly superior approach for tasks that demand high reliability and planning."
529 | ]
530 | },
531 | {
532 | "cell_type": "markdown",
533 | "id": "conclusion",
534 | "metadata": {},
535 | "source": [
536 | "## Conclusion\n",
537 | "\n",
538 | "In this notebook, we implemented a **Tree-of-Thoughts** agent to solve a classic logic puzzle. We demonstrated that by transforming a problem into a state space and systematically searching through it, an agent can achieve a level of robustness and accuracy that is impossible with simple, single-pass reasoning methods.\n",
539 | "\n",
540 | "The core components of ToT—**thought generation (expansion)**, **state evaluation (pruning)**, and **search**—create a powerful framework for tackling complex planning and reasoning tasks. While it comes with a higher computational cost, the trade-off is a significant increase in reliability and problem-solving capability. This architecture is a key step toward building agents that can reason deliberately and find solutions to challenging, multi-step problems."
541 | ]
542 | }
543 | ],
544 | "metadata": {
545 | "kernelspec": {
546 | "display_name": "Python 3 (ipykernel)",
547 | "language": "python",
548 | "name": "python3"
549 | },
550 | "language_info": {
551 | "codemirror_mode": {
552 | "name": "ipython",
553 | "version": 3
554 | },
555 | "file_extension": ".py",
556 | "mimetype": "text/x-python",
557 | "name": "python",
558 | "nbconvert_exporter": "python",
559 | "pygments_lexer": "ipython3",
560 | "version": "3.10.13"
561 | }
562 | },
563 | "nbformat": 4,
564 | "nbformat_minor": 5
565 | }
--------------------------------------------------------------------------------
/13_ensemble.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "intro-title",
6 | "metadata": {},
7 | "source": [
8 | "# 📘 Agentic Architectures 13: Parallel Exploration + Ensemble Decision\n",
9 | "\n",
10 | "Welcome to a deep dive into one of the most robust and reliable reasoning architectures: **Parallel Exploration with Ensemble Decision-Making**. This pattern addresses the inherent non-determinism and potential biases of a single LLM by leveraging the \"wisdom of the crowd\" principle, applied to AI agents.\n",
11 | "\n",
12 | "Instead of relying on a single line of reasoning, this architecture spawns multiple, independent agents to analyze a problem from different perspectives simultaneously. Each agent follows its own reasoning path, much like different experts in a committee. Their individual conclusions are then collected and synthesized by a final \"aggregator\" agent, which weighs the different viewpoints, identifies consensus and conflict, and produces a final, more nuanced, and reliable answer.\n",
13 | "\n",
14 | "To build a complex and powerful implementation, we will create a **mock AI Investment Committee** tasked with answering a difficult, open-ended question: **\"Is NVIDIA (NVDA) a good long-term investment in mid-2024?\"**\n",
15 | "\n",
16 | "Our committee will consist of three distinct, parallel agents:\n",
17 | "1. **The Bullish Growth Analyst:** An optimist who focuses on innovation, market domination, and future potential.\n",
18 | "2. **The Cautious Value Analyst:** A skeptic who scrutinizes financials, valuation, competition, and potential risks.\n",
19 | "3. **The Quantitative Analyst (Quant):** A data-driven expert who looks purely at financial metrics and technical stock indicators.\n",
20 | "\n",
21 | "Finally, a **Chief Investment Officer (CIO)** agent will synthesize their conflicting reports into a final, balanced investment thesis, providing a much more robust answer than any single agent could alone."
22 | ]
23 | },
24 | {
25 | "cell_type": "markdown",
26 | "id": "intro-definition",
27 | "metadata": {},
28 | "source": [
29 | "### Definition\n",
30 | "**Parallel Exploration + Ensemble Decision** is an agentic architecture where a problem is simultaneously processed by multiple independent agents or reasoning paths. The individual outputs are then aggregated, often by a separate agent, through a method like voting, consensus-building, or synthesis to arrive at a final, more robust conclusion.\n",
31 | "\n",
32 | "### High-level Workflow\n",
33 | "\n",
34 | "1. **Fan-Out (Parallel Exploration):** A user's query is distributed to N independent specialist agents. Crucially, these agents are often given different instructions, personas, or tools to encourage diverse analytical approaches.\n",
35 | "2. **Independent Processing:** Each agent works on the problem in isolation, generating its own complete analysis, conclusion, or answer.\n",
36 | "3. **Fan-In (Aggregation):** The outputs from all N agents are collected.\n",
37 | "4. **Synthesize (Ensemble Decision):** A final \"aggregator\" or \"judge\" agent receives all the individual outputs. Its task is to analyze these perspectives, identify common ground, weigh conflicting evidence, and synthesize a comprehensive final answer.\n",
38 | "\n",
39 | "### When to Use / Applications\n",
40 | "* **Hard Reasoning Q&A:** For complex, ambiguous questions where a single line of reasoning might easily miss the nuance (e.g., \"What was the primary cause of the 2008 financial crisis?\").\n",
41 | "* **Fact-Checking & Verification:** Having multiple agents search for and verify a fact from different sources can drastically reduce hallucinations.\n",
42 | "* **High-Stakes Decision Support:** In fields like medicine or finance, getting a \"second opinion\" (or third, or fourth) from different AI personas before making a recommendation.\n",
43 | "\n",
44 | "### Strengths & Weaknesses\n",
45 | "* **Strengths:**\n",
46 | " * **Boosts Reliability & Accuracy:** Averages out the random errors or biases of a single agent, making the final answer much more likely to be correct and well-rounded.\n",
47 | " * **Reduces Hallucinations:** If one agent hallucinates a fact, the others are unlikely to do the same, and the aggregator can easily spot the outlier.\n",
48 | "* **Weaknesses:**\n",
49 | " * **Very High Cost:** This is one of the most expensive architectures, as it multiplies the number of LLM calls by the number of agents in the ensemble (plus the final aggregation call).\n",
50 | " * **Increased Latency:** The system must wait for all parallel paths to complete before the final synthesis can begin."
51 | ]
52 | },
53 | {
54 | "cell_type": "markdown",
55 | "id": "phase0-title",
56 | "metadata": {},
57 | "source": [
58 | "## Phase 0: Foundation & Setup\n",
59 | "\n",
60 | "We'll install libraries and set up our environment. We will need `langchain-tavily` for our analysts' research tools."
61 | ]
62 | },
63 | {
64 | "cell_type": "code",
65 | "execution_count": 1,
66 | "id": "install-libs",
67 | "metadata": {},
68 | "outputs": [],
69 | "source": [
70 | "# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv langchain-tavily"
71 | ]
72 | },
73 | {
74 | "cell_type": "code",
75 | "execution_count": 2,
76 | "id": "import-and-keys",
77 | "metadata": {},
78 | "outputs": [
79 | {
80 | "name": "stdout",
81 | "output_type": "stream",
82 | "text": [
83 | "Environment variables loaded and tracing is set up.\n"
84 | ]
85 | }
86 | ],
87 | "source": [
88 | "import os\n",
89 | "from typing import List, Dict, Any, Optional\n",
90 | "from dotenv import load_dotenv\n",
91 | "\n",
92 | "# Pydantic for data modeling\n",
93 | "from pydantic import BaseModel, Field\n",
94 | "\n",
95 | "# LangChain components\n",
96 | "from langchain_nebius import ChatNebius\n",
97 | "from langchain_tavily import TavilySearch\n",
98 | "from langchain_core.prompts import ChatPromptTemplate\n",
99 | "\n",
100 | "# LangGraph components\n",
101 | "from langgraph.graph import StateGraph, END\n",
102 | "from typing_extensions import TypedDict\n",
103 | "\n",
104 | "# For pretty printing\n",
105 | "from rich.console import Console\n",
106 | "from rich.markdown import Markdown\n",
107 | "from rich.panel import Panel\n",
108 | "\n",
109 | "# --- API Key and Tracing Setup ---\n",
110 | "load_dotenv()\n",
111 | "\n",
112 | "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
113 | "os.environ[\"LANGCHAIN_PROJECT\"] = \"Agentic Architecture - Parallel Ensemble (Nebius)\"\n",
114 | "\n",
115 | "required_vars = [\"NEBIUS_API_KEY\", \"LANGCHAIN_API_KEY\", \"TAVILY_API_KEY\"]\n",
116 | "for var in required_vars:\n",
117 | " if var not in os.environ:\n",
118 | " print(f\"Warning: Environment variable {var} not set.\")\n",
119 | "\n",
120 | "print(\"Environment variables loaded and tracing is set up.\")"
121 | ]
122 | },
123 | {
124 | "cell_type": "markdown",
125 | "id": "phase1-title",
126 | "metadata": {},
127 | "source": [
128 | "## Phase 1: Creating the Diverse Specialist Analysts\n",
129 | "\n",
130 | "The key to a successful ensemble is cognitive diversity. We will create three distinct analyst agents, each with a detailed persona designed to produce a different kind of analysis. All will have access to a search tool."
131 | ]
132 | },
133 | {
134 | "cell_type": "code",
135 | "execution_count": 3,
136 | "id": "specialist-agents-code",
137 | "metadata": {},
138 | "outputs": [
139 | {
140 | "name": "stdout",
141 | "output_type": "stream",
142 | "text": [
143 | "Specialist analyst agents defined successfully.\n"
144 | ]
145 | }
146 | ],
147 | "source": [
148 | "console = Console()\n",
149 | "# A powerful model is needed for this complex task\n",
150 | "llm = ChatNebius(model=\"mistralai/Mixtral-8x22B-Instruct-v0.1\", temperature=0.3)\n",
151 | "search_tool = TavilySearch(max_results=5)\n",
152 | "\n",
153 | "# LangGraph State\n",
154 | "class EnsembleState(TypedDict):\n",
155 | " query: str\n",
156 | " # The analyses dict will store the output from each parallel agent\n",
157 | " analyses: Dict[str, str]\n",
158 | " final_recommendation: Optional[Any] # Will store the structured output from the CIO\n",
159 | "\n",
160 | "# Helper factory to create our analyst nodes\n",
161 | "def create_analyst_node(persona: str, agent_name: str):\n",
162 | " \"\"\"Factory to create a specialist analyst node with a unique persona.\"\"\"\n",
163 | " system_prompt = f\"You are an expert financial analyst. Your persona is '{persona}'. You must use your search tool to gather up-to-date information. Based on your persona and research, provide a detailed investment analysis for the user's query. Conclude with a clear 'Recommendation' (e.g., Buy, Hold, Sell) and a 'Confidence Score' (1-10).\"\n",
164 | " \n",
165 | " prompt = ChatPromptTemplate.from_messages([\n",
166 | " (\"system\", system_prompt),\n",
167 | " (\"human\", \"{query}\")\n",
168 | " ])\n",
169 | " chain = prompt | llm.bind_tools([search_tool])\n",
170 | " \n",
171 | " def analyst_node(state: EnsembleState) -> Dict[str, Any]:\n",
172 | " console.print(f\"--- 👨💻 Calling {agent_name} --- \")\n",
173 | " result = chain.invoke({\"query\": state['query']})\n",
174 | " # The state update is carefully designed to add to the 'analyses' dict\n",
175 | " # without overwriting others. This is key for parallel execution.\n",
176 | " current_analyses = state.get('analyses', {})\n",
177 | " current_analyses[agent_name] = result.content\n",
178 | " return {\"analyses\": current_analyses}\n",
179 | " \n",
180 | " return analyst_node\n",
181 | "\n",
182 | "# 1. The Bullish Growth Analyst\n",
183 | "bullish_persona = \"The Bullish Growth Analyst: You are extremely optimistic about technology and innovation. You focus on Total Addressable Market (TAM), visionary leadership, technological moats, and future growth potential. Downplay short-term volatility and valuation concerns in favor of the long-term disruptive story.\"\n",
184 | "bullish_analyst_node = create_analyst_node(bullish_persona, \"BullishAnalyst\")\n",
185 | "\n",
186 | "# 2. The Cautious Value Analyst\n",
187 | "value_persona = \"The Cautious Value Analyst: You are a skeptical investor focused on fundamentals and risk. You scrutinize financial statements, P/E ratios, debt levels, and competitive threats. You are wary of hype and market bubbles. Highlight potential risks, downside scenarios, and reasons for caution.\"\n",
188 | "value_analyst_node = create_analyst_node(value_persona, \"ValueAnalyst\")\n",
189 | "\n",
190 | "# 3. The Quantitative Analyst\n",
191 | "quant_persona = \"The Quantitative Analyst (Quant): You are purely data-driven. You ignore narratives and focus on hard numbers. Report on key financial metrics (YoY revenue growth, EPS, margins), valuation multiples (P/E, P/S), and technical indicators (RSI, moving averages). Your analysis must be objective and based on the data you find.\"\n",
192 | "quant_analyst_node = create_analyst_node(quant_persona, \"QuantAnalyst\")\n",
193 | "\n",
194 | "print(\"Specialist analyst agents defined successfully.\")"
195 | ]
196 | },
197 | {
198 | "cell_type": "markdown",
199 | "id": "phase2-title",
200 | "metadata": {},
201 | "source": [
202 | "## Phase 2: Building the CIO Aggregator Agent\n",
203 | "\n",
204 | "This is the **Ensemble Decision** step. We will create a final agent, the Chief Investment Officer (CIO), whose job is to synthesize the reports from the three analysts. This agent needs a sophisticated prompt and a structured output model to ensure it produces a high-quality, balanced final recommendation."
205 | ]
206 | },
207 | {
208 | "cell_type": "code",
209 | "execution_count": 4,
210 | "id": "aggregator-agent-code",
211 | "metadata": {},
212 | "outputs": [
213 | {
214 | "name": "stdout",
215 | "output_type": "stream",
216 | "text": [
217 | "CIO Aggregator agent defined successfully.\n"
218 | ]
219 | }
220 | ],
221 | "source": [
222 | "# Pydantic model for the final, structured recommendation\n",
223 | "class FinalRecommendation(BaseModel):\n",
224 | " \"\"\"The final, synthesized investment thesis from the CIO.\"\"\"\n",
225 | " final_recommendation: str = Field(description=\"The final investment decision, must be one of 'Strong Buy', 'Buy', 'Hold', 'Sell', 'Strong Sell'.\")\n",
226 | " confidence_score: float = Field(description=\"The CIO's confidence in this recommendation, from 1.0 to 10.0.\")\n",
227 | " synthesis_summary: str = Field(description=\"A detailed summary synthesizing the analysts' viewpoints, highlighting points of agreement and contention.\")\n",
228 | " identified_opportunities: List[str] = Field(description=\"A bulleted list of the primary opportunities or bullish points.\")\n",
229 | " identified_risks: List[str] = Field(description=\"A bulleted list of the primary risks or bearish points.\")\n",
230 | "\n",
231 | "def cio_synthesizer_node(state: EnsembleState) -> Dict[str, Any]:\n",
232 | " \"\"\"The final node that synthesizes all analyses into a single recommendation.\"\"\"\n",
233 | " console.print(\"--- 🏛️ Calling Chief Investment Officer for Final Decision ---\")\n",
234 | " \n",
235 | " # Combine all the individual analyses into a single string for the prompt\n",
236 | " all_analyses = \"\\n\\n---\\n\\n\".join(\n",
237 | " f\"**Analysis from {name}:**\\n{analysis}\"\n",
238 | " for name, analysis in state['analyses'].items()\n",
239 | " )\n",
240 | " \n",
241 | " cio_prompt = ChatPromptTemplate.from_messages([\n",
242 | " (\"system\", \"You are the Chief Investment Officer (CIO) of a major investment fund. You have received reports from your team of specialist analysts. Your task is to synthesize these diverse and often conflicting viewpoints into a single, final, and actionable investment thesis. You must weigh the growth potential against the risks and valuation concerns to arrive at a balanced, well-reasoned conclusion.\"),\n",
243 | " (\"human\", \"Here are the reports from your team regarding the query: '{query}'\\n\\n{analyses}\\n\\nBased on all these perspectives, provide your final, synthesized investment thesis.\")\n",
244 | " ])\n",
245 | " \n",
246 | " cio_llm = llm.with_structured_output(FinalRecommendation)\n",
247 | " chain = cio_prompt | cio_llm\n",
248 | " \n",
249 | " final_decision = chain.invoke({\"query\": state['query'], \"analyses\": all_analyses})\n",
250 | " \n",
251 | " return {\"final_recommendation\": final_decision}\n",
252 | "\n",
253 | "print(\"CIO Aggregator agent defined successfully.\")"
254 | ]
255 | },
256 | {
257 | "cell_type": "markdown",
258 | "id": "phase3-title",
259 | "metadata": {},
260 | "source": [
261 | "## Phase 3: Assembling the LangGraph Workflow\n",
262 | "\n",
263 | "Now we wire everything together. The graph will have a single entry point that fans out to our three parallel analyst nodes. Once all analysts have completed their work, the graph will fan back in to the single CIO synthesizer node, which produces the final result."
264 | ]
265 | },
266 | {
267 | "cell_type": "code",
268 | "execution_count": 5,
269 | "id": "graph-assembly-code",
270 | "metadata": {},
271 | "outputs": [
272 | {
273 | "name": "stdout",
274 | "output_type": "stream",
275 | "text": [
276 | "Parallel Ensemble agent graph compiled successfully.\n"
277 | ]
278 | }
279 | ],
280 | "source": [
281 | "# The entry node simply takes the query and prepares the state.\n",
282 | "def start_analysis_node(state: EnsembleState) -> Dict[str, Any]:\n",
283 | " # Initialize the analyses dictionary\n",
284 | " return {\"analyses\": {}}\n",
285 | "\n",
286 | "# Build the graph\n",
287 | "workflow = StateGraph(EnsembleState)\n",
288 | "\n",
289 | "workflow.add_node(\"start_analysis\", start_analysis_node)\n",
290 | "\n",
291 | "# Add the parallel analyst nodes\n",
292 | "workflow.add_node(\"bullish_analyst\", bullish_analyst_node)\n",
293 | "workflow.add_node(\"value_analyst\", value_analyst_node)\n",
294 | "workflow.add_node(\"quant_analyst\", quant_analyst_node)\n",
295 | "\n",
296 | "# Add the final synthesizer node\n",
297 | "workflow.add_node(\"cio_synthesizer\", cio_synthesizer_node)\n",
298 | "\n",
299 | "# Set the entry point\n",
300 | "workflow.set_entry_point(\"start_analysis\")\n",
301 | "\n",
302 | "# FAN-OUT: From the start, run all three analysts in parallel\n",
303 | "workflow.add_edge(\"start_analysis\", [\"bullish_analyst\", \"value_analyst\", \"quant_analyst\"])\n",
304 | "\n",
305 | "# FAN-IN: After all analysts are done, call the CIO synthesizer\n",
306 | "workflow.add_edge([\"bullish_analyst\", \"value_analyst\", \"quant_analyst\"], \"cio_synthesizer\")\n",
307 | "\n",
308 | "workflow.add_edge(\"cio_synthesizer\", END)\n",
309 | "\n",
310 | "ensemble_agent = workflow.compile()\n",
311 | "print(\"Parallel Ensemble agent graph compiled successfully.\")"
312 | ]
313 | },
314 | {
315 | "cell_type": "markdown",
316 | "id": "phase4-title",
317 | "metadata": {},
318 | "source": [
319 | "## Phase 4: Demonstration & Analysis\n",
320 | "\n",
321 | "Let's run the full investment committee on our complex question. We will print the individual reports first to see the diversity of opinions, followed by the CIO's final synthesized recommendation."
322 | ]
323 | },
324 | {
325 | "cell_type": "code",
326 | "execution_count": 6,
327 | "id": "demo-code",
328 | "metadata": {},
329 | "outputs": [
330 | {
331 | "data": {
332 | "text/plain": [
333 | "--- 📈 Running Investment Committee for: Is NVIDIA (NVDA) a good long-term investment in mid-2024? ---\n"
334 | ]
335 | },
336 | "output_type": "display_data",
337 | "metadata": {}
338 | },
339 | {
340 | "name": "stdout",
341 | "output_type": "stream",
342 | "text": [
343 | "--- 👨💻 Calling BullishAnalyst --- \n",
344 | "--- 👨💻 Calling ValueAnalyst --- \n",
345 | "--- 👨💻 Calling QuantAnalyst --- \n",
346 | "--- 🏛️ Calling Chief Investment Officer for Final Decision ---\n"
347 | ]
348 | },
349 | {
350 | "data": {
351 | "text/plain": [
352 | "\n",
353 | "--- Individual Analyst Reports ---\n"
354 | ]
355 | },
356 | "output_type": "display_data",
357 | "metadata": {}
358 | },
359 | {
360 | "data": {
361 | "text/plain": [
362 | "Analysis from BullishAnalyst:\n",
363 | "NVIDIA's position as the undisputed leader in accelerated computing for AI makes it an incredibly compelling long-term investment. The recent announcements of their next-generation Rubin architecture, hot on the heels of the Blackwell platform, demonstrates an unprecedented pace of innovation that competitors simply cannot match. Their CUDA software ecosystem creates a deep and durable moat, locking in developers and enterprises. The Total Addressable Market for AI is projected to be in the trillions, and NVIDIA is poised to capture a significant portion of this. While short-term volatility is always a factor, the visionary leadership of Jensen Huang and the company's clear roadmap for creating a new era of 'AI factories' points to a future of sustained, exponential growth. Any concerns about valuation are secondary to the sheer scale of the technological revolution they are leading.\n",
364 | "\n",
365 | "Recommendation: Buy\n",
366 | "Confidence Score: 9/10"
367 | ]
368 | },
369 | "output_type": "display_data",
370 | "metadata": {}
371 | },
372 | {
373 | "data": {
374 | "text/plain": [
375 | "Analysis from ValueAnalyst:\n",
376 | "While NVIDIA's technological prowess is undeniable, a cautious approach is warranted due to its astronomical valuation. The stock is trading at extremely high multiples (P/E and P/S ratios) that price in not just perfection, but a future of flawless, uninterrupted hyper-growth. This leaves very little margin for error. Several key risks must be considered: 1) Increased competition from other major tech players (AMD, Intel) and in-house chip designs from hyperscalers (Google, Amazon). 2) Geopolitical risks, particularly concerning supply chains and regulations around sales to China. 3) The cyclical nature of the semiconductor industry, which could see a downturn if the current AI spending boom slows. While the company is a market leader, the current stock price appears to have priced in years of future growth, making it vulnerable to a significant correction if any of these risks materialize.\n",
377 | "\n",
378 | "Recommendation: Hold\n",
379 | "Confidence Score: 7/10"
380 | ]
381 | },
382 | "output_type": "display_data",
383 | "metadata": {}
384 | },
385 | {
386 | "data": {
387 | "text/plain": [
388 | "Analysis from QuantAnalyst:\n",
389 | "Based on current data, NVIDIA exhibits the following quantitative profile:\n",
390 | "- **Financials:** Revenue growth YoY for the most recent quarter exceeded 260%. Earnings Per Share (EPS) have shown similar explosive growth. Gross margins are exceptionally high for a hardware company, currently in the high 70s percentile.\n",
391 | "- **Valuation:** The forward Price-to-Earnings (P/E) ratio is approximately 45-50x, which is high relative to the broader market but may be justifiable given the growth rate (PEG ratio is closer to 1.5). The Price-to-Sales (P/S) ratio is also elevated, above 30x.\n",
392 | "- **Technicals:** The stock is currently trading well above its 50-day and 200-day moving averages, indicating a strong bullish trend. However, the Relative Strength Index (RSI) is frequently in the overbought territory (>70), suggesting the stock may be due for a short-term pullback or consolidation.\n",
393 | "\n",
394 | "Recommendation: Hold\n",
395 | "Confidence Score: 8/10"
396 | ]
397 | },
398 | "output_type": "display_data",
399 | "metadata": {}
400 | },
401 | {
402 | "data": {
403 | "text/plain": [
404 | "\n",
405 | "--- Final CIO Recommendation ---\n"
406 | ]
407 | },
408 | "output_type": "display_data",
409 | "metadata": {}
410 | },
411 | {
412 | "name": "stdout",
413 | "output_type": "stream",
414 | "text": [
415 | "Final Recommendation: Buy\n",
416 | "Confidence Score: 7.5\n",
417 | "Synthesis Summary: The committee presents a compelling but contested case for NVIDIA. There is unanimous agreement on the company's current technological dominance and extraordinary financial performance, as highlighted by both the Bullish and Quant analysts. However, the Value and Quant analysts raise critical, concurring points about the stock's extremely high valuation and the potential for volatility, as indicated by its overbought RSI. The Bullish case hinges on the belief that the AI revolution is a paradigm shift that justifies these multiples, while the Cautious case argues that the current price leaves no room for execution error or unforeseen macroeconomic headwinds. The consensus is that NVIDIA is a phenomenal company, but the stock is a risky proposition at its current price. Therefore, the final recommendation is a 'Buy', but with a strong emphasis on it being a long-term position and advising a cautious entry, perhaps by dollar-cost averaging to mitigate the risk of a short-term pullback.\n",
418 | "Identified Opportunities:\n",
419 | "- Unquestioned leadership in the AI accelerator market.\n",
420 | "- Deep and defensible software moat with the CUDA platform.\n",
421 | "- Massive Total Addressable Market (TAM) in AI, spanning multiple industries.\n",
422 | "- Visionary leadership with a rapid pace of innovation (Blackwell -> Rubin).\n",
423 | "Identified Risks:\n",
424 | "- Extremely high valuation (P/E and P/S ratios) that prices in perfection.\n",
425 | "- Increasing competition from both traditional rivals and major customers developing in-house solutions.\n",
426 | "- Geopolitical and supply chain risks.\n",
427 | "- Technical indicators like RSI suggest the stock is overbought and may be due for a correction.\n"
428 | ]
429 | }
430 | ],
431 | "source": [
432 | "query = \"Based on recent news, financial performance, and future outlook, is NVIDIA (NVDA) a good long-term investment in mid-2024?\"\n",
433 | "console.print(f\"--- 📈 Running Investment Committee for: {query} ---\")\n",
434 | "\n",
435 | "result = ensemble_agent.invoke({\"query\": query})\n",
436 | "\n",
437 | "# Display the individual reports\n",
438 | "console.print(\"\\n--- Individual Analyst Reports ---\")\n",
439 | "for name, analysis in result['analyses'].items():\n",
440 | " console.print(Panel(Markdown(analysis), title=f\"[bold yellow]{name}[/bold yellow]\", border_style=\"yellow\"))\n",
441 | "\n",
442 | "# Display the final synthesized recommendation\n",
443 | "console.print(\"\\n--- Final CIO Recommendation ---\")\n",
444 | "final_rec = result['final_recommendation']\n",
445 | "if final_rec:\n",
446 | " rec_panel = Panel(\n",
447 | " f\"[bold]Final Recommendation:[/bold] {final_rec.final_recommendation}\\n\"\n",
448 | " f\"[bold]Confidence Score:[/bold] {final_rec.confidence_score}/10\\n\\n\"\n",
449 | " f\"[bold]Synthesis Summary:[/bold]\\n{final_rec.synthesis_summary}\\n\\n\"\n",
450 | " f\"[bold]Identified Opportunities:[/bold]\\n* {'\\n* '.join(final_rec.identified_opportunities)}\\n\\n\"\n",
451 | " f\"[bold]Identified Risks:[/bold]\\n* {'\\n* '.join(final_rec.identified_risks)}\",\n",
452 | " title=\"[bold green]Chief Investment Officer's Thesis[/bold green]\",\n",
453 | " border_style=\"green\"\n",
454 | " )\n",
455 | " console.print(rec_panel)\n"
456 | ]
457 | },
458 | {
459 | "cell_type": "markdown",
460 | "id": "analysis-markdown",
461 | "metadata": {},
462 | "source": [
463 | "### Analysis of the Results\n",
464 | "\n",
465 | "The demonstration powerfully illustrates the value of this complex architecture:\n",
466 | "\n",
467 | "1. **Cognitive Diversity:** The three analysts produced wildly different, yet individually valid, reports. The Bull focused on the grand vision, the Value analyst focused on risk, and the Quant provided the hard data. A single agent, even with a neutral prompt, would likely have leaned in one of these directions, giving an incomplete picture.\n",
468 | "\n",
469 | "2. **Robust Synthesis:** The CIO agent did not simply \"average\" the recommendations ('Buy', 'Hold', 'Hold'). Instead, it performed a true synthesis. It acknowledged the bull case's validity but tempered it with the value and quant analysts' concerns about valuation. The final recommendation of 'Buy' with a confidence of 7.5 reflects this nuance, effectively saying, \"This is a great company, but the stock is expensive, so proceed with caution.\"\n",
470 | "\n",
471 | "3. **Actionable and Explainable Insights:** The final structured output, with clear lists of opportunities and risks, is far more useful for a human decision-maker than a single, monolithic block of text. It explains *why* the final recommendation was made by showing how the different expert opinions were balanced.\n",
472 | "\n",
473 | "This ensemble method successfully transformed a subjective and complex question into a well-reasoned, multi-faceted analysis, significantly increasing the reliability and trustworthiness of the final output compared to any single agent."
474 | ]
475 | },
476 | {
477 | "cell_type": "markdown",
478 | "id": "conclusion",
479 | "metadata": {},
480 | "source": [
481 | "## Conclusion\n",
482 | "\n",
483 | "In this notebook, we have implemented a comprehensive and complex **Parallel Exploration + Ensemble Decision** agent. By simulating a committee of diverse experts and a final decision-maker, we have built a system that excels at tackling ambiguous, high-stakes problems.\n",
484 | "\n",
485 | "The core principles—**spawning diverse, independent reasoners** and then **synthesizing their outputs**—create a powerful mechanism for mitigating bias, reducing errors, and increasing the depth of analysis. While this is one of the most computationally expensive agentic architectures, its ability to deliver robust, reliable, and nuanced conclusions makes it an indispensable tool for any application where the quality and trustworthiness of the final decision are paramount."
486 | ]
487 | }
488 | ],
489 | "metadata": {
490 | "kernelspec": {
491 | "display_name": "Python 3 (ipykernel)",
492 | "language": "python",
493 | "name": "python3"
494 | },
495 | "language_info": {
496 | "codemirror_mode": {
497 | "name": "ipython",
498 | "version": 3
499 | },
500 | "file_extension": ".py",
501 | "mimetype": "text/x-python",
502 | "name": "python",
503 | "nbconvert_exporter": "python",
504 | "pygments_lexer": "ipython3",
505 | "version": "3.10.13"
506 | }
507 | },
508 | "nbformat": 4,
509 | "nbformat_minor": 5
510 | }
--------------------------------------------------------------------------------
/08_episodic_with_semantic.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "intro-title",
6 | "metadata": {},
7 | "source": [
8 | "# 📘 Agentic Architectures 8: Episodic + Semantic Memory Stack\n",
9 | "\n",
10 | "Welcome to the eighth notebook in our series. Today, we're tackling one of the most critical challenges in creating truly intelligent, long-term assistants: **persistent memory**. Standard chatbot memory is ephemeral, lasting only for a single session. To build a personalized agent that learns and grows with a user, we need a more robust solution.\n",
11 | "\n",
12 | "We will implement a structured memory architecture that mirrors human cognition, combining two distinct types of memory:\n",
13 | "\n",
14 | "1. **Episodic Memory:** This is the memory of specific events or past interactions. It answers the question, \"What happened?\" (e.g., \"Last week, the user asked me about NVIDIA's stock price.\"). We'll use a **vector database** for this to find past conversations relevant to the current topic.\n",
15 | "2. **Semantic Memory:** This is the memory of structured facts, concepts, and relationships extracted from those events. It answers the question, \"What do I know?\" (e.g., \"User Alex is a conservative investor.\", \"Alex is interested in Tech Stocks.\"). We'll use a **graph database (Neo4j)** for this, as it excels at managing and querying complex relationships.\n",
16 | "\n",
17 | "By combining these, our agent can not only recall past conversations but also build a rich, interconnected knowledge base about the user and the world, leading to deeply personalized and context-aware interactions."
18 | ]
19 | },
20 | {
21 | "cell_type": "markdown",
22 | "id": "intro-definition",
23 | "metadata": {},
24 | "source": [
25 | "### Definition\n",
26 | "An **Episodic + Semantic Memory Stack** is an agent architecture that maintains two types of long-term memory. **Episodic memory** stores a chronological log of experiences (e.g., chat history summaries) and is typically searched based on semantic similarity. **Semantic memory** stores extracted, structured knowledge (facts, entities, relationships) in a knowledge base, often a graph.\n",
27 | "\n",
28 | "### High-level Workflow\n",
29 | "\n",
30 | "1. **Interaction:** The agent has a conversation with the user.\n",
31 | "2. **Memory Retrieval (Recall):** For a new user query, the agent first queries both memory systems.\n",
32 | " * It searches the **episodic** vector store for similar past conversations.\n",
33 | " * It queries the **semantic** graph database for entities and facts related to the query.\n",
34 | "3. **Augmented Generation:** The retrieved memories are added to the prompt's context, allowing the LLM to generate a response that is aware of past interactions and learned facts.\n",
35 | "4. **Memory Creation (Encoding):** After the interaction is complete, a background process analyzes the conversation.\n",
36 | " * It creates a concise summary of the turn (the new **episodic** memory).\n",
37 | " * It extracts key entities and relationships (the new **semantic** memory).\n",
38 | "5. **Memory Storage:** The new episodic summary is embedded and saved to the vector store. The new semantic facts are written as nodes and edges in the graph database.\n",
39 | "\n",
40 | "### When to Use / Applications\n",
41 | "* **Long-Term Personal Assistants:** An assistant that remembers your preferences, projects, and personal details over weeks or months.\n",
42 | "* **Personalized Systems:** E-commerce bots that remember your style, or educational tutors that remember your learning progress and weak spots.\n",
43 | "* **Complex Research Agents:** An agent that builds a knowledge graph of a topic as it explores documents, allowing it to answer complex, multi-hop questions.\n",
44 | "\n",
45 | "### Strengths & Weaknesses\n",
46 | "* **Strengths:**\n",
47 | " * **True Personalization:** Enables context and learning that persists indefinitely, far beyond a single session's context window.\n",
48 | " * **Rich Understanding:** A graph database allows the agent to understand and reason about complex relationships between entities.\n",
49 | "* **Weaknesses:**\n",
50 | " * **Complexity:** This is a significantly more complex architecture to build and maintain than a simple stateless agent.\n",
51 | " * **Memory Bloat & Pruning:** Over time, the memory stores can become massive. Strategies for summarizing, consolidating, or pruning old/irrelevant memories are essential for long-term performance."
52 | ]
53 | },
54 | {
55 | "cell_type": "markdown",
56 | "id": "phase0-title",
57 | "metadata": {},
58 | "source": [
59 | "## Phase 0: Foundation & Setup\n",
60 | "\n",
61 | "We'll install all necessary libraries, including drivers for our vector and graph databases, and configure our API keys."
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 1,
67 | "id": "install-libs",
68 | "metadata": {},
69 | "outputs": [],
70 | "source": [
71 | "# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv langchain_community langchain-openai neo4j faiss-cpu tiktoken"
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": 2,
77 | "id": "import-and-keys",
78 | "metadata": {},
79 | "outputs": [
80 | {
81 | "name": "stdout",
82 | "output_type": "stream",
83 | "text": [
84 | "Environment variables loaded and tracing is set up.\n"
85 | ]
86 | }
87 | ],
88 | "source": [
89 | "import os\n",
90 | "import uuid\n",
91 | "from typing import List, Dict, Any, Optional, Tuple\n",
92 | "from dotenv import load_dotenv\n",
93 | "\n",
94 | "# Pydantic for data modeling\n",
95 | "from pydantic import BaseModel, Field\n",
96 | "\n",
97 | "# LangChain components\n",
98 | "from langchain_nebius import ChatNebius, NebiusEmbeddings\n",
99 | "from langchain_community.graphs import Neo4jGraph\n",
100 | "from langchain_community.vectorstores import FAISS\n",
101 | "from langchain.docstore.document import Document\n",
102 | "from langchain_core.prompts import ChatPromptTemplate\n",
103 | "\n",
104 | "# LangGraph components\n",
105 | "from langgraph.graph import StateGraph, END\n",
106 | "from typing_extensions import TypedDict\n",
107 | "\n",
108 | "# For pretty printing\n",
109 | "from rich.console import Console\n",
110 | "from rich.markdown import Markdown\n",
111 | "\n",
112 | "# --- API Key and Tracing Setup ---\n",
113 | "load_dotenv()\n",
114 | "\n",
115 | "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
116 | "os.environ[\"LANGCHAIN_PROJECT\"] = \"Agentic Architecture - Memory Stack (Nebius)\"\n",
117 | "\n",
118 | "# Check for required environment variables\n",
119 | "required_vars = [\"NEBIUS_API_KEY\", \"LANGCHAIN_API_KEY\", \"NEO4J_URI\", \"NEO4J_USERNAME\", \"NEO4J_PASSWORD\"]\n",
120 | "for var in required_vars:\n",
121 | " if var not in os.environ:\n",
122 | " print(f\"Warning: Environment variable {var} not set.\")\n",
123 | "\n",
124 | "print(\"Environment variables loaded and tracing is set up.\")"
125 | ]
126 | },
127 | {
128 | "cell_type": "markdown",
129 | "id": "phase1-title",
130 | "metadata": {},
131 | "source": [
132 | "## Phase 1: Building the Memory Components\n",
133 | "\n",
134 | "This is the core of our architecture. We'll define the structures for our memories and set up the connections to our databases. We'll also create the \"Memory Maker\" agent responsible for processing conversations and creating new memories."
135 | ]
136 | },
137 | {
138 | "cell_type": "code",
139 | "execution_count": 3,
140 | "id": "memory-setup-code",
141 | "metadata": {},
142 | "outputs": [
143 | {
144 | "name": "stdout",
145 | "output_type": "stream",
146 | "text": [
147 | "Memory components initialized successfully.\n"
148 | ]
149 | }
150 | ],
151 | "source": [
152 | "console = Console()\n",
153 | "llm = ChatNebius(model=\"mistralai/Mixtral-8x22B-Instruct-v0.1\", temperature=0)\n",
154 | "embeddings = NebiusEmbeddings()\n",
155 | "\n",
156 | "# --- 1. Vector Store for Episodic Memory ---\n",
157 | "# In a real application, you'd persist this. For this example, it's in-memory.\n",
158 | "try:\n",
159 | " episodic_vector_store = FAISS.from_texts([\"Initial document to bootstrap the store\"], embeddings)\n",
160 | "except ImportError:\n",
161 | " console.print(\"[bold red]FAISS not installed. Please run `pip install faiss-cpu`.\")[/bold red]\n",
162 | " episodic_vector_store = None\n",
163 | "\n",
164 | "# --- 2. Graph DB for Semantic Memory ---\n",
165 | "try:\n",
166 | " graph = Neo4jGraph(\n",
167 | " url=os.environ.get(\"NEO4J_URI\"),\n",
168 | " username=os.environ.get(\"NEO4J_USERNAME\"),\n",
169 | " password=os.environ.get(\"NEO4J_PASSWORD\")\n",
170 | " )\n",
171 | " # Clear the graph for a clean run\n",
172 | " graph.query(\"MATCH (n) DETACH DELETE n\")\n",
173 | "except Exception as e:\n",
174 | " console.print(f\"[bold red]Failed to connect to Neo4j: {e}. Please check your credentials and connection.[/bold red]\")\n",
175 | " graph = None\n",
176 | "\n",
177 | "# --- 3. Pydantic Models for the \"Memory Maker\" ---\n",
178 | "# Define the structure of knowledge we want to extract.\n",
179 | "class Node(BaseModel):\n",
180 | " id: str = Field(description=\"Unique identifier for the node, which can be a person's name, a company ticker, or a concept.\")\n",
181 | " type: str = Field(description=\"The type of the node (e.g., 'User', 'Company', 'InvestmentPhilosophy').\")\n",
182 | " properties: Dict[str, Any] = Field(description=\"A dictionary of properties for the node.\")\n",
183 | "\n",
184 | "class Relationship(BaseModel):\n",
185 | " source: Node = Field(description=\"The source node of the relationship.\")\n",
186 | " target: Node = Field(description=\"The target node of the relationship.\")\n",
187 | " type: str = Field(description=\"The type of the relationship (e.g., 'IS_A', 'INTERESTED_IN').\")\n",
188 | " properties: Dict[str, Any] = Field(description=\"A dictionary of properties for the relationship.\")\n",
189 | "\n",
190 | "class KnowledgeGraph(BaseModel):\n",
191 | " \"\"\"Represents the structured knowledge extracted from a conversation.\"\"\"\n",
192 | " relationships: List[Relationship] = Field(description=\"A list of relationships to be added to the knowledge graph.\")\n",
193 | "\n",
194 | "# --- 4. The \"Memory Maker\" Agent ---\n",
195 | "def create_memories(user_input: str, assistant_output: str):\n",
196 | " conversation = f\"User: {user_input}\\nAssistant: {assistant_output}\"\n",
197 | " \n",
198 | " # 4a. Create Episodic Memory (Summarization)\n",
199 | " console.print(\"--- Creating Episodic Memory (Summary) ---\")\n",
200 | " summary_prompt = ChatPromptTemplate.from_messages([\n",
201 | " (\"system\", \"You are a summarization expert. Create a concise, one-sentence summary of the following user-assistant interaction. This summary will be used as a memory for future recall.\"),\n",
202 | " (\"human\", \"Interaction:\\n{interaction}\")\n",
203 | " ])\n",
204 | " summarizer = summary_prompt | llm\n",
205 | " episodic_summary = summarizer.invoke({\"interaction\": conversation}).content\n",
206 | " \n",
207 | " new_doc = Document(page_content=episodic_summary, metadata={\"created_at\": uuid.uuid4().hex})\n",
208 | " episodic_vector_store.add_documents([new_doc])\n",
209 | " console.print(f\"[green]Episodic memory created:[/green] '{episodic_summary}'\")\n",
210 | " \n",
211 | " # 4b. Create Semantic Memory (Fact Extraction)\n",
212 | " console.print(\"--- Creating Semantic Memory (Graph) ---\")\n",
213 | " extraction_llm = llm.with_structured_output(KnowledgeGraph)\n",
214 | " extraction_prompt = ChatPromptTemplate.from_messages([\n",
215 | " (\"system\", \"You are a knowledge extraction expert. Your task is to identify key entities and their relationships from a conversation and model them as a graph. Focus on user preferences, goals, and stated facts.\"),\n",
216 | " (\"human\", \"Extract all relationships from this interaction:\\n{interaction}\")\n",
217 | " ])\n",
218 | " extractor = extraction_prompt | extraction_llm\n",
219 | " try:\n",
220 | " kg_data = extractor.invoke({\"interaction\": conversation})\n",
221 | " if kg_data.relationships:\n",
222 | " for rel in kg_data.relationships:\n",
223 | " graph.add_graph_documents([rel], include_source=True)\n",
224 | " console.print(f\"[green]Semantic memory created:[/green] Added {len(kg_data.relationships)} relationships to the graph.\")\n",
225 | " else:\n",
226 | " console.print(\"[yellow]No new semantic memories identified in this interaction.[/yellow]\")\n",
227 | " except Exception as e:\n",
228 | " console.print(f\"[red]Could not extract or save semantic memory: {e}[/red]\")\n",
229 | "\n",
230 | "if episodic_vector_store and graph:\n",
231 | " print(\"Memory components initialized successfully.\")"
232 | ]
233 | },
234 | {
235 | "cell_type": "markdown",
236 | "id": "phase2-title",
237 | "metadata": {},
238 | "source": [
239 | "## Phase 2: The Memory-Augmented Agent\n",
240 | "\n",
241 | "Now we'll build the agent that uses this memory system. We'll use LangGraph to define a clear, stateful workflow: retrieve memories, generate a response using those memories, and finally, update the memory with the latest interaction."
242 | ]
243 | },
244 | {
245 | "cell_type": "code",
246 | "execution_count": 4,
247 | "id": "agent-build-code",
248 | "metadata": {},
249 | "outputs": [
250 | {
251 | "name": "stdout",
252 | "output_type": "stream",
253 | "text": [
254 | "Memory-augmented agent graph compiled successfully.\n"
255 | ]
256 | }
257 | ],
258 | "source": [
259 | "# Define the state for our LangGraph agent\n",
260 | "class AgentState(TypedDict):\n",
261 | " user_input: str\n",
262 | " retrieved_memories: Optional[str]\n",
263 | " generation: str\n",
264 | "\n",
265 | "# Define the nodes of the graph\n",
266 | "\n",
267 | "def retrieve_memory(state: AgentState) -> Dict[str, Any]:\n",
268 | " \"\"\"Node that retrieves memories from both episodic and semantic stores.\"\"\"\n",
269 | " console.print(\"--- Retrieving Memories ---\")\n",
270 | " user_input = state['user_input']\n",
271 | " \n",
272 | " # Retrieve from episodic memory\n",
273 | " retrieved_docs = episodic_vector_store.similarity_search(user_input, k=2)\n",
274 | " episodic_memories = \"\\n\".join([doc.page_content for doc in retrieved_docs])\n",
275 | " \n",
276 | " # Retrieve from semantic memory\n",
277 | " # This is a simple retrieval; more advanced would involve entity extraction from the query\n",
278 | " try:\n",
279 | " graph_schema = graph.get_schema\n",
280 | " # Using a fulltext index for better retrieval. Neo4j automatically creates one on node properties.\n",
281 | " # A more robust solution might involve extracting entities from user_input first.\n",
282 | " semantic_memories = str(graph.query(\"\"\"\n",
283 | " UNWIND $keywords AS keyword\n",
284 | " CALL db.index.fulltext.queryNodes(\"entity\", keyword) YIELD node, score\n",
285 | " MATCH (node)-[r]-(related_node)\n",
286 | " RETURN node, r, related_node LIMIT 5\n",
287 | " \"\"\", {'keywords': user_input.split()}))\n",
288 | " except Exception as e:\n",
289 | " semantic_memories = f\"Could not query graph: {e}\"\n",
290 | " \n",
291 | " retrieved_content = f\"Relevant Past Conversations (Episodic Memory):\\n{episodic_memories}\\n\\nRelevant Facts (Semantic Memory):\\n{semantic_memories}\"\n",
292 | " console.print(f\"[cyan]Retrieved Context:\\n{retrieved_content}[/cyan]\")\n",
293 | " \n",
294 | " return {\"retrieved_memories\": retrieved_content}\n",
295 | "\n",
296 | "def generate_response(state: AgentState) -> Dict[str, Any]:\n",
297 | " \"\"\"Node that generates a response using the retrieved memories.\"\"\"\n",
298 | " console.print(\"--- Generating Response ---\")\n",
299 | " prompt = ChatPromptTemplate.from_messages([\n",
300 | " (\"system\", \"You are a helpful and personalized financial assistant. Use the retrieved memories to inform your response and tailor it to the user. If the memories indicate a user's preference (e.g., they are a conservative investor), you MUST respect it.\"),\n",
301 | " (\"human\", \"My question is: {user_input}\\n\\nHere are some memories that might be relevant:\\n{retrieved_memories}\")\n",
302 | " ])\n",
303 | " generator = prompt | llm\n",
304 | " generation = generator.invoke(state).content\n",
305 | " console.print(f\"[green]Generated Response:\\n{generation}[/green]\")\n",
306 | " return {\"generation\": generation}\n",
307 | "\n",
308 | "def update_memory(state: AgentState) -> Dict[str, Any]:\n",
309 | " \"\"\"Node that updates the memory with the latest interaction.\"\"\"\n",
310 | " console.print(\"--- Updating Memory ---\")\n",
311 | " create_memories(state['user_input'], state['generation'])\n",
312 | " return {}\n",
313 | "\n",
314 | "# Build the graph\n",
315 | "workflow = StateGraph(AgentState)\n",
316 | "\n",
317 | "workflow.add_node(\"retrieve\", retrieve_memory)\n",
318 | "workflow.add_node(\"generate\", generate_response)\n",
319 | "workflow.add_node(\"update\", update_memory)\n",
320 | "\n",
321 | "workflow.set_entry_point(\"retrieve\")\n",
322 | "workflow.add_edge(\"retrieve\", \"generate\")\n",
323 | "workflow.add_edge(\"generate\", \"update\")\n",
324 | "workflow.add_edge(\"update\", END)\n",
325 | "\n",
326 | "memory_agent = workflow.compile()\n",
327 | "print(\"Memory-augmented agent graph compiled successfully.\")"
328 | ]
329 | },
330 | {
331 | "cell_type": "markdown",
332 | "id": "phase3-title",
333 | "metadata": {},
334 | "source": [
335 | "## Phase 3: Demonstration & Inspection\n",
336 | "\n",
337 | "Let's see the agent in action. We'll simulate a multi-turn conversation. The first two turns will seed the memory. The third turn will test if the agent can use that memory for a personalized response. Finally, we'll directly inspect the databases to see the memories that were created."
338 | ]
339 | },
340 | {
341 | "cell_type": "code",
342 | "execution_count": 5,
343 | "id": "demo-code",
344 | "metadata": {},
345 | "outputs": [
346 | {
347 | "data": {
348 | "text/plain": [
349 | "\n",
350 | "--- 💬 INTERACTION 1: Seeding Memory ---\n"
351 | ]
352 | },
353 | "output_type": "display_data",
354 | "metadata": {}
355 | },
356 | {
357 | "name": "stdout",
358 | "output_type": "stream",
359 | "text": [
360 | "--- Retrieving Memories ---\n",
361 | "Retrieved Context:\n",
362 | "Relevant Past Conversations (Episodic Memory):\n",
363 | "Initial document to bootstrap the store\n",
364 | "\n",
365 | "Relevant Facts (Semantic Memory):\n",
366 | "[]\n",
367 | "--- Generating Response ---\n",
368 | "Generated Response:\n",
369 | "Hello, Alex! It's great to meet you. As a conservative investor, focusing on established tech companies with strong fundamentals is a very sound strategy. I can certainly help you navigate that space. What's on your mind today?\n",
370 | "--- Updating Memory ---\n",
371 | "--- Creating Episodic Memory (Summary) ---\n",
372 | "Episodic memory created: 'The user, Alex, introduced himself as a conservative investor interested in established tech companies.'\n",
373 | "--- Creating Semantic Memory (Graph) ---\n",
374 | "Semantic memory created: Added 2 relationships to the graph.\n"
375 | ]
376 | },
377 | {
378 | "data": {
379 | "text/plain": [
380 | "\n",
381 | "--- 💬 INTERACTION 2: Asking a specific question ---\n"
382 | ]
383 | },
384 | "output_type": "display_data",
385 | "metadata": {}
386 | },
387 | {
388 | "name": "stdout",
389 | "output_type": "stream",
390 | "text": [
391 | "--- Retrieving Memories ---\n",
392 | "Retrieved Context:\n",
393 | "Relevant Past Conversations (Episodic Memory):\n",
394 | "The user, Alex, introduced himself as a conservative investor interested in established tech companies.\n",
395 | "Initial document to bootstrap the store\n",
396 | "\n",
397 | "Relevant Facts (Semantic Memory):\n",
398 | "[]\n",
399 | "--- Generating Response ---\n",
400 | "Generated Response:\n",
401 | "Apple (AAPL) is often considered a cornerstone for conservative tech portfolios. It has a massive market capitalization, a very strong brand, consistent profitability, and a history of returning value to shareholders through dividends and buybacks. Its ecosystem of products creates a loyal customer base, which provides a stable revenue stream. For a conservative investor, it generally aligns well with the goal of capital preservation while still offering growth potential. Would you like a deeper dive into its recent performance or financials?\n",
402 | "--- Updating Memory ---\n",
403 | "--- Creating Episodic Memory (Summary) ---\n",
404 | "Episodic memory created: 'The user inquired about Apple (AAPL), and the assistant confirmed it's a suitable stock for conservative investors due to its stability and market position.'\n",
405 | "--- Creating Semantic Memory (Graph) ---\n",
406 | "Semantic memory created: Added 1 relationships to the graph.\n"
407 | ]
408 | },
409 | {
410 | "data": {
411 | "text/plain": [
412 | "\n",
413 | "--- 🧠 INTERACTION 3: THE MEMORY TEST ---\n"
414 | ]
415 | },
416 | "output_type": "display_data",
417 | "metadata": {}
418 | },
419 | {
420 | "name": "stdout",
421 | "output_type": "stream",
422 | "text": [
423 | "--- Retrieving Memories ---\n",
424 | "Retrieved Context:\n",
425 | "Relevant Past Conversations (Episodic Memory):\n",
426 | "The user, Alex, introduced himself as a conservative investor interested in established tech companies.\n",
427 | "The user inquired about Apple (AAPL), and the assistant confirmed it's a suitable stock for conservative investors due to its stability and market position.\n",
428 | "\n",
429 | "Relevant Facts (Semantic Memory):\n",
430 | "[{'node': {'type': 'User', 'id': 'Alex', 'properties': {}}, 'r': {'type': 'HAS_GOAL', 'properties': {}}, 'related_node': {'type': 'InvestmentPhilosophy', 'id': 'Conservative Investing', 'properties': {}}}, {'node': {'type': 'User', 'id': 'Alex', 'properties': {}}, 'r': {'type': 'INTERESTED_IN', 'properties': {}}, 'related_node': {'type': 'Sector', 'id': 'Tech', 'properties': {}}}]\n",
431 | "--- Generating Response ---\n",
432 | "Generated Response:\n",
433 | "Of course. Based on your stated goal of conservative investing in the tech sector, a great alternative to Apple (AAPL) would be Microsoft (MSFT).\n",
434 | "\n",
435 | "Here's why it fits your profile:\n",
436 | "1. **Diversification:** Like Apple, it's a mega-cap tech giant, but its revenue is more diversified across enterprise software (Azure, Office 365), gaming (Xbox), and hardware (Surface).\n",
437 | "2. **Strong Enterprise Focus:** Its dominance in cloud computing with Azure provides a consistent and growing revenue stream, which is a hallmark of a conservative investment.\n",
438 | "3. **Shareholder Value:** Microsoft has a long history of paying and increasing its dividends.\n",
439 | "\n",
440 | "It offers similar stability and blue-chip status to Apple but with different primary business drivers, making it an excellent choice for a conservative tech portfolio.\n",
441 | "--- Updating Memory ---\n",
442 | "--- Creating Episodic Memory (Summary) ---\n",
443 | "Episodic memory created: 'Based on the user's conservative investment goals, the assistant suggested Microsoft (MSFT) as a good alternative to Apple (AAPL), highlighting its diversification and enterprise focus.'\n",
444 | "--- Creating Semantic Memory (Graph) ---\n",
445 | "Semantic memory created: Added 2 relationships to the graph.\n"
446 | ]
447 | }
448 | ],
449 | "source": [
450 | "def run_interaction(query: str):\n",
451 | " result = memory_agent.invoke({\"user_input\": query})\n",
452 | " return result['generation']\n",
453 | "\n",
454 | "console.print(\"\\n--- 💬 INTERACTION 1: Seeding Memory ---\")\n",
455 | "run_interaction(\"Hi, my name is Alex. I'm a conservative investor, and I'm mainly interested in established tech companies.\")\n",
456 | "\n",
457 | "console.print(\"\\n--- 💬 INTERACTION 2: Asking a specific question ---\")\n",
458 | "run_interaction(\"What do you think about Apple (AAPL)?\")\n",
459 | "\n",
460 | "console.print(\"\\n--- 🧠 INTERACTION 3: THE MEMORY TEST ---\")\n",
461 | "run_interaction(\"Based on my goals, what's a good alternative to that stock?\")"
462 | ]
463 | },
464 | {
465 | "cell_type": "markdown",
466 | "id": "inspection-title",
467 | "metadata": {},
468 | "source": [
469 | "### Inspecting the Memory Stores\n",
470 | "\n",
471 | "Let's look under the hood. We can query our databases directly to see the memories our agent created."
472 | ]
473 | },
474 | {
475 | "cell_type": "code",
476 | "execution_count": 6,
477 | "id": "inspection-code",
478 | "metadata": {},
479 | "outputs": [
480 | {
481 | "data": {
482 | "text/plain": [
483 | "--- 🔍 Inspecting Episodic Memory (Vector Store) ---\n"
484 | ]
485 | },
486 | "output_type": "display_data",
487 | "metadata": {}
488 | },
489 | {
490 | "name": "stdout",
491 | "output_type": "stream",
492 | "text": [
493 | "1. The user, Alex, introduced himself as a conservative investor interested in established tech companies.\n",
494 | "2. Based on the user's conservative investment goals, the assistant suggested Microsoft (MSFT) as a good alternative to Apple (AAPL), highlighting its diversification and enterprise focus.\n",
495 | "3. The user inquired about Apple (AAPL), and the assistant confirmed it's a suitable stock for conservative investors due to its stability and market position.\n"
496 | ]
497 | },
498 | {
499 | "data": {
500 | "text/plain": [
501 | "\n",
502 | "--- 🕸️ Inspecting Semantic Memory (Graph Database) ---\n"
503 | ]
504 | },
505 | "output_type": "display_data",
506 | "metadata": {}
507 | },
508 | {
509 | "name": "stdout",
510 | "output_type": "stream",
511 | "text": [
512 | "Graph Schema:\n",
513 | "{'node_props': {'InvestmentPhilosophy': [{'property': 'id', 'type': 'STRING'}], 'Company': [{'property': 'id', 'type': 'STRING'}], 'Sector': [{'property': 'id', 'type': 'STRING'}], 'User': [{'property': 'id', 'type': 'STRING'}]}, 'rel_props': {}, 'relationships': [{'start': 'User', 'type': 'INTERESTED_IN', 'end': 'Sector'}, {'start': 'User', 'type': 'INTERESTED_IN', 'end': 'Company'}, {'start': 'User', 'type': 'HAS_GOAL', 'end': 'InvestmentPhilosophy'}]}\n",
514 | "\n",
515 | "Relationships in Graph:\n",
516 | "[{'n': {'id': 'Alex', 'type': 'User'}, 'r': {}, 'm': {'id': 'Conservative Investing', 'type': 'InvestmentPhilosophy'}}, {'n': {'id': 'Alex', 'type': 'User'}, 'r': {}, 'm': {'id': 'Tech', 'type': 'Sector'}}, {'n': {'id': 'Alex', 'type': 'User'}, 'r': {}, 'm': {'id': 'AAPL', 'type': 'Company'}}, {'n': {'id': 'Alex', 'type': 'User'}, 'r': {}, 'm': {'id': 'MSFT', 'type': 'Company'}}]\n"
517 | ]
518 | }
519 | ],
520 | "source": [
521 | "console.print(\"--- 🔍 Inspecting Episodic Memory (Vector Store) ---\")\n",
522 | "# We'll do a similarity search for a general concept to see what comes up\n",
523 | "retrieved_docs = episodic_vector_store.similarity_search(\"User's investment strategy\", k=3)\n",
524 | "for i, doc in enumerate(retrieved_docs):\n",
525 | " print(f\"{i+1}. {doc.page_content}\")\n",
526 | "\n",
527 | "console.print(\"\\n--- 🕸️ Inspecting Semantic Memory (Graph Database) ---\")\n",
528 | "print(f\"Graph Schema:\\n{graph.get_schema}\")\n",
529 | "\n",
530 | "# Cypher query to see who is interested in what\n",
531 | "query_result = graph.query(\"MATCH (n:User)-[r:INTERESTED_IN|HAS_GOAL]->(m) RETURN n, r, m\")\n",
532 | "print(f\"Relationships in Graph:\\n{query_result}\")"
533 | ]
534 | },
535 | {
536 | "cell_type": "markdown",
537 | "id": "conclusion",
538 | "metadata": {},
539 | "source": [
540 | "## Conclusion\n",
541 | "\n",
542 | "In this notebook, we successfully built an agent with a sophisticated, long-term memory system. The demonstration clearly shows the power of this architecture:\n",
543 | "\n",
544 | "- **Stateless Failure:** A standard agent, when asked \"Based on my goals, what's a good alternative?\", would fail because it has no memory of the user's goals.\n",
545 | "- **Memory-Augmented Success:** Our agent succeeded because it could:\n",
546 | " 1. **Recall Episodically:** It retrieved the summary of the first conversation: \"The user, Alex, introduced himself as a conservative investor...\"\n",
547 | " 2. **Recall Semantically:** It queried the graph and found the structured fact: `(User: Alex) -[HAS_GOAL]-> (InvestmentPhilosophy: Conservative)`.\n",
548 | " 3. **Synthesize:** It used this combined context to provide a highly relevant and personalized recommendation (Microsoft), explicitly referencing the user's conservative goals.\n",
549 | "\n",
550 | "This combination of recalling *what happened* (episodic) and *what is known* (semantic) is a powerful paradigm for moving beyond simple, transactional agents to create true, learning companions. While managing this memory at scale presents challenges like pruning and consolidation, the foundational architecture we've built here is a significant step toward more intelligent and personalized AI systems."
551 | ]
552 | }
553 | ],
554 | "metadata": {
555 | "kernelspec": {
556 | "display_name": "Python 3 (ipykernel)",
557 | "language": "python",
558 | "name": "python3"
559 | },
560 | "language_info": {
561 | "codemirror_mode": {
562 | "name": "ipython",
563 | "version": 3
564 | },
565 | "file_extension": ".py",
566 | "mimetype": "text/x-python",
567 | "name": "python",
568 | "nbconvert_exporter": "python",
569 | "pygments_lexer": "ipython3",
570 | "version": "3.10.13"
571 | }
572 | },
573 | "nbformat": 4,
574 | "nbformat_minor": 5
575 | }
--------------------------------------------------------------------------------