├── web_search_agent
    ├── __init__.py
    └── agent.py
├── .env.template
├── .gitattributes
├── LICENSE
└── README.md


/web_search_agent/__init__.py:
--------------------------------------------------------------------------------
1 | from . import agent
2 | 


--------------------------------------------------------------------------------
/.env.template:
--------------------------------------------------------------------------------
1 | GOOGLE_GENAI_USE_VERTEXAI="False"
2 | GOOGLE_API_KEY="GEMINI_API_KEY"


--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 MeirKaD
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Web Search Agent using Google ADK and Bright Data MCP
  2 | 
  3 | This repository contains a web search agent built with Google's Agent Development Kit (ADK) and Bright Data's Model Context Protocol (MCP). The agent can search the web and retrieve information based on user queries.
  4 | 
  5 | ## Prerequisites
  6 | 
  7 | - Python 3.12 or later
  8 | - Node.js and npm (for Bright Data MCP)
  9 | - Google Gemini API key
 10 | - Bright Data account with active Web Unblocker API zone (For Browser capabilities, Scraping Browser zone is required as well)
 11 | 
 12 | ## Installation
 13 | 
 14 | ### 1. Clone the repository
 15 | 
 16 | ```bash
 17 | git clone https://github.com/MeirKaD/MCP_ADK.git
 18 | cd MCP_ADK
 19 | ```
 20 | 
 21 | ### 2. Create and activate a virtual environment
 22 | 
 23 | ```bash
 24 | # For macOS/Linux
 25 | python -m venv .venv
 26 | source .venv/bin/activate
 27 | 
 28 | # For Windows
 29 | python -m venv .venv
 30 | .venv\Scripts\activate
 31 | ```
 32 | 
 33 | ### 3. Install the required packages
 34 | 
 35 | ```bash
 36 | pip install google-adk google-generativeai python-dotenv
 37 | ```
 38 | 
 39 | ### 4. Install Bright Data MCP package
 40 | 
 41 | ```bash
 42 | npm install -g @brightdata/mcp
 43 | ```
 44 | 
 45 | ### 5. Set up environment variables
 46 | 
 47 | Create a `.env` file in the root directory by copying the `.env.template`:
 48 | 
 49 | ```bash
 50 | cp .env.template .env
 51 | ```
 52 | 
 53 | Then, edit the `.env` file and add your Google Gemini API key:
 54 | 
 55 | ```
 56 | GOOGLE_GENAI_USE_VERTEXAI="False"
 57 | GOOGLE_API_KEY="YOUR_GEMINI_API_KEY"
 58 | ```
 59 | 
 60 | ### 6. Configure Bright Data MCP credentials
 61 | 
 62 | Edit the `web_search_agent/agent.py` file and replace the placeholders with your Bright Data credentials:
 63 | 
 64 | ```python
 65 | "API_TOKEN": "YOUR_BRIGHT_DATA_API_TOKEN",
 66 | "WEB_UNLOCKER_ZONE": "unblocker",
 67 | "BROWSER_AUTH": "brd-customer-YOUR_CUSTOMER_ID-zone-scraping_browser:YOUR_PASSWORD"
 68 | ```
 69 | 
 70 | ## Running the Agent with ADK Web Interface
 71 | 
 72 | ### 1. Start the ADK Web Server
 73 | 
 74 | ```bash
 75 | adk web
 76 | ```
 77 | 
 78 | This will start a local web server, typically at `http://localhost:8000`.
 79 | 
 80 | ### 2. Access the Web Interface
 81 | 
 82 | Open your browser and navigate to `http://localhost:8000` to interact with your agent through the ADK web interface.
 83 | 
 84 | ## How the Agent Works
 85 | 
 86 | The agent is built using Google's Agent Development Kit (ADK) and uses Gemini 2.0 Flash as the underlying model. It leverages Bright Data's Model Context Protocol (MCP) to perform web searches and retrieve information from websites.
 87 | 
 88 | The agent initializes the MCP toolset asynchronously when the first request is received, connecting to Bright Data's services to enable web search capabilities.
 89 | 
 90 | ## Features
 91 | 
 92 | - Web search using Bright Data MCP
 93 | - Information retrieval from websites
 94 | - Answering questions based on web content
 95 | - Automatic cleanup of resources when the agent terminates
 96 | 
 97 | ## Customization
 98 | 
 99 | You can customize the agent's behavior by modifying the `web_search_agent/agent.py` file:
100 | 
101 | - Change the model by updating the `model` parameter
102 | - Modify the agent's description and instructions
103 | - Add additional tools or capabilities
104 | 
105 | ## Troubleshooting
106 | 
107 | If you encounter issues:
108 | 
109 | 1. Ensure your Google Gemini API key is valid
110 | 2. Check your Bright Data credentials
111 | 3. Verify that Node.js and npm are correctly installed
112 | 4. Make sure you have the correct version of Python and all required packages
113 | 
114 | ## License
115 | 
116 | MIT
117 | 
118 | ## Acknowledgements
119 | 
120 | - Google Agent Development Kit (ADK)
121 | - Bright Data MCP
122 | 


--------------------------------------------------------------------------------
/web_search_agent/agent.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import threading
  3 | import asyncio
  4 | from dotenv import load_dotenv
  5 | from google.adk.agents import Agent, SequentialAgent, LoopAgent
  6 | from google.adk.agents.callback_context import CallbackContext
  7 | from google.adk.models import LlmRequest, LlmResponse
  8 | from google.adk.events import Event, EventActions
  9 | from google.genai import types
 10 | from typing import Optional
 11 | 
 12 | load_dotenv()
 13 | 
 14 | _mcp_tools = None
 15 | _exit_stack = None
 16 | _initialized = False
 17 | _initialization_in_progress = False
 18 | _init_lock = threading.Lock()
 19 | 
 20 | print("Module loaded: web_research_agent")
 21 | 
 22 | def create_planner_agent():
 23 |     return Agent(
 24 |         name="planner",
 25 |         model="gemini-2.0-flash",
 26 |         description="Plans research by breaking down complex topics into search queries",
 27 |         instruction="""
 28 |         You are a research planning expert. Your task is to:
 29 |         1. Analyze the user's research topic
 30 |         2. Break it down into 3-5 specific search queries that together will cover the topic comprehensively
 31 |         3. Output a JSON object with format: {"queries": ["query1", "query2", "query3"]}
 32 |         Be concise and focused in your search queries.
 33 |         """,
 34 |         output_key="search_queries"
 35 |     )
 36 | 
 37 | # Define researcher agent with improved tool guidance
 38 | def create_researcher_agent():
 39 |     return Agent(
 40 |         name="researcher",
 41 |         model="gemini-2.0-flash",
 42 |         description="Executes web searches and extracts relevant information",
 43 |         instruction="""
 44 |         You are a web researcher. You will:
 45 |         1. Take the specific search queries from the planner
 46 |         2. For EACH query:
 47 |            a. Use search_engine to find relevant information (start with "google" engine)
 48 |            b. Select 2-3 most relevant results and for each result:
 49 |               i. Use scraping_browser_navigate to navigate to the URL
 50 |               ii. Use scraping_browser_get_text to extract the main content
 51 |               iii. If needed, use scraping_browser_links to find important sections and scraping_browser_click to navigate to them
 52 |            c. If a page fails to load or lacks information, try another result
 53 |         3. Summarize key findings for each query with source citations
 54 |         
 55 |         IMPORTANT: 
 56 |         - Always begin with search_engine to discover relevant pages
 57 |         - Then use browser tools in this sequence:
 58 |           1. scraping_browser_navigate (to go to the URL)
 59 |           2. scraping_browser_get_text (to extract content)
 60 |           3. scraping_browser_links and scraping_browser_click (if you need to navigate within the site)
 61 |         - Include clear citations with URLs for each piece of information
 62 |         - Format your findings for each search query separately
 63 |         """,
 64 |         before_model_callback=check_researcher_tools
 65 |     )
 66 | 
 67 | # Define publisher agent with clear instruction
 68 | def create_publisher_agent():
 69 |     return Agent(
 70 |         name="publisher",
 71 |         model="gemini-2.0-flash", 
 72 |         description="Synthesizes research findings into a comprehensive and detailed final document",
 73 |         instruction="""
 74 |         You are an expert Technical Writer and Synthesist. Your mission is to transform the detailed research findings provided by the researcher into a comprehensive, well-structured, and insightful final report.
 75 | 
 76 |         Follow these steps meticulously:
 77 |         1.  **Deep Analysis & Synthesis:** Carefully review *all* the research findings, summaries, and cited sources provided by the researcher for *all* search queries. Do not just list findings; **synthesize** them. Identify connections, relationships, common themes, contrasting points, and overall patterns across the different pieces of information and sources.
 78 |         2.  **Logical Structure:** Organize the synthesized information into a coherent and deeply structured document. Use logical sections and sub-sections with clear, descriptive headings (using Markdown H2, H3, etc.) to group related concepts and findings. A possible structure could be: Introduction, Key Theme/Aspect 1 (with sub-points), Key Theme/Aspect 2 (with sub-points), ..., Conclusion, References. Adapt the structure based on the content.
 79 |         3.  **Compelling Introduction:** Write a robust introduction that clearly defines the topic, states the report's main objectives, highlights the key questions or areas explored, and provides a roadmap for the reader, outlining the main sections of the report.
 80 |         4.  **Detailed Body Sections:** Elaborate on the synthesized findings within each section. Provide sufficient detail and explanation. Explain concepts clearly. Ensure that claims and statements are directly supported by the research gathered by the researcher. **Explicitly reference the source URLs** where appropriate within the text (e.g., "According to [Source URL], ..."). Aim for thoroughness and depth, ensuring all significant aspects uncovered by the research are included. Use bullet points or numbered lists for clarity where appropriate. Ensure smooth transitions between paragraphs and sections.
 81 |         5.  **Insightful Conclusion:** Craft a strong conclusion that summarizes the most important findings and synthesized insights from the report. Briefly reiterate the main points discussed. You may also briefly mention limitations based *only* on the provided research or suggest natural next steps *if strongly implied* by the findings, but do *not* introduce entirely new information or opinions.
 82 |         6.  **Professional Formatting:** Format the entire document using clean and consistent Markdown. Utilize headings, lists (bulleted and numbered), bold/italic emphasis, and potentially blockquotes effectively to enhance readability and structure.
 83 |         7.  **Comprehensive References:** Create a dedicated "References" section at the very end. List *all* unique source URLs that were cited in the researcher's findings and used in your report. Ensure the list is clean and easy to read.
 84 |         8.  **Tone and Quality:** Maintain a professional, objective, and informative tone throughout the report. Ensure the language is clear, precise, and accurate according to the research. Strive for a high-quality, polished final document that is significantly more detailed and synthesized than the raw researcher output. Cover all key aspects comprehensively.
 85 |         """,
 86 |         output_key="final_document"
 87 |     )
 88 | 
 89 | # Create a single initialization function that leverages the EXISTING event loop
 90 | async def initialize_mcp_tools():
 91 |     """Initialize MCP tools using the existing event loop."""
 92 |     global _mcp_tools, _exit_stack, _initialized, _initialization_in_progress
 93 |     
 94 |     if _initialized:
 95 |         return _mcp_tools
 96 |     
 97 |     with _init_lock:
 98 |         if _initialized:
 99 |             return _mcp_tools
100 |             
101 |         if _initialization_in_progress:
102 |             while _initialization_in_progress:
103 |                 await asyncio.sleep(0.1)
104 |             return _mcp_tools
105 |         
106 |         _initialization_in_progress = True
107 |     
108 |     try:
109 |         from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, StdioServerParameters
110 |         
111 |         print("Connecting to Bright Data MCP...")
112 |         tools, exit_stack = await MCPToolset.from_server(
113 |             connection_params=StdioServerParameters(
114 |                 command='npx',
115 |                 args=["-y", "@brightdata/mcp"],
116 |                 env={
117 |                     "API_TOKEN": "YOUR_API_TOKEN",
118 |                     "WEB_UNLOCKER_ZONE": "UB_ZONE",
119 |                     "BROWSER_AUTH": "SBR_USER:SBR_PASS"
120 |                 }
121 |             )
122 |         )
123 |         print(f"MCP Toolset created successfully with {len(tools)} tools")
124 |         
125 |         _mcp_tools = tools
126 |         _exit_stack = exit_stack
127 |         
128 |         import atexit
129 |         
130 |         def cleanup_mcp():
131 |             global _exit_stack
132 |             if _exit_stack:
133 |                 print("Closing MCP server connection...")
134 |                 try:
135 | 
136 |                     loop = asyncio.new_event_loop()
137 |                     loop.run_until_complete(_exit_stack.aclose())
138 |                     loop.close()
139 |                     print("MCP server connection closed successfully.")
140 |                 except Exception as e:
141 |                     print(f"Error closing MCP connection: {e}")
142 |                 finally:
143 |                     _exit_stack = None
144 |         
145 |         atexit.register(cleanup_mcp)
146 |         
147 |         _initialized = True
148 |         
149 |         # Find and update the researcher agent if root_agent is defined
150 |         for agent in root_agent.sub_agents:
151 |             if agent.name == "researcher":
152 |                 agent.tools = tools
153 |                 print(f"Successfully added {len(tools)} tools to researcher agent")
154 |                 
155 |                 # List some tool names for debugging
156 |                 tool_names = [tool.name for tool in tools[:5]]
157 |                 print(f"Available tools include: {', '.join(tool_names)}")
158 |                 break
159 |                 
160 |         print("MCP initialization complete!")
161 |         return tools
162 |         
163 |     except Exception as e:
164 |         print(f"Error initializing MCP tools: {e}")
165 |         return None
166 |     finally:
167 |         _initialization_in_progress = False
168 | 
169 | 
170 | async def wait_for_initialization():
171 |     """Wait for MCP initialization to complete."""
172 |     global _initialized
173 |     
174 |     if not _initialized:
175 |         print("Starting initialization in callback...")
176 |         await initialize_mcp_tools()
177 |     
178 |     return _initialized
179 | 
180 | def check_researcher_tools(callback_context: CallbackContext, llm_request: LlmRequest) -> Optional[LlmResponse]:
181 |     global _mcp_tools, _initialized
182 |     
183 |     agent_name = callback_context.agent_name
184 |     
185 |     if agent_name == "researcher" and not _initialized:
186 |         print("Researcher agent needs tools - will start initialization")
187 |         
188 |         loop = asyncio.get_event_loop()
189 |         loop.create_task(initialize_mcp_tools())
190 |         
191 |         print("Initialization started in background. Asking user to retry.")
192 |         return LlmResponse(
193 |             content=types.Content(
194 |                 role="model",
195 |                 parts=[types.Part(text="Initializing research tools. This happens only once. Please try your query again in a few moments.")]
196 |             )
197 |         )
198 |     
199 |     return None
200 | 
201 | root_agent = SequentialAgent(
202 |     name="web_research_agent",
203 |     description="An agent that researches topics on the web and creates comprehensive reports",
204 |     sub_agents=[
205 |         create_planner_agent(),
206 |         create_researcher_agent(),
207 |         create_publisher_agent()
208 |     ]
209 | )
210 | 
211 | print("Agent structure created. MCP tools will be initialized on first use.")
212 | 


--------------------------------------------------------------------------------