├── .gitignore
├── Integrations
└── AutoGen
│ ├── .env.example
│ ├── autogen_flowise_ai_agent.ipynb
│ ├── autogen_langchain_uniswap_agent_speak.ipynb
│ ├── autogen_langchain_uniswap_ai_agent.ipynb
│ ├── autogen_with_open_source_llms.ipynb
│ ├── comprehensive_overview_of_large_language_models.pdf
│ └── uniswap_v3.pdf
├── LICENSE
├── README.md
├── Retrievers
├── 01_MultiQuery_Retriever.ipynb
├── 02_Contextual_Compression_Retriever.ipynb
├── 03_Ensemble_Retriever.ipynb
├── 04_MultiVector_Retriever
│ ├── 04_MultiVector_Retriever.ipynb
│ ├── README.md
│ ├── hypothetical_questions.png
│ ├── smaller_chunks.png
│ └── summary.png
├── 05_SelfQuerying_Retriever
│ ├── 05_Self_Querying_Retriever.ipynb
│ └── README.md
├── 06_Web_Research_Retriever
│ ├── 06_Web_Research_Retriever.ipynb
│ └── README.md
└── README.md
├── claude_3_xml_agent.ipynb
└── langchain_with_openai_assistants_api.ipynb
/.gitignore:
--------------------------------------------------------------------------------
1 | */**/OAI_CONFIG_LIST
2 | */**/.env
3 | */**/pyht
4 | */**/.cache
--------------------------------------------------------------------------------
/Integrations/AutoGen/.env.example:
--------------------------------------------------------------------------------
1 | PLAY_HT_USER_ID=
2 | PLAY_HT_API_KEY=
--------------------------------------------------------------------------------
/Integrations/AutoGen/autogen_flowise_ai_agent.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "colab_type": "text",
7 | "id": "view-in-github"
8 | },
9 | "source": [
10 | "
"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "metadata": {
16 | "id": "39_S0tWcB-oh"
17 | },
18 | "source": [
19 | "# AutoGen + Flowise Use Case\n",
20 | "\n",
21 | "**`AutoGen`** is a versatile framework that facilitates the creation of LLM applications by employing multiple agents capable of interacting with one another to tackle tasks. These AutoGen agents can be tailored to specific needs, engage in conversations, and seamlessly integrate human participation.\n",
22 | "\n",
23 | "**`Flowise`** is an open source UI visual tool to build customized LLM flows using LangchainJS, written in Node Typescript/Javascript.\n",
24 | "\n",
25 | "Flowise has been the leading no-code platform for Langchain based LLM workflow development. You may have already built great features on it.\n",
26 | "\n",
27 | "In this tutorial, I will show you how to integrate Flowise flows into AutoGen. You will be able to empower your AutoGen agents with the capability of your Flowise flow, with minimal effort."
28 | ]
29 | },
30 | {
31 | "cell_type": "markdown",
32 | "metadata": {
33 | "id": "NtvjgnBZZjUL"
34 | },
35 | "source": [
36 | "## Use Case - Uniswap Protocol AI Agent with Flowise Flow\n",
37 | "\n",
38 | "`Uniswap` is a decentralized exchange that allows users to trade Ethereum-based tokens.\n",
39 | "\n",
40 | "In this tutorial, I will walk you through the steps to simplify the AI agent we built in the previous tutorial - [AutoGen + LangChain Use Case - Uniswap Protocol AI Agent](./autogen_langchain_uniswap_ai_agent.ipynb). This agent can execute tasks that involve Uniswap knowledge.\n",
41 | "\n",
42 | "We will need much less steps if we have already implemented the QA chain as a Flowise flow.\n",
43 | "\n",
44 | "1. Fetch the Python code snippet for Flowise integration.\n",
45 | "2. Define a function `answer_flowise_uniswap_question`.\n",
46 | "3. Validate the `answer_flowise_uniswap_question` function.\n",
47 | "4. Set up AutoGen user agent and assistant agent with function calling enabled.\n",
48 | "\n",
49 | " In the function mapping, the function defined in step 3 is changed to `answer_flowise_uniswap_question`."
50 | ]
51 | },
52 | {
53 | "cell_type": "markdown",
54 | "metadata": {
55 | "id": "z73o7bmtb5LH"
56 | },
57 | "source": [
58 | "### Environment Preparation"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 1,
64 | "metadata": {
65 | "execution": {
66 | "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
67 | "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
68 | "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
69 | "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
70 | },
71 | "id": "1VRZnGGGgkhl"
72 | },
73 | "outputs": [
74 | {
75 | "name": "stdout",
76 | "output_type": "stream",
77 | "text": [
78 | "Note: you may need to restart the kernel to use updated packages.\n"
79 | ]
80 | }
81 | ],
82 | "source": [
83 | "%pip install pyautogen~=0.1.0 docker -q -U"
84 | ]
85 | },
86 | {
87 | "cell_type": "code",
88 | "execution_count": 2,
89 | "metadata": {
90 | "id": "HZ7w_A3nXU8-"
91 | },
92 | "outputs": [],
93 | "source": [
94 | "import autogen\n",
95 | "\n",
96 | "config_list = autogen.config_list_from_json(\n",
97 | " \"OAI_CONFIG_LIST\",\n",
98 | " filter_dict={\n",
99 | " \"model\": [\"gpt-4\"],\n",
100 | " },\n",
101 | ")\n",
102 | "#\n",
103 | "# Sample content of OAI_CONFIG_LIST file below:\n",
104 | "#\n",
105 | "# [\n",
106 | "# {\n",
107 | "# \"model\": \"gpt-4\",\n",
108 | "# \"api_key\": \"your openai api key\"\n",
109 | "# }\n",
110 | "# ]\n",
111 | "#"
112 | ]
113 | },
114 | {
115 | "cell_type": "markdown",
116 | "metadata": {
117 | "id": "BA48TH6Hc_3c"
118 | },
119 | "source": [
120 | "### Steps"
121 | ]
122 | },
123 | {
124 | "cell_type": "markdown",
125 | "metadata": {},
126 | "source": [
127 | "#### 1. Fetch the Python code snippet for Flowise integration\n",
128 | "\n",
129 | "Click the API Endpoint on the top right corner of your Flowise canvas, and find the Python code snippet similar as below.\n",
130 | "\n",
131 | "```python\n",
132 | "import requests\n",
133 | "\n",
134 | "API_URL = \"http://localhost:4000/api/v1/prediction/433ed37e-9546-4e73-a688-7352b78bf852\"\n",
135 | "\n",
136 | "def query(payload):\n",
137 | " response = requests.post(API_URL, json=payload)\n",
138 | " return response.json()\n",
139 | " \n",
140 | "output = query({\n",
141 | " \"question\": \"Hey, how are you?\",\n",
142 | "})\n",
143 | "```\n",
144 | "\n",
145 | "We will reuse it in the AutoGen code base."
146 | ]
147 | },
148 | {
149 | "cell_type": "markdown",
150 | "metadata": {
151 | "id": "O7VPXVI_coX4"
152 | },
153 | "source": [
154 | "#### 2. Define a function `answer_flowise_uniswap_question`\n",
155 | "\n",
156 | "It takes a parameter `question`, makes API call to the established Flowise flow, and answer it with the API response."
157 | ]
158 | },
159 | {
160 | "cell_type": "code",
161 | "execution_count": 3,
162 | "metadata": {
163 | "id": "yPThjcdFT1Lw"
164 | },
165 | "outputs": [],
166 | "source": [
167 | "import requests\n",
168 | "\n",
169 | "API_URL = \"http://localhost:4000/api/v1/prediction/433ed37e-9546-4e73-a688-7352b78bf852\"\n",
170 | "\n",
171 | "def answer_flowise_uniswap_question(question):\n",
172 | " response = requests.post(API_URL, json={ \"question\": question })\n",
173 | " return response.json()"
174 | ]
175 | },
176 | {
177 | "cell_type": "markdown",
178 | "metadata": {},
179 | "source": [
180 | "#### 3. Validate the `answer_flowise_uniswap_question` function"
181 | ]
182 | },
183 | {
184 | "cell_type": "code",
185 | "execution_count": 4,
186 | "metadata": {},
187 | "outputs": [
188 | {
189 | "data": {
190 | "text/plain": [
191 | "'The main changes in Uniswap v3 include the removal of the need for users to track previous values of the accumulator externally, the introduction of multiple pools for each pair of tokens with different swap fees, the implementation of concentrated liquidity within a price range, the use of geometric mean TWAP instead of arithmetic mean TWAP, and the addition of a liquidity accumulator for external contracts to implement liquidity mining.'"
192 | ]
193 | },
194 | "execution_count": 4,
195 | "metadata": {},
196 | "output_type": "execute_result"
197 | }
198 | ],
199 | "source": [
200 | "answer_flowise_uniswap_question(\"What are the main changes in Uniswap v3?\")"
201 | ]
202 | },
203 | {
204 | "cell_type": "markdown",
205 | "metadata": {
206 | "id": "Wu7gjAv-c4uP"
207 | },
208 | "source": [
209 | "#### 4. Set up AutoGen user agent and assistant agent with function calling enabled."
210 | ]
211 | },
212 | {
213 | "cell_type": "code",
214 | "execution_count": 5,
215 | "metadata": {
216 | "id": "ZsXuHf1fgkhl"
217 | },
218 | "outputs": [],
219 | "source": [
220 | "llm_config={\n",
221 | " \"request_timeout\": 600,\n",
222 | " \"seed\": 42,\n",
223 | " \"config_list\": config_list,\n",
224 | " \"temperature\": 0,\n",
225 | " \"functions\": [\n",
226 | " {\n",
227 | " \"name\": \"answer_flowise_uniswap_question\",\n",
228 | " \"description\": \"Answer any Uniswap related questions\",\n",
229 | " \"parameters\": {\n",
230 | " \"type\": \"object\",\n",
231 | " \"properties\": {\n",
232 | " \"question\": {\n",
233 | " \"type\": \"string\",\n",
234 | " \"description\": \"The question to ask in relation to Uniswap protocol\",\n",
235 | " }\n",
236 | " },\n",
237 | " \"required\": [\"question\"],\n",
238 | " },\n",
239 | " }\n",
240 | " ],\n",
241 | "}"
242 | ]
243 | },
244 | {
245 | "cell_type": "code",
246 | "execution_count": 6,
247 | "metadata": {
248 | "id": "eg9g65hOgkhm"
249 | },
250 | "outputs": [],
251 | "source": [
252 | "# create an AssistantAgent instance named \"assistant\"\n",
253 | "assistant = autogen.AssistantAgent(\n",
254 | " name=\"assistant\",\n",
255 | " llm_config=llm_config,\n",
256 | ")\n",
257 | "# create a UserProxyAgent instance named \"user_proxy\"\n",
258 | "user_proxy = autogen.UserProxyAgent(\n",
259 | " name=\"user_proxy\",\n",
260 | " human_input_mode=\"NEVER\",\n",
261 | " max_consecutive_auto_reply=10,\n",
262 | " code_execution_config={\"work_dir\": \".\"},\n",
263 | " llm_config=llm_config,\n",
264 | " system_message=\"\"\"Reply TERMINATE if the task has been solved at full satisfaction.\n",
265 | "Otherwise, reply CONTINUE, or the reason why the task is not solved yet.\"\"\",\n",
266 | " function_map={\"answer_flowise_uniswap_question\": answer_flowise_uniswap_question}\n",
267 | ")"
268 | ]
269 | },
270 | {
271 | "cell_type": "markdown",
272 | "metadata": {
273 | "id": "37cRtpqLdLSZ"
274 | },
275 | "source": [
276 | "### It's time to start a task for the agents.\n",
277 | "\n",
278 | "Now, let's user the user agent to ask the agents to write an introduction blog for `Uniswap` protocol v3."
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": 7,
284 | "metadata": {
285 | "colab": {
286 | "base_uri": "https://localhost:8080/"
287 | },
288 | "id": "aCdAqig3gkhn",
289 | "outputId": "0deb286b-a1e4-4b56-e8c4-4ab7ea33173f"
290 | },
291 | "outputs": [
292 | {
293 | "name": "stdout",
294 | "output_type": "stream",
295 | "text": [
296 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
297 | "\n",
298 | "\n",
299 | "I'm writing a blog to introduce the version 3 of Uniswap protocol. Find the answers to the 3 questions below and write an introduction based on them.\n",
300 | "\n",
301 | "1. What is Uniswap?\n",
302 | "2. What are the main changes in Uniswap version 3?\n",
303 | "3. How to use Uniswap?\n",
304 | "\n",
305 | "Start the work now.\n",
306 | "\n",
307 | "\n",
308 | "--------------------------------------------------------------------------------\n",
309 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
310 | "\n",
311 | "\u001b[32m***** Suggested function Call: answer_flowise_uniswap_question *****\u001b[0m\n",
312 | "Arguments: \n",
313 | "\n",
314 | "{\n",
315 | " \"question\": \"What is Uniswap?\"\n",
316 | "}\n",
317 | "\u001b[32m********************************************************************\u001b[0m\n",
318 | "\n",
319 | "--------------------------------------------------------------------------------\n",
320 | "\u001b[35m\n",
321 | ">>>>>>>> EXECUTING FUNCTION answer_flowise_uniswap_question...\u001b[0m\n",
322 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
323 | "\n",
324 | "\u001b[32m***** Response from calling function \"answer_flowise_uniswap_question\" *****\u001b[0m\n",
325 | "Uniswap is a decentralized exchange protocol that allows users to trade ERC-20 tokens directly from their wallets. In Uniswap version 3, the main change is the introduction of concentrated liquidity, which allows liquidity to be bounded within a specific price range. This is in contrast to earlier versions where liquidity was distributed uniformly along the reserves curve. Uniswap version 3 also introduces multiple pools for each pair of tokens, each with a different swap fee, and includes improvements to the time-weighted average price (TWAP) oracle.\n",
326 | "\u001b[32m****************************************************************************\u001b[0m\n",
327 | "\n",
328 | "--------------------------------------------------------------------------------\n",
329 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
330 | "\n",
331 | "\u001b[32m***** Suggested function Call: answer_flowise_uniswap_question *****\u001b[0m\n",
332 | "Arguments: \n",
333 | "\n",
334 | "{\n",
335 | " \"question\": \"What are the main changes in Uniswap version 3?\"\n",
336 | "}\n",
337 | "\u001b[32m********************************************************************\u001b[0m\n",
338 | "\n",
339 | "--------------------------------------------------------------------------------\n",
340 | "\u001b[35m\n",
341 | ">>>>>>>> EXECUTING FUNCTION answer_flowise_uniswap_question...\u001b[0m\n",
342 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
343 | "\n",
344 | "\u001b[32m***** Response from calling function \"answer_flowise_uniswap_question\" *****\u001b[0m\n",
345 | "The main changes in Uniswap version 3 include the removal of the need for users to track previous values of the accumulator externally, the introduction of multiple pools for each pair of tokens with different swap fees, the implementation of concentrated liquidity bounded within a price range, tracking the sum of log prices for computing geometric mean TWAP, and the addition of a liquidity accumulator for implementing liquidity mining.\n",
346 | "\u001b[32m****************************************************************************\u001b[0m\n",
347 | "\n",
348 | "--------------------------------------------------------------------------------\n",
349 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
350 | "\n",
351 | "\u001b[32m***** Suggested function Call: answer_flowise_uniswap_question *****\u001b[0m\n",
352 | "Arguments: \n",
353 | "\n",
354 | "{\n",
355 | " \"question\": \"How to use Uniswap?\"\n",
356 | "}\n",
357 | "\u001b[32m********************************************************************\u001b[0m\n",
358 | "\n",
359 | "--------------------------------------------------------------------------------\n",
360 | "\u001b[35m\n",
361 | ">>>>>>>> EXECUTING FUNCTION answer_flowise_uniswap_question...\u001b[0m\n",
362 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
363 | "\n",
364 | "\u001b[32m***** Response from calling function \"answer_flowise_uniswap_question\" *****\u001b[0m\n",
365 | "Based on the provided context, there are no specific instructions on how to use Uniswap.\n",
366 | "\u001b[32m****************************************************************************\u001b[0m\n",
367 | "\n",
368 | "--------------------------------------------------------------------------------\n",
369 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
370 | "\n",
371 | "Uniswap is a decentralized exchange protocol built on the Ethereum blockchain, allowing users to directly trade ERC-20 tokens from their wallets. It has revolutionized the way users interact with DeFi protocols by providing a platform that is permissionless, transparent, and open to all.\n",
372 | "\n",
373 | "The latest version, Uniswap V3, introduces significant changes to enhance the trading experience. The most notable change is the introduction of concentrated liquidity. Unlike previous versions where liquidity was distributed uniformly along the price curve, V3 allows liquidity providers to concentrate their capital within specific price ranges, increasing capital efficiency.\n",
374 | "\n",
375 | "Another major change in Uniswap V3 is the introduction of multiple fee tiers for each token pair. This allows liquidity providers to be compensated appropriately for taking on varying degrees of risk. Additionally, improvements have been made to the time-weighted average price (TWAP) oracle, removing the need for users to track previous values of the accumulator externally.\n",
376 | "\n",
377 | "To use Uniswap, users need to connect their Ethereum wallet (like MetaMask) to the Uniswap interface. They can then select the tokens they wish to swap, input the amount, and confirm the transaction. Liquidity providers can also add their tokens to the liquidity pool of their choice and earn fees from trades.\n",
378 | "\n",
379 | "Please note that while using Uniswap, users should be aware of potential risks such as impermanent loss and smart contract vulnerabilities. Always do your own research and consider your risk tolerance when interacting with DeFi protocols.\n",
380 | "\n",
381 | "TERMINATE\n",
382 | "\n",
383 | "--------------------------------------------------------------------------------\n",
384 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
385 | "\n",
386 | "TERMINATE\n",
387 | "\n",
388 | "--------------------------------------------------------------------------------\n"
389 | ]
390 | }
391 | ],
392 | "source": [
393 | "# the assistant receives a message from the user, which contains the task description\n",
394 | "user_proxy.initiate_chat(\n",
395 | " assistant,\n",
396 | " message=\"\"\"\n",
397 | "I'm writing a blog to introduce the version 3 of Uniswap protocol. Find the answers to the 3 questions below and write an introduction based on them.\n",
398 | "\n",
399 | "1. What is Uniswap?\n",
400 | "2. What are the main changes in Uniswap version 3?\n",
401 | "3. How to use Uniswap?\n",
402 | "\n",
403 | "Start the work now.\n",
404 | "\"\"\"\n",
405 | ")"
406 | ]
407 | }
408 | ],
409 | "metadata": {
410 | "colab": {
411 | "include_colab_link": true,
412 | "provenance": []
413 | },
414 | "kernelspec": {
415 | "display_name": "Python 3",
416 | "language": "python",
417 | "name": "python3"
418 | },
419 | "language_info": {
420 | "codemirror_mode": {
421 | "name": "ipython",
422 | "version": 3
423 | },
424 | "file_extension": ".py",
425 | "mimetype": "text/x-python",
426 | "name": "python",
427 | "nbconvert_exporter": "python",
428 | "pygments_lexer": "ipython3",
429 | "version": "3.11.4"
430 | },
431 | "vscode": {
432 | "interpreter": {
433 | "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
434 | }
435 | }
436 | },
437 | "nbformat": 4,
438 | "nbformat_minor": 0
439 | }
440 |
--------------------------------------------------------------------------------
/Integrations/AutoGen/autogen_langchain_uniswap_agent_speak.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "colab_type": "text",
7 | "id": "view-in-github"
8 | },
9 | "source": [
10 | "
"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "metadata": {
16 | "id": "39_S0tWcB-oh"
17 | },
18 | "source": [
19 | "# AutoGen + LangChain + PlayHT Use Case - Super AI Agent that Speaks\n",
20 | "\n",
21 | "**`AutoGen`** is a versatile framework that facilitates the creation of LLM applications by employing multiple agents capable of interacting with one another to tackle tasks.\n",
22 | "\n",
23 | "**`LangChain`** is an open-source framework designed for software developers engaged in AI and ML. It enables them to seamlessly integrate LLM with external components, facilitating the creation of LLM-driven applications.\n",
24 | "\n",
25 | "**`PlayHT`** is a company serving the generative text to speech service.\n",
26 | "\n",
27 | "Integrating them together, we are able to build a super cool AI agent that,\n",
28 | "\n",
29 | "1. is knowledgeable in certain area\n",
30 | "2. can **SPEAK**\n",
31 | "\n",
32 | "This is the enhanced version of the AI Agent introduced in previous tutorial. We will build the audio feature on top of it. To learn more about it before starting this tutorial, please visit the following link:\n",
33 | "\n",
34 | "[AutoGen + LangChain Use Case - Uniswap Protocol AI Agent](https://github.com/sugarforever/LangChain-Advanced/blob/main/Integrations/AutoGen/autogen_langchain_uniswap_ai_agent.ipynb)\n",
35 | "\n"
36 | ]
37 | },
38 | {
39 | "cell_type": "markdown",
40 | "metadata": {
41 | "id": "NtvjgnBZZjUL"
42 | },
43 | "source": [
44 | "## Use Case - Uniswap Protocol AI Agent that Speaks\n",
45 | "\n",
46 | "`Uniswap` is a decentralized exchange that allows users to trade Ethereum-based tokens.\n",
47 | "\n",
48 | "In previous tutorial, we already built an AI Agent that can execute tasks require Uniswap protocol knowledge.\n",
49 | "\n",
50 | "In this tutorial, let's make the agents answer in **audio**."
51 | ]
52 | },
53 | {
54 | "cell_type": "markdown",
55 | "metadata": {
56 | "id": "z73o7bmtb5LH"
57 | },
58 | "source": [
59 | "### Environment Preparation"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": null,
65 | "metadata": {
66 | "execution": {
67 | "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
68 | "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
69 | "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
70 | "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
71 | },
72 | "id": "1VRZnGGGgkhl"
73 | },
74 | "outputs": [],
75 | "source": [
76 | "%pip install pyautogen~=0.1.0 docker langchain openai tiktoken chromadb pypdf simpleaudio numpy -q"
77 | ]
78 | },
79 | {
80 | "cell_type": "code",
81 | "execution_count": null,
82 | "metadata": {},
83 | "outputs": [],
84 | "source": [
85 | "from dotenv import load_dotenv\n",
86 | "load_dotenv()"
87 | ]
88 | },
89 | {
90 | "cell_type": "code",
91 | "execution_count": null,
92 | "metadata": {
93 | "id": "HZ7w_A3nXU8-"
94 | },
95 | "outputs": [],
96 | "source": [
97 | "import autogen\n",
98 | "\n",
99 | "config_list = autogen.config_list_from_json(\n",
100 | " \"OAI_CONFIG_LIST\",\n",
101 | " filter_dict={\n",
102 | " \"model\": [\"gpt-4\"],\n",
103 | " },\n",
104 | ")\n",
105 | "#\n",
106 | "# Sample content of OAI_CONFIG_LIST file below:\n",
107 | "#\n",
108 | "# [\n",
109 | "# {\n",
110 | "# \"model\": \"gpt-4\",\n",
111 | "# \"api_key\": \"your openai api key\"\n",
112 | "# }\n",
113 | "# ]\n",
114 | "#"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": null,
120 | "metadata": {
121 | "id": "1WVvbWIDSWMR"
122 | },
123 | "outputs": [],
124 | "source": [
125 | "from langchain.vectorstores import Chroma\n",
126 | "from langchain.embeddings import OpenAIEmbeddings\n",
127 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
128 | "from langchain.document_loaders import PyPDFLoader\n",
129 | "from langchain.memory import ConversationBufferMemory\n",
130 | "from langchain.llms import OpenAI\n",
131 | "from langchain.chains import ConversationalRetrievalChain"
132 | ]
133 | },
134 | {
135 | "cell_type": "markdown",
136 | "metadata": {
137 | "id": "BA48TH6Hc_3c"
138 | },
139 | "source": [
140 | "### Steps"
141 | ]
142 | },
143 | {
144 | "cell_type": "markdown",
145 | "metadata": {
146 | "id": "rCrCnRC7cdC-"
147 | },
148 | "source": [
149 | "#### 1. Build up a vector store with Uniswap V3 whitepaper."
150 | ]
151 | },
152 | {
153 | "cell_type": "code",
154 | "execution_count": null,
155 | "metadata": {
156 | "id": "LcC3gnqXSZHs"
157 | },
158 | "outputs": [],
159 | "source": [
160 | "docs = PyPDFLoader('./uniswap_v3.pdf').load()\n",
161 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)\n",
162 | "docs = text_splitter.split_documents(docs)"
163 | ]
164 | },
165 | {
166 | "cell_type": "code",
167 | "execution_count": null,
168 | "metadata": {
169 | "id": "8VoyrUwTShHw"
170 | },
171 | "outputs": [],
172 | "source": [
173 | "vectorstore = Chroma(\n",
174 | " collection_name=\"full_documents\",\n",
175 | " embedding_function=OpenAIEmbeddings()\n",
176 | ")\n",
177 | "vectorstore.add_documents(docs)"
178 | ]
179 | },
180 | {
181 | "cell_type": "markdown",
182 | "metadata": {
183 | "id": "PxFsXiHVciOo"
184 | },
185 | "source": [
186 | "#### 2. Set up a conversational retrieval QA chain by LangChain, based on the vector store."
187 | ]
188 | },
189 | {
190 | "cell_type": "code",
191 | "execution_count": null,
192 | "metadata": {
193 | "id": "6eRvVjJITKfR"
194 | },
195 | "outputs": [],
196 | "source": [
197 | "qa = ConversationalRetrievalChain.from_llm(\n",
198 | " OpenAI(temperature=0),\n",
199 | " vectorstore.as_retriever(),\n",
200 | " memory=ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True)\n",
201 | ")"
202 | ]
203 | },
204 | {
205 | "cell_type": "code",
206 | "execution_count": 29,
207 | "metadata": {
208 | "id": "sXST-2kRTUOs"
209 | },
210 | "outputs": [],
211 | "source": [
212 | "result = qa(({\"question\": \"What is uniswap?\"}))"
213 | ]
214 | },
215 | {
216 | "cell_type": "code",
217 | "execution_count": 30,
218 | "metadata": {
219 | "colab": {
220 | "base_uri": "https://localhost:8080/",
221 | "height": 54
222 | },
223 | "id": "WVi6rT78Tsu_",
224 | "outputId": "d1808979-3d18-4847-c45a-0f3864bc8819"
225 | },
226 | "outputs": [
227 | {
228 | "data": {
229 | "text/plain": [
230 | "' Uniswap is a noncustodial automated market maker implemented for the Ethereum Virtual Machine.'"
231 | ]
232 | },
233 | "execution_count": 30,
234 | "metadata": {},
235 | "output_type": "execute_result"
236 | }
237 | ],
238 | "source": [
239 | "result['answer']"
240 | ]
241 | },
242 | {
243 | "cell_type": "markdown",
244 | "metadata": {
245 | "id": "O7VPXVI_coX4"
246 | },
247 | "source": [
248 | "#### 3. Define a function `answer_uniswap_question`\n",
249 | "\n",
250 | "It takes a parameter `question`, calls the QA chain, and answer it by returning the answer from the chain response."
251 | ]
252 | },
253 | {
254 | "cell_type": "code",
255 | "execution_count": null,
256 | "metadata": {
257 | "id": "yPThjcdFT1Lw"
258 | },
259 | "outputs": [],
260 | "source": [
261 | "def answer_uniswap_question(question):\n",
262 | " response = qa({\"question\": question})\n",
263 | " return response[\"answer\"]"
264 | ]
265 | },
266 | {
267 | "cell_type": "markdown",
268 | "metadata": {},
269 | "source": [
270 | "#### 4. Define a function convert_text_to_audio"
271 | ]
272 | },
273 | {
274 | "cell_type": "code",
275 | "execution_count": null,
276 | "metadata": {},
277 | "outputs": [],
278 | "source": [
279 | "from typing import Generator, Iterable\n",
280 | "\n",
281 | "import time\n",
282 | "import threading\n",
283 | "import os\n",
284 | "import re\n",
285 | "import numpy as np\n",
286 | "import simpleaudio as sa\n",
287 | "\n",
288 | "from pyht.client import Client, TTSOptions\n",
289 | "from pyht.protos import api_pb2\n",
290 | "\n",
291 | "def play_audio(data: Generator[bytes, None, None] | Iterable[bytes]):\n",
292 | " buff_size = 10485760\n",
293 | " ptr = 0\n",
294 | " start_time = time.time()\n",
295 | " buffer = np.empty(buff_size, np.float16)\n",
296 | " audio = None\n",
297 | " for i, chunk in enumerate(data):\n",
298 | " if i == 0:\n",
299 | " start_time = time.time()\n",
300 | " continue # Drop the first response, we don't want a header.\n",
301 | " elif i == 1:\n",
302 | " print(\"First audio byte received in:\", time.time() - start_time)\n",
303 | " for sample in np.frombuffer(chunk, np.float16):\n",
304 | " buffer[ptr] = sample\n",
305 | " ptr += 1\n",
306 | " if i == 5:\n",
307 | " # Give a 4 sample worth of breathing room before starting\n",
308 | " # playback\n",
309 | " audio = sa.play_buffer(buffer, 1, 2, 24000)\n",
310 | " approx_run_time = ptr / 24_000\n",
311 | " time.sleep(max(approx_run_time - time.time() + start_time, 0))\n",
312 | " if audio is not None:\n",
313 | " audio.stop()\n",
314 | "\n",
315 | "\n",
316 | "def convert_text_to_audio(\n",
317 | " text: str\n",
318 | "):\n",
319 | " text_partitions = re.split(r'[,.]', text)\n",
320 | "\n",
321 | " # Setup the client\n",
322 | " client = Client(os.environ['PLAY_HT_USER_ID'], os.environ['PLAY_HT_API_KEY'])\n",
323 | "\n",
324 | " # Set the speech options\n",
325 | " voice = \"s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json\"\n",
326 | " options = TTSOptions(voice=voice, format=api_pb2.FORMAT_WAV, quality=\"faster\")\n",
327 | "\n",
328 | " # Get the streams\n",
329 | " in_stream, out_stream = client.get_stream_pair(options)\n",
330 | "\n",
331 | " # Start a player thread.\n",
332 | " audio_thread = threading.Thread(None, play_audio, args=(out_stream,))\n",
333 | " audio_thread.start()\n",
334 | "\n",
335 | " # Send some text, play some audio.\n",
336 | " for t in text_partitions:\n",
337 | " in_stream(t)\n",
338 | " in_stream.done()\n",
339 | "\n",
340 | " # cleanup\n",
341 | " audio_thread.join()\n",
342 | " out_stream.close()\n",
343 | "\n",
344 | " # Cleanup.\n",
345 | " client.close()\n",
346 | " return 0"
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": null,
352 | "metadata": {},
353 | "outputs": [],
354 | "source": [
355 | "convert_text_to_audio(\"Welcome to the Uniswap V3 whitepaper.\")"
356 | ]
357 | },
358 | {
359 | "cell_type": "markdown",
360 | "metadata": {
361 | "id": "Wu7gjAv-c4uP"
362 | },
363 | "source": [
364 | "#### 5. Set up AutoGen agents with text-to-audio conversion function"
365 | ]
366 | },
367 | {
368 | "cell_type": "code",
369 | "execution_count": null,
370 | "metadata": {
371 | "id": "ZsXuHf1fgkhl"
372 | },
373 | "outputs": [],
374 | "source": [
375 | "llm_config={\n",
376 | " \"request_timeout\": 600,\n",
377 | " \"seed\": 42,\n",
378 | " \"config_list\": config_list,\n",
379 | " \"temperature\": 0,\n",
380 | " \"functions\": [\n",
381 | " {\n",
382 | " \"name\": \"answer_uniswap_question\",\n",
383 | " \"description\": \"Answer any Uniswap related questions\",\n",
384 | " \"parameters\": {\n",
385 | " \"type\": \"object\",\n",
386 | " \"properties\": {\n",
387 | " \"question\": {\n",
388 | " \"type\": \"string\",\n",
389 | " \"description\": \"The question to ask in relation to Uniswap protocol\",\n",
390 | " }\n",
391 | " },\n",
392 | " \"required\": [\"question\"],\n",
393 | " },\n",
394 | " },\n",
395 | " {\n",
396 | " \"name\": \"convert_text_to_audio\",\n",
397 | " \"description\": \"Convert text to audio and speak it out loud\",\n",
398 | " \"parameters\": {\n",
399 | " \"type\": \"object\",\n",
400 | " \"properties\": {\n",
401 | " \"text\": {\n",
402 | " \"type\": \"string\",\n",
403 | " \"description\": \"The text to be converted and spoken out loud\",\n",
404 | " }\n",
405 | " },\n",
406 | " \"required\": [\"text\"],\n",
407 | " },\n",
408 | " }\n",
409 | " ],\n",
410 | "}"
411 | ]
412 | },
413 | {
414 | "cell_type": "code",
415 | "execution_count": null,
416 | "metadata": {
417 | "id": "eg9g65hOgkhm"
418 | },
419 | "outputs": [],
420 | "source": [
421 | "# create an AssistantAgent instance named \"assistant\"\n",
422 | "assistant = autogen.AssistantAgent(\n",
423 | " name=\"assistant\",\n",
424 | " llm_config=llm_config,\n",
425 | ")\n",
426 | "# create a UserProxyAgent instance named \"user_proxy\"\n",
427 | "user_proxy = autogen.UserProxyAgent(\n",
428 | " name=\"user_proxy\",\n",
429 | " human_input_mode=\"NEVER\",\n",
430 | " max_consecutive_auto_reply=10,\n",
431 | " code_execution_config={\"work_dir\": \".\"},\n",
432 | " llm_config=llm_config,\n",
433 | " system_message=\"\"\"Reply TERMINATE if the task has been solved at full satisfaction.\n",
434 | "Otherwise, reply CONTINUE, or the reason why the task is not solved yet.\"\"\",\n",
435 | " function_map={\n",
436 | " \"answer_uniswap_question\": answer_uniswap_question,\n",
437 | " \"convert_text_to_audio\": convert_text_to_audio\n",
438 | " }\n",
439 | ")"
440 | ]
441 | },
442 | {
443 | "cell_type": "markdown",
444 | "metadata": {
445 | "id": "37cRtpqLdLSZ"
446 | },
447 | "source": [
448 | "### It's time to let the agents SPEAK.\n",
449 | "\n",
450 | "Now, let's user the user agent to ask the agents to write an introduction blog for `Uniswap` protocol v3, and **speak it out loudly**."
451 | ]
452 | },
453 | {
454 | "cell_type": "code",
455 | "execution_count": 32,
456 | "metadata": {
457 | "colab": {
458 | "base_uri": "https://localhost:8080/"
459 | },
460 | "id": "aCdAqig3gkhn",
461 | "outputId": "0deb286b-a1e4-4b56-e8c4-4ab7ea33173f"
462 | },
463 | "outputs": [
464 | {
465 | "name": "stdout",
466 | "output_type": "stream",
467 | "text": [
468 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
469 | "\n",
470 | "\n",
471 | "I'm writing a blog to introduce the version 3 of Uniswap protocol. \n",
472 | "Find the answers to the 2 questions below, write an introduction based on them and speak it out loudly.\n",
473 | "\n",
474 | "1. What is Uniswap?\n",
475 | "2. What are the main changes in Uniswap version 3?\n",
476 | "\n",
477 | "Start the work now.\n",
478 | "\n",
479 | "\n",
480 | "--------------------------------------------------------------------------------\n",
481 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
482 | "\n",
483 | "\u001b[32m***** Suggested function Call: answer_uniswap_question *****\u001b[0m\n",
484 | "Arguments: \n",
485 | "{\n",
486 | " \"question\": \"What is Uniswap?\"\n",
487 | "}\n",
488 | "\u001b[32m************************************************************\u001b[0m\n",
489 | "\n",
490 | "--------------------------------------------------------------------------------\n",
491 | "\u001b[35m\n",
492 | ">>>>>>>> EXECUTING FUNCTION answer_uniswap_question...\u001b[0m\n",
493 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
494 | "\n",
495 | "\u001b[32m***** Response from calling function \"answer_uniswap_question\" *****\u001b[0m\n",
496 | " Uniswap is a noncustodial automated market maker implemented for the Ethereum Virtual Machine.\n",
497 | "\u001b[32m********************************************************************\u001b[0m\n",
498 | "\n",
499 | "--------------------------------------------------------------------------------\n",
500 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
501 | "\n",
502 | "\u001b[32m***** Suggested function Call: answer_uniswap_question *****\u001b[0m\n",
503 | "Arguments: \n",
504 | "{\n",
505 | " \"question\": \"What are the main changes in Uniswap version 3?\"\n",
506 | "}\n",
507 | "\u001b[32m************************************************************\u001b[0m\n",
508 | "\n",
509 | "--------------------------------------------------------------------------------\n",
510 | "\u001b[35m\n",
511 | ">>>>>>>> EXECUTING FUNCTION answer_uniswap_question...\u001b[0m\n",
512 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
513 | "\n",
514 | "\u001b[32m***** Response from calling function \"answer_uniswap_question\" *****\u001b[0m\n",
515 | " Uniswap v3 provides increased capital efficiency and fine-tuned control to liquidity providers, improves the accuracy and convenience of the price oracle, and has a more flexible fee structure. It also introduces multiple pools for each pair of tokens, each with a different swap fee, and introduces the concept of concentrated liquidity.\n",
516 | "\u001b[32m********************************************************************\u001b[0m\n",
517 | "\n",
518 | "--------------------------------------------------------------------------------\n",
519 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
520 | "\n",
521 | "Based on the answers, here is an introduction for your blog:\n",
522 | "\n",
523 | "\"Uniswap is a noncustodial automated market maker implemented for the Ethereum Virtual Machine. It has revolutionized the way we trade cryptocurrencies by providing a decentralized platform for swapping tokens. The latest version, Uniswap v3, brings a host of improvements and changes. It provides increased capital efficiency and fine-tuned control to liquidity providers, making it more beneficial for them to participate. The accuracy and convenience of the price oracle have been improved, providing more reliable price feeds. The fee structure has become more flexible, catering to a wider range of use cases. Uniswap v3 also introduces multiple pools for each pair of tokens, each with a different swap fee, and introduces the concept of concentrated liquidity. This allows liquidity providers to concentrate their capital within specific price ranges, increasing their potential returns.\"\n",
524 | "\n",
525 | "Now, let's convert this text to audio and speak it out loud.\n",
526 | "\u001b[32m***** Suggested function Call: convert_text_to_audio *****\u001b[0m\n",
527 | "Arguments: \n",
528 | "{\n",
529 | " \"text\": \"Uniswap is a noncustodial automated market maker implemented for the Ethereum Virtual Machine. It has revolutionized the way we trade cryptocurrencies by providing a decentralized platform for swapping tokens. The latest version, Uniswap v3, brings a host of improvements and changes. It provides increased capital efficiency and fine-tuned control to liquidity providers, making it more beneficial for them to participate. The accuracy and convenience of the price oracle have been improved, providing more reliable price feeds. The fee structure has become more flexible, catering to a wider range of use cases. Uniswap v3 also introduces multiple pools for each pair of tokens, each with a different swap fee, and introduces the concept of concentrated liquidity. This allows liquidity providers to concentrate their capital within specific price ranges, increasing their potential returns.\"\n",
530 | "}\n",
531 | "\u001b[32m**********************************************************\u001b[0m\n",
532 | "\n",
533 | "--------------------------------------------------------------------------------\n",
534 | "\u001b[35m\n",
535 | ">>>>>>>> EXECUTING FUNCTION convert_text_to_audio...\u001b[0m\n",
536 | "First audio byte received in: 0.21162700653076172\n",
537 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
538 | "\n",
539 | "\u001b[32m***** Response from calling function \"convert_text_to_audio\" *****\u001b[0m\n",
540 | "0\n",
541 | "\u001b[32m******************************************************************\u001b[0m\n",
542 | "\n",
543 | "--------------------------------------------------------------------------------\n",
544 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
545 | "\n",
546 | "TERMINATE\n",
547 | "\n",
548 | "--------------------------------------------------------------------------------\n"
549 | ]
550 | }
551 | ],
552 | "source": [
553 | "# the assistant receives a message from the user, which contains the task description\n",
554 | "user_proxy.initiate_chat(\n",
555 | " assistant,\n",
556 | " message=\"\"\"\n",
557 | "I'm writing a blog to introduce the version 3 of Uniswap protocol. \n",
558 | "Find the answers to the 2 questions below, write an introduction based on them and speak it out loudly.\n",
559 | "\n",
560 | "1. What is Uniswap?\n",
561 | "2. What are the main changes in Uniswap version 3?\n",
562 | "\n",
563 | "Start the work now.\n",
564 | "\"\"\"\n",
565 | ")"
566 | ]
567 | }
568 | ],
569 | "metadata": {
570 | "colab": {
571 | "include_colab_link": true,
572 | "provenance": []
573 | },
574 | "kernelspec": {
575 | "display_name": "Python 3",
576 | "language": "python",
577 | "name": "python3"
578 | },
579 | "language_info": {
580 | "codemirror_mode": {
581 | "name": "ipython",
582 | "version": 3
583 | },
584 | "file_extension": ".py",
585 | "mimetype": "text/x-python",
586 | "name": "python",
587 | "nbconvert_exporter": "python",
588 | "pygments_lexer": "ipython3",
589 | "version": "3.11.4"
590 | },
591 | "vscode": {
592 | "interpreter": {
593 | "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
594 | }
595 | }
596 | },
597 | "nbformat": 4,
598 | "nbformat_minor": 0
599 | }
600 |
--------------------------------------------------------------------------------
/Integrations/AutoGen/autogen_langchain_uniswap_ai_agent.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "id": "view-in-github",
7 | "colab_type": "text"
8 | },
9 | "source": [
10 | "
"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "source": [
16 | "# AutoGen + LangChain Use Case - Uniswap Protocol AI Agent\n",
17 | "\n",
18 | "**`AutoGen`** is a versatile framework that facilitates the creation of LLM applications by employing multiple agents capable of interacting with one another to tackle tasks. These AutoGen agents can be tailored to specific needs, engage in conversations, and seamlessly integrate human participation. They are adaptable to different operation modes that encompass the utilization of LLMs, human inputs, and various tools.\n",
19 | "\n",
20 | "**`LangChain`** is an open-source framework designed for software developers engaged in AI and ML. It enables them to seamlessly integrate LLM with external components, facilitating the creation of LLM-driven applications. The primary aim of LangChain is to establish connections between LLMs such as OpenAI's GPT-3.5 and GPT-4 and various external data sources, enabling the development and utilization of NLP applications.\n",
21 | "\n",
22 | "Both of them are playing key roles in the LLM application development.\n",
23 | "\n",
24 | "`AutoGen` doesn't support connecting to various external data sources natively. This is exactly where `LangChain` can come into play."
25 | ],
26 | "metadata": {
27 | "id": "39_S0tWcB-oh"
28 | }
29 | },
30 | {
31 | "cell_type": "markdown",
32 | "source": [
33 | "## Use Case - Uniswap Protocol\n",
34 | "\n",
35 | "`Uniswap` is a decentralized exchange that allows users to trade Ethereum-based tokens.\n",
36 | "\n",
37 | "In this tutorial, I will walk you through the steps to follow to build an AI agent that can complete tasks that involve Uniswap knowledge.\n",
38 | "\n",
39 | "1. Build up a vector store with Uniswap V3 whitepaper.\n",
40 | "2. Set up a conversational retrieval QA chain by LangChain, based on the vector store.\n",
41 | "3. Define a function `answer_uniswap_question` by taking a parameter `question`, calling the QA chain to answer it.\n",
42 | "4. Set up AutoGen user agent and assistant agent with function calling enabled.\n",
43 | "\n",
44 | " - In the function mapping, the function defined in step 3 is included.\n",
45 | " - The assistant agent is instructed by the following message to call the function `answer_uniswap_question` to answer Uniswap related questions.\n",
46 | "\n",
47 | " ```\n",
48 | " I'm writing a blog to introduce the version 3 of Uniswap protocol. Find the answers to the 3 questions below and write an introduction based on them.\n",
49 | "\n",
50 | " 1. What is Uniswap?\n",
51 | " 2. What are the main changes in Uniswap version 3?\n",
52 | " 3. How to use Uniswap?\n",
53 | "\n",
54 | " Start the work now.\n",
55 | " ```\n",
56 | "\n",
57 | " - The function call is done by user agent."
58 | ],
59 | "metadata": {
60 | "id": "NtvjgnBZZjUL"
61 | }
62 | },
63 | {
64 | "cell_type": "markdown",
65 | "source": [
66 | "### Environment Preparation"
67 | ],
68 | "metadata": {
69 | "id": "z73o7bmtb5LH"
70 | }
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": null,
75 | "metadata": {
76 | "execution": {
77 | "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
78 | "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
79 | "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
80 | "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
81 | },
82 | "id": "1VRZnGGGgkhl"
83 | },
84 | "outputs": [],
85 | "source": [
86 | "%pip install pyautogen~=0.1.0 docker langchain openai tiktoken chromadb pypdf -q -U"
87 | ]
88 | },
89 | {
90 | "cell_type": "code",
91 | "source": [
92 | "!wget -O uniswap_v3.pdf https://uniswap.org/whitepaper-v3.pdf"
93 | ],
94 | "metadata": {
95 | "id": "YL7GbhDnSOFE"
96 | },
97 | "execution_count": null,
98 | "outputs": []
99 | },
100 | {
101 | "cell_type": "code",
102 | "source": [
103 | "import autogen\n",
104 | "\n",
105 | "config_list = autogen.config_list_from_json(\n",
106 | " \"OAI_CONFIG_LIST\",\n",
107 | " filter_dict={\n",
108 | " \"model\": [\"gpt-4\"],\n",
109 | " },\n",
110 | ")\n",
111 | "#\n",
112 | "# Sample content of OAI_CONFIG_LIST file below:\n",
113 | "#\n",
114 | "# [\n",
115 | "# {\n",
116 | "# \"model\": \"gpt-4\",\n",
117 | "# \"api_key\": \"your openai api key\"\n",
118 | "# }\n",
119 | "# ]\n",
120 | "#"
121 | ],
122 | "metadata": {
123 | "id": "HZ7w_A3nXU8-"
124 | },
125 | "execution_count": null,
126 | "outputs": []
127 | },
128 | {
129 | "cell_type": "code",
130 | "source": [
131 | "gpt4_api_key = config_list[0][\"api_key\"]"
132 | ],
133 | "metadata": {
134 | "id": "EFD1i82uXYhI"
135 | },
136 | "execution_count": null,
137 | "outputs": []
138 | },
139 | {
140 | "cell_type": "code",
141 | "source": [
142 | "import os\n",
143 | "os.environ['OPENAI_API_KEY'] = gpt4_api_key"
144 | ],
145 | "metadata": {
146 | "id": "ZhfDAOAeSUJg"
147 | },
148 | "execution_count": null,
149 | "outputs": []
150 | },
151 | {
152 | "cell_type": "code",
153 | "source": [
154 | "from langchain.vectorstores import Chroma\n",
155 | "from langchain.embeddings import OpenAIEmbeddings\n",
156 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
157 | "from langchain.document_loaders import PyPDFLoader\n",
158 | "from langchain.memory import ConversationBufferMemory\n",
159 | "from langchain.llms import OpenAI\n",
160 | "from langchain.chains import ConversationalRetrievalChain"
161 | ],
162 | "metadata": {
163 | "id": "1WVvbWIDSWMR"
164 | },
165 | "execution_count": null,
166 | "outputs": []
167 | },
168 | {
169 | "cell_type": "markdown",
170 | "source": [
171 | "### Steps"
172 | ],
173 | "metadata": {
174 | "id": "BA48TH6Hc_3c"
175 | }
176 | },
177 | {
178 | "cell_type": "markdown",
179 | "source": [
180 | "#### 1. Build up a vector store with Uniswap V3 whitepaper."
181 | ],
182 | "metadata": {
183 | "id": "rCrCnRC7cdC-"
184 | }
185 | },
186 | {
187 | "cell_type": "code",
188 | "source": [
189 | "loaders = [ PyPDFLoader('./uniswap_v3.pdf') ]\n",
190 | "docs = []\n",
191 | "for l in loaders:\n",
192 | " docs.extend(l.load())\n",
193 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)\n",
194 | "docs = text_splitter.split_documents(docs)"
195 | ],
196 | "metadata": {
197 | "id": "LcC3gnqXSZHs"
198 | },
199 | "execution_count": null,
200 | "outputs": []
201 | },
202 | {
203 | "cell_type": "code",
204 | "source": [
205 | "vectorstore = Chroma(\n",
206 | " collection_name=\"full_documents\",\n",
207 | " embedding_function=OpenAIEmbeddings()\n",
208 | ")\n",
209 | "vectorstore.add_documents(docs)"
210 | ],
211 | "metadata": {
212 | "id": "8VoyrUwTShHw"
213 | },
214 | "execution_count": null,
215 | "outputs": []
216 | },
217 | {
218 | "cell_type": "markdown",
219 | "source": [
220 | "#### 2. Set up a conversational retrieval QA chain by LangChain, based on the vector store."
221 | ],
222 | "metadata": {
223 | "id": "PxFsXiHVciOo"
224 | }
225 | },
226 | {
227 | "cell_type": "code",
228 | "source": [
229 | "qa = ConversationalRetrievalChain.from_llm(\n",
230 | " OpenAI(temperature=0),\n",
231 | " vectorstore.as_retriever(),\n",
232 | " memory=ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True)\n",
233 | ")"
234 | ],
235 | "metadata": {
236 | "id": "6eRvVjJITKfR"
237 | },
238 | "execution_count": null,
239 | "outputs": []
240 | },
241 | {
242 | "cell_type": "code",
243 | "source": [
244 | "result = qa(({\"question\": \"What is uniswap?\"}))"
245 | ],
246 | "metadata": {
247 | "id": "sXST-2kRTUOs"
248 | },
249 | "execution_count": null,
250 | "outputs": []
251 | },
252 | {
253 | "cell_type": "code",
254 | "source": [
255 | "result['answer']"
256 | ],
257 | "metadata": {
258 | "colab": {
259 | "base_uri": "https://localhost:8080/",
260 | "height": 54
261 | },
262 | "id": "WVi6rT78Tsu_",
263 | "outputId": "d1808979-3d18-4847-c45a-0f3864bc8819"
264 | },
265 | "execution_count": null,
266 | "outputs": [
267 | {
268 | "output_type": "execute_result",
269 | "data": {
270 | "text/plain": [
271 | "'\\nUniswap is a decentralized cryptocurrency exchange that allows users to trade Ethereum-based tokens without the need for a centralized exchange. It is built on the Ethereum blockchain and uses automated market makers to facilitate trades.'"
272 | ],
273 | "application/vnd.google.colaboratory.intrinsic+json": {
274 | "type": "string"
275 | }
276 | },
277 | "metadata": {},
278 | "execution_count": 32
279 | }
280 | ]
281 | },
282 | {
283 | "cell_type": "markdown",
284 | "source": [
285 | "#### 3. Define a function `answer_uniswap_question`\n",
286 | "\n",
287 | "It takes a parameter `question`, calls the QA chain, and answer it by returning the answer from the chain response."
288 | ],
289 | "metadata": {
290 | "id": "O7VPXVI_coX4"
291 | }
292 | },
293 | {
294 | "cell_type": "code",
295 | "source": [
296 | "def answer_uniswap_question(question):\n",
297 | " response = qa({\"question\": question})\n",
298 | " return response[\"answer\"]"
299 | ],
300 | "metadata": {
301 | "id": "yPThjcdFT1Lw"
302 | },
303 | "execution_count": null,
304 | "outputs": []
305 | },
306 | {
307 | "cell_type": "markdown",
308 | "source": [
309 | "#### 4. Set up AutoGen user agent and assistant agent with function calling enabled."
310 | ],
311 | "metadata": {
312 | "id": "Wu7gjAv-c4uP"
313 | }
314 | },
315 | {
316 | "cell_type": "code",
317 | "execution_count": null,
318 | "metadata": {
319 | "id": "ZsXuHf1fgkhl"
320 | },
321 | "outputs": [],
322 | "source": [
323 | "llm_config={\n",
324 | " \"request_timeout\": 600,\n",
325 | " \"seed\": 42,\n",
326 | " \"config_list\": config_list,\n",
327 | " \"temperature\": 0,\n",
328 | " \"functions\": [\n",
329 | " {\n",
330 | " \"name\": \"answer_uniswap_question\",\n",
331 | " \"description\": \"Answer any Uniswap related questions\",\n",
332 | " \"parameters\": {\n",
333 | " \"type\": \"object\",\n",
334 | " \"properties\": {\n",
335 | " \"question\": {\n",
336 | " \"type\": \"string\",\n",
337 | " \"description\": \"The question to ask in relation to Uniswap protocol\",\n",
338 | " }\n",
339 | " },\n",
340 | " \"required\": [\"question\"],\n",
341 | " },\n",
342 | " }\n",
343 | " ],\n",
344 | "}"
345 | ]
346 | },
347 | {
348 | "cell_type": "code",
349 | "execution_count": null,
350 | "metadata": {
351 | "id": "eg9g65hOgkhm"
352 | },
353 | "outputs": [],
354 | "source": [
355 | "# create an AssistantAgent instance named \"assistant\"\n",
356 | "assistant = autogen.AssistantAgent(\n",
357 | " name=\"assistant\",\n",
358 | " llm_config=llm_config,\n",
359 | ")\n",
360 | "# create a UserProxyAgent instance named \"user_proxy\"\n",
361 | "user_proxy = autogen.UserProxyAgent(\n",
362 | " name=\"user_proxy\",\n",
363 | " human_input_mode=\"NEVER\",\n",
364 | " max_consecutive_auto_reply=10,\n",
365 | " code_execution_config={\"work_dir\": \".\"},\n",
366 | " llm_config=llm_config,\n",
367 | " system_message=\"\"\"Reply TERMINATE if the task has been solved at full satisfaction.\n",
368 | "Otherwise, reply CONTINUE, or the reason why the task is not solved yet.\"\"\",\n",
369 | " function_map={\"answer_uniswap_question\": answer_uniswap_question}\n",
370 | ")"
371 | ]
372 | },
373 | {
374 | "cell_type": "markdown",
375 | "source": [
376 | "### It's time to start a task for the agents.\n",
377 | "\n",
378 | "Now, let's user the user agent to ask the agents to write an introduction blog for `Uniswap` protocol v3."
379 | ],
380 | "metadata": {
381 | "id": "37cRtpqLdLSZ"
382 | }
383 | },
384 | {
385 | "cell_type": "code",
386 | "execution_count": null,
387 | "metadata": {
388 | "id": "aCdAqig3gkhn",
389 | "outputId": "0deb286b-a1e4-4b56-e8c4-4ab7ea33173f",
390 | "colab": {
391 | "base_uri": "https://localhost:8080/"
392 | }
393 | },
394 | "outputs": [
395 | {
396 | "output_type": "stream",
397 | "name": "stdout",
398 | "text": [
399 | "user_proxy (to assistant):\n",
400 | "\n",
401 | "\n",
402 | "I'm writing a blog to introduce the version 3 of Uniswap protocol. Find the answers to the 3 questions below and write an introduction based on them.\n",
403 | "\n",
404 | "1. What is Uniswap?\n",
405 | "2. What are the main changes in Uniswap version 3?\n",
406 | "3. How to use Uniswap?\n",
407 | "\n",
408 | "Start the work now.\n",
409 | "\n",
410 | "\n",
411 | "--------------------------------------------------------------------------------\n",
412 | "assistant (to user_proxy):\n",
413 | "\n",
414 | "***** Suggested function Call: answer_uniswap_question *****\n",
415 | "Arguments: \n",
416 | "\n",
417 | "{\n",
418 | " \"question\": \"What is Uniswap?\"\n",
419 | "}\n",
420 | "************************************************************\n",
421 | "\n",
422 | "--------------------------------------------------------------------------------\n",
423 | "\n",
424 | ">>>>>>>> EXECUTING FUNCTION answer_uniswap_question...\n",
425 | "user_proxy (to assistant):\n",
426 | "\n",
427 | "***** Response from calling function \"answer_uniswap_question\" *****\n",
428 | "\n",
429 | "Uniswap is a decentralized cryptocurrency exchange that allows users to trade Ethereum-based tokens without the need for a centralized exchange. It is built on the Ethereum blockchain and uses automated market makers to facilitate trades.\n",
430 | "********************************************************************\n",
431 | "\n",
432 | "--------------------------------------------------------------------------------\n",
433 | "assistant (to user_proxy):\n",
434 | "\n",
435 | "***** Suggested function Call: answer_uniswap_question *****\n",
436 | "Arguments: \n",
437 | "\n",
438 | "{\n",
439 | " \"question\": \"What are the main changes in Uniswap version 3?\"\n",
440 | "}\n",
441 | "************************************************************\n",
442 | "\n",
443 | "--------------------------------------------------------------------------------\n",
444 | "\n",
445 | ">>>>>>>> EXECUTING FUNCTION answer_uniswap_question...\n",
446 | "user_proxy (to assistant):\n",
447 | "\n",
448 | "***** Response from calling function \"answer_uniswap_question\" *****\n",
449 | "\n",
450 | "\n",
451 | "Uniswap version 3 introduces a number of new features, including a new fee structure, improved liquidity provisioning, and a new liquidity mining program. It also includes a new user interface, improved gas efficiency, and a new governance system.\n",
452 | "********************************************************************\n",
453 | "\n",
454 | "--------------------------------------------------------------------------------\n",
455 | "assistant (to user_proxy):\n",
456 | "\n",
457 | "***** Suggested function Call: answer_uniswap_question *****\n",
458 | "Arguments: \n",
459 | "\n",
460 | "{\n",
461 | " \"question\": \"How to use Uniswap?\"\n",
462 | "}\n",
463 | "************************************************************\n",
464 | "\n",
465 | "--------------------------------------------------------------------------------\n",
466 | "\n",
467 | ">>>>>>>> EXECUTING FUNCTION answer_uniswap_question...\n",
468 | "user_proxy (to assistant):\n",
469 | "\n",
470 | "***** Response from calling function \"answer_uniswap_question\" *****\n",
471 | "\n",
472 | "\n",
473 | "To use Uniswap, you will need to have a web3 wallet such as MetaMask installed and connected to the Ethereum network. Once you have your wallet set up, you can go to the Uniswap website and connect your wallet. From there, you can select the tokens you want to trade and execute the swap.\n",
474 | "********************************************************************\n",
475 | "\n",
476 | "--------------------------------------------------------------------------------\n",
477 | "assistant (to user_proxy):\n",
478 | "\n",
479 | "Based on the answers to your questions, here is a draft introduction for your blog:\n",
480 | "\n",
481 | "Uniswap is a revolutionary decentralized cryptocurrency exchange that operates on the Ethereum blockchain. It allows users to trade Ethereum-based tokens directly from their wallets, bypassing the need for a centralized exchange. Uniswap uses automated market makers to facilitate trades, providing a seamless and efficient trading experience.\n",
482 | "\n",
483 | "The latest version, Uniswap version 3, brings a host of new features and improvements. It introduces a new fee structure, making transactions more cost-effective for users. The liquidity provisioning has been enhanced, allowing for more efficient use of capital. A new liquidity mining program has been introduced, providing incentives for users to provide liquidity to the platform. Additionally, Uniswap V3 boasts a new user interface, improved gas efficiency, and a revamped governance system, making it more user-friendly and democratic.\n",
484 | "\n",
485 | "Using Uniswap is straightforward. All you need is a web3 wallet, such as MetaMask, connected to the Ethereum network. Once your wallet is set up, you can visit the Uniswap website, connect your wallet, select the tokens you wish to trade, and execute the swap. It's as simple as that!\n",
486 | "\n",
487 | "Uniswap V3 is a significant step forward in the world of decentralized finance, providing users with more control, flexibility, and efficiency in their trading activities.\n",
488 | "\n",
489 | "TERMINATE\n",
490 | "\n",
491 | "--------------------------------------------------------------------------------\n",
492 | "user_proxy (to assistant):\n",
493 | "\n",
494 | "TERMINATE\n",
495 | "\n",
496 | "--------------------------------------------------------------------------------\n"
497 | ]
498 | }
499 | ],
500 | "source": [
501 | "# the assistant receives a message from the user, which contains the task description\n",
502 | "user_proxy.initiate_chat(\n",
503 | " assistant,\n",
504 | " message=\"\"\"\n",
505 | "I'm writing a blog to introduce the version 3 of Uniswap protocol. Find the answers to the 3 questions below and write an introduction based on them.\n",
506 | "\n",
507 | "1. What is Uniswap?\n",
508 | "2. What are the main changes in Uniswap version 3?\n",
509 | "3. How to use Uniswap?\n",
510 | "\n",
511 | "Start the work now.\n",
512 | "\"\"\"\n",
513 | ")"
514 | ]
515 | }
516 | ],
517 | "metadata": {
518 | "kernelspec": {
519 | "display_name": "Python 3",
520 | "language": "python",
521 | "name": "python3"
522 | },
523 | "language_info": {
524 | "codemirror_mode": {
525 | "name": "ipython",
526 | "version": 3
527 | },
528 | "file_extension": ".py",
529 | "mimetype": "text/x-python",
530 | "name": "python",
531 | "nbconvert_exporter": "python",
532 | "pygments_lexer": "ipython3",
533 | "version": "3.9.17"
534 | },
535 | "vscode": {
536 | "interpreter": {
537 | "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
538 | }
539 | },
540 | "colab": {
541 | "provenance": [],
542 | "include_colab_link": true
543 | }
544 | },
545 | "nbformat": 4,
546 | "nbformat_minor": 0
547 | }
--------------------------------------------------------------------------------
/Integrations/AutoGen/autogen_with_open_source_llms.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "colab_type": "text",
7 | "id": "view-in-github"
8 | },
9 | "source": [
10 | "
"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "metadata": {
16 | "id": "39_S0tWcB-oh"
17 | },
18 | "source": [
19 | "# AutoGen Use Case - Open Source LLMs\n",
20 | "\n",
21 | "**`AutoGen`** is a versatile framework that facilitates the creation of LLM applications by employing multiple agents capable of interacting with one another to tackle tasks. These AutoGen agents can be tailored to specific needs, engage in conversations, and seamlessly integrate human participation. They are adaptable to different operation modes that encompass the utilization of LLMs, human inputs, and various tools.\n",
22 | "\n",
23 | "`AutoGen` works exceptionally well with OpenAI GPT-4 model. But you may notice it's also costly considering the Agents execution may involve considerable token consumption.\n",
24 | "\n",
25 | "Open source LLMs are great alternatives to OpenAI GPT models in some use cases. In this tutorial, let's see how to use one of the popular open source models [`TheBloke/dolphin-2.1-mistral-7B-GGUF`](https://huggingface.co/TheBloke/dolphin-2.1-mistral-7B-GGUF) to drive the AutoGen agents."
26 | ]
27 | },
28 | {
29 | "cell_type": "markdown",
30 | "metadata": {
31 | "id": "NtvjgnBZZjUL"
32 | },
33 | "source": [
34 | "## Use Case\n",
35 | "\n",
36 | "In this tutorial, we will instruct the AutoGen agents to use `TheBloke/dolphin-2.1-mistral-7B-GGUF`."
37 | ]
38 | },
39 | {
40 | "cell_type": "markdown",
41 | "metadata": {
42 | "id": "z73o7bmtb5LH"
43 | },
44 | "source": [
45 | "### Environment Preparation\n",
46 | "\n",
47 | "AutoGen agents are designed to interact with OpenAI models via APIs.\n",
48 | "\n",
49 | "In order to serve the agents with open source LLMs, we need to run the APIs on top of the underlying model in the compatible endpoints as OpenAI.\n",
50 | "\n",
51 | "```shell\n",
52 | "/v1/models\n",
53 | "/v1/chat/completions\n",
54 | "/v1/completions\n",
55 | "```\n",
56 | "\n",
57 | "It can be done with [LM Studio](https://lmstudio.ai/)."
58 | ]
59 | },
60 | {
61 | "cell_type": "markdown",
62 | "metadata": {},
63 | "source": [
64 | "#### 1. Install LM Studio and Start HTTP Server with chosen model\n",
65 | "\n",
66 | "1. Download the LM Studio package, install and run it locally.\n",
67 | "2. In the LM Studio app, download the model to use.\n",
68 | "3. Choose the model and start the server.\n",
69 | "\n",
70 | " You may need to change the model settings as needed."
71 | ]
72 | },
73 | {
74 | "cell_type": "markdown",
75 | "metadata": {},
76 | "source": [
77 | "#### 2. Install Autogen packages"
78 | ]
79 | },
80 | {
81 | "cell_type": "code",
82 | "execution_count": 1,
83 | "metadata": {
84 | "execution": {
85 | "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
86 | "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
87 | "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
88 | "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
89 | },
90 | "id": "1VRZnGGGgkhl"
91 | },
92 | "outputs": [
93 | {
94 | "name": "stdout",
95 | "output_type": "stream",
96 | "text": [
97 | "Note: you may need to restart the kernel to use updated packages.\n"
98 | ]
99 | }
100 | ],
101 | "source": [
102 | "%pip install pyautogen~=0.1.0 docker -q -U"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": 2,
108 | "metadata": {
109 | "id": "HZ7w_A3nXU8-"
110 | },
111 | "outputs": [],
112 | "source": [
113 | "import autogen\n",
114 | "\n",
115 | "config_list = [\n",
116 | " {\n",
117 | " \"api_base\": \"http://localhost:1234/v1\",\n",
118 | " \"api_key\": \"xyz\"\n",
119 | " }\n",
120 | "]"
121 | ]
122 | },
123 | {
124 | "cell_type": "markdown",
125 | "metadata": {
126 | "id": "Wu7gjAv-c4uP"
127 | },
128 | "source": [
129 | "#### 4. Set up AutoGen user agent and assistant agent"
130 | ]
131 | },
132 | {
133 | "cell_type": "code",
134 | "execution_count": 15,
135 | "metadata": {
136 | "id": "ZsXuHf1fgkhl"
137 | },
138 | "outputs": [],
139 | "source": [
140 | "llm_config={\n",
141 | " \"config_list\": config_list,\n",
142 | " \"temperature\": 0\n",
143 | "}"
144 | ]
145 | },
146 | {
147 | "cell_type": "code",
148 | "execution_count": 19,
149 | "metadata": {
150 | "id": "eg9g65hOgkhm"
151 | },
152 | "outputs": [
153 | {
154 | "ename": "SyntaxError",
155 | "evalue": "invalid syntax. Perhaps you forgot a comma? (403027892.py, line 11)",
156 | "output_type": "error",
157 | "traceback": [
158 | "\u001b[0;36m Cell \u001b[0;32mIn[19], line 11\u001b[0;36m\u001b[0m\n\u001b[0;31m max_consecutive_auto_reply=10\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax. Perhaps you forgot a comma?\n"
159 | ]
160 | }
161 | ],
162 | "source": [
163 | "# create an AssistantAgent instance named \"assistant\"\n",
164 | "assistant = autogen.AssistantAgent(\n",
165 | " name=\"assistant\",\n",
166 | " llm_config=llm_config,\n",
167 | " # system_message=\"AI agent\"\n",
168 | ")\n",
169 | "# create a UserProxyAgent instance named \"user_proxy\"\n",
170 | "user_proxy = autogen.UserProxyAgent(\n",
171 | " name=\"user_proxy\",\n",
172 | " human_input_mode=\"NEVER\",\n",
173 | " max_consecutive_auto_reply=10,\n",
174 | " code_execution_config={\"use_docker\": False },\n",
175 | ")"
176 | ]
177 | },
178 | {
179 | "cell_type": "markdown",
180 | "metadata": {
181 | "id": "37cRtpqLdLSZ"
182 | },
183 | "source": [
184 | "### Execution time!\n",
185 | "\n",
186 | "Now, let's run a simple task."
187 | ]
188 | },
189 | {
190 | "cell_type": "code",
191 | "execution_count": 18,
192 | "metadata": {},
193 | "outputs": [
194 | {
195 | "name": "stdout",
196 | "output_type": "stream",
197 | "text": [
198 | "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
199 | "\n",
200 | "Write a python function to calculate the square root of a number, and call it with the number 4.\n",
201 | "\n",
202 | "--------------------------------------------------------------------------------\n",
203 | "\u001b[33massistant\u001b[0m (to user_proxy):\n",
204 | "\n",
205 | "```python\n",
206 | "def sqrt(n):\n",
207 | " return n ** 0.5\n",
208 | "\n",
209 | "print(sqrt(4))\n",
210 | "```\n",
211 | "TERMINATE\n",
212 | "\n",
213 | "--------------------------------------------------------------------------------\n",
214 | "\u001b[31m\n",
215 | ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n"
216 | ]
217 | },
218 | {
219 | "ename": "DockerException",
220 | "evalue": "Error while fetching server API version: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)",
221 | "output_type": "error",
222 | "traceback": [
223 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
224 | "\u001b[0;31mTimeoutError\u001b[0m Traceback (most recent call last)",
225 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py:466\u001b[0m, in \u001b[0;36mHTTPConnectionPool._make_request\u001b[0;34m(self, conn, method, url, timeout, chunked, **httplib_request_kw)\u001b[0m\n\u001b[1;32m 462\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mBaseException\u001b[39;00m \u001b[39mas\u001b[39;00m e:\n\u001b[1;32m 463\u001b[0m \u001b[39m# Remove the TypeError from the exception chain in\u001b[39;00m\n\u001b[1;32m 464\u001b[0m \u001b[39m# Python 3 (including for exceptions like SystemExit).\u001b[39;00m\n\u001b[1;32m 465\u001b[0m \u001b[39m# Otherwise it looks like a bug in the code.\u001b[39;00m\n\u001b[0;32m--> 466\u001b[0m six\u001b[39m.\u001b[39mraise_from(e, \u001b[39mNone\u001b[39;00m)\n\u001b[1;32m 467\u001b[0m \u001b[39mexcept\u001b[39;00m (SocketTimeout, BaseSSLError, SocketError) \u001b[39mas\u001b[39;00m e:\n",
226 | "File \u001b[0;32m:3\u001b[0m, in \u001b[0;36mraise_from\u001b[0;34m(value, from_value)\u001b[0m\n",
227 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py:461\u001b[0m, in \u001b[0;36mHTTPConnectionPool._make_request\u001b[0;34m(self, conn, method, url, timeout, chunked, **httplib_request_kw)\u001b[0m\n\u001b[1;32m 460\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m--> 461\u001b[0m httplib_response \u001b[39m=\u001b[39m conn\u001b[39m.\u001b[39mgetresponse()\n\u001b[1;32m 462\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mBaseException\u001b[39;00m \u001b[39mas\u001b[39;00m e:\n\u001b[1;32m 463\u001b[0m \u001b[39m# Remove the TypeError from the exception chain in\u001b[39;00m\n\u001b[1;32m 464\u001b[0m \u001b[39m# Python 3 (including for exceptions like SystemExit).\u001b[39;00m\n\u001b[1;32m 465\u001b[0m \u001b[39m# Otherwise it looks like a bug in the code.\u001b[39;00m\n",
228 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/http/client.py:1378\u001b[0m, in \u001b[0;36mHTTPConnection.getresponse\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 1377\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m-> 1378\u001b[0m response\u001b[39m.\u001b[39mbegin()\n\u001b[1;32m 1379\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mConnectionError\u001b[39;00m:\n",
229 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/http/client.py:318\u001b[0m, in \u001b[0;36mHTTPResponse.begin\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 317\u001b[0m \u001b[39mwhile\u001b[39;00m \u001b[39mTrue\u001b[39;00m:\n\u001b[0;32m--> 318\u001b[0m version, status, reason \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_read_status()\n\u001b[1;32m 319\u001b[0m \u001b[39mif\u001b[39;00m status \u001b[39m!=\u001b[39m CONTINUE:\n",
230 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/http/client.py:279\u001b[0m, in \u001b[0;36mHTTPResponse._read_status\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 278\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39m_read_status\u001b[39m(\u001b[39mself\u001b[39m):\n\u001b[0;32m--> 279\u001b[0m line \u001b[39m=\u001b[39m \u001b[39mstr\u001b[39m(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mfp\u001b[39m.\u001b[39mreadline(_MAXLINE \u001b[39m+\u001b[39m \u001b[39m1\u001b[39m), \u001b[39m\"\u001b[39m\u001b[39miso-8859-1\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[1;32m 280\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mlen\u001b[39m(line) \u001b[39m>\u001b[39m _MAXLINE:\n",
231 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/socket.py:706\u001b[0m, in \u001b[0;36mSocketIO.readinto\u001b[0;34m(self, b)\u001b[0m\n\u001b[1;32m 705\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m--> 706\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_sock\u001b[39m.\u001b[39mrecv_into(b)\n\u001b[1;32m 707\u001b[0m \u001b[39mexcept\u001b[39;00m timeout:\n",
232 | "\u001b[0;31mTimeoutError\u001b[0m: timed out",
233 | "\nDuring handling of the above exception, another exception occurred:\n",
234 | "\u001b[0;31mReadTimeoutError\u001b[0m Traceback (most recent call last)",
235 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/requests/adapters.py:486\u001b[0m, in \u001b[0;36mHTTPAdapter.send\u001b[0;34m(self, request, stream, timeout, verify, cert, proxies)\u001b[0m\n\u001b[1;32m 485\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m--> 486\u001b[0m resp \u001b[39m=\u001b[39m conn\u001b[39m.\u001b[39murlopen(\n\u001b[1;32m 487\u001b[0m method\u001b[39m=\u001b[39mrequest\u001b[39m.\u001b[39mmethod,\n\u001b[1;32m 488\u001b[0m url\u001b[39m=\u001b[39murl,\n\u001b[1;32m 489\u001b[0m body\u001b[39m=\u001b[39mrequest\u001b[39m.\u001b[39mbody,\n\u001b[1;32m 490\u001b[0m headers\u001b[39m=\u001b[39mrequest\u001b[39m.\u001b[39mheaders,\n\u001b[1;32m 491\u001b[0m redirect\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m,\n\u001b[1;32m 492\u001b[0m assert_same_host\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m,\n\u001b[1;32m 493\u001b[0m preload_content\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m,\n\u001b[1;32m 494\u001b[0m decode_content\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m,\n\u001b[1;32m 495\u001b[0m retries\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mmax_retries,\n\u001b[1;32m 496\u001b[0m timeout\u001b[39m=\u001b[39mtimeout,\n\u001b[1;32m 497\u001b[0m chunked\u001b[39m=\u001b[39mchunked,\n\u001b[1;32m 498\u001b[0m )\n\u001b[1;32m 500\u001b[0m \u001b[39mexcept\u001b[39;00m (ProtocolError, \u001b[39mOSError\u001b[39;00m) \u001b[39mas\u001b[39;00m err:\n",
236 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py:798\u001b[0m, in \u001b[0;36mHTTPConnectionPool.urlopen\u001b[0;34m(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)\u001b[0m\n\u001b[1;32m 796\u001b[0m e \u001b[39m=\u001b[39m ProtocolError(\u001b[39m\"\u001b[39m\u001b[39mConnection aborted.\u001b[39m\u001b[39m\"\u001b[39m, e)\n\u001b[0;32m--> 798\u001b[0m retries \u001b[39m=\u001b[39m retries\u001b[39m.\u001b[39mincrement(\n\u001b[1;32m 799\u001b[0m method, url, error\u001b[39m=\u001b[39me, _pool\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m, _stacktrace\u001b[39m=\u001b[39msys\u001b[39m.\u001b[39mexc_info()[\u001b[39m2\u001b[39m]\n\u001b[1;32m 800\u001b[0m )\n\u001b[1;32m 801\u001b[0m retries\u001b[39m.\u001b[39msleep()\n",
237 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/util/retry.py:550\u001b[0m, in \u001b[0;36mRetry.increment\u001b[0;34m(self, method, url, response, error, _pool, _stacktrace)\u001b[0m\n\u001b[1;32m 549\u001b[0m \u001b[39mif\u001b[39;00m read \u001b[39mis\u001b[39;00m \u001b[39mFalse\u001b[39;00m \u001b[39mor\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_is_method_retryable(method):\n\u001b[0;32m--> 550\u001b[0m \u001b[39mraise\u001b[39;00m six\u001b[39m.\u001b[39mreraise(\u001b[39mtype\u001b[39m(error), error, _stacktrace)\n\u001b[1;32m 551\u001b[0m \u001b[39melif\u001b[39;00m read \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n",
238 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/packages/six.py:770\u001b[0m, in \u001b[0;36mreraise\u001b[0;34m(tp, value, tb)\u001b[0m\n\u001b[1;32m 769\u001b[0m \u001b[39mraise\u001b[39;00m value\u001b[39m.\u001b[39mwith_traceback(tb)\n\u001b[0;32m--> 770\u001b[0m \u001b[39mraise\u001b[39;00m value\n\u001b[1;32m 771\u001b[0m \u001b[39mfinally\u001b[39;00m:\n",
239 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py:714\u001b[0m, in \u001b[0;36mHTTPConnectionPool.urlopen\u001b[0;34m(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)\u001b[0m\n\u001b[1;32m 713\u001b[0m \u001b[39m# Make the request on the httplib connection object.\u001b[39;00m\n\u001b[0;32m--> 714\u001b[0m httplib_response \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_make_request(\n\u001b[1;32m 715\u001b[0m conn,\n\u001b[1;32m 716\u001b[0m method,\n\u001b[1;32m 717\u001b[0m url,\n\u001b[1;32m 718\u001b[0m timeout\u001b[39m=\u001b[39mtimeout_obj,\n\u001b[1;32m 719\u001b[0m body\u001b[39m=\u001b[39mbody,\n\u001b[1;32m 720\u001b[0m headers\u001b[39m=\u001b[39mheaders,\n\u001b[1;32m 721\u001b[0m chunked\u001b[39m=\u001b[39mchunked,\n\u001b[1;32m 722\u001b[0m )\n\u001b[1;32m 724\u001b[0m \u001b[39m# If we're going to release the connection in ``finally:``, then\u001b[39;00m\n\u001b[1;32m 725\u001b[0m \u001b[39m# the response doesn't need to know about the connection. Otherwise\u001b[39;00m\n\u001b[1;32m 726\u001b[0m \u001b[39m# it will also try to release it and we'll have a double-release\u001b[39;00m\n\u001b[1;32m 727\u001b[0m \u001b[39m# mess.\u001b[39;00m\n",
240 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py:468\u001b[0m, in \u001b[0;36mHTTPConnectionPool._make_request\u001b[0;34m(self, conn, method, url, timeout, chunked, **httplib_request_kw)\u001b[0m\n\u001b[1;32m 467\u001b[0m \u001b[39mexcept\u001b[39;00m (SocketTimeout, BaseSSLError, SocketError) \u001b[39mas\u001b[39;00m e:\n\u001b[0;32m--> 468\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_raise_timeout(err\u001b[39m=\u001b[39me, url\u001b[39m=\u001b[39murl, timeout_value\u001b[39m=\u001b[39mread_timeout)\n\u001b[1;32m 469\u001b[0m \u001b[39mraise\u001b[39;00m\n",
241 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py:357\u001b[0m, in \u001b[0;36mHTTPConnectionPool._raise_timeout\u001b[0;34m(self, err, url, timeout_value)\u001b[0m\n\u001b[1;32m 356\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39misinstance\u001b[39m(err, SocketTimeout):\n\u001b[0;32m--> 357\u001b[0m \u001b[39mraise\u001b[39;00m ReadTimeoutError(\n\u001b[1;32m 358\u001b[0m \u001b[39mself\u001b[39m, url, \u001b[39m\"\u001b[39m\u001b[39mRead timed out. (read timeout=\u001b[39m\u001b[39m%s\u001b[39;00m\u001b[39m)\u001b[39m\u001b[39m\"\u001b[39m \u001b[39m%\u001b[39m timeout_value\n\u001b[1;32m 359\u001b[0m )\n\u001b[1;32m 361\u001b[0m \u001b[39m# See the above comment about EAGAIN in Python 3. In Python 2 we have\u001b[39;00m\n\u001b[1;32m 362\u001b[0m \u001b[39m# to specifically catch it and throw the timeout error\u001b[39;00m\n",
242 | "\u001b[0;31mReadTimeoutError\u001b[0m: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)",
243 | "\nDuring handling of the above exception, another exception occurred:\n",
244 | "\u001b[0;31mReadTimeout\u001b[0m Traceback (most recent call last)",
245 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/api/client.py:214\u001b[0m, in \u001b[0;36mAPIClient._retrieve_server_version\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 213\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m--> 214\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mversion(api_version\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m)[\u001b[39m\"\u001b[39m\u001b[39mApiVersion\u001b[39m\u001b[39m\"\u001b[39m]\n\u001b[1;32m 215\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mKeyError\u001b[39;00m:\n",
246 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/api/daemon.py:181\u001b[0m, in \u001b[0;36mDaemonApiMixin.version\u001b[0;34m(self, api_version)\u001b[0m\n\u001b[1;32m 180\u001b[0m url \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_url(\u001b[39m\"\u001b[39m\u001b[39m/version\u001b[39m\u001b[39m\"\u001b[39m, versioned_api\u001b[39m=\u001b[39mapi_version)\n\u001b[0;32m--> 181\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_result(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_get(url), json\u001b[39m=\u001b[39m\u001b[39mTrue\u001b[39;00m)\n",
247 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/utils/decorators.py:46\u001b[0m, in \u001b[0;36mupdate_headers..inner\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 45\u001b[0m kwargs[\u001b[39m'\u001b[39m\u001b[39mheaders\u001b[39m\u001b[39m'\u001b[39m]\u001b[39m.\u001b[39mupdate(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_general_configs[\u001b[39m'\u001b[39m\u001b[39mHttpHeaders\u001b[39m\u001b[39m'\u001b[39m])\n\u001b[0;32m---> 46\u001b[0m \u001b[39mreturn\u001b[39;00m f(\u001b[39mself\u001b[39m, \u001b[39m*\u001b[39margs, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs)\n",
248 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/api/client.py:237\u001b[0m, in \u001b[0;36mAPIClient._get\u001b[0;34m(self, url, **kwargs)\u001b[0m\n\u001b[1;32m 235\u001b[0m \u001b[39m@update_headers\u001b[39m\n\u001b[1;32m 236\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39m_get\u001b[39m(\u001b[39mself\u001b[39m, url, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs):\n\u001b[0;32m--> 237\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mget(url, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_set_request_timeout(kwargs))\n",
249 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/requests/sessions.py:602\u001b[0m, in \u001b[0;36mSession.get\u001b[0;34m(self, url, **kwargs)\u001b[0m\n\u001b[1;32m 601\u001b[0m kwargs\u001b[39m.\u001b[39msetdefault(\u001b[39m\"\u001b[39m\u001b[39mallow_redirects\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39mTrue\u001b[39;00m)\n\u001b[0;32m--> 602\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mrequest(\u001b[39m\"\u001b[39m\u001b[39mGET\u001b[39m\u001b[39m\"\u001b[39m, url, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs)\n",
250 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/requests/sessions.py:589\u001b[0m, in \u001b[0;36mSession.request\u001b[0;34m(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)\u001b[0m\n\u001b[1;32m 588\u001b[0m send_kwargs\u001b[39m.\u001b[39mupdate(settings)\n\u001b[0;32m--> 589\u001b[0m resp \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39msend(prep, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39msend_kwargs)\n\u001b[1;32m 591\u001b[0m \u001b[39mreturn\u001b[39;00m resp\n",
251 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/requests/sessions.py:703\u001b[0m, in \u001b[0;36mSession.send\u001b[0;34m(self, request, **kwargs)\u001b[0m\n\u001b[1;32m 702\u001b[0m \u001b[39m# Send the request\u001b[39;00m\n\u001b[0;32m--> 703\u001b[0m r \u001b[39m=\u001b[39m adapter\u001b[39m.\u001b[39msend(request, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs)\n\u001b[1;32m 705\u001b[0m \u001b[39m# Total elapsed time of the request (approximately)\u001b[39;00m\n",
252 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/requests/adapters.py:532\u001b[0m, in \u001b[0;36mHTTPAdapter.send\u001b[0;34m(self, request, stream, timeout, verify, cert, proxies)\u001b[0m\n\u001b[1;32m 531\u001b[0m \u001b[39melif\u001b[39;00m \u001b[39misinstance\u001b[39m(e, ReadTimeoutError):\n\u001b[0;32m--> 532\u001b[0m \u001b[39mraise\u001b[39;00m ReadTimeout(e, request\u001b[39m=\u001b[39mrequest)\n\u001b[1;32m 533\u001b[0m \u001b[39melif\u001b[39;00m \u001b[39misinstance\u001b[39m(e, _InvalidHeader):\n",
253 | "\u001b[0;31mReadTimeout\u001b[0m: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)",
254 | "\nDuring handling of the above exception, another exception occurred:\n",
255 | "\u001b[0;31mDockerException\u001b[0m Traceback (most recent call last)",
256 | "\u001b[1;32m/Users/wyang14/sugarforever/LangChain-Advanced/Integrations/AutoGen/autogen_with_open_source_llms.ipynb Cell 13\u001b[0m line \u001b[0;36m1\n\u001b[0;32m----> 1\u001b[0m user_proxy\u001b[39m.\u001b[39minitiate_chat(\n\u001b[1;32m 2\u001b[0m assistant,\n\u001b[1;32m 3\u001b[0m message\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mWrite a python function to calculate the square root of a number, and call it with the number 4.\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m 4\u001b[0m )\n",
257 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:531\u001b[0m, in \u001b[0;36mConversableAgent.initiate_chat\u001b[0;34m(self, recipient, clear_history, silent, **context)\u001b[0m\n\u001b[1;32m 517\u001b[0m \u001b[39m\u001b[39m\u001b[39m\"\"\"Initiate a chat with the recipient agent.\u001b[39;00m\n\u001b[1;32m 518\u001b[0m \n\u001b[1;32m 519\u001b[0m \u001b[39mReset the consecutive auto reply counter.\u001b[39;00m\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 528\u001b[0m \u001b[39m \"message\" needs to be provided if the `generate_init_message` method is not overridden.\u001b[39;00m\n\u001b[1;32m 529\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[1;32m 530\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_prepare_chat(recipient, clear_history)\n\u001b[0;32m--> 531\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39msend(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mgenerate_init_message(\u001b[39m*\u001b[39m\u001b[39m*\u001b[39mcontext), recipient, silent\u001b[39m=\u001b[39msilent)\n",
258 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:334\u001b[0m, in \u001b[0;36mConversableAgent.send\u001b[0;34m(self, message, recipient, request_reply, silent)\u001b[0m\n\u001b[1;32m 332\u001b[0m valid \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_append_oai_message(message, \u001b[39m\"\u001b[39m\u001b[39massistant\u001b[39m\u001b[39m\"\u001b[39m, recipient)\n\u001b[1;32m 333\u001b[0m \u001b[39mif\u001b[39;00m valid:\n\u001b[0;32m--> 334\u001b[0m recipient\u001b[39m.\u001b[39mreceive(message, \u001b[39mself\u001b[39m, request_reply, silent)\n\u001b[1;32m 335\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m 336\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\n\u001b[1;32m 337\u001b[0m \u001b[39m\"\u001b[39m\u001b[39mMessage can\u001b[39m\u001b[39m'\u001b[39m\u001b[39mt be converted into a valid ChatCompletion message. Either content or function_call must be provided.\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m 338\u001b[0m )\n",
259 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:464\u001b[0m, in \u001b[0;36mConversableAgent.receive\u001b[0;34m(self, message, sender, request_reply, silent)\u001b[0m\n\u001b[1;32m 462\u001b[0m reply \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mgenerate_reply(messages\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mchat_messages[sender], sender\u001b[39m=\u001b[39msender)\n\u001b[1;32m 463\u001b[0m \u001b[39mif\u001b[39;00m reply \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n\u001b[0;32m--> 464\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39msend(reply, sender, silent\u001b[39m=\u001b[39msilent)\n",
260 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:334\u001b[0m, in \u001b[0;36mConversableAgent.send\u001b[0;34m(self, message, recipient, request_reply, silent)\u001b[0m\n\u001b[1;32m 332\u001b[0m valid \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_append_oai_message(message, \u001b[39m\"\u001b[39m\u001b[39massistant\u001b[39m\u001b[39m\"\u001b[39m, recipient)\n\u001b[1;32m 333\u001b[0m \u001b[39mif\u001b[39;00m valid:\n\u001b[0;32m--> 334\u001b[0m recipient\u001b[39m.\u001b[39mreceive(message, \u001b[39mself\u001b[39m, request_reply, silent)\n\u001b[1;32m 335\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m 336\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\n\u001b[1;32m 337\u001b[0m \u001b[39m\"\u001b[39m\u001b[39mMessage can\u001b[39m\u001b[39m'\u001b[39m\u001b[39mt be converted into a valid ChatCompletion message. Either content or function_call must be provided.\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m 338\u001b[0m )\n",
261 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:462\u001b[0m, in \u001b[0;36mConversableAgent.receive\u001b[0;34m(self, message, sender, request_reply, silent)\u001b[0m\n\u001b[1;32m 460\u001b[0m \u001b[39mif\u001b[39;00m request_reply \u001b[39mis\u001b[39;00m \u001b[39mFalse\u001b[39;00m \u001b[39mor\u001b[39;00m request_reply \u001b[39mis\u001b[39;00m \u001b[39mNone\u001b[39;00m \u001b[39mand\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mreply_at_receive[sender] \u001b[39mis\u001b[39;00m \u001b[39mFalse\u001b[39;00m:\n\u001b[1;32m 461\u001b[0m \u001b[39mreturn\u001b[39;00m\n\u001b[0;32m--> 462\u001b[0m reply \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mgenerate_reply(messages\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mchat_messages[sender], sender\u001b[39m=\u001b[39msender)\n\u001b[1;32m 463\u001b[0m \u001b[39mif\u001b[39;00m reply \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n\u001b[1;32m 464\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39msend(reply, sender, silent\u001b[39m=\u001b[39msilent)\n",
262 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:781\u001b[0m, in \u001b[0;36mConversableAgent.generate_reply\u001b[0;34m(self, messages, sender, exclude)\u001b[0m\n\u001b[1;32m 779\u001b[0m \u001b[39mcontinue\u001b[39;00m\n\u001b[1;32m 780\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_match_trigger(reply_func_tuple[\u001b[39m\"\u001b[39m\u001b[39mtrigger\u001b[39m\u001b[39m\"\u001b[39m], sender):\n\u001b[0;32m--> 781\u001b[0m final, reply \u001b[39m=\u001b[39m reply_func(\u001b[39mself\u001b[39m, messages\u001b[39m=\u001b[39mmessages, sender\u001b[39m=\u001b[39msender, config\u001b[39m=\u001b[39mreply_func_tuple[\u001b[39m\"\u001b[39m\u001b[39mconfig\u001b[39m\u001b[39m\"\u001b[39m])\n\u001b[1;32m 782\u001b[0m \u001b[39mif\u001b[39;00m final:\n\u001b[1;32m 783\u001b[0m \u001b[39mreturn\u001b[39;00m reply\n",
263 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:637\u001b[0m, in \u001b[0;36mConversableAgent.generate_code_execution_reply\u001b[0;34m(self, messages, sender, config)\u001b[0m\n\u001b[1;32m 634\u001b[0m \u001b[39mcontinue\u001b[39;00m\n\u001b[1;32m 636\u001b[0m \u001b[39m# found code blocks, execute code and push \"last_n_messages\" back\u001b[39;00m\n\u001b[0;32m--> 637\u001b[0m exitcode, logs \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mexecute_code_blocks(code_blocks)\n\u001b[1;32m 638\u001b[0m code_execution_config[\u001b[39m\"\u001b[39m\u001b[39mlast_n_messages\u001b[39m\u001b[39m\"\u001b[39m] \u001b[39m=\u001b[39m last_n_messages\n\u001b[1;32m 639\u001b[0m exitcode2str \u001b[39m=\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mexecution succeeded\u001b[39m\u001b[39m\"\u001b[39m \u001b[39mif\u001b[39;00m exitcode \u001b[39m==\u001b[39m \u001b[39m0\u001b[39m \u001b[39melse\u001b[39;00m \u001b[39m\"\u001b[39m\u001b[39mexecution failed\u001b[39m\u001b[39m\"\u001b[39m\n",
264 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:908\u001b[0m, in \u001b[0;36mConversableAgent.execute_code_blocks\u001b[0;34m(self, code_blocks)\u001b[0m\n\u001b[1;32m 906\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m 907\u001b[0m filename \u001b[39m=\u001b[39m \u001b[39mNone\u001b[39;00m\n\u001b[0;32m--> 908\u001b[0m exitcode, logs, image \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mrun_code(\n\u001b[1;32m 909\u001b[0m code,\n\u001b[1;32m 910\u001b[0m lang\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mpython\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[1;32m 911\u001b[0m filename\u001b[39m=\u001b[39mfilename,\n\u001b[1;32m 912\u001b[0m \u001b[39m*\u001b[39m\u001b[39m*\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_code_execution_config,\n\u001b[1;32m 913\u001b[0m )\n\u001b[1;32m 914\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m 915\u001b[0m \u001b[39m# In case the language is not supported, we return an error message.\u001b[39;00m\n\u001b[1;32m 916\u001b[0m exitcode, logs, image \u001b[39m=\u001b[39m (\n\u001b[1;32m 917\u001b[0m \u001b[39m1\u001b[39m,\n\u001b[1;32m 918\u001b[0m \u001b[39mf\u001b[39m\u001b[39m\"\u001b[39m\u001b[39munknown language \u001b[39m\u001b[39m{\u001b[39;00mlang\u001b[39m}\u001b[39;00m\u001b[39m\"\u001b[39m,\n\u001b[1;32m 919\u001b[0m \u001b[39mNone\u001b[39;00m,\n\u001b[1;32m 920\u001b[0m )\n",
265 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py:885\u001b[0m, in \u001b[0;36mConversableAgent.run_code\u001b[0;34m(self, code, **kwargs)\u001b[0m\n\u001b[1;32m 871\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mrun_code\u001b[39m(\u001b[39mself\u001b[39m, code, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs):\n\u001b[1;32m 872\u001b[0m \u001b[39m \u001b[39m\u001b[39m\"\"\"Run the code and return the result.\u001b[39;00m\n\u001b[1;32m 873\u001b[0m \n\u001b[1;32m 874\u001b[0m \u001b[39m Override this function to modify the way to run the code.\u001b[39;00m\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 883\u001b[0m \u001b[39m image (str or None): the docker image used for the code execution.\u001b[39;00m\n\u001b[1;32m 884\u001b[0m \u001b[39m \"\"\"\u001b[39;00m\n\u001b[0;32m--> 885\u001b[0m \u001b[39mreturn\u001b[39;00m execute_code(code, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs)\n",
266 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/autogen/code_utils.py:340\u001b[0m, in \u001b[0;36mexecute_code\u001b[0;34m(code, timeout, filename, work_dir, use_docker, lang)\u001b[0m\n\u001b[1;32m 337\u001b[0m \u001b[39mreturn\u001b[39;00m result\u001b[39m.\u001b[39mreturncode, logs, \u001b[39mNone\u001b[39;00m\n\u001b[1;32m 339\u001b[0m \u001b[39m# create a docker client\u001b[39;00m\n\u001b[0;32m--> 340\u001b[0m client \u001b[39m=\u001b[39m docker\u001b[39m.\u001b[39mfrom_env()\n\u001b[1;32m 341\u001b[0m image_list \u001b[39m=\u001b[39m (\n\u001b[1;32m 342\u001b[0m [\u001b[39m\"\u001b[39m\u001b[39mpython:3-alpine\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39mpython:3\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39mpython:3-windowsservercore\u001b[39m\u001b[39m\"\u001b[39m]\n\u001b[1;32m 343\u001b[0m \u001b[39mif\u001b[39;00m use_docker \u001b[39mis\u001b[39;00m \u001b[39mTrue\u001b[39;00m\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 346\u001b[0m \u001b[39melse\u001b[39;00m use_docker\n\u001b[1;32m 347\u001b[0m )\n\u001b[1;32m 348\u001b[0m \u001b[39mfor\u001b[39;00m image \u001b[39min\u001b[39;00m image_list:\n\u001b[1;32m 349\u001b[0m \u001b[39m# check if the image exists\u001b[39;00m\n",
267 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/client.py:96\u001b[0m, in \u001b[0;36mDockerClient.from_env\u001b[0;34m(cls, **kwargs)\u001b[0m\n\u001b[1;32m 94\u001b[0m version \u001b[39m=\u001b[39m kwargs\u001b[39m.\u001b[39mpop(\u001b[39m'\u001b[39m\u001b[39mversion\u001b[39m\u001b[39m'\u001b[39m, \u001b[39mNone\u001b[39;00m)\n\u001b[1;32m 95\u001b[0m use_ssh_client \u001b[39m=\u001b[39m kwargs\u001b[39m.\u001b[39mpop(\u001b[39m'\u001b[39m\u001b[39muse_ssh_client\u001b[39m\u001b[39m'\u001b[39m, \u001b[39mFalse\u001b[39;00m)\n\u001b[0;32m---> 96\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mcls\u001b[39m(\n\u001b[1;32m 97\u001b[0m timeout\u001b[39m=\u001b[39mtimeout,\n\u001b[1;32m 98\u001b[0m max_pool_size\u001b[39m=\u001b[39mmax_pool_size,\n\u001b[1;32m 99\u001b[0m version\u001b[39m=\u001b[39mversion,\n\u001b[1;32m 100\u001b[0m use_ssh_client\u001b[39m=\u001b[39muse_ssh_client,\n\u001b[1;32m 101\u001b[0m \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs_from_env(\u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs)\n\u001b[1;32m 102\u001b[0m )\n",
268 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/client.py:45\u001b[0m, in \u001b[0;36mDockerClient.__init__\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 44\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39m__init__\u001b[39m(\u001b[39mself\u001b[39m, \u001b[39m*\u001b[39margs, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs):\n\u001b[0;32m---> 45\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mapi \u001b[39m=\u001b[39m APIClient(\u001b[39m*\u001b[39margs, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs)\n",
269 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/api/client.py:197\u001b[0m, in \u001b[0;36mAPIClient.__init__\u001b[0;34m(self, base_url, version, timeout, tls, user_agent, num_pools, credstore_env, use_ssh_client, max_pool_size)\u001b[0m\n\u001b[1;32m 192\u001b[0m \u001b[39m# version detection needs to be after unix adapter mounting\u001b[39;00m\n\u001b[1;32m 193\u001b[0m \u001b[39mif\u001b[39;00m version \u001b[39mis\u001b[39;00m \u001b[39mNone\u001b[39;00m \u001b[39mor\u001b[39;00m (\u001b[39misinstance\u001b[39m(\n\u001b[1;32m 194\u001b[0m version,\n\u001b[1;32m 195\u001b[0m \u001b[39mstr\u001b[39m\n\u001b[1;32m 196\u001b[0m ) \u001b[39mand\u001b[39;00m version\u001b[39m.\u001b[39mlower() \u001b[39m==\u001b[39m \u001b[39m'\u001b[39m\u001b[39mauto\u001b[39m\u001b[39m'\u001b[39m):\n\u001b[0;32m--> 197\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_version \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_retrieve_server_version()\n\u001b[1;32m 198\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m 199\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_version \u001b[39m=\u001b[39m version\n",
270 | "File \u001b[0;32m/opt/homebrew/anaconda3/lib/python3.11/site-packages/docker/api/client.py:221\u001b[0m, in \u001b[0;36mAPIClient._retrieve_server_version\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 216\u001b[0m \u001b[39mraise\u001b[39;00m DockerException(\n\u001b[1;32m 217\u001b[0m \u001b[39m'\u001b[39m\u001b[39mInvalid response from docker daemon: key \u001b[39m\u001b[39m\"\u001b[39m\u001b[39mApiVersion\u001b[39m\u001b[39m\"\u001b[39m\u001b[39m'\u001b[39m\n\u001b[1;32m 218\u001b[0m \u001b[39m'\u001b[39m\u001b[39m is missing.\u001b[39m\u001b[39m'\u001b[39m\n\u001b[1;32m 219\u001b[0m )\n\u001b[1;32m 220\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mException\u001b[39;00m \u001b[39mas\u001b[39;00m e:\n\u001b[0;32m--> 221\u001b[0m \u001b[39mraise\u001b[39;00m DockerException(\n\u001b[1;32m 222\u001b[0m \u001b[39mf\u001b[39m\u001b[39m'\u001b[39m\u001b[39mError while fetching server API version: \u001b[39m\u001b[39m{\u001b[39;00me\u001b[39m}\u001b[39;00m\u001b[39m'\u001b[39m\n\u001b[1;32m 223\u001b[0m )\n",
271 | "\u001b[0;31mDockerException\u001b[0m: Error while fetching server API version: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)"
272 | ]
273 | }
274 | ],
275 | "source": [
276 | "user_proxy.initiate_chat(\n",
277 | " assistant,\n",
278 | " message=\"Write a python function to calculate the square root of a number, and call it with the number 4.\"\n",
279 | ")"
280 | ]
281 | }
282 | ],
283 | "metadata": {
284 | "colab": {
285 | "include_colab_link": true,
286 | "provenance": []
287 | },
288 | "kernelspec": {
289 | "display_name": "Python 3",
290 | "language": "python",
291 | "name": "python3"
292 | },
293 | "language_info": {
294 | "codemirror_mode": {
295 | "name": "ipython",
296 | "version": 3
297 | },
298 | "file_extension": ".py",
299 | "mimetype": "text/x-python",
300 | "name": "python",
301 | "nbconvert_exporter": "python",
302 | "pygments_lexer": "ipython3",
303 | "version": "3.11.4"
304 | },
305 | "vscode": {
306 | "interpreter": {
307 | "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
308 | }
309 | }
310 | },
311 | "nbformat": 4,
312 | "nbformat_minor": 0
313 | }
314 |
--------------------------------------------------------------------------------
/Integrations/AutoGen/comprehensive_overview_of_large_language_models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Integrations/AutoGen/comprehensive_overview_of_large_language_models.pdf
--------------------------------------------------------------------------------
/Integrations/AutoGen/uniswap_v3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Integrations/AutoGen/uniswap_v3.pdf
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2023 sugarforever
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # LangChain-Advanced
--------------------------------------------------------------------------------
/Retrievers/01_MultiQuery_Retriever.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyO89L7nv9/pdmZVzUVpKS+b",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | }
17 | },
18 | "cells": [
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "id": "view-in-github",
23 | "colab_type": "text"
24 | },
25 | "source": [
26 | "
"
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "execution_count": null,
32 | "metadata": {
33 | "colab": {
34 | "base_uri": "https://localhost:8080/"
35 | },
36 | "id": "xPWZLTgY5E_4",
37 | "outputId": "4e54d9d6-64dd-4aa6-d025-2af049efc249"
38 | },
39 | "outputs": [
40 | {
41 | "output_type": "stream",
42 | "name": "stdout",
43 | "text": [
44 | "\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/1.8 MB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[91m━━━━━━━\u001b[0m\u001b[90m╺\u001b[0m\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.3/1.8 MB\u001b[0m \u001b[31m9.6 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.8/1.8 MB\u001b[0m \u001b[31m25.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
45 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.0/77.0 kB\u001b[0m \u001b[31m8.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
46 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m437.8/437.8 kB\u001b[0m \u001b[31m32.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
47 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m66.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
48 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.4/2.4 MB\u001b[0m \u001b[31m65.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
49 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m66.3/66.3 kB\u001b[0m \u001b[31m6.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
50 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m59.5/59.5 kB\u001b[0m \u001b[31m6.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
51 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.4/5.4 MB\u001b[0m \u001b[31m64.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
52 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.2/6.2 MB\u001b[0m \u001b[31m64.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
53 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.8/3.8 MB\u001b[0m \u001b[31m65.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
54 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.3/67.3 kB\u001b[0m \u001b[31m5.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
55 | "\u001b[?25h Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
56 | " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
57 | " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
58 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m593.7/593.7 kB\u001b[0m \u001b[31m43.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
59 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.4/49.4 kB\u001b[0m \u001b[31m4.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
60 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.0/67.0 kB\u001b[0m \u001b[31m6.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
61 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m46.0/46.0 kB\u001b[0m \u001b[31m3.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
62 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m268.8/268.8 kB\u001b[0m \u001b[31m23.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
63 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m6.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
64 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m428.8/428.8 kB\u001b[0m \u001b[31m39.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
65 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.1/4.1 MB\u001b[0m \u001b[31m87.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
66 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m46.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
67 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m129.9/129.9 kB\u001b[0m \u001b[31m12.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
68 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.8/86.8 kB\u001b[0m \u001b[31m7.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
69 | "\u001b[?25h Building wheel for pypika (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n"
70 | ]
71 | }
72 | ],
73 | "source": [
74 | "!pip install -q -U langchain openai chromadb tiktoken"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "source": [
80 | "import os\n",
81 | "os.environ['OPENAI_API_KEY'] = \"your valid openai api key\""
82 | ],
83 | "metadata": {
84 | "id": "eVQjKJn-5ZZZ"
85 | },
86 | "execution_count": null,
87 | "outputs": []
88 | },
89 | {
90 | "cell_type": "code",
91 | "source": [
92 | "import logging\n",
93 | "\n",
94 | "logging.basicConfig()\n",
95 | "logging.getLogger(\"langchain.retrievers.multi_query\").setLevel(logging.INFO)"
96 | ],
97 | "metadata": {
98 | "id": "hLkz1l-25xLY"
99 | },
100 | "execution_count": null,
101 | "outputs": []
102 | },
103 | {
104 | "cell_type": "code",
105 | "source": [
106 | "from langchain.vectorstores import Chroma\n",
107 | "from langchain.document_loaders import WebBaseLoader\n",
108 | "from langchain.embeddings.openai import OpenAIEmbeddings\n",
109 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
110 | "\n",
111 | "# Load blog post\n",
112 | "loader = WebBaseLoader(\"https://blog.langchain.dev/langchain-prompt-hub/\")\n",
113 | "data = loader.load()\n",
114 | "\n",
115 | "# Split\n",
116 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
117 | "splits = text_splitter.split_documents(data)\n",
118 | "\n",
119 | "# VectorDB\n",
120 | "embedding = OpenAIEmbeddings()\n",
121 | "vectordb = Chroma.from_documents(documents=splits, embedding=embedding)"
122 | ],
123 | "metadata": {
124 | "id": "wXOpXACj5YKS"
125 | },
126 | "execution_count": null,
127 | "outputs": []
128 | },
129 | {
130 | "cell_type": "code",
131 | "source": [
132 | "from langchain.chat_models import ChatOpenAI\n",
133 | "from langchain.retrievers.multi_query import MultiQueryRetriever\n",
134 | "\n",
135 | "question = \"What can we do with LangChain hub?\"\n",
136 | "llm = ChatOpenAI(temperature=0)\n",
137 | "retriever_from_llm = MultiQueryRetriever.from_llm(retriever=vectordb.as_retriever(), llm=llm)"
138 | ],
139 | "metadata": {
140 | "id": "G1dVQCZA5mMt"
141 | },
142 | "execution_count": null,
143 | "outputs": []
144 | },
145 | {
146 | "cell_type": "code",
147 | "source": [
148 | "unique_docs = retriever_from_llm.get_relevant_documents(query=question)\n",
149 | "unique_docs"
150 | ],
151 | "metadata": {
152 | "colab": {
153 | "base_uri": "https://localhost:8080/"
154 | },
155 | "id": "wJUcf3kW52es",
156 | "outputId": "682eccc9-8852-406e-f7db-59eee22e24ec"
157 | },
158 | "execution_count": null,
159 | "outputs": [
160 | {
161 | "output_type": "stream",
162 | "name": "stderr",
163 | "text": [
164 | "INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can the LangChain hub be utilized?', '2. What are the possible applications of the LangChain hub?', '3. In what ways can the LangChain hub be used effectively?']\n"
165 | ]
166 | },
167 | {
168 | "output_type": "execute_result",
169 | "data": {
170 | "text/plain": [
171 | "[Document(page_content=\"Today, we're excited to launch LangChain Hub‚Äìa home for uploading, browsing, pulling, and managing your prompts. (Soon, we'll be adding other artifacts like chains and agents).\\uf8ffüí°Explore the Hub hereLangChain Hub is built into LangSmith (more on that below) so there are 2 ways to start exploring LangChain Hub.With LangSmith access: Full read and write permissions. You can explore all existing prompts and upload your own by logging in and navigate to the Hub from your admin panel.Without\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
172 | " Document(page_content=\"way to facilitate this kind of collaboration.We're aiming to make LangChain Hub the best place for teams to write and manage prompts, together. The product isn't quite there today–this first iteration only supports personal accounts–but we're actively looking for organizations that are excited to explore an Alpha with us so if you want organizational support for the Hub, please reach out to us directly at support@langchain.dev with the subject [Hub: Orgs]4. Artifact Management and\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
173 | " Document(page_content=\"applications. It's about advancing our collective wisdom and translating that into knowledge we can all put to use now. We want to help make this easier on an individual, team, and organization scale, across any use-case and every industry. Our goal for LangChain Hub is that it becomes the go-to place for developers to discover new use cases and polished prompts.Today, polished prompts and the wisdom that comes with it are distributed across the web and all-too-often buried in the crannies of\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
174 | " Document(page_content='Announcing LangChain Hub\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n LangChain\\n \\n\\n\\n\\n\\n\\n\\nHome\\nBy LangChain\\nRelease Notes\\nGitHub\\nDocs\\nWrite with Us\\n\\n\\n\\n\\n\\nSign in\\nSubscribe\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n Sep 5, 2023\\n \\n\\n\\n 6 min read\\n \\n\\n\\n By LangChain\\n \\n\\n\\nAnnouncing LangChain Hub', metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
175 | " Document(page_content=\"syntax (e.g. SYS and INST for Llama2).As developers explore the wide variety of models, we hope the LangChain Hub can assist in that exploration by providing starter prompts for those models. We've added tags to prompts to indicate which model(s) they work best with.2. InspectabilityPrompts power the chains and agents in LangChain. Often times, the prompts are obfuscated away. We built LangChain Hub in a way that puts them front and center, so that anyone can see what's going on under the\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
176 | " Document(page_content=\"LangSmithFrom partnering with early LangSmith users, the tie-in between debugging, logging, testing, and evaluation and artifact management has become increasingly obvious. By making LangChain Hub a part of LangSmith, we knew we could help teams not only identify and collaborate on prompts, but also make informed decisions about how to implement them. Testing integrations with prompts aren't out yet but they are coming soon!Favorite FeaturesHome PageWe want to make discoverability and\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'})]"
177 | ]
178 | },
179 | "metadata": {},
180 | "execution_count": 7
181 | }
182 | ]
183 | },
184 | {
185 | "cell_type": "code",
186 | "source": [
187 | "from typing import List\n",
188 | "from langchain.chains import LLMChain\n",
189 | "from pydantic import BaseModel, Field\n",
190 | "from langchain.prompts import PromptTemplate\n",
191 | "from langchain.output_parsers import PydanticOutputParser\n",
192 | "from langchain.retrievers.multi_query import LineListOutputParser\n",
193 | "\n",
194 | "\n",
195 | "# Output parser will split the LLM result into a list of queries\n",
196 | "'''\n",
197 | "class LineList(BaseModel):\n",
198 | " # \"lines\" is the key (attribute name) of the parsed output\n",
199 | " lines: List[str] = Field(description=\"Lines of text\")\n",
200 | "\n",
201 | "\n",
202 | "class LineListOutputParser(PydanticOutputParser):\n",
203 | " def __init__(self) -> None:\n",
204 | " super().__init__(pydantic_object=LineList)\n",
205 | "\n",
206 | " def parse(self, text: str) -> LineList:\n",
207 | " print(f\"Input text: {text}\")\n",
208 | " lines = text.strip().split(\"\\n\")\n",
209 | " return LineList(lines=lines)\n",
210 | "'''\n",
211 | "\n",
212 | "output_parser = LineListOutputParser()\n",
213 | "\n",
214 | "QUERY_PROMPT = PromptTemplate(\n",
215 | " input_variables=[\"question\"],\n",
216 | " template=\"\"\"You are an AI language model assistant. Your task is to generate five\n",
217 | " different versions of the given user question to retrieve relevant documents from a vector\n",
218 | " database. By generating multiple perspectives on the user question, your goal is to help\n",
219 | " the user overcome some of the limitations of the distance-based similarity search.\n",
220 | " Provide these alternative questions seperated by newlines.\n",
221 | " Original question: {question}\"\"\",\n",
222 | ")\n",
223 | "llm = ChatOpenAI(temperature=0)\n",
224 | "\n",
225 | "# Chain\n",
226 | "llm_chain = LLMChain(llm=llm, prompt=QUERY_PROMPT, output_parser=output_parser)"
227 | ],
228 | "metadata": {
229 | "id": "qbqSmDne57Qj"
230 | },
231 | "execution_count": null,
232 | "outputs": []
233 | },
234 | {
235 | "cell_type": "code",
236 | "source": [
237 | "# Run\n",
238 | "retriever = MultiQueryRetriever(\n",
239 | " retriever=vectordb.as_retriever(), llm_chain=llm_chain, parser_key=\"lines\"\n",
240 | ") # \"lines\" is the key (attribute name) of the parsed output\n",
241 | "\n",
242 | "unique_docs = retriever.get_relevant_documents(query=question)\n",
243 | "unique_docs"
244 | ],
245 | "metadata": {
246 | "colab": {
247 | "base_uri": "https://localhost:8080/"
248 | },
249 | "id": "b1Art5pt6I3k",
250 | "outputId": "8bb4db8a-36c2-4bf1-f05f-698358281336"
251 | },
252 | "execution_count": null,
253 | "outputs": [
254 | {
255 | "output_type": "stream",
256 | "name": "stderr",
257 | "text": [
258 | "INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can the LangChain hub be utilized?', '2. What are the possible applications of the LangChain hub?', '3. In what ways can the LangChain hub be used?', '4. What functionalities does the LangChain hub offer?', '5. What are the potential uses of the LangChain hub?']\n"
259 | ]
260 | },
261 | {
262 | "output_type": "execute_result",
263 | "data": {
264 | "text/plain": [
265 | "[Document(page_content=\"Today, we're excited to launch LangChain Hub‚Äìa home for uploading, browsing, pulling, and managing your prompts. (Soon, we'll be adding other artifacts like chains and agents).\\uf8ffüí°Explore the Hub hereLangChain Hub is built into LangSmith (more on that below) so there are 2 ways to start exploring LangChain Hub.With LangSmith access: Full read and write permissions. You can explore all existing prompts and upload your own by logging in and navigate to the Hub from your admin panel.Without\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
266 | " Document(page_content=\"way to facilitate this kind of collaboration.We're aiming to make LangChain Hub the best place for teams to write and manage prompts, together. The product isn't quite there today–this first iteration only supports personal accounts–but we're actively looking for organizations that are excited to explore an Alpha with us so if you want organizational support for the Hub, please reach out to us directly at support@langchain.dev with the subject [Hub: Orgs]4. Artifact Management and\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
267 | " Document(page_content=\"applications. It's about advancing our collective wisdom and translating that into knowledge we can all put to use now. We want to help make this easier on an individual, team, and organization scale, across any use-case and every industry. Our goal for LangChain Hub is that it becomes the go-to place for developers to discover new use cases and polished prompts.Today, polished prompts and the wisdom that comes with it are distributed across the web and all-too-often buried in the crannies of\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
268 | " Document(page_content='Announcing LangChain Hub\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n LangChain\\n \\n\\n\\n\\n\\n\\n\\nHome\\nBy LangChain\\nRelease Notes\\nGitHub\\nDocs\\nWrite with Us\\n\\n\\n\\n\\n\\nSign in\\nSubscribe\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n Sep 5, 2023\\n \\n\\n\\n 6 min read\\n \\n\\n\\n By LangChain\\n \\n\\n\\nAnnouncing LangChain Hub', metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
269 | " Document(page_content=\"syntax (e.g. SYS and INST for Llama2).As developers explore the wide variety of models, we hope the LangChain Hub can assist in that exploration by providing starter prompts for those models. We've added tags to prompts to indicate which model(s) they work best with.2. InspectabilityPrompts power the chains and agents in LangChain. Often times, the prompts are obfuscated away. We built LangChain Hub in a way that puts them front and center, so that anyone can see what's going on under the\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'}),\n",
270 | " Document(page_content=\"LangSmithFrom partnering with early LangSmith users, the tie-in between debugging, logging, testing, and evaluation and artifact management has become increasingly obvious. By making LangChain Hub a part of LangSmith, we knew we could help teams not only identify and collaborate on prompts, but also make informed decisions about how to implement them. Testing integrations with prompts aren't out yet but they are coming soon!Favorite FeaturesHome PageWe want to make discoverability and\", metadata={'language': 'en', 'source': 'https://blog.langchain.dev/langchain-prompt-hub/', 'title': 'Announcing LangChain Hub'})]"
271 | ]
272 | },
273 | "metadata": {},
274 | "execution_count": 9
275 | }
276 | ]
277 | }
278 | ]
279 | }
--------------------------------------------------------------------------------
/Retrievers/02_Contextual_Compression_Retriever.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyOj68rOFVk/fSt+6/mGaOcl",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | }
17 | },
18 | "cells": [
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "id": "view-in-github",
23 | "colab_type": "text"
24 | },
25 | "source": [
26 | "
"
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "execution_count": null,
32 | "metadata": {
33 | "colab": {
34 | "base_uri": "https://localhost:8080/"
35 | },
36 | "id": "xPWZLTgY5E_4",
37 | "outputId": "c87cf6f1-c676-4faf-8fb7-def934a3f5da"
38 | },
39 | "outputs": [
40 | {
41 | "output_type": "stream",
42 | "name": "stdout",
43 | "text": [
44 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.8/1.8 MB\u001b[0m \u001b[31m18.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
45 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.0/77.0 kB\u001b[0m \u001b[31m7.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
46 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m437.8/437.8 kB\u001b[0m \u001b[31m32.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
47 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m70.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
48 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.4/2.4 MB\u001b[0m \u001b[31m61.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
49 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m66.3/66.3 kB\u001b[0m \u001b[31m5.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
50 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m59.5/59.5 kB\u001b[0m \u001b[31m6.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
51 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.4/5.4 MB\u001b[0m \u001b[31m80.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
52 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.2/6.2 MB\u001b[0m \u001b[31m82.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
53 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.8/3.8 MB\u001b[0m \u001b[31m55.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
54 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.3/67.3 kB\u001b[0m \u001b[31m3.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
55 | "\u001b[?25h Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
56 | " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
57 | " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
58 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m593.7/593.7 kB\u001b[0m \u001b[31m42.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
59 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.4/49.4 kB\u001b[0m \u001b[31m4.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
60 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.0/67.0 kB\u001b[0m \u001b[31m5.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
61 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m46.0/46.0 kB\u001b[0m \u001b[31m4.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
62 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m268.8/268.8 kB\u001b[0m \u001b[31m20.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
63 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m5.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
64 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m428.8/428.8 kB\u001b[0m \u001b[31m26.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
65 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.1/4.1 MB\u001b[0m \u001b[31m81.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
66 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m60.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
67 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m129.9/129.9 kB\u001b[0m \u001b[31m11.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
68 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.8/86.8 kB\u001b[0m \u001b[31m8.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
69 | "\u001b[?25h Building wheel for pypika (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n"
70 | ]
71 | }
72 | ],
73 | "source": [
74 | "!pip install -q -U langchain openai chromadb tiktoken"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "source": [
80 | "import os\n",
81 | "os.environ['OPENAI_API_KEY'] = \"your valid openai api key\""
82 | ],
83 | "metadata": {
84 | "id": "eVQjKJn-5ZZZ"
85 | },
86 | "execution_count": null,
87 | "outputs": []
88 | },
89 | {
90 | "cell_type": "code",
91 | "source": [
92 | "def pretty_print_docs(docs):\n",
93 | " print(f\"\\n{'-' * 100}\\n\".join([f\"Document {i+1}:\\n\\n\" + d.page_content for i, d in enumerate(docs)]))"
94 | ],
95 | "metadata": {
96 | "id": "gNt6WEub04Rz"
97 | },
98 | "execution_count": null,
99 | "outputs": []
100 | },
101 | {
102 | "cell_type": "code",
103 | "source": [
104 | "!mkdir data\n",
105 | "!wget https://raw.githubusercontent.com/jerryjliu/llama_index/main/examples/paul_graham_essay/data/paul_graham_essay.txt -O data/paul_graham_essay.txt"
106 | ],
107 | "metadata": {
108 | "colab": {
109 | "base_uri": "https://localhost:8080/"
110 | },
111 | "id": "7PyoqHpu6Ha2",
112 | "outputId": "940f029e-a555-4518-eb31-436ebfb3ad86"
113 | },
114 | "execution_count": null,
115 | "outputs": [
116 | {
117 | "output_type": "stream",
118 | "name": "stdout",
119 | "text": [
120 | "--2023-09-30 21:04:52-- https://raw.githubusercontent.com/jerryjliu/llama_index/main/examples/paul_graham_essay/data/paul_graham_essay.txt\n",
121 | "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
122 | "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
123 | "HTTP request sent, awaiting response... 200 OK\n",
124 | "Length: 75047 (73K) [text/plain]\n",
125 | "Saving to: ‘data/paul_graham_essay.txt’\n",
126 | "\n",
127 | "\r data/paul 0%[ ] 0 --.-KB/s \rdata/paul_graham_es 100%[===================>] 73.29K --.-KB/s in 0.03s \n",
128 | "\n",
129 | "2023-09-30 21:04:53 (2.72 MB/s) - ‘data/paul_graham_essay.txt’ saved [75047/75047]\n",
130 | "\n"
131 | ]
132 | }
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "source": [
138 | "question = \"Where did Paul Graham study?\""
139 | ],
140 | "metadata": {
141 | "id": "pDr0keyO1qIx"
142 | },
143 | "execution_count": null,
144 | "outputs": []
145 | },
146 | {
147 | "cell_type": "code",
148 | "source": [
149 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
150 | "from langchain.embeddings import OpenAIEmbeddings\n",
151 | "from langchain.document_loaders import WebBaseLoader\n",
152 | "from langchain.vectorstores import Chroma\n",
153 | "\n",
154 | "# Load blog post\n",
155 | "loader = WebBaseLoader(\"https://raw.githubusercontent.com/jerryjliu/llama_index/main/examples/paul_graham_essay/data/paul_graham_essay.txt\")\n",
156 | "data = loader.load()\n",
157 | "\n",
158 | "# Split\n",
159 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
160 | "splits = text_splitter.split_documents(data)\n",
161 | "\n",
162 | "retriever = Chroma.from_documents(splits, OpenAIEmbeddings(), collection_name=\"example\").as_retriever()\n",
163 | "\n",
164 | "docs = retriever.get_relevant_documents(question)\n",
165 | "pretty_print_docs(docs)"
166 | ],
167 | "metadata": {
168 | "colab": {
169 | "base_uri": "https://localhost:8080/"
170 | },
171 | "id": "Zlx1PgXj02Sl",
172 | "outputId": "ef146e52-97b7-4771-fc43-cc479292989a"
173 | },
174 | "execution_count": null,
175 | "outputs": [
176 | {
177 | "output_type": "stream",
178 | "name": "stdout",
179 | "text": [
180 | "Document 1:\n",
181 | "\n",
182 | "So I said no more essays till Bel was done. But I told few people about Bel while I was working on it. So for years it must have seemed that I was doing nothing, when in fact I was working harder than I'd ever worked on anything. Occasionally after wrestling for hours with some gruesome bug I'd check Twitter or HN and see someone asking \"Does Paul Graham still code?\"\n",
183 | "----------------------------------------------------------------------------------------------------\n",
184 | "Document 2:\n",
185 | "\n",
186 | "I applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n",
187 | "----------------------------------------------------------------------------------------------------\n",
188 | "Document 3:\n",
189 | "\n",
190 | "That fall I started taking art classes at Harvard. Grad students could take classes in any department, and my advisor, Tom Cheatham, was very easy going. If he even knew about the strange classes I was taking, he never said anything.\n",
191 | "----------------------------------------------------------------------------------------------------\n",
192 | "Document 4:\n",
193 | "\n",
194 | "Arc, in a house I bought in Cambridge.\n"
195 | ]
196 | }
197 | ]
198 | },
199 | {
200 | "cell_type": "code",
201 | "source": [
202 | "from langchain.llms import OpenAI\n",
203 | "from langchain.retrievers import ContextualCompressionRetriever\n",
204 | "from langchain.retrievers.document_compressors import LLMChainExtractor\n",
205 | "\n",
206 | "llm = OpenAI(temperature=0)\n",
207 | "compressor = LLMChainExtractor.from_llm(llm)\n",
208 | "compression_retriever = ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)\n",
209 | "\n",
210 | "compressed_docs = compression_retriever.get_relevant_documents(question)\n",
211 | "pretty_print_docs(compressed_docs)"
212 | ],
213 | "metadata": {
214 | "colab": {
215 | "base_uri": "https://localhost:8080/"
216 | },
217 | "id": "zCvxJLSp1Ewa",
218 | "outputId": "ed90aea5-a18f-40fb-bd07-a1509f7a3855"
219 | },
220 | "execution_count": null,
221 | "outputs": [
222 | {
223 | "output_type": "stream",
224 | "name": "stderr",
225 | "text": [
226 | "/usr/local/lib/python3.10/dist-packages/langchain/chains/llm.py:280: UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.\n",
227 | " warnings.warn(\n"
228 | ]
229 | },
230 | {
231 | "output_type": "stream",
232 | "name": "stdout",
233 | "text": [
234 | "Document 1:\n",
235 | "\n",
236 | "Harvard\n",
237 | "----------------------------------------------------------------------------------------------------\n",
238 | "Document 2:\n",
239 | "\n",
240 | "\"Harvard\"\n",
241 | "----------------------------------------------------------------------------------------------------\n",
242 | "Document 3:\n",
243 | "\n",
244 | "Arc, Cambridge.\n"
245 | ]
246 | }
247 | ]
248 | },
249 | {
250 | "cell_type": "code",
251 | "source": [
252 | "from langchain.retrievers.document_compressors import LLMChainFilter\n",
253 | "\n",
254 | "_filter = LLMChainFilter.from_llm(llm)\n",
255 | "compression_retriever = ContextualCompressionRetriever(base_compressor=_filter, base_retriever=retriever)\n",
256 | "\n",
257 | "compressed_docs = compression_retriever.get_relevant_documents(question)\n",
258 | "pretty_print_docs(compressed_docs)"
259 | ],
260 | "metadata": {
261 | "colab": {
262 | "base_uri": "https://localhost:8080/"
263 | },
264 | "id": "Xa-6AdPu1GwG",
265 | "outputId": "930bd734-522d-46ae-efea-32c7255ce189"
266 | },
267 | "execution_count": null,
268 | "outputs": [
269 | {
270 | "output_type": "stream",
271 | "name": "stdout",
272 | "text": [
273 | "Document 1:\n",
274 | "\n",
275 | "I applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n"
276 | ]
277 | }
278 | ]
279 | },
280 | {
281 | "cell_type": "code",
282 | "source": [
283 | "from langchain.embeddings import OpenAIEmbeddings\n",
284 | "from langchain.retrievers.document_compressors import EmbeddingsFilter\n",
285 | "\n",
286 | "embeddings = OpenAIEmbeddings()\n",
287 | "embeddings_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.8)\n",
288 | "compression_retriever = ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)\n",
289 | "\n",
290 | "compressed_docs = compression_retriever.get_relevant_documents(question)\n",
291 | "pretty_print_docs(compressed_docs)"
292 | ],
293 | "metadata": {
294 | "colab": {
295 | "base_uri": "https://localhost:8080/"
296 | },
297 | "id": "V-3OzgRT1IVP",
298 | "outputId": "56d5286c-fca2-444d-bb5b-5f28650ce4aa"
299 | },
300 | "execution_count": null,
301 | "outputs": [
302 | {
303 | "output_type": "stream",
304 | "name": "stdout",
305 | "text": [
306 | "Document 1:\n",
307 | "\n",
308 | "So I said no more essays till Bel was done. But I told few people about Bel while I was working on it. So for years it must have seemed that I was doing nothing, when in fact I was working harder than I'd ever worked on anything. Occasionally after wrestling for hours with some gruesome bug I'd check Twitter or HN and see someone asking \"Does Paul Graham still code?\"\n",
309 | "----------------------------------------------------------------------------------------------------\n",
310 | "Document 2:\n",
311 | "\n",
312 | "I applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n"
313 | ]
314 | }
315 | ]
316 | },
317 | {
318 | "cell_type": "code",
319 | "source": [
320 | "from langchain.document_transformers import EmbeddingsRedundantFilter\n",
321 | "from langchain.retrievers.document_compressors import DocumentCompressorPipeline\n",
322 | "from langchain.text_splitter import CharacterTextSplitter\n",
323 | "\n",
324 | "splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=0, separator=\". \")\n",
325 | "redundant_filter = EmbeddingsRedundantFilter(embeddings=embeddings)\n",
326 | "relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)\n",
327 | "pipeline_compressor = DocumentCompressorPipeline(\n",
328 | " transformers=[splitter, redundant_filter, relevant_filter]\n",
329 | ")"
330 | ],
331 | "metadata": {
332 | "id": "JmesmHkY1KRg"
333 | },
334 | "execution_count": null,
335 | "outputs": []
336 | },
337 | {
338 | "cell_type": "code",
339 | "source": [
340 | "compression_retriever = ContextualCompressionRetriever(base_compressor=pipeline_compressor, base_retriever=retriever)\n",
341 | "\n",
342 | "compressed_docs = compression_retriever.get_relevant_documents(question)\n",
343 | "pretty_print_docs(compressed_docs)"
344 | ],
345 | "metadata": {
346 | "colab": {
347 | "base_uri": "https://localhost:8080/"
348 | },
349 | "id": "gLT2CB4h1MeS",
350 | "outputId": "4f2364cc-458e-43cd-857c-62ef1b8e48f2"
351 | },
352 | "execution_count": null,
353 | "outputs": [
354 | {
355 | "output_type": "stream",
356 | "name": "stdout",
357 | "text": [
358 | "Document 1:\n",
359 | "\n",
360 | "Occasionally after wrestling for hours with some gruesome bug I'd check Twitter or HN and see someone asking \"Does Paul Graham still code?\"\n",
361 | "----------------------------------------------------------------------------------------------------\n",
362 | "Document 2:\n",
363 | "\n",
364 | "I applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n",
365 | "----------------------------------------------------------------------------------------------------\n",
366 | "Document 3:\n",
367 | "\n",
368 | "That fall I started taking art classes at Harvard. Grad students could take classes in any department, and my advisor, Tom Cheatham, was very easy going. If he even knew about the strange classes I was taking, he never said anything.\n",
369 | "----------------------------------------------------------------------------------------------------\n",
370 | "Document 4:\n",
371 | "\n",
372 | "Arc, in a house I bought in Cambridge.\n"
373 | ]
374 | }
375 | ]
376 | }
377 | ]
378 | }
--------------------------------------------------------------------------------
/Retrievers/03_Ensemble_Retriever.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyMPAOeEbrTU6JWjzESEf3uo",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | }
17 | },
18 | "cells": [
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "id": "view-in-github",
23 | "colab_type": "text"
24 | },
25 | "source": [
26 | "
"
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "execution_count": null,
32 | "metadata": {
33 | "id": "xPWZLTgY5E_4"
34 | },
35 | "outputs": [],
36 | "source": [
37 | "!pip install -q -U langchain openai chromadb tiktoken rank_bm25"
38 | ]
39 | },
40 | {
41 | "cell_type": "code",
42 | "source": [
43 | "import os\n",
44 | "os.environ['OPENAI_API_KEY'] = \"your valid openai api key\""
45 | ],
46 | "metadata": {
47 | "id": "eVQjKJn-5ZZZ"
48 | },
49 | "execution_count": null,
50 | "outputs": []
51 | },
52 | {
53 | "cell_type": "code",
54 | "source": [
55 | "from langchain.embeddings import OpenAIEmbeddings\n",
56 | "from langchain.retrievers import BM25Retriever\n",
57 | "from langchain.vectorstores import Chroma"
58 | ],
59 | "metadata": {
60 | "id": "7qjgj56XSn5E"
61 | },
62 | "execution_count": null,
63 | "outputs": []
64 | },
65 | {
66 | "cell_type": "code",
67 | "source": [
68 | "from typing import Any, Dict, List\n",
69 | "\n",
70 | "from langchain.callbacks.manager import (\n",
71 | " AsyncCallbackManagerForRetrieverRun,\n",
72 | " CallbackManagerForRetrieverRun,\n",
73 | ")\n",
74 | "from langchain.pydantic_v1 import root_validator\n",
75 | "from langchain.schema import BaseRetriever, Document\n",
76 | "\n",
77 | "\n",
78 | "class EnsembleRetriever(BaseRetriever):\n",
79 | " \"\"\"Retriever that ensembles the multiple retrievers.\n",
80 | "\n",
81 | " It uses a rank fusion.\n",
82 | "\n",
83 | " Args:\n",
84 | " retrievers: A list of retrievers to ensemble.\n",
85 | " weights: A list of weights corresponding to the retrievers. Defaults to equal\n",
86 | " weighting for all retrievers.\n",
87 | " c: A constant added to the rank, controlling the balance between the importance\n",
88 | " of high-ranked items and the consideration given to lower-ranked items.\n",
89 | " Default is 60.\n",
90 | " \"\"\"\n",
91 | "\n",
92 | " retrievers: List[BaseRetriever]\n",
93 | " weights: List[float]\n",
94 | " c: int = 60\n",
95 | "\n",
96 | " @root_validator(pre=True)\n",
97 | " def set_weights(cls, values: Dict[str, Any]) -> Dict[str, Any]:\n",
98 | " if not values.get(\"weights\"):\n",
99 | " n_retrievers = len(values[\"retrievers\"])\n",
100 | " values[\"weights\"] = [1 / n_retrievers] * n_retrievers\n",
101 | " return values\n",
102 | "\n",
103 | " def _get_relevant_documents(\n",
104 | " self,\n",
105 | " query: str,\n",
106 | " *,\n",
107 | " run_manager: CallbackManagerForRetrieverRun,\n",
108 | " ) -> List[Document]:\n",
109 | " \"\"\"\n",
110 | " Get the relevant documents for a given query.\n",
111 | "\n",
112 | " Args:\n",
113 | " query: The query to search for.\n",
114 | "\n",
115 | " Returns:\n",
116 | " A list of reranked documents.\n",
117 | " \"\"\"\n",
118 | "\n",
119 | " # Get fused result of the retrievers.\n",
120 | " fused_documents = self.rank_fusion(query, run_manager)\n",
121 | "\n",
122 | " return fused_documents\n",
123 | "\n",
124 | " async def _aget_relevant_documents(\n",
125 | " self,\n",
126 | " query: str,\n",
127 | " *,\n",
128 | " run_manager: AsyncCallbackManagerForRetrieverRun,\n",
129 | " ) -> List[Document]:\n",
130 | " \"\"\"\n",
131 | " Asynchronously get the relevant documents for a given query.\n",
132 | "\n",
133 | " Args:\n",
134 | " query: The query to search for.\n",
135 | "\n",
136 | " Returns:\n",
137 | " A list of reranked documents.\n",
138 | " \"\"\"\n",
139 | "\n",
140 | " # Get fused result of the retrievers.\n",
141 | " fused_documents = await self.arank_fusion(query, run_manager)\n",
142 | "\n",
143 | " return fused_documents\n",
144 | "\n",
145 | " def rank_fusion(\n",
146 | " self, query: str, run_manager: CallbackManagerForRetrieverRun\n",
147 | " ) -> List[Document]:\n",
148 | " \"\"\"\n",
149 | " Retrieve the results of the retrievers and use rank_fusion_func to get\n",
150 | " the final result.\n",
151 | "\n",
152 | " Args:\n",
153 | " query: The query to search for.\n",
154 | "\n",
155 | " Returns:\n",
156 | " A list of reranked documents.\n",
157 | " \"\"\"\n",
158 | "\n",
159 | " # Get the results of all retrievers.\n",
160 | " retriever_docs = [\n",
161 | " retriever.get_relevant_documents(\n",
162 | " query, callbacks=run_manager.get_child(tag=f\"retriever_{i+1}\")\n",
163 | " )\n",
164 | " for i, retriever in enumerate(self.retrievers)\n",
165 | " ]\n",
166 | "\n",
167 | " # apply rank fusion\n",
168 | " fused_documents = self.weighted_reciprocal_rank(retriever_docs)\n",
169 | "\n",
170 | " return fused_documents\n",
171 | "\n",
172 | " async def arank_fusion(\n",
173 | " self, query: str, run_manager: AsyncCallbackManagerForRetrieverRun\n",
174 | " ) -> List[Document]:\n",
175 | " \"\"\"\n",
176 | " Asynchronously retrieve the results of the retrievers\n",
177 | " and use rank_fusion_func to get the final result.\n",
178 | "\n",
179 | " Args:\n",
180 | " query: The query to search for.\n",
181 | "\n",
182 | " Returns:\n",
183 | " A list of reranked documents.\n",
184 | " \"\"\"\n",
185 | "\n",
186 | " # Get the results of all retrievers.\n",
187 | " retriever_docs = [\n",
188 | " await retriever.aget_relevant_documents(\n",
189 | " query, callbacks=run_manager.get_child(tag=f\"retriever_{i+1}\")\n",
190 | " )\n",
191 | " for i, retriever in enumerate(self.retrievers)\n",
192 | " ]\n",
193 | "\n",
194 | " # apply rank fusion\n",
195 | " fused_documents = self.weighted_reciprocal_rank(retriever_docs)\n",
196 | "\n",
197 | " return fused_documents\n",
198 | "\n",
199 | " def weighted_reciprocal_rank(\n",
200 | " self, doc_lists: List[List[Document]]\n",
201 | " ) -> List[Document]:\n",
202 | " \"\"\"\n",
203 | " Perform weighted Reciprocal Rank Fusion on multiple rank lists.\n",
204 | " You can find more details about RRF here:\n",
205 | " https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf\n",
206 | "\n",
207 | " Args:\n",
208 | " doc_lists: A list of rank lists, where each rank list contains unique items.\n",
209 | "\n",
210 | " Returns:\n",
211 | " list: The final aggregated list of items sorted by their weighted RRF\n",
212 | " scores in descending order.\n",
213 | " \"\"\"\n",
214 | " if len(doc_lists) != len(self.weights):\n",
215 | " raise ValueError(\n",
216 | " \"Number of rank lists must be equal to the number of weights.\"\n",
217 | " )\n",
218 | "\n",
219 | " # Create a union of all unique documents in the input doc_lists\n",
220 | " all_documents = set()\n",
221 | " for doc_list in doc_lists:\n",
222 | " for doc in doc_list:\n",
223 | " all_documents.add(doc.page_content)\n",
224 | "\n",
225 | " # Initialize the RRF score dictionary for each document\n",
226 | " rrf_score_dic = {doc: 0.0 for doc in all_documents}\n",
227 | "\n",
228 | " # Calculate RRF scores for each document\n",
229 | " for doc_list, weight in zip(doc_lists, self.weights):\n",
230 | " for rank, doc in enumerate(doc_list, start=1):\n",
231 | " rrf_score = weight * (1 / (rank + self.c))\n",
232 | " rrf_score_dic[doc.page_content] += rrf_score\n",
233 | "\n",
234 | " for key, value in rrf_score_dic.items():\n",
235 | " print(f'Key: {key}, Value: {value}')\n",
236 | "\n",
237 | " # Sort documents by their RRF scores in descending order\n",
238 | " sorted_documents = sorted(\n",
239 | " rrf_score_dic.keys(), key=lambda x: rrf_score_dic[x], reverse=True\n",
240 | " )\n",
241 | "\n",
242 | " # Map the sorted page_content back to the original document objects\n",
243 | " page_content_to_doc_map = {\n",
244 | " doc.page_content: doc for doc_list in doc_lists for doc in doc_list\n",
245 | " }\n",
246 | " sorted_docs = [\n",
247 | " page_content_to_doc_map[page_content] for page_content in sorted_documents\n",
248 | " ]\n",
249 | "\n",
250 | " return sorted_docs\n"
251 | ],
252 | "metadata": {
253 | "id": "33D_Wq_042Gu"
254 | },
255 | "execution_count": null,
256 | "outputs": []
257 | },
258 | {
259 | "cell_type": "code",
260 | "source": [
261 | "doc_list = [\n",
262 | " \"I like apples\",\n",
263 | " \"I like oranges\",\n",
264 | " \"Apples and oranges are fruits\"\n",
265 | "]\n",
266 | "\n",
267 | "# initialize the bm25 retriever and Chromadb retriever\n",
268 | "bm25_retriever = BM25Retriever.from_texts(doc_list)\n",
269 | "bm25_retriever.k = 2\n",
270 | "\n",
271 | "docs = bm25_retriever.get_relevant_documents('apple')\n",
272 | "print(docs)\n",
273 | "\n",
274 | "embedding = OpenAIEmbeddings()\n",
275 | "vectorstore = Chroma.from_texts(doc_list, embedding, collection_name=\"tutorial_2023\")\n",
276 | "vs_retriever = vectorstore.as_retriever(search_kwargs={\"k\": 2})\n",
277 | "\n",
278 | "vs_docs = vs_retriever.get_relevant_documents('apple')\n",
279 | "print(vs_docs)\n",
280 | "\n",
281 | "# initialize the ensemble retriever\n",
282 | "ensemble_retriever = EnsembleRetriever(retrievers=[bm25_retriever, vs_retriever], weights=[0.5, 0.5])"
283 | ],
284 | "metadata": {
285 | "colab": {
286 | "base_uri": "https://localhost:8080/"
287 | },
288 | "id": "lafBN9JvSqZi",
289 | "outputId": "12d16d4b-7cc2-4986-af57-bf274a8a0271"
290 | },
291 | "execution_count": null,
292 | "outputs": [
293 | {
294 | "output_type": "stream",
295 | "name": "stdout",
296 | "text": [
297 | "[Document(page_content='Apples and oranges are fruits'), Document(page_content='I like oranges')]\n",
298 | "[Document(page_content='I like apples'), Document(page_content='Apples and oranges are fruits')]\n"
299 | ]
300 | }
301 | ]
302 | },
303 | {
304 | "cell_type": "code",
305 | "source": [
306 | "docs = ensemble_retriever.get_relevant_documents(\"apple\")\n",
307 | "docs"
308 | ],
309 | "metadata": {
310 | "colab": {
311 | "base_uri": "https://localhost:8080/"
312 | },
313 | "id": "rmftLuuJStBg",
314 | "outputId": "d91be41d-9f90-4553-8a61-c1267ba06b03"
315 | },
316 | "execution_count": null,
317 | "outputs": [
318 | {
319 | "output_type": "stream",
320 | "name": "stdout",
321 | "text": [
322 | "Key: Apples and oranges are fruits, Value: 0.01626123744050767\n",
323 | "Key: I like oranges, Value: 0.008064516129032258\n",
324 | "Key: I like apples, Value: 0.00819672131147541\n"
325 | ]
326 | },
327 | {
328 | "output_type": "execute_result",
329 | "data": {
330 | "text/plain": [
331 | "[Document(page_content='Apples and oranges are fruits'),\n",
332 | " Document(page_content='I like apples'),\n",
333 | " Document(page_content='I like oranges')]"
334 | ]
335 | },
336 | "metadata": {},
337 | "execution_count": 34
338 | }
339 | ]
340 | }
341 | ]
342 | }
--------------------------------------------------------------------------------
/Retrievers/04_MultiVector_Retriever/README.md:
--------------------------------------------------------------------------------
1 | # 04 MultiVector Retriever
2 |
3 | To improve the retrieval effectiveness, in practice we can store multiple vectors per document. It's been proved in multiple use cases.
4 |
5 | LangChain provides a retriever component `MultiVectorRetriever` which support such mechanism. It can be implemented in the following methods:
6 |
7 | - Smaller Chunks
8 |
9 | Split a document into smaller chunks, and embed them.
10 |
11 | - Summary
12 |
13 | Create a summary for each document, embed that along with (or instead of) the document.
14 |
15 | - Hypothetical questions
16 |
17 | Create hypothetical questions that each document would be appropriate to answer, embed those along with (or instead of) the document.
18 |
19 | ## Methods in Brief
20 |
21 | ### Smaller Chunks
22 |
23 | 
24 |
25 | ### Summary
26 |
27 | 
28 |
29 | ### Hypothetical questions
30 |
31 | 
32 |
33 | ## Examples
34 |
35 | [Python Notebook](./04_MultiVector_Retriever.ipynb)
--------------------------------------------------------------------------------
/Retrievers/04_MultiVector_Retriever/hypothetical_questions.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Retrievers/04_MultiVector_Retriever/hypothetical_questions.png
--------------------------------------------------------------------------------
/Retrievers/04_MultiVector_Retriever/smaller_chunks.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Retrievers/04_MultiVector_Retriever/smaller_chunks.png
--------------------------------------------------------------------------------
/Retrievers/04_MultiVector_Retriever/summary.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Retrievers/04_MultiVector_Retriever/summary.png
--------------------------------------------------------------------------------
/Retrievers/05_SelfQuerying_Retriever/05_Self_Querying_Retriever.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyP5ncXXfWeUVcVSos1Kg4fy",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | }
17 | },
18 | "cells": [
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "id": "view-in-github",
23 | "colab_type": "text"
24 | },
25 | "source": [
26 | "
"
27 | ]
28 | },
29 | {
30 | "cell_type": "markdown",
31 | "source": [
32 | "# 05 Self-Querying Retriever\n",
33 | "\n",
34 | "A self-querying retriever, as the name implies, possesses the capability to initiate queries to itself.\n",
35 | "\n",
36 | "When presented with a natural language query, this retriever employs a query-constructing LLM chain to create a structured query. It then utilizes this structured query to interact with its VectorStore, enabling it to not just assess semantic similarity between the user-input query and stored documents but also to discern and execute filters based on user queries related to document metadata.\n",
37 | "\n"
38 | ],
39 | "metadata": {
40 | "id": "cUbNOf6aAD2w"
41 | }
42 | },
43 | {
44 | "cell_type": "markdown",
45 | "source": [
46 | "## Key Takeaway\n",
47 | "\n",
48 | "1. Self querying is done on the metadata of the documents\n",
49 | "2. Query-constructing LLM chain is used to generate the query parameters and translate it to underlying vector store specific query (structured)."
50 | ],
51 | "metadata": {
52 | "id": "mG_zHbCgiFMD"
53 | }
54 | },
55 | {
56 | "cell_type": "markdown",
57 | "source": [
58 | "## Example"
59 | ],
60 | "metadata": {
61 | "id": "ZCMb8QtOBawT"
62 | }
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 1,
67 | "metadata": {
68 | "id": "xPWZLTgY5E_4",
69 | "colab": {
70 | "base_uri": "https://localhost:8080/"
71 | },
72 | "outputId": "7e3fb26c-7eaa-4f9a-af72-1bc5da7f6d2b"
73 | },
74 | "outputs": [
75 | {
76 | "output_type": "stream",
77 | "name": "stdout",
78 | "text": [
79 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.9/1.9 MB\u001b[0m \u001b[31m19.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
80 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.0/77.0 kB\u001b[0m \u001b[31m9.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
81 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m448.1/448.1 kB\u001b[0m \u001b[31m40.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
82 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m67.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
83 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m108.9/108.9 kB\u001b[0m \u001b[31m12.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
84 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m40.1/40.1 kB\u001b[0m \u001b[31m4.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
85 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.4/2.4 MB\u001b[0m \u001b[31m86.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
86 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m66.3/66.3 kB\u001b[0m \u001b[31m8.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
87 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m59.5/59.5 kB\u001b[0m \u001b[31m5.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
88 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.4/5.4 MB\u001b[0m \u001b[31m99.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
89 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.2/6.2 MB\u001b[0m \u001b[31m101.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
90 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.8/3.8 MB\u001b[0m \u001b[31m104.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
91 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.3/67.3 kB\u001b[0m \u001b[31m9.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
92 | "\u001b[?25h Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
93 | " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
94 | " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
95 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m593.7/593.7 kB\u001b[0m \u001b[31m49.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
96 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.4/49.4 kB\u001b[0m \u001b[31m5.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
97 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.0/67.0 kB\u001b[0m \u001b[31m7.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
98 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m46.0/46.0 kB\u001b[0m \u001b[31m4.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
99 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m295.0/295.0 kB\u001b[0m \u001b[31m32.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
100 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m7.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
101 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m341.4/341.4 kB\u001b[0m \u001b[31m33.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
102 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.2/4.2 MB\u001b[0m \u001b[31m107.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
103 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m79.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
104 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m129.9/129.9 kB\u001b[0m \u001b[31m14.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
105 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.8/86.8 kB\u001b[0m \u001b[31m10.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
106 | "\u001b[?25h Building wheel for pypika (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n"
107 | ]
108 | }
109 | ],
110 | "source": [
111 | "!pip install -q -U langchain openai chromadb tiktoken lark"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "source": [
117 | "import os\n",
118 | "os.environ['OPENAI_API_KEY'] = \"your valid openai api key\""
119 | ],
120 | "metadata": {
121 | "id": "eVQjKJn-5ZZZ"
122 | },
123 | "execution_count": 2,
124 | "outputs": []
125 | },
126 | {
127 | "cell_type": "code",
128 | "source": [
129 | "from typing import Collection\n",
130 | "from langchain.llms import OpenAI\n",
131 | "from langchain.schema import Document\n",
132 | "from langchain.embeddings.openai import OpenAIEmbeddings\n",
133 | "from langchain.vectorstores import Chroma\n",
134 | "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
135 | "from langchain.chains.query_constructor.base import AttributeInfo\n",
136 | "\n",
137 | "docs = [\n",
138 | " Document(page_content=\"A bunch of scientists bring back dinosaurs and mayhem breaks loose\", metadata={\"year\": 1993, \"rating\": 7.7, \"genre\": \"action\"}),\n",
139 | " Document(page_content=\"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...\", metadata={\"year\": 2010, \"director\": \"Christopher Nolan\", \"rating\": 8.2}),\n",
140 | " Document(page_content=\"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea\", metadata={\"year\": 2006, \"director\": \"Satoshi Kon\", \"rating\": 8.6}),\n",
141 | " Document(page_content=\"A bunch of normal-sized women are supremely wholesome and some men pine after them\", metadata={\"year\": 2019, \"director\": \"Greta Gerwig\", \"rating\": 8.3}),\n",
142 | " Document(page_content=\"Toys come alive and have a blast doing so\", metadata={\"year\": 1995, \"genre\": \"animated\"}),\n",
143 | " Document(page_content=\"Three men walk into the Zone, three men walk out of the Zone\", metadata={\"year\": 1979, \"rating\": 9.9, \"director\": \"Andrei Tarkovsky\", \"genre\": \"thriller\", \"rating\": 9.9})\n",
144 | "]\n",
145 | "vectorstore = Chroma.from_documents(\n",
146 | " docs, OpenAIEmbeddings(), collection_name=\"self_querying\"\n",
147 | ")\n",
148 | "\n",
149 | "metadata_field_info=[\n",
150 | " AttributeInfo(\n",
151 | " name=\"genre\",\n",
152 | " description=\"The genre of the movie\",\n",
153 | " type=\"string\",\n",
154 | " ),\n",
155 | " AttributeInfo(\n",
156 | " name=\"year\",\n",
157 | " description=\"The year the movie was released\",\n",
158 | " type=\"integer\",\n",
159 | " ),\n",
160 | " AttributeInfo(\n",
161 | " name=\"director\",\n",
162 | " description=\"The name of the movie director\",\n",
163 | " type=\"string\",\n",
164 | " ),\n",
165 | " AttributeInfo(\n",
166 | " name=\"rating\",\n",
167 | " description=\"A 1-10 rating for the movie\",\n",
168 | " type=\"float\"\n",
169 | " ),\n",
170 | "]"
171 | ],
172 | "metadata": {
173 | "id": "7qjgj56XSn5E"
174 | },
175 | "execution_count": 3,
176 | "outputs": []
177 | },
178 | {
179 | "cell_type": "code",
180 | "source": [
181 | "document_content_description = \"Brief summary of a movie\"\n",
182 | "llm = OpenAI(temperature=0)\n",
183 | "retriever = SelfQueryRetriever.from_llm(llm, vectorstore, document_content_description, metadata_field_info, verbose=True)"
184 | ],
185 | "metadata": {
186 | "id": "S5NvjPdlhYDf"
187 | },
188 | "execution_count": 28,
189 | "outputs": []
190 | },
191 | {
192 | "cell_type": "markdown",
193 | "source": [
194 | "### Ask with a regular query"
195 | ],
196 | "metadata": {
197 | "id": "GDnavYa8hbY6"
198 | }
199 | },
200 | {
201 | "cell_type": "code",
202 | "source": [
203 | "retriever.get_relevant_documents(\"What are some movies about dinosaurs\")"
204 | ],
205 | "metadata": {
206 | "colab": {
207 | "base_uri": "https://localhost:8080/"
208 | },
209 | "id": "JcR0ioK_hauq",
210 | "outputId": "0e3b32bc-a82d-4471-9567-33aefce92a12"
211 | },
212 | "execution_count": 29,
213 | "outputs": [
214 | {
215 | "output_type": "execute_result",
216 | "data": {
217 | "text/plain": [
218 | "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'genre': 'action', 'rating': 7.7, 'year': 1993}),\n",
219 | " Document(page_content='Toys come alive and have a blast doing so', metadata={'genre': 'animated', 'year': 1995}),\n",
220 | " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'director': 'Andrei Tarkovsky', 'genre': 'thriller', 'rating': 9.9, 'year': 1979}),\n",
221 | " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'director': 'Satoshi Kon', 'rating': 8.6, 'year': 2006})]"
222 | ]
223 | },
224 | "metadata": {},
225 | "execution_count": 29
226 | }
227 | ]
228 | },
229 | {
230 | "cell_type": "markdown",
231 | "source": [
232 | "### Ask with a filter"
233 | ],
234 | "metadata": {
235 | "id": "TcceqQVIhkSM"
236 | }
237 | },
238 | {
239 | "cell_type": "code",
240 | "source": [
241 | "retriever.get_relevant_documents(\"I want to watch a movie rated lower than 8\")"
242 | ],
243 | "metadata": {
244 | "colab": {
245 | "base_uri": "https://localhost:8080/"
246 | },
247 | "id": "Hb2lCXmfhnYh",
248 | "outputId": "9bc14370-6629-48bf-df70-4c9982bc81af"
249 | },
250 | "execution_count": 30,
251 | "outputs": [
252 | {
253 | "output_type": "execute_result",
254 | "data": {
255 | "text/plain": [
256 | "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'genre': 'action', 'rating': 7.7, 'year': 1993})]"
257 | ]
258 | },
259 | "metadata": {},
260 | "execution_count": 30
261 | }
262 | ]
263 | },
264 | {
265 | "cell_type": "markdown",
266 | "source": [
267 | "### Ask with a query containing a filter"
268 | ],
269 | "metadata": {
270 | "id": "zpoXP2C_htVy"
271 | }
272 | },
273 | {
274 | "cell_type": "code",
275 | "source": [
276 | "retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")"
277 | ],
278 | "metadata": {
279 | "colab": {
280 | "base_uri": "https://localhost:8080/"
281 | },
282 | "id": "uDh-KYGXhxIq",
283 | "outputId": "1b8dd19f-2cd5-4307-f6b6-8b958274d91b"
284 | },
285 | "execution_count": 31,
286 | "outputs": [
287 | {
288 | "output_type": "execute_result",
289 | "data": {
290 | "text/plain": [
291 | "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'director': 'Greta Gerwig', 'rating': 8.3, 'year': 2019})]"
292 | ]
293 | },
294 | "metadata": {},
295 | "execution_count": 31
296 | }
297 | ]
298 | },
299 | {
300 | "cell_type": "markdown",
301 | "source": [
302 | "### Ask with a composite filter"
303 | ],
304 | "metadata": {
305 | "id": "XeKaxVubh2yb"
306 | }
307 | },
308 | {
309 | "cell_type": "code",
310 | "source": [
311 | "retriever.get_relevant_documents(\"What's a highly rated (above 8.5) triller film?\")"
312 | ],
313 | "metadata": {
314 | "colab": {
315 | "base_uri": "https://localhost:8080/"
316 | },
317 | "id": "JGNgDZ8vh5YZ",
318 | "outputId": "e9ff4793-4f93-4dcb-a2f5-212fa1908229"
319 | },
320 | "execution_count": 9,
321 | "outputs": [
322 | {
323 | "output_type": "execute_result",
324 | "data": {
325 | "text/plain": [
326 | "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'director': 'Andrei Tarkovsky', 'genre': 'thriller', 'rating': 9.9, 'year': 1979})]"
327 | ]
328 | },
329 | "metadata": {},
330 | "execution_count": 9
331 | }
332 | ]
333 | },
334 | {
335 | "cell_type": "markdown",
336 | "source": [
337 | "### Ask with a query and composite filter"
338 | ],
339 | "metadata": {
340 | "id": "vSahhaO-h8q4"
341 | }
342 | },
343 | {
344 | "cell_type": "code",
345 | "source": [
346 | "retriever.get_relevant_documents(\"What's an animated movie that's all about toys and after 1990\")"
347 | ],
348 | "metadata": {
349 | "colab": {
350 | "base_uri": "https://localhost:8080/"
351 | },
352 | "id": "cMlp7yHBh_g7",
353 | "outputId": "b4efa47c-d3aa-4022-fb0d-eedae012da1a"
354 | },
355 | "execution_count": 32,
356 | "outputs": [
357 | {
358 | "output_type": "execute_result",
359 | "data": {
360 | "text/plain": [
361 | "[Document(page_content='Toys come alive and have a blast doing so', metadata={'genre': 'animated', 'year': 1995})]"
362 | ]
363 | },
364 | "metadata": {},
365 | "execution_count": 32
366 | }
367 | ]
368 | },
369 | {
370 | "cell_type": "code",
371 | "source": [
372 | "retriever.query_constructor.invoke({\"query\": \"What's an animated movie that's all about toys and after 1990\"})"
373 | ],
374 | "metadata": {
375 | "colab": {
376 | "base_uri": "https://localhost:8080/"
377 | },
378 | "id": "--RrtZcC-i9l",
379 | "outputId": "0a73ffc6-10dc-41f1-a489-809ae9cf823f"
380 | },
381 | "execution_count": 33,
382 | "outputs": [
383 | {
384 | "output_type": "execute_result",
385 | "data": {
386 | "text/plain": [
387 | "StructuredQuery(query='toys', filter=Operation(operator=, arguments=[Comparison(comparator=, attribute='genre', value='animated'), Comparison(comparator=, attribute='year', value=1990)]), limit=None)"
388 | ]
389 | },
390 | "metadata": {},
391 | "execution_count": 33
392 | }
393 | ]
394 | },
395 | {
396 | "cell_type": "code",
397 | "source": [
398 | "retriever.query_constructor.invoke({\"query\": \"Show me one movie that's rated higher than 8\"})"
399 | ],
400 | "metadata": {
401 | "id": "jXHAFlk5Dl9t"
402 | },
403 | "execution_count": null,
404 | "outputs": []
405 | },
406 | {
407 | "cell_type": "markdown",
408 | "source": [
409 | "### Enable the limit\n",
410 | "\n",
411 | "The parameter `enable_limit` in `SelfQueryRetriever.from_llm` can be used to enable limit, which can allow developers to specify how many records to retrieve"
412 | ],
413 | "metadata": {
414 | "id": "jwZBE-zWDJ6a"
415 | }
416 | },
417 | {
418 | "cell_type": "code",
419 | "source": [
420 | "retriever = SelfQueryRetriever.from_llm(\n",
421 | " llm,\n",
422 | " vectorstore,\n",
423 | " document_content_description,\n",
424 | " metadata_field_info,\n",
425 | " verbose=True,\n",
426 | " enable_limit=True)"
427 | ],
428 | "metadata": {
429 | "id": "ySeejTuyDeIq"
430 | },
431 | "execution_count": 34,
432 | "outputs": []
433 | },
434 | {
435 | "cell_type": "code",
436 | "source": [
437 | "retriever.query_constructor.invoke({\"query\": \"Show me one movie that's rated higher than 8\"})"
438 | ],
439 | "metadata": {
440 | "colab": {
441 | "base_uri": "https://localhost:8080/"
442 | },
443 | "id": "CF-_7XEfDmzV",
444 | "outputId": "b137af7e-a06c-4af8-9259-6685ae1fdea5"
445 | },
446 | "execution_count": 35,
447 | "outputs": [
448 | {
449 | "output_type": "execute_result",
450 | "data": {
451 | "text/plain": [
452 | "StructuredQuery(query=' ', filter=Comparison(comparator=, attribute='rating', value=8), limit=1)"
453 | ]
454 | },
455 | "metadata": {},
456 | "execution_count": 35
457 | }
458 | ]
459 | },
460 | {
461 | "cell_type": "code",
462 | "source": [
463 | "retriever.get_relevant_documents(\"Show me one movie that's rated higher than 8\")"
464 | ],
465 | "metadata": {
466 | "colab": {
467 | "base_uri": "https://localhost:8080/"
468 | },
469 | "id": "95VI28DVMZat",
470 | "outputId": "2f12d2b6-ea0c-4fa9-cbed-59a43004cfab"
471 | },
472 | "execution_count": 36,
473 | "outputs": [
474 | {
475 | "output_type": "execute_result",
476 | "data": {
477 | "text/plain": [
478 | "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'director': 'Greta Gerwig', 'rating': 8.3, 'year': 2019})]"
479 | ]
480 | },
481 | "metadata": {},
482 | "execution_count": 36
483 | }
484 | ]
485 | }
486 | ]
487 | }
--------------------------------------------------------------------------------
/Retrievers/05_SelfQuerying_Retriever/README.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Retrievers/05_SelfQuerying_Retriever/README.md
--------------------------------------------------------------------------------
/Retrievers/06_Web_Research_Retriever/06_Web_Research_Retriever.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyOo/yJhRQYcANIK1Z8U9k10",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | }
17 | },
18 | "cells": [
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "id": "view-in-github",
23 | "colab_type": "text"
24 | },
25 | "source": [
26 | "
"
27 | ]
28 | },
29 | {
30 | "cell_type": "markdown",
31 | "source": [
32 | "# 06 Web Research Retriever\n"
33 | ],
34 | "metadata": {
35 | "id": "cUbNOf6aAD2w"
36 | }
37 | },
38 | {
39 | "cell_type": "markdown",
40 | "source": [
41 | "## Example"
42 | ],
43 | "metadata": {
44 | "id": "ZCMb8QtOBawT"
45 | }
46 | },
47 | {
48 | "cell_type": "code",
49 | "execution_count": null,
50 | "metadata": {
51 | "id": "xPWZLTgY5E_4"
52 | },
53 | "outputs": [],
54 | "source": [
55 | "!pip install -q -U langchain openai chromadb tiktoken"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "source": [
61 | "import os\n",
62 | "os.environ['OPENAI_API_KEY'] = \"your valid openai api key\""
63 | ],
64 | "metadata": {
65 | "id": "eVQjKJn-5ZZZ"
66 | },
67 | "execution_count": null,
68 | "outputs": []
69 | },
70 | {
71 | "cell_type": "code",
72 | "source": [
73 | "from langchain.retrievers.web_research import WebResearchRetriever"
74 | ],
75 | "metadata": {
76 | "id": "Wui4jqtAOFol"
77 | },
78 | "execution_count": null,
79 | "outputs": []
80 | },
81 | {
82 | "cell_type": "code",
83 | "source": [
84 | "import os\n",
85 | "from langchain.vectorstores import Chroma\n",
86 | "from langchain.embeddings import OpenAIEmbeddings\n",
87 | "from langchain.chat_models.openai import ChatOpenAI\n",
88 | "from langchain.utilities import GoogleSearchAPIWrapper\n",
89 | "\n",
90 | "# Vectorstore\n",
91 | "vectorstore = Chroma(embedding_function=OpenAIEmbeddings(),persist_directory=\"./chroma_db_oai\")\n",
92 | "\n",
93 | "# LLM\n",
94 | "llm = ChatOpenAI(temperature=0)\n",
95 | "\n",
96 | "# Request from https://programmablesearchengine.google.com/controlpanel/all\n",
97 | "os.environ[\"GOOGLE_CSE_ID\"] = \"yor cse id\"\n",
98 | "# Request from https://developers.google.com/custom-search/v1/introduction\n",
99 | "os.environ[\"GOOGLE_API_KEY\"] = \"your google api key\"\n",
100 | "search = GoogleSearchAPIWrapper()"
101 | ],
102 | "metadata": {
103 | "id": "UNMFu8SvOOW-"
104 | },
105 | "execution_count": null,
106 | "outputs": []
107 | },
108 | {
109 | "cell_type": "code",
110 | "source": [
111 | "search.run(\"What is vitamin?\")"
112 | ],
113 | "metadata": {
114 | "colab": {
115 | "base_uri": "https://localhost:8080/",
116 | "height": 128
117 | },
118 | "id": "6_eSDCxoOgpu",
119 | "outputId": "f499890c-9138-4da0-c1f9-e81b00e4336b"
120 | },
121 | "execution_count": 18,
122 | "outputs": [
123 | {
124 | "output_type": "execute_result",
125 | "data": {
126 | "text/plain": [
127 | "\"Dec 16, 2020 ... A vitamin is an organic compound, which means that it contains carbon. It is also an essential nutrient that the body may need to get from food. Sep 18, 2023 ... Total vitamin D intakes were three times higher with supplement use than with diet alone; the mean intake from foods and beverages alone for\\xa0... The two main forms of vitamin A in the human diet are preformed vitamin A (retinol, retinyl esters), and provitamin A carotenoids such as alpha-carotene and\\xa0... Nov 8, 2022 ... Vitamin D helps maintain strong bones. Learn how much you need, good sources, deficiency symptoms, and health effects here. Jan 19, 2023 ... Vitamins are a group of substances that are needed for normal cell function, growth, and development. There are 13 essential vitamins. This\\xa0... Aug 12, 2022 ... Vitamin A also helps your heart, lungs, and other organs work properly. Carotenoids are pigments that give yellow, orange, and red fruits and\\xa0... Aug 2, 2022 ... Vitamin D deficiency means that you don't have enough vitamin D in your body. It's common and primarily causes issues with your bones and\\xa0... Vitamin A, along with other vitamins, minerals and other compounds, is an essential micronutrient. This means that our bodies cannot manufacture it and\\xa0... Vitamin D is both a nutrient we eat and a hormone our bodies make. It is a fat-soluble vitamin that has long been known to help the body absorb and retain\\xa0... If you take vitamin A for its antioxidant properties, keep in mind that the supplement might not offer the same benefits as naturally occurring antioxidants in\\xa0...\""
128 | ],
129 | "application/vnd.google.colaboratory.intrinsic+json": {
130 | "type": "string"
131 | }
132 | },
133 | "metadata": {},
134 | "execution_count": 18
135 | }
136 | ]
137 | },
138 | {
139 | "cell_type": "code",
140 | "source": [
141 | "web_research_retriever = WebResearchRetriever.from_llm(\n",
142 | " vectorstore=vectorstore,\n",
143 | " llm=llm,\n",
144 | " search=search,\n",
145 | ")"
146 | ],
147 | "metadata": {
148 | "id": "jeT8hws8Op6V"
149 | },
150 | "execution_count": null,
151 | "outputs": []
152 | },
153 | {
154 | "cell_type": "code",
155 | "source": [
156 | "!pip install -q -U html2text"
157 | ],
158 | "metadata": {
159 | "id": "EAUtzZ8bOxyr"
160 | },
161 | "execution_count": null,
162 | "outputs": []
163 | },
164 | {
165 | "cell_type": "code",
166 | "source": [
167 | "from langchain.chains import RetrievalQAWithSourcesChain\n",
168 | "user_input = \"Who is the winner of FIFA world cup 2002?\"\n",
169 | "qa_chain = RetrievalQAWithSourcesChain.from_chain_type(llm,retriever=web_research_retriever)\n",
170 | "result = qa_chain({\"question\": user_input})\n",
171 | "result"
172 | ],
173 | "metadata": {
174 | "colab": {
175 | "base_uri": "https://localhost:8080/"
176 | },
177 | "id": "BbuktT3-OsRu",
178 | "outputId": "8242c64e-6ca2-45f3-a3a0-aa58bbcd249c"
179 | },
180 | "execution_count": 19,
181 | "outputs": [
182 | {
183 | "output_type": "stream",
184 | "name": "stderr",
185 | "text": [
186 | "INFO:langchain.retrievers.web_research:Generating questions for Google Search ...\n",
187 | "INFO:langchain.retrievers.web_research:Questions for Google Search (raw): {'question': 'Who is the winner of FIFA world cup 2002?', 'text': LineList(lines=['1. \"Which country won the FIFA World Cup in 2002?\"\\n', '2. \"Who was the champion of the FIFA World Cup held in 2002?\"\\n', '3. \"Who emerged as the victor in the FIFA World Cup 2002?\"'])}\n",
188 | "INFO:langchain.retrievers.web_research:Questions for Google Search: ['1. \"Which country won the FIFA World Cup in 2002?\"\\n', '2. \"Who was the champion of the FIFA World Cup held in 2002?\"\\n', '3. \"Who emerged as the victor in the FIFA World Cup 2002?\"']\n",
189 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
190 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
191 | "INFO:langchain.retrievers.web_research:Search results: [{'title': '2002 FIFA World Cup - Wikipedia', 'link': 'https://en.wikipedia.org/wiki/2002_FIFA_World_Cup', 'snippet': 'However, the most potent team at the tournament, Brazil, prevailed, winning the final against Germany 2–0, making them the first and only country to have won\\xa0...'}]\n",
192 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
193 | "INFO:langchain.retrievers.web_research:Search results: [{'title': '2002 FIFA World Cup - Wikipedia', 'link': 'https://en.wikipedia.org/wiki/2002_FIFA_World_Cup', 'snippet': \"... FIFA World Cup, the quadrennial football world championship for men's national teams organized by FIFA. It was held from 31 May to 30 June 2002 at sites in\\xa0...\"}]\n",
194 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
195 | "INFO:langchain.retrievers.web_research:Search results: [{'title': 'FIFA World Cup Host Countries 2023', 'link': 'https://worldpopulationreview.com/country-rankings/fifa-world-cup-host-countries', 'snippet': 'France emerged as the final victor of the 2018 FIFA World Cup, which took place in Russia. ... 2002: Japan/South Korea; 1998: France (2nd time); 1994: United\\xa0...'}]\n",
196 | "INFO:langchain.retrievers.web_research:New URLs to load: []\n",
197 | "INFO:langchain.retrievers.web_research:Grabbing most relevant splits from urls...\n"
198 | ]
199 | },
200 | {
201 | "output_type": "execute_result",
202 | "data": {
203 | "text/plain": [
204 | "{'question': 'Who is the winner of FIFA world cup 2002?',\n",
205 | " 'answer': 'Brazil is the winner of the FIFA World Cup 2002.\\n',\n",
206 | " 'sources': 'https://en.wikipedia.org/wiki/2002_FIFA_World_Cup'}"
207 | ]
208 | },
209 | "metadata": {},
210 | "execution_count": 19
211 | }
212 | ]
213 | },
214 | {
215 | "cell_type": "code",
216 | "source": [
217 | "import logging\n",
218 | "logging.basicConfig()\n",
219 | "logging.getLogger(\"langchain.retrievers.web_research\").setLevel(logging.INFO)\n",
220 | "user_input = \"What is Task Decomposition in LLM Powered Autonomous Agents?\"\n",
221 | "docs = web_research_retriever.get_relevant_documents(user_input)"
222 | ],
223 | "metadata": {
224 | "colab": {
225 | "base_uri": "https://localhost:8080/"
226 | },
227 | "id": "k8DHpRiMO--N",
228 | "outputId": "2fcb6704-61fd-4c76-8efa-76891293039c"
229 | },
230 | "execution_count": 20,
231 | "outputs": [
232 | {
233 | "output_type": "stream",
234 | "name": "stderr",
235 | "text": [
236 | "INFO:langchain.retrievers.web_research:Generating questions for Google Search ...\n",
237 | "INFO:langchain.retrievers.web_research:Questions for Google Search (raw): {'question': 'What is Task Decomposition in LLM Powered Autonomous Agents?', 'text': LineList(lines=['1. How does task decomposition work in LLM powered autonomous agents?\\n', '2. What is the role of task decomposition in LLM powered autonomous agents?\\n', '3. Can you explain the concept of task decomposition in LLM powered autonomous agents?'])}\n",
238 | "INFO:langchain.retrievers.web_research:Questions for Google Search: ['1. How does task decomposition work in LLM powered autonomous agents?\\n', '2. What is the role of task decomposition in LLM powered autonomous agents?\\n', '3. Can you explain the concept of task decomposition in LLM powered autonomous agents?']\n",
239 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
240 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
241 | "INFO:langchain.retrievers.web_research:Search results: [{'title': \"LLM Powered Autonomous Agents | Lil'Log\", 'link': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'snippet': 'Jun 23, 2023 ... Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\" , \"What are the subgoals for achieving XYZ?\" , (2)\\xa0...'}]\n",
242 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
243 | "INFO:langchain.retrievers.web_research:Search results: [{'title': \"LLM Powered Autonomous Agents | Lil'Log\", 'link': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'snippet': 'Jun 23, 2023 ... Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\" , \"What are the subgoals for achieving XYZ?\" , (2)\\xa0...'}]\n",
244 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
245 | "INFO:langchain.retrievers.web_research:Search results: [{'title': \"LLM Powered Autonomous Agents | Lil'Log\", 'link': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'snippet': 'Jun 23, 2023 ... Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\" , \"What are the subgoals for achieving XYZ?\" , (2)\\xa0...'}]\n",
246 | "INFO:langchain.retrievers.web_research:New URLs to load: []\n",
247 | "INFO:langchain.retrievers.web_research:Grabbing most relevant splits from urls...\n"
248 | ]
249 | }
250 | ]
251 | },
252 | {
253 | "cell_type": "code",
254 | "source": [
255 | "import os\n",
256 | "import re\n",
257 | "from typing import List\n",
258 | "from langchain.chains import LLMChain\n",
259 | "from pydantic import BaseModel, Field\n",
260 | "from langchain.prompts import PromptTemplate\n",
261 | "from langchain.output_parsers.pydantic import PydanticOutputParser\n",
262 | "\n",
263 | "# LLMChain\n",
264 | "search_prompt = PromptTemplate(\n",
265 | " input_variables=[\"question\"],\n",
266 | " template=\"\"\"You are an assistant tasked with improving Google search\n",
267 | " results. Generate 5 Google search queries that are similar to\n",
268 | " this question. The output should be a numbered list of questions and each\n",
269 | " should have a question mark at the end: {question}\"\"\",\n",
270 | ")\n",
271 | "\n",
272 | "class LineList(BaseModel):\n",
273 | " \"\"\"List of questions.\"\"\"\n",
274 | "\n",
275 | " lines: List[str] = Field(description=\"Questions\")\n",
276 | "\n",
277 | "class QuestionListOutputParser(PydanticOutputParser):\n",
278 | " \"\"\"Output parser for a list of numbered questions.\"\"\"\n",
279 | "\n",
280 | " def __init__(self) -> None:\n",
281 | " super().__init__(pydantic_object=LineList)\n",
282 | "\n",
283 | " def parse(self, text: str) -> LineList:\n",
284 | " lines = re.findall(r\"\\d+\\..*?\\n\", text)\n",
285 | " return LineList(lines=lines)\n",
286 | "\n",
287 | "llm_chain = LLMChain(llm=llm, prompt=search_prompt, output_parser=QuestionListOutputParser())"
288 | ],
289 | "metadata": {
290 | "id": "jNfh403xQZ2X"
291 | },
292 | "execution_count": null,
293 | "outputs": []
294 | },
295 | {
296 | "cell_type": "code",
297 | "source": [
298 | "# Initialize\n",
299 | "web_research_retriever_llm_chain = WebResearchRetriever(vectorstore=vectorstore, llm_chain=llm_chain, search=search)\n",
300 | "\n",
301 | "# Run\n",
302 | "docs = web_research_retriever_llm_chain.get_relevant_documents(\"What is the recommended way to recycle plastics?\")"
303 | ],
304 | "metadata": {
305 | "colab": {
306 | "base_uri": "https://localhost:8080/"
307 | },
308 | "id": "MXpFnjFDQjAe",
309 | "outputId": "7e8e2e86-41c0-432b-ccce-d1f4dc1248c3"
310 | },
311 | "execution_count": 21,
312 | "outputs": [
313 | {
314 | "output_type": "stream",
315 | "name": "stderr",
316 | "text": [
317 | "INFO:langchain.retrievers.web_research:Generating questions for Google Search ...\n",
318 | "INFO:langchain.retrievers.web_research:Questions for Google Search (raw): {'question': 'What is the recommended way to recycle plastics?', 'text': LineList(lines=['1. How can I recycle plastics effectively?\\n', '2. What are the best practices for recycling plastics?\\n', '3. Which methods are recommended for recycling plastics?\\n', '4. What is the most efficient way to recycle plastics?\\n'])}\n",
319 | "INFO:langchain.retrievers.web_research:Questions for Google Search: ['1. How can I recycle plastics effectively?\\n', '2. What are the best practices for recycling plastics?\\n', '3. Which methods are recommended for recycling plastics?\\n', '4. What is the most efficient way to recycle plastics?\\n']\n",
320 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
321 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
322 | "INFO:langchain.retrievers.web_research:Search results: [{'title': '7 Tips to Recycle Better - Earth Day', 'link': 'https://www.earthday.org/7-tips-to-recycle-better/', 'snippet': \"Feb 25, 2022 ... Solution: Just as the rule states, make sure your recyclables are clean, empty and dry. It'll take seconds and if everyone did it, it would save\\xa0...\"}]\n",
323 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
324 | "INFO:langchain.retrievers.web_research:Search results: [{'title': 'What Is Recycling & What to Recycle | WM', 'link': 'https://www.wm.com/us/en/recycle-right/recycling-101', 'snippet': 'What Can Be Recycled: Recycling Guide. Learn guidelines and common myths related to recycling. You can also download and print our Curbside Recycling Myths\\xa0...'}]\n",
325 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
326 | "INFO:langchain.retrievers.web_research:Search results: [{'title': 'Plastics recycling: challenges and opportunities - PMC', 'link': 'https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873020/', 'snippet': 'Table 3 provides data on some environmental impacts from production of virgin commodity plastics ... ACRR 2004Good practices guide on waste plastics recycling\\xa0...'}]\n",
327 | "INFO:langchain.retrievers.web_research:Searching for relevant urls...\n",
328 | "INFO:langchain.retrievers.web_research:Search results: [{'title': 'Frequent Questions on Recycling | US EPA', 'link': 'https://www.epa.gov/recycle/frequent-questions-recycling', 'snippet': 'May 29, 2023 ... Recycling just 10 plastic bottles saves enough energy to power a laptop for more than 25 hours. How does recycling save energy? When we make new\\xa0...'}]\n",
329 | "INFO:langchain.retrievers.web_research:New URLs to load: ['https://www.earthday.org/7-tips-to-recycle-better/', 'https://www.wm.com/us/en/recycle-right/recycling-101', 'https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873020/', 'https://www.epa.gov/recycle/frequent-questions-recycling']\n",
330 | "INFO:langchain.retrievers.web_research:Indexing new urls...\n",
331 | "Fetching pages: 100%|##########| 4/4 [00:01<00:00, 3.14it/s]\n",
332 | "INFO:langchain.retrievers.web_research:Grabbing most relevant splits from urls...\n"
333 | ]
334 | }
335 | ]
336 | },
337 | {
338 | "cell_type": "code",
339 | "source": [
340 | "docs"
341 | ],
342 | "metadata": {
343 | "colab": {
344 | "base_uri": "https://localhost:8080/"
345 | },
346 | "id": "cpCR8J5gQs7J",
347 | "outputId": "bb7a014a-35ac-4085-cb20-1f51c5e131b0"
348 | },
349 | "execution_count": 22,
350 | "outputs": [
351 | {
352 | "output_type": "execute_result",
353 | "data": {
354 | "text/plain": [
355 | "[Document(page_content='_Solution: Just as the rule states, make sure your recyclables are clean,\\nempty and dry. It’ll take seconds and if everyone did it, it would save tons\\nof recyclables going to the landfill._\\n\\n## **4\\\\. Combined materials are trash**\\n\\nRecycling only works when like materials are together. Unfortunately, items\\nlike plastic-coated coffee cups, laminated paper and paper-bubble wrap\\nenvelopes from the mail can’t ever be separated, which means they’re trash.\\n\\n_Solution: Try to avoid buying nonrecyclable materials that can’t be\\nseparated. And when you can, shop local to cut down on the carbon footprint of\\nyour products._\\n\\n## **5\\\\. Know your plastics**\\n\\nNot all plastics are treated equally. Rigid plastics are recyclable, labeled\\nby resin codes 1 through 7. Generally, the higher the number, the less\\nrecyclable it is. Most recycling centers will recycle plastics 1 and 2 without\\na problem. Past that, it gets tricky.\\n\\nFurthermore, a lot of plastic just isn’t recyclable curbside. As noted\\nearlier, you can’t recycle plastic bags or films. Additionally, you can’t\\nrecycle anything that can tear like paper. That means no cracker bags, chip\\nbags or cereal bags.\\n\\n“With plastics, it does get so confusing,” says Erin Hafner of Baltimore’s\\nrecycling program. “Clamshell containers, cutlery, plastic straws — all that\\nstuff that ends up in the [recycling] bin.” And it shouldn’t.\\n\\n_Solution: Check your city’s recycling website for the number the city takes._\\n\\n## **6\\\\. Stop wishcycling**', metadata={'source': 'https://www.earthday.org/7-tips-to-recycle-better/'}),\n",
356 | " Document(page_content='Effective recycling of mixed plastics waste is the next major challenge for\\nthe plastics recycling sector. The advantage is the ability to recycle a\\nlarger proportion of the plastic waste stream by expanding post-consumer\\ncollection of plastic packaging to cover a wider variety of materials and pack\\ntypes. Product design for recycling has strong potential to assist in such\\nrecycling efforts. A study carried out in the UK found that the amount of\\npackaging in a regular shopping basket that, even if collected, cannot be\\neffectively recycled, ranged from 21 to 40% (Local Government Association (UK)\\n2007). Hence, wider implementation of policies to promote the use of\\nenvironmental design principles by industry could have a large impact on\\nrecycling performance, increasing the proportion of packaging that can\\neconomically be collected and diverted from landfill (see Shaxson _et al._\\n2009). The same logic applies to durable consumer goods designing for\\ndisassembly, recycling and specifications for use of recycled resins are key\\nactions to increase recycling.', metadata={'source': 'https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873020/'}),\n",
357 | " Document(page_content='Recycling is one of the most important actions currently available to reduce\\nthese impacts and represents one of the most dynamic areas in the plastics\\nindustry today. Recycling provides opportunities to reduce oil usage, carbon\\ndioxide emissions and the quantities of waste requiring disposal. Here, we\\nbriefly set recycling into context against other waste-reduction strategies,\\nnamely reduction in material use through downgauging or product reuse, the use\\nof alternative biodegradable materials and energy recovery as fuel.\\n\\nWhile plastics have been recycled since the 1970s, the quantities that are\\nrecycled vary geographically, according to plastic type and application.\\nRecycling of packaging materials has seen rapid expansion over the last\\ndecades in a number of countries. Advances in technologies and systems for the\\ncollection, sorting and reprocessing of recyclable plastics are creating new\\nopportunities for recycling, and with the combined actions of the public,\\nindustry and governments it may be possible to divert the majority of plastic\\nwaste from landfills to recycling over the next decades.\\n\\n **Keywords:** plastics recycling, plastic packaging, environmental impacts,\\nwaste management, chemical recycling, energy recovery\\n\\n## 1\\\\. Introduction', metadata={'source': 'https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873020/'}),\n",
358 | " Document(page_content='The effectiveness of post-consumer packaging recycling could be dramatically\\nincreased if the diversity of materials were to be rationalized to a subset of\\ncurrent usage. For example, if rigid plastic containers ranging from bottles,\\njars to trays were all PET, HDPE and PP, without clear PVC or PS, which are\\nproblematic to sort from co-mingled recyclables, then all rigid plastic\\npackaging could be collected and sorted to make recycled resins with minimal\\ncross-contamination. The losses of rejected material and the value of the\\nrecycled resins would be enhanced. In addition, labels and adhesive materials\\nshould be selected to maximize recycling performance. Improvements in\\nsorting/separation within recycling plants give further potential for both\\nhigher recycling volumes, and better eco-efficiency by decreasing waste\\nfractions, energy and water use (see §3). The goals should be to maximize both\\nthe volume and quality of recycled resins.\\n\\n## 9\\\\. Conclusions\\n\\nIn summary, recycling is one strategy for end-of-life waste management of\\nplastic products. It makes increasing sense economically as well as\\nenvironmentally and recent trends demonstrate a substantial increase in the\\nrate of recovery and recycling of plastic wastes. These trends are likely to\\ncontinue, but some significant challenges still exist from both technological\\nfactors and from economic or social behaviour issues relating to the\\ncollection of recyclable wastes, and substitution for virgin material.', metadata={'source': 'https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873020/'})]"
359 | ]
360 | },
361 | "metadata": {},
362 | "execution_count": 22
363 | }
364 | ]
365 | }
366 | ]
367 | }
--------------------------------------------------------------------------------
/Retrievers/06_Web_Research_Retriever/README.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Retrievers/06_Web_Research_Retriever/README.md
--------------------------------------------------------------------------------
/Retrievers/README.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sugarforever/LangChain-Advanced/42d3c7fa5b6acf52b9a7b60e47e58de66d64394a/Retrievers/README.md
--------------------------------------------------------------------------------
/langchain_with_openai_assistants_api.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyP3f0+ueKPgrKhFpzZWRC4C",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | }
17 | },
18 | "cells": [
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "id": "view-in-github",
23 | "colab_type": "text"
24 | },
25 | "source": [
26 | "
"
27 | ]
28 | },
29 | {
30 | "cell_type": "markdown",
31 | "source": [
32 | "# OpenAI Assistants API with LangChain\n",
33 | "\n",
34 | "OpenAI finally released the **Assistants** API. It allows users to build AI assistants within the applications.\n",
35 | "\n",
36 | "An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries. The **Assistants** API currently supports three types of tools:\n",
37 | "- Code Interpreter\n",
38 | "- Retrieval\n",
39 | "- Function calling\n",
40 | "\n",
41 | "In this tutorial, I will share with you 2 ways of using **Assistants** API:\n",
42 | "\n",
43 | "1. Command line tool `curl`\n",
44 | "2. `LangChain`\n",
45 | "\n",
46 | "You can surely use the official OpenAI Python SDK. It's just not covered in this tutorial.\n",
47 | "\n",
48 | "In principle, to interact with an Assistant, the following steps should be followed:\n",
49 | "\n",
50 | "1. Create an `Assistant`\n",
51 | "2. Create a `Thread`\n",
52 | "3. Submit a `Run`\n",
53 | "4. Query the `Run` status\n",
54 | "5. Retrieve the `Assistant`'s response (message)"
55 | ],
56 | "metadata": {
57 | "id": "ykElVnxqDNO7"
58 | }
59 | },
60 | {
61 | "cell_type": "markdown",
62 | "source": [
63 | "## Play with `curl`\n",
64 | "\n",
65 | "Before you run the following commands, make sure you have a valid **OPENAI_API_KEY** environmental variable set."
66 | ],
67 | "metadata": {
68 | "id": "bpPPGpeuE3wj"
69 | }
70 | },
71 | {
72 | "cell_type": "markdown",
73 | "source": [
74 | "### 1. Create an Assistant\n",
75 | "\n",
76 | "```shell\n",
77 | "curl \"https://api.openai.com/v1/assistants\" \\\n",
78 | " -u:$OPENAI_API_KEY \\\n",
79 | " -H \"Content-Type: application/json\" \\\n",
80 | " -H \"OpenAI-Beta: assistants=v1\" \\\n",
81 | " -d '{\n",
82 | " \"instructions\": \"You are a personal math tutor. When asked a question, write and run Python code to answer the question.\",\n",
83 | " \"name\": \"Math Tutor\",\n",
84 | " \"tools\": [{\"type\": \"code_interpreter\"}],\n",
85 | " \"model\": \"gpt-4\"\n",
86 | " }'\n",
87 | "```\n",
88 | "\n",
89 | "You will be given an assistant id in this API response.\n",
90 | "\n",
91 | "```json\n",
92 | "{\n",
93 | " \"id\": \"asst_M1qNqUsUIeItZ8f7JG6FLHTy\",\n",
94 | " \"object\": \"assistant\",\n",
95 | " \"created_at\": 1699486594,\n",
96 | " \"name\": \"Math Tutor\",\n",
97 | " \"description\": null,\n",
98 | " \"model\": \"gpt-4\",\n",
99 | " \"instructions\": \"You are a personal math tutor. When asked a question, write and run Python code to answer the question.\",\n",
100 | " \"tools\": [\n",
101 | " {\n",
102 | " \"type\": \"code_interpreter\"\n",
103 | " }\n",
104 | " ],\n",
105 | " \"file_ids\": [],\n",
106 | " \"metadata\": {}\n",
107 | "}% \n",
108 | "```\n",
109 | "\n",
110 | "### 2. Create a Thread\n",
111 | "\n",
112 | "A `Thread` represents a conversation between an Assistant and a user.\n",
113 | "\n",
114 | "```shell\n",
115 | "curl https://api.openai.com/v1/threads \\\n",
116 | " -u:$OPENAI_API_KEY \\\n",
117 | " -H 'Content-Type: application/json' \\\n",
118 | " -H 'OpenAI-Beta: assistants=v1' \\\n",
119 | " -d '{\n",
120 | " \"messages\": [\n",
121 | " {\n",
122 | " \"role\": \"user\",\n",
123 | " \"content\": \"Work out the sum of 1, 2 and 3\"\n",
124 | " }\n",
125 | " ]\n",
126 | " }'\n",
127 | "```\n",
128 | "\n",
129 | "You will be given a Thread id in this API response.\n",
130 | "\n",
131 | "```json\n",
132 | "{\n",
133 | " \"id\": \"thread_lEH83Q4bKV3hv4Y9yECcjfgy\",\n",
134 | " \"object\": \"thread\",\n",
135 | " \"created_at\": 1699486702,\n",
136 | " \"metadata\": {}\n",
137 | "}\n",
138 | "```\n",
139 | "\n",
140 | "### 3. Submit a Run\n",
141 | "\n",
142 | "Submit a Run with the `thread id` and `assistant id` to `run` the conversation.\n",
143 | "\n",
144 | "```shell\n",
145 | "curl https://api.openai.com/v1/threads//runs \\\n",
146 | " -u:$OPENAI_API_KEY \\\n",
147 | " -H 'Content-Type: application/json' \\\n",
148 | " -H 'OpenAI-Beta: assistants=v1' \\\n",
149 | " -d '{\n",
150 | " \"assistant_id\": \"asst_M1qNqUsUIeItZ8f7JG6FLHTy\"\n",
151 | " }'\n",
152 | "```\n",
153 | "\n",
154 | "You will be given a Run id in this API response.\n",
155 | "\n",
156 | "```json\n",
157 | "{\n",
158 | " \"id\": \"run_in35OoYZNo0osZnQOQrAHaNo\",\n",
159 | " \"object\": \"thread.run\",\n",
160 | " \"created_at\": 1699486777,\n",
161 | " \"assistant_id\": \"asst_M1qNqUsUIeItZ8f7JG6FLHTy\",\n",
162 | " \"thread_id\": \"thread_lEH83Q4bKV3hv4Y9yECcjfgy\",\n",
163 | " \"status\": \"queued\",\n",
164 | " \"started_at\": null,\n",
165 | " \"expires_at\": 1699487377,\n",
166 | " \"cancelled_at\": null,\n",
167 | " \"failed_at\": null,\n",
168 | " \"completed_at\": null,\n",
169 | " \"last_error\": null,\n",
170 | " \"model\": \"gpt-4\",\n",
171 | " \"instructions\": \"You are a personal math tutor. When asked a question, write and run Python code to answer the question.\",\n",
172 | " \"tools\": [\n",
173 | " {\n",
174 | " \"type\": \"code_interpreter\"\n",
175 | " }\n",
176 | " ],\n",
177 | " \"file_ids\": [],\n",
178 | " \"metadata\": {}\n",
179 | "}\n",
180 | "```\n",
181 | "\n",
182 | "### 4. Query the Run status\n",
183 | "\n",
184 | "```shell\n",
185 | "curl https://api.openai.com/v1/threads//runs/ \\\n",
186 | " -u:$OPENAI_API_KEY \\\n",
187 | " -H 'OpenAI-Beta: assistants=v1'\n",
188 | "```\n",
189 | "\n",
190 | "You will see the status of the Run in the API response.\n",
191 | "\n",
192 | "```json\n",
193 | "{\n",
194 | " \"id\": \"run_in35OoYZNo0osZnQOQrAHaNo\",\n",
195 | " \"object\": \"thread.run\",\n",
196 | " \"created_at\": 1699486777,\n",
197 | " \"assistant_id\": \"asst_M1qNqUsUIeItZ8f7JG6FLHTy\",\n",
198 | " \"thread_id\": \"thread_lEH83Q4bKV3hv4Y9yECcjfgy\",\n",
199 | " \"status\": \"completed\",\n",
200 | " \"started_at\": 1699486777,\n",
201 | " \"expires_at\": null,\n",
202 | " \"cancelled_at\": null,\n",
203 | " \"failed_at\": null,\n",
204 | " \"completed_at\": 1699486783,\n",
205 | " \"last_error\": null,\n",
206 | " \"model\": \"gpt-4\",\n",
207 | " \"instructions\": \"You are a personal math tutor. When asked a question, write and run Python code to answer the question.\",\n",
208 | " \"tools\": [\n",
209 | " {\n",
210 | " \"type\": \"code_interpreter\"\n",
211 | " }\n",
212 | " ],\n",
213 | " \"file_ids\": [],\n",
214 | " \"metadata\": {}\n",
215 | "}%\n",
216 | "```\n",
217 | "\n",
218 | "### 5. Retrieve the Assistant's response\n",
219 | "\n",
220 | "The Assistant's response will be the latest message in the following API response:\n",
221 | "\n",
222 | "```shell\n",
223 | "curl https://api.openai.com/v1/threads//messages \\\n",
224 | " -u:$OPENAI_API_KEY \\\n",
225 | " -H \"Content-Type: application/json\" \\\n",
226 | " -H \"OpenAI-Beta: assistants=v1\"\n",
227 | "```\n",
228 | "\n",
229 | "Sample response:\n",
230 | "\n",
231 | "```json\n",
232 | "{\n",
233 | " \"object\": \"list\",\n",
234 | " \"data\": [\n",
235 | " {\n",
236 | " \"id\": \"msg_bCgflETMYexEUEGqHkwNgN6Z\",\n",
237 | " \"object\": \"thread.message\",\n",
238 | " \"created_at\": 1699486782,\n",
239 | " \"thread_id\": \"thread_lEH83Q4bKV3hv4Y9yECcjfgy\",\n",
240 | " \"role\": \"assistant\",\n",
241 | " \"content\": [\n",
242 | " {\n",
243 | " \"type\": \"text\",\n",
244 | " \"text\": {\n",
245 | " \"value\": \"The sum of 1, 2, and 3 is 6.\",\n",
246 | " \"annotations\": []\n",
247 | " }\n",
248 | " }\n",
249 | " ],\n",
250 | " \"file_ids\": [],\n",
251 | " \"assistant_id\": \"asst_M1qNqUsUIeItZ8f7JG6FLHTy\",\n",
252 | " \"run_id\": \"run_in35OoYZNo0osZnQOQrAHaNo\",\n",
253 | " \"metadata\": {}\n",
254 | " },\n",
255 | " {\n",
256 | " \"id\": \"msg_I4xv8bsfy5evvEC8K4MVZar3\",\n",
257 | " \"object\": \"thread.message\",\n",
258 | " \"created_at\": 1699486702,\n",
259 | " \"thread_id\": \"thread_lEH83Q4bKV3hv4Y9yECcjfgy\",\n",
260 | " \"role\": \"user\",\n",
261 | " \"content\": [\n",
262 | " {\n",
263 | " \"type\": \"text\",\n",
264 | " \"text\": {\n",
265 | " \"value\": \"Work out the sum of 1, 2 and 3\",\n",
266 | " \"annotations\": []\n",
267 | " }\n",
268 | " }\n",
269 | " ],\n",
270 | " \"file_ids\": [],\n",
271 | " \"assistant_id\": null,\n",
272 | " \"run_id\": null,\n",
273 | " \"metadata\": {}\n",
274 | " }\n",
275 | " ],\n",
276 | " \"first_id\": \"msg_bCgflETMYexEUEGqHkwNgN6Z\",\n",
277 | " \"last_id\": \"msg_I4xv8bsfy5evvEC8K4MVZar3\",\n",
278 | " \"has_more\": false\n",
279 | "}%\n",
280 | "```"
281 | ],
282 | "metadata": {
283 | "id": "CgoXjB6bFgUZ"
284 | }
285 | },
286 | {
287 | "cell_type": "markdown",
288 | "source": [
289 | "## Play with LangChain\n",
290 | "\n",
291 | "LangChain dev team has been responding to OpenAI changes proactively. The latest **RC** version of LangChain has already supported **Assistants** API.\n",
292 | "\n",
293 | "Make sure you use at least version `0.0.331rc2` of LangChain to work with Assistants API.\n",
294 | "\n",
295 | "Currently LangChain is delivering the Assistants support by `langchain-experimental` package.\n",
296 | "\n",
297 | "The only class you need is just **OpenAIAssistantRunnable**, which makes code much cleaner.\n",
298 | "\n",
299 | "Let's see below."
300 | ],
301 | "metadata": {
302 | "id": "1BVbUSW2IC51"
303 | }
304 | },
305 | {
306 | "cell_type": "code",
307 | "source": [
308 | "import os\n",
309 | "os.environ[\"OPENAI_API_KEY\"] = 'your valid OpenAI API key with GPT-4 enabled'"
310 | ],
311 | "metadata": {
312 | "id": "GjlOblb38jmA"
313 | },
314 | "execution_count": 17,
315 | "outputs": []
316 | },
317 | {
318 | "cell_type": "code",
319 | "execution_count": null,
320 | "metadata": {
321 | "id": "em823NSa5xOa"
322 | },
323 | "outputs": [],
324 | "source": [
325 | "!pip install -U -q \"langchain==0.0.331rc2\" langchain-experimental \"openai>=1.1\""
326 | ]
327 | },
328 | {
329 | "cell_type": "code",
330 | "source": [
331 | "from langchain_experimental.openai_assistant import OpenAIAssistantRunnable"
332 | ],
333 | "metadata": {
334 | "id": "4vxoGs1DBVnX"
335 | },
336 | "execution_count": 14,
337 | "outputs": []
338 | },
339 | {
340 | "cell_type": "code",
341 | "source": [
342 | "interpreter_assistant = OpenAIAssistantRunnable.create_assistant(\n",
343 | " name=\"langchain assistant\",\n",
344 | " instructions=\"You are a personal math tutor. When asked a question, write and run Python code to answer the question.\",\n",
345 | " tools=[{\"type\": \"code_interpreter\"}],\n",
346 | " model=\"gpt-4\"\n",
347 | ")"
348 | ],
349 | "metadata": {
350 | "id": "RvPrTrT3Cy6z"
351 | },
352 | "execution_count": 18,
353 | "outputs": []
354 | },
355 | {
356 | "cell_type": "code",
357 | "source": [
358 | "output = interpreter_assistant.invoke({\"content\": \"Work out the sum of 1, 2 and 3\"})\n",
359 | "output"
360 | ],
361 | "metadata": {
362 | "colab": {
363 | "base_uri": "https://localhost:8080/"
364 | },
365 | "id": "us5FDeYi8a25",
366 | "outputId": "bb930f15-9d8b-4f89-a3fa-b7c02a1ec0f1"
367 | },
368 | "execution_count": 19,
369 | "outputs": [
370 | {
371 | "output_type": "execute_result",
372 | "data": {
373 | "text/plain": [
374 | "[ThreadMessage(id='msg_RnANxbiBnHsm08qNLOZiMLgX', assistant_id='asst_ZYhTHQEq3oxx9EM8a7TyE4Ac', content=[MessageContentText(text=Text(annotations=[], value='The sum of 1, 2 and 3 is 6.'), type='text')], created_at=1699487072, file_ids=[], metadata={}, object='thread.message', role='assistant', run_id='run_x9f7CXuvuAKGbGwGktLf1ghY', thread_id='thread_Fe5tZZGTBhJShOaBPDAbqOjz')]"
375 | ]
376 | },
377 | "metadata": {},
378 | "execution_count": 19
379 | }
380 | ]
381 | }
382 | ]
383 | }
--------------------------------------------------------------------------------