├── .env.example
├── .gitignore
├── LICENSE.txt
├── LangChainGuides
    ├── BasicAgent.ipynb
    ├── ChatWithObsidian.ipynb
    ├── ChatWithPodcast.ipynb
    ├── ContentIdeaGenerator.ipynb
    ├── PromptTemplates.ipynb
    ├── YouTubeLoader.ipynb
    ├── images
    │   ├── Chains_router.png
    │   ├── Chains_seq.png
    │   ├── Chains_simple_seq.png
    │   ├── Embeddings.png
    │   ├── TAO.png
    │   ├── VectorDatabaseCreate.png
    │   ├── VectorDatabaseProcess.png
    │   ├── qa_data_ecosystem.png
    │   └── qa_flow.png
    └── transcripts
    │   └── PT693-Transcript.pdf
└── README.md


/.env.example:
--------------------------------------------------------------------------------
1 | OPENAI_API_KEY="Your OPENAI API Key"


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | todo.md
2 | .env
3 | .vscode
4 | 
5 | pg_files/
6 | 
7 | chroma/
8 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Krzysztof Ograbek
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/LangChainGuides/BasicAgent.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# Create Your First AI Agent with LangChain\n",
   8 |     "\n",
   9 |     "Large Language Models are powerful but they have some limitations. Let's focus on 2 of them:\n",
  10 |     "- the knowledge cut-off,\n",
  11 |     "- mistakes in math.\n",
  12 |     "\n",
  13 |     "The idea for this project is use LangChain to build a basic AI Agent and give him 2 tools: Google search and a calculator.\n",
  14 |     "\n",
  15 |     "We'll use a prompting technique called **ReAct** (Reasoning & Acting).\n",
  16 |     "\n",
  17 |     "*Can our AI Agent take advantage of the toolkit we provided?*\n",
  18 |     "\n",
  19 |     "Let's find out!\n",
  20 |     "\n",
  21 |     "By following this tutorial, you'll learn how to:\n",
  22 |     "1. Create an AI agent using the LangChain library\n",
  23 |     "2. What is ReAct prompting.\n",
  24 |     "3. Use google search with agents.\n",
  25 |     "4. Use LLM math tool.\n"
  26 |    ]
  27 |   },
  28 |   {
  29 |    "cell_type": "markdown",
  30 |    "metadata": {},
  31 |    "source": [
  32 |     "## Import necessary libraries & Load Env Variables\n",
  33 |     "\n",
  34 |     "First, we need to install some packages. Just run `pip install openai langchain python-doteenv google-search-results` to install them.\n",
  35 |     "\n",
  36 |     "For the project, we'll need 2 API keys:\n",
  37 |     "1. OpenAI API key for the conversational agent. You can grab it here: https://platform.openai.com/account/api-keys\n",
  38 |     "2. SERP API key for google search. You can grab it here: https://serpapi.com/"
  39 |    ]
  40 |   },
  41 |   {
  42 |    "cell_type": "code",
  43 |    "execution_count": 1,
  44 |    "metadata": {},
  45 |    "outputs": [
  46 |     {
  47 |      "data": {
  48 |       "text/plain": [
  49 |        "True"
  50 |       ]
  51 |      },
  52 |      "execution_count": 1,
  53 |      "metadata": {},
  54 |      "output_type": "execute_result"
  55 |     }
  56 |    ],
  57 |    "source": [
  58 |     "import os\n",
  59 |     "from dotenv import load_dotenv\n",
  60 |     "\n",
  61 |     "# load_dotenv()\n",
  62 |     "\n",
  63 |     "# Get the absolute path of the current script\n",
  64 |     "script_dir = os.path.abspath(os.getcwd())\n",
  65 |     "\n",
  66 |     "# Get the absolute path of the parent directory\n",
  67 |     "parent_dir = os.path.join(script_dir, os.pardir)\n",
  68 |     "\n",
  69 |     "dotenv_path = os.path.join(parent_dir, '.env')\n",
  70 |     "# Load the .env file from the parent directory\n",
  71 |     "load_dotenv(dotenv_path)"
  72 |    ]
  73 |   },
  74 |   {
  75 |    "cell_type": "code",
  76 |    "execution_count": 2,
  77 |    "metadata": {},
  78 |    "outputs": [
  79 |     {
  80 |      "data": {
  81 |       "text/plain": [
  82 |        "'2a5096fc18b65b3fcab02a9bc50c45c5338d9895252875761d92b26dfb2d9181'"
  83 |       ]
  84 |      },
  85 |      "execution_count": 2,
  86 |      "metadata": {},
  87 |      "output_type": "execute_result"
  88 |     }
  89 |    ],
  90 |    "source": [
  91 |     "os.getenv(\"SERPAPI_API_KEY\")"
  92 |    ]
  93 |   },
  94 |   {
  95 |    "cell_type": "markdown",
  96 |    "metadata": {},
  97 |    "source": [
  98 |     "## Using Models without Agents\n",
  99 |     "\n",
 100 |     "In this part, we'll quickly go over the standard way of using OpenAI models."
 101 |    ]
 102 |   },
 103 |   {
 104 |    "cell_type": "code",
 105 |    "execution_count": 2,
 106 |    "metadata": {},
 107 |    "outputs": [],
 108 |    "source": [
 109 |     "from langchain.llms import OpenAI\n",
 110 |     "\n",
 111 |     "llm = OpenAI(temperature=0)"
 112 |    ]
 113 |   },
 114 |   {
 115 |    "cell_type": "code",
 116 |    "execution_count": 3,
 117 |    "metadata": {},
 118 |    "outputs": [
 119 |     {
 120 |      "data": {
 121 |       "text/plain": [
 122 |        "'text-davinci-003'"
 123 |       ]
 124 |      },
 125 |      "execution_count": 3,
 126 |      "metadata": {},
 127 |      "output_type": "execute_result"
 128 |     }
 129 |    ],
 130 |    "source": [
 131 |     "llm.model_name"
 132 |    ]
 133 |   },
 134 |   {
 135 |    "cell_type": "code",
 136 |    "execution_count": 4,
 137 |    "metadata": {},
 138 |    "outputs": [
 139 |     {
 140 |      "data": {
 141 |       "text/plain": [
 142 |        "'\\n\\nThe Los Angeles Lakers are the current NBA champions.'"
 143 |       ]
 144 |      },
 145 |      "execution_count": 4,
 146 |      "metadata": {},
 147 |      "output_type": "execute_result"
 148 |     }
 149 |    ],
 150 |    "source": [
 151 |     "llm.predict(\"Who is the current NBA champion?\")"
 152 |    ]
 153 |   },
 154 |   {
 155 |    "cell_type": "markdown",
 156 |    "metadata": {},
 157 |    "source": [
 158 |     "### Using Chat Models"
 159 |    ]
 160 |   },
 161 |   {
 162 |    "cell_type": "code",
 163 |    "execution_count": 5,
 164 |    "metadata": {},
 165 |    "outputs": [],
 166 |    "source": [
 167 |     "from langchain.chat_models import ChatOpenAI\n",
 168 |     "\n",
 169 |     "chat_llm = ChatOpenAI(temperature=0)"
 170 |    ]
 171 |   },
 172 |   {
 173 |    "cell_type": "code",
 174 |    "execution_count": 6,
 175 |    "metadata": {},
 176 |    "outputs": [
 177 |     {
 178 |      "data": {
 179 |       "text/plain": [
 180 |        "'gpt-3.5-turbo'"
 181 |       ]
 182 |      },
 183 |      "execution_count": 6,
 184 |      "metadata": {},
 185 |      "output_type": "execute_result"
 186 |     }
 187 |    ],
 188 |    "source": [
 189 |     "chat_llm.model_name"
 190 |    ]
 191 |   },
 192 |   {
 193 |    "cell_type": "code",
 194 |    "execution_count": 7,
 195 |    "metadata": {},
 196 |    "outputs": [
 197 |     {
 198 |      "data": {
 199 |       "text/plain": [
 200 |        "'As of September 2021, the current NBA champion is the Milwaukee Bucks. They won the championship in the 2020-2021 season.'"
 201 |       ]
 202 |      },
 203 |      "execution_count": 7,
 204 |      "metadata": {},
 205 |      "output_type": "execute_result"
 206 |     }
 207 |    ],
 208 |    "source": [
 209 |     "chat_llm.predict(\"Who is the current NBA champion?\")"
 210 |    ]
 211 |   },
 212 |   {
 213 |    "cell_type": "code",
 214 |    "execution_count": 8,
 215 |    "metadata": {},
 216 |    "outputs": [],
 217 |    "source": [
 218 |     "gpt4 = ChatOpenAI(model=\"gpt-4\", temperature=0)"
 219 |    ]
 220 |   },
 221 |   {
 222 |    "cell_type": "code",
 223 |    "execution_count": 9,
 224 |    "metadata": {},
 225 |    "outputs": [
 226 |     {
 227 |      "data": {
 228 |       "text/plain": [
 229 |        "ChatOpenAI(client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-4', temperature=0.0, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='')"
 230 |       ]
 231 |      },
 232 |      "execution_count": 9,
 233 |      "metadata": {},
 234 |      "output_type": "execute_result"
 235 |     }
 236 |    ],
 237 |    "source": [
 238 |     "gpt4.model_name"
 239 |    ]
 240 |   },
 241 |   {
 242 |    "cell_type": "code",
 243 |    "execution_count": 10,
 244 |    "metadata": {},
 245 |    "outputs": [
 246 |     {
 247 |      "data": {
 248 |       "text/plain": [
 249 |        "'As of the end of the 2020-2021 season, the current NBA champion is the Milwaukee Bucks.'"
 250 |       ]
 251 |      },
 252 |      "execution_count": 10,
 253 |      "metadata": {},
 254 |      "output_type": "execute_result"
 255 |     }
 256 |    ],
 257 |    "source": [
 258 |     "gpt4.predict(\"Who is the current NBA champion?\")"
 259 |    ]
 260 |   },
 261 |   {
 262 |    "cell_type": "markdown",
 263 |    "metadata": {},
 264 |    "source": [
 265 |     "#### Finding the NBA all-time leading scorer.\n",
 266 |     "\n",
 267 |     "This is a great example to test both limitations: the knowledge cut-off & math mistakes. \n",
 268 |     "\n",
 269 |     "In 2023, LeBron James surpassed Kareem Abdul-Jabbar as the NBA all-time leading scorer. So we expect GPT-4 to answer Kareem. Then we want to take the total points to the power of 0.42 to make the calculation more challenging."
 270 |    ]
 271 |   },
 272 |   {
 273 |    "cell_type": "code",
 274 |    "execution_count": 11,
 275 |    "metadata": {},
 276 |    "outputs": [
 277 |     {
 278 |      "data": {
 279 |       "text/plain": [
 280 |        "'The NBA all-time leading scorer is Kareem Abdul-Jabbar with a total of 38,387 points. His total points to the power of 0.42 is approximately 137.97.'"
 281 |       ]
 282 |      },
 283 |      "execution_count": 11,
 284 |      "metadata": {},
 285 |      "output_type": "execute_result"
 286 |     }
 287 |    ],
 288 |    "source": [
 289 |     "gpt4.predict(\"Who is the NBA all-time leading scorer? What's his total points to the power of 0.42?\")"
 290 |    ]
 291 |   },
 292 |   {
 293 |    "cell_type": "markdown",
 294 |    "metadata": {},
 295 |    "source": [
 296 |     "Both answers are wrong!\n",
 297 |     "1. It's LeBron James (the knowledge cut-off issue).\n",
 298 |     "2. It should be approximately 84 (bad math).\n",
 299 |     "\n",
 300 |     "Let's see if we can overcome the limitations!"
 301 |    ]
 302 |   },
 303 |   {
 304 |    "cell_type": "markdown",
 305 |    "metadata": {},
 306 |    "source": [
 307 |     "### Adding the Agent\n",
 308 |     "\n",
 309 |     "We'll use an Agent that uses **Chain-of-Thought Reasoning (ReAct)**\n",
 310 |     "\n",
 311 |     "<img src=\"images/TAO.png\" alt=\"Image Alt Text\" width=\"600\">"
 312 |    ]
 313 |   },
 314 |   {
 315 |    "cell_type": "code",
 316 |    "execution_count": 14,
 317 |    "metadata": {},
 318 |    "outputs": [],
 319 |    "source": [
 320 |     "from langchain.agents import load_tools\n",
 321 |     "from langchain.agents import initialize_agent\n",
 322 |     "from langchain.agents import AgentType"
 323 |    ]
 324 |   },
 325 |   {
 326 |    "cell_type": "markdown",
 327 |    "metadata": {},
 328 |    "source": [
 329 |     "### Standard Agent with the DaVinci model\n",
 330 |     "\n",
 331 |     "First, we'll try to use the agent for the standard LLM: `text-davinci-003`"
 332 |    ]
 333 |   },
 334 |   {
 335 |    "cell_type": "markdown",
 336 |    "metadata": {},
 337 |    "source": [
 338 |     "We give our Agent 2 tools:\n",
 339 |     "1. SEPR API for Google Search so we get the answers that are true TODAY. *Note:* requires `pip install google-search-results`\n",
 340 |     "2. `llm-math` for calculations."
 341 |    ]
 342 |   },
 343 |   {
 344 |    "cell_type": "code",
 345 |    "execution_count": 15,
 346 |    "metadata": {},
 347 |    "outputs": [],
 348 |    "source": [
 349 |     "llm = OpenAI(temperature=0)\n",
 350 |     "\n",
 351 |     "# adding the tools\n",
 352 |     "tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)"
 353 |    ]
 354 |   },
 355 |   {
 356 |    "cell_type": "markdown",
 357 |    "metadata": {},
 358 |    "source": [
 359 |     "To apply ReAct prompting we use `agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION`"
 360 |    ]
 361 |   },
 362 |   {
 363 |    "cell_type": "code",
 364 |    "execution_count": 16,
 365 |    "metadata": {},
 366 |    "outputs": [],
 367 |    "source": [
 368 |     "# initializing the agent\n",
 369 |     "agent_executor = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)"
 370 |    ]
 371 |   },
 372 |   {
 373 |    "cell_type": "code",
 374 |    "execution_count": 17,
 375 |    "metadata": {},
 376 |    "outputs": [
 377 |     {
 378 |      "name": "stdout",
 379 |      "output_type": "stream",
 380 |      "text": [
 381 |       "\n",
 382 |       "\n",
 383 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 384 |       "\u001b[32;1m\u001b[1;3m I need to find out who won the most recent NBA championship\n",
 385 |       "Action: Search\n",
 386 |       "Action Input: \"2020 NBA Champions\"\u001b[0m\n",
 387 |       "Observation: \u001b[36;1m\u001b[1;3m{'title': 'NBA', 'rankings': 'NBA Finals', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e3359b9d7a36ae41dca170eeb5f549ee64c2318754ac54440.png', 'games': [{'tournament': 'NBA', 'arena': 'AdventHealth Arena at ESPN Wide World of Sports Complex', 'status': 'Final', 'date': 'Oct 11, 20', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=czq7usfGiZY&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382125e827c5974d947a154b51dfed1f642f917191f83e557c7b70f070d3167b72fafd8cf3dbc9c5a246.jpeg', 'duration': '9:56'}, 'teams': [{'name': 'Lakers', 'score': '106', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382187637fc68e482d3f591c4784e21ee6742f0ddf745f5e59f7b7d8288de8aa6c1a.png'}, {'name': 'Heat', 'score': '93', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382187637fc68e482d3f5f1722c8a08cc0fdb42261d32d2de8e7408fa347b40c0a32.png'}]}, {'tournament': 'NBA', 'arena': 'AdventHealth Arena at ESPN Wide World of Sports Complex', 'status': 'Final', 'date': 'Oct 9, 20', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=1S1cXaqkIRQ&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d3821f8d7339a8123548d29d179aa99e7ad3d95bca9c63cbd00b761edb95f68022b4cf9375ea351bc5918.jpeg', 'duration': '9:50'}, 'teams': [{'name': 'Heat', 'score': '111', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382196e401766691baaa194147fee974403dec7184ac912a099b06335d0a2e0fb411.png'}, {'name': 'Lakers', 'score': '108', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382196e401766691baaa00ed72f33a1886b23eb31c8f5483d9ec73ba112e50b4ccf2.png'}]}, {'tournament': 'NBA', 'arena': 'AdventHealth Arena at ESPN Wide World of Sports Complex', 'status': 'Final', 'date': 'Oct 6, 20', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=Bh90U367Ivc&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d38216fffefeff5e3ed1a7719963044fb177c0e6c849af4574c9462c50285abfdd09d1fc58e8ae41f5058.jpeg', 'duration': '9:47'}, 'teams': [{'name': 'Lakers', 'score': '102', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d3821df27767a43b56f33fa677f947e39cca6d6b3316b3853de4b3734d45083f54c00.png'}, {'name': 'Heat', 'score': '96', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d3821df27767a43b56f33236747433e4c00759563e7d95f2c6b080117c2ea5987b233.png'}]}, {'tournament': 'NBA', 'arena': 'AdventHealth Arena at ESPN Wide World of Sports Complex', 'status': 'Final', 'date': 'Oct 4, 20', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=yqiq97qlAwo&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d38213023f22c116a9ddc99bd3feaaff4472f5fde41bbc034cbafd40e3bba9fcdcdb0307f9419ef7830d3.jpeg', 'duration': '9:53'}, 'teams': [{'name': 'Lakers', 'score': '104', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382102d75fb5650b2b82e6d3f3109c9d4567bb4ce493c574d3c9f8f9253dc847c9c7.png'}, {'name': 'Heat', 'score': '115', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382102d75fb5650b2b82baffbf6cff3986bc8307a26ad2dbbbc5fd316340a5a02a13.png'}]}, {'tournament': 'NBA', 'arena': 'AdventHealth Arena at ESPN Wide World of Sports Complex', 'status': 'Final', 'date': 'Oct 2, 20', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=jUnCLlDTgq0&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d3821583cafab94c4e7f996b06a06c41ea4b2f94b60ee71d0a5ce7018544e98824af0d8aef55b6aa0d233.jpeg', 'duration': '9:42'}, 'teams': [{'name': 'Heat', 'score': '114', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382181d42c55d7f8a54a5083b596d3a358ab90a7292ae40698734af18cc52824ee5b.png'}, {'name': 'Lakers', 'score': '124', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d382181d42c55d7f8a54aaf29a095fe59273700bca46498ffd2314b1d72828fc0d5fb.png'}]}, {'tournament': 'NBA', 'arena': 'AdventHealth Arena at ESPN Wide World of Sports Complex', 'status': 'Final', 'date': 'Sep 30, 20', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=AmwTYudvE80&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d3821a19ecf3ea3ffa1f597f157e0fa25d406e8138d9778eb3e2d20e9c85e0ac5824a5c2e0f036abe2991.jpeg', 'duration': '9:51'}, 'teams': [{'name': 'Heat', 'score': '98', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d3821fcefd7c6c1ec94641f40b81c836112f953fd15b1460708989659fb17e1f55aa9.png'}, {'name': 'Lakers', 'score': '116', 'thumbnail': 'https://serpapi.com/searches/651bf4a095bf92492e92fdaa/images/e568f1d4e23e3a0e84b774e26c3d3821fcefd7c6c1ec94644bc8623d53335aadde018fbe8f6f8f9dc075f1772cdc3782.png'}]}]}\u001b[0m\n",
 388 |       "Thought:\u001b[32;1m\u001b[1;3m The Lakers have won the most recent NBA championship.\n",
 389 |       "Final Answer: The Los Angeles Lakers are the 2020 NBA Champions.\u001b[0m\n",
 390 |       "\n",
 391 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 392 |      ]
 393 |     },
 394 |     {
 395 |      "data": {
 396 |       "text/plain": [
 397 |        "{'input': 'Who is the current NBA Champion?',\n",
 398 |        " 'output': 'The Los Angeles Lakers are the 2020 NBA Champions.'}"
 399 |       ]
 400 |      },
 401 |      "execution_count": 17,
 402 |      "metadata": {},
 403 |      "output_type": "execute_result"
 404 |     }
 405 |    ],
 406 |    "source": [
 407 |     "agent_executor.invoke({\"input\": \"Who is the current NBA Champion?\"})"
 408 |    ]
 409 |   },
 410 |   {
 411 |    "cell_type": "markdown",
 412 |    "metadata": {},
 413 |    "source": [
 414 |     "**The answer is wrong!**\n",
 415 |     "\n",
 416 |     "Our Agent assumes it's 2020, so it took \"the current\" part of our prompt as 2020...\n",
 417 |     "\n",
 418 |     "Why does he assume it's 2020? Because that's when it had its knowledge cut-off."
 419 |    ]
 420 |   },
 421 |   {
 422 |    "cell_type": "code",
 423 |    "execution_count": 18,
 424 |    "metadata": {},
 425 |    "outputs": [
 426 |     {
 427 |      "name": "stdout",
 428 |      "output_type": "stream",
 429 |      "text": [
 430 |       "\n",
 431 |       "\n",
 432 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 433 |       "\u001b[32;1m\u001b[1;3m I need to find out who won the NBA title in 2023\n",
 434 |       "Action: Search\n",
 435 |       "Action Input: \"NBA title 2023\"\u001b[0m\n",
 436 |       "Observation: \u001b[36;1m\u001b[1;3m{'title': 'NBA', 'rankings': 'NBA Finals', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582e5669e7b39e1efa347aa52467bffd138153c2d71f3cb20a7.png', 'games': [{'tournament': 'NBA', 'arena': 'Ball Arena', 'status': 'Final', 'date': 'Jun 12, 23', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=ucZZdf94LbI&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb0844011134d085aa11b07b134734c796c8e77ae11baf6644cfb746b936a645036ce885d36f1b4380.jpeg', 'duration': '9:51'}, 'teams': [{'name': 'Heat', 'score': '89', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb811e2de0f99d7f4796efb9ecd0d2dd54a7c0c8b4ba9b7798e1459993ca3be27c.png'}, {'name': 'Nuggets', 'score': '94', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb811e2de0f99d7f47766149fdc081bf83ac1f2eb863f1d656048971ff14aad6ee.png'}]}, {'tournament': 'NBA', 'arena': 'FTX Arena', 'status': 'Final', 'date': 'Jun 9, 23', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=Y0p8PzJ2eMw&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bba083b6cb574b08f79db6117aa476e797cbe00aa1c381ddeb947e0ad9d9cee4ac83cfd646cb7c4eeb.jpeg', 'duration': '9:53'}, 'teams': [{'name': 'Nuggets', 'score': '108', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb1a49c22fc7661a5125e36ba033d89a6b4eecdfd342c7898543a80d16ac75d596.png'}, {'name': 'Heat', 'score': '95', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb1a49c22fc7661a5158d815bbc217939023d3464acea7122dc4ce153b9b614cd5.png'}]}, {'tournament': 'NBA', 'arena': 'FTX Arena', 'status': 'Final', 'date': 'Jun 7, 23', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=5_wL0QrJT_M&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb4f8f6ae81fcd5e5ad03df5bd239d178e7cd7430e82ce05df8426928aa1abaa69b3a0d6970ea7f256.jpeg', 'duration': '9:59'}, 'teams': [{'name': 'Nuggets', 'score': '109', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb3f8d7f8e7945f22a3531bb8ff2391dcd8dc22510a6ae5ed62cfc0cf32114ade8.png'}, {'name': 'Heat', 'score': '94', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb3f8d7f8e7945f22af21e5acc3b8a6bb9c1c113402b3d6ae9e6c7d982762eeda0.png'}]}, {'tournament': 'NBA', 'arena': 'Ball Arena', 'status': 'Final', 'date': 'Jun 4, 23', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=pjSflSwIDEc&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb201dbcfc9265790626fa169414dcea61041a320d05b0fdcbd62993a70b73140b9b7f59851336cbf1.jpeg', 'duration': '9:57'}, 'teams': [{'name': 'Heat', 'score': '111', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bbd66aa84c8fd7496c1c507c01c5c5c1af8647112dbf58eb1321262681e03510a1.png'}, {'name': 'Nuggets', 'score': '108', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bbd66aa84c8fd7496c158142910d9c79f08ddf443d58799f6fb268ebee3671e882.png'}]}, {'tournament': 'NBA', 'arena': 'Ball Arena', 'status': 'Final', 'date': 'Jun 1, 23', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=IQ3btTsFDTc&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb05256dae91a809e74394038770e3a89c39afc80f831d1a98368ab174c865d4a7df01e94c06473bb7.jpeg', 'duration': '9:59'}, 'teams': [{'name': 'Heat', 'score': '93', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb8bbb8c5a20bddf2ba2f750684c98ef9c3b82122f32f88060ac6a05176d1e29c2.png'}, {'name': 'Nuggets', 'score': '104', 'thumbnail': 'https://serpapi.com/searches/651bf508c9de45d6da429b22/images/ee44ec191643b582030df48d546e38bb8bbb8c5a20bddf2b83ad12805e0514161702168fb84d71a40ad4be1e9c27afe5.png'}]}]}\u001b[0m\n",
 437 |       "Thought:\u001b[32;1m\u001b[1;3m The Nuggets won the NBA title in 2023.\n",
 438 |       "Final Answer: The Nuggets won the NBA title in 2023.\u001b[0m\n",
 439 |       "\n",
 440 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 441 |      ]
 442 |     },
 443 |     {
 444 |      "data": {
 445 |       "text/plain": [
 446 |        "{'input': 'Who won the NBA title in 2023?',\n",
 447 |        " 'output': 'The Nuggets won the NBA title in 2023.'}"
 448 |       ]
 449 |      },
 450 |      "execution_count": 18,
 451 |      "metadata": {},
 452 |      "output_type": "execute_result"
 453 |     }
 454 |    ],
 455 |    "source": [
 456 |     "agent_executor.invoke({\"input\": \"Who won the NBA title in 2023?\"})"
 457 |    ]
 458 |   },
 459 |   {
 460 |    "cell_type": "markdown",
 461 |    "metadata": {},
 462 |    "source": [
 463 |     "Although the answer is good, we had to \"force\" the Agent to type 2023 in the search."
 464 |    ]
 465 |   },
 466 |   {
 467 |    "cell_type": "code",
 468 |    "execution_count": 19,
 469 |    "metadata": {},
 470 |    "outputs": [
 471 |     {
 472 |      "name": "stdout",
 473 |      "output_type": "stream",
 474 |      "text": [
 475 |       "\n",
 476 |       "\n",
 477 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 478 |       "\u001b[32;1m\u001b[1;3m I need to find out who the NBA all-time leading scorer is and then calculate his total points to the power of 0.42\n",
 479 |       "Action: Search\n",
 480 |       "Action Input: NBA all-time leading scorer\u001b[0m\n",
 481 |       "Observation: \u001b[36;1m\u001b[1;3mLeBron James · 38,652\u001b[0m\n",
 482 |       "Thought:\u001b[32;1m\u001b[1;3m I need to calculate 38,652 to the power of 0.42\n",
 483 |       "Action: Calculator\n",
 484 |       "Action Input: 38,652^0.42\u001b[0m\n",
 485 |       "Observation: \u001b[33;1m\u001b[1;3mAnswer: 84.45244506356971\u001b[0m\n",
 486 |       "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
 487 |       "Final Answer: LeBron James is the NBA all-time leading scorer with a total points of 84.45244506356971 to the power of 0.42.\u001b[0m\n",
 488 |       "\n",
 489 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 490 |      ]
 491 |     },
 492 |     {
 493 |      "data": {
 494 |       "text/plain": [
 495 |        "{'input': \"Who is the NBA all-time leading scorer? What's his total points to the power of 0.42?\",\n",
 496 |        " 'output': 'LeBron James is the NBA all-time leading scorer with a total points of 84.45244506356971 to the power of 0.42.'}"
 497 |       ]
 498 |      },
 499 |      "execution_count": 19,
 500 |      "metadata": {},
 501 |      "output_type": "execute_result"
 502 |     }
 503 |    ],
 504 |    "source": [
 505 |     "agent_executor.invoke({\"input\": \"Who is the NBA all-time leading scorer? What's his total points to the power of 0.42?\"})"
 506 |    ]
 507 |   },
 508 |   {
 509 |    "cell_type": "markdown",
 510 |    "metadata": {},
 511 |    "source": [
 512 |     "Cool, we've got the right answer about the all-time leading scorer and the math is also correct!\n",
 513 |     "\n",
 514 |     "However... Here's another issue! It took the calculation and assigned it to Lebron's total points, which is incorrect.\n",
 515 |     "\n",
 516 |     "Can the Chat Agent do better than this?"
 517 |    ]
 518 |   },
 519 |   {
 520 |    "cell_type": "markdown",
 521 |    "metadata": {},
 522 |    "source": [
 523 |     "### Agent for Chat Models + Defining Tools\n",
 524 |     "\n",
 525 |     "This time we skip the math agent."
 526 |    ]
 527 |   },
 528 |   {
 529 |    "cell_type": "code",
 530 |    "execution_count": 9,
 531 |    "metadata": {},
 532 |    "outputs": [],
 533 |    "source": [
 534 |     "from langchain.utilities import SerpAPIWrapper\n",
 535 |     "from langchain.agents import initialize_agent, Tool, AgentType\n",
 536 |     "\n",
 537 |     "search = SerpAPIWrapper()\n",
 538 |     "new_tools = [\n",
 539 |     "    Tool(\n",
 540 |     "        name=\"Search\",\n",
 541 |     "        func=search.run,\n",
 542 |     "        description=\"useful for when you need to answer questions about current events. You should ask targeted questions\",\n",
 543 |     "    ),\n",
 544 |     "]"
 545 |    ]
 546 |   },
 547 |   {
 548 |    "cell_type": "markdown",
 549 |    "metadata": {},
 550 |    "source": [
 551 |     "To use the Chat Agent, we need to make 2 changes:\n",
 552 |     "1. Use a Chat model - `ChatOpenAI(temperature=0)`\n",
 553 |     "2. Change the agent type to include \"chat\" - `agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION`"
 554 |    ]
 555 |   },
 556 |   {
 557 |    "cell_type": "code",
 558 |    "execution_count": 21,
 559 |    "metadata": {},
 560 |    "outputs": [
 561 |     {
 562 |      "name": "stdout",
 563 |      "output_type": "stream",
 564 |      "text": [
 565 |       "\n",
 566 |       "\n",
 567 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 568 |       "\u001b[32;1m\u001b[1;3mThought: I can use the Search tool to find the answer to this question.\n",
 569 |       "\n",
 570 |       "Action:\n",
 571 |       "```\n",
 572 |       "{\n",
 573 |       "  \"action\": \"Search\",\n",
 574 |       "  \"action_input\": \"current FIFA men's world champion\"\n",
 575 |       "}\n",
 576 |       "```\u001b[0m\n",
 577 |       "Observation: \u001b[36;1m\u001b[1;3mThe reigning champions are Argentina, who won their third title at the 2022 tournament.\u001b[0m\n",
 578 |       "Thought:\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
 579 |       "Final Answer: The current FIFA men's world champion is Argentina.\u001b[0m\n",
 580 |       "\n",
 581 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 582 |      ]
 583 |     },
 584 |     {
 585 |      "data": {
 586 |       "text/plain": [
 587 |        "\"The current FIFA men's world champion is Argentina.\""
 588 |       ]
 589 |      },
 590 |      "execution_count": 21,
 591 |      "metadata": {},
 592 |      "output_type": "execute_result"
 593 |     }
 594 |    ],
 595 |    "source": [
 596 |     "from langchain.chat_models import ChatOpenAI\n",
 597 |     "\n",
 598 |     "agent = initialize_agent(\n",
 599 |     "    new_tools, \n",
 600 |     "    ChatOpenAI(temperature=0), \n",
 601 |     "    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, \n",
 602 |     "    verbose=True, \n",
 603 |     "    # handle_parsing_errors=\"Check your output and make sure it conforms!\"\n",
 604 |     "    )\n",
 605 |     "\n",
 606 |     "agent.run(\"Who is the Current FIFA men's world champion?\")"
 607 |    ]
 608 |   },
 609 |   {
 610 |    "cell_type": "markdown",
 611 |    "metadata": {},
 612 |    "source": [
 613 |     "No assumption about the year! Is it true for the NBA too?"
 614 |    ]
 615 |   },
 616 |   {
 617 |    "cell_type": "code",
 618 |    "execution_count": 22,
 619 |    "metadata": {},
 620 |    "outputs": [
 621 |     {
 622 |      "name": "stdout",
 623 |      "output_type": "stream",
 624 |      "text": [
 625 |       "\n",
 626 |       "\n",
 627 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 628 |       "\u001b[32;1m\u001b[1;3mI should use the Search tool to find the current NBA champion.\n",
 629 |       "\n",
 630 |       "Action:\n",
 631 |       "```\n",
 632 |       "{\n",
 633 |       "  \"action\": \"Search\",\n",
 634 |       "  \"action_input\": \"current NBA champion\"\n",
 635 |       "}\n",
 636 |       "```\u001b[0m\n",
 637 |       "Observation: \u001b[36;1m\u001b[1;3mDenver Nuggets\u001b[0m\n",
 638 |       "Thought:\u001b[32;1m\u001b[1;3mThe observation is incorrect. I need to refine my search query to get the correct answer.\n",
 639 |       "\n",
 640 |       "Action:\n",
 641 |       "```\n",
 642 |       "{\n",
 643 |       "  \"action\": \"Search\",\n",
 644 |       "  \"action_input\": \"2021 NBA champion\"\n",
 645 |       "}\n",
 646 |       "```\n",
 647 |       "\u001b[0m\n",
 648 |       "Observation: \u001b[36;1m\u001b[1;3m{'title': 'NBA', 'rankings': 'NBA Finals', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794aaa70568c619c75b84166ca3ceb52b96b37c42934cb077e43.png', 'games': [{'tournament': 'NBA', 'arena': 'Fiserv Forum', 'status': 'Final', 'date': 'Jul 20, 21', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=xxRQ5nne7fs&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584b085d5d0a2ae5b2447bb304bb85558f98be7045f3ecf4f090c70d1ea3b23c13d9bea1de32c6f6383.jpeg', 'duration': '9:57'}, 'teams': [{'name': 'Suns', 'score': '98', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584ed98abdba16de4babd3d18e4732f57990edd0a684f563c6e5579e13cbcb188fe.png'}, {'name': 'Bucks', 'score': '105', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584ed98abdba16de4ba010f00c0804ee9c2852c5cd0b4b40288b8c06abf2f1fa162.png'}]}, {'tournament': 'NBA', 'arena': 'Footprint Center', 'status': 'Final', 'date': 'Jul 17, 21', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=NQVD6ddpwco&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c88325847016242c23f12907bf16381f6fc5b4be51a07a38672c0a59e6f1f8948f12259cc71b2e6430951b77.jpeg', 'duration': '9:43'}, 'teams': [{'name': 'Bucks', 'score': '123', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584abb794ddbad3352dc0b458d8d17a2c94fd025c52fc2a058f9fd26298281c5d2c.png'}, {'name': 'Suns', 'score': '119', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584abb794ddbad3352ddef9103741314dff6d7778393e76d46f0e09ae4ae028a20a.png'}]}, {'tournament': 'NBA', 'arena': 'Fiserv Forum', 'status': 'Final', 'date': 'Jul 14, 21', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=a_39qOCpDco&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c88325849e1b5a9aa752b2c2ef4f45f32a9c68f3e67609ebccdee17779375f87c27c09333bd0ec95fcb23e08.jpeg', 'duration': '9:52'}, 'teams': [{'name': 'Suns', 'score': '103', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584b9f5331764c4e2144fdb881f18fdef22813d081a6496108970790df8ba153da5.png'}, {'name': 'Bucks', 'score': '109', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584b9f5331764c4e21488f463367e26217f2da4902422a10dcca39a754bca6d9196.png'}]}, {'tournament': 'NBA', 'arena': 'Fiserv Forum', 'status': 'Final', 'date': 'Jul 11, 21', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=61pqEPT3MfI&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c883258485e0d424a266c64a57b412ac25512e3205f8fab037c590ee67d81f22b8b8aa8ccb3071321e1d2ac4.jpeg', 'duration': '9:47'}, 'teams': [{'name': 'Suns', 'score': '100', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584aaeec18136edd89eb5562522b6b727c28264c7fde651c1009d5b6d304b6371dc.png'}, {'name': 'Bucks', 'score': '120', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584aaeec18136edd89ea9f4c3e7f6aa170e0107aa2ce691fce941204905f8174b69.png'}]}, {'tournament': 'NBA', 'arena': 'Footprint Center', 'status': 'Final', 'date': 'Jul 8, 21', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=oJZrEOmctc4&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584cfa5949c0fc8e421aae4a88740bba7e02fabb3fa28dcd30e90ca4809a29f393ded5cedb18f99352c.jpeg', 'duration': '9:51'}, 'teams': [{'name': 'Bucks', 'score': '108', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c88325842fc2d354f35bcc85fbb4f72c610a68473d469965e3106a432e70156dd4f8427a.png'}, {'name': 'Suns', 'score': '118', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c88325842fc2d354f35bcc857226c45b5b32dd23c873e923f01d91721b5c94fb5ccd3183.png'}]}, {'tournament': 'NBA', 'arena': 'Footprint Center', 'status': 'Final', 'date': 'Jul 6, 21', 'video_highlights': {'link': 'https://www.youtube.com/watch?v=Fm5lzr1J25Y&feature=onebox', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584e48ec1615e597948d8ecf9b2fe8036d4de0b23efc2dbc36ab47aa72830c8cf14920c5006d453a0b7.jpeg', 'duration': '9:58'}, 'teams': [{'name': 'Bucks', 'score': '105', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584a93d29ce32b2cdaf84344ac3fa976d936e213c57a76046bcc173fbcb5185d723.png'}, {'name': 'Suns', 'score': '118', 'thumbnail': 'https://serpapi.com/searches/651bf635797ac6c57f903e77/images/3ab47b1fdc51794a018967e8c8832584a93d29ce32b2cdaff1ea95e65dac1b021c56abb3bb4442c320d2edae4cd452c0.png'}]}]}\u001b[0m\n",
 649 |       "Thought:\u001b[32;1m\u001b[1;3mThe observation shows that the Milwaukee Bucks are the current NBA champions. \n",
 650 |       "\n",
 651 |       "Final Answer: The Milwaukee Bucks are the current NBA champions.\u001b[0m\n",
 652 |       "\n",
 653 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 654 |      ]
 655 |     },
 656 |     {
 657 |      "data": {
 658 |       "text/plain": [
 659 |        "'The Milwaukee Bucks are the current NBA champions.'"
 660 |       ]
 661 |      },
 662 |      "execution_count": 22,
 663 |      "metadata": {},
 664 |      "output_type": "execute_result"
 665 |     }
 666 |    ],
 667 |    "source": [
 668 |     "agent.run(\"Who is the current NBA champion?\")"
 669 |    ]
 670 |   },
 671 |   {
 672 |    "cell_type": "markdown",
 673 |    "metadata": {},
 674 |    "source": [
 675 |     "I absolutely love this example because it shows that Agent's aren't perfect!\n",
 676 |     "\n",
 677 |     "Look at the first \"Thought\" part. It says: **The observation is incorrect** although it was correct!\n",
 678 |     "\n",
 679 |     "I don't know why our model did it. I assume it was because it didn't match the answer from 2021...\n",
 680 |     "\n",
 681 |     "I've ran the code previously and it ended up giving me Denver Nuggets.\n",
 682 |     "\n",
 683 |     "But this time was different. The Agent started the second iteration typing \"2021 NBA champion\" in google search."
 684 |    ]
 685 |   },
 686 |   {
 687 |    "cell_type": "markdown",
 688 |    "metadata": {},
 689 |    "source": [
 690 |     "Will it handle our math question?"
 691 |    ]
 692 |   },
 693 |   {
 694 |    "cell_type": "code",
 695 |    "execution_count": 24,
 696 |    "metadata": {},
 697 |    "outputs": [
 698 |     {
 699 |      "name": "stdout",
 700 |      "output_type": "stream",
 701 |      "text": [
 702 |       "\n",
 703 |       "\n",
 704 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 705 |       "\u001b[32;1m\u001b[1;3mQuestion: Who is the NBA all-time leading scorer? What's his total points to the power of 0.42?\n",
 706 |       "Thought: I can use the Search tool to find the answer to the first question and then calculate the second part using a calculator.\n",
 707 |       "Action:\n",
 708 |       "```\n",
 709 |       "{\n",
 710 |       "  \"action\": \"Search\",\n",
 711 |       "  \"action_input\": \"NBA all-time leading scorer\"\n",
 712 |       "}\n",
 713 |       "```\n",
 714 |       "\u001b[0m\n",
 715 |       "Observation: \u001b[36;1m\u001b[1;3mLeBron James · 38,652\u001b[0m\n",
 716 |       "Thought:\u001b[32;1m\u001b[1;3mI have found that LeBron James is the NBA all-time leading scorer with 38,652 points. Now I can calculate his total points to the power of 0.42 using a calculator.\n",
 717 |       "Action:\n",
 718 |       "```\n",
 719 |       "{\n",
 720 |       "  \"action\": \"Search\",\n",
 721 |       "  \"action_input\": \"38652^0.42\"\n",
 722 |       "}\n",
 723 |       "```\n",
 724 |       "\n",
 725 |       "\u001b[0m\n",
 726 |       "Observation: \u001b[36;1m\u001b[1;3m84.45244506356968\u001b[0m\n",
 727 |       "Thought:\u001b[32;1m\u001b[1;3mI have calculated that LeBron James' total points to the power of 0.42 is approximately 84.45244506356968.\n",
 728 |       "Final Answer: LeBron James is the NBA all-time leading scorer with 38,652 points and his total points to the power of 0.42 is approximately 84.45244506356968.\u001b[0m\n",
 729 |       "\n",
 730 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 731 |      ]
 732 |     },
 733 |     {
 734 |      "data": {
 735 |       "text/plain": [
 736 |        "'LeBron James is the NBA all-time leading scorer with 38,652 points and his total points to the power of 0.42 is approximately 84.45244506356968.'"
 737 |       ]
 738 |      },
 739 |      "execution_count": 24,
 740 |      "metadata": {},
 741 |      "output_type": "execute_result"
 742 |     }
 743 |    ],
 744 |    "source": [
 745 |     "agent.run(\"Who is the NBA all-time leading scorer? What's his total points to the power of 0.42?\")"
 746 |    ]
 747 |   },
 748 |   {
 749 |    "cell_type": "markdown",
 750 |    "metadata": {},
 751 |    "source": [
 752 |     "Awesome! Everything is great here!"
 753 |    ]
 754 |   },
 755 |   {
 756 |    "cell_type": "markdown",
 757 |    "metadata": {},
 758 |    "source": [
 759 |     "### Creating GPT-4 AI Agent\n",
 760 |     "\n",
 761 |     "Let's test the same but for GPT-4. Let's try to add the calculator as a tool as well."
 762 |    ]
 763 |   },
 764 |   {
 765 |    "cell_type": "code",
 766 |    "execution_count": 3,
 767 |    "metadata": {},
 768 |    "outputs": [],
 769 |    "source": [
 770 |     "from langchain.utilities import SerpAPIWrapper\n",
 771 |     "from langchain.agents import initialize_agent, Tool, AgentType, load_tools\n",
 772 |     "from langchain.chat_models import ChatOpenAI\n",
 773 |     "\n",
 774 |     "search = SerpAPIWrapper()\n",
 775 |     "\n",
 776 |     "gpt4 = ChatOpenAI(model=\"gpt-4\", temperature=0)\n",
 777 |     "\n",
 778 |     "# create the serp tool\n",
 779 |     "serp_tool = Tool(\n",
 780 |     "        name=\"Search\",\n",
 781 |     "        func=search.run,\n",
 782 |     "        description=\"useful for when you need to answer questions about current events. You should ask targeted questions\",\n",
 783 |     "    )\n",
 784 |     "\n",
 785 |     "# Initialize tools with calculator and the model\n",
 786 |     "gpt4_tools = load_tools([\"llm-math\"], llm=gpt4)\n",
 787 |     "# add the serp tool\n",
 788 |     "gpt4_tools = gpt4_tools + [serp_tool]"
 789 |    ]
 790 |   },
 791 |   {
 792 |    "cell_type": "code",
 793 |    "execution_count": 4,
 794 |    "metadata": {},
 795 |    "outputs": [
 796 |     {
 797 |      "name": "stdout",
 798 |      "output_type": "stream",
 799 |      "text": [
 800 |       "\n",
 801 |       "\n",
 802 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 803 |       "\u001b[32;1m\u001b[1;3mThought: I need to find out the current FIFA men's world champion. I can use the Search tool to find this information.\n",
 804 |       "Action:\n",
 805 |       "```\n",
 806 |       "{\n",
 807 |       "  \"action\": \"Search\",\n",
 808 |       "  \"action_input\": \"Current FIFA men's world champion\"\n",
 809 |       "}\n",
 810 |       "```\u001b[0m\n",
 811 |       "Observation: \u001b[33;1m\u001b[1;3mThe reigning champions are Argentina, who won their third title at the 2022 tournament.\u001b[0m\n",
 812 |       "Thought:\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
 813 |       "Final Answer: The current FIFA men's world champion is Argentina.\u001b[0m\n",
 814 |       "\n",
 815 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 816 |      ]
 817 |     },
 818 |     {
 819 |      "data": {
 820 |       "text/plain": [
 821 |        "\"The current FIFA men's world champion is Argentina.\""
 822 |       ]
 823 |      },
 824 |      "execution_count": 4,
 825 |      "metadata": {},
 826 |      "output_type": "execute_result"
 827 |     }
 828 |    ],
 829 |    "source": [
 830 |     "# initialize GPT-4 Agent\n",
 831 |     "gpt4agent = initialize_agent(\n",
 832 |     "    gpt4_tools, \n",
 833 |     "    ChatOpenAI(model=\"gpt-4\", temperature=0), \n",
 834 |     "    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, \n",
 835 |     "    verbose=True, \n",
 836 |     "    # handle_parsing_errors=\"Check your output and make sure it conforms!\"\n",
 837 |     "    )\n",
 838 |     "\n",
 839 |     "gpt4agent.run(\"Who is the Current FIFA men's world champion?\")"
 840 |    ]
 841 |   },
 842 |   {
 843 |    "cell_type": "code",
 844 |    "execution_count": 5,
 845 |    "metadata": {},
 846 |    "outputs": [
 847 |     {
 848 |      "name": "stdout",
 849 |      "output_type": "stream",
 850 |      "text": [
 851 |       "\n",
 852 |       "\n",
 853 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 854 |       "\u001b[32;1m\u001b[1;3mThought: I need to find out who the current NBA champion is. I can use the Search tool to find this information.\n",
 855 |       "Action:\n",
 856 |       "```\n",
 857 |       "{\n",
 858 |       "  \"action\": \"Search\",\n",
 859 |       "  \"action_input\": \"current NBA champion\"\n",
 860 |       "}\n",
 861 |       "```\u001b[0m\n",
 862 |       "Observation: \u001b[33;1m\u001b[1;3mDenver Nuggets\u001b[0m\n",
 863 |       "Thought:\u001b[32;1m\u001b[1;3mThe search results show that the current NBA champion is the Denver Nuggets.\n",
 864 |       "Final Answer: The current NBA champion is the Denver Nuggets.\u001b[0m\n",
 865 |       "\n",
 866 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 867 |      ]
 868 |     },
 869 |     {
 870 |      "data": {
 871 |       "text/plain": [
 872 |        "'The current NBA champion is the Denver Nuggets.'"
 873 |       ]
 874 |      },
 875 |      "execution_count": 5,
 876 |      "metadata": {},
 877 |      "output_type": "execute_result"
 878 |     }
 879 |    ],
 880 |    "source": [
 881 |     "gpt4agent.run(\"Who is the current NBA champion?\")"
 882 |    ]
 883 |   },
 884 |   {
 885 |    "cell_type": "code",
 886 |    "execution_count": 6,
 887 |    "metadata": {},
 888 |    "outputs": [
 889 |     {
 890 |      "name": "stdout",
 891 |      "output_type": "stream",
 892 |      "text": [
 893 |       "\n",
 894 |       "\n",
 895 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 896 |       "\u001b[32;1m\u001b[1;3mThought: I need to find out who the NBA all-time leading scorer is and how many total points he scored. I will use the Search tool to find this information.\n",
 897 |       "Action:\n",
 898 |       "```\n",
 899 |       "{\n",
 900 |       "  \"action\": \"Search\",\n",
 901 |       "  \"action_input\": \"NBA all-time leading scorer and total points\"\n",
 902 |       "}\n",
 903 |       "```\u001b[0m\n",
 904 |       "Observation: \u001b[33;1m\u001b[1;3mLeBron James points tracker: NBA's all-time top scorer closing in on 47,000 career points. Team USA's two-time Olympic gold medallist broke the career NBA regular season points scoring record held by Kareem Abdul-Jabbar. Basketball hero LeBron James is now the NBA's all-time top scorer.\u001b[0m\n",
 905 |       "Thought:\u001b[32;1m\u001b[1;3mThe NBA all-time leading scorer is LeBron James. However, the total points mentioned is not specific and seems to be an approximation. I need to find the exact total points scored by LeBron James. I will use the Search tool again to find this information.\n",
 906 |       "Action:\n",
 907 |       "```\n",
 908 |       "{\n",
 909 |       "  \"action\": \"Search\",\n",
 910 |       "  \"action_input\": \"LeBron James total NBA points\"\n",
 911 |       "}\n",
 912 |       "```\u001b[0m\n",
 913 |       "Observation: \u001b[33;1m\u001b[1;3m38,652\u001b[0m\n",
 914 |       "Thought:\u001b[32;1m\u001b[1;3mLeBron James has scored a total of 38,652 points in the NBA. Now I need to calculate 38,652 to the power of 0.42. I will use the Calculator tool for this.\n",
 915 |       "Action:\n",
 916 |       "```\n",
 917 |       "{\n",
 918 |       "  \"action\": \"Calculator\",\n",
 919 |       "  \"action_input\": \"38652^0.42\"\n",
 920 |       "}\n",
 921 |       "```\n",
 922 |       "\u001b[0m\n",
 923 |       "Observation: \u001b[36;1m\u001b[1;3mAnswer: 84.45244506356971\u001b[0m\n",
 924 |       "Thought:\u001b[32;1m\u001b[1;3mI now know the final answer. LeBron James is the NBA all-time leading scorer and 38,652 points to the power of 0.42 is approximately 84.45. \n",
 925 |       "Final Answer: The NBA all-time leading scorer is LeBron James and 38,652 points to the power of 0.42 is approximately 84.45.\u001b[0m\n",
 926 |       "\n",
 927 |       "\u001b[1m> Finished chain.\u001b[0m\n"
 928 |      ]
 929 |     },
 930 |     {
 931 |      "data": {
 932 |       "text/plain": [
 933 |        "'The NBA all-time leading scorer is LeBron James and 38,652 points to the power of 0.42 is approximately 84.45.'"
 934 |       ]
 935 |      },
 936 |      "execution_count": 6,
 937 |      "metadata": {},
 938 |      "output_type": "execute_result"
 939 |     }
 940 |    ],
 941 |    "source": [
 942 |     "gpt4agent.run(\"Who is the NBA all-time leading scorer? What's his total points to the power of 0.42?\")"
 943 |    ]
 944 |   },
 945 |   {
 946 |    "cell_type": "markdown",
 947 |    "metadata": {},
 948 |    "source": [
 949 |     "### Asking about the Volleyball\n",
 950 |     "\n",
 951 |     "Volleyball is the second or third most popular sport in Poland. But it's a niche sport in other countries. So I was curious how our models will handle the question about the European Champions.\n",
 952 |     "\n",
 953 |     "And this question has 1 more interesting part. European Championships took place in September 2021. Which is very close to the knowledge cut-off for our GPT models..."
 954 |    ]
 955 |   },
 956 |   {
 957 |    "cell_type": "code",
 958 |    "execution_count": 25,
 959 |    "metadata": {},
 960 |    "outputs": [
 961 |     {
 962 |      "data": {
 963 |       "text/plain": [
 964 |        "\"As of September 2021, the current men's volleyball European champion is Serbia. They won the title in 2019 by defeating Slovenia in the final.\""
 965 |       ]
 966 |      },
 967 |      "execution_count": 25,
 968 |      "metadata": {},
 969 |      "output_type": "execute_result"
 970 |     }
 971 |    ],
 972 |    "source": [
 973 |     "chat_llm.predict(\"Who is the current men's volleyball european champion?\")"
 974 |    ]
 975 |   },
 976 |   {
 977 |    "cell_type": "markdown",
 978 |    "metadata": {},
 979 |    "source": [
 980 |     "Cool, GPT-3.5 answered with the 2019 champions. So it has no information about the 2021 results. How about GPT-4?"
 981 |    ]
 982 |   },
 983 |   {
 984 |    "cell_type": "code",
 985 |    "execution_count": 26,
 986 |    "metadata": {},
 987 |    "outputs": [
 988 |     {
 989 |      "data": {
 990 |       "text/plain": [
 991 |        "\"As of my last update in October 2021, the current men's volleyball European champion is Slovenia. They won the 2021 Men's European Volleyball Championship.\""
 992 |       ]
 993 |      },
 994 |      "execution_count": 26,
 995 |      "metadata": {},
 996 |      "output_type": "execute_result"
 997 |     }
 998 |    ],
 999 |    "source": [
1000 |     "gpt4.predict(\"Who is the current men's volleyball european champion?\")"
1001 |    ]
1002 |   },
1003 |   {
1004 |    "cell_type": "markdown",
1005 |    "metadata": {},
1006 |    "source": [
1007 |     "Wrong!\n",
1008 |     "\n",
1009 |     "Slovenia has lost against Italy in the finals...\n",
1010 |     "\n",
1011 |     "Can our Agent find the correct and updated answer??"
1012 |    ]
1013 |   },
1014 |   {
1015 |    "cell_type": "code",
1016 |    "execution_count": 27,
1017 |    "metadata": {},
1018 |    "outputs": [
1019 |     {
1020 |      "name": "stdout",
1021 |      "output_type": "stream",
1022 |      "text": [
1023 |       "\n",
1024 |       "\n",
1025 |       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
1026 |       "\u001b[32;1m\u001b[1;3mThought: I can use the Search tool to find the current men's volleyball European champion.\n",
1027 |       "\n",
1028 |       "Action:\n",
1029 |       "```\n",
1030 |       "{\n",
1031 |       "  \"action\": \"Search\",\n",
1032 |       "  \"action_input\": \"current men's volleyball European champion\"\n",
1033 |       "}\n",
1034 |       "```\u001b[0m\n",
1035 |       "Observation: \u001b[36;1m\u001b[1;3mThe initial gap between championships was variable, but since 1975 they have been awarded every two years. The current champion is Poland, which won its second title at the 2023 tournament.\u001b[0m\n",
1036 |       "Thought:\u001b[32;1m\u001b[1;3mThe current men's volleyball European champion is Poland.\n",
1037 |       "\n",
1038 |       "Final Answer: Poland\u001b[0m\n",
1039 |       "\n",
1040 |       "\u001b[1m> Finished chain.\u001b[0m\n"
1041 |      ]
1042 |     },
1043 |     {
1044 |      "data": {
1045 |       "text/plain": [
1046 |        "'Poland'"
1047 |       ]
1048 |      },
1049 |      "execution_count": 27,
1050 |      "metadata": {},
1051 |      "output_type": "execute_result"
1052 |     }
1053 |    ],
1054 |    "source": [
1055 |     "agent.run(\"Who is the current men's volleyball european champion?\")"
1056 |    ]
1057 |   },
1058 |   {
1059 |    "cell_type": "markdown",
1060 |    "metadata": {},
1061 |    "source": [
1062 |     "Yes, it can!"
1063 |    ]
1064 |   },
1065 |   {
1066 |    "cell_type": "markdown",
1067 |    "metadata": {},
1068 |    "source": [
1069 |     "## Conclusions\n",
1070 |     "\n",
1071 |     "So I hope this notebook gave you some hands-on experience with AI Agents that you can easily build with LangChain."
1072 |    ]
1073 |   }
1074 |  ],
1075 |  "metadata": {
1076 |   "kernelspec": {
1077 |    "display_name": "langchain",
1078 |    "language": "python",
1079 |    "name": "python3"
1080 |   },
1081 |   "language_info": {
1082 |    "codemirror_mode": {
1083 |     "name": "ipython",
1084 |     "version": 3
1085 |    },
1086 |    "file_extension": ".py",
1087 |    "mimetype": "text/x-python",
1088 |    "name": "python",
1089 |    "nbconvert_exporter": "python",
1090 |    "pygments_lexer": "ipython3",
1091 |    "version": "3.9.17"
1092 |   },
1093 |   "orig_nbformat": 4
1094 |  },
1095 |  "nbformat": 4,
1096 |  "nbformat_minor": 2
1097 | }
1098 | 


--------------------------------------------------------------------------------
/LangChainGuides/ChatWithObsidian.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "### Introduction\n",
  8 |     "\n",
  9 |     "In this notebook, we aim to harness the power of advanced machine learning techniques, combined with the utility of personal knowledge management systems, to tap into a 'second brain'. Through the integration of Obsidian (a popular note-taking app) and OpenAI's latest models, we'll achieve this.\n",
 10 |     "\n",
 11 |     "By the end of this notebook, you'll be able to:\n",
 12 |     "\n",
 13 |     "1. **Interact with Your Second Brain**: Understand how to use your personal notes in tandem with AI to answer complex queries.\n",
 14 |     "2. **Load Notes from Obsidian**: Utilize LangChain's `ObsidianLoader` to import your notes into a Python environment.\n",
 15 |     "3. **Create a Vector Database**: Leverage LangChain's `VectorstoreIndexCreator` to create a searchable database of your notes."
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "markdown",
 20 |    "metadata": {},
 21 |    "source": [
 22 |     "\n",
 23 |     "## Setup and Prerequisites\n",
 24 |     "### Installing Essential Packages\n",
 25 |     "\n",
 26 |     "To begin, let's make sure we have all the necessary Python packages installed. You can do this by running the command below:\n",
 27 |     "\n",
 28 |     "```bash\n",
 29 |     "pip install openai langchain python-dotenv chromadb tiktoken\n",
 30 |     "```"
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "markdown",
 35 |    "metadata": {},
 36 |    "source": [
 37 |     "### Loading API Keys\n",
 38 |     "\n",
 39 |     "It's always a good practice to keep sensitive data like API keys separate from your main codebase. Here, we'll be loading them using the `dotenv` library."
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 1,
 45 |    "metadata": {},
 46 |    "outputs": [
 47 |     {
 48 |      "data": {
 49 |       "text/plain": [
 50 |        "True"
 51 |       ]
 52 |      },
 53 |      "execution_count": 1,
 54 |      "metadata": {},
 55 |      "output_type": "execute_result"
 56 |     }
 57 |    ],
 58 |    "source": [
 59 |     "import os\n",
 60 |     "from dotenv import load_dotenv\n",
 61 |     "\n",
 62 |     "# load_dotenv()\n",
 63 |     "\n",
 64 |     "# Get the absolute path of the current script\n",
 65 |     "script_dir = os.path.abspath(os.getcwd())\n",
 66 |     "\n",
 67 |     "# Get the absolute path of the parent directory\n",
 68 |     "parent_dir = os.path.join(script_dir, os.pardir)\n",
 69 |     "\n",
 70 |     "dotenv_path = os.path.join(parent_dir, '.env')\n",
 71 |     "# Load the .env file from the parent directory\n",
 72 |     "load_dotenv(dotenv_path)"
 73 |    ]
 74 |   },
 75 |   {
 76 |    "cell_type": "markdown",
 77 |    "metadata": {},
 78 |    "source": [
 79 |     "### Loading Notes from Obsidian\n",
 80 |     "\n",
 81 |     "With just a simple path, LangChain's `ObsidianLoader` can help us load our Obsidian notes into our Python environment."
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "code",
 86 |    "execution_count": 2,
 87 |    "metadata": {},
 88 |    "outputs": [],
 89 |    "source": [
 90 |     "from langchain.document_loaders import ObsidianLoader\n",
 91 |     "\n",
 92 |     "loader = ObsidianLoader('/home/kris/Documents/SmartNotes/SecondBrain/Reference Notes/')\n",
 93 |     "docs = loader.load()"
 94 |    ]
 95 |   },
 96 |   {
 97 |    "cell_type": "code",
 98 |    "execution_count": null,
 99 |    "metadata": {},
100 |    "outputs": [],
101 |    "source": [
102 |     "len(docs)"
103 |    ]
104 |   },
105 |   {
106 |    "cell_type": "code",
107 |    "execution_count": null,
108 |    "metadata": {},
109 |    "outputs": [],
110 |    "source": [
111 |     "# docs[:5]"
112 |    ]
113 |   },
114 |   {
115 |    "cell_type": "code",
116 |    "execution_count": 4,
117 |    "metadata": {},
118 |    "outputs": [
119 |     {
120 |      "name": "stdout",
121 |      "output_type": "stream",
122 |      "text": [
123 |       "📚 Show Your Work.md\n",
124 |       "📚 Storyworthy.md\n",
125 |       "📚 The Subtle Art of Not Giving a F.ck.md\n",
126 |       "📚 On Writing Well.md\n",
127 |       "📚 Atomic Habits.md\n"
128 |      ]
129 |     }
130 |    ],
131 |    "source": [
132 |     "for doc in docs[:5]:\n",
133 |     "    print(doc.metadata[\"source\"])"
134 |    ]
135 |   },
136 |   {
137 |    "cell_type": "markdown",
138 |    "metadata": {},
139 |    "source": [
140 |     "### Setting up the Vector Database with Chroma\n",
141 |     "\n",
142 |     "This step involves transforming our notes into vector embeddings so that we can efficiently search through them. This transformation facilitates the identification of notes that are relevant to our queries."
143 |    ]
144 |   },
145 |   {
146 |    "cell_type": "code",
147 |    "execution_count": 5,
148 |    "metadata": {},
149 |    "outputs": [],
150 |    "source": [
151 |     "from langchain.indexes import VectorstoreIndexCreator\n",
152 |     "\n",
153 |     "index = VectorstoreIndexCreator(\n",
154 |     "    vectorstore_kwargs={\"persist_directory\": \"./chroma/obsidian\"}\n",
155 |     "    ).from_documents(docs)"
156 |    ]
157 |   },
158 |   {
159 |    "cell_type": "markdown",
160 |    "metadata": {},
161 |    "source": [
162 |     "### Querying Your Second Brain\n",
163 |     "\n",
164 |     "This is where the magic happens:\n",
165 |     "\n",
166 |     "- **Vector Transformation**: Our query is turned into vector embeddings.\n",
167 |     "- **Vector Database Search**: The vector database is searched to identify relevant documents.\n",
168 |     "- **Query Processing with LLM**: The query and the relevant documents are sent to our Large Language Model (LLM) for processing.\n",
169 |     "- **Obtaining the Answer**: The LLM returns the most accurate `answer`."
170 |    ]
171 |   },
172 |   {
173 |    "cell_type": "code",
174 |    "execution_count": 6,
175 |    "metadata": {},
176 |    "outputs": [
177 |     {
178 |      "data": {
179 |       "text/plain": [
180 |        "{'question': 'What are 4 laws of behavior change?',\n",
181 |        " 'answer': ' The four laws of behavior change are: 1. Make it Obvious, 2. Make it Attractive, 3. Make it Easy, and 4. Make it Satisfying.\\n',\n",
182 |        " 'sources': '📚 Atomic Habits.md'}"
183 |       ]
184 |      },
185 |      "execution_count": 6,
186 |      "metadata": {},
187 |      "output_type": "execute_result"
188 |     }
189 |    ],
190 |    "source": [
191 |     "query = \"What are 4 laws of behavior change?\"\n",
192 |     "# expecting Atomic Habits\n",
193 |     "index.query_with_sources(query)"
194 |    ]
195 |   },
196 |   {
197 |    "cell_type": "markdown",
198 |    "metadata": {},
199 |    "source": [
200 |     "#### Helpful Function for showing the results"
201 |    ]
202 |   },
203 |   {
204 |    "cell_type": "code",
205 |    "execution_count": 7,
206 |    "metadata": {},
207 |    "outputs": [],
208 |    "source": [
209 |     "def display_answer(answer):\n",
210 |     "    print(\"Sources: \", answer[\"sources\"], \"\\n\\n\", answer[\"answer\"])"
211 |    ]
212 |   },
213 |   {
214 |    "cell_type": "code",
215 |    "execution_count": 8,
216 |    "metadata": {},
217 |    "outputs": [
218 |     {
219 |      "name": "stdout",
220 |      "output_type": "stream",
221 |      "text": [
222 |       "Sources:  📚 $100M Offers.md \n",
223 |       "\n",
224 |       "  The key takeaways from $100M Offers are:\n",
225 |       "1. Sell your products based on VALUE not PRICE.\n",
226 |       "2. Focus on the value you're providing.\n",
227 |       "3. Create the \"Category of One.\"\n",
228 |       "4. Sell in a vacuum.\n",
229 |       "5. Three levers on Success: Market > Offer > Copywriting.\n",
230 |       "\n"
231 |      ]
232 |     }
233 |    ],
234 |    "source": [
235 |     "query = \"Give me 5 key takeaways from $100M Offers\"\n",
236 |     "display_answer(index.query_with_sources(query))"
237 |    ]
238 |   },
239 |   {
240 |    "cell_type": "markdown",
241 |    "metadata": {},
242 |    "source": [
243 |     "Halucinations?"
244 |    ]
245 |   },
246 |   {
247 |    "cell_type": "code",
248 |    "execution_count": 9,
249 |    "metadata": {},
250 |    "outputs": [
251 |     {
252 |      "name": "stdout",
253 |      "output_type": "stream",
254 |      "text": [
255 |       "Sources:  https://www.healthline.com/nutrition/10-brain-foods#section2\n",
256 |       "https://www.webmd.com/diet/features/eat-smart-healthier-brain#1 \n",
257 |       "\n",
258 |       "  Brain foods include foods high in omega-3 fatty acids, foods high in antioxidants, foods high in B vitamins, foods high in vitamin E, foods high in vitamin C, dark chocolate, blueberries, turmeric, green tea, and nuts.\n",
259 |       "\n"
260 |      ]
261 |     }
262 |    ],
263 |    "source": [
264 |     "query = \"What are 10 brain foods?\"\n",
265 |     "display_answer(index.query_with_sources(query))"
266 |    ]
267 |   },
268 |   {
269 |    "cell_type": "markdown",
270 |    "metadata": {},
271 |    "source": [
272 |     "Just wrong..."
273 |    ]
274 |   },
275 |   {
276 |    "cell_type": "code",
277 |    "execution_count": 10,
278 |    "metadata": {},
279 |    "outputs": [
280 |     {
281 |      "name": "stdout",
282 |      "output_type": "stream",
283 |      "text": [
284 |       "Sources:  📚 Limitless.md, 📚 Building a Second Brain.md, 📚 Dopamine Nation.md, 📚 Keep Sharp.md \n",
285 |       "\n",
286 |       "  10 brain foods from Limitless are: exercise, nutrition, sleep, new learning, social interactions, interconectedness, neuroplasticity, FOMO, self-expression, and chasing what excites you.\n",
287 |       "\n"
288 |      ]
289 |     }
290 |    ],
291 |    "source": [
292 |     "query = \"What are 10 brain foods from Limitless?\"\n",
293 |     "display_answer(index.query_with_sources(query))"
294 |    ]
295 |   },
296 |   {
297 |    "cell_type": "code",
298 |    "execution_count": 11,
299 |    "metadata": {},
300 |    "outputs": [
301 |     {
302 |      "name": "stdout",
303 |      "output_type": "stream",
304 |      "text": [
305 |       "Sources:  📚 Keep Sharp.md \n",
306 |       "\n",
307 |       "  The five pillars of a healthy brain are exercise, nutrition, sleep, new learning, and social interactions.\n",
308 |       "\n"
309 |      ]
310 |     }
311 |    ],
312 |    "source": [
313 |     "query = \"What are the pillars of a healthy brain?\"\n",
314 |     "# Expecting Keep Sharp\n",
315 |     "display_answer(index.query_with_sources(query))"
316 |    ]
317 |   },
318 |   {
319 |    "cell_type": "code",
320 |    "execution_count": 12,
321 |    "metadata": {},
322 |    "outputs": [
323 |     {
324 |      "name": "stdout",
325 |      "output_type": "stream",
326 |      "text": [
327 |       "Sources:  📚 The Almanack Of Naval Ravikant.md \n",
328 |       "\n",
329 |       "  According to Naval Ravikant, creating wealth involves finding and building specific knowledge, building or buying equity in a business, and finding a position of leverage.\n",
330 |       "\n"
331 |      ]
332 |     }
333 |    ],
334 |    "source": [
335 |     "query = \"How to create wealth?\"\n",
336 |     "display_answer(index.query_with_sources(query))"
337 |    ]
338 |   },
339 |   {
340 |    "cell_type": "markdown",
341 |    "metadata": {},
342 |    "source": [
343 |     "Simple, \"I don't know.\""
344 |    ]
345 |   },
346 |   {
347 |    "cell_type": "code",
348 |    "execution_count": 13,
349 |    "metadata": {},
350 |    "outputs": [
351 |     {
352 |      "name": "stdout",
353 |      "output_type": "stream",
354 |      "text": [
355 |       "Sources:  📚 The Boron Letters.md, 📚 The Genius In All Of Us.md, 📚 Dopamine Nation.md \n",
356 |       "\n",
357 |       "  Naval Ravikant does not mention raising kids.\n",
358 |       "\n"
359 |      ]
360 |     }
361 |    ],
362 |    "source": [
363 |     "query = \"What does Naval Ravikant say about raising kids?\"\n",
364 |     "display_answer(index.query_with_sources(query))"
365 |    ]
366 |   },
367 |   {
368 |    "cell_type": "code",
369 |    "execution_count": 14,
370 |    "metadata": {},
371 |    "outputs": [
372 |     {
373 |      "name": "stdout",
374 |      "output_type": "stream",
375 |      "text": [
376 |       "Sources:  📚 The Subtle Art of Not Giving a F.ck.md, 📚 The Gap And The Gain.md, 📚 Dopamine Nation.md \n",
377 |       "\n",
378 |       "  Hedonic adaptation is a phenomenon in which you achieve your goal only to feel unfulfilled and wanting more, and is caused by repeated exposure to the same or similar pleasure stimulus. It is also known as tolerance, and can lead to a dopamine deficit state.\n",
379 |       "\n"
380 |      ]
381 |     }
382 |    ],
383 |    "source": [
384 |     "query = \"What is hedonic adaptation?\"\n",
385 |     "display_answer(index.query_with_sources(query))"
386 |    ]
387 |   },
388 |   {
389 |    "cell_type": "code",
390 |    "execution_count": 15,
391 |    "metadata": {},
392 |    "outputs": [
393 |     {
394 |      "name": "stdout",
395 |      "output_type": "stream",
396 |      "text": [
397 |       "Sources:  📚 The Long Game.md, 📚 The Boron Letters.md, 📚 The 10X Rule.md \n",
398 |       "\n",
399 |       "  People are often unsuccessful because they are afraid to set \"unrealistic\" goals, they fear failure, they are met with negativity from friends and family, and they depend on others instead of themselves.\n",
400 |       "\n"
401 |      ]
402 |     }
403 |    ],
404 |    "source": [
405 |     "query = \"Why most people aren't successful?\"\n",
406 |     "display_answer(index.query_with_sources(query))"
407 |    ]
408 |   },
409 |   {
410 |    "cell_type": "markdown",
411 |    "metadata": {},
412 |    "source": [
413 |     "### Final Words\n",
414 |     "\n",
415 |     "With this notebook, you now have a robust system to seamlessly query your personal knowledge base with the aid of state-of-the-art machine learning models. \n",
416 |     "\n",
417 |     "The convergence of your stored knowledge (in Obsidian) and real-time information retrieval (with OpenAI) empowers you to make well-informed decisions quickly."
418 |    ]
419 |   },
420 |   {
421 |    "cell_type": "code",
422 |    "execution_count": null,
423 |    "metadata": {},
424 |    "outputs": [],
425 |    "source": []
426 |   }
427 |  ],
428 |  "metadata": {
429 |   "kernelspec": {
430 |    "display_name": "langchain",
431 |    "language": "python",
432 |    "name": "python3"
433 |   },
434 |   "language_info": {
435 |    "codemirror_mode": {
436 |     "name": "ipython",
437 |     "version": 3
438 |    },
439 |    "file_extension": ".py",
440 |    "mimetype": "text/x-python",
441 |    "name": "python",
442 |    "nbconvert_exporter": "python",
443 |    "pygments_lexer": "ipython3",
444 |    "version": "3.9.17"
445 |   },
446 |   "orig_nbformat": 4
447 |  },
448 |  "nbformat": 4,
449 |  "nbformat_minor": 2
450 | }
451 | 


--------------------------------------------------------------------------------
/LangChainGuides/ChatWithPodcast.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## Ask the Expert: How to Chat with a Podcast\n",
  8 |     "Welcome to this interactive tutorial where we'll explore how to \"talk\" to our data using the power of Large Language Models (LLMs) like OpenAI's GPT-3.5-turbo and Vector Databases. \n",
  9 |     "\n",
 10 |     "This tutorial will guide you through an exciting process of transforming unstructured data (a PDF document in our case) into an interactive and smart knowledge base that can answer your questions!\n",
 11 |     "\n",
 12 |     "By following this tutorial, you will:\n",
 13 |     "\n",
 14 |     "- Learn how to load data from a PDF file and split it into smaller, manageable chunks.\n",
 15 |     "- Understand the concept of text embeddings and how we can utilize them to store our data in a Vector Database.\n",
 16 |     "- Discover how to ask questions and retrieve the most relevant information from your database.\n",
 17 |     "- Use a Large Language Model to generate answers based on the context we provide.\n",
 18 |     "\n",
 19 |     "You will experience the benefit of harnessing the power of language models and vector databases in extracting and utilizing information from large amounts of text data. \n",
 20 |     "\n",
 21 |     "The approach used in this tutorial can be applied to a wide range of tasks, from creating a smart Q&A system to building a personal digital assistant or even designing a conversational AI.\n",
 22 |     "\n",
 23 |     "<img src=\"images/qa_flow.png\" alt=\"Image Alt Text\" width=\"1200\">"
 24 |    ]
 25 |   },
 26 |   {
 27 |    "cell_type": "markdown",
 28 |    "metadata": {},
 29 |    "source": [
 30 |     "#### Installing packages\n",
 31 |     "Before we start, you need to have a few Python libraries installed. You can install these libraries by running the following command in your terminal:\n",
 32 |     "```bash\n",
 33 |     "pip install openai langchain ipykernel python-dotenv chromadb pypdf tiktoken\n",
 34 |     "```"
 35 |    ]
 36 |   },
 37 |   {
 38 |    "cell_type": "markdown",
 39 |    "metadata": {},
 40 |    "source": [
 41 |     "### Loading OpenAI API Key\n",
 42 |     "\n",
 43 |     "We'll need it for Embeddings and the GPT-3.5 model."
 44 |    ]
 45 |   },
 46 |   {
 47 |    "cell_type": "code",
 48 |    "execution_count": 1,
 49 |    "metadata": {},
 50 |    "outputs": [
 51 |     {
 52 |      "data": {
 53 |       "text/plain": [
 54 |        "True"
 55 |       ]
 56 |      },
 57 |      "execution_count": 1,
 58 |      "metadata": {},
 59 |      "output_type": "execute_result"
 60 |     }
 61 |    ],
 62 |    "source": [
 63 |     "import os\n",
 64 |     "from dotenv import load_dotenv\n",
 65 |     "\n",
 66 |     "load_dotenv()"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "markdown",
 71 |    "metadata": {},
 72 |    "source": [
 73 |     "### Step 1: Loading PDFs\n",
 74 |     "In this step, we're using the PyPDFLoader class from the LangChain library to load our PDF file into standardized Document format.\n",
 75 |     "\n"
 76 |    ]
 77 |   },
 78 |   {
 79 |    "cell_type": "code",
 80 |    "execution_count": 2,
 81 |    "metadata": {},
 82 |    "outputs": [],
 83 |    "source": [
 84 |     "from langchain.document_loaders import PyPDFLoader\n",
 85 |     "\n",
 86 |     "# create an instance of PyPDFLoader with the target PDF file\n",
 87 |     "loader = PyPDFLoader(\"transcripts/PT693-Transcript.pdf\")\n",
 88 |     "\n",
 89 |     "# load the PDF file into a variable named 'docs'\n",
 90 |     "docs = loader.load()"
 91 |    ]
 92 |   },
 93 |   {
 94 |    "cell_type": "code",
 95 |    "execution_count": 3,
 96 |    "metadata": {},
 97 |    "outputs": [
 98 |     {
 99 |      "data": {
100 |       "text/plain": [
101 |        "(43, langchain.schema.document.Document)"
102 |       ]
103 |      },
104 |      "execution_count": 3,
105 |      "metadata": {},
106 |      "output_type": "execute_result"
107 |     }
108 |    ],
109 |    "source": [
110 |     "len(docs), type(docs[0])"
111 |    ]
112 |   },
113 |   {
114 |    "cell_type": "markdown",
115 |    "metadata": {},
116 |    "source": [
117 |     "### Step 2: Splitting Pages\n",
118 |     "Next, we're using a `RecursiveCharacterTextSplitter` to break down the content of the PDF into smaller chunks.\n",
119 |     "\n",
120 |     "We define a chunk size and overlap to decide how large each slice should be and how much they should overlap with each other."
121 |    ]
122 |   },
123 |   {
124 |    "cell_type": "code",
125 |    "execution_count": 4,
126 |    "metadata": {},
127 |    "outputs": [],
128 |    "source": [
129 |     "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
130 |     "\n",
131 |     "# Constants: Define constants used in the code.\n",
132 |     "CHUNK_SIZE = 1200\n",
133 |     "CHUNK_OVERLAP = 200\n",
134 |     "\n",
135 |     "# Create an instance of RecursiveCharacterTextSplitter with the desired chunk size and overlap\n",
136 |     "r_splitter = RecursiveCharacterTextSplitter(\n",
137 |     "    chunk_size=CHUNK_SIZE,\n",
138 |     "    chunk_overlap=CHUNK_OVERLAP, \n",
139 |     "    length_function=len  # function used to measure chunk size\n",
140 |     ")\n",
141 |     "\n",
142 |     "# split documents into smaller chunks\n",
143 |     "splits = r_splitter.split_documents(docs)\n"
144 |    ]
145 |   },
146 |   {
147 |    "cell_type": "code",
148 |    "execution_count": 5,
149 |    "metadata": {},
150 |    "outputs": [
151 |     {
152 |      "data": {
153 |       "text/plain": [
154 |        "84"
155 |       ]
156 |      },
157 |      "execution_count": 5,
158 |      "metadata": {},
159 |      "output_type": "execute_result"
160 |     }
161 |    ],
162 |    "source": [
163 |     "len(splits)"
164 |    ]
165 |   },
166 |   {
167 |    "cell_type": "code",
168 |    "execution_count": 6,
169 |    "metadata": {},
170 |    "outputs": [
171 |     {
172 |      "name": "stdout",
173 |      "output_type": "stream",
174 |      "text": [
175 |       "103\n",
176 |       "1148\n",
177 |       "960\n",
178 |       "1157\n",
179 |       "1014\n",
180 |       "1193\n",
181 |       "1002\n",
182 |       "1161\n",
183 |       "1091\n",
184 |       "1152\n"
185 |      ]
186 |     }
187 |    ],
188 |    "source": [
189 |     "for doc in splits[:10]:\n",
190 |     "    print(len(doc.page_content))"
191 |    ]
192 |   },
193 |   {
194 |    "cell_type": "markdown",
195 |    "metadata": {},
196 |    "source": [
197 |     "### Step 3: Creating Embeddings for the Splits\n",
198 |     "\n",
199 |     "**Computers don't understand words. They only understand numbers.**\n",
200 |     "\n",
201 |     "Embeddings are a way of converting text into a numerical form that a machine can understand.\n",
202 |     "\n",
203 |     "Imagine trying to describe a movie scene to a friend - you would use words to describe what's happening, the mood, the characters, etc. In a similar way, embeddings capture the essence of the text, but in a format that the machine can work with, like a numerical vector.\n",
204 |     "\n",
205 |     "\n",
206 |     "\n",
207 |     "To create the embeddings, we're using OpenAIEmbeddings from LangChain. \n",
208 |     "\n",
209 |     "<img src=\"images/Embeddings.png\" alt=\"Image Alt Text\" width=\"600\">\n"
210 |    ]
211 |   },
212 |   {
213 |    "cell_type": "code",
214 |    "execution_count": 7,
215 |    "metadata": {},
216 |    "outputs": [],
217 |    "source": [
218 |     "from langchain.embeddings.openai import OpenAIEmbeddings\n",
219 |     "\n",
220 |     "# Create an instance of OpenAIEmbeddings for embedding the chunks\n",
221 |     "embedding = OpenAIEmbeddings()"
222 |    ]
223 |   },
224 |   {
225 |    "cell_type": "markdown",
226 |    "metadata": {},
227 |    "source": [
228 |     "#### Similarity Search and Cosine Similarity\n",
229 |     "\n"
230 |    ]
231 |   },
232 |   {
233 |    "cell_type": "code",
234 |    "execution_count": 8,
235 |    "metadata": {},
236 |    "outputs": [],
237 |    "source": [
238 |     "sent1 = \"I love dogs\"\n",
239 |     "sent2 = \"I love cats\"\n",
240 |     "sent3 = \"Yesterday I played basketball\"\n",
241 |     "sent4 = \"Yesterday I played football\"\n",
242 |     "sent5 = \"Leonardo Di Caprio is an underrated actor\""
243 |    ]
244 |   },
245 |   {
246 |    "cell_type": "code",
247 |    "execution_count": 9,
248 |    "metadata": {},
249 |    "outputs": [],
250 |    "source": [
251 |     "embedding1 = embedding.embed_query(sent1)\n",
252 |     "embedding2 = embedding.embed_query(sent2)\n",
253 |     "embedding3 = embedding.embed_query(sent3)\n",
254 |     "embedding4 = embedding.embed_query(sent4)\n",
255 |     "embedding5 = embedding.embed_query(sent5)"
256 |    ]
257 |   },
258 |   {
259 |    "cell_type": "code",
260 |    "execution_count": 10,
261 |    "metadata": {},
262 |    "outputs": [
263 |     {
264 |      "name": "stdout",
265 |      "output_type": "stream",
266 |      "text": [
267 |       "0.9113747123840847\n",
268 |       "0.9424351991605048\n",
269 |       "---\n",
270 |       "0.7609000588250058\n",
271 |       "0.7659245559775505\n",
272 |       "0.7612783173640613\n",
273 |       "0.745260925638238\n",
274 |       "0.7202260133268322\n"
275 |      ]
276 |     }
277 |    ],
278 |    "source": [
279 |     "import numpy as np\n",
280 |     "\n",
281 |     "def cosine_similarity(vec1, vec2):\n",
282 |     "    # Compute the dot product of vec1 and vec2\n",
283 |     "    dot_product = np.dot(vec1, vec2)\n",
284 |     "    \n",
285 |     "    # Compute the L2 norms (or magnitudes) of vec1 and vec2\n",
286 |     "    norm_vec1 = np.linalg.norm(vec1)\n",
287 |     "    norm_vec2 = np.linalg.norm(vec2)\n",
288 |     "    \n",
289 |     "    # Compute the cosine similarity\n",
290 |     "    cos_sim = dot_product / (norm_vec1 * norm_vec2)\n",
291 |     "    \n",
292 |     "    return cos_sim\n",
293 |     "\n",
294 |     "# Assuming vec1 and vec2 are your embeddings\n",
295 |     "vec1 = np.array(embedding1)\n",
296 |     "vec2 = np.array(embedding2)\n",
297 |     "vec3 = np.array(embedding3)\n",
298 |     "vec4 = np.array(embedding4)\n",
299 |     "vec5 = np.array(embedding5)\n",
300 |     "\n",
301 |     "# More similar\n",
302 |     "print(cosine_similarity(vec1, vec2))\n",
303 |     "print(cosine_similarity(vec3, vec4))\n",
304 |     "\n",
305 |     "print(\"---\")\n",
306 |     "# Less similar\n",
307 |     "print(cosine_similarity(vec1, vec3))\n",
308 |     "print(cosine_similarity(vec2, vec3))\n",
309 |     "print(cosine_similarity(vec2, vec4))\n",
310 |     "print(cosine_similarity(vec2, vec5))\n",
311 |     "print(cosine_similarity(vec4, vec5))\n"
312 |    ]
313 |   },
314 |   {
315 |    "cell_type": "markdown",
316 |    "metadata": {},
317 |    "source": [
318 |     "### Step 4: Storing them into a Vector Database\n",
319 |     "\n",
320 |     "A vector database is like a library for these numerical vectors. We store these vectors in a structured manner so we can search and retrieve them efficiently later on.\n",
321 |     "\n",
322 |     "Once we have the embeddings (the 'numerical' form of our text), we're storing them in a vector database using the Chroma class from LangChain.\n",
323 |     "\n",
324 |     "<img src=\"images/VectorDatabaseCreate.png\" alt=\"Image Alt Text\" width=\"600\">"
325 |    ]
326 |   },
327 |   {
328 |    "cell_type": "code",
329 |    "execution_count": 11,
330 |    "metadata": {},
331 |    "outputs": [],
332 |    "source": [
333 |     "from langchain.vectorstores import Chroma\n",
334 |     "\n",
335 |     "# Define directory to persist the embeddings\n",
336 |     "persist_directory = 'chroma/sds/'\n",
337 |     "\n",
338 |     "# Create an instance of Chroma with the documents, embeddings, and the persist directory\n",
339 |     "vectordb = Chroma.from_documents(\n",
340 |     "    documents=splits,\n",
341 |     "    embedding=embedding,\n",
342 |     "    persist_directory=persist_directory\n",
343 |     ")\n"
344 |    ]
345 |   },
346 |   {
347 |    "cell_type": "markdown",
348 |    "metadata": {},
349 |    "source": [
350 |     "### Step 5: Retrieving Relevant Documents\n",
351 |     "After storing the embeddings in the vector database (Chroma), we want to retrieve the most relevant ones based on our question. \n",
352 |     "\n",
353 |     "This is similar to asking a librarian for the most relevant books based on the topic you're interested in.\n",
354 |     "\n",
355 |     "This is what the `similarity_search` method does. It takes our question and returns the most related documents from our database."
356 |    ]
357 |   },
358 |   {
359 |    "cell_type": "code",
360 |    "execution_count": 12,
361 |    "metadata": {},
362 |    "outputs": [],
363 |    "source": [
364 |     "# Create an instance of Chroma with the persist directory and embeddings\n",
365 |     "vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding)"
366 |    ]
367 |   },
368 |   {
369 |    "cell_type": "markdown",
370 |    "metadata": {},
371 |    "source": [
372 |     "Use the created vector database to find the most similar documents to a given question"
373 |    ]
374 |   },
375 |   {
376 |    "cell_type": "code",
377 |    "execution_count": 13,
378 |    "metadata": {},
379 |    "outputs": [],
380 |    "source": [
381 |     "question = \"Give me all books from the episode\"\n",
382 |     "\n",
383 |     "# Retrieve similar documents to a given question using the vector database\n",
384 |     "docs = vectordb.similarity_search(question, k=5)"
385 |    ]
386 |   },
387 |   {
388 |    "cell_type": "code",
389 |    "execution_count": 14,
390 |    "metadata": {},
391 |    "outputs": [
392 |     {
393 |      "data": {
394 |       "text/plain": [
395 |        "[Document(page_content=\"Harpreet Sahota:  01:15:00  Yeah, yeah, yeah. The Manga Guide to Calculus and the \\nManga Guide to Linear Algebra. Super good.  \\nJon Krohn:  01:15:06  Awesome. So, near the end of every episode I ask people \\nfor book recommendations, but you have just given us a \\nton. So, I think we've covered that question, unless you \\nhave any other books you'd like to add.  \\nHarpreet Sahota:  01:15:17  You know, I used to, I used to, I have, I've traded, when I \\nwas recording the Artist of Data Science podcast, I read a \\nlot of books lik e, cause I had so many authors on and \\nsince I kind of put the podcast on hold for now, I spent \\nmost of my time reading research papers in the morning \\nwhenever I have free mornings. I have not read a book in \\nlike six months, sadly. But the one that I have c urrently \\njust gone back to rereading is Deep Work by Cal Newport. \\nI think that's a good book. Important book for people who \\nare in roles like ours that are knowledge -intensive and \\nrequire a lot of thinking.  \\nJon Krohn:  01:15:55  Yeah. So, important to be abl e to carve out a little bit of \\ntime every day to be able to work deeply. It's absolutely\", metadata={'page': 39, 'source': 'transcripts/PT693-Transcript.pdf'}),\n",
396 |        " Document(page_content=\"podcast directly by asking Professor Wiggins your burning \\nquestions on stage.  \\n 01:19:25  All right, thanks to my colleagues at Nebula for \\nsupporting me while I create content like this \\nSuperDataSc ience episode for you. And thanks of course \\nto Ivana, Mario, Natalie, Serg, Sylvia, Zara, and Kirill on \\nthe SuperDataScience team for producing another eye -\\nopening episode for us today. For enabling that super \\nteam to create this free podcast for you, we'r e of course, \\ndeeply grateful to our sponsors. Please consider \\nsupporting the show by checking out our sponsors' links, \\nwhich you can find in the show notes. Finally, thanks of \\ncourse to you for listening. I'm so grateful to have you \\ntuning in and I hope I can continue to make episodes you \\nlove for years and years to come. Well, until next time, my \\nfriend, keep on rocking it out there and I'm looking\", metadata={'page': 41, 'source': 'transcripts/PT693-Transcript.pdf'}),\n",
397 |        " Document(page_content='forward to enjoying another round of the \\nSuperDataScience podcast with you very soon.', metadata={'page': 42, 'source': 'transcripts/PT693-Transcript.pdf'}),\n",
398 |        " Document(page_content=\"time investment. So, like he wrote this book, the Deep \\nLearning Illustrated Guide, huge, huge, massive book, \\nright. No math - \\nJon Krohn:  01:08:48  Deep Learning: A Visual Approach.  \\nHarpreet Sahota:  01:08:49  Oh yes. Deep Learning: A Visual Approach. Yes, Deep \\nLearning Illustrated is another book I'm about to talk \\nabout. Deep Learning Illustrated Approach. Great book. \\nAnd then once you do that start le arning some PyTorch, \\nright. Just, you need to move away from SciKit -Learn. \\nGoing from SciKit -Learn to PyTorch is a bit of a mental \\nshift. But you know, Daniel Bourke, I'm not sure if you've \\ninterviewed him on your podcast or not, he's awesome. \\nHe's based o ut of Australia. Highly recommend him. \\n@mrdburke on Twitter. But he's got this Zero to Mastery \\nPyTorch course. Go through that because you're gonna \\nget a bit of intuition about what's happening under the \\nhood. Then you're getting your fingers on the keyboa rd, \\nyou're getting your hands dirty, you're coding, right? This\", metadata={'page': 35, 'source': 'transcripts/PT693-Transcript.pdf'}),\n",
399 |        " Document(page_content=\"curriculum. But yeah, whether you are yeah, wanting to \\nget just get started on the underly ing calculus to \\nunderstand backprop or you want to jump to later videos \\nand get deep into the weeds on how backprop works \\nusing calculus principles and do it in a hands -on Python -\\nbased way. Yeah, I think I I it's my own resource, but I \\nthink it's good.  \\nHarpreet Sahota:  01:14:32  I've linked to, linked to that resource many times, like it's \\na great YouTube course. And another kind of interesting \\nresource I like is there's like this a series of manga books \\nthat touch on a wide range of topics. I've got the ent ire \\nset, but there's a book there on calculus and on linear \\nalgebra and they're like, you know, proper like comic \\nbooks, but it teaches you calculus and algebra.  \\nJon Krohn:  01:14:58  Oh, manga. Oh really?\", metadata={'page': 38, 'source': 'transcripts/PT693-Transcript.pdf'})]"
400 |       ]
401 |      },
402 |      "execution_count": 14,
403 |      "metadata": {},
404 |      "output_type": "execute_result"
405 |     }
406 |    ],
407 |    "source": [
408 |     "docs"
409 |    ]
410 |   },
411 |   {
412 |    "cell_type": "markdown",
413 |    "metadata": {},
414 |    "source": [
415 |     "Checking cosine similarity"
416 |    ]
417 |   },
418 |   {
419 |    "cell_type": "code",
420 |    "execution_count": 15,
421 |    "metadata": {},
422 |    "outputs": [
423 |     {
424 |      "name": "stdout",
425 |      "output_type": "stream",
426 |      "text": [
427 |       "0.7724149422983935\n",
428 |       "0.7522458412266196\n",
429 |       "0.7437202518308199\n",
430 |       "0.7277780673958708\n",
431 |       "0.7273885797063984\n"
432 |      ]
433 |     }
434 |    ],
435 |    "source": [
436 |     "q_emb = embedding.embed_query(question)\n",
437 |     "q_vec = np.array(q_emb)\n",
438 |     "\n",
439 |     "for d in docs:\n",
440 |     "    emb = embedding.embed_query(d.page_content)\n",
441 |     "    vec = np.array(emb)\n",
442 |     "    cosine = cosine_similarity(q_vec, vec)\n",
443 |     "    print(cosine)"
444 |    ]
445 |   },
446 |   {
447 |    "cell_type": "markdown",
448 |    "metadata": {},
449 |    "source": [
450 |     "### Step 6: Generating the Answer\n",
451 |     "In this step, we use our Large Language Model (LLM), to generate a response. \n",
452 |     "\n",
453 |     "We provide the model with 2 things:\n",
454 |     "- our query\n",
455 |     "- the most relevant documents retrieved in the previous step\n",
456 |     "\n",
457 |     "We use a PromptTemplate, which is a set of instructions for our LLM. It's like telling a story to a friend and then asking them a question about that story.\n",
458 |     "\n",
459 |     "In this case, the PromptTemplate instructs the LLM to use the documents (context) to answer the question at the end.\n",
460 |     "\n",
461 |     "<img src=\"images/VectorDatabaseProcess.png\" alt=\"Image Alt Text\" width=\"600\">"
462 |    ]
463 |   },
464 |   {
465 |    "cell_type": "code",
466 |    "execution_count": 16,
467 |    "metadata": {},
468 |    "outputs": [],
469 |    "source": [
470 |     "# Import the necessary classes from the langchain library.\n",
471 |     "from langchain.chains import RetrievalQA\n",
472 |     "from langchain.prompts import PromptTemplate\n",
473 |     "from langchain.chat_models import ChatOpenAI\n",
474 |     "\n",
475 |     "# Define a prompt template. This is a format for the text input we'll give to our model.\n",
476 |     "# It tells the model how to structure its response and what to do in different situations.\n",
477 |     "template = \"\"\"I will provide you pieces of [Context] to answer the [Question]. \\\n",
478 |     "If you don't know the answer based on [Context] just say that you don't know, don't try to make up an answer. \\\n",
479 |     "[Context]: {context} \\\n",
480 |     "[Question]: {question} \\\n",
481 |     "Helpful Answer:\"\"\"\n",
482 |     "\n",
483 |     "# If your answer includes any sort of list, return it in bullets. \\\n",
484 |     "# Format your answer to Markdown. \\\n",
485 |     "\n",
486 |     "# Create a PromptTemplate object from our string template.\n",
487 |     "QA_CHAIN_PROMPT = PromptTemplate.from_template(template)\n",
488 |     "\n",
489 |     "# Initialize our language model. We're using OpenAI's GPT-3.5-turbo model here.\n",
490 |     "llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n",
491 |     "\n",
492 |     "# Create a RetrievalQA object. This uses our language model (llm) and a retriever,\n",
493 |     "# which is our vector database (vectordb). This object will handle asking our model questions\n",
494 |     "# and retrieving relevant documents to help answer them.\n",
495 |     "qa_chain = RetrievalQA.from_chain_type(\n",
496 |     "    llm,\n",
497 |     "    retriever=vectordb.as_retriever(),\n",
498 |     "    chain_type=\"stuff\",\n",
499 |     "    return_source_documents=True,\n",
500 |     "    chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT}\n",
501 |     ")"
502 |    ]
503 |   },
504 |   {
505 |    "cell_type": "code",
506 |    "execution_count": 17,
507 |    "metadata": {},
508 |    "outputs": [
509 |     {
510 |      "name": "stdout",
511 |      "output_type": "stream",
512 |      "text": [
513 |       "Based on the context provided, here is a helpful answer to the question:\n",
514 |       "\n",
515 |       "To start your journey as an aspiring deep learning engineer, it is recommended to follow a top-down approach. This means focusing on applications and practical implementations of deep learning rather than getting overwhelmed by the mathematical equations initially. Look for courses or resources that provide hands-on experience and showcase real-world examples of deep learning in action. \n",
516 |       "\n",
517 |       "Here are some steps you can take:\n",
518 |       "\n",
519 |       "1. Start with a course like the one mentioned in the context, specifically designed for people who are comfortable with statistics, math, Python programming, and classical machine learning. This course will provide a structured introduction to deep learning, starting from pre-deep learning methods and gradually progressing towards more advanced concepts.\n",
520 |       "\n",
521 |       "2. Once you have a basic understanding, explore and experiment with popular deep learning models and frameworks. For computer vision, you can try YOLO-NAS for image classification and ChatGPT or other language models for natural language processing tasks. Interact with these models and try building something cool with them.\n",
522 |       "\n",
523 |       "3. To gain deeper insights, consider resources like Andrew Glassner's deep learning crash course, which provides proper intuition for understanding how deep learning works. Another recommendation is the book \"Deep Learning: A Visual Approach\" by Jon Krohn, which offers a visual perspective on deep learning concepts.\n",
524 |       "\n",
525 |       "4. Join communities and engage with other deep learning enthusiasts. Surrounding yourself with people who have different levels of experience, from beginners to experts, can provide valuable insights and support.\n",
526 |       "\n",
527 |       "5. Finally, work on projects to apply your knowledge and gain practical experience. Platforms like Kaggle offer datasets and competitions where you can participate and solve real-world problems using deep learning techniques.\n",
528 |       "\n",
529 |       "Remember, the key is to start with practical applications, gradually build your understanding, and continuously engage with the deep learning community. Good luck on your journey!\n"
530 |      ]
531 |     }
532 |    ],
533 |    "source": [
534 |     "question = \"I'm an aspiring deep learning engineer. How should I start?\"\n",
535 |     "\n",
536 |     "# Ask our question to the qa_chain, and store the result.\n",
537 |     "result = qa_chain({\"query\": question})\n",
538 |     "\n",
539 |     "# Print out the result\n",
540 |     "print(result[\"result\"])\n"
541 |    ]
542 |   },
543 |   {
544 |    "cell_type": "code",
545 |    "execution_count": 18,
546 |    "metadata": {},
547 |    "outputs": [
548 |     {
549 |      "data": {
550 |       "text/plain": [
551 |        "[Document(page_content=\"Harpreet Sahota:  00:36:11  Yeah, yeah. Doing it on LinkedIn Learning, it's, it's, it's \\ngonna  be a cool course. So, like it, the audience for this \\ncourse are people who are like me before I got into deep \\nlearning. So, if you're comfortable with statistics, math, \\nPython programming, classical ML, if like, you're good \\nwith all that, and you're like looking at this deep learning \\nthing and wondering like, okay, how, how can I get into \\nthis? Then this is the course that I made for you. I made \\nit for an earlier version of me. And it goes through, like I \\nstart with like a history of computer vision for im age \\nclassification, and I talk about, you know, important \\nconcepts like the things that I felt I needed to understand \\nbefore I got into deep learning. So, I kind of structured it \\nthat way. I start from pre -deep learning methods, just \\nbriefly touch on those .\", metadata={'page': 19, 'source': 'transcripts/PT693-Transcript.pdf'}),\n",
552 |        " Document(page_content=\"know, pick up YOLO -NAS and run, run it on some image, \\nsee the power  of it. Open up, you know, ChatGPT or any \\nof the other language models and start playing with it, \\nlike interacting with it. Start interacting with models, \\ntrying to build something with it, trying to do cool stuff \\nwith it, right?  \\n 01:08:01  You know, open u p, learn some LangChain, and see what \\nyou can build, right? Just see the magic happen, get \\ninspired. Then once you're kind of inspired, right, if you \\nthink it's cool, right, some people probably won't think it's \\ncool, they'll just, you know, be like, okay,  cool, whatever. \\nThat's fine. But if you think it's cool and you're interested, \\nthen dig a little bit deeper. And digging deeper, there's \\nlike a couple of places I recommend one of them: Andrew \\nGlassner has this deep learning crash course. It's like \\nthree and a half hours long. But it gives you just proper \\nintuition for how, how all this works. Very good return on \\ntime investment. So, like he wrote this book, the Deep \\nLearning Illustrated Guide, huge, huge, massive book, \\nright. No math - \\nJon Krohn:  01:08:48  Deep Learning: A Visual Approach.\", metadata={'page': 35, 'source': 'transcripts/PT693-Transcript.pdf'}),\n",
553 |        " Document(page_content=\"to more refined use cases. T hat makes a lot of sense to \\nme. So obviously you're learning a lot about deep learning \\nin particular, and I know that you have a particular \\nphilosophy of learning deep learning that you describe as \\ntop-down. Do you want to describe for our listeners what \\nthat means and why it might also be the way that they \\nshould be learning complex concepts like deep learning?  \\nHarpreet Sahota:  01:07:08  Yeah, yeah. So, I'll, I'll preface this by saying that like, \\nyou know, I've got a master's in mathematics and \\nstatistics.  I was a biostatistician, I was the actuary, I was \\na data scientist. So, there is, this is coming from that \\nperspective, but even with, as somebody who has that \\nbackground, like my approach is to just skip the math. \\nFirst, skip the math, right? Ignore it w hen you're starting \\nout, because looking at equations is gonna demotivate \\nyou, right? So, what I instead implore people to do is just \\nlook for applications of deep learning, right? So, you\", metadata={'page': 34, 'source': 'transcripts/PT693-Transcript.pdf'}),\n",
554 |        " Document(page_content=\"Neural Network Zero to Hero on YouTube. Great, great \\nresource for that. You actually end up building you end \\nup building like a mini version of PyTorch. I think he calls \\nit like minigrad or something like that. But it's amazing. \\nIt's great.  \\n 01:10:47  Once you've done that, then get an understanding of \\nmore foundational architectures. You could, you know, \\nonce my LinkedIn learning course is out, go through that, \\ngo through some of the foundational computer vision \\narchitectures. Yan nic Kilcher has a great YouTube series \\non classical papers. It breaks it down in an easy -to-\\nunderstand manner. And then just join some community, \\nyou know, be around other people who are into the same \\nstuff. You want to be around people who have a broad \\nrange of experience from learners to experts. And then \\nfinally just projects. Just do projects, get on Kaggle, do a\", metadata={'page': 36, 'source': 'transcripts/PT693-Transcript.pdf'})]"
555 |       ]
556 |      },
557 |      "execution_count": 18,
558 |      "metadata": {},
559 |      "output_type": "execute_result"
560 |     }
561 |    ],
562 |    "source": [
563 |     "result[\"source_documents\"]"
564 |    ]
565 |   },
566 |   {
567 |    "cell_type": "markdown",
568 |    "metadata": {},
569 |    "source": [
570 |     "## Bonus Section\n",
571 |     "\n",
572 |     "### Understanding Retrieval from Vector Databases\n",
573 |     "Vector databases work like a magical library. \n",
574 |     "\n",
575 |     "When you ask a question, the database doesn't read through all the books (or in our case, document splits). Instead, it translates your question into a special language (the embeddings) and then finds the books that speak the same language the closest.\n",
576 |     "\n",
577 |     "These databases use a measure of similarity, such as cosine similarity to find the most related vectors (or embeddings). In our project, `similarity_search()` does exactly this - it finds the most similar vectors (embeddings) to the query vector using cosine similarity.\n",
578 |     "\n",
579 |     "If you think of vectors as arrows in space, cosine similarity measures the cosine of the angle between them. \n",
580 |     "\n",
581 |     "When the vectors are close to each other, pointing in almost the same direction, the cosine of the angle between them is close to 1, meaning they're very similar. On the contrary, if the vectors point in completely different directions, the cosine is close to -1, meaning they're very dissimilar.\n",
582 |     "\n",
583 |     "To capture the essence:\n",
584 |     "- **vectors are close** (pointing in almost the same direction) -> high cosine similarity (close to 1) -> similar meaning\n",
585 |     "- **vectors are far apart** (pointing in almost the opposite direction) -> low cosine similarity (close to -1) -> dissimilar meaning\n",
586 |     "\n",
587 |     "This is how vector databases can quickly find the most related documents to your question!\n",
588 |     "\n",
589 |     "\n",
590 |     "### Context Length in Large Language Models\n",
591 |     "When a language model reads text, it has a limit to how much it can remember at once. \n",
592 |     "\n",
593 |     "This limit is called the \"context length\".\n",
594 |     "\n",
595 |     "Imagine you're reading a very long story but you have a memory limit. If the story exceeds this limit, you start to forget the earlier parts as you read further. \n",
596 |     "\n",
597 |     "The same happens with language models. \n",
598 |     "\n",
599 |     "For the standard GPT-3.5, the context length is 4096 tokens (~3000 words). \n",
600 |     "\n",
601 |     "If a text exceeds this limit, the model can't remember the initial parts while processing the later parts.\n",
602 |     "\n",
603 |     "Context length matters because the quality of the response can significantly depend on the provided context. If important information is outside of the model's context length, it won't be able to reference it in its response.\n",
604 |     "\n",
605 |     "### Vector Databases are the modern solution for the Context Length limits\n",
606 |     "\n",
607 |     "We can't feed our Large Language Models with a 300-page PDF.\n",
608 |     "\n",
609 |     "The Context Length is too short. The model won't \"remember\" most of it.\n",
610 |     "\n",
611 |     "But we can feed our Vector Database with a 300-page PDF!\n",
612 |     "\n",
613 |     "Thanks to similarity search, we can retrieve ONLY the relevant chunks from our PDF.\n",
614 |     "\n",
615 |     "Then, we just take our query together with the chunks without exceeding the context length."
616 |    ]
617 |   },
618 |   {
619 |    "cell_type": "markdown",
620 |    "metadata": {},
621 |    "source": []
622 |   }
623 |  ],
624 |  "metadata": {
625 |   "kernelspec": {
626 |    "display_name": "venv",
627 |    "language": "python",
628 |    "name": "python3"
629 |   },
630 |   "language_info": {
631 |    "codemirror_mode": {
632 |     "name": "ipython",
633 |     "version": 3
634 |    },
635 |    "file_extension": ".py",
636 |    "mimetype": "text/x-python",
637 |    "name": "python",
638 |    "nbconvert_exporter": "python",
639 |    "pygments_lexer": "ipython3",
640 |    "version": "3.9.17"
641 |   },
642 |   "orig_nbformat": 4
643 |  },
644 |  "nbformat": 4,
645 |  "nbformat_minor": 2
646 | }
647 | 


--------------------------------------------------------------------------------
/LangChainGuides/ContentIdeaGenerator.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Step-by-Step Guide for Creating a Content Idea Generator\n",
  8 |     "\n",
  9 |     "## Introduction\n",
 10 |     "In this Python notebook, we'll create a Content Idea Generator using LangChain and the OpenAI API. \n",
 11 |     "\n",
 12 |     "This tool will summarize YouTube videos and then generate content ideas based on those summaries, taking into account:\n",
 13 |     "- specific information about the user\n",
 14 |     "- the target audience."
 15 |    ]
 16 |   },
 17 |   {
 18 |    "cell_type": "markdown",
 19 |    "metadata": {},
 20 |    "source": [
 21 |     "## Prerequisites\n",
 22 |     "To run this notebook, make sure you've installed requried packages:\n",
 23 |     "\n",
 24 |     "`pip install langchain openai gradio youtube-transcript-api pytube python-dotenv`\n"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "## Step 1: Load Api keys .env\n",
 32 |     "\n",
 33 |     "First, let's load the .env file to get OPENAI_API_KEY"
 34 |    ]
 35 |   },
 36 |   {
 37 |    "cell_type": "code",
 38 |    "execution_count": 1,
 39 |    "metadata": {},
 40 |    "outputs": [],
 41 |    "source": [
 42 |     "\n",
 43 |     "from dotenv import load_dotenv, find_dotenv\n",
 44 |     "\n",
 45 |     "_ = load_dotenv(find_dotenv())\n"
 46 |    ]
 47 |   },
 48 |   {
 49 |    "cell_type": "markdown",
 50 |    "metadata": {},
 51 |    "source": [
 52 |     "## Step 2: YouTube Transcript Loader Function\n",
 53 |     "Create a function that takes a YouTube URL, extracts the video transcript and title, and returns them.\n",
 54 |     "\n",
 55 |     "First, we need to extract the video ID from the YouTube URL because Langchain's `YoutubeLoader` requires a video ID to load the transcript."
 56 |    ]
 57 |   },
 58 |   {
 59 |    "cell_type": "code",
 60 |    "execution_count": 2,
 61 |    "metadata": {},
 62 |    "outputs": [],
 63 |    "source": [
 64 |     "from urllib.parse import urlparse, parse_qs\n",
 65 |     "\n",
 66 |     "def extract_video_id_from_url(url):\n",
 67 |     "    \"\"\"\n",
 68 |     "    Extract the YouTube video ID using urllib.parse.\n",
 69 |     "    \"\"\"\n",
 70 |     "    video_id = None\n",
 71 |     "    parsed_url = urlparse(url)\n",
 72 |     "    \n",
 73 |     "    if \"youtube.com\" in parsed_url.netloc:\n",
 74 |     "        parsed_query = parse_qs(parsed_url.query)\n",
 75 |     "        video_id = parsed_query.get(\"v\", [None])[0]\n",
 76 |     "    elif \"youtu.be\" in parsed_url.netloc:\n",
 77 |     "        video_id = parsed_url.path[1:]\n",
 78 |     "        \n",
 79 |     "    return video_id\n"
 80 |    ]
 81 |   },
 82 |   {
 83 |    "cell_type": "code",
 84 |    "execution_count": 3,
 85 |    "metadata": {},
 86 |    "outputs": [],
 87 |    "source": [
 88 |     "from typing import Optional, Tuple\n",
 89 |     "from langchain.document_loaders import YoutubeLoader\n",
 90 |     "\n",
 91 |     "def get_transcript_and_metadata(url: str) -> Tuple[Optional[str], Optional[str]]:\n",
 92 |     "    \"\"\"\n",
 93 |     "    Returns the transcript and title from a YouTube URL.\n",
 94 |     "\n",
 95 |     "    Parameters:\n",
 96 |     "    url (str): The YouTube URL from which the transcript and title will be extracted.\n",
 97 |     "\n",
 98 |     "    Returns:\n",
 99 |     "    transcript (str): The transcript of the video.\n",
100 |     "    title (str): The title of the video.\n",
101 |     "    \"\"\"\n",
102 |     "    try:\n",
103 |     "        vid_id = extract_video_id_from_url(url)\n",
104 |     "        loader = YoutubeLoader(vid_id, add_video_info=True)\n",
105 |     "        docs = loader.load()\n",
106 |     "        if docs:\n",
107 |     "            doc = docs[0]\n",
108 |     "            transcript = doc.page_content\n",
109 |     "            title = doc.metadata[\"title\"]\n",
110 |     "            return transcript, title\n",
111 |     "        else:\n",
112 |     "            return None, None\n",
113 |     "    except Exception as e:\n",
114 |     "        print(f\"Failed to load transcript and title from URL {url}: {e}\")\n",
115 |     "        return None, None\n"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "markdown",
120 |    "metadata": {},
121 |    "source": [
122 |     "Testing the function"
123 |    ]
124 |   },
125 |   {
126 |    "cell_type": "code",
127 |    "execution_count": 4,
128 |    "metadata": {},
129 |    "outputs": [],
130 |    "source": [
131 |     "# get_transcript_and_metadata(\"https://youtu.be/Z6sCl6abJj4?si=627FWCed9VtYTcbR\")"
132 |    ]
133 |   },
134 |   {
135 |    "cell_type": "markdown",
136 |    "metadata": {},
137 |    "source": [
138 |     "## Step 3: Create Chain for Summary\n",
139 |     "Use LangChain's Chain to create a chain that will summarize the YouTube transcript.\n",
140 |     "\n",
141 |     "- for summaries we'll use the `gpt-3.5-turbo-16k` model to handle longer transcripts\n",
142 |     "- we'll also use a smaller temperature to increase reasoning\n",
143 |     "\n",
144 |     "\n",
145 |     "<img src=\"images/Chains_seq.png\" alt=\"Image Alt Text\" width=\"600\">"
146 |    ]
147 |   },
148 |   {
149 |    "cell_type": "code",
150 |    "execution_count": 5,
151 |    "metadata": {},
152 |    "outputs": [],
153 |    "source": [
154 |     "from langchain.chains import LLMChain\n",
155 |     "from langchain.prompts import PromptTemplate\n",
156 |     "from langchain.chat_models import ChatOpenAI\n",
157 |     "\n",
158 |     "llm_summary = ChatOpenAI(model_name='gpt-3.5-turbo-16k', temperature=.3)\n",
159 |     "summary_template = \"\"\"Please summarize the following transcript in a form of a list with key takeaways.\\\n",
160 |     "Tailor the summary for the person who is {info_about_me}.\\\n",
161 |     "\n",
162 |     "Transcript: {transcript}\n",
163 |     "\"\"\"\n",
164 |     "\n",
165 |     "summary_prompt_template = PromptTemplate(input_variables=[\"transcript\", \"info_about_me\"], template=summary_template)\n",
166 |     "summary_chain = LLMChain(llm=llm_summary, prompt=summary_prompt_template, output_key=\"summary\")\n"
167 |    ]
168 |   },
169 |   {
170 |    "cell_type": "markdown",
171 |    "metadata": {},
172 |    "source": [
173 |     "## Step 4: Create Chain for Idea Generation\n",
174 |     "Create another Chain that will take the summary, your info, and your target audience info to generate content ideas."
175 |    ]
176 |   },
177 |   {
178 |    "cell_type": "code",
179 |    "execution_count": 6,
180 |    "metadata": {},
181 |    "outputs": [],
182 |    "source": [
183 |     "llm_idea = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=.7)\n",
184 |     "\n",
185 |     "idea_template = \"\"\"Given the summarized content,\\\n",
186 |     "and knowing that the creator is specialized in {info_about_me} and\\\n",
187 |     "the target audience is interested in {info_about_audience},\\\n",
188 |     "what are some content ideas that can be generated?\\\n",
189 |     "Summary: {summary}\"\"\"\n",
190 |     "\n",
191 |     "idea_prompt_template = PromptTemplate(input_variables=[\"summary\", \"info_about_me\", \"info_about_audience\"], template=idea_template)\n",
192 |     "idea_chain = LLMChain(llm=llm_idea, prompt=idea_prompt_template, output_key=\"content_ideas\")\n"
193 |    ]
194 |   },
195 |   {
196 |    "cell_type": "markdown",
197 |    "metadata": {},
198 |    "source": [
199 |     "## Step 5: Sequential Chain\n",
200 |     "Create a SequentialChain that combines both the summary and idea generation Chains.\n",
201 |     "\n",
202 |     "Although we're using `SequentialChain`, our model is simple.\n",
203 |     "\n",
204 |     "<img src=\"images/Chains_simple_seq.png\" alt=\"Image Alt Text\" width=\"600\">"
205 |    ]
206 |   },
207 |   {
208 |    "cell_type": "code",
209 |    "execution_count": 7,
210 |    "metadata": {},
211 |    "outputs": [],
212 |    "source": [
213 |     "from langchain.chains import SequentialChain\n",
214 |     "\n",
215 |     "overall_chain = SequentialChain(\n",
216 |     "    chains=[summary_chain, idea_chain],\n",
217 |     "    input_variables=[\"transcript\", \"info_about_me\", \"info_about_audience\"],\n",
218 |     "    output_variables=[\"summary\", \"content_ideas\"],\n",
219 |     "    verbose=True\n",
220 |     ")\n"
221 |    ]
222 |   },
223 |   {
224 |    "cell_type": "markdown",
225 |    "metadata": {},
226 |    "source": [
227 |     "## Step 6: Gradio Interface with Additional Inputs\n",
228 |     "Update the Gradio interface to include fields for entering information about you and your target audience."
229 |    ]
230 |   },
231 |   {
232 |    "cell_type": "code",
233 |    "execution_count": 8,
234 |    "metadata": {},
235 |    "outputs": [],
236 |    "source": [
237 |     "ABOUT_ME = \"\"\"An NLP Engineer with a background in Full-Stack Development,\\\n",
238 |     "specialized in Large Language Models and Generative AI.\\\n",
239 |     "Creates educational content and shares it on LinkedIn, YouTube and Medium.\"\"\"\n",
240 |     "\n",
241 |     "TARGET_AUDIENCE = \"\"\"Aspiring NLP engineers, data scientists, and tech enthusiasts who are interested in leveraging cutting-edge AI technologies.\\\n",
242 |     "They look for practical guides and insights into building projects with Large Language Models.\"\"\""
243 |    ]
244 |   },
245 |   {
246 |    "cell_type": "code",
247 |    "execution_count": 9,
248 |    "metadata": {},
249 |    "outputs": [
250 |     {
251 |      "name": "stderr",
252 |      "output_type": "stream",
253 |      "text": [
254 |       "/home/kris/anaconda3/envs/langchain/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
255 |       "  from .autonotebook import tqdm as notebook_tqdm\n"
256 |      ]
257 |     },
258 |     {
259 |      "name": "stdout",
260 |      "output_type": "stream",
261 |      "text": [
262 |       "Running on local URL:  http://127.0.0.1:7860\n",
263 |       "\n",
264 |       "To create a public link, set `share=True` in `launch()`.\n"
265 |      ]
266 |     },
267 |     {
268 |      "data": {
269 |       "text/html": [
270 |        "<div><iframe src=\"http://127.0.0.1:7860/\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
271 |       ],
272 |       "text/plain": [
273 |        "<IPython.core.display.HTML object>"
274 |       ]
275 |      },
276 |      "metadata": {},
277 |      "output_type": "display_data"
278 |     },
279 |     {
280 |      "name": "stdout",
281 |      "output_type": "stream",
282 |      "text": [
283 |       "Keyboard interruption in main thread... closing server.\n"
284 |      ]
285 |     },
286 |     {
287 |      "data": {
288 |       "text/plain": []
289 |      },
290 |      "execution_count": 9,
291 |      "metadata": {},
292 |      "output_type": "execute_result"
293 |     }
294 |    ],
295 |    "source": [
296 |     "import gradio as gr\n",
297 |     "\n",
298 |     "def execute_chain(url: str, info_about_me: str, info_about_audience: str):\n",
299 |     "    transcript = get_transcript_and_metadata(url)\n",
300 |     "    if transcript:\n",
301 |     "        inputs = {\n",
302 |     "            \"transcript\": transcript,\n",
303 |     "            \"info_about_me\": info_about_me,\n",
304 |     "            \"info_about_audience\": info_about_audience,\n",
305 |     "        }\n",
306 |     "        output = overall_chain(inputs)\n",
307 |     "        return output[\"summary\"], output[\"content_ideas\"]\n",
308 |     "    else:\n",
309 |     "        return \"Failed to load transcript.\", \"Cannot generate content ideas without transcript.\"\n",
310 |     "\n",
311 |     "demo = gr.Interface(\n",
312 |     "    fn=execute_chain,\n",
313 |     "    inputs=[\n",
314 |     "        \"text\",\n",
315 |     "        gr.Textbox(lines=4, value=ABOUT_ME, label=\"About Me\"),\n",
316 |     "        gr.Textbox(lines=2, value=TARGET_AUDIENCE, label=\"Target Audience\"),\n",
317 |     "    ],\n",
318 |     "    outputs=[\n",
319 |     "        gr.Textbox(label=\"Video Summary\"), \n",
320 |     "        gr.Textbox(label=\"Content Ideas\"),\n",
321 |     "    ],\n",
322 |     ")\n",
323 |     "\n",
324 |     "demo.launch(debug=True)\n"
325 |    ]
326 |   },
327 |   {
328 |    "cell_type": "code",
329 |    "execution_count": null,
330 |    "metadata": {},
331 |    "outputs": [],
332 |    "source": []
333 |   }
334 |  ],
335 |  "metadata": {
336 |   "kernelspec": {
337 |    "display_name": "langchain",
338 |    "language": "python",
339 |    "name": "python3"
340 |   },
341 |   "language_info": {
342 |    "codemirror_mode": {
343 |     "name": "ipython",
344 |     "version": 3
345 |    },
346 |    "file_extension": ".py",
347 |    "mimetype": "text/x-python",
348 |    "name": "python",
349 |    "nbconvert_exporter": "python",
350 |    "pygments_lexer": "ipython3",
351 |    "version": "3.9.17"
352 |   },
353 |   "orig_nbformat": 4
354 |  },
355 |  "nbformat": 4,
356 |  "nbformat_minor": 2
357 | }
358 | 


--------------------------------------------------------------------------------
/LangChainGuides/PromptTemplates.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## LangChain Prompt Templates\n",
  8 |     "\n",
  9 |     "This notebook demonstrates how to use LangChain for creating chat-based prompt templates for the GPT-like language model. \n",
 10 |     "\n",
 11 |     "Utilizing these templates allows for dynamic and flexible interactions with large language models, ensuring we can harness their full potential for specific tasks.\n",
 12 |     "\n",
 13 |     "**Benefits of Using Prompt Templates**:\n",
 14 |     "1. *Consistency*: Same tasks can be executed across sessions with standardized prompts.\n",
 15 |     "2. *Efficiency*: Reduces the need for manually typing and formulating prompts.\n",
 16 |     "3. *Flexibility*: Easy to alter parameters to get varied responses.\n",
 17 |     "4. *Validation*: Helps in ensuring correct and expected variable inputs.\n",
 18 |     "5. *Model Agnostic*: Templates can be reused across different language models."
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "markdown",
 23 |    "metadata": {},
 24 |    "source": [
 25 |     "### Requirements\n",
 26 |     "To run this notebook successfully, make sure you've installed the following Python packages:\n",
 27 |     "\n",
 28 |     "- `langchain`: Provides the main functionality to interact with YouTube videos\n",
 29 |     "- `openai`: Allows us to use OpenAI LLM models like GPT-3.5\n",
 30 |     "- `python-dotenv`: Used to read the .env file containing the OpenAI API Key\n",
 31 |     "- `ipykernel`: Enables running this notebook in VSCode\n",
 32 |     "\n",
 33 |     "You can install all of these with a single pip command:\n",
 34 |     "\n",
 35 |     "```bash\n",
 36 |     "pip install langchain openai python-dotenv ipykernel\n",
 37 |     "```"
 38 |    ]
 39 |   },
 40 |   {
 41 |    "cell_type": "markdown",
 42 |    "metadata": {},
 43 |    "source": [
 44 |     "### Loading API Key\n",
 45 |     "\n",
 46 |     "We need to load the OpenAI API key to utilize OpenAI's GPT models.\n",
 47 |     "\n",
 48 |     "*Note:* to run this code you'll need to sign up for an OpenAI API key and replace `YOUR_API_KEY` with your actual key."
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": 1,
 54 |    "metadata": {},
 55 |    "outputs": [
 56 |     {
 57 |      "data": {
 58 |       "text/plain": [
 59 |        "True"
 60 |       ]
 61 |      },
 62 |      "execution_count": 1,
 63 |      "metadata": {},
 64 |      "output_type": "execute_result"
 65 |     }
 66 |    ],
 67 |    "source": [
 68 |     "import os\n",
 69 |     "from dotenv import load_dotenv\n",
 70 |     "\n",
 71 |     "# load_dotenv()\n",
 72 |     "\n",
 73 |     "# Get the absolute path of the current script\n",
 74 |     "script_dir = os.path.abspath(os.getcwd())\n",
 75 |     "\n",
 76 |     "# Get the absolute path of the parent directory\n",
 77 |     "parent_dir = os.path.join(script_dir, os.pardir)\n",
 78 |     "\n",
 79 |     "dotenv_path = os.path.join(parent_dir, '.env')\n",
 80 |     "# Load the .env file from the parent directory\n",
 81 |     "load_dotenv(dotenv_path)"
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "markdown",
 86 |    "metadata": {},
 87 |    "source": [
 88 |     "### Imports & Initial Setup"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "code",
 93 |    "execution_count": 2,
 94 |    "metadata": {},
 95 |    "outputs": [],
 96 |    "source": [
 97 |     "from langchain.prompts import ChatPromptTemplate\n",
 98 |     "from langchain.chat_models import ChatOpenAI\n",
 99 |     "\n",
100 |     "# Using the default GPT-3.5 model with temperature 0.7\n",
101 |     "chat = ChatOpenAI()"
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "code",
106 |    "execution_count": 3,
107 |    "metadata": {},
108 |    "outputs": [
109 |     {
110 |      "data": {
111 |       "text/plain": [
112 |        "('gpt-3.5-turbo', 0.7)"
113 |       ]
114 |      },
115 |      "execution_count": 3,
116 |      "metadata": {},
117 |      "output_type": "execute_result"
118 |     }
119 |    ],
120 |    "source": [
121 |     "chat.model_name, chat.temperature"
122 |    ]
123 |   },
124 |   {
125 |    "cell_type": "markdown",
126 |    "metadata": {},
127 |    "source": [
128 |     "### Tourist Guide Template\n",
129 |     "\n",
130 |     "This template provides a short trip guide based on:\n",
131 |     "\n",
132 |     "- **city_name**: Name of the city.\n",
133 |     "- **interest**: The person's particular interest (e.g., art, history).\n",
134 |     "- **stay_duration**: Duration of the person's stay (e.g., 3 days, a week).\n",
135 |     "- **budget**: The person's budget (e.g., low, moderate, high).\n"
136 |    ]
137 |   },
138 |   {
139 |    "cell_type": "code",
140 |    "execution_count": 4,
141 |    "metadata": {},
142 |    "outputs": [
143 |     {
144 |      "name": "stdout",
145 |      "output_type": "stream",
146 |      "text": [
147 |       "<class 'langchain.prompts.chat.ChatPromptTemplate'>\n"
148 |      ]
149 |     }
150 |    ],
151 |    "source": [
152 |     "tourist_guide_string = \"\"\"Create a travel plan for in {city_name} \\\n",
153 |     "for someone interested in {interest}, \\\n",
154 |     "staying for {stay_duration}, \\\n",
155 |     "and having a {budget} budget.\"\"\"\n",
156 |     "\n",
157 |     "tourist_guide_template = ChatPromptTemplate.from_template(tourist_guide_string)\n",
158 |     "\n",
159 |     "print(type(tourist_guide_template))"
160 |    ]
161 |   },
162 |   {
163 |    "cell_type": "code",
164 |    "execution_count": 5,
165 |    "metadata": {},
166 |    "outputs": [
167 |     {
168 |      "name": "stdout",
169 |      "output_type": "stream",
170 |      "text": [
171 |       "input_variables=['interest', 'city_name', 'stay_duration', 'budget'] output_parser=None partial_variables={} messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['budget', 'city_name', 'interest', 'stay_duration'], output_parser=None, partial_variables={}, template='Create a travel plan for in {city_name} for someone interested in {interest}, staying for {stay_duration}, and having a {budget} budget.', template_format='f-string', validate_template=True), additional_kwargs={})]\n"
172 |      ]
173 |     }
174 |    ],
175 |    "source": [
176 |     "print(tourist_guide_template)"
177 |    ]
178 |   },
179 |   {
180 |    "cell_type": "code",
181 |    "execution_count": 6,
182 |    "metadata": {},
183 |    "outputs": [
184 |     {
185 |      "name": "stdout",
186 |      "output_type": "stream",
187 |      "text": [
188 |       "[HumanMessage(content='Create a travel plan for in Paris for someone interested in art, staying for 3 days, and having a moderate budget.', additional_kwargs={}, example=False)]\n"
189 |      ]
190 |     }
191 |    ],
192 |    "source": [
193 |     "tourist_guide_messages = tourist_guide_template.format_messages(\n",
194 |     "    city_name=\"Paris\",\n",
195 |     "    interest=\"art\",\n",
196 |     "    stay_duration=\"3 days\",\n",
197 |     "    budget=\"moderate\",\n",
198 |     ")\n",
199 |     "\n",
200 |     "print(tourist_guide_messages)"
201 |    ]
202 |   },
203 |   {
204 |    "cell_type": "code",
205 |    "execution_count": 7,
206 |    "metadata": {},
207 |    "outputs": [],
208 |    "source": [
209 |     "response_travel = chat(tourist_guide_messages)\n"
210 |    ]
211 |   },
212 |   {
213 |    "cell_type": "code",
214 |    "execution_count": 8,
215 |    "metadata": {},
216 |    "outputs": [
217 |     {
218 |      "name": "stdout",
219 |      "output_type": "stream",
220 |      "text": [
221 |       "Day 1:\n",
222 |       "- Morning: Start your day by visiting the Louvre Museum, which houses a vast collection of art from around the world, including the famous Mona Lisa. Consider purchasing tickets in advance to avoid long queues.\n",
223 |       "- Afternoon: Explore the charming neighborhood of Montmartre, known for its bohemian atmosphere and artistic history. Visit the iconic Sacré-Cœur Basilica, stroll through the narrow streets, and discover local artists painting in Place du Tertre.\n",
224 |       "- Evening: Enjoy dinner in a cozy bistro in Montmartre, savoring classic French cuisine and soaking in the artistic ambiance of the area.\n",
225 |       "\n",
226 |       "Day 2:\n",
227 |       "- Morning: Head to the Musée d'Orsay, located in a former railway station, to admire its extensive collection of impressionist and post-impressionist masterpieces. Marvel at works by Monet, Renoir, Van Gogh, and many others.\n",
228 |       "- Afternoon: Take a leisurely walk along the Seine River, crossing the Pont des Arts bridge adorned with love locks. Visit the Centre Pompidou, a modern art museum showcasing contemporary works, and enjoy panoramic views of Paris from its rooftop terrace.\n",
229 |       "- Evening: Experience the vibrant nightlife of the Marais district. Explore its numerous art galleries, trendy boutiques, and stylish bars. Consider catching a show at one of the many theaters in the area.\n",
230 |       "\n",
231 |       "Day 3:\n",
232 |       "- Morning: Visit the Musée de l'Orangerie, located in the beautiful Tuileries Garden, to admire Claude Monet's Water Lilies series. This tranquil museum provides a unique and immersive experience of Monet's extraordinary work.\n",
233 |       "- Afternoon: Explore the artistic neighborhood of Saint-Germain-des-Prés, known for its literary and intellectual history. Visit the famous Café de Flore or Les Deux Magots, where famous writers and artists once gathered. Browse through bookshops, antique stores, and art galleries in the area.\n",
234 |       "- Evening: End your trip with a cruise along the Seine River, enjoying breathtaking views of Paris illuminated at night. Consider booking a dinner cruise to combine a delicious meal with the sightseeing experience.\n",
235 |       "\n",
236 |       "Remember to check museum opening hours and consider purchasing a Paris Museum Pass if you plan to visit several museums, as it offers skip-the-line access and can help save money. Additionally, try local street food or opt for fixed-price menus in restaurants to manage your budget efficiently while enjoying delicious French cuisine.\n"
237 |      ]
238 |     }
239 |    ],
240 |    "source": [
241 |     "print(response_travel.content)"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "markdown",
246 |    "metadata": {},
247 |    "source": [
248 |     "**Keep playing with it**\n",
249 |     "\n",
250 |     "Here are other potential `input_variables` you could try:\n",
251 |     "```\n",
252 |     "travel_variables = [\n",
253 |     "    {\"city_name\": \"Paris\", \"interest\": \"Renaissance art and museums\", \"stay_duration\": \"4 days\", \"budget\": \"3000\"},\n",
254 |     "    {\"city_name\": \"Tokyo\", \"interest\": \"modern technology and gadget shopping\", \"stay_duration\": \"1 week\", \"budget\": \"1500\"},\n",
255 |     "    {\"city_name\": \"Cairo\", \"interest\": \"ancient pyramids and Egyptian history\", \"stay_duration\": \"3 days\", \"budget\": \"500\"},\n",
256 |     "    {\"city_name\": \"Sydney\", \"interest\": \"coastal hikes and famous landmarks\", \"stay_duration\": \"5 days\", \"budget\": \"2500\"},\n",
257 |     "    {\"city_name\": \"Rio de Janeiro\", \"interest\": \"vibrant street festivals and samba dancing\", \"stay_duration\": \"2 days\", \"budget\": \"800\"},\n",
258 |     "    {\"city_name\": \"New York City\", \"interest\": \"Broadway shows and urban exploration\", \"stay_duration\": \"1 week\", \"budget\": \"2000\"},\n",
259 |     "    {\"city_name\": \"Bangkok\", \"interest\": \"street food markets and Thai culinary experiences\", \"stay_duration\": \"3 days\", \"budget\": \"1200\"},\n",
260 |     "    {\"city_name\": \"Venice\", \"interest\": \"gondola rides and historic architecture\", \"stay_duration\": \"4 days\", \"budget\": \"1800\"},\n",
261 |     "    {\"city_name\": \"Cape Town\", \"interest\": \"mountain hiking and scenic coastal views\", \"stay_duration\": \"1 week\", \"budget\": \"900\"},\n",
262 |     "    {\"city_name\": \"Beijing\", \"interest\": \"Imperial history and traditional Chinese culture\", \"stay_duration\": \"6 days\", \"budget\": \"2200\"}\n",
263 |     "]\n",
264 |     "```"
265 |    ]
266 |   },
267 |   {
268 |    "cell_type": "markdown",
269 |    "metadata": {},
270 |    "source": [
271 |     "### Story Writing Template\n",
272 |     "Description: This template guides the writing of a story based on:\n",
273 |     "\n",
274 |     "- **word_limit**: Desired length of the story in words.\n",
275 |     "- **writing_style**: Style of writing (e.g., narrative, descriptive).\n",
276 |     "- **genre**: Genre of the story (e.g., mystery, romance).\n",
277 |     "- **starting_sentence**: The sentence to start the story with."
278 |    ]
279 |   },
280 |   {
281 |    "cell_type": "code",
282 |    "execution_count": null,
283 |    "metadata": {},
284 |    "outputs": [],
285 |    "source": [
286 |     "story_writing_string = \"\"\"Write a {word_limit}-word \\\n",
287 |     "{writing_style} {genre} story \\\n",
288 |     "starting with the sentence: {starting_sentence}.\"\"\"\n",
289 |     "\n",
290 |     "story_writing_template = ChatPromptTemplate.from_template(story_writing_string)\n",
291 |     "\n",
292 |     "story_messages = story_writing_template.format_messages(\n",
293 |     "    word_limit=\"300\", \n",
294 |     "    writing_style=\"narrative\", \n",
295 |     "    genre=\"mystery\", \n",
296 |     "    starting_sentence=\"The boat had disappeared over the horizon.\"\n",
297 |     ")\n",
298 |     "\n",
299 |     "print(story_messages)"
300 |    ]
301 |   },
302 |   {
303 |    "cell_type": "code",
304 |    "execution_count": null,
305 |    "metadata": {},
306 |    "outputs": [],
307 |    "source": [
308 |     "response_story = chat(story_messages)"
309 |    ]
310 |   },
311 |   {
312 |    "cell_type": "code",
313 |    "execution_count": null,
314 |    "metadata": {},
315 |    "outputs": [],
316 |    "source": [
317 |     "print(response_story.content)"
318 |    ]
319 |   },
320 |   {
321 |    "cell_type": "markdown",
322 |    "metadata": {},
323 |    "source": [
324 |     "**Keep experimenting**\n",
325 |     "\n",
326 |     "Here are other `input_variables` you could try:\n",
327 |     "\n",
328 |     "```\n",
329 |     "creative_writing_variables = [\n",
330 |     "    {\"word_limit\": \"500\", \"writing_style\": \"atmospheric and melancholic\", \"genre\": \"urban fantasy\", \"starting_sentence\": \"In the dim glow of twilight, the city revealed its secrets.\"},\n",
331 |     "    {\"word_limit\": \"400\", \"writing_style\": \"mysterious and intriguing\", \"genre\": \"mystery\", \"starting_sentence\": \"Every morning, I found a new letter on my doorstep, but no sign of the sender.\"},\n",
332 |     "    {\"word_limit\": \"600\", \"writing_style\": \"lighthearted and comedic\", \"genre\": \"romantic comedy\", \"starting_sentence\": \"The day started with coffee spilled all over her dress, and it only got weirder from there.\"},\n",
333 |     "    {\"word_limit\": \"300\", \"writing_style\": \"dark and suspenseful\", \"genre\": \"thriller\", \"starting_sentence\": \"The clock struck midnight, and a knock echoed through the silent house.\"},\n",
334 |     "    {\"word_limit\": \"550\", \"writing_style\": \"whimsical and dreamy\", \"genre\": \"fantasy\", \"starting_sentence\": \"Among the clouds, cities floated, and each had its own tale.\"},\n",
335 |     "    {\"word_limit\": \"450\", \"writing_style\": \"intense and gripping\", \"genre\": \"drama\", \"starting_sentence\": \"The promise made years ago was now coming to haunt them.\"},\n",
336 |     "    {\"word_limit\": \"350\", \"writing_style\": \"reflective and philosophical\", \"genre\": \"slice of life\", \"starting_sentence\": \"In the grand scheme of things, he pondered, what was the meaning of a single day?\"},\n",
337 |     "    {\"word_limit\": \"520\", \"writing_style\": \"adventurous and thrilling\", \"genre\": \"adventure\", \"starting_sentence\": \"With the map in one hand and a compass in another, she ventured into the unknown.\"},\n",
338 |     "    {\"word_limit\": \"480\", \"writing_style\": \"eerie and haunting\", \"genre\": \"horror\", \"starting_sentence\": \"Whispers echoed from the walls, but the house had been abandoned for decades.\"},\n",
339 |     "    {\"word_limit\": \"500\", \"writing_style\": \"inspiring and uplifting\", \"genre\": \"inspirational\", \"starting_sentence\": \"Every end is a new beginning, she reminded herself.\"}\n",
340 |     "]\n",
341 |     "```"
342 |    ]
343 |   },
344 |   {
345 |    "cell_type": "markdown",
346 |    "metadata": {},
347 |    "source": [
348 |     "## Conclusion\n",
349 |     "\n",
350 |     "LangChain offers a seamless approach to create and work with chat-based prompt templates, facilitating dynamic interactions with models like GPT-4 and simplifying the process of generating specific and contextualized prompts.\n",
351 |     "\n",
352 |     "Using prompt templates, especially in combination with powerful language models, has a broad range of practical applications. Here are 10 potential applications:\n",
353 |     "\n",
354 |     "\n",
355 |     "1. **Content Generation**:\n",
356 |     "\n",
357 |     "Automate the creation of articles, blogs, or other content based on predefined themes, tones, or styles.\n",
358 |     "\n",
359 |     "\n",
360 |     "2. **Customer Support**:\n",
361 |     "\n",
362 |     "Standardize initial queries or troubleshooting steps, enabling quicker issue identification and resolution.\n",
363 |     "\n",
364 |     "\n",
365 |     "3. **Education & Tutoring**:\n",
366 |     "\n",
367 |     "Teachers or educational platforms can use templates to generate quiz questions, assignments, or even explanations on various topics in a consistent format.\n",
368 |     "\n",
369 |     "\n",
370 |     "4. **Data Analysis**:\n",
371 |     "\n",
372 |     "Users can create templates to ask language models to perform specific types of data analyses, generating insights or visualizations on datasets.\n",
373 |     "\n",
374 |     "\n",
375 |     "5. **Entertainment & Gaming**:\n",
376 |     "\n",
377 |     "Game developers can generate dynamic in-game dialogues, storylines, or character backgrounds using templates.\n",
378 |     "\n",
379 |     "\n",
380 |     "6. **Research Assistance**:\n",
381 |     "\n",
382 |     "Researchers can use templates to scan or summarize vast amounts of literature based on specific research questions or themes.\n",
383 |     "\n",
384 |     "\n",
385 |     "7. **Translation & Localization**:\n",
386 |     "\n",
387 |     "Generate translations for standardized messages or content across multiple languages.\n",
388 |     "\n",
389 |     "\n",
390 |     "8. **E-commerce & Retail**:\n",
391 |     "\n",
392 |     "Retailers can use templates to generate product descriptions, reviews, or even marketing messages based on product attributes.\n",
393 |     "\n",
394 |     "\n",
395 |     "9. **Recruitment & HR**:\n",
396 |     "\n",
397 |     "HR departments can use templates to draft standardized interview questions, job descriptions, or feedback forms tailored to specific roles or departments.\n",
398 |     "\n",
399 |     "\n",
400 |     "10. **Healthcare**:\n",
401 |     "\n",
402 |     "Health practitioners can use templates for patient intake forms, diagnosis reports, or even to draft advice or recommendations based on specific symptoms or conditions."
403 |    ]
404 |   },
405 |   {
406 |    "cell_type": "code",
407 |    "execution_count": null,
408 |    "metadata": {},
409 |    "outputs": [],
410 |    "source": []
411 |   }
412 |  ],
413 |  "metadata": {
414 |   "kernelspec": {
415 |    "display_name": "langchain",
416 |    "language": "python",
417 |    "name": "python3"
418 |   },
419 |   "language_info": {
420 |    "codemirror_mode": {
421 |     "name": "ipython",
422 |     "version": 3
423 |    },
424 |    "file_extension": ".py",
425 |    "mimetype": "text/x-python",
426 |    "name": "python",
427 |    "nbconvert_exporter": "python",
428 |    "pygments_lexer": "ipython3",
429 |    "version": "3.9.17"
430 |   },
431 |   "orig_nbformat": 4
432 |  },
433 |  "nbformat": 4,
434 |  "nbformat_minor": 2
435 | }
436 | 


--------------------------------------------------------------------------------
/LangChainGuides/images/Chains_router.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/Chains_router.png


--------------------------------------------------------------------------------
/LangChainGuides/images/Chains_seq.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/Chains_seq.png


--------------------------------------------------------------------------------
/LangChainGuides/images/Chains_simple_seq.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/Chains_simple_seq.png


--------------------------------------------------------------------------------
/LangChainGuides/images/Embeddings.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/Embeddings.png


--------------------------------------------------------------------------------
/LangChainGuides/images/TAO.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/TAO.png


--------------------------------------------------------------------------------
/LangChainGuides/images/VectorDatabaseCreate.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/VectorDatabaseCreate.png


--------------------------------------------------------------------------------
/LangChainGuides/images/VectorDatabaseProcess.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/VectorDatabaseProcess.png


--------------------------------------------------------------------------------
/LangChainGuides/images/qa_data_ecosystem.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/qa_data_ecosystem.png


--------------------------------------------------------------------------------
/LangChainGuides/images/qa_flow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/images/qa_flow.png


--------------------------------------------------------------------------------
/LangChainGuides/transcripts/PT693-Transcript.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/krisograbek/LangChain_Tutorials/bc51827d798390675a2f3159076f05fbd558dfb7/LangChainGuides/transcripts/PT693-Transcript.pdf


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # LangChain Tutorial Repository
 2 | 
 3 | Welcome to the LangChain Tutorial Repository! This repository contains a collection of tutorials and examples to help you get started with the LangChain Library, a powerful Python library for natural language processing and text analysis.
 4 | 
 5 | ## Table of Contents
 6 | 
 7 | - [LangChain Tutorial Repository](#langchain-tutorial-repository)
 8 |   - [Table of Contents](#table-of-contents)
 9 |   - [Introduction](#introduction)
10 |   - [Installation](#installation)
11 |   - [Getting Started](#getting-started)
12 |   - [Tutorials](#tutorials)
13 |   - [License](#license)
14 | 
15 | ## Introduction
16 | 
17 | The LangChain Library is an open-source Python library designed to simplify and accelerate the development of natural language processing applications. Whether you're a beginner or an experienced developer, these tutorials will walk you through the basics of using LangChain to process and analyze text data effectively.
18 | 
19 | ## Installation
20 | 
21 | Before diving into the tutorials, make sure you have installed the LangChain and OpenAI Libraries. You can install them using pip:
22 | 
23 | ```bash
24 | pip install langchain openai
25 | ```
26 | 
27 | Please refer to the official [LangChain documentation](https://python.langchain.com/docs/get_started/introduction.html) for more detailed installation instructions and library features.
28 | 
29 | Depending on the tutorial you run, you may need to install the following libraries:
30 | 
31 | - `python-dotenv`: Used to read the .env file containing the OpenAI API Key
32 | - `ipykernel`: Enables running this notebook in VSCode
33 | - `youtube-transcript-api`: Fetches YouTube video transcripts
34 | - `pytube`: Fetches YouTube video metadata
35 | - `tiktoken`: Counts tokens in a text
36 | 
37 | 
38 | ## Getting Started
39 | 
40 | If you are new to LangChain, we recommend starting with the `Getting Started` section of the documentation. There, you will learn the fundamentals of the library and the basic concepts required for the tutorials.
41 | 
42 | ## Tutorials
43 | 
44 | The tutorials in this repository cover a range of topics and use cases to demonstrate how to use LangChain for various natural language processing tasks. Each tutorial is contained in a separate Jupyter Notebook for easy viewing and execution.
45 | 
46 | | Tutorial Name                                | Description                                      |
47 | | ------------------------------------------- | ------------------------------------------------ |
48 | | [YouTube Loader](LangChainGuides/YouTubeLoader.ipynb) | Analyze YouTube Videos with LangChain and GPT-3.5.  |
49 | | [Podcast transcript QA](LangChainGuides/ChatWithPodcast.ipynb)  | Chat With your favorite podcast using GPT-3.5.  |
50 | | [Second Brain (Obsidian) QA](LangChainGuides/ChatWithObsidian.ipynb) | QA over your second brain with LangChain  |
51 | | [LangChain Prompt Templates](LangChainGuides/PromptTemplates.ipynb) | How to use Langchain's Prompt templates  |
52 | | [LangChain Chains](LangChainGuides/ContentIdeaGenerator.ipynb) | How to use Langchain's Chains  |
53 | | [Basic LangChain Agents](LangChainGuides/BasicAgent.ipynb) | The basic usage of LangChain Agents  |
54 | 
55 | Feel free to explore the tutorials in any order you prefer, depending on your interests and prior experience with the LangChain Library.
56 | 
57 | 
58 | ## License
59 | 
60 | This project is licensed under the [MIT License](https://github.com/krisograbek/LangChain_Tutorials/blob/main/LICENSE.txt). You are free to use, modify, and distribute this code in your projects. We appreciate attribution if you use this library for your work.
61 | 
62 | 


--------------------------------------------------------------------------------