├── .github
    └── FUNDING.yml
├── LICENSE
├── README.md
├── data
    ├── attention-is-all-you-need.pdf
    ├── deniro.csv
    └── lorem-ipsum.txt
├── images
    └── colab.svg
└── notebooks
    ├── langchain_decoded.ipynb
    ├── langchain_decoded_1_models.ipynb
    ├── langchain_decoded_2_embeddings.ipynb
    ├── langchain_decoded_3_prompts.ipynb
    ├── langchain_decoded_4_indexes.ipynb
    └── langchain_decoded_5_memory.ipynb


/.github/FUNDING.yml:
--------------------------------------------------------------------------------
1 | ko_fi: alphasec
2 | buy_me_a_coffee: alphasec
3 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Alphasec
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # LangChain Decoded
 2 | <h4 align="center">
 3 |   <a href="https://github.com/alphasecio/langchain-decoded/blob/main/LICENSE">
 4 |     <img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="Released under the MIT license." />
 5 |   </a>
 6 |   <a href="https://github.com/alphasecio/langchain-decoded">
 7 |     <img src="https://img.shields.io/github/stars/alphasecio/langchain-decoded" alt="GitHub Stars" />
 8 |   </a>
 9 |   <a href="https://github.com/alphasecio/langchain-decoded">
10 |     <img src="https://img.shields.io/github/forks/alphasecio/langchain-decoded" alt="GitHub Forks" />
11 |   </a>
12 |   <a href="https://github.com/alphasecio/langchain-decoded">
13 |     <img src="https://img.shields.io/github/watchers/alphasecio/langchain-decoded" alt="GitHub Watchers" />
14 |   </a>
15 |   <a href="https://twitter.com/alphasecio">
16 |     <img src="https://img.shields.io/twitter/follow/alphasecio?label=Follow" alt="Follow on Twitter" />
17 |   </a>
18 | </h4>
19 | 
20 | A companion guide for the blog post series, [LangChain Decoded](https://alphasec.io/langchain-decoded-the-muggles-guide-to-langchain). 
21 | 
22 | [LangChain](https://langchain.readthedocs.io/en/latest/) is an open-source framework created to aid the development of applications leveraging the power of large language models (LLMs). It can be used for chatbots, text summarisation, data generation, code understanding, question answering, evaluation, and more. In this multi-part series, I explore various LangChain modules and use cases, and document my journey via Python notebooks. Feel free to follow along and fork the repository, or use individual notebooks on [Google Colab](https://colab.research.google.com).
23 | 
24 | ## [Part 1: Models](notebooks/langchain_decoded_1_models.ipynb)
25 | This notebook is an exploration of LangChain Models. Read [this post](https://alphasec.io/langchain-decoded-part-1-models) and follow along!
26 | 
27 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_1_models.ipynb)
28 | 
29 | ## [Part 2: Embeddings](notebooks/langchain_decoded_2_embeddings.ipynb)
30 | This notebook is an exploration of LangChain Embeddings. Read [this post](https://alphasec.io/langchain-decoded-part-2-embeddings) and follow along!
31 | 
32 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_2_embeddings.ipynb)
33 | 
34 | ## [Part 3: Prompts](notebooks/langchain_decoded_3_prompts.ipynb)
35 | This notebook is an exploration of LangChain Prompts. Read [this post](https://alphasec.io/langchain-decoded-part-3-prompts) and follow along!
36 | 
37 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_3_prompts.ipynb)
38 | 
39 | ## [Part 4: Indexes](notebooks/langchain_decoded_4_indexes.ipynb)
40 | This notebook is an exploration of LangChain Indexes. Read [this post](https://alphasec.io/langchain-decoded-part-4-indexes) and follow along!
41 | 
42 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_4_indexes.ipynb)
43 | 
44 | ## [Part 5: Memory](notebooks/langchain_decoded_5_memory.ipynb)
45 | This notebook is an exploration of LangChain Memory. Read [this post](https://alphasec.io/langchain-decoded-part-5-memory) and follow along!
46 | 
47 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_5_memory.ipynb)
48 | 
49 | ## [Part 6: Chains](notebooks/langchain_decoded_6_chains.ipynb) (coming soon)
50 | This notebook is an exploration of LangChain Chains. Read [this post](https://alphasec.io/langchain-decoded-part-6-chains) and follow along!
51 | 
52 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_6_chains.ipynb)
53 | 
54 | ## [Part 7: Agents](notebooks/langchain_decoded_7_agents.ipynb) (coming soon)
55 | This notebook is an exploration of LangChain Agents. Read [this post](https://alphasec.io/langchain-decoded-part-7-agents) and follow along!
56 | 
57 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_7_agents.ipynb)
58 | 
59 | ## [Part 8: Callbacks](notebooks/langchain_decoded_8_callbacks.ipynb) (coming soon)
60 | This notebook is an exploration of LangChain Callbacks. Read [this post](https://alphasec.io/langchain-decoded-part-8-callbacks) and follow along!
61 | 
62 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded_8_callbacks.ipynb)
63 | 
64 | ## [All-in-One](notebooks/langchain_decoded.ipynb)
65 | This notebook is a consolidation of the individual notebooks above.
66 | 
67 | [![Open In Colab](images/colab.svg)](https://colab.research.google.com/github/alphasecio/langchain-decoded/blob/main/notebooks/langchain_decoded.ipynb)
68 | 


--------------------------------------------------------------------------------
/data/attention-is-all-you-need.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alphasecio/langchain-decoded/1da0bcaf2cf7a334e25a9bae151ca22dfe2d0537/data/attention-is-all-you-need.pdf


--------------------------------------------------------------------------------
/data/deniro.csv:
--------------------------------------------------------------------------------
 1 | "Year", "Score", "Title"
 2 | 1968,  86, "Greetings"
 3 | 1970,  17, "Bloody Mama"
 4 | 1970,  73, "Hi, Mom!"
 5 | 1971,  40, "Born to Win"
 6 | 1973,  98, "Mean Streets"
 7 | 1973,  88, "Bang the Drum Slowly"
 8 | 1974,  97, "The Godfather, Part II"
 9 | 1976,  41, "The Last Tycoon"
10 | 1976,  99, "Taxi Driver"
11 | 1977,  47, "1900"
12 | 1977,  67, "New York, New York"
13 | 1978,  93, "The Deer Hunter"
14 | 1980,  97, "Raging Bull"
15 | 1981,  75, "True Confessions"
16 | 1983,  90, "The King of Comedy"
17 | 1984,  89, "Once Upon a Time in America"
18 | 1984,  60, "Falling in Love"
19 | 1985,  98, "Brazil"
20 | 1986,  65, "The Mission"
21 | 1987, 100, "Dear America: Letters Home From Vietnam"
22 | 1987,  80, "The Untouchables"
23 | 1987,  78, "Angel Heart"
24 | 1988,  96, "Midnight Run"
25 | 1989,  64, "Jacknife"
26 | 1989,  47, "We're No Angels"
27 | 1990,  88, "Awakenings"
28 | 1990,  29, "Stanley & Iris"
29 | 1990,  96, "Goodfellas"
30 | 1991,  76, "Cape Fear"
31 | 1991,  69, "Mistress"
32 | 1991,  65, "Guilty by Suspicion"
33 | 1991,  71, "Backdraft"
34 | 1992,  87, "Thunderheart"
35 | 1992,  67, "Night and the City"
36 | 1993,  75, "This Boy's Life"
37 | 1993,  78, "Mad Dog and Glory"
38 | 1993,  96, "A Bronx Tale"
39 | 1994,  39, "Mary Shelley's Frankenstein"
40 | 1995,  80, "Casino"
41 | 1995,  86, "Heat"
42 | 1996,  74, "Sleepers"
43 | 1996,  38, "The Fan"
44 | 1996,  80, "Marvin's Room"
45 | 1997,  85, "Wag the Dog"
46 | 1997,  87, "Jackie Brown"
47 | 1997,  72, "Cop Land"
48 | 1998,  68, "Ronin"
49 | 1998,  38, "Great Expectations"
50 | 1999,  69, "Analyze This"
51 | 1999,  43, "Flawless"
52 | 2000,  43, "The Adventures of Rocky & Bullwinkle"
53 | 2000,  84, "Meet the Parents"
54 | 2000,  41, "Men of Honor"
55 | 2001,  73, "The Score"
56 | 2001,  33, "15 Minutes"
57 | 2002,  48, "City by the Sea"
58 | 2002,  27, "Analyze That"
59 | 2003,   4, "Godsend"
60 | 2004,  35, "Shark Tale"
61 | 2004,  38, "Meet the Fockers"
62 | 2005,   4, "The Bridge of San Luis Rey"
63 | 2005,  46, "Rent"
64 | 2005,  13, "Hide and Seek"
65 | 2006,  54, "The Good Shepherd"
66 | 2007,  21, "Arthur and the Invisibles"
67 | 2007,  76, "Captain Shakespeare"
68 | 2008,  19, "Righteous Kill"
69 | 2008,  51, "What Just Happened?"
70 | 2009,  46, "Everybody's Fine"
71 | 2010,  72, "Machete"
72 | 2010,  10, "Little Fockers"
73 | 2010,  50, "Stone"
74 | 2011,  25, "Killer Elite"
75 | 2011,   7, "New Year's Eve"
76 | 2011,  70, "Limitless"
77 | 2012,  92, "Silver Linings Playbook"
78 | 2012,  51, "Being Flynn"
79 | 2012,  29, "Red Lights"
80 | 2013,  46, "Last Vegas"
81 | 2013,   7, "The Big Wedding"
82 | 2013,  29, "Grudge Match"
83 | 2013,  11, "Killing Season"
84 | 2014,   9, "The Bag Man"
85 | 2015,  60, "Joy"
86 | 2015,  26, "Heist"
87 | 2015,  61, "The Intern"
88 | 2016,  11, "Dirty Grandpa"
89 | 
90 | 


--------------------------------------------------------------------------------
/data/lorem-ipsum.txt:
--------------------------------------------------------------------------------
 1 | Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ut dignissim tortor, sit amet malesuada nibh. Mauris eleifend volutpat libero finibus volutpat. In sagittis mi id fringilla hendrerit. Phasellus ut leo fringilla, consequat diam ut, tincidunt nulla. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse aliquet ut metus at tristique. Proin ornare laoreet metus, ac venenatis lectus congue non. Donec sagittis, metus ac tristique maximus, augue massa facilisis nisi, id maximus tellus dui eu sem. Sed luctus ac tellus ut auctor.
 2 | 
 3 | Vestibulum congue convallis finibus. Morbi sit amet ex eget risus hendrerit congue id ac sem. Nulla sagittis nec quam id volutpat. Nullam suscipit posuere pellentesque. Suspendisse condimentum ex nec dolor feugiat, eu ullamcorper enim pulvinar. Sed pretium lacus eu ipsum auctor, id ullamcorper libero ultrices. Maecenas in nisl tincidunt, bibendum eros vitae, pellentesque urna. Integer pharetra lacus justo, non imperdiet ligula fringilla eget.
 4 | 
 5 | Suspendisse placerat quam at eros scelerisque, ut dapibus erat mollis. Suspendisse posuere malesuada aliquam. Suspendisse tincidunt consequat pulvinar. Curabitur maximus ipsum arcu, venenatis efficitur turpis malesuada quis. Nunc ac urna at sapien suscipit pretium sit amet malesuada tortor. Nullam bibendum sit amet neque eu fermentum. Sed vel elementum lacus, sit amet luctus eros. Pellentesque laoreet leo enim, eget porttitor ante rhoncus ut. Morbi ac enim non est aliquet venenatis eu eget odio. Vestibulum a pellentesque mi, a posuere eros. In elementum accumsan nisl. Aliquam tellus enim, luctus id libero quis, consectetur sagittis lacus. In tellus ipsum, semper ac accumsan vitae, consequat sit amet nisl. Morbi ornare, risus eget hendrerit euismod, enim nunc tempor orci, vel molestie arcu felis vitae velit. Morbi risus erat, commodo sed arcu id, vehicula egestas ex.
 6 | 
 7 | Suspendisse quis rutrum lectus, ut malesuada elit. Proin eget tellus vitae nunc lacinia sollicitudin quis vel felis. Praesent viverra dui vel elementum sagittis. Sed vehicula metus ullamcorper, malesuada nunc ac, imperdiet ex. Ut turpis est, posuere at libero vel, elementum ultricies nisl. Praesent dignissim euismod egestas. Proin justo ex, laoreet a sodales vitae, ullamcorper a neque. Aenean pharetra ornare ipsum ut tempus. Aliquam nunc sem, hendrerit non augue ac, mollis hendrerit libero. Morbi quis hendrerit leo. Nullam vel ex consequat, tempus turpis sit amet, tristique tortor. Aenean eleifend, lacus vel dignissim pellentesque, tellus eros tincidunt arcu, sit amet bibendum erat turpis id est. Nullam commodo in nisi nec dictum. Sed eget condimentum dolor. Aenean id leo venenatis, condimentum turpis a, posuere lorem. Aliquam tincidunt ante ut dui gravida pretium.
 8 | 
 9 | Suspendisse facilisis sit amet lacus ut varius. Nullam sed odio turpis. Etiam rutrum ex a iaculis efficitur. Donec tincidunt magna at est euismod, non placerat eros tempus. Ut libero diam, auctor eu turpis sit amet, lobortis vehicula turpis. Vestibulum at sodales tortor. Nunc consequat, mi ut semper ultricies, massa eros maximus mauris, vel bibendum quam libero eu purus. Vivamus dapibus nisi a dolor tincidunt, vel malesuada metus sollicitudin. Aliquam erat volutpat. Donec nisl tellus, pharetra eu elit feugiat, luctus malesuada nulla. Duis ipsum erat, ultrices non molestie ac, bibendum pellentesque turpis. Nam tincidunt volutpat ipsum, quis aliquam est. Phasellus lectus eros, imperdiet non porta nec, accumsan ut augue. Curabitur ultricies fermentum enim, suscipit elementum justo finibus vitae. Nullam molestie egestas erat ut vehicula. Proin et dui et orci vehicula ultrices.
10 | 
11 | In mattis tincidunt ex ac dictum. Duis ultricies sem tincidunt odio sagittis egestas. Proin vitae risus eu mauris rhoncus iaculis quis ac dolor. Phasellus euismod enim ligula, a porttitor elit tincidunt vel. Donec ac tempus felis. Curabitur sed posuere ipsum. Morbi scelerisque volutpat magna a vestibulum. Donec tincidunt ultrices ipsum sit amet ultricies. Vivamus eleifend massa et nibh vulputate, vel semper nulla viverra. Mauris cursus accumsan molestie. Praesent mattis vestibulum feugiat. Vestibulum lobortis ornare sapien, eget porttitor mauris rhoncus quis. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam vehicula convallis sollicitudin. Nulla rhoncus dapibus purus, ut vulputate nibh efficitur vitae. Mauris vulputate purus eu tellus rutrum dictum.
12 | 
13 | Sed molestie massa sit amet pharetra maximus. Aliquam erat volutpat. Etiam faucibus risus at pretium suscipit. In semper vitae magna vitae tristique. Sed mauris orci, tincidunt vitae mollis volutpat, molestie vitae mi. Aliquam faucibus sagittis arcu et ornare. Nunc sit amet mattis erat. Fusce erat urna, egestas sed consequat et, dignissim quis lacus. Quisque semper lacinia semper. Etiam consectetur eu odio sit amet mollis.
14 | 
15 | Duis ac mollis massa, tincidunt gravida elit. Sed vel est vel diam tristique ullamcorper. Mauris volutpat venenatis justo, quis fermentum neque gravida aliquet. Nam dignissim sagittis augue, sed rutrum tellus imperdiet eu. Aenean sed urna elementum, auctor sem id, malesuada tortor. Pellentesque eget elementum ipsum. Phasellus vulputate dapibus aliquet. Proin ultricies pulvinar risus eget venenatis. Vestibulum nec dolor lobortis, tempor lectus at, iaculis eros.
16 | 
17 | Duis in quam a massa elementum congue. Sed nec nulla hendrerit augue sodales suscipit non vitae orci. Fusce tincidunt ligula urna, vel imperdiet nisi euismod sagittis. Fusce tincidunt leo tempus ante aliquam vulputate. Phasellus rutrum blandit dictum. Maecenas aliquet tristique semper. Aenean sem ex, consectetur in justo ac, consectetur efficitur justo. Integer interdum cursus imperdiet. Sed vitae feugiat ipsum. Aliquam eget velit sit amet leo viverra consequat. Phasellus eu elit et dolor bibendum vestibulum. Donec a venenatis elit.
18 | 
19 | In hac habitasse platea dictumst. Vivamus vitae congue nisl, id tincidunt ipsum. Fusce interdum quam turpis, ac tempor metus accumsan a. Praesent suscipit tincidunt elit, sit amet iaculis nunc porta vel. Sed vehicula augue vel nulla viverra, vel tincidunt dui egestas. Donec commodo maximus nisl laoreet bibendum. Ut euismod lectus ut eleifend bibendum. Etiam pellentesque finibus consequat. Suspendisse gravida massa nulla, ut consequat risus pharetra a.
20 | 


--------------------------------------------------------------------------------
/images/colab.svg:
--------------------------------------------------------------------------------
1 | <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="117" height="20"><linearGradient id="b" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="a"><rect width="117" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#a)"><path fill="#555" d="M0 0h30v20H0z"/><path fill="#007ec6" d="M30 0h87v20H30z"/><path fill="url(#b)" d="M0 0h117v20H0z"/></g><g fill="#fff" text-anchor="middle" font-family="DejaVu Sans,Verdana,Geneva,sans-serif" font-size="110"><svg x="4px" y="0px" width="22px" height="20px" viewBox="-2 0 28 24" style="background-color: #fff;border-radius: 1px;"><path style="fill:#e8710a;" d="M1.977,16.77c-2.667-2.277-2.605-7.079,0-9.357C2.919,8.057,3.522,9.075,4.49,9.691c-1.152,1.6-1.146,3.201-0.004,4.803C3.522,15.111,2.918,16.126,1.977,16.77z"/><path style="fill:#f9ab00;" d="M12.257,17.114c-1.767-1.633-2.485-3.658-2.118-6.02c0.451-2.91,2.139-4.893,4.946-5.678c2.565-0.718,4.964-0.217,6.878,1.819c-0.884,0.743-1.707,1.547-2.434,2.446C18.488,8.827,17.319,8.435,16,8.856c-2.404,0.767-3.046,3.241-1.494,5.644c-0.241,0.275-0.493,0.541-0.721,0.826C13.295,15.939,12.511,16.3,12.257,17.114z"/><path style="fill:#e8710a;" d="M19.529,9.682c0.727-0.899,1.55-1.703,2.434-2.446c2.703,2.783,2.701,7.031-0.005,9.764c-2.648,2.674-6.936,2.725-9.701,0.115c0.254-0.814,1.038-1.175,1.528-1.788c0.228-0.285,0.48-0.552,0.721-0.826c1.053,0.916,2.254,1.268,3.6,0.83C20.502,14.551,21.151,11.927,19.529,9.682z"/><path style="fill:#f9ab00;" d="M4.49,9.691C3.522,9.075,2.919,8.057,1.977,7.413c2.209-2.398,5.721-2.942,8.476-1.355c0.555,0.32,0.719,0.606,0.285,1.128c-0.157,0.188-0.258,0.422-0.391,0.631c-0.299,0.47-0.509,1.067-0.929,1.371C8.933,9.539,8.523,8.847,8.021,8.746C6.673,8.475,5.509,8.787,4.49,9.691z"/><path style="fill:#f9ab00;" d="M1.977,16.77c0.941-0.644,1.545-1.659,2.509-2.277c1.373,1.152,2.85,1.433,4.45,0.499c0.332-0.194,0.503-0.088,0.673,0.19c0.386,0.635,0.753,1.285,1.181,1.89c0.34,0.48,0.222,0.715-0.253,1.006C7.84,19.73,4.205,19.188,1.977,16.77z"/></svg><text x="245" y="140" transform="scale(.1)" textLength="30"> </text><text x="725" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="770">Open in Colab</text><text x="725" y="140" transform="scale(.1)" textLength="770">Open in Colab</text></g> </svg>
2 | 


--------------------------------------------------------------------------------
/notebooks/langchain_decoded.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |   "nbformat": 4,
   3 |   "nbformat_minor": 0,
   4 |   "metadata": {
   5 |     "colab": {
   6 |       "provenance": [],
   7 |       "collapsed_sections": [
   8 |         "zgFXxUsp5pbu",
   9 |         "d_YGHcob5Rhs",
  10 |         "3MvgsUgi-yQD",
  11 |         "vpCYayVyWqyr",
  12 |         "oH043Uvlr95a",
  13 |         "U8kUt5Jm5Y6L",
  14 |         "MTkGMnWG_lyh",
  15 |         "Pc8d5uA7_4u_",
  16 |         "dIEPnFzw_7Rx",
  17 |         "yF32kJni__hY",
  18 |         "1RpFQvgAACR6",
  19 |         "m0fTtYd30V7O"
  20 |       ]
  21 |     },
  22 |     "kernelspec": {
  23 |       "name": "python3",
  24 |       "display_name": "Python 3"
  25 |     },
  26 |     "language_info": {
  27 |       "name": "python"
  28 |     }
  29 |   },
  30 |   "cells": [
  31 |     {
  32 |       "cell_type": "markdown",
  33 |       "source": [
  34 |         "# **LangChain Decoded**"
  35 |       ],
  36 |       "metadata": {
  37 |         "id": "JQxt18IfZR40"
  38 |       }
  39 |     },
  40 |     {
  41 |       "cell_type": "markdown",
  42 |       "source": [
  43 |         "## Getting Started"
  44 |       ],
  45 |       "metadata": {
  46 |         "id": "zgFXxUsp5pbu"
  47 |       }
  48 |     },
  49 |     {
  50 |       "cell_type": "code",
  51 |       "source": [
  52 |         "# Install the LangChain package\n",
  53 |         "!pip install langchain"
  54 |       ],
  55 |       "metadata": {
  56 |         "id": "fRUe2wO95xoR"
  57 |       },
  58 |       "execution_count": null,
  59 |       "outputs": []
  60 |     },
  61 |     {
  62 |       "cell_type": "code",
  63 |       "source": [
  64 |         "# Install the OpenAI package\n",
  65 |         "!pip install openai"
  66 |       ],
  67 |       "metadata": {
  68 |         "id": "T7TwICBe9yy4"
  69 |       },
  70 |       "execution_count": null,
  71 |       "outputs": []
  72 |     },
  73 |     {
  74 |       "cell_type": "code",
  75 |       "source": [
  76 |         "# Configure the API key\n",
  77 |         "import os\n",
  78 |         "\n",
  79 |         "openai_api_key = os.environ.get('OPENAI_API_KEY', 'sk-XXX')"
  80 |       ],
  81 |       "metadata": {
  82 |         "id": "I-ZUDlva6m3Y"
  83 |       },
  84 |       "execution_count": null,
  85 |       "outputs": []
  86 |     },
  87 |     {
  88 |       "cell_type": "markdown",
  89 |       "source": [
  90 |         "## Part 1: Models"
  91 |       ],
  92 |       "metadata": {
  93 |         "id": "d_YGHcob5Rhs"
  94 |       }
  95 |     },
  96 |     {
  97 |       "cell_type": "markdown",
  98 |       "source": [
  99 |         "### Large Language Models (LLMs)"
 100 |       ],
 101 |       "metadata": {
 102 |         "id": "3MvgsUgi-yQD"
 103 |       }
 104 |     },
 105 |     {
 106 |       "cell_type": "code",
 107 |       "source": [
 108 |         "# Use the OpenAI LLM wrapper and text-davinci-003 model\n",
 109 |         "from langchain.llms import OpenAI\n",
 110 |         "\n",
 111 |         "llm = OpenAI(model_name=\"text-davinci-003\", openai_api_key=openai_api_key)"
 112 |       ],
 113 |       "metadata": {
 114 |         "id": "9gqrb7Rh53uF"
 115 |       },
 116 |       "execution_count": null,
 117 |       "outputs": []
 118 |     },
 119 |     {
 120 |       "cell_type": "code",
 121 |       "source": [
 122 |         "# Generate a simple text response\n",
 123 |         "llm(\"Why is the sky blue?\")"
 124 |       ],
 125 |       "metadata": {
 126 |         "id": "1oQ3UzyEAnsL"
 127 |       },
 128 |       "execution_count": null,
 129 |       "outputs": []
 130 |     },
 131 |     {
 132 |       "cell_type": "code",
 133 |       "source": [
 134 |         "# Show the generation output instead\n",
 135 |         "llm_result = llm.generate([\"Why is the sky blue?\"])\n",
 136 |         "llm_result.llm_output"
 137 |       ],
 138 |       "metadata": {
 139 |         "id": "wpYhYbKOC1_L"
 140 |       },
 141 |       "execution_count": null,
 142 |       "outputs": []
 143 |     },
 144 |     {
 145 |       "cell_type": "code",
 146 |       "source": [
 147 |         "# Track OpenAI token usage for a single API call\n",
 148 |         "from langchain.callbacks import get_openai_callback\n",
 149 |         "\n",
 150 |         "with get_openai_callback() as cb:\n",
 151 |         "    result = llm(\"Why is the sky blue?\")\n",
 152 |         "\n",
 153 |         "    print(f\"Total Tokens: {cb.total_tokens}\")\n",
 154 |         "    print(f\"\\tPrompt Tokens: {cb.prompt_tokens}\")\n",
 155 |         "    print(f\"\\tCompletion Tokens: {cb.completion_tokens}\")\n",
 156 |         "    print(f\"Total Cost (USD): ${cb.total_cost}\")"
 157 |       ],
 158 |       "metadata": {
 159 |         "id": "LsE4Euf2DeW9"
 160 |       },
 161 |       "execution_count": null,
 162 |       "outputs": []
 163 |     },
 164 |     {
 165 |       "cell_type": "markdown",
 166 |       "source": [
 167 |         "### Chat Models"
 168 |       ],
 169 |       "metadata": {
 170 |         "id": "vpCYayVyWqyr"
 171 |       }
 172 |     },
 173 |     {
 174 |       "cell_type": "code",
 175 |       "source": [
 176 |         "# Define system message for the chatbot, and pass human message\n",
 177 |         "from langchain.chat_models import ChatOpenAI\n",
 178 |         "from langchain.schema import (\n",
 179 |         "    AIMessage,\n",
 180 |         "    HumanMessage,\n",
 181 |         "    SystemMessage\n",
 182 |         ")\n",
 183 |         "\n",
 184 |         "chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)\n",
 185 |         "\n",
 186 |         "messages = [\n",
 187 |         "    SystemMessage(content=\"You are a helpful assistant that translates English to Spanish.\"),\n",
 188 |         "    HumanMessage(content=\"Translate this sentence from English to Spanish. I'm hungry, give me food.\")\n",
 189 |         "]\n",
 190 |         "\n",
 191 |         "chat(messages)"
 192 |       ],
 193 |       "metadata": {
 194 |         "id": "Bm_9BTO1Wsz4"
 195 |       },
 196 |       "execution_count": null,
 197 |       "outputs": []
 198 |     },
 199 |     {
 200 |       "cell_type": "markdown",
 201 |       "source": [
 202 |         "## Part 2: Embeddings"
 203 |       ],
 204 |       "metadata": {
 205 |         "id": "oH043Uvlr95a"
 206 |       }
 207 |     },
 208 |     {
 209 |       "cell_type": "code",
 210 |       "source": [
 211 |         "# Use OpenAI text embeddings for a text input\n",
 212 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 213 |         "\n",
 214 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
 215 |         "\n",
 216 |         "text = \"This is a sample query.\"\n",
 217 |         "\n",
 218 |         "query_result = embeddings.embed_query(text)\n",
 219 |         "print(query_result)\n",
 220 |         "print(len(query_result))"
 221 |       ],
 222 |       "metadata": {
 223 |         "id": "ozw7kkLosUqy"
 224 |       },
 225 |       "execution_count": null,
 226 |       "outputs": []
 227 |     },
 228 |     {
 229 |       "cell_type": "code",
 230 |       "source": [
 231 |         "# Use OpenAI text embeddings for multiple text/document inputs\n",
 232 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 233 |         "\n",
 234 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
 235 |         "\n",
 236 |         "text = [\"This is a sample query.\", \"This is another sample query.\", \"This is yet another sample query.\"]\n",
 237 |         "\n",
 238 |         "doc_result = embeddings.embed_documents(text)\n",
 239 |         "print(doc_result)\n",
 240 |         "print(len(doc_result))"
 241 |       ],
 242 |       "metadata": {
 243 |         "id": "NRH8asFsMtZB"
 244 |       },
 245 |       "execution_count": null,
 246 |       "outputs": []
 247 |     },
 248 |     {
 249 |       "cell_type": "code",
 250 |       "source": [
 251 |         "# Use fake embeddings to test your pipeline\n",
 252 |         "from langchain.embeddings import FakeEmbeddings\n",
 253 |         "\n",
 254 |         "embeddings = FakeEmbeddings(size=1481)\n",
 255 |         "\n",
 256 |         "text = \"This is a sample query.\"\n",
 257 |         "\n",
 258 |         "query_result = embeddings.embed_query(text)\n",
 259 |         "print(query_result)\n",
 260 |         "print(len(query_result))"
 261 |       ],
 262 |       "metadata": {
 263 |         "id": "-1VMjbchOZZG"
 264 |       },
 265 |       "execution_count": null,
 266 |       "outputs": []
 267 |     },
 268 |     {
 269 |       "cell_type": "code",
 270 |       "source": [
 271 |         "# Request with context length > 8191 throws an error\n",
 272 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 273 |         "\n",
 274 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
 275 |         "\n",
 276 |         "long_text = 'Hello ' * 10000\n",
 277 |         "\n",
 278 |         "query_result = embeddings.embed_query(long_text)\n",
 279 |         "print(query_result)"
 280 |       ],
 281 |       "metadata": {
 282 |         "id": "9SYhZJyQTiZZ"
 283 |       },
 284 |       "execution_count": null,
 285 |       "outputs": []
 286 |     },
 287 |     {
 288 |       "cell_type": "code",
 289 |       "source": [
 290 |         "!pip install tiktoken"
 291 |       ],
 292 |       "metadata": {
 293 |         "id": "t0blTkHpVOjl"
 294 |       },
 295 |       "execution_count": null,
 296 |       "outputs": []
 297 |     },
 298 |     {
 299 |       "cell_type": "code",
 300 |       "source": [
 301 |         "# Truncate input text length using tiktoken\n",
 302 |         "import tiktoken\n",
 303 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 304 |         "\n",
 305 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
 306 |         "\n",
 307 |         "max_tokens = 8191\n",
 308 |         "encoding_name = 'cl100k_base'\n",
 309 |         "\n",
 310 |         "long_text = 'Hello ' * 10000\n",
 311 |         "\n",
 312 |         "# Tokenize the input text before truncating it\n",
 313 |         "encoding = tiktoken.get_encoding(encoding_name)\n",
 314 |         "tokens = encoding.encode(long_text)[:max_tokens]\n",
 315 |         "\n",
 316 |         "# Re-convert the tokens to a string before embedding\n",
 317 |         "truncated_text = encoding.decode(tokens)\n",
 318 |         "\n",
 319 |         "query_result = embeddings.embed_query(truncated_text)\n",
 320 |         "print(query_result)\n",
 321 |         "print(len(query_result))"
 322 |       ],
 323 |       "metadata": {
 324 |         "id": "reeAzDpqXXoC"
 325 |       },
 326 |       "execution_count": null,
 327 |       "outputs": []
 328 |     },
 329 |     {
 330 |       "cell_type": "markdown",
 331 |       "source": [
 332 |         "## Part 3: Prompts"
 333 |       ],
 334 |       "metadata": {
 335 |         "id": "U8kUt5Jm5Y6L"
 336 |       }
 337 |     },
 338 |     {
 339 |       "cell_type": "code",
 340 |       "source": [
 341 |         "# Ask the LLM about a recent event/occurence\n",
 342 |         "from langchain.llms.openai import OpenAI\n",
 343 |         "\n",
 344 |         "llm = OpenAI(model_name='text-davinci-003', openai_api_key=openai_api_key)\n",
 345 |         "\n",
 346 |         "print(llm(\"What is LangChain useful for? Answer in one sentence.\"))"
 347 |       ],
 348 |       "metadata": {
 349 |         "id": "lOwGPXanZevQ"
 350 |       },
 351 |       "execution_count": null,
 352 |       "outputs": []
 353 |     },
 354 |     {
 355 |       "cell_type": "code",
 356 |       "source": [
 357 |         "# Ask the same question again, but with relevant context\n",
 358 |         "prompt = \"\"\"You are a helpful assistant, who can explain concepts in an easy-to-understand manner. Answer the following question succintly.\n",
 359 |         "          Context: There are six main areas that LangChain is designed to help with. These are, in increasing order of complexity:\n",
 360 |         "            LLMs and Prompts: This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.\n",
 361 |         "            Chains: Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.\n",
 362 |         "            Data Augmented Generation: Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.\n",
 363 |         "            Agents: Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.\n",
 364 |         "            Memory: Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.\n",
 365 |         "            Evaluation: Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.\n",
 366 |         "          Question: What is LangChain useful for?\n",
 367 |         "          Answer: \"\"\"\n",
 368 |         "\n",
 369 |         "print(llm(prompt))"
 370 |       ],
 371 |       "metadata": {
 372 |         "id": "lHi9-vCAasK5"
 373 |       },
 374 |       "execution_count": null,
 375 |       "outputs": []
 376 |     },
 377 |     {
 378 |       "cell_type": "code",
 379 |       "source": [
 380 |         "# Use a template to structure the prompt\n",
 381 |         "from langchain import PromptTemplate\n",
 382 |         "\n",
 383 |         "template = \"\"\"You are a helpful assistant, who is good at general knowledge trivia. Answer the following question succintly.\n",
 384 |         "              Question: {question}\n",
 385 |         "              Answer:\"\"\"\n",
 386 |         "\n",
 387 |         "prompt = PromptTemplate(template=template, input_variables=['question'])\n",
 388 |         "\n",
 389 |         "question = \"Who won the first football World Cup?\"\n",
 390 |         "\n",
 391 |         "print(llm(question))"
 392 |       ],
 393 |       "metadata": {
 394 |         "id": "E4e782fpa-av"
 395 |       },
 396 |       "execution_count": null,
 397 |       "outputs": []
 398 |     },
 399 |     {
 400 |       "cell_type": "code",
 401 |       "source": [
 402 |         "# Use a chain to execute the prompt\n",
 403 |         "from langchain.chains import LLMChain\n",
 404 |         "\n",
 405 |         "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
 406 |         "\n",
 407 |         "print(llm_chain.run(question))"
 408 |       ],
 409 |       "metadata": {
 410 |         "id": "YwlnLs8cb62N"
 411 |       },
 412 |       "execution_count": null,
 413 |       "outputs": []
 414 |     },
 415 |     {
 416 |       "cell_type": "code",
 417 |       "source": [
 418 |         "# Save prompt template to JSON file\n",
 419 |         "prompt.save(\"myprompt.json\")\n",
 420 |         "\n",
 421 |         "# Load prompt template from JSON file\n",
 422 |         "from langchain.prompts import load_prompt\n",
 423 |         "\n",
 424 |         "saved_prompt = load_prompt(\"myprompt.json\")\n",
 425 |         "assert prompt == saved_prompt\n",
 426 |         "\n",
 427 |         "print(llm(question))"
 428 |       ],
 429 |       "metadata": {
 430 |         "id": "lRCAqLneb8m9"
 431 |       },
 432 |       "execution_count": null,
 433 |       "outputs": []
 434 |     },
 435 |     {
 436 |       "cell_type": "code",
 437 |       "source": [
 438 |         "# Guide the model using few shot examples in the prompt\n",
 439 |         "from langchain import PromptTemplate, FewShotPromptTemplate\n",
 440 |         "\n",
 441 |         "examples = [\n",
 442 |         "    { \"question\": \"How can we extend our lifespan?\",\n",
 443 |         "      \"answer\": \"Just freeze yourself and wait for technology to catch up.\"},\n",
 444 |         "    { \"question\": \"Does red wine help you live longer?\",\n",
 445 |         "      \"answer\": \"I don't know about that, but it does make the time pass more quickly.\"},\n",
 446 |         "    { \"question\": \"How can we slow down the aging process?\",\n",
 447 |         "      \"answer\": \"Simple, just stop having birthdays.\"}\n",
 448 |         "]\n",
 449 |         "\n",
 450 |         "template = \"\"\"\n",
 451 |         "    Question: {question}\n",
 452 |         "    Answer: {answer}\n",
 453 |         "  \"\"\"\n",
 454 |         "\n",
 455 |         "prompt = PromptTemplate(input_variables=[\"question\", \"answer\"], template=template)\n",
 456 |         "\n",
 457 |         "few_shot_prompt = FewShotPromptTemplate(\n",
 458 |         "    examples=examples,\n",
 459 |         "    example_prompt=prompt,\n",
 460 |         "    prefix=\"Respond with a funny and witty remark.\",\n",
 461 |         "    suffix=\"Question: {question}\\nAnswer:\",\n",
 462 |         "    input_variables=[\"question\"],\n",
 463 |         "    example_separator=\"\"\n",
 464 |         ")\n",
 465 |         "\n",
 466 |         "print(few_shot_prompt.format(question=\"How can I eat healthy?\"))\n",
 467 |         "print(llm(few_shot_prompt.format(question=\"How can I eat healthy?\")))"
 468 |       ],
 469 |       "metadata": {
 470 |         "id": "Z7tTrZJ7b-Zc"
 471 |       },
 472 |       "execution_count": null,
 473 |       "outputs": []
 474 |     },
 475 |     {
 476 |       "cell_type": "code",
 477 |       "source": [
 478 |         "# Use prompt templates with chat models\n",
 479 |         "from langchain.chat_models import ChatOpenAI\n",
 480 |         "from langchain.prompts import (\n",
 481 |         "    ChatPromptTemplate,\n",
 482 |         "    PromptTemplate,\n",
 483 |         "    SystemMessagePromptTemplate,\n",
 484 |         "    AIMessagePromptTemplate,\n",
 485 |         "    HumanMessagePromptTemplate,\n",
 486 |         ")\n",
 487 |         "\n",
 488 |         "chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)\n",
 489 |         "\n",
 490 |         "system_message=\"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
 491 |         "system_message_prompt = SystemMessagePromptTemplate.from_template(system_message)\n",
 492 |         "\n",
 493 |         "human_message=\"{text}\"\n",
 494 |         "human_message_prompt = HumanMessagePromptTemplate.from_template(human_message)\n",
 495 |         "\n",
 496 |         "chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])\n",
 497 |         "\n",
 498 |         "messages = chat_prompt.format_prompt(input_language=\"English\", output_language=\"Spanish\", text=\"I'm hungry, give me food.\").to_messages()\n",
 499 |         "\n",
 500 |         "chat(messages)"
 501 |       ],
 502 |       "metadata": {
 503 |         "id": "0CD1ssN8cBgK"
 504 |       },
 505 |       "execution_count": null,
 506 |       "outputs": []
 507 |     },
 508 |     {
 509 |       "cell_type": "markdown",
 510 |       "source": [
 511 |         "## Part 4: Indexes"
 512 |       ],
 513 |       "metadata": {
 514 |         "id": "MTkGMnWG_lyh"
 515 |       }
 516 |     },
 517 |     {
 518 |       "cell_type": "markdown",
 519 |       "source": [
 520 |         "### Document Loaders"
 521 |       ],
 522 |       "metadata": {
 523 |         "id": "Pc8d5uA7_4u_"
 524 |       }
 525 |     },
 526 |     {
 527 |       "cell_type": "code",
 528 |       "source": [
 529 |         "!pip install unstructured tabulate pdf2image pytesseract"
 530 |       ],
 531 |       "metadata": {
 532 |         "id": "i6zYaxTl_rlU"
 533 |       },
 534 |       "execution_count": null,
 535 |       "outputs": []
 536 |     },
 537 |     {
 538 |       "cell_type": "code",
 539 |       "source": [
 540 |         "# URL Loader\n",
 541 |         "from langchain.document_loaders import UnstructuredURLLoader\n",
 542 |         "\n",
 543 |         "urls = [\"https://alphasec.io/summarize-text-with-langchain-and-openai\"]\n",
 544 |         "loader = UnstructuredURLLoader(urls=urls)\n",
 545 |         "data = loader.load()\n",
 546 |         "print(data)"
 547 |       ],
 548 |       "metadata": {
 549 |         "id": "LiLWQ_tF_686"
 550 |       },
 551 |       "execution_count": null,
 552 |       "outputs": []
 553 |     },
 554 |     {
 555 |       "cell_type": "code",
 556 |       "source": [
 557 |         "!pip install pypdf"
 558 |       ],
 559 |       "metadata": {
 560 |         "id": "U_uGRyko_Hbk"
 561 |       },
 562 |       "execution_count": null,
 563 |       "outputs": []
 564 |     },
 565 |     {
 566 |       "cell_type": "code",
 567 |       "source": [
 568 |         "# PDF Loader\n",
 569 |         "from langchain.document_loaders import PyPDFLoader\n",
 570 |         "\n",
 571 |         "loader = PyPDFLoader(\"./data/attention-is-all-you-need.pdf\")\n",
 572 |         "pages = loader.load_and_split()\n",
 573 |         "pages[0]"
 574 |       ],
 575 |       "metadata": {
 576 |         "id": "EdpCr38O-olp"
 577 |       },
 578 |       "execution_count": null,
 579 |       "outputs": []
 580 |     },
 581 |     {
 582 |       "cell_type": "code",
 583 |       "source": [
 584 |         "# File Directory Loader\n",
 585 |         "from langchain.document_loaders import DirectoryLoader\n",
 586 |         "\n",
 587 |         "loader = DirectoryLoader('data', glob=\"**/*.csv\")\n",
 588 |         "docs = loader.load()\n",
 589 |         "len(docs)"
 590 |       ],
 591 |       "metadata": {
 592 |         "id": "dJIv52X3EbyD"
 593 |       },
 594 |       "execution_count": null,
 595 |       "outputs": []
 596 |     },
 597 |     {
 598 |       "cell_type": "code",
 599 |       "source": [
 600 |         "!pip install pytube youtube-transcript-api"
 601 |       ],
 602 |       "metadata": {
 603 |         "id": "080GK3Q3Iv5_"
 604 |       },
 605 |       "execution_count": null,
 606 |       "outputs": []
 607 |     },
 608 |     {
 609 |       "cell_type": "code",
 610 |       "source": [
 611 |         "# YouTube Transcripts Loader\n",
 612 |         "from langchain.document_loaders import YoutubeLoader\n",
 613 |         "\n",
 614 |         "loader = YoutubeLoader.from_youtube_url(\"https://www.youtube.com/watch?v=yEgHrxvLsz0\", add_video_info=True)\n",
 615 |         "data = loader.load()\n",
 616 |         "print(data)"
 617 |       ],
 618 |       "metadata": {
 619 |         "id": "B13lagzPGn1n"
 620 |       },
 621 |       "execution_count": null,
 622 |       "outputs": []
 623 |     },
 624 |     {
 625 |       "cell_type": "code",
 626 |       "source": [
 627 |         "!pip install google-cloud-storage"
 628 |       ],
 629 |       "metadata": {
 630 |         "id": "Hwab0v2IbBYb"
 631 |       },
 632 |       "execution_count": null,
 633 |       "outputs": []
 634 |     },
 635 |     {
 636 |       "cell_type": "code",
 637 |       "source": [
 638 |         "# Google Cloud Storage File Loader\n",
 639 |         "from langchain.document_loaders import GCSFileLoader\n",
 640 |         "\n",
 641 |         "loader = GCSFileLoader(project_name=\"langchain-gcs\", bucket=\"langchain-gcs\", blob=\"lorem-ipsum.txt\")\n",
 642 |         "data = loader.load()\n",
 643 |         "print(data)"
 644 |       ],
 645 |       "metadata": {
 646 |         "id": "WO7Vsn_4bB7d"
 647 |       },
 648 |       "execution_count": null,
 649 |       "outputs": []
 650 |     },
 651 |     {
 652 |       "cell_type": "markdown",
 653 |       "source": [
 654 |         "### Text Splitters"
 655 |       ],
 656 |       "metadata": {
 657 |         "id": "dIEPnFzw_7Rx"
 658 |       }
 659 |     },
 660 |     {
 661 |       "cell_type": "code",
 662 |       "source": [
 663 |         "# Character Text Splitter\n",
 664 |         "from langchain.text_splitter import CharacterTextSplitter\n",
 665 |         "from google.colab import files\n",
 666 |         "\n",
 667 |         "uploaded = files.upload()\n",
 668 |         "filename = next(iter(uploaded))\n",
 669 |         "text = uploaded[filename].decode(\"utf-8\")\n",
 670 |         "\n",
 671 |         "text_splitter = CharacterTextSplitter(\n",
 672 |         "    separator = \"\\n\\n\",\n",
 673 |         "    chunk_size = 1000,\n",
 674 |         "    chunk_overlap  = 200,\n",
 675 |         "    length_function = len,\n",
 676 |         ")\n",
 677 |         "\n",
 678 |         "texts = text_splitter.create_documents([text])\n",
 679 |         "print(texts[0])\n",
 680 |         "print(texts[1])\n",
 681 |         "print(texts[2])"
 682 |       ],
 683 |       "metadata": {
 684 |         "id": "WGb0P3fS__BD"
 685 |       },
 686 |       "execution_count": null,
 687 |       "outputs": []
 688 |     },
 689 |     {
 690 |       "cell_type": "code",
 691 |       "source": [
 692 |         "# Recursive Character Text Splitter\n",
 693 |         "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
 694 |         "from google.colab import files\n",
 695 |         "\n",
 696 |         "uploaded = files.upload()\n",
 697 |         "filename = next(iter(uploaded))\n",
 698 |         "text = uploaded[filename].decode(\"utf-8\")\n",
 699 |         "\n",
 700 |         "text_splitter = RecursiveCharacterTextSplitter(\n",
 701 |         "    chunk_size = 100,\n",
 702 |         "    chunk_overlap  = 20,\n",
 703 |         "    length_function = len,\n",
 704 |         ")\n",
 705 |         "\n",
 706 |         "texts = text_splitter.create_documents([text])\n",
 707 |         "print(texts[0])\n",
 708 |         "print(texts[1])\n",
 709 |         "print(texts[2])"
 710 |       ],
 711 |       "metadata": {
 712 |         "id": "NmYW0cN0UhxP"
 713 |       },
 714 |       "execution_count": null,
 715 |       "outputs": []
 716 |     },
 717 |     {
 718 |       "cell_type": "markdown",
 719 |       "source": [
 720 |         "### Vector Stores"
 721 |       ],
 722 |       "metadata": {
 723 |         "id": "yF32kJni__hY"
 724 |       }
 725 |     },
 726 |     {
 727 |       "cell_type": "code",
 728 |       "source": [
 729 |         "!pip install chromadb tiktoken"
 730 |       ],
 731 |       "metadata": {
 732 |         "id": "rMtIR-tRX0Qt"
 733 |       },
 734 |       "execution_count": null,
 735 |       "outputs": []
 736 |     },
 737 |     {
 738 |       "cell_type": "code",
 739 |       "source": [
 740 |         "# Chroma Vector Store\n",
 741 |         "import os, tiktoken\n",
 742 |         "from langchain.document_loaders import TextLoader\n",
 743 |         "from langchain.text_splitter import CharacterTextSplitter\n",
 744 |         "from langchain.vectorstores import Chroma\n",
 745 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 746 |         "\n",
 747 |         "OPENAI_API_KEY = '' # @param {type:\"string\"}\n",
 748 |         "os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY\n",
 749 |         "\n",
 750 |         "from google.colab import files\n",
 751 |         "\n",
 752 |         "uploaded = files.upload()\n",
 753 |         "filename = next(iter(uploaded))\n",
 754 |         "\n",
 755 |         "loader = TextLoader(filename)\n",
 756 |         "data = loader.load()\n",
 757 |         "\n",
 758 |         "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
 759 |         "docs = text_splitter.split_documents(data)\n",
 760 |         "\n",
 761 |         "embeddings = OpenAIEmbeddings()\n",
 762 |         "db = Chroma.from_documents(docs, embeddings)\n",
 763 |         "\n",
 764 |         "query = \"What comes after 'Vestibulum congue convallis finibus'?\"\n",
 765 |         "docs = db.similarity_search(query)\n",
 766 |         "\n",
 767 |         "print(docs[0].page_content)"
 768 |       ],
 769 |       "metadata": {
 770 |         "id": "0XUsuto_ABze"
 771 |       },
 772 |       "execution_count": null,
 773 |       "outputs": []
 774 |     },
 775 |     {
 776 |       "cell_type": "markdown",
 777 |       "source": [
 778 |         "### Retrievers"
 779 |       ],
 780 |       "metadata": {
 781 |         "id": "1RpFQvgAACR6"
 782 |       }
 783 |     },
 784 |     {
 785 |       "cell_type": "code",
 786 |       "source": [
 787 |         "!pip install arxiv pymupdf"
 788 |       ],
 789 |       "metadata": {
 790 |         "id": "ZYC12MjFPQTo"
 791 |       },
 792 |       "execution_count": null,
 793 |       "outputs": []
 794 |     },
 795 |     {
 796 |       "cell_type": "code",
 797 |       "source": [
 798 |         "# Arxiv Retriever\n",
 799 |         "from langchain.retrievers import ArxivRetriever\n",
 800 |         "\n",
 801 |         "retriever = ArxivRetriever(load_max_docs=2)\n",
 802 |         "docs = retriever.get_relevant_documents(query='2203.15556')\n",
 803 |         "\n",
 804 |         "docs[0].metadata"
 805 |       ],
 806 |       "metadata": {
 807 |         "id": "mwVrbBUfAEaG"
 808 |       },
 809 |       "execution_count": null,
 810 |       "outputs": []
 811 |     },
 812 |     {
 813 |       "cell_type": "code",
 814 |       "source": [
 815 |         "!pip install wikipedia"
 816 |       ],
 817 |       "metadata": {
 818 |         "id": "26WMZAxdTEzs"
 819 |       },
 820 |       "execution_count": null,
 821 |       "outputs": []
 822 |     },
 823 |     {
 824 |       "cell_type": "code",
 825 |       "source": [
 826 |         "# Wikipedia Retriever\n",
 827 |         "from langchain.retrievers import WikipediaRetriever\n",
 828 |         "\n",
 829 |         "retriever = WikipediaRetriever()\n",
 830 |         "docs = retriever.get_relevant_documents(query='large language models')\n",
 831 |         "\n",
 832 |         "docs[0].metadata"
 833 |       ],
 834 |       "metadata": {
 835 |         "id": "-pY7LCZ8TG-8"
 836 |       },
 837 |       "execution_count": null,
 838 |       "outputs": []
 839 |     },
 840 |     {
 841 |       "cell_type": "code",
 842 |       "source": [
 843 |         "!pip install chromadb tiktoken"
 844 |       ],
 845 |       "metadata": {
 846 |         "id": "8LrQSBVaUFmX"
 847 |       },
 848 |       "execution_count": null,
 849 |       "outputs": []
 850 |     },
 851 |     {
 852 |       "cell_type": "code",
 853 |       "source": [
 854 |         "# Chroma Vector Store Retriever\n",
 855 |         "import os, tiktoken\n",
 856 |         "from langchain.document_loaders import TextLoader\n",
 857 |         "from langchain.text_splitter import CharacterTextSplitter\n",
 858 |         "from langchain.vectorstores import Chroma\n",
 859 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 860 |         "\n",
 861 |         "OPENAI_API_KEY = '' # @param {type:\"string\"}\n",
 862 |         "os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY\n",
 863 |         "\n",
 864 |         "from google.colab import files\n",
 865 |         "\n",
 866 |         "uploaded = files.upload()\n",
 867 |         "filename = next(iter(uploaded))\n",
 868 |         "\n",
 869 |         "loader = TextLoader(filename)\n",
 870 |         "data = loader.load()\n",
 871 |         "\n",
 872 |         "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
 873 |         "docs = text_splitter.split_documents(data)\n",
 874 |         "\n",
 875 |         "embeddings = OpenAIEmbeddings()\n",
 876 |         "db = Chroma.from_documents(docs, embeddings)\n",
 877 |         "\n",
 878 |         "retriever = db.as_retriever()\n",
 879 |         "query = \"What comes after 'Vestibulum congue convallis finibus'?\"\n",
 880 |         "docs = retriever.get_relevant_documents(query)\n",
 881 |         "\n",
 882 |         "print(docs[0].page_content)"
 883 |       ],
 884 |       "metadata": {
 885 |         "id": "S0ZiBwNKSaqS"
 886 |       },
 887 |       "execution_count": null,
 888 |       "outputs": []
 889 |     },
 890 |     {
 891 |       "cell_type": "markdown",
 892 |       "source": [
 893 |         "## Part 5: Memory"
 894 |       ],
 895 |       "metadata": {
 896 |         "id": "m0fTtYd30V7O"
 897 |       }
 898 |     },
 899 |     {
 900 |       "cell_type": "code",
 901 |       "source": [
 902 |         "# Store and retrieve chat messages with ChatMessageHistory\n",
 903 |         "from langchain.memory import ChatMessageHistory\n",
 904 |         "\n",
 905 |         "history = ChatMessageHistory()\n",
 906 |         "history.add_user_message(\"Hello\")\n",
 907 |         "history.add_ai_message(\"Hi, how can I help you?\")\n",
 908 |         "history.add_user_message(\"I want to write Python code.\")\n",
 909 |         "history.add_ai_message(\"Sure, I can help with that. What do you want to code?\")\n",
 910 |         "\n",
 911 |         "history.messages"
 912 |       ],
 913 |       "metadata": {
 914 |         "id": "MeseqbKK0V7P"
 915 |       },
 916 |       "execution_count": null,
 917 |       "outputs": []
 918 |     },
 919 |     {
 920 |       "cell_type": "code",
 921 |       "source": [
 922 |         "# Retrieve chat messages with ConversationBufferHistory (as a variable)\n",
 923 |         "from langchain.memory import ConversationBufferMemory\n",
 924 |         "\n",
 925 |         "memory = ConversationBufferMemory()\n",
 926 |         "memory.chat_memory.add_user_message(\"Hello\")\n",
 927 |         "memory.chat_memory.add_ai_message(\"Hi, how can I help you?\")\n",
 928 |         "memory.chat_memory.add_user_message(\"I want to write Python code.\")\n",
 929 |         "memory.chat_memory.add_ai_message(\"Sure, I can help with that. What do you want to code?\")\n",
 930 |         "\n",
 931 |         "memory.load_memory_variables({})"
 932 |       ],
 933 |       "metadata": {
 934 |         "id": "RMKPUXnKpj-u"
 935 |       },
 936 |       "execution_count": null,
 937 |       "outputs": []
 938 |     },
 939 |     {
 940 |       "cell_type": "code",
 941 |       "source": [
 942 |         "# Retrieve chat messages with ConversationBufferHistory (as a list of messages)\n",
 943 |         "from langchain.memory import ConversationBufferMemory\n",
 944 |         "\n",
 945 |         "memory = ConversationBufferMemory(return_messages=True)\n",
 946 |         "memory.chat_memory.add_user_message(\"Hello\")\n",
 947 |         "memory.chat_memory.add_ai_message(\"Hi, how can I help you?\")\n",
 948 |         "memory.chat_memory.add_user_message(\"I want to write Python code.\")\n",
 949 |         "memory.chat_memory.add_ai_message(\"Sure, I can help with that. What do you want to code?\")\n",
 950 |         "\n",
 951 |         "memory.load_memory_variables({})"
 952 |       ],
 953 |       "metadata": {
 954 |         "id": "WCY2tsblqprw"
 955 |       },
 956 |       "execution_count": null,
 957 |       "outputs": []
 958 |     },
 959 |     {
 960 |       "cell_type": "code",
 961 |       "source": [
 962 |         "# Use ConversationBufferMemory in a chain\n",
 963 |         "from langchain.llms.openai import OpenAI\n",
 964 |         "from langchain.chains import ConversationChain\n",
 965 |         "\n",
 966 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
 967 |         "conversation = ConversationChain(llm=llm, memory=ConversationBufferMemory())\n",
 968 |         "\n",
 969 |         "conversation.predict(input=\"Hello\")"
 970 |       ],
 971 |       "metadata": {
 972 |         "id": "m2Ged561sLhm"
 973 |       },
 974 |       "execution_count": null,
 975 |       "outputs": []
 976 |     },
 977 |     {
 978 |       "cell_type": "code",
 979 |       "source": [
 980 |         "conversation.predict(input=\"I want to write Python code.\")"
 981 |       ],
 982 |       "metadata": {
 983 |         "id": "MS07RHCCtJSL"
 984 |       },
 985 |       "execution_count": null,
 986 |       "outputs": []
 987 |     },
 988 |     {
 989 |       "cell_type": "code",
 990 |       "source": [
 991 |         "# Store a conversation summary with ConversationSummaryMemory\n",
 992 |         "from langchain.llms.openai import OpenAI\n",
 993 |         "from langchain.memory import ChatMessageHistory, ConversationSummaryMemory\n",
 994 |         "\n",
 995 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
 996 |         "memory = ConversationSummaryMemory(llm=llm)\n",
 997 |         "memory.save_context({\"input\": \"Hello\"}, {\"output\": \"Hi, how can I help you?\"})\n",
 998 |         "\n",
 999 |         "memory.load_memory_variables({})"
1000 |       ],
1001 |       "metadata": {
1002 |         "id": "2f9H0r1XuBAm"
1003 |       },
1004 |       "execution_count": null,
1005 |       "outputs": []
1006 |     },
1007 |     {
1008 |       "cell_type": "code",
1009 |       "source": [
1010 |         "conversation.predict(input=\"I want to write Python code.\")"
1011 |       ],
1012 |       "metadata": {
1013 |         "id": "Ne2bNr9_xNFj"
1014 |       },
1015 |       "execution_count": null,
1016 |       "outputs": []
1017 |     },
1018 |     {
1019 |       "cell_type": "code",
1020 |       "source": [
1021 |         "# Use ConversationSummaryMemory in a chain\n",
1022 |         "from langchain.llms.openai import OpenAI\n",
1023 |         "from langchain.chains import ConversationChain\n",
1024 |         "\n",
1025 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
1026 |         "memory = ConversationSummaryMemory(llm=llm)\n",
1027 |         "conversation = ConversationChain(llm=llm, verbose=True, memory=memory)\n",
1028 |         "\n",
1029 |         "conversation.predict(input=\"Hello\")"
1030 |       ],
1031 |       "metadata": {
1032 |         "id": "jxmlmYBhwzPT"
1033 |       },
1034 |       "execution_count": null,
1035 |       "outputs": []
1036 |     },
1037 |     {
1038 |       "cell_type": "code",
1039 |       "source": [
1040 |         "conversation.predict(input=\"I want to write Python code.\")"
1041 |       ],
1042 |       "metadata": {
1043 |         "id": "FW8h7QFFoLN-"
1044 |       },
1045 |       "execution_count": null,
1046 |       "outputs": []
1047 |     },
1048 |     {
1049 |       "cell_type": "code",
1050 |       "source": [
1051 |         "conversation.predict(input=\"No, I'm a beginner.\")"
1052 |       ],
1053 |       "metadata": {
1054 |         "id": "gzl4GUCmxaEp"
1055 |       },
1056 |       "execution_count": null,
1057 |       "outputs": []
1058 |     },
1059 |     {
1060 |       "cell_type": "code",
1061 |       "source": [
1062 |         "# # Memory management using Motorhead (managed)\n",
1063 |         "from langchain.memory.motorhead_memory import MotorheadMemory\n",
1064 |         "from langchain import OpenAI, LLMChain, PromptTemplate\n",
1065 |         "\n",
1066 |         "template = \"\"\"You are a chatbot having a conversation with a human.\n",
1067 |         "\n",
1068 |         "{chat_history}\n",
1069 |         "Human: {human_input}\n",
1070 |         "AI:\"\"\"\n",
1071 |         "\n",
1072 |         "prompt = PromptTemplate(input_variables=[\"chat_history\", \"human_input\"], template=template)\n",
1073 |         "\n",
1074 |         "memory = MotorheadMemory(\n",
1075 |         "    api_key=\"API_KEY\",\n",
1076 |         "    client_id=\"CLIENT_ID\",\n",
1077 |         "    session_id=\"langchain-1\",\n",
1078 |         "    memory_key=\"chat_history\",\n",
1079 |         ")\n",
1080 |         "\n",
1081 |         "await memory.init();\n",
1082 |         "\n",
1083 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
1084 |         "llm_chain = LLMChain(llm=llm, prompt=prompt, memory=memory)\n",
1085 |         "\n",
1086 |         "llm_chain.run(\"Hello, I'm Motorhead.\")"
1087 |       ],
1088 |       "metadata": {
1089 |         "id": "WGTiwbGVYJsP"
1090 |       },
1091 |       "execution_count": null,
1092 |       "outputs": []
1093 |     },
1094 |     {
1095 |       "cell_type": "code",
1096 |       "source": [
1097 |         "llm_chain.run(\"What's my name?\")"
1098 |       ],
1099 |       "metadata": {
1100 |         "id": "FT0GzIbCZBeb"
1101 |       },
1102 |       "execution_count": null,
1103 |       "outputs": []
1104 |     },
1105 |     {
1106 |       "cell_type": "code",
1107 |       "source": [
1108 |         "# Memory management using Motorhead (self-hosted)\n",
1109 |         "from langchain import OpenAI, LLMChain, PromptTemplate\n",
1110 |         "from langchain.memory.motorhead_memory import MotorheadMemory\n",
1111 |         "\n",
1112 |         "template = \"\"\"You are a chatbot having a conversation with a human.\n",
1113 |         "\n",
1114 |         "{chat_history}\n",
1115 |         "Human: {human_input}\n",
1116 |         "AI:\"\"\"\n",
1117 |         "\n",
1118 |         "prompt = PromptTemplate(input_variables=[\"chat_history\", \"human_input\"], template=template)\n",
1119 |         "\n",
1120 |         "memory = MotorheadMemory(\n",
1121 |         "    url=\"URL\",\n",
1122 |         "    session_id=\"langchain-1\",\n",
1123 |         "    memory_key=\"chat_history\",\n",
1124 |         ")\n",
1125 |         "\n",
1126 |         "await memory.init();\n",
1127 |         "\n",
1128 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
1129 |         "llm_chain = LLMChain(llm=llm, prompt=prompt, memory=memory)\n",
1130 |         "\n",
1131 |         "llm_chain.run(\"Hello, I'm Motorhead.\")"
1132 |       ],
1133 |       "metadata": {
1134 |         "id": "TOnq4fuXg1qF"
1135 |       },
1136 |       "execution_count": null,
1137 |       "outputs": []
1138 |     },
1139 |     {
1140 |       "cell_type": "code",
1141 |       "source": [
1142 |         "llm_chain.run(\"What's my name?\")"
1143 |       ],
1144 |       "metadata": {
1145 |         "id": "Po6Ce7HRipd1"
1146 |       },
1147 |       "execution_count": null,
1148 |       "outputs": []
1149 |     }
1150 |   ]
1151 | }


--------------------------------------------------------------------------------
/notebooks/langchain_decoded_1_models.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "provenance": []
  7 |     },
  8 |     "kernelspec": {
  9 |       "name": "python3",
 10 |       "display_name": "Python 3"
 11 |     },
 12 |     "language_info": {
 13 |       "name": "python"
 14 |     }
 15 |   },
 16 |   "cells": [
 17 |     {
 18 |       "cell_type": "markdown",
 19 |       "source": [
 20 |         "# **LangChain Decoded**"
 21 |       ],
 22 |       "metadata": {
 23 |         "id": "JQxt18IfZR40"
 24 |       }
 25 |     },
 26 |     {
 27 |       "cell_type": "markdown",
 28 |       "source": [
 29 |         "## Getting Started"
 30 |       ],
 31 |       "metadata": {
 32 |         "id": "zgFXxUsp5pbu"
 33 |       }
 34 |     },
 35 |     {
 36 |       "cell_type": "code",
 37 |       "source": [
 38 |         "# Install the LangChain package\n",
 39 |         "!pip install langchain"
 40 |       ],
 41 |       "metadata": {
 42 |         "id": "fRUe2wO95xoR"
 43 |       },
 44 |       "execution_count": null,
 45 |       "outputs": []
 46 |     },
 47 |     {
 48 |       "cell_type": "code",
 49 |       "source": [
 50 |         "# Install the OpenAI package\n",
 51 |         "!pip install openai"
 52 |       ],
 53 |       "metadata": {
 54 |         "id": "T7TwICBe9yy4"
 55 |       },
 56 |       "execution_count": null,
 57 |       "outputs": []
 58 |     },
 59 |     {
 60 |       "cell_type": "code",
 61 |       "source": [
 62 |         "# Configure the API key\n",
 63 |         "import os\n",
 64 |         "\n",
 65 |         "openai_api_key = os.environ.get('OPENAI_API_KEY', 'sk-XXX')"
 66 |       ],
 67 |       "metadata": {
 68 |         "id": "I-ZUDlva6m3Y"
 69 |       },
 70 |       "execution_count": null,
 71 |       "outputs": []
 72 |     },
 73 |     {
 74 |       "cell_type": "markdown",
 75 |       "source": [
 76 |         "## Part 1: Models"
 77 |       ],
 78 |       "metadata": {
 79 |         "id": "d_YGHcob5Rhs"
 80 |       }
 81 |     },
 82 |     {
 83 |       "cell_type": "markdown",
 84 |       "source": [
 85 |         "### Large Language Models (LLMs)"
 86 |       ],
 87 |       "metadata": {
 88 |         "id": "3MvgsUgi-yQD"
 89 |       }
 90 |     },
 91 |     {
 92 |       "cell_type": "code",
 93 |       "source": [
 94 |         "# Use the OpenAI LLM wrapper and text-davinci-003 model\n",
 95 |         "from langchain.llms import OpenAI\n",
 96 |         "\n",
 97 |         "llm = OpenAI(model_name=\"text-davinci-003\", openai_api_key=openai_api_key)"
 98 |       ],
 99 |       "metadata": {
100 |         "id": "9gqrb7Rh53uF"
101 |       },
102 |       "execution_count": null,
103 |       "outputs": []
104 |     },
105 |     {
106 |       "cell_type": "code",
107 |       "source": [
108 |         "# Generate a simple text response\n",
109 |         "llm(\"Why is the sky blue?\")"
110 |       ],
111 |       "metadata": {
112 |         "id": "1oQ3UzyEAnsL"
113 |       },
114 |       "execution_count": null,
115 |       "outputs": []
116 |     },
117 |     {
118 |       "cell_type": "code",
119 |       "source": [
120 |         "# Show the generation output instead\n",
121 |         "llm_result = llm.generate([\"Why is the sky blue?\"])\n",
122 |         "llm_result.llm_output"
123 |       ],
124 |       "metadata": {
125 |         "id": "wpYhYbKOC1_L"
126 |       },
127 |       "execution_count": null,
128 |       "outputs": []
129 |     },
130 |     {
131 |       "cell_type": "code",
132 |       "source": [
133 |         "# Track OpenAI token usage for a single API call\n",
134 |         "from langchain.callbacks import get_openai_callback\n",
135 |         "\n",
136 |         "with get_openai_callback() as cb:\n",
137 |         "    result = llm(\"Why is the sky blue?\")\n",
138 |         "\n",
139 |         "    print(f\"Total Tokens: {cb.total_tokens}\")\n",
140 |         "    print(f\"\\tPrompt Tokens: {cb.prompt_tokens}\")\n",
141 |         "    print(f\"\\tCompletion Tokens: {cb.completion_tokens}\")\n",
142 |         "    print(f\"Total Cost (USD): ${cb.total_cost}\")"
143 |       ],
144 |       "metadata": {
145 |         "id": "LsE4Euf2DeW9"
146 |       },
147 |       "execution_count": null,
148 |       "outputs": []
149 |     },
150 |     {
151 |       "cell_type": "markdown",
152 |       "source": [
153 |         "### Chat Models"
154 |       ],
155 |       "metadata": {
156 |         "id": "vpCYayVyWqyr"
157 |       }
158 |     },
159 |     {
160 |       "cell_type": "code",
161 |       "source": [
162 |         "# Define system message for the chatbot, and pass human message\n",
163 |         "from langchain.chat_models import ChatOpenAI\n",
164 |         "from langchain.schema import (\n",
165 |         "    AIMessage,\n",
166 |         "    HumanMessage,\n",
167 |         "    SystemMessage\n",
168 |         ")\n",
169 |         "\n",
170 |         "chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)\n",
171 |         "\n",
172 |         "messages = [\n",
173 |         "    SystemMessage(content=\"You are a helpful assistant that translates English to Spanish.\"),\n",
174 |         "    HumanMessage(content=\"Translate this sentence from English to Spanish. I'm hungry, give me food.\")\n",
175 |         "]\n",
176 |         "\n",
177 |         "chat(messages)"
178 |       ],
179 |       "metadata": {
180 |         "id": "Bm_9BTO1Wsz4"
181 |       },
182 |       "execution_count": null,
183 |       "outputs": []
184 |     }
185 |   ]
186 | }
187 | 


--------------------------------------------------------------------------------
/notebooks/langchain_decoded_2_embeddings.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "provenance": []
  7 |     },
  8 |     "kernelspec": {
  9 |       "name": "python3",
 10 |       "display_name": "Python 3"
 11 |     },
 12 |     "language_info": {
 13 |       "name": "python"
 14 |     }
 15 |   },
 16 |   "cells": [
 17 |     {
 18 |       "cell_type": "markdown",
 19 |       "source": [
 20 |         "# **LangChain Decoded**"
 21 |       ],
 22 |       "metadata": {
 23 |         "id": "JQxt18IfZR40"
 24 |       }
 25 |     },
 26 |     {
 27 |       "cell_type": "markdown",
 28 |       "source": [
 29 |         "## Getting Started"
 30 |       ],
 31 |       "metadata": {
 32 |         "id": "zgFXxUsp5pbu"
 33 |       }
 34 |     },
 35 |     {
 36 |       "cell_type": "code",
 37 |       "source": [
 38 |         "# Install the LangChain package\n",
 39 |         "!pip install langchain"
 40 |       ],
 41 |       "metadata": {
 42 |         "id": "fRUe2wO95xoR"
 43 |       },
 44 |       "execution_count": null,
 45 |       "outputs": []
 46 |     },
 47 |     {
 48 |       "cell_type": "code",
 49 |       "source": [
 50 |         "# Install the OpenAI package\n",
 51 |         "!pip install openai"
 52 |       ],
 53 |       "metadata": {
 54 |         "id": "T7TwICBe9yy4"
 55 |       },
 56 |       "execution_count": null,
 57 |       "outputs": []
 58 |     },
 59 |     {
 60 |       "cell_type": "code",
 61 |       "source": [
 62 |         "# Configure the API key\n",
 63 |         "import os\n",
 64 |         "\n",
 65 |         "openai_api_key = os.environ.get('OPENAI_API_KEY', 'sk-XXX')"
 66 |       ],
 67 |       "metadata": {
 68 |         "id": "I-ZUDlva6m3Y"
 69 |       },
 70 |       "execution_count": null,
 71 |       "outputs": []
 72 |     },
 73 |     {
 74 |       "cell_type": "markdown",
 75 |       "source": [
 76 |         "## Part 2: Embeddings"
 77 |       ],
 78 |       "metadata": {
 79 |         "id": "oH043Uvlr95a"
 80 |       }
 81 |     },
 82 |     {
 83 |       "cell_type": "code",
 84 |       "source": [
 85 |         "# Use OpenAI text embeddings for a text input\n",
 86 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 87 |         "\n",
 88 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
 89 |         "\n",
 90 |         "text = \"This is a sample query.\"\n",
 91 |         "\n",
 92 |         "query_result = embeddings.embed_query(text)\n",
 93 |         "print(query_result)\n",
 94 |         "print(len(query_result))"
 95 |       ],
 96 |       "metadata": {
 97 |         "id": "ozw7kkLosUqy"
 98 |       },
 99 |       "execution_count": null,
100 |       "outputs": []
101 |     },
102 |     {
103 |       "cell_type": "code",
104 |       "source": [
105 |         "# Use OpenAI text embeddings for multiple text/document inputs\n",
106 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
107 |         "\n",
108 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
109 |         "\n",
110 |         "text = [\"This is a sample query.\", \"This is another sample query.\", \"This is yet another sample query.\"]\n",
111 |         "\n",
112 |         "doc_result = embeddings.embed_documents(text)\n",
113 |         "print(doc_result)\n",
114 |         "print(len(doc_result))"
115 |       ],
116 |       "metadata": {
117 |         "id": "NRH8asFsMtZB"
118 |       },
119 |       "execution_count": null,
120 |       "outputs": []
121 |     },
122 |     {
123 |       "cell_type": "code",
124 |       "source": [
125 |         "# Use fake embeddings to test your pipeline\n",
126 |         "from langchain.embeddings import FakeEmbeddings\n",
127 |         "\n",
128 |         "embeddings = FakeEmbeddings(size=1481)\n",
129 |         "\n",
130 |         "text = \"This is a sample query.\"\n",
131 |         "\n",
132 |         "query_result = embeddings.embed_query(text)\n",
133 |         "print(query_result)\n",
134 |         "print(len(query_result))"
135 |       ],
136 |       "metadata": {
137 |         "id": "-1VMjbchOZZG"
138 |       },
139 |       "execution_count": null,
140 |       "outputs": []
141 |     },
142 |     {
143 |       "cell_type": "code",
144 |       "source": [
145 |         "# Request with context length > 8191 throws an error\n",
146 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
147 |         "\n",
148 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
149 |         "\n",
150 |         "long_text = 'Hello ' * 10000\n",
151 |         "\n",
152 |         "query_result = embeddings.embed_query(long_text)\n",
153 |         "print(query_result)"
154 |       ],
155 |       "metadata": {
156 |         "id": "9SYhZJyQTiZZ"
157 |       },
158 |       "execution_count": null,
159 |       "outputs": []
160 |     },
161 |     {
162 |       "cell_type": "code",
163 |       "source": [
164 |         "!pip install tiktoken"
165 |       ],
166 |       "metadata": {
167 |         "id": "t0blTkHpVOjl"
168 |       },
169 |       "execution_count": null,
170 |       "outputs": []
171 |     },
172 |     {
173 |       "cell_type": "code",
174 |       "source": [
175 |         "# Truncate input text length using tiktoken\n",
176 |         "import tiktoken\n",
177 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
178 |         "\n",
179 |         "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
180 |         "\n",
181 |         "max_tokens = 8191\n",
182 |         "encoding_name = 'cl100k_base'\n",
183 |         "\n",
184 |         "long_text = 'Hello ' * 10000\n",
185 |         "\n",
186 |         "# Tokenize the input text before truncating it\n",
187 |         "encoding = tiktoken.get_encoding(encoding_name)\n",
188 |         "tokens = encoding.encode(long_text)[:max_tokens]\n",
189 |         "\n",
190 |         "# Re-convert the tokens to a string before embedding\n",
191 |         "truncated_text = encoding.decode(tokens)\n",
192 |         "\n",
193 |         "query_result = embeddings.embed_query(truncated_text)\n",
194 |         "print(query_result)\n",
195 |         "print(len(query_result))"
196 |       ],
197 |       "metadata": {
198 |         "id": "reeAzDpqXXoC"
199 |       },
200 |       "execution_count": null,
201 |       "outputs": []
202 |     }
203 |   ]
204 | }


--------------------------------------------------------------------------------
/notebooks/langchain_decoded_3_prompts.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "provenance": []
  7 |     },
  8 |     "kernelspec": {
  9 |       "name": "python3",
 10 |       "display_name": "Python 3"
 11 |     },
 12 |     "language_info": {
 13 |       "name": "python"
 14 |     }
 15 |   },
 16 |   "cells": [
 17 |     {
 18 |       "cell_type": "markdown",
 19 |       "source": [
 20 |         "# **LangChain Decoded**"
 21 |       ],
 22 |       "metadata": {
 23 |         "id": "JQxt18IfZR40"
 24 |       }
 25 |     },
 26 |     {
 27 |       "cell_type": "markdown",
 28 |       "source": [
 29 |         "## Getting Started"
 30 |       ],
 31 |       "metadata": {
 32 |         "id": "zgFXxUsp5pbu"
 33 |       }
 34 |     },
 35 |     {
 36 |       "cell_type": "code",
 37 |       "source": [
 38 |         "# Install the LangChain package\n",
 39 |         "!pip install langchain"
 40 |       ],
 41 |       "metadata": {
 42 |         "id": "fRUe2wO95xoR"
 43 |       },
 44 |       "execution_count": null,
 45 |       "outputs": []
 46 |     },
 47 |     {
 48 |       "cell_type": "code",
 49 |       "source": [
 50 |         "# Install the OpenAI package\n",
 51 |         "!pip install openai"
 52 |       ],
 53 |       "metadata": {
 54 |         "id": "T7TwICBe9yy4"
 55 |       },
 56 |       "execution_count": null,
 57 |       "outputs": []
 58 |     },
 59 |     {
 60 |       "cell_type": "code",
 61 |       "source": [
 62 |         "# Configure the API key\n",
 63 |         "import os\n",
 64 |         "\n",
 65 |         "openai_api_key = os.environ.get('OPENAI_API_KEY', 'sk-XXX')"
 66 |       ],
 67 |       "metadata": {
 68 |         "id": "I-ZUDlva6m3Y"
 69 |       },
 70 |       "execution_count": 4,
 71 |       "outputs": []
 72 |     },
 73 |     {
 74 |       "cell_type": "markdown",
 75 |       "source": [
 76 |         "## Part 3: Prompts"
 77 |       ],
 78 |       "metadata": {
 79 |         "id": "U8kUt5Jm5Y6L"
 80 |       }
 81 |     },
 82 |     {
 83 |       "cell_type": "code",
 84 |       "source": [
 85 |         "# Ask the LLM about a recent event/occurence\n",
 86 |         "from langchain.llms.openai import OpenAI\n",
 87 |         "\n",
 88 |         "llm = OpenAI(model_name='text-davinci-003', openai_api_key=openai_api_key)\n",
 89 |         "\n",
 90 |         "print(llm(\"What is LangChain useful for? Answer in one sentence.\"))"
 91 |       ],
 92 |       "metadata": {
 93 |         "id": "GN9NfbApagwW"
 94 |       },
 95 |       "execution_count": null,
 96 |       "outputs": []
 97 |     },
 98 |     {
 99 |       "cell_type": "code",
100 |       "source": [
101 |         "# Ask the same question again, but with relevant context\n",
102 |         "prompt = \"\"\"You are a helpful assistant, who can explain concepts in an easy-to-understand manner. Answer the following question succintly.\n",
103 |         "          Context: There are six main areas that LangChain is designed to help with. These are, in increasing order of complexity:\n",
104 |         "            LLMs and Prompts: This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.\n",
105 |         "            Chains: Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.\n",
106 |         "            Data Augmented Generation: Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.\n",
107 |         "            Agents: Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.\n",
108 |         "            Memory: Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.\n",
109 |         "            Evaluation: Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.\n",
110 |         "          Question: What is LangChain useful for?\n",
111 |         "          Answer: \"\"\"\n",
112 |         "\n",
113 |         "print(llm(prompt))"
114 |       ],
115 |       "metadata": {
116 |         "id": "lHi9-vCAasK5"
117 |       },
118 |       "execution_count": null,
119 |       "outputs": []
120 |     },
121 |     {
122 |       "cell_type": "code",
123 |       "source": [
124 |         "# Use a template to structure the prompt\n",
125 |         "from langchain import PromptTemplate\n",
126 |         "\n",
127 |         "template = \"\"\"You are a helpful assistant, who is good at general knowledge trivia. Answer the following question succintly.\n",
128 |         "              Question: {question}\n",
129 |         "              Answer:\"\"\"\n",
130 |         "\n",
131 |         "prompt = PromptTemplate(template=template, input_variables=['question'])\n",
132 |         "\n",
133 |         "question = \"Who won the first football World Cup?\"\n",
134 |         "\n",
135 |         "print(llm(question))"
136 |       ],
137 |       "metadata": {
138 |         "id": "lOwGPXanZevQ"
139 |       },
140 |       "execution_count": null,
141 |       "outputs": []
142 |     },
143 |     {
144 |       "cell_type": "code",
145 |       "source": [
146 |         "# Use a chain to execute the prompt\n",
147 |         "from langchain.chains import LLMChain\n",
148 |         "\n",
149 |         "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
150 |         "\n",
151 |         "print(llm_chain.run(question))"
152 |       ],
153 |       "metadata": {
154 |         "id": "E4e782fpa-av"
155 |       },
156 |       "execution_count": null,
157 |       "outputs": []
158 |     },
159 |     {
160 |       "cell_type": "code",
161 |       "source": [
162 |         "# Save prompt template to JSON file\n",
163 |         "prompt.save(\"myprompt.json\")\n",
164 |         "\n",
165 |         "# Load prompt template from JSON file\n",
166 |         "from langchain.prompts import load_prompt\n",
167 |         "\n",
168 |         "saved_prompt = load_prompt(\"myprompt.json\")\n",
169 |         "assert prompt == saved_prompt\n",
170 |         "\n",
171 |         "print(llm(question))"
172 |       ],
173 |       "metadata": {
174 |         "id": "BGgfhiQ692V7"
175 |       },
176 |       "execution_count": null,
177 |       "outputs": []
178 |     },
179 |     {
180 |       "cell_type": "code",
181 |       "source": [
182 |         "# Guide the model using few shot examples in the prompt\n",
183 |         "from langchain import PromptTemplate, FewShotPromptTemplate\n",
184 |         "\n",
185 |         "examples = [\n",
186 |         "    { \"question\": \"How can we extend our lifespan?\", \n",
187 |         "      \"answer\": \"Just freeze yourself and wait for technology to catch up.\"},\n",
188 |         "    { \"question\": \"Does red wine help you live longer?\", \n",
189 |         "      \"answer\": \"I don't know about that, but it does make the time pass more quickly.\"},\n",
190 |         "    { \"question\": \"How can we slow down the aging process?\", \n",
191 |         "      \"answer\": \"Simple, just stop having birthdays.\"}\n",
192 |         "]\n",
193 |         "\n",
194 |         "template = \"\"\"\n",
195 |         "    Question: {question}\n",
196 |         "    Answer: {answer}\n",
197 |         "  \"\"\"\n",
198 |         "\n",
199 |         "prompt = PromptTemplate(input_variables=[\"question\", \"answer\"], template=template)\n",
200 |         "\n",
201 |         "few_shot_prompt = FewShotPromptTemplate(\n",
202 |         "    examples=examples,\n",
203 |         "    example_prompt=prompt,\n",
204 |         "    prefix=\"Respond with a funny and witty remark.\",\n",
205 |         "    suffix=\"Question: {question}\\nAnswer:\",\n",
206 |         "    input_variables=[\"question\"],\n",
207 |         "    example_separator=\"\"\n",
208 |         ")\n",
209 |         "\n",
210 |         "print(few_shot_prompt.format(question=\"How can I eat healthy?\"))\n",
211 |         "print(llm(few_shot_prompt.format(question=\"How can I eat healthy?\")))"
212 |       ],
213 |       "metadata": {
214 |         "id": "IzKztT8_AIiI"
215 |       },
216 |       "execution_count": null,
217 |       "outputs": []
218 |     },
219 |     {
220 |       "cell_type": "code",
221 |       "source": [
222 |         "# Use prompt templates with chat models\n",
223 |         "from langchain.chat_models import ChatOpenAI\n",
224 |         "from langchain.prompts import (\n",
225 |         "    ChatPromptTemplate,\n",
226 |         "    PromptTemplate,\n",
227 |         "    SystemMessagePromptTemplate,\n",
228 |         "    AIMessagePromptTemplate,\n",
229 |         "    HumanMessagePromptTemplate,\n",
230 |         ")\n",
231 |         "\n",
232 |         "chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)\n",
233 |         "\n",
234 |         "system_message=\"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
235 |         "system_message_prompt = SystemMessagePromptTemplate.from_template(system_message)\n",
236 |         "\n",
237 |         "human_message=\"{text}\"\n",
238 |         "human_message_prompt = HumanMessagePromptTemplate.from_template(human_message)\n",
239 |         "\n",
240 |         "chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])\n",
241 |         "\n",
242 |         "messages = chat_prompt.format_prompt(input_language=\"English\", output_language=\"Spanish\", text=\"I'm hungry, give me food.\").to_messages()\n",
243 |         "\n",
244 |         "chat(messages)"
245 |       ],
246 |       "metadata": {
247 |         "id": "uIXTlQ4dKNL-"
248 |       },
249 |       "execution_count": null,
250 |       "outputs": []
251 |     }
252 |   ]
253 | }


--------------------------------------------------------------------------------
/notebooks/langchain_decoded_4_indexes.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "provenance": []
  7 |     },
  8 |     "kernelspec": {
  9 |       "name": "python3",
 10 |       "display_name": "Python 3"
 11 |     },
 12 |     "language_info": {
 13 |       "name": "python"
 14 |     }
 15 |   },
 16 |   "cells": [
 17 |     {
 18 |       "cell_type": "markdown",
 19 |       "source": [
 20 |         "# **LangChain Decoded**"
 21 |       ],
 22 |       "metadata": {
 23 |         "id": "JQxt18IfZR40"
 24 |       }
 25 |     },
 26 |     {
 27 |       "cell_type": "markdown",
 28 |       "source": [
 29 |         "## Getting Started"
 30 |       ],
 31 |       "metadata": {
 32 |         "id": "zgFXxUsp5pbu"
 33 |       }
 34 |     },
 35 |     {
 36 |       "cell_type": "code",
 37 |       "source": [
 38 |         "# Install the LangChain package\n",
 39 |         "!pip install langchain"
 40 |       ],
 41 |       "metadata": {
 42 |         "id": "fRUe2wO95xoR"
 43 |       },
 44 |       "execution_count": null,
 45 |       "outputs": []
 46 |     },
 47 |     {
 48 |       "cell_type": "code",
 49 |       "source": [
 50 |         "# Install the OpenAI package\n",
 51 |         "!pip install openai"
 52 |       ],
 53 |       "metadata": {
 54 |         "id": "T7TwICBe9yy4"
 55 |       },
 56 |       "execution_count": null,
 57 |       "outputs": []
 58 |     },
 59 |     {
 60 |       "cell_type": "code",
 61 |       "source": [
 62 |         "# Configure the API key\n",
 63 |         "import os\n",
 64 |         "\n",
 65 |         "openai_api_key = os.environ.get('OPENAI_API_KEY', 'sk-XXX')"
 66 |       ],
 67 |       "metadata": {
 68 |         "id": "I-ZUDlva6m3Y"
 69 |       },
 70 |       "execution_count": null,
 71 |       "outputs": []
 72 |     },
 73 |     {
 74 |       "cell_type": "markdown",
 75 |       "source": [
 76 |         "## Part 4: Indexes"
 77 |       ],
 78 |       "metadata": {
 79 |         "id": "MTkGMnWG_lyh"
 80 |       }
 81 |     },
 82 |     {
 83 |       "cell_type": "markdown",
 84 |       "source": [
 85 |         "### Document Loaders"
 86 |       ],
 87 |       "metadata": {
 88 |         "id": "Pc8d5uA7_4u_"
 89 |       }
 90 |     },
 91 |     {
 92 |       "cell_type": "code",
 93 |       "source": [
 94 |         "!pip install unstructured tabulate pdf2image pytesseract"
 95 |       ],
 96 |       "metadata": {
 97 |         "id": "i6zYaxTl_rlU"
 98 |       },
 99 |       "execution_count": null,
100 |       "outputs": []
101 |     },
102 |     {
103 |       "cell_type": "code",
104 |       "source": [
105 |         "# URL Loader\n",
106 |         "from langchain.document_loaders import UnstructuredURLLoader\n",
107 |         "\n",
108 |         "urls = [\"https://alphasec.io/summarize-text-with-langchain-and-openai\"]\n",
109 |         "loader = UnstructuredURLLoader(urls=urls)\n",
110 |         "data = loader.load()\n",
111 |         "print(data)"
112 |       ],
113 |       "metadata": {
114 |         "id": "LiLWQ_tF_686"
115 |       },
116 |       "execution_count": null,
117 |       "outputs": []
118 |     },
119 |     {
120 |       "cell_type": "code",
121 |       "source": [
122 |         "!pip install pypdf"
123 |       ],
124 |       "metadata": {
125 |         "id": "U_uGRyko_Hbk"
126 |       },
127 |       "execution_count": null,
128 |       "outputs": []
129 |     },
130 |     {
131 |       "cell_type": "code",
132 |       "source": [
133 |         "# PDF Loader\n",
134 |         "from langchain.document_loaders import PyPDFLoader\n",
135 |         "\n",
136 |         "loader = PyPDFLoader(\"./data/attention-is-all-you-need.pdf\")\n",
137 |         "pages = loader.load_and_split()\n",
138 |         "pages[0]"
139 |       ],
140 |       "metadata": {
141 |         "id": "EdpCr38O-olp"
142 |       },
143 |       "execution_count": null,
144 |       "outputs": []
145 |     },
146 |     {
147 |       "cell_type": "code",
148 |       "source": [
149 |         "# File Directory Loader\n",
150 |         "from langchain.document_loaders import DirectoryLoader\n",
151 |         "\n",
152 |         "loader = DirectoryLoader('data', glob=\"**/*.csv\")\n",
153 |         "docs = loader.load()\n",
154 |         "len(docs)"
155 |       ],
156 |       "metadata": {
157 |         "id": "dJIv52X3EbyD"
158 |       },
159 |       "execution_count": null,
160 |       "outputs": []
161 |     },
162 |     {
163 |       "cell_type": "code",
164 |       "source": [
165 |         "!pip install pytube youtube-transcript-api"
166 |       ],
167 |       "metadata": {
168 |         "id": "080GK3Q3Iv5_"
169 |       },
170 |       "execution_count": null,
171 |       "outputs": []
172 |     },
173 |     {
174 |       "cell_type": "code",
175 |       "source": [
176 |         "# YouTube Transcripts Loader\n",
177 |         "from langchain.document_loaders import YoutubeLoader\n",
178 |         "\n",
179 |         "loader = YoutubeLoader.from_youtube_url(\"https://www.youtube.com/watch?v=yEgHrxvLsz0\", add_video_info=True)\n",
180 |         "data = loader.load()\n",
181 |         "print(data)"
182 |       ],
183 |       "metadata": {
184 |         "id": "B13lagzPGn1n"
185 |       },
186 |       "execution_count": null,
187 |       "outputs": []
188 |     },
189 |     {
190 |       "cell_type": "code",
191 |       "source": [
192 |         "!pip install google-cloud-storage"
193 |       ],
194 |       "metadata": {
195 |         "id": "nte5y7xxa9We"
196 |       },
197 |       "execution_count": null,
198 |       "outputs": []
199 |     },
200 |     {
201 |       "cell_type": "code",
202 |       "source": [
203 |         "# Google Cloud Storage File Loader\n",
204 |         "from langchain.document_loaders import GCSFileLoader\n",
205 |         "\n",
206 |         "loader = GCSFileLoader(project_name=\"langchain-gcs\", bucket=\"langchain-gcs\", blob=\"lorem-ipsum.txt\")\n",
207 |         "data = loader.load()\n",
208 |         "print(data)"
209 |       ],
210 |       "metadata": {
211 |         "id": "kmpul4UqbXuJ"
212 |       },
213 |       "execution_count": null,
214 |       "outputs": []
215 |     },
216 |     {
217 |       "cell_type": "markdown",
218 |       "source": [
219 |         "### Text Splitters"
220 |       ],
221 |       "metadata": {
222 |         "id": "dIEPnFzw_7Rx"
223 |       }
224 |     },
225 |     {
226 |       "cell_type": "code",
227 |       "source": [
228 |         "# Character Text Splitter\n",
229 |         "from langchain.text_splitter import CharacterTextSplitter\n",
230 |         "from google.colab import files\n",
231 |         "\n",
232 |         "uploaded = files.upload()\n",
233 |         "filename = next(iter(uploaded))\n",
234 |         "text = uploaded[filename].decode(\"utf-8\")\n",
235 |         "\n",
236 |         "text_splitter = CharacterTextSplitter(        \n",
237 |         "    separator = \"\\n\\n\",\n",
238 |         "    chunk_size = 1000,\n",
239 |         "    chunk_overlap  = 200,\n",
240 |         "    length_function = len,\n",
241 |         ")\n",
242 |         "\n",
243 |         "texts = text_splitter.create_documents([text])\n",
244 |         "print(texts[0])\n",
245 |         "print(texts[1])\n",
246 |         "print(texts[2])"
247 |       ],
248 |       "metadata": {
249 |         "id": "WGb0P3fS__BD"
250 |       },
251 |       "execution_count": null,
252 |       "outputs": []
253 |     },
254 |     {
255 |       "cell_type": "code",
256 |       "source": [
257 |         "# Recursive Character Text Splitter\n",
258 |         "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
259 |         "from google.colab import files\n",
260 |         "\n",
261 |         "uploaded = files.upload()\n",
262 |         "filename = next(iter(uploaded))\n",
263 |         "text = uploaded[filename].decode(\"utf-8\")\n",
264 |         "\n",
265 |         "text_splitter = RecursiveCharacterTextSplitter(        \n",
266 |         "    chunk_size = 100,\n",
267 |         "    chunk_overlap  = 20,\n",
268 |         "    length_function = len,\n",
269 |         ")\n",
270 |         "\n",
271 |         "texts = text_splitter.create_documents([text])\n",
272 |         "print(texts[0])\n",
273 |         "print(texts[1])\n",
274 |         "print(texts[2])"
275 |       ],
276 |       "metadata": {
277 |         "id": "NmYW0cN0UhxP"
278 |       },
279 |       "execution_count": null,
280 |       "outputs": []
281 |     },
282 |     {
283 |       "cell_type": "markdown",
284 |       "source": [
285 |         "### Vector Stores"
286 |       ],
287 |       "metadata": {
288 |         "id": "yF32kJni__hY"
289 |       }
290 |     },
291 |     {
292 |       "cell_type": "code",
293 |       "source": [
294 |         "!pip install chromadb tiktoken"
295 |       ],
296 |       "metadata": {
297 |         "id": "rMtIR-tRX0Qt"
298 |       },
299 |       "execution_count": null,
300 |       "outputs": []
301 |     },
302 |     {
303 |       "cell_type": "code",
304 |       "source": [
305 |         "# Chroma Vector Store\n",
306 |         "import os, tiktoken\n",
307 |         "from langchain.document_loaders import TextLoader\n",
308 |         "from langchain.text_splitter import CharacterTextSplitter\n",
309 |         "from langchain.vectorstores import Chroma\n",
310 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
311 |         "\n",
312 |         "OPENAI_API_KEY = '' # @param {type:\"string\"}\n",
313 |         "os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY\n",
314 |         "\n",
315 |         "from google.colab import files\n",
316 |         "\n",
317 |         "uploaded = files.upload()\n",
318 |         "filename = next(iter(uploaded))\n",
319 |         "\n",
320 |         "loader = TextLoader(filename)\n",
321 |         "data = loader.load()\n",
322 |         "\n",
323 |         "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
324 |         "docs = text_splitter.split_documents(data)\n",
325 |         "\n",
326 |         "embeddings = OpenAIEmbeddings()\n",
327 |         "db = Chroma.from_documents(docs, embeddings)\n",
328 |         "\n",
329 |         "query = \"What comes after 'Vestibulum congue convallis finibus'?\"\n",
330 |         "docs = db.similarity_search(query)\n",
331 |         "\n",
332 |         "print(docs[0].page_content)"
333 |       ],
334 |       "metadata": {
335 |         "id": "0XUsuto_ABze"
336 |       },
337 |       "execution_count": null,
338 |       "outputs": []
339 |     },
340 |     {
341 |       "cell_type": "markdown",
342 |       "source": [
343 |         "### Retrievers"
344 |       ],
345 |       "metadata": {
346 |         "id": "1RpFQvgAACR6"
347 |       }
348 |     },
349 |     {
350 |       "cell_type": "code",
351 |       "source": [
352 |         "!pip install arxiv pymupdf"
353 |       ],
354 |       "metadata": {
355 |         "id": "ZYC12MjFPQTo"
356 |       },
357 |       "execution_count": null,
358 |       "outputs": []
359 |     },
360 |     {
361 |       "cell_type": "code",
362 |       "source": [
363 |         "# Arxiv Retriever\n",
364 |         "from langchain.retrievers import ArxivRetriever\n",
365 |         "\n",
366 |         "retriever = ArxivRetriever(load_max_docs=2)\n",
367 |         "docs = retriever.get_relevant_documents(query='2203.15556')\n",
368 |         "\n",
369 |         "docs[0].metadata"
370 |       ],
371 |       "metadata": {
372 |         "id": "mwVrbBUfAEaG"
373 |       },
374 |       "execution_count": null,
375 |       "outputs": []
376 |     },
377 |     {
378 |       "cell_type": "code",
379 |       "source": [
380 |         "!pip install wikipedia"
381 |       ],
382 |       "metadata": {
383 |         "id": "26WMZAxdTEzs"
384 |       },
385 |       "execution_count": null,
386 |       "outputs": []
387 |     },
388 |     {
389 |       "cell_type": "code",
390 |       "source": [
391 |         "# Wikipedia Retriever\n",
392 |         "from langchain.retrievers import WikipediaRetriever\n",
393 |         "\n",
394 |         "retriever = WikipediaRetriever()\n",
395 |         "docs = retriever.get_relevant_documents(query='large language models')\n",
396 |         "\n",
397 |         "docs[0].metadata"
398 |       ],
399 |       "metadata": {
400 |         "id": "-pY7LCZ8TG-8"
401 |       },
402 |       "execution_count": null,
403 |       "outputs": []
404 |     },
405 |     {
406 |       "cell_type": "code",
407 |       "source": [
408 |         "!pip install chromadb tiktoken"
409 |       ],
410 |       "metadata": {
411 |         "id": "8LrQSBVaUFmX"
412 |       },
413 |       "execution_count": null,
414 |       "outputs": []
415 |     },
416 |     {
417 |       "cell_type": "code",
418 |       "source": [
419 |         "# Chroma Vector Store Retriever\n",
420 |         "import os, tiktoken\n",
421 |         "from langchain.document_loaders import TextLoader\n",
422 |         "from langchain.text_splitter import CharacterTextSplitter\n",
423 |         "from langchain.vectorstores import Chroma\n",
424 |         "from langchain.embeddings.openai import OpenAIEmbeddings\n",
425 |         "\n",
426 |         "OPENAI_API_KEY = '' # @param {type:\"string\"}\n",
427 |         "os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY\n",
428 |         "\n",
429 |         "from google.colab import files\n",
430 |         "\n",
431 |         "uploaded = files.upload()\n",
432 |         "filename = next(iter(uploaded))\n",
433 |         "\n",
434 |         "loader = TextLoader(filename)\n",
435 |         "data = loader.load()\n",
436 |         "\n",
437 |         "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
438 |         "docs = text_splitter.split_documents(data)\n",
439 |         "\n",
440 |         "embeddings = OpenAIEmbeddings()\n",
441 |         "db = Chroma.from_documents(docs, embeddings)\n",
442 |         "\n",
443 |         "retriever = db.as_retriever()\n",
444 |         "query = \"What comes after 'Vestibulum congue convallis finibus'?\"\n",
445 |         "docs = retriever.get_relevant_documents(query)\n",
446 |         "\n",
447 |         "print(docs[0].page_content)"
448 |       ],
449 |       "metadata": {
450 |         "id": "S0ZiBwNKSaqS"
451 |       },
452 |       "execution_count": null,
453 |       "outputs": []
454 |     }
455 |   ]
456 | }


--------------------------------------------------------------------------------
/notebooks/langchain_decoded_5_memory.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "provenance": []
  7 |     },
  8 |     "kernelspec": {
  9 |       "name": "python3",
 10 |       "display_name": "Python 3"
 11 |     },
 12 |     "language_info": {
 13 |       "name": "python"
 14 |     }
 15 |   },
 16 |   "cells": [
 17 |     {
 18 |       "cell_type": "markdown",
 19 |       "source": [
 20 |         "# **LangChain Decoded**"
 21 |       ],
 22 |       "metadata": {
 23 |         "id": "JQxt18IfZR40"
 24 |       }
 25 |     },
 26 |     {
 27 |       "cell_type": "markdown",
 28 |       "source": [
 29 |         "## Getting Started"
 30 |       ],
 31 |       "metadata": {
 32 |         "id": "zgFXxUsp5pbu"
 33 |       }
 34 |     },
 35 |     {
 36 |       "cell_type": "code",
 37 |       "source": [
 38 |         "# Install the LangChain package\n",
 39 |         "!pip install langchain"
 40 |       ],
 41 |       "metadata": {
 42 |         "id": "fRUe2wO95xoR"
 43 |       },
 44 |       "execution_count": null,
 45 |       "outputs": []
 46 |     },
 47 |     {
 48 |       "cell_type": "code",
 49 |       "source": [
 50 |         "# Install the OpenAI package\n",
 51 |         "!pip install openai"
 52 |       ],
 53 |       "metadata": {
 54 |         "id": "T7TwICBe9yy4"
 55 |       },
 56 |       "execution_count": null,
 57 |       "outputs": []
 58 |     },
 59 |     {
 60 |       "cell_type": "code",
 61 |       "source": [
 62 |         "# Configure the API key\n",
 63 |         "import os\n",
 64 |         "\n",
 65 |         "openai_api_key = os.environ.get('OPENAI_API_KEY', 'sk-XXX')"
 66 |       ],
 67 |       "metadata": {
 68 |         "id": "I-ZUDlva6m3Y"
 69 |       },
 70 |       "execution_count": null,
 71 |       "outputs": []
 72 |     },
 73 |     {
 74 |       "cell_type": "markdown",
 75 |       "source": [
 76 |         "## Part 5: Memory"
 77 |       ],
 78 |       "metadata": {
 79 |         "id": "DQ6IfBHr_nhl"
 80 |       }
 81 |     },
 82 |     {
 83 |       "cell_type": "code",
 84 |       "source": [
 85 |         "# Store and retrieve chat messages with ChatMessageHistory\n",
 86 |         "from langchain.memory import ChatMessageHistory\n",
 87 |         "\n",
 88 |         "history = ChatMessageHistory()\n",
 89 |         "history.add_user_message(\"Hello\")\n",
 90 |         "history.add_ai_message(\"Hi, how can I help you?\")\n",
 91 |         "history.add_user_message(\"I want to write Python code.\")\n",
 92 |         "history.add_ai_message(\"Sure, I can help with that. What do you want to code?\")\n",
 93 |         "\n",
 94 |         "history.messages"
 95 |       ],
 96 |       "metadata": {
 97 |         "id": "gYMhOgj5_x1K"
 98 |       },
 99 |       "execution_count": null,
100 |       "outputs": []
101 |     },
102 |     {
103 |       "cell_type": "code",
104 |       "source": [
105 |         "# Retrieve chat messages with ConversationBufferHistory (as a variable)\n",
106 |         "from langchain.memory import ConversationBufferMemory\n",
107 |         "\n",
108 |         "memory = ConversationBufferMemory()\n",
109 |         "memory.chat_memory.add_user_message(\"Hello\")\n",
110 |         "memory.chat_memory.add_ai_message(\"Hi, how can I help you?\")\n",
111 |         "memory.chat_memory.add_user_message(\"I want to write Python code.\")\n",
112 |         "memory.chat_memory.add_ai_message(\"Sure, I can help with that. What do you want to code?\")\n",
113 |         "\n",
114 |         "memory.load_memory_variables({})"
115 |       ],
116 |       "metadata": {
117 |         "id": "RMKPUXnKpj-u"
118 |       },
119 |       "execution_count": null,
120 |       "outputs": []
121 |     },
122 |     {
123 |       "cell_type": "code",
124 |       "source": [
125 |         "# Retrieve chat messages with ConversationBufferHistory (as a list of messages)\n",
126 |         "from langchain.memory import ConversationBufferMemory\n",
127 |         "\n",
128 |         "memory = ConversationBufferMemory(return_messages=True)\n",
129 |         "memory.chat_memory.add_user_message(\"Hello\")\n",
130 |         "memory.chat_memory.add_ai_message(\"Hi, how can I help you?\")\n",
131 |         "memory.chat_memory.add_user_message(\"I want to write Python code.\")\n",
132 |         "memory.chat_memory.add_ai_message(\"Sure, I can help with that. What do you want to code?\")\n",
133 |         "\n",
134 |         "memory.load_memory_variables({})"
135 |       ],
136 |       "metadata": {
137 |         "id": "WCY2tsblqprw"
138 |       },
139 |       "execution_count": null,
140 |       "outputs": []
141 |     },
142 |     {
143 |       "cell_type": "code",
144 |       "source": [
145 |         "# Use ConversationBufferMemory in a chain\n",
146 |         "from langchain.llms.openai import OpenAI\n",
147 |         "from langchain.chains import ConversationChain\n",
148 |         "\n",
149 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
150 |         "conversation = ConversationChain(llm=llm, memory=ConversationBufferMemory())\n",
151 |         "\n",
152 |         "conversation.predict(input=\"Hello\")"
153 |       ],
154 |       "metadata": {
155 |         "id": "m2Ged561sLhm"
156 |       },
157 |       "execution_count": null,
158 |       "outputs": []
159 |     },
160 |     {
161 |       "cell_type": "code",
162 |       "source": [
163 |         "conversation.predict(input=\"I want to write Python code.\")"
164 |       ],
165 |       "metadata": {
166 |         "id": "MS07RHCCtJSL"
167 |       },
168 |       "execution_count": null,
169 |       "outputs": []
170 |     },
171 |     {
172 |       "cell_type": "code",
173 |       "source": [
174 |         "# Store a conversation summary with ConversationSummaryMemory\n",
175 |         "from langchain.llms.openai import OpenAI\n",
176 |         "from langchain.memory import ChatMessageHistory, ConversationSummaryMemory\n",
177 |         "\n",
178 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
179 |         "memory = ConversationSummaryMemory(llm=llm)\n",
180 |         "memory.save_context({\"input\": \"Hello\"}, {\"output\": \"Hi, how can I help you?\"})\n",
181 |         "\n",
182 |         "memory.load_memory_variables({})"
183 |       ],
184 |       "metadata": {
185 |         "id": "2f9H0r1XuBAm"
186 |       },
187 |       "execution_count": null,
188 |       "outputs": []
189 |     },
190 |     {
191 |       "cell_type": "code",
192 |       "source": [
193 |         "conversation.predict(input=\"I want to write Python code.\")"
194 |       ],
195 |       "metadata": {
196 |         "id": "Ne2bNr9_xNFj"
197 |       },
198 |       "execution_count": null,
199 |       "outputs": []
200 |     },
201 |     {
202 |       "cell_type": "code",
203 |       "source": [
204 |         "# Use ConversationSummaryMemory in a chain\n",
205 |         "from langchain.llms.openai import OpenAI\n",
206 |         "from langchain.chains import ConversationChain\n",
207 |         "\n",
208 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
209 |         "memory = ConversationSummaryMemory(llm=llm)\n",
210 |         "conversation = ConversationChain(llm=llm, verbose=True, memory=memory)\n",
211 |         "\n",
212 |         "conversation.predict(input=\"Hello\")"
213 |       ],
214 |       "metadata": {
215 |         "id": "jxmlmYBhwzPT"
216 |       },
217 |       "execution_count": null,
218 |       "outputs": []
219 |     },
220 |     {
221 |       "cell_type": "code",
222 |       "source": [
223 |         "conversation.predict(input=\"I want to write Python code.\")"
224 |       ],
225 |       "metadata": {
226 |         "id": "FW8h7QFFoLN-"
227 |       },
228 |       "execution_count": null,
229 |       "outputs": []
230 |     },
231 |     {
232 |       "cell_type": "code",
233 |       "source": [
234 |         "conversation.predict(input=\"No, I'm a beginner.\")"
235 |       ],
236 |       "metadata": {
237 |         "id": "gzl4GUCmxaEp"
238 |       },
239 |       "execution_count": null,
240 |       "outputs": []
241 |     },
242 |     {
243 |       "cell_type": "code",
244 |       "source": [
245 |         "# Memory management using Motorhead (managed)\n",
246 |         "from langchain import OpenAI, LLMChain, PromptTemplate\n",
247 |         "from langchain.memory.motorhead_memory import MotorheadMemory\n",
248 |         "\n",
249 |         "template = \"\"\"You are a chatbot having a conversation with a human.\n",
250 |         "\n",
251 |         "{chat_history}\n",
252 |         "Human: {human_input}\n",
253 |         "AI:\"\"\"\n",
254 |         "\n",
255 |         "prompt = PromptTemplate(input_variables=[\"chat_history\", \"human_input\"], template=template)\n",
256 |         "\n",
257 |         "memory = MotorheadMemory(\n",
258 |         "    api_key=\"API_KEY\",\n",
259 |         "    client_id=\"CLIENT_ID\",\n",
260 |         "    session_id=\"langchain-1\",\n",
261 |         "    memory_key=\"chat_history\",\n",
262 |         ")\n",
263 |         "\n",
264 |         "await memory.init();\n",
265 |         "\n",
266 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
267 |         "llm_chain = LLMChain(llm=llm, prompt=prompt, memory=memory)\n",
268 |         "\n",
269 |         "llm_chain.run(\"Hello, I'm Motorhead.\")"
270 |       ],
271 |       "metadata": {
272 |         "id": "WGTiwbGVYJsP"
273 |       },
274 |       "execution_count": null,
275 |       "outputs": []
276 |     },
277 |     {
278 |       "cell_type": "code",
279 |       "source": [
280 |         "llm_chain.run(\"What's my name?\")"
281 |       ],
282 |       "metadata": {
283 |         "id": "W-4D47gqg5Ze"
284 |       },
285 |       "execution_count": null,
286 |       "outputs": []
287 |     },
288 |     {
289 |       "cell_type": "code",
290 |       "source": [
291 |         "# Memory management using Motorhead (self-hosted)\n",
292 |         "from langchain import OpenAI, LLMChain, PromptTemplate\n",
293 |         "from langchain.memory.motorhead_memory import MotorheadMemory\n",
294 |         "\n",
295 |         "template = \"\"\"You are a chatbot having a conversation with a human.\n",
296 |         "\n",
297 |         "{chat_history}\n",
298 |         "Human: {human_input}\n",
299 |         "AI:\"\"\"\n",
300 |         "\n",
301 |         "prompt = PromptTemplate(input_variables=[\"chat_history\", \"human_input\"], template=template)\n",
302 |         "\n",
303 |         "memory = MotorheadMemory(\n",
304 |         "    url=\"URL\",\n",
305 |         "    session_id=\"langchain-1\",\n",
306 |         "    memory_key=\"chat_history\",\n",
307 |         ")\n",
308 |         "\n",
309 |         "await memory.init();\n",
310 |         "\n",
311 |         "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
312 |         "llm_chain = LLMChain(llm=llm, prompt=prompt, memory=memory)\n",
313 |         "\n",
314 |         "llm_chain.run(\"Hello, I'm Motorhead.\")"
315 |       ],
316 |       "metadata": {
317 |         "id": "TOnq4fuXg1qF"
318 |       },
319 |       "execution_count": null,
320 |       "outputs": []
321 |     },
322 |     {
323 |       "cell_type": "code",
324 |       "source": [
325 |         "llm_chain.run(\"What's my name?\")"
326 |       ],
327 |       "metadata": {
328 |         "id": "FT0GzIbCZBeb"
329 |       },
330 |       "execution_count": null,
331 |       "outputs": []
332 |     }
333 |   ]
334 | }
335 | 


--------------------------------------------------------------------------------