├── README.md ├── .gitignore ├── chinese ├── LangChain_Conclusion.srt ├── LangChain_Intro.srt ├── LangChain_L6.srt ├── LangChain_L2.srt ├── LangChain_L4.srt ├── LangChain_L3.srt ├── LangChain_L5.srt └── LangChain_L1.srt ├── LangChain_Conclusion.srt ├── english ├── LangChain_Conclusion.srt ├── LangChain_Intro.srt ├── LangChain_L6.srt ├── LangChain_L2.srt ├── LangChain_L4.srt └── LangChain_L3.srt ├── LangChain_Intro.srt └── LangChain_L6.srt /README.md: -------------------------------------------------------------------------------- 1 | # [新]吴恩达 LangChain 课程中英字幕 2 | 3 | 使用whisper + gpt 识别翻译,勘误请提issues,谢谢。 4 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | video 2 | audio 3 | .env 4 | translator.py 5 | main.py 6 | merge.mjs 7 | package.json 8 | node_modules 9 | pnpm-lock.yaml -------------------------------------------------------------------------------- /chinese/LangChain_Conclusion.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:00,000 --> 00:00:10,600 5 | 在这个短课程中,您看到了一系列的应用,包括处理客户评论、构建一个应用程序来回答文档问题,甚至使用LLM来决定何时调用外部工具,如Web搜索来回答复杂问题。 6 | 7 | 2 8 | 00:00:10,600 --> 00:00:16,840 9 | 如果一两周前,有人问你建立所有这些应用需要多少工作量,我想很多人会觉得,哇,这听起来像是几周,甚至更长时间的工作。 10 | 11 | 3 12 | 00:00:16,840 --> 00:00:22,080 13 | 但是您在这个短课程中看到了,只需要相当数量的代码行,您就可以使用Langtrain相当高效地构建所有这些应用。 14 | 15 | 4 16 | 00:00:22,080 --> 00:00:26,400 17 | 希望您能够借鉴这些想法,也许您可以使用Jupyter笔记本中看到的一些代码片段在自己的应用程序中使用它们。 18 | 19 | 5 20 | 00:00:26,400 --> 00:00:30,800 21 | 而这些想法只是一个开始。 22 | 23 | 6 24 | 00:00:30,800 --> 00:00:32,200 25 | 您可以使用语言模型进行许多其他应用。 26 | 27 | 7 28 | 00:00:32,200 --> 00:00:36,640 29 | 这些模型非常强大,因为它们适用于如此广泛的任务,无论是关于CSV的问题,查询SQL数据库,还是与API交互。 30 | 31 | 8 32 | 00:00:36,640 --> 00:00:41,600 33 | 有许多不同的使用链和提示组合以及输出解析器和更多链来在Langtrain中完成所有这些事情的示例。 34 | 35 | 9 36 | 00:00:41,600 --> 00:00:45,600 37 | 而这大部分归功于Langtrain社区。 38 | 39 | 10 40 | 00:00:45,600 --> 00:00:49,120 41 | 因此,我还要非常感谢社区中的每个人的贡献,无论是改进文档,使其更容易入门,还是新类型的链,开启了一个全新的可能性世界。 42 | 43 | 11 44 | 00:00:49,120 --> 00:00:51,320 45 | 因此,如果您还没有这样做,请去您的笔记本电脑、台式机上运行pip install Langtrain。 46 | 47 | 12 48 | 00:00:51,320 --> 00:00:54,920 49 | 然后使用它来构建一些惊人的应用程序。 -------------------------------------------------------------------------------- /LangChain_Conclusion.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:10,600 3 | 在这个短课程中,您看到了一系列的应用,包括处理客户评论、构建一个应用程序来回答文档问题,甚至使用LLM来决定何时调用外部工具,如Web搜索来回答复杂问题。 4 | In this short course, you saw a range of applications, including processing customer reviews and 5 | 6 | 2 7 | 00:00:10,600 --> 00:00:16,840 8 | 如果一两周前,有人问你建立所有这些应用需要多少工作量,我想很多人会觉得,哇,这听起来像是几周,甚至更长时间的工作。 9 | building an application to answer questions over documents, and even using an llm to decide 10 | 11 | 3 12 | 00:00:16,840 --> 00:00:22,080 13 | 但是您在这个短课程中看到了,只需要相当数量的代码行,您就可以使用Langtrain相当高效地构建所有这些应用。 14 | when to make a call to an external tool like web search to answer complex questions. 15 | 16 | 4 17 | 00:00:22,080 --> 00:00:26,400 18 | 希望您能够借鉴这些想法,也许您可以使用Jupyter笔记本中看到的一些代码片段在自己的应用程序中使用它们。 19 | If a week or two ago, someone had asked you how much work would it be to build all of 20 | 21 | 5 22 | 00:00:26,400 --> 00:00:30,800 23 | 而这些想法只是一个开始。 24 | these applications, I think a lot of people thought, boy, this sounds like weeks, maybe 25 | 26 | 6 27 | 00:00:30,800 --> 00:00:32,200 28 | 您可以使用语言模型进行许多其他应用。 29 | even longer of work. 30 | 31 | 7 32 | 00:00:32,200 --> 00:00:36,640 33 | 这些模型非常强大,因为它们适用于如此广泛的任务,无论是关于CSV的问题,查询SQL数据库,还是与API交互。 34 | But you saw in this short course how with just a pretty reasonable number of lines of 35 | 36 | 8 37 | 00:00:36,640 --> 00:00:41,600 38 | 有许多不同的使用链和提示组合以及输出解析器和更多链来在Langtrain中完成所有这些事情的示例。 39 | code, you can use Langtrain to build all of these applications pretty efficiently. 40 | 41 | 9 42 | 00:00:41,600 --> 00:00:45,600 43 | 而这大部分归功于Langtrain社区。 44 | As I hope you take these ideas, maybe you can take some code snippets that you saw in 45 | 46 | 10 47 | 00:00:45,600 --> 00:00:49,120 48 | 因此,我还要非常感谢社区中的每个人的贡献,无论是改进文档,使其更容易入门,还是新类型的链,开启了一个全新的可能性世界。 49 | the Jupyter notebooks and use them in your own applications. 50 | 51 | 11 52 | 00:00:49,120 --> 00:00:51,320 53 | 因此,如果您还没有这样做,请去您的笔记本电脑、台式机上运行pip install Langtrain。 54 | And these ideas are really just the start. 55 | 56 | 12 57 | 00:00:51,320 --> 00:00:54,920 58 | 然后使用它来构建一些惊人的应用程序。 59 | There's a lot of other applications that you can use language models for. 60 | -------------------------------------------------------------------------------- /english/LangChain_Conclusion.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:10,600 3 | In this short course, you saw a range of applications, including processing customer reviews and 4 | 5 | 2 6 | 00:00:10,600 --> 00:00:16,840 7 | building an application to answer questions over documents, and even using an llm to decide 8 | 9 | 3 10 | 00:00:16,840 --> 00:00:22,080 11 | when to make a call to an external tool like web search to answer complex questions. 12 | 13 | 4 14 | 00:00:22,080 --> 00:00:26,400 15 | If a week or two ago, someone had asked you how much work would it be to build all of 16 | 17 | 5 18 | 00:00:26,400 --> 00:00:30,800 19 | these applications, I think a lot of people thought, boy, this sounds like weeks, maybe 20 | 21 | 6 22 | 00:00:30,800 --> 00:00:32,200 23 | even longer of work. 24 | 25 | 7 26 | 00:00:32,200 --> 00:00:36,640 27 | But you saw in this short course how with just a pretty reasonable number of lines of 28 | 29 | 8 30 | 00:00:36,640 --> 00:00:41,600 31 | code, you can use Langtrain to build all of these applications pretty efficiently. 32 | 33 | 9 34 | 00:00:41,600 --> 00:00:45,600 35 | As I hope you take these ideas, maybe you can take some code snippets that you saw in 36 | 37 | 10 38 | 00:00:45,600 --> 00:00:49,120 39 | the Jupyter notebooks and use them in your own applications. 40 | 41 | 11 42 | 00:00:49,120 --> 00:00:51,320 43 | And these ideas are really just the start. 44 | 45 | 12 46 | 00:00:51,320 --> 00:00:54,920 47 | There's a lot of other applications that you can use language models for. 48 | 49 | 13 50 | 00:00:54,920 --> 00:00:59,680 51 | These models are so powerful because they're applicable to such a wide range of tasks, 52 | 53 | 14 54 | 00:00:59,680 --> 00:01:06,600 55 | whether it be answering questions about CSVs, querying SQL databases, interacting with APIs. 56 | 57 | 15 58 | 00:01:06,600 --> 00:01:11,740 59 | There's a lot of different examples of using chains and the combinations of prompts and 60 | 61 | 16 62 | 00:01:11,740 --> 00:01:16,040 63 | output parsers and then more chains to do all these things in Langtrain. 64 | 65 | 17 66 | 00:01:16,040 --> 00:01:18,200 67 | And most of that is due to the Langtrain community. 68 | 69 | 18 70 | 00:01:18,200 --> 00:01:22,360 71 | So I want to also give a really big thank you to everyone in the community who's contributed, 72 | 73 | 19 74 | 00:01:22,360 --> 00:01:26,560 75 | whether it be improvements in the documentation, making it easier for others to get started, 76 | 77 | 20 78 | 00:01:26,560 --> 00:01:30,360 79 | or new types of chains, opening up a whole new world of possibilities. 80 | 81 | 21 82 | 00:01:30,360 --> 00:01:35,560 83 | And so with that, if you have not yet already done so, I hope you go to your laptop, your 84 | 85 | 22 86 | 00:01:35,560 --> 00:01:38,720 87 | desktop and run pip install Langtrain. 88 | 89 | 23 90 | 00:01:38,720 --> 00:01:54,000 91 | And then go use this too to go build some amazing applications. 92 | 93 | 94 | -------------------------------------------------------------------------------- /chinese/LangChain_Intro.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:00,000 --> 00:00:06,440 5 | 欢迎来到这个关于LandChain大语言模型应用开发的短期课程。 6 | 7 | 2 8 | 00:00:06,440 --> 00:00:10,040 9 | 通过提示llm或大型语言模型, 10 | 11 | 3 12 | 00:00:10,040 --> 00:00:13,480 13 | 现在可以比以往更快地开发AI应用程序。 14 | 15 | 4 16 | 00:00:13,480 --> 00:00:18,600 17 | 但一个应用程序可能需要多次提示llm并解析输出。 18 | 19 | 5 20 | 00:00:18,600 --> 00:00:24,800 21 | 因此需要编写大量的粘合代码。 22 | 23 | 6 24 | 00:00:24,800 --> 00:00:28,220 25 | Harrison Chase创建的LandChain使得这个开发过程更加容易。 26 | 27 | 7 28 | 00:00:28,220 --> 00:00:34,360 29 | 我很高兴有Harrison在这里, 30 | 31 | 8 32 | 00:00:34,360 --> 00:00:36,280 33 | 他与deeplearning.ai合作建立了这个短期课程,教大家如何使用这个神奇的工具。 34 | 35 | 9 36 | 00:00:36,280 --> 00:00:38,920 37 | 感谢你的邀请。我很高兴能来这里。 38 | 39 | 10 40 | 00:00:38,920 --> 00:00:42,680 41 | LandChain起初是一个用于构建所有应用程序的开源框架。 42 | 43 | 11 44 | 00:00:42,680 --> 00:00:45,400 45 | 谢谢你。我很高兴能来这里。 46 | 47 | 12 48 | 00:00:45,400 --> 00:00:49,760 49 | 当我与该领域的一些人交谈时,他们正在构建更复杂的应用程序,并看到了 50 | 51 | 13 52 | 00:00:49,760 --> 00:00:53,360 53 | 一些共同的抽象,以及它们如何被开发。 54 | 55 | 14 56 | 00:00:53,360 --> 00:00:56,200 57 | 到目前为止,我们非常高兴地看到LandChain在社区中的采用。 58 | 59 | 15 60 | 00:00:56,200 --> 00:01:04,320 61 | 因此,期待与大家分享它,并期待看到人们用它构建的东西。 62 | 63 | 16 64 | 00:01:04,320 --> 00:01:06,400 65 | 事实上,作为LandChain动力的一个标志, 66 | 67 | 17 68 | 00:01:06,400 --> 00:01:08,960 69 | 它不仅有众多用户, 70 | 71 | 18 72 | 00:01:08,960 --> 00:01:12,700 73 | 而且还有许多对开源的贡献者。 74 | 75 | 19 76 | 00:01:12,700 --> 00:01:19,000 77 | 这对于它的快速发展至关重要。 78 | 79 | 20 80 | 00:01:19,000 --> 00:01:22,760 81 | 这个团队以惊人的速度推出代码和功能。 82 | 83 | 21 84 | 00:01:22,760 --> 00:01:28,960 85 | 因此,希望在这个短期课程之后, 86 | 87 | 22 88 | 00:01:28,960 --> 00:01:33,720 89 | 你将能够快速地使用LandChain组合一些非常酷的应用程序。 90 | 91 | 23 92 | 00:01:33,720 --> 00:01:36,280 93 | 谁知道,也许你甚至决定 94 | 95 | 24 96 | 00:01:36,280 --> 00:01:39,800 97 | 回馈开源LandChain的努力。 98 | 99 | 25 100 | 00:01:39,800 --> 00:01:45,360 101 | LandChain是一个用于构建应用程序的开源开发框架。 102 | 103 | 26 104 | 00:01:45,360 --> 00:01:47,280 105 | 我们有两个不同的包, 106 | 107 | 27 108 | 00:01:47,280 --> 00:01:49,520 109 | 一个是Python,一个是JavaScript。 110 | 111 | 28 112 | 00:01:49,520 --> 00:01:54,960 113 | 它们专注于组合和模块化。 114 | 115 | 29 116 | 00:01:54,960 --> 00:01:58,320 117 | 因此,它们有许多单独的组件,可以一起使用或单独使用。 118 | 119 | 34 120 | 00:01:58,320 --> 00:02:00,080 121 | 因此,这是其中一个关键的附加值。 122 | 123 | 35 124 | 00:02:00,080 --> 00:02:03,680 125 | 另一个关键的附加值是许多不同的用例。 126 | 127 | 36 128 | 00:02:03,680 --> 00:02:07,280 129 | 因此,将这些模块化组件组合成链式方式,形成更多端到端的应用程序,并使其易于开始使用这些用例。 130 | 131 | 37 132 | 00:02:07,280 --> 00:02:09,680 133 | 在本课程中,我们将介绍LandChain的常见组件。 134 | 135 | 38 136 | 00:02:09,680 --> 00:02:12,640 137 | 因此,我们将讨论模型。 138 | 139 | 39 140 | 00:02:12,640 --> 00:02:16,080 141 | 我们将讨论提示,这是您使模型执行有用和有趣操作的方式。 142 | 143 | 40 144 | 00:02:16,080 --> 00:02:17,520 145 | 我们将讨论索引, 146 | 147 | 41 148 | 00:02:17,520 --> 00:02:19,560 149 | 这是一种摄取数据的方式,以便您可以将其与模型结合使用。 150 | 151 | 42 152 | 00:02:19,560 --> 00:02:22,080 153 | 然后,我们将讨论链式, 154 | 155 | 43 156 | 00:02:22,080 --> 00:02:27,920 157 | 这是更多的端到端用例,以及代理人, 158 | 159 | 44 160 | 00:02:27,920 --> 00:02:29,480 161 | 这是一种非常令人兴奋的端到端用例, 162 | 163 | 45 164 | 00:02:29,480 --> 00:02:32,280 165 | 它使用模型作为推理引擎。 166 | 167 | 46 168 | 00:02:32,280 --> 00:02:34,920 169 | 我们还感谢Ankush Gholar, 170 | 171 | 47 172 | 00:02:34,920 --> 00:02:37,680 173 | 他是LandChain的联合创始人,与Harrison Chase一起, 174 | 175 | 48 176 | 00:02:37,680 --> 00:02:40,320 177 | 也为这些材料投入了很多思考,并帮助创建了这个短期课程。 178 | 179 | 49 180 | 00:02:40,320 --> 00:02:44,600 181 | 在deep learning.ai方面, 182 | 183 | 50 184 | 00:02:44,600 --> 00:02:46,840 185 | Jeff Ludwig,Eddie Hsu和Diala Ezzedine, 186 | 187 | 51 188 | 00:02:46,840 --> 00:02:50,480 189 | 也为这些材料做出了贡献。 190 | 191 | 52 192 | 00:02:50,480 --> 00:02:52,840 193 | 因此,让我们进入下一个视频,了解LandChain的模型,提示和暂停。 194 | 195 | -------------------------------------------------------------------------------- /english/LangChain_Intro.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:06,440 3 | Welcome to this short course on 4 | 5 | 2 6 | 00:00:06,440 --> 00:00:10,040 7 | LandChain for large language model application development. 8 | 9 | 3 10 | 00:00:10,040 --> 00:00:13,480 11 | By prompting an llm or large language model, 12 | 13 | 4 14 | 00:00:13,480 --> 00:00:18,600 15 | it's now possible to develop AI applications much faster than ever before. 16 | 17 | 5 18 | 00:00:18,600 --> 00:00:24,800 19 | But an application can require prompting an llm multiple times and parsing as output. 20 | 21 | 6 22 | 00:00:24,800 --> 00:00:28,220 23 | So there's a lot of glue code that needs to be written. 24 | 25 | 7 26 | 00:00:28,220 --> 00:00:34,360 27 | LandChain created by Harrison Chase makes this development process much easier. 28 | 29 | 8 30 | 00:00:34,360 --> 00:00:36,280 31 | I'm thrilled to have Harrison here, 32 | 33 | 9 34 | 00:00:36,280 --> 00:00:38,920 35 | who had built this short course in collaboration with 36 | 37 | 10 38 | 00:00:38,920 --> 00:00:42,680 39 | deeplearning.ai to teach how to use this amazing tool. 40 | 41 | 11 42 | 00:00:42,680 --> 00:00:45,400 43 | Thanks for having me. I'm really excited to be here. 44 | 45 | 12 46 | 00:00:45,400 --> 00:00:49,760 47 | LandChain started as an open source framework for building all on applications. 48 | 49 | 13 50 | 00:00:49,760 --> 00:00:53,360 51 | It came about when I was talking to a bunch of folks in the field who were 52 | 53 | 14 54 | 00:00:53,360 --> 00:00:56,200 55 | building more complex applications and saw 56 | 57 | 15 58 | 00:00:56,200 --> 00:00:59,760 59 | some common abstractions in terms of how they were being developed. 60 | 61 | 16 62 | 00:00:59,760 --> 00:01:04,320 63 | We've been really thrilled at the community adoption of LandChain so far. 64 | 65 | 17 66 | 00:01:04,320 --> 00:01:06,400 67 | So look forward to sharing it with everyone 68 | 69 | 18 70 | 00:01:06,400 --> 00:01:08,960 71 | here and look forward to seeing what people build with it. 72 | 73 | 19 74 | 00:01:08,960 --> 00:01:12,700 75 | In fact, as a sign of LandChain's momentum, 76 | 77 | 20 78 | 00:01:12,700 --> 00:01:14,920 79 | not only does it have numerous users, 80 | 81 | 21 82 | 00:01:14,920 --> 00:01:19,000 83 | but there are also many hundreds of contributors to the open source. 84 | 85 | 22 86 | 00:01:19,000 --> 00:01:22,760 87 | This has been instrumental for its rapid rate of development. 88 | 89 | 23 90 | 00:01:22,760 --> 00:01:26,440 91 | This team really shifts code and features at an amazing pace. 92 | 93 | 24 94 | 00:01:26,440 --> 00:01:28,960 95 | So hopefully, after this short course, 96 | 97 | 25 98 | 00:01:28,960 --> 00:01:33,720 99 | you'll be able to quickly put together some really cool applications using LandChain. 100 | 101 | 26 102 | 00:01:33,720 --> 00:01:36,280 103 | And who knows, maybe you even decide to 104 | 105 | 27 106 | 00:01:36,280 --> 00:01:39,800 107 | contribute back to the open source LandChain effort. 108 | 109 | 28 110 | 00:01:39,800 --> 00:01:45,360 111 | LandChain is an open source development framework for building applications. 112 | 113 | 29 114 | 00:01:45,360 --> 00:01:47,280 115 | We have two different packages, 116 | 117 | 30 118 | 00:01:47,280 --> 00:01:49,520 119 | a Python one and a JavaScript one. 120 | 121 | 31 122 | 00:01:49,520 --> 00:01:52,600 123 | They're focused on composition and modularity. 124 | 125 | 32 126 | 00:01:52,600 --> 00:01:54,960 127 | So they have a lot of individual components that can be 128 | 129 | 33 130 | 00:01:54,960 --> 00:01:58,320 131 | used in conjunction with each other or by themselves. 132 | 133 | 34 134 | 00:01:58,320 --> 00:02:00,080 135 | And so that's one of the key value adds. 136 | 137 | 35 138 | 00:02:00,080 --> 00:02:03,680 139 | And then the other key value add is a bunch of different use cases. 140 | 141 | 36 142 | 00:02:03,680 --> 00:02:07,280 143 | So chains of ways of combining these modular components into 144 | 145 | 37 146 | 00:02:07,280 --> 00:02:09,680 147 | more end-to-end applications and making it 148 | 149 | 38 150 | 00:02:09,680 --> 00:02:12,640 151 | very easy to get started with those use cases. 152 | 153 | 39 154 | 00:02:12,640 --> 00:02:16,080 155 | In this class, we'll cover the common components of LandChain. 156 | 157 | 40 158 | 00:02:16,080 --> 00:02:17,520 159 | So we'll talk about models. 160 | 161 | 41 162 | 00:02:17,520 --> 00:02:19,560 163 | We'll talk about prompts, which are how you get 164 | 165 | 42 166 | 00:02:19,560 --> 00:02:22,080 167 | models to do useful and interesting things. 168 | 169 | 43 170 | 00:02:22,080 --> 00:02:23,480 171 | We'll talk about indexes, 172 | 173 | 44 174 | 00:02:23,480 --> 00:02:27,920 175 | which are ways of ingesting data so that you can combine it with models. 176 | 177 | 45 178 | 00:02:27,920 --> 00:02:29,480 179 | And then we'll talk about chains, 180 | 181 | 46 182 | 00:02:29,480 --> 00:02:32,280 183 | which are more end-to-end use cases along with agents, 184 | 185 | 47 186 | 00:02:32,280 --> 00:02:34,920 187 | which are a very exciting type of end-to-end use case, 188 | 189 | 48 190 | 00:02:34,920 --> 00:02:37,680 191 | which uses the model as a reasoning engine. 192 | 193 | 49 194 | 00:02:37,680 --> 00:02:40,320 195 | We're also grateful to Ankush Gholar, 196 | 197 | 50 198 | 00:02:40,320 --> 00:02:44,600 199 | who is the co-founder of LandChain alongside Harrison Chase, 200 | 201 | 51 202 | 00:02:44,600 --> 00:02:46,840 203 | for also putting a lot of thoughts into 204 | 205 | 52 206 | 00:02:46,840 --> 00:02:50,480 207 | these materials and helping with the creation of this short course. 208 | 209 | 53 210 | 00:02:50,480 --> 00:02:52,840 211 | And on the deep learning.ai side, 212 | 213 | 54 214 | 00:02:52,840 --> 00:02:56,080 215 | Jeff Ludwig, Eddie Hsu, and Diala Ezzedine, 216 | 217 | 55 218 | 00:02:56,080 --> 00:02:58,840 219 | have also contributed to these materials. 220 | 221 | 56 222 | 00:02:58,840 --> 00:03:02,040 223 | And so with that, let's go on to the next video where we'll learn 224 | 225 | 57 226 | 00:03:02,040 --> 00:03:21,040 227 | about LandChain's models, prompts, and pauses. 228 | 229 | 230 | -------------------------------------------------------------------------------- /LangChain_Intro.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:06,440 3 | 欢迎来到这个关于LandChain大语言模型应用开发的短期课程。 4 | Welcome to this short course on 5 | 6 | 2 7 | 00:00:06,440 --> 00:00:10,040 8 | 通过提示llm或大型语言模型, 9 | LandChain for large language model application development. 10 | 11 | 3 12 | 00:00:10,040 --> 00:00:13,480 13 | 现在可以比以往更快地开发AI应用程序。 14 | By prompting an llm or large language model, 15 | 16 | 4 17 | 00:00:13,480 --> 00:00:18,600 18 | 但一个应用程序可能需要多次提示llm并解析输出。 19 | it's now possible to develop AI applications much faster than ever before. 20 | 21 | 5 22 | 00:00:18,600 --> 00:00:24,800 23 | 因此需要编写大量的粘合代码。 24 | But an application can require prompting an llm multiple times and parsing as output. 25 | 26 | 6 27 | 00:00:24,800 --> 00:00:28,220 28 | Harrison Chase创建的LandChain使得这个开发过程更加容易。 29 | So there's a lot of glue code that needs to be written. 30 | 31 | 7 32 | 00:00:28,220 --> 00:00:34,360 33 | 我很高兴有Harrison在这里, 34 | LandChain created by Harrison Chase makes this development process much easier. 35 | 36 | 8 37 | 00:00:34,360 --> 00:00:36,280 38 | 他与deeplearning.ai合作建立了这个短期课程,教大家如何使用这个神奇的工具。 39 | I'm thrilled to have Harrison here, 40 | 41 | 9 42 | 00:00:36,280 --> 00:00:38,920 43 | 感谢你的邀请。我很高兴能来这里。 44 | who had built this short course in collaboration with 45 | 46 | 10 47 | 00:00:38,920 --> 00:00:42,680 48 | LandChain起初是一个用于构建所有应用程序的开源框架。 49 | deeplearning.ai to teach how to use this amazing tool. 50 | 51 | 11 52 | 00:00:42,680 --> 00:00:45,400 53 | 谢谢你。我很高兴能来这里。 54 | Thanks for having me. I'm really excited to be here. 55 | 56 | 12 57 | 00:00:45,400 --> 00:00:49,760 58 | 当我与该领域的一些人交谈时,他们正在构建更复杂的应用程序,并看到了 59 | LandChain started as an open source framework for building all on applications. 60 | 61 | 13 62 | 00:00:49,760 --> 00:00:53,360 63 | 一些共同的抽象,以及它们如何被开发。 64 | It came about when I was talking to a bunch of folks in the field who were 65 | 66 | 14 67 | 00:00:53,360 --> 00:00:56,200 68 | 到目前为止,我们非常高兴地看到LandChain在社区中的采用。 69 | building more complex applications and saw 70 | 71 | 15 72 | 00:00:56,200 --> 00:01:04,320 73 | 因此,期待与大家分享它,并期待看到人们用它构建的东西。 74 | some common abstractions in terms of how they were being developed. 75 | 76 | 16 77 | 00:01:04,320 --> 00:01:06,400 78 | 事实上,作为LandChain动力的一个标志, 79 | We've been really thrilled at the community adoption of LandChain so far. 80 | 81 | 17 82 | 00:01:06,400 --> 00:01:08,960 83 | 它不仅有众多用户, 84 | So look forward to sharing it with everyone 85 | 86 | 18 87 | 00:01:08,960 --> 00:01:12,700 88 | 而且还有许多对开源的贡献者。 89 | here and look forward to seeing what people build with it. 90 | 91 | 19 92 | 00:01:12,700 --> 00:01:19,000 93 | 这对于它的快速发展至关重要。 94 | In fact, as a sign of LandChain's momentum, 95 | 96 | 20 97 | 00:01:19,000 --> 00:01:22,760 98 | 这个团队以惊人的速度推出代码和功能。 99 | not only does it have numerous users, 100 | 101 | 21 102 | 00:01:22,760 --> 00:01:28,960 103 | 因此,希望在这个短期课程之后, 104 | but there are also many hundreds of contributors to the open source. 105 | 106 | 22 107 | 00:01:28,960 --> 00:01:33,720 108 | 你将能够快速地使用LandChain组合一些非常酷的应用程序。 109 | This has been instrumental for its rapid rate of development. 110 | 111 | 23 112 | 00:01:33,720 --> 00:01:36,280 113 | 谁知道,也许你甚至决定 114 | This team really shifts code and features at an amazing pace. 115 | 116 | 24 117 | 00:01:36,280 --> 00:01:39,800 118 | 回馈开源LandChain的努力。 119 | So hopefully, after this short course, 120 | 121 | 25 122 | 00:01:39,800 --> 00:01:45,360 123 | LandChain是一个用于构建应用程序的开源开发框架。 124 | you'll be able to quickly put together some really cool applications using LandChain. 125 | 126 | 26 127 | 00:01:45,360 --> 00:01:47,280 128 | 我们有两个不同的包, 129 | And who knows, maybe you even decide to 130 | 131 | 27 132 | 00:01:47,280 --> 00:01:49,520 133 | 一个是Python,一个是JavaScript。 134 | contribute back to the open source LandChain effort. 135 | 136 | 28 137 | 00:01:49,520 --> 00:01:54,960 138 | 它们专注于组合和模块化。 139 | LandChain is an open source development framework for building applications. 140 | 141 | 29 142 | 00:01:54,960 --> 00:01:58,320 143 | 因此,它们有许多单独的组件,可以一起使用或单独使用。 144 | 145 | We have two different packages, 146 | 147 | 30 148 | 00:01:58,320 --> 00:02:00,080 149 | 因此,这是其中一个关键的附加值。 150 | a Python one and a JavaScript one. 151 | 152 | 31 153 | 00:02:00,080 --> 00:02:03,680 154 | 另一个关键的附加值是许多不同的用例。 155 | They're focused on composition and modularity. 156 | 157 | 32 158 | 00:02:03,680 --> 00:02:07,280 159 | 因此,将这些模块化组件组合成链式方式,形成更多端到端的应用程序,并使其易于开始使用这些用例。 160 | So they have a lot of individual components that can be 161 | 162 | 33 163 | 00:02:07,280 --> 00:02:09,680 164 | 在本课程中,我们将介绍LandChain的常见组件。 165 | used in conjunction with each other or by themselves. 166 | 167 | 34 168 | 00:02:09,680 --> 00:02:12,640 169 | 因此,我们将讨论模型。 170 | And so that's one of the key value adds. 171 | 172 | 35 173 | 00:02:12,640 --> 00:02:16,080 174 | 我们将讨论提示,这是您使模型执行有用和有趣操作的方式。 175 | And then the other key value add is a bunch of different use cases. 176 | 177 | 36 178 | 00:02:16,080 --> 00:02:17,520 179 | 我们将讨论索引, 180 | So chains of ways of combining these modular components into 181 | 182 | 37 183 | 00:02:17,520 --> 00:02:19,560 184 | 这是一种摄取数据的方式,以便您可以将其与模型结合使用。 185 | more end-to-end applications and making it 186 | 187 | 38 188 | 00:02:19,560 --> 00:02:22,080 189 | 然后,我们将讨论链式, 190 | very easy to get started with those use cases. 191 | 192 | 39 193 | 00:02:22,080 --> 00:02:27,920 194 | 这是更多的端到端用例,以及代理人, 195 | In this class, we'll cover the common components of LandChain. 196 | 197 | 40 198 | 00:02:27,920 --> 00:02:29,480 199 | 这是一种非常令人兴奋的端到端用例, 200 | So we'll talk about models. 201 | 202 | 41 203 | 00:02:29,480 --> 00:02:32,280 204 | 它使用模型作为推理引擎。 205 | We'll talk about prompts, which are how you get 206 | 207 | 42 208 | 00:02:32,280 --> 00:02:34,920 209 | 我们还感谢Ankush Gholar, 210 | models to do useful and interesting things. 211 | 212 | 43 213 | 00:02:34,920 --> 00:02:37,680 214 | 他是LandChain的联合创始人,与Harrison Chase一起, 215 | We'll talk about indexes, 216 | 217 | 44 218 | 00:02:37,680 --> 00:02:40,320 219 | 也为这些材料投入了很多思考,并帮助创建了这个短期课程。 220 | which are ways of ingesting data so that you can combine it with models. 221 | 222 | 45 223 | 00:02:40,320 --> 00:02:44,600 224 | 在deep learning.ai方面, 225 | And then we'll talk about chains, 226 | 227 | 46 228 | 00:02:44,600 --> 00:02:46,840 229 | Jeff Ludwig,Eddie Hsu和Diala Ezzedine, 230 | which are more end-to-end use cases along with agents, 231 | 232 | 47 233 | 00:02:46,840 --> 00:02:50,480 234 | 也为这些材料做出了贡献。 235 | which are a very exciting type of end-to-end use case, 236 | 237 | 48 238 | 00:02:50,480 --> 00:02:52,840 239 | 因此,让我们进入下一个视频,了解LandChain的模型,提示和暂停。 240 | which uses the model as a reasoning engine. 241 | -------------------------------------------------------------------------------- /chinese/LangChain_L6.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:00,000 --> 00:00:08,920 5 | 有时人们认为大型语言模型是一个知识库, 6 | 7 | 2 8 | 00:00:08,920 --> 00:00:11,920 9 | 好像它已经学会了记忆大量信息, 10 | 11 | 3 12 | 00:00:11,920 --> 00:00:14,880 13 | 也许是从互联网上获取的,所以当你问它一个问题时, 14 | 15 | 4 16 | 00:00:14,880 --> 00:00:16,380 17 | 它可以回答这个问题。 18 | 19 | 5 20 | 00:00:16,380 --> 00:00:19,340 21 | 但我认为,将大型语言模型视为推理引擎更加有用, 22 | 23 | 6 24 | 00:00:19,340 --> 00:00:22,980 25 | 你可以给它一些文本块或其他信息来源。 26 | 27 | 7 28 | 00:00:22,980 --> 00:00:27,140 29 | 然后大型语言模型,LLM, 30 | 31 | 8 32 | 00:00:27,140 --> 00:00:29,460 33 | 可能会使用从互联网上学到的背景知识, 34 | 35 | 9 36 | 00:00:29,460 --> 00:00:33,000 37 | 但是使用你提供的新信息来帮助你回答问题或推理内容或甚至决定下一步该做什么。 38 | 39 | 10 40 | 00:00:33,000 --> 00:00:36,620 41 | 这就是Lanchain的代理框架帮助你做的事情。 42 | 43 | 11 44 | 00:00:36,620 --> 00:00:41,100 45 | 代理可能是我最喜欢的Lanchain部分。 46 | 47 | 12 48 | 00:00:41,100 --> 00:00:45,180 49 | 我认为它们也是最强大的部分之一, 50 | 51 | 13 52 | 00:00:45,180 --> 00:00:48,340 53 | 但它们也是最新的部分之一。 54 | 55 | 14 56 | 00:00:48,340 --> 00:00:50,320 57 | 我们正在看到很多新的东西出现在这里,对于该领域的每个人来说都是新的。 58 | 59 | 15 60 | 00:00:50,320 --> 00:00:52,140 61 | 这应该是一个非常令人兴奋的课程,因为我们深入探讨 62 | 63 | 16 64 | 00:00:52,140 --> 00:00:56,020 65 | 代理是什么,如何创建代理, 66 | 67 | 17 68 | 00:00:56,020 --> 00:00:58,940 69 | 以及如何使用代理, 70 | 71 | 18 72 | 00:00:58,940 --> 00:01:01,180 73 | 如何为它们配备不同类型的工具,如 74 | 75 | 19 76 | 00:01:01,180 --> 00:01:02,500 77 | 内置于Lanchain中的搜索引擎, 78 | 79 | 20 80 | 00:01:02,500 --> 00:01:04,860 81 | 以及如何创建自己的工具,以便让代理与 82 | 83 | 21 84 | 00:01:04,860 --> 00:01:07,180 85 | 任何数据存储,任何API, 86 | 87 | 22 88 | 00:01:07,180 --> 00:01:11,480 89 | 任何你想让它们与之交互的函数。 90 | 91 | 23 92 | 00:01:11,480 --> 00:01:14,780 93 | 这是令人兴奋的前沿技术, 94 | 95 | 24 96 | 00:01:14,780 --> 00:01:16,860 97 | 但已经出现了一些重要的用例。 98 | 99 | 25 100 | 00:01:16,860 --> 00:01:19,460 101 | 因此,让我们开始吧。 102 | 103 | 26 104 | 00:01:19,460 --> 00:01:23,060 105 | 要开始使用代理, 106 | 107 | 27 108 | 00:01:23,060 --> 00:01:25,620 109 | 我们将像往常一样导入正确的环境变量。 110 | 111 | 28 112 | 00:01:25,620 --> 00:01:27,500 113 | 我们还需要安装一些软件包。 114 | 115 | 29 116 | 00:01:27,500 --> 00:01:32,420 117 | 因此,我们将使用DuckDuckGo搜索引擎和维基百科。 118 | 119 | 30 120 | 00:01:32,420 --> 00:01:35,100 121 | 因此,我们将要安装这些。 122 | 123 | 31 124 | 00:01:35,100 --> 00:01:39,020 125 | 我已经在我的环境中安装了这些,所以我不会运行这行。 126 | 127 | 32 128 | 00:01:39,020 --> 00:01:40,780 129 | 但如果你们没有, 130 | 131 | 36 132 | 00:01:46,360 --> 00:01:48,580 133 | 你应该取消注释那一行, 134 | 135 | 37 136 | 00:01:48,580 --> 00:01:51,300 137 | 运行它,然后你就可以开始了。 138 | 139 | 38 140 | 00:01:51,300 --> 00:01:56,060 141 | 然后我们将从Lanchain导入一些我们需要的方法和类。 142 | 143 | 39 144 | 00:01:56,060 --> 00:01:59,060 145 | 所以我们要导入一些加载工具的方法, 146 | 147 | 40 148 | 00:01:59,060 --> 00:02:02,340 149 | 这些是我们将连接语言模型的东西。 150 | 151 | 41 152 | 00:02:02,340 --> 00:02:05,020 153 | 我们将加载一个初始化代理的方法。 154 | 155 | 42 156 | 00:02:05,020 --> 00:02:07,820 157 | 我们将加载聊天Open AI包装器, 158 | 159 | 43 160 | 00:02:07,820 --> 00:02:09,500 161 | 我们将加载代理类型。 162 | 163 | 44 164 | 00:02:09,500 --> 00:02:14,220 165 | 所以代理类型将用于指定我们要使用的代理类型。 166 | 167 | 45 168 | 00:02:14,220 --> 00:02:16,540 169 | Lanchain中有许多不同类型的代理。 170 | 171 | 46 172 | 00:02:16,540 --> 00:02:18,780 173 | 我们现在不会详细介绍所有这些。 174 | 175 | 47 176 | 00:02:18,780 --> 00:02:21,420 177 | 我们将选择一个并运行它。 178 | 179 | 48 180 | 00:02:21,420 --> 00:02:24,700 181 | 然后我们将初始化我们要使用的语言模型。 182 | 183 | 49 184 | 00:02:24,700 --> 00:02:30,500 185 | 同样,我们将使用它作为我们用来驱动代理的推理引擎。 186 | 187 | 50 188 | 00:02:30,500 --> 00:02:33,740 189 | 然后我们将加载我们要使用的工具。 190 | 191 | 51 192 | 00:02:33,740 --> 00:02:37,020 193 | 所以我们将加载DuckDuckGo搜索和维基百科, 194 | 195 | 52 196 | 00:02:37,020 --> 00:02:40,140 197 | 这些都是内置的Lanchain工具。 198 | 199 | 53 200 | 00:02:40,140 --> 00:02:42,980 201 | 最后,我们将初始化代理。 202 | 203 | 54 204 | 00:02:42,980 --> 00:02:44,780 205 | 我们将传递工具, 206 | 207 | 55 208 | 00:02:44,780 --> 00:02:47,700 209 | 语言模型和代理类型。 210 | 211 | 56 212 | 00:02:47,700 --> 00:02:49,340 213 | 所以这里我们使用聊天, 214 | 215 | 57 216 | 00:02:49,340 --> 00:02:51,460 217 | 零射击,反应,描述。 218 | 219 | 58 220 | 00:02:51,460 --> 00:02:54,060 221 | 我不会详细介绍这意味着什么。 222 | 223 | 59 224 | 00:02:54,060 --> 00:02:56,220 225 | 需要注意的重要事项是聊天。 226 | 227 | 60 228 | 00:02:56,220 --> 00:03:00,540 229 | 这是针对聊天模型进行优化的,然后是反应。 230 | 231 | 61 232 | 00:03:00,540 --> 00:03:05,620 233 | 反应是一种提示策略,可以从语言模型中引出更好的想法。 234 | 235 | 62 236 | 00:03:05,620 --> 00:03:09,220 237 | 我们还将设置处理解析错误等于true。 238 | 239 | 63 240 | 00:03:09,220 --> 00:03:11,620 241 | 如果您还记得第一课, 242 | 243 | 64 244 | 00:03:11,620 --> 00:03:17,140 245 | 我们谈论了输出解析器以及如何使用它们将LLM输出, 246 | 247 | 65 248 | 00:03:17,140 --> 00:03:22,060 249 | 这是一个字符串,并将其解析为我们可以在下游使用的特定格式。 250 | 251 | 66 252 | 00:03:22,060 --> 00:03:23,740 253 | 这在这里非常重要。 254 | 255 | 67 256 | 00:03:23,740 --> 00:03:25,620 257 | 当我们将LLM的输出, 258 | 259 | 68 260 | 00:03:25,620 --> 00:03:28,940 261 | 这是文本,并将其解析为特定的操作, 262 | 263 | 69 264 | 00:03:28,940 --> 00:03:32,700 265 | 以及语言模型应该采取的特定操作输入。 266 | 267 | 70 268 | 00:03:32,700 --> 00:03:34,300 269 | 现在让我们使用这个代理。 270 | 271 | 71 272 | 00:03:34,300 --> 00:03:38,940 273 | 让我们问一个关于最近事件的问题,这个模型在训练时不知道。 274 | 275 | 72 276 | 00:03:38,940 --> 00:03:41,060 277 | 所以让我们问一下2022年世界杯的情况。 278 | 279 | 73 280 | 00:03:41,060 --> 00:03:43,860 281 | 这里的模型是根据2021年左右的数据进行训练的。 282 | 283 | 74 284 | 00:03:43,860 --> 00:03:47,660 285 | 所以它不应该知道这个问题的答案。 286 | 287 | 75 288 | 00:03:47,660 --> 00:03:49,820 289 | 因此,它应该意识到需要使用工具来查找这个最近的信息。 290 | 291 | 76 292 | 00:03:49,820 --> 00:03:55,580 293 | 所以我们可以看到这里的代理意识到它需要使用DuckDuckGo搜索,然后查找2022年世界杯的获胜者。 294 | 295 | 77 296 | 00:04:05,340 --> 00:04:10,620 297 | 因此,它得到了一些信息。 298 | 299 | 78 300 | 00:04:10,620 --> 00:04:14,900 301 | 然后我们可以看到代理认为2022年世界杯还没有发生。 302 | 303 | 79 304 | 00:04:14,900 --> 00:04:18,060 305 | 所以这是一个很好的例子,说明代理仍然具有探索性。 306 | 307 | 80 308 | 00:04:18,060 --> 00:04:23,940 309 | 我们可以看到这里有很多关于2022年世界杯的信息,但它并没有完全意识到所有的事情都已经发生了。 310 | 311 | 81 312 | 00:04:23,940 --> 00:04:28,060 313 | 因此,它需要查找更多的信息。 314 | 315 | 82 316 | 00:04:28,060 --> 00:04:32,020 317 | 然后基于这些信息,它可以回答正确的答案,即阿根廷赢得了2022年世界杯。 318 | 319 | 87 320 | 00:04:47,380 --> 00:04:52,500 321 | 然后让我们问一个问题,它应该意识到需要使用维基百科。 322 | 323 | 88 324 | 00:04:52,500 --> 00:04:58,740 325 | 维基百科有很多关于特定人物和特定实体的信息,这些信息可以是很久以前的,不需要是当前的信息。 326 | 327 | 89 328 | 00:04:58,740 --> 00:05:02,980 329 | 所以让我们问一下美国计算机科学家Tom M. Mitchell写了哪本书。 330 | 331 | 90 332 | 00:05:02,980 --> 00:05:06,540 333 | 我们可以看到它意识到应该使用维基百科来查找答案。 334 | 335 | 91 336 | 00:05:06,540 --> 00:05:08,420 337 | 所以它搜索Tom M. Mitchell维基百科。 338 | 339 | 92 340 | 00:05:08,420 --> 00:05:12,700 341 | 然后再进行另一个跟进搜索,以确保它得到了正确的答案。 342 | 343 | 93 344 | 00:05:12,700 --> 00:05:16,020 345 | 所以它搜索Tom M. Mitchell机器学习,并得到更多的信息。 346 | 347 | 94 348 | 00:05:16,020 --> 00:05:19,460 349 | 然后基于这些信息,它最终能够回答Tom M. Mitchell写的教科书是《机器学习》。 350 | 351 | 98 352 | 00:05:29,660 --> 00:05:33,580 353 | 你可以在这里暂停视频,尝试输入不同的内容。 354 | 355 | 99 356 | 00:05:33,580 --> 00:05:38,380 357 | 到目前为止,我们已经使用了LinkedIn中预定义的工具。 358 | 359 | 100 360 | 00:05:38,380 --> 00:05:42,820 361 | 但代理的一个重要功能是可以将其连接到您自己的信息源、API和数据。 362 | 363 | 101 364 | 00:05:42,820 --> 00:05:45,100 365 | 您可以创建一个自定义工具,将其连接到您想要的任何内容。 366 | 367 | 102 368 | 00:05:45,100 --> 00:05:47,700 369 | 现在我们来创建一个工具,它将告诉我们当前的日期。 370 | 371 | 103 372 | 00:05:47,700 --> 00:05:50,700 373 | 首先,我们要导入这个工具装饰器。 374 | 375 | 106 376 | 00:05:57,500 --> 00:06:03,100 377 | 接下来,我们将编写一个名为“time”的函数,它接受任何文本字符串。 378 | 379 | 109 380 | 00:06:09,900 --> 00:06:15,540 381 | 它将通过调用日期时间返回今天的日期。 382 | 383 | 110 384 | 00:06:15,540 --> 00:06:20,660 385 | 除了函数的名称,我们还将编写一个非常详细的文档字符串。 386 | 387 | 111 388 | 00:06:20,660 --> 00:06:25,100 389 | 这是代理将用来知道何时调用此工具以及如何调用此工具的方式。 390 | 391 | 113 392 | 00:06:28,500 --> 00:06:32,060 393 | 例如,在这里,我们说输入应始终为空字符串。 394 | 395 | 116 396 | 00:06:37,460 --> 00:06:42,940 397 | 如果我们对输入有更严格的要求,例如,如果我们有一个应始终接受搜索查询或SQL语句的函数, 398 | 399 | 118 400 | 00:06:47,340 --> 00:06:49,060 401 | 现在我们将创建另一个代理。 402 | 403 | 119 404 | 00:06:49,060 --> 00:06:55,660 405 | 这次,我们将时间工具添加到现有工具列表中。 406 | 407 | 121 408 | 00:07:03,660 --> 00:07:08,140 409 | 它识别出需要使用时间工具,并在此指定。 410 | 411 | 126 412 | 00:07:18,740 --> 00:07:22,540 413 | 今天的日期是2023年5月21日。 414 | 415 | 128 416 | 00:07:26,860 --> 00:07:29,340 417 | 这就是代理的全部内容。 418 | 419 | 129 420 | 00:07:29,340 --> 00:07:34,740 421 | 这是LangChain中较新、更令人兴奋和更具实验性的部分之一。 422 | 423 | 130 424 | 00:07:34,740 --> 00:07:36,540 425 | 希望您喜欢使用它。 426 | 427 | 131 428 | 00:07:36,540 --> 00:07:40,540 429 | 希望它向您展示了如何使用语言模型作为推理引擎 430 | 431 | 132 432 | 00:07:40,540 --> 00:08:00,540 433 | 以执行不同的操作并连接到其他功能和数据源。 -------------------------------------------------------------------------------- /chinese/LangChain_L2.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:00,000 --> 00:00:18,000 5 | 当你与这些模型交互时,它们自然而然地不会记得你之前说过的话或任何以前的对话,这在构建一些应用程序(如聊天机器人)并希望与它们进行对话时是一个问题。 6 | 7 | 2 8 | 00:00:18,000 --> 00:00:31,000 9 | 因此,在本节中,我们将介绍记忆,即如何记住先前对话的部分并将其馈入语言模型中,以便在与它们交互时具有这种对话流。 10 | 11 | 3 12 | 00:00:31,000 --> 00:00:38,000 13 | 没错。因此,Language Chain提供了多种复杂的选项来管理这些记忆。让我们跳进来看看。 14 | 15 | 4 16 | 00:00:38,000 --> 00:00:48,000 17 | 因此,让我首先导入我的OpenAI API密钥,然后让我导入我需要的一些工具。 18 | 19 | 5 20 | 00:00:48,000 --> 00:00:55,000 21 | 让我们以使用LangChain来管理聊天或聊天机器人对话为记忆的动机示例。 22 | 23 | 6 24 | 00:00:55,000 --> 00:01:09,000 25 | 因此,为此,我将将llm设置为OpenAI的聊天界面,温度为零。我将使用内存作为对话缓冲区内存。 26 | 27 | 7 28 | 00:01:09,000 --> 00:01:15,000 29 | 稍后您将看到这意味着什么。我将构建一个对话链。 30 | 31 | 8 32 | 00:01:15,000 --> 00:01:26,000 33 | 同样,在这个短期课程中,Harrison将更深入地探讨LangChain中的链是什么。所以现在不要太担心语法的细节。 34 | 35 | 9 36 | 00:01:26,000 --> 00:01:36,000 37 | 但是这构建了一个llm。如果我开始交谈,conversation.predict,给出输入。嗨,我的名字是安德鲁。 38 | 39 | 10 40 | 00:01:36,000 --> 00:01:47,000 41 | 让我们看看它说什么。你好,安德鲁,很高兴见到你。对吧?等等。然后让我们说我问它,1加1等于多少? 42 | 43 | 11 44 | 00:01:47,000 --> 00:01:55,000 45 | 嗯,1加1等于2。然后再问一遍,你知道我的名字是什么吗?你的名字是安德鲁,正如你之前提到的那样。 46 | 47 | 12 48 | 00:01:55,000 --> 00:02:06,000 49 | 嗯,那里有很多讽刺的痕迹。不确定。因此,如果您愿意,可以将此verbose变量更改为true,以查看LangChain实际上正在做什么。 50 | 51 | 13 52 | 00:02:06,000 --> 00:02:11,000 53 | 当您运行predict,hi,my name is Andrew时,这是LangChain正在生成的提示。 54 | 55 | 14 56 | 00:02:11,000 --> 00:02:16,000 57 | 它说,以下是人类和AI之间友好的对话,AI健谈等等。 58 | 59 | 15 60 | 00:02:16,000 --> 00:02:26,000 61 | 因此,这是LangChain生成的提示,以使系统进行希望和友好的对话,并且必须保存对话,这是响应。 62 | 63 | 16 64 | 00:02:26,000 --> 00:02:35,000 65 | 当您在第二和第三部分对话上执行此操作时,它会保留提示如下。 66 | 67 | 17 68 | 00:02:35,000 --> 00:02:43,000 69 | 请注意,到我说出“我的名字是什么?”这是第三轮,那是我的第三个输入。 70 | 71 | 18 72 | 00:02:43,000 --> 00:02:50,000 73 | 它已将当前对话存储如下。嗨,我的名字是安德鲁,一加一等于多少,等等。 74 | 75 | 19 76 | 00:02:50,000 --> 00:02:57,000 77 | 因此,这个对话的记忆或历史变得越来越长。 78 | 79 | 20 80 | 00:02:57,000 --> 00:03:02,000 81 | 实际上,在顶部,我使用了内存变量来存储内存。 82 | 83 | 21 84 | 00:03:02,000 --> 00:03:08,000 85 | 因此,如果我要打印memory.buffer,它已经存储了到目前为止的对话。 86 | 87 | 22 88 | 00:03:08,000 --> 00:03:14,000 89 | 您还可以打印出这个,memory.loadMemoryVariables。 90 | 91 | 23 92 | 00:03:14,000 --> 00:03:18,000 93 | 这里的花括号实际上是一个空字典。 94 | 95 | 24 96 | 00:03:18,000 --> 00:03:25,000 97 | 有一些更高级的功能,您可以使用更复杂的输入,但我们不会在这个短期课程中讨论它们。 98 | 99 | 25 100 | 00:03:25,000 --> 00:03:28,000 101 | 所以不要担心为什么这里有一个空的花括号。 102 | 103 | 26 104 | 00:03:28,000 --> 00:03:33,000 105 | 但这就是LangChain到目前为止在对话记忆中记住的一切。 106 | 107 | 27 108 | 00:03:33,000 --> 00:03:38,000 109 | 这只是AI或人类说的一切。 110 | 111 | 28 112 | 00:03:38,000 --> 00:03:41,000 113 | 我鼓励您暂停视频并运行代码。 114 | 115 | 29 116 | 00:03:41,000 --> 00:03:49,000 117 | 因此,LangChain存储对话的方式是使用这个对话缓冲区内存。 118 | 119 | 30 120 | 00:03:49,000 --> 00:03:55,000 121 | 如果我使用对话缓冲区内存来指定一些输入和输出, 122 | 123 | 31 124 | 00:03:55,000 --> 00:03:59,000 125 | 如果您希望明确地这样做,这是添加新内容到内存中的方法。 126 | 127 | 32 128 | 00:03:59,000 --> 00:04:03,000 129 | Memory.saveContext说,嗨,怎么样? 130 | 131 | 33 132 | 00:04:03,000 --> 00:04:09,000 133 | 我知道这不是最令人兴奋的对话,但我想举一个简短的例子。 134 | 135 | 34 136 | 00:04:09,000 --> 00:04:15,000 137 | 嗯,就这样内存的状态。 138 | 139 | 35 140 | 00:04:15,000 --> 00:04:22,000 141 | 再次,让我显示一下内存变量。 142 | 143 | 36 144 | 00:04:22,000 --> 00:04:29,000 145 | 现在,如果您想向内存添加其他数据,您可以继续保存其他上下文。 146 | 147 | 37 148 | 00:04:29,000 --> 00:04:34,000 149 | 因此,对话继续,没有什么,只是挂着,很酷。 150 | 151 | 38 152 | 00:04:34,000 --> 00:04:38,000 153 | 如果您打印出内存,您会发现现在有更多的东西。 154 | 155 | 39 156 | 00:04:38,000 --> 00:04:46,000 157 | 因此,当您使用大型语言模型进行聊天对话时,大型语言模型本身实际上是无状态的。 158 | 159 | 40 160 | 00:04:46,000 --> 00:04:51,000 161 | 语言模型本身不记得到目前为止的对话。 162 | 163 | 41 164 | 00:04:51,000 --> 00:04:55,000 165 | 每个交易,每个调用API端点都是独立的。 166 | 167 | 42 168 | 00:04:55,000 --> 00:05:07,000 169 | 聊天机器人似乎只有记忆,因为通常会提供完整的对话作为上下文,以提供给LLM。 170 | 171 | 43 172 | 00:05:07,000 --> 00:05:14,000 173 | 因此,内存可以明确地存储到目前为止的术语或话语。 174 | 175 | 44 176 | 00:05:14,000 --> 00:05:16,000 177 | 嗨,我叫安德鲁。你好,很高兴认识你等等。 178 | 179 | 45 180 | 00:05:16,000 --> 00:05:30,000 181 | 这个内存存储器被用作输入或附加上下文到LLM中,以便它可以生成一个输出,就好像它只是有下一个对话的转折,知道之前说过什么。 182 | 183 | 46 184 | 00:05:30,000 --> 00:05:37,000 185 | 随着对话变得越来越长,所需的内存量也变得非常长。 186 | 187 | 47 188 | 00:05:37,000 --> 00:05:46,000 189 | 因此,将大量的令牌发送到LLM的成本,通常是基于它需要处理的令牌数量而收费,也会变得更加昂贵。 190 | 191 | 48 192 | 00:05:46,000 --> 00:05:54,000 193 | 因此,Lanchain提供了几种方便的内存来存储和累积对话。 194 | 195 | 49 196 | 00:05:54,000 --> 00:06:00,000 197 | 到目前为止,我们一直在看对话缓冲区内存。让我们看看另一种类型的内存。 198 | 199 | 50 200 | 00:06:00,000 --> 00:06:09,000 201 | 我将导入只保留一个窗口内存的对话缓冲区窗口内存。 202 | 203 | 51 204 | 00:06:09,000 --> 00:06:20,000 205 | 如果我将内存设置为具有k等于1的对话缓冲区窗口内存,则变量k等于1指定我只想记住一个对话交换。 206 | 207 | 52 208 | 00:06:20,000 --> 00:06:25,000 209 | 也就是我和聊天机器人的一次发言。 210 | 211 | 53 212 | 00:06:25,000 --> 00:06:31,000 213 | 所以现在,如果我让它保存上下文,嗨,怎么样,没什么,只是闲逛。 214 | 215 | 54 216 | 00:06:31,000 --> 00:06:38,000 217 | 如果我查看内存点加载变量,它只记住最近的话语。 218 | 219 | 55 220 | 00:06:38,000 --> 00:06:45,000 221 | 请注意,它已经删除了。嗨,怎么样?它只是说,人类说没什么,只是闲逛,AI说很酷。 222 | 223 | 56 224 | 00:06:45,000 --> 00:06:48,000 225 | 所以这是一个很好的功能,因为它可以让你跟踪最近的几个对话术语。 226 | 227 | 57 228 | 00:06:48,000 --> 00:06:56,000 229 | 在实践中,您可能不会使用k等于1。您将使用k设置为更大的数字。 230 | 231 | 58 232 | 00:06:56,000 --> 00:07:03,000 233 | 但是,这仍然可以防止随着对话的进行,内存无限增长。 234 | 235 | 59 236 | 00:07:03,000 --> 00:07:10,000 237 | 所以,如果我重新运行我们刚才的对话,我们会说,嗨,我叫安德鲁。 238 | 239 | 60 240 | 00:07:10,000 --> 00:07:23,000 241 | 1加1等于多少?现在我问它,我的名字是什么? 242 | 243 | 61 244 | 00:07:23,000 --> 00:07:32,000 245 | 因为k等于1,它只记得最后一次交流,而不是1加1等于什么? 246 | 247 | 62 248 | 00:07:32,000 --> 00:07:37,000 249 | 答案是1加1等于2,它已经忘记了这个早期的交流,现在说, 250 | 251 | 63 252 | 00:07:37,000 --> 00:07:42,000 253 | 抱歉,没有访问那些信息。 254 | 255 | 64 256 | 00:07:42,000 --> 00:07:46,000 257 | 我希望你能做的一件事是暂停视频,在左侧的代码中将其更改为true, 258 | 259 | 66 260 | 00:07:53,000 --> 00:07:57,000 261 | 并使用verbose等于true重新运行此对话。 262 | 263 | 67 264 | 00:07:57,000 --> 00:08:00,000 265 | 然后您将看到实际用于生成此内容的提示。 266 | 267 | 68 268 | 00:08:00,000 --> 00:08:08,000 269 | 希望您能看到当您在调用LLM时,询问“我的名字是什么”时, 270 | 271 | 69 272 | 00:08:08,000 --> 00:08:11,000 273 | 内存已删除了我学习“我的名字是什么”的交换, 274 | 275 | 70 276 | 00:08:11,000 --> 00:08:17,000 277 | 这就是为什么现在它说不知道我的名字。 278 | 279 | 71 280 | 00:08:17,000 --> 00:08:28,000 281 | 使用对话令牌缓冲器内存,内存将限制保存的令牌数量。 282 | 283 | 72 284 | 00:08:28,000 --> 00:08:39,000 285 | 由于LLM定价的很多是基于令牌的,因此这更直接地映射到LLM调用的成本。 286 | 287 | 73 288 | 00:08:39,000 --> 00:08:47,000 289 | 因此,如果我说最大令牌限制等于50,实际上让我注入一些评论。 290 | 291 | 74 292 | 00:08:47,000 --> 00:08:51,000 293 | 所以让我们说,对话是,AI是什么?惊人。 294 | 295 | 75 296 | 00:08:51,000 --> 00:08:54,000 297 | 反向传播是什么?美丽。聊天机器人是什么?迷人。 298 | 299 | 76 300 | 00:08:54,000 --> 00:08:58,000 301 | 我使用ABC作为所有这些对话术语的第一个字母。 302 | 303 | 77 304 | 00:08:58,000 --> 00:09:02,000 305 | 我们可以跟踪,嗯,什么时候说了什么。 306 | 307 | 78 308 | 00:09:02,000 --> 00:09:08,000 309 | 如果我使用高令牌限制运行它,它几乎包含了整个对话。 310 | 311 | 79 312 | 00:09:08,000 --> 00:09:14,000 313 | 如果我将令牌限制增加到100,它现在包含了整个对话。 314 | 315 | 80 316 | 00:09:14,000 --> 00:09:24,000 317 | 所以我有AI是什么?如果我减少它,那么,您知道,它会切掉这个对话的早期部分 318 | 319 | 81 320 | 00:09:24,000 --> 00:09:28,000 321 | 以保留与最近的交流相对应的令牌数量, 322 | 323 | 82 324 | 00:09:28,000 --> 00:09:32,000 325 | 但不超过令牌限制。 326 | 327 | 83 328 | 00:09:32,000 --> 00:09:35,000 329 | 如果您想知道为什么我们需要指定LLM, 330 | 331 | 84 332 | 00:09:35,000 --> 00:09:39,000 333 | 那是因为不同的LLM使用不同的令牌计数方式。 334 | 335 | 85 336 | 00:09:39,000 --> 00:09:46,000 337 | 因此,这告诉它使用聊天OpenAI LLM使用的令牌计数方式。 338 | 339 | 86 340 | 00:09:46,000 --> 00:09:49,000 341 | 我鼓励您暂停视频并运行代码, 342 | 343 | 87 344 | 00:09:49,000 --> 00:09:54,000 345 | 并尝试修改提示以查看是否可以获得不同的输出。 346 | 347 | 88 348 | 00:09:54,000 --> 00:09:58,000 349 | 最后,我想在这里说明的最后一种记忆类型是对话摘要缓冲器记忆。 350 | 351 | 89 352 | 00:10:04,000 --> 00:10:12,000 353 | 这个想法是,不是将内存限制为基于最近话语的固定数量的令牌 354 | 355 | 90 356 | 00:10:12,000 --> 00:10:15,000 357 | 或固定数量的对话交换, 358 | 359 | 91 360 | 00:10:15,000 --> 00:10:24,000 361 | 让我们使用LLM编写对话摘要,并让其成为内存。 362 | 363 | 00:10:24,000 --> 00:10:29,000 364 | 这里有一个例子,我将创建一个长字符串,其中包含某人的日程安排。 365 | 366 | 00:10:29,000 --> 00:10:31,000 367 | 你知道,有Meteor AM,我们是产品团队, 368 | 369 | 00:10:31,000 --> 00:10:33,000 370 | 你需要你的PowerPoint演示文稿等等。 371 | 372 | 00:10:33,000 --> 00:10:38,000 373 | 所以这是一个长字符串,说出你的日程安排,你知道的, 374 | 375 | 00:10:38,000 --> 00:10:42,000 376 | 也许以与客户在意大利餐厅的午餐结束, 377 | 378 | 00:10:42,000 --> 00:10:46,000 379 | 带上你的笔记本电脑,展示LLM,最新的LLM演示。 380 | 381 | 00:10:46,000 --> 00:10:53,000 382 | 所以让我使用一个对话摘要缓存内存,嗯, 383 | 384 | 00:10:53,000 --> 00:10:58,000 385 | 在这种情况下,最大令牌限制为400,令牌限制相当高。 386 | 387 | 00:10:58,000 --> 00:11:05,000 388 | 我将插入一些对话术语,以我们开始的方式, 389 | 390 | 00:11:05,000 --> 00:11:10,000 391 | 你好,怎么了,没有人只是闲逛,嗯,酷。 392 | 393 | 00:11:10,000 --> 00:11:17,000 394 | 然后今天的日程安排是什么,回答是,你知道,那个长长的日程安排。 395 | 396 | 00:11:17,000 --> 00:11:22,000 397 | 所以这个内存现在有相当多的文本。 398 | 399 | 00:11:22,000 --> 00:11:26,000 400 | 事实上,让我们看看内存变量。 401 | 402 | 00:11:26,000 --> 00:11:37,000 403 | 它包含整个文本,因为400个令牌足以存储所有这些文本。 404 | 405 | 00:11:37,000 --> 00:11:43,000 406 | 但是现在,如果我将最大令牌限制减少,比如将其减少到100个令牌, 407 | 408 | 00:11:43,000 --> 00:11:46,000 409 | 请记住,这存储了整个对话历史记录。 410 | 411 | 00:11:46,000 --> 00:11:50,000 412 | 如果我将令牌数减少到100个, 413 | 414 | 00:11:50,000 --> 00:11:57,000 415 | 那么对话摘要缓存内存实际上已经使用了LLM, 416 | 417 | 00:11:57,000 --> 00:12:01,000 418 | 在这种情况下,我们已经将LLM设置为open AI端点, 419 | 420 | 00:12:01,000 --> 00:12:05,000 421 | 以生成到目前为止对话的摘要。 422 | 423 | 00:12:05,000 --> 00:12:09,000 424 | 因此,摘要是人工智能在计划日程之前进行了闲聊, 425 | 426 | 00:12:09,000 --> 00:12:12,000 427 | 并在早晨会议上通知人类,等等, 428 | 429 | 00:12:12,000 --> 00:12:17,000 430 | 午餐会议与对人工智能感兴趣的客户, 431 | 432 | 00:12:17,000 --> 00:12:28,000 433 | 最新的人工智能发展。如果我们使用这个LLM进行对话, 434 | 435 | 00:12:28,000 --> 00:12:33,000 436 | 然后创建一个对话链,与之前相同。 437 | 438 | 00:12:33,000 --> 00:12:41,000 439 | 如果我们问,你知道什么是一个好的演示文稿吗? 440 | 441 | 00:12:41,000 --> 00:12:43,000 442 | 嗯,我说verbose=true。 443 | 444 | 00:12:43,000 --> 00:12:53,000 445 | 所以这里是提示,LLM认为当前的对话已经讨论过这个问题了, 446 | 447 | 00:12:53,000 --> 00:12:56,000 448 | 因为这是对话的摘要。 449 | 450 | 122 451 | 00:12:56,000 --> 00:13:03,000 452 | 还有一点需要注意,如果你熟悉开放式AI聊天API端点, 453 | 454 | 123 455 | 00:13:03,000 --> 00:13:07,000 456 | 有一个特定的系统消息。 457 | 458 | 124 459 | 00:13:07,000 --> 00:13:11,000 460 | 在这个例子中,这并没有使用官方的开放式AI系统消息。 461 | 462 | 125 463 | 00:13:11,000 --> 00:13:14,000 464 | 它只是将其作为提示的一部分包含在内。 465 | 466 | 126 467 | 00:13:14,000 --> 00:13:16,000 468 | 但它仍然运行得相当不错。 469 | 470 | 127 471 | 00:13:16,000 --> 00:13:21,000 472 | 鉴于这个提示,你知道,LLM输出基本的对AI发展感兴趣的客户, 473 | 474 | 128 475 | 00:13:21,000 --> 00:13:24,000 476 | 或者建议展示我们最新的NLP能力。 477 | 478 | 129 479 | 00:13:24,000 --> 00:13:26,000 480 | 好的,很酷。 481 | 482 | 130 483 | 00:13:26,000 --> 00:13:31,000 484 | 嗯,它正在做一些向酷炫演示的建议, 485 | 486 | 131 487 | 00:13:31,000 --> 00:13:35,000 488 | 并让你想到如果我遇到一个客户,我会说, 489 | 490 | 132 491 | 00:13:35,000 --> 00:13:43,000 492 | 天哪,如果有一个开源框架可以帮助我使用LLMs构建酷炫的NLP应用程序。 493 | 494 | 133 495 | 00:13:43,000 --> 00:13:46,000 496 | 好事正在发生。 497 | 498 | 134 499 | 00:13:46,000 --> 00:13:58,000 500 | 有趣的是,如果你现在看看记忆发生了什么。 501 | 502 | 135 503 | 00:13:58,000 --> 00:14:04,000 504 | 请注意,这里已经合并了最近的AI系统输出, 505 | 506 | 136 507 | 00:14:04,000 --> 00:14:11,000 508 | 而我的话问它是否是一个好的演示已经被合并到系统消息中。 509 | 510 | 137 511 | 00:14:11,000 --> 00:14:14,000 512 | 你知道,到目前为止的对话总结。 513 | 514 | 138 515 | 00:14:14,000 --> 00:14:17,000 516 | 通过对话总结缓冲区记忆, 517 | 518 | 139 519 | 00:14:17,000 --> 00:14:27,000 520 | 它试图保持消息的显式存储,直到我们指定的令牌数为止。 521 | 522 | 140 523 | 00:14:27,000 --> 00:14:30,000 524 | 所以,你知道,这部分,显式存储, 525 | 526 | 141 527 | 00:14:30,000 --> 00:14:34,000 528 | 我们试图将其限制在100个令牌,因为这是我们要求的。 529 | 530 | 142 531 | 00:14:34,000 --> 00:14:38,000 532 | 然后,任何超过这个限制的内容,它都将使用LLM生成一个摘要, 533 | 534 | 143 535 | 00:14:38,000 --> 00:14:41,000 536 | 这就是上面看到的内容。 537 | 538 | 144 539 | 00:14:41,000 --> 00:14:46,000 540 | 即使我使用聊天作为一个运行示例来说明这些不同的记忆, 541 | 542 | 145 543 | 00:14:46,000 --> 00:14:49,000 544 | 这些记忆也对其他应用程序有用, 545 | 546 | 146 547 | 00:14:49,000 --> 00:14:54,000 548 | 在这些应用程序中,你可能会不断地获得新的文本片段或新的信息, 549 | 550 | 147 551 | 00:14:54,000 --> 00:14:59,000 552 | 比如如果你的系统反复在线搜索事实, 553 | 554 | 148 555 | 00:14:59,000 --> 00:15:04,000 556 | 但你希望保持用于存储这个不断增长的事实列表的总内存大小为, 557 | 558 | 149 559 | 00:15:04,000 --> 00:15:07,000 560 | 有限的,而不是任意增长。 561 | 562 | 150 563 | 00:15:07,000 --> 00:15:11,000 564 | 我鼓励你暂停视频并运行代码。 565 | 566 | 151 567 | 00:15:11,000 --> 00:15:15,000 568 | 在这个视频中,你看到了一些类型的内存,包括基于对话交换或令牌数量限制的缓冲内存, 569 | 570 | 152 571 | 00:15:15,000 --> 00:15:21,000 572 | 或者可以总结超过一定限制的令牌的内存。 573 | 574 | 153 575 | 00:15:21,000 --> 00:15:26,000 576 | Lanchain实际上还支持其他类型的内存。 577 | 578 | 155 579 | 00:15:30,000 --> 00:15:33,000 580 | 其中最强大的之一是向量数据内存。 581 | 582 | 156 583 | 00:15:33,000 --> 00:15:36,000 584 | 如果你熟悉单词嵌入和文本嵌入, 585 | 586 | 157 587 | 00:15:36,000 --> 00:15:39,000 588 | 向量数据库实际上存储这样的嵌入。 589 | 590 | 158 591 | 00:15:39,000 --> 00:15:41,000 592 | 如果你不知道这是什么意思,不用担心。 593 | 594 | 159 595 | 00:15:41,000 --> 00:15:43,000 596 | 哈里森会在后面解释。 597 | 598 | 160 599 | 00:15:43,000 --> 00:15:51,000 600 | 然后,它可以使用这种类型的向量数据库检索最相关的文本块作为其内存。 601 | 602 | 161 603 | 00:15:51,000 --> 00:15:54,000 604 | Lanchain还支持实体内存, 605 | 606 | 162 607 | 00:15:54,000 --> 00:15:58,000 608 | 当你想让它记住特定人物的细节时,这是适用的, 609 | 610 | 163 611 | 00:15:58,000 --> 00:16:04,000 612 | 特定的其他实体,比如如果你谈论一个特定的朋友, 613 | 614 | 164 615 | 00:16:04,000 --> 00:16:08,000 616 | 你可以让Lanchain记住关于那个朋友的事实, 617 | 618 | 165 619 | 00:16:08,000 --> 00:16:12,000 620 | 这将是一种明确的实体。 621 | 622 | 166 623 | 00:16:12,000 --> 00:16:14,000 624 | 当你使用Lanchain实现应用程序时, 625 | 626 | 167 627 | 00:16:14,000 --> 00:16:17,000 628 | 你还可以使用多种类型的内存, 629 | 630 | 168 631 | 00:16:17,000 --> 00:16:22,000 632 | 比如使用你在这个视频中看到的一种对话内存类型。 633 | 634 | 169 635 | 00:16:22,000 --> 00:16:26,000 636 | 此外,还可以使用实体内存来回忆个人。 637 | 638 | 170 639 | 00:16:26,000 --> 00:16:30,000 640 | 这样它就可以记住对话的摘要, 641 | 642 | 171 643 | 00:16:30,000 --> 00:16:35,000 644 | 再加上以明确的方式存储重要人物的重要事实。 645 | 646 | 172 647 | 00:16:35,000 --> 00:16:38,000 648 | 当然,除了使用这些内存类型之外, 649 | 650 | 173 651 | 00:16:38,000 --> 00:16:43,000 652 | 开发人员还经常将整个对话存储在传统数据库中, 653 | 654 | 174 655 | 00:16:43,000 --> 00:16:46,000 656 | 某种键值存储或SQL数据库。 657 | 658 | 175 659 | 00:16:46,000 --> 00:16:51,000 660 | 因此,你可以回顾整个对话以进行审计或进一步改进系统。 661 | 662 | 176 663 | 00:16:51,000 --> 00:16:53,000 664 | 这就是内存类型。 665 | 666 | 177 667 | 00:16:53,000 --> 00:16:57,000 668 | 我希望你在构建自己的应用程序时会发现这个视频有用。 669 | 670 | 178 671 | 00:16:57,000 --> 00:17:21,000 672 | 现在,让我们继续下一个视频,了解Lanchain的关键构建块,即链。 -------------------------------------------------------------------------------- /english/LangChain_L6.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:08,920 3 | Sometimes people think of a large language model as a knowledge store, 4 | 5 | 2 6 | 00:00:08,920 --> 00:00:11,920 7 | as if it's learned to memorize a lot of information, 8 | 9 | 3 10 | 00:00:11,920 --> 00:00:14,880 11 | maybe off the internet, so when you ask it a question, 12 | 13 | 4 14 | 00:00:14,880 --> 00:00:16,380 15 | it can answer the question. 16 | 17 | 5 18 | 00:00:16,380 --> 00:00:19,340 19 | But I think a even more useful way to think of 20 | 21 | 6 22 | 00:00:19,340 --> 00:00:22,980 23 | a large language model is sometimes as a reasoning engine, 24 | 25 | 7 26 | 00:00:22,980 --> 00:00:27,140 27 | in which you can give it chunks of text or other sources of information. 28 | 29 | 8 30 | 00:00:27,140 --> 00:00:29,460 31 | Then the large language model, llm, 32 | 33 | 9 34 | 00:00:29,460 --> 00:00:33,000 35 | will maybe use this background knowledge that's learned off the internet, 36 | 37 | 10 38 | 00:00:33,000 --> 00:00:36,620 39 | but to use the new information you give it to help you answer 40 | 41 | 11 42 | 00:00:36,620 --> 00:00:41,100 43 | questions or reason through content or decide even what to do next. 44 | 45 | 12 46 | 00:00:41,100 --> 00:00:45,180 47 | That's what Lanchain's agents framework helps you to do. 48 | 49 | 13 50 | 00:00:45,180 --> 00:00:48,340 51 | Agents are probably my favorite part of Lanchain. 52 | 53 | 14 54 | 00:00:48,340 --> 00:00:50,320 55 | I think they're also one of the most powerful parts, 56 | 57 | 15 58 | 00:00:50,320 --> 00:00:52,140 59 | but they're also one of the newer parts. 60 | 61 | 16 62 | 00:00:52,140 --> 00:00:56,020 63 | We're seeing a lot of stuff emerge here that's really new to everyone in the field. 64 | 65 | 17 66 | 00:00:56,020 --> 00:00:58,940 67 | This should be a very exciting lesson as we dive 68 | 69 | 18 70 | 00:00:58,940 --> 00:01:01,180 71 | into what agents are, how to create, 72 | 73 | 19 74 | 00:01:01,180 --> 00:01:02,500 75 | and how to use agents, 76 | 77 | 20 78 | 00:01:02,500 --> 00:01:04,860 79 | how to equip them with different types of tools like 80 | 81 | 21 82 | 00:01:04,860 --> 00:01:07,180 83 | search engines that come built into Lanchain, 84 | 85 | 22 86 | 00:01:07,180 --> 00:01:11,480 87 | and then also how to create your own tools so that you can let agents interact with 88 | 89 | 23 90 | 00:01:11,480 --> 00:01:14,780 91 | any data stores, any APIs, 92 | 93 | 24 94 | 00:01:14,780 --> 00:01:16,860 95 | any functions that you might want them to. 96 | 97 | 25 98 | 00:01:16,860 --> 00:01:19,460 99 | This is exciting, cutting-edge stuff, 100 | 101 | 26 102 | 00:01:19,460 --> 00:01:23,060 103 | but already with emerging important use cases. 104 | 105 | 27 106 | 00:01:23,060 --> 00:01:25,620 107 | So with that, let's dive in. 108 | 109 | 28 110 | 00:01:25,620 --> 00:01:27,500 111 | To get started with agents, 112 | 113 | 29 114 | 00:01:27,500 --> 00:01:32,420 115 | we're going to start as we always do by importing the correct environment variables. 116 | 117 | 30 118 | 00:01:32,420 --> 00:01:35,100 119 | We're also going to need to install a few packages here. 120 | 121 | 31 122 | 00:01:35,100 --> 00:01:39,020 123 | So we're going to use the DuckDuckGo search engine and Wikipedia. 124 | 125 | 32 126 | 00:01:39,020 --> 00:01:40,780 127 | So we're going to want to pip install those. 128 | 129 | 33 130 | 00:01:40,780 --> 00:01:42,940 131 | I've already installed those in my environment, 132 | 133 | 34 134 | 00:01:42,940 --> 00:01:44,540 135 | so I'm not going to run this line. 136 | 137 | 35 138 | 00:01:44,540 --> 00:01:46,360 139 | But if you guys have not, 140 | 141 | 36 142 | 00:01:46,360 --> 00:01:48,580 143 | you should uncomment that line, 144 | 145 | 37 146 | 00:01:48,580 --> 00:01:51,300 147 | run it, and then you're good to go. 148 | 149 | 38 150 | 00:01:51,300 --> 00:01:56,060 151 | We're then going to import some methods and classes that we need from Lanchain. 152 | 153 | 39 154 | 00:01:56,060 --> 00:01:59,060 155 | So we're going to import some methods to load tools, 156 | 157 | 40 158 | 00:01:59,060 --> 00:02:02,340 159 | and these are things that we're going to connect the language model to. 160 | 161 | 41 162 | 00:02:02,340 --> 00:02:05,020 163 | We're going to load a method to initialize the agent. 164 | 165 | 42 166 | 00:02:05,020 --> 00:02:07,820 167 | We're going to load the chat open AI wrapper, 168 | 169 | 43 170 | 00:02:07,820 --> 00:02:09,500 171 | and we're going to load agent type. 172 | 173 | 44 174 | 00:02:09,500 --> 00:02:14,220 175 | So agent type will be used to specify what type of agent we want to use. 176 | 177 | 45 178 | 00:02:14,220 --> 00:02:16,540 179 | There are a bunch of different types of agents in Lanchain. 180 | 181 | 46 182 | 00:02:16,540 --> 00:02:18,780 183 | We're not going to go over all of them right now. 184 | 185 | 47 186 | 00:02:18,780 --> 00:02:21,420 187 | We'll just choose one and run with that. 188 | 189 | 48 190 | 00:02:21,420 --> 00:02:24,700 191 | We're then going to initialize the language model that we're going to use. 192 | 193 | 49 194 | 00:02:24,700 --> 00:02:30,500 195 | Again, we're using this as the reasoning engine that we're using to drive the agent. 196 | 197 | 50 198 | 00:02:30,500 --> 00:02:33,740 199 | We'll then load the tools that we're going to use. 200 | 201 | 51 202 | 00:02:33,740 --> 00:02:37,020 203 | So we're going to load DuckDuckGo search and Wikipedia, 204 | 205 | 52 206 | 00:02:37,020 --> 00:02:40,140 207 | and these are built-in Lanchain tools. 208 | 209 | 53 210 | 00:02:40,140 --> 00:02:42,980 211 | Finally, we're going to initialize the agent. 212 | 213 | 54 214 | 00:02:42,980 --> 00:02:44,780 215 | We pass it the tools, 216 | 217 | 55 218 | 00:02:44,780 --> 00:02:47,700 219 | the language model, and agent type. 220 | 221 | 56 222 | 00:02:47,700 --> 00:02:49,340 223 | So here we're using chat, 224 | 225 | 57 226 | 00:02:49,340 --> 00:02:51,460 227 | zero shot, react, description. 228 | 229 | 58 230 | 00:02:51,460 --> 00:02:54,060 231 | I'm not going to go over in too much detail what this means. 232 | 233 | 59 234 | 00:02:54,060 --> 00:02:56,220 235 | The important things to note are chat. 236 | 237 | 60 238 | 00:02:56,220 --> 00:03:00,540 239 | This is optimized to work with chat models, and then react. 240 | 241 | 61 242 | 00:03:00,540 --> 00:03:05,620 243 | React is a prompting strategy that elicits better thoughts from a language model. 244 | 245 | 62 246 | 00:03:05,620 --> 00:03:09,220 247 | We're also going to set handle parsing errors equals true. 248 | 249 | 63 250 | 00:03:09,220 --> 00:03:11,620 251 | If you remember from the first lesson, 252 | 253 | 64 254 | 00:03:11,620 --> 00:03:17,140 255 | we chatted a bunch about output parsers and how those can be used to take the LLM output, 256 | 257 | 65 258 | 00:03:17,140 --> 00:03:22,060 259 | which is a string, and parse it into a specific format that we can use downstream. 260 | 261 | 66 262 | 00:03:22,060 --> 00:03:23,740 263 | That's extremely important here. 264 | 265 | 67 266 | 00:03:23,740 --> 00:03:25,620 267 | When we take the output of the LLM, 268 | 269 | 68 270 | 00:03:25,620 --> 00:03:28,940 271 | which is text, and parse it into the specific action, 272 | 273 | 69 274 | 00:03:28,940 --> 00:03:32,700 275 | and the specific action input that the language model should take. 276 | 277 | 70 278 | 00:03:32,700 --> 00:03:34,300 279 | Let's now use this agent. 280 | 281 | 71 282 | 00:03:34,300 --> 00:03:38,940 283 | Let's ask it a question about a recent event that the model, 284 | 285 | 72 286 | 00:03:38,940 --> 00:03:41,060 287 | when it was trained, didn't know. 288 | 289 | 73 290 | 00:03:41,060 --> 00:03:43,860 291 | So let's ask it about the 2022 World Cup. 292 | 293 | 74 294 | 00:03:43,860 --> 00:03:47,660 295 | The models here were trained on data up to around 2021. 296 | 297 | 75 298 | 00:03:47,660 --> 00:03:49,820 299 | So it shouldn't know the answer to this question. 300 | 301 | 76 302 | 00:03:49,820 --> 00:03:55,580 303 | And so it should realize that it needs to use a tool to look up this recent piece of information. 304 | 305 | 77 306 | 00:04:05,340 --> 00:04:10,620 307 | So we can see here that the agent realizes that it needs to use DuckDuckGo search, 308 | 309 | 78 310 | 00:04:10,620 --> 00:04:14,900 311 | and then looks up the 2022 World Cup winner. 312 | 313 | 79 314 | 00:04:14,900 --> 00:04:18,060 315 | And so it gets back a bunch of information. 316 | 317 | 80 318 | 00:04:18,060 --> 00:04:23,940 319 | We can then see that the agent thinks that the 2022 World Cup has not happened yet. 320 | 321 | 81 322 | 00:04:23,940 --> 00:04:28,060 323 | So this is a good example of why agents are still pretty exploratory. 324 | 325 | 82 326 | 00:04:28,060 --> 00:04:32,020 327 | You can see that there's a bunch of information here about the 2022 World Cup, 328 | 329 | 83 330 | 00:04:32,020 --> 00:04:34,180 331 | but it doesn't quite realize all that has happened. 332 | 333 | 84 334 | 00:04:34,180 --> 00:04:38,180 335 | And so it needs to look up more things and get more information. 336 | 337 | 85 338 | 00:04:38,180 --> 00:04:40,220 339 | And so then based on that information, 340 | 341 | 86 342 | 00:04:40,220 --> 00:04:47,380 343 | it can respond with the correct answer that Argentina won the 2022 World Cup. 344 | 345 | 87 346 | 00:04:47,380 --> 00:04:52,500 347 | Let's then ask it a question where it should recognize that it needs to use Wikipedia. 348 | 349 | 88 350 | 00:04:52,500 --> 00:04:58,740 351 | So Wikipedia has a lot of information about specific people and specific entities 352 | 353 | 89 354 | 00:04:58,740 --> 00:05:02,980 355 | from a long time ago. It doesn't need to be current information. 356 | 357 | 90 358 | 00:05:02,980 --> 00:05:06,540 359 | So let's ask it about Tom M. Mitchell, an American computer scientist, 360 | 361 | 91 362 | 00:05:06,540 --> 00:05:08,420 363 | and what book did he write? 364 | 365 | 92 366 | 00:05:08,420 --> 00:05:12,700 367 | We can see here that it recognizes that it should use Wikipedia to look up the answer. 368 | 369 | 93 370 | 00:05:12,700 --> 00:05:16,020 371 | So it searches for Tom M. Mitchell Wikipedia. 372 | 373 | 94 374 | 00:05:16,020 --> 00:05:19,460 375 | And then does another follow-up search just to make sure that it's got the right answer. 376 | 377 | 95 378 | 00:05:19,460 --> 00:05:24,340 379 | So it searches for Tom M. Mitchell machine learning and gets back more information. 380 | 381 | 96 382 | 00:05:24,340 --> 00:05:27,060 383 | And then based on that, it's able to finally answer with 384 | 385 | 97 386 | 00:05:27,060 --> 00:05:29,660 387 | Tom M. Mitchell wrote the textbook, Machine Learning. 388 | 389 | 98 390 | 00:05:29,660 --> 00:05:33,580 391 | You should pause the video here and try putting in different inputs. 392 | 393 | 99 394 | 00:05:33,580 --> 00:05:38,380 395 | So far, we've used tools that come defined in LinkedIn already, 396 | 397 | 100 398 | 00:05:38,380 --> 00:05:42,820 399 | but a big power of agents is that you can connect it to your own sources of information, 400 | 401 | 101 402 | 00:05:42,820 --> 00:05:45,100 403 | your own APIs, your own data. 404 | 405 | 102 406 | 00:05:45,100 --> 00:05:47,700 407 | So here we're going to go over how you can create a custom tool 408 | 409 | 103 410 | 00:05:47,700 --> 00:05:50,700 411 | so that you can connect it to whatever you want. 412 | 413 | 104 414 | 00:05:50,700 --> 00:05:54,660 415 | Let's make a tool that's going to tell us what the current date is. 416 | 417 | 105 418 | 00:05:54,660 --> 00:05:57,500 419 | First, we're going to import this tool decorator. 420 | 421 | 106 422 | 00:05:57,500 --> 00:06:03,100 423 | This can be applied to any function, and it turns it into a tool that LinkedIn can use. 424 | 425 | 107 426 | 00:06:03,100 --> 00:06:08,500 427 | Next, we're going to write a function called time, which takes in any text string. 428 | 429 | 108 430 | 00:06:08,500 --> 00:06:09,900 431 | We're not really going to use that. 432 | 433 | 109 434 | 00:06:09,900 --> 00:06:15,540 435 | And it's going to return today's date by calling date time. 436 | 437 | 110 438 | 00:06:15,540 --> 00:06:20,660 439 | In addition to the name of the function, we're also going to write a really detailed doc string. 440 | 441 | 111 442 | 00:06:20,660 --> 00:06:25,100 443 | That's because this is what the agent will use to know when it should call this tool 444 | 445 | 112 446 | 00:06:25,100 --> 00:06:28,500 447 | and how it should call this tool. 448 | 449 | 113 450 | 00:06:28,500 --> 00:06:32,060 451 | For example, here we say that the input should always be an empty string. 452 | 453 | 114 454 | 00:06:32,060 --> 00:06:33,860 455 | That's because we don't use it. 456 | 457 | 115 458 | 00:06:33,860 --> 00:06:37,460 459 | If we have more stringent requirements on what the input should be, 460 | 461 | 116 462 | 00:06:37,460 --> 00:06:42,940 463 | for example, if we have a function that should always take in a search query or a SQL statement, 464 | 465 | 117 466 | 00:06:42,940 --> 00:06:47,340 467 | you'll want to make sure to mention that here. 468 | 469 | 118 470 | 00:06:47,340 --> 00:06:49,060 471 | We're now going to create another agent. 472 | 473 | 119 474 | 00:06:49,060 --> 00:06:55,660 475 | This time we're adding the time tool to the list of existing tools. 476 | 477 | 120 478 | 00:06:55,660 --> 00:07:03,660 479 | And finally, let's call the agent and ask it what the date today is. 480 | 481 | 121 482 | 00:07:03,660 --> 00:07:08,140 483 | It recognizes that it needs to use the time tool, which it specifies here. 484 | 485 | 122 486 | 00:07:08,140 --> 00:07:10,340 487 | It has the action input as an empty string. 488 | 489 | 123 490 | 00:07:10,340 --> 00:07:12,540 491 | This is great. This is what we told it to do. 492 | 493 | 124 494 | 00:07:12,540 --> 00:07:14,660 495 | And then it returns with an observation. 496 | 497 | 125 498 | 00:07:14,660 --> 00:07:18,740 499 | And then finally, the language model takes that observation and responds to the user. 500 | 501 | 126 502 | 00:07:18,740 --> 00:07:22,540 503 | Today's date is 2023-05-21. 504 | 505 | 127 506 | 00:07:22,540 --> 00:07:26,860 507 | You should pause the video here and try putting in different inputs. 508 | 509 | 128 510 | 00:07:26,860 --> 00:07:29,340 511 | This wraps up the lesson on agents. 512 | 513 | 129 514 | 00:07:29,340 --> 00:07:34,740 515 | This is one of the newer and more exciting and more experimental pieces of LangChain. 516 | 517 | 130 518 | 00:07:34,740 --> 00:07:36,540 519 | So I hope you enjoy using it. 520 | 521 | 131 522 | 00:07:36,540 --> 00:07:40,540 523 | Hopefully it showed you how you can use a language model as a reasoning engine 524 | 525 | 132 526 | 00:07:40,540 --> 00:08:00,540 527 | to take different actions and connect to other functions and data sources. 528 | 529 | 530 | -------------------------------------------------------------------------------- /LangChain_L6.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:08,920 3 | 有时人们认为大型语言模型是一个知识库, 4 | Sometimes people think of a large language model as a knowledge store, 5 | 6 | 2 7 | 00:00:08,920 --> 00:00:11,920 8 | 好像它已经学会了记忆大量信息, 9 | as if it's learned to memorize a lot of information, 10 | 11 | 3 12 | 00:00:11,920 --> 00:00:14,880 13 | 也许是从互联网上获取的,所以当你问它一个问题时, 14 | maybe off the internet, so when you ask it a question, 15 | 16 | 4 17 | 00:00:14,880 --> 00:00:16,380 18 | 它可以回答这个问题。 19 | it can answer the question. 20 | 21 | 5 22 | 00:00:16,380 --> 00:00:19,340 23 | 但我认为,将大型语言模型视为推理引擎更加有用, 24 | But I think a even more useful way to think of 25 | 26 | 6 27 | 00:00:19,340 --> 00:00:22,980 28 | 你可以给它一些文本块或其他信息来源。 29 | a large language model is sometimes as a reasoning engine, 30 | 31 | 7 32 | 00:00:22,980 --> 00:00:27,140 33 | 然后大型语言模型,LLM, 34 | in which you can give it chunks of text or other sources of information. 35 | 36 | 8 37 | 00:00:27,140 --> 00:00:29,460 38 | 可能会使用从互联网上学到的背景知识, 39 | Then the large language model, llm, 40 | 41 | 9 42 | 00:00:29,460 --> 00:00:33,000 43 | 但是使用你提供的新信息来帮助你回答问题或推理内容或甚至决定下一步该做什么。 44 | will maybe use this background knowledge that's learned off the internet, 45 | 46 | 10 47 | 00:00:33,000 --> 00:00:36,620 48 | 这就是Lanchain的代理框架帮助你做的事情。 49 | but to use the new information you give it to help you answer 50 | 51 | 11 52 | 00:00:36,620 --> 00:00:41,100 53 | 代理可能是我最喜欢的Lanchain部分。 54 | questions or reason through content or decide even what to do next. 55 | 56 | 12 57 | 00:00:41,100 --> 00:00:45,180 58 | 我认为它们也是最强大的部分之一, 59 | That's what Lanchain's agents framework helps you to do. 60 | 61 | 13 62 | 00:00:45,180 --> 00:00:48,340 63 | 但它们也是最新的部分之一。 64 | Agents are probably my favorite part of Lanchain. 65 | 66 | 14 67 | 00:00:48,340 --> 00:00:50,320 68 | 我们正在看到很多新的东西出现在这里,对于该领域的每个人来说都是新的。 69 | I think they're also one of the most powerful parts, 70 | 71 | 15 72 | 00:00:50,320 --> 00:00:52,140 73 | 这应该是一个非常令人兴奋的课程,因为我们深入探讨 74 | but they're also one of the newer parts. 75 | 76 | 16 77 | 00:00:52,140 --> 00:00:56,020 78 | 代理是什么,如何创建代理, 79 | We're seeing a lot of stuff emerge here that's really new to everyone in the field. 80 | 81 | 17 82 | 00:00:56,020 --> 00:00:58,940 83 | 以及如何使用代理, 84 | This should be a very exciting lesson as we dive 85 | 86 | 18 87 | 00:00:58,940 --> 00:01:01,180 88 | 如何为它们配备不同类型的工具,如 89 | into what agents are, how to create, 90 | 91 | 19 92 | 00:01:01,180 --> 00:01:02,500 93 | 内置于Lanchain中的搜索引擎, 94 | and how to use agents, 95 | 96 | 20 97 | 00:01:02,500 --> 00:01:04,860 98 | 以及如何创建自己的工具,以便让代理与 99 | how to equip them with different types of tools like 100 | 101 | 21 102 | 00:01:04,860 --> 00:01:07,180 103 | 任何数据存储,任何API, 104 | search engines that come built into Lanchain, 105 | 106 | 22 107 | 00:01:07,180 --> 00:01:11,480 108 | 任何你想让它们与之交互的函数。 109 | and then also how to create your own tools so that you can let agents interact with 110 | 111 | 23 112 | 00:01:11,480 --> 00:01:14,780 113 | 这是令人兴奋的前沿技术, 114 | any data stores, any APIs, 115 | 116 | 24 117 | 00:01:14,780 --> 00:01:16,860 118 | 但已经出现了一些重要的用例。 119 | any functions that you might want them to. 120 | 121 | 25 122 | 00:01:16,860 --> 00:01:19,460 123 | 因此,让我们开始吧。 124 | This is exciting, cutting-edge stuff, 125 | 126 | 26 127 | 00:01:19,460 --> 00:01:23,060 128 | 要开始使用代理, 129 | but already with emerging important use cases. 130 | 131 | 27 132 | 00:01:23,060 --> 00:01:25,620 133 | 我们将像往常一样导入正确的环境变量。 134 | So with that, let's dive in. 135 | 136 | 28 137 | 00:01:25,620 --> 00:01:27,500 138 | 我们还需要安装一些软件包。 139 | To get started with agents, 140 | 141 | 29 142 | 00:01:27,500 --> 00:01:32,420 143 | 因此,我们将使用DuckDuckGo搜索引擎和维基百科。 144 | we're going to start as we always do by importing the correct environment variables. 145 | 146 | 30 147 | 00:01:32,420 --> 00:01:35,100 148 | 因此,我们将要安装这些。 149 | We're also going to need to install a few packages here. 150 | 151 | 31 152 | 00:01:35,100 --> 00:01:39,020 153 | 我已经在我的环境中安装了这些,所以我不会运行这行。 154 | So we're going to use the DuckDuckGo search engine and Wikipedia. 155 | 156 | 32 157 | 00:01:39,020 --> 00:01:40,780 158 | 但如果你们没有, 159 | 160 | So we're going to want to pip install those. 161 | 162 | 33 163 | 00:01:46,360 --> 00:01:48,580 164 | 你应该取消注释那一行, 165 | I've already installed those in my environment, 166 | 167 | 34 168 | 00:01:48,580 --> 00:01:51,300 169 | 运行它,然后你就可以开始了。 170 | so I'm not going to run this line. 171 | 172 | 35 173 | 00:01:51,300 --> 00:01:56,060 174 | 然后我们将从Lanchain导入一些我们需要的方法和类。 175 | But if you guys have not, 176 | 177 | 36 178 | 00:01:56,060 --> 00:01:59,060 179 | 所以我们要导入一些加载工具的方法, 180 | you should uncomment that line, 181 | 182 | 37 183 | 00:01:59,060 --> 00:02:02,340 184 | 这些是我们将连接语言模型的东西。 185 | run it, and then you're good to go. 186 | 187 | 38 188 | 00:02:02,340 --> 00:02:05,020 189 | 我们将加载一个初始化代理的方法。 190 | We're then going to import some methods and classes that we need from Lanchain. 191 | 192 | 39 193 | 00:02:05,020 --> 00:02:07,820 194 | 我们将加载聊天Open AI包装器, 195 | So we're going to import some methods to load tools, 196 | 197 | 40 198 | 00:02:07,820 --> 00:02:09,500 199 | 我们将加载代理类型。 200 | and these are things that we're going to connect the language model to. 201 | 202 | 41 203 | 00:02:09,500 --> 00:02:14,220 204 | 所以代理类型将用于指定我们要使用的代理类型。 205 | We're going to load a method to initialize the agent. 206 | 207 | 42 208 | 00:02:14,220 --> 00:02:16,540 209 | Lanchain中有许多不同类型的代理。 210 | We're going to load the chat open AI wrapper, 211 | 212 | 43 213 | 00:02:16,540 --> 00:02:18,780 214 | 我们现在不会详细介绍所有这些。 215 | and we're going to load agent type. 216 | 217 | 44 218 | 00:02:18,780 --> 00:02:21,420 219 | 我们将选择一个并运行它。 220 | So agent type will be used to specify what type of agent we want to use. 221 | 222 | 45 223 | 00:02:21,420 --> 00:02:24,700 224 | 然后我们将初始化我们要使用的语言模型。 225 | There are a bunch of different types of agents in Lanchain. 226 | 227 | 46 228 | 00:02:24,700 --> 00:02:30,500 229 | 同样,我们将使用它作为我们用来驱动代理的推理引擎。 230 | We're not going to go over all of them right now. 231 | 232 | 47 233 | 00:02:30,500 --> 00:02:33,740 234 | 然后我们将加载我们要使用的工具。 235 | We'll just choose one and run with that. 236 | 237 | 48 238 | 00:02:33,740 --> 00:02:37,020 239 | 所以我们将加载DuckDuckGo搜索和维基百科, 240 | We're then going to initialize the language model that we're going to use. 241 | 242 | 49 243 | 00:02:37,020 --> 00:02:40,140 244 | 这些都是内置的Lanchain工具。 245 | Again, we're using this as the reasoning engine that we're using to drive the agent. 246 | 247 | 50 248 | 00:02:40,140 --> 00:02:42,980 249 | 最后,我们将初始化代理。 250 | We'll then load the tools that we're going to use. 251 | 252 | 51 253 | 00:02:42,980 --> 00:02:44,780 254 | 我们将传递工具, 255 | So we're going to load DuckDuckGo search and Wikipedia, 256 | 257 | 52 258 | 00:02:44,780 --> 00:02:47,700 259 | 语言模型和代理类型。 260 | and these are built-in Lanchain tools. 261 | 262 | 53 263 | 00:02:47,700 --> 00:02:49,340 264 | 所以这里我们使用聊天, 265 | Finally, we're going to initialize the agent. 266 | 267 | 54 268 | 00:02:49,340 --> 00:02:51,460 269 | 零射击,反应,描述。 270 | We pass it the tools, 271 | 272 | 55 273 | 00:02:51,460 --> 00:02:54,060 274 | 我不会详细介绍这意味着什么。 275 | the language model, and agent type. 276 | 277 | 56 278 | 00:02:54,060 --> 00:02:56,220 279 | 需要注意的重要事项是聊天。 280 | So here we're using chat, 281 | 282 | 57 283 | 00:02:56,220 --> 00:03:00,540 284 | 这是针对聊天模型进行优化的,然后是反应。 285 | zero shot, react, description. 286 | 287 | 58 288 | 00:03:00,540 --> 00:03:05,620 289 | 反应是一种提示策略,可以从语言模型中引出更好的想法。 290 | I'm not going to go over in too much detail what this means. 291 | 292 | 59 293 | 00:03:05,620 --> 00:03:09,220 294 | 我们还将设置处理解析错误等于true。 295 | The important things to note are chat. 296 | 297 | 60 298 | 00:03:09,220 --> 00:03:11,620 299 | 如果您还记得第一课, 300 | This is optimized to work with chat models, and then react. 301 | 302 | 61 303 | 00:03:11,620 --> 00:03:17,140 304 | 我们谈论了输出解析器以及如何使用它们将LLM输出, 305 | React is a prompting strategy that elicits better thoughts from a language model. 306 | 307 | 62 308 | 00:03:17,140 --> 00:03:22,060 309 | 这是一个字符串,并将其解析为我们可以在下游使用的特定格式。 310 | We're also going to set handle parsing errors equals true. 311 | 312 | 63 313 | 00:03:22,060 --> 00:03:23,740 314 | 这在这里非常重要。 315 | If you remember from the first lesson, 316 | 317 | 64 318 | 00:03:23,740 --> 00:03:25,620 319 | 当我们将LLM的输出, 320 | we chatted a bunch about output parsers and how those can be used to take the LLM output, 321 | 322 | 65 323 | 00:03:25,620 --> 00:03:28,940 324 | 这是文本,并将其解析为特定的操作, 325 | which is a string, and parse it into a specific format that we can use downstream. 326 | 327 | 66 328 | 00:03:28,940 --> 00:03:32,700 329 | 以及语言模型应该采取的特定操作输入。 330 | 331 | That's extremely important here. 332 | 333 | 67 334 | 00:03:32,700 --> 00:03:34,300 335 | 现在让我们使用这个代理。 336 | When we take the output of the LLM, 337 | 338 | 68 339 | 00:03:34,300 --> 00:03:38,940 340 | 让我们问一个关于最近事件的问题,这个模型在训练时不知道。 341 | which is text, and parse it into the specific action, 342 | 343 | 69 344 | 00:03:38,940 --> 00:03:41,060 345 | 所以让我们问一下2022年世界杯的情况。 346 | and the specific action input that the language model should take. 347 | 348 | 70 349 | 00:03:41,060 --> 00:03:43,860 350 | 这里的模型是根据2021年左右的数据进行训练的。 351 | Let's now use this agent. 352 | 353 | 71 354 | 00:03:43,860 --> 00:03:47,660 355 | 所以它不应该知道这个问题的答案。 356 | Let's ask it a question about a recent event that the model, 357 | 358 | 72 359 | 00:03:47,660 --> 00:03:49,820 360 | 因此,它应该意识到需要使用工具来查找这个最近的信息。 361 | when it was trained, didn't know. 362 | 363 | 73 364 | 00:03:49,820 --> 00:03:55,580 365 | 所以我们可以看到这里的代理意识到它需要使用DuckDuckGo搜索,然后查找2022年世界杯的获胜者。 366 | So let's ask it about the 2022 World Cup. 367 | 368 | 74 369 | 00:04:05,340 --> 00:04:10,620 370 | 因此,它得到了一些信息。 371 | The models here were trained on data up to around 2021. 372 | 373 | 75 374 | 00:04:10,620 --> 00:04:14,900 375 | 然后我们可以看到代理认为2022年世界杯还没有发生。 376 | So it shouldn't know the answer to this question. 377 | 378 | 76 379 | 00:04:14,900 --> 00:04:18,060 380 | 所以这是一个很好的例子,说明代理仍然具有探索性。 381 | And so it should realize that it needs to use a tool to look up this recent piece of information. 382 | 383 | 77 384 | 00:04:18,060 --> 00:04:23,940 385 | 我们可以看到这里有很多关于2022年世界杯的信息,但它并没有完全意识到所有的事情都已经发生了。 386 | So we can see here that the agent realizes that it needs to use DuckDuckGo search, 387 | 388 | 78 389 | 00:04:23,940 --> 00:04:28,060 390 | 因此,它需要查找更多的信息。 391 | and then looks up the 2022 World Cup winner. 392 | 393 | 79 394 | 00:04:28,060 --> 00:04:32,020 395 | 然后基于这些信息,它可以回答正确的答案,即阿根廷赢得了2022年世界杯。 396 | And so it gets back a bunch of information. 397 | 398 | 80 399 | 00:04:47,380 --> 00:04:52,500 400 | 然后让我们问一个问题,它应该意识到需要使用维基百科。 401 | We can then see that the agent thinks that the 2022 World Cup has not happened yet. 402 | 403 | 81 404 | 00:04:52,500 --> 00:04:58,740 405 | 维基百科有很多关于特定人物和特定实体的信息,这些信息可以是很久以前的,不需要是当前的信息。 406 | So this is a good example of why agents are still pretty exploratory. 407 | 408 | 82 409 | 00:04:58,740 --> 00:05:02,980 410 | 所以让我们问一下美国计算机科学家Tom M. Mitchell写了哪本书。 411 | You can see that there's a bunch of information here about the 2022 World Cup, 412 | 413 | 83 414 | 00:05:02,980 --> 00:05:06,540 415 | 我们可以看到它意识到应该使用维基百科来查找答案。 416 | but it doesn't quite realize all that has happened. 417 | 418 | 84 419 | 00:05:06,540 --> 00:05:08,420 420 | 所以它搜索Tom M. Mitchell维基百科。 421 | And so it needs to look up more things and get more information. 422 | 423 | 85 424 | 00:05:08,420 --> 00:05:12,700 425 | 然后再进行另一个跟进搜索,以确保它得到了正确的答案。 426 | And so then based on that information, 427 | 428 | 86 429 | 00:05:12,700 --> 00:05:16,020 430 | 所以它搜索Tom M. Mitchell机器学习,并得到更多的信息。 431 | it can respond with the correct answer that Argentina won the 2022 World Cup. 432 | 433 | 87 434 | 00:05:16,020 --> 00:05:19,460 435 | 然后基于这些信息,它最终能够回答Tom M. Mitchell写的教科书是《机器学习》。 436 | Let's then ask it a question where it should recognize that it needs to use Wikipedia. 437 | 438 | 88 439 | 00:05:29,660 --> 00:05:33,580 440 | 你可以在这里暂停视频,尝试输入不同的内容。 441 | So Wikipedia has a lot of information about specific people and specific entities 442 | 443 | 89 444 | 00:05:33,580 --> 00:05:38,380 445 | 到目前为止,我们已经使用了LinkedIn中预定义的工具。 446 | 447 | from a long time ago. It doesn't need to be current information. 448 | 449 | 90 450 | 00:05:38,380 --> 00:05:42,820 451 | 但代理的一个重要功能是可以将其连接到您自己的信息源、API和数据。 452 | So let's ask it about Tom M. Mitchell, an American computer scientist, 453 | 454 | 91 455 | 00:05:42,820 --> 00:05:45,100 456 | 您可以创建一个自定义工具,将其连接到您想要的任何内容。 457 | and what book did he write? 458 | 459 | 92 460 | 00:05:45,100 --> 00:05:47,700 461 | 现在我们来创建一个工具,它将告诉我们当前的日期。 462 | We can see here that it recognizes that it should use Wikipedia to look up the answer. 463 | 464 | 93 465 | 00:05:47,700 --> 00:05:50,700 466 | 首先,我们要导入这个工具装饰器。 467 | So it searches for Tom M. Mitchell Wikipedia. 468 | 469 | 94 470 | 00:05:57,500 --> 00:06:03,100 471 | 接下来,我们将编写一个名为“time”的函数,它接受任何文本字符串。 472 | And then does another follow-up search just to make sure that it's got the right answer. 473 | 474 | 95 475 | 00:06:09,900 --> 00:06:15,540 476 | 它将通过调用日期时间返回今天的日期。 477 | So it searches for Tom M. Mitchell machine learning and gets back more information. 478 | 479 | 96 480 | 00:06:15,540 --> 00:06:20,660 481 | 除了函数的名称,我们还将编写一个非常详细的文档字符串。 482 | And then based on that, it's able to finally answer with 483 | 484 | 97 485 | 00:06:20,660 --> 00:06:25,100 486 | 这是代理将用来知道何时调用此工具以及如何调用此工具的方式。 487 | Tom M. Mitchell wrote the textbook, Machine Learning. 488 | 489 | 98 490 | 00:06:28,500 --> 00:06:32,060 491 | 例如,在这里,我们说输入应始终为空字符串。 492 | You should pause the video here and try putting in different inputs. 493 | 494 | 99 495 | 00:06:37,460 --> 00:06:42,940 496 | 如果我们对输入有更严格的要求,例如,如果我们有一个应始终接受搜索查询或SQL语句的函数, 497 | So far, we've used tools that come defined in LinkedIn already, 498 | 499 | 100 500 | 00:06:47,340 --> 00:06:49,060 501 | 现在我们将创建另一个代理。 502 | but a big power of agents is that you can connect it to your own sources of information, 503 | 504 | 101 505 | 00:06:49,060 --> 00:06:55,660 506 | 这次,我们将时间工具添加到现有工具列表中。 507 | your own APIs, your own data. 508 | 509 | 102 510 | 00:07:03,660 --> 00:07:08,140 511 | 它识别出需要使用时间工具,并在此指定。 512 | So here we're going to go over how you can create a custom tool 513 | 514 | 103 515 | 00:07:18,740 --> 00:07:22,540 516 | 今天的日期是2023年5月21日。 517 | so that you can connect it to whatever you want. 518 | 519 | 104 520 | 00:07:26,860 --> 00:07:29,340 521 | 这就是代理的全部内容。 522 | Let's make a tool that's going to tell us what the current date is. 523 | 524 | 105 525 | 00:07:29,340 --> 00:07:34,740 526 | 这是LangChain中较新、更令人兴奋和更具实验性的部分之一。 527 | First, we're going to import this tool decorator. 528 | 529 | 106 530 | 00:07:34,740 --> 00:07:36,540 531 | 希望您喜欢使用它。 532 | 533 | This can be applied to any function, and it turns it into a tool that LinkedIn can use. 534 | 535 | 107 536 | 00:07:36,540 --> 00:07:40,540 537 | 希望它向您展示了如何使用语言模型作为推理引擎 538 | Next, we're going to write a function called time, which takes in any text string. 539 | 540 | 108 541 | 00:07:40,540 --> 00:08:00,540 542 | 以执行不同的操作并连接到其他功能和数据源。 543 | We're not really going to use that. 544 | -------------------------------------------------------------------------------- /chinese/LangChain_L4.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:01,000 --> 00:00:15,000 5 | 使用llm构建的最常见的复杂应用程序之一是一个系统,可以在文档上方或关于文档回答问题。 6 | 7 | 2 8 | 00:00:15,000 --> 00:00:24,000 9 | 因此,给定从PDF文件、网页或某些公司的内部文档收集中提取的文本, 10 | 11 | 3 12 | 00:00:24,000 --> 00:00:33,000 13 | 您可以使用llm回答有关这些文档内容的问题,以帮助用户获得更深入的理解并获得所需的信息吗? 14 | 15 | 4 16 | 00:00:33,000 --> 00:00:39,000 17 | 这真的很强大,因为它开始将这些语言模型与它们最初没有接受培训的数据结合起来。 18 | 19 | 5 20 | 00:00:39,000 --> 00:00:42,000 21 | 因此,它使它们更加灵活和适应您的用例。 22 | 23 | 6 24 | 00:00:42,000 --> 00:00:48,000 25 | 这也非常令人兴奋,因为我们将开始超越语言模型、提示和输出解析器, 26 | 27 | 7 28 | 00:00:48,000 --> 00:00:54,000 29 | 并开始引入链式的一些关键组件,例如嵌入模型和向量存储。 30 | 31 | 8 32 | 00:00:54,000 --> 00:00:58,000 33 | 正如安德鲁所提到的,这是我们拥有的最受欢迎的链之一,所以我希望你很兴奋。 34 | 35 | 9 36 | 00:00:58,000 --> 00:01:03,000 37 | 实际上,嵌入和向量存储是一些最强大的现代技术。 38 | 39 | 10 40 | 00:01:03,000 --> 00:01:08,000 41 | 因此,如果您还没有看到它们,那么了解它们非常值得。 42 | 43 | 11 44 | 00:01:08,000 --> 00:01:10,000 45 | 那么,让我们开始吧。 46 | 47 | 12 48 | 00:01:10,000 --> 00:01:11,000 49 | 开始吧。 50 | 51 | 13 52 | 00:01:11,000 --> 00:01:16,000 53 | 因此,我们将从像往常一样导入环境变量开始。 54 | 55 | 14 56 | 00:01:16,000 --> 00:01:20,000 57 | 现在,我们将导入一些在构建此链时将有所帮助的东西。 58 | 59 | 15 60 | 00:01:20,000 --> 00:01:22,000 61 | 我们将导入检索QA链。 62 | 63 | 16 64 | 00:01:22,000 --> 00:01:24,000 65 | 这将在一些文档上进行检索。 66 | 67 | 17 68 | 00:01:24,000 --> 00:01:28,000 69 | 我们将导入我们最喜欢的聊天Open AI语言模型。 70 | 71 | 18 72 | 00:01:28,000 --> 00:01:29,000 73 | 我们将导入文档加载器。 74 | 75 | 19 76 | 00:01:29,000 --> 00:01:34,000 77 | 这将用于加载一些专有数据,我们将与语言模型结合使用。 78 | 79 | 20 80 | 00:01:34,000 --> 00:01:36,000 81 | 在这种情况下,它将在CSV中。 82 | 83 | 21 84 | 00:01:36,000 --> 00:01:39,000 85 | 因此,我们将导入CSV加载器。 86 | 87 | 22 88 | 00:01:39,000 --> 00:01:41,000 89 | 最后,我们将导入向量存储。 90 | 91 | 23 92 | 00:01:41,000 --> 00:01:45,000 93 | 有许多不同类型的向量存储,我们将在稍后介绍它们的确切含义。 94 | 95 | 24 96 | 00:01:45,000 --> 00:01:49,000 97 | 但是,我们将从Dock Array内存搜索向量存储开始。 98 | 99 | 25 100 | 00:01:49,000 --> 00:01:51,000 101 | 这非常好,因为它是一个内存向量存储, 102 | 103 | 26 104 | 00:01:51,000 --> 00:01:55,000 105 | 并且不需要连接到任何外部数据库, 106 | 107 | 27 108 | 00:01:55,000 --> 00:01:57,000 109 | 所以它使得入门变得非常容易。 110 | 111 | 28 112 | 00:01:57,000 --> 00:01:59,000 113 | 我们还将导入显示和markdown两个常见的在Jupyter Notebooks中显示信息的工具。 114 | 115 | 29 116 | 00:01:59,000 --> 00:02:04,000 117 | 我们提供了一个户外服装的CSV文件,我们将使用它与语言模型结合使用。 118 | 119 | 30 120 | 00:02:04,000 --> 00:02:10,000 121 | 在这里,我们将使用该文件的路径初始化一个加载器,即CSV加载器。 122 | 123 | 31 124 | 00:02:10,000 --> 00:02:18,000 125 | 接下来,我们将导入一个索引,即向量存储索引创建器。 126 | 127 | 32 128 | 00:02:18,000 --> 00:02:22,000 129 | 这将帮助我们非常容易地创建一个向量存储。 130 | 131 | 33 132 | 00:02:22,000 --> 00:02:26,000 133 | 如下所示,只需要几行代码就可以创建它。 134 | 135 | 34 136 | 00:02:26,000 --> 00:02:34,000 137 | 为了创建它,我们将指定两件事。 138 | 139 | 35 140 | 00:02:34,000 --> 00:02:37,000 141 | 首先,我们将指定向量存储类。 142 | 143 | 36 144 | 00:02:37,000 --> 00:02:40,000 145 | 如前所述,我们将使用这个向量存储, 146 | 147 | 37 148 | 00:02:40,000 --> 00:02:46,000 149 | 因为它是一个特别容易入门的向量存储。 150 | 151 | 38 152 | 00:02:46,000 --> 00:02:49,000 153 | 创建完成后,我们将从加载器中调用, 154 | 155 | 39 156 | 00:02:49,000 --> 00:02:51,000 157 | 它接受一个文档加载器列表。 158 | 159 | 40 160 | 00:02:51,000 --> 00:02:58,000 161 | 我们只有一个我们真正关心的加载器,所以这就是我们在这里传递的。 162 | 163 | 41 164 | 00:02:58,000 --> 00:03:02,000 165 | 现在它已经被创建了,我们可以开始询问它的问题了。 166 | 167 | 42 168 | 00:03:02,000 --> 00:03:07,000 169 | 下面我们将介绍发生了什么,所以现在不要担心这个。 170 | 171 | 43 172 | 00:03:07,000 --> 00:03:09,000 173 | 在这里我们将从一个查询开始。 174 | 175 | 44 176 | 00:03:09,000 --> 00:03:17,000 177 | 然后我们将使用索引查询创建一个响应,并传入这个查询。 178 | 179 | 45 180 | 00:03:17,000 --> 00:03:21,000 181 | 同样,我们将在下面介绍发生了什么。 182 | 183 | 46 184 | 00:03:21,000 --> 00:03:30,000 185 | 现在,我们只需要等待它的响应。 186 | 187 | 47 188 | 00:03:30,000 --> 00:03:34,000 189 | 完成后,我们现在可以看看到底返回了什么。 190 | 191 | 48 192 | 00:03:34,000 --> 00:03:41,000 193 | 我们得到了一个Markdown表格,其中包含所有带有防晒衣的衬衫的名称和描述。 194 | 195 | 49 196 | 00:03:41,000 --> 00:03:45,000 197 | 我们还得到了一个语言模型提供的不错的小总结。 198 | 199 | 50 200 | 00:03:45,000 --> 00:03:48,000 201 | 所以我们已经介绍了如何在您的文档中进行问答, 202 | 203 | 51 204 | 00:03:48,000 --> 00:03:52,000 205 | 但是到底在幕后发生了什么呢? 206 | 207 | 52 208 | 00:03:52,000 --> 00:03:54,000 209 | 首先,让我们考虑一般的想法。 210 | 211 | 53 212 | 00:03:54,000 --> 00:03:58,000 213 | 我们想要使用语言模型并将其与我们的许多文档结合使用, 214 | 215 | 54 216 | 00:03:58,000 --> 00:04:03,000 217 | 但是有一个关键问题。语言模型一次只能检查几千个单词。 218 | 219 | 56 220 | 00:04:03,000 --> 00:04:10,000 221 | 如果我们有非常大的文档,如何让语言模型回答关于其中所有内容的问题呢? 222 | 223 | 57 224 | 00:04:10,000 --> 00:04:14,000 225 | 这就是嵌入和向量存储发挥作用的地方。 226 | 227 | 58 228 | 00:04:14,000 --> 00:04:17,000 229 | 首先,让我们谈谈嵌入。 230 | 231 | 59 232 | 00:04:17,000 --> 00:04:21,000 233 | 嵌入为文本片段创建数字表示。 234 | 235 | 60 236 | 00:04:21,000 --> 00:04:27,000 237 | 这种数字表示捕捉了它所运行的文本片段的语义含义。 238 | 239 | 61 240 | 00:04:27,000 --> 00:04:31,000 241 | 相似内容的文本片段将具有相似的向量。 242 | 243 | 62 244 | 00:04:31,000 --> 00:04:35,000 245 | 这使我们可以在向量空间中比较文本片段。 246 | 247 | 63 248 | 00:04:35,000 --> 00:04:38,000 249 | 在下面的示例中,我们可以看到我们有三个句子。 250 | 251 | 64 252 | 00:04:38,000 --> 00:04:43,000 253 | 前两个是关于宠物的,而第三个是关于汽车的。 254 | 255 | 65 256 | 00:04:43,000 --> 00:04:46,000 257 | 如果我们看一下数字空间中的表示, 258 | 259 | 66 260 | 00:04:46,000 --> 00:04:54,000 261 | 我们可以看到当我们比较与宠物句子相对应的文本片段上的两个向量时,它们非常相似。 262 | 263 | 67 264 | 00:04:54,000 --> 00:04:58,000 265 | 而如果我们将其与谈论汽车的那个进行比较,它们根本不相似。 266 | 267 | 68 268 | 00:04:58,000 --> 00:05:02,000 269 | 这将让我们轻松地找出哪些文本片段彼此相似, 270 | 271 | 69 272 | 00:05:02,000 --> 00:05:10,000 273 | 这在我们考虑要包含哪些文本片段传递给语言模型以回答问题时非常有用。 274 | 275 | 70 276 | 00:05:10,000 --> 00:05:13,000 277 | 我们要介绍的下一个组件是向量数据库。 278 | 279 | 71 280 | 00:05:13,000 --> 00:05:18,000 281 | 向量数据库是存储我们在上一步中创建的这些向量表示的一种方式。 282 | 283 | 72 284 | 00:05:18,000 --> 00:05:24,000 285 | 我们创建这个向量数据库的方式是用来自传入文档的文本块填充它。 286 | 287 | 73 288 | 00:05:24,000 --> 00:05:28,000 289 | 当我们获得一个大的传入文档时,我们首先将其分成较小的块。 290 | 291 | 74 292 | 00:05:28,000 --> 00:05:33,000 293 | 这有助于创建比原始文档小的文本片段, 294 | 295 | 75 296 | 00:05:33,000 --> 00:05:37,000 297 | 这很有用,因为我们可能无法将整个文档传递给语言模型。 298 | 299 | 76 300 | 00:05:37,000 --> 00:05:43,000 301 | 因此,我们想创建这些小块,以便只传递最相关的块给语言模型。 302 | 303 | 77 304 | 00:05:43,000 --> 00:05:48,000 305 | 然后,我们为每个这些块创建一个嵌入,然后将它们存储在向量数据库中。 306 | 307 | 78 308 | 00:05:48,000 --> 00:05:51,000 309 | 这就是我们创建索引时发生的事情。 310 | 311 | 79 312 | 00:05:51,000 --> 00:05:58,000 313 | 现在我们有了这个索引,我们可以在运行时使用它来查找与传入查询最相关的文本片段。 314 | 315 | 80 316 | 00:05:58,000 --> 00:06:02,000 317 | 当查询进来时,我们首先为该查询创建一个嵌入。 318 | 319 | 81 320 | 00:06:02,000 --> 00:06:07,000 321 | 然后我们将其与向量数据库中的所有向量进行比较,并选择最相似的n个。 322 | 323 | 82 324 | 00:06:07,000 --> 00:06:14,000 325 | 然后将它们返回,我们可以将它们传递到语言模型中,以获得最终答案。 326 | 327 | 83 328 | 00:06:14,000 --> 00:06:17,000 329 | 因此,我们创建了这个链,只需要几行代码。 330 | 331 | 84 332 | 00:06:17,000 --> 00:06:19,000 333 | 这对于快速入门非常有用。 334 | 335 | 85 336 | 00:06:19,000 --> 00:06:25,000 337 | 好的,现在让我们逐步进行,并了解底层到底发生了什么。 338 | 339 | 86 340 | 00:06:25,000 --> 00:06:27,000 341 | 第一步与上面类似。 342 | 343 | 87 344 | 00:06:27,000 --> 00:06:36,000 345 | 我们将创建一个文档加载器,从包含我们要进行问题回答的所有产品描述的CSV中加载。 346 | 347 | 88 348 | 00:06:36,000 --> 00:06:41,000 349 | 然后我们可以从这个文档加载器中加载文档。 350 | 351 | 89 352 | 00:06:41,000 --> 00:06:50,000 353 | 如果我们查看单个文档,我们可以看到每个文档对应于CSV中的一个产品。 354 | 355 | 90 356 | 00:06:50,000 --> 00:06:53,000 357 | 之前,我们谈到了创建块。 358 | 359 | 91 360 | 00:06:53,000 --> 00:07:01,000 361 | 因为这些文档已经非常小了,所以我们实际上不需要在这里进行任何分块,因此我们可以直接创建嵌入。 362 | 363 | 92 364 | 00:07:01,000 --> 00:07:05,000 365 | 要创建嵌入,我们将使用OpenAI的嵌入类。 366 | 367 | 93 368 | 00:07:05,000 --> 00:07:08,000 369 | 我们可以在这里导入它并初始化它。 370 | 371 | 94 372 | 00:07:08,000 --> 00:07:21,000 373 | 如果我们想看看这些嵌入是如何工作的,我们实际上可以看一下嵌入特定文本时会发生什么。 374 | 375 | 95 376 | 00:07:21,000 --> 00:07:26,000 377 | 让我们使用嵌入对象上的嵌入查询方法为特定文本创建嵌入。 378 | 379 | 96 380 | 00:07:26,000 --> 00:07:31,000 381 | 在这种情况下,句子是“嗨,我的名字是哈里森”。 382 | 383 | 97 384 | 00:07:31,000 --> 00:07:41,000 385 | 如果我们查看这个嵌入,我们可以看到有超过一千个不同的元素。 386 | 387 | 98 388 | 00:07:41,000 --> 00:07:44,000 389 | 每个元素都是不同的数字值。 390 | 391 | 99 392 | 00:07:44,000 --> 00:07:51,000 393 | 组合起来,这就创建了这段文本的总体数值表示。 394 | 395 | 100 396 | 00:07:51,000 --> 00:07:58,000 397 | 我们想为刚刚加载的所有文本创建嵌入,然后我们还想将它们存储在向量存储中。 398 | 399 | 101 400 | 00:07:58,000 --> 00:08:03,000 401 | 我们可以使用向量存储上的from documents方法来实现这一点。 402 | 403 | 102 404 | 00:08:03,000 --> 00:08:12,000 405 | 该方法接受文档列表、嵌入对象,然后我们将创建一个总体向量存储。 406 | 407 | 103 408 | 00:08:12,000 --> 00:08:18,000 409 | 现在,我们可以使用这个向量存储来查找与传入查询类似的文本。 410 | 411 | 104 412 | 00:08:18,000 --> 00:08:23,000 413 | 因此,让我们看一下查询,请建议一件带有防晒功能的衬衫。 414 | 415 | 105 416 | 00:08:23,000 --> 00:08:36,000 417 | 如果我们在向量存储中使用相似性搜索方法并传入一个查询,我们将得到一个文档列表。 418 | 419 | 106 420 | 00:08:36,000 --> 00:08:48,000 421 | 我们可以看到它返回了四个文档,如果我们看第一个文档,我们可以看到它确实是一件关于防晒的衬衫。 422 | 423 | 107 424 | 00:08:48,000 --> 00:08:52,000 425 | 那么我们如何使用它来回答我们自己的文档问题呢? 426 | 427 | 108 428 | 00:08:52,000 --> 00:08:57,000 429 | 首先,我们需要从这个向量存储中创建一个检索器。 430 | 431 | 109 432 | 00:08:57,000 --> 00:09:03,000 433 | 检索器是一个通用接口,可以由任何接受查询并返回文档的方法支持。 434 | 435 | 110 436 | 00:09:03,000 --> 00:09:11,000 437 | 向量存储和嵌入是一种这样的方法,尽管有许多不同的方法,有些不太先进,有些更先进。 438 | 439 | 111 440 | 00:09:11,000 --> 00:09:20,000 441 | 接下来,因为我们想要进行文本生成并返回自然语言响应,我们将导入一个语言模型,我们将使用聊天开放AI。 442 | 443 | 112 444 | 00:09:20,000 --> 00:09:28,000 445 | 如果我们手动进行此操作,我们将合并文档中的所有页面内容到一个变量中。 446 | 447 | 113 448 | 00:09:28,000 --> 00:09:37,000 449 | 因此,我们会做一些像这样的事情,将所有页面内容连接到一个变量中。 450 | 451 | 114 452 | 00:09:37,000 --> 00:09:48,000 453 | 然后,我们将传递此变量或问题的变体,例如请列出所有具有防晒功能的衬衫并在Markdown表格中总结每个衬衫的语言模型。 454 | 455 | 115 456 | 00:09:48,000 --> 00:09:55,000 457 | 如果我们在此处打印响应,我们可以看到我们得到了一个表格,正如我们所要求的那样。 458 | 459 | 116 460 | 00:09:55,000 --> 00:09:59,000 461 | 所有这些步骤都可以用LangChain链封装起来。 462 | 463 | 117 464 | 00:09:59,000 --> 00:10:02,000 465 | 因此,我们可以创建一个检索QA链。 466 | 467 | 118 468 | 00:10:02,000 --> 00:10:06,000 469 | 这将进行检索,然后对检索到的文档进行问题回答。 470 | 471 | 119 472 | 00:10:06,000 --> 00:10:09,000 473 | 要创建这样的链,我们将传入几个不同的东西。 474 | 475 | 120 476 | 00:10:09,000 --> 00:10:12,000 477 | 首先,我们将传入语言模型。 478 | 479 | 121 480 | 00:10:12,000 --> 00:10:15,000 481 | 这将用于在最后进行文本生成。 482 | 483 | 122 484 | 00:10:15,000 --> 00:10:17,000 485 | 接下来,我们将传入链类型。 486 | 487 | 123 488 | 00:10:17,000 --> 00:10:18,000 489 | 我们将使用stuff。 490 | 491 | 124 492 | 00:10:18,000 --> 00:10:25,000 493 | 这是最简单的方法,因为它只是将所有文档塞入上下文并对语言模型进行一次调用。 494 | 495 | 125 496 | 00:10:25,000 --> 00:10:32,000 497 | 还有一些其他方法可以用来进行问题回答,我可能会在最后提及,但我们不会详细讨论。 498 | 499 | 126 500 | 00:10:32,000 --> 00:10:34,000 501 | 第三,我们将传入一个检索器。 502 | 503 | 127 504 | 00:10:34,000 --> 00:10:38,000 505 | 我们上面创建的检索器只是一个获取文档的接口。 506 | 507 | 128 508 | 00:10:38,000 --> 00:10:41,000 509 | 这将用于获取文档并将其传递给语言模型。 510 | 511 | 129 512 | 00:10:41,000 --> 00:10:46,000 513 | 最后,我们将设置 verbose 等于 true。 514 | 515 | 130 516 | 00:10:46,000 --> 00:11:08,000 517 | 现在我们可以创建一个查询并在此查询上运行链。 518 | 519 | 131 520 | 00:11:08,000 --> 00:11:14,000 521 | 当我们获得响应时,我们可以再次使用 display 和 markdown 实用程序显示它。 522 | 523 | 132 524 | 00:11:14,000 --> 00:11:20,000 525 | 您可以在此暂停视频并尝试使用一堆不同的查询。 526 | 527 | 133 528 | 00:11:20,000 --> 00:11:26,000 529 | 所以这就是您详细了解它的方式,但请记住,我们仍然可以轻松地使用我们上面的一行来完成它。 530 | 531 | 134 532 | 00:11:26,000 --> 00:11:30,000 533 | 因此,这两个东西等同于相同的结果。 534 | 535 | 135 536 | 00:11:30,000 --> 00:11:32,000 537 | 这就是 LinkChain 的有趣之处。 538 | 539 | 136 540 | 00:11:32,000 --> 00:11:38,000 541 | 您可以在一行中完成它,也可以查看各个内容并将其分解为更详细的五个内容。 542 | 543 | 137 544 | 00:11:38,000 --> 00:11:44,000 545 | 五个更详细的内容让您设置更多关于正在发生的确切内容的细节,但一行代码很容易入手。 546 | 547 | 138 548 | 00:11:44,000 --> 00:11:48,000 549 | 所以由您决定如何继续前进。 550 | 551 | 139 552 | 00:11:48,000 --> 00:11:51,000 553 | 我们还可以在创建索引时自定义索引。 554 | 555 | 140 556 | 00:11:51,000 --> 00:11:55,000 557 | 因此,如果您记得,当我们手动创建它时,我们指定了一个嵌入。 558 | 559 | 141 560 | 00:11:55,000 --> 00:11:57,000 561 | 我们也可以在这里指定一个嵌入。 562 | 563 | 142 564 | 00:11:57,000 --> 00:12:01,000 565 | 这将使我们能够灵活地创建嵌入本身。 566 | 567 | 143 568 | 00:12:01,000 --> 00:12:06,000 569 | 我们还可以在此处替换向量存储器以获取不同类型的向量存储器。 570 | 571 | 144 572 | 00:12:06,000 --> 00:12:15,000 573 | 因此,在创建索引时,您可以进行与手动创建时相同级别的自定义。 574 | 575 | 145 576 | 00:12:15,000 --> 00:12:17,000 577 | 在这个笔记本中,我们使用了 stuff 方法。 578 | 579 | 146 580 | 00:12:17,000 --> 00:12:19,000 581 | stuff 方法非常好,因为它非常简单。 582 | 583 | 147 584 | 00:12:19,000 --> 00:12:25,000 585 | 您只需将所有内容放入一个提示符中,然后将其发送到语言模型并获取一个响应。 586 | 587 | 148 588 | 00:12:25,000 --> 00:12:27,000 589 | 因此,很容易理解正在发生什么。 590 | 591 | 149 592 | 00:12:27,000 --> 00:12:30,000 593 | 它非常便宜,而且效果很好。 594 | 595 | 150 596 | 00:12:30,000 --> 00:12:32,000 597 | 但是,这并不总是可以正常工作。 598 | 599 | 151 600 | 00:12:32,000 --> 00:12:37,000 601 | 因此,如果您记得,在笔记本中获取文档时,我们只返回了四个文档。 602 | 603 | 152 604 | 00:12:37,000 --> 00:12:39,000 605 | 它们相对较小。 606 | 607 | 153 608 | 00:12:39,000 --> 00:12:44,000 609 | 但是,如果您想在许多不同类型的块上执行相同类型的问答,该怎么办? 610 | 611 | 154 612 | 00:12:44,000 --> 00:12:47,000 613 | 那么我们可以使用几种不同的方法。 614 | 615 | 155 616 | 00:12:47,000 --> 00:12:48,000 617 | 第一个是Map Reduce。 618 | 619 | 156 620 | 00:12:48,000 --> 00:12:55,000 621 | 这基本上是将所有块与问题一起传递给语言模型,获取回复, 622 | 623 | 157 624 | 00:12:55,000 --> 00:13:02,000 625 | 然后使用另一个语言模型调用将所有单独的回复总结成最终答案。 626 | 627 | 158 628 | 00:13:02,000 --> 00:13:06,000 629 | 这非常强大,因为它可以在任意数量的文档上运行。 630 | 631 | 159 632 | 00:13:06,000 --> 00:13:11,000 633 | 而且它也非常强大,因为您可以并行处理单个问题。 634 | 635 | 160 636 | 00:13:11,000 --> 00:13:13,000 637 | 但是它需要更多的调用。 638 | 639 | 161 640 | 00:13:13,000 --> 00:13:19,000 641 | 它将所有文档视为独立的,这可能并不总是最理想的事情。 642 | 643 | 162 644 | 00:13:19,000 --> 00:13:24,000 645 | Refine是另一种方法,再次用于循环许多文档。 646 | 647 | 163 648 | 00:13:24,000 --> 00:13:25,000 649 | 但它实际上是迭代的。 650 | 651 | 164 652 | 00:13:25,000 --> 00:13:28,000 653 | 它建立在先前文档的答案之上。 654 | 655 | 165 656 | 00:13:28,000 --> 00:13:33,000 657 | 因此,这非常适合组合信息并随时间逐步构建答案。 658 | 659 | 166 660 | 00:13:33,000 --> 00:13:36,000 661 | 它通常会导致更长的答案。 662 | 663 | 167 664 | 00:13:36,000 --> 00:13:39,000 665 | 而且它也不太快,因为现在调用不是独立的。 666 | 667 | 168 668 | 00:13:39,000 --> 00:13:43,000 669 | 它们依赖于先前调用的结果。 670 | 671 | 169 672 | 00:13:43,000 --> 00:13:49,000 673 | 这意味着它通常需要更长的时间,并且基本上需要与Map Reduce一样多的调用。 674 | 675 | 170 676 | 00:13:49,000 --> 00:13:57,000 677 | Map Re-rank是一种相当有趣且更为实验性的方法,其中您对每个文档进行单个语言模型调用。 678 | 679 | 171 680 | 00:13:57,000 --> 00:14:00,000 681 | 然后您还要求它返回一个分数。 682 | 683 | 172 684 | 00:14:00,000 --> 00:14:02,000 685 | 然后您选择最高分。 686 | 687 | 173 688 | 00:14:02,000 --> 00:14:06,000 689 | 这依赖于语言模型知道分数应该是什么。 690 | 691 | 174 692 | 00:14:06,000 --> 00:14:12,000 693 | 因此,您经常需要告诉它,嘿,如果它与文档相关,则应该是高分,并在那里精细调整说明。 694 | 695 | 175 696 | 00:14:12,000 --> 00:14:15,000 697 | 与Map Reduce类似,所有调用都是独立的。 698 | 699 | 176 700 | 00:14:15,000 --> 00:14:16,000 701 | 所以您可以批量处理它们。 702 | 703 | 177 704 | 00:14:16,000 --> 00:14:18,000 705 | 而且它相对较快。 706 | 707 | 178 708 | 00:14:18,000 --> 00:14:20,000 709 | 但是,您正在进行大量的语言模型调用。 710 | 711 | 179 712 | 00:14:20,000 --> 00:14:22,000 713 | 因此,它会更加昂贵。 714 | 715 | 180 716 | 00:14:22,000 --> 00:14:29,000 717 | 这些方法中最常见的是Stuff方法,我们在笔记本中使用它将所有内容组合成一个文档。 718 | 719 | 181 720 | 00:14:29,000 --> 00:14:35,000 721 | 第二种最常见的方法是Map Reduce方法,它将这些块发送到语言模型。 722 | 723 | 182 724 | 00:14:35,000 --> 00:14:42,000 725 | 这里的这些方法,如stuff、map reduce、refine和re-rank,也可以用于除了问答之外的许多其他链。 726 | 727 | 183 728 | 00:14:42,000 --> 00:14:53,000 729 | 例如,map reduce链的一个非常常见的用例是摘要,其中您有一个非常长的文档,您想要递归地摘要其中的信息片段。 730 | 731 | 184 732 | 00:14:53,000 --> 00:14:56,000 733 | 这就是关于文档问答的全部内容。 734 | 735 | 185 736 | 00:14:56,000 --> 00:15:00,000 737 | 正如您可能已经注意到的那样,我们这里有许多不同的链条。 738 | 739 | 186 740 | 00:15:00,000 --> 00:15:12,000 741 | 因此,在下一节中,我们将介绍更好地了解所有这些链条内部究竟发生了什么的方法。 -------------------------------------------------------------------------------- /chinese/LangChain_L3.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:00,000 --> 00:00:09,560 5 | 在这节课中,哈里森将教授最重要的关键构建块,即链。 6 | 7 | 2 8 | 00:00:09,560 --> 00:00:11,960 9 | 链通常将一个llm大型语言模型与提示结合在一起。 10 | 11 | 3 12 | 00:00:12,680 --> 00:00:17,440 13 | 使用这个构建块,您还可以将一堆这些构建块组合在一起,对您的文本或其他数据进行一系列操作。 14 | 15 | 4 16 | 00:00:17,720 --> 00:00:21,720 17 | 我很兴奋地深入研究它。 18 | 19 | 5 20 | 00:00:21,720 --> 00:00:26,040 21 | 好的,首先,我们要加载环境变量,就像以前一样。 22 | 23 | 6 24 | 00:00:26,040 --> 00:00:26,540 25 | 然后我们还要加载一些我们要使用的数据。 26 | 27 | 7 28 | 00:00:27,040 --> 00:00:28,400 29 | 这些链的一部分的强大之处在于您可以一次运行它们在许多输入上。 30 | 31 | 8 32 | 00:00:28,400 --> 00:00:33,080 33 | 因此,我们将加载一个pandas数据框架。 34 | 35 | 9 36 | 00:00:33,080 --> 00:00:33,580 37 | pandas数据框架只是包含许多不同数据元素的数据结构。 38 | 39 | 10 40 | 00:00:34,440 --> 00:00:37,240 41 | 如果您不熟悉pandas,请不要担心。 42 | 43 | 11 44 | 00:00:38,040 --> 00:00:43,480 45 | 这里的主要观点是我们正在加载一些数据,稍后可以使用。 46 | 47 | 12 48 | 00:00:43,800 --> 00:00:44,320 49 | 因此,如果我们查看这个pandas数据框架,我们可以看到有一个 50 | 51 | 13 52 | 00:00:44,760 --> 00:00:46,600 53 | 产品列,然后是一个评论列。 54 | 55 | 14 56 | 00:00:47,240 --> 00:00:50,840 57 | 每一行都是一个不同的数据点,我们可以开始通过我们的链传递。 58 | 59 | 15 60 | 00:00:50,840 --> 00:00:52,440 61 | 因此,我们要介绍的第一个链是llm链。 62 | 63 | 16 64 | 00:00:52,520 --> 00:00:54,480 65 | 这是一个简单但非常强大的链,是未来我们将要介绍的许多链的基础。 66 | 67 | 17 68 | 00:00:54,480 --> 00:00:58,560 69 | 因此,我们将导入三个不同的东西。 70 | 71 | 18 72 | 00:00:58,600 --> 00:01:02,000 73 | 我们将导入OpenAI模型。 74 | 75 | 19 76 | 00:01:02,000 --> 00:01:04,200 77 | 所以llm,我们将导入聊天提示模板。 78 | 79 | 20 80 | 00:01:04,280 --> 00:01:07,640 81 | 这是提示,然后我们将导入llm链。 82 | 83 | 21 84 | 00:01:07,640 --> 00:01:08,480 85 | 首先,我们要做的是初始化我们要使用的语言模型。 86 | 87 | 22 88 | 00:01:09,680 --> 00:01:12,120 89 | 因此,我们将使用高温度初始化聊天OpenAI。 90 | 91 | 23 92 | 00:01:12,200 --> 00:01:16,400 93 | 现在,我们将初始化提示,这个提示将接受一个 94 | 95 | 34 96 | 00:01:47,640 --> 00:01:48,920 97 | 名为product的变量。 98 | 99 | 35 100 | 00:01:49,240 --> 00:01:53,000 101 | 它将要求LLM生成描述制造该产品的最佳名称。 102 | 103 | 36 104 | 00:01:53,000 --> 00:01:54,680 105 | 公司的名称。 106 | 107 | 37 108 | 00:01:55,520 --> 00:01:59,120 109 | 最后,我们将把这两个东西组合成一个链。 110 | 111 | 38 112 | 00:01:59,760 --> 00:02:01,880 113 | 这就是我们所谓的LLM链。 114 | 115 | 39 116 | 00:02:02,000 --> 00:02:02,840 117 | 它非常简单。 118 | 119 | 40 120 | 00:02:02,840 --> 00:02:06,120 121 | 它只是LLM和提示的组合。 122 | 123 | 41 124 | 00:02:06,720 --> 00:02:10,640 125 | 但现在,这个链将让我们按顺序运行提示和LLM。 126 | 127 | 42 128 | 00:02:10,640 --> 00:02:11,400 129 | 因此,如果我们有一个名为“queen size sheet set”的产品,我们可以通过使用chain.run将其运行通过这个链。 130 | 131 | 43 132 | 00:02:11,680 --> 00:02:16,120 133 | 在幕后,它将格式化提示,然后将整个提示传递到LLM中。 134 | 135 | 44 136 | 00:02:16,120 --> 00:02:17,840 137 | 因此,我们可以看到我们得到了这个假想公司的名称,叫做Royal Beddings。 138 | 139 | 45 140 | 00:02:18,240 --> 00:02:21,440 141 | 因此,现在是暂停的好时机,您可以输入任何产品描述,然后查看链将输出什么结果。 142 | 143 | 46 144 | 00:02:21,440 --> 00:02:24,080 145 | LLM链是最基本的链类型,将在未来经常使用。 146 | 147 | 47 148 | 00:02:24,400 --> 00:02:28,440 149 | 因此,我们可以看到这将如何在下一个类型的链中使用,即顺序链。 150 | 151 | 48 152 | 00:02:28,440 --> 00:02:29,320 153 | 因此,顺序链依次运行一系列链。 154 | 155 | 49 156 | 00:02:30,360 --> 00:02:33,440 157 | 因此,首先,您将导入简单的顺序链。 158 | 159 | 50 160 | 00:02:33,440 --> 00:02:36,720 161 | 当我们有只期望一个输入并返回一个输出的子链时,这很有效。 162 | 163 | 51 164 | 00:02:36,720 --> 00:02:37,160 165 | 结果。 166 | 167 | 52 168 | 00:02:38,120 --> 00:02:42,880 169 | 因此,我们可以看到这将如何在下一个类型的链中使用,即顺序链。 170 | 171 | 53 172 | 00:02:42,880 --> 00:02:43,680 173 | 将经常使用LLM链。 174 | 175 | 54 176 | 00:02:43,880 --> 00:02:47,320 177 | 因此,我们可以看到这将如何在下一个类型的链中使用,即顺序链。 178 | 179 | 55 180 | 00:02:47,320 --> 00:02:48,440 181 | 因此,顺序链依次运行一系列链。 182 | 183 | 56 184 | 00:02:48,440 --> 00:02:52,520 185 | 因此,顺序链依次运行一系列链。 186 | 187 | 57 188 | 00:02:52,960 --> 00:02:56,800 189 | 因此,首先,您将导入简单的顺序链。 190 | 191 | 58 192 | 00:02:57,240 --> 00:03:00,880 193 | 当我们有只期望一个输入并返回一个输出的子链时,这很有效。 194 | 195 | 59 196 | 00:03:00,880 --> 00:03:02,000 197 | 结果。 198 | 199 | 60 200 | 00:03:02,760 --> 00:03:07,600 201 | 因此,这里我们将首先创建一个链,它使用LLM和提示。 202 | 203 | 61 204 | 00:03:07,600 --> 00:03:08,160 205 | 这个提示将接受产品并返回最佳名称来描述该公司。 206 | 207 | 62 208 | 00:03:08,560 --> 00:03:13,640 209 | 那将是第一个链。 210 | 211 | 63 212 | 00:03:13,640 --> 00:03:14,640 213 | 然后,我们将创建第二个链。 214 | 215 | 64 216 | 00:03:14,840 --> 00:03:16,080 217 | 在第二个链中,我们将接受公司名称,然后输出该公司的20个单词的描述。 218 | 219 | 65 220 | 00:03:16,080 --> 00:03:18,280 221 | 因此,您可以想象这些链如何一个接一个地运行, 222 | 223 | 66 224 | 00:03:18,680 --> 00:03:23,280 225 | 因此,这里我们将首先创建一个链,它使用LLM和提示。 226 | 227 | 67 228 | 00:03:23,320 --> 00:03:25,240 229 | 这个提示将接受产品并返回最佳名称来描述该公司。 230 | 231 | 68 232 | 00:03:26,320 --> 00:03:29,920 233 | 因此,您可以想象这些链如何一个接一个地运行, 234 | 235 | 69 236 | 00:03:30,160 --> 00:03:33,600 237 | 第一个链的输出,即公司名称,然后传递到第二个链中。 238 | 239 | 70 240 | 00:03:33,600 --> 00:03:34,200 241 | 第二个链。 242 | 243 | 71 244 | 00:03:35,800 --> 00:03:39,960 245 | 我们可以通过创建一个简单的顺序链来轻松实现这一点,在这个链中,我们有 246 | 247 | 72 248 | 00:03:39,960 --> 00:03:41,640 249 | 那里描述的两个链。 250 | 251 | 73 252 | 00:03:42,240 --> 00:03:44,240 253 | 我们将称之为整体简单链。 254 | 255 | 74 256 | 00:03:44,240 --> 00:03:49,720 257 | 现在,您可以在任何产品描述上运行此链。 258 | 259 | 75 260 | 00:03:50,600 --> 00:03:54,920 261 | 因此,如果我们将其与上面的产品一起使用,即女王尺寸床单套装,我们可以 262 | 263 | 76 264 | 00:03:54,920 --> 00:03:58,840 265 | 运行它,我们可以看到首先输出皇家博彩,然后将其传递到第二个链中。 266 | 267 | 77 268 | 00:03:58,840 --> 00:04:00,200 269 | 然后它提出了这家公司可能涉及的描述。 270 | 271 | 78 272 | 00:04:00,200 --> 00:04:03,400 273 | 当只有一个输入和一个输出时,简单的顺序链运作良好。 274 | 275 | 79 276 | 00:04:05,680 --> 00:04:09,160 277 | 但是当只有一个输入时呢? 278 | 279 | 80 280 | 00:04:09,160 --> 00:04:09,840 281 | 单个输出,但是当有多个输入或多个输出时呢? 282 | 283 | 81 284 | 00:04:10,320 --> 00:04:12,120 285 | 那么我们可以使用普通的顺序链来实现这一点。 286 | 287 | 82 288 | 00:04:12,120 --> 00:04:16,200 289 | 当然,我们可以使用普通的顺序链来实现这一点。 290 | 291 | 83 292 | 00:04:16,200 --> 00:04:16,680 293 | 输出? 294 | 295 | 84 296 | 00:04:16,920 --> 00:04:20,080 297 | 因此,我们可以使用普通的顺序链来实现这一点。 298 | 299 | 85 300 | 00:04:21,240 --> 00:04:22,160 301 | 所以让我们导入它。 302 | 303 | 86 304 | 00:04:22,160 --> 00:04:25,280 305 | 然后,您将创建一堆链,我们将一个接一个地使用它们。 306 | 307 | 87 308 | 00:04:26,200 --> 00:04:29,200 309 | 我们将使用上面的数据,其中有一个评论。 310 | 311 | 88 312 | 00:04:29,640 --> 00:04:34,320 313 | 因此,第一个链,我们将采取评论并将其翻译成 314 | 315 | 89 316 | 00:04:34,320 --> 00:04:34,840 317 | 英语。 318 | 319 | 90 320 | 00:04:37,240 --> 00:04:41,200 321 | 第二个链,我们将创建一个摘要 322 | 323 | 91 324 | 00:04:41,200 --> 00:04:46,840 325 | 一句话。这将使用先前生成的英文评论。 326 | 327 | 93 328 | 00:04:54,560 --> 00:04:58,720 329 | 因此,如果您注意到,这是使用来自 330 | 331 | 96 332 | 00:04:58,720 --> 00:05:00,320 333 | 原始评论的评论变量。 334 | 335 | 97 336 | 00:05:02,400 --> 00:05:05,480 337 | 最后,第四个链将接受多个输入。 338 | 339 | 98 340 | 00:05:05,840 --> 00:05:09,560 341 | 因此,这将接受我们使用第二个 342 | 343 | 99 344 | 00:05:09,560 --> 00:05:13,320 345 | 链计算的摘要变量和我们使用第三个链计算的语言变量。 346 | 347 | 100 348 | 00:05:13,760 --> 00:05:17,720 349 | 它将要求在指定的 350 | 351 | 101 352 | 00:05:17,720 --> 00:05:18,240 353 | 语言中对摘要进行跟进回复。 354 | 355 | 102 356 | 00:05:19,800 --> 00:05:23,640 357 | 关于所有这些子链的一个重要事项是输入键 358 | 359 | 103 360 | 00:05:23,720 --> 00:05:25,960 361 | 输出键需要非常精确。 362 | 363 | 104 364 | 00:05:26,680 --> 00:05:28,520 365 | 所以在这里我们正在进行审查。 366 | 367 | 105 368 | 00:05:28,600 --> 00:05:31,120 369 | 这是一个在开始时将传递的变量。 370 | 371 | 106 372 | 00:05:31,760 --> 00:05:35,320 373 | 我们可以看到我们明确将输出键设置为英文审查。 374 | 375 | 107 376 | 00:05:35,320 --> 00:05:39,840 377 | 然后在下面的下一个提示中使用它,我们接受英文审查 378 | 379 | 108 380 | 00:05:39,840 --> 00:05:44,240 381 | 使用相同的变量名,并将该链的输出键设置为摘要, 382 | 383 | 109 384 | 00:05:44,680 --> 00:05:46,920 385 | 我们可以看到它在最终链中使用。 386 | 387 | 110 388 | 00:05:47,800 --> 00:05:52,760 389 | 第三个提示接受审查原始变量和输出语言, 390 | 391 | 111 392 | 00:05:53,160 --> 00:05:55,040 393 | 这又在最终提示中使用。 394 | 395 | 112 396 | 00:05:56,040 --> 00:05:59,760 397 | 确保这些变量名称完全正确非常重要, 398 | 399 | 113 400 | 00:05:59,920 --> 00:06:02,400 401 | 因为有很多不同的输入和输出。 402 | 403 | 114 404 | 00:06:02,400 --> 00:06:06,160 405 | 如果您遇到任何键错误,一定要检查它们是否对齐。 406 | 407 | 115 408 | 00:06:06,160 --> 00:06:12,040 409 | 因此,简单的顺序链接收多个链,其中每个链具有一个 410 | 411 | 116 412 | 00:06:12,040 --> 00:06:13,680 413 | 单个输入和单个输出。 414 | 415 | 117 416 | 00:06:14,560 --> 00:06:19,080 417 | 要查看其可视化表示,我们可以查看幻灯片,其中有一个 418 | 419 | 118 420 | 00:06:19,080 --> 00:06:22,760 421 | 链一个接一个地进入另一个链。 422 | 423 | 119 424 | 00:06:24,080 --> 00:06:28,000 425 | 在这里,我们可以看到顺序链的视觉描述,将其与 426 | 427 | 120 428 | 00:06:28,000 --> 00:06:32,920 429 | 上面的链进行比较,您可以注意到链中的任何步骤都可以接受多个输入 430 | 431 | 121 432 | 00:06:32,920 --> 00:06:33,720 433 | 变量。 434 | 435 | 122 436 | 00:06:34,280 --> 00:06:38,400 437 | 当您拥有更复杂的下游链需要时,这非常有用 438 | 439 | 123 440 | 00:06:38,400 --> 00:06:41,400 441 | 成为多个先前链的组合。 442 | 443 | 124 444 | 00:06:42,840 --> 00:06:46,400 445 | 现在我们拥有所有这些链,我们可以轻松地将它们组合在顺序中 446 | 447 | 125 448 | 00:06:46,400 --> 00:06:46,920 449 | 链。 450 | 451 | 126 452 | 00:06:47,360 --> 00:06:51,880 453 | 因此,您会注意到我们将创建的四个链传递到 454 | 455 | 127 456 | 00:06:51,880 --> 00:06:52,760 457 | 链变量。 458 | 459 | 128 460 | 00:06:52,760 --> 00:06:57,280 461 | 我们将创建具有一个人类输入的输入变量,即 462 | 463 | 129 464 | 00:06:57,280 --> 00:06:58,000 465 | 审查。 466 | 467 | 130 468 | 00:06:58,400 --> 00:07:02,200 469 | 然后我们想返回所有中间输出。 470 | 471 | 131 472 | 00:07:02,200 --> 00:07:05,080 473 | 英文审查,摘要,然后是后续消息。 474 | 475 | 132 476 | 00:07:07,320 --> 00:07:10,080 477 | 现在我们可以在一些数据上运行它。 478 | 479 | 133 480 | 00:07:10,120 --> 00:07:14,800 481 | 所以让我们选择一篇评论并通过整个链传递它。 482 | 483 | 134 484 | 00:07:20,000 --> 00:07:24,920 485 | 我们可以看到这里的原始评论似乎是法语。 486 | 487 | 135 488 | 00:07:24,920 --> 00:07:27,680 489 | 我们可以把英文评论看作是一种翻译。 490 | 491 | 136 492 | 00:07:27,680 --> 00:07:31,880 493 | 我们可以看到该评论的摘要,然后我们可以看到一条用法语原文写的跟进信息。 494 | 495 | 137 496 | 00:07:31,880 --> 00:07:34,240 497 | 您应该在此暂停视频并尝试输入不同的输入。 498 | 499 | 138 500 | 00:07:34,800 --> 00:07:38,320 501 | 到目前为止,我们已经涵盖了LLM链和顺序链。 502 | 503 | 139 504 | 00:07:39,040 --> 00:07:42,560 505 | 但是,如果您想做一些更复杂的事情怎么办? 506 | 507 | 140 508 | 00:07:43,080 --> 00:07:45,600 509 | 一个相当常见但基本的操作是根据输入将其路由到链中。 510 | 511 | 141 512 | 00:07:46,200 --> 00:07:50,440 513 | 一个很好的想象方式是,如果您有多个子链,每个子链都专门用于特定类型的输入,您可以有一个路由器链, 514 | 515 | 142 516 | 00:07:50,440 --> 00:07:52,400 517 | 首先决定要将其传递到哪个子链,然后将其传递到该链。 518 | 519 | 143 520 | 00:07:52,400 --> 00:07:57,200 521 | 一个具体的例子是,让我们看看在不同类型的链之间路由的情况,具体取决于似乎出现的主题。 522 | 523 | 144 524 | 00:07:57,200 --> 00:08:01,720 525 | 所以我们这里有不同的提示。 526 | 527 | 145 528 | 00:08:01,760 --> 00:08:06,000 529 | 一个提示适合回答物理问题。 530 | 531 | 146 532 | 00:08:06,000 --> 00:08:06,480 533 | 第二个提示适合回答数学问题。 534 | 535 | 147 536 | 00:08:07,360 --> 00:08:11,520 537 | 第三个适用于历史,第四个适用于计算机科学。 538 | 539 | 148 540 | 00:08:11,520 --> 00:08:15,720 541 | 让我们定义所有这些提示模板。 542 | 543 | 149 544 | 00:08:16,440 --> 00:08:18,640 545 | 在我们拥有这些提示模板之后,我们可以提供更多信息 546 | 547 | 150 548 | 00:08:18,800 --> 00:08:21,280 549 | 关于它们。 550 | 551 | 151 552 | 00:08:21,280 --> 00:08:23,680 553 | 我们可以为每个模板命名,然后提供描述。 554 | 555 | 152 556 | 00:08:23,680 --> 00:08:26,960 557 | 这个物理学的描述适合回答关于物理学的问题。 558 | 559 | 153 560 | 00:08:27,280 --> 00:08:29,440 561 | 这些信息将传递给路由器链。 562 | 563 | 154 564 | 00:08:33,320 --> 00:08:36,800 565 | 因此,路由器链可以决定何时使用此子链。 566 | 567 | 155 568 | 00:08:36,800 --> 00:08:37,320 569 | 我们现在可以导入我们需要的其他类型的链。 570 | 571 | 156 572 | 00:08:37,760 --> 00:08:40,600 573 | 在这里,我们需要一个多提示链。 574 | 575 | 157 576 | 00:08:41,160 --> 00:08:44,280 577 | 这是一种特定类型的链,用于在多个不同的提示模板之间进行路由。 578 | 579 | 158 580 | 00:08:44,280 --> 00:08:44,800 581 | 但是,这只是您可以路由的一种类型。 582 | 583 | 159 584 | 00:08:45,480 --> 00:08:48,560 585 | 您可以在任何类型的链之间进行路由。 586 | 587 | 168 588 | 00:09:19,000 --> 00:09:22,400 589 | 这里我们要实现的另外几个类是LLM路由器链。 590 | 591 | 169 592 | 00:09:22,880 --> 00:09:26,840 593 | 这个类本身使用语言模型来在不同的子链之间进行路由。 594 | 595 | 170 596 | 00:09:26,880 --> 00:09:30,360 597 | 这就是上面提供的描述和名称将被使用的地方。 598 | 599 | 171 600 | 00:09:31,160 --> 00:09:33,360 601 | 我们还将导入一个路由器输出解析器。 602 | 603 | 172 604 | 00:09:33,920 --> 00:09:38,080 605 | 这将LLM输出解析为可在下游使用的字典,以确定要使用哪个链以及该链的输入应该是什么。 606 | 607 | 173 608 | 00:09:38,440 --> 00:09:42,440 609 | 现在我们可以开始使用它了。 610 | 611 | 174 612 | 00:09:42,440 --> 00:09:44,120 613 | 首先,让我们导入并定义要使用的语言模型。 614 | 615 | 175 616 | 00:09:44,160 --> 00:09:48,680 617 | 现在我们创建目标链。 618 | 619 | 176 620 | 00:09:52,000 --> 00:09:54,240 621 | 这些是由路由器链调用的链。 622 | 623 | 177 624 | 00:09:54,400 --> 00:09:57,080 625 | 正如您所看到的,每个目标链本身都是一个语言模型链, 626 | 627 | 178 628 | 00:09:57,560 --> 00:10:01,400 629 | 除了目标链之外,我们还需要一个默认目标链。 630 | 631 | 179 632 | 00:10:01,400 --> 00:10:02,360 633 | 这是一个当路由器无法决定使用哪个子链时调用的链。 634 | 635 | 180 636 | 00:10:04,240 --> 00:10:08,640 637 | 在上面的示例中,当输入问题与物理、数学、历史或计算机科学无关时,可能会调用它。 638 | 639 | 181 640 | 00:10:08,640 --> 00:10:12,800 641 | 除了目标链之外,我们还需要一个默认链。 642 | 643 | 182 644 | 00:10:13,320 --> 00:10:15,880 645 | 这是一个当路由器无法决定使用哪个子链时调用的链。 646 | 647 | 183 648 | 00:10:16,080 --> 00:10:17,800 649 | 在上面的示例中,当输入问题与物理、数学、历史或计算机科学无关时,可能会调用它。 650 | 651 | 184 652 | 00:10:18,080 --> 00:10:22,000 653 | 现在我们定义了LLM用于在不同链之间进行路由的模板。 654 | 655 | 185 656 | 00:10:22,000 --> 00:10:25,800 657 | 这包括要完成的任务的说明以及输出应该采用的特定格式。 658 | 659 | 186 660 | 00:10:28,120 --> 00:10:31,760 661 | 让我们将其中一些部分组合起来构建路由器链。 662 | 663 | 187 664 | 00:10:31,760 --> 00:10:33,800 665 | 首先,我们通过格式化上面定义的目标创建完整的路由器模板。 666 | 667 | 188 668 | 00:10:34,720 --> 00:10:37,000 669 | 这个模板可以适应许多不同类型的目标。 670 | 671 | 189 672 | 00:10:37,000 --> 00:10:40,440 673 | 在这里,您可以暂停并添加不同类型的目标。 674 | 675 | 190 676 | 00:10:41,680 --> 00:10:44,840 677 | 因此,在这里,您可以添加一个不同的学科,如英语或拉丁语,而不仅仅是物理、数学、历史和计算机科学。 678 | 679 | 191 680 | 00:10:45,600 --> 00:10:48,520 681 | 接下来,我们从这个模板创建提示模板, 682 | 683 | 192 684 | 00:10:48,520 --> 00:10:50,480 685 | 然后通过传入LLM和整个路由器提示来创建路由器链。 686 | 687 | 193 688 | 00:10:50,920 --> 00:10:54,280 689 | 这个模板可以适应许多不同类型的目标。 690 | 691 | 194 692 | 00:10:54,720 --> 00:10:58,520 693 | 在这里,您可以暂停并添加不同类型的目标。 694 | 695 | 195 696 | 00:10:59,000 --> 00:11:02,160 697 | 因此,在这里,您可以添加一个不同的学科,如英语或拉丁语,而不仅仅是物理、数学、历史和计算机科学。 698 | 699 | 196 700 | 00:11:02,160 --> 00:11:04,960 701 | 接下来,我们从这个模板创建提示模板, 702 | 703 | 197 704 | 00:11:04,960 --> 00:11:07,760 705 | 然后通过传入LLM和整个路由器提示来创建路由器链。 706 | 707 | 200 708 | 00:11:13,960 --> 00:11:16,360 709 | 请注意,这里有路由器输出解析器。 710 | 711 | 201 712 | 00:11:16,720 --> 00:11:19,320 713 | 这很重要,因为它将帮助这个链路决定 714 | 715 | 202 716 | 00:11:19,720 --> 00:11:22,160 717 | 在哪些子链路之间进行路由。 718 | 719 | 203 720 | 00:11:24,760 --> 00:11:28,920 721 | 最后,将所有内容整合在一起,我们可以创建整体链路。 722 | 723 | 204 724 | 00:11:29,240 --> 00:11:32,400 725 | 它有一个路由器链路,在这里定义。 726 | 727 | 205 728 | 00:11:32,400 --> 00:11:35,200 729 | 它有目标链路,我们在这里传递。 730 | 731 | 206 732 | 00:11:35,400 --> 00:11:37,200 733 | 然后我们还传递默认链路。 734 | 735 | 207 736 | 00:11:38,880 --> 00:11:40,200 737 | 现在我们可以使用这个链路。 738 | 739 | 208 740 | 00:11:40,520 --> 00:11:41,960 741 | 所以让我们问一些问题。 742 | 743 | 209 744 | 00:11:42,560 --> 00:11:45,320 745 | 如果我们问一个物理问题, 746 | 747 | 210 748 | 00:11:45,520 --> 00:11:48,920 749 | 我们应该希望看到它被路由到物理链路 750 | 751 | 211 752 | 00:11:49,640 --> 00:11:52,560 753 | 输入是什么,黑体辐射是什么? 754 | 755 | 212 756 | 00:11:52,800 --> 00:11:55,480 757 | 然后它被传递到下面的链路中。 758 | 759 | 213 760 | 00:11:55,480 --> 00:11:59,080 761 | 我们可以看到回答非常详细 762 | 763 | 214 764 | 00:11:59,080 --> 00:12:01,080 765 | 有很多物理细节。 766 | 767 | 215 768 | 00:12:01,080 --> 00:12:04,600 769 | 您应该在此暂停视频并尝试输入不同的内容。 770 | 771 | 216 772 | 00:12:04,920 --> 00:12:08,520 773 | 您可以尝试使用我们上面定义的所有其他类型的特殊链路 774 | 775 | 217 776 | 00:12:08,520 --> 00:12:09,920 777 | 。 778 | 779 | 218 780 | 00:12:10,440 --> 00:12:13,240 781 | 因此,例如,如果我们问一个数学问题, 782 | 783 | 219 784 | 00:12:21,600 --> 00:12:23,680 785 | 我们应该看到它被路由到数学链路 786 | 787 | 220 788 | 00:12:24,040 --> 00:12:25,120 789 | 然后传递到那里。 790 | 791 | 221 792 | 00:12:25,120 --> 00:12:27,720 793 | 我们还可以看到当我们传递一个问题时会发生什么 794 | 795 | 222 796 | 00:12:27,920 --> 00:12:30,320 797 | 与任何子链路都无关的。 798 | 799 | 223 800 | 00:12:30,720 --> 00:12:33,480 801 | 所以在这里,我们问一个关于生物学的问题 802 | 803 | 224 804 | 00:12:33,760 --> 00:12:35,880 805 | 我们可以看到它选择的链路是无。 806 | 807 | 225 808 | 00:12:36,520 --> 00:12:38,400 809 | 这意味着它将被传递到默认链路, 810 | 811 | 226 812 | 00:12:38,400 --> 00:12:41,360 813 | 它本身只是对语言模型的通用调用。 814 | 815 | 227 816 | 00:12:41,560 --> 00:12:43,680 817 | 语言模型幸运地对生物学知道很多, 818 | 819 | 228 820 | 00:12:43,680 --> 00:12:44,880 821 | 所以它可以帮助我们。 822 | 823 | 229 824 | 00:12:46,080 --> 00:12:48,120 825 | 现在我们已经涵盖了这些基本问题, 826 | 827 | 230 828 | 00:12:48,120 --> 00:12:50,120 829 | 让我们继续视频的下一部分。 830 | 831 | 231 832 | 00:12:50,120 --> 00:12:52,120 833 | 那就是如何创建一个新的链路。 834 | 835 | 232 836 | 00:12:52,120 --> 00:12:55,120 837 | 因此,例如,在下一部分中, 838 | 839 | 233 840 | 00:12:55,120 --> 00:12:57,120 841 | 我们将介绍如何创建一个链路 842 | 843 | 234 844 | 00:12:57,120 --> 00:13:22,120 845 | 可以对您的文档进行问答。 -------------------------------------------------------------------------------- /chinese/LangChain_L5.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:01,000 --> 00:00:16,000 5 | 当使用llm构建复杂应用程序时,评估应用程序的表现是一个重要但有时棘手的步骤。它是否满足某些准确性标准? 6 | 7 | 2 8 | 00:00:16,000 --> 00:00:33,000 9 | 此外,如果您决定更改实现,可能会交换不同的llm或更改如何使用矢量数据库或其他内容检索通道或更改系统的某些其他参数的策略,那么如何知道您是在使其变得更好还是更糟? 10 | 11 | 3 12 | 00:00:34,000 --> 00:00:42,000 13 | 在本视频中,哈里森将深入探讨一些框架,以思考如何评估基于llm的应用程序以及一些工具来帮助您做到这一点。 14 | 15 | 4 16 | 00:00:42,000 --> 00:00:53,000 17 | 这些应用程序实际上是许多不同步骤的链和序列。因此,老实说,您应该做的第一件事就是了解每个步骤的具体情况。 18 | 19 | 5 20 | 00:00:54,000 --> 00:00:58,000 21 | 因此,一些工具实际上可以被视为可视化器或调试器。 22 | 23 | 6 24 | 00:00:59,000 --> 00:01:04,000 25 | 但是,通常更有用的是从许多不同的数据点中获得更全面的模型表现情况。 26 | 27 | 7 28 | 00:01:04,000 --> 00:01:15,000 29 | 一种方法是通过肉眼观察来做到这一点。但是,还有这个非常酷的想法,即使用语言模型本身和链本身来评估其他语言模型、其他链和其他应用程序。 30 | 31 | 8 32 | 00:01:16,000 --> 00:01:17,000 33 | 我们也将深入探讨这个想法。 34 | 35 | 9 36 | 00:01:18,000 --> 00:01:30,000 37 | 因此,有很多很酷的主题。我发现随着许多开发转向基于提示的开发,使用llm开发应用程序,整个工作流程评估过程正在被重新思考。 38 | 39 | 10 40 | 00:01:30,000 --> 00:01:34,000 41 | 因此,在本视频中有许多令人兴奋的概念。让我们开始吧。 42 | 43 | 11 44 | 00:01:35,000 --> 00:01:38,000 45 | 好的。那么,让我们开始评估。 46 | 47 | 12 48 | 00:01:39,000 --> 00:01:44,000 49 | 首先,我们需要有我们要评估的链或应用程序。 50 | 51 | 13 52 | 00:01:45,000 --> 00:01:49,000 53 | 我们将使用上一课的文档问答链。 54 | 55 | 14 56 | 00:01:50,000 --> 00:01:55,000 57 | 因此,我们将导入我们需要的所有内容。我们将加载相同的数据。 58 | 59 | 15 60 | 00:01:55,000 --> 00:01:58,000 61 | 我们将用一行代码创建该索引。 62 | 63 | 16 64 | 00:01:59,000 --> 00:02:11,000 65 | 然后,我们将通过指定语言模型、链类型、检索器和我们要打印的详细程度来创建检索QA链。 66 | 67 | 17 68 | 00:02:13,000 --> 00:02:14,000 69 | 因此,我们有了这个应用程序。 70 | 71 | 18 72 | 00:02:14,000 --> 00:02:25,000 73 | 我们需要做的第一件事是真正弄清楚我们想要评估它的一些数据点。 74 | 75 | 19 76 | 00:02:26,000 --> 00:02:29,000 77 | 因此,我们将介绍几种不同的方法来完成这个任务。 78 | 79 | 20 80 | 00:02:30,000 --> 00:02:37,000 81 | 第一种方法是最简单的,基本上我们将自己想出好的数据点作为例子。 82 | 83 | 21 84 | 00:02:37,000 --> 00:02:47,000 85 | 为此,我们可以查看一些数据,然后想出例子问题和答案,以便以后用于评估。 86 | 87 | 22 88 | 00:02:48,000 --> 00:02:55,000 89 | 因此,如果我们查看这里的一些文档,我们可以对其中发生的事情有所了解。 90 | 91 | 23 92 | 00:02:56,000 --> 00:03:01,000 93 | 看起来第一个文档中有这个套头衫,第二个文档中有这个夹克。 94 | 95 | 24 96 | 00:03:02,000 --> 00:03:05,000 97 | 它们都有很多细节。 98 | 99 | 25 100 | 00:03:05,000 --> 00:03:10,000 101 | 从这些细节中,我们可以创建一些例子查询和答案对。 102 | 103 | 26 104 | 00:03:11,000 --> 00:03:16,000 105 | 因此,我们可以问一个简单的问题,这个舒适的套头衫套装有侧口袋吗? 106 | 107 | 27 108 | 00:03:17,000 --> 00:03:22,000 109 | 我们可以通过上面的内容看到,它确实有一些侧口袋。 110 | 111 | 28 112 | 00:03:23,000 --> 00:03:29,000 113 | 然后对于第二个文档,我们可以看到这件夹克来自某个系列,即down tech系列。 114 | 115 | 29 116 | 00:03:30,000 --> 00:03:32,000 117 | 因此,我们可以问一个问题,这件夹克来自哪个系列? 118 | 119 | 30 120 | 00:03:32,000 --> 00:03:35,000 121 | 答案是down tech系列。 122 | 123 | 31 124 | 00:03:36,000 --> 00:03:38,000 125 | 因此,我们创建了两个例子。 126 | 127 | 32 128 | 00:03:39,000 --> 00:03:41,000 129 | 但这并不是很可扩展。 130 | 131 | 33 132 | 00:03:42,000 --> 00:03:45,000 133 | 需要花费一些时间查看每个例子并弄清楚发生了什么。 134 | 135 | 34 136 | 00:03:46,000 --> 00:03:48,000 137 | 因此,有没有一种方法可以自动化? 138 | 139 | 35 140 | 00:03:49,000 --> 00:03:53,000 141 | 我们认为可以使用语言模型本身来实现这一点。 142 | 143 | 36 144 | 00:03:54,000 --> 00:03:57,000 145 | 因此,我们在LangChain中有一个链可以做到这一点。 146 | 147 | 37 148 | 00:03:58,000 --> 00:04:00,000 149 | 因此,我们可以导入QA生成链。 150 | 151 | 38 152 | 00:04:00,000 --> 00:04:05,000 153 | 它将接收文档,并从每个文档中创建一个问题答案对。 154 | 155 | 39 156 | 00:04:06,000 --> 00:04:08,000 157 | 它将使用语言模型本身来完成这一点。 158 | 159 | 40 160 | 00:04:09,000 --> 00:04:13,000 161 | 因此,我们需要通过传递chat open AI语言模型来创建这个链。 162 | 163 | 41 164 | 00:04:14,000 --> 00:04:16,000 165 | 然后,我们可以创建许多例子。 166 | 167 | 42 168 | 00:04:17,000 --> 00:04:23,000 169 | 因此,我们将使用apply和parse方法,因为这将应用输出解析器到结果。 170 | 171 | 43 172 | 00:04:24,000 --> 00:04:27,000 173 | 因为我们想要得到一个具有查询和答案对的字典, 174 | 175 | 44 176 | 00:04:27,000 --> 00:04:29,000 177 | 而不仅仅是一个字符串。 178 | 179 | 45 180 | 00:04:36,000 --> 00:04:39,000 181 | 因此,现在如果我们看看这里返回了什么, 182 | 183 | 46 184 | 00:04:40,000 --> 00:04:42,000 185 | 我们可以看到一个查询和一个答案。 186 | 187 | 47 188 | 00:04:43,000 --> 00:04:46,000 189 | 让我们检查一下这是一个问题和答案的文档。 190 | 191 | 48 192 | 00:04:47,000 --> 00:04:49,000 193 | 我们可以看到它正在询问这个的重量。 194 | 195 | 49 196 | 00:04:50,000 --> 00:04:52,000 197 | 我们可以看到它正在从这里获取重量。 198 | 199 | 50 200 | 00:04:53,000 --> 00:04:54,000 201 | 看看那个。 202 | 203 | 51 204 | 00:04:54,000 --> 00:04:56,000 205 | 我们刚刚生成了一堆问题答案对。 206 | 207 | 52 208 | 00:04:57,000 --> 00:04:58,000 209 | 我们不必自己编写它们。 210 | 211 | 53 212 | 00:04:59,000 --> 00:05:02,000 213 | 节省了我们很多时间,我们可以做更有趣的事情。 214 | 215 | 54 216 | 00:05:03,000 --> 00:05:08,000 217 | 因此,现在让我们将这些示例添加到我们已经创建的示例中。 218 | 219 | 55 220 | 00:05:09,000 --> 00:05:14,000 221 | 所以,我们现在有了这些示例,但是我们如何评估正在发生的事情呢? 222 | 223 | 56 224 | 00:05:15,000 --> 00:05:18,000 225 | 我们想做的第一件事就是运行一个示例通过链 226 | 227 | 57 228 | 00:05:19,000 --> 00:05:21,000 229 | 并查看它产生的输出。 230 | 231 | 58 232 | 00:05:21,000 --> 00:05:25,000 233 | 因此,在这里我们传递一个查询,然后我们得到一个答案。 234 | 235 | 59 236 | 00:05:26,000 --> 00:05:29,000 237 | 但是这在了解链中实际发生的事情方面有点受限 238 | 239 | 60 240 | 00:05:30,000 --> 00:05:31,000 241 | 实际上正在发生的事情。 242 | 243 | 61 244 | 00:05:32,000 --> 00:05:34,000 245 | 进入语言模型的实际提示是什么? 246 | 247 | 62 248 | 00:05:35,000 --> 00:05:37,000 249 | 它检索的文档是什么? 250 | 251 | 63 252 | 00:05:38,000 --> 00:05:40,000 253 | 如果这是一个更复杂的链,其中有多个步骤, 254 | 255 | 64 256 | 00:05:41,000 --> 00:05:42,000 257 | 中间结果是什么? 258 | 259 | 65 260 | 00:05:43,000 --> 00:05:46,000 261 | 仅仅查看最终答案通常不足以了解链中出现了什么问题或可能出现了什么问题。 262 | 263 | 66 264 | 00:05:46,000 --> 00:05:50,000 265 | 为了帮助解决这个问题,我们在LingChain中有一个有趣的小工具,称为LingChainDebug。 266 | 267 | 67 268 | 00:05:51,000 --> 00:05:56,000 269 | 因此,如果我们将LingChainDebug设置为true 270 | 271 | 68 272 | 00:06:03,000 --> 00:06:06,000 273 | 现在我们重新运行与上面相同的示例, 274 | 275 | 69 276 | 00:06:07,000 --> 00:06:11,000 277 | 我们可以看到它开始打印出更多的信息。 278 | 279 | 70 280 | 00:06:12,000 --> 00:06:14,000 281 | 因此,如果我们看看它到底打印了什么, 282 | 283 | 71 284 | 00:06:14,000 --> 00:06:18,000 285 | 我们可以看到它首先深入到检索QA链中 286 | 287 | 72 288 | 00:06:19,000 --> 00:06:21,000 289 | 然后它进入了一些文档链。 290 | 291 | 73 292 | 00:06:22,000 --> 00:06:24,000 293 | 因此,如上所述,我们正在使用stuff方法。 294 | 295 | 74 296 | 00:06:25,000 --> 00:06:28,000 297 | 现在它正在进入LLM链,我们有几个不同的输入。 298 | 299 | 75 300 | 00:06:29,000 --> 00:06:31,000 301 | 因此,我们可以看到原始问题就在那里。 302 | 303 | 76 304 | 00:06:32,000 --> 00:06:33,000 305 | 现在我们正在传递这个上下文。 306 | 307 | 78 308 | 00:06:34,000 --> 00:06:36,000 309 | 我们可以看到,这个上下文是由我们检索到的不同文档创建的。 310 | 311 | 79 312 | 00:06:37,000 --> 00:06:40,000 313 | 因此,在进行问答时,当返回错误结果时, 314 | 315 | 80 316 | 00:06:41,000 --> 00:06:42,000 317 | 通常不是语言模型本身出错了。 318 | 319 | 81 320 | 00:06:42,000 --> 00:06:44,000 321 | 实际上是检索步骤出错了。 322 | 323 | 82 324 | 00:06:45,000 --> 00:06:48,000 325 | 因此,仔细查看问题的确切内容和上下文可以帮助调试出错的原因。 326 | 327 | 83 328 | 00:06:49,000 --> 00:06:50,000 329 | 然后,我们可以再向下一级 330 | 331 | 84 332 | 00:06:51,000 --> 00:06:54,000 333 | 看看进入语言模型的确切内容, 334 | 335 | 85 336 | 00:06:55,000 --> 00:06:58,000 337 | 以及 OpenAI 自身。 338 | 339 | 86 340 | 00:06:59,000 --> 00:07:01,000 341 | 因此,在这里,我们可以看到传递的完整提示。 342 | 343 | 87 344 | 00:07:02,000 --> 00:07:04,000 345 | 所以,我们有一个系统消息。 346 | 347 | 88 348 | 00:07:05,000 --> 00:07:06,000 349 | 我们有所使用的提示的描述。 350 | 351 | 89 352 | 00:07:07,000 --> 00:07:09,000 353 | 因此,这是问题回答链使用的提示, 354 | 355 | 90 356 | 00:07:09,000 --> 00:07:11,000 357 | 我们甚至直到现在都没有看过。 358 | 359 | 91 360 | 00:07:12,000 --> 00:07:14,000 361 | 因此,我们可以看到提示打印出来, 362 | 363 | 92 364 | 00:07:15,000 --> 00:07:17,000 365 | 使用以下上下文片段回答用户的问题。 366 | 367 | 93 368 | 00:07:18,000 --> 00:07:19,000 369 | 如果您不知道答案,只需说您不知道即可。 370 | 371 | 94 372 | 00:07:20,000 --> 00:07:21,000 373 | 不要试图编造答案。 374 | 375 | 95 376 | 00:07:22,000 --> 00:07:23,000 377 | 然后我们看到一堆之前插入的上下文, 378 | 379 | 96 380 | 00:07:24,000 --> 00:07:26,000 381 | 然后我们看到一个人类问题, 382 | 383 | 97 384 | 00:07:27,000 --> 00:07:28,000 385 | 也就是我们问它的问题。 386 | 387 | 98 388 | 00:07:29,000 --> 00:07:30,000 389 | 我们还可以看到有关实际返回类型的更多信息。 390 | 391 | 99 392 | 00:07:31,000 --> 00:07:33,000 393 | 因此,我们不仅仅返回一个字符串, 394 | 395 | 100 396 | 00:07:34,000 --> 00:07:35,000 397 | 我们还返回了许多信息,如令牌使用情况, 398 | 399 | 101 400 | 00:07:36,000 --> 00:07:37,000 401 | 因此,提示令牌、完成令牌、 402 | 403 | 102 404 | 00:07:38,000 --> 00:07:40,000 405 | 总令牌和模型名称。 406 | 407 | 103 408 | 00:07:41,000 --> 00:07:42,000 409 | 这可以非常有用地跟踪您在链中使用的令牌 410 | 411 | 104 412 | 00:07:43,000 --> 00:07:45,000 413 | 或随时间调用语言模型的令牌 414 | 415 | 105 416 | 00:07:46,000 --> 00:07:47,000 417 | 并跟踪总令牌数, 418 | 419 | 106 420 | 00:07:48,000 --> 00:07:50,000 421 | 这与总成本非常接近。 422 | 423 | 107 424 | 00:07:51,000 --> 00:07:53,000 425 | 由于这是一个相对简单的链, 426 | 427 | 108 428 | 00:07:54,000 --> 00:07:55,000 429 | 我们现在可以看到最终的响应, 430 | 431 | 114 432 | 00:08:07,000 --> 00:08:09,000 433 | 舒适的毛衣套装,条纹款, 434 | 435 | 115 436 | 00:08:10,000 --> 00:08:11,000 437 | 有侧袋,正在起泡, 438 | 439 | 116 440 | 00:08:12,000 --> 00:08:14,000 441 | 通过链条返回给用户。 442 | 443 | 117 444 | 00:08:15,000 --> 00:08:17,000 445 | 因此,我们刚刚讲解了如何查看和调试单个输入到该链的情况。 446 | 447 | 118 448 | 00:08:18,000 --> 00:08:21,000 449 | 但是我们创建的所有示例呢? 450 | 451 | 119 452 | 00:08:22,000 --> 00:08:23,000 453 | 我们该如何评估它们? 454 | 455 | 120 456 | 00:08:24,000 --> 00:08:25,000 457 | 与创建它们类似, 458 | 459 | 121 460 | 00:08:26,000 --> 00:08:28,000 461 | 一种方法是手动进行。 462 | 463 | 122 464 | 00:08:29,000 --> 00:08:30,000 465 | 我们可以运行链条来处理所有示例, 466 | 467 | 123 468 | 00:08:31,000 --> 00:08:32,000 469 | 然后查看输出并尝试弄清楚 470 | 471 | 124 472 | 00:08:32,000 --> 00:08:34,000 473 | 发生了什么,它是否正确, 474 | 475 | 125 476 | 00:08:35,000 --> 00:08:36,000 477 | 不正确,部分正确。 478 | 479 | 126 480 | 00:08:37,000 --> 00:08:38,000 481 | 与创建示例类似, 482 | 483 | 127 484 | 00:08:39,000 --> 00:08:40,000 485 | 随着时间的推移,这开始变得有点乏味。 486 | 487 | 128 488 | 00:08:41,000 --> 00:08:42,000 489 | 因此,让我们回到我们最喜欢的解决方案。 490 | 491 | 129 492 | 00:08:43,000 --> 00:08:45,000 493 | 我们可以要求语言模型来做吗? 494 | 495 | 130 496 | 00:08:46,000 --> 00:08:48,000 497 | 首先,我们需要为所有示例创建预测。 498 | 499 | 131 500 | 00:08:49,000 --> 00:08:51,000 501 | 在这之前,我要关闭 502 | 503 | 132 504 | 00:08:52,000 --> 00:08:54,000 505 | 调试模式,以便不将所有内容打印到屏幕上。 506 | 507 | 133 508 | 00:08:55,000 --> 00:08:57,000 509 | 然后我将为所有不同的示例创建预测。 510 | 511 | 134 512 | 00:08:58,000 --> 00:08:59,000 513 | 因此,我认为我们总共有七个示例。 514 | 515 | 135 516 | 00:09:00,000 --> 00:09:01,000 517 | 现在我们有了这些示例, 518 | 519 | 136 520 | 00:09:02,000 --> 00:09:03,000 521 | 我们可以考虑评估它们。 522 | 523 | 137 524 | 00:09:04,000 --> 00:09:06,000 525 | 因此,我们将导入QA, 526 | 527 | 138 528 | 00:09:07,000 --> 00:09:09,000 529 | 问题回答,评估链。 530 | 531 | 139 532 | 00:09:09,000 --> 00:09:29,000 533 | 我们将通过语言模型创建此链, 534 | 535 | 140 536 | 00:09:30,000 --> 00:09:31,000 537 | 因为我们将使用语言模型 538 | 539 | 141 540 | 00:09:32,000 --> 00:09:33,000 541 | 来帮助进行评估。 542 | 543 | 142 544 | 00:09:34,000 --> 00:09:35,000 545 | 然后我们将在此链上调用evaluate。 546 | 547 | 143 548 | 00:09:35,000 --> 00:09:38,000 549 | 我们将传入示例和预测, 550 | 551 | 144 552 | 00:09:39,000 --> 00:09:40,000 553 | 然后我们将得到一堆分级输出。 554 | 555 | 145 556 | 00:09:41,000 --> 00:09:42,000 557 | 因此,为了看到每个示例的情况, 558 | 559 | 153 560 | 00:10:07,000 --> 00:10:08,000 561 | 我们将循环遍历它们。 562 | 563 | 154 564 | 00:10:09,000 --> 00:10:10,000 565 | 我们将打印出问题。 566 | 567 | 155 568 | 00:10:11,000 --> 00:10:12,000 569 | 而且,这是由语言模型生成的。 570 | 571 | 156 572 | 00:10:13,000 --> 00:10:14,000 573 | 我们将打印出真正的答案。 574 | 575 | 157 576 | 00:10:15,000 --> 00:10:17,000 577 | 而且,这也是由语言模型生成的。 578 | 579 | 158 580 | 00:10:18,000 --> 00:10:19,000 581 | 当它面前有整个文档时,它可以生成一个真实的答案。 582 | 583 | 159 584 | 00:10:20,000 --> 00:10:22,000 585 | 因此,它可以生成一个真实的答案。 586 | 587 | 160 588 | 00:10:23,000 --> 00:10:24,000 589 | 我们将打印出预测的答案。 590 | 591 | 161 592 | 00:10:25,000 --> 00:10:26,000 593 | 这是由语言模型生成的。 594 | 595 | 162 596 | 00:10:27,000 --> 00:10:28,000 597 | 当它进行QA链时, 598 | 599 | 163 600 | 00:10:29,000 --> 00:10:30,000 601 | 当它使用嵌入和向量数据库进行检索时, 602 | 603 | 164 604 | 00:10:30,000 --> 00:10:32,000 605 | 然后将其传递到语言模型中, 606 | 607 | 165 608 | 00:10:33,000 --> 00:10:35,000 609 | 然后尝试猜测预测的答案。 610 | 611 | 166 612 | 00:10:36,000 --> 00:10:37,000 613 | 然后我们还将打印出成绩。 614 | 615 | 167 616 | 00:10:38,000 --> 00:10:40,000 617 | 而且,这也是由语言模型生成的 618 | 619 | 168 620 | 00:10:41,000 --> 00:10:43,000 621 | 当它要求评估链评估正在发生的事情时, 622 | 623 | 169 624 | 00:10:44,000 --> 00:10:45,000 625 | 以及它是否正确或不正确。 626 | 627 | 170 628 | 00:10:46,000 --> 00:10:47,000 629 | 因此,当我们循环遍历所有这些示例并将它们打印出来时, 630 | 631 | 171 632 | 00:10:48,000 --> 00:10:49,000 633 | 我们可以详细了解每个示例。 634 | 635 | 172 636 | 00:10:50,000 --> 00:10:53,000 637 | 对于每个示例,它看起来都是正确的。 638 | 639 | 173 640 | 00:10:54,000 --> 00:10:56,000 641 | 这是一个相对简单的检索问题, 642 | 643 | 174 644 | 00:11:00,000 --> 00:11:01,000 645 | 所以这是令人放心的。 646 | 647 | 175 648 | 00:11:02,000 --> 00:11:04,000 649 | 那么,让我们看看第一个例子。 650 | 651 | 176 652 | 00:11:05,000 --> 00:11:07,000 653 | 这里的问题是,舒适的套头衫套装 654 | 655 | 178 656 | 00:11:08,000 --> 00:11:09,000 657 | 有侧口袋吗? 658 | 659 | 179 660 | 00:11:10,000 --> 00:11:11,000 661 | 真正的答案,我们创建了这个,是肯定的。 662 | 663 | 180 664 | 00:11:12,000 --> 00:11:14,000 665 | 预测的答案,语言模型产生的, 666 | 667 | 181 668 | 00:11:15,000 --> 00:11:17,000 669 | 是舒适的套头衫套装条纹 670 | 671 | 182 672 | 00:11:18,000 --> 00:11:19,000 673 | 确实有侧口袋。 674 | 675 | 183 676 | 00:11:20,000 --> 00:11:22,000 677 | 因此,我们可以理解这是一个正确的答案。 678 | 679 | 184 680 | 00:11:22,000 --> 00:11:25,000 681 | 实际上,语言模型也是这样, 682 | 683 | 185 684 | 00:11:26,000 --> 00:11:27,000 685 | 它将其评为正确。 686 | 687 | 186 688 | 00:11:28,000 --> 00:11:29,000 689 | 但是让我们想想为什么我们需要使用 690 | 691 | 187 692 | 00:11:30,000 --> 00:11:31,000 693 | 语言模型首先。 694 | 695 | 696 | 697 | 191 698 | 00:11:40,000 --> 00:11:41,000 699 | 我甚至认为“是”的字眼都没有出现过。 700 | 701 | 192 702 | 00:11:42,000 --> 00:11:43,000 703 | 在这个字符串中。 704 | 705 | 193 706 | 00:11:44,000 --> 00:11:45,000 707 | 因此,如果我们尝试进行一些字符串匹配 708 | 709 | 194 710 | 00:11:46,000 --> 00:11:47,000 711 | 或精确匹配,甚至在这里使用一些正则表达式, 712 | 713 | 195 714 | 00:11:48,000 --> 00:11:49,000 715 | 它就不知道该怎么做了。 716 | 717 | 196 718 | 00:11:49,000 --> 00:11:51,000 719 | 它们不是同一件事。 720 | 721 | 197 722 | 00:11:52,000 --> 00:11:53,000 723 | 这显示了使用语言模型进行评估的重要性。 724 | 725 | 198 726 | 00:11:54,000 --> 00:11:55,000 727 | 你有这些答案,它们是任意的字符串。 728 | 729 | 199 730 | 00:11:56,000 --> 00:12:00,000 731 | 没有单一的真实字符串是最好的可能答案。 732 | 733 | 200 734 | 00:12:01,000 --> 00:12:03,000 735 | 有许多不同的变体。 736 | 737 | 201 738 | 00:12:04,000 --> 00:12:05,000 739 | 只要它们具有相同的语义意义, 740 | 741 | 202 742 | 00:12:06,000 --> 00:12:07,000 743 | 它们应该被评为相似。 744 | 745 | 203 746 | 00:12:08,000 --> 00:12:09,000 747 | 这就是语言模型的帮助所在, 748 | 749 | 204 750 | 00:12:10,000 --> 00:12:12,000 751 | 而不是仅仅进行精确匹配。 752 | 753 | 207 754 | 00:12:16,000 --> 00:12:19,000 755 | 这种比较字符串的困难是使语言模型评估变得如此困难的原因。 756 | 757 | 208 758 | 00:12:20,000 --> 00:12:23,000 759 | 我们正在将它们用于这些非常开放的任务 760 | 761 | 209 762 | 00:12:24,000 --> 00:12:26,000 763 | 其中它们被要求生成文本。 764 | 765 | 210 766 | 00:12:27,000 --> 00:12:28,000 767 | 这以前并没有真正做过, 768 | 769 | 211 770 | 00:12:29,000 --> 00:12:30,000 771 | 因为直到最近的模型还不够好 772 | 773 | 212 774 | 00:12:31,000 --> 00:12:32,000 775 | 来做到这一点。 776 | 777 | 213 778 | 00:12:33,000 --> 00:12:34,000 779 | 因此,到目前为止存在的许多评估指标都不够好。 780 | 781 | 214 782 | 00:12:35,000 --> 00:12:36,000 783 | 我们不得不发明新的指标 784 | 785 | 215 786 | 00:12:37,000 --> 00:12:38,000 787 | 和发明新的启发式方法来做到这一点。 788 | 789 | 216 790 | 00:12:39,000 --> 00:12:40,000 791 | 目前最有趣和最受欢迎的 792 | 793 | 219 794 | 00:12:44,000 --> 00:12:46,000 795 | 这些启发式方法之一 796 | 797 | 220 798 | 00:12:47,000 --> 00:12:49,000 799 | 实际上是使用语言模型进行评估。 800 | 801 | 221 802 | 00:12:50,000 --> 00:12:51,000 803 | 这结束了评估课程, 804 | 805 | 222 806 | 00:12:52,000 --> 00:12:53,000 807 | 但我想向您展示的最后一件事 808 | 809 | 223 810 | 00:12:54,000 --> 00:12:55,000 811 | 是LangChain评估平台。 812 | 813 | 224 814 | 00:12:56,000 --> 00:12:58,000 815 | 这是一种在笔记本中执行我们刚刚执行的所有操作的方法,但是将其持久化并在UI中显示。 816 | 817 | 225 818 | 00:12:59,000 --> 00:13:01,000 819 | 因此,让我们来看看。 820 | 821 | 230 822 | 00:13:09,000 --> 00:13:13,000 823 | 我们在笔记本中运行的所有运行。 824 | 825 | 231 826 | 00:13:14,000 --> 00:13:16,000 827 | 因此,这是跟踪输入和输出的好方法 828 | 829 | 232 830 | 00:13:17,000 --> 00:13:18,000 831 | 在高层次上,但这也是一种非常好的方式 832 | 833 | 233 834 | 00:13:19,000 --> 00:13:21,000 835 | 看看底层到底发生了什么。 836 | 837 | 234 838 | 00:13:22,000 --> 00:13:24,000 839 | 因此,这是在打开调试模式时打印出的相同信息 840 | 841 | 235 842 | 00:13:25,000 --> 00:13:26,000 843 | 在一个 UI 中可视化 844 | 845 | 236 846 | 00:13:27,000 --> 00:13:28,000 847 | 在一个更好的方式。 848 | 849 | 237 850 | 00:13:29,000 --> 00:13:31,000 851 | 因此,我们可以看到链的输入 852 | 853 | 238 854 | 00:13:32,000 --> 00:13:33,000 855 | 和每个步骤的链的输出。 856 | 857 | 239 858 | 00:13:34,000 --> 00:13:35,000 859 | 然后我们可以点击越来越深 860 | 861 | 240 862 | 00:13:36,000 --> 00:13:37,000 863 | 进入链并查看更多信息 864 | 865 | 241 866 | 00:13:37,000 --> 00:13:39,000 867 | 关于实际传递的内容。 868 | 869 | 242 870 | 00:13:40,000 --> 00:13:41,000 871 | 因此,如果我们一直走到底部, 872 | 873 | 243 874 | 00:13:42,000 --> 00:13:43,000 875 | 我们现在可以看到正在传递什么 876 | 877 | 244 878 | 00:13:44,000 --> 00:13:45,000 879 | 确切地到聊天模型。 880 | 881 | 245 882 | 00:13:46,000 --> 00:13:47,000 883 | 我们在这里有系统消息。 884 | 885 | 246 886 | 00:13:48,000 --> 00:13:49,000 887 | 我们在这里有人类问题。 888 | 889 | 247 890 | 00:13:50,000 --> 00:13:51,000 891 | 我们在这里有聊天模型的响应。 892 | 893 | 248 894 | 00:13:52,000 --> 00:13:53,000 895 | 我们有一些输出元数据。 896 | 897 | 249 898 | 00:13:54,000 --> 00:13:55,000 899 | 我们在这里添加的另一件事 900 | 901 | 250 902 | 00:13:56,000 --> 00:13:58,000 903 | 是能够将这些示例添加到数据集中。 904 | 905 | 251 906 | 00:13:59,000 --> 00:14:01,000 907 | 因此,如果您记得,当我们创建时 908 | 909 | 252 910 | 00:14:02,000 --> 00:14:03,000 911 | 那些示例数据集在开始时, 912 | 913 | 253 914 | 00:14:04,000 --> 00:14:05,000 915 | 我们部分手动创建, 916 | 917 | 254 918 | 00:14:05,000 --> 00:14:07,000 919 | 部分使用语言模型。 920 | 921 | 255 922 | 00:14:08,000 --> 00:14:09,000 923 | 在这里,我们可以通过单击此小按钮将其添加到数据集中, 924 | 925 | 256 926 | 00:14:10,000 --> 00:14:11,000 927 | 现在我们有输入查询 928 | 929 | 257 930 | 00:14:12,000 --> 00:14:13,000 931 | 和输出结果。 932 | 933 | 258 934 | 00:14:14,000 --> 00:14:15,000 935 | 因此,我们可以创建一个数据集。 936 | 937 | 261 938 | 00:14:20,000 --> 00:14:22,000 939 | 我们可以称其为深度学习。 940 | 941 | 262 942 | 00:14:25,000 --> 00:14:26,000 943 | 然后我们可以开始添加示例 944 | 945 | 263 946 | 00:14:27,000 --> 00:14:28,000 947 | 到这个数据集中。 948 | 949 | 264 950 | 00:14:29,000 --> 00:14:30,000 951 | 因此,回到最初的事情 952 | 953 | 265 954 | 00:14:31,000 --> 00:14:32,000 955 | 我们在课程开始时解决的问题, 956 | 957 | 266 958 | 00:14:32,000 --> 00:14:34,000 959 | 我们需要创建这些数据集 960 | 961 | 267 962 | 00:14:35,000 --> 00:14:36,000 963 | 以便我们可以进行评估。 964 | 965 | 268 966 | 00:14:37,000 --> 00:14:38,000 967 | 这是一种非常好的方式 968 | 969 | 269 970 | 00:14:39,000 --> 00:14:40,000 971 | 只是在后台运行。 972 | 973 | 270 974 | 00:14:41,000 --> 00:14:42,000 975 | 随着时间的推移,不断添加示例数据集 976 | 977 | 271 978 | 00:14:43,000 --> 00:14:44,000 979 | 开始建立这些示例 980 | 981 | 272 982 | 00:14:45,000 --> 00:14:46,000 983 | 您可以开始用于评估的示例 984 | 985 | 273 986 | 00:14:46,000 --> 00:15:06,000 987 | 并启动评估的飞轮。 -------------------------------------------------------------------------------- /chinese/LangChain_L1.srt: -------------------------------------------------------------------------------- 1 | 2 | 3 | 1 4 | 00:00:00,000 --> 00:00:08,720 5 | 在第一课中,我们将涵盖模型、提示和解析器。 6 | 7 | 2 8 | 00:00:08,720 --> 00:00:13,320 9 | 因此,模型是指支撑许多内容的语言模型。 10 | 11 | 3 12 | 00:00:13,320 --> 00:00:18,680 13 | 提示是指创建输入以传递到模型的样式。 14 | 15 | 4 16 | 00:00:18,680 --> 00:00:20,400 17 | 然后解析器位于相反的端点。 18 | 19 | 5 20 | 00:00:20,400 --> 00:00:24,360 21 | 它涉及将这些模型的输出解析为更结构化的格式,以便您可以在下游执行操作。 22 | 23 | 6 24 | 00:00:24,360 --> 00:00:27,360 25 | 因此,当您使用llm构建应用程序时, 26 | 27 | 7 28 | 00:00:27,360 --> 00:00:30,640 29 | 它们通常是可重用的模型。 30 | 31 | 8 32 | 00:00:30,640 --> 00:00:32,600 33 | 我们反复提示模型, 34 | 35 | 9 36 | 00:00:32,600 --> 00:00:34,160 37 | 解析输出,因此Lanchain提供了一组易于使用的抽象来执行此类操作。 38 | 39 | 10 40 | 00:00:34,160 --> 00:00:36,520 41 | 因此,让我们开始看一下模型、提示和解析器。 42 | 43 | 11 44 | 00:00:36,520 --> 00:00:39,760 45 | 因此,为了开始, 46 | 47 | 12 48 | 00:00:39,760 --> 00:00:44,640 49 | 这里有一些起始代码。 50 | 51 | 13 52 | 00:00:44,640 --> 00:00:46,160 53 | 我要导入OS, 54 | 55 | 14 56 | 00:00:46,160 --> 00:00:47,640 57 | 导入OpenAI,并加载我的OpenAI秘钥。 58 | 59 | 15 60 | 00:00:47,640 --> 00:00:48,840 61 | OpenAI库已经安装在 62 | 63 | 16 64 | 00:00:48,840 --> 00:00:53,240 65 | 我的Jupyter笔记本环境中,因此如果您在本地运行此代码, 66 | 67 | 17 68 | 00:00:53,240 --> 00:00:56,400 69 | 并且您尚未安装OpenAI, 70 | 71 | 18 72 | 00:00:56,400 --> 00:00:59,920 73 | 您可能需要运行它。 74 | 75 | 19 76 | 00:00:59,920 --> 00:01:01,960 77 | BangPip install OpenAI,但我不会在这里这样做。 78 | 79 | 20 80 | 00:01:01,960 --> 00:01:04,000 81 | 然后这是一个辅助函数。 82 | 83 | 21 84 | 00:01:04,000 --> 00:01:06,840 85 | 这实际上与您可能在 86 | 87 | 22 88 | 00:01:06,840 --> 00:01:08,720 89 | ChatGPT提示工程师开发者课程中看到的辅助函数非常相似, 90 | 91 | 23 92 | 00:01:08,720 --> 00:01:10,280 93 | 所以使用这个辅助函数, 94 | 95 | 24 96 | 00:01:10,280 --> 00:01:13,960 97 | 您可以说get completion on, 98 | 99 | 25 100 | 00:01:13,960 --> 00:01:17,240 101 | 什么是1加1,这将调用ChatGPT或技术上的模型, 102 | 103 | 26 104 | 00:01:17,240 --> 00:01:20,360 105 | GPT 3.5 Turbo,以便您可以得到这样的答案。 106 | 107 | 27 108 | 00:01:20,360 --> 00:01:22,160 109 | 因此,为了激发模型提示和解析器的线链抽象, 110 | 111 | 28 112 | 00:01:22,160 --> 00:01:25,200 113 | 让我们假设您收到一封来自客户的电子邮件,该电子邮件不是英语。 114 | 115 | 29 116 | 00:01:25,200 --> 00:01:31,120 117 | 为了确保这是可访问的,我将使用英语海盗语言。 118 | 119 | 30 120 | 00:01:31,120 --> 00:01:35,400 121 | 当客户说R时, 122 | 123 | 36 124 | 00:01:57,120 --> 00:02:02,280 125 | 我会因为搅拌机盖子飞出去,把我的厨房墙壁弄得满是果汁而感到愤怒。 126 | 127 | 37 128 | 00:02:02,280 --> 00:02:06,120 129 | 更糟糕的是,保修不包括清洁厨房的费用。 130 | 131 | 38 132 | 00:02:06,120 --> 00:02:08,000 133 | 我现在需要你的帮助,伙计。 134 | 135 | 39 136 | 00:02:08,000 --> 00:02:12,400 137 | 所以我们将要做的是请求这个LLM以平静和尊重的口吻将文本翻译成美式英语。 138 | 139 | 40 140 | 00:02:12,400 --> 00:02:18,080 141 | 所以我将把风格设置为平静和尊重的美式英语。 142 | 143 | 41 144 | 00:02:18,080 --> 00:02:22,520 145 | 为了实现这一点,我将使用一个f字符串来指定提示和指令。 146 | 147 | 42 148 | 00:02:22,520 --> 00:02:26,080 149 | 如果你之前看过一些提示,我将使用一个f字符串来指定提示和指令。 150 | 151 | 43 152 | 00:02:26,080 --> 00:02:29,080 153 | 我将指定用三个反引号括起来的文本翻译成指定的风格。 154 | 155 | 44 156 | 00:02:29,080 --> 00:02:33,040 157 | 然后插入这两种风格。 158 | 159 | 45 160 | 00:02:33,040 --> 00:02:38,160 161 | 这样就生成了一个提示,说要翻译文本等等。 162 | 163 | 46 164 | 00:02:38,160 --> 00:02:39,880 165 | 我鼓励你暂停视频并运行代码,尝试修改提示以查看是否可以获得不同的输出。 166 | 167 | 47 168 | 00:02:39,880 --> 00:02:46,200 169 | 然后,你可以提示大型语言模型以获得响应。 170 | 171 | 48 172 | 00:02:46,200 --> 00:02:49,840 173 | 让我们看看响应是什么。 174 | 175 | 53 176 | 00:03:04,320 --> 00:03:07,880 177 | 说将英语海盗的消息翻译成这个非常礼貌的语气。 178 | 179 | 54 180 | 00:03:07,880 --> 00:03:10,680 181 | 我真的很沮丧,因为我的搅拌机盖子飞了出去, 182 | 183 | 55 184 | 00:03:10,680 --> 00:03:13,480 185 | 把我的厨房墙壁弄得满是果汁等等。 186 | 187 | 56 188 | 00:03:13,480 --> 00:03:18,120 189 | 嗯,我现在真的需要你的帮助,我的朋友。听起来非常好。 190 | 191 | 57 192 | 00:03:18,120 --> 00:03:23,160 193 | 因此,如果你有不同语言的不同客户撰写评论, 194 | 195 | 58 196 | 00:03:23,160 --> 00:03:26,880 197 | 不仅仅是英语海盗,还有法语、德语、日语等等, 198 | 199 | 59 200 | 00:03:26,880 --> 00:03:29,800 201 | 你可以想象需要生成一整个提示序列来生成这样的翻译。 202 | 203 | 60 204 | 00:03:29,800 --> 00:03:33,920 205 | 让我们看看如何使用Lang chain更方便地完成这项工作。 206 | 207 | 61 208 | 00:03:33,920 --> 00:03:39,360 209 | 我将导入chat open AI。这是Lang chain对chat GPT API端点的抽象。 210 | 211 | 62 212 | 00:03:39,360 --> 00:03:44,360 213 | 因此,如果我设置chat等于chat open AI并查看chat是什么, 214 | 215 | 63 216 | 00:03:44,360 --> 00:03:49,320 217 | 它将创建这个对象,使用chat GPT模型,也称为GPT 3.5 turbo。 218 | 219 | 64 220 | 00:03:49,320 --> 00:03:53,840 221 | 当我构建应用程序时, 222 | 223 | 69 224 | 00:04:04,560 --> 00:04:09,560 225 | 我经常做的一件事是将温度参数设置为0。 226 | 227 | 70 228 | 00:04:09,560 --> 00:04:11,800 229 | 所以默认温度为0.7。 230 | 231 | 71 232 | 00:04:11,800 --> 00:04:20,040 233 | 但让我重新设置温度为0.0。 234 | 235 | 72 236 | 00:04:20,040 --> 00:04:25,400 237 | 现在将温度设置为0,以使我们输出的随机性稍微减少一些。 238 | 239 | 73 240 | 00:04:26,800 --> 00:04:30,960 241 | 现在让我按如下方式定义模板字符串。 242 | 243 | 74 244 | 00:04:30,960 --> 00:04:35,000 245 | 将由三个向量分隔的文本翻译成样式。 246 | 247 | 75 248 | 00:04:35,000 --> 00:04:36,800 249 | 然后这里是文本。 250 | 251 | 76 252 | 00:04:36,800 --> 00:04:40,320 253 | 为了反复重用这个模板, 254 | 255 | 77 256 | 00:04:40,320 --> 00:04:46,200 257 | 让我们导入Lang chain的聊天提示模板。 258 | 259 | 78 260 | 00:04:46,200 --> 00:04:54,840 261 | 然后让我使用我们刚刚编写的模板字符串创建一个提示模板。 262 | 263 | 79 264 | 00:04:54,840 --> 00:05:01,560 265 | 从提示模板中, 266 | 267 | 80 268 | 00:05:01,560 --> 00:05:06,120 269 | 你实际上可以提取原始提示,并意识到 270 | 271 | 81 272 | 00:05:06,120 --> 00:05:10,520 273 | 这个提示有两个输入变量,样式和文本, 274 | 275 | 82 276 | 00:05:10,520 --> 00:05:16,040 277 | 这些用花括号表示。 278 | 279 | 83 280 | 00:05:16,040 --> 00:05:20,200 281 | 这里也是我们指定的原始模板。 282 | 283 | 84 284 | 00:05:20,200 --> 00:05:22,760 285 | 事实上,如果我打印出来, 286 | 287 | 85 288 | 00:05:22,760 --> 00:05:27,800 289 | 它意识到它有两个输入变量,样式和文本。 290 | 291 | 86 292 | 00:05:27,800 --> 00:05:30,200 293 | 现在让我们指定样式。 294 | 295 | 87 296 | 00:05:30,200 --> 00:05:31,960 297 | 这是我想要的样式, 298 | 299 | 88 300 | 00:05:31,960 --> 00:05:33,800 301 | 将客户消息翻译成该样式。 302 | 303 | 89 304 | 00:05:33,800 --> 00:05:36,320 305 | 所以我要称之为客户样式。 306 | 307 | 90 308 | 00:05:36,320 --> 00:05:44,120 309 | 这是我之前的同一个客户电子邮件。 310 | 311 | 91 312 | 00:05:44,120 --> 00:05:50,960 313 | 现在,如果我创建 314 | 315 | 92 316 | 00:05:50,960 --> 00:05:55,520 317 | 客户消息,这将生成提示, 318 | 319 | 93 320 | 00:05:55,520 --> 00:05:59,560 321 | 并将在一分钟内传递给大型语言模型以获得响应。 322 | 323 | 94 324 | 00:05:59,560 --> 00:06:01,880 325 | 所以如果你想看看类型, 326 | 327 | 95 328 | 00:06:01,880 --> 00:06:04,400 329 | 客户消息实际上是一个列表。 330 | 331 | 96 332 | 00:06:04,400 --> 00:06:10,760 333 | 如果你看一下列表的第一个元素, 334 | 335 | 97 336 | 00:06:10,760 --> 00:06:16,880 337 | 这更或多或少是你期望它创建的提示。 338 | 339 | 98 340 | 00:06:16,880 --> 00:06:20,440 341 | 最后,让我们将此提示传递给LLM。 342 | 343 | 99 344 | 00:06:20,440 --> 00:06:22,560 345 | 所以我要调用聊天, 346 | 347 | 100 348 | 00:06:22,560 --> 00:06:25,040 349 | 我们之前设置的, 350 | 351 | 101 352 | 00:06:25,040 --> 00:06:28,480 353 | 作为OpenAI聊天GPT端点的参考。 354 | 355 | 102 356 | 00:06:28,480 --> 00:06:36,320 357 | 如果我们打印出客户响应的内容, 358 | 359 | 103 360 | 00:06:36,320 --> 00:06:38,800 361 | 那么它会给你返回,um, 362 | 363 | 104 364 | 00:06:38,800 --> 00:06:45,000 365 | 这段文本是从英语海盗语翻译成礼貌的美式英语。 366 | 367 | 105 368 | 00:06:45,000 --> 00:06:47,840 369 | 当然,你可以想象其他使用情况, 370 | 371 | 106 372 | 00:06:47,840 --> 00:06:53,400 373 | 客户的电子邮件是其他语言,这也可以用来 374 | 375 | 107 376 | 00:06:53,400 --> 00:06:58,400 377 | 翻译消息,以便英语为母语的人理解并回复。 378 | 379 | 108 380 | 00:06:58,400 --> 00:07:02,280 381 | 我鼓励你暂停视频并运行代码,还可以 382 | 383 | 109 384 | 00:07:02,280 --> 00:07:06,280 385 | 尝试修改提示,看看是否可以获得不同的输出。 386 | 387 | 110 388 | 00:07:06,280 --> 00:07:09,240 389 | 现在,让我们希望我们的客服代表 390 | 391 | 111 392 | 00:07:09,240 --> 00:07:11,800 393 | 用他们的原始语言回复客户。 394 | 395 | 112 396 | 00:07:11,800 --> 00:07:16,160 397 | 所以,让我们假设英语为母语的客服代表写了这个并说, 398 | 399 | 113 400 | 00:07:16,160 --> 00:07:18,240 401 | 嘿,客户,保修不包括, 402 | 403 | 114 404 | 00:07:18,240 --> 00:07:20,280 405 | 你的厨房清洁费,因为这是你的错, 406 | 407 | 115 408 | 00:07:20,280 --> 00:07:23,520 409 | 你忘记盖上盖子,误用了你的搅拌机。 410 | 411 | 116 412 | 00:07:23,520 --> 00:07:24,920 413 | 很遗憾,再见。 414 | 415 | 117 416 | 00:07:24,920 --> 00:07:26,320 417 | 不是很礼貌的消息, 418 | 419 | 118 420 | 00:07:26,320 --> 00:07:31,560 421 | 但是,让我们假设这是客服代表想要的。 422 | 423 | 119 424 | 00:07:31,720 --> 00:07:36,040 425 | 我们将指定 426 | 427 | 120 428 | 00:07:36,040 --> 00:07:39,480 429 | 服务消息将被翻译成这种海盗风格。 430 | 431 | 121 432 | 00:07:39,480 --> 00:07:45,120 433 | 所以我们希望它以礼貌的语气用英语海盗语说话。 434 | 435 | 122 436 | 00:07:45,120 --> 00:07:48,080 437 | 因为我们之前创建了那个提示模板, 438 | 439 | 123 440 | 00:07:48,080 --> 00:07:52,520 441 | 很酷的是我们现在可以重复使用那个提示模板并指定 442 | 443 | 124 444 | 00:07:52,520 --> 00:07:58,240 445 | 我们想要的输出样式是这个服务风格的海盗和这个服务回复的文本。 446 | 447 | 125 448 | 00:07:58,240 --> 00:08:01,240 449 | 如果我们这样做, 450 | 451 | 126 452 | 00:08:01,800 --> 00:08:05,200 453 | 那就是提示。 454 | 455 | 127 456 | 00:08:05,760 --> 00:08:09,160 457 | 如果我们提示, 458 | 459 | 128 460 | 00:08:09,160 --> 00:08:13,040 461 | 聊天GPT,这是它给我们的回应。 462 | 463 | 129 464 | 00:08:13,040 --> 00:08:18,080 465 | 啊,那里的伙计,我必须友好地告诉你,保修不包括 466 | 467 | 130 468 | 00:08:18,080 --> 00:08:21,200 469 | 你的厨房清洁费等等。 470 | 471 | 131 472 | 00:08:21,200 --> 00:08:23,520 473 | 是的,很遗憾,再见我的心爱的。 474 | 475 | 132 476 | 00:08:23,520 --> 00:08:27,600 477 | 所以你可能会想知道为什么我们使用提示模板而不是, 478 | 479 | 133 480 | 00:08:27,600 --> 00:08:28,920 481 | 你知道,只是一个F字符串? 482 | 483 | 134 484 | 00:08:28,920 --> 00:08:32,480 485 | 答案是,随着你构建复杂的应用程序, 486 | 487 | 135 488 | 00:08:32,480 --> 00:08:35,360 489 | 提示可能会非常长和详细。 490 | 491 | 136 492 | 00:08:35,360 --> 00:08:42,440 493 | 因此,提示模板是一个有用的抽象,可以帮助您在可以重复使用好的提示时重复使用它们。 494 | 495 | 137 496 | 00:08:42,440 --> 00:08:46,760 497 | 嗯,这是一个相对较长的提示示例,用于在线学习应用程序中对学生提交的作业进行评分。 498 | 499 | 138 500 | 00:08:46,760 --> 00:08:52,000 501 | 像这样的提示可能会很长,您可以要求LLM首先解决问题,然后以特定格式输出。 502 | 503 | 139 504 | 00:08:52,000 --> 00:08:57,560 505 | 将其包装在Lanchain提示中可以更轻松地重用此类提示。 506 | 507 | 140 508 | 00:08:57,560 --> 00:09:02,600 509 | 此外,您稍后会看到Lanchain为一些常见操作提供提示, 510 | 511 | 141 512 | 00:09:02,600 --> 00:09:08,720 513 | 例如摘要或问题回答或连接到SQL数据库, 514 | 515 | 142 516 | 00:09:08,720 --> 00:09:14,640 517 | 或连接到不同的API。 518 | 519 | 143 520 | 00:09:14,640 --> 00:09:20,520 521 | 因此,通过使用一些Lanchain内置的提示, 522 | 523 | 144 524 | 00:09:20,520 --> 00:09:22,280 525 | 您可以快速使应用程序运行而无需自己设计提示。 526 | 527 | 145 528 | 00:09:22,280 --> 00:09:25,880 529 | Lanchain提示库的另一个方面是它还支持输出解析, 530 | 531 | 146 532 | 00:09:25,880 --> 00:09:29,640 533 | 我们将在一分钟内介绍。 534 | 535 | 147 536 | 00:09:29,640 --> 00:09:31,880 537 | 但是,当您使用LLM构建复杂应用程序时, 538 | 539 | 148 540 | 00:09:31,880 --> 00:09:37,840 541 | 通常会指示LLM以特定格式生成其输出, 542 | 543 | 149 544 | 00:09:37,840 --> 00:09:40,600 545 | 例如使用特定关键字。 546 | 547 | 150 548 | 00:09:40,600 --> 00:09:42,920 549 | 左侧的示例说明了使用LLM执行一种称为思维链推理的东西, 550 | 551 | 151 552 | 00:09:42,920 --> 00:09:46,840 553 | 使用React框架。 554 | 555 | 152 556 | 00:09:46,840 --> 00:09:52,600 557 | 但是不要担心技术细节, 558 | 559 | 153 560 | 00:09:52,600 --> 00:09:55,240 561 | 但关键是LLM正在思考什么, 562 | 563 | 154 564 | 00:09:55,240 --> 00:10:00,680 565 | 因为给LLM思考的空间, 566 | 567 | 155 568 | 00:10:00,680 --> 00:10:06,160 569 | 它通常可以得出更准确的结论。 570 | 571 | 156 572 | 00:10:06,160 --> 00:10:09,280 573 | 然后将动作作为关键字执行特定操作, 574 | 575 | 157 576 | 00:10:09,280 --> 00:10:15,560 577 | 然后观察以显示从该操作中学到的内容,依此类推。 578 | 579 | 158 580 | 00:10:15,560 --> 00:10:18,160 581 | 如果您有一个提示,指示LLM使用这些特定关键字, 582 | 583 | 159 584 | 00:10:18,160 --> 00:10:21,240 585 | 思考,动作和观察, 586 | 587 | 160 588 | 00:10:21,240 --> 00:10:25,520 589 | 那么这个提示可以与解析器配合使用, 590 | 591 | 161 592 | 00:10:25,520 --> 00:10:31,240 593 | 以提取已标记为这些特定关键字的文本。 594 | 595 | 162 596 | 00:10:31,240 --> 00:10:37,480 597 | 因此,这一起为指定LLM的输入提供了非常好的抽象, 598 | 599 | 163 600 | 00:10:37,480 --> 00:10:39,920 601 | 然后还可以使用解析器正确解释LLM给出的输出。 602 | 603 | 168 604 | 00:11:01,040 --> 00:11:08,680 605 | 因此,让我们回到使用Langchain的输出解析器的示例。 606 | 607 | 169 608 | 00:11:08,680 --> 00:11:14,160 609 | 在这个例子中,让我们看一下如何使用LLM输出JSON, 610 | 611 | 170 612 | 00:11:14,160 --> 00:11:17,280 613 | 并使用Langchain解析该输出。 614 | 615 | 171 616 | 00:11:17,280 --> 00:11:23,440 617 | 我将使用一个产品评论的运行示例来提取信息并以JSON格式格式化输出。 618 | 619 | 172 620 | 00:11:23,440 --> 00:11:28,800 621 | 这是一个期望的输出示例。 622 | 623 | 173 624 | 00:11:28,800 --> 00:11:33,920 625 | 这实际上是一个Python字典, 626 | 627 | 174 628 | 00:11:33,920 --> 00:11:36,720 629 | 其中产品是否为GIF, 630 | 631 | 175 632 | 00:11:36,720 --> 00:11:38,960 633 | 大规模的假,交付所需的天数为五天, 634 | 635 | 176 636 | 00:11:38,960 --> 00:11:41,840 637 | 价格值相当实惠。 638 | 639 | 177 640 | 00:11:41,840 --> 00:11:44,440 641 | 这是一个期望的输出示例。 642 | 643 | 178 644 | 00:11:44,440 --> 00:11:48,280 645 | 以下是客户评论以及尝试获得JSON输出的模板。 646 | 647 | 179 648 | 00:11:48,280 --> 00:11:50,720 649 | 以下是一个客户评论。 650 | 651 | 180 652 | 00:11:50,720 --> 00:11:57,160 653 | 它说,睡眠吹风机非常惊人。 654 | 655 | 181 656 | 00:11:57,160 --> 00:11:58,520 657 | 它有四个设置,蜡烛吹风机, 658 | 659 | 182 660 | 00:11:58,520 --> 00:12:00,600 661 | 温柔的微风,风城和龙卷风。 662 | 663 | 183 664 | 00:12:00,600 --> 00:12:02,480 665 | 它在两天内到达,正好是我妻子的周年礼物。 666 | 667 | 184 668 | 00:12:02,480 --> 00:12:04,800 669 | 我认为我妻子非常喜欢它,她一言不发。 670 | 671 | 185 672 | 00:12:04,800 --> 00:12:08,640 673 | 到目前为止,我是唯一使用它的人,等等。 674 | 675 | 186 676 | 00:12:08,640 --> 00:12:15,520 677 | 以下是评论模板。 678 | 679 | 187 680 | 00:12:15,520 --> 00:12:18,080 681 | 对于以下文本,提取以下信息, 682 | 683 | 188 684 | 00:12:18,080 --> 00:12:20,040 685 | 指定这是一个GIF。 686 | 687 | 189 688 | 00:12:20,040 --> 00:12:21,680 689 | 在这种情况下,是的, 690 | 691 | 190 692 | 00:12:21,680 --> 00:12:22,840 693 | 因为这是一个GIF。 694 | 695 | 191 696 | 00:12:22,840 --> 00:12:25,640 697 | 还有交付天数。 698 | 699 | 192 700 | 00:12:25,640 --> 00:12:27,160 701 | 需要多长时间才能交付? 702 | 703 | 193 704 | 00:12:27,160 --> 00:12:29,640 705 | 看起来在这种情况下,它在两天内到达。 706 | 707 | 194 708 | 00:12:29,640 --> 00:12:32,040 709 | 还有,价格值是多少? 710 | 711 | 195 712 | 00:12:32,040 --> 00:12:35,640 713 | 比睡眠吹风机稍微贵一些,等等。 714 | 715 | 196 716 | 00:12:35,640 --> 00:12:42,920 717 | 因此,评论模板要求LLM以客户评论作为输入, 718 | 719 | 197 720 | 00:12:42,920 --> 00:12:45,920 721 | 并提取这三个字段, 722 | 723 | 198 724 | 00:12:45,920 --> 00:12:48,360 725 | 然后将输出格式化为JSON, 726 | 727 | 199 728 | 00:12:48,360 --> 00:12:52,000 729 | 使用以下键。 730 | 731 | 200 732 | 00:12:52,000 --> 00:12:56,400 733 | 好的。 734 | 735 | 201 736 | 00:12:56,400 --> 00:13:01,080 737 | 以下是如何在Langchain中包装它。 738 | 739 | 204 740 | 00:13:01,080 --> 00:13:02,920 741 | 让我们导入聊天提示模板。 742 | 743 | 205 744 | 00:13:02,920 --> 00:13:04,760 745 | 实际上我们之前已经导入过了。 746 | 747 | 206 748 | 00:13:04,760 --> 00:13:06,520 749 | 所以从技术上讲,这一行是多余的, 750 | 751 | 207 752 | 00:13:06,520 --> 00:13:08,360 753 | 但我会再次导入它, 754 | 755 | 208 756 | 00:13:08,360 --> 00:13:10,680 757 | 然后从上面的评论模板创建提示模板。 758 | 759 | 209 760 | 00:13:10,680 --> 00:13:16,040 761 | 这是提示模板。 762 | 763 | 210 764 | 00:13:16,040 --> 00:13:19,480 765 | 现在,类似于我们早期使用提示模板的方式, 766 | 767 | 211 768 | 00:13:19,480 --> 00:13:23,680 769 | 让我们创建要传递给OpenAI端点的消息。 770 | 771 | 212 772 | 00:13:23,680 --> 00:13:32,000 773 | 创建OpenAI端点,调用该端点,然后让我们打印出响应。 774 | 775 | 213 776 | 00:13:32,000 --> 00:13:34,760 777 | 我鼓励您暂停视频并运行代码。 778 | 779 | 214 780 | 00:13:34,760 --> 00:13:39,000 781 | 然后就是这样了。 782 | 783 | 215 784 | 00:13:39,000 --> 00:13:42,960 785 | 它说GIF为true,交货时间为两天, 786 | 787 | 216 788 | 00:13:44,440 --> 00:13:46,520 789 | 价格值看起来也相当准确。 790 | 791 | 217 792 | 00:13:46,520 --> 00:13:49,000 793 | 但请注意,如果我们检查响应的类型, 794 | 795 | 218 796 | 00:13:49,000 --> 00:13:52,920 797 | 这实际上是一个字符串。 798 | 799 | 219 800 | 00:13:52,920 --> 00:14:02,280 801 | 看起来像JSON,看起来有键值对, 802 | 803 | 220 804 | 00:14:02,280 --> 00:14:04,040 805 | 但它实际上不是一个字典。 806 | 807 | 221 808 | 00:14:04,040 --> 00:14:07,960 809 | 这只是一个很长的字符串。 810 | 811 | 222 812 | 00:14:07,960 --> 00:14:09,480 813 | 所以我真正想做的是去响应内容, 814 | 815 | 223 816 | 00:14:09,480 --> 00:14:11,920 817 | 并从GIF键中获取值,这应该是true。 818 | 819 | 224 820 | 00:14:11,920 --> 00:14:14,720 821 | 但如果我运行这个,这应该会生成一个错误,因为,嗯, 822 | 823 | 225 824 | 00:14:14,720 --> 00:14:17,560 825 | 这实际上是一个字符串,这不是一个Python字典。 826 | 827 | 226 828 | 00:14:17,560 --> 00:14:21,080 829 | 所以让我们看看如何使用Langchain的解析器来做到这一点。 830 | 831 | 227 832 | 00:14:21,080 --> 00:14:24,040 833 | 我将从Langchain导入响应模式和结构化输出解析器。 834 | 835 | 228 836 | 00:14:24,040 --> 00:14:27,680 837 | 并且,我将告诉它我想要解析什么,通过指定这些响应模式。 838 | 839 | 229 840 | 00:14:27,680 --> 00:14:31,200 841 | 所以GIF模式被命名为GIF, 842 | 843 | 230 844 | 00:14:31,200 --> 00:14:39,560 845 | 这是描述。购买的物品是否作为礼物送给别人?回答true或yes,如果不是或未知,则为false等等。 846 | 847 | 231 848 | 00:14:39,560 --> 00:14:46,400 849 | 所以我有一个GIF模式, 850 | 851 | 232 852 | 00:14:46,400 --> 00:14:49,080 853 | 交货日期模式,价格值模式, 854 | 855 | 233 856 | 00:14:49,080 --> 00:14:50,200 857 | 然后让我们将它们全部放入列表中。 858 | 859 | 234 860 | 00:14:50,200 --> 00:14:52,720 861 | 现在我已经指定了这些模式, 862 | 863 | 241 864 | 00:15:08,760 --> 00:15:12,880 865 | Langchain实际上可以直接给你提示,通过输出解析器告诉你要发送给LLM的指令。 866 | 867 | 242 868 | 00:15:12,880 --> 00:15:20,080 869 | 通过输出解析器告诉你要发送给LLM的指令,这样,如果我要打印格式指令, 870 | 871 | 243 872 | 00:15:20,080 --> 00:15:24,320 873 | 她有一套非常精确的LLM格式指令,可以生成输出,输出解析器可以处理。 874 | 875 | 244 876 | 00:15:24,840 --> 00:15:29,200 877 | 所以这是新的评论模板。 878 | 879 | 245 880 | 00:15:29,200 --> 00:15:33,640 881 | 评论模板包括Langchain生成的格式指令,因此也可以从评论模板中创建提示, 882 | 883 | 246 884 | 00:15:33,880 --> 00:15:37,400 885 | 然后创建将传递到OpenAI端点的消息。 886 | 887 | 247 888 | 00:15:37,400 --> 00:15:41,440 889 | 如果您想,您可以查看实际提示, 890 | 891 | 248 892 | 00:15:41,440 --> 00:15:50,720 893 | 它会告诉您如何提取字段,GIF、交货天数、价格值。 894 | 895 | 249 896 | 00:15:50,720 --> 00:15:57,960 897 | 这是文本,这是格式化指令。 898 | 899 | 250 900 | 00:15:57,960 --> 00:16:02,240 901 | 最后,如果我们调用OpenAI端点, 902 | 903 | 251 904 | 00:16:02,240 --> 00:16:07,440 905 | 让我们看看我们得到了什么响应。 906 | 907 | 252 908 | 00:16:07,440 --> 00:16:10,400 909 | 现在是这样的。 910 | 911 | 253 912 | 00:16:10,400 --> 00:16:15,400 913 | 现在,如果我们使用之前创建的输出解析器, 914 | 915 | 254 916 | 00:16:17,400 --> 00:16:25,240 917 | 您可以将其解析为输出字典。 918 | 919 | 255 920 | 00:16:25,240 --> 00:16:29,120 921 | 我们的打印应该是这样的。 922 | 923 | 256 924 | 00:16:29,120 --> 00:16:32,520 925 | 请注意,这是字典类型,而不是字符串。 926 | 927 | 257 928 | 00:16:33,320 --> 00:16:38,760 929 | 这就是为什么我现在可以提取与GIFs键相关联的值并获得true, 930 | 931 | 258 932 | 00:16:38,760 --> 00:16:46,040 933 | 或提取与交货天数相关联的值并获得2。 934 | 935 | 259 936 | 00:16:46,040 --> 00:16:49,080 937 | 或者您还可以提取与价格值相关联的值。 938 | 939 | 260 940 | 00:16:49,080 --> 00:16:57,000 941 | 因此,这是一种巧妙的方法,可以将您的LLM输出解析为Python字典,以使输出在下游处理中更容易使用。 942 | 943 | 261 944 | 00:16:57,000 --> 00:17:03,920 945 | 我鼓励您暂停视频并运行代码。 946 | 947 | 262 948 | 00:17:03,920 --> 00:17:08,640 949 | 这就是模型、提示和解析器。 950 | 951 | 263 952 | 00:17:08,640 --> 00:17:10,800 953 | 有了这些工具,希望您能轻松地重用自己的提示模板, 954 | 955 | 264 956 | 00:17:10,800 --> 00:17:14,040 957 | 与您合作的其他人分享提示模板, 958 | 959 | 265 960 | 00:17:14,040 --> 00:17:20,160 961 | 甚至使用Lanchain内置的提示模板,正如您刚才看到的, 962 | 963 | 266 964 | 00:17:20,160 --> 00:17:25,080 965 | 这是一种巧妙的方法,可以将您的LLM输出解析为Python字典,以使输出在下游处理中更容易使用。 966 | 967 | 267 968 | 00:17:25,080 --> 00:17:28,440 969 | 我鼓励您暂停视频并运行代码。 970 | 971 | 268 972 | 00:17:28,440 --> 00:17:31,160 973 | 这就是模型、提示和解析器。 974 | 975 | 269 976 | 00:17:31,160 --> 00:17:32,680 977 | 有了这些工具,希望您能轻松地重用自己的提示模板, 978 | 979 | 270 980 | 00:17:32,680 --> 00:17:37,640 981 | 与您合作的其他人分享提示模板, 982 | 983 | 271 984 | 00:17:37,640 --> 00:17:40,280 985 | 甚至使用Lanchain内置的提示模板,正如您刚才看到的, 986 | 987 | 275 988 | 00:17:45,040 --> 00:17:47,920 989 | 通常可以与输出解析器配对使用。 990 | 991 | 276 992 | 00:17:47,920 --> 00:17:53,280 993 | 这样,输入提示可以以特定格式输出,然后解析器将输出暂停以将数据存储在Python字典或其他数据结构中,以便于下游处理。 994 | 995 | 277 996 | 00:17:53,280 --> 00:17:57,800 997 | 我希望您在许多应用程序中都能找到这个有用。 998 | 999 | 280 1000 | 00:18:06,080 --> 00:18:10,400 1001 | 有了这个,让我们进入下一个视频,看看Lanchain如何帮助您构建更好的聊天机器人或使LLM的聊天更有效, 1002 | 1003 | 282 1004 | 00:18:16,400 --> 00:18:36,400 1005 | 通过更好地管理它到目前为止所记住的对话。 -------------------------------------------------------------------------------- /english/LangChain_L2.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:18,000 3 | When you interact with these models, naturally, they don't remember what you say before or any of the previous conversation, which is an issue when you're building some applications like chatbot and you want to have a conversation with them. 4 | 5 | 2 6 | 00:00:18,000 --> 00:00:31,000 7 | And so in this section, we'll cover memory, which is basically how do you remember previous parts of the conversation and feed that into the language model so that they can have this conversational flow as you're interacting with them. 8 | 9 | 3 10 | 00:00:31,000 --> 00:00:38,000 11 | Yep. So, Language Chain offers multiple sophisticated options for managing these memories. Let's jump in and take a look. 12 | 13 | 4 14 | 00:00:38,000 --> 00:00:48,000 15 | So, let me start off by importing my OpenAI API key and then let me import a few tools that I'll need. 16 | 17 | 5 18 | 00:00:48,000 --> 00:00:55,000 19 | Let's use as the motivating example for memory, using LangChain to manage a chat or a chatbot conversation. 20 | 21 | 6 22 | 00:00:55,000 --> 00:01:09,000 23 | So, to do that, I'm going to set the llm as a chat interface of OpenAI with temperature equals zero. And I'm going to use the memory as a conversation buffer memory. 24 | 25 | 7 26 | 00:01:09,000 --> 00:01:15,000 27 | And you'll see later what this means. And I'm going to build a conversation chain. 28 | 29 | 8 30 | 00:01:15,000 --> 00:01:26,000 31 | Again, later in this short course, Harrison will dive much more deeply into what exactly is a chain in LangChain. So, don't worry too much about the details of the syntax for now. 32 | 33 | 9 34 | 00:01:26,000 --> 00:01:36,000 35 | But this builds an llm. And if I start to have a conversation, conversation.predict, give the input. Hi, my name is Andrew. 36 | 37 | 10 38 | 00:01:36,000 --> 00:01:47,000 39 | Let's see what it says. Hello, Andrew, it's nice to meet you. Right? And so on. And then let's say I ask it, what is one plus one? 40 | 41 | 11 42 | 00:01:47,000 --> 00:01:55,000 43 | Um, one plus one is two. And then ask it again, you know, what's my name? Your name is Andrew, as you mentioned earlier. 44 | 45 | 12 46 | 00:01:55,000 --> 00:02:06,000 47 | Hmm, there's a lot of trace of sarcasm there. Not sure. And so if you want, you can change this verbose variable to true to see what LangChain is actually doing. 48 | 49 | 13 50 | 00:02:06,000 --> 00:02:11,000 51 | When you run predict, hi, my name is Andrew, this is the prompt that LangChain is generating. 52 | 53 | 14 54 | 00:02:11,000 --> 00:02:16,000 55 | It says, the following is a friendly conversation between a human and an AI, AI is talkative, and so on. 56 | 57 | 15 58 | 00:02:16,000 --> 00:02:26,000 59 | So this is a prompt that LangChain has generated to have the system have a hopeful and friendly conversation, and it has to save the conversation and here's the response. 60 | 61 | 16 62 | 00:02:26,000 --> 00:02:35,000 63 | And when you execute this on the second and third parts of the conversations, it keeps the prompt as follows. 64 | 65 | 17 66 | 00:02:35,000 --> 00:02:43,000 67 | And notice that by the time I'm uttering, what is my name? This is the third turn, that's my third input. 68 | 69 | 18 70 | 00:02:43,000 --> 00:02:50,000 71 | It has stored the current conversation as follows. Hi, my name is Andrew, what is one plus one, and so on. 72 | 73 | 19 74 | 00:02:50,000 --> 00:02:57,000 75 | And so this memory or this history of the conversation gets longer and longer. 76 | 77 | 20 78 | 00:02:57,000 --> 00:03:02,000 79 | In fact, up on top, I had used the memory variable to store the memory. 80 | 81 | 21 82 | 00:03:02,000 --> 00:03:08,000 83 | So if I were to print memory.buffer, it has stored the conversation so far. 84 | 85 | 22 86 | 00:03:08,000 --> 00:03:14,000 87 | You can also print this out, memory.loadMemoryVariables. 88 | 89 | 23 90 | 00:03:14,000 --> 00:03:18,000 91 | The curly braces here is actually an empty dictionary. 92 | 93 | 24 94 | 00:03:18,000 --> 00:03:25,000 95 | There's some more advanced features that you can use with a more sophisticated input, but we won't talk about them in this short course. 96 | 97 | 25 98 | 00:03:25,000 --> 00:03:28,000 99 | So don't worry about why there's an empty curly braces here. 100 | 101 | 26 102 | 00:03:28,000 --> 00:03:33,000 103 | But this is what LangChain has remembered in the memory of the conversation so far. 104 | 105 | 27 106 | 00:03:33,000 --> 00:03:38,000 107 | It's just everything that the AI or that the human has said. 108 | 109 | 28 110 | 00:03:38,000 --> 00:03:41,000 111 | I encourage you to pause the video and run the code. 112 | 113 | 29 114 | 00:03:41,000 --> 00:03:49,000 115 | So the way that LangChain is storing the conversation is with this conversation buffer memory. 116 | 117 | 30 118 | 00:03:49,000 --> 00:03:55,000 119 | If I were to use the conversation buffer memory to specify a couple of inputs and outputs, 120 | 121 | 31 122 | 00:03:55,000 --> 00:03:59,000 123 | this is how you add new things to the memory if you wish to do so explicitly. 124 | 125 | 32 126 | 00:03:59,000 --> 00:04:03,000 127 | Memory.saveContext says, hi, what's up? 128 | 129 | 33 130 | 00:04:03,000 --> 00:04:09,000 131 | I know this is not the most exciting conversation, but I wanted to have a short example. 132 | 133 | 34 134 | 00:04:09,000 --> 00:04:15,000 135 | Um, and with that, this is what the status of the memory is. 136 | 137 | 35 138 | 00:04:15,000 --> 00:04:22,000 139 | And once again, let me actually show the, uh, memory variables. 140 | 141 | 36 142 | 00:04:22,000 --> 00:04:29,000 143 | Now, if you want to add additional, um, data to the memory, you can keep on saving additional context. 144 | 145 | 37 146 | 00:04:29,000 --> 00:04:34,000 147 | So conversation goes on, not much, just hanging, cool. 148 | 149 | 38 150 | 00:04:34,000 --> 00:04:38,000 151 | And if you print out the memory, you know, there's now more stuff in it. 152 | 153 | 39 154 | 00:04:38,000 --> 00:04:46,000 155 | So when you use a large language model for a chat conversation, um, the large language model itself is actually stateless. 156 | 157 | 40 158 | 00:04:46,000 --> 00:04:51,000 159 | The language model itself does not remember the conversation you've had so far. 160 | 161 | 41 162 | 00:04:51,000 --> 00:04:55,000 163 | And each transaction, each call to the API endpoint is independent. 164 | 165 | 42 166 | 00:04:55,000 --> 00:05:07,000 167 | And chatbots appear to have memory only because there's usually rapid code that provides the full conversation that's been had so far as context to the LLM. 168 | 169 | 43 170 | 00:05:07,000 --> 00:05:14,000 171 | And so the memory can store explicitly the terms or the utterances so far. 172 | 173 | 44 174 | 00:05:14,000 --> 00:05:16,000 175 | Hi, my name is Andrew. Hello, it's just nice to meet you and so on. 176 | 177 | 45 178 | 00:05:16,000 --> 00:05:30,000 179 | And this memory storage is used as input or additional context to the LLM so that it can generate an output as if it's just having the next conversational turn, knowing what's been said before. 180 | 181 | 46 182 | 00:05:30,000 --> 00:05:37,000 183 | And as the conversation becomes long, the amounts of memory needed becomes really, really long. 184 | 185 | 47 186 | 00:05:37,000 --> 00:05:46,000 187 | And thus the cost of sending a lot of tokens to the LLM, which usually charges based on the number of tokens it needs to process will also become more expensive. 188 | 189 | 48 190 | 00:05:46,000 --> 00:05:54,000 191 | So Lanchain provides several convenient kinds of memory to store and accumulate the conversation. 192 | 193 | 49 194 | 00:05:54,000 --> 00:06:00,000 195 | So far, we've been looking at the conversation buffer memory. Let's look at a different type of memory. 196 | 197 | 50 198 | 00:06:00,000 --> 00:06:09,000 199 | I'm going to import the conversation buffer window memory that only keeps a window of memory. 200 | 201 | 51 202 | 00:06:09,000 --> 00:06:20,000 203 | If I set memory to conversational buffer window memory with k equals one, the variable k equals one specifies that I want it to remember just one conversational exchange. 204 | 205 | 52 206 | 00:06:20,000 --> 00:06:25,000 207 | That is one utterance from me and one utterance from the chatbot. 208 | 209 | 53 210 | 00:06:25,000 --> 00:06:31,000 211 | So now if I were to have it save context, hi, what's up, not much, just hanging. 212 | 213 | 54 214 | 00:06:31,000 --> 00:06:38,000 215 | If I were to look at memory dot load variables, it only remembers the most recent utterance. 216 | 217 | 55 218 | 00:06:38,000 --> 00:06:45,000 219 | Notice it's dropped. Hi, what's up? It's just saying, human says not much, just hanging, and the AI says cool. 220 | 221 | 56 222 | 00:06:45,000 --> 00:06:48,000 223 | So that's because k was equal to one. 224 | 225 | 57 226 | 00:06:48,000 --> 00:06:56,000 227 | So this is a nice feature because it lets you keep track of just the most recent few conversational terms. 228 | 229 | 58 230 | 00:06:56,000 --> 00:07:03,000 231 | In practice, you probably won't use this with k equals one. You use this with k set to a larger number. 232 | 233 | 59 234 | 00:07:03,000 --> 00:07:10,000 235 | But still, this prevents the memory from growing without limit as the conversation goes longer. 236 | 237 | 60 238 | 00:07:10,000 --> 00:07:23,000 239 | And so if I were to rerun the conversation that we had just now, we'll say, hi, my name is Andrew. 240 | 241 | 61 242 | 00:07:23,000 --> 00:07:32,000 243 | What is one plus one? And now I ask it, what is my name? 244 | 245 | 62 246 | 00:07:32,000 --> 00:07:37,000 247 | Because k equals one, it only remembers the last exchange versus what is one plus one? 248 | 249 | 63 250 | 00:07:37,000 --> 00:07:42,000 251 | The answer is one plus one equals two, and it's forgotten this early exchange, which is now, now says, 252 | 253 | 64 254 | 00:07:42,000 --> 00:07:46,000 255 | sorry, don't have access to that information. 256 | 257 | 65 258 | 00:07:46,000 --> 00:07:53,000 259 | One thing I hope you will do is pause the video, change this to true in the code on the left, 260 | 261 | 66 262 | 00:07:53,000 --> 00:07:57,000 263 | and rerun this conversation with verbose equals true. 264 | 265 | 67 266 | 00:07:57,000 --> 00:08:00,000 267 | And then you will see the prompts actually used to generate this. 268 | 269 | 68 270 | 00:08:00,000 --> 00:08:08,000 271 | And hopefully you see that the memory, when you're calling the LLM on what is my name, 272 | 273 | 69 274 | 00:08:08,000 --> 00:08:11,000 275 | that the memory has dropped this exchange where I learned what is my name, 276 | 277 | 70 278 | 00:08:11,000 --> 00:08:17,000 279 | which is why it now says it doesn't know what is my name. 280 | 281 | 71 282 | 00:08:17,000 --> 00:08:28,000 283 | With the conversational token buffer memory, the memory will limit the number of tokens saved. 284 | 285 | 72 286 | 00:08:28,000 --> 00:08:39,000 287 | And because a lot of LLM pricing is based on tokens, this maps more directly to the cost of the LLM calls. 288 | 289 | 73 290 | 00:08:39,000 --> 00:08:47,000 291 | So if I were to say the max token limit is equal to 50, and actually let me inject a few comments. 292 | 293 | 74 294 | 00:08:47,000 --> 00:08:51,000 295 | So let's say the conversation is, AI is what? Amazing. 296 | 297 | 75 298 | 00:08:51,000 --> 00:08:54,000 299 | Backpropagation is what? Beautiful. Chatbot is what? Charming. 300 | 301 | 76 302 | 00:08:54,000 --> 00:08:58,000 303 | I use ABC as the first letter of all of these conversational terms. 304 | 305 | 77 306 | 00:08:58,000 --> 00:09:02,000 307 | We can keep track of, um, what was said when. 308 | 309 | 78 310 | 00:09:02,000 --> 00:09:08,000 311 | If I run this with a high token limit, um, it has almost the whole conversation. 312 | 313 | 79 314 | 00:09:08,000 --> 00:09:14,000 315 | If I increase the token limit to 100, it now has the whole conversation. 316 | 317 | 80 318 | 00:09:14,000 --> 00:09:24,000 319 | So I have AI is what? If I decrease it, then, you know, it chops off the earlier parts of this conversation 320 | 321 | 81 322 | 00:09:24,000 --> 00:09:28,000 323 | to retain the number of tokens corresponding to the most recent exchanges, 324 | 325 | 82 326 | 00:09:28,000 --> 00:09:32,000 327 | um, but subject to not exceeding the token limit. 328 | 329 | 83 330 | 00:09:32,000 --> 00:09:35,000 331 | And in case you're wondering why we needed to specify an LLM, 332 | 333 | 84 334 | 00:09:35,000 --> 00:09:39,000 335 | it's because different LLMs use different ways of counting tokens. 336 | 337 | 85 338 | 00:09:39,000 --> 00:09:46,000 339 | So this tells it to use the way of counting tokens that the, um, chat open AI LLM uses. 340 | 341 | 86 342 | 00:09:46,000 --> 00:09:49,000 343 | I encourage you to pause the video and run the code, 344 | 345 | 87 346 | 00:09:49,000 --> 00:09:54,000 347 | and also try modifying the prompt to see if you can get a different output. 348 | 349 | 88 350 | 00:09:54,000 --> 00:09:58,000 351 | Finally, there's one last type of memory I want to illustrate here, 352 | 353 | 89 354 | 00:09:58,000 --> 00:10:04,000 355 | which is the conversation summary buffer memory. 356 | 357 | 90 358 | 00:10:04,000 --> 00:10:12,000 359 | And the idea is instead of limiting the memory to fixed number of tokens based on the most recent utterances 360 | 361 | 91 362 | 00:10:12,000 --> 00:10:15,000 363 | or a fixed number of conversational exchanges, 364 | 365 | 92 366 | 00:10:15,000 --> 00:10:24,000 367 | let's use an LLM to write a summary of the conversation so far and let that be the memory. 368 | 369 | 93 370 | 00:10:24,000 --> 00:10:29,000 371 | So here's an example where I'm going to create a long string with someone's schedule. 372 | 373 | 94 374 | 00:10:29,000 --> 00:10:31,000 375 | You know, there's Meteor AM, we are product team, 376 | 377 | 95 378 | 00:10:31,000 --> 00:10:33,000 379 | you need your PowerPoint presentation and so on and so on. 380 | 381 | 96 382 | 00:10:33,000 --> 00:10:38,000 383 | So this is a long string saying what's your schedule, you know, 384 | 385 | 97 386 | 00:10:38,000 --> 00:10:42,000 387 | maybe ending with a noon lunch at the Italian restaurant with a customer, 388 | 389 | 98 390 | 00:10:42,000 --> 00:10:46,000 391 | bring your laptop, show the LLM, latest LLM demo. 392 | 393 | 99 394 | 00:10:46,000 --> 00:10:53,000 395 | And so let me use a conversation summary buffer memory, um, 396 | 397 | 100 398 | 00:10:53,000 --> 00:10:58,000 399 | with a max token limits of 400 in this case, pretty high token limit. 400 | 401 | 101 402 | 00:10:58,000 --> 00:11:05,000 403 | And I'm going to insert in a few conversational terms in which we start with, 404 | 405 | 102 406 | 00:11:05,000 --> 00:11:10,000 407 | hello, what's up, no one's just hanging, uh, cool. 408 | 409 | 103 410 | 00:11:10,000 --> 00:11:17,000 411 | And then what is on the schedule today and the response is, you know, that long schedule. 412 | 413 | 104 414 | 00:11:17,000 --> 00:11:22,000 415 | So this memory now has quite a lot of text in it. 416 | 417 | 105 418 | 00:11:22,000 --> 00:11:26,000 419 | In fact, let's take a look at the memory variables. 420 | 421 | 106 422 | 00:11:26,000 --> 00:11:37,000 423 | It contains that entire, um, piece of text because 400 tokens was sufficient to store all this text. 424 | 425 | 107 426 | 00:11:37,000 --> 00:11:43,000 427 | But now if I were to reduce the max token limit, say to 100 tokens, 428 | 429 | 108 430 | 00:11:43,000 --> 00:11:46,000 431 | remember this stores the entire conversational history. 432 | 433 | 109 434 | 00:11:46,000 --> 00:11:50,000 435 | If I reduce the number of tokens to 100, 436 | 437 | 110 438 | 00:11:50,000 --> 00:11:57,000 439 | then the conversation summary buffer memory has actually used an LLM, 440 | 441 | 111 442 | 00:11:57,000 --> 00:12:01,000 443 | the open AI endpoint in this case, because that's what we have set the LLM to, 444 | 445 | 112 446 | 00:12:01,000 --> 00:12:05,000 447 | to actually generate a summary of the conversation so far. 448 | 449 | 113 450 | 00:12:05,000 --> 00:12:09,000 451 | So the summary is human AI engaged in small talk before the scheduled day schedule, 452 | 453 | 114 454 | 00:12:09,000 --> 00:12:12,000 455 | and informs human in the morning meeting, blah, blah, blah, 456 | 457 | 115 458 | 00:12:12,000 --> 00:12:17,000 459 | um, lunch meeting with customer interested in AI, latest AI developments. 460 | 461 | 116 462 | 00:12:17,000 --> 00:12:28,000 463 | And so if we were to have a conversation using this LLM, 464 | 465 | 117 466 | 00:12:28,000 --> 00:12:33,000 467 | then create a conversation chain, same as before. 468 | 469 | 118 470 | 00:12:33,000 --> 00:12:41,000 471 | And, um, let's say that we were to ask, you know, input, what would be a good demo to show? 472 | 473 | 119 474 | 00:12:41,000 --> 00:12:43,000 475 | Um, I said verbose equals true. 476 | 477 | 120 478 | 00:12:43,000 --> 00:12:53,000 479 | So here's the prompt, the LLM thinks the current conversation has had this discussion so far, 480 | 481 | 121 482 | 00:12:53,000 --> 00:12:56,000 483 | because that's the summary of the conversation. 484 | 485 | 122 486 | 00:12:56,000 --> 00:13:03,000 487 | And just one note, if you're familiar with the open AI chat API endpoint, 488 | 489 | 123 490 | 00:13:03,000 --> 00:13:07,000 491 | there is a specific system message. 492 | 493 | 124 494 | 00:13:07,000 --> 00:13:11,000 495 | In this example, this is not using the official open AI system message. 496 | 497 | 125 498 | 00:13:11,000 --> 00:13:14,000 499 | It's just including it as part of the prompt here. 500 | 501 | 126 502 | 00:13:14,000 --> 00:13:16,000 503 | But it nonetheless works pretty well. 504 | 505 | 127 506 | 00:13:16,000 --> 00:13:21,000 507 | And given this prompt, you know, the LLM outputs basic customer interested in AI developments, 508 | 509 | 128 510 | 00:13:21,000 --> 00:13:24,000 511 | or suggests showcasing our latest NLP capabilities. 512 | 513 | 129 514 | 00:13:24,000 --> 00:13:26,000 515 | Okay, that's cool. 516 | 517 | 130 518 | 00:13:26,000 --> 00:13:31,000 519 | Um, well, it's, you know, making some suggestions to the cool demos, 520 | 521 | 131 522 | 00:13:31,000 --> 00:13:35,000 523 | and makes you think if I was meeting a customer, I would say, 524 | 525 | 132 526 | 00:13:35,000 --> 00:13:43,000 527 | boy, if only there were open source framework available to help me build cool NLP applications using LLMs. 528 | 529 | 133 530 | 00:13:43,000 --> 00:13:46,000 531 | Good things are launching. 532 | 533 | 134 534 | 00:13:46,000 --> 00:13:58,000 535 | Um, and the interesting thing is, if you now look at what has happened to the memory. 536 | 537 | 135 538 | 00:13:58,000 --> 00:14:04,000 539 | So notice that, um, here it has incorporated the most recent AI system output, 540 | 541 | 136 542 | 00:14:04,000 --> 00:14:11,000 543 | whereas my utterance asking it won't be a good demo to show has been incorporated into the system message. 544 | 545 | 137 546 | 00:14:11,000 --> 00:14:14,000 547 | Um, you know, the overall summary of the conversation so far. 548 | 549 | 138 550 | 00:14:14,000 --> 00:14:17,000 551 | With the conversation summary buffer memory, 552 | 553 | 139 554 | 00:14:17,000 --> 00:14:27,000 555 | what it tries to do is keep the explicit storage of the messages up to the number of tokens we have specified as a limit. 556 | 557 | 140 558 | 00:14:27,000 --> 00:14:30,000 559 | So, you know, this part, the explicit storage, 560 | 561 | 141 562 | 00:14:30,000 --> 00:14:34,000 563 | we're trying to cap at 100 tokens because that's what we're asking for. 564 | 565 | 142 566 | 00:14:34,000 --> 00:14:38,000 567 | And then anything beyond that, it will use the LLM to generate a summary, 568 | 569 | 143 570 | 00:14:38,000 --> 00:14:41,000 571 | which is what is seen up here. 572 | 573 | 144 574 | 00:14:41,000 --> 00:14:46,000 575 | And even though I've illustrated these different memories using a chat as a running example, 576 | 577 | 145 578 | 00:14:46,000 --> 00:14:49,000 579 | these memories are useful for other applications too, 580 | 581 | 146 582 | 00:14:49,000 --> 00:14:54,000 583 | where you might keep on getting new snippets of text or keep on getting new information, 584 | 585 | 147 586 | 00:14:54,000 --> 00:14:59,000 587 | such as if your system repeatedly goes online to search for facts, 588 | 589 | 148 590 | 00:14:59,000 --> 00:15:04,000 591 | but you want to keep the total memory used to store this growing list of facts as, you know, 592 | 593 | 149 594 | 00:15:04,000 --> 00:15:07,000 595 | capped and not growing arbitrarily long. 596 | 597 | 150 598 | 00:15:07,000 --> 00:15:11,000 599 | I encourage you to pause the video and run the code. 600 | 601 | 151 602 | 00:15:11,000 --> 00:15:15,000 603 | In this video, you saw a few types of memory, um, 604 | 605 | 152 606 | 00:15:15,000 --> 00:15:21,000 607 | including buffer memories that limits based on number of conversation exchanges or tokens, 608 | 609 | 153 610 | 00:15:21,000 --> 00:15:26,000 611 | or a memory that can summarize tokens above a certain limit. 612 | 613 | 154 614 | 00:15:26,000 --> 00:15:30,000 615 | Lanchain actually supports additional memory types as well. 616 | 617 | 155 618 | 00:15:30,000 --> 00:15:33,000 619 | One of the most powerful is vector data memory. 620 | 621 | 156 622 | 00:15:33,000 --> 00:15:36,000 623 | If you're familiar with word embeddings and text embeddings, 624 | 625 | 157 626 | 00:15:36,000 --> 00:15:39,000 627 | the vector database actually stores such embeddings. 628 | 629 | 158 630 | 00:15:39,000 --> 00:15:41,000 631 | If you don't know what that means, don't worry about it. 632 | 633 | 159 634 | 00:15:41,000 --> 00:15:43,000 635 | Harrison will explain it later. 636 | 637 | 160 638 | 00:15:43,000 --> 00:15:51,000 639 | And it can then retrieve the most relevant blocks of text using this type of vector database for its memory. 640 | 641 | 161 642 | 00:15:51,000 --> 00:15:54,000 643 | And Lanchain also supports entity memories, 644 | 645 | 162 646 | 00:15:54,000 --> 00:15:58,000 647 | which is applicable when you want it to remember details about specific people, 648 | 649 | 163 650 | 00:15:58,000 --> 00:16:04,000 651 | specific other entities, such as if you talk about a specific friend, 652 | 653 | 164 654 | 00:16:04,000 --> 00:16:08,000 655 | you can have Lanchain remember facts about that friend, 656 | 657 | 165 658 | 00:16:08,000 --> 00:16:12,000 659 | which would be an entity in an explicit way. 660 | 661 | 166 662 | 00:16:12,000 --> 00:16:14,000 663 | When you're implementing applications using Lanchain, 664 | 665 | 167 666 | 00:16:14,000 --> 00:16:17,000 667 | you can also use multiple types of memories, 668 | 669 | 168 670 | 00:16:17,000 --> 00:16:22,000 671 | such as using one of the types of conversation memory that you saw in this video. 672 | 673 | 169 674 | 00:16:22,000 --> 00:16:26,000 675 | Plus additionally, entity memory to recall individuals. 676 | 677 | 170 678 | 00:16:26,000 --> 00:16:30,000 679 | So this way it can remember maybe a summary of the conversation, 680 | 681 | 171 682 | 00:16:30,000 --> 00:16:35,000 683 | plus an explicit way of storing important facts about important people in the conversation. 684 | 685 | 172 686 | 00:16:35,000 --> 00:16:38,000 687 | And of course, in addition to using these memory types, 688 | 689 | 173 690 | 00:16:38,000 --> 00:16:43,000 691 | it's also not uncommon for developers to store the entire conversation in the conventional database, 692 | 693 | 174 694 | 00:16:43,000 --> 00:16:46,000 695 | some sort of key value store or SQL database. 696 | 697 | 175 698 | 00:16:46,000 --> 00:16:51,000 699 | So you could refer back to the whole conversation for auditing or for improving the system further. 700 | 701 | 176 702 | 00:16:51,000 --> 00:16:53,000 703 | And so that's memory types. 704 | 705 | 177 706 | 00:16:53,000 --> 00:16:57,000 707 | I hope you find this useful building your own applications. 708 | 709 | 178 710 | 00:16:57,000 --> 00:17:21,000 711 | And now let's go on to the next video to learn about the key building block of Lanchain, namely the chain. 712 | 713 | 714 | -------------------------------------------------------------------------------- /english/LangChain_L4.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:01,000 --> 00:00:15,000 3 | One of the most common complex applications that people are building using an llm is a system that can answer questions on top of or about a document. 4 | 5 | 2 6 | 00:00:15,000 --> 00:00:24,000 7 | So given a piece of text may be extracted from PDF file or from a webpage or from some company's intranet internal document collection, 8 | 9 | 3 10 | 00:00:24,000 --> 00:00:33,000 11 | can you use an llm to answer questions about the content of those documents to help users gain a deeper understanding and get access to the information that they need? 12 | 13 | 4 14 | 00:00:33,000 --> 00:00:39,000 15 | This is really powerful because it starts to combine these language models with data that they weren't originally trained on. 16 | 17 | 5 18 | 00:00:39,000 --> 00:00:42,000 19 | So it makes them much more flexible and adaptable to your use case. 20 | 21 | 6 22 | 00:00:42,000 --> 00:00:48,000 23 | It's also really exciting because we'll start to move beyond language models, prompts, and output parsers, 24 | 25 | 7 26 | 00:00:48,000 --> 00:00:54,000 27 | and start introducing some more of the key components of link chain, such as embedding models and vector stores. 28 | 29 | 8 30 | 00:00:54,000 --> 00:00:58,000 31 | As Andrew mentioned, this is one of the more popular chains that we've got, so I hope you're excited. 32 | 33 | 9 34 | 00:00:58,000 --> 00:01:03,000 35 | In fact, embeddings and vector stores are some of the most powerful modern techniques. 36 | 37 | 10 38 | 00:01:03,000 --> 00:01:08,000 39 | So if you have not seen them yet, they are very much worth learning about. 40 | 41 | 11 42 | 00:01:08,000 --> 00:01:10,000 43 | So with that, let's dive in. 44 | 45 | 12 46 | 00:01:10,000 --> 00:01:11,000 47 | Let's do it. 48 | 49 | 13 50 | 00:01:11,000 --> 00:01:16,000 51 | So we're going to start by importing the environment variables as we always do. 52 | 53 | 14 54 | 00:01:16,000 --> 00:01:20,000 55 | Now we're going to import some things that will help us when building this chain. 56 | 57 | 15 58 | 00:01:20,000 --> 00:01:22,000 59 | We're going to import the retrieval QA chain. 60 | 61 | 16 62 | 00:01:22,000 --> 00:01:24,000 63 | This will do retrieval over some documents. 64 | 65 | 17 66 | 00:01:24,000 --> 00:01:28,000 67 | We're going to import our favorite chat open AI language model. 68 | 69 | 18 70 | 00:01:28,000 --> 00:01:29,000 71 | We're going to import a document loader. 72 | 73 | 19 74 | 00:01:29,000 --> 00:01:34,000 75 | This is going to be used to load some proprietary data that we're going to combine with the language model. 76 | 77 | 20 78 | 00:01:34,000 --> 00:01:36,000 79 | In this case, it's going to be in a CSV. 80 | 81 | 21 82 | 00:01:36,000 --> 00:01:39,000 83 | So we're going to import the CSV loader. 84 | 85 | 22 86 | 00:01:39,000 --> 00:01:41,000 87 | Finally, we're going to import a vector store. 88 | 89 | 23 90 | 00:01:41,000 --> 00:01:45,000 91 | There are many different types of vector stores, and we'll cover what exactly these are later on. 92 | 93 | 24 94 | 00:01:45,000 --> 00:01:49,000 95 | But we're going to get started with the Dock Array in-memory search vector store. 96 | 97 | 25 98 | 00:01:49,000 --> 00:01:51,000 99 | This is really nice because it's an in-memory vector store, 100 | 101 | 26 102 | 00:01:51,000 --> 00:01:55,000 103 | and it doesn't require connecting to an external database of any kind, 104 | 105 | 27 106 | 00:01:55,000 --> 00:01:57,000 107 | so it makes it really easy to get started. 108 | 109 | 28 110 | 00:01:57,000 --> 00:01:59,000 111 | We're also going to import display and markdown, 112 | 113 | 29 114 | 00:01:59,000 --> 00:02:04,000 115 | two common utilities for displaying information in Jupyter Notebooks. 116 | 117 | 30 118 | 00:02:04,000 --> 00:02:10,000 119 | We've provided a CSV of outdoor clothing that we're going to use to combine with the language model. 120 | 121 | 31 122 | 00:02:10,000 --> 00:02:18,000 123 | Here we're going to initialize a loader, the CSV loader, with a path to this file. 124 | 125 | 32 126 | 00:02:18,000 --> 00:02:22,000 127 | We're next going to import an index, the vector store index creator. 128 | 129 | 33 130 | 00:02:22,000 --> 00:02:26,000 131 | This will help us create a vector store really easily. 132 | 133 | 34 134 | 00:02:26,000 --> 00:02:34,000 135 | As we can see below, it will only be a few lines of code to create this. 136 | 137 | 35 138 | 00:02:34,000 --> 00:02:37,000 139 | To create it, we're going to specify two things. 140 | 141 | 36 142 | 00:02:37,000 --> 00:02:40,000 143 | First, we're going to specify the vector store class. 144 | 145 | 37 146 | 00:02:40,000 --> 00:02:42,000 147 | As mentioned before, we're going to use this vector store, 148 | 149 | 38 150 | 00:02:42,000 --> 00:02:46,000 151 | as it's a particularly easy one to get started with. 152 | 153 | 39 154 | 00:02:46,000 --> 00:02:49,000 155 | After it's been created, we're then going to call from loaders, 156 | 157 | 40 158 | 00:02:49,000 --> 00:02:51,000 159 | which takes in a list of document loaders. 160 | 161 | 41 162 | 00:02:51,000 --> 00:02:58,000 163 | We've only got one loader that we really care about, so that's what we're passing in here. 164 | 165 | 42 166 | 00:02:58,000 --> 00:03:02,000 167 | It's now been created, and we can start to ask questions about it. 168 | 169 | 43 170 | 00:03:02,000 --> 00:03:07,000 171 | Below we'll cover what exactly happened under the hood, so let's not worry about that for now. 172 | 173 | 44 174 | 00:03:07,000 --> 00:03:09,000 175 | Here we'll start with a query. 176 | 177 | 45 178 | 00:03:09,000 --> 00:03:17,000 179 | We'll then create a response using index query and pass in this query. 180 | 181 | 46 182 | 00:03:17,000 --> 00:03:21,000 183 | Again, we'll cover what's going on under the hood down below. 184 | 185 | 47 186 | 00:03:21,000 --> 00:03:30,000 187 | For now, we'll just wait for it to respond. 188 | 189 | 48 190 | 00:03:30,000 --> 00:03:34,000 191 | After it finishes, we can now take a look at what exactly was returned. 192 | 193 | 49 194 | 00:03:34,000 --> 00:03:41,000 195 | We've gotten back a table in Markdown with names and descriptions for all shirts with sun protection. 196 | 197 | 50 198 | 00:03:41,000 --> 00:03:45,000 199 | We've also got a nice little summary that the language model has provided us. 200 | 201 | 51 202 | 00:03:45,000 --> 00:03:48,000 203 | So we've gone over how to do question answering over your documents, 204 | 205 | 52 206 | 00:03:48,000 --> 00:03:52,000 207 | but what exactly is going on underneath the hood? 208 | 209 | 53 210 | 00:03:52,000 --> 00:03:54,000 211 | First, let's think about the general idea. 212 | 213 | 54 214 | 00:03:54,000 --> 00:03:58,000 215 | We want to use language models and combine it with a lot of our documents, 216 | 217 | 55 218 | 00:03:58,000 --> 00:04:03,000 219 | but there's a key issue. Language models can only inspect a few thousand words at a time. 220 | 221 | 56 222 | 00:04:03,000 --> 00:04:10,000 223 | So if we have really large documents, how can we get the language model to answer questions about everything that's in there? 224 | 225 | 57 226 | 00:04:10,000 --> 00:04:14,000 227 | This is where embeddings and vector stores come into play. 228 | 229 | 58 230 | 00:04:14,000 --> 00:04:17,000 231 | First, let's talk about embeddings. 232 | 233 | 59 234 | 00:04:17,000 --> 00:04:21,000 235 | Embeddings create numerical representations for pieces of text. 236 | 237 | 60 238 | 00:04:21,000 --> 00:04:27,000 239 | This numerical representation captures the semantic meaning of the piece of text that it's been run over. 240 | 241 | 61 242 | 00:04:27,000 --> 00:04:31,000 243 | Pieces of text with similar content will have similar vectors. 244 | 245 | 62 246 | 00:04:31,000 --> 00:04:35,000 247 | This lets us compare pieces of text in the vector space. 248 | 249 | 63 250 | 00:04:35,000 --> 00:04:38,000 251 | In the example below, we can see that we have three sentences. 252 | 253 | 64 254 | 00:04:38,000 --> 00:04:43,000 255 | The first two are about pets, while the third is about a car. 256 | 257 | 65 258 | 00:04:43,000 --> 00:04:46,000 259 | If we look at the representation in the numeric space, 260 | 261 | 66 262 | 00:04:46,000 --> 00:04:54,000 263 | we can see that when we compare the two vectors on the pieces of text corresponding to the sentences about pets, they're very similar. 264 | 265 | 67 266 | 00:04:54,000 --> 00:04:58,000 267 | While if we compare it to the one that talks about a car, they're not similar at all. 268 | 269 | 68 270 | 00:04:58,000 --> 00:05:02,000 271 | This will let us easily figure out which pieces of text are like each other, 272 | 273 | 69 274 | 00:05:02,000 --> 00:05:10,000 275 | which will be very useful as we think about which pieces of text we want to include when passing to the language model to answer a question. 276 | 277 | 70 278 | 00:05:10,000 --> 00:05:13,000 279 | The next component that we're going to cover is the vector database. 280 | 281 | 71 282 | 00:05:13,000 --> 00:05:18,000 283 | A vector database is a way to store these vector representations that we created in the previous step. 284 | 285 | 72 286 | 00:05:18,000 --> 00:05:24,000 287 | The way that we create this vector database is we populate it with chunks of text coming from incoming documents. 288 | 289 | 73 290 | 00:05:24,000 --> 00:05:28,000 291 | When we get a big incoming document, we're first going to break it up into smaller chunks. 292 | 293 | 74 294 | 00:05:28,000 --> 00:05:33,000 295 | This helps create pieces of text that are smaller than the original document, 296 | 297 | 75 298 | 00:05:33,000 --> 00:05:37,000 299 | which is useful because we may not be able to pass the whole document to the language model. 300 | 301 | 76 302 | 00:05:37,000 --> 00:05:43,000 303 | So we want to create these small chunks so we can only pass the most relevant ones to the language model. 304 | 305 | 77 306 | 00:05:43,000 --> 00:05:48,000 307 | We then create an embedding for each of these chunks, and then we store those in a vector database. 308 | 309 | 78 310 | 00:05:48,000 --> 00:05:51,000 311 | That's what happens when we create the index. 312 | 313 | 79 314 | 00:05:51,000 --> 00:05:58,000 315 | Now that we've got this index, we can use it during runtime to find the pieces of text most relevant to an incoming query. 316 | 317 | 80 318 | 00:05:58,000 --> 00:06:02,000 319 | When a query comes in, we first create an embedding for that query. 320 | 321 | 81 322 | 00:06:02,000 --> 00:06:07,000 323 | We then compare it to all the vectors in the vector database, and we pick the n most similar. 324 | 325 | 82 326 | 00:06:07,000 --> 00:06:14,000 327 | These are then returned, and we can pass those in the prompt to the language model to get back a final answer. 328 | 329 | 83 330 | 00:06:14,000 --> 00:06:17,000 331 | So above, we created this chain and only a few lines of code. 332 | 333 | 84 334 | 00:06:17,000 --> 00:06:19,000 335 | That's great for getting started quickly. 336 | 337 | 85 338 | 00:06:19,000 --> 00:06:25,000 339 | Well, let's now do it a bit more step by step and understand what exactly is going on under the hood. 340 | 341 | 86 342 | 00:06:25,000 --> 00:06:27,000 343 | The first step is similar to above. 344 | 345 | 87 346 | 00:06:27,000 --> 00:06:36,000 347 | We're going to create a document loader, loading from that CSV with all the descriptions of the products that we want to do question answering over. 348 | 349 | 88 350 | 00:06:36,000 --> 00:06:41,000 351 | We can then load documents from this document loader. 352 | 353 | 89 354 | 00:06:41,000 --> 00:06:50,000 355 | If we look at the individual documents, we can see that each document corresponds to one of the products in the CSV. 356 | 357 | 90 358 | 00:06:50,000 --> 00:06:53,000 359 | Previously, we talked about creating chunks. 360 | 361 | 91 362 | 00:06:53,000 --> 00:07:01,000 363 | Because these documents are already so small, we actually don't need to do any chunking here, and so we can create embeddings directly. 364 | 365 | 92 366 | 00:07:01,000 --> 00:07:05,000 367 | To create embeddings, we're going to use OpenAI's embedding class. 368 | 369 | 93 370 | 00:07:05,000 --> 00:07:08,000 371 | We can import it and initialize it here. 372 | 373 | 94 374 | 00:07:08,000 --> 00:07:21,000 375 | If we want to see what these embeddings do, we can actually take a look at what happens when we embed a particular piece of text. 376 | 377 | 95 378 | 00:07:21,000 --> 00:07:26,000 379 | Let's use the embed query method on the embeddings object to create embeddings for a particular piece of text. 380 | 381 | 96 382 | 00:07:26,000 --> 00:07:31,000 383 | In this case, the sentence, hi, my name is Harrison. 384 | 385 | 97 386 | 00:07:31,000 --> 00:07:41,000 387 | If we take a look at this embedding, we can see that there are over a thousand different elements. 388 | 389 | 98 390 | 00:07:41,000 --> 00:07:44,000 391 | Each of these elements is a different numerical value. 392 | 393 | 99 394 | 00:07:44,000 --> 00:07:51,000 395 | Combined, this creates the overall numerical representation for this piece of text. 396 | 397 | 100 398 | 00:07:51,000 --> 00:07:58,000 399 | We want to create embeddings for all the pieces of text that we just loaded, and then we also want to store them in a vector store. 400 | 401 | 101 402 | 00:07:58,000 --> 00:08:03,000 403 | We can do that by using the from documents method on the vector store. 404 | 405 | 102 406 | 00:08:03,000 --> 00:08:12,000 407 | This method takes in a list of documents, an embedding object, and then we'll create an overall vector store. 408 | 409 | 103 410 | 00:08:12,000 --> 00:08:18,000 411 | We can now use this vector store to find pieces of text similar to an incoming query. 412 | 413 | 104 414 | 00:08:18,000 --> 00:08:23,000 415 | So let's look at the query, please suggest a shirt with sunblocking. 416 | 417 | 105 418 | 00:08:23,000 --> 00:08:36,000 419 | If we use the similarity search method on the vector store and pass in a query, we will get back a list of documents. 420 | 421 | 106 422 | 00:08:36,000 --> 00:08:48,000 423 | We can see that it returns four documents, and if we look at the first one, we can see that it is indeed a shirt about sunblocking. 424 | 425 | 107 426 | 00:08:48,000 --> 00:08:52,000 427 | So how do we use this to do question answering over our own documents? 428 | 429 | 108 430 | 00:08:52,000 --> 00:08:57,000 431 | First, we need to create a retriever from this vector store. 432 | 433 | 109 434 | 00:08:57,000 --> 00:09:03,000 435 | A retriever is a generic interface that can be underpinned by any method that takes in a query and returns documents. 436 | 437 | 110 438 | 00:09:03,000 --> 00:09:11,000 439 | Vector stores and embeddings are one such method to do so, although there are plenty of different methods, some less advanced, some more advanced. 440 | 441 | 111 442 | 00:09:11,000 --> 00:09:20,000 443 | Next, because we want to do text generation and return a natural language response, we're going to import a language model and we're going to use chat open AI. 444 | 445 | 112 446 | 00:09:20,000 --> 00:09:28,000 447 | If we were doing this by hand, what we would do is we would combine the documents into a single piece of text. 448 | 449 | 113 450 | 00:09:28,000 --> 00:09:37,000 451 | So we'd do something like this, where we join all the page content in the documents into a variable. 452 | 453 | 114 454 | 00:09:37,000 --> 00:09:48,000 455 | And then we'd pass this variable or a variant on the question, like please list all your shirts with sun protection and a table with markdown and summarize each one into the language model. 456 | 457 | 115 458 | 00:09:48,000 --> 00:09:55,000 459 | And if we print out the response here, we can see that we get back a table exactly as we asked for. 460 | 461 | 116 462 | 00:09:55,000 --> 00:09:59,000 463 | All of those steps can be encapsulated with the LangChain chain. 464 | 465 | 117 466 | 00:09:59,000 --> 00:10:02,000 467 | So here we can create a retrieval QA chain. 468 | 469 | 118 470 | 00:10:02,000 --> 00:10:06,000 471 | This does retrieval and then does question answering over the retrieved documents. 472 | 473 | 119 474 | 00:10:06,000 --> 00:10:09,000 475 | To create such a chain, we'll pass in a few different things. 476 | 477 | 120 478 | 00:10:09,000 --> 00:10:12,000 479 | First, we'll pass in the language model. 480 | 481 | 121 482 | 00:10:12,000 --> 00:10:15,000 483 | This will be used for doing the text generation at the end. 484 | 485 | 122 486 | 00:10:15,000 --> 00:10:17,000 487 | Next, we'll pass in the chain type. 488 | 489 | 123 490 | 00:10:17,000 --> 00:10:18,000 491 | We're going to use stuff. 492 | 493 | 124 494 | 00:10:18,000 --> 00:10:25,000 495 | This is the simplest method as it just stuffs all the documents into context and makes one call to a language model. 496 | 497 | 125 498 | 00:10:25,000 --> 00:10:32,000 499 | There are a few other methods that you can use to do question answering that I'll maybe touch on at the end, but we're not going to look at in detail. 500 | 501 | 126 502 | 00:10:32,000 --> 00:10:34,000 503 | Third, we're going to pass in a retriever. 504 | 505 | 127 506 | 00:10:34,000 --> 00:10:38,000 507 | The retriever we created above is just an interface for fetching documents. 508 | 509 | 128 510 | 00:10:38,000 --> 00:10:41,000 511 | This will be used to fetch documents and pass it to the language model. 512 | 513 | 129 514 | 00:10:41,000 --> 00:10:46,000 515 | And then finally, we're going to set verbose equals to true. 516 | 517 | 130 518 | 00:10:46,000 --> 00:11:08,000 519 | Now we can create a query and we can run the chain on this query. 520 | 521 | 131 522 | 00:11:08,000 --> 00:11:14,000 523 | When we get the response, we can again display it using the display and markdown utilities. 524 | 525 | 132 526 | 00:11:14,000 --> 00:11:20,000 527 | You can pause the video here and try it out with a bunch of different queries. 528 | 529 | 133 530 | 00:11:20,000 --> 00:11:26,000 531 | So that's how you do it in detail, but remember that we can still do it pretty easily with just the one line that we had up above. 532 | 533 | 134 534 | 00:11:26,000 --> 00:11:30,000 535 | So these two things equate to the same result. 536 | 537 | 135 538 | 00:11:30,000 --> 00:11:32,000 539 | And that's part of the interesting stuff about LinkChain. 540 | 541 | 136 542 | 00:11:32,000 --> 00:11:38,000 543 | You can do it in one line or you can look at the individual things and break it down into five more detailed ones. 544 | 545 | 137 546 | 00:11:38,000 --> 00:11:44,000 547 | The five more detailed ones let you set more specifics about what exactly is going on, but the one-liner is easy to get started. 548 | 549 | 138 550 | 00:11:44,000 --> 00:11:48,000 551 | So up to you as to how you'd prefer to go forward. 552 | 553 | 139 554 | 00:11:48,000 --> 00:11:51,000 555 | We can also customize the index when we're creating it. 556 | 557 | 140 558 | 00:11:51,000 --> 00:11:55,000 559 | And so if you remember, when we created it by hand, we specified an embedding. 560 | 561 | 141 562 | 00:11:55,000 --> 00:11:57,000 563 | And we can specify an embedding here as well. 564 | 565 | 142 566 | 00:11:57,000 --> 00:12:01,000 567 | And so this will give us flexibility over how the embeddings themselves are created. 568 | 569 | 143 570 | 00:12:01,000 --> 00:12:06,000 571 | And we can also swap out the vector store here for a different type of vector store. 572 | 573 | 144 574 | 00:12:06,000 --> 00:12:15,000 575 | So there's the same level of customization that you did when you created it by hand that's also available when you create the index here. 576 | 577 | 145 578 | 00:12:15,000 --> 00:12:17,000 579 | We use the stuff method in this notebook. 580 | 581 | 146 582 | 00:12:17,000 --> 00:12:19,000 583 | The stuff method is really nice because it's pretty simple. 584 | 585 | 147 586 | 00:12:19,000 --> 00:12:25,000 587 | You just put all of it into one prompt and send that to the language model and get back one response. 588 | 589 | 148 590 | 00:12:25,000 --> 00:12:27,000 591 | So it's quite simple to understand what's going on. 592 | 593 | 149 594 | 00:12:27,000 --> 00:12:30,000 595 | It's quite cheap and it works pretty well. 596 | 597 | 150 598 | 00:12:30,000 --> 00:12:32,000 599 | But that doesn't always work okay. 600 | 601 | 151 602 | 00:12:32,000 --> 00:12:37,000 603 | So if you remember, when we fetched the documents in the notebook, we only got four documents back. 604 | 605 | 152 606 | 00:12:37,000 --> 00:12:39,000 607 | And they were relatively small. 608 | 609 | 153 610 | 00:12:39,000 --> 00:12:44,000 611 | But what if you wanted to do the same type of question answering over lots of different types of chunks? 612 | 613 | 154 614 | 00:12:44,000 --> 00:12:47,000 615 | Then there are a few different methods that we can use. 616 | 617 | 155 618 | 00:12:47,000 --> 00:12:48,000 619 | The first is map reduce. 620 | 621 | 156 622 | 00:12:48,000 --> 00:12:55,000 623 | This basically takes all the chunks, passes them along with the question to a language model, gets back a response, 624 | 625 | 157 626 | 00:12:55,000 --> 00:13:02,000 627 | and then uses another language model call to summarize all of the individual responses into a final answer. 628 | 629 | 158 630 | 00:13:02,000 --> 00:13:06,000 631 | This is really powerful because it can operate over any number of documents. 632 | 633 | 159 634 | 00:13:06,000 --> 00:13:11,000 635 | And it's also really powerful because you can do the individual questions in parallel. 636 | 637 | 160 638 | 00:13:11,000 --> 00:13:13,000 639 | But it does take a lot more calls. 640 | 641 | 161 642 | 00:13:13,000 --> 00:13:19,000 643 | And it does treat all the documents as independent, which may not always be the most desired thing. 644 | 645 | 162 646 | 00:13:19,000 --> 00:13:24,000 647 | Refine, which is another method, is again used to loop over many documents. 648 | 649 | 163 650 | 00:13:24,000 --> 00:13:25,000 651 | But it actually does it iteratively. 652 | 653 | 164 654 | 00:13:25,000 --> 00:13:28,000 655 | It builds upon the answer from the previous document. 656 | 657 | 165 658 | 00:13:28,000 --> 00:13:33,000 659 | So this is really good for combining information and building up an answer over time. 660 | 661 | 166 662 | 00:13:33,000 --> 00:13:36,000 663 | It will generally lead to longer answers. 664 | 665 | 167 666 | 00:13:36,000 --> 00:13:39,000 667 | And it's also not as fast because now the calls aren't independent. 668 | 669 | 168 670 | 00:13:39,000 --> 00:13:43,000 671 | They depend on the result of previous calls. 672 | 673 | 169 674 | 00:13:43,000 --> 00:13:49,000 675 | This means that it often takes a good while longer and takes just as many calls as map reduce, basically. 676 | 677 | 170 678 | 00:13:49,000 --> 00:13:57,000 679 | Map re-rank is a pretty interesting and a bit more experimental one where you do a single call to the language model for each document. 680 | 681 | 171 682 | 00:13:57,000 --> 00:14:00,000 683 | And you also ask it to return a score. 684 | 685 | 172 686 | 00:14:00,000 --> 00:14:02,000 687 | And then you select the highest score. 688 | 689 | 173 690 | 00:14:02,000 --> 00:14:06,000 691 | This relies on the language model to know what the score should be. 692 | 693 | 174 694 | 00:14:06,000 --> 00:14:12,000 695 | So you often have to tell it, hey, it should be a high score if it's relevant to the document and really refine the instructions there. 696 | 697 | 175 698 | 00:14:12,000 --> 00:14:15,000 699 | Similar to map reduce, all the calls are independent. 700 | 701 | 176 702 | 00:14:15,000 --> 00:14:16,000 703 | So you can batch them. 704 | 705 | 177 706 | 00:14:16,000 --> 00:14:18,000 707 | And it's relatively fast. 708 | 709 | 178 710 | 00:14:18,000 --> 00:14:20,000 711 | But again, you're making a bunch of language model calls. 712 | 713 | 179 714 | 00:14:20,000 --> 00:14:22,000 715 | So it will be a bit more expensive. 716 | 717 | 180 718 | 00:14:22,000 --> 00:14:29,000 719 | The most common of these methods is the stuff method, which we used in the notebook to combine it all into one document. 720 | 721 | 181 722 | 00:14:29,000 --> 00:14:35,000 723 | The second most common is the map reduce method, which takes these chunks and sends them to the language model. 724 | 725 | 182 726 | 00:14:35,000 --> 00:14:42,000 727 | These methods here, stuff, map reduce, refine, and re-rank can also be used for lots of other chains besides just question answering. 728 | 729 | 183 730 | 00:14:42,000 --> 00:14:53,000 731 | For example, a really common use case of the map reduce chain is for summarization, where you have a really long document and you want to recursively summarize pieces of information in it. 732 | 733 | 184 734 | 00:14:53,000 --> 00:14:56,000 735 | That's it for question answering over documents. 736 | 737 | 185 738 | 00:14:56,000 --> 00:15:00,000 739 | As you may have noticed, there's a lot going on in the different chains that we have here. 740 | 741 | 186 742 | 00:15:00,000 --> 00:15:12,000 743 | And so in the next section, we'll cover ways to better understand what exactly is going on inside all of these chains. 744 | 745 | 746 | -------------------------------------------------------------------------------- /english/LangChain_L3.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:00,000 --> 00:00:09,560 3 | In this lesson, Harrison will teach the most important key building block of 4 | 5 | 2 6 | 00:00:09,560 --> 00:00:11,960 7 | land chain, namely the chain. 8 | 9 | 3 10 | 00:00:12,680 --> 00:00:17,440 11 | The chain usually combines an llm large language model together with a prompt. 12 | 13 | 4 14 | 00:00:17,720 --> 00:00:21,720 15 | And with this building block, you can also put a bunch of these building blocks 16 | 17 | 5 18 | 00:00:21,720 --> 00:00:26,040 19 | together to carry out a sequence of operations on your texts or on your other 20 | 21 | 6 22 | 00:00:26,040 --> 00:00:26,540 23 | data. 24 | 25 | 7 26 | 00:00:27,040 --> 00:00:28,400 27 | I'm excited to dive into it. 28 | 29 | 8 30 | 00:00:28,400 --> 00:00:33,080 31 | All right, to start, we're going to load the environment variables as we have 32 | 33 | 9 34 | 00:00:33,080 --> 00:00:33,580 35 | before. 36 | 37 | 10 38 | 00:00:34,440 --> 00:00:37,240 39 | And then we're also going to load some data that we're going to use. 40 | 41 | 11 42 | 00:00:38,040 --> 00:00:43,480 43 | So part of the power of these chains is that you can run them over many inputs 44 | 45 | 12 46 | 00:00:43,800 --> 00:00:44,320 47 | at a time. 48 | 49 | 13 50 | 00:00:44,760 --> 00:00:46,600 51 | Here, we're going to load a pandas data frame. 52 | 53 | 14 54 | 00:00:47,240 --> 00:00:50,840 55 | A pandas data frame is just a data structure that contains a bunch of 56 | 57 | 15 58 | 00:00:50,840 --> 00:00:52,440 59 | different elements of data. 60 | 61 | 16 62 | 00:00:52,520 --> 00:00:54,480 63 | If you're not familiar with pandas, don't worry about it. 64 | 65 | 17 66 | 00:00:54,480 --> 00:00:58,560 67 | The main point here is that we're loading some data that we can then use later on. 68 | 69 | 18 70 | 00:00:58,600 --> 00:01:02,000 71 | And so if we look inside this pandas data frame, we can see that there is a 72 | 73 | 19 74 | 00:01:02,000 --> 00:01:04,200 75 | product column and then a review column. 76 | 77 | 20 78 | 00:01:04,280 --> 00:01:07,640 79 | And each of these rows is a different data point that we can start passing 80 | 81 | 21 82 | 00:01:07,640 --> 00:01:08,480 83 | through our chains. 84 | 85 | 22 86 | 00:01:09,680 --> 00:01:12,120 87 | So the first chain we're going to cover is the llm chain. 88 | 89 | 23 90 | 00:01:12,200 --> 00:01:16,400 91 | And this is a simple but really powerful chain that underpins a lot of the chains 92 | 93 | 24 94 | 00:01:16,400 --> 00:01:18,240 95 | that we'll go over in the future. 96 | 97 | 25 98 | 00:01:18,640 --> 00:01:21,200 99 | And so we're going to import three different things. 100 | 101 | 26 102 | 00:01:21,480 --> 00:01:24,040 103 | We're going to import the OpenAI model. 104 | 105 | 27 106 | 00:01:24,040 --> 00:01:27,360 107 | So the llm, we're going to import the chat prompt template. 108 | 109 | 28 110 | 00:01:27,400 --> 00:01:29,920 111 | And so this is the prompt and then we're going to import the llm chain. 112 | 113 | 29 114 | 00:01:31,040 --> 00:01:34,520 115 | And so first, what we're going to do is we're going to initialize the language 116 | 117 | 30 118 | 00:01:34,520 --> 00:01:35,960 119 | model that we want to use. 120 | 121 | 31 122 | 00:01:36,040 --> 00:01:39,840 123 | So we're going to initialize the chat OpenAI with a temperature, with a high 124 | 125 | 32 126 | 00:01:39,840 --> 00:01:43,840 127 | temperature so that we can get some fun descriptions. 128 | 129 | 33 130 | 00:01:44,560 --> 00:01:47,640 131 | Now we're going to initialize a prompt and this prompt is going to take in a 132 | 133 | 34 134 | 00:01:47,640 --> 00:01:48,920 135 | variable called product. 136 | 137 | 35 138 | 00:01:49,240 --> 00:01:53,000 139 | It's going to ask the llm to generate what the best name is to describe a 140 | 141 | 36 142 | 00:01:53,000 --> 00:01:54,680 143 | company that makes that product. 144 | 145 | 37 146 | 00:01:55,520 --> 00:01:59,120 147 | And then finally, we're going to combine these two things into a chain. 148 | 149 | 38 150 | 00:01:59,760 --> 00:02:01,880 151 | And so this is what we call an llm chain. 152 | 153 | 39 154 | 00:02:02,000 --> 00:02:02,840 155 | And it's quite simple. 156 | 157 | 40 158 | 00:02:02,840 --> 00:02:06,120 159 | It's just the combination of the llm and the prompt. 160 | 161 | 41 162 | 00:02:06,720 --> 00:02:10,640 163 | But now this chain will let us run through the prompt and into the llm in a 164 | 165 | 42 166 | 00:02:10,640 --> 00:02:11,400 167 | sequential manner. 168 | 169 | 43 170 | 00:02:11,680 --> 00:02:16,120 171 | And so if we have a product called queen size sheet set, we can run this through 172 | 173 | 44 174 | 00:02:16,120 --> 00:02:17,840 175 | the chain by using chain.run. 176 | 177 | 45 178 | 00:02:18,240 --> 00:02:21,440 179 | And what this will do is it will format the prompt under the hood and then it 180 | 181 | 46 182 | 00:02:21,440 --> 00:02:24,080 183 | will pass the whole prompt into the llm. 184 | 185 | 47 186 | 00:02:24,400 --> 00:02:28,440 187 | And so we can see that we get back the name of this hypothetical company called 188 | 189 | 48 190 | 00:02:28,440 --> 00:02:29,320 191 | Royal Beddings. 192 | 193 | 49 194 | 00:02:30,360 --> 00:02:33,440 195 | And so here would be a good time to pause and you can input any product 196 | 197 | 50 198 | 00:02:33,440 --> 00:02:36,720 199 | descriptions that you would want and you can see what the chain will output as a 200 | 201 | 51 202 | 00:02:36,720 --> 00:02:37,160 203 | result. 204 | 205 | 52 206 | 00:02:38,120 --> 00:02:42,880 207 | So the llm chain is the most basic type of chain and that's going to be used a 208 | 209 | 53 210 | 00:02:42,880 --> 00:02:43,680 211 | lot in the future. 212 | 213 | 54 214 | 00:02:43,880 --> 00:02:47,320 215 | And so we can see how this will be used in the next type of chain, which will be 216 | 217 | 55 218 | 00:02:47,320 --> 00:02:48,440 219 | sequential chains. 220 | 221 | 56 222 | 00:02:48,440 --> 00:02:52,520 223 | And so sequential chains run a sequence of chains one after another. 224 | 225 | 57 226 | 00:02:52,960 --> 00:02:56,800 227 | So to start, you're going to import the simple sequential chain. 228 | 229 | 58 230 | 00:02:57,240 --> 00:03:00,880 231 | And this works well when we have sub chains that expect only one input and 232 | 233 | 59 234 | 00:03:00,880 --> 00:03:02,000 235 | return only one output. 236 | 237 | 60 238 | 00:03:02,760 --> 00:03:07,600 239 | And so here we're going to first create one chain, which uses an llm and a 240 | 241 | 61 242 | 00:03:07,600 --> 00:03:08,160 243 | prompt. 244 | 245 | 62 246 | 00:03:08,560 --> 00:03:13,640 247 | And this prompt is going to take in the product and will return the best name to 248 | 249 | 63 250 | 00:03:13,640 --> 00:03:14,640 251 | describe that company. 252 | 253 | 64 254 | 00:03:14,840 --> 00:03:16,080 255 | So that will be the first chain. 256 | 257 | 65 258 | 00:03:16,080 --> 00:03:18,280 259 | Then we're going to create a second chain. 260 | 261 | 66 262 | 00:03:18,680 --> 00:03:23,280 263 | In the second chain, we'll take in the company name and then output a 20 word 264 | 265 | 67 266 | 00:03:23,320 --> 00:03:25,240 267 | description of that company. 268 | 269 | 68 270 | 00:03:26,320 --> 00:03:29,920 271 | And so you can imagine how these chains might want to be run one after another, 272 | 273 | 69 274 | 00:03:30,160 --> 00:03:33,600 275 | where the output of the first chain, the company name is then passed into the 276 | 277 | 70 278 | 00:03:33,600 --> 00:03:34,200 279 | second chain. 280 | 281 | 71 282 | 00:03:35,800 --> 00:03:39,960 283 | We can easily do this by creating a simple sequential chain where we have the 284 | 285 | 72 286 | 00:03:39,960 --> 00:03:41,640 287 | two chains described there. 288 | 289 | 73 290 | 00:03:42,240 --> 00:03:44,240 291 | And we'll call this overall simple chain. 292 | 293 | 74 294 | 00:03:44,240 --> 00:03:49,720 295 | Now, what you can do is run this chain over any product description. 296 | 297 | 75 298 | 00:03:50,600 --> 00:03:54,920 299 | And so if we use it with the product above the queen size sheet set, we can 300 | 301 | 76 302 | 00:03:54,920 --> 00:03:58,840 303 | run it over and we can see that at first outputs royal betting, and then it 304 | 305 | 77 306 | 00:03:58,840 --> 00:04:00,200 307 | passes it into the second chain. 308 | 309 | 78 310 | 00:04:00,200 --> 00:04:03,400 311 | And it comes up with this description of what that company could be about. 312 | 313 | 79 314 | 00:04:05,680 --> 00:04:09,160 315 | The simple sequential chain works well when there's only a single input and a 316 | 317 | 80 318 | 00:04:09,160 --> 00:04:09,840 319 | single output. 320 | 321 | 81 322 | 00:04:10,320 --> 00:04:12,120 323 | But what about when there's only one input? 324 | 325 | 82 326 | 00:04:12,120 --> 00:04:16,200 327 | And a single output, but what about when there are multiple inputs or multiple 328 | 329 | 83 330 | 00:04:16,200 --> 00:04:16,680 331 | outputs? 332 | 333 | 84 334 | 00:04:16,920 --> 00:04:20,080 335 | And so we can do this by using just the regular sequential chain. 336 | 337 | 85 338 | 00:04:21,240 --> 00:04:22,160 339 | So let's import that. 340 | 341 | 86 342 | 00:04:22,160 --> 00:04:25,280 343 | And then you're going to create a bunch of chains that we're going to use one 344 | 345 | 87 346 | 00:04:25,280 --> 00:04:25,920 347 | after another. 348 | 349 | 88 350 | 00:04:26,200 --> 00:04:29,200 351 | We're going to be using the data from above, which has a review. 352 | 353 | 89 354 | 00:04:29,640 --> 00:04:34,320 355 | And so the first chain, we're going to take the review and translate it into 356 | 357 | 90 358 | 00:04:34,320 --> 00:04:34,840 359 | English. 360 | 361 | 91 362 | 00:04:37,240 --> 00:04:41,200 363 | With a second chain, we're going to create a summary of that review in one 364 | 365 | 92 366 | 00:04:41,200 --> 00:04:46,840 367 | sentence. And this will use the previously generated English review. 368 | 369 | 93 370 | 00:04:49,520 --> 00:04:53,560 371 | The third chain is going to detect what the language of the review was in the 372 | 373 | 94 374 | 00:04:53,560 --> 00:04:54,200 375 | first place. 376 | 377 | 95 378 | 00:04:54,560 --> 00:04:58,720 379 | And so if you notice, this is using the review variable that is coming from the 380 | 381 | 96 382 | 00:04:58,720 --> 00:05:00,320 383 | original review. 384 | 385 | 97 386 | 00:05:02,400 --> 00:05:05,480 387 | And finally, the fourth chain will take in multiple inputs. 388 | 389 | 98 390 | 00:05:05,840 --> 00:05:09,560 391 | So this will take in the summary variable, which we calculated with the second 392 | 393 | 99 394 | 00:05:09,560 --> 00:05:13,320 395 | chain and the language variable, which we calculated with the third chain. 396 | 397 | 100 398 | 00:05:13,760 --> 00:05:17,720 399 | And it's going to ask for a follow up response to the summary in the specified 400 | 401 | 101 402 | 00:05:17,720 --> 00:05:18,240 403 | language. 404 | 405 | 102 406 | 00:05:19,800 --> 00:05:23,640 407 | One important thing to note about all these sub chains is that the input keys 408 | 409 | 103 410 | 00:05:23,720 --> 00:05:25,960 411 | and output keys need to be pretty precise. 412 | 413 | 104 414 | 00:05:26,680 --> 00:05:28,520 415 | So here we're taking in review. 416 | 417 | 105 418 | 00:05:28,600 --> 00:05:31,120 419 | This is a variable that will be passed in at the start. 420 | 421 | 106 422 | 00:05:31,760 --> 00:05:35,320 423 | We can see that we explicitly set the output key to English review. 424 | 425 | 107 426 | 00:05:35,320 --> 00:05:39,840 427 | This is then used in the next prompt down below where we take in English review 428 | 429 | 108 430 | 00:05:39,840 --> 00:05:44,240 431 | with that same variable name and we set the output key of that chain to summary, 432 | 433 | 109 434 | 00:05:44,680 --> 00:05:46,920 435 | which we can see is used in the final chain. 436 | 437 | 110 438 | 00:05:47,800 --> 00:05:52,760 439 | The third prompt takes in review the original variable and output language, 440 | 441 | 111 442 | 00:05:53,160 --> 00:05:55,040 443 | which is again used in the final prompt. 444 | 445 | 112 446 | 00:05:56,040 --> 00:05:59,760 447 | It's really important to get these variable names lined up exactly right, 448 | 449 | 113 450 | 00:05:59,920 --> 00:06:02,400 451 | because there's so many different inputs and outputs going on. 452 | 453 | 114 454 | 00:06:02,400 --> 00:06:06,160 455 | And if you get any key errors, you should definitely check that they are lined up. 456 | 457 | 115 458 | 00:06:06,160 --> 00:06:12,040 459 | So the simple sequential chain takes in multiple chains where each one has a 460 | 461 | 116 462 | 00:06:12,040 --> 00:06:13,680 463 | single input and a single output. 464 | 465 | 117 466 | 00:06:14,560 --> 00:06:19,080 467 | To see a visual representation of this, we can look at the slide where it has one 468 | 469 | 118 470 | 00:06:19,080 --> 00:06:22,760 471 | chain feeding into the other chain one after another. 472 | 473 | 119 474 | 00:06:24,080 --> 00:06:28,000 475 | Here we can see a visual description of the sequential chain, comparing it to the 476 | 477 | 120 478 | 00:06:28,000 --> 00:06:32,920 479 | above chain, you can notice that any step in the chain can take in multiple input 480 | 481 | 121 482 | 00:06:32,920 --> 00:06:33,720 483 | variables. 484 | 485 | 122 486 | 00:06:34,280 --> 00:06:38,400 487 | This is useful when you have more complicated downstream chains that need 488 | 489 | 123 490 | 00:06:38,400 --> 00:06:41,400 491 | to be a composition of multiple previous chains. 492 | 493 | 124 494 | 00:06:42,840 --> 00:06:46,400 495 | Now that we have all these chains, we can easily combine them in the sequential 496 | 497 | 125 498 | 00:06:46,400 --> 00:06:46,920 499 | chain. 500 | 501 | 126 502 | 00:06:47,360 --> 00:06:51,880 503 | So you'll notice here that we'll pass in the four chains we created into the 504 | 505 | 127 506 | 00:06:51,880 --> 00:06:52,760 507 | chains variable. 508 | 509 | 128 510 | 00:06:52,760 --> 00:06:57,280 511 | We'll create the inputs variable with the one human input, which is the 512 | 513 | 129 514 | 00:06:57,280 --> 00:06:58,000 515 | review. 516 | 517 | 130 518 | 00:06:58,400 --> 00:07:02,200 519 | And then we want to return all the intermediate outputs. 520 | 521 | 131 522 | 00:07:02,200 --> 00:07:05,080 523 | So the English review, the summary, and then the follow up message. 524 | 525 | 132 526 | 00:07:07,320 --> 00:07:10,080 527 | Now we can run this over some of the data. 528 | 529 | 133 530 | 00:07:10,120 --> 00:07:14,800 531 | So let's choose a review and pass it in through the overall chain. 532 | 533 | 134 534 | 00:07:20,000 --> 00:07:24,920 535 | We can see here that the original review looks like it was in French. 536 | 537 | 135 538 | 00:07:24,920 --> 00:07:27,680 539 | We can see the English review as a translation. 540 | 541 | 136 542 | 00:07:27,680 --> 00:07:31,880 543 | We can see a summary of that review, and then we can see a follow up message in 544 | 545 | 137 546 | 00:07:31,880 --> 00:07:34,240 547 | the original language of French. 548 | 549 | 138 550 | 00:07:34,800 --> 00:07:38,320 551 | You should pause the video here and try putting in different inputs. 552 | 553 | 139 554 | 00:07:39,040 --> 00:07:42,560 555 | So far we've covered the LLM chain and then a sequential chain. 556 | 557 | 140 558 | 00:07:43,080 --> 00:07:45,600 559 | But what if you want to do something more complicated? 560 | 561 | 141 562 | 00:07:46,200 --> 00:07:50,440 563 | A pretty common, but basic operation is to route an input to a chain, depending 564 | 565 | 142 566 | 00:07:50,440 --> 00:07:52,400 567 | on what exactly that input is. 568 | 569 | 143 570 | 00:07:52,400 --> 00:07:57,200 571 | A good way to imagine this is if you have multiple sub chains, each of which 572 | 573 | 144 574 | 00:07:57,200 --> 00:08:01,720 575 | specialized for a particular type of input, you could have a router chain, 576 | 577 | 145 578 | 00:08:01,760 --> 00:08:06,000 579 | which first decides which sub chain to pass it to, and then passes it to that 580 | 581 | 146 582 | 00:08:06,000 --> 00:08:06,480 583 | chain. 584 | 585 | 147 586 | 00:08:07,360 --> 00:08:11,520 587 | For a concrete example, let's look at where we are routing between different 588 | 589 | 148 590 | 00:08:11,520 --> 00:08:15,720 591 | types of chains, depending on the subject that seems to come in. 592 | 593 | 149 594 | 00:08:16,440 --> 00:08:18,640 595 | So we have here different prompts. 596 | 597 | 150 598 | 00:08:18,800 --> 00:08:21,280 599 | One prompt is good for answering physics questions. 600 | 601 | 151 602 | 00:08:21,280 --> 00:08:23,680 603 | The second prompt is good for answering math questions. 604 | 605 | 152 606 | 00:08:23,680 --> 00:08:26,960 607 | The third for history, and then a fourth for computer science. 608 | 609 | 153 610 | 00:08:27,280 --> 00:08:29,440 611 | Let's define all these prompt templates. 612 | 613 | 154 614 | 00:08:33,320 --> 00:08:36,800 615 | After we have these prompt templates, we can then provide more information 616 | 617 | 155 618 | 00:08:36,800 --> 00:08:37,320 619 | about them. 620 | 621 | 156 622 | 00:08:37,760 --> 00:08:40,600 623 | We can give each one a name and then a description. 624 | 625 | 157 626 | 00:08:41,160 --> 00:08:44,280 627 | This description for the physics one is good for answering questions about 628 | 629 | 158 630 | 00:08:44,280 --> 00:08:44,800 631 | physics. 632 | 633 | 159 634 | 00:08:45,480 --> 00:08:48,560 635 | This information is going to be passed to the router chain. 636 | 637 | 160 638 | 00:08:48,560 --> 00:08:52,000 639 | So the router chain can decide when to use this sub chain. 640 | 641 | 161 642 | 00:08:58,080 --> 00:09:00,880 643 | Let's now import the other types of chains that we need. 644 | 645 | 162 646 | 00:09:01,360 --> 00:09:03,080 647 | Here we need a multi-prompt chain. 648 | 649 | 163 650 | 00:09:03,560 --> 00:09:07,400 651 | This is a specific type of chain that is used when routing between multiple 652 | 653 | 164 654 | 00:09:07,400 --> 00:09:08,600 655 | different prompt templates. 656 | 657 | 165 658 | 00:09:09,200 --> 00:09:12,680 659 | As you can see, all the options we have are prompt templates themselves. 660 | 661 | 166 662 | 00:09:13,280 --> 00:09:15,760 663 | But this is just one type of thing that you can route between. 664 | 665 | 167 666 | 00:09:15,760 --> 00:09:18,520 667 | You can route between any type of chain. 668 | 669 | 168 670 | 00:09:19,000 --> 00:09:22,400 671 | The other classes that we'll implement here are an LLM router chain. 672 | 673 | 169 674 | 00:09:22,880 --> 00:09:26,840 675 | This uses a language model itself to route between the different sub chains. 676 | 677 | 170 678 | 00:09:26,880 --> 00:09:30,360 679 | This is where the description and the name provided above will be used. 680 | 681 | 171 682 | 00:09:31,160 --> 00:09:33,360 683 | We'll also import a router output parser. 684 | 685 | 172 686 | 00:09:33,920 --> 00:09:38,080 687 | This parses the LLM output into a dictionary that can be used downstream 688 | 689 | 173 690 | 00:09:38,440 --> 00:09:42,440 691 | to determine which chain to use and what the input to that chain should be. 692 | 693 | 174 694 | 00:09:42,440 --> 00:09:44,120 695 | Now we can get around to using it. 696 | 697 | 175 698 | 00:09:44,160 --> 00:09:48,680 699 | First, let's import and define the language model that we will use. 700 | 701 | 176 702 | 00:09:52,000 --> 00:09:54,240 703 | We now create the destination chains. 704 | 705 | 177 706 | 00:09:54,400 --> 00:09:57,080 707 | These are the chains that will be called by the router chain. 708 | 709 | 178 710 | 00:09:57,560 --> 00:10:01,400 711 | As you can see, each destination chain itself is a language model chain, 712 | 713 | 179 714 | 00:10:01,400 --> 00:10:02,360 715 | an LLM chain. 716 | 717 | 180 718 | 00:10:04,240 --> 00:10:08,640 719 | In addition to the destination chains, we also need a default destination chain. 720 | 721 | 181 722 | 00:10:08,640 --> 00:10:12,800 723 | In addition to the destination chains, we also need a default chain. 724 | 725 | 182 726 | 00:10:13,320 --> 00:10:15,880 727 | This is a chain that's called when the router can't decide 728 | 729 | 183 730 | 00:10:16,080 --> 00:10:17,800 731 | which of the sub chains to use. 732 | 733 | 184 734 | 00:10:18,080 --> 00:10:22,000 735 | And the example above, this might be called when the input question 736 | 737 | 185 738 | 00:10:22,000 --> 00:10:25,800 739 | has nothing to do with physics, math, history or computer science. 740 | 741 | 186 742 | 00:10:28,120 --> 00:10:31,760 743 | Now we define the template that is used by the LLM 744 | 745 | 187 746 | 00:10:31,760 --> 00:10:33,800 747 | to route between the different chains. 748 | 749 | 188 750 | 00:10:34,720 --> 00:10:37,000 751 | This has instructions of the task to be done, 752 | 753 | 189 754 | 00:10:37,000 --> 00:10:40,440 755 | as well as the specific formatting that the output should be in. 756 | 757 | 190 758 | 00:10:41,680 --> 00:10:44,840 759 | Let's put a few of these pieces together to build the router chain. 760 | 761 | 191 762 | 00:10:45,600 --> 00:10:48,520 763 | First, we create the full router template by formatting it 764 | 765 | 192 766 | 00:10:48,520 --> 00:10:50,480 767 | with the destinations that we defined above. 768 | 769 | 193 770 | 00:10:50,920 --> 00:10:54,280 771 | This template is flexible to a bunch of different types of destinations. 772 | 773 | 194 774 | 00:10:54,720 --> 00:10:58,520 775 | One thing you can do here is pause and add different types of destinations. 776 | 777 | 195 778 | 00:10:59,000 --> 00:11:02,160 779 | So up here, rather than just physics, math, history and computer science, 780 | 781 | 196 782 | 00:11:02,160 --> 00:11:04,960 783 | you could add a different subject like English or Latin. 784 | 785 | 197 786 | 00:11:04,960 --> 00:11:07,760 787 | Next, we create the prompt template from this template, 788 | 789 | 198 790 | 00:11:08,080 --> 00:11:11,280 791 | and then we create the router chain by passing in the LLM 792 | 793 | 199 794 | 00:11:11,280 --> 00:11:13,120 795 | and the overall router prompt. 796 | 797 | 200 798 | 00:11:13,960 --> 00:11:16,360 799 | Note that here we have the router output parser. 800 | 801 | 201 802 | 00:11:16,720 --> 00:11:19,320 803 | This is important as it will help this chain decide 804 | 805 | 202 806 | 00:11:19,720 --> 00:11:22,160 807 | which sub chains to route between. 808 | 809 | 203 810 | 00:11:24,760 --> 00:11:28,920 811 | And finally, putting it all together, we can create the overall chain. 812 | 813 | 204 814 | 00:11:29,240 --> 00:11:32,400 815 | This has a router chain, which is defined here. 816 | 817 | 205 818 | 00:11:32,400 --> 00:11:35,200 819 | It has destination chains, which we pass in here. 820 | 821 | 206 822 | 00:11:35,400 --> 00:11:37,200 823 | And then we also pass in the default chain. 824 | 825 | 207 826 | 00:11:38,880 --> 00:11:40,200 827 | We can now use this chain. 828 | 829 | 208 830 | 00:11:40,520 --> 00:11:41,960 831 | So let's ask it some questions. 832 | 833 | 209 834 | 00:11:42,560 --> 00:11:45,320 835 | If we ask it a question about physics, 836 | 837 | 210 838 | 00:11:45,520 --> 00:11:48,920 839 | we should hopefully see that it is routed to the physics chain 840 | 841 | 211 842 | 00:11:49,640 --> 00:11:52,560 843 | with the input, what is blackbody radiation? 844 | 845 | 212 846 | 00:11:52,800 --> 00:11:55,480 847 | And then that is passed into the chain down below. 848 | 849 | 213 850 | 00:11:55,480 --> 00:11:59,080 851 | And we can see that the response is very detailed 852 | 853 | 214 854 | 00:11:59,080 --> 00:12:01,080 855 | with lots of physics details. 856 | 857 | 215 858 | 00:12:01,080 --> 00:12:04,600 859 | You should pause the video here and try putting in different inputs. 860 | 861 | 216 862 | 00:12:04,920 --> 00:12:08,520 863 | You can try with all the other types of special chains 864 | 865 | 217 866 | 00:12:08,520 --> 00:12:09,920 867 | that we have defined above. 868 | 869 | 218 870 | 00:12:10,440 --> 00:12:13,240 871 | So, for example, if we ask it a math question, 872 | 873 | 219 874 | 00:12:21,600 --> 00:12:23,680 875 | we should see that it's routed to the math chain 876 | 877 | 220 878 | 00:12:24,040 --> 00:12:25,120 879 | and then passed into that. 880 | 881 | 221 882 | 00:12:25,120 --> 00:12:27,720 883 | We can also see what happens when we pass in a question 884 | 885 | 222 886 | 00:12:27,920 --> 00:12:30,320 887 | that is not related to any of the subchains. 888 | 889 | 223 890 | 00:12:30,720 --> 00:12:33,480 891 | So here we ask it a question about biology 892 | 893 | 224 894 | 00:12:33,760 --> 00:12:35,880 895 | and we can see the chain that it chooses is none. 896 | 897 | 225 898 | 00:12:36,520 --> 00:12:38,400 899 | This means that it will be passed to the default chain, 900 | 901 | 226 902 | 00:12:38,400 --> 00:12:41,360 903 | which itself is just a generic call to the language model. 904 | 905 | 227 906 | 00:12:41,560 --> 00:12:43,680 907 | The language model luckily knows a lot about biology, 908 | 909 | 228 910 | 00:12:43,680 --> 00:12:44,880 911 | so it can help us out here. 912 | 913 | 229 914 | 00:12:46,080 --> 00:12:48,120 915 | Now that we've covered these basic questions, 916 | 917 | 230 918 | 00:12:48,120 --> 00:12:50,120 919 | let's move on to the next part of the video. 920 | 921 | 231 922 | 00:12:50,120 --> 00:12:52,120 923 | And that is how we can create a new chain. 924 | 925 | 232 926 | 00:12:52,120 --> 00:12:55,120 927 | So, for example, in the next section, 928 | 929 | 233 930 | 00:12:55,120 --> 00:12:57,120 931 | we're going to cover how to create a chain 932 | 933 | 234 934 | 00:12:57,120 --> 00:13:22,120 935 | that can do question answering over your documents. 936 | 937 | 938 | --------------------------------------------------------------------------------