├── README.md
├── .gitignore
├── chinese
    ├── LangChain_Conclusion.srt
    ├── LangChain_Intro.srt
    ├── LangChain_L6.srt
    ├── LangChain_L2.srt
    ├── LangChain_L4.srt
    ├── LangChain_L3.srt
    ├── LangChain_L5.srt
    └── LangChain_L1.srt
├── LangChain_Conclusion.srt
├── english
    ├── LangChain_Conclusion.srt
    ├── LangChain_Intro.srt
    ├── LangChain_L6.srt
    ├── LangChain_L2.srt
    ├── LangChain_L4.srt
    └── LangChain_L3.srt
├── LangChain_Intro.srt
└── LangChain_L6.srt


/README.md:
--------------------------------------------------------------------------------
1 | # [新]吴恩达 LangChain 课程中英字幕
2 | 
3 | 使用whisper + gpt 识别翻译，勘误请提issues，谢谢。
4 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | video
2 | audio
3 | .env
4 | translator.py
5 | main.py
6 | merge.mjs
7 | package.json
8 | node_modules
9 | pnpm-lock.yaml


--------------------------------------------------------------------------------
/chinese/LangChain_Conclusion.srt:
--------------------------------------------------------------------------------
 1 | 
 2 |  
 3 | 1
 4 | 00:00:00,000 --> 00:00:10,600
 5 | 在这个短课程中，您看到了一系列的应用，包括处理客户评论、构建一个应用程序来回答文档问题，甚至使用LLM来决定何时调用外部工具，如Web搜索来回答复杂问题。
 6 | 
 7 | 2
 8 | 00:00:10,600 --> 00:00:16,840
 9 | 如果一两周前，有人问你建立所有这些应用需要多少工作量，我想很多人会觉得，哇，这听起来像是几周，甚至更长时间的工作。
10 | 
11 | 3
12 | 00:00:16,840 --> 00:00:22,080
13 | 但是您在这个短课程中看到了，只需要相当数量的代码行，您就可以使用Langtrain相当高效地构建所有这些应用。
14 | 
15 | 4
16 | 00:00:22,080 --> 00:00:26,400
17 | 希望您能够借鉴这些想法，也许您可以使用Jupyter笔记本中看到的一些代码片段在自己的应用程序中使用它们。
18 | 
19 | 5
20 | 00:00:26,400 --> 00:00:30,800
21 | 而这些想法只是一个开始。
22 | 
23 | 6
24 | 00:00:30,800 --> 00:00:32,200
25 | 您可以使用语言模型进行许多其他应用。
26 | 
27 | 7
28 | 00:00:32,200 --> 00:00:36,640
29 | 这些模型非常强大，因为它们适用于如此广泛的任务，无论是关于CSV的问题，查询SQL数据库，还是与API交互。
30 | 
31 | 8
32 | 00:00:36,640 --> 00:00:41,600
33 | 有许多不同的使用链和提示组合以及输出解析器和更多链来在Langtrain中完成所有这些事情的示例。
34 | 
35 | 9
36 | 00:00:41,600 --> 00:00:45,600
37 | 而这大部分归功于Langtrain社区。
38 | 
39 | 10
40 | 00:00:45,600 --> 00:00:49,120
41 | 因此，我还要非常感谢社区中的每个人的贡献，无论是改进文档，使其更容易入门，还是新类型的链，开启了一个全新的可能性世界。
42 | 
43 | 11
44 | 00:00:49,120 --> 00:00:51,320
45 | 因此，如果您还没有这样做，请去您的笔记本电脑、台式机上运行pip install Langtrain。
46 | 
47 | 12
48 | 00:00:51,320 --> 00:00:54,920
49 | 然后使用它来构建一些惊人的应用程序。


--------------------------------------------------------------------------------
/LangChain_Conclusion.srt:
--------------------------------------------------------------------------------
 1 | 1
 2 | 00:00:00,000 --> 00:00:10,600
 3 | 在这个短课程中，您看到了一系列的应用，包括处理客户评论、构建一个应用程序来回答文档问题，甚至使用LLM来决定何时调用外部工具，如Web搜索来回答复杂问题。
 4 | In this short course, you saw a range of applications, including processing customer reviews and
 5 | 
 6 | 2
 7 | 00:00:10,600 --> 00:00:16,840
 8 | 如果一两周前，有人问你建立所有这些应用需要多少工作量，我想很多人会觉得，哇，这听起来像是几周，甚至更长时间的工作。
 9 | building an application to answer questions over documents, and even using an llm to decide
10 | 
11 | 3
12 | 00:00:16,840 --> 00:00:22,080
13 | 但是您在这个短课程中看到了，只需要相当数量的代码行，您就可以使用Langtrain相当高效地构建所有这些应用。
14 | when to make a call to an external tool like web search to answer complex questions.
15 | 
16 | 4
17 | 00:00:22,080 --> 00:00:26,400
18 | 希望您能够借鉴这些想法，也许您可以使用Jupyter笔记本中看到的一些代码片段在自己的应用程序中使用它们。
19 | If a week or two ago, someone had asked you how much work would it be to build all of
20 | 
21 | 5
22 | 00:00:26,400 --> 00:00:30,800
23 | 而这些想法只是一个开始。
24 | these applications, I think a lot of people thought, boy, this sounds like weeks, maybe
25 | 
26 | 6
27 | 00:00:30,800 --> 00:00:32,200
28 | 您可以使用语言模型进行许多其他应用。
29 | even longer of work.
30 | 
31 | 7
32 | 00:00:32,200 --> 00:00:36,640
33 | 这些模型非常强大，因为它们适用于如此广泛的任务，无论是关于CSV的问题，查询SQL数据库，还是与API交互。
34 | But you saw in this short course how with just a pretty reasonable number of lines of
35 | 
36 | 8
37 | 00:00:36,640 --> 00:00:41,600
38 | 有许多不同的使用链和提示组合以及输出解析器和更多链来在Langtrain中完成所有这些事情的示例。
39 | code, you can use Langtrain to build all of these applications pretty efficiently.
40 | 
41 | 9
42 | 00:00:41,600 --> 00:00:45,600
43 | 而这大部分归功于Langtrain社区。
44 | As I hope you take these ideas, maybe you can take some code snippets that you saw in
45 | 
46 | 10
47 | 00:00:45,600 --> 00:00:49,120
48 | 因此，我还要非常感谢社区中的每个人的贡献，无论是改进文档，使其更容易入门，还是新类型的链，开启了一个全新的可能性世界。
49 | the Jupyter notebooks and use them in your own applications.
50 | 
51 | 11
52 | 00:00:49,120 --> 00:00:51,320
53 | 因此，如果您还没有这样做，请去您的笔记本电脑、台式机上运行pip install Langtrain。
54 | And these ideas are really just the start.
55 | 
56 | 12
57 | 00:00:51,320 --> 00:00:54,920
58 | 然后使用它来构建一些惊人的应用程序。
59 | There's a lot of other applications that you can use language models for.
60 | 


--------------------------------------------------------------------------------
/english/LangChain_Conclusion.srt:
--------------------------------------------------------------------------------
 1 | 1
 2 | 00:00:00,000 --> 00:00:10,600
 3 | In this short course, you saw a range of applications, including processing customer reviews and
 4 | 
 5 | 2
 6 | 00:00:10,600 --> 00:00:16,840
 7 | building an application to answer questions over documents, and even using an llm to decide
 8 | 
 9 | 3
10 | 00:00:16,840 --> 00:00:22,080
11 | when to make a call to an external tool like web search to answer complex questions.
12 | 
13 | 4
14 | 00:00:22,080 --> 00:00:26,400
15 | If a week or two ago, someone had asked you how much work would it be to build all of
16 | 
17 | 5
18 | 00:00:26,400 --> 00:00:30,800
19 | these applications, I think a lot of people thought, boy, this sounds like weeks, maybe
20 | 
21 | 6
22 | 00:00:30,800 --> 00:00:32,200
23 | even longer of work.
24 | 
25 | 7
26 | 00:00:32,200 --> 00:00:36,640
27 | But you saw in this short course how with just a pretty reasonable number of lines of
28 | 
29 | 8
30 | 00:00:36,640 --> 00:00:41,600
31 | code, you can use Langtrain to build all of these applications pretty efficiently.
32 | 
33 | 9
34 | 00:00:41,600 --> 00:00:45,600
35 | As I hope you take these ideas, maybe you can take some code snippets that you saw in
36 | 
37 | 10
38 | 00:00:45,600 --> 00:00:49,120
39 | the Jupyter notebooks and use them in your own applications.
40 | 
41 | 11
42 | 00:00:49,120 --> 00:00:51,320
43 | And these ideas are really just the start.
44 | 
45 | 12
46 | 00:00:51,320 --> 00:00:54,920
47 | There's a lot of other applications that you can use language models for.
48 | 
49 | 13
50 | 00:00:54,920 --> 00:00:59,680
51 | These models are so powerful because they're applicable to such a wide range of tasks,
52 | 
53 | 14
54 | 00:00:59,680 --> 00:01:06,600
55 | whether it be answering questions about CSVs, querying SQL databases, interacting with APIs.
56 | 
57 | 15
58 | 00:01:06,600 --> 00:01:11,740
59 | There's a lot of different examples of using chains and the combinations of prompts and
60 | 
61 | 16
62 | 00:01:11,740 --> 00:01:16,040
63 | output parsers and then more chains to do all these things in Langtrain.
64 | 
65 | 17
66 | 00:01:16,040 --> 00:01:18,200
67 | And most of that is due to the Langtrain community.
68 | 
69 | 18
70 | 00:01:18,200 --> 00:01:22,360
71 | So I want to also give a really big thank you to everyone in the community who's contributed,
72 | 
73 | 19
74 | 00:01:22,360 --> 00:01:26,560
75 | whether it be improvements in the documentation, making it easier for others to get started,
76 | 
77 | 20
78 | 00:01:26,560 --> 00:01:30,360
79 | or new types of chains, opening up a whole new world of possibilities.
80 | 
81 | 21
82 | 00:01:30,360 --> 00:01:35,560
83 | And so with that, if you have not yet already done so, I hope you go to your laptop, your
84 | 
85 | 22
86 | 00:01:35,560 --> 00:01:38,720
87 | desktop and run pip install Langtrain.
88 | 
89 | 23
90 | 00:01:38,720 --> 00:01:54,000
91 | And then go use this too to go build some amazing applications.
92 | 
93 | 
94 | 


--------------------------------------------------------------------------------
/chinese/LangChain_Intro.srt:
--------------------------------------------------------------------------------
  1 | 
  2 |  
  3 | 1
  4 | 00:00:00,000 --> 00:00:06,440
  5 | 欢迎来到这个关于LandChain大语言模型应用开发的短期课程。
  6 | 
  7 | 2
  8 | 00:00:06,440 --> 00:00:10,040
  9 | 通过提示llm或大型语言模型，
 10 | 
 11 | 3
 12 | 00:00:10,040 --> 00:00:13,480
 13 | 现在可以比以往更快地开发AI应用程序。
 14 | 
 15 | 4
 16 | 00:00:13,480 --> 00:00:18,600
 17 | 但一个应用程序可能需要多次提示llm并解析输出。
 18 | 
 19 | 5
 20 | 00:00:18,600 --> 00:00:24,800
 21 | 因此需要编写大量的粘合代码。
 22 | 
 23 | 6
 24 | 00:00:24,800 --> 00:00:28,220
 25 | Harrison Chase创建的LandChain使得这个开发过程更加容易。
 26 | 
 27 | 7
 28 | 00:00:28,220 --> 00:00:34,360
 29 | 我很高兴有Harrison在这里，
 30 | 
 31 | 8
 32 | 00:00:34,360 --> 00:00:36,280
 33 | 他与deeplearning.ai合作建立了这个短期课程，教大家如何使用这个神奇的工具。
 34 | 
 35 | 9
 36 | 00:00:36,280 --> 00:00:38,920
 37 | 感谢你的邀请。我很高兴能来这里。
 38 | 
 39 | 10
 40 | 00:00:38,920 --> 00:00:42,680
 41 | LandChain起初是一个用于构建所有应用程序的开源框架。
 42 | 
 43 | 11
 44 | 00:00:42,680 --> 00:00:45,400
 45 | 谢谢你。我很高兴能来这里。
 46 | 
 47 | 12
 48 | 00:00:45,400 --> 00:00:49,760
 49 | 当我与该领域的一些人交谈时，他们正在构建更复杂的应用程序，并看到了
 50 | 
 51 | 13
 52 | 00:00:49,760 --> 00:00:53,360
 53 | 一些共同的抽象，以及它们如何被开发。
 54 | 
 55 | 14
 56 | 00:00:53,360 --> 00:00:56,200
 57 | 到目前为止，我们非常高兴地看到LandChain在社区中的采用。
 58 | 
 59 | 15
 60 | 00:00:56,200 --> 00:01:04,320
 61 | 因此，期待与大家分享它，并期待看到人们用它构建的东西。
 62 | 
 63 | 16
 64 | 00:01:04,320 --> 00:01:06,400
 65 | 事实上，作为LandChain动力的一个标志，
 66 | 
 67 | 17
 68 | 00:01:06,400 --> 00:01:08,960
 69 | 它不仅有众多用户，
 70 | 
 71 | 18
 72 | 00:01:08,960 --> 00:01:12,700
 73 | 而且还有许多对开源的贡献者。
 74 | 
 75 | 19
 76 | 00:01:12,700 --> 00:01:19,000
 77 | 这对于它的快速发展至关重要。
 78 | 
 79 | 20
 80 | 00:01:19,000 --> 00:01:22,760
 81 | 这个团队以惊人的速度推出代码和功能。
 82 | 
 83 | 21
 84 | 00:01:22,760 --> 00:01:28,960
 85 | 因此，希望在这个短期课程之后，
 86 | 
 87 | 22
 88 | 00:01:28,960 --> 00:01:33,720
 89 | 你将能够快速地使用LandChain组合一些非常酷的应用程序。
 90 | 
 91 | 23
 92 | 00:01:33,720 --> 00:01:36,280
 93 | 谁知道，也许你甚至决定
 94 | 
 95 | 24
 96 | 00:01:36,280 --> 00:01:39,800
 97 | 回馈开源LandChain的努力。
 98 | 
 99 | 25
100 | 00:01:39,800 --> 00:01:45,360
101 | LandChain是一个用于构建应用程序的开源开发框架。
102 | 
103 | 26
104 | 00:01:45,360 --> 00:01:47,280
105 | 我们有两个不同的包，
106 | 
107 | 27
108 | 00:01:47,280 --> 00:01:49,520
109 | 一个是Python，一个是JavaScript。
110 | 
111 | 28
112 | 00:01:49,520 --> 00:01:54,960
113 | 它们专注于组合和模块化。
114 | 
115 | 29
116 | 00:01:54,960 --> 00:01:58,320
117 | 因此，它们有许多单独的组件，可以一起使用或单独使用。
118 |  
119 | 34
120 | 00:01:58,320 --> 00:02:00,080
121 | 因此，这是其中一个关键的附加值。
122 | 
123 | 35
124 | 00:02:00,080 --> 00:02:03,680
125 | 另一个关键的附加值是许多不同的用例。
126 | 
127 | 36
128 | 00:02:03,680 --> 00:02:07,280
129 | 因此，将这些模块化组件组合成链式方式，形成更多端到端的应用程序，并使其易于开始使用这些用例。
130 | 
131 | 37
132 | 00:02:07,280 --> 00:02:09,680
133 | 在本课程中，我们将介绍LandChain的常见组件。
134 | 
135 | 38
136 | 00:02:09,680 --> 00:02:12,640
137 | 因此，我们将讨论模型。
138 | 
139 | 39
140 | 00:02:12,640 --> 00:02:16,080
141 | 我们将讨论提示，这是您使模型执行有用和有趣操作的方式。
142 | 
143 | 40
144 | 00:02:16,080 --> 00:02:17,520
145 | 我们将讨论索引，
146 | 
147 | 41
148 | 00:02:17,520 --> 00:02:19,560
149 | 这是一种摄取数据的方式，以便您可以将其与模型结合使用。
150 | 
151 | 42
152 | 00:02:19,560 --> 00:02:22,080
153 | 然后，我们将讨论链式，
154 | 
155 | 43
156 | 00:02:22,080 --> 00:02:27,920
157 | 这是更多的端到端用例，以及代理人，
158 | 
159 | 44
160 | 00:02:27,920 --> 00:02:29,480
161 | 这是一种非常令人兴奋的端到端用例，
162 | 
163 | 45
164 | 00:02:29,480 --> 00:02:32,280
165 | 它使用模型作为推理引擎。
166 | 
167 | 46
168 | 00:02:32,280 --> 00:02:34,920
169 | 我们还感谢Ankush Gholar，
170 | 
171 | 47
172 | 00:02:34,920 --> 00:02:37,680
173 | 他是LandChain的联合创始人，与Harrison Chase一起，
174 | 
175 | 48
176 | 00:02:37,680 --> 00:02:40,320
177 | 也为这些材料投入了很多思考，并帮助创建了这个短期课程。
178 | 
179 | 49
180 | 00:02:40,320 --> 00:02:44,600
181 | 在deep learning.ai方面，
182 | 
183 | 50
184 | 00:02:44,600 --> 00:02:46,840
185 | Jeff Ludwig，Eddie Hsu和Diala Ezzedine，
186 | 
187 | 51
188 | 00:02:46,840 --> 00:02:50,480
189 | 也为这些材料做出了贡献。
190 | 
191 | 52
192 | 00:02:50,480 --> 00:02:52,840
193 | 因此，让我们进入下一个视频，了解LandChain的模型，提示和暂停。
194 | 
195 | 


--------------------------------------------------------------------------------
/english/LangChain_Intro.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:00,000 --> 00:00:06,440
  3 | Welcome to this short course on
  4 | 
  5 | 2
  6 | 00:00:06,440 --> 00:00:10,040
  7 | LandChain for large language model application development.
  8 | 
  9 | 3
 10 | 00:00:10,040 --> 00:00:13,480
 11 | By prompting an llm or large language model,
 12 | 
 13 | 4
 14 | 00:00:13,480 --> 00:00:18,600
 15 | it's now possible to develop AI applications much faster than ever before.
 16 | 
 17 | 5
 18 | 00:00:18,600 --> 00:00:24,800
 19 | But an application can require prompting an llm multiple times and parsing as output.
 20 | 
 21 | 6
 22 | 00:00:24,800 --> 00:00:28,220
 23 | So there's a lot of glue code that needs to be written.
 24 | 
 25 | 7
 26 | 00:00:28,220 --> 00:00:34,360
 27 | LandChain created by Harrison Chase makes this development process much easier.
 28 | 
 29 | 8
 30 | 00:00:34,360 --> 00:00:36,280
 31 | I'm thrilled to have Harrison here,
 32 | 
 33 | 9
 34 | 00:00:36,280 --> 00:00:38,920
 35 | who had built this short course in collaboration with
 36 | 
 37 | 10
 38 | 00:00:38,920 --> 00:00:42,680
 39 | deeplearning.ai to teach how to use this amazing tool.
 40 | 
 41 | 11
 42 | 00:00:42,680 --> 00:00:45,400
 43 | Thanks for having me. I'm really excited to be here.
 44 | 
 45 | 12
 46 | 00:00:45,400 --> 00:00:49,760
 47 | LandChain started as an open source framework for building all on applications.
 48 | 
 49 | 13
 50 | 00:00:49,760 --> 00:00:53,360
 51 | It came about when I was talking to a bunch of folks in the field who were
 52 | 
 53 | 14
 54 | 00:00:53,360 --> 00:00:56,200
 55 | building more complex applications and saw
 56 | 
 57 | 15
 58 | 00:00:56,200 --> 00:00:59,760
 59 | some common abstractions in terms of how they were being developed.
 60 | 
 61 | 16
 62 | 00:00:59,760 --> 00:01:04,320
 63 | We've been really thrilled at the community adoption of LandChain so far.
 64 | 
 65 | 17
 66 | 00:01:04,320 --> 00:01:06,400
 67 | So look forward to sharing it with everyone
 68 | 
 69 | 18
 70 | 00:01:06,400 --> 00:01:08,960
 71 | here and look forward to seeing what people build with it.
 72 | 
 73 | 19
 74 | 00:01:08,960 --> 00:01:12,700
 75 | In fact, as a sign of LandChain's momentum,
 76 | 
 77 | 20
 78 | 00:01:12,700 --> 00:01:14,920
 79 | not only does it have numerous users,
 80 | 
 81 | 21
 82 | 00:01:14,920 --> 00:01:19,000
 83 | but there are also many hundreds of contributors to the open source.
 84 | 
 85 | 22
 86 | 00:01:19,000 --> 00:01:22,760
 87 | This has been instrumental for its rapid rate of development.
 88 | 
 89 | 23
 90 | 00:01:22,760 --> 00:01:26,440
 91 | This team really shifts code and features at an amazing pace.
 92 | 
 93 | 24
 94 | 00:01:26,440 --> 00:01:28,960
 95 | So hopefully, after this short course,
 96 | 
 97 | 25
 98 | 00:01:28,960 --> 00:01:33,720
 99 | you'll be able to quickly put together some really cool applications using LandChain.
100 | 
101 | 26
102 | 00:01:33,720 --> 00:01:36,280
103 | And who knows, maybe you even decide to
104 | 
105 | 27
106 | 00:01:36,280 --> 00:01:39,800
107 | contribute back to the open source LandChain effort.
108 | 
109 | 28
110 | 00:01:39,800 --> 00:01:45,360
111 | LandChain is an open source development framework for building applications.
112 | 
113 | 29
114 | 00:01:45,360 --> 00:01:47,280
115 | We have two different packages,
116 | 
117 | 30
118 | 00:01:47,280 --> 00:01:49,520
119 | a Python one and a JavaScript one.
120 | 
121 | 31
122 | 00:01:49,520 --> 00:01:52,600
123 | They're focused on composition and modularity.
124 | 
125 | 32
126 | 00:01:52,600 --> 00:01:54,960
127 | So they have a lot of individual components that can be
128 | 
129 | 33
130 | 00:01:54,960 --> 00:01:58,320
131 | used in conjunction with each other or by themselves.
132 | 
133 | 34
134 | 00:01:58,320 --> 00:02:00,080
135 | And so that's one of the key value adds.
136 | 
137 | 35
138 | 00:02:00,080 --> 00:02:03,680
139 | And then the other key value add is a bunch of different use cases.
140 | 
141 | 36
142 | 00:02:03,680 --> 00:02:07,280
143 | So chains of ways of combining these modular components into
144 | 
145 | 37
146 | 00:02:07,280 --> 00:02:09,680
147 | more end-to-end applications and making it
148 | 
149 | 38
150 | 00:02:09,680 --> 00:02:12,640
151 | very easy to get started with those use cases.
152 | 
153 | 39
154 | 00:02:12,640 --> 00:02:16,080
155 | In this class, we'll cover the common components of LandChain.
156 | 
157 | 40
158 | 00:02:16,080 --> 00:02:17,520
159 | So we'll talk about models.
160 | 
161 | 41
162 | 00:02:17,520 --> 00:02:19,560
163 | We'll talk about prompts, which are how you get
164 | 
165 | 42
166 | 00:02:19,560 --> 00:02:22,080
167 | models to do useful and interesting things.
168 | 
169 | 43
170 | 00:02:22,080 --> 00:02:23,480
171 | We'll talk about indexes,
172 | 
173 | 44
174 | 00:02:23,480 --> 00:02:27,920
175 | which are ways of ingesting data so that you can combine it with models.
176 | 
177 | 45
178 | 00:02:27,920 --> 00:02:29,480
179 | And then we'll talk about chains,
180 | 
181 | 46
182 | 00:02:29,480 --> 00:02:32,280
183 | which are more end-to-end use cases along with agents,
184 | 
185 | 47
186 | 00:02:32,280 --> 00:02:34,920
187 | which are a very exciting type of end-to-end use case,
188 | 
189 | 48
190 | 00:02:34,920 --> 00:02:37,680
191 | which uses the model as a reasoning engine.
192 | 
193 | 49
194 | 00:02:37,680 --> 00:02:40,320
195 | We're also grateful to Ankush Gholar,
196 | 
197 | 50
198 | 00:02:40,320 --> 00:02:44,600
199 | who is the co-founder of LandChain alongside Harrison Chase,
200 | 
201 | 51
202 | 00:02:44,600 --> 00:02:46,840
203 | for also putting a lot of thoughts into
204 | 
205 | 52
206 | 00:02:46,840 --> 00:02:50,480
207 | these materials and helping with the creation of this short course.
208 | 
209 | 53
210 | 00:02:50,480 --> 00:02:52,840
211 | And on the deep learning.ai side,
212 | 
213 | 54
214 | 00:02:52,840 --> 00:02:56,080
215 | Jeff Ludwig, Eddie Hsu, and Diala Ezzedine,
216 | 
217 | 55
218 | 00:02:56,080 --> 00:02:58,840
219 | have also contributed to these materials.
220 | 
221 | 56
222 | 00:02:58,840 --> 00:03:02,040
223 | And so with that, let's go on to the next video where we'll learn
224 | 
225 | 57
226 | 00:03:02,040 --> 00:03:21,040
227 | about LandChain's models, prompts, and pauses.
228 | 
229 | 
230 | 


--------------------------------------------------------------------------------
/LangChain_Intro.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:00,000 --> 00:00:06,440
  3 | 欢迎来到这个关于LandChain大语言模型应用开发的短期课程。
  4 | Welcome to this short course on
  5 | 
  6 | 2
  7 | 00:00:06,440 --> 00:00:10,040
  8 | 通过提示llm或大型语言模型，
  9 | LandChain for large language model application development.
 10 | 
 11 | 3
 12 | 00:00:10,040 --> 00:00:13,480
 13 | 现在可以比以往更快地开发AI应用程序。
 14 | By prompting an llm or large language model,
 15 | 
 16 | 4
 17 | 00:00:13,480 --> 00:00:18,600
 18 | 但一个应用程序可能需要多次提示llm并解析输出。
 19 | it's now possible to develop AI applications much faster than ever before.
 20 | 
 21 | 5
 22 | 00:00:18,600 --> 00:00:24,800
 23 | 因此需要编写大量的粘合代码。
 24 | But an application can require prompting an llm multiple times and parsing as output.
 25 | 
 26 | 6
 27 | 00:00:24,800 --> 00:00:28,220
 28 | Harrison Chase创建的LandChain使得这个开发过程更加容易。
 29 | So there's a lot of glue code that needs to be written.
 30 | 
 31 | 7
 32 | 00:00:28,220 --> 00:00:34,360
 33 | 我很高兴有Harrison在这里，
 34 | LandChain created by Harrison Chase makes this development process much easier.
 35 | 
 36 | 8
 37 | 00:00:34,360 --> 00:00:36,280
 38 | 他与deeplearning.ai合作建立了这个短期课程，教大家如何使用这个神奇的工具。
 39 | I'm thrilled to have Harrison here,
 40 | 
 41 | 9
 42 | 00:00:36,280 --> 00:00:38,920
 43 | 感谢你的邀请。我很高兴能来这里。
 44 | who had built this short course in collaboration with
 45 | 
 46 | 10
 47 | 00:00:38,920 --> 00:00:42,680
 48 | LandChain起初是一个用于构建所有应用程序的开源框架。
 49 | deeplearning.ai to teach how to use this amazing tool.
 50 | 
 51 | 11
 52 | 00:00:42,680 --> 00:00:45,400
 53 | 谢谢你。我很高兴能来这里。
 54 | Thanks for having me. I'm really excited to be here.
 55 | 
 56 | 12
 57 | 00:00:45,400 --> 00:00:49,760
 58 | 当我与该领域的一些人交谈时，他们正在构建更复杂的应用程序，并看到了
 59 | LandChain started as an open source framework for building all on applications.
 60 | 
 61 | 13
 62 | 00:00:49,760 --> 00:00:53,360
 63 | 一些共同的抽象，以及它们如何被开发。
 64 | It came about when I was talking to a bunch of folks in the field who were
 65 | 
 66 | 14
 67 | 00:00:53,360 --> 00:00:56,200
 68 | 到目前为止，我们非常高兴地看到LandChain在社区中的采用。
 69 | building more complex applications and saw
 70 | 
 71 | 15
 72 | 00:00:56,200 --> 00:01:04,320
 73 | 因此，期待与大家分享它，并期待看到人们用它构建的东西。
 74 | some common abstractions in terms of how they were being developed.
 75 | 
 76 | 16
 77 | 00:01:04,320 --> 00:01:06,400
 78 | 事实上，作为LandChain动力的一个标志，
 79 | We've been really thrilled at the community adoption of LandChain so far.
 80 | 
 81 | 17
 82 | 00:01:06,400 --> 00:01:08,960
 83 | 它不仅有众多用户，
 84 | So look forward to sharing it with everyone
 85 | 
 86 | 18
 87 | 00:01:08,960 --> 00:01:12,700
 88 | 而且还有许多对开源的贡献者。
 89 | here and look forward to seeing what people build with it.
 90 | 
 91 | 19
 92 | 00:01:12,700 --> 00:01:19,000
 93 | 这对于它的快速发展至关重要。
 94 | In fact, as a sign of LandChain's momentum,
 95 | 
 96 | 20
 97 | 00:01:19,000 --> 00:01:22,760
 98 | 这个团队以惊人的速度推出代码和功能。
 99 | not only does it have numerous users,
100 | 
101 | 21
102 | 00:01:22,760 --> 00:01:28,960
103 | 因此，希望在这个短期课程之后，
104 | but there are also many hundreds of contributors to the open source.
105 | 
106 | 22
107 | 00:01:28,960 --> 00:01:33,720
108 | 你将能够快速地使用LandChain组合一些非常酷的应用程序。
109 | This has been instrumental for its rapid rate of development.
110 | 
111 | 23
112 | 00:01:33,720 --> 00:01:36,280
113 | 谁知道，也许你甚至决定
114 | This team really shifts code and features at an amazing pace.
115 | 
116 | 24
117 | 00:01:36,280 --> 00:01:39,800
118 | 回馈开源LandChain的努力。
119 | So hopefully, after this short course,
120 | 
121 | 25
122 | 00:01:39,800 --> 00:01:45,360
123 | LandChain是一个用于构建应用程序的开源开发框架。
124 | you'll be able to quickly put together some really cool applications using LandChain.
125 | 
126 | 26
127 | 00:01:45,360 --> 00:01:47,280
128 | 我们有两个不同的包，
129 | And who knows, maybe you even decide to
130 | 
131 | 27
132 | 00:01:47,280 --> 00:01:49,520
133 | 一个是Python，一个是JavaScript。
134 | contribute back to the open source LandChain effort.
135 | 
136 | 28
137 | 00:01:49,520 --> 00:01:54,960
138 | 它们专注于组合和模块化。
139 | LandChain is an open source development framework for building applications.
140 | 
141 | 29
142 | 00:01:54,960 --> 00:01:58,320
143 | 因此，它们有许多单独的组件，可以一起使用或单独使用。
144 |  
145 | We have two different packages,
146 | 
147 | 30
148 | 00:01:58,320 --> 00:02:00,080
149 | 因此，这是其中一个关键的附加值。
150 | a Python one and a JavaScript one.
151 | 
152 | 31
153 | 00:02:00,080 --> 00:02:03,680
154 | 另一个关键的附加值是许多不同的用例。
155 | They're focused on composition and modularity.
156 | 
157 | 32
158 | 00:02:03,680 --> 00:02:07,280
159 | 因此，将这些模块化组件组合成链式方式，形成更多端到端的应用程序，并使其易于开始使用这些用例。
160 | So they have a lot of individual components that can be
161 | 
162 | 33
163 | 00:02:07,280 --> 00:02:09,680
164 | 在本课程中，我们将介绍LandChain的常见组件。
165 | used in conjunction with each other or by themselves.
166 | 
167 | 34
168 | 00:02:09,680 --> 00:02:12,640
169 | 因此，我们将讨论模型。
170 | And so that's one of the key value adds.
171 | 
172 | 35
173 | 00:02:12,640 --> 00:02:16,080
174 | 我们将讨论提示，这是您使模型执行有用和有趣操作的方式。
175 | And then the other key value add is a bunch of different use cases.
176 | 
177 | 36
178 | 00:02:16,080 --> 00:02:17,520
179 | 我们将讨论索引，
180 | So chains of ways of combining these modular components into
181 | 
182 | 37
183 | 00:02:17,520 --> 00:02:19,560
184 | 这是一种摄取数据的方式，以便您可以将其与模型结合使用。
185 | more end-to-end applications and making it
186 | 
187 | 38
188 | 00:02:19,560 --> 00:02:22,080
189 | 然后，我们将讨论链式，
190 | very easy to get started with those use cases.
191 | 
192 | 39
193 | 00:02:22,080 --> 00:02:27,920
194 | 这是更多的端到端用例，以及代理人，
195 | In this class, we'll cover the common components of LandChain.
196 | 
197 | 40
198 | 00:02:27,920 --> 00:02:29,480
199 | 这是一种非常令人兴奋的端到端用例，
200 | So we'll talk about models.
201 | 
202 | 41
203 | 00:02:29,480 --> 00:02:32,280
204 | 它使用模型作为推理引擎。
205 | We'll talk about prompts, which are how you get
206 | 
207 | 42
208 | 00:02:32,280 --> 00:02:34,920
209 | 我们还感谢Ankush Gholar，
210 | models to do useful and interesting things.
211 | 
212 | 43
213 | 00:02:34,920 --> 00:02:37,680
214 | 他是LandChain的联合创始人，与Harrison Chase一起，
215 | We'll talk about indexes,
216 | 
217 | 44
218 | 00:02:37,680 --> 00:02:40,320
219 | 也为这些材料投入了很多思考，并帮助创建了这个短期课程。
220 | which are ways of ingesting data so that you can combine it with models.
221 | 
222 | 45
223 | 00:02:40,320 --> 00:02:44,600
224 | 在deep learning.ai方面，
225 | And then we'll talk about chains,
226 | 
227 | 46
228 | 00:02:44,600 --> 00:02:46,840
229 | Jeff Ludwig，Eddie Hsu和Diala Ezzedine，
230 | which are more end-to-end use cases along with agents,
231 | 
232 | 47
233 | 00:02:46,840 --> 00:02:50,480
234 | 也为这些材料做出了贡献。
235 | which are a very exciting type of end-to-end use case,
236 | 
237 | 48
238 | 00:02:50,480 --> 00:02:52,840
239 | 因此，让我们进入下一个视频，了解LandChain的模型，提示和暂停。
240 | which uses the model as a reasoning engine.
241 | 


--------------------------------------------------------------------------------
/chinese/LangChain_L6.srt:
--------------------------------------------------------------------------------
  1 | 
  2 |  
  3 | 1
  4 | 00:00:00,000 --> 00:00:08,920
  5 | 有时人们认为大型语言模型是一个知识库，
  6 | 
  7 | 2
  8 | 00:00:08,920 --> 00:00:11,920
  9 | 好像它已经学会了记忆大量信息，
 10 | 
 11 | 3
 12 | 00:00:11,920 --> 00:00:14,880
 13 | 也许是从互联网上获取的，所以当你问它一个问题时，
 14 | 
 15 | 4
 16 | 00:00:14,880 --> 00:00:16,380
 17 | 它可以回答这个问题。
 18 | 
 19 | 5
 20 | 00:00:16,380 --> 00:00:19,340
 21 | 但我认为，将大型语言模型视为推理引擎更加有用，
 22 | 
 23 | 6
 24 | 00:00:19,340 --> 00:00:22,980
 25 | 你可以给它一些文本块或其他信息来源。
 26 | 
 27 | 7
 28 | 00:00:22,980 --> 00:00:27,140
 29 | 然后大型语言模型，LLM，
 30 | 
 31 | 8
 32 | 00:00:27,140 --> 00:00:29,460
 33 | 可能会使用从互联网上学到的背景知识，
 34 | 
 35 | 9
 36 | 00:00:29,460 --> 00:00:33,000
 37 | 但是使用你提供的新信息来帮助你回答问题或推理内容或甚至决定下一步该做什么。
 38 | 
 39 | 10
 40 | 00:00:33,000 --> 00:00:36,620
 41 | 这就是Lanchain的代理框架帮助你做的事情。
 42 | 
 43 | 11
 44 | 00:00:36,620 --> 00:00:41,100
 45 | 代理可能是我最喜欢的Lanchain部分。
 46 | 
 47 | 12
 48 | 00:00:41,100 --> 00:00:45,180
 49 | 我认为它们也是最强大的部分之一，
 50 | 
 51 | 13
 52 | 00:00:45,180 --> 00:00:48,340
 53 | 但它们也是最新的部分之一。
 54 | 
 55 | 14
 56 | 00:00:48,340 --> 00:00:50,320
 57 | 我们正在看到很多新的东西出现在这里，对于该领域的每个人来说都是新的。
 58 | 
 59 | 15
 60 | 00:00:50,320 --> 00:00:52,140
 61 | 这应该是一个非常令人兴奋的课程，因为我们深入探讨
 62 | 
 63 | 16
 64 | 00:00:52,140 --> 00:00:56,020
 65 | 代理是什么，如何创建代理，
 66 | 
 67 | 17
 68 | 00:00:56,020 --> 00:00:58,940
 69 | 以及如何使用代理，
 70 | 
 71 | 18
 72 | 00:00:58,940 --> 00:01:01,180
 73 | 如何为它们配备不同类型的工具，如
 74 | 
 75 | 19
 76 | 00:01:01,180 --> 00:01:02,500
 77 | 内置于Lanchain中的搜索引擎，
 78 | 
 79 | 20
 80 | 00:01:02,500 --> 00:01:04,860
 81 | 以及如何创建自己的工具，以便让代理与
 82 | 
 83 | 21
 84 | 00:01:04,860 --> 00:01:07,180
 85 | 任何数据存储，任何API，
 86 | 
 87 | 22
 88 | 00:01:07,180 --> 00:01:11,480
 89 | 任何你想让它们与之交互的函数。
 90 | 
 91 | 23
 92 | 00:01:11,480 --> 00:01:14,780
 93 | 这是令人兴奋的前沿技术，
 94 | 
 95 | 24
 96 | 00:01:14,780 --> 00:01:16,860
 97 | 但已经出现了一些重要的用例。
 98 | 
 99 | 25
100 | 00:01:16,860 --> 00:01:19,460
101 | 因此，让我们开始吧。
102 | 
103 | 26
104 | 00:01:19,460 --> 00:01:23,060
105 | 要开始使用代理，
106 | 
107 | 27
108 | 00:01:23,060 --> 00:01:25,620
109 | 我们将像往常一样导入正确的环境变量。
110 | 
111 | 28
112 | 00:01:25,620 --> 00:01:27,500
113 | 我们还需要安装一些软件包。
114 | 
115 | 29
116 | 00:01:27,500 --> 00:01:32,420
117 | 因此，我们将使用DuckDuckGo搜索引擎和维基百科。
118 | 
119 | 30
120 | 00:01:32,420 --> 00:01:35,100
121 | 因此，我们将要安装这些。
122 | 
123 | 31
124 | 00:01:35,100 --> 00:01:39,020
125 | 我已经在我的环境中安装了这些，所以我不会运行这行。
126 | 
127 | 32
128 | 00:01:39,020 --> 00:01:40,780
129 | 但如果你们没有，
130 |  
131 | 36
132 | 00:01:46,360 --> 00:01:48,580
133 | 你应该取消注释那一行，
134 | 
135 | 37
136 | 00:01:48,580 --> 00:01:51,300
137 | 运行它，然后你就可以开始了。
138 | 
139 | 38
140 | 00:01:51,300 --> 00:01:56,060
141 | 然后我们将从Lanchain导入一些我们需要的方法和类。
142 | 
143 | 39
144 | 00:01:56,060 --> 00:01:59,060
145 | 所以我们要导入一些加载工具的方法，
146 | 
147 | 40
148 | 00:01:59,060 --> 00:02:02,340
149 | 这些是我们将连接语言模型的东西。
150 | 
151 | 41
152 | 00:02:02,340 --> 00:02:05,020
153 | 我们将加载一个初始化代理的方法。
154 | 
155 | 42
156 | 00:02:05,020 --> 00:02:07,820
157 | 我们将加载聊天Open AI包装器，
158 | 
159 | 43
160 | 00:02:07,820 --> 00:02:09,500
161 | 我们将加载代理类型。
162 | 
163 | 44
164 | 00:02:09,500 --> 00:02:14,220
165 | 所以代理类型将用于指定我们要使用的代理类型。
166 | 
167 | 45
168 | 00:02:14,220 --> 00:02:16,540
169 | Lanchain中有许多不同类型的代理。
170 | 
171 | 46
172 | 00:02:16,540 --> 00:02:18,780
173 | 我们现在不会详细介绍所有这些。
174 | 
175 | 47
176 | 00:02:18,780 --> 00:02:21,420
177 | 我们将选择一个并运行它。
178 | 
179 | 48
180 | 00:02:21,420 --> 00:02:24,700
181 | 然后我们将初始化我们要使用的语言模型。
182 | 
183 | 49
184 | 00:02:24,700 --> 00:02:30,500
185 | 同样，我们将使用它作为我们用来驱动代理的推理引擎。
186 | 
187 | 50
188 | 00:02:30,500 --> 00:02:33,740
189 | 然后我们将加载我们要使用的工具。
190 | 
191 | 51
192 | 00:02:33,740 --> 00:02:37,020
193 | 所以我们将加载DuckDuckGo搜索和维基百科，
194 | 
195 | 52
196 | 00:02:37,020 --> 00:02:40,140
197 | 这些都是内置的Lanchain工具。
198 | 
199 | 53
200 | 00:02:40,140 --> 00:02:42,980
201 | 最后，我们将初始化代理。
202 | 
203 | 54
204 | 00:02:42,980 --> 00:02:44,780
205 | 我们将传递工具，
206 | 
207 | 55
208 | 00:02:44,780 --> 00:02:47,700
209 | 语言模型和代理类型。
210 | 
211 | 56
212 | 00:02:47,700 --> 00:02:49,340
213 | 所以这里我们使用聊天，
214 | 
215 | 57
216 | 00:02:49,340 --> 00:02:51,460
217 | 零射击，反应，描述。
218 | 
219 | 58
220 | 00:02:51,460 --> 00:02:54,060
221 | 我不会详细介绍这意味着什么。
222 | 
223 | 59
224 | 00:02:54,060 --> 00:02:56,220
225 | 需要注意的重要事项是聊天。
226 | 
227 | 60
228 | 00:02:56,220 --> 00:03:00,540
229 | 这是针对聊天模型进行优化的，然后是反应。
230 | 
231 | 61
232 | 00:03:00,540 --> 00:03:05,620
233 | 反应是一种提示策略，可以从语言模型中引出更好的想法。
234 | 
235 | 62
236 | 00:03:05,620 --> 00:03:09,220
237 | 我们还将设置处理解析错误等于true。
238 | 
239 | 63
240 | 00:03:09,220 --> 00:03:11,620
241 | 如果您还记得第一课，
242 | 
243 | 64
244 | 00:03:11,620 --> 00:03:17,140
245 | 我们谈论了输出解析器以及如何使用它们将LLM输出，
246 | 
247 | 65
248 | 00:03:17,140 --> 00:03:22,060
249 | 这是一个字符串，并将其解析为我们可以在下游使用的特定格式。
250 | 
251 | 66
252 | 00:03:22,060 --> 00:03:23,740
253 | 这在这里非常重要。
254 | 
255 | 67
256 | 00:03:23,740 --> 00:03:25,620
257 | 当我们将LLM的输出，
258 | 
259 | 68
260 | 00:03:25,620 --> 00:03:28,940
261 | 这是文本，并将其解析为特定的操作，
262 | 
263 | 69
264 | 00:03:28,940 --> 00:03:32,700
265 | 以及语言模型应该采取的特定操作输入。
266 |  
267 | 70
268 | 00:03:32,700 --> 00:03:34,300
269 | 现在让我们使用这个代理。
270 | 
271 | 71
272 | 00:03:34,300 --> 00:03:38,940
273 | 让我们问一个关于最近事件的问题，这个模型在训练时不知道。
274 | 
275 | 72
276 | 00:03:38,940 --> 00:03:41,060
277 | 所以让我们问一下2022年世界杯的情况。
278 | 
279 | 73
280 | 00:03:41,060 --> 00:03:43,860
281 | 这里的模型是根据2021年左右的数据进行训练的。
282 | 
283 | 74
284 | 00:03:43,860 --> 00:03:47,660
285 | 所以它不应该知道这个问题的答案。
286 | 
287 | 75
288 | 00:03:47,660 --> 00:03:49,820
289 | 因此，它应该意识到需要使用工具来查找这个最近的信息。
290 | 
291 | 76
292 | 00:03:49,820 --> 00:03:55,580
293 | 所以我们可以看到这里的代理意识到它需要使用DuckDuckGo搜索，然后查找2022年世界杯的获胜者。
294 | 
295 | 77
296 | 00:04:05,340 --> 00:04:10,620
297 | 因此，它得到了一些信息。
298 | 
299 | 78
300 | 00:04:10,620 --> 00:04:14,900
301 | 然后我们可以看到代理认为2022年世界杯还没有发生。
302 | 
303 | 79
304 | 00:04:14,900 --> 00:04:18,060
305 | 所以这是一个很好的例子，说明代理仍然具有探索性。
306 | 
307 | 80
308 | 00:04:18,060 --> 00:04:23,940
309 | 我们可以看到这里有很多关于2022年世界杯的信息，但它并没有完全意识到所有的事情都已经发生了。
310 | 
311 | 81
312 | 00:04:23,940 --> 00:04:28,060
313 | 因此，它需要查找更多的信息。
314 | 
315 | 82
316 | 00:04:28,060 --> 00:04:32,020
317 | 然后基于这些信息，它可以回答正确的答案，即阿根廷赢得了2022年世界杯。
318 | 
319 | 87
320 | 00:04:47,380 --> 00:04:52,500
321 | 然后让我们问一个问题，它应该意识到需要使用维基百科。
322 | 
323 | 88
324 | 00:04:52,500 --> 00:04:58,740
325 | 维基百科有很多关于特定人物和特定实体的信息，这些信息可以是很久以前的，不需要是当前的信息。
326 | 
327 | 89
328 | 00:04:58,740 --> 00:05:02,980
329 | 所以让我们问一下美国计算机科学家Tom M. Mitchell写了哪本书。
330 | 
331 | 90
332 | 00:05:02,980 --> 00:05:06,540
333 | 我们可以看到它意识到应该使用维基百科来查找答案。
334 | 
335 | 91
336 | 00:05:06,540 --> 00:05:08,420
337 | 所以它搜索Tom M. Mitchell维基百科。
338 | 
339 | 92
340 | 00:05:08,420 --> 00:05:12,700
341 | 然后再进行另一个跟进搜索，以确保它得到了正确的答案。
342 | 
343 | 93
344 | 00:05:12,700 --> 00:05:16,020
345 | 所以它搜索Tom M. Mitchell机器学习，并得到更多的信息。
346 | 
347 | 94
348 | 00:05:16,020 --> 00:05:19,460
349 | 然后基于这些信息，它最终能够回答Tom M. Mitchell写的教科书是《机器学习》。
350 | 
351 | 98
352 | 00:05:29,660 --> 00:05:33,580
353 | 你可以在这里暂停视频，尝试输入不同的内容。
354 | 
355 | 99
356 | 00:05:33,580 --> 00:05:38,380
357 | 到目前为止，我们已经使用了LinkedIn中预定义的工具。
358 |  
359 | 100
360 | 00:05:38,380 --> 00:05:42,820
361 | 但代理的一个重要功能是可以将其连接到您自己的信息源、API和数据。
362 | 
363 | 101
364 | 00:05:42,820 --> 00:05:45,100
365 | 您可以创建一个自定义工具，将其连接到您想要的任何内容。
366 | 
367 | 102
368 | 00:05:45,100 --> 00:05:47,700
369 | 现在我们来创建一个工具，它将告诉我们当前的日期。
370 | 
371 | 103
372 | 00:05:47,700 --> 00:05:50,700
373 | 首先，我们要导入这个工具装饰器。
374 | 
375 | 106
376 | 00:05:57,500 --> 00:06:03,100
377 | 接下来，我们将编写一个名为“time”的函数，它接受任何文本字符串。
378 | 
379 | 109
380 | 00:06:09,900 --> 00:06:15,540
381 | 它将通过调用日期时间返回今天的日期。
382 | 
383 | 110
384 | 00:06:15,540 --> 00:06:20,660
385 | 除了函数的名称，我们还将编写一个非常详细的文档字符串。
386 | 
387 | 111
388 | 00:06:20,660 --> 00:06:25,100
389 | 这是代理将用来知道何时调用此工具以及如何调用此工具的方式。
390 | 
391 | 113
392 | 00:06:28,500 --> 00:06:32,060
393 | 例如，在这里，我们说输入应始终为空字符串。
394 | 
395 | 116
396 | 00:06:37,460 --> 00:06:42,940
397 | 如果我们对输入有更严格的要求，例如，如果我们有一个应始终接受搜索查询或SQL语句的函数，
398 | 
399 | 118
400 | 00:06:47,340 --> 00:06:49,060
401 | 现在我们将创建另一个代理。
402 | 
403 | 119
404 | 00:06:49,060 --> 00:06:55,660
405 | 这次，我们将时间工具添加到现有工具列表中。
406 | 
407 | 121
408 | 00:07:03,660 --> 00:07:08,140
409 | 它识别出需要使用时间工具，并在此指定。
410 | 
411 | 126
412 | 00:07:18,740 --> 00:07:22,540
413 | 今天的日期是2023年5月21日。
414 | 
415 | 128
416 | 00:07:26,860 --> 00:07:29,340
417 | 这就是代理的全部内容。
418 | 
419 | 129
420 | 00:07:29,340 --> 00:07:34,740
421 | 这是LangChain中较新、更令人兴奋和更具实验性的部分之一。
422 | 
423 | 130
424 | 00:07:34,740 --> 00:07:36,540
425 | 希望您喜欢使用它。
426 |  
427 | 131
428 | 00:07:36,540 --> 00:07:40,540
429 | 希望它向您展示了如何使用语言模型作为推理引擎
430 | 
431 | 132
432 | 00:07:40,540 --> 00:08:00,540
433 | 以执行不同的操作并连接到其他功能和数据源。


--------------------------------------------------------------------------------
/chinese/LangChain_L2.srt:
--------------------------------------------------------------------------------
  1 | 
  2 |  
  3 | 1
  4 | 00:00:00,000 --> 00:00:18,000
  5 | 当你与这些模型交互时，它们自然而然地不会记得你之前说过的话或任何以前的对话，这在构建一些应用程序（如聊天机器人）并希望与它们进行对话时是一个问题。
  6 | 
  7 | 2
  8 | 00:00:18,000 --> 00:00:31,000
  9 | 因此，在本节中，我们将介绍记忆，即如何记住先前对话的部分并将其馈入语言模型中，以便在与它们交互时具有这种对话流。
 10 | 
 11 | 3
 12 | 00:00:31,000 --> 00:00:38,000
 13 | 没错。因此，Language Chain提供了多种复杂的选项来管理这些记忆。让我们跳进来看看。
 14 | 
 15 | 4
 16 | 00:00:38,000 --> 00:00:48,000
 17 | 因此，让我首先导入我的OpenAI API密钥，然后让我导入我需要的一些工具。
 18 | 
 19 | 5
 20 | 00:00:48,000 --> 00:00:55,000
 21 | 让我们以使用LangChain来管理聊天或聊天机器人对话为记忆的动机示例。
 22 | 
 23 | 6
 24 | 00:00:55,000 --> 00:01:09,000
 25 | 因此，为此，我将将llm设置为OpenAI的聊天界面，温度为零。我将使用内存作为对话缓冲区内存。
 26 | 
 27 | 7
 28 | 00:01:09,000 --> 00:01:15,000
 29 | 稍后您将看到这意味着什么。我将构建一个对话链。
 30 | 
 31 | 8
 32 | 00:01:15,000 --> 00:01:26,000
 33 | 同样，在这个短期课程中，Harrison将更深入地探讨LangChain中的链是什么。所以现在不要太担心语法的细节。
 34 | 
 35 | 9
 36 | 00:01:26,000 --> 00:01:36,000
 37 | 但是这构建了一个llm。如果我开始交谈，conversation.predict，给出输入。嗨，我的名字是安德鲁。
 38 | 
 39 | 10
 40 | 00:01:36,000 --> 00:01:47,000
 41 | 让我们看看它说什么。你好，安德鲁，很高兴见到你。对吧？等等。然后让我们说我问它，1加1等于多少？
 42 | 
 43 | 11
 44 | 00:01:47,000 --> 00:01:55,000
 45 | 嗯，1加1等于2。然后再问一遍，你知道我的名字是什么吗？你的名字是安德鲁，正如你之前提到的那样。
 46 | 
 47 | 12
 48 | 00:01:55,000 --> 00:02:06,000
 49 | 嗯，那里有很多讽刺的痕迹。不确定。因此，如果您愿意，可以将此verbose变量更改为true，以查看LangChain实际上正在做什么。
 50 | 
 51 | 13
 52 | 00:02:06,000 --> 00:02:11,000
 53 | 当您运行predict，hi，my name is Andrew时，这是LangChain正在生成的提示。
 54 | 
 55 | 14
 56 | 00:02:11,000 --> 00:02:16,000
 57 | 它说，以下是人类和AI之间友好的对话，AI健谈等等。
 58 | 
 59 | 15
 60 | 00:02:16,000 --> 00:02:26,000
 61 | 因此，这是LangChain生成的提示，以使系统进行希望和友好的对话，并且必须保存对话，这是响应。
 62 | 
 63 | 16
 64 | 00:02:26,000 --> 00:02:35,000
 65 | 当您在第二和第三部分对话上执行此操作时，它会保留提示如下。
 66 | 
 67 | 17
 68 | 00:02:35,000 --> 00:02:43,000
 69 | 请注意，到我说出“我的名字是什么？”这是第三轮，那是我的第三个输入。
 70 |  
 71 | 18
 72 | 00:02:43,000 --> 00:02:50,000
 73 | 它已将当前对话存储如下。嗨，我的名字是安德鲁，一加一等于多少，等等。
 74 | 
 75 | 19
 76 | 00:02:50,000 --> 00:02:57,000
 77 | 因此，这个对话的记忆或历史变得越来越长。
 78 | 
 79 | 20
 80 | 00:02:57,000 --> 00:03:02,000
 81 | 实际上，在顶部，我使用了内存变量来存储内存。
 82 | 
 83 | 21
 84 | 00:03:02,000 --> 00:03:08,000
 85 | 因此，如果我要打印memory.buffer，它已经存储了到目前为止的对话。
 86 | 
 87 | 22
 88 | 00:03:08,000 --> 00:03:14,000
 89 | 您还可以打印出这个，memory.loadMemoryVariables。
 90 | 
 91 | 23
 92 | 00:03:14,000 --> 00:03:18,000
 93 | 这里的花括号实际上是一个空字典。
 94 | 
 95 | 24
 96 | 00:03:18,000 --> 00:03:25,000
 97 | 有一些更高级的功能，您可以使用更复杂的输入，但我们不会在这个短期课程中讨论它们。
 98 | 
 99 | 25
100 | 00:03:25,000 --> 00:03:28,000
101 | 所以不要担心为什么这里有一个空的花括号。
102 | 
103 | 26
104 | 00:03:28,000 --> 00:03:33,000
105 | 但这就是LangChain到目前为止在对话记忆中记住的一切。
106 | 
107 | 27
108 | 00:03:33,000 --> 00:03:38,000
109 | 这只是AI或人类说的一切。
110 | 
111 | 28
112 | 00:03:38,000 --> 00:03:41,000
113 | 我鼓励您暂停视频并运行代码。
114 | 
115 | 29
116 | 00:03:41,000 --> 00:03:49,000
117 | 因此，LangChain存储对话的方式是使用这个对话缓冲区内存。
118 | 
119 | 30
120 | 00:03:49,000 --> 00:03:55,000
121 | 如果我使用对话缓冲区内存来指定一些输入和输出，
122 | 
123 | 31
124 | 00:03:55,000 --> 00:03:59,000
125 | 如果您希望明确地这样做，这是添加新内容到内存中的方法。
126 | 
127 | 32
128 | 00:03:59,000 --> 00:04:03,000
129 | Memory.saveContext说，嗨，怎么样？
130 | 
131 | 33
132 | 00:04:03,000 --> 00:04:09,000
133 | 我知道这不是最令人兴奋的对话，但我想举一个简短的例子。
134 | 
135 | 34
136 | 00:04:09,000 --> 00:04:15,000
137 | 嗯，就这样内存的状态。
138 | 
139 | 35
140 | 00:04:15,000 --> 00:04:22,000
141 | 再次，让我显示一下内存变量。
142 | 
143 | 36
144 | 00:04:22,000 --> 00:04:29,000
145 | 现在，如果您想向内存添加其他数据，您可以继续保存其他上下文。
146 | 
147 | 37
148 | 00:04:29,000 --> 00:04:34,000
149 | 因此，对话继续，没有什么，只是挂着，很酷。
150 | 
151 | 38
152 | 00:04:34,000 --> 00:04:38,000
153 | 如果您打印出内存，您会发现现在有更多的东西。
154 | 
155 | 39
156 | 00:04:38,000 --> 00:04:46,000
157 | 因此，当您使用大型语言模型进行聊天对话时，大型语言模型本身实际上是无状态的。
158 | 
159 | 40
160 | 00:04:46,000 --> 00:04:51,000
161 | 语言模型本身不记得到目前为止的对话。
162 | 
163 | 41
164 | 00:04:51,000 --> 00:04:55,000
165 | 每个交易，每个调用API端点都是独立的。
166 | 
167 | 42
168 | 00:04:55,000 --> 00:05:07,000
169 | 聊天机器人似乎只有记忆，因为通常会提供完整的对话作为上下文，以提供给LLM。
170 | 
171 | 43
172 | 00:05:07,000 --> 00:05:14,000
173 | 因此，内存可以明确地存储到目前为止的术语或话语。
174 |  
175 | 44
176 | 00:05:14,000 --> 00:05:16,000
177 | 嗨，我叫安德鲁。你好，很高兴认识你等等。
178 | 
179 | 45
180 | 00:05:16,000 --> 00:05:30,000
181 | 这个内存存储器被用作输入或附加上下文到LLM中，以便它可以生成一个输出，就好像它只是有下一个对话的转折，知道之前说过什么。
182 | 
183 | 46
184 | 00:05:30,000 --> 00:05:37,000
185 | 随着对话变得越来越长，所需的内存量也变得非常长。
186 | 
187 | 47
188 | 00:05:37,000 --> 00:05:46,000
189 | 因此，将大量的令牌发送到LLM的成本，通常是基于它需要处理的令牌数量而收费，也会变得更加昂贵。
190 | 
191 | 48
192 | 00:05:46,000 --> 00:05:54,000
193 | 因此，Lanchain提供了几种方便的内存来存储和累积对话。
194 | 
195 | 49
196 | 00:05:54,000 --> 00:06:00,000
197 | 到目前为止，我们一直在看对话缓冲区内存。让我们看看另一种类型的内存。
198 | 
199 | 50
200 | 00:06:00,000 --> 00:06:09,000
201 | 我将导入只保留一个窗口内存的对话缓冲区窗口内存。
202 | 
203 | 51
204 | 00:06:09,000 --> 00:06:20,000
205 | 如果我将内存设置为具有k等于1的对话缓冲区窗口内存，则变量k等于1指定我只想记住一个对话交换。
206 | 
207 | 52
208 | 00:06:20,000 --> 00:06:25,000
209 | 也就是我和聊天机器人的一次发言。
210 | 
211 | 53
212 | 00:06:25,000 --> 00:06:31,000
213 | 所以现在，如果我让它保存上下文，嗨，怎么样，没什么，只是闲逛。
214 | 
215 | 54
216 | 00:06:31,000 --> 00:06:38,000
217 | 如果我查看内存点加载变量，它只记住最近的话语。
218 | 
219 | 55
220 | 00:06:38,000 --> 00:06:45,000
221 | 请注意，它已经删除了。嗨，怎么样？它只是说，人类说没什么，只是闲逛，AI说很酷。
222 | 
223 | 56
224 | 00:06:45,000 --> 00:06:48,000
225 | 所以这是一个很好的功能，因为它可以让你跟踪最近的几个对话术语。
226 | 
227 | 57
228 | 00:06:48,000 --> 00:06:56,000
229 | 在实践中，您可能不会使用k等于1。您将使用k设置为更大的数字。
230 | 
231 | 58
232 | 00:06:56,000 --> 00:07:03,000
233 | 但是，这仍然可以防止随着对话的进行，内存无限增长。
234 | 
235 | 59
236 | 00:07:03,000 --> 00:07:10,000
237 | 所以，如果我重新运行我们刚才的对话，我们会说，嗨，我叫安德鲁。
238 | 
239 | 60
240 | 00:07:10,000 --> 00:07:23,000
241 | 1加1等于多少？现在我问它，我的名字是什么？
242 | 
243 | 61
244 | 00:07:23,000 --> 00:07:32,000
245 | 因为k等于1，它只记得最后一次交流，而不是1加1等于什么？
246 | 
247 | 62
248 | 00:07:32,000 --> 00:07:37,000
249 | 答案是1加1等于2，它已经忘记了这个早期的交流，现在说，
250 | 
251 | 63
252 | 00:07:37,000 --> 00:07:42,000
253 | 抱歉，没有访问那些信息。
254 | 
255 | 64
256 | 00:07:42,000 --> 00:07:46,000
257 | 我希望你能做的一件事是暂停视频，在左侧的代码中将其更改为true，
258 |  
259 | 66
260 | 00:07:53,000 --> 00:07:57,000
261 | 并使用verbose等于true重新运行此对话。
262 | 
263 | 67
264 | 00:07:57,000 --> 00:08:00,000
265 | 然后您将看到实际用于生成此内容的提示。
266 | 
267 | 68
268 | 00:08:00,000 --> 00:08:08,000
269 | 希望您能看到当您在调用LLM时，询问“我的名字是什么”时，
270 | 
271 | 69
272 | 00:08:08,000 --> 00:08:11,000
273 | 内存已删除了我学习“我的名字是什么”的交换，
274 | 
275 | 70
276 | 00:08:11,000 --> 00:08:17,000
277 | 这就是为什么现在它说不知道我的名字。
278 | 
279 | 71
280 | 00:08:17,000 --> 00:08:28,000
281 | 使用对话令牌缓冲器内存，内存将限制保存的令牌数量。
282 | 
283 | 72
284 | 00:08:28,000 --> 00:08:39,000
285 | 由于LLM定价的很多是基于令牌的，因此这更直接地映射到LLM调用的成本。
286 | 
287 | 73
288 | 00:08:39,000 --> 00:08:47,000
289 | 因此，如果我说最大令牌限制等于50，实际上让我注入一些评论。
290 | 
291 | 74
292 | 00:08:47,000 --> 00:08:51,000
293 | 所以让我们说，对话是，AI是什么？惊人。
294 | 
295 | 75
296 | 00:08:51,000 --> 00:08:54,000
297 | 反向传播是什么？美丽。聊天机器人是什么？迷人。
298 | 
299 | 76
300 | 00:08:54,000 --> 00:08:58,000
301 | 我使用ABC作为所有这些对话术语的第一个字母。
302 | 
303 | 77
304 | 00:08:58,000 --> 00:09:02,000
305 | 我们可以跟踪，嗯，什么时候说了什么。
306 | 
307 | 78
308 | 00:09:02,000 --> 00:09:08,000
309 | 如果我使用高令牌限制运行它，它几乎包含了整个对话。
310 | 
311 | 79
312 | 00:09:08,000 --> 00:09:14,000
313 | 如果我将令牌限制增加到100，它现在包含了整个对话。
314 | 
315 | 80
316 | 00:09:14,000 --> 00:09:24,000
317 | 所以我有AI是什么？如果我减少它，那么，您知道，它会切掉这个对话的早期部分
318 | 
319 | 81
320 | 00:09:24,000 --> 00:09:28,000
321 | 以保留与最近的交流相对应的令牌数量，
322 | 
323 | 82
324 | 00:09:28,000 --> 00:09:32,000
325 | 但不超过令牌限制。
326 | 
327 | 83
328 | 00:09:32,000 --> 00:09:35,000
329 | 如果您想知道为什么我们需要指定LLM，
330 | 
331 | 84
332 | 00:09:35,000 --> 00:09:39,000
333 | 那是因为不同的LLM使用不同的令牌计数方式。
334 | 
335 | 85
336 | 00:09:39,000 --> 00:09:46,000
337 | 因此，这告诉它使用聊天OpenAI LLM使用的令牌计数方式。
338 | 
339 | 86
340 | 00:09:46,000 --> 00:09:49,000
341 | 我鼓励您暂停视频并运行代码，
342 | 
343 | 87
344 | 00:09:49,000 --> 00:09:54,000
345 | 并尝试修改提示以查看是否可以获得不同的输出。
346 | 
347 | 88
348 | 00:09:54,000 --> 00:09:58,000
349 | 最后，我想在这里说明的最后一种记忆类型是对话摘要缓冲器记忆。
350 | 
351 | 89
352 | 00:10:04,000 --> 00:10:12,000
353 | 这个想法是，不是将内存限制为基于最近话语的固定数量的令牌
354 | 
355 | 90
356 | 00:10:12,000 --> 00:10:15,000
357 | 或固定数量的对话交换，
358 | 
359 | 91
360 | 00:10:15,000 --> 00:10:24,000
361 | 让我们使用LLM编写对话摘要，并让其成为内存。
362 |  
363 | 00:10:24,000 --> 00:10:29,000
364 | 这里有一个例子，我将创建一个长字符串，其中包含某人的日程安排。
365 | 
366 | 00:10:29,000 --> 00:10:31,000
367 | 你知道，有Meteor AM，我们是产品团队，
368 | 
369 | 00:10:31,000 --> 00:10:33,000
370 | 你需要你的PowerPoint演示文稿等等。
371 | 
372 | 00:10:33,000 --> 00:10:38,000
373 | 所以这是一个长字符串，说出你的日程安排，你知道的，
374 | 
375 | 00:10:38,000 --> 00:10:42,000
376 | 也许以与客户在意大利餐厅的午餐结束，
377 | 
378 | 00:10:42,000 --> 00:10:46,000
379 | 带上你的笔记本电脑，展示LLM，最新的LLM演示。
380 | 
381 | 00:10:46,000 --> 00:10:53,000
382 | 所以让我使用一个对话摘要缓存内存，嗯，
383 | 
384 | 00:10:53,000 --> 00:10:58,000
385 | 在这种情况下，最大令牌限制为400，令牌限制相当高。
386 | 
387 | 00:10:58,000 --> 00:11:05,000
388 | 我将插入一些对话术语，以我们开始的方式，
389 | 
390 | 00:11:05,000 --> 00:11:10,000
391 | 你好，怎么了，没有人只是闲逛，嗯，酷。
392 | 
393 | 00:11:10,000 --> 00:11:17,000
394 | 然后今天的日程安排是什么，回答是，你知道，那个长长的日程安排。
395 | 
396 | 00:11:17,000 --> 00:11:22,000
397 | 所以这个内存现在有相当多的文本。
398 | 
399 | 00:11:22,000 --> 00:11:26,000
400 | 事实上，让我们看看内存变量。
401 | 
402 | 00:11:26,000 --> 00:11:37,000
403 | 它包含整个文本，因为400个令牌足以存储所有这些文本。
404 | 
405 | 00:11:37,000 --> 00:11:43,000
406 | 但是现在，如果我将最大令牌限制减少，比如将其减少到100个令牌，
407 | 
408 | 00:11:43,000 --> 00:11:46,000
409 | 请记住，这存储了整个对话历史记录。
410 | 
411 | 00:11:46,000 --> 00:11:50,000
412 | 如果我将令牌数减少到100个，
413 | 
414 | 00:11:50,000 --> 00:11:57,000
415 | 那么对话摘要缓存内存实际上已经使用了LLM，
416 | 
417 | 00:11:57,000 --> 00:12:01,000
418 | 在这种情况下，我们已经将LLM设置为open AI端点，
419 | 
420 | 00:12:01,000 --> 00:12:05,000
421 | 以生成到目前为止对话的摘要。
422 | 
423 | 00:12:05,000 --> 00:12:09,000
424 | 因此，摘要是人工智能在计划日程之前进行了闲聊，
425 | 
426 | 00:12:09,000 --> 00:12:12,000
427 | 并在早晨会议上通知人类，等等，
428 | 
429 | 00:12:12,000 --> 00:12:17,000
430 | 午餐会议与对人工智能感兴趣的客户，
431 | 
432 | 00:12:17,000 --> 00:12:28,000
433 | 最新的人工智能发展。如果我们使用这个LLM进行对话，
434 | 
435 | 00:12:28,000 --> 00:12:33,000
436 | 然后创建一个对话链，与之前相同。
437 | 
438 | 00:12:33,000 --> 00:12:41,000
439 | 如果我们问，你知道什么是一个好的演示文稿吗？
440 | 
441 | 00:12:41,000 --> 00:12:43,000
442 | 嗯，我说verbose=true。
443 | 
444 | 00:12:43,000 --> 00:12:53,000
445 | 所以这里是提示，LLM认为当前的对话已经讨论过这个问题了，
446 | 
447 | 00:12:53,000 --> 00:12:56,000
448 | 因为这是对话的摘要。
449 |  
450 | 122
451 | 00:12:56,000 --> 00:13:03,000
452 | 还有一点需要注意，如果你熟悉开放式AI聊天API端点，
453 | 
454 | 123
455 | 00:13:03,000 --> 00:13:07,000
456 | 有一个特定的系统消息。
457 | 
458 | 124
459 | 00:13:07,000 --> 00:13:11,000
460 | 在这个例子中，这并没有使用官方的开放式AI系统消息。
461 | 
462 | 125
463 | 00:13:11,000 --> 00:13:14,000
464 | 它只是将其作为提示的一部分包含在内。
465 | 
466 | 126
467 | 00:13:14,000 --> 00:13:16,000
468 | 但它仍然运行得相当不错。
469 | 
470 | 127
471 | 00:13:16,000 --> 00:13:21,000
472 | 鉴于这个提示，你知道，LLM输出基本的对AI发展感兴趣的客户，
473 | 
474 | 128
475 | 00:13:21,000 --> 00:13:24,000
476 | 或者建议展示我们最新的NLP能力。
477 | 
478 | 129
479 | 00:13:24,000 --> 00:13:26,000
480 | 好的，很酷。
481 | 
482 | 130
483 | 00:13:26,000 --> 00:13:31,000
484 | 嗯，它正在做一些向酷炫演示的建议，
485 | 
486 | 131
487 | 00:13:31,000 --> 00:13:35,000
488 | 并让你想到如果我遇到一个客户，我会说，
489 | 
490 | 132
491 | 00:13:35,000 --> 00:13:43,000
492 | 天哪，如果有一个开源框架可以帮助我使用LLMs构建酷炫的NLP应用程序。
493 | 
494 | 133
495 | 00:13:43,000 --> 00:13:46,000
496 | 好事正在发生。
497 | 
498 | 134
499 | 00:13:46,000 --> 00:13:58,000
500 | 有趣的是，如果你现在看看记忆发生了什么。
501 | 
502 | 135
503 | 00:13:58,000 --> 00:14:04,000
504 | 请注意，这里已经合并了最近的AI系统输出，
505 | 
506 | 136
507 | 00:14:04,000 --> 00:14:11,000
508 | 而我的话问它是否是一个好的演示已经被合并到系统消息中。
509 | 
510 | 137
511 | 00:14:11,000 --> 00:14:14,000
512 | 你知道，到目前为止的对话总结。
513 | 
514 | 138
515 | 00:14:14,000 --> 00:14:17,000
516 | 通过对话总结缓冲区记忆，
517 | 
518 | 139
519 | 00:14:17,000 --> 00:14:27,000
520 | 它试图保持消息的显式存储，直到我们指定的令牌数为止。
521 | 
522 | 140
523 | 00:14:27,000 --> 00:14:30,000
524 | 所以，你知道，这部分，显式存储，
525 | 
526 | 141
527 | 00:14:30,000 --> 00:14:34,000
528 | 我们试图将其限制在100个令牌，因为这是我们要求的。
529 | 
530 | 142
531 | 00:14:34,000 --> 00:14:38,000
532 | 然后，任何超过这个限制的内容，它都将使用LLM生成一个摘要，
533 | 
534 | 143
535 | 00:14:38,000 --> 00:14:41,000
536 | 这就是上面看到的内容。
537 | 
538 | 144
539 | 00:14:41,000 --> 00:14:46,000
540 | 即使我使用聊天作为一个运行示例来说明这些不同的记忆，
541 | 
542 | 145
543 | 00:14:46,000 --> 00:14:49,000
544 | 这些记忆也对其他应用程序有用，
545 | 
546 | 146
547 | 00:14:49,000 --> 00:14:54,000
548 | 在这些应用程序中，你可能会不断地获得新的文本片段或新的信息，
549 | 
550 | 147
551 | 00:14:54,000 --> 00:14:59,000
552 | 比如如果你的系统反复在线搜索事实，
553 | 
554 | 148
555 | 00:14:59,000 --> 00:15:04,000
556 | 但你希望保持用于存储这个不断增长的事实列表的总内存大小为，
557 | 
558 | 149
559 | 00:15:04,000 --> 00:15:07,000
560 | 有限的，而不是任意增长。
561 | 
562 | 150
563 | 00:15:07,000 --> 00:15:11,000
564 | 我鼓励你暂停视频并运行代码。
565 |  
566 | 151
567 | 00:15:11,000 --> 00:15:15,000
568 | 在这个视频中，你看到了一些类型的内存，包括基于对话交换或令牌数量限制的缓冲内存，
569 | 
570 | 152
571 | 00:15:15,000 --> 00:15:21,000
572 | 或者可以总结超过一定限制的令牌的内存。
573 | 
574 | 153
575 | 00:15:21,000 --> 00:15:26,000
576 | Lanchain实际上还支持其他类型的内存。
577 | 
578 | 155
579 | 00:15:30,000 --> 00:15:33,000
580 | 其中最强大的之一是向量数据内存。
581 | 
582 | 156
583 | 00:15:33,000 --> 00:15:36,000
584 | 如果你熟悉单词嵌入和文本嵌入，
585 | 
586 | 157
587 | 00:15:36,000 --> 00:15:39,000
588 | 向量数据库实际上存储这样的嵌入。
589 | 
590 | 158
591 | 00:15:39,000 --> 00:15:41,000
592 | 如果你不知道这是什么意思，不用担心。
593 | 
594 | 159
595 | 00:15:41,000 --> 00:15:43,000
596 | 哈里森会在后面解释。
597 | 
598 | 160
599 | 00:15:43,000 --> 00:15:51,000
600 | 然后，它可以使用这种类型的向量数据库检索最相关的文本块作为其内存。
601 | 
602 | 161
603 | 00:15:51,000 --> 00:15:54,000
604 | Lanchain还支持实体内存，
605 | 
606 | 162
607 | 00:15:54,000 --> 00:15:58,000
608 | 当你想让它记住特定人物的细节时，这是适用的，
609 | 
610 | 163
611 | 00:15:58,000 --> 00:16:04,000
612 | 特定的其他实体，比如如果你谈论一个特定的朋友，
613 | 
614 | 164
615 | 00:16:04,000 --> 00:16:08,000
616 | 你可以让Lanchain记住关于那个朋友的事实，
617 | 
618 | 165
619 | 00:16:08,000 --> 00:16:12,000
620 | 这将是一种明确的实体。
621 | 
622 | 166
623 | 00:16:12,000 --> 00:16:14,000
624 | 当你使用Lanchain实现应用程序时，
625 | 
626 | 167
627 | 00:16:14,000 --> 00:16:17,000
628 | 你还可以使用多种类型的内存，
629 | 
630 | 168
631 | 00:16:17,000 --> 00:16:22,000
632 | 比如使用你在这个视频中看到的一种对话内存类型。
633 | 
634 | 169
635 | 00:16:22,000 --> 00:16:26,000
636 | 此外，还可以使用实体内存来回忆个人。
637 | 
638 | 170
639 | 00:16:26,000 --> 00:16:30,000
640 | 这样它就可以记住对话的摘要，
641 | 
642 | 171
643 | 00:16:30,000 --> 00:16:35,000
644 | 再加上以明确的方式存储重要人物的重要事实。
645 | 
646 | 172
647 | 00:16:35,000 --> 00:16:38,000
648 | 当然，除了使用这些内存类型之外，
649 | 
650 | 173
651 | 00:16:38,000 --> 00:16:43,000
652 | 开发人员还经常将整个对话存储在传统数据库中，
653 | 
654 | 174
655 | 00:16:43,000 --> 00:16:46,000
656 | 某种键值存储或SQL数据库。
657 | 
658 | 175
659 | 00:16:46,000 --> 00:16:51,000
660 | 因此，你可以回顾整个对话以进行审计或进一步改进系统。
661 | 
662 | 176
663 | 00:16:51,000 --> 00:16:53,000
664 | 这就是内存类型。
665 | 
666 | 177
667 | 00:16:53,000 --> 00:16:57,000
668 | 我希望你在构建自己的应用程序时会发现这个视频有用。
669 | 
670 | 178
671 | 00:16:57,000 --> 00:17:21,000
672 | 现在，让我们继续下一个视频，了解Lanchain的关键构建块，即链。


--------------------------------------------------------------------------------
/english/LangChain_L6.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:00,000 --> 00:00:08,920
  3 | Sometimes people think of a large language model as a knowledge store,
  4 | 
  5 | 2
  6 | 00:00:08,920 --> 00:00:11,920
  7 | as if it's learned to memorize a lot of information,
  8 | 
  9 | 3
 10 | 00:00:11,920 --> 00:00:14,880
 11 | maybe off the internet, so when you ask it a question,
 12 | 
 13 | 4
 14 | 00:00:14,880 --> 00:00:16,380
 15 | it can answer the question.
 16 | 
 17 | 5
 18 | 00:00:16,380 --> 00:00:19,340
 19 | But I think a even more useful way to think of
 20 | 
 21 | 6
 22 | 00:00:19,340 --> 00:00:22,980
 23 | a large language model is sometimes as a reasoning engine,
 24 | 
 25 | 7
 26 | 00:00:22,980 --> 00:00:27,140
 27 | in which you can give it chunks of text or other sources of information.
 28 | 
 29 | 8
 30 | 00:00:27,140 --> 00:00:29,460
 31 | Then the large language model, llm,
 32 | 
 33 | 9
 34 | 00:00:29,460 --> 00:00:33,000
 35 | will maybe use this background knowledge that's learned off the internet,
 36 | 
 37 | 10
 38 | 00:00:33,000 --> 00:00:36,620
 39 | but to use the new information you give it to help you answer
 40 | 
 41 | 11
 42 | 00:00:36,620 --> 00:00:41,100
 43 | questions or reason through content or decide even what to do next.
 44 | 
 45 | 12
 46 | 00:00:41,100 --> 00:00:45,180
 47 | That's what Lanchain's agents framework helps you to do.
 48 | 
 49 | 13
 50 | 00:00:45,180 --> 00:00:48,340
 51 | Agents are probably my favorite part of Lanchain.
 52 | 
 53 | 14
 54 | 00:00:48,340 --> 00:00:50,320
 55 | I think they're also one of the most powerful parts,
 56 | 
 57 | 15
 58 | 00:00:50,320 --> 00:00:52,140
 59 | but they're also one of the newer parts.
 60 | 
 61 | 16
 62 | 00:00:52,140 --> 00:00:56,020
 63 | We're seeing a lot of stuff emerge here that's really new to everyone in the field.
 64 | 
 65 | 17
 66 | 00:00:56,020 --> 00:00:58,940
 67 | This should be a very exciting lesson as we dive
 68 | 
 69 | 18
 70 | 00:00:58,940 --> 00:01:01,180
 71 | into what agents are, how to create,
 72 | 
 73 | 19
 74 | 00:01:01,180 --> 00:01:02,500
 75 | and how to use agents,
 76 | 
 77 | 20
 78 | 00:01:02,500 --> 00:01:04,860
 79 | how to equip them with different types of tools like
 80 | 
 81 | 21
 82 | 00:01:04,860 --> 00:01:07,180
 83 | search engines that come built into Lanchain,
 84 | 
 85 | 22
 86 | 00:01:07,180 --> 00:01:11,480
 87 | and then also how to create your own tools so that you can let agents interact with
 88 | 
 89 | 23
 90 | 00:01:11,480 --> 00:01:14,780
 91 | any data stores, any APIs,
 92 | 
 93 | 24
 94 | 00:01:14,780 --> 00:01:16,860
 95 | any functions that you might want them to.
 96 | 
 97 | 25
 98 | 00:01:16,860 --> 00:01:19,460
 99 | This is exciting, cutting-edge stuff,
100 | 
101 | 26
102 | 00:01:19,460 --> 00:01:23,060
103 | but already with emerging important use cases.
104 | 
105 | 27
106 | 00:01:23,060 --> 00:01:25,620
107 | So with that, let's dive in.
108 | 
109 | 28
110 | 00:01:25,620 --> 00:01:27,500
111 | To get started with agents,
112 | 
113 | 29
114 | 00:01:27,500 --> 00:01:32,420
115 | we're going to start as we always do by importing the correct environment variables.
116 | 
117 | 30
118 | 00:01:32,420 --> 00:01:35,100
119 | We're also going to need to install a few packages here.
120 | 
121 | 31
122 | 00:01:35,100 --> 00:01:39,020
123 | So we're going to use the DuckDuckGo search engine and Wikipedia.
124 | 
125 | 32
126 | 00:01:39,020 --> 00:01:40,780
127 | So we're going to want to pip install those.
128 | 
129 | 33
130 | 00:01:40,780 --> 00:01:42,940
131 | I've already installed those in my environment,
132 | 
133 | 34
134 | 00:01:42,940 --> 00:01:44,540
135 | so I'm not going to run this line.
136 | 
137 | 35
138 | 00:01:44,540 --> 00:01:46,360
139 | But if you guys have not,
140 | 
141 | 36
142 | 00:01:46,360 --> 00:01:48,580
143 | you should uncomment that line,
144 | 
145 | 37
146 | 00:01:48,580 --> 00:01:51,300
147 | run it, and then you're good to go.
148 | 
149 | 38
150 | 00:01:51,300 --> 00:01:56,060
151 | We're then going to import some methods and classes that we need from Lanchain.
152 | 
153 | 39
154 | 00:01:56,060 --> 00:01:59,060
155 | So we're going to import some methods to load tools,
156 | 
157 | 40
158 | 00:01:59,060 --> 00:02:02,340
159 | and these are things that we're going to connect the language model to.
160 | 
161 | 41
162 | 00:02:02,340 --> 00:02:05,020
163 | We're going to load a method to initialize the agent.
164 | 
165 | 42
166 | 00:02:05,020 --> 00:02:07,820
167 | We're going to load the chat open AI wrapper,
168 | 
169 | 43
170 | 00:02:07,820 --> 00:02:09,500
171 | and we're going to load agent type.
172 | 
173 | 44
174 | 00:02:09,500 --> 00:02:14,220
175 | So agent type will be used to specify what type of agent we want to use.
176 | 
177 | 45
178 | 00:02:14,220 --> 00:02:16,540
179 | There are a bunch of different types of agents in Lanchain.
180 | 
181 | 46
182 | 00:02:16,540 --> 00:02:18,780
183 | We're not going to go over all of them right now.
184 | 
185 | 47
186 | 00:02:18,780 --> 00:02:21,420
187 | We'll just choose one and run with that.
188 | 
189 | 48
190 | 00:02:21,420 --> 00:02:24,700
191 | We're then going to initialize the language model that we're going to use.
192 | 
193 | 49
194 | 00:02:24,700 --> 00:02:30,500
195 | Again, we're using this as the reasoning engine that we're using to drive the agent.
196 | 
197 | 50
198 | 00:02:30,500 --> 00:02:33,740
199 | We'll then load the tools that we're going to use.
200 | 
201 | 51
202 | 00:02:33,740 --> 00:02:37,020
203 | So we're going to load DuckDuckGo search and Wikipedia,
204 | 
205 | 52
206 | 00:02:37,020 --> 00:02:40,140
207 | and these are built-in Lanchain tools.
208 | 
209 | 53
210 | 00:02:40,140 --> 00:02:42,980
211 | Finally, we're going to initialize the agent.
212 | 
213 | 54
214 | 00:02:42,980 --> 00:02:44,780
215 | We pass it the tools,
216 | 
217 | 55
218 | 00:02:44,780 --> 00:02:47,700
219 | the language model, and agent type.
220 | 
221 | 56
222 | 00:02:47,700 --> 00:02:49,340
223 | So here we're using chat,
224 | 
225 | 57
226 | 00:02:49,340 --> 00:02:51,460
227 | zero shot, react, description.
228 | 
229 | 58
230 | 00:02:51,460 --> 00:02:54,060
231 | I'm not going to go over in too much detail what this means.
232 | 
233 | 59
234 | 00:02:54,060 --> 00:02:56,220
235 | The important things to note are chat.
236 | 
237 | 60
238 | 00:02:56,220 --> 00:03:00,540
239 | This is optimized to work with chat models, and then react.
240 | 
241 | 61
242 | 00:03:00,540 --> 00:03:05,620
243 | React is a prompting strategy that elicits better thoughts from a language model.
244 | 
245 | 62
246 | 00:03:05,620 --> 00:03:09,220
247 | We're also going to set handle parsing errors equals true.
248 | 
249 | 63
250 | 00:03:09,220 --> 00:03:11,620
251 | If you remember from the first lesson,
252 | 
253 | 64
254 | 00:03:11,620 --> 00:03:17,140
255 | we chatted a bunch about output parsers and how those can be used to take the LLM output,
256 | 
257 | 65
258 | 00:03:17,140 --> 00:03:22,060
259 | which is a string, and parse it into a specific format that we can use downstream.
260 | 
261 | 66
262 | 00:03:22,060 --> 00:03:23,740
263 | That's extremely important here.
264 | 
265 | 67
266 | 00:03:23,740 --> 00:03:25,620
267 | When we take the output of the LLM,
268 | 
269 | 68
270 | 00:03:25,620 --> 00:03:28,940
271 | which is text, and parse it into the specific action,
272 | 
273 | 69
274 | 00:03:28,940 --> 00:03:32,700
275 | and the specific action input that the language model should take.
276 | 
277 | 70
278 | 00:03:32,700 --> 00:03:34,300
279 | Let's now use this agent.
280 | 
281 | 71
282 | 00:03:34,300 --> 00:03:38,940
283 | Let's ask it a question about a recent event that the model,
284 | 
285 | 72
286 | 00:03:38,940 --> 00:03:41,060
287 | when it was trained, didn't know.
288 | 
289 | 73
290 | 00:03:41,060 --> 00:03:43,860
291 | So let's ask it about the 2022 World Cup.
292 | 
293 | 74
294 | 00:03:43,860 --> 00:03:47,660
295 | The models here were trained on data up to around 2021.
296 | 
297 | 75
298 | 00:03:47,660 --> 00:03:49,820
299 | So it shouldn't know the answer to this question.
300 | 
301 | 76
302 | 00:03:49,820 --> 00:03:55,580
303 | And so it should realize that it needs to use a tool to look up this recent piece of information.
304 | 
305 | 77
306 | 00:04:05,340 --> 00:04:10,620
307 | So we can see here that the agent realizes that it needs to use DuckDuckGo search,
308 | 
309 | 78
310 | 00:04:10,620 --> 00:04:14,900
311 | and then looks up the 2022 World Cup winner.
312 | 
313 | 79
314 | 00:04:14,900 --> 00:04:18,060
315 | And so it gets back a bunch of information.
316 | 
317 | 80
318 | 00:04:18,060 --> 00:04:23,940
319 | We can then see that the agent thinks that the 2022 World Cup has not happened yet.
320 | 
321 | 81
322 | 00:04:23,940 --> 00:04:28,060
323 | So this is a good example of why agents are still pretty exploratory.
324 | 
325 | 82
326 | 00:04:28,060 --> 00:04:32,020
327 | You can see that there's a bunch of information here about the 2022 World Cup,
328 | 
329 | 83
330 | 00:04:32,020 --> 00:04:34,180
331 | but it doesn't quite realize all that has happened.
332 | 
333 | 84
334 | 00:04:34,180 --> 00:04:38,180
335 | And so it needs to look up more things and get more information.
336 | 
337 | 85
338 | 00:04:38,180 --> 00:04:40,220
339 | And so then based on that information,
340 | 
341 | 86
342 | 00:04:40,220 --> 00:04:47,380
343 | it can respond with the correct answer that Argentina won the 2022 World Cup.
344 | 
345 | 87
346 | 00:04:47,380 --> 00:04:52,500
347 | Let's then ask it a question where it should recognize that it needs to use Wikipedia.
348 | 
349 | 88
350 | 00:04:52,500 --> 00:04:58,740
351 | So Wikipedia has a lot of information about specific people and specific entities
352 | 
353 | 89
354 | 00:04:58,740 --> 00:05:02,980
355 | from a long time ago. It doesn't need to be current information.
356 | 
357 | 90
358 | 00:05:02,980 --> 00:05:06,540
359 | So let's ask it about Tom M. Mitchell, an American computer scientist,
360 | 
361 | 91
362 | 00:05:06,540 --> 00:05:08,420
363 | and what book did he write?
364 | 
365 | 92
366 | 00:05:08,420 --> 00:05:12,700
367 | We can see here that it recognizes that it should use Wikipedia to look up the answer.
368 | 
369 | 93
370 | 00:05:12,700 --> 00:05:16,020
371 | So it searches for Tom M. Mitchell Wikipedia.
372 | 
373 | 94
374 | 00:05:16,020 --> 00:05:19,460
375 | And then does another follow-up search just to make sure that it's got the right answer.
376 | 
377 | 95
378 | 00:05:19,460 --> 00:05:24,340
379 | So it searches for Tom M. Mitchell machine learning and gets back more information.
380 | 
381 | 96
382 | 00:05:24,340 --> 00:05:27,060
383 | And then based on that, it's able to finally answer with
384 | 
385 | 97
386 | 00:05:27,060 --> 00:05:29,660
387 | Tom M. Mitchell wrote the textbook, Machine Learning.
388 | 
389 | 98
390 | 00:05:29,660 --> 00:05:33,580
391 | You should pause the video here and try putting in different inputs.
392 | 
393 | 99
394 | 00:05:33,580 --> 00:05:38,380
395 | So far, we've used tools that come defined in LinkedIn already,
396 | 
397 | 100
398 | 00:05:38,380 --> 00:05:42,820
399 | but a big power of agents is that you can connect it to your own sources of information,
400 | 
401 | 101
402 | 00:05:42,820 --> 00:05:45,100
403 | your own APIs, your own data.
404 | 
405 | 102
406 | 00:05:45,100 --> 00:05:47,700
407 | So here we're going to go over how you can create a custom tool
408 | 
409 | 103
410 | 00:05:47,700 --> 00:05:50,700
411 | so that you can connect it to whatever you want.
412 | 
413 | 104
414 | 00:05:50,700 --> 00:05:54,660
415 | Let's make a tool that's going to tell us what the current date is.
416 | 
417 | 105
418 | 00:05:54,660 --> 00:05:57,500
419 | First, we're going to import this tool decorator.
420 | 
421 | 106
422 | 00:05:57,500 --> 00:06:03,100
423 | This can be applied to any function, and it turns it into a tool that LinkedIn can use.
424 | 
425 | 107
426 | 00:06:03,100 --> 00:06:08,500
427 | Next, we're going to write a function called time, which takes in any text string.
428 | 
429 | 108
430 | 00:06:08,500 --> 00:06:09,900
431 | We're not really going to use that.
432 | 
433 | 109
434 | 00:06:09,900 --> 00:06:15,540
435 | And it's going to return today's date by calling date time.
436 | 
437 | 110
438 | 00:06:15,540 --> 00:06:20,660
439 | In addition to the name of the function, we're also going to write a really detailed doc string.
440 | 
441 | 111
442 | 00:06:20,660 --> 00:06:25,100
443 | That's because this is what the agent will use to know when it should call this tool
444 | 
445 | 112
446 | 00:06:25,100 --> 00:06:28,500
447 | and how it should call this tool.
448 | 
449 | 113
450 | 00:06:28,500 --> 00:06:32,060
451 | For example, here we say that the input should always be an empty string.
452 | 
453 | 114
454 | 00:06:32,060 --> 00:06:33,860
455 | That's because we don't use it.
456 | 
457 | 115
458 | 00:06:33,860 --> 00:06:37,460
459 | If we have more stringent requirements on what the input should be,
460 | 
461 | 116
462 | 00:06:37,460 --> 00:06:42,940
463 | for example, if we have a function that should always take in a search query or a SQL statement,
464 | 
465 | 117
466 | 00:06:42,940 --> 00:06:47,340
467 | you'll want to make sure to mention that here.
468 | 
469 | 118
470 | 00:06:47,340 --> 00:06:49,060
471 | We're now going to create another agent.
472 | 
473 | 119
474 | 00:06:49,060 --> 00:06:55,660
475 | This time we're adding the time tool to the list of existing tools.
476 | 
477 | 120
478 | 00:06:55,660 --> 00:07:03,660
479 | And finally, let's call the agent and ask it what the date today is.
480 | 
481 | 121
482 | 00:07:03,660 --> 00:07:08,140
483 | It recognizes that it needs to use the time tool, which it specifies here.
484 | 
485 | 122
486 | 00:07:08,140 --> 00:07:10,340
487 | It has the action input as an empty string.
488 | 
489 | 123
490 | 00:07:10,340 --> 00:07:12,540
491 | This is great. This is what we told it to do.
492 | 
493 | 124
494 | 00:07:12,540 --> 00:07:14,660
495 | And then it returns with an observation.
496 | 
497 | 125
498 | 00:07:14,660 --> 00:07:18,740
499 | And then finally, the language model takes that observation and responds to the user.
500 | 
501 | 126
502 | 00:07:18,740 --> 00:07:22,540
503 | Today's date is 2023-05-21.
504 | 
505 | 127
506 | 00:07:22,540 --> 00:07:26,860
507 | You should pause the video here and try putting in different inputs.
508 | 
509 | 128
510 | 00:07:26,860 --> 00:07:29,340
511 | This wraps up the lesson on agents.
512 | 
513 | 129
514 | 00:07:29,340 --> 00:07:34,740
515 | This is one of the newer and more exciting and more experimental pieces of LangChain.
516 | 
517 | 130
518 | 00:07:34,740 --> 00:07:36,540
519 | So I hope you enjoy using it.
520 | 
521 | 131
522 | 00:07:36,540 --> 00:07:40,540
523 | Hopefully it showed you how you can use a language model as a reasoning engine
524 | 
525 | 132
526 | 00:07:40,540 --> 00:08:00,540
527 | to take different actions and connect to other functions and data sources.
528 | 
529 | 
530 | 


--------------------------------------------------------------------------------
/LangChain_L6.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:00,000 --> 00:00:08,920
  3 | 有时人们认为大型语言模型是一个知识库，
  4 | Sometimes people think of a large language model as a knowledge store,
  5 | 
  6 | 2
  7 | 00:00:08,920 --> 00:00:11,920
  8 | 好像它已经学会了记忆大量信息，
  9 | as if it's learned to memorize a lot of information,
 10 | 
 11 | 3
 12 | 00:00:11,920 --> 00:00:14,880
 13 | 也许是从互联网上获取的，所以当你问它一个问题时，
 14 | maybe off the internet, so when you ask it a question,
 15 | 
 16 | 4
 17 | 00:00:14,880 --> 00:00:16,380
 18 | 它可以回答这个问题。
 19 | it can answer the question.
 20 | 
 21 | 5
 22 | 00:00:16,380 --> 00:00:19,340
 23 | 但我认为，将大型语言模型视为推理引擎更加有用，
 24 | But I think a even more useful way to think of
 25 | 
 26 | 6
 27 | 00:00:19,340 --> 00:00:22,980
 28 | 你可以给它一些文本块或其他信息来源。
 29 | a large language model is sometimes as a reasoning engine,
 30 | 
 31 | 7
 32 | 00:00:22,980 --> 00:00:27,140
 33 | 然后大型语言模型，LLM，
 34 | in which you can give it chunks of text or other sources of information.
 35 | 
 36 | 8
 37 | 00:00:27,140 --> 00:00:29,460
 38 | 可能会使用从互联网上学到的背景知识，
 39 | Then the large language model, llm,
 40 | 
 41 | 9
 42 | 00:00:29,460 --> 00:00:33,000
 43 | 但是使用你提供的新信息来帮助你回答问题或推理内容或甚至决定下一步该做什么。
 44 | will maybe use this background knowledge that's learned off the internet,
 45 | 
 46 | 10
 47 | 00:00:33,000 --> 00:00:36,620
 48 | 这就是Lanchain的代理框架帮助你做的事情。
 49 | but to use the new information you give it to help you answer
 50 | 
 51 | 11
 52 | 00:00:36,620 --> 00:00:41,100
 53 | 代理可能是我最喜欢的Lanchain部分。
 54 | questions or reason through content or decide even what to do next.
 55 | 
 56 | 12
 57 | 00:00:41,100 --> 00:00:45,180
 58 | 我认为它们也是最强大的部分之一，
 59 | That's what Lanchain's agents framework helps you to do.
 60 | 
 61 | 13
 62 | 00:00:45,180 --> 00:00:48,340
 63 | 但它们也是最新的部分之一。
 64 | Agents are probably my favorite part of Lanchain.
 65 | 
 66 | 14
 67 | 00:00:48,340 --> 00:00:50,320
 68 | 我们正在看到很多新的东西出现在这里，对于该领域的每个人来说都是新的。
 69 | I think they're also one of the most powerful parts,
 70 | 
 71 | 15
 72 | 00:00:50,320 --> 00:00:52,140
 73 | 这应该是一个非常令人兴奋的课程，因为我们深入探讨
 74 | but they're also one of the newer parts.
 75 | 
 76 | 16
 77 | 00:00:52,140 --> 00:00:56,020
 78 | 代理是什么，如何创建代理，
 79 | We're seeing a lot of stuff emerge here that's really new to everyone in the field.
 80 | 
 81 | 17
 82 | 00:00:56,020 --> 00:00:58,940
 83 | 以及如何使用代理，
 84 | This should be a very exciting lesson as we dive
 85 | 
 86 | 18
 87 | 00:00:58,940 --> 00:01:01,180
 88 | 如何为它们配备不同类型的工具，如
 89 | into what agents are, how to create,
 90 | 
 91 | 19
 92 | 00:01:01,180 --> 00:01:02,500
 93 | 内置于Lanchain中的搜索引擎，
 94 | and how to use agents,
 95 | 
 96 | 20
 97 | 00:01:02,500 --> 00:01:04,860
 98 | 以及如何创建自己的工具，以便让代理与
 99 | how to equip them with different types of tools like
100 | 
101 | 21
102 | 00:01:04,860 --> 00:01:07,180
103 | 任何数据存储，任何API，
104 | search engines that come built into Lanchain,
105 | 
106 | 22
107 | 00:01:07,180 --> 00:01:11,480
108 | 任何你想让它们与之交互的函数。
109 | and then also how to create your own tools so that you can let agents interact with
110 | 
111 | 23
112 | 00:01:11,480 --> 00:01:14,780
113 | 这是令人兴奋的前沿技术，
114 | any data stores, any APIs,
115 | 
116 | 24
117 | 00:01:14,780 --> 00:01:16,860
118 | 但已经出现了一些重要的用例。
119 | any functions that you might want them to.
120 | 
121 | 25
122 | 00:01:16,860 --> 00:01:19,460
123 | 因此，让我们开始吧。
124 | This is exciting, cutting-edge stuff,
125 | 
126 | 26
127 | 00:01:19,460 --> 00:01:23,060
128 | 要开始使用代理，
129 | but already with emerging important use cases.
130 | 
131 | 27
132 | 00:01:23,060 --> 00:01:25,620
133 | 我们将像往常一样导入正确的环境变量。
134 | So with that, let's dive in.
135 | 
136 | 28
137 | 00:01:25,620 --> 00:01:27,500
138 | 我们还需要安装一些软件包。
139 | To get started with agents,
140 | 
141 | 29
142 | 00:01:27,500 --> 00:01:32,420
143 | 因此，我们将使用DuckDuckGo搜索引擎和维基百科。
144 | we're going to start as we always do by importing the correct environment variables.
145 | 
146 | 30
147 | 00:01:32,420 --> 00:01:35,100
148 | 因此，我们将要安装这些。
149 | We're also going to need to install a few packages here.
150 | 
151 | 31
152 | 00:01:35,100 --> 00:01:39,020
153 | 我已经在我的环境中安装了这些，所以我不会运行这行。
154 | So we're going to use the DuckDuckGo search engine and Wikipedia.
155 | 
156 | 32
157 | 00:01:39,020 --> 00:01:40,780
158 | 但如果你们没有，
159 |  
160 | So we're going to want to pip install those.
161 | 
162 | 33
163 | 00:01:46,360 --> 00:01:48,580
164 | 你应该取消注释那一行，
165 | I've already installed those in my environment,
166 | 
167 | 34
168 | 00:01:48,580 --> 00:01:51,300
169 | 运行它，然后你就可以开始了。
170 | so I'm not going to run this line.
171 | 
172 | 35
173 | 00:01:51,300 --> 00:01:56,060
174 | 然后我们将从Lanchain导入一些我们需要的方法和类。
175 | But if you guys have not,
176 | 
177 | 36
178 | 00:01:56,060 --> 00:01:59,060
179 | 所以我们要导入一些加载工具的方法，
180 | you should uncomment that line,
181 | 
182 | 37
183 | 00:01:59,060 --> 00:02:02,340
184 | 这些是我们将连接语言模型的东西。
185 | run it, and then you're good to go.
186 | 
187 | 38
188 | 00:02:02,340 --> 00:02:05,020
189 | 我们将加载一个初始化代理的方法。
190 | We're then going to import some methods and classes that we need from Lanchain.
191 | 
192 | 39
193 | 00:02:05,020 --> 00:02:07,820
194 | 我们将加载聊天Open AI包装器，
195 | So we're going to import some methods to load tools,
196 | 
197 | 40
198 | 00:02:07,820 --> 00:02:09,500
199 | 我们将加载代理类型。
200 | and these are things that we're going to connect the language model to.
201 | 
202 | 41
203 | 00:02:09,500 --> 00:02:14,220
204 | 所以代理类型将用于指定我们要使用的代理类型。
205 | We're going to load a method to initialize the agent.
206 | 
207 | 42
208 | 00:02:14,220 --> 00:02:16,540
209 | Lanchain中有许多不同类型的代理。
210 | We're going to load the chat open AI wrapper,
211 | 
212 | 43
213 | 00:02:16,540 --> 00:02:18,780
214 | 我们现在不会详细介绍所有这些。
215 | and we're going to load agent type.
216 | 
217 | 44
218 | 00:02:18,780 --> 00:02:21,420
219 | 我们将选择一个并运行它。
220 | So agent type will be used to specify what type of agent we want to use.
221 | 
222 | 45
223 | 00:02:21,420 --> 00:02:24,700
224 | 然后我们将初始化我们要使用的语言模型。
225 | There are a bunch of different types of agents in Lanchain.
226 | 
227 | 46
228 | 00:02:24,700 --> 00:02:30,500
229 | 同样，我们将使用它作为我们用来驱动代理的推理引擎。
230 | We're not going to go over all of them right now.
231 | 
232 | 47
233 | 00:02:30,500 --> 00:02:33,740
234 | 然后我们将加载我们要使用的工具。
235 | We'll just choose one and run with that.
236 | 
237 | 48
238 | 00:02:33,740 --> 00:02:37,020
239 | 所以我们将加载DuckDuckGo搜索和维基百科，
240 | We're then going to initialize the language model that we're going to use.
241 | 
242 | 49
243 | 00:02:37,020 --> 00:02:40,140
244 | 这些都是内置的Lanchain工具。
245 | Again, we're using this as the reasoning engine that we're using to drive the agent.
246 | 
247 | 50
248 | 00:02:40,140 --> 00:02:42,980
249 | 最后，我们将初始化代理。
250 | We'll then load the tools that we're going to use.
251 | 
252 | 51
253 | 00:02:42,980 --> 00:02:44,780
254 | 我们将传递工具，
255 | So we're going to load DuckDuckGo search and Wikipedia,
256 | 
257 | 52
258 | 00:02:44,780 --> 00:02:47,700
259 | 语言模型和代理类型。
260 | and these are built-in Lanchain tools.
261 | 
262 | 53
263 | 00:02:47,700 --> 00:02:49,340
264 | 所以这里我们使用聊天，
265 | Finally, we're going to initialize the agent.
266 | 
267 | 54
268 | 00:02:49,340 --> 00:02:51,460
269 | 零射击，反应，描述。
270 | We pass it the tools,
271 | 
272 | 55
273 | 00:02:51,460 --> 00:02:54,060
274 | 我不会详细介绍这意味着什么。
275 | the language model, and agent type.
276 | 
277 | 56
278 | 00:02:54,060 --> 00:02:56,220
279 | 需要注意的重要事项是聊天。
280 | So here we're using chat,
281 | 
282 | 57
283 | 00:02:56,220 --> 00:03:00,540
284 | 这是针对聊天模型进行优化的，然后是反应。
285 | zero shot, react, description.
286 | 
287 | 58
288 | 00:03:00,540 --> 00:03:05,620
289 | 反应是一种提示策略，可以从语言模型中引出更好的想法。
290 | I'm not going to go over in too much detail what this means.
291 | 
292 | 59
293 | 00:03:05,620 --> 00:03:09,220
294 | 我们还将设置处理解析错误等于true。
295 | The important things to note are chat.
296 | 
297 | 60
298 | 00:03:09,220 --> 00:03:11,620
299 | 如果您还记得第一课，
300 | This is optimized to work with chat models, and then react.
301 | 
302 | 61
303 | 00:03:11,620 --> 00:03:17,140
304 | 我们谈论了输出解析器以及如何使用它们将LLM输出，
305 | React is a prompting strategy that elicits better thoughts from a language model.
306 | 
307 | 62
308 | 00:03:17,140 --> 00:03:22,060
309 | 这是一个字符串，并将其解析为我们可以在下游使用的特定格式。
310 | We're also going to set handle parsing errors equals true.
311 | 
312 | 63
313 | 00:03:22,060 --> 00:03:23,740
314 | 这在这里非常重要。
315 | If you remember from the first lesson,
316 | 
317 | 64
318 | 00:03:23,740 --> 00:03:25,620
319 | 当我们将LLM的输出，
320 | we chatted a bunch about output parsers and how those can be used to take the LLM output,
321 | 
322 | 65
323 | 00:03:25,620 --> 00:03:28,940
324 | 这是文本，并将其解析为特定的操作，
325 | which is a string, and parse it into a specific format that we can use downstream.
326 | 
327 | 66
328 | 00:03:28,940 --> 00:03:32,700
329 | 以及语言模型应该采取的特定操作输入。
330 |  
331 | That's extremely important here.
332 | 
333 | 67
334 | 00:03:32,700 --> 00:03:34,300
335 | 现在让我们使用这个代理。
336 | When we take the output of the LLM,
337 | 
338 | 68
339 | 00:03:34,300 --> 00:03:38,940
340 | 让我们问一个关于最近事件的问题，这个模型在训练时不知道。
341 | which is text, and parse it into the specific action,
342 | 
343 | 69
344 | 00:03:38,940 --> 00:03:41,060
345 | 所以让我们问一下2022年世界杯的情况。
346 | and the specific action input that the language model should take.
347 | 
348 | 70
349 | 00:03:41,060 --> 00:03:43,860
350 | 这里的模型是根据2021年左右的数据进行训练的。
351 | Let's now use this agent.
352 | 
353 | 71
354 | 00:03:43,860 --> 00:03:47,660
355 | 所以它不应该知道这个问题的答案。
356 | Let's ask it a question about a recent event that the model,
357 | 
358 | 72
359 | 00:03:47,660 --> 00:03:49,820
360 | 因此，它应该意识到需要使用工具来查找这个最近的信息。
361 | when it was trained, didn't know.
362 | 
363 | 73
364 | 00:03:49,820 --> 00:03:55,580
365 | 所以我们可以看到这里的代理意识到它需要使用DuckDuckGo搜索，然后查找2022年世界杯的获胜者。
366 | So let's ask it about the 2022 World Cup.
367 | 
368 | 74
369 | 00:04:05,340 --> 00:04:10,620
370 | 因此，它得到了一些信息。
371 | The models here were trained on data up to around 2021.
372 | 
373 | 75
374 | 00:04:10,620 --> 00:04:14,900
375 | 然后我们可以看到代理认为2022年世界杯还没有发生。
376 | So it shouldn't know the answer to this question.
377 | 
378 | 76
379 | 00:04:14,900 --> 00:04:18,060
380 | 所以这是一个很好的例子，说明代理仍然具有探索性。
381 | And so it should realize that it needs to use a tool to look up this recent piece of information.
382 | 
383 | 77
384 | 00:04:18,060 --> 00:04:23,940
385 | 我们可以看到这里有很多关于2022年世界杯的信息，但它并没有完全意识到所有的事情都已经发生了。
386 | So we can see here that the agent realizes that it needs to use DuckDuckGo search,
387 | 
388 | 78
389 | 00:04:23,940 --> 00:04:28,060
390 | 因此，它需要查找更多的信息。
391 | and then looks up the 2022 World Cup winner.
392 | 
393 | 79
394 | 00:04:28,060 --> 00:04:32,020
395 | 然后基于这些信息，它可以回答正确的答案，即阿根廷赢得了2022年世界杯。
396 | And so it gets back a bunch of information.
397 | 
398 | 80
399 | 00:04:47,380 --> 00:04:52,500
400 | 然后让我们问一个问题，它应该意识到需要使用维基百科。
401 | We can then see that the agent thinks that the 2022 World Cup has not happened yet.
402 | 
403 | 81
404 | 00:04:52,500 --> 00:04:58,740
405 | 维基百科有很多关于特定人物和特定实体的信息，这些信息可以是很久以前的，不需要是当前的信息。
406 | So this is a good example of why agents are still pretty exploratory.
407 | 
408 | 82
409 | 00:04:58,740 --> 00:05:02,980
410 | 所以让我们问一下美国计算机科学家Tom M. Mitchell写了哪本书。
411 | You can see that there's a bunch of information here about the 2022 World Cup,
412 | 
413 | 83
414 | 00:05:02,980 --> 00:05:06,540
415 | 我们可以看到它意识到应该使用维基百科来查找答案。
416 | but it doesn't quite realize all that has happened.
417 | 
418 | 84
419 | 00:05:06,540 --> 00:05:08,420
420 | 所以它搜索Tom M. Mitchell维基百科。
421 | And so it needs to look up more things and get more information.
422 | 
423 | 85
424 | 00:05:08,420 --> 00:05:12,700
425 | 然后再进行另一个跟进搜索，以确保它得到了正确的答案。
426 | And so then based on that information,
427 | 
428 | 86
429 | 00:05:12,700 --> 00:05:16,020
430 | 所以它搜索Tom M. Mitchell机器学习，并得到更多的信息。
431 | it can respond with the correct answer that Argentina won the 2022 World Cup.
432 | 
433 | 87
434 | 00:05:16,020 --> 00:05:19,460
435 | 然后基于这些信息，它最终能够回答Tom M. Mitchell写的教科书是《机器学习》。
436 | Let's then ask it a question where it should recognize that it needs to use Wikipedia.
437 | 
438 | 88
439 | 00:05:29,660 --> 00:05:33,580
440 | 你可以在这里暂停视频，尝试输入不同的内容。
441 | So Wikipedia has a lot of information about specific people and specific entities
442 | 
443 | 89
444 | 00:05:33,580 --> 00:05:38,380
445 | 到目前为止，我们已经使用了LinkedIn中预定义的工具。
446 |  
447 | from a long time ago. It doesn't need to be current information.
448 | 
449 | 90
450 | 00:05:38,380 --> 00:05:42,820
451 | 但代理的一个重要功能是可以将其连接到您自己的信息源、API和数据。
452 | So let's ask it about Tom M. Mitchell, an American computer scientist,
453 | 
454 | 91
455 | 00:05:42,820 --> 00:05:45,100
456 | 您可以创建一个自定义工具，将其连接到您想要的任何内容。
457 | and what book did he write?
458 | 
459 | 92
460 | 00:05:45,100 --> 00:05:47,700
461 | 现在我们来创建一个工具，它将告诉我们当前的日期。
462 | We can see here that it recognizes that it should use Wikipedia to look up the answer.
463 | 
464 | 93
465 | 00:05:47,700 --> 00:05:50,700
466 | 首先，我们要导入这个工具装饰器。
467 | So it searches for Tom M. Mitchell Wikipedia.
468 | 
469 | 94
470 | 00:05:57,500 --> 00:06:03,100
471 | 接下来，我们将编写一个名为“time”的函数，它接受任何文本字符串。
472 | And then does another follow-up search just to make sure that it's got the right answer.
473 | 
474 | 95
475 | 00:06:09,900 --> 00:06:15,540
476 | 它将通过调用日期时间返回今天的日期。
477 | So it searches for Tom M. Mitchell machine learning and gets back more information.
478 | 
479 | 96
480 | 00:06:15,540 --> 00:06:20,660
481 | 除了函数的名称，我们还将编写一个非常详细的文档字符串。
482 | And then based on that, it's able to finally answer with
483 | 
484 | 97
485 | 00:06:20,660 --> 00:06:25,100
486 | 这是代理将用来知道何时调用此工具以及如何调用此工具的方式。
487 | Tom M. Mitchell wrote the textbook, Machine Learning.
488 | 
489 | 98
490 | 00:06:28,500 --> 00:06:32,060
491 | 例如，在这里，我们说输入应始终为空字符串。
492 | You should pause the video here and try putting in different inputs.
493 | 
494 | 99
495 | 00:06:37,460 --> 00:06:42,940
496 | 如果我们对输入有更严格的要求，例如，如果我们有一个应始终接受搜索查询或SQL语句的函数，
497 | So far, we've used tools that come defined in LinkedIn already,
498 | 
499 | 100
500 | 00:06:47,340 --> 00:06:49,060
501 | 现在我们将创建另一个代理。
502 | but a big power of agents is that you can connect it to your own sources of information,
503 | 
504 | 101
505 | 00:06:49,060 --> 00:06:55,660
506 | 这次，我们将时间工具添加到现有工具列表中。
507 | your own APIs, your own data.
508 | 
509 | 102
510 | 00:07:03,660 --> 00:07:08,140
511 | 它识别出需要使用时间工具，并在此指定。
512 | So here we're going to go over how you can create a custom tool
513 | 
514 | 103
515 | 00:07:18,740 --> 00:07:22,540
516 | 今天的日期是2023年5月21日。
517 | so that you can connect it to whatever you want.
518 | 
519 | 104
520 | 00:07:26,860 --> 00:07:29,340
521 | 这就是代理的全部内容。
522 | Let's make a tool that's going to tell us what the current date is.
523 | 
524 | 105
525 | 00:07:29,340 --> 00:07:34,740
526 | 这是LangChain中较新、更令人兴奋和更具实验性的部分之一。
527 | First, we're going to import this tool decorator.
528 | 
529 | 106
530 | 00:07:34,740 --> 00:07:36,540
531 | 希望您喜欢使用它。
532 |  
533 | This can be applied to any function, and it turns it into a tool that LinkedIn can use.
534 | 
535 | 107
536 | 00:07:36,540 --> 00:07:40,540
537 | 希望它向您展示了如何使用语言模型作为推理引擎
538 | Next, we're going to write a function called time, which takes in any text string.
539 | 
540 | 108
541 | 00:07:40,540 --> 00:08:00,540
542 | 以执行不同的操作并连接到其他功能和数据源。
543 | We're not really going to use that.
544 | 


--------------------------------------------------------------------------------
/chinese/LangChain_L4.srt:
--------------------------------------------------------------------------------
  1 | 
  2 |  
  3 | 1
  4 | 00:00:01,000 --> 00:00:15,000
  5 | 使用llm构建的最常见的复杂应用程序之一是一个系统，可以在文档上方或关于文档回答问题。
  6 | 
  7 | 2
  8 | 00:00:15,000 --> 00:00:24,000
  9 | 因此，给定从PDF文件、网页或某些公司的内部文档收集中提取的文本，
 10 | 
 11 | 3
 12 | 00:00:24,000 --> 00:00:33,000
 13 | 您可以使用llm回答有关这些文档内容的问题，以帮助用户获得更深入的理解并获得所需的信息吗？
 14 | 
 15 | 4
 16 | 00:00:33,000 --> 00:00:39,000
 17 | 这真的很强大，因为它开始将这些语言模型与它们最初没有接受培训的数据结合起来。
 18 | 
 19 | 5
 20 | 00:00:39,000 --> 00:00:42,000
 21 | 因此，它使它们更加灵活和适应您的用例。
 22 | 
 23 | 6
 24 | 00:00:42,000 --> 00:00:48,000
 25 | 这也非常令人兴奋，因为我们将开始超越语言模型、提示和输出解析器，
 26 | 
 27 | 7
 28 | 00:00:48,000 --> 00:00:54,000
 29 | 并开始引入链式的一些关键组件，例如嵌入模型和向量存储。
 30 | 
 31 | 8
 32 | 00:00:54,000 --> 00:00:58,000
 33 | 正如安德鲁所提到的，这是我们拥有的最受欢迎的链之一，所以我希望你很兴奋。
 34 | 
 35 | 9
 36 | 00:00:58,000 --> 00:01:03,000
 37 | 实际上，嵌入和向量存储是一些最强大的现代技术。
 38 | 
 39 | 10
 40 | 00:01:03,000 --> 00:01:08,000
 41 | 因此，如果您还没有看到它们，那么了解它们非常值得。
 42 | 
 43 | 11
 44 | 00:01:08,000 --> 00:01:10,000
 45 | 那么，让我们开始吧。
 46 | 
 47 | 12
 48 | 00:01:10,000 --> 00:01:11,000
 49 | 开始吧。
 50 | 
 51 | 13
 52 | 00:01:11,000 --> 00:01:16,000
 53 | 因此，我们将从像往常一样导入环境变量开始。
 54 | 
 55 | 14
 56 | 00:01:16,000 --> 00:01:20,000
 57 | 现在，我们将导入一些在构建此链时将有所帮助的东西。
 58 | 
 59 | 15
 60 | 00:01:20,000 --> 00:01:22,000
 61 | 我们将导入检索QA链。
 62 | 
 63 | 16
 64 | 00:01:22,000 --> 00:01:24,000
 65 | 这将在一些文档上进行检索。
 66 | 
 67 | 17
 68 | 00:01:24,000 --> 00:01:28,000
 69 | 我们将导入我们最喜欢的聊天Open AI语言模型。
 70 | 
 71 | 18
 72 | 00:01:28,000 --> 00:01:29,000
 73 | 我们将导入文档加载器。
 74 | 
 75 | 19
 76 | 00:01:29,000 --> 00:01:34,000
 77 | 这将用于加载一些专有数据，我们将与语言模型结合使用。
 78 | 
 79 | 20
 80 | 00:01:34,000 --> 00:01:36,000
 81 | 在这种情况下，它将在CSV中。
 82 | 
 83 | 21
 84 | 00:01:36,000 --> 00:01:39,000
 85 | 因此，我们将导入CSV加载器。
 86 | 
 87 | 22
 88 | 00:01:39,000 --> 00:01:41,000
 89 | 最后，我们将导入向量存储。
 90 | 
 91 | 23
 92 | 00:01:41,000 --> 00:01:45,000
 93 | 有许多不同类型的向量存储，我们将在稍后介绍它们的确切含义。
 94 | 
 95 | 24
 96 | 00:01:45,000 --> 00:01:49,000
 97 | 但是，我们将从Dock Array内存搜索向量存储开始。
 98 | 
 99 | 25
100 | 00:01:49,000 --> 00:01:51,000
101 | 这非常好，因为它是一个内存向量存储，
102 | 
103 | 26
104 | 00:01:51,000 --> 00:01:55,000
105 | 并且不需要连接到任何外部数据库，
106 |  
107 | 27
108 | 00:01:55,000 --> 00:01:57,000
109 | 所以它使得入门变得非常容易。
110 | 
111 | 28
112 | 00:01:57,000 --> 00:01:59,000
113 | 我们还将导入显示和markdown两个常见的在Jupyter Notebooks中显示信息的工具。
114 | 
115 | 29
116 | 00:01:59,000 --> 00:02:04,000
117 | 我们提供了一个户外服装的CSV文件，我们将使用它与语言模型结合使用。
118 | 
119 | 30
120 | 00:02:04,000 --> 00:02:10,000
121 | 在这里，我们将使用该文件的路径初始化一个加载器，即CSV加载器。
122 | 
123 | 31
124 | 00:02:10,000 --> 00:02:18,000
125 | 接下来，我们将导入一个索引，即向量存储索引创建器。
126 | 
127 | 32
128 | 00:02:18,000 --> 00:02:22,000
129 | 这将帮助我们非常容易地创建一个向量存储。
130 | 
131 | 33
132 | 00:02:22,000 --> 00:02:26,000
133 | 如下所示，只需要几行代码就可以创建它。
134 | 
135 | 34
136 | 00:02:26,000 --> 00:02:34,000
137 | 为了创建它，我们将指定两件事。
138 | 
139 | 35
140 | 00:02:34,000 --> 00:02:37,000
141 | 首先，我们将指定向量存储类。
142 | 
143 | 36
144 | 00:02:37,000 --> 00:02:40,000
145 | 如前所述，我们将使用这个向量存储，
146 | 
147 | 37
148 | 00:02:40,000 --> 00:02:46,000
149 | 因为它是一个特别容易入门的向量存储。
150 | 
151 | 38
152 | 00:02:46,000 --> 00:02:49,000
153 | 创建完成后，我们将从加载器中调用，
154 | 
155 | 39
156 | 00:02:49,000 --> 00:02:51,000
157 | 它接受一个文档加载器列表。
158 | 
159 | 40
160 | 00:02:51,000 --> 00:02:58,000
161 | 我们只有一个我们真正关心的加载器，所以这就是我们在这里传递的。
162 | 
163 | 41
164 | 00:02:58,000 --> 00:03:02,000
165 | 现在它已经被创建了，我们可以开始询问它的问题了。
166 | 
167 | 42
168 | 00:03:02,000 --> 00:03:07,000
169 | 下面我们将介绍发生了什么，所以现在不要担心这个。
170 | 
171 | 43
172 | 00:03:07,000 --> 00:03:09,000
173 | 在这里我们将从一个查询开始。
174 | 
175 | 44
176 | 00:03:09,000 --> 00:03:17,000
177 | 然后我们将使用索引查询创建一个响应，并传入这个查询。
178 | 
179 | 45
180 | 00:03:17,000 --> 00:03:21,000
181 | 同样，我们将在下面介绍发生了什么。
182 | 
183 | 46
184 | 00:03:21,000 --> 00:03:30,000
185 | 现在，我们只需要等待它的响应。
186 | 
187 | 47
188 | 00:03:30,000 --> 00:03:34,000
189 | 完成后，我们现在可以看看到底返回了什么。
190 | 
191 | 48
192 | 00:03:34,000 --> 00:03:41,000
193 | 我们得到了一个Markdown表格，其中包含所有带有防晒衣的衬衫的名称和描述。
194 | 
195 | 49
196 | 00:03:41,000 --> 00:03:45,000
197 | 我们还得到了一个语言模型提供的不错的小总结。
198 | 
199 | 50
200 | 00:03:45,000 --> 00:03:48,000
201 | 所以我们已经介绍了如何在您的文档中进行问答，
202 | 
203 | 51
204 | 00:03:48,000 --> 00:03:52,000
205 | 但是到底在幕后发生了什么呢？
206 | 
207 | 52
208 | 00:03:52,000 --> 00:03:54,000
209 | 首先，让我们考虑一般的想法。
210 | 
211 | 53
212 | 00:03:54,000 --> 00:03:58,000
213 | 我们想要使用语言模型并将其与我们的许多文档结合使用，
214 | 
215 | 54
216 | 00:03:58,000 --> 00:04:03,000
217 | 但是有一个关键问题。语言模型一次只能检查几千个单词。
218 |  
219 | 56
220 | 00:04:03,000 --> 00:04:10,000
221 | 如果我们有非常大的文档，如何让语言模型回答关于其中所有内容的问题呢？
222 | 
223 | 57
224 | 00:04:10,000 --> 00:04:14,000
225 | 这就是嵌入和向量存储发挥作用的地方。
226 | 
227 | 58
228 | 00:04:14,000 --> 00:04:17,000
229 | 首先，让我们谈谈嵌入。
230 | 
231 | 59
232 | 00:04:17,000 --> 00:04:21,000
233 | 嵌入为文本片段创建数字表示。
234 | 
235 | 60
236 | 00:04:21,000 --> 00:04:27,000
237 | 这种数字表示捕捉了它所运行的文本片段的语义含义。
238 | 
239 | 61
240 | 00:04:27,000 --> 00:04:31,000
241 | 相似内容的文本片段将具有相似的向量。
242 | 
243 | 62
244 | 00:04:31,000 --> 00:04:35,000
245 | 这使我们可以在向量空间中比较文本片段。
246 | 
247 | 63
248 | 00:04:35,000 --> 00:04:38,000
249 | 在下面的示例中，我们可以看到我们有三个句子。
250 | 
251 | 64
252 | 00:04:38,000 --> 00:04:43,000
253 | 前两个是关于宠物的，而第三个是关于汽车的。
254 | 
255 | 65
256 | 00:04:43,000 --> 00:04:46,000
257 | 如果我们看一下数字空间中的表示，
258 | 
259 | 66
260 | 00:04:46,000 --> 00:04:54,000
261 | 我们可以看到当我们比较与宠物句子相对应的文本片段上的两个向量时，它们非常相似。
262 | 
263 | 67
264 | 00:04:54,000 --> 00:04:58,000
265 | 而如果我们将其与谈论汽车的那个进行比较，它们根本不相似。
266 | 
267 | 68
268 | 00:04:58,000 --> 00:05:02,000
269 | 这将让我们轻松地找出哪些文本片段彼此相似，
270 | 
271 | 69
272 | 00:05:02,000 --> 00:05:10,000
273 | 这在我们考虑要包含哪些文本片段传递给语言模型以回答问题时非常有用。
274 | 
275 | 70
276 | 00:05:10,000 --> 00:05:13,000
277 | 我们要介绍的下一个组件是向量数据库。
278 | 
279 | 71
280 | 00:05:13,000 --> 00:05:18,000
281 | 向量数据库是存储我们在上一步中创建的这些向量表示的一种方式。
282 | 
283 | 72
284 | 00:05:18,000 --> 00:05:24,000
285 | 我们创建这个向量数据库的方式是用来自传入文档的文本块填充它。
286 | 
287 | 73
288 | 00:05:24,000 --> 00:05:28,000
289 | 当我们获得一个大的传入文档时，我们首先将其分成较小的块。
290 | 
291 | 74
292 | 00:05:28,000 --> 00:05:33,000
293 | 这有助于创建比原始文档小的文本片段，
294 | 
295 | 75
296 | 00:05:33,000 --> 00:05:37,000
297 | 这很有用，因为我们可能无法将整个文档传递给语言模型。
298 | 
299 | 76
300 | 00:05:37,000 --> 00:05:43,000
301 | 因此，我们想创建这些小块，以便只传递最相关的块给语言模型。
302 | 
303 | 77
304 | 00:05:43,000 --> 00:05:48,000
305 | 然后，我们为每个这些块创建一个嵌入，然后将它们存储在向量数据库中。
306 | 
307 | 78
308 | 00:05:48,000 --> 00:05:51,000
309 | 这就是我们创建索引时发生的事情。
310 | 
311 | 79
312 | 00:05:51,000 --> 00:05:58,000
313 | 现在我们有了这个索引，我们可以在运行时使用它来查找与传入查询最相关的文本片段。
314 | 
315 | 80
316 | 00:05:58,000 --> 00:06:02,000
317 | 当查询进来时，我们首先为该查询创建一个嵌入。
318 |  
319 | 81
320 | 00:06:02,000 --> 00:06:07,000
321 | 然后我们将其与向量数据库中的所有向量进行比较，并选择最相似的n个。
322 | 
323 | 82
324 | 00:06:07,000 --> 00:06:14,000
325 | 然后将它们返回，我们可以将它们传递到语言模型中，以获得最终答案。
326 | 
327 | 83
328 | 00:06:14,000 --> 00:06:17,000
329 | 因此，我们创建了这个链，只需要几行代码。
330 | 
331 | 84
332 | 00:06:17,000 --> 00:06:19,000
333 | 这对于快速入门非常有用。
334 | 
335 | 85
336 | 00:06:19,000 --> 00:06:25,000
337 | 好的，现在让我们逐步进行，并了解底层到底发生了什么。
338 | 
339 | 86
340 | 00:06:25,000 --> 00:06:27,000
341 | 第一步与上面类似。
342 | 
343 | 87
344 | 00:06:27,000 --> 00:06:36,000
345 | 我们将创建一个文档加载器，从包含我们要进行问题回答的所有产品描述的CSV中加载。
346 | 
347 | 88
348 | 00:06:36,000 --> 00:06:41,000
349 | 然后我们可以从这个文档加载器中加载文档。
350 | 
351 | 89
352 | 00:06:41,000 --> 00:06:50,000
353 | 如果我们查看单个文档，我们可以看到每个文档对应于CSV中的一个产品。
354 | 
355 | 90
356 | 00:06:50,000 --> 00:06:53,000
357 | 之前，我们谈到了创建块。
358 | 
359 | 91
360 | 00:06:53,000 --> 00:07:01,000
361 | 因为这些文档已经非常小了，所以我们实际上不需要在这里进行任何分块，因此我们可以直接创建嵌入。
362 | 
363 | 92
364 | 00:07:01,000 --> 00:07:05,000
365 | 要创建嵌入，我们将使用OpenAI的嵌入类。
366 | 
367 | 93
368 | 00:07:05,000 --> 00:07:08,000
369 | 我们可以在这里导入它并初始化它。
370 | 
371 | 94
372 | 00:07:08,000 --> 00:07:21,000
373 | 如果我们想看看这些嵌入是如何工作的，我们实际上可以看一下嵌入特定文本时会发生什么。
374 | 
375 | 95
376 | 00:07:21,000 --> 00:07:26,000
377 | 让我们使用嵌入对象上的嵌入查询方法为特定文本创建嵌入。
378 | 
379 | 96
380 | 00:07:26,000 --> 00:07:31,000
381 | 在这种情况下，句子是“嗨，我的名字是哈里森”。
382 | 
383 | 97
384 | 00:07:31,000 --> 00:07:41,000
385 | 如果我们查看这个嵌入，我们可以看到有超过一千个不同的元素。
386 | 
387 | 98
388 | 00:07:41,000 --> 00:07:44,000
389 | 每个元素都是不同的数字值。
390 | 
391 | 99
392 | 00:07:44,000 --> 00:07:51,000
393 | 组合起来，这就创建了这段文本的总体数值表示。
394 | 
395 | 100
396 | 00:07:51,000 --> 00:07:58,000
397 | 我们想为刚刚加载的所有文本创建嵌入，然后我们还想将它们存储在向量存储中。
398 | 
399 | 101
400 | 00:07:58,000 --> 00:08:03,000
401 | 我们可以使用向量存储上的from documents方法来实现这一点。
402 | 
403 | 102
404 | 00:08:03,000 --> 00:08:12,000
405 | 该方法接受文档列表、嵌入对象，然后我们将创建一个总体向量存储。
406 | 
407 | 103
408 | 00:08:12,000 --> 00:08:18,000
409 | 现在，我们可以使用这个向量存储来查找与传入查询类似的文本。
410 | 
411 | 104
412 | 00:08:18,000 --> 00:08:23,000
413 | 因此，让我们看一下查询，请建议一件带有防晒功能的衬衫。
414 |  
415 | 105
416 | 00:08:23,000 --> 00:08:36,000
417 | 如果我们在向量存储中使用相似性搜索方法并传入一个查询，我们将得到一个文档列表。
418 | 
419 | 106
420 | 00:08:36,000 --> 00:08:48,000
421 | 我们可以看到它返回了四个文档，如果我们看第一个文档，我们可以看到它确实是一件关于防晒的衬衫。
422 | 
423 | 107
424 | 00:08:48,000 --> 00:08:52,000
425 | 那么我们如何使用它来回答我们自己的文档问题呢？
426 | 
427 | 108
428 | 00:08:52,000 --> 00:08:57,000
429 | 首先，我们需要从这个向量存储中创建一个检索器。
430 | 
431 | 109
432 | 00:08:57,000 --> 00:09:03,000
433 | 检索器是一个通用接口，可以由任何接受查询并返回文档的方法支持。
434 | 
435 | 110
436 | 00:09:03,000 --> 00:09:11,000
437 | 向量存储和嵌入是一种这样的方法，尽管有许多不同的方法，有些不太先进，有些更先进。
438 | 
439 | 111
440 | 00:09:11,000 --> 00:09:20,000
441 | 接下来，因为我们想要进行文本生成并返回自然语言响应，我们将导入一个语言模型，我们将使用聊天开放AI。
442 | 
443 | 112
444 | 00:09:20,000 --> 00:09:28,000
445 | 如果我们手动进行此操作，我们将合并文档中的所有页面内容到一个变量中。
446 | 
447 | 113
448 | 00:09:28,000 --> 00:09:37,000
449 | 因此，我们会做一些像这样的事情，将所有页面内容连接到一个变量中。
450 | 
451 | 114
452 | 00:09:37,000 --> 00:09:48,000
453 | 然后，我们将传递此变量或问题的变体，例如请列出所有具有防晒功能的衬衫并在Markdown表格中总结每个衬衫的语言模型。
454 | 
455 | 115
456 | 00:09:48,000 --> 00:09:55,000
457 | 如果我们在此处打印响应，我们可以看到我们得到了一个表格，正如我们所要求的那样。
458 | 
459 | 116
460 | 00:09:55,000 --> 00:09:59,000
461 | 所有这些步骤都可以用LangChain链封装起来。
462 | 
463 | 117
464 | 00:09:59,000 --> 00:10:02,000
465 | 因此，我们可以创建一个检索QA链。
466 | 
467 | 118
468 | 00:10:02,000 --> 00:10:06,000
469 | 这将进行检索，然后对检索到的文档进行问题回答。
470 | 
471 | 119
472 | 00:10:06,000 --> 00:10:09,000
473 | 要创建这样的链，我们将传入几个不同的东西。
474 | 
475 | 120
476 | 00:10:09,000 --> 00:10:12,000
477 | 首先，我们将传入语言模型。
478 | 
479 | 121
480 | 00:10:12,000 --> 00:10:15,000
481 | 这将用于在最后进行文本生成。
482 | 
483 | 122
484 | 00:10:15,000 --> 00:10:17,000
485 | 接下来，我们将传入链类型。
486 | 
487 | 123
488 | 00:10:17,000 --> 00:10:18,000
489 | 我们将使用stuff。
490 | 
491 | 124
492 | 00:10:18,000 --> 00:10:25,000
493 | 这是最简单的方法，因为它只是将所有文档塞入上下文并对语言模型进行一次调用。
494 | 
495 | 125
496 | 00:10:25,000 --> 00:10:32,000
497 | 还有一些其他方法可以用来进行问题回答，我可能会在最后提及，但我们不会详细讨论。
498 | 
499 | 126
500 | 00:10:32,000 --> 00:10:34,000
501 | 第三，我们将传入一个检索器。
502 | 
503 | 127
504 | 00:10:34,000 --> 00:10:38,000
505 | 我们上面创建的检索器只是一个获取文档的接口。
506 |  
507 | 128
508 | 00:10:38,000 --> 00:10:41,000
509 | 这将用于获取文档并将其传递给语言模型。
510 | 
511 | 129
512 | 00:10:41,000 --> 00:10:46,000
513 | 最后，我们将设置 verbose 等于 true。
514 | 
515 | 130
516 | 00:10:46,000 --> 00:11:08,000
517 | 现在我们可以创建一个查询并在此查询上运行链。
518 | 
519 | 131
520 | 00:11:08,000 --> 00:11:14,000
521 | 当我们获得响应时，我们可以再次使用 display 和 markdown 实用程序显示它。
522 | 
523 | 132
524 | 00:11:14,000 --> 00:11:20,000
525 | 您可以在此暂停视频并尝试使用一堆不同的查询。
526 | 
527 | 133
528 | 00:11:20,000 --> 00:11:26,000
529 | 所以这就是您详细了解它的方式，但请记住，我们仍然可以轻松地使用我们上面的一行来完成它。
530 | 
531 | 134
532 | 00:11:26,000 --> 00:11:30,000
533 | 因此，这两个东西等同于相同的结果。
534 | 
535 | 135
536 | 00:11:30,000 --> 00:11:32,000
537 | 这就是 LinkChain 的有趣之处。
538 | 
539 | 136
540 | 00:11:32,000 --> 00:11:38,000
541 | 您可以在一行中完成它，也可以查看各个内容并将其分解为更详细的五个内容。
542 | 
543 | 137
544 | 00:11:38,000 --> 00:11:44,000
545 | 五个更详细的内容让您设置更多关于正在发生的确切内容的细节，但一行代码很容易入手。
546 | 
547 | 138
548 | 00:11:44,000 --> 00:11:48,000
549 | 所以由您决定如何继续前进。
550 | 
551 | 139
552 | 00:11:48,000 --> 00:11:51,000
553 | 我们还可以在创建索引时自定义索引。
554 | 
555 | 140
556 | 00:11:51,000 --> 00:11:55,000
557 | 因此，如果您记得，当我们手动创建它时，我们指定了一个嵌入。
558 | 
559 | 141
560 | 00:11:55,000 --> 00:11:57,000
561 | 我们也可以在这里指定一个嵌入。
562 | 
563 | 142
564 | 00:11:57,000 --> 00:12:01,000
565 | 这将使我们能够灵活地创建嵌入本身。
566 | 
567 | 143
568 | 00:12:01,000 --> 00:12:06,000
569 | 我们还可以在此处替换向量存储器以获取不同类型的向量存储器。
570 | 
571 | 144
572 | 00:12:06,000 --> 00:12:15,000
573 | 因此，在创建索引时，您可以进行与手动创建时相同级别的自定义。
574 | 
575 | 145
576 | 00:12:15,000 --> 00:12:17,000
577 | 在这个笔记本中，我们使用了 stuff 方法。
578 | 
579 | 146
580 | 00:12:17,000 --> 00:12:19,000
581 | stuff 方法非常好，因为它非常简单。
582 | 
583 | 147
584 | 00:12:19,000 --> 00:12:25,000
585 | 您只需将所有内容放入一个提示符中，然后将其发送到语言模型并获取一个响应。
586 | 
587 | 148
588 | 00:12:25,000 --> 00:12:27,000
589 | 因此，很容易理解正在发生什么。
590 | 
591 | 149
592 | 00:12:27,000 --> 00:12:30,000
593 | 它非常便宜，而且效果很好。
594 | 
595 | 150
596 | 00:12:30,000 --> 00:12:32,000
597 | 但是，这并不总是可以正常工作。
598 | 
599 | 151
600 | 00:12:32,000 --> 00:12:37,000
601 | 因此，如果您记得，在笔记本中获取文档时，我们只返回了四个文档。
602 | 
603 | 152
604 | 00:12:37,000 --> 00:12:39,000
605 | 它们相对较小。
606 | 
607 | 153
608 | 00:12:39,000 --> 00:12:44,000
609 | 但是，如果您想在许多不同类型的块上执行相同类型的问答，该怎么办？
610 | 
611 | 154
612 | 00:12:44,000 --> 00:12:47,000
613 | 那么我们可以使用几种不同的方法。
614 |  
615 | 155
616 | 00:12:47,000 --> 00:12:48,000
617 | 第一个是Map Reduce。
618 | 
619 | 156
620 | 00:12:48,000 --> 00:12:55,000
621 | 这基本上是将所有块与问题一起传递给语言模型，获取回复，
622 | 
623 | 157
624 | 00:12:55,000 --> 00:13:02,000
625 | 然后使用另一个语言模型调用将所有单独的回复总结成最终答案。
626 | 
627 | 158
628 | 00:13:02,000 --> 00:13:06,000
629 | 这非常强大，因为它可以在任意数量的文档上运行。
630 | 
631 | 159
632 | 00:13:06,000 --> 00:13:11,000
633 | 而且它也非常强大，因为您可以并行处理单个问题。
634 | 
635 | 160
636 | 00:13:11,000 --> 00:13:13,000
637 | 但是它需要更多的调用。
638 | 
639 | 161
640 | 00:13:13,000 --> 00:13:19,000
641 | 它将所有文档视为独立的，这可能并不总是最理想的事情。
642 | 
643 | 162
644 | 00:13:19,000 --> 00:13:24,000
645 | Refine是另一种方法，再次用于循环许多文档。
646 | 
647 | 163
648 | 00:13:24,000 --> 00:13:25,000
649 | 但它实际上是迭代的。
650 | 
651 | 164
652 | 00:13:25,000 --> 00:13:28,000
653 | 它建立在先前文档的答案之上。
654 | 
655 | 165
656 | 00:13:28,000 --> 00:13:33,000
657 | 因此，这非常适合组合信息并随时间逐步构建答案。
658 | 
659 | 166
660 | 00:13:33,000 --> 00:13:36,000
661 | 它通常会导致更长的答案。
662 | 
663 | 167
664 | 00:13:36,000 --> 00:13:39,000
665 | 而且它也不太快，因为现在调用不是独立的。
666 | 
667 | 168
668 | 00:13:39,000 --> 00:13:43,000
669 | 它们依赖于先前调用的结果。
670 | 
671 | 169
672 | 00:13:43,000 --> 00:13:49,000
673 | 这意味着它通常需要更长的时间，并且基本上需要与Map Reduce一样多的调用。
674 | 
675 | 170
676 | 00:13:49,000 --> 00:13:57,000
677 | Map Re-rank是一种相当有趣且更为实验性的方法，其中您对每个文档进行单个语言模型调用。
678 | 
679 | 171
680 | 00:13:57,000 --> 00:14:00,000
681 | 然后您还要求它返回一个分数。
682 | 
683 | 172
684 | 00:14:00,000 --> 00:14:02,000
685 | 然后您选择最高分。
686 | 
687 | 173
688 | 00:14:02,000 --> 00:14:06,000
689 | 这依赖于语言模型知道分数应该是什么。
690 | 
691 | 174
692 | 00:14:06,000 --> 00:14:12,000
693 | 因此，您经常需要告诉它，嘿，如果它与文档相关，则应该是高分，并在那里精细调整说明。
694 | 
695 | 175
696 | 00:14:12,000 --> 00:14:15,000
697 | 与Map Reduce类似，所有调用都是独立的。
698 | 
699 | 176
700 | 00:14:15,000 --> 00:14:16,000
701 | 所以您可以批量处理它们。
702 | 
703 | 177
704 | 00:14:16,000 --> 00:14:18,000
705 | 而且它相对较快。
706 | 
707 | 178
708 | 00:14:18,000 --> 00:14:20,000
709 | 但是，您正在进行大量的语言模型调用。
710 | 
711 | 179
712 | 00:14:20,000 --> 00:14:22,000
713 | 因此，它会更加昂贵。
714 | 
715 | 180
716 | 00:14:22,000 --> 00:14:29,000
717 | 这些方法中最常见的是Stuff方法，我们在笔记本中使用它将所有内容组合成一个文档。
718 | 
719 | 181
720 | 00:14:29,000 --> 00:14:35,000
721 | 第二种最常见的方法是Map Reduce方法，它将这些块发送到语言模型。
722 |  
723 | 182
724 | 00:14:35,000 --> 00:14:42,000
725 | 这里的这些方法，如stuff、map reduce、refine和re-rank，也可以用于除了问答之外的许多其他链。
726 | 
727 | 183
728 | 00:14:42,000 --> 00:14:53,000
729 | 例如，map reduce链的一个非常常见的用例是摘要，其中您有一个非常长的文档，您想要递归地摘要其中的信息片段。
730 | 
731 | 184
732 | 00:14:53,000 --> 00:14:56,000
733 | 这就是关于文档问答的全部内容。
734 | 
735 | 185
736 | 00:14:56,000 --> 00:15:00,000
737 | 正如您可能已经注意到的那样，我们这里有许多不同的链条。
738 | 
739 | 186
740 | 00:15:00,000 --> 00:15:12,000
741 | 因此，在下一节中，我们将介绍更好地了解所有这些链条内部究竟发生了什么的方法。


--------------------------------------------------------------------------------
/chinese/LangChain_L3.srt:
--------------------------------------------------------------------------------
  1 | 
  2 |  
  3 | 1
  4 | 00:00:00,000 --> 00:00:09,560
  5 | 在这节课中，哈里森将教授最重要的关键构建块，即链。
  6 | 
  7 | 2
  8 | 00:00:09,560 --> 00:00:11,960
  9 | 链通常将一个llm大型语言模型与提示结合在一起。
 10 | 
 11 | 3
 12 | 00:00:12,680 --> 00:00:17,440
 13 | 使用这个构建块，您还可以将一堆这些构建块组合在一起，对您的文本或其他数据进行一系列操作。
 14 | 
 15 | 4
 16 | 00:00:17,720 --> 00:00:21,720
 17 | 我很兴奋地深入研究它。
 18 | 
 19 | 5
 20 | 00:00:21,720 --> 00:00:26,040
 21 | 好的，首先，我们要加载环境变量，就像以前一样。
 22 | 
 23 | 6
 24 | 00:00:26,040 --> 00:00:26,540
 25 | 然后我们还要加载一些我们要使用的数据。
 26 | 
 27 | 7
 28 | 00:00:27,040 --> 00:00:28,400
 29 | 这些链的一部分的强大之处在于您可以一次运行它们在许多输入上。
 30 | 
 31 | 8
 32 | 00:00:28,400 --> 00:00:33,080
 33 | 因此，我们将加载一个pandas数据框架。
 34 | 
 35 | 9
 36 | 00:00:33,080 --> 00:00:33,580
 37 | pandas数据框架只是包含许多不同数据元素的数据结构。
 38 | 
 39 | 10
 40 | 00:00:34,440 --> 00:00:37,240
 41 | 如果您不熟悉pandas，请不要担心。
 42 | 
 43 | 11
 44 | 00:00:38,040 --> 00:00:43,480
 45 | 这里的主要观点是我们正在加载一些数据，稍后可以使用。
 46 | 
 47 | 12
 48 | 00:00:43,800 --> 00:00:44,320
 49 | 因此，如果我们查看这个pandas数据框架，我们可以看到有一个
 50 | 
 51 | 13
 52 | 00:00:44,760 --> 00:00:46,600
 53 | 产品列，然后是一个评论列。
 54 | 
 55 | 14
 56 | 00:00:47,240 --> 00:00:50,840
 57 | 每一行都是一个不同的数据点，我们可以开始通过我们的链传递。
 58 | 
 59 | 15
 60 | 00:00:50,840 --> 00:00:52,440
 61 | 因此，我们要介绍的第一个链是llm链。
 62 | 
 63 | 16
 64 | 00:00:52,520 --> 00:00:54,480
 65 | 这是一个简单但非常强大的链，是未来我们将要介绍的许多链的基础。
 66 | 
 67 | 17
 68 | 00:00:54,480 --> 00:00:58,560
 69 | 因此，我们将导入三个不同的东西。
 70 | 
 71 | 18
 72 | 00:00:58,600 --> 00:01:02,000
 73 | 我们将导入OpenAI模型。
 74 | 
 75 | 19
 76 | 00:01:02,000 --> 00:01:04,200
 77 | 所以llm，我们将导入聊天提示模板。
 78 | 
 79 | 20
 80 | 00:01:04,280 --> 00:01:07,640
 81 | 这是提示，然后我们将导入llm链。
 82 | 
 83 | 21
 84 | 00:01:07,640 --> 00:01:08,480
 85 | 首先，我们要做的是初始化我们要使用的语言模型。
 86 | 
 87 | 22
 88 | 00:01:09,680 --> 00:01:12,120
 89 | 因此，我们将使用高温度初始化聊天OpenAI。
 90 | 
 91 | 23
 92 | 00:01:12,200 --> 00:01:16,400
 93 | 现在，我们将初始化提示，这个提示将接受一个
 94 |  
 95 | 34
 96 | 00:01:47,640 --> 00:01:48,920
 97 | 名为product的变量。
 98 | 
 99 | 35
100 | 00:01:49,240 --> 00:01:53,000
101 | 它将要求LLM生成描述制造该产品的最佳名称。
102 | 
103 | 36
104 | 00:01:53,000 --> 00:01:54,680
105 | 公司的名称。
106 | 
107 | 37
108 | 00:01:55,520 --> 00:01:59,120
109 | 最后，我们将把这两个东西组合成一个链。
110 | 
111 | 38
112 | 00:01:59,760 --> 00:02:01,880
113 | 这就是我们所谓的LLM链。
114 | 
115 | 39
116 | 00:02:02,000 --> 00:02:02,840
117 | 它非常简单。
118 | 
119 | 40
120 | 00:02:02,840 --> 00:02:06,120
121 | 它只是LLM和提示的组合。
122 | 
123 | 41
124 | 00:02:06,720 --> 00:02:10,640
125 | 但现在，这个链将让我们按顺序运行提示和LLM。
126 | 
127 | 42
128 | 00:02:10,640 --> 00:02:11,400
129 | 因此，如果我们有一个名为“queen size sheet set”的产品，我们可以通过使用chain.run将其运行通过这个链。
130 | 
131 | 43
132 | 00:02:11,680 --> 00:02:16,120
133 | 在幕后，它将格式化提示，然后将整个提示传递到LLM中。
134 | 
135 | 44
136 | 00:02:16,120 --> 00:02:17,840
137 | 因此，我们可以看到我们得到了这个假想公司的名称，叫做Royal Beddings。
138 | 
139 | 45
140 | 00:02:18,240 --> 00:02:21,440
141 | 因此，现在是暂停的好时机，您可以输入任何产品描述，然后查看链将输出什么结果。
142 | 
143 | 46
144 | 00:02:21,440 --> 00:02:24,080
145 | LLM链是最基本的链类型，将在未来经常使用。
146 | 
147 | 47
148 | 00:02:24,400 --> 00:02:28,440
149 | 因此，我们可以看到这将如何在下一个类型的链中使用，即顺序链。
150 | 
151 | 48
152 | 00:02:28,440 --> 00:02:29,320
153 | 因此，顺序链依次运行一系列链。
154 | 
155 | 49
156 | 00:02:30,360 --> 00:02:33,440
157 | 因此，首先，您将导入简单的顺序链。
158 | 
159 | 50
160 | 00:02:33,440 --> 00:02:36,720
161 | 当我们有只期望一个输入并返回一个输出的子链时，这很有效。
162 | 
163 | 51
164 | 00:02:36,720 --> 00:02:37,160
165 | 结果。
166 | 
167 | 52
168 | 00:02:38,120 --> 00:02:42,880
169 | 因此，我们可以看到这将如何在下一个类型的链中使用，即顺序链。
170 | 
171 | 53
172 | 00:02:42,880 --> 00:02:43,680
173 | 将经常使用LLM链。
174 | 
175 | 54
176 | 00:02:43,880 --> 00:02:47,320
177 | 因此，我们可以看到这将如何在下一个类型的链中使用，即顺序链。
178 | 
179 | 55
180 | 00:02:47,320 --> 00:02:48,440
181 | 因此，顺序链依次运行一系列链。
182 | 
183 | 56
184 | 00:02:48,440 --> 00:02:52,520
185 | 因此，顺序链依次运行一系列链。
186 | 
187 | 57
188 | 00:02:52,960 --> 00:02:56,800
189 | 因此，首先，您将导入简单的顺序链。
190 | 
191 | 58
192 | 00:02:57,240 --> 00:03:00,880
193 | 当我们有只期望一个输入并返回一个输出的子链时，这很有效。
194 | 
195 | 59
196 | 00:03:00,880 --> 00:03:02,000
197 | 结果。
198 | 
199 | 60
200 | 00:03:02,760 --> 00:03:07,600
201 | 因此，这里我们将首先创建一个链，它使用LLM和提示。
202 | 
203 | 61
204 | 00:03:07,600 --> 00:03:08,160
205 | 这个提示将接受产品并返回最佳名称来描述该公司。
206 | 
207 | 62
208 | 00:03:08,560 --> 00:03:13,640
209 | 那将是第一个链。
210 | 
211 | 63
212 | 00:03:13,640 --> 00:03:14,640
213 | 然后，我们将创建第二个链。
214 | 
215 | 64
216 | 00:03:14,840 --> 00:03:16,080
217 | 在第二个链中，我们将接受公司名称，然后输出该公司的20个单词的描述。
218 | 
219 | 65
220 | 00:03:16,080 --> 00:03:18,280
221 | 因此，您可以想象这些链如何一个接一个地运行，
222 | 
223 | 66
224 | 00:03:18,680 --> 00:03:23,280
225 | 因此，这里我们将首先创建一个链，它使用LLM和提示。
226 | 
227 | 67
228 | 00:03:23,320 --> 00:03:25,240
229 | 这个提示将接受产品并返回最佳名称来描述该公司。
230 | 
231 | 68
232 | 00:03:26,320 --> 00:03:29,920
233 | 因此，您可以想象这些链如何一个接一个地运行，
234 |  
235 | 69
236 | 00:03:30,160 --> 00:03:33,600
237 | 第一个链的输出，即公司名称，然后传递到第二个链中。
238 | 
239 | 70
240 | 00:03:33,600 --> 00:03:34,200
241 | 第二个链。
242 | 
243 | 71
244 | 00:03:35,800 --> 00:03:39,960
245 | 我们可以通过创建一个简单的顺序链来轻松实现这一点，在这个链中，我们有
246 | 
247 | 72
248 | 00:03:39,960 --> 00:03:41,640
249 | 那里描述的两个链。
250 | 
251 | 73
252 | 00:03:42,240 --> 00:03:44,240
253 | 我们将称之为整体简单链。
254 | 
255 | 74
256 | 00:03:44,240 --> 00:03:49,720
257 | 现在，您可以在任何产品描述上运行此链。
258 | 
259 | 75
260 | 00:03:50,600 --> 00:03:54,920
261 | 因此，如果我们将其与上面的产品一起使用，即女王尺寸床单套装，我们可以
262 | 
263 | 76
264 | 00:03:54,920 --> 00:03:58,840
265 | 运行它，我们可以看到首先输出皇家博彩，然后将其传递到第二个链中。
266 | 
267 | 77
268 | 00:03:58,840 --> 00:04:00,200
269 | 然后它提出了这家公司可能涉及的描述。
270 | 
271 | 78
272 | 00:04:00,200 --> 00:04:03,400
273 | 当只有一个输入和一个输出时，简单的顺序链运作良好。
274 | 
275 | 79
276 | 00:04:05,680 --> 00:04:09,160
277 | 但是当只有一个输入时呢？
278 | 
279 | 80
280 | 00:04:09,160 --> 00:04:09,840
281 | 单个输出，但是当有多个输入或多个输出时呢？
282 | 
283 | 81
284 | 00:04:10,320 --> 00:04:12,120
285 | 那么我们可以使用普通的顺序链来实现这一点。
286 | 
287 | 82
288 | 00:04:12,120 --> 00:04:16,200
289 | 当然，我们可以使用普通的顺序链来实现这一点。
290 | 
291 | 83
292 | 00:04:16,200 --> 00:04:16,680
293 | 输出？
294 | 
295 | 84
296 | 00:04:16,920 --> 00:04:20,080
297 | 因此，我们可以使用普通的顺序链来实现这一点。
298 | 
299 | 85
300 | 00:04:21,240 --> 00:04:22,160
301 | 所以让我们导入它。
302 | 
303 | 86
304 | 00:04:22,160 --> 00:04:25,280
305 | 然后，您将创建一堆链，我们将一个接一个地使用它们。
306 | 
307 | 87
308 | 00:04:26,200 --> 00:04:29,200
309 | 我们将使用上面的数据，其中有一个评论。
310 | 
311 | 88
312 | 00:04:29,640 --> 00:04:34,320
313 | 因此，第一个链，我们将采取评论并将其翻译成
314 | 
315 | 89
316 | 00:04:34,320 --> 00:04:34,840
317 | 英语。
318 | 
319 | 90
320 | 00:04:37,240 --> 00:04:41,200
321 | 第二个链，我们将创建一个摘要
322 | 
323 | 91
324 | 00:04:41,200 --> 00:04:46,840
325 | 一句话。这将使用先前生成的英文评论。
326 | 
327 | 93
328 | 00:04:54,560 --> 00:04:58,720
329 | 因此，如果您注意到，这是使用来自
330 | 
331 | 96
332 | 00:04:58,720 --> 00:05:00,320
333 | 原始评论的评论变量。
334 | 
335 | 97
336 | 00:05:02,400 --> 00:05:05,480
337 | 最后，第四个链将接受多个输入。
338 | 
339 | 98
340 | 00:05:05,840 --> 00:05:09,560
341 | 因此，这将接受我们使用第二个
342 | 
343 | 99
344 | 00:05:09,560 --> 00:05:13,320
345 | 链计算的摘要变量和我们使用第三个链计算的语言变量。
346 | 
347 | 100
348 | 00:05:13,760 --> 00:05:17,720
349 | 它将要求在指定的
350 | 
351 | 101
352 | 00:05:17,720 --> 00:05:18,240
353 | 语言中对摘要进行跟进回复。
354 | 
355 | 102
356 | 00:05:19,800 --> 00:05:23,640
357 | 关于所有这些子链的一个重要事项是输入键
358 |  
359 | 103
360 | 00:05:23,720 --> 00:05:25,960
361 | 输出键需要非常精确。
362 | 
363 | 104
364 | 00:05:26,680 --> 00:05:28,520
365 | 所以在这里我们正在进行审查。
366 | 
367 | 105
368 | 00:05:28,600 --> 00:05:31,120
369 | 这是一个在开始时将传递的变量。
370 | 
371 | 106
372 | 00:05:31,760 --> 00:05:35,320
373 | 我们可以看到我们明确将输出键设置为英文审查。
374 | 
375 | 107
376 | 00:05:35,320 --> 00:05:39,840
377 | 然后在下面的下一个提示中使用它，我们接受英文审查
378 | 
379 | 108
380 | 00:05:39,840 --> 00:05:44,240
381 | 使用相同的变量名，并将该链的输出键设置为摘要，
382 | 
383 | 109
384 | 00:05:44,680 --> 00:05:46,920
385 | 我们可以看到它在最终链中使用。
386 | 
387 | 110
388 | 00:05:47,800 --> 00:05:52,760
389 | 第三个提示接受审查原始变量和输出语言，
390 | 
391 | 111
392 | 00:05:53,160 --> 00:05:55,040
393 | 这又在最终提示中使用。
394 | 
395 | 112
396 | 00:05:56,040 --> 00:05:59,760
397 | 确保这些变量名称完全正确非常重要，
398 | 
399 | 113
400 | 00:05:59,920 --> 00:06:02,400
401 | 因为有很多不同的输入和输出。
402 | 
403 | 114
404 | 00:06:02,400 --> 00:06:06,160
405 | 如果您遇到任何键错误，一定要检查它们是否对齐。
406 | 
407 | 115
408 | 00:06:06,160 --> 00:06:12,040
409 | 因此，简单的顺序链接收多个链，其中每个链具有一个
410 | 
411 | 116
412 | 00:06:12,040 --> 00:06:13,680
413 | 单个输入和单个输出。
414 | 
415 | 117
416 | 00:06:14,560 --> 00:06:19,080
417 | 要查看其可视化表示，我们可以查看幻灯片，其中有一个
418 | 
419 | 118
420 | 00:06:19,080 --> 00:06:22,760
421 | 链一个接一个地进入另一个链。
422 | 
423 | 119
424 | 00:06:24,080 --> 00:06:28,000
425 | 在这里，我们可以看到顺序链的视觉描述，将其与
426 | 
427 | 120
428 | 00:06:28,000 --> 00:06:32,920
429 | 上面的链进行比较，您可以注意到链中的任何步骤都可以接受多个输入
430 | 
431 | 121
432 | 00:06:32,920 --> 00:06:33,720
433 | 变量。
434 | 
435 | 122
436 | 00:06:34,280 --> 00:06:38,400
437 | 当您拥有更复杂的下游链需要时，这非常有用
438 | 
439 | 123
440 | 00:06:38,400 --> 00:06:41,400
441 | 成为多个先前链的组合。
442 | 
443 | 124
444 | 00:06:42,840 --> 00:06:46,400
445 | 现在我们拥有所有这些链，我们可以轻松地将它们组合在顺序中
446 | 
447 | 125
448 | 00:06:46,400 --> 00:06:46,920
449 | 链。
450 | 
451 | 126
452 | 00:06:47,360 --> 00:06:51,880
453 | 因此，您会注意到我们将创建的四个链传递到
454 | 
455 | 127
456 | 00:06:51,880 --> 00:06:52,760
457 | 链变量。
458 | 
459 | 128
460 | 00:06:52,760 --> 00:06:57,280
461 | 我们将创建具有一个人类输入的输入变量，即
462 | 
463 | 129
464 | 00:06:57,280 --> 00:06:58,000
465 | 审查。
466 | 
467 | 130
468 | 00:06:58,400 --> 00:07:02,200
469 | 然后我们想返回所有中间输出。
470 | 
471 | 131
472 | 00:07:02,200 --> 00:07:05,080
473 | 英文审查，摘要，然后是后续消息。
474 | 
475 | 132
476 | 00:07:07,320 --> 00:07:10,080
477 | 现在我们可以在一些数据上运行它。
478 | 
479 | 133
480 | 00:07:10,120 --> 00:07:14,800
481 | 所以让我们选择一篇评论并通过整个链传递它。
482 | 
483 | 134
484 | 00:07:20,000 --> 00:07:24,920
485 | 我们可以看到这里的原始评论似乎是法语。
486 |  
487 | 135
488 | 00:07:24,920 --> 00:07:27,680
489 | 我们可以把英文评论看作是一种翻译。
490 | 
491 | 136
492 | 00:07:27,680 --> 00:07:31,880
493 | 我们可以看到该评论的摘要，然后我们可以看到一条用法语原文写的跟进信息。
494 | 
495 | 137
496 | 00:07:31,880 --> 00:07:34,240
497 | 您应该在此暂停视频并尝试输入不同的输入。
498 | 
499 | 138
500 | 00:07:34,800 --> 00:07:38,320
501 | 到目前为止，我们已经涵盖了LLM链和顺序链。
502 | 
503 | 139
504 | 00:07:39,040 --> 00:07:42,560
505 | 但是，如果您想做一些更复杂的事情怎么办？
506 | 
507 | 140
508 | 00:07:43,080 --> 00:07:45,600
509 | 一个相当常见但基本的操作是根据输入将其路由到链中。
510 | 
511 | 141
512 | 00:07:46,200 --> 00:07:50,440
513 | 一个很好的想象方式是，如果您有多个子链，每个子链都专门用于特定类型的输入，您可以有一个路由器链，
514 | 
515 | 142
516 | 00:07:50,440 --> 00:07:52,400
517 | 首先决定要将其传递到哪个子链，然后将其传递到该链。
518 | 
519 | 143
520 | 00:07:52,400 --> 00:07:57,200
521 | 一个具体的例子是，让我们看看在不同类型的链之间路由的情况，具体取决于似乎出现的主题。
522 | 
523 | 144
524 | 00:07:57,200 --> 00:08:01,720
525 | 所以我们这里有不同的提示。
526 | 
527 | 145
528 | 00:08:01,760 --> 00:08:06,000
529 | 一个提示适合回答物理问题。
530 | 
531 | 146
532 | 00:08:06,000 --> 00:08:06,480
533 | 第二个提示适合回答数学问题。
534 | 
535 | 147
536 | 00:08:07,360 --> 00:08:11,520
537 | 第三个适用于历史，第四个适用于计算机科学。
538 | 
539 | 148
540 | 00:08:11,520 --> 00:08:15,720
541 | 让我们定义所有这些提示模板。
542 | 
543 | 149
544 | 00:08:16,440 --> 00:08:18,640
545 | 在我们拥有这些提示模板之后，我们可以提供更多信息
546 | 
547 | 150
548 | 00:08:18,800 --> 00:08:21,280
549 | 关于它们。
550 | 
551 | 151
552 | 00:08:21,280 --> 00:08:23,680
553 | 我们可以为每个模板命名，然后提供描述。
554 | 
555 | 152
556 | 00:08:23,680 --> 00:08:26,960
557 | 这个物理学的描述适合回答关于物理学的问题。
558 | 
559 | 153
560 | 00:08:27,280 --> 00:08:29,440
561 | 这些信息将传递给路由器链。
562 | 
563 | 154
564 | 00:08:33,320 --> 00:08:36,800
565 | 因此，路由器链可以决定何时使用此子链。
566 | 
567 | 155
568 | 00:08:36,800 --> 00:08:37,320
569 | 我们现在可以导入我们需要的其他类型的链。
570 | 
571 | 156
572 | 00:08:37,760 --> 00:08:40,600
573 | 在这里，我们需要一个多提示链。
574 | 
575 | 157
576 | 00:08:41,160 --> 00:08:44,280
577 | 这是一种特定类型的链，用于在多个不同的提示模板之间进行路由。
578 | 
579 | 158
580 | 00:08:44,280 --> 00:08:44,800
581 | 但是，这只是您可以路由的一种类型。
582 | 
583 | 159
584 | 00:08:45,480 --> 00:08:48,560
585 | 您可以在任何类型的链之间进行路由。
586 |  
587 | 168
588 | 00:09:19,000 --> 00:09:22,400
589 | 这里我们要实现的另外几个类是LLM路由器链。
590 | 
591 | 169
592 | 00:09:22,880 --> 00:09:26,840
593 | 这个类本身使用语言模型来在不同的子链之间进行路由。
594 | 
595 | 170
596 | 00:09:26,880 --> 00:09:30,360
597 | 这就是上面提供的描述和名称将被使用的地方。
598 | 
599 | 171
600 | 00:09:31,160 --> 00:09:33,360
601 | 我们还将导入一个路由器输出解析器。
602 | 
603 | 172
604 | 00:09:33,920 --> 00:09:38,080
605 | 这将LLM输出解析为可在下游使用的字典，以确定要使用哪个链以及该链的输入应该是什么。
606 | 
607 | 173
608 | 00:09:38,440 --> 00:09:42,440
609 | 现在我们可以开始使用它了。
610 | 
611 | 174
612 | 00:09:42,440 --> 00:09:44,120
613 | 首先，让我们导入并定义要使用的语言模型。
614 | 
615 | 175
616 | 00:09:44,160 --> 00:09:48,680
617 | 现在我们创建目标链。
618 | 
619 | 176
620 | 00:09:52,000 --> 00:09:54,240
621 | 这些是由路由器链调用的链。
622 | 
623 | 177
624 | 00:09:54,400 --> 00:09:57,080
625 | 正如您所看到的，每个目标链本身都是一个语言模型链，
626 | 
627 | 178
628 | 00:09:57,560 --> 00:10:01,400
629 | 除了目标链之外，我们还需要一个默认目标链。
630 | 
631 | 179
632 | 00:10:01,400 --> 00:10:02,360
633 | 这是一个当路由器无法决定使用哪个子链时调用的链。
634 | 
635 | 180
636 | 00:10:04,240 --> 00:10:08,640
637 | 在上面的示例中，当输入问题与物理、数学、历史或计算机科学无关时，可能会调用它。
638 | 
639 | 181
640 | 00:10:08,640 --> 00:10:12,800
641 | 除了目标链之外，我们还需要一个默认链。
642 | 
643 | 182
644 | 00:10:13,320 --> 00:10:15,880
645 | 这是一个当路由器无法决定使用哪个子链时调用的链。
646 | 
647 | 183
648 | 00:10:16,080 --> 00:10:17,800
649 | 在上面的示例中，当输入问题与物理、数学、历史或计算机科学无关时，可能会调用它。
650 | 
651 | 184
652 | 00:10:18,080 --> 00:10:22,000
653 | 现在我们定义了LLM用于在不同链之间进行路由的模板。
654 | 
655 | 185
656 | 00:10:22,000 --> 00:10:25,800
657 | 这包括要完成的任务的说明以及输出应该采用的特定格式。
658 | 
659 | 186
660 | 00:10:28,120 --> 00:10:31,760
661 | 让我们将其中一些部分组合起来构建路由器链。
662 | 
663 | 187
664 | 00:10:31,760 --> 00:10:33,800
665 | 首先，我们通过格式化上面定义的目标创建完整的路由器模板。
666 | 
667 | 188
668 | 00:10:34,720 --> 00:10:37,000
669 | 这个模板可以适应许多不同类型的目标。
670 | 
671 | 189
672 | 00:10:37,000 --> 00:10:40,440
673 | 在这里，您可以暂停并添加不同类型的目标。
674 | 
675 | 190
676 | 00:10:41,680 --> 00:10:44,840
677 | 因此，在这里，您可以添加一个不同的学科，如英语或拉丁语，而不仅仅是物理、数学、历史和计算机科学。
678 | 
679 | 191
680 | 00:10:45,600 --> 00:10:48,520
681 | 接下来，我们从这个模板创建提示模板，
682 | 
683 | 192
684 | 00:10:48,520 --> 00:10:50,480
685 | 然后通过传入LLM和整个路由器提示来创建路由器链。
686 | 
687 | 193
688 | 00:10:50,920 --> 00:10:54,280
689 | 这个模板可以适应许多不同类型的目标。
690 | 
691 | 194
692 | 00:10:54,720 --> 00:10:58,520
693 | 在这里，您可以暂停并添加不同类型的目标。
694 | 
695 | 195
696 | 00:10:59,000 --> 00:11:02,160
697 | 因此，在这里，您可以添加一个不同的学科，如英语或拉丁语，而不仅仅是物理、数学、历史和计算机科学。
698 | 
699 | 196
700 | 00:11:02,160 --> 00:11:04,960
701 | 接下来，我们从这个模板创建提示模板，
702 | 
703 | 197
704 | 00:11:04,960 --> 00:11:07,760
705 | 然后通过传入LLM和整个路由器提示来创建路由器链。
706 |  
707 | 200
708 | 00:11:13,960 --> 00:11:16,360
709 | 请注意，这里有路由器输出解析器。
710 | 
711 | 201
712 | 00:11:16,720 --> 00:11:19,320
713 | 这很重要，因为它将帮助这个链路决定
714 | 
715 | 202
716 | 00:11:19,720 --> 00:11:22,160
717 | 在哪些子链路之间进行路由。
718 | 
719 | 203
720 | 00:11:24,760 --> 00:11:28,920
721 | 最后，将所有内容整合在一起，我们可以创建整体链路。
722 | 
723 | 204
724 | 00:11:29,240 --> 00:11:32,400
725 | 它有一个路由器链路，在这里定义。
726 | 
727 | 205
728 | 00:11:32,400 --> 00:11:35,200
729 | 它有目标链路，我们在这里传递。
730 | 
731 | 206
732 | 00:11:35,400 --> 00:11:37,200
733 | 然后我们还传递默认链路。
734 | 
735 | 207
736 | 00:11:38,880 --> 00:11:40,200
737 | 现在我们可以使用这个链路。
738 | 
739 | 208
740 | 00:11:40,520 --> 00:11:41,960
741 | 所以让我们问一些问题。
742 | 
743 | 209
744 | 00:11:42,560 --> 00:11:45,320
745 | 如果我们问一个物理问题，
746 | 
747 | 210
748 | 00:11:45,520 --> 00:11:48,920
749 | 我们应该希望看到它被路由到物理链路
750 | 
751 | 211
752 | 00:11:49,640 --> 00:11:52,560
753 | 输入是什么，黑体辐射是什么？
754 | 
755 | 212
756 | 00:11:52,800 --> 00:11:55,480
757 | 然后它被传递到下面的链路中。
758 | 
759 | 213
760 | 00:11:55,480 --> 00:11:59,080
761 | 我们可以看到回答非常详细
762 | 
763 | 214
764 | 00:11:59,080 --> 00:12:01,080
765 | 有很多物理细节。
766 | 
767 | 215
768 | 00:12:01,080 --> 00:12:04,600
769 | 您应该在此暂停视频并尝试输入不同的内容。
770 | 
771 | 216
772 | 00:12:04,920 --> 00:12:08,520
773 | 您可以尝试使用我们上面定义的所有其他类型的特殊链路
774 | 
775 | 217
776 | 00:12:08,520 --> 00:12:09,920
777 | 。
778 | 
779 | 218
780 | 00:12:10,440 --> 00:12:13,240
781 | 因此，例如，如果我们问一个数学问题，
782 | 
783 | 219
784 | 00:12:21,600 --> 00:12:23,680
785 | 我们应该看到它被路由到数学链路
786 | 
787 | 220
788 | 00:12:24,040 --> 00:12:25,120
789 | 然后传递到那里。
790 | 
791 | 221
792 | 00:12:25,120 --> 00:12:27,720
793 | 我们还可以看到当我们传递一个问题时会发生什么
794 | 
795 | 222
796 | 00:12:27,920 --> 00:12:30,320
797 | 与任何子链路都无关的。
798 | 
799 | 223
800 | 00:12:30,720 --> 00:12:33,480
801 | 所以在这里，我们问一个关于生物学的问题
802 | 
803 | 224
804 | 00:12:33,760 --> 00:12:35,880
805 | 我们可以看到它选择的链路是无。
806 | 
807 | 225
808 | 00:12:36,520 --> 00:12:38,400
809 | 这意味着它将被传递到默认链路，
810 | 
811 | 226
812 | 00:12:38,400 --> 00:12:41,360
813 | 它本身只是对语言模型的通用调用。
814 | 
815 | 227
816 | 00:12:41,560 --> 00:12:43,680
817 | 语言模型幸运地对生物学知道很多，
818 | 
819 | 228
820 | 00:12:43,680 --> 00:12:44,880
821 | 所以它可以帮助我们。
822 | 
823 | 229
824 | 00:12:46,080 --> 00:12:48,120
825 | 现在我们已经涵盖了这些基本问题，
826 | 
827 | 230
828 | 00:12:48,120 --> 00:12:50,120
829 | 让我们继续视频的下一部分。
830 | 
831 | 231
832 | 00:12:50,120 --> 00:12:52,120
833 | 那就是如何创建一个新的链路。
834 | 
835 | 232
836 | 00:12:52,120 --> 00:12:55,120
837 | 因此，例如，在下一部分中，
838 | 
839 | 233
840 | 00:12:55,120 --> 00:12:57,120
841 | 我们将介绍如何创建一个链路
842 | 
843 | 234
844 | 00:12:57,120 --> 00:13:22,120
845 | 可以对您的文档进行问答。


--------------------------------------------------------------------------------
/chinese/LangChain_L5.srt:
--------------------------------------------------------------------------------
  1 | 
  2 |  
  3 | 1
  4 | 00:00:01,000 --> 00:00:16,000
  5 | 当使用llm构建复杂应用程序时，评估应用程序的表现是一个重要但有时棘手的步骤。它是否满足某些准确性标准？
  6 | 
  7 | 2
  8 | 00:00:16,000 --> 00:00:33,000
  9 | 此外，如果您决定更改实现，可能会交换不同的llm或更改如何使用矢量数据库或其他内容检索通道或更改系统的某些其他参数的策略，那么如何知道您是在使其变得更好还是更糟？
 10 | 
 11 | 3
 12 | 00:00:34,000 --> 00:00:42,000
 13 | 在本视频中，哈里森将深入探讨一些框架，以思考如何评估基于llm的应用程序以及一些工具来帮助您做到这一点。
 14 | 
 15 | 4
 16 | 00:00:42,000 --> 00:00:53,000
 17 | 这些应用程序实际上是许多不同步骤的链和序列。因此，老实说，您应该做的第一件事就是了解每个步骤的具体情况。
 18 | 
 19 | 5
 20 | 00:00:54,000 --> 00:00:58,000
 21 | 因此，一些工具实际上可以被视为可视化器或调试器。
 22 | 
 23 | 6
 24 | 00:00:59,000 --> 00:01:04,000
 25 | 但是，通常更有用的是从许多不同的数据点中获得更全面的模型表现情况。
 26 | 
 27 | 7
 28 | 00:01:04,000 --> 00:01:15,000
 29 | 一种方法是通过肉眼观察来做到这一点。但是，还有这个非常酷的想法，即使用语言模型本身和链本身来评估其他语言模型、其他链和其他应用程序。
 30 | 
 31 | 8
 32 | 00:01:16,000 --> 00:01:17,000
 33 | 我们也将深入探讨这个想法。
 34 | 
 35 | 9
 36 | 00:01:18,000 --> 00:01:30,000
 37 | 因此，有很多很酷的主题。我发现随着许多开发转向基于提示的开发，使用llm开发应用程序，整个工作流程评估过程正在被重新思考。
 38 | 
 39 | 10
 40 | 00:01:30,000 --> 00:01:34,000
 41 | 因此，在本视频中有许多令人兴奋的概念。让我们开始吧。
 42 | 
 43 | 11
 44 | 00:01:35,000 --> 00:01:38,000
 45 | 好的。那么，让我们开始评估。
 46 | 
 47 | 12
 48 | 00:01:39,000 --> 00:01:44,000
 49 | 首先，我们需要有我们要评估的链或应用程序。
 50 | 
 51 | 13
 52 | 00:01:45,000 --> 00:01:49,000
 53 | 我们将使用上一课的文档问答链。
 54 | 
 55 | 14
 56 | 00:01:50,000 --> 00:01:55,000
 57 | 因此，我们将导入我们需要的所有内容。我们将加载相同的数据。
 58 | 
 59 | 15
 60 | 00:01:55,000 --> 00:01:58,000
 61 | 我们将用一行代码创建该索引。
 62 | 
 63 | 16
 64 | 00:01:59,000 --> 00:02:11,000
 65 | 然后，我们将通过指定语言模型、链类型、检索器和我们要打印的详细程度来创建检索QA链。
 66 | 
 67 | 17
 68 | 00:02:13,000 --> 00:02:14,000
 69 | 因此，我们有了这个应用程序。
 70 | 
 71 | 18
 72 | 00:02:14,000 --> 00:02:25,000
 73 | 我们需要做的第一件事是真正弄清楚我们想要评估它的一些数据点。
 74 |  
 75 | 19
 76 | 00:02:26,000 --> 00:02:29,000
 77 | 因此，我们将介绍几种不同的方法来完成这个任务。
 78 | 
 79 | 20
 80 | 00:02:30,000 --> 00:02:37,000
 81 | 第一种方法是最简单的，基本上我们将自己想出好的数据点作为例子。
 82 | 
 83 | 21
 84 | 00:02:37,000 --> 00:02:47,000
 85 | 为此，我们可以查看一些数据，然后想出例子问题和答案，以便以后用于评估。
 86 | 
 87 | 22
 88 | 00:02:48,000 --> 00:02:55,000
 89 | 因此，如果我们查看这里的一些文档，我们可以对其中发生的事情有所了解。
 90 | 
 91 | 23
 92 | 00:02:56,000 --> 00:03:01,000
 93 | 看起来第一个文档中有这个套头衫，第二个文档中有这个夹克。
 94 | 
 95 | 24
 96 | 00:03:02,000 --> 00:03:05,000
 97 | 它们都有很多细节。
 98 | 
 99 | 25
100 | 00:03:05,000 --> 00:03:10,000
101 | 从这些细节中，我们可以创建一些例子查询和答案对。
102 | 
103 | 26
104 | 00:03:11,000 --> 00:03:16,000
105 | 因此，我们可以问一个简单的问题，这个舒适的套头衫套装有侧口袋吗？
106 | 
107 | 27
108 | 00:03:17,000 --> 00:03:22,000
109 | 我们可以通过上面的内容看到，它确实有一些侧口袋。
110 | 
111 | 28
112 | 00:03:23,000 --> 00:03:29,000
113 | 然后对于第二个文档，我们可以看到这件夹克来自某个系列，即down tech系列。
114 | 
115 | 29
116 | 00:03:30,000 --> 00:03:32,000
117 | 因此，我们可以问一个问题，这件夹克来自哪个系列？
118 | 
119 | 30
120 | 00:03:32,000 --> 00:03:35,000
121 | 答案是down tech系列。
122 | 
123 | 31
124 | 00:03:36,000 --> 00:03:38,000
125 | 因此，我们创建了两个例子。
126 | 
127 | 32
128 | 00:03:39,000 --> 00:03:41,000
129 | 但这并不是很可扩展。
130 | 
131 | 33
132 | 00:03:42,000 --> 00:03:45,000
133 | 需要花费一些时间查看每个例子并弄清楚发生了什么。
134 | 
135 | 34
136 | 00:03:46,000 --> 00:03:48,000
137 | 因此，有没有一种方法可以自动化？
138 | 
139 | 35
140 | 00:03:49,000 --> 00:03:53,000
141 | 我们认为可以使用语言模型本身来实现这一点。
142 | 
143 | 36
144 | 00:03:54,000 --> 00:03:57,000
145 | 因此，我们在LangChain中有一个链可以做到这一点。
146 | 
147 | 37
148 | 00:03:58,000 --> 00:04:00,000
149 | 因此，我们可以导入QA生成链。
150 | 
151 | 38
152 | 00:04:00,000 --> 00:04:05,000
153 | 它将接收文档，并从每个文档中创建一个问题答案对。
154 | 
155 | 39
156 | 00:04:06,000 --> 00:04:08,000
157 | 它将使用语言模型本身来完成这一点。
158 | 
159 | 40
160 | 00:04:09,000 --> 00:04:13,000
161 | 因此，我们需要通过传递chat open AI语言模型来创建这个链。
162 | 
163 | 41
164 | 00:04:14,000 --> 00:04:16,000
165 | 然后，我们可以创建许多例子。
166 | 
167 | 42
168 | 00:04:17,000 --> 00:04:23,000
169 | 因此，我们将使用apply和parse方法，因为这将应用输出解析器到结果。
170 | 
171 | 43
172 | 00:04:24,000 --> 00:04:27,000
173 | 因为我们想要得到一个具有查询和答案对的字典，
174 | 
175 | 44
176 | 00:04:27,000 --> 00:04:29,000
177 | 而不仅仅是一个字符串。
178 |  
179 | 45
180 | 00:04:36,000 --> 00:04:39,000
181 | 因此，现在如果我们看看这里返回了什么，
182 | 
183 | 46
184 | 00:04:40,000 --> 00:04:42,000
185 | 我们可以看到一个查询和一个答案。
186 | 
187 | 47
188 | 00:04:43,000 --> 00:04:46,000
189 | 让我们检查一下这是一个问题和答案的文档。
190 | 
191 | 48
192 | 00:04:47,000 --> 00:04:49,000
193 | 我们可以看到它正在询问这个的重量。
194 | 
195 | 49
196 | 00:04:50,000 --> 00:04:52,000
197 | 我们可以看到它正在从这里获取重量。
198 | 
199 | 50
200 | 00:04:53,000 --> 00:04:54,000
201 | 看看那个。
202 | 
203 | 51
204 | 00:04:54,000 --> 00:04:56,000
205 | 我们刚刚生成了一堆问题答案对。
206 | 
207 | 52
208 | 00:04:57,000 --> 00:04:58,000
209 | 我们不必自己编写它们。
210 | 
211 | 53
212 | 00:04:59,000 --> 00:05:02,000
213 | 节省了我们很多时间，我们可以做更有趣的事情。
214 | 
215 | 54
216 | 00:05:03,000 --> 00:05:08,000
217 | 因此，现在让我们将这些示例添加到我们已经创建的示例中。
218 | 
219 | 55
220 | 00:05:09,000 --> 00:05:14,000
221 | 所以，我们现在有了这些示例，但是我们如何评估正在发生的事情呢？
222 | 
223 | 56
224 | 00:05:15,000 --> 00:05:18,000
225 | 我们想做的第一件事就是运行一个示例通过链
226 | 
227 | 57
228 | 00:05:19,000 --> 00:05:21,000
229 | 并查看它产生的输出。
230 | 
231 | 58
232 | 00:05:21,000 --> 00:05:25,000
233 | 因此，在这里我们传递一个查询，然后我们得到一个答案。
234 | 
235 | 59
236 | 00:05:26,000 --> 00:05:29,000
237 | 但是这在了解链中实际发生的事情方面有点受限
238 | 
239 | 60
240 | 00:05:30,000 --> 00:05:31,000
241 | 实际上正在发生的事情。
242 | 
243 | 61
244 | 00:05:32,000 --> 00:05:34,000
245 | 进入语言模型的实际提示是什么？
246 | 
247 | 62
248 | 00:05:35,000 --> 00:05:37,000
249 | 它检索的文档是什么？
250 | 
251 | 63
252 | 00:05:38,000 --> 00:05:40,000
253 | 如果这是一个更复杂的链，其中有多个步骤，
254 | 
255 | 64
256 | 00:05:41,000 --> 00:05:42,000
257 | 中间结果是什么？
258 | 
259 | 65
260 | 00:05:43,000 --> 00:05:46,000
261 | 仅仅查看最终答案通常不足以了解链中出现了什么问题或可能出现了什么问题。
262 | 
263 | 66
264 | 00:05:46,000 --> 00:05:50,000
265 | 为了帮助解决这个问题，我们在LingChain中有一个有趣的小工具，称为LingChainDebug。
266 | 
267 | 67
268 | 00:05:51,000 --> 00:05:56,000
269 | 因此，如果我们将LingChainDebug设置为true
270 | 
271 | 68
272 | 00:06:03,000 --> 00:06:06,000
273 | 现在我们重新运行与上面相同的示例，
274 | 
275 | 69
276 | 00:06:07,000 --> 00:06:11,000
277 | 我们可以看到它开始打印出更多的信息。
278 | 
279 | 70
280 | 00:06:12,000 --> 00:06:14,000
281 | 因此，如果我们看看它到底打印了什么，
282 | 
283 | 71
284 | 00:06:14,000 --> 00:06:18,000
285 | 我们可以看到它首先深入到检索QA链中
286 | 
287 | 72
288 | 00:06:19,000 --> 00:06:21,000
289 | 然后它进入了一些文档链。
290 | 
291 | 73
292 | 00:06:22,000 --> 00:06:24,000
293 | 因此，如上所述，我们正在使用stuff方法。
294 | 
295 | 74
296 | 00:06:25,000 --> 00:06:28,000
297 | 现在它正在进入LLM链，我们有几个不同的输入。
298 | 
299 | 75
300 | 00:06:29,000 --> 00:06:31,000
301 | 因此，我们可以看到原始问题就在那里。
302 | 
303 | 76
304 | 00:06:32,000 --> 00:06:33,000
305 | 现在我们正在传递这个上下文。
306 |  
307 | 78
308 | 00:06:34,000 --> 00:06:36,000
309 | 我们可以看到，这个上下文是由我们检索到的不同文档创建的。
310 | 
311 | 79
312 | 00:06:37,000 --> 00:06:40,000
313 | 因此，在进行问答时，当返回错误结果时，
314 | 
315 | 80
316 | 00:06:41,000 --> 00:06:42,000
317 | 通常不是语言模型本身出错了。
318 | 
319 | 81
320 | 00:06:42,000 --> 00:06:44,000
321 | 实际上是检索步骤出错了。
322 | 
323 | 82
324 | 00:06:45,000 --> 00:06:48,000
325 | 因此，仔细查看问题的确切内容和上下文可以帮助调试出错的原因。
326 | 
327 | 83
328 | 00:06:49,000 --> 00:06:50,000
329 | 然后，我们可以再向下一级
330 | 
331 | 84
332 | 00:06:51,000 --> 00:06:54,000
333 | 看看进入语言模型的确切内容，
334 | 
335 | 85
336 | 00:06:55,000 --> 00:06:58,000
337 | 以及 OpenAI 自身。
338 | 
339 | 86
340 | 00:06:59,000 --> 00:07:01,000
341 | 因此，在这里，我们可以看到传递的完整提示。
342 | 
343 | 87
344 | 00:07:02,000 --> 00:07:04,000
345 | 所以，我们有一个系统消息。
346 | 
347 | 88
348 | 00:07:05,000 --> 00:07:06,000
349 | 我们有所使用的提示的描述。
350 | 
351 | 89
352 | 00:07:07,000 --> 00:07:09,000
353 | 因此，这是问题回答链使用的提示，
354 | 
355 | 90
356 | 00:07:09,000 --> 00:07:11,000
357 | 我们甚至直到现在都没有看过。
358 | 
359 | 91
360 | 00:07:12,000 --> 00:07:14,000
361 | 因此，我们可以看到提示打印出来，
362 | 
363 | 92
364 | 00:07:15,000 --> 00:07:17,000
365 | 使用以下上下文片段回答用户的问题。
366 | 
367 | 93
368 | 00:07:18,000 --> 00:07:19,000
369 | 如果您不知道答案，只需说您不知道即可。
370 | 
371 | 94
372 | 00:07:20,000 --> 00:07:21,000
373 | 不要试图编造答案。
374 | 
375 | 95
376 | 00:07:22,000 --> 00:07:23,000
377 | 然后我们看到一堆之前插入的上下文，
378 | 
379 | 96
380 | 00:07:24,000 --> 00:07:26,000
381 | 然后我们看到一个人类问题，
382 | 
383 | 97
384 | 00:07:27,000 --> 00:07:28,000
385 | 也就是我们问它的问题。
386 | 
387 | 98
388 | 00:07:29,000 --> 00:07:30,000
389 | 我们还可以看到有关实际返回类型的更多信息。
390 | 
391 | 99
392 | 00:07:31,000 --> 00:07:33,000
393 | 因此，我们不仅仅返回一个字符串，
394 | 
395 | 100
396 | 00:07:34,000 --> 00:07:35,000
397 | 我们还返回了许多信息，如令牌使用情况，
398 | 
399 | 101
400 | 00:07:36,000 --> 00:07:37,000
401 | 因此，提示令牌、完成令牌、
402 | 
403 | 102
404 | 00:07:38,000 --> 00:07:40,000
405 | 总令牌和模型名称。
406 | 
407 | 103
408 | 00:07:41,000 --> 00:07:42,000
409 | 这可以非常有用地跟踪您在链中使用的令牌
410 | 
411 | 104
412 | 00:07:43,000 --> 00:07:45,000
413 | 或随时间调用语言模型的令牌
414 | 
415 | 105
416 | 00:07:46,000 --> 00:07:47,000
417 | 并跟踪总令牌数，
418 | 
419 | 106
420 | 00:07:48,000 --> 00:07:50,000
421 | 这与总成本非常接近。
422 | 
423 | 107
424 | 00:07:51,000 --> 00:07:53,000
425 | 由于这是一个相对简单的链，
426 | 
427 | 108
428 | 00:07:54,000 --> 00:07:55,000
429 | 我们现在可以看到最终的响应，
430 |  
431 | 114
432 | 00:08:07,000 --> 00:08:09,000
433 | 舒适的毛衣套装，条纹款，
434 | 
435 | 115
436 | 00:08:10,000 --> 00:08:11,000
437 | 有侧袋，正在起泡，
438 | 
439 | 116
440 | 00:08:12,000 --> 00:08:14,000
441 | 通过链条返回给用户。
442 | 
443 | 117
444 | 00:08:15,000 --> 00:08:17,000
445 | 因此，我们刚刚讲解了如何查看和调试单个输入到该链的情况。
446 | 
447 | 118
448 | 00:08:18,000 --> 00:08:21,000
449 | 但是我们创建的所有示例呢？
450 | 
451 | 119
452 | 00:08:22,000 --> 00:08:23,000
453 | 我们该如何评估它们？
454 | 
455 | 120
456 | 00:08:24,000 --> 00:08:25,000
457 | 与创建它们类似，
458 | 
459 | 121
460 | 00:08:26,000 --> 00:08:28,000
461 | 一种方法是手动进行。
462 | 
463 | 122
464 | 00:08:29,000 --> 00:08:30,000
465 | 我们可以运行链条来处理所有示例，
466 | 
467 | 123
468 | 00:08:31,000 --> 00:08:32,000
469 | 然后查看输出并尝试弄清楚
470 | 
471 | 124
472 | 00:08:32,000 --> 00:08:34,000
473 | 发生了什么，它是否正确，
474 | 
475 | 125
476 | 00:08:35,000 --> 00:08:36,000
477 | 不正确，部分正确。
478 | 
479 | 126
480 | 00:08:37,000 --> 00:08:38,000
481 | 与创建示例类似，
482 | 
483 | 127
484 | 00:08:39,000 --> 00:08:40,000
485 | 随着时间的推移，这开始变得有点乏味。
486 | 
487 | 128
488 | 00:08:41,000 --> 00:08:42,000
489 | 因此，让我们回到我们最喜欢的解决方案。
490 | 
491 | 129
492 | 00:08:43,000 --> 00:08:45,000
493 | 我们可以要求语言模型来做吗？
494 | 
495 | 130
496 | 00:08:46,000 --> 00:08:48,000
497 | 首先，我们需要为所有示例创建预测。
498 | 
499 | 131
500 | 00:08:49,000 --> 00:08:51,000
501 | 在这之前，我要关闭
502 | 
503 | 132
504 | 00:08:52,000 --> 00:08:54,000
505 | 调试模式，以便不将所有内容打印到屏幕上。
506 | 
507 | 133
508 | 00:08:55,000 --> 00:08:57,000
509 | 然后我将为所有不同的示例创建预测。
510 | 
511 | 134
512 | 00:08:58,000 --> 00:08:59,000
513 | 因此，我认为我们总共有七个示例。
514 | 
515 | 135
516 | 00:09:00,000 --> 00:09:01,000
517 | 现在我们有了这些示例，
518 | 
519 | 136
520 | 00:09:02,000 --> 00:09:03,000
521 | 我们可以考虑评估它们。
522 | 
523 | 137
524 | 00:09:04,000 --> 00:09:06,000
525 | 因此，我们将导入QA，
526 | 
527 | 138
528 | 00:09:07,000 --> 00:09:09,000
529 | 问题回答，评估链。
530 | 
531 | 139
532 | 00:09:09,000 --> 00:09:29,000
533 | 我们将通过语言模型创建此链，
534 | 
535 | 140
536 | 00:09:30,000 --> 00:09:31,000
537 | 因为我们将使用语言模型
538 | 
539 | 141
540 | 00:09:32,000 --> 00:09:33,000
541 | 来帮助进行评估。
542 | 
543 | 142
544 | 00:09:34,000 --> 00:09:35,000
545 | 然后我们将在此链上调用evaluate。
546 | 
547 | 143
548 | 00:09:35,000 --> 00:09:38,000
549 | 我们将传入示例和预测，
550 | 
551 | 144
552 | 00:09:39,000 --> 00:09:40,000
553 | 然后我们将得到一堆分级输出。
554 | 
555 | 145
556 | 00:09:41,000 --> 00:09:42,000
557 | 因此，为了看到每个示例的情况，
558 |  
559 | 153
560 | 00:10:07,000 --> 00:10:08,000
561 | 我们将循环遍历它们。
562 | 
563 | 154
564 | 00:10:09,000 --> 00:10:10,000
565 | 我们将打印出问题。
566 | 
567 | 155
568 | 00:10:11,000 --> 00:10:12,000
569 | 而且，这是由语言模型生成的。
570 | 
571 | 156
572 | 00:10:13,000 --> 00:10:14,000
573 | 我们将打印出真正的答案。
574 | 
575 | 157
576 | 00:10:15,000 --> 00:10:17,000
577 | 而且，这也是由语言模型生成的。
578 | 
579 | 158
580 | 00:10:18,000 --> 00:10:19,000
581 | 当它面前有整个文档时，它可以生成一个真实的答案。
582 | 
583 | 159
584 | 00:10:20,000 --> 00:10:22,000
585 | 因此，它可以生成一个真实的答案。
586 | 
587 | 160
588 | 00:10:23,000 --> 00:10:24,000
589 | 我们将打印出预测的答案。
590 | 
591 | 161
592 | 00:10:25,000 --> 00:10:26,000
593 | 这是由语言模型生成的。
594 | 
595 | 162
596 | 00:10:27,000 --> 00:10:28,000
597 | 当它进行QA链时，
598 | 
599 | 163
600 | 00:10:29,000 --> 00:10:30,000
601 | 当它使用嵌入和向量数据库进行检索时，
602 | 
603 | 164
604 | 00:10:30,000 --> 00:10:32,000
605 | 然后将其传递到语言模型中，
606 | 
607 | 165
608 | 00:10:33,000 --> 00:10:35,000
609 | 然后尝试猜测预测的答案。
610 | 
611 | 166
612 | 00:10:36,000 --> 00:10:37,000
613 | 然后我们还将打印出成绩。
614 | 
615 | 167
616 | 00:10:38,000 --> 00:10:40,000
617 | 而且，这也是由语言模型生成的
618 | 
619 | 168
620 | 00:10:41,000 --> 00:10:43,000
621 | 当它要求评估链评估正在发生的事情时，
622 | 
623 | 169
624 | 00:10:44,000 --> 00:10:45,000
625 | 以及它是否正确或不正确。
626 | 
627 | 170
628 | 00:10:46,000 --> 00:10:47,000
629 | 因此，当我们循环遍历所有这些示例并将它们打印出来时，
630 | 
631 | 171
632 | 00:10:48,000 --> 00:10:49,000
633 | 我们可以详细了解每个示例。
634 | 
635 | 172
636 | 00:10:50,000 --> 00:10:53,000
637 | 对于每个示例，它看起来都是正确的。
638 | 
639 | 173
640 | 00:10:54,000 --> 00:10:56,000
641 | 这是一个相对简单的检索问题，
642 | 
643 | 174
644 | 00:11:00,000 --> 00:11:01,000
645 | 所以这是令人放心的。
646 | 
647 | 175
648 | 00:11:02,000 --> 00:11:04,000
649 | 那么，让我们看看第一个例子。
650 | 
651 | 176
652 | 00:11:05,000 --> 00:11:07,000
653 | 这里的问题是，舒适的套头衫套装
654 | 
655 | 178
656 | 00:11:08,000 --> 00:11:09,000
657 | 有侧口袋吗？
658 | 
659 | 179
660 | 00:11:10,000 --> 00:11:11,000
661 | 真正的答案，我们创建了这个，是肯定的。
662 | 
663 | 180
664 | 00:11:12,000 --> 00:11:14,000
665 | 预测的答案，语言模型产生的，
666 | 
667 | 181
668 | 00:11:15,000 --> 00:11:17,000
669 | 是舒适的套头衫套装条纹
670 | 
671 | 182
672 | 00:11:18,000 --> 00:11:19,000
673 | 确实有侧口袋。
674 | 
675 | 183
676 | 00:11:20,000 --> 00:11:22,000
677 | 因此，我们可以理解这是一个正确的答案。
678 | 
679 | 184
680 | 00:11:22,000 --> 00:11:25,000
681 | 实际上，语言模型也是这样，
682 | 
683 | 185
684 | 00:11:26,000 --> 00:11:27,000
685 | 它将其评为正确。
686 | 
687 | 186
688 | 00:11:28,000 --> 00:11:29,000
689 | 但是让我们想想为什么我们需要使用
690 | 
691 | 187
692 | 00:11:30,000 --> 00:11:31,000
693 | 语言模型首先。 
694 | 
695 | 
696 |  
697 | 191
698 | 00:11:40,000 --> 00:11:41,000
699 | 我甚至认为“是”的字眼都没有出现过。
700 | 
701 | 192
702 | 00:11:42,000 --> 00:11:43,000
703 | 在这个字符串中。
704 | 
705 | 193
706 | 00:11:44,000 --> 00:11:45,000
707 | 因此，如果我们尝试进行一些字符串匹配
708 | 
709 | 194
710 | 00:11:46,000 --> 00:11:47,000
711 | 或精确匹配，甚至在这里使用一些正则表达式，
712 | 
713 | 195
714 | 00:11:48,000 --> 00:11:49,000
715 | 它就不知道该怎么做了。
716 | 
717 | 196
718 | 00:11:49,000 --> 00:11:51,000
719 | 它们不是同一件事。
720 | 
721 | 197
722 | 00:11:52,000 --> 00:11:53,000
723 | 这显示了使用语言模型进行评估的重要性。
724 | 
725 | 198
726 | 00:11:54,000 --> 00:11:55,000
727 | 你有这些答案，它们是任意的字符串。
728 | 
729 | 199
730 | 00:11:56,000 --> 00:12:00,000
731 | 没有单一的真实字符串是最好的可能答案。
732 | 
733 | 200
734 | 00:12:01,000 --> 00:12:03,000
735 | 有许多不同的变体。
736 | 
737 | 201
738 | 00:12:04,000 --> 00:12:05,000
739 | 只要它们具有相同的语义意义，
740 | 
741 | 202
742 | 00:12:06,000 --> 00:12:07,000
743 | 它们应该被评为相似。
744 | 
745 | 203
746 | 00:12:08,000 --> 00:12:09,000
747 | 这就是语言模型的帮助所在，
748 | 
749 | 204
750 | 00:12:10,000 --> 00:12:12,000
751 | 而不是仅仅进行精确匹配。
752 | 
753 | 207
754 | 00:12:16,000 --> 00:12:19,000
755 | 这种比较字符串的困难是使语言模型评估变得如此困难的原因。
756 | 
757 | 208
758 | 00:12:20,000 --> 00:12:23,000
759 | 我们正在将它们用于这些非常开放的任务
760 | 
761 | 209
762 | 00:12:24,000 --> 00:12:26,000
763 | 其中它们被要求生成文本。
764 | 
765 | 210
766 | 00:12:27,000 --> 00:12:28,000
767 | 这以前并没有真正做过，
768 | 
769 | 211
770 | 00:12:29,000 --> 00:12:30,000
771 | 因为直到最近的模型还不够好
772 | 
773 | 212
774 | 00:12:31,000 --> 00:12:32,000
775 | 来做到这一点。
776 | 
777 | 213
778 | 00:12:33,000 --> 00:12:34,000
779 | 因此，到目前为止存在的许多评估指标都不够好。
780 | 
781 | 214
782 | 00:12:35,000 --> 00:12:36,000
783 | 我们不得不发明新的指标
784 | 
785 | 215
786 | 00:12:37,000 --> 00:12:38,000
787 | 和发明新的启发式方法来做到这一点。
788 | 
789 | 216
790 | 00:12:39,000 --> 00:12:40,000
791 | 目前最有趣和最受欢迎的
792 | 
793 | 219
794 | 00:12:44,000 --> 00:12:46,000
795 | 这些启发式方法之一
796 | 
797 | 220
798 | 00:12:47,000 --> 00:12:49,000
799 | 实际上是使用语言模型进行评估。
800 | 
801 | 221
802 | 00:12:50,000 --> 00:12:51,000
803 | 这结束了评估课程，
804 | 
805 | 222
806 | 00:12:52,000 --> 00:12:53,000
807 | 但我想向您展示的最后一件事
808 | 
809 | 223
810 | 00:12:54,000 --> 00:12:55,000
811 | 是LangChain评估平台。
812 | 
813 | 224
814 | 00:12:56,000 --> 00:12:58,000
815 | 这是一种在笔记本中执行我们刚刚执行的所有操作的方法，但是将其持久化并在UI中显示。
816 | 
817 | 225
818 | 00:12:59,000 --> 00:13:01,000
819 | 因此，让我们来看看。
820 |  
821 | 230
822 | 00:13:09,000 --> 00:13:13,000
823 | 我们在笔记本中运行的所有运行。
824 | 
825 | 231
826 | 00:13:14,000 --> 00:13:16,000
827 | 因此，这是跟踪输入和输出的好方法
828 | 
829 | 232
830 | 00:13:17,000 --> 00:13:18,000
831 | 在高层次上，但这也是一种非常好的方式
832 | 
833 | 233
834 | 00:13:19,000 --> 00:13:21,000
835 | 看看底层到底发生了什么。
836 | 
837 | 234
838 | 00:13:22,000 --> 00:13:24,000
839 | 因此，这是在打开调试模式时打印出的相同信息
840 | 
841 | 235
842 | 00:13:25,000 --> 00:13:26,000
843 | 在一个 UI 中可视化
844 | 
845 | 236
846 | 00:13:27,000 --> 00:13:28,000
847 | 在一个更好的方式。
848 | 
849 | 237
850 | 00:13:29,000 --> 00:13:31,000
851 | 因此，我们可以看到链的输入
852 | 
853 | 238
854 | 00:13:32,000 --> 00:13:33,000
855 | 和每个步骤的链的输出。
856 | 
857 | 239
858 | 00:13:34,000 --> 00:13:35,000
859 | 然后我们可以点击越来越深
860 | 
861 | 240
862 | 00:13:36,000 --> 00:13:37,000
863 | 进入链并查看更多信息
864 | 
865 | 241
866 | 00:13:37,000 --> 00:13:39,000
867 | 关于实际传递的内容。
868 | 
869 | 242
870 | 00:13:40,000 --> 00:13:41,000
871 | 因此，如果我们一直走到底部，
872 | 
873 | 243
874 | 00:13:42,000 --> 00:13:43,000
875 | 我们现在可以看到正在传递什么
876 | 
877 | 244
878 | 00:13:44,000 --> 00:13:45,000
879 | 确切地到聊天模型。
880 | 
881 | 245
882 | 00:13:46,000 --> 00:13:47,000
883 | 我们在这里有系统消息。
884 | 
885 | 246
886 | 00:13:48,000 --> 00:13:49,000
887 | 我们在这里有人类问题。
888 | 
889 | 247
890 | 00:13:50,000 --> 00:13:51,000
891 | 我们在这里有聊天模型的响应。
892 | 
893 | 248
894 | 00:13:52,000 --> 00:13:53,000
895 | 我们有一些输出元数据。
896 | 
897 | 249
898 | 00:13:54,000 --> 00:13:55,000
899 | 我们在这里添加的另一件事
900 | 
901 | 250
902 | 00:13:56,000 --> 00:13:58,000
903 | 是能够将这些示例添加到数据集中。
904 | 
905 | 251
906 | 00:13:59,000 --> 00:14:01,000
907 | 因此，如果您记得，当我们创建时
908 | 
909 | 252
910 | 00:14:02,000 --> 00:14:03,000
911 | 那些示例数据集在开始时，
912 | 
913 | 253
914 | 00:14:04,000 --> 00:14:05,000
915 | 我们部分手动创建，
916 | 
917 | 254
918 | 00:14:05,000 --> 00:14:07,000
919 | 部分使用语言模型。
920 | 
921 | 255
922 | 00:14:08,000 --> 00:14:09,000
923 | 在这里，我们可以通过单击此小按钮将其添加到数据集中，
924 | 
925 | 256
926 | 00:14:10,000 --> 00:14:11,000
927 | 现在我们有输入查询
928 | 
929 | 257
930 | 00:14:12,000 --> 00:14:13,000
931 | 和输出结果。
932 | 
933 | 258
934 | 00:14:14,000 --> 00:14:15,000
935 | 因此，我们可以创建一个数据集。
936 | 
937 | 261
938 | 00:14:20,000 --> 00:14:22,000
939 | 我们可以称其为深度学习。
940 | 
941 | 262
942 | 00:14:25,000 --> 00:14:26,000
943 | 然后我们可以开始添加示例
944 | 
945 | 263
946 | 00:14:27,000 --> 00:14:28,000
947 | 到这个数据集中。
948 | 
949 | 264
950 | 00:14:29,000 --> 00:14:30,000
951 | 因此，回到最初的事情
952 | 
953 | 265
954 | 00:14:31,000 --> 00:14:32,000
955 | 我们在课程开始时解决的问题，
956 | 
957 | 266
958 | 00:14:32,000 --> 00:14:34,000
959 | 我们需要创建这些数据集
960 | 
961 | 267
962 | 00:14:35,000 --> 00:14:36,000
963 | 以便我们可以进行评估。
964 | 
965 | 268
966 | 00:14:37,000 --> 00:14:38,000
967 | 这是一种非常好的方式
968 | 
969 | 269
970 | 00:14:39,000 --> 00:14:40,000
971 | 只是在后台运行。
972 |  
973 | 270
974 | 00:14:41,000 --> 00:14:42,000
975 | 随着时间的推移，不断添加示例数据集
976 | 
977 | 271
978 | 00:14:43,000 --> 00:14:44,000
979 | 开始建立这些示例
980 | 
981 | 272
982 | 00:14:45,000 --> 00:14:46,000
983 | 您可以开始用于评估的示例
984 | 
985 | 273
986 | 00:14:46,000 --> 00:15:06,000
987 | 并启动评估的飞轮。


--------------------------------------------------------------------------------
/chinese/LangChain_L1.srt:
--------------------------------------------------------------------------------
   1 | 
   2 |  
   3 | 1
   4 | 00:00:00,000 --> 00:00:08,720
   5 | 在第一课中，我们将涵盖模型、提示和解析器。
   6 | 
   7 | 2
   8 | 00:00:08,720 --> 00:00:13,320
   9 | 因此，模型是指支撑许多内容的语言模型。
  10 | 
  11 | 3
  12 | 00:00:13,320 --> 00:00:18,680
  13 | 提示是指创建输入以传递到模型的样式。
  14 | 
  15 | 4
  16 | 00:00:18,680 --> 00:00:20,400
  17 | 然后解析器位于相反的端点。
  18 | 
  19 | 5
  20 | 00:00:20,400 --> 00:00:24,360
  21 | 它涉及将这些模型的输出解析为更结构化的格式，以便您可以在下游执行操作。
  22 | 
  23 | 6
  24 | 00:00:24,360 --> 00:00:27,360
  25 | 因此，当您使用llm构建应用程序时，
  26 | 
  27 | 7
  28 | 00:00:27,360 --> 00:00:30,640
  29 | 它们通常是可重用的模型。
  30 | 
  31 | 8
  32 | 00:00:30,640 --> 00:00:32,600
  33 | 我们反复提示模型，
  34 | 
  35 | 9
  36 | 00:00:32,600 --> 00:00:34,160
  37 | 解析输出，因此Lanchain提供了一组易于使用的抽象来执行此类操作。
  38 | 
  39 | 10
  40 | 00:00:34,160 --> 00:00:36,520
  41 | 因此，让我们开始看一下模型、提示和解析器。
  42 | 
  43 | 11
  44 | 00:00:36,520 --> 00:00:39,760
  45 | 因此，为了开始，
  46 | 
  47 | 12
  48 | 00:00:39,760 --> 00:00:44,640
  49 | 这里有一些起始代码。
  50 | 
  51 | 13
  52 | 00:00:44,640 --> 00:00:46,160
  53 | 我要导入OS，
  54 | 
  55 | 14
  56 | 00:00:46,160 --> 00:00:47,640
  57 | 导入OpenAI，并加载我的OpenAI秘钥。
  58 | 
  59 | 15
  60 | 00:00:47,640 --> 00:00:48,840
  61 | OpenAI库已经安装在
  62 | 
  63 | 16
  64 | 00:00:48,840 --> 00:00:53,240
  65 | 我的Jupyter笔记本环境中，因此如果您在本地运行此代码，
  66 | 
  67 | 17
  68 | 00:00:53,240 --> 00:00:56,400
  69 | 并且您尚未安装OpenAI，
  70 | 
  71 | 18
  72 | 00:00:56,400 --> 00:00:59,920
  73 | 您可能需要运行它。
  74 | 
  75 | 19
  76 | 00:00:59,920 --> 00:01:01,960
  77 | BangPip install OpenAI，但我不会在这里这样做。
  78 | 
  79 | 20
  80 | 00:01:01,960 --> 00:01:04,000
  81 | 然后这是一个辅助函数。
  82 | 
  83 | 21
  84 | 00:01:04,000 --> 00:01:06,840
  85 | 这实际上与您可能在
  86 | 
  87 | 22
  88 | 00:01:06,840 --> 00:01:08,720
  89 | ChatGPT提示工程师开发者课程中看到的辅助函数非常相似，
  90 | 
  91 | 23
  92 | 00:01:08,720 --> 00:01:10,280
  93 | 所以使用这个辅助函数，
  94 | 
  95 | 24
  96 | 00:01:10,280 --> 00:01:13,960
  97 | 您可以说get completion on，
  98 | 
  99 | 25
 100 | 00:01:13,960 --> 00:01:17,240
 101 | 什么是1加1，这将调用ChatGPT或技术上的模型，
 102 | 
 103 | 26
 104 | 00:01:17,240 --> 00:01:20,360
 105 | GPT 3.5 Turbo，以便您可以得到这样的答案。
 106 | 
 107 | 27
 108 | 00:01:20,360 --> 00:01:22,160
 109 | 因此，为了激发模型提示和解析器的线链抽象，
 110 | 
 111 | 28
 112 | 00:01:22,160 --> 00:01:25,200
 113 | 让我们假设您收到一封来自客户的电子邮件，该电子邮件不是英语。
 114 | 
 115 | 29
 116 | 00:01:25,200 --> 00:01:31,120
 117 | 为了确保这是可访问的，我将使用英语海盗语言。
 118 | 
 119 | 30
 120 | 00:01:31,120 --> 00:01:35,400
 121 | 当客户说R时，
 122 |  
 123 | 36
 124 | 00:01:57,120 --> 00:02:02,280
 125 | 我会因为搅拌机盖子飞出去，把我的厨房墙壁弄得满是果汁而感到愤怒。
 126 | 
 127 | 37
 128 | 00:02:02,280 --> 00:02:06,120
 129 | 更糟糕的是，保修不包括清洁厨房的费用。
 130 | 
 131 | 38
 132 | 00:02:06,120 --> 00:02:08,000
 133 | 我现在需要你的帮助，伙计。
 134 | 
 135 | 39
 136 | 00:02:08,000 --> 00:02:12,400
 137 | 所以我们将要做的是请求这个LLM以平静和尊重的口吻将文本翻译成美式英语。
 138 | 
 139 | 40
 140 | 00:02:12,400 --> 00:02:18,080
 141 | 所以我将把风格设置为平静和尊重的美式英语。
 142 | 
 143 | 41
 144 | 00:02:18,080 --> 00:02:22,520
 145 | 为了实现这一点，我将使用一个f字符串来指定提示和指令。
 146 | 
 147 | 42
 148 | 00:02:22,520 --> 00:02:26,080
 149 | 如果你之前看过一些提示，我将使用一个f字符串来指定提示和指令。
 150 | 
 151 | 43
 152 | 00:02:26,080 --> 00:02:29,080
 153 | 我将指定用三个反引号括起来的文本翻译成指定的风格。
 154 | 
 155 | 44
 156 | 00:02:29,080 --> 00:02:33,040
 157 | 然后插入这两种风格。
 158 | 
 159 | 45
 160 | 00:02:33,040 --> 00:02:38,160
 161 | 这样就生成了一个提示，说要翻译文本等等。
 162 | 
 163 | 46
 164 | 00:02:38,160 --> 00:02:39,880
 165 | 我鼓励你暂停视频并运行代码，尝试修改提示以查看是否可以获得不同的输出。
 166 | 
 167 | 47
 168 | 00:02:39,880 --> 00:02:46,200
 169 | 然后，你可以提示大型语言模型以获得响应。
 170 | 
 171 | 48
 172 | 00:02:46,200 --> 00:02:49,840
 173 | 让我们看看响应是什么。
 174 | 
 175 | 53
 176 | 00:03:04,320 --> 00:03:07,880
 177 | 说将英语海盗的消息翻译成这个非常礼貌的语气。
 178 | 
 179 | 54
 180 | 00:03:07,880 --> 00:03:10,680
 181 | 我真的很沮丧，因为我的搅拌机盖子飞了出去，
 182 | 
 183 | 55
 184 | 00:03:10,680 --> 00:03:13,480
 185 | 把我的厨房墙壁弄得满是果汁等等。
 186 | 
 187 | 56
 188 | 00:03:13,480 --> 00:03:18,120
 189 | 嗯，我现在真的需要你的帮助，我的朋友。听起来非常好。
 190 | 
 191 | 57
 192 | 00:03:18,120 --> 00:03:23,160
 193 | 因此，如果你有不同语言的不同客户撰写评论，
 194 | 
 195 | 58
 196 | 00:03:23,160 --> 00:03:26,880
 197 | 不仅仅是英语海盗，还有法语、德语、日语等等，
 198 | 
 199 | 59
 200 | 00:03:26,880 --> 00:03:29,800
 201 | 你可以想象需要生成一整个提示序列来生成这样的翻译。
 202 | 
 203 | 60
 204 | 00:03:29,800 --> 00:03:33,920
 205 | 让我们看看如何使用Lang chain更方便地完成这项工作。
 206 | 
 207 | 61
 208 | 00:03:33,920 --> 00:03:39,360
 209 | 我将导入chat open AI。这是Lang chain对chat GPT API端点的抽象。
 210 | 
 211 | 62
 212 | 00:03:39,360 --> 00:03:44,360
 213 | 因此，如果我设置chat等于chat open AI并查看chat是什么，
 214 | 
 215 | 63
 216 | 00:03:44,360 --> 00:03:49,320
 217 | 它将创建这个对象，使用chat GPT模型，也称为GPT 3.5 turbo。
 218 | 
 219 | 64
 220 | 00:03:49,320 --> 00:03:53,840
 221 | 当我构建应用程序时，
 222 |  
 223 | 69
 224 | 00:04:04,560 --> 00:04:09,560
 225 | 我经常做的一件事是将温度参数设置为0。
 226 | 
 227 | 70
 228 | 00:04:09,560 --> 00:04:11,800
 229 | 所以默认温度为0.7。
 230 | 
 231 | 71
 232 | 00:04:11,800 --> 00:04:20,040
 233 | 但让我重新设置温度为0.0。
 234 | 
 235 | 72
 236 | 00:04:20,040 --> 00:04:25,400
 237 | 现在将温度设置为0，以使我们输出的随机性稍微减少一些。
 238 | 
 239 | 73
 240 | 00:04:26,800 --> 00:04:30,960
 241 | 现在让我按如下方式定义模板字符串。
 242 | 
 243 | 74
 244 | 00:04:30,960 --> 00:04:35,000
 245 | 将由三个向量分隔的文本翻译成样式。
 246 | 
 247 | 75
 248 | 00:04:35,000 --> 00:04:36,800
 249 | 然后这里是文本。
 250 | 
 251 | 76
 252 | 00:04:36,800 --> 00:04:40,320
 253 | 为了反复重用这个模板，
 254 | 
 255 | 77
 256 | 00:04:40,320 --> 00:04:46,200
 257 | 让我们导入Lang chain的聊天提示模板。
 258 | 
 259 | 78
 260 | 00:04:46,200 --> 00:04:54,840
 261 | 然后让我使用我们刚刚编写的模板字符串创建一个提示模板。
 262 | 
 263 | 79
 264 | 00:04:54,840 --> 00:05:01,560
 265 | 从提示模板中，
 266 | 
 267 | 80
 268 | 00:05:01,560 --> 00:05:06,120
 269 | 你实际上可以提取原始提示，并意识到
 270 | 
 271 | 81
 272 | 00:05:06,120 --> 00:05:10,520
 273 | 这个提示有两个输入变量，样式和文本，
 274 | 
 275 | 82
 276 | 00:05:10,520 --> 00:05:16,040
 277 | 这些用花括号表示。
 278 | 
 279 | 83
 280 | 00:05:16,040 --> 00:05:20,200
 281 | 这里也是我们指定的原始模板。
 282 | 
 283 | 84
 284 | 00:05:20,200 --> 00:05:22,760
 285 | 事实上，如果我打印出来，
 286 | 
 287 | 85
 288 | 00:05:22,760 --> 00:05:27,800
 289 | 它意识到它有两个输入变量，样式和文本。
 290 | 
 291 | 86
 292 | 00:05:27,800 --> 00:05:30,200
 293 | 现在让我们指定样式。
 294 | 
 295 | 87
 296 | 00:05:30,200 --> 00:05:31,960
 297 | 这是我想要的样式，
 298 | 
 299 | 88
 300 | 00:05:31,960 --> 00:05:33,800
 301 | 将客户消息翻译成该样式。
 302 | 
 303 | 89
 304 | 00:05:33,800 --> 00:05:36,320
 305 | 所以我要称之为客户样式。
 306 | 
 307 | 90
 308 | 00:05:36,320 --> 00:05:44,120
 309 | 这是我之前的同一个客户电子邮件。
 310 | 
 311 | 91
 312 | 00:05:44,120 --> 00:05:50,960
 313 | 现在，如果我创建
 314 | 
 315 | 92
 316 | 00:05:50,960 --> 00:05:55,520
 317 | 客户消息，这将生成提示，
 318 | 
 319 | 93
 320 | 00:05:55,520 --> 00:05:59,560
 321 | 并将在一分钟内传递给大型语言模型以获得响应。
 322 | 
 323 | 94
 324 | 00:05:59,560 --> 00:06:01,880
 325 | 所以如果你想看看类型，
 326 | 
 327 | 95
 328 | 00:06:01,880 --> 00:06:04,400
 329 | 客户消息实际上是一个列表。
 330 | 
 331 | 96
 332 | 00:06:04,400 --> 00:06:10,760
 333 | 如果你看一下列表的第一个元素，
 334 | 
 335 | 97
 336 | 00:06:10,760 --> 00:06:16,880
 337 | 这更或多或少是你期望它创建的提示。
 338 | 
 339 | 98
 340 | 00:06:16,880 --> 00:06:20,440
 341 | 最后，让我们将此提示传递给LLM。
 342 | 
 343 | 99
 344 | 00:06:20,440 --> 00:06:22,560
 345 | 所以我要调用聊天，
 346 | 
 347 | 100
 348 | 00:06:22,560 --> 00:06:25,040
 349 | 我们之前设置的，
 350 | 
 351 | 101
 352 | 00:06:25,040 --> 00:06:28,480
 353 | 作为OpenAI聊天GPT端点的参考。
 354 | 
 355 | 102
 356 | 00:06:28,480 --> 00:06:36,320
 357 | 如果我们打印出客户响应的内容，
 358 | 
 359 | 103
 360 | 00:06:36,320 --> 00:06:38,800
 361 | 那么它会给你返回，um，
 362 |  
 363 | 104
 364 | 00:06:38,800 --> 00:06:45,000
 365 | 这段文本是从英语海盗语翻译成礼貌的美式英语。
 366 | 
 367 | 105
 368 | 00:06:45,000 --> 00:06:47,840
 369 | 当然，你可以想象其他使用情况，
 370 | 
 371 | 106
 372 | 00:06:47,840 --> 00:06:53,400
 373 | 客户的电子邮件是其他语言，这也可以用来
 374 | 
 375 | 107
 376 | 00:06:53,400 --> 00:06:58,400
 377 | 翻译消息，以便英语为母语的人理解并回复。
 378 | 
 379 | 108
 380 | 00:06:58,400 --> 00:07:02,280
 381 | 我鼓励你暂停视频并运行代码，还可以
 382 | 
 383 | 109
 384 | 00:07:02,280 --> 00:07:06,280
 385 | 尝试修改提示，看看是否可以获得不同的输出。
 386 | 
 387 | 110
 388 | 00:07:06,280 --> 00:07:09,240
 389 | 现在，让我们希望我们的客服代表
 390 | 
 391 | 111
 392 | 00:07:09,240 --> 00:07:11,800
 393 | 用他们的原始语言回复客户。
 394 | 
 395 | 112
 396 | 00:07:11,800 --> 00:07:16,160
 397 | 所以，让我们假设英语为母语的客服代表写了这个并说，
 398 | 
 399 | 113
 400 | 00:07:16,160 --> 00:07:18,240
 401 | 嘿，客户，保修不包括，
 402 | 
 403 | 114
 404 | 00:07:18,240 --> 00:07:20,280
 405 | 你的厨房清洁费，因为这是你的错，
 406 | 
 407 | 115
 408 | 00:07:20,280 --> 00:07:23,520
 409 | 你忘记盖上盖子，误用了你的搅拌机。
 410 | 
 411 | 116
 412 | 00:07:23,520 --> 00:07:24,920
 413 | 很遗憾，再见。
 414 | 
 415 | 117
 416 | 00:07:24,920 --> 00:07:26,320
 417 | 不是很礼貌的消息，
 418 | 
 419 | 118
 420 | 00:07:26,320 --> 00:07:31,560
 421 | 但是，让我们假设这是客服代表想要的。
 422 | 
 423 | 119
 424 | 00:07:31,720 --> 00:07:36,040
 425 | 我们将指定
 426 | 
 427 | 120
 428 | 00:07:36,040 --> 00:07:39,480
 429 | 服务消息将被翻译成这种海盗风格。
 430 | 
 431 | 121
 432 | 00:07:39,480 --> 00:07:45,120
 433 | 所以我们希望它以礼貌的语气用英语海盗语说话。
 434 | 
 435 | 122
 436 | 00:07:45,120 --> 00:07:48,080
 437 | 因为我们之前创建了那个提示模板，
 438 | 
 439 | 123
 440 | 00:07:48,080 --> 00:07:52,520
 441 | 很酷的是我们现在可以重复使用那个提示模板并指定
 442 | 
 443 | 124
 444 | 00:07:52,520 --> 00:07:58,240
 445 | 我们想要的输出样式是这个服务风格的海盗和这个服务回复的文本。
 446 | 
 447 | 125
 448 | 00:07:58,240 --> 00:08:01,240
 449 | 如果我们这样做，
 450 | 
 451 | 126
 452 | 00:08:01,800 --> 00:08:05,200
 453 | 那就是提示。
 454 | 
 455 | 127
 456 | 00:08:05,760 --> 00:08:09,160
 457 | 如果我们提示，
 458 | 
 459 | 128
 460 | 00:08:09,160 --> 00:08:13,040
 461 | 聊天GPT，这是它给我们的回应。
 462 | 
 463 | 129
 464 | 00:08:13,040 --> 00:08:18,080
 465 | 啊，那里的伙计，我必须友好地告诉你，保修不包括
 466 | 
 467 | 130
 468 | 00:08:18,080 --> 00:08:21,200
 469 | 你的厨房清洁费等等。
 470 | 
 471 | 131
 472 | 00:08:21,200 --> 00:08:23,520
 473 | 是的，很遗憾，再见我的心爱的。
 474 | 
 475 | 132
 476 | 00:08:23,520 --> 00:08:27,600
 477 | 所以你可能会想知道为什么我们使用提示模板而不是，
 478 | 
 479 | 133
 480 | 00:08:27,600 --> 00:08:28,920
 481 | 你知道，只是一个F字符串？
 482 | 
 483 | 134
 484 | 00:08:28,920 --> 00:08:32,480
 485 | 答案是，随着你构建复杂的应用程序，
 486 | 
 487 | 135
 488 | 00:08:32,480 --> 00:08:35,360
 489 | 提示可能会非常长和详细。
 490 | 
 491 | 136
 492 | 00:08:35,360 --> 00:08:42,440
 493 | 因此，提示模板是一个有用的抽象，可以帮助您在可以重复使用好的提示时重复使用它们。
 494 |  
 495 | 137
 496 | 00:08:42,440 --> 00:08:46,760
 497 | 嗯，这是一个相对较长的提示示例，用于在线学习应用程序中对学生提交的作业进行评分。
 498 | 
 499 | 138
 500 | 00:08:46,760 --> 00:08:52,000
 501 | 像这样的提示可能会很长，您可以要求LLM首先解决问题，然后以特定格式输出。
 502 | 
 503 | 139
 504 | 00:08:52,000 --> 00:08:57,560
 505 | 将其包装在Lanchain提示中可以更轻松地重用此类提示。
 506 | 
 507 | 140
 508 | 00:08:57,560 --> 00:09:02,600
 509 | 此外，您稍后会看到Lanchain为一些常见操作提供提示，
 510 | 
 511 | 141
 512 | 00:09:02,600 --> 00:09:08,720
 513 | 例如摘要或问题回答或连接到SQL数据库，
 514 | 
 515 | 142
 516 | 00:09:08,720 --> 00:09:14,640
 517 | 或连接到不同的API。
 518 | 
 519 | 143
 520 | 00:09:14,640 --> 00:09:20,520
 521 | 因此，通过使用一些Lanchain内置的提示，
 522 | 
 523 | 144
 524 | 00:09:20,520 --> 00:09:22,280
 525 | 您可以快速使应用程序运行而无需自己设计提示。
 526 | 
 527 | 145
 528 | 00:09:22,280 --> 00:09:25,880
 529 | Lanchain提示库的另一个方面是它还支持输出解析，
 530 | 
 531 | 146
 532 | 00:09:25,880 --> 00:09:29,640
 533 | 我们将在一分钟内介绍。
 534 | 
 535 | 147
 536 | 00:09:29,640 --> 00:09:31,880
 537 | 但是，当您使用LLM构建复杂应用程序时，
 538 | 
 539 | 148
 540 | 00:09:31,880 --> 00:09:37,840
 541 | 通常会指示LLM以特定格式生成其输出，
 542 | 
 543 | 149
 544 | 00:09:37,840 --> 00:09:40,600
 545 | 例如使用特定关键字。
 546 | 
 547 | 150
 548 | 00:09:40,600 --> 00:09:42,920
 549 | 左侧的示例说明了使用LLM执行一种称为思维链推理的东西，
 550 | 
 551 | 151
 552 | 00:09:42,920 --> 00:09:46,840
 553 | 使用React框架。
 554 | 
 555 | 152
 556 | 00:09:46,840 --> 00:09:52,600
 557 | 但是不要担心技术细节，
 558 | 
 559 | 153
 560 | 00:09:52,600 --> 00:09:55,240
 561 | 但关键是LLM正在思考什么，
 562 | 
 563 | 154
 564 | 00:09:55,240 --> 00:10:00,680
 565 | 因为给LLM思考的空间，
 566 | 
 567 | 155
 568 | 00:10:00,680 --> 00:10:06,160
 569 | 它通常可以得出更准确的结论。
 570 | 
 571 | 156
 572 | 00:10:06,160 --> 00:10:09,280
 573 | 然后将动作作为关键字执行特定操作，
 574 | 
 575 | 157
 576 | 00:10:09,280 --> 00:10:15,560
 577 | 然后观察以显示从该操作中学到的内容，依此类推。
 578 | 
 579 | 158
 580 | 00:10:15,560 --> 00:10:18,160
 581 | 如果您有一个提示，指示LLM使用这些特定关键字，
 582 | 
 583 | 159
 584 | 00:10:18,160 --> 00:10:21,240
 585 | 思考，动作和观察，
 586 | 
 587 | 160
 588 | 00:10:21,240 --> 00:10:25,520
 589 | 那么这个提示可以与解析器配合使用，
 590 | 
 591 | 161
 592 | 00:10:25,520 --> 00:10:31,240
 593 | 以提取已标记为这些特定关键字的文本。
 594 | 
 595 | 162
 596 | 00:10:31,240 --> 00:10:37,480
 597 | 因此，这一起为指定LLM的输入提供了非常好的抽象，
 598 | 
 599 | 163
 600 | 00:10:37,480 --> 00:10:39,920
 601 | 然后还可以使用解析器正确解释LLM给出的输出。
 602 |  
 603 | 168
 604 | 00:11:01,040 --> 00:11:08,680
 605 | 因此，让我们回到使用Langchain的输出解析器的示例。
 606 | 
 607 | 169
 608 | 00:11:08,680 --> 00:11:14,160
 609 | 在这个例子中，让我们看一下如何使用LLM输出JSON，
 610 | 
 611 | 170
 612 | 00:11:14,160 --> 00:11:17,280
 613 | 并使用Langchain解析该输出。
 614 | 
 615 | 171
 616 | 00:11:17,280 --> 00:11:23,440
 617 | 我将使用一个产品评论的运行示例来提取信息并以JSON格式格式化输出。
 618 | 
 619 | 172
 620 | 00:11:23,440 --> 00:11:28,800
 621 | 这是一个期望的输出示例。
 622 | 
 623 | 173
 624 | 00:11:28,800 --> 00:11:33,920
 625 | 这实际上是一个Python字典，
 626 | 
 627 | 174
 628 | 00:11:33,920 --> 00:11:36,720
 629 | 其中产品是否为GIF，
 630 | 
 631 | 175
 632 | 00:11:36,720 --> 00:11:38,960
 633 | 大规模的假，交付所需的天数为五天，
 634 | 
 635 | 176
 636 | 00:11:38,960 --> 00:11:41,840
 637 | 价格值相当实惠。
 638 | 
 639 | 177
 640 | 00:11:41,840 --> 00:11:44,440
 641 | 这是一个期望的输出示例。
 642 | 
 643 | 178
 644 | 00:11:44,440 --> 00:11:48,280
 645 | 以下是客户评论以及尝试获得JSON输出的模板。
 646 | 
 647 | 179
 648 | 00:11:48,280 --> 00:11:50,720
 649 | 以下是一个客户评论。
 650 | 
 651 | 180
 652 | 00:11:50,720 --> 00:11:57,160
 653 | 它说，睡眠吹风机非常惊人。
 654 | 
 655 | 181
 656 | 00:11:57,160 --> 00:11:58,520
 657 | 它有四个设置，蜡烛吹风机，
 658 | 
 659 | 182
 660 | 00:11:58,520 --> 00:12:00,600
 661 | 温柔的微风，风城和龙卷风。
 662 | 
 663 | 183
 664 | 00:12:00,600 --> 00:12:02,480
 665 | 它在两天内到达，正好是我妻子的周年礼物。
 666 | 
 667 | 184
 668 | 00:12:02,480 --> 00:12:04,800
 669 | 我认为我妻子非常喜欢它，她一言不发。
 670 | 
 671 | 185
 672 | 00:12:04,800 --> 00:12:08,640
 673 | 到目前为止，我是唯一使用它的人，等等。
 674 | 
 675 | 186
 676 | 00:12:08,640 --> 00:12:15,520
 677 | 以下是评论模板。
 678 | 
 679 | 187
 680 | 00:12:15,520 --> 00:12:18,080
 681 | 对于以下文本，提取以下信息，
 682 | 
 683 | 188
 684 | 00:12:18,080 --> 00:12:20,040
 685 | 指定这是一个GIF。
 686 | 
 687 | 189
 688 | 00:12:20,040 --> 00:12:21,680
 689 | 在这种情况下，是的，
 690 | 
 691 | 190
 692 | 00:12:21,680 --> 00:12:22,840
 693 | 因为这是一个GIF。
 694 | 
 695 | 191
 696 | 00:12:22,840 --> 00:12:25,640
 697 | 还有交付天数。
 698 | 
 699 | 192
 700 | 00:12:25,640 --> 00:12:27,160
 701 | 需要多长时间才能交付？
 702 | 
 703 | 193
 704 | 00:12:27,160 --> 00:12:29,640
 705 | 看起来在这种情况下，它在两天内到达。
 706 | 
 707 | 194
 708 | 00:12:29,640 --> 00:12:32,040
 709 | 还有，价格值是多少？
 710 | 
 711 | 195
 712 | 00:12:32,040 --> 00:12:35,640
 713 | 比睡眠吹风机稍微贵一些，等等。
 714 | 
 715 | 196
 716 | 00:12:35,640 --> 00:12:42,920
 717 | 因此，评论模板要求LLM以客户评论作为输入，
 718 | 
 719 | 197
 720 | 00:12:42,920 --> 00:12:45,920
 721 | 并提取这三个字段，
 722 | 
 723 | 198
 724 | 00:12:45,920 --> 00:12:48,360
 725 | 然后将输出格式化为JSON，
 726 | 
 727 | 199
 728 | 00:12:48,360 --> 00:12:52,000
 729 | 使用以下键。
 730 | 
 731 | 200
 732 | 00:12:52,000 --> 00:12:56,400
 733 | 好的。
 734 | 
 735 | 201
 736 | 00:12:56,400 --> 00:13:01,080
 737 | 以下是如何在Langchain中包装它。
 738 |  
 739 | 204
 740 | 00:13:01,080 --> 00:13:02,920
 741 | 让我们导入聊天提示模板。
 742 | 
 743 | 205
 744 | 00:13:02,920 --> 00:13:04,760
 745 | 实际上我们之前已经导入过了。
 746 | 
 747 | 206
 748 | 00:13:04,760 --> 00:13:06,520
 749 | 所以从技术上讲，这一行是多余的，
 750 | 
 751 | 207
 752 | 00:13:06,520 --> 00:13:08,360
 753 | 但我会再次导入它，
 754 | 
 755 | 208
 756 | 00:13:08,360 --> 00:13:10,680
 757 | 然后从上面的评论模板创建提示模板。
 758 | 
 759 | 209
 760 | 00:13:10,680 --> 00:13:16,040
 761 | 这是提示模板。
 762 | 
 763 | 210
 764 | 00:13:16,040 --> 00:13:19,480
 765 | 现在，类似于我们早期使用提示模板的方式，
 766 | 
 767 | 211
 768 | 00:13:19,480 --> 00:13:23,680
 769 | 让我们创建要传递给OpenAI端点的消息。
 770 | 
 771 | 212
 772 | 00:13:23,680 --> 00:13:32,000
 773 | 创建OpenAI端点，调用该端点，然后让我们打印出响应。
 774 | 
 775 | 213
 776 | 00:13:32,000 --> 00:13:34,760
 777 | 我鼓励您暂停视频并运行代码。
 778 | 
 779 | 214
 780 | 00:13:34,760 --> 00:13:39,000
 781 | 然后就是这样了。
 782 | 
 783 | 215
 784 | 00:13:39,000 --> 00:13:42,960
 785 | 它说GIF为true，交货时间为两天，
 786 | 
 787 | 216
 788 | 00:13:44,440 --> 00:13:46,520
 789 | 价格值看起来也相当准确。
 790 | 
 791 | 217
 792 | 00:13:46,520 --> 00:13:49,000
 793 | 但请注意，如果我们检查响应的类型，
 794 | 
 795 | 218
 796 | 00:13:49,000 --> 00:13:52,920
 797 | 这实际上是一个字符串。
 798 | 
 799 | 219
 800 | 00:13:52,920 --> 00:14:02,280
 801 | 看起来像JSON，看起来有键值对，
 802 | 
 803 | 220
 804 | 00:14:02,280 --> 00:14:04,040
 805 | 但它实际上不是一个字典。
 806 | 
 807 | 221
 808 | 00:14:04,040 --> 00:14:07,960
 809 | 这只是一个很长的字符串。
 810 | 
 811 | 222
 812 | 00:14:07,960 --> 00:14:09,480
 813 | 所以我真正想做的是去响应内容，
 814 | 
 815 | 223
 816 | 00:14:09,480 --> 00:14:11,920
 817 | 并从GIF键中获取值，这应该是true。
 818 | 
 819 | 224
 820 | 00:14:11,920 --> 00:14:14,720
 821 | 但如果我运行这个，这应该会生成一个错误，因为，嗯，
 822 | 
 823 | 225
 824 | 00:14:14,720 --> 00:14:17,560
 825 | 这实际上是一个字符串，这不是一个Python字典。
 826 | 
 827 | 226
 828 | 00:14:17,560 --> 00:14:21,080
 829 | 所以让我们看看如何使用Langchain的解析器来做到这一点。
 830 | 
 831 | 227
 832 | 00:14:21,080 --> 00:14:24,040
 833 | 我将从Langchain导入响应模式和结构化输出解析器。
 834 | 
 835 | 228
 836 | 00:14:24,040 --> 00:14:27,680
 837 | 并且，我将告诉它我想要解析什么，通过指定这些响应模式。
 838 | 
 839 | 229
 840 | 00:14:27,680 --> 00:14:31,200
 841 | 所以GIF模式被命名为GIF，
 842 | 
 843 | 230
 844 | 00:14:31,200 --> 00:14:39,560
 845 | 这是描述。购买的物品是否作为礼物送给别人？回答true或yes，如果不是或未知，则为false等等。
 846 | 
 847 | 231
 848 | 00:14:39,560 --> 00:14:46,400
 849 | 所以我有一个GIF模式，
 850 | 
 851 | 232
 852 | 00:14:46,400 --> 00:14:49,080
 853 | 交货日期模式，价格值模式，
 854 | 
 855 | 233
 856 | 00:14:49,080 --> 00:14:50,200
 857 | 然后让我们将它们全部放入列表中。
 858 | 
 859 | 234
 860 | 00:14:50,200 --> 00:14:52,720
 861 | 现在我已经指定了这些模式，
 862 |  
 863 | 241
 864 | 00:15:08,760 --> 00:15:12,880
 865 | Langchain实际上可以直接给你提示，通过输出解析器告诉你要发送给LLM的指令。
 866 | 
 867 | 242
 868 | 00:15:12,880 --> 00:15:20,080
 869 | 通过输出解析器告诉你要发送给LLM的指令，这样，如果我要打印格式指令，
 870 | 
 871 | 243
 872 | 00:15:20,080 --> 00:15:24,320
 873 | 她有一套非常精确的LLM格式指令，可以生成输出，输出解析器可以处理。
 874 | 
 875 | 244
 876 | 00:15:24,840 --> 00:15:29,200
 877 | 所以这是新的评论模板。
 878 | 
 879 | 245
 880 | 00:15:29,200 --> 00:15:33,640
 881 | 评论模板包括Langchain生成的格式指令，因此也可以从评论模板中创建提示，
 882 | 
 883 | 246
 884 | 00:15:33,880 --> 00:15:37,400
 885 | 然后创建将传递到OpenAI端点的消息。
 886 | 
 887 | 247
 888 | 00:15:37,400 --> 00:15:41,440
 889 | 如果您想，您可以查看实际提示，
 890 | 
 891 | 248
 892 | 00:15:41,440 --> 00:15:50,720
 893 | 它会告诉您如何提取字段，GIF、交货天数、价格值。
 894 | 
 895 | 249
 896 | 00:15:50,720 --> 00:15:57,960
 897 | 这是文本，这是格式化指令。
 898 | 
 899 | 250
 900 | 00:15:57,960 --> 00:16:02,240
 901 | 最后，如果我们调用OpenAI端点，
 902 | 
 903 | 251
 904 | 00:16:02,240 --> 00:16:07,440
 905 | 让我们看看我们得到了什么响应。
 906 | 
 907 | 252
 908 | 00:16:07,440 --> 00:16:10,400
 909 | 现在是这样的。
 910 | 
 911 | 253
 912 | 00:16:10,400 --> 00:16:15,400
 913 | 现在，如果我们使用之前创建的输出解析器，
 914 | 
 915 | 254
 916 | 00:16:17,400 --> 00:16:25,240
 917 | 您可以将其解析为输出字典。
 918 | 
 919 | 255
 920 | 00:16:25,240 --> 00:16:29,120
 921 | 我们的打印应该是这样的。
 922 | 
 923 | 256
 924 | 00:16:29,120 --> 00:16:32,520
 925 | 请注意，这是字典类型，而不是字符串。
 926 | 
 927 | 257
 928 | 00:16:33,320 --> 00:16:38,760
 929 | 这就是为什么我现在可以提取与GIFs键相关联的值并获得true，
 930 | 
 931 | 258
 932 | 00:16:38,760 --> 00:16:46,040
 933 | 或提取与交货天数相关联的值并获得2。
 934 | 
 935 | 259
 936 | 00:16:46,040 --> 00:16:49,080
 937 | 或者您还可以提取与价格值相关联的值。
 938 | 
 939 | 260
 940 | 00:16:49,080 --> 00:16:57,000
 941 | 因此，这是一种巧妙的方法，可以将您的LLM输出解析为Python字典，以使输出在下游处理中更容易使用。
 942 | 
 943 | 261
 944 | 00:16:57,000 --> 00:17:03,920
 945 | 我鼓励您暂停视频并运行代码。
 946 | 
 947 | 262
 948 | 00:17:03,920 --> 00:17:08,640
 949 | 这就是模型、提示和解析器。
 950 | 
 951 | 263
 952 | 00:17:08,640 --> 00:17:10,800
 953 | 有了这些工具，希望您能轻松地重用自己的提示模板，
 954 | 
 955 | 264
 956 | 00:17:10,800 --> 00:17:14,040
 957 | 与您合作的其他人分享提示模板，
 958 | 
 959 | 265
 960 | 00:17:14,040 --> 00:17:20,160
 961 | 甚至使用Lanchain内置的提示模板，正如您刚才看到的，
 962 | 
 963 | 266
 964 | 00:17:20,160 --> 00:17:25,080
 965 | 这是一种巧妙的方法，可以将您的LLM输出解析为Python字典，以使输出在下游处理中更容易使用。
 966 | 
 967 | 267
 968 | 00:17:25,080 --> 00:17:28,440
 969 | 我鼓励您暂停视频并运行代码。
 970 | 
 971 | 268
 972 | 00:17:28,440 --> 00:17:31,160
 973 | 这就是模型、提示和解析器。
 974 | 
 975 | 269
 976 | 00:17:31,160 --> 00:17:32,680
 977 | 有了这些工具，希望您能轻松地重用自己的提示模板，
 978 | 
 979 | 270
 980 | 00:17:32,680 --> 00:17:37,640
 981 | 与您合作的其他人分享提示模板，
 982 | 
 983 | 271
 984 | 00:17:37,640 --> 00:17:40,280
 985 | 甚至使用Lanchain内置的提示模板，正如您刚才看到的，
 986 |  
 987 | 275
 988 | 00:17:45,040 --> 00:17:47,920
 989 | 通常可以与输出解析器配对使用。
 990 | 
 991 | 276
 992 | 00:17:47,920 --> 00:17:53,280
 993 | 这样，输入提示可以以特定格式输出，然后解析器将输出暂停以将数据存储在Python字典或其他数据结构中，以便于下游处理。
 994 | 
 995 | 277
 996 | 00:17:53,280 --> 00:17:57,800
 997 | 我希望您在许多应用程序中都能找到这个有用。
 998 | 
 999 | 280
1000 | 00:18:06,080 --> 00:18:10,400
1001 | 有了这个，让我们进入下一个视频，看看Lanchain如何帮助您构建更好的聊天机器人或使LLM的聊天更有效，
1002 | 
1003 | 282
1004 | 00:18:16,400 --> 00:18:36,400
1005 | 通过更好地管理它到目前为止所记住的对话。


--------------------------------------------------------------------------------
/english/LangChain_L2.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:00,000 --> 00:00:18,000
  3 | When you interact with these models, naturally, they don't remember what you say before or any of the previous conversation, which is an issue when you're building some applications like chatbot and you want to have a conversation with them.
  4 | 
  5 | 2
  6 | 00:00:18,000 --> 00:00:31,000
  7 | And so in this section, we'll cover memory, which is basically how do you remember previous parts of the conversation and feed that into the language model so that they can have this conversational flow as you're interacting with them.
  8 | 
  9 | 3
 10 | 00:00:31,000 --> 00:00:38,000
 11 | Yep. So, Language Chain offers multiple sophisticated options for managing these memories. Let's jump in and take a look.
 12 | 
 13 | 4
 14 | 00:00:38,000 --> 00:00:48,000
 15 | So, let me start off by importing my OpenAI API key and then let me import a few tools that I'll need.
 16 | 
 17 | 5
 18 | 00:00:48,000 --> 00:00:55,000
 19 | Let's use as the motivating example for memory, using LangChain to manage a chat or a chatbot conversation.
 20 | 
 21 | 6
 22 | 00:00:55,000 --> 00:01:09,000
 23 | So, to do that, I'm going to set the llm as a chat interface of OpenAI with temperature equals zero. And I'm going to use the memory as a conversation buffer memory.
 24 | 
 25 | 7
 26 | 00:01:09,000 --> 00:01:15,000
 27 | And you'll see later what this means. And I'm going to build a conversation chain.
 28 | 
 29 | 8
 30 | 00:01:15,000 --> 00:01:26,000
 31 | Again, later in this short course, Harrison will dive much more deeply into what exactly is a chain in LangChain. So, don't worry too much about the details of the syntax for now.
 32 | 
 33 | 9
 34 | 00:01:26,000 --> 00:01:36,000
 35 | But this builds an llm. And if I start to have a conversation, conversation.predict, give the input. Hi, my name is Andrew.
 36 | 
 37 | 10
 38 | 00:01:36,000 --> 00:01:47,000
 39 | Let's see what it says. Hello, Andrew, it's nice to meet you. Right? And so on. And then let's say I ask it, what is one plus one?
 40 | 
 41 | 11
 42 | 00:01:47,000 --> 00:01:55,000
 43 | Um, one plus one is two. And then ask it again, you know, what's my name? Your name is Andrew, as you mentioned earlier.
 44 | 
 45 | 12
 46 | 00:01:55,000 --> 00:02:06,000
 47 | Hmm, there's a lot of trace of sarcasm there. Not sure. And so if you want, you can change this verbose variable to true to see what LangChain is actually doing.
 48 | 
 49 | 13
 50 | 00:02:06,000 --> 00:02:11,000
 51 | When you run predict, hi, my name is Andrew, this is the prompt that LangChain is generating.
 52 | 
 53 | 14
 54 | 00:02:11,000 --> 00:02:16,000
 55 | It says, the following is a friendly conversation between a human and an AI, AI is talkative, and so on.
 56 | 
 57 | 15
 58 | 00:02:16,000 --> 00:02:26,000
 59 | So this is a prompt that LangChain has generated to have the system have a hopeful and friendly conversation, and it has to save the conversation and here's the response.
 60 | 
 61 | 16
 62 | 00:02:26,000 --> 00:02:35,000
 63 | And when you execute this on the second and third parts of the conversations, it keeps the prompt as follows.
 64 | 
 65 | 17
 66 | 00:02:35,000 --> 00:02:43,000
 67 | And notice that by the time I'm uttering, what is my name? This is the third turn, that's my third input.
 68 | 
 69 | 18
 70 | 00:02:43,000 --> 00:02:50,000
 71 | It has stored the current conversation as follows. Hi, my name is Andrew, what is one plus one, and so on.
 72 | 
 73 | 19
 74 | 00:02:50,000 --> 00:02:57,000
 75 | And so this memory or this history of the conversation gets longer and longer.
 76 | 
 77 | 20
 78 | 00:02:57,000 --> 00:03:02,000
 79 | In fact, up on top, I had used the memory variable to store the memory.
 80 | 
 81 | 21
 82 | 00:03:02,000 --> 00:03:08,000
 83 | So if I were to print memory.buffer, it has stored the conversation so far.
 84 | 
 85 | 22
 86 | 00:03:08,000 --> 00:03:14,000
 87 | You can also print this out, memory.loadMemoryVariables.
 88 | 
 89 | 23
 90 | 00:03:14,000 --> 00:03:18,000
 91 | The curly braces here is actually an empty dictionary.
 92 | 
 93 | 24
 94 | 00:03:18,000 --> 00:03:25,000
 95 | There's some more advanced features that you can use with a more sophisticated input, but we won't talk about them in this short course.
 96 | 
 97 | 25
 98 | 00:03:25,000 --> 00:03:28,000
 99 | So don't worry about why there's an empty curly braces here.
100 | 
101 | 26
102 | 00:03:28,000 --> 00:03:33,000
103 | But this is what LangChain has remembered in the memory of the conversation so far.
104 | 
105 | 27
106 | 00:03:33,000 --> 00:03:38,000
107 | It's just everything that the AI or that the human has said.
108 | 
109 | 28
110 | 00:03:38,000 --> 00:03:41,000
111 | I encourage you to pause the video and run the code.
112 | 
113 | 29
114 | 00:03:41,000 --> 00:03:49,000
115 | So the way that LangChain is storing the conversation is with this conversation buffer memory.
116 | 
117 | 30
118 | 00:03:49,000 --> 00:03:55,000
119 | If I were to use the conversation buffer memory to specify a couple of inputs and outputs,
120 | 
121 | 31
122 | 00:03:55,000 --> 00:03:59,000
123 | this is how you add new things to the memory if you wish to do so explicitly.
124 | 
125 | 32
126 | 00:03:59,000 --> 00:04:03,000
127 | Memory.saveContext says, hi, what's up?
128 | 
129 | 33
130 | 00:04:03,000 --> 00:04:09,000
131 | I know this is not the most exciting conversation, but I wanted to have a short example.
132 | 
133 | 34
134 | 00:04:09,000 --> 00:04:15,000
135 | Um, and with that, this is what the status of the memory is.
136 | 
137 | 35
138 | 00:04:15,000 --> 00:04:22,000
139 | And once again, let me actually show the, uh, memory variables.
140 | 
141 | 36
142 | 00:04:22,000 --> 00:04:29,000
143 | Now, if you want to add additional, um, data to the memory, you can keep on saving additional context.
144 | 
145 | 37
146 | 00:04:29,000 --> 00:04:34,000
147 | So conversation goes on, not much, just hanging, cool.
148 | 
149 | 38
150 | 00:04:34,000 --> 00:04:38,000
151 | And if you print out the memory, you know, there's now more stuff in it.
152 | 
153 | 39
154 | 00:04:38,000 --> 00:04:46,000
155 | So when you use a large language model for a chat conversation, um, the large language model itself is actually stateless.
156 | 
157 | 40
158 | 00:04:46,000 --> 00:04:51,000
159 | The language model itself does not remember the conversation you've had so far.
160 | 
161 | 41
162 | 00:04:51,000 --> 00:04:55,000
163 | And each transaction, each call to the API endpoint is independent.
164 | 
165 | 42
166 | 00:04:55,000 --> 00:05:07,000
167 | And chatbots appear to have memory only because there's usually rapid code that provides the full conversation that's been had so far as context to the LLM.
168 | 
169 | 43
170 | 00:05:07,000 --> 00:05:14,000
171 | And so the memory can store explicitly the terms or the utterances so far.
172 | 
173 | 44
174 | 00:05:14,000 --> 00:05:16,000
175 | Hi, my name is Andrew. Hello, it's just nice to meet you and so on.
176 | 
177 | 45
178 | 00:05:16,000 --> 00:05:30,000
179 | And this memory storage is used as input or additional context to the LLM so that it can generate an output as if it's just having the next conversational turn, knowing what's been said before.
180 | 
181 | 46
182 | 00:05:30,000 --> 00:05:37,000
183 | And as the conversation becomes long, the amounts of memory needed becomes really, really long.
184 | 
185 | 47
186 | 00:05:37,000 --> 00:05:46,000
187 | And thus the cost of sending a lot of tokens to the LLM, which usually charges based on the number of tokens it needs to process will also become more expensive.
188 | 
189 | 48
190 | 00:05:46,000 --> 00:05:54,000
191 | So Lanchain provides several convenient kinds of memory to store and accumulate the conversation.
192 | 
193 | 49
194 | 00:05:54,000 --> 00:06:00,000
195 | So far, we've been looking at the conversation buffer memory. Let's look at a different type of memory.
196 | 
197 | 50
198 | 00:06:00,000 --> 00:06:09,000
199 | I'm going to import the conversation buffer window memory that only keeps a window of memory.
200 | 
201 | 51
202 | 00:06:09,000 --> 00:06:20,000
203 | If I set memory to conversational buffer window memory with k equals one, the variable k equals one specifies that I want it to remember just one conversational exchange.
204 | 
205 | 52
206 | 00:06:20,000 --> 00:06:25,000
207 | That is one utterance from me and one utterance from the chatbot.
208 | 
209 | 53
210 | 00:06:25,000 --> 00:06:31,000
211 | So now if I were to have it save context, hi, what's up, not much, just hanging.
212 | 
213 | 54
214 | 00:06:31,000 --> 00:06:38,000
215 | If I were to look at memory dot load variables, it only remembers the most recent utterance.
216 | 
217 | 55
218 | 00:06:38,000 --> 00:06:45,000
219 | Notice it's dropped. Hi, what's up? It's just saying, human says not much, just hanging, and the AI says cool.
220 | 
221 | 56
222 | 00:06:45,000 --> 00:06:48,000
223 | So that's because k was equal to one.
224 | 
225 | 57
226 | 00:06:48,000 --> 00:06:56,000
227 | So this is a nice feature because it lets you keep track of just the most recent few conversational terms.
228 | 
229 | 58
230 | 00:06:56,000 --> 00:07:03,000
231 | In practice, you probably won't use this with k equals one. You use this with k set to a larger number.
232 | 
233 | 59
234 | 00:07:03,000 --> 00:07:10,000
235 | But still, this prevents the memory from growing without limit as the conversation goes longer.
236 | 
237 | 60
238 | 00:07:10,000 --> 00:07:23,000
239 | And so if I were to rerun the conversation that we had just now, we'll say, hi, my name is Andrew.
240 | 
241 | 61
242 | 00:07:23,000 --> 00:07:32,000
243 | What is one plus one? And now I ask it, what is my name?
244 | 
245 | 62
246 | 00:07:32,000 --> 00:07:37,000
247 | Because k equals one, it only remembers the last exchange versus what is one plus one?
248 | 
249 | 63
250 | 00:07:37,000 --> 00:07:42,000
251 | The answer is one plus one equals two, and it's forgotten this early exchange, which is now, now says,
252 | 
253 | 64
254 | 00:07:42,000 --> 00:07:46,000
255 | sorry, don't have access to that information.
256 | 
257 | 65
258 | 00:07:46,000 --> 00:07:53,000
259 | One thing I hope you will do is pause the video, change this to true in the code on the left,
260 | 
261 | 66
262 | 00:07:53,000 --> 00:07:57,000
263 | and rerun this conversation with verbose equals true.
264 | 
265 | 67
266 | 00:07:57,000 --> 00:08:00,000
267 | And then you will see the prompts actually used to generate this.
268 | 
269 | 68
270 | 00:08:00,000 --> 00:08:08,000
271 | And hopefully you see that the memory, when you're calling the LLM on what is my name,
272 | 
273 | 69
274 | 00:08:08,000 --> 00:08:11,000
275 | that the memory has dropped this exchange where I learned what is my name,
276 | 
277 | 70
278 | 00:08:11,000 --> 00:08:17,000
279 | which is why it now says it doesn't know what is my name.
280 | 
281 | 71
282 | 00:08:17,000 --> 00:08:28,000
283 | With the conversational token buffer memory, the memory will limit the number of tokens saved.
284 | 
285 | 72
286 | 00:08:28,000 --> 00:08:39,000
287 | And because a lot of LLM pricing is based on tokens, this maps more directly to the cost of the LLM calls.
288 | 
289 | 73
290 | 00:08:39,000 --> 00:08:47,000
291 | So if I were to say the max token limit is equal to 50, and actually let me inject a few comments.
292 | 
293 | 74
294 | 00:08:47,000 --> 00:08:51,000
295 | So let's say the conversation is, AI is what? Amazing.
296 | 
297 | 75
298 | 00:08:51,000 --> 00:08:54,000
299 | Backpropagation is what? Beautiful. Chatbot is what? Charming.
300 | 
301 | 76
302 | 00:08:54,000 --> 00:08:58,000
303 | I use ABC as the first letter of all of these conversational terms.
304 | 
305 | 77
306 | 00:08:58,000 --> 00:09:02,000
307 | We can keep track of, um, what was said when.
308 | 
309 | 78
310 | 00:09:02,000 --> 00:09:08,000
311 | If I run this with a high token limit, um, it has almost the whole conversation.
312 | 
313 | 79
314 | 00:09:08,000 --> 00:09:14,000
315 | If I increase the token limit to 100, it now has the whole conversation.
316 | 
317 | 80
318 | 00:09:14,000 --> 00:09:24,000
319 | So I have AI is what? If I decrease it, then, you know, it chops off the earlier parts of this conversation
320 | 
321 | 81
322 | 00:09:24,000 --> 00:09:28,000
323 | to retain the number of tokens corresponding to the most recent exchanges,
324 | 
325 | 82
326 | 00:09:28,000 --> 00:09:32,000
327 | um, but subject to not exceeding the token limit.
328 | 
329 | 83
330 | 00:09:32,000 --> 00:09:35,000
331 | And in case you're wondering why we needed to specify an LLM,
332 | 
333 | 84
334 | 00:09:35,000 --> 00:09:39,000
335 | it's because different LLMs use different ways of counting tokens.
336 | 
337 | 85
338 | 00:09:39,000 --> 00:09:46,000
339 | So this tells it to use the way of counting tokens that the, um, chat open AI LLM uses.
340 | 
341 | 86
342 | 00:09:46,000 --> 00:09:49,000
343 | I encourage you to pause the video and run the code,
344 | 
345 | 87
346 | 00:09:49,000 --> 00:09:54,000
347 | and also try modifying the prompt to see if you can get a different output.
348 | 
349 | 88
350 | 00:09:54,000 --> 00:09:58,000
351 | Finally, there's one last type of memory I want to illustrate here,
352 | 
353 | 89
354 | 00:09:58,000 --> 00:10:04,000
355 | which is the conversation summary buffer memory.
356 | 
357 | 90
358 | 00:10:04,000 --> 00:10:12,000
359 | And the idea is instead of limiting the memory to fixed number of tokens based on the most recent utterances
360 | 
361 | 91
362 | 00:10:12,000 --> 00:10:15,000
363 | or a fixed number of conversational exchanges,
364 | 
365 | 92
366 | 00:10:15,000 --> 00:10:24,000
367 | let's use an LLM to write a summary of the conversation so far and let that be the memory.
368 | 
369 | 93
370 | 00:10:24,000 --> 00:10:29,000
371 | So here's an example where I'm going to create a long string with someone's schedule.
372 | 
373 | 94
374 | 00:10:29,000 --> 00:10:31,000
375 | You know, there's Meteor AM, we are product team,
376 | 
377 | 95
378 | 00:10:31,000 --> 00:10:33,000
379 | you need your PowerPoint presentation and so on and so on.
380 | 
381 | 96
382 | 00:10:33,000 --> 00:10:38,000
383 | So this is a long string saying what's your schedule, you know,
384 | 
385 | 97
386 | 00:10:38,000 --> 00:10:42,000
387 | maybe ending with a noon lunch at the Italian restaurant with a customer,
388 | 
389 | 98
390 | 00:10:42,000 --> 00:10:46,000
391 | bring your laptop, show the LLM, latest LLM demo.
392 | 
393 | 99
394 | 00:10:46,000 --> 00:10:53,000
395 | And so let me use a conversation summary buffer memory, um,
396 | 
397 | 100
398 | 00:10:53,000 --> 00:10:58,000
399 | with a max token limits of 400 in this case, pretty high token limit.
400 | 
401 | 101
402 | 00:10:58,000 --> 00:11:05,000
403 | And I'm going to insert in a few conversational terms in which we start with,
404 | 
405 | 102
406 | 00:11:05,000 --> 00:11:10,000
407 | hello, what's up, no one's just hanging, uh, cool.
408 | 
409 | 103
410 | 00:11:10,000 --> 00:11:17,000
411 | And then what is on the schedule today and the response is, you know, that long schedule.
412 | 
413 | 104
414 | 00:11:17,000 --> 00:11:22,000
415 | So this memory now has quite a lot of text in it.
416 | 
417 | 105
418 | 00:11:22,000 --> 00:11:26,000
419 | In fact, let's take a look at the memory variables.
420 | 
421 | 106
422 | 00:11:26,000 --> 00:11:37,000
423 | It contains that entire, um, piece of text because 400 tokens was sufficient to store all this text.
424 | 
425 | 107
426 | 00:11:37,000 --> 00:11:43,000
427 | But now if I were to reduce the max token limit, say to 100 tokens,
428 | 
429 | 108
430 | 00:11:43,000 --> 00:11:46,000
431 | remember this stores the entire conversational history.
432 | 
433 | 109
434 | 00:11:46,000 --> 00:11:50,000
435 | If I reduce the number of tokens to 100,
436 | 
437 | 110
438 | 00:11:50,000 --> 00:11:57,000
439 | then the conversation summary buffer memory has actually used an LLM,
440 | 
441 | 111
442 | 00:11:57,000 --> 00:12:01,000
443 | the open AI endpoint in this case, because that's what we have set the LLM to,
444 | 
445 | 112
446 | 00:12:01,000 --> 00:12:05,000
447 | to actually generate a summary of the conversation so far.
448 | 
449 | 113
450 | 00:12:05,000 --> 00:12:09,000
451 | So the summary is human AI engaged in small talk before the scheduled day schedule,
452 | 
453 | 114
454 | 00:12:09,000 --> 00:12:12,000
455 | and informs human in the morning meeting, blah, blah, blah,
456 | 
457 | 115
458 | 00:12:12,000 --> 00:12:17,000
459 | um, lunch meeting with customer interested in AI, latest AI developments.
460 | 
461 | 116
462 | 00:12:17,000 --> 00:12:28,000
463 | And so if we were to have a conversation using this LLM,
464 | 
465 | 117
466 | 00:12:28,000 --> 00:12:33,000
467 | then create a conversation chain, same as before.
468 | 
469 | 118
470 | 00:12:33,000 --> 00:12:41,000
471 | And, um, let's say that we were to ask, you know, input, what would be a good demo to show?
472 | 
473 | 119
474 | 00:12:41,000 --> 00:12:43,000
475 | Um, I said verbose equals true.
476 | 
477 | 120
478 | 00:12:43,000 --> 00:12:53,000
479 | So here's the prompt, the LLM thinks the current conversation has had this discussion so far,
480 | 
481 | 121
482 | 00:12:53,000 --> 00:12:56,000
483 | because that's the summary of the conversation.
484 | 
485 | 122
486 | 00:12:56,000 --> 00:13:03,000
487 | And just one note, if you're familiar with the open AI chat API endpoint,
488 | 
489 | 123
490 | 00:13:03,000 --> 00:13:07,000
491 | there is a specific system message.
492 | 
493 | 124
494 | 00:13:07,000 --> 00:13:11,000
495 | In this example, this is not using the official open AI system message.
496 | 
497 | 125
498 | 00:13:11,000 --> 00:13:14,000
499 | It's just including it as part of the prompt here.
500 | 
501 | 126
502 | 00:13:14,000 --> 00:13:16,000
503 | But it nonetheless works pretty well.
504 | 
505 | 127
506 | 00:13:16,000 --> 00:13:21,000
507 | And given this prompt, you know, the LLM outputs basic customer interested in AI developments,
508 | 
509 | 128
510 | 00:13:21,000 --> 00:13:24,000
511 | or suggests showcasing our latest NLP capabilities.
512 | 
513 | 129
514 | 00:13:24,000 --> 00:13:26,000
515 | Okay, that's cool.
516 | 
517 | 130
518 | 00:13:26,000 --> 00:13:31,000
519 | Um, well, it's, you know, making some suggestions to the cool demos,
520 | 
521 | 131
522 | 00:13:31,000 --> 00:13:35,000
523 | and makes you think if I was meeting a customer, I would say,
524 | 
525 | 132
526 | 00:13:35,000 --> 00:13:43,000
527 | boy, if only there were open source framework available to help me build cool NLP applications using LLMs.
528 | 
529 | 133
530 | 00:13:43,000 --> 00:13:46,000
531 | Good things are launching.
532 | 
533 | 134
534 | 00:13:46,000 --> 00:13:58,000
535 | Um, and the interesting thing is, if you now look at what has happened to the memory.
536 | 
537 | 135
538 | 00:13:58,000 --> 00:14:04,000
539 | So notice that, um, here it has incorporated the most recent AI system output,
540 | 
541 | 136
542 | 00:14:04,000 --> 00:14:11,000
543 | whereas my utterance asking it won't be a good demo to show has been incorporated into the system message.
544 | 
545 | 137
546 | 00:14:11,000 --> 00:14:14,000
547 | Um, you know, the overall summary of the conversation so far.
548 | 
549 | 138
550 | 00:14:14,000 --> 00:14:17,000
551 | With the conversation summary buffer memory,
552 | 
553 | 139
554 | 00:14:17,000 --> 00:14:27,000
555 | what it tries to do is keep the explicit storage of the messages up to the number of tokens we have specified as a limit.
556 | 
557 | 140
558 | 00:14:27,000 --> 00:14:30,000
559 | So, you know, this part, the explicit storage,
560 | 
561 | 141
562 | 00:14:30,000 --> 00:14:34,000
563 | we're trying to cap at 100 tokens because that's what we're asking for.
564 | 
565 | 142
566 | 00:14:34,000 --> 00:14:38,000
567 | And then anything beyond that, it will use the LLM to generate a summary,
568 | 
569 | 143
570 | 00:14:38,000 --> 00:14:41,000
571 | which is what is seen up here.
572 | 
573 | 144
574 | 00:14:41,000 --> 00:14:46,000
575 | And even though I've illustrated these different memories using a chat as a running example,
576 | 
577 | 145
578 | 00:14:46,000 --> 00:14:49,000
579 | these memories are useful for other applications too,
580 | 
581 | 146
582 | 00:14:49,000 --> 00:14:54,000
583 | where you might keep on getting new snippets of text or keep on getting new information,
584 | 
585 | 147
586 | 00:14:54,000 --> 00:14:59,000
587 | such as if your system repeatedly goes online to search for facts,
588 | 
589 | 148
590 | 00:14:59,000 --> 00:15:04,000
591 | but you want to keep the total memory used to store this growing list of facts as, you know,
592 | 
593 | 149
594 | 00:15:04,000 --> 00:15:07,000
595 | capped and not growing arbitrarily long.
596 | 
597 | 150
598 | 00:15:07,000 --> 00:15:11,000
599 | I encourage you to pause the video and run the code.
600 | 
601 | 151
602 | 00:15:11,000 --> 00:15:15,000
603 | In this video, you saw a few types of memory, um,
604 | 
605 | 152
606 | 00:15:15,000 --> 00:15:21,000
607 | including buffer memories that limits based on number of conversation exchanges or tokens,
608 | 
609 | 153
610 | 00:15:21,000 --> 00:15:26,000
611 | or a memory that can summarize tokens above a certain limit.
612 | 
613 | 154
614 | 00:15:26,000 --> 00:15:30,000
615 | Lanchain actually supports additional memory types as well.
616 | 
617 | 155
618 | 00:15:30,000 --> 00:15:33,000
619 | One of the most powerful is vector data memory.
620 | 
621 | 156
622 | 00:15:33,000 --> 00:15:36,000
623 | If you're familiar with word embeddings and text embeddings,
624 | 
625 | 157
626 | 00:15:36,000 --> 00:15:39,000
627 | the vector database actually stores such embeddings.
628 | 
629 | 158
630 | 00:15:39,000 --> 00:15:41,000
631 | If you don't know what that means, don't worry about it.
632 | 
633 | 159
634 | 00:15:41,000 --> 00:15:43,000
635 | Harrison will explain it later.
636 | 
637 | 160
638 | 00:15:43,000 --> 00:15:51,000
639 | And it can then retrieve the most relevant blocks of text using this type of vector database for its memory.
640 | 
641 | 161
642 | 00:15:51,000 --> 00:15:54,000
643 | And Lanchain also supports entity memories,
644 | 
645 | 162
646 | 00:15:54,000 --> 00:15:58,000
647 | which is applicable when you want it to remember details about specific people,
648 | 
649 | 163
650 | 00:15:58,000 --> 00:16:04,000
651 | specific other entities, such as if you talk about a specific friend,
652 | 
653 | 164
654 | 00:16:04,000 --> 00:16:08,000
655 | you can have Lanchain remember facts about that friend,
656 | 
657 | 165
658 | 00:16:08,000 --> 00:16:12,000
659 | which would be an entity in an explicit way.
660 | 
661 | 166
662 | 00:16:12,000 --> 00:16:14,000
663 | When you're implementing applications using Lanchain,
664 | 
665 | 167
666 | 00:16:14,000 --> 00:16:17,000
667 | you can also use multiple types of memories,
668 | 
669 | 168
670 | 00:16:17,000 --> 00:16:22,000
671 | such as using one of the types of conversation memory that you saw in this video.
672 | 
673 | 169
674 | 00:16:22,000 --> 00:16:26,000
675 | Plus additionally, entity memory to recall individuals.
676 | 
677 | 170
678 | 00:16:26,000 --> 00:16:30,000
679 | So this way it can remember maybe a summary of the conversation,
680 | 
681 | 171
682 | 00:16:30,000 --> 00:16:35,000
683 | plus an explicit way of storing important facts about important people in the conversation.
684 | 
685 | 172
686 | 00:16:35,000 --> 00:16:38,000
687 | And of course, in addition to using these memory types,
688 | 
689 | 173
690 | 00:16:38,000 --> 00:16:43,000
691 | it's also not uncommon for developers to store the entire conversation in the conventional database,
692 | 
693 | 174
694 | 00:16:43,000 --> 00:16:46,000
695 | some sort of key value store or SQL database.
696 | 
697 | 175
698 | 00:16:46,000 --> 00:16:51,000
699 | So you could refer back to the whole conversation for auditing or for improving the system further.
700 | 
701 | 176
702 | 00:16:51,000 --> 00:16:53,000
703 | And so that's memory types.
704 | 
705 | 177
706 | 00:16:53,000 --> 00:16:57,000
707 | I hope you find this useful building your own applications.
708 | 
709 | 178
710 | 00:16:57,000 --> 00:17:21,000
711 | And now let's go on to the next video to learn about the key building block of Lanchain, namely the chain.
712 | 
713 | 
714 | 


--------------------------------------------------------------------------------
/english/LangChain_L4.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:01,000 --> 00:00:15,000
  3 | One of the most common complex applications that people are building using an llm is a system that can answer questions on top of or about a document.
  4 | 
  5 | 2
  6 | 00:00:15,000 --> 00:00:24,000
  7 | So given a piece of text may be extracted from PDF file or from a webpage or from some company's intranet internal document collection,
  8 | 
  9 | 3
 10 | 00:00:24,000 --> 00:00:33,000
 11 | can you use an llm to answer questions about the content of those documents to help users gain a deeper understanding and get access to the information that they need?
 12 | 
 13 | 4
 14 | 00:00:33,000 --> 00:00:39,000
 15 | This is really powerful because it starts to combine these language models with data that they weren't originally trained on.
 16 | 
 17 | 5
 18 | 00:00:39,000 --> 00:00:42,000
 19 | So it makes them much more flexible and adaptable to your use case.
 20 | 
 21 | 6
 22 | 00:00:42,000 --> 00:00:48,000
 23 | It's also really exciting because we'll start to move beyond language models, prompts, and output parsers,
 24 | 
 25 | 7
 26 | 00:00:48,000 --> 00:00:54,000
 27 | and start introducing some more of the key components of link chain, such as embedding models and vector stores.
 28 | 
 29 | 8
 30 | 00:00:54,000 --> 00:00:58,000
 31 | As Andrew mentioned, this is one of the more popular chains that we've got, so I hope you're excited.
 32 | 
 33 | 9
 34 | 00:00:58,000 --> 00:01:03,000
 35 | In fact, embeddings and vector stores are some of the most powerful modern techniques.
 36 | 
 37 | 10
 38 | 00:01:03,000 --> 00:01:08,000
 39 | So if you have not seen them yet, they are very much worth learning about.
 40 | 
 41 | 11
 42 | 00:01:08,000 --> 00:01:10,000
 43 | So with that, let's dive in.
 44 | 
 45 | 12
 46 | 00:01:10,000 --> 00:01:11,000
 47 | Let's do it.
 48 | 
 49 | 13
 50 | 00:01:11,000 --> 00:01:16,000
 51 | So we're going to start by importing the environment variables as we always do.
 52 | 
 53 | 14
 54 | 00:01:16,000 --> 00:01:20,000
 55 | Now we're going to import some things that will help us when building this chain.
 56 | 
 57 | 15
 58 | 00:01:20,000 --> 00:01:22,000
 59 | We're going to import the retrieval QA chain.
 60 | 
 61 | 16
 62 | 00:01:22,000 --> 00:01:24,000
 63 | This will do retrieval over some documents.
 64 | 
 65 | 17
 66 | 00:01:24,000 --> 00:01:28,000
 67 | We're going to import our favorite chat open AI language model.
 68 | 
 69 | 18
 70 | 00:01:28,000 --> 00:01:29,000
 71 | We're going to import a document loader.
 72 | 
 73 | 19
 74 | 00:01:29,000 --> 00:01:34,000
 75 | This is going to be used to load some proprietary data that we're going to combine with the language model.
 76 | 
 77 | 20
 78 | 00:01:34,000 --> 00:01:36,000
 79 | In this case, it's going to be in a CSV.
 80 | 
 81 | 21
 82 | 00:01:36,000 --> 00:01:39,000
 83 | So we're going to import the CSV loader.
 84 | 
 85 | 22
 86 | 00:01:39,000 --> 00:01:41,000
 87 | Finally, we're going to import a vector store.
 88 | 
 89 | 23
 90 | 00:01:41,000 --> 00:01:45,000
 91 | There are many different types of vector stores, and we'll cover what exactly these are later on.
 92 | 
 93 | 24
 94 | 00:01:45,000 --> 00:01:49,000
 95 | But we're going to get started with the Dock Array in-memory search vector store.
 96 | 
 97 | 25
 98 | 00:01:49,000 --> 00:01:51,000
 99 | This is really nice because it's an in-memory vector store,
100 | 
101 | 26
102 | 00:01:51,000 --> 00:01:55,000
103 | and it doesn't require connecting to an external database of any kind,
104 | 
105 | 27
106 | 00:01:55,000 --> 00:01:57,000
107 | so it makes it really easy to get started.
108 | 
109 | 28
110 | 00:01:57,000 --> 00:01:59,000
111 | We're also going to import display and markdown,
112 | 
113 | 29
114 | 00:01:59,000 --> 00:02:04,000
115 | two common utilities for displaying information in Jupyter Notebooks.
116 | 
117 | 30
118 | 00:02:04,000 --> 00:02:10,000
119 | We've provided a CSV of outdoor clothing that we're going to use to combine with the language model.
120 | 
121 | 31
122 | 00:02:10,000 --> 00:02:18,000
123 | Here we're going to initialize a loader, the CSV loader, with a path to this file.
124 | 
125 | 32
126 | 00:02:18,000 --> 00:02:22,000
127 | We're next going to import an index, the vector store index creator.
128 | 
129 | 33
130 | 00:02:22,000 --> 00:02:26,000
131 | This will help us create a vector store really easily.
132 | 
133 | 34
134 | 00:02:26,000 --> 00:02:34,000
135 | As we can see below, it will only be a few lines of code to create this.
136 | 
137 | 35
138 | 00:02:34,000 --> 00:02:37,000
139 | To create it, we're going to specify two things.
140 | 
141 | 36
142 | 00:02:37,000 --> 00:02:40,000
143 | First, we're going to specify the vector store class.
144 | 
145 | 37
146 | 00:02:40,000 --> 00:02:42,000
147 | As mentioned before, we're going to use this vector store,
148 | 
149 | 38
150 | 00:02:42,000 --> 00:02:46,000
151 | as it's a particularly easy one to get started with.
152 | 
153 | 39
154 | 00:02:46,000 --> 00:02:49,000
155 | After it's been created, we're then going to call from loaders,
156 | 
157 | 40
158 | 00:02:49,000 --> 00:02:51,000
159 | which takes in a list of document loaders.
160 | 
161 | 41
162 | 00:02:51,000 --> 00:02:58,000
163 | We've only got one loader that we really care about, so that's what we're passing in here.
164 | 
165 | 42
166 | 00:02:58,000 --> 00:03:02,000
167 | It's now been created, and we can start to ask questions about it.
168 | 
169 | 43
170 | 00:03:02,000 --> 00:03:07,000
171 | Below we'll cover what exactly happened under the hood, so let's not worry about that for now.
172 | 
173 | 44
174 | 00:03:07,000 --> 00:03:09,000
175 | Here we'll start with a query.
176 | 
177 | 45
178 | 00:03:09,000 --> 00:03:17,000
179 | We'll then create a response using index query and pass in this query.
180 | 
181 | 46
182 | 00:03:17,000 --> 00:03:21,000
183 | Again, we'll cover what's going on under the hood down below.
184 | 
185 | 47
186 | 00:03:21,000 --> 00:03:30,000
187 | For now, we'll just wait for it to respond.
188 | 
189 | 48
190 | 00:03:30,000 --> 00:03:34,000
191 | After it finishes, we can now take a look at what exactly was returned.
192 | 
193 | 49
194 | 00:03:34,000 --> 00:03:41,000
195 | We've gotten back a table in Markdown with names and descriptions for all shirts with sun protection.
196 | 
197 | 50
198 | 00:03:41,000 --> 00:03:45,000
199 | We've also got a nice little summary that the language model has provided us.
200 | 
201 | 51
202 | 00:03:45,000 --> 00:03:48,000
203 | So we've gone over how to do question answering over your documents,
204 | 
205 | 52
206 | 00:03:48,000 --> 00:03:52,000
207 | but what exactly is going on underneath the hood?
208 | 
209 | 53
210 | 00:03:52,000 --> 00:03:54,000
211 | First, let's think about the general idea.
212 | 
213 | 54
214 | 00:03:54,000 --> 00:03:58,000
215 | We want to use language models and combine it with a lot of our documents,
216 | 
217 | 55
218 | 00:03:58,000 --> 00:04:03,000
219 | but there's a key issue. Language models can only inspect a few thousand words at a time.
220 | 
221 | 56
222 | 00:04:03,000 --> 00:04:10,000
223 | So if we have really large documents, how can we get the language model to answer questions about everything that's in there?
224 | 
225 | 57
226 | 00:04:10,000 --> 00:04:14,000
227 | This is where embeddings and vector stores come into play.
228 | 
229 | 58
230 | 00:04:14,000 --> 00:04:17,000
231 | First, let's talk about embeddings.
232 | 
233 | 59
234 | 00:04:17,000 --> 00:04:21,000
235 | Embeddings create numerical representations for pieces of text.
236 | 
237 | 60
238 | 00:04:21,000 --> 00:04:27,000
239 | This numerical representation captures the semantic meaning of the piece of text that it's been run over.
240 | 
241 | 61
242 | 00:04:27,000 --> 00:04:31,000
243 | Pieces of text with similar content will have similar vectors.
244 | 
245 | 62
246 | 00:04:31,000 --> 00:04:35,000
247 | This lets us compare pieces of text in the vector space.
248 | 
249 | 63
250 | 00:04:35,000 --> 00:04:38,000
251 | In the example below, we can see that we have three sentences.
252 | 
253 | 64
254 | 00:04:38,000 --> 00:04:43,000
255 | The first two are about pets, while the third is about a car.
256 | 
257 | 65
258 | 00:04:43,000 --> 00:04:46,000
259 | If we look at the representation in the numeric space,
260 | 
261 | 66
262 | 00:04:46,000 --> 00:04:54,000
263 | we can see that when we compare the two vectors on the pieces of text corresponding to the sentences about pets, they're very similar.
264 | 
265 | 67
266 | 00:04:54,000 --> 00:04:58,000
267 | While if we compare it to the one that talks about a car, they're not similar at all.
268 | 
269 | 68
270 | 00:04:58,000 --> 00:05:02,000
271 | This will let us easily figure out which pieces of text are like each other,
272 | 
273 | 69
274 | 00:05:02,000 --> 00:05:10,000
275 | which will be very useful as we think about which pieces of text we want to include when passing to the language model to answer a question.
276 | 
277 | 70
278 | 00:05:10,000 --> 00:05:13,000
279 | The next component that we're going to cover is the vector database.
280 | 
281 | 71
282 | 00:05:13,000 --> 00:05:18,000
283 | A vector database is a way to store these vector representations that we created in the previous step.
284 | 
285 | 72
286 | 00:05:18,000 --> 00:05:24,000
287 | The way that we create this vector database is we populate it with chunks of text coming from incoming documents.
288 | 
289 | 73
290 | 00:05:24,000 --> 00:05:28,000
291 | When we get a big incoming document, we're first going to break it up into smaller chunks.
292 | 
293 | 74
294 | 00:05:28,000 --> 00:05:33,000
295 | This helps create pieces of text that are smaller than the original document,
296 | 
297 | 75
298 | 00:05:33,000 --> 00:05:37,000
299 | which is useful because we may not be able to pass the whole document to the language model.
300 | 
301 | 76
302 | 00:05:37,000 --> 00:05:43,000
303 | So we want to create these small chunks so we can only pass the most relevant ones to the language model.
304 | 
305 | 77
306 | 00:05:43,000 --> 00:05:48,000
307 | We then create an embedding for each of these chunks, and then we store those in a vector database.
308 | 
309 | 78
310 | 00:05:48,000 --> 00:05:51,000
311 | That's what happens when we create the index.
312 | 
313 | 79
314 | 00:05:51,000 --> 00:05:58,000
315 | Now that we've got this index, we can use it during runtime to find the pieces of text most relevant to an incoming query.
316 | 
317 | 80
318 | 00:05:58,000 --> 00:06:02,000
319 | When a query comes in, we first create an embedding for that query.
320 | 
321 | 81
322 | 00:06:02,000 --> 00:06:07,000
323 | We then compare it to all the vectors in the vector database, and we pick the n most similar.
324 | 
325 | 82
326 | 00:06:07,000 --> 00:06:14,000
327 | These are then returned, and we can pass those in the prompt to the language model to get back a final answer.
328 | 
329 | 83
330 | 00:06:14,000 --> 00:06:17,000
331 | So above, we created this chain and only a few lines of code.
332 | 
333 | 84
334 | 00:06:17,000 --> 00:06:19,000
335 | That's great for getting started quickly.
336 | 
337 | 85
338 | 00:06:19,000 --> 00:06:25,000
339 | Well, let's now do it a bit more step by step and understand what exactly is going on under the hood.
340 | 
341 | 86
342 | 00:06:25,000 --> 00:06:27,000
343 | The first step is similar to above.
344 | 
345 | 87
346 | 00:06:27,000 --> 00:06:36,000
347 | We're going to create a document loader, loading from that CSV with all the descriptions of the products that we want to do question answering over.
348 | 
349 | 88
350 | 00:06:36,000 --> 00:06:41,000
351 | We can then load documents from this document loader.
352 | 
353 | 89
354 | 00:06:41,000 --> 00:06:50,000
355 | If we look at the individual documents, we can see that each document corresponds to one of the products in the CSV.
356 | 
357 | 90
358 | 00:06:50,000 --> 00:06:53,000
359 | Previously, we talked about creating chunks.
360 | 
361 | 91
362 | 00:06:53,000 --> 00:07:01,000
363 | Because these documents are already so small, we actually don't need to do any chunking here, and so we can create embeddings directly.
364 | 
365 | 92
366 | 00:07:01,000 --> 00:07:05,000
367 | To create embeddings, we're going to use OpenAI's embedding class.
368 | 
369 | 93
370 | 00:07:05,000 --> 00:07:08,000
371 | We can import it and initialize it here.
372 | 
373 | 94
374 | 00:07:08,000 --> 00:07:21,000
375 | If we want to see what these embeddings do, we can actually take a look at what happens when we embed a particular piece of text.
376 | 
377 | 95
378 | 00:07:21,000 --> 00:07:26,000
379 | Let's use the embed query method on the embeddings object to create embeddings for a particular piece of text.
380 | 
381 | 96
382 | 00:07:26,000 --> 00:07:31,000
383 | In this case, the sentence, hi, my name is Harrison.
384 | 
385 | 97
386 | 00:07:31,000 --> 00:07:41,000
387 | If we take a look at this embedding, we can see that there are over a thousand different elements.
388 | 
389 | 98
390 | 00:07:41,000 --> 00:07:44,000
391 | Each of these elements is a different numerical value.
392 | 
393 | 99
394 | 00:07:44,000 --> 00:07:51,000
395 | Combined, this creates the overall numerical representation for this piece of text.
396 | 
397 | 100
398 | 00:07:51,000 --> 00:07:58,000
399 | We want to create embeddings for all the pieces of text that we just loaded, and then we also want to store them in a vector store.
400 | 
401 | 101
402 | 00:07:58,000 --> 00:08:03,000
403 | We can do that by using the from documents method on the vector store.
404 | 
405 | 102
406 | 00:08:03,000 --> 00:08:12,000
407 | This method takes in a list of documents, an embedding object, and then we'll create an overall vector store.
408 | 
409 | 103
410 | 00:08:12,000 --> 00:08:18,000
411 | We can now use this vector store to find pieces of text similar to an incoming query.
412 | 
413 | 104
414 | 00:08:18,000 --> 00:08:23,000
415 | So let's look at the query, please suggest a shirt with sunblocking.
416 | 
417 | 105
418 | 00:08:23,000 --> 00:08:36,000
419 | If we use the similarity search method on the vector store and pass in a query, we will get back a list of documents.
420 | 
421 | 106
422 | 00:08:36,000 --> 00:08:48,000
423 | We can see that it returns four documents, and if we look at the first one, we can see that it is indeed a shirt about sunblocking.
424 | 
425 | 107
426 | 00:08:48,000 --> 00:08:52,000
427 | So how do we use this to do question answering over our own documents?
428 | 
429 | 108
430 | 00:08:52,000 --> 00:08:57,000
431 | First, we need to create a retriever from this vector store.
432 | 
433 | 109
434 | 00:08:57,000 --> 00:09:03,000
435 | A retriever is a generic interface that can be underpinned by any method that takes in a query and returns documents.
436 | 
437 | 110
438 | 00:09:03,000 --> 00:09:11,000
439 | Vector stores and embeddings are one such method to do so, although there are plenty of different methods, some less advanced, some more advanced.
440 | 
441 | 111
442 | 00:09:11,000 --> 00:09:20,000
443 | Next, because we want to do text generation and return a natural language response, we're going to import a language model and we're going to use chat open AI.
444 | 
445 | 112
446 | 00:09:20,000 --> 00:09:28,000
447 | If we were doing this by hand, what we would do is we would combine the documents into a single piece of text.
448 | 
449 | 113
450 | 00:09:28,000 --> 00:09:37,000
451 | So we'd do something like this, where we join all the page content in the documents into a variable.
452 | 
453 | 114
454 | 00:09:37,000 --> 00:09:48,000
455 | And then we'd pass this variable or a variant on the question, like please list all your shirts with sun protection and a table with markdown and summarize each one into the language model.
456 | 
457 | 115
458 | 00:09:48,000 --> 00:09:55,000
459 | And if we print out the response here, we can see that we get back a table exactly as we asked for.
460 | 
461 | 116
462 | 00:09:55,000 --> 00:09:59,000
463 | All of those steps can be encapsulated with the LangChain chain.
464 | 
465 | 117
466 | 00:09:59,000 --> 00:10:02,000
467 | So here we can create a retrieval QA chain.
468 | 
469 | 118
470 | 00:10:02,000 --> 00:10:06,000
471 | This does retrieval and then does question answering over the retrieved documents.
472 | 
473 | 119
474 | 00:10:06,000 --> 00:10:09,000
475 | To create such a chain, we'll pass in a few different things.
476 | 
477 | 120
478 | 00:10:09,000 --> 00:10:12,000
479 | First, we'll pass in the language model.
480 | 
481 | 121
482 | 00:10:12,000 --> 00:10:15,000
483 | This will be used for doing the text generation at the end.
484 | 
485 | 122
486 | 00:10:15,000 --> 00:10:17,000
487 | Next, we'll pass in the chain type.
488 | 
489 | 123
490 | 00:10:17,000 --> 00:10:18,000
491 | We're going to use stuff.
492 | 
493 | 124
494 | 00:10:18,000 --> 00:10:25,000
495 | This is the simplest method as it just stuffs all the documents into context and makes one call to a language model.
496 | 
497 | 125
498 | 00:10:25,000 --> 00:10:32,000
499 | There are a few other methods that you can use to do question answering that I'll maybe touch on at the end, but we're not going to look at in detail.
500 | 
501 | 126
502 | 00:10:32,000 --> 00:10:34,000
503 | Third, we're going to pass in a retriever.
504 | 
505 | 127
506 | 00:10:34,000 --> 00:10:38,000
507 | The retriever we created above is just an interface for fetching documents.
508 | 
509 | 128
510 | 00:10:38,000 --> 00:10:41,000
511 | This will be used to fetch documents and pass it to the language model.
512 | 
513 | 129
514 | 00:10:41,000 --> 00:10:46,000
515 | And then finally, we're going to set verbose equals to true.
516 | 
517 | 130
518 | 00:10:46,000 --> 00:11:08,000
519 | Now we can create a query and we can run the chain on this query.
520 | 
521 | 131
522 | 00:11:08,000 --> 00:11:14,000
523 | When we get the response, we can again display it using the display and markdown utilities.
524 | 
525 | 132
526 | 00:11:14,000 --> 00:11:20,000
527 | You can pause the video here and try it out with a bunch of different queries.
528 | 
529 | 133
530 | 00:11:20,000 --> 00:11:26,000
531 | So that's how you do it in detail, but remember that we can still do it pretty easily with just the one line that we had up above.
532 | 
533 | 134
534 | 00:11:26,000 --> 00:11:30,000
535 | So these two things equate to the same result.
536 | 
537 | 135
538 | 00:11:30,000 --> 00:11:32,000
539 | And that's part of the interesting stuff about LinkChain.
540 | 
541 | 136
542 | 00:11:32,000 --> 00:11:38,000
543 | You can do it in one line or you can look at the individual things and break it down into five more detailed ones.
544 | 
545 | 137
546 | 00:11:38,000 --> 00:11:44,000
547 | The five more detailed ones let you set more specifics about what exactly is going on, but the one-liner is easy to get started.
548 | 
549 | 138
550 | 00:11:44,000 --> 00:11:48,000
551 | So up to you as to how you'd prefer to go forward.
552 | 
553 | 139
554 | 00:11:48,000 --> 00:11:51,000
555 | We can also customize the index when we're creating it.
556 | 
557 | 140
558 | 00:11:51,000 --> 00:11:55,000
559 | And so if you remember, when we created it by hand, we specified an embedding.
560 | 
561 | 141
562 | 00:11:55,000 --> 00:11:57,000
563 | And we can specify an embedding here as well.
564 | 
565 | 142
566 | 00:11:57,000 --> 00:12:01,000
567 | And so this will give us flexibility over how the embeddings themselves are created.
568 | 
569 | 143
570 | 00:12:01,000 --> 00:12:06,000
571 | And we can also swap out the vector store here for a different type of vector store.
572 | 
573 | 144
574 | 00:12:06,000 --> 00:12:15,000
575 | So there's the same level of customization that you did when you created it by hand that's also available when you create the index here.
576 | 
577 | 145
578 | 00:12:15,000 --> 00:12:17,000
579 | We use the stuff method in this notebook.
580 | 
581 | 146
582 | 00:12:17,000 --> 00:12:19,000
583 | The stuff method is really nice because it's pretty simple.
584 | 
585 | 147
586 | 00:12:19,000 --> 00:12:25,000
587 | You just put all of it into one prompt and send that to the language model and get back one response.
588 | 
589 | 148
590 | 00:12:25,000 --> 00:12:27,000
591 | So it's quite simple to understand what's going on.
592 | 
593 | 149
594 | 00:12:27,000 --> 00:12:30,000
595 | It's quite cheap and it works pretty well.
596 | 
597 | 150
598 | 00:12:30,000 --> 00:12:32,000
599 | But that doesn't always work okay.
600 | 
601 | 151
602 | 00:12:32,000 --> 00:12:37,000
603 | So if you remember, when we fetched the documents in the notebook, we only got four documents back.
604 | 
605 | 152
606 | 00:12:37,000 --> 00:12:39,000
607 | And they were relatively small.
608 | 
609 | 153
610 | 00:12:39,000 --> 00:12:44,000
611 | But what if you wanted to do the same type of question answering over lots of different types of chunks?
612 | 
613 | 154
614 | 00:12:44,000 --> 00:12:47,000
615 | Then there are a few different methods that we can use.
616 | 
617 | 155
618 | 00:12:47,000 --> 00:12:48,000
619 | The first is map reduce.
620 | 
621 | 156
622 | 00:12:48,000 --> 00:12:55,000
623 | This basically takes all the chunks, passes them along with the question to a language model, gets back a response,
624 | 
625 | 157
626 | 00:12:55,000 --> 00:13:02,000
627 | and then uses another language model call to summarize all of the individual responses into a final answer.
628 | 
629 | 158
630 | 00:13:02,000 --> 00:13:06,000
631 | This is really powerful because it can operate over any number of documents.
632 | 
633 | 159
634 | 00:13:06,000 --> 00:13:11,000
635 | And it's also really powerful because you can do the individual questions in parallel.
636 | 
637 | 160
638 | 00:13:11,000 --> 00:13:13,000
639 | But it does take a lot more calls.
640 | 
641 | 161
642 | 00:13:13,000 --> 00:13:19,000
643 | And it does treat all the documents as independent, which may not always be the most desired thing.
644 | 
645 | 162
646 | 00:13:19,000 --> 00:13:24,000
647 | Refine, which is another method, is again used to loop over many documents.
648 | 
649 | 163
650 | 00:13:24,000 --> 00:13:25,000
651 | But it actually does it iteratively.
652 | 
653 | 164
654 | 00:13:25,000 --> 00:13:28,000
655 | It builds upon the answer from the previous document.
656 | 
657 | 165
658 | 00:13:28,000 --> 00:13:33,000
659 | So this is really good for combining information and building up an answer over time.
660 | 
661 | 166
662 | 00:13:33,000 --> 00:13:36,000
663 | It will generally lead to longer answers.
664 | 
665 | 167
666 | 00:13:36,000 --> 00:13:39,000
667 | And it's also not as fast because now the calls aren't independent.
668 | 
669 | 168
670 | 00:13:39,000 --> 00:13:43,000
671 | They depend on the result of previous calls.
672 | 
673 | 169
674 | 00:13:43,000 --> 00:13:49,000
675 | This means that it often takes a good while longer and takes just as many calls as map reduce, basically.
676 | 
677 | 170
678 | 00:13:49,000 --> 00:13:57,000
679 | Map re-rank is a pretty interesting and a bit more experimental one where you do a single call to the language model for each document.
680 | 
681 | 171
682 | 00:13:57,000 --> 00:14:00,000
683 | And you also ask it to return a score.
684 | 
685 | 172
686 | 00:14:00,000 --> 00:14:02,000
687 | And then you select the highest score.
688 | 
689 | 173
690 | 00:14:02,000 --> 00:14:06,000
691 | This relies on the language model to know what the score should be.
692 | 
693 | 174
694 | 00:14:06,000 --> 00:14:12,000
695 | So you often have to tell it, hey, it should be a high score if it's relevant to the document and really refine the instructions there.
696 | 
697 | 175
698 | 00:14:12,000 --> 00:14:15,000
699 | Similar to map reduce, all the calls are independent.
700 | 
701 | 176
702 | 00:14:15,000 --> 00:14:16,000
703 | So you can batch them.
704 | 
705 | 177
706 | 00:14:16,000 --> 00:14:18,000
707 | And it's relatively fast.
708 | 
709 | 178
710 | 00:14:18,000 --> 00:14:20,000
711 | But again, you're making a bunch of language model calls.
712 | 
713 | 179
714 | 00:14:20,000 --> 00:14:22,000
715 | So it will be a bit more expensive.
716 | 
717 | 180
718 | 00:14:22,000 --> 00:14:29,000
719 | The most common of these methods is the stuff method, which we used in the notebook to combine it all into one document.
720 | 
721 | 181
722 | 00:14:29,000 --> 00:14:35,000
723 | The second most common is the map reduce method, which takes these chunks and sends them to the language model.
724 | 
725 | 182
726 | 00:14:35,000 --> 00:14:42,000
727 | These methods here, stuff, map reduce, refine, and re-rank can also be used for lots of other chains besides just question answering.
728 | 
729 | 183
730 | 00:14:42,000 --> 00:14:53,000
731 | For example, a really common use case of the map reduce chain is for summarization, where you have a really long document and you want to recursively summarize pieces of information in it.
732 | 
733 | 184
734 | 00:14:53,000 --> 00:14:56,000
735 | That's it for question answering over documents.
736 | 
737 | 185
738 | 00:14:56,000 --> 00:15:00,000
739 | As you may have noticed, there's a lot going on in the different chains that we have here.
740 | 
741 | 186
742 | 00:15:00,000 --> 00:15:12,000
743 | And so in the next section, we'll cover ways to better understand what exactly is going on inside all of these chains.
744 | 
745 | 
746 | 


--------------------------------------------------------------------------------
/english/LangChain_L3.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:00,000 --> 00:00:09,560
  3 | In this lesson, Harrison will teach the most important key building block of
  4 | 
  5 | 2
  6 | 00:00:09,560 --> 00:00:11,960
  7 | land chain, namely the chain.
  8 | 
  9 | 3
 10 | 00:00:12,680 --> 00:00:17,440
 11 | The chain usually combines an llm large language model together with a prompt.
 12 | 
 13 | 4
 14 | 00:00:17,720 --> 00:00:21,720
 15 | And with this building block, you can also put a bunch of these building blocks
 16 | 
 17 | 5
 18 | 00:00:21,720 --> 00:00:26,040
 19 | together to carry out a sequence of operations on your texts or on your other
 20 | 
 21 | 6
 22 | 00:00:26,040 --> 00:00:26,540
 23 | data.
 24 | 
 25 | 7
 26 | 00:00:27,040 --> 00:00:28,400
 27 | I'm excited to dive into it.
 28 | 
 29 | 8
 30 | 00:00:28,400 --> 00:00:33,080
 31 | All right, to start, we're going to load the environment variables as we have
 32 | 
 33 | 9
 34 | 00:00:33,080 --> 00:00:33,580
 35 | before.
 36 | 
 37 | 10
 38 | 00:00:34,440 --> 00:00:37,240
 39 | And then we're also going to load some data that we're going to use.
 40 | 
 41 | 11
 42 | 00:00:38,040 --> 00:00:43,480
 43 | So part of the power of these chains is that you can run them over many inputs
 44 | 
 45 | 12
 46 | 00:00:43,800 --> 00:00:44,320
 47 | at a time.
 48 | 
 49 | 13
 50 | 00:00:44,760 --> 00:00:46,600
 51 | Here, we're going to load a pandas data frame.
 52 | 
 53 | 14
 54 | 00:00:47,240 --> 00:00:50,840
 55 | A pandas data frame is just a data structure that contains a bunch of
 56 | 
 57 | 15
 58 | 00:00:50,840 --> 00:00:52,440
 59 | different elements of data.
 60 | 
 61 | 16
 62 | 00:00:52,520 --> 00:00:54,480
 63 | If you're not familiar with pandas, don't worry about it.
 64 | 
 65 | 17
 66 | 00:00:54,480 --> 00:00:58,560
 67 | The main point here is that we're loading some data that we can then use later on.
 68 | 
 69 | 18
 70 | 00:00:58,600 --> 00:01:02,000
 71 | And so if we look inside this pandas data frame, we can see that there is a
 72 | 
 73 | 19
 74 | 00:01:02,000 --> 00:01:04,200
 75 | product column and then a review column.
 76 | 
 77 | 20
 78 | 00:01:04,280 --> 00:01:07,640
 79 | And each of these rows is a different data point that we can start passing
 80 | 
 81 | 21
 82 | 00:01:07,640 --> 00:01:08,480
 83 | through our chains.
 84 | 
 85 | 22
 86 | 00:01:09,680 --> 00:01:12,120
 87 | So the first chain we're going to cover is the llm chain.
 88 | 
 89 | 23
 90 | 00:01:12,200 --> 00:01:16,400
 91 | And this is a simple but really powerful chain that underpins a lot of the chains
 92 | 
 93 | 24
 94 | 00:01:16,400 --> 00:01:18,240
 95 | that we'll go over in the future.
 96 | 
 97 | 25
 98 | 00:01:18,640 --> 00:01:21,200
 99 | And so we're going to import three different things.
100 | 
101 | 26
102 | 00:01:21,480 --> 00:01:24,040
103 | We're going to import the OpenAI model.
104 | 
105 | 27
106 | 00:01:24,040 --> 00:01:27,360
107 | So the llm, we're going to import the chat prompt template.
108 | 
109 | 28
110 | 00:01:27,400 --> 00:01:29,920
111 | And so this is the prompt and then we're going to import the llm chain.
112 | 
113 | 29
114 | 00:01:31,040 --> 00:01:34,520
115 | And so first, what we're going to do is we're going to initialize the language
116 | 
117 | 30
118 | 00:01:34,520 --> 00:01:35,960
119 | model that we want to use.
120 | 
121 | 31
122 | 00:01:36,040 --> 00:01:39,840
123 | So we're going to initialize the chat OpenAI with a temperature, with a high
124 | 
125 | 32
126 | 00:01:39,840 --> 00:01:43,840
127 | temperature so that we can get some fun descriptions.
128 | 
129 | 33
130 | 00:01:44,560 --> 00:01:47,640
131 | Now we're going to initialize a prompt and this prompt is going to take in a
132 | 
133 | 34
134 | 00:01:47,640 --> 00:01:48,920
135 | variable called product.
136 | 
137 | 35
138 | 00:01:49,240 --> 00:01:53,000
139 | It's going to ask the llm to generate what the best name is to describe a
140 | 
141 | 36
142 | 00:01:53,000 --> 00:01:54,680
143 | company that makes that product.
144 | 
145 | 37
146 | 00:01:55,520 --> 00:01:59,120
147 | And then finally, we're going to combine these two things into a chain.
148 | 
149 | 38
150 | 00:01:59,760 --> 00:02:01,880
151 | And so this is what we call an llm chain.
152 | 
153 | 39
154 | 00:02:02,000 --> 00:02:02,840
155 | And it's quite simple.
156 | 
157 | 40
158 | 00:02:02,840 --> 00:02:06,120
159 | It's just the combination of the llm and the prompt.
160 | 
161 | 41
162 | 00:02:06,720 --> 00:02:10,640
163 | But now this chain will let us run through the prompt and into the llm in a
164 | 
165 | 42
166 | 00:02:10,640 --> 00:02:11,400
167 | sequential manner.
168 | 
169 | 43
170 | 00:02:11,680 --> 00:02:16,120
171 | And so if we have a product called queen size sheet set, we can run this through
172 | 
173 | 44
174 | 00:02:16,120 --> 00:02:17,840
175 | the chain by using chain.run.
176 | 
177 | 45
178 | 00:02:18,240 --> 00:02:21,440
179 | And what this will do is it will format the prompt under the hood and then it
180 | 
181 | 46
182 | 00:02:21,440 --> 00:02:24,080
183 | will pass the whole prompt into the llm.
184 | 
185 | 47
186 | 00:02:24,400 --> 00:02:28,440
187 | And so we can see that we get back the name of this hypothetical company called
188 | 
189 | 48
190 | 00:02:28,440 --> 00:02:29,320
191 | Royal Beddings.
192 | 
193 | 49
194 | 00:02:30,360 --> 00:02:33,440
195 | And so here would be a good time to pause and you can input any product
196 | 
197 | 50
198 | 00:02:33,440 --> 00:02:36,720
199 | descriptions that you would want and you can see what the chain will output as a
200 | 
201 | 51
202 | 00:02:36,720 --> 00:02:37,160
203 | result.
204 | 
205 | 52
206 | 00:02:38,120 --> 00:02:42,880
207 | So the llm chain is the most basic type of chain and that's going to be used a
208 | 
209 | 53
210 | 00:02:42,880 --> 00:02:43,680
211 | lot in the future.
212 | 
213 | 54
214 | 00:02:43,880 --> 00:02:47,320
215 | And so we can see how this will be used in the next type of chain, which will be
216 | 
217 | 55
218 | 00:02:47,320 --> 00:02:48,440
219 | sequential chains.
220 | 
221 | 56
222 | 00:02:48,440 --> 00:02:52,520
223 | And so sequential chains run a sequence of chains one after another.
224 | 
225 | 57
226 | 00:02:52,960 --> 00:02:56,800
227 | So to start, you're going to import the simple sequential chain.
228 | 
229 | 58
230 | 00:02:57,240 --> 00:03:00,880
231 | And this works well when we have sub chains that expect only one input and
232 | 
233 | 59
234 | 00:03:00,880 --> 00:03:02,000
235 | return only one output.
236 | 
237 | 60
238 | 00:03:02,760 --> 00:03:07,600
239 | And so here we're going to first create one chain, which uses an llm and a
240 | 
241 | 61
242 | 00:03:07,600 --> 00:03:08,160
243 | prompt.
244 | 
245 | 62
246 | 00:03:08,560 --> 00:03:13,640
247 | And this prompt is going to take in the product and will return the best name to
248 | 
249 | 63
250 | 00:03:13,640 --> 00:03:14,640
251 | describe that company.
252 | 
253 | 64
254 | 00:03:14,840 --> 00:03:16,080
255 | So that will be the first chain.
256 | 
257 | 65
258 | 00:03:16,080 --> 00:03:18,280
259 | Then we're going to create a second chain.
260 | 
261 | 66
262 | 00:03:18,680 --> 00:03:23,280
263 | In the second chain, we'll take in the company name and then output a 20 word
264 | 
265 | 67
266 | 00:03:23,320 --> 00:03:25,240
267 | description of that company.
268 | 
269 | 68
270 | 00:03:26,320 --> 00:03:29,920
271 | And so you can imagine how these chains might want to be run one after another,
272 | 
273 | 69
274 | 00:03:30,160 --> 00:03:33,600
275 | where the output of the first chain, the company name is then passed into the
276 | 
277 | 70
278 | 00:03:33,600 --> 00:03:34,200
279 | second chain.
280 | 
281 | 71
282 | 00:03:35,800 --> 00:03:39,960
283 | We can easily do this by creating a simple sequential chain where we have the
284 | 
285 | 72
286 | 00:03:39,960 --> 00:03:41,640
287 | two chains described there.
288 | 
289 | 73
290 | 00:03:42,240 --> 00:03:44,240
291 | And we'll call this overall simple chain.
292 | 
293 | 74
294 | 00:03:44,240 --> 00:03:49,720
295 | Now, what you can do is run this chain over any product description.
296 | 
297 | 75
298 | 00:03:50,600 --> 00:03:54,920
299 | And so if we use it with the product above the queen size sheet set, we can
300 | 
301 | 76
302 | 00:03:54,920 --> 00:03:58,840
303 | run it over and we can see that at first outputs royal betting, and then it
304 | 
305 | 77
306 | 00:03:58,840 --> 00:04:00,200
307 | passes it into the second chain.
308 | 
309 | 78
310 | 00:04:00,200 --> 00:04:03,400
311 | And it comes up with this description of what that company could be about.
312 | 
313 | 79
314 | 00:04:05,680 --> 00:04:09,160
315 | The simple sequential chain works well when there's only a single input and a
316 | 
317 | 80
318 | 00:04:09,160 --> 00:04:09,840
319 | single output.
320 | 
321 | 81
322 | 00:04:10,320 --> 00:04:12,120
323 | But what about when there's only one input?
324 | 
325 | 82
326 | 00:04:12,120 --> 00:04:16,200
327 | And a single output, but what about when there are multiple inputs or multiple
328 | 
329 | 83
330 | 00:04:16,200 --> 00:04:16,680
331 | outputs?
332 | 
333 | 84
334 | 00:04:16,920 --> 00:04:20,080
335 | And so we can do this by using just the regular sequential chain.
336 | 
337 | 85
338 | 00:04:21,240 --> 00:04:22,160
339 | So let's import that.
340 | 
341 | 86
342 | 00:04:22,160 --> 00:04:25,280
343 | And then you're going to create a bunch of chains that we're going to use one
344 | 
345 | 87
346 | 00:04:25,280 --> 00:04:25,920
347 | after another.
348 | 
349 | 88
350 | 00:04:26,200 --> 00:04:29,200
351 | We're going to be using the data from above, which has a review.
352 | 
353 | 89
354 | 00:04:29,640 --> 00:04:34,320
355 | And so the first chain, we're going to take the review and translate it into
356 | 
357 | 90
358 | 00:04:34,320 --> 00:04:34,840
359 | English.
360 | 
361 | 91
362 | 00:04:37,240 --> 00:04:41,200
363 | With a second chain, we're going to create a summary of that review in one
364 | 
365 | 92
366 | 00:04:41,200 --> 00:04:46,840
367 | sentence. And this will use the previously generated English review.
368 | 
369 | 93
370 | 00:04:49,520 --> 00:04:53,560
371 | The third chain is going to detect what the language of the review was in the
372 | 
373 | 94
374 | 00:04:53,560 --> 00:04:54,200
375 | first place.
376 | 
377 | 95
378 | 00:04:54,560 --> 00:04:58,720
379 | And so if you notice, this is using the review variable that is coming from the
380 | 
381 | 96
382 | 00:04:58,720 --> 00:05:00,320
383 | original review.
384 | 
385 | 97
386 | 00:05:02,400 --> 00:05:05,480
387 | And finally, the fourth chain will take in multiple inputs.
388 | 
389 | 98
390 | 00:05:05,840 --> 00:05:09,560
391 | So this will take in the summary variable, which we calculated with the second
392 | 
393 | 99
394 | 00:05:09,560 --> 00:05:13,320
395 | chain and the language variable, which we calculated with the third chain.
396 | 
397 | 100
398 | 00:05:13,760 --> 00:05:17,720
399 | And it's going to ask for a follow up response to the summary in the specified
400 | 
401 | 101
402 | 00:05:17,720 --> 00:05:18,240
403 | language.
404 | 
405 | 102
406 | 00:05:19,800 --> 00:05:23,640
407 | One important thing to note about all these sub chains is that the input keys
408 | 
409 | 103
410 | 00:05:23,720 --> 00:05:25,960
411 | and output keys need to be pretty precise.
412 | 
413 | 104
414 | 00:05:26,680 --> 00:05:28,520
415 | So here we're taking in review.
416 | 
417 | 105
418 | 00:05:28,600 --> 00:05:31,120
419 | This is a variable that will be passed in at the start.
420 | 
421 | 106
422 | 00:05:31,760 --> 00:05:35,320
423 | We can see that we explicitly set the output key to English review.
424 | 
425 | 107
426 | 00:05:35,320 --> 00:05:39,840
427 | This is then used in the next prompt down below where we take in English review
428 | 
429 | 108
430 | 00:05:39,840 --> 00:05:44,240
431 | with that same variable name and we set the output key of that chain to summary,
432 | 
433 | 109
434 | 00:05:44,680 --> 00:05:46,920
435 | which we can see is used in the final chain.
436 | 
437 | 110
438 | 00:05:47,800 --> 00:05:52,760
439 | The third prompt takes in review the original variable and output language,
440 | 
441 | 111
442 | 00:05:53,160 --> 00:05:55,040
443 | which is again used in the final prompt.
444 | 
445 | 112
446 | 00:05:56,040 --> 00:05:59,760
447 | It's really important to get these variable names lined up exactly right,
448 | 
449 | 113
450 | 00:05:59,920 --> 00:06:02,400
451 | because there's so many different inputs and outputs going on.
452 | 
453 | 114
454 | 00:06:02,400 --> 00:06:06,160
455 | And if you get any key errors, you should definitely check that they are lined up.
456 | 
457 | 115
458 | 00:06:06,160 --> 00:06:12,040
459 | So the simple sequential chain takes in multiple chains where each one has a
460 | 
461 | 116
462 | 00:06:12,040 --> 00:06:13,680
463 | single input and a single output.
464 | 
465 | 117
466 | 00:06:14,560 --> 00:06:19,080
467 | To see a visual representation of this, we can look at the slide where it has one
468 | 
469 | 118
470 | 00:06:19,080 --> 00:06:22,760
471 | chain feeding into the other chain one after another.
472 | 
473 | 119
474 | 00:06:24,080 --> 00:06:28,000
475 | Here we can see a visual description of the sequential chain, comparing it to the
476 | 
477 | 120
478 | 00:06:28,000 --> 00:06:32,920
479 | above chain, you can notice that any step in the chain can take in multiple input
480 | 
481 | 121
482 | 00:06:32,920 --> 00:06:33,720
483 | variables.
484 | 
485 | 122
486 | 00:06:34,280 --> 00:06:38,400
487 | This is useful when you have more complicated downstream chains that need
488 | 
489 | 123
490 | 00:06:38,400 --> 00:06:41,400
491 | to be a composition of multiple previous chains.
492 | 
493 | 124
494 | 00:06:42,840 --> 00:06:46,400
495 | Now that we have all these chains, we can easily combine them in the sequential
496 | 
497 | 125
498 | 00:06:46,400 --> 00:06:46,920
499 | chain.
500 | 
501 | 126
502 | 00:06:47,360 --> 00:06:51,880
503 | So you'll notice here that we'll pass in the four chains we created into the
504 | 
505 | 127
506 | 00:06:51,880 --> 00:06:52,760
507 | chains variable.
508 | 
509 | 128
510 | 00:06:52,760 --> 00:06:57,280
511 | We'll create the inputs variable with the one human input, which is the
512 | 
513 | 129
514 | 00:06:57,280 --> 00:06:58,000
515 | review.
516 | 
517 | 130
518 | 00:06:58,400 --> 00:07:02,200
519 | And then we want to return all the intermediate outputs.
520 | 
521 | 131
522 | 00:07:02,200 --> 00:07:05,080
523 | So the English review, the summary, and then the follow up message.
524 | 
525 | 132
526 | 00:07:07,320 --> 00:07:10,080
527 | Now we can run this over some of the data.
528 | 
529 | 133
530 | 00:07:10,120 --> 00:07:14,800
531 | So let's choose a review and pass it in through the overall chain.
532 | 
533 | 134
534 | 00:07:20,000 --> 00:07:24,920
535 | We can see here that the original review looks like it was in French.
536 | 
537 | 135
538 | 00:07:24,920 --> 00:07:27,680
539 | We can see the English review as a translation.
540 | 
541 | 136
542 | 00:07:27,680 --> 00:07:31,880
543 | We can see a summary of that review, and then we can see a follow up message in
544 | 
545 | 137
546 | 00:07:31,880 --> 00:07:34,240
547 | the original language of French.
548 | 
549 | 138
550 | 00:07:34,800 --> 00:07:38,320
551 | You should pause the video here and try putting in different inputs.
552 | 
553 | 139
554 | 00:07:39,040 --> 00:07:42,560
555 | So far we've covered the LLM chain and then a sequential chain.
556 | 
557 | 140
558 | 00:07:43,080 --> 00:07:45,600
559 | But what if you want to do something more complicated?
560 | 
561 | 141
562 | 00:07:46,200 --> 00:07:50,440
563 | A pretty common, but basic operation is to route an input to a chain, depending
564 | 
565 | 142
566 | 00:07:50,440 --> 00:07:52,400
567 | on what exactly that input is.
568 | 
569 | 143
570 | 00:07:52,400 --> 00:07:57,200
571 | A good way to imagine this is if you have multiple sub chains, each of which
572 | 
573 | 144
574 | 00:07:57,200 --> 00:08:01,720
575 | specialized for a particular type of input, you could have a router chain,
576 | 
577 | 145
578 | 00:08:01,760 --> 00:08:06,000
579 | which first decides which sub chain to pass it to, and then passes it to that
580 | 
581 | 146
582 | 00:08:06,000 --> 00:08:06,480
583 | chain.
584 | 
585 | 147
586 | 00:08:07,360 --> 00:08:11,520
587 | For a concrete example, let's look at where we are routing between different
588 | 
589 | 148
590 | 00:08:11,520 --> 00:08:15,720
591 | types of chains, depending on the subject that seems to come in.
592 | 
593 | 149
594 | 00:08:16,440 --> 00:08:18,640
595 | So we have here different prompts.
596 | 
597 | 150
598 | 00:08:18,800 --> 00:08:21,280
599 | One prompt is good for answering physics questions.
600 | 
601 | 151
602 | 00:08:21,280 --> 00:08:23,680
603 | The second prompt is good for answering math questions.
604 | 
605 | 152
606 | 00:08:23,680 --> 00:08:26,960
607 | The third for history, and then a fourth for computer science.
608 | 
609 | 153
610 | 00:08:27,280 --> 00:08:29,440
611 | Let's define all these prompt templates.
612 | 
613 | 154
614 | 00:08:33,320 --> 00:08:36,800
615 | After we have these prompt templates, we can then provide more information
616 | 
617 | 155
618 | 00:08:36,800 --> 00:08:37,320
619 | about them.
620 | 
621 | 156
622 | 00:08:37,760 --> 00:08:40,600
623 | We can give each one a name and then a description.
624 | 
625 | 157
626 | 00:08:41,160 --> 00:08:44,280
627 | This description for the physics one is good for answering questions about
628 | 
629 | 158
630 | 00:08:44,280 --> 00:08:44,800
631 | physics.
632 | 
633 | 159
634 | 00:08:45,480 --> 00:08:48,560
635 | This information is going to be passed to the router chain.
636 | 
637 | 160
638 | 00:08:48,560 --> 00:08:52,000
639 | So the router chain can decide when to use this sub chain.
640 | 
641 | 161
642 | 00:08:58,080 --> 00:09:00,880
643 | Let's now import the other types of chains that we need.
644 | 
645 | 162
646 | 00:09:01,360 --> 00:09:03,080
647 | Here we need a multi-prompt chain.
648 | 
649 | 163
650 | 00:09:03,560 --> 00:09:07,400
651 | This is a specific type of chain that is used when routing between multiple
652 | 
653 | 164
654 | 00:09:07,400 --> 00:09:08,600
655 | different prompt templates.
656 | 
657 | 165
658 | 00:09:09,200 --> 00:09:12,680
659 | As you can see, all the options we have are prompt templates themselves.
660 | 
661 | 166
662 | 00:09:13,280 --> 00:09:15,760
663 | But this is just one type of thing that you can route between.
664 | 
665 | 167
666 | 00:09:15,760 --> 00:09:18,520
667 | You can route between any type of chain.
668 | 
669 | 168
670 | 00:09:19,000 --> 00:09:22,400
671 | The other classes that we'll implement here are an LLM router chain.
672 | 
673 | 169
674 | 00:09:22,880 --> 00:09:26,840
675 | This uses a language model itself to route between the different sub chains.
676 | 
677 | 170
678 | 00:09:26,880 --> 00:09:30,360
679 | This is where the description and the name provided above will be used.
680 | 
681 | 171
682 | 00:09:31,160 --> 00:09:33,360
683 | We'll also import a router output parser.
684 | 
685 | 172
686 | 00:09:33,920 --> 00:09:38,080
687 | This parses the LLM output into a dictionary that can be used downstream
688 | 
689 | 173
690 | 00:09:38,440 --> 00:09:42,440
691 | to determine which chain to use and what the input to that chain should be.
692 | 
693 | 174
694 | 00:09:42,440 --> 00:09:44,120
695 | Now we can get around to using it.
696 | 
697 | 175
698 | 00:09:44,160 --> 00:09:48,680
699 | First, let's import and define the language model that we will use.
700 | 
701 | 176
702 | 00:09:52,000 --> 00:09:54,240
703 | We now create the destination chains.
704 | 
705 | 177
706 | 00:09:54,400 --> 00:09:57,080
707 | These are the chains that will be called by the router chain.
708 | 
709 | 178
710 | 00:09:57,560 --> 00:10:01,400
711 | As you can see, each destination chain itself is a language model chain,
712 | 
713 | 179
714 | 00:10:01,400 --> 00:10:02,360
715 | an LLM chain.
716 | 
717 | 180
718 | 00:10:04,240 --> 00:10:08,640
719 | In addition to the destination chains, we also need a default destination chain.
720 | 
721 | 181
722 | 00:10:08,640 --> 00:10:12,800
723 | In addition to the destination chains, we also need a default chain.
724 | 
725 | 182
726 | 00:10:13,320 --> 00:10:15,880
727 | This is a chain that's called when the router can't decide
728 | 
729 | 183
730 | 00:10:16,080 --> 00:10:17,800
731 | which of the sub chains to use.
732 | 
733 | 184
734 | 00:10:18,080 --> 00:10:22,000
735 | And the example above, this might be called when the input question
736 | 
737 | 185
738 | 00:10:22,000 --> 00:10:25,800
739 | has nothing to do with physics, math, history or computer science.
740 | 
741 | 186
742 | 00:10:28,120 --> 00:10:31,760
743 | Now we define the template that is used by the LLM
744 | 
745 | 187
746 | 00:10:31,760 --> 00:10:33,800
747 | to route between the different chains.
748 | 
749 | 188
750 | 00:10:34,720 --> 00:10:37,000
751 | This has instructions of the task to be done,
752 | 
753 | 189
754 | 00:10:37,000 --> 00:10:40,440
755 | as well as the specific formatting that the output should be in.
756 | 
757 | 190
758 | 00:10:41,680 --> 00:10:44,840
759 | Let's put a few of these pieces together to build the router chain.
760 | 
761 | 191
762 | 00:10:45,600 --> 00:10:48,520
763 | First, we create the full router template by formatting it
764 | 
765 | 192
766 | 00:10:48,520 --> 00:10:50,480
767 | with the destinations that we defined above.
768 | 
769 | 193
770 | 00:10:50,920 --> 00:10:54,280
771 | This template is flexible to a bunch of different types of destinations.
772 | 
773 | 194
774 | 00:10:54,720 --> 00:10:58,520
775 | One thing you can do here is pause and add different types of destinations.
776 | 
777 | 195
778 | 00:10:59,000 --> 00:11:02,160
779 | So up here, rather than just physics, math, history and computer science,
780 | 
781 | 196
782 | 00:11:02,160 --> 00:11:04,960
783 | you could add a different subject like English or Latin.
784 | 
785 | 197
786 | 00:11:04,960 --> 00:11:07,760
787 | Next, we create the prompt template from this template,
788 | 
789 | 198
790 | 00:11:08,080 --> 00:11:11,280
791 | and then we create the router chain by passing in the LLM
792 | 
793 | 199
794 | 00:11:11,280 --> 00:11:13,120
795 | and the overall router prompt.
796 | 
797 | 200
798 | 00:11:13,960 --> 00:11:16,360
799 | Note that here we have the router output parser.
800 | 
801 | 201
802 | 00:11:16,720 --> 00:11:19,320
803 | This is important as it will help this chain decide
804 | 
805 | 202
806 | 00:11:19,720 --> 00:11:22,160
807 | which sub chains to route between.
808 | 
809 | 203
810 | 00:11:24,760 --> 00:11:28,920
811 | And finally, putting it all together, we can create the overall chain.
812 | 
813 | 204
814 | 00:11:29,240 --> 00:11:32,400
815 | This has a router chain, which is defined here.
816 | 
817 | 205
818 | 00:11:32,400 --> 00:11:35,200
819 | It has destination chains, which we pass in here.
820 | 
821 | 206
822 | 00:11:35,400 --> 00:11:37,200
823 | And then we also pass in the default chain.
824 | 
825 | 207
826 | 00:11:38,880 --> 00:11:40,200
827 | We can now use this chain.
828 | 
829 | 208
830 | 00:11:40,520 --> 00:11:41,960
831 | So let's ask it some questions.
832 | 
833 | 209
834 | 00:11:42,560 --> 00:11:45,320
835 | If we ask it a question about physics,
836 | 
837 | 210
838 | 00:11:45,520 --> 00:11:48,920
839 | we should hopefully see that it is routed to the physics chain
840 | 
841 | 211
842 | 00:11:49,640 --> 00:11:52,560
843 | with the input, what is blackbody radiation?
844 | 
845 | 212
846 | 00:11:52,800 --> 00:11:55,480
847 | And then that is passed into the chain down below.
848 | 
849 | 213
850 | 00:11:55,480 --> 00:11:59,080
851 | And we can see that the response is very detailed
852 | 
853 | 214
854 | 00:11:59,080 --> 00:12:01,080
855 | with lots of physics details.
856 | 
857 | 215
858 | 00:12:01,080 --> 00:12:04,600
859 | You should pause the video here and try putting in different inputs.
860 | 
861 | 216
862 | 00:12:04,920 --> 00:12:08,520
863 | You can try with all the other types of special chains
864 | 
865 | 217
866 | 00:12:08,520 --> 00:12:09,920
867 | that we have defined above.
868 | 
869 | 218
870 | 00:12:10,440 --> 00:12:13,240
871 | So, for example, if we ask it a math question,
872 | 
873 | 219
874 | 00:12:21,600 --> 00:12:23,680
875 | we should see that it's routed to the math chain
876 | 
877 | 220
878 | 00:12:24,040 --> 00:12:25,120
879 | and then passed into that.
880 | 
881 | 221
882 | 00:12:25,120 --> 00:12:27,720
883 | We can also see what happens when we pass in a question
884 | 
885 | 222
886 | 00:12:27,920 --> 00:12:30,320
887 | that is not related to any of the subchains.
888 | 
889 | 223
890 | 00:12:30,720 --> 00:12:33,480
891 | So here we ask it a question about biology
892 | 
893 | 224
894 | 00:12:33,760 --> 00:12:35,880
895 | and we can see the chain that it chooses is none.
896 | 
897 | 225
898 | 00:12:36,520 --> 00:12:38,400
899 | This means that it will be passed to the default chain,
900 | 
901 | 226
902 | 00:12:38,400 --> 00:12:41,360
903 | which itself is just a generic call to the language model.
904 | 
905 | 227
906 | 00:12:41,560 --> 00:12:43,680
907 | The language model luckily knows a lot about biology,
908 | 
909 | 228
910 | 00:12:43,680 --> 00:12:44,880
911 | so it can help us out here.
912 | 
913 | 229
914 | 00:12:46,080 --> 00:12:48,120
915 | Now that we've covered these basic questions,
916 | 
917 | 230
918 | 00:12:48,120 --> 00:12:50,120
919 | let's move on to the next part of the video.
920 | 
921 | 231
922 | 00:12:50,120 --> 00:12:52,120
923 | And that is how we can create a new chain.
924 | 
925 | 232
926 | 00:12:52,120 --> 00:12:55,120
927 | So, for example, in the next section,
928 | 
929 | 233
930 | 00:12:55,120 --> 00:12:57,120
931 | we're going to cover how to create a chain
932 | 
933 | 234
934 | 00:12:57,120 --> 00:13:22,120
935 | that can do question answering over your documents.
936 | 
937 | 
938 | 


--------------------------------------------------------------------------------