├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── blog_post.docx
├── blog_post.html
├── blog_post.md
├── blog_post.qmd
├── img
├── ML-15729-Amit.png
├── ML-15729-agt-console.png
├── ML-15729-agt-iam.png
├── ML-15729-agt-select-model.png
├── ML-15729-agt-trace1.png
├── ML-15729-agt-trace2.png
├── ML-15729-agt1.png
├── ML-15729-agt2-s1-1.png
├── ML-15729-agt2-s1-2.png
├── ML-15729-agt2.png
├── ML-15729-agt3-s1.png
├── ML-15729-agt3.png
├── ML-15729-agt4.png
├── ML-15729-agt5-s1.png
├── ML-15729-agt5.png
├── ML-15729-agt6.png
├── ML-15729-agt7.png
├── ML-15729-aoss-cv.jpg
├── ML-15729-aoss.jpg
├── ML-15729-aoss1.jpg
├── ML-15729-aoss2.jpg
├── ML-15729-bedrock-agents-kb.png
├── ML-15729-blog_post.png
├── ML-15729-cf-outputs.jpg
├── ML-15729-cf1.jpg
├── ML-15729-cloudformation-launch-stack.png
├── ML-15729-kb1.jpg
├── ML-15729-kb10.jpg
├── ML-15729-kb11-w-context.png
├── ML-15729-kb11-wo-context.png
├── ML-15729-kb11.png
├── ML-15729-kb2.jpg
├── ML-15729-kb3.jpg
├── ML-15729-kb4.jpg
├── ML-15729-kb5.jpg
├── ML-15729-kb6.png
├── ML-15729-kb7.png
├── ML-15729-kb8.jpg
├── ML-15729-kb9.jpg
├── ML-15729-kb_02.png
├── ML-15729-os-vi-1.1.png
├── ML-15729-os-vi-1.png
├── ML-15729-os-vi-2.1.png
├── ML-15729-os-vi-2.png
├── ML-15729-rag_1.png
├── ML-15729-sm1.jpg
├── ML-15729-sm2.jpg
├── ML-15729-sm3.jpg
└── bedrock-agents-kb.drawio
├── rag_w_bedrock_agent.ipynb
├── rag_w_bedrock_and_aoss.ipynb
├── setup_bedrock_conda.sh
└── template.yml
/.gitignore:
--------------------------------------------------------------------------------
1 | docs/
2 | agent-sdk/
3 | python-sdk-test-runtime-trace/
4 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing Guidelines
2 |
3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
4 | documentation, we greatly value feedback and contributions from our community.
5 |
6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
7 | information to effectively respond to your bug report or contribution.
8 |
9 |
10 | ## Reporting Bugs/Feature Requests
11 |
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 |
14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 |
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 |
22 |
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 |
26 | 1. You are working against the latest source on the *main* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 |
30 | To send us a pull request, please:
31 |
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 |
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 |
42 |
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45 |
46 |
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 |
52 |
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 |
56 |
57 | ## Licensing
58 |
59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT No Attribution
2 |
3 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
6 | this software and associated documentation files (the "Software"), to deal in
7 | the Software without restriction, including without limitation the rights to
8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
9 | the Software, and to permit persons to whom the Software is furnished to do so.
10 |
11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
13 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
14 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
15 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
16 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
17 |
18 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Retrieval Augmented Generation using Amazon Bedrock
2 |
3 | This repository provides sample code for implementing a question answering application using the Retrieval Augmented Generation (RAG) technique with Amazon Bedrock. A RAG implementation consists of two parts:
4 |
5 | 1. A data pipeline that ingests that from documents (typically stored in Amazon S3) into a knowledge base i.e. a vector database such as Amazon OpenSearch Service Serverless (AOSS) so that it is available for lookup when a question is received.
6 |
7 | 1. An application that receives a question from the user, looks up the knowledge base for relevant pieces of information (context) and then creates a prompt that includes the question and the context and provides it to an LLM for generating a response.
8 |
9 | The data pipeline represents an undifferentiated heavy lifting and can be implemented using Amazon Bedrock Agents for knowledge Base. We can now connect an S3 bucket to a vector database such as AOSS and have a Bedrock Agent read the objects (html, pdf, text etc.), chunk them, and then convert these chunks into embeddings using Amazon Titan Embeddings model and then store these embeddings in AOSS. All of this without having to build, deploy and manage the data pipeline.
10 |
11 | Once the data is available in the Bedrock Knowledge Base then a question answering application can be built using the following architectural pattern.
12 |
13 | 
14 |
15 | ## Installation
16 |
17 | Follow the steps listed below to create and run the RAG solution. The [blog_post.md](./blog_post.md) describes this solution in detail.
18 |
19 | 1. Launch the AWS CloudFormation template included in this repository using one of the buttons from the table below. The CloudFormation template creates the following resources within your AWS account: Amazon OpenSearch Service Serverless (AOSS) Collection, Amazon S3 bucket, IAM roles for Amazon Bedrock Knowledge Base Agent and Notebook and a Amazon SageMaker Notebook with this repository cloned to run the next steps.
20 |
21 |
22 | |AWS Region | Link |
23 | |:------------------------:|:-----------:|
24 | |us-east-1 (N. Virginia) | [
](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) |
25 | |us-west-2 (Oregon) | [
](https://console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) |
26 |
27 | 1. Follow instructions in [Build a RAG based question answer solution using Amazon Bedrock Knowledge Base and Amazon OpenSearch Service Serverless](./blog_post.md)
28 |
29 | ## Security
30 |
31 | See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
32 |
33 | ## License
34 |
35 | This library is licensed under the MIT-0 License. See the [LICENSE](./LICENSE) file.
36 |
37 |
38 |
--------------------------------------------------------------------------------
/blog_post.docx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/blog_post.docx
--------------------------------------------------------------------------------
/blog_post.md:
--------------------------------------------------------------------------------
1 | Build a RAG based question answer solution using Amazon Bedrock
2 | Knowledge Base, vector engine for Amazon OpenSearch Service Serverless
3 | and LangChain
4 | ================
5 |
6 | *Amit Arora*
7 |
8 | One of the most common applications of generative AI and Foundation
9 | Models (FMs) in an enterprise environment is answering questions based
10 | on the enterprise’s knowledge corpus. [Amazon
11 | Lex](https://aws.amazon.com/lex/) provides the framework for building
12 | [AI based
13 | chatbots](https://aws.amazon.com/solutions/retail/ai-for-chatbots).
14 | Pre-trained foundation models (FMs) perform well at natural language
15 | understanding (NLU) tasks such summarization, text generation and
16 | question answering on a broad variety of topics but either struggle to
17 | provide accurate (without hallucinations) answers or completely fail at
18 | answering questions about content that they haven’t seen as part of
19 | their training data. Furthermore, FMs are trained with a point in time
20 | snapshot of data and have no inherent ability to access fresh data at
21 | inference time; without this ability they might provide responses that
22 | are potentially incorrect or inadequate.
23 |
24 | A commonly used approach to address this problem is to use a technique
25 | called Retrieval Augmented Generation (RAG). In the RAG-based approach
26 | we convert the user question into vector embeddings using an FM and then
27 | do a similarity search for these embeddings in a pre-populated vector
28 | database holding the embeddings for the enterprise knowledge corpus. A
29 | small number of similar documents (typically three) is added as context
30 | along with the user question to the “prompt” provided to another FM and
31 | then that FM generates an answer to the user question using information
32 | provided as context in the prompt. RAG models were introduced by [Lewis
33 | et al.](https://arxiv.org/abs/2005.11401) in 2020 as a model where
34 | parametric memory is a pre-trained seq2seq model and the non-parametric
35 | memory is a dense vector index of Wikipedia, accessed with a pre-trained
36 | neural retriever. To understand the overall structure of a RAG-based
37 | approach, refer to [Build a powerful question answering bot with Amazon
38 | SageMaker, Amazon OpenSearch Service, Streamlit, and
39 | LangChain](https://aws.amazon.com/blogs/machine-learning/build-a-powerful-question-answering-bot-with-amazon-sagemaker-amazon-opensearch-service-streamlit-and-langchain/).
40 |
41 | In this post we provide a step-by-step guide with all the building
42 | blocks for creating a *Low Code No Code* (LCNC) enterprise ready RAG
43 | application such as a question answering solution. We use FMs available
44 | through Amazon Bedrock for the embeddings model (Amazon Titan Text
45 | Embeddings v2), the text generation model (Anthropic Claude v2), the
46 | Amazon Bedrock Knowledge Base and Amazon Bedrock Agents for this
47 | solution. The text corpus representing an enterprise knowledge base is
48 | stored as HTML files in Amazon S3 and is ingested in the form of text
49 | embeddings into an index in a Amazon OpenSearch Service Serverless
50 | collection using Bedrock knowledge base agent in a fully-managed
51 | serverless fashion.
52 |
53 | We provide an AWS Cloud Formation template to stand up all the resources
54 | required for building this solution. We then demonstrate how to use
55 | [LangChain](https://www.langchain.com) to interface with the Bedrock and
56 | [opensearch-py](https://pypi.org/project/opensearch-py/) to interface
57 | with OpenSearch Service Serverless and build a RAG based question answer
58 | workflow.
59 |
60 | ## Solution overview
61 |
62 | We use a subset of [SageMaker docs](https://sagemaker.readthedocs.io) as
63 | the knowledge corpus for this post. The data is available in the form of
64 | HTML files in an S3 bucket, a Bedrock Knowledge Base Agent then reads
65 | these files, converts them into smaller chunks, encodes these chunks
66 | into vectors (embeddings) and then ingests these embeddings into an
67 | OpenSearch Service Serverless collection index. We implement the RAG
68 | functionality in a notebook, a set of SageMaker related questions is
69 | asked of the Claude model without providing any additional context and
70 | then the same questions are asked again but this time with context based
71 | on similar documents retrieved from OpenSearch Service Serverless
72 | i.e. using the RAG approach. We demonstrate the responses generated
73 | without RAG could be factually inaccurate whereas the RAG based
74 | responses are accurate and more useful.
75 |
76 | All the code for this post is available in the [GitHub
77 | repo](https://github.com/aws-samples/bedrock-kb-rag/tree/main/blogs/rag).
78 |
79 | The following figure represents the high-level architecture of the
80 | proposed solution.
81 |
82 |
83 |
85 | Figure 1: Architecture
86 |
87 |
88 | Step-by-step explanation:
89 |
90 | 1. The user provides a question via the Jupyter notebook.
91 | 2. The question is converted into embedding using Bedrock via the Titan
92 | embeddings v2 model.
93 | 3. The embedding is used to find similar documents from an OpenSearch
94 | Service Serverless index.
95 | 4. The similar documents long with the user question are used to create
96 | a “prompt”.
97 | 5. The prompt is provided to Bedrock to generate a response using the
98 | Claude v2 model.
99 | 6. The response along with the context is printed out in a notebook
100 | cell.
101 |
102 | As illustrated in the architecture diagram, we use the following AWS
103 | services:
104 |
105 | - [Bedrock](https://aws.amazon.com/bedrock/) for access to the FMs for
106 | embedding and text generation as well as for the knowledge base agent.
107 | - [OpenSearch Service Serverless with vector
108 | search](https://aws.amazon.com/opensearch-service/serverless-vector-engine/)
109 | for storing the embeddings of the enterprise knowledge corpus and
110 | doing similarity search with user questions.
111 | - [S3](https://aws.amazon.com/pm/serv-s3/) for storing the raw knowledge
112 | corpus data (HTML files).
113 | - [AWS Identity and Access Management](https://aws.amazon.com/iam/)
114 | roles and policies for access management.
115 | - [AWS CloudFormation](https://aws.amazon.com/cloudformation/) for
116 | creating the entire solution stack through infrastructure as code.
117 |
118 | In terms of open-source packages used in this solution, we use
119 | [LangChain](https://python.langchain.com/en/latest/index.html) for
120 | interfacing with Bedrock and
121 | [opensearch-py](https://pypi.org/project/opensearch-py/) to interface
122 | with OpenSearch Service Serverless.
123 |
124 | The workflow for instantiating the solution presented in this post in
125 | your own AWS account is as follows:
126 |
127 | 1. Run the CloudFormation template provided with this post in your
128 | account. This will create all the necessary infrastructure resources
129 | needed for this solution:
130 |
131 | 1. OpenSearch Service Serverless collection
132 | 2. SageMaker Notebook
133 | 3. IAM roles
134 |
135 | 2. Create a vector index in the OpenSearch Service Serverless
136 | collection. This is done through the OpenSearch Service Serverless
137 | console.
138 |
139 | 3. Create a knowledge base in Bedrock and synch data from the S3 bucket
140 | to the OpenSearch Service Serverless collection index. This is done
141 | through the Bedrock console.
142 |
143 | 4. Create a Bedrock Agent and connect it to the knowledge base and use
144 | the Agent console for question answering *without having to write
145 | any code*.
146 |
147 | 5. Run the
148 | [`rag_w_bedrock_and_aoss.ipynb`](./rag_w_bedrock_and_aoss.ipynb)
149 | notebook in the SageMaker notebook to ask questions based on the
150 | data ingested in OpenSearch Service Serverless collection index.
151 |
152 | These steps are discussed in detail in the following sections.
153 |
154 | ### Prerequisites
155 |
156 | To implement the solution provided in this post, you should have an [AWS
157 | account](https://signin.aws.amazon.com/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&client_id=signup)
158 | and awareness about FMs, OpenSearch Service and Bedrock.
159 |
160 | #### Use AWS Cloud Formation to create the solution stack
161 |
162 | Choose **Launch Stack** for the Region you want to deploy resources to.
163 | All parameters needed by the CloudFormation template have default values
164 | already filled in, except for ARN of the IAM role with which you are
165 | currently logged into your AWS account which you’d have to provide. Make
166 | a note of the OpenSearch Service collection ARN, we use this in
167 | subsequent steps. **This template takes about 10 minutes to complete**.
168 |
169 | | AWS Region | Link |
170 | |:-----------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
171 | | us-east-1 (N. Virginia) | [
](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) |
172 | | us-west-2 (Oregon) | [
](https://console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) |
173 |
174 | After the stack is created successfully, navigate to the stack’s
175 | `Outputs` tab on the AWS CloudFormation console and note the values for
176 | `CollectionARN` and `AOSSVectorIndexName`. We use those in the
177 | subsequent steps.
178 |
179 |
180 |
182 | Figure 2: Cloud Formation Stack
183 | Outputs
184 |
185 |
186 | #### Create an OpenSearch Service Serverless vector index
187 |
188 | The CloudFormation stack creates an OpenSearch Service Serverless
189 | collection, the next step is to create a vector index. This is done
190 | through the OpenSearch Service Serverless console as described below.
191 |
192 | 1. Navigate to OpenSearch Service console and click on `Collections`.
193 | The `sagemaker-kb` collection created by the CloudFormation stack
194 | will be listed there.
195 |
196 |
197 |
199 | Figure 3: SageMaker Knowledge Base
200 | Collection
201 |
202 |
203 | 2. Click on the `sagemaker-kb` link to create a vector index for
204 | storing the embeddings from the documents in S3.
205 |
206 |
207 |
210 | Figure 4: SageMaker Knowledge Base Vector
211 | Index
212 |
213 |
214 | 3. Set the vector index name as `sagemaker-readthedocs-io`, vector
215 | field name as `vector` dimensions as `1536`, choose engine types as `FAISS` and distance metric as
216 | `Euclidean`. **It is required that you set these parameters exactly
217 | as mentioned here because the Bedrock Knowledge Base Agent is going
218 | to use these same values**.
219 |
220 |
221 |
224 | Figure 5: SageMaker Knowledge Base Vector
225 | Index Parameters
226 |
227 |
228 | 4. Once created the vector index is listed as part of the collection.
229 |
230 |
231 |
234 | Figure 6: SageMaker Knowledge Base Vector
235 | Index Created
236 |
237 |
238 | #### Create a Bedrock knowledge base
239 |
240 | Once the OpenSearch Service Serverless collection and vector index have
241 | been created, it is time to setup the Bedrock knowledge base.
242 |
243 | 1. Navigate to the Bedrock Console and click on `Knowledge Base` and
244 | click on the `Created Knowledge Base` button.
245 |
246 |
247 |
249 | Figure 7: Bedrock Knowledge
250 | Base
251 |
252 |
253 | 2. Fill out the details for creating the knowledge base as shown in the
254 | screenshots below.
255 |
256 |
257 |
259 | Figure 8: Bedrock Knowledge
260 | Base
261 |
262 |
263 | 3. Select the S3 bucket.
264 |
265 |
266 |
268 | Figure 9: Bedrock Knowledge Base S3
269 | bucket
270 |
271 |
272 | 4. The Titan embeddings model is automatically selected.
273 |
274 |
275 |
277 | Figure 10: Bedrock Knowledge Base
278 | embeddings model
279 |
280 |
281 | 5. Select Amazon OpenSearch Service Serverless from the vector database
282 | options available.
283 |
284 |
285 |
287 | Figure 11: Bedrock Knowledge Base
288 | OpenSearch Service Serverless
289 |
290 |
291 | 6. Review and create the knowledge base by clicking the
292 | `Create knowledge base` button.
293 |
294 |
295 |
297 | Figure 12: Bedrock Knowledge Base Review
298 | & Create
299 |
300 |
301 | 7. The knowledge base should be created now.
302 |
303 |
304 |
306 | Figure 13: Bedrock Knowledge Base create
307 | complete
308 |
309 |
310 | ##### Sync the Bedrock knowledge base
311 |
312 | Once the Bedrock knowledge base is created we are now ready to sync the
313 | data (raw documents) in S3 to embeddings in the OpenSearch Service
314 | Serverless collection vector index.
315 |
316 | 1. Start the `Sync` by pressing the `Sync` button, the button label
317 | changes to `Syncing`.
318 |
319 |
320 |
322 | Figure 14: Bedrock Knowledge Base
323 | sync
324 |
325 |
326 | 2. Once the `Sync` completes the status changes to `Ready`.
327 |
328 |
329 |
331 | Figure 15: Bedrock Knowledge Base sync
332 | completed
333 |
334 |
335 | #### Create a Bedrock Agent for question answering
336 |
337 | Now we are all set to ask some questions of our newly created knowledge
338 | base. In this step we do this in a no code way by creating a Bedrock
339 | Agent.
340 |
341 | 1. Create a new Bedrock agent, call it `sagemaker-qa` and use the
342 | `AmazonBedrockExecutionRoleForAgent_SageMakerQA` IAM role, this role
343 | is created automatically via CloudFormation.
344 |
345 |
346 |
348 | Figure 16: Provide agent details - agent
349 | name
350 |
351 |
352 |
353 |
355 | Figure 17: Provide agent details - IAM
356 | role
357 |
358 |
359 | 2. Provide the following as the instructions for the agent:
360 | `You are a Q&A agent that politely answers questions from a knowledge base named sagemaker-docs.`.
361 | The `Anthropic Claude V2` model is selected as the model for the
362 | agent.
363 |
364 |
365 |
367 | Figure 18: Select model
368 |
369 |
370 | 3. Click `Next` on the `Add Action groups - optional` page, there are
371 | no action groups needed for this agent.
372 |
373 | 4. Select the `sagemaker-docs` knowledge base, in the knowledge base
374 | instructions for agent field entry
375 | `Answer questions about Amazon SageMaker based only on the information contained in the knowledge base.`.
376 |
377 |
378 |
380 | Figure 19: Add knowledge
381 | base
382 |
383 |
384 | 5. Click the `Create Agent` button on the `Review and create` screen.
385 |
386 |
387 |
389 | Figure 20: Review and create
390 |
391 |
392 | 6. Once the agent is ready, we can ask questions to our agent using the
393 | Agent console.
394 |
395 |
396 |
398 | Figure 21: Agent console
399 |
400 |
401 | 7. We ask the agent some questions such as
402 | `What are the XGBoost versions supported in Amazon SageMaker`,
403 | notice that we not only get the correct answer but also a link to
404 | the source of the answer in terms of the original document stored in
405 | S3 that has been used as context to provide this answer!
406 |
407 |
408 |
410 | Figure 22: Q&A with Bedrock
411 | Agent
412 |
413 |
414 | 8. The agent also provides a *trace* feature which can show the steps
415 | the agent undertakes to come up with the final answer. The steps
416 | include the prompt used and the text from the retrieved documents
417 | from the knowledge base.
418 |
419 |
420 |
422 | Figure 23: Bedrock Agent Trace Step
423 | 1
424 |
425 |
426 |
427 |
429 | Figure 24: Bedrock Agent Trace Step
430 | 2
431 |
432 |
433 | #### Run the RAG notebook
434 |
435 | Now we will interact with our knowledge base through code. The
436 | CloudFormation template creates a SageMaker Notebook that contains the
437 | code to demonstrate this.
438 |
439 | 1. Navigate to SageMaker Notebooks and find the notebook named
440 | `bedrock-kb-rag-workshop` and click on `Open Jupyter Lab`.
441 |
442 |
443 |
445 | Figure 25: RAG with Bedrock KB
446 | notebook
447 |
448 |
449 | 2. Open a new `Terminal` from `File -> New -> Terminal` and run the
450 | following commands to install the Bedrock SDK in a new conda kernel
451 | called `bedrock_py39`.
452 |
453 | ``` python
454 | chmod +x /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh
455 | /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh
456 | ```
457 |
458 | 3. Wait for one minute after completing the previous step and now click
459 | on the `rag_w_bedrock_and_aoss.ipynb` to open the notebook. *Confirm
460 | that the notebook is using the newly created `bedrock_py39` kernel,
461 | otherwise the code will not work. In case the kernel is not set to
462 | `bedrock_py39` then refresh the page and this time the
463 | `bedrock_py39` kernel would be selected*.
464 |
465 | 4. The notebook code demonstrates use of Bedrock, LangChain and
466 | opensearch-py packages for implementing the RAG technique for
467 | question answering.
468 |
469 | 5. We access the models available via Bedrock using the `Bedrock` and
470 | `BedrockEmbeddings` classes from the LangChain package.
471 |
472 | ``` python
473 | # we will use Anthropic Claude for text generation
474 | claude_llm = Bedrock(model_id= "anthropic.claude-v2")
475 | claude_llm.model_kwargs = dict(temperature=0.5, max_tokens_to_sample=300, top_k=250, top_p=1, stop_sequences=[])
476 |
477 | # we will be using the Titan Embeddings Model to generate our Embeddings.
478 | embeddings = BedrockEmbeddings(model_id = "amazon.titan-embed-g1-text-02")
479 | ```
480 |
481 | 6. Interface to OpenSearch Service Serverless is through the
482 | opensearch-py package.
483 |
484 | ``` python
485 | # Functions to talk to OpenSearch
486 |
487 | # Define queries for OpenSearch
488 | def query_docs(query: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, index: str, k: int = 3) -> Dict:
489 | """
490 | Convert the query into embedding and then find similar documents from OpenSearch Service Serverless
491 | """
492 |
493 | # embedding
494 | query_embedding = embeddings.embed_query(query)
495 |
496 | # query to lookup OpenSearch kNN vector. Can add any metadata fields based filtering
497 | # here as part of this query.
498 | query_qna = {
499 | "size": k,
500 | "query": {
501 | "knn": {
502 | "vector": {
503 | "vector": query_embedding,
504 | "k": k
505 | }
506 | }
507 | }
508 | }
509 |
510 | # OpenSearch API call
511 | relevant_documents = aoss_client.search(
512 | body = query_qna,
513 | index = index
514 | )
515 | return relevant_documents
516 | ```
517 |
518 | 7. We combine the prompt and the documents retrieved from OpenSearch
519 | Service Serverless as follows.
520 |
521 | ``` python
522 | def create_context_for_query(q: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, vector_index: str) -> str:
523 | """
524 | Create a context out of the similar docs retrieved from the vector database
525 | by concatenating the text from the similar documents.
526 | """
527 | print(f"query -> {q}")
528 | aoss_response = query_docs(q, embeddings, aoss_client, vector_index)
529 | context = ""
530 | for r in aoss_response['hits']['hits']:
531 | s = r['_source']
532 | print(f"{s['metadata']}\n{s['text']}")
533 | context += f"{s['text']}\n"
534 | print("----------------")
535 | return context
536 | ```
537 |
538 | 8. Combining everything, the RAG workflow works as shown below.
539 |
540 | ``` python
541 | # 1. Start with the query
542 | q = "What versions of XGBoost are supported by Amazon SageMaker?"
543 |
544 | # 2. Create the context by finding similar documents from the knowledge base
545 | context = create_context_for_query(q, embeddings, client, aoss_vector_index)
546 |
547 | # 3. Now create a prompt by combining the query and the context
548 | prompt = PROMPT_TEMPLATE.format(context, q)
549 |
550 | # 4. Provide the prompt to the FM to generate an answer to the query based on context provided
551 | response = claude_llm(prompt)
552 | ```
553 |
554 | 9. Here is an example of a sample question answered first with just the
555 | question in the prompt i.e. without providing any additional
556 | context. The answer without context is inaccurate.
557 |
558 |
559 |
561 | Figure 26: Answer with prompt
562 | alone
563 |
564 |
565 | 10. We then ask the same question but this time with the additional
566 | context retrieved from the knowledge base included in the prompt.
567 | Now the inaccuracy in the earlier response is addressed and we also
568 | have attribution as to the source of this answer (notice the
569 | underlined text for the filename and the actual answer)!
570 |
571 |
572 |
574 | Figure 27: Answer with prompt and
575 | context
576 |
577 |
578 | ## Clean up
579 |
580 | To avoid incurring future charges, delete the resources. You can do this
581 | by first deleting all the files from the S3 bucket created by the
582 | CloudFormation template and then deleting the CloudFormation stack.
583 |
584 | ## Conclusion
585 |
586 | In this post, we showed how to create an enterprise ready RAG solution
587 | using a combination of AWS services and open-source Python packages.
588 |
589 | We encourage you to learn more by exploring [Amazon
590 | Titan](https://aws.amazon.com/bedrock/titan/) models, [Amazon
591 | Bedrock](https://aws.amazon.com/bedrock/), and [OpenSearch
592 | Service](https://aws.amazon.com/opensearch-service/) and building a
593 | solution using the sample implementation provided in this post and a
594 | dataset relevant to your business. If you have questions or suggestions,
595 | leave a comment.
596 |
597 | ------------------------------------------------------------------------
598 |
599 | ## Author bio
600 |
601 |
Amit
602 | Arora is an AI and ML Specialist Architect at Amazon Web Services,
603 | helping enterprise customers use cloud-based machine learning services
604 | to rapidly scale their innovations. He is also an adjunct lecturer in
605 | the MS data science and analytics program at Georgetown University in
606 | Washington D.C.
607 |
--------------------------------------------------------------------------------
/blog_post.qmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Build a RAG based question answer solution using Amazon Bedrock Knowledge Base, vector engine for Amazon OpenSearch Service Serverless and LangChain"
3 | format:
4 | html:
5 | embed-resources: true
6 | output-file: blog_post.html
7 | theme: cosmo
8 | code-copy: true
9 | code-line-numbers: true
10 | highlight-style: github
11 | docx:
12 | embed-resources: true
13 | output-file: blog_post.docx
14 | theme: cosmo
15 | code-copy: true
16 | code-line-numbers: true
17 | highlight-style: github
18 | gfm:
19 | output-file: blog_post.md
20 | ---
21 |
22 | _Amit Arora_
23 |
24 | One of the most common applications of generative AI and Foundation Models (FMs) in an enterprise environment is answering questions based on the enterprise’s knowledge corpus. [Amazon Lex](https://aws.amazon.com/lex/) provides the framework for building [AI based chatbots](https://aws.amazon.com/solutions/retail/ai-for-chatbots). Pre-trained foundation models (FMs) perform well at natural language understanding (NLU) tasks such summarization, text generation and question answering on a broad variety of topics but either struggle to provide accurate (without hallucinations) answers or completely fail at answering questions about content that they haven't seen as part of their training data. Furthermore, FMs are trained with a point in time snapshot of data and have no inherent ability to access fresh data at inference time; without this ability they might provide responses that are potentially incorrect or inadequate.
25 |
26 | A commonly used approach to address this problem is to use a technique called Retrieval Augmented Generation (RAG). In the RAG-based approach we convert the user question into vector embeddings using an FM and then do a similarity search for these embeddings in a pre-populated vector database holding the embeddings for the enterprise knowledge corpus. A small number of similar documents (typically three) is added as context along with the user question to the "prompt" provided to another FM and then that FM generates an answer to the user question using information provided as context in the prompt. RAG models were introduced by [Lewis et al.](https://arxiv.org/abs/2005.11401) in 2020 as a model where parametric memory is a pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever. To understand the overall structure of a RAG-based approach, refer to [Build a powerful question answering bot with Amazon SageMaker, Amazon OpenSearch Service, Streamlit, and LangChain](https://aws.amazon.com/blogs/machine-learning/build-a-powerful-question-answering-bot-with-amazon-sagemaker-amazon-opensearch-service-streamlit-and-langchain/).
27 |
28 | In this post we provide a step-by-step guide with all the building blocks for creating a _Low Code No Code_ (LCNC) enterprise ready RAG application such as a question answering solution. We use FMs available through Amazon Bedrock for the embeddings model (Amazon Titan Text Embeddings v2), the text generation model (Anthropic Claude v2), the Amazon Bedrock Knowledge Base and Amazon Bedrock Agents for this solution. The text corpus representing an enterprise knowledge base is stored as HTML files in Amazon S3 and is ingested in the form of text embeddings into an index in a Amazon OpenSearch Service Serverless collection using Bedrock knowledge base agent in a fully-managed serverless fashion.
29 |
30 | We provide an AWS Cloud Formation template to stand up all the resources required for building this solution. We then demonstrate how to use [LangChain](https://www.langchain.com) to interface with the Bedrock and [opensearch-py](https://pypi.org/project/opensearch-py/) to interface with OpenSearch Service Serverless and build a RAG based question answer workflow.
31 |
32 | ## Solution overview
33 |
34 | We use a subset of [SageMaker docs](https://sagemaker.readthedocs.io) as the knowledge corpus for this post. The data is available in the form of HTML files in an S3 bucket, a Bedrock Knowledge Base Agent then reads these files, converts them into smaller chunks, encodes these chunks into vectors (embeddings) and then ingests these embeddings into an OpenSearch Service Serverless collection index. We implement the RAG functionality in a notebook, a set of SageMaker related questions is asked of the Claude model without providing any additional context and then the same questions are asked again but this time with context based on similar documents retrieved from OpenSearch Service Serverless i.e. using the RAG approach. We demonstrate the responses generated without RAG could be factually inaccurate whereas the RAG based responses are accurate and more useful.
35 |
36 | All the code for this post is available in the [GitHub repo](https://github.com/aws-samples/bedrock-kb-rag/tree/main/blogs/rag).
37 |
38 |
39 | The following figure represents the high-level architecture of the proposed solution.
40 |
41 | {#fig-architecture}
42 |
43 | Step-by-step explanation:
44 |
45 | 1. The user provides a question via the Jupyter notebook.
46 | 1. The question is converted into embedding using Bedrock via the Titan embeddings v2 model.
47 | 1. The embedding is used to find similar documents from an OpenSearch Service Serverless index.
48 | 1. The similar documents long with the user question are used to create a "prompt".
49 | 1. The prompt is provided to Bedrock to generate a response using the Claude v2 model.
50 | 1. The response along with the context is printed out in a notebook cell.
51 |
52 | As illustrated in the architecture diagram, we use the following AWS services:
53 |
54 | - [Bedrock](https://aws.amazon.com/bedrock/) for access to the FMs for embedding and text generation as well as for the knowledge base agent.
55 | - [OpenSearch Service Serverless with vector search](https://aws.amazon.com/opensearch-service/serverless-vector-engine/) for storing the embeddings of the enterprise knowledge corpus and doing similarity search with user questions.
56 | - [S3](https://aws.amazon.com/pm/serv-s3/) for storing the raw knowledge corpus data (HTML files).
57 | - [AWS Identity and Access Management](https://aws.amazon.com/iam/) roles and policies for access management.
58 | - [AWS CloudFormation](https://aws.amazon.com/cloudformation/) for creating the entire solution stack through infrastructure as code.
59 |
60 | In terms of open-source packages used in this solution, we use [LangChain](https://python.langchain.com/en/latest/index.html) for interfacing with Bedrock and [opensearch-py](https://pypi.org/project/opensearch-py/) to interface with OpenSearch Service Serverless.
61 |
62 | The workflow for instantiating the solution presented in this post in your own AWS account is as follows:
63 |
64 | 1. Run the CloudFormation template provided with this post in your account. This will create all the necessary infrastructure resources needed for this solution:
65 | a. OpenSearch Service Serverless collection
66 | a. SageMaker Notebook
67 | a. IAM roles
68 |
69 | 1. Create a vector index in the OpenSearch Service Serverless collection. This is done through the OpenSearch Service Serverless console.
70 |
71 | 1. Create a knowledge base in Bedrock and synch data from the S3 bucket to the OpenSearch Service Serverless collection index. This is done through the Bedrock console.
72 |
73 | 1. Create a Bedrock Agent and connect it to the knowledge base and use the Agent console for question answering _without having to write any code_.
74 |
75 | 1. Run the [`rag_w_bedrock_and_aoss.ipynb`](./rag_w_bedrock_and_aoss.ipynb) notebook in the SageMaker notebook to ask questions based on the data ingested in OpenSearch Service Serverless collection index.
76 |
77 | These steps are discussed in detail in the following sections.
78 |
79 | ### Prerequisites
80 |
81 | To implement the solution provided in this post, you should have an [AWS account](https://signin.aws.amazon.com/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&client_id=signup) and awareness about FMs, OpenSearch Service and Bedrock.
82 |
83 | #### Use AWS Cloud Formation to create the solution stack
84 |
85 | Choose **Launch Stack** for the Region you want to deploy resources to. All parameters needed by the CloudFormation template have default values already filled in, except for ARN of the IAM role with which you are currently logged into your AWS account which you'd have to provide. Make a note of the OpenSearch Service collection ARN, we use this in subsequent steps. **This template takes about 10 minutes to complete**.
86 |
87 | |AWS Region | Link |
88 | |:------------------------:|:-----------:|
89 | |us-east-1 (N. Virginia) | [
](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) |
90 | |us-west-2 (Oregon) | [
](https://console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) |
91 |
92 | After the stack is created successfully, navigate to the stack's `Outputs` tab on the AWS CloudFormation console and note the values for `CollectionARN` and `AOSSVectorIndexName`. We use those in the subsequent steps.
93 |
94 | {#fig-cfn-outputs}
95 |
96 | #### Create an OpenSearch Service Serverless vector index
97 |
98 | The CloudFormation stack creates an OpenSearch Service Serverless collection, the next step is to create a vector index. This is done through the OpenSearch Service Serverless console as described below.
99 |
100 | 1. Navigate to OpenSearch Service console and click on `Collections`. The `sagemaker-kb` collection created by the CloudFormation stack will be listed there.
101 |
102 | {#fig-aoss-collections}
103 |
104 | 1. Click on the `sagemaker-kb` link to create a vector index for storing the embeddings from the documents in S3.
105 |
106 | {#fig-aoss-collection-vector-index}
107 |
108 | 1. Set the vector index name as `sagemaker-readthedocs-io`, vector field name as `vector` dimensions as `1536`, and distance metric as `Euclidean`. **It is required that you set these parameters exactly as mentioned here because the Bedrock Knowledge Base Agent is going to use these same values**.
109 |
110 | {#fig-aoss-collection-vector-index-parameters}
111 |
112 | 1. Once created the vector index is listed as part of the collection.
113 |
114 | {#fig-aoss-collection-vector-index-created}
115 |
116 |
117 | #### Create a Bedrock knowledge base
118 |
119 | Once the OpenSearch Service Serverless collection and vector index have been created, it is time to setup the Bedrock knowledge base.
120 |
121 | 1. Navigate to the Bedrock Console and click on `Knowledge Base` and click on the `Created Knowledge Base` button.
122 |
123 | {#fig-br-kb-list}
124 |
125 | 1. Fill out the details for creating the knowledge base as shown in the screenshots below.
126 |
127 | {#fig-br-kb-list}
128 |
129 | 1. Select the S3 bucket.
130 |
131 | {#fig-br-kb-s3-bucket}
132 |
133 | 1. The Titan embeddings model is automatically selected.
134 |
135 | {#fig-br-kb-titan}
136 |
137 |
138 | 1. Select Amazon OpenSearch Service Serverless from the vector database options available.
139 |
140 | {#fig-br-kb-aoss}
141 |
142 | 1. Review and create the knowledge base by clicking the `Create knowledge base` button.
143 |
144 | {#fig-br-kb-review-and-create}
145 |
146 | 1. The knowledge base should be created now.
147 |
148 | {#fig-br-kb-create-complete}
149 |
150 |
151 | ##### Sync the Bedrock knowledge base
152 |
153 | Once the Bedrock knowledge base is created we are now ready to sync the data (raw documents) in S3 to embeddings in the OpenSearch Service Serverless collection vector index.
154 |
155 | 1. Start the `Sync` by pressing the `Sync` button, the button label changes to `Syncing`.
156 |
157 | {#fig-br-kb-sync-in-progress}
158 |
159 | 1. Once the `Sync` completes the status changes to `Ready`.
160 |
161 | {#fig-br-kb-sync-done}
162 |
163 | #### Create a Bedrock Agent for question answering
164 |
165 | Now we are all set to ask some questions of our newly created knowledge base. In this step we do this in a no code way by creating a Bedrock Agent.
166 |
167 | 1. Create a new Bedrock agent, call it `sagemaker-qa` and use the `AmazonBedrockExecutionRoleForAgent_SageMakerQA` IAM role, this role is created automatically via CloudFormation.
168 |
169 | {#fig-br-agt-create-step1-1}
170 |
171 | {#fig-br-agt-create-step1-2}
172 |
173 | 1. Provide the following as the instructions for the agent: `You are a Q&A agent that politely answers questions from a knowledge base.`. The `Anthropic Claude V2` model is selected as the model for the agent.
174 |
175 | {#fig-br-agt-select-model}
176 |
177 | 1. Click `Next` on the `Add Action groups - optional` page, there are no action groups needed for this agent.
178 |
179 | 1. Select the `sagemaker-docs` knowledge base, in the knowledge base instructions for agent field entry `Answer questions about Amazon SageMaker based only on the information contained in the knowledge base.`.
180 |
181 | {#fig-br-agt-add-kb}
182 |
183 | 1. Click the `Create Agent` button on the `Review and create` screen.
184 |
185 | {#fig-br-agt-review-and-create}
186 |
187 | 1. Once the agent is ready, we can ask questions to our agent using the Agent console.
188 |
189 | {#fig-br-agt-console}
190 |
191 | 1. We ask the agent some questions such as `What are the XGBoost versions supported in Amazon SageMaker`, notice that we not only get the correct answer but also a link to the source of the answer in terms of the original document stored in S3 that has been used as context to provide this answer!
192 |
193 | {#fig-br-agt-qna}
194 |
195 | 1. The agent also provides a _trace_ feature which can show the steps the agent undertakes to come up with the final answer. The steps include the prompt used and the text from the retrieved documents from the knowledge base.
196 |
197 | {#fig-br-agt-trace-step1}
198 |
199 | {#fig-br-agt-trace-step2}
200 |
201 | #### Run the RAG notebook
202 |
203 | Now we will interact with our knowledge base through code. The CloudFormation template creates a SageMaker Notebook that contains the code to demonstrate this.
204 |
205 | 1. Navigate to SageMaker Notebooks and find the notebook named `bedrock-kb-rag-workshop` and click on `Open Jupyter Lab`.
206 |
207 | {#fig-rag-w-br-nb}
208 |
209 | 1. Open a new `Terminal` from `File -> New -> Terminal` and run the following commands to install the Bedrock SDK in a new conda kernel called `bedrock_py39`.
210 |
211 | ```{.python}
212 | chmod +x /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh
213 | /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh
214 | ```
215 |
216 | 1. Wait for one minute after completing the previous step and now click on the `rag_w_bedrock_and_aoss.ipynb` to open the notebook. *Confirm that the notebook is using the newly created `bedrock_py39` kernel, otherwise the code will not work. In case the kernel is not set to `bedrock_py39` then refresh the page and this time the `bedrock_py39` kernel would be selected*.
217 |
218 | 1. The notebook code demonstrates use of Bedrock, LangChain and opensearch-py packages for implementing the RAG technique for question answering.
219 |
220 | 1. We access the models available via Bedrock using the `Bedrock` and `BedrockEmbeddings` classes from the LangChain package.
221 |
222 | ```{.python}
223 | # we will use Anthropic Claude for text generation
224 | claude_llm = Bedrock(model_id= "anthropic.claude-v2")
225 | claude_llm.model_kwargs = dict(temperature=0.5, max_tokens_to_sample=300, top_k=250, top_p=1, stop_sequences=[])
226 |
227 | # we will be using the Titan Embeddings Model to generate our Embeddings.
228 | embeddings = BedrockEmbeddings(model_id = "amazon.titan-embed-g1-text-02")
229 | ```
230 |
231 | 1. Interface to OpenSearch Service Serverless is through the opensearch-py package.
232 |
233 | ```{.python}
234 | # Functions to talk to OpenSearch
235 |
236 | # Define queries for OpenSearch
237 | def query_docs(query: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, index: str, k: int = 3) -> Dict:
238 | """
239 | Convert the query into embedding and then find similar documents from OpenSearch Service Serverless
240 | """
241 |
242 | # embedding
243 | query_embedding = embeddings.embed_query(query)
244 |
245 | # query to lookup OpenSearch kNN vector. Can add any metadata fields based filtering
246 | # here as part of this query.
247 | query_qna = {
248 | "size": k,
249 | "query": {
250 | "knn": {
251 | "vector": {
252 | "vector": query_embedding,
253 | "k": k
254 | }
255 | }
256 | }
257 | }
258 |
259 | # OpenSearch API call
260 | relevant_documents = aoss_client.search(
261 | body = query_qna,
262 | index = index
263 | )
264 | return relevant_documents
265 | ```
266 |
267 | 1. We combine the prompt and the documents retrieved from OpenSearch Service Serverless as follows.
268 |
269 | ```{.python}
270 | def create_context_for_query(q: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, vector_index: str) -> str:
271 | """
272 | Create a context out of the similar docs retrieved from the vector database
273 | by concatenating the text from the similar documents.
274 | """
275 | print(f"query -> {q}")
276 | aoss_response = query_docs(q, embeddings, aoss_client, vector_index)
277 | context = ""
278 | for r in aoss_response['hits']['hits']:
279 | s = r['_source']
280 | print(f"{s['metadata']}\n{s['text']}")
281 | context += f"{s['text']}\n"
282 | print("----------------")
283 | return context
284 | ```
285 |
286 | 1. Combining everything, the RAG workflow works as shown below.
287 |
288 | ```{.python}
289 | # 1. Start with the query
290 | q = "What versions of XGBoost are supported by Amazon SageMaker?"
291 |
292 | # 2. Create the context by finding similar documents from the knowledge base
293 | context = create_context_for_query(q, embeddings, client, aoss_vector_index)
294 |
295 | # 3. Now create a prompt by combining the query and the context
296 | prompt = PROMPT_TEMPLATE.format(context, q)
297 |
298 | # 4. Provide the prompt to the FM to generate an answer to the query based on context provided
299 | response = claude_llm(prompt)
300 | ```
301 |
302 | 1. Here is an example of a sample question answered first with just the question in the prompt i.e. without providing any additional context. The answer without context is inaccurate.
303 |
304 | {#fig-rag-wo-context}
305 |
306 |
307 | 1. We then ask the same question but this time with the additional context retrieved from the knowledge base included in the prompt. Now the inaccuracy in the earlier response is addressed and we also have attribution as to the source of this answer (notice the underlined text for the filename and the actual answer)!
308 |
309 | {#fig-answer-w-context}
310 |
311 | ## Clean up
312 |
313 | To avoid incurring future charges, delete the resources. You can do this by first deleting all the files from the S3 bucket created by the CloudFormation template and then deleting the CloudFormation stack.
314 |
315 | ## Conclusion
316 |
317 | In this post, we showed how to create an enterprise ready RAG solution using a combination of AWS services and open-source Python packages.
318 |
319 | We encourage you to learn more by exploring [Amazon Titan](https://aws.amazon.com/bedrock/titan/) models, [Amazon Bedrock](https://aws.amazon.com/bedrock/), and [OpenSearch Service](https://aws.amazon.com/opensearch-service/) and building a solution using the sample implementation provided in this post and a dataset relevant to your business. If you have questions or suggestions, leave a comment.
320 |
321 | * * * * *
322 |
323 | ## Author bio
324 |
325 |
Amit Arora is an AI and ML Specialist Architect at Amazon Web Services, helping enterprise customers use cloud-based machine learning services to rapidly scale their innovations. He is also an adjunct lecturer in the MS data science and analytics program at Georgetown University in Washington D.C.
326 |
--------------------------------------------------------------------------------
/img/ML-15729-Amit.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-Amit.png
--------------------------------------------------------------------------------
/img/ML-15729-agt-console.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-console.png
--------------------------------------------------------------------------------
/img/ML-15729-agt-iam.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-iam.png
--------------------------------------------------------------------------------
/img/ML-15729-agt-select-model.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-select-model.png
--------------------------------------------------------------------------------
/img/ML-15729-agt-trace1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-trace1.png
--------------------------------------------------------------------------------
/img/ML-15729-agt-trace2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-trace2.png
--------------------------------------------------------------------------------
/img/ML-15729-agt1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt1.png
--------------------------------------------------------------------------------
/img/ML-15729-agt2-s1-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt2-s1-1.png
--------------------------------------------------------------------------------
/img/ML-15729-agt2-s1-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt2-s1-2.png
--------------------------------------------------------------------------------
/img/ML-15729-agt2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt2.png
--------------------------------------------------------------------------------
/img/ML-15729-agt3-s1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt3-s1.png
--------------------------------------------------------------------------------
/img/ML-15729-agt3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt3.png
--------------------------------------------------------------------------------
/img/ML-15729-agt4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt4.png
--------------------------------------------------------------------------------
/img/ML-15729-agt5-s1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt5-s1.png
--------------------------------------------------------------------------------
/img/ML-15729-agt5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt5.png
--------------------------------------------------------------------------------
/img/ML-15729-agt6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt6.png
--------------------------------------------------------------------------------
/img/ML-15729-agt7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt7.png
--------------------------------------------------------------------------------
/img/ML-15729-aoss-cv.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss-cv.jpg
--------------------------------------------------------------------------------
/img/ML-15729-aoss.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss.jpg
--------------------------------------------------------------------------------
/img/ML-15729-aoss1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss1.jpg
--------------------------------------------------------------------------------
/img/ML-15729-aoss2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss2.jpg
--------------------------------------------------------------------------------
/img/ML-15729-bedrock-agents-kb.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-bedrock-agents-kb.png
--------------------------------------------------------------------------------
/img/ML-15729-blog_post.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-blog_post.png
--------------------------------------------------------------------------------
/img/ML-15729-cf-outputs.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-cf-outputs.jpg
--------------------------------------------------------------------------------
/img/ML-15729-cf1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-cf1.jpg
--------------------------------------------------------------------------------
/img/ML-15729-cloudformation-launch-stack.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-cloudformation-launch-stack.png
--------------------------------------------------------------------------------
/img/ML-15729-kb1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb1.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb10.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb10.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb11-w-context.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb11-w-context.png
--------------------------------------------------------------------------------
/img/ML-15729-kb11-wo-context.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb11-wo-context.png
--------------------------------------------------------------------------------
/img/ML-15729-kb11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb11.png
--------------------------------------------------------------------------------
/img/ML-15729-kb2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb2.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb3.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb4.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb4.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb5.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb5.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb6.png
--------------------------------------------------------------------------------
/img/ML-15729-kb7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb7.png
--------------------------------------------------------------------------------
/img/ML-15729-kb8.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb8.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb9.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb9.jpg
--------------------------------------------------------------------------------
/img/ML-15729-kb_02.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb_02.png
--------------------------------------------------------------------------------
/img/ML-15729-os-vi-1.1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-1.1.png
--------------------------------------------------------------------------------
/img/ML-15729-os-vi-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-1.png
--------------------------------------------------------------------------------
/img/ML-15729-os-vi-2.1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-2.1.png
--------------------------------------------------------------------------------
/img/ML-15729-os-vi-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-2.png
--------------------------------------------------------------------------------
/img/ML-15729-rag_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-rag_1.png
--------------------------------------------------------------------------------
/img/ML-15729-sm1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-sm1.jpg
--------------------------------------------------------------------------------
/img/ML-15729-sm2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-sm2.jpg
--------------------------------------------------------------------------------
/img/ML-15729-sm3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-sm3.jpg
--------------------------------------------------------------------------------
/img/bedrock-agents-kb.drawio:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 |
63 |
64 |
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 |
75 |
76 |
77 |
78 |
79 |
80 |
81 |
82 |
83 |
84 |
85 |
86 |
87 |
88 |
89 |
90 |
91 |
92 |
93 |
94 |
95 |
96 |
97 |
98 |
99 |
100 |
101 |
102 |
103 |
104 |
105 |
106 |
107 |
108 |
109 |
110 |
111 |
112 |
113 |
--------------------------------------------------------------------------------
/rag_w_bedrock_agent.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 30,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import uuid\n",
10 | "import boto3\n",
11 | "import pprint\n",
12 | "import botocore\n",
13 | "import logging\n",
14 | "\n",
15 | "logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)\n",
16 | "logger = logging.getLogger(__name__)\n"
17 | ]
18 | },
19 | {
20 | "cell_type": "code",
21 | "execution_count": 31,
22 | "metadata": {},
23 | "outputs": [],
24 | "source": [
25 | "# global constants\n",
26 | "ENDPOINT_URL: str = None\n"
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "execution_count": 32,
32 | "metadata": {},
33 | "outputs": [],
34 | "source": [
35 | "# you want to make sure that install sequence is as follows\n",
36 | "# %pip install boto3-1.28.54-py3-none-any.whl\n",
37 | "# %pip install botocore-1.31.54-py3-none-any.whl\n",
38 | "# %pip install awscli-1.29.54-py3-none-any.whl\n",
39 | "\n",
40 | "# exit out if the Boto3 (Python) SDK versions are not correct\n",
41 | "assert boto3.__version__ == \"1.28.73\"\n",
42 | "assert botocore.__version__ == \"1.31.73\"\n"
43 | ]
44 | },
45 | {
46 | "cell_type": "code",
47 | "execution_count": 33,
48 | "metadata": {},
49 | "outputs": [],
50 | "source": [
51 | "input_text:str = \"What are the XGBoost versions supported in Amazon SageMaker?\" # replace this with a prompt relevant to your agent\n",
52 | "agent_id:str = 'J0TEWQNZ89' # note this from the agent console on Bedrock\n",
53 | "agent_alias_id:str = 'TSTALIASID' # fixed for draft version of the agent\n",
54 | "session_id:str = str(uuid.uuid1()) # random identifier\n",
55 | "enable_trace:bool = True\n"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": 34,
61 | "metadata": {},
62 | "outputs": [
63 | {
64 | "name": "stderr",
65 | "output_type": "stream",
66 | "text": [
67 | "[2023-11-02 15:01:55,355] p43792 {691742557.py:3} INFO - \n"
68 | ]
69 | }
70 | ],
71 | "source": [
72 | "# create an boto3 bedrock agent client\n",
73 | "client = boto3.client(\"bedrock-agent-runtime\", endpoint_url=ENDPOINT_URL)\n",
74 | "logger.info(client)\n"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "execution_count": 35,
80 | "metadata": {},
81 | "outputs": [
82 | {
83 | "name": "stderr",
84 | "output_type": "stream",
85 | "text": [
86 | "[2023-11-02 15:01:55,849] p43792 {4226590062.py:9} INFO - None\n"
87 | ]
88 | },
89 | {
90 | "name": "stdout",
91 | "output_type": "stream",
92 | "text": [
93 | "{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',\n",
94 | " 'content-type': 'application/json',\n",
95 | " 'date': 'Thu, 02 Nov 2023 19:01:55 GMT',\n",
96 | " 'transfer-encoding': 'chunked',\n",
97 | " 'x-amz-bedrock-agent-session-id': '4a5687bd-79b2-11ee-943b-846a79be0989',\n",
98 | " 'x-amzn-bedrock-agent-content-type': 'application/json',\n",
99 | " 'x-amzn-requestid': 'b76650eb-ee2d-4120-a091-d680ea4c588f'},\n",
100 | " 'HTTPStatusCode': 200,\n",
101 | " 'RequestId': 'b76650eb-ee2d-4120-a091-d680ea4c588f',\n",
102 | " 'RetryAttempts': 0},\n",
103 | " 'completion': ,\n",
104 | " 'contentType': 'application/json',\n",
105 | " 'sessionId': '4a5687bd-79b2-11ee-943b-846a79be0989'}\n"
106 | ]
107 | }
108 | ],
109 | "source": [
110 | "# invoke the agent API\n",
111 | "response = client.invoke_agent(inputText=input_text,\n",
112 | " agentId=agent_id,\n",
113 | " agentAliasId=agent_alias_id,\n",
114 | " sessionId=session_id,\n",
115 | " enableTrace=enable_trace\n",
116 | ")\n",
117 | "\n",
118 | "logger.info(pprint.pprint(response))\n"
119 | ]
120 | },
121 | {
122 | "cell_type": "code",
123 | "execution_count": 36,
124 | "metadata": {},
125 | "outputs": [
126 | {
127 | "name": "stderr",
128 | "output_type": "stream",
129 | "text": [
130 | "[2023-11-02 15:02:04,472] p43792 {:11} INFO - {\n",
131 | " \"agentId\": \"J0TEWQNZ89\",\n",
132 | " \"agentAliasId\": \"TSTALIASID\",\n",
133 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n",
134 | " \"trace\": {\n",
135 | " \"rationaleTrace\": {\n",
136 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-0\",\n",
137 | " \"text\": \"Review the \\\"User Input\\\", \\\"Conversation History\\\", \\\"Attributes\\\", \\\"APIs\\\" and always think about what to do\"\n",
138 | " }\n",
139 | " }\n",
140 | "}\n",
141 | "[2023-11-02 15:02:19,633] p43792 {:11} INFO - {\n",
142 | " \"agentId\": \"J0TEWQNZ89\",\n",
143 | " \"agentAliasId\": \"TSTALIASID\",\n",
144 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n",
145 | " \"trace\": {\n",
146 | " \"invocationInputTrace\": {\n",
147 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-0\",\n",
148 | " \"invocationType\": \"KNOWLEDGE_BASE\",\n",
149 | " \"knowledgeBaseLookupInput\": {\n",
150 | " \"text\": \"What are the XGBoost versions supported in Amazon SageMaker?\",\n",
151 | " \"knowledgeBaseId\": \"HK7XZ6KQYP\"\n",
152 | " }\n",
153 | " }\n",
154 | " }\n",
155 | "}\n",
156 | "[2023-11-02 15:02:19,634] p43792 {:11} INFO - {\n",
157 | " \"agentId\": \"J0TEWQNZ89\",\n",
158 | " \"agentAliasId\": \"TSTALIASID\",\n",
159 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n",
160 | " \"trace\": {\n",
161 | " \"observationTrace\": {\n",
162 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-0\",\n",
163 | " \"invocationType\": \"KNOWLEDGE_BASE\",\n",
164 | " \"knowledgeBaseLookupOutput\": {\n",
165 | " \"sourceReferences\": {\n",
166 | " \"textSourceReferences\": [\n",
167 | " {\n",
168 | " \"sourceLocation\": {\n",
169 | " \"s3SourceLocation\": {\n",
170 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"\n",
171 | " }\n",
172 | " },\n",
173 | " \"referenceText\": \"see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm\\u00b6 Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don\\u2019t have to write a training script. If you don\\u2019t need the features and flexibility of open source XGBoost, consider using the built-in version. For information about using the Amazon SageMaker XGBoost built-in algorithm, see XGBoost Algorithm in the Amazon SageMaker Developer Guide. Use the Open Source XGBoost Algorithm\\u00b6 If you want the flexibility and additional features that it provides, use the SageMaker open source XGBoost algorithm. For which XGBoost versions are supported, see the AWS documentation. We recommend that you use the latest supported version because that\\u2019s where we focus most of our development efforts. For a complete example of using the open source XGBoost algorithm, see the sample notebook at https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb. For more information about XGBoost, see the XGBoost documentation. Train a Model with Open Source XGBoost\\u00b6 To train a model by using the Amazon SageMaker open source XGBoost algorithm: Prepare a training script Create a sagemaker.xgboost.XGBoost estimator Call the estimator\\u2019s fit method Prepare a Training Script\\u00b6 A typical training script loads data from the input channels, configures training with hyperparameters, trains a model,\"\n",
174 | " },\n",
175 | " {\n",
176 | " \"sourceLocation\": {\n",
177 | " \"s3SourceLocation\": {\n",
178 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"\n",
179 | " }\n",
180 | " },\n",
181 | " \"referenceText\": \"see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm\\u00b6 Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don\\u2019t have to write a training script. If you don\\u2019t need the features and flexibility of open source XGBoost, consider using the built-in version. For information about using the Amazon SageMaker XGBoost built-in algorithm, see XGBoost Algorithm in the Amazon SageMaker Developer Guide. Use the Open Source XGBoost Algorithm\\u00b6 If you want the flexibility and additional features that it provides, use the SageMaker open source XGBoost algorithm. For which XGBoost versions are supported, see the AWS documentation. We recommend that you use the latest supported version because that\\u2019s where we focus most of our development efforts. For a complete example of using the open source XGBoost algorithm, see the sample notebook at https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb. For more information about XGBoost, see the XGBoost documentation. Train a Model with Open Source XGBoost\\u00b6 To train a model by using the Amazon SageMaker open source XGBoost algorithm: Prepare a training script Create a sagemaker.xgboost.XGBoost estimator Call the estimator\\u2019s fit method Prepare a Training Script\\u00b6 A typical training script loads data from the input channels, configures training with hyperparameters, trains a model,\"\n",
182 | " },\n",
183 | " {\n",
184 | " \"sourceLocation\": {\n",
185 | " \"s3SourceLocation\": {\n",
186 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_algorithms_tabular_xgboost.html\"\n",
187 | " }\n",
188 | " },\n",
189 | " \"referenceText\": \"an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5. The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm. Notebook Title Description How to Create a Custom XGBoost container? This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform. Regression with XGBoost using Parquet This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model. How to Train and Host a Multiclass Classification Model? This notebook shows how to use the MNIST dataset to train and host a multiclass classification model. How to train a Model for Customer Churn Prediction? This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers. An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training This notebook shows you how to use Spot Instances for training with a XGBoost Container. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running. For instructions on how to create and access Jupyter\"\n",
190 | " },\n",
191 | " {\n",
192 | " \"sourceLocation\": {\n",
193 | " \"s3SourceLocation\": {\n",
194 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_algorithms_tabular_xgboost.html\"\n",
195 | " }\n",
196 | " },\n",
197 | " \"referenceText\": \"an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5. The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm. Notebook Title Description How to Create a Custom XGBoost container? This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform. Regression with XGBoost using Parquet This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model. How to Train and Host a Multiclass Classification Model? This notebook shows how to use the MNIST dataset to train and host a multiclass classification model. How to train a Model for Customer Churn Prediction? This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers. An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training This notebook shows you how to use Spot Instances for training with a XGBoost Container. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running. For instructions on how to create and access Jupyter\"\n",
198 | " },\n",
199 | " {\n",
200 | " \"sourceLocation\": {\n",
201 | " \"s3SourceLocation\": {\n",
202 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"\n",
203 | " }\n",
204 | " },\n",
205 | " \"referenceText\": \"the permissions necessary to run an Amazon SageMaker training job, the type and number of instances to use for the training job, and a dictionary of the hyperparameters to pass to the training script. from sagemaker.xgboost.estimator import XGBoost xgb_estimator = XGBoost( entry_point=\\\"abalone.py\\\", hyperparameters=hyperparameters, role=role, instance_count=1, instance_type=\\\"ml.m5.2xlarge\\\", framework_version=\\\"1.0-1\\\", ) Call the fit Method\\u00b6 After you create an estimator, call the fit method to run the training job. xgb_script_mode_estimator.fit({\\\"train\\\": train_input}) Deploy Open Source XGBoost Models\\u00b6 After you fit an XGBoost Estimator, you can host the newly created model in SageMaker. After you call fit, you can call deploy on an XGBoost estimator to create a SageMaker endpoint. The endpoint runs a SageMaker-provided XGBoost model server and hosts the model produced by your training script, which was run when you called fit. This was the model you saved to model_dir. deploy returns a Predictor object, which you can use to do inference on the Endpoint hosting your XGBoost model. Each Predictor provides a predict method which can do inference with numpy arrays, Python lists, or strings. After inference arrays or lists are serialized and sent to the XGBoost model server, predict returns the result of inference against your model. serializer = StringSerializer(content_type=\\\"text/libsvm\\\") predictor = estimator.deploy(\"\n",
206 | " }\n",
207 | " ]\n",
208 | " }\n",
209 | " }\n",
210 | " }\n",
211 | " }\n",
212 | "}\n",
213 | "[2023-11-02 15:02:26,296] p43792 {:11} INFO - {\n",
214 | " \"agentId\": \"J0TEWQNZ89\",\n",
215 | " \"agentAliasId\": \"TSTALIASID\",\n",
216 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n",
217 | " \"trace\": {\n",
218 | " \"rationaleTrace\": {\n",
219 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-1\",\n",
220 | " \"text\": \"Based on the previous \\\"Observation\\\" I am able to provide \\\"Final Answer\\\"\"\n",
221 | " }\n",
222 | " }\n",
223 | "}\n",
224 | "[2023-11-02 15:02:37,746] p43792 {:11} INFO - {\n",
225 | " \"agentId\": \"J0TEWQNZ89\",\n",
226 | " \"agentAliasId\": \"TSTALIASID\",\n",
227 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n",
228 | " \"trace\": {\n",
229 | " \"observationTrace\": {\n",
230 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-1\",\n",
231 | " \"invocationType\": \"FINISH\",\n",
232 | " \"finalResponse\": {\n",
233 | " \"text\": \"The XGBoost versions supported in Amazon SageMaker are 1.0, 1.2, 1.3 and 1.5. The latest version 1.5 is recommended.\"\n",
234 | " }\n",
235 | " }\n",
236 | " }\n",
237 | "}\n",
238 | "[2023-11-02 15:02:37,747] p43792 {:7} INFO - Final answer ->\n",
239 | "Amazon SageMaker supports XGBoost versions 1.0, 1.2, 1.3, and 1.5. The latest supported version is recommended because that is where most development efforts are focused. Amazon SageMaker supports XGBoost versions 1.0, 1.2, 1.3, and 1.5. The latest supported version is recommended because that is where most development efforts are focused.\n"
240 | ]
241 | },
242 | {
243 | "name": "stdout",
244 | "output_type": "stream",
245 | "text": [
246 | "CPU times: total: 15.6 ms\n",
247 | "Wall time: 41.9 s\n"
248 | ]
249 | }
250 | ],
251 | "source": [
252 | "%%time\n",
253 | "import json\n",
254 | "event_stream = response['completion']\n",
255 | "try:\n",
256 | " for event in event_stream: \n",
257 | " if 'chunk' in event:\n",
258 | " data = event['chunk']['bytes']\n",
259 | " logger.info(f\"Final answer ->\\n{data.decode('utf8')}\") \n",
260 | " end_event_received = True\n",
261 | " # End event indicates that the request finished successfully\n",
262 | " elif 'trace' in event:\n",
263 | " logger.info(json.dumps(event['trace'], indent=2))\n",
264 | " else:\n",
265 | " raise Exception(\"unexpected event.\", event)\n",
266 | "except Exception as e:\n",
267 | " raise Exception(\"unexpected event.\", e)\n"
268 | ]
269 | },
270 | {
271 | "cell_type": "code",
272 | "execution_count": null,
273 | "metadata": {},
274 | "outputs": [],
275 | "source": []
276 | }
277 | ],
278 | "metadata": {
279 | "kernelspec": {
280 | "display_name": "python-sdk-test-runtime-trace",
281 | "language": "python",
282 | "name": "python3"
283 | },
284 | "language_info": {
285 | "codemirror_mode": {
286 | "name": "ipython",
287 | "version": 3
288 | },
289 | "file_extension": ".py",
290 | "mimetype": "text/x-python",
291 | "name": "python",
292 | "nbconvert_exporter": "python",
293 | "pygments_lexer": "ipython3",
294 | "version": "3.11.5"
295 | }
296 | },
297 | "nbformat": 4,
298 | "nbformat_minor": 2
299 | }
300 |
--------------------------------------------------------------------------------
/rag_w_bedrock_and_aoss.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "c2dc3fcb-ae4f-48e6-9b1c-71b002e0fe1b",
6 | "metadata": {
7 | "tags": []
8 | },
9 | "source": [
10 | "# RAG with Amazon Bedrock Knowledge Base\n",
11 | "\n",
12 | "In this notebook we use the information ingested in the Bedrock knowledge base to answer user queries."
13 | ]
14 | },
15 | {
16 | "cell_type": "markdown",
17 | "id": "a59d4975",
18 | "metadata": {},
19 | "source": [
20 | "## Import packages and utility functions\n",
21 | "Import packages, setup utility functions, interface with Amazon OpenSearch Service Serverless (AOSS)."
22 | ]
23 | },
24 | {
25 | "cell_type": "code",
26 | "execution_count": 1,
27 | "id": "85ce61b6-795b-488c-b400-1ac80d355162",
28 | "metadata": {
29 | "tags": []
30 | },
31 | "outputs": [],
32 | "source": [
33 | "import os\n",
34 | "import sys\n",
35 | "import json\n",
36 | "import boto3\n",
37 | "from typing import Dict\n",
38 | "from urllib.request import urlretrieve\n",
39 | "from langchain.llms.bedrock import Bedrock\n",
40 | "from IPython.display import Markdown, display\n",
41 | "from langchain.embeddings import BedrockEmbeddings\n",
42 | "from opensearchpy import OpenSearch, RequestsHttpConnection\n",
43 | "from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth\n"
44 | ]
45 | },
46 | {
47 | "cell_type": "code",
48 | "execution_count": 2,
49 | "id": "79eb7df4",
50 | "metadata": {},
51 | "outputs": [
52 | {
53 | "name": "stdout",
54 | "output_type": "stream",
55 | "text": [
56 | "Requirement already satisfied: opensearch-py in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (2.3.2)\n",
57 | "Requirement already satisfied: urllib3>=1.26.9 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (1.26.17)\n",
58 | "Requirement already satisfied: requests<3.0.0,>=2.4.0 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (2.31.0)\n",
59 | "Requirement already satisfied: six in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (1.16.0)\n",
60 | "Requirement already satisfied: python-dateutil in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (2.8.2)\n",
61 | "Requirement already satisfied: certifi>=2022.12.07 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (2023.7.22)\n",
62 | "Requirement already satisfied: charset-normalizer<4,>=2 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from requests<3.0.0,>=2.4.0->opensearch-py) (3.3.0)\n",
63 | "Requirement already satisfied: idna<4,>=2.5 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from requests<3.0.0,>=2.4.0->opensearch-py) (3.4)\n",
64 | "Note: you may need to restart the kernel to use updated packages.\n"
65 | ]
66 | }
67 | ],
68 | "source": [
69 | "%pip install opensearch-py\n"
70 | ]
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": 3,
75 | "id": "4c1ea784-37bc-4a3f-84e3-1047f7e5cfd9",
76 | "metadata": {
77 | "tags": []
78 | },
79 | "outputs": [],
80 | "source": [
81 | "# global constants\n",
82 | "SERVICE = 'aoss'\n",
83 | "\n",
84 | "# do not change the name of the CFN stack, we assume that the \n",
85 | "# blog post creates a stack by this name and read output values\n",
86 | "# from the stack.\n",
87 | "CFN_STACK_NAME = \"rag-w-bedrock-kb\"\n"
88 | ]
89 | },
90 | {
91 | "cell_type": "code",
92 | "execution_count": 4,
93 | "id": "59d559b7",
94 | "metadata": {
95 | "tags": []
96 | },
97 | "outputs": [],
98 | "source": [
99 | "# Anthropic models need the Human/Assistant terminology used in the prompts, \n",
100 | "# they work better with XML style tags.\n",
101 | "PROMPT_TEMPLATE = \"\"\"Human: Answer the question based only on the information provided in few sentences.\n",
102 | "\n",
103 | "{}\n",
104 | "\n",
105 | "Include your answer in the tags. Do not include any preamble in your answer.\n",
106 | "\n",
107 | "{}\n",
108 | "\n",
109 | "Assistant:\"\"\"\n"
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": 5,
115 | "id": "61c6f5cc-2384-4f18-8add-418b258e8ab5",
116 | "metadata": {
117 | "tags": []
118 | },
119 | "outputs": [],
120 | "source": [
121 | "# utility functions\n",
122 | "\n",
123 | "def get_cfn_outputs(stackname: str) -> str:\n",
124 | " cfn = boto3.client('cloudformation')\n",
125 | " outputs = {}\n",
126 | " for output in cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']:\n",
127 | " outputs[output['OutputKey']] = output['OutputValue']\n",
128 | " return outputs\n",
129 | "\n",
130 | "def printmd(string: str):\n",
131 | " display(Markdown(string))\n"
132 | ]
133 | },
134 | {
135 | "cell_type": "code",
136 | "execution_count": 6,
137 | "id": "326c8d7f",
138 | "metadata": {
139 | "tags": []
140 | },
141 | "outputs": [],
142 | "source": [
143 | "# Functions to talk to OpenSearch\n",
144 | "\n",
145 | "# Define queries for OpenSearch\n",
146 | "def query_docs(query: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, index: str, k: int = 3) -> Dict:\n",
147 | " \"\"\"\n",
148 | " Convert the query into embedding and then find similar documents from AOSS\n",
149 | " \"\"\"\n",
150 | "\n",
151 | " # embedding\n",
152 | " query_embedding = embeddings.embed_query(query)\n",
153 | "\n",
154 | " # query to lookup OpenSearch kNN vector. Can add any metadata fields based filtering\n",
155 | " # here as part of this query.\n",
156 | " query_qna = {\n",
157 | " \"size\": k,\n",
158 | " \"query\": {\n",
159 | " \"knn\": {\n",
160 | " \"vector\": {\n",
161 | " \"vector\": query_embedding,\n",
162 | " \"k\": k\n",
163 | " }\n",
164 | " }\n",
165 | " }\n",
166 | " }\n",
167 | "\n",
168 | " # OpenSearch API call\n",
169 | " relevant_documents = aoss_client.search(\n",
170 | " body = query_qna,\n",
171 | " index = index\n",
172 | " )\n",
173 | " return relevant_documents\n"
174 | ]
175 | },
176 | {
177 | "cell_type": "code",
178 | "execution_count": 7,
179 | "id": "1d011b20",
180 | "metadata": {
181 | "tags": []
182 | },
183 | "outputs": [],
184 | "source": [
185 | "def create_context_for_query(q: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, vector_index: str) -> str:\n",
186 | " \"\"\"\n",
187 | " Create a context out of the similar docs retrieved from the vector database\n",
188 | " by concatenating the text from the similar documents.\n",
189 | " \"\"\"\n",
190 | " print(f\"query -> {q}\")\n",
191 | " aoss_response = query_docs(q, embeddings, aoss_client, vector_index)\n",
192 | " context = \"\"\n",
193 | " for r in aoss_response['hits']['hits']:\n",
194 | " s = r['_source']\n",
195 | " print(f\"{s['metadata']}\\n{s['text']}\")\n",
196 | " context += f\"{s['text']}\\n\"\n",
197 | " print(\"----------------\")\n",
198 | " return context\n"
199 | ]
200 | },
201 | {
202 | "cell_type": "markdown",
203 | "id": "6adf61b1",
204 | "metadata": {},
205 | "source": [
206 | "## Retrieve parameters needed from the AWS CloudFormation stack"
207 | ]
208 | },
209 | {
210 | "cell_type": "code",
211 | "execution_count": 8,
212 | "id": "10051806",
213 | "metadata": {
214 | "tags": []
215 | },
216 | "outputs": [
217 | {
218 | "name": "stdout",
219 | "output_type": "stream",
220 | "text": [
221 | "aoss_collection_arn=arn:aws:aoss:us-east-1:015469603702:collection/sip67bzp3hoel0x7crh3\n",
222 | "aoss_host=sip67bzp3hoel0x7crh3.us-east-1.aoss.amazonaws.com\n",
223 | "aoss_vector_index=sagemaker-readthedocs-io\n",
224 | "aws_region=us-east-1\n"
225 | ]
226 | }
227 | ],
228 | "source": [
229 | "\n",
230 | "outputs = get_cfn_outputs(CFN_STACK_NAME)\n",
231 | "\n",
232 | "region = outputs[\"Region\"]\n",
233 | "aoss_collection_arn = outputs['CollectionARN']\n",
234 | "aoss_host = f\"{os.path.basename(aoss_collection_arn)}.{region}.aoss.amazonaws.com\"\n",
235 | "aoss_vector_index = outputs['AOSSVectorIndexName']\n",
236 | "print(f\"aoss_collection_arn={aoss_collection_arn}\\naoss_host={aoss_host}\\naoss_vector_index={aoss_vector_index}\\naws_region={region}\")\n"
237 | ]
238 | },
239 | {
240 | "cell_type": "markdown",
241 | "id": "1b4a5e9e",
242 | "metadata": {},
243 | "source": [
244 | "## Setup Embeddings and Text Generation model\n",
245 | "\n",
246 | "We can use LangChain to setup the embeddings and text generation models provided via Amazon Bedrock."
247 | ]
248 | },
249 | {
250 | "cell_type": "code",
251 | "execution_count": 9,
252 | "id": "cf6613d2-aae8-48e5-adfb-0ea7fb75f2dd",
253 | "metadata": {
254 | "tags": []
255 | },
256 | "outputs": [],
257 | "source": [
258 | "# create a boto3 bedrock client\n",
259 | "bedrock_client = boto3.client('bedrock-runtime')\n",
260 | "\n",
261 | "# we will use Anthropic Claude for text generation\n",
262 | "claude_llm = Bedrock(model_id= \"anthropic.claude-v2\", client=bedrock_client)\n",
263 | "claude_llm.model_kwargs = dict(temperature=0.5, max_tokens_to_sample=300, top_k=250, top_p=1, stop_sequences=[])\n",
264 | "\n",
265 | "# we will be using the Titan Embeddings Model to generate our Embeddings.\n",
266 | "embeddings = BedrockEmbeddings(model_id=\"amazon.titan-embed-g1-text-02\", client=bedrock_client)\n"
267 | ]
268 | },
269 | {
270 | "cell_type": "markdown",
271 | "id": "64f0166a",
272 | "metadata": {},
273 | "source": [
274 | "## Interface with Amazon OpenSearch Service Serverless\n",
275 | "We use the open-source [opensearch-py](https://pypi.org/project/opensearch-py/) package to talk to AOSS."
276 | ]
277 | },
278 | {
279 | "cell_type": "code",
280 | "execution_count": 10,
281 | "id": "5d36f340-81ea-4617-b37d-57bf7669c9ac",
282 | "metadata": {
283 | "tags": []
284 | },
285 | "outputs": [],
286 | "source": [
287 | "credentials = boto3.Session().get_credentials()\n",
288 | "auth = AWSV4SignerAuth(credentials, region, SERVICE)\n",
289 | "\n",
290 | "client = OpenSearch(\n",
291 | " hosts = [{'host': aoss_host, 'port': 443}],\n",
292 | " http_auth = auth,\n",
293 | " use_ssl = True,\n",
294 | " verify_certs = True,\n",
295 | " connection_class = RequestsHttpConnection,\n",
296 | " pool_maxsize = 20\n",
297 | ")\n"
298 | ]
299 | },
300 | {
301 | "cell_type": "markdown",
302 | "id": "3e383e23",
303 | "metadata": {},
304 | "source": [
305 | "## Use Retrieval Augumented Generation (RAG) for answering queries\n",
306 | "\n",
307 | "Now that we have setup the LLMs through Bedrock and vector database through AOSS, we are ready to answer queries using RAG. The workflow is as follows:\n",
308 | "\n",
309 | "1. Convert the user query into embeddings.\n",
310 | "\n",
311 | "1. Use the embeddings to find similar documents from the vector database.\n",
312 | "\n",
313 | "1. Create a prompt using the user query and similar documents (retrieved from the vector db) to create a prompt.\n",
314 | "\n",
315 | "1. Provide the prompt to the LLM to create an answer to the user query."
316 | ]
317 | },
318 | {
319 | "cell_type": "markdown",
320 | "id": "0224f2c4-b725-4f3a-84ac-914c4eba8a94",
321 | "metadata": {},
322 | "source": [
323 | "## Query 1\n",
324 | "\n",
325 | "Let us first ask the our question to the model without providing any context, see the result and then ask the same question with context provided using document retrieved from AOSS and see if the answer improves!"
326 | ]
327 | },
328 | {
329 | "cell_type": "code",
330 | "execution_count": 11,
331 | "id": "d4be3215-3dde-4abd-8c38-45871e63d058",
332 | "metadata": {
333 | "tags": []
334 | },
335 | "outputs": [
336 | {
337 | "data": {
338 | "text/markdown": [
339 | "question=What versions of XGBoost are supported by Amazon SageMaker?
answer=\n",
340 | "Amazon SageMaker supports XGBoost versions 0.90-1, 0.90-2, 1.0-1, 1.2-1, 1.3-1, and 1.5-1.\n",
341 | "\n"
342 | ],
343 | "text/plain": [
344 | ""
345 | ]
346 | },
347 | "metadata": {},
348 | "output_type": "display_data"
349 | }
350 | ],
351 | "source": [
352 | "# 1. Start with the query\n",
353 | "q = \"What versions of XGBoost are supported by Amazon SageMaker?\"\n",
354 | "\n",
355 | "# 2. Now create a prompt by combining the query and the context (which is empty at this time)\n",
356 | "context = \"\"\n",
357 | "prompt = PROMPT_TEMPLATE.format(context, q)\n",
358 | "\n",
359 | "# 3. Provide the prompt to the LLM to generate an answer to the query without any additional context provided\n",
360 | "response = claude_llm(prompt)\n",
361 | "printmd(f\"question={q.strip()}
answer={response.strip()}\\n\")\n"
362 | ]
363 | },
364 | {
365 | "cell_type": "markdown",
366 | "id": "a1f429bb-050d-4c81-b532-aa5b8e531990",
367 | "metadata": {},
368 | "source": [
369 | "**The answer provided above is incorrect**, as can be seen from the [SageMaker XGBoost Algorithm page](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html). The supported version numbers are \"1.0, 1.2, 1.3, 1.5, and 1.7\".\n",
370 | "\n",
371 | "Now, let us see if we can improve upon this answer by using additional information that is available to use in the vector database. **Also notice in the response below that the source of the documents that are being used as context is also being called out (the name of the file in the S3 bucket), this helps create confidence in the response generated by the LLM**."
372 | ]
373 | },
374 | {
375 | "cell_type": "code",
376 | "execution_count": 12,
377 | "id": "371f86e8-157f-41b0-88a4-59a56f5507c9",
378 | "metadata": {
379 | "tags": []
380 | },
381 | "outputs": [
382 | {
383 | "name": "stdout",
384 | "output_type": "stream",
385 | "text": [
386 | "query -> What versions of XGBoost are supported by Amazon SageMaker?\n",
387 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"}\n",
388 | "see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm¶ Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don’t have to write a training script. If you don’t need the features and flexibility of open source XGBoost, consider using the built-in version. For information about using the Amazon SageMaker XGBoost built-in algorithm, see XGBoost Algorithm in the Amazon SageMaker Developer Guide. Use the Open Source XGBoost Algorithm¶ If you want the flexibility and additional features that it provides, use the SageMaker open source XGBoost algorithm. For which XGBoost versions are supported, see the AWS documentation. We recommend that you use the latest supported version because that’s where we focus most of our development efforts. For a complete example of using the open source XGBoost algorithm, see the sample notebook at https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb. For more information about XGBoost, see the XGBoost documentation. Train a Model with Open Source XGBoost¶ To train a model by using the Amazon SageMaker open source XGBoost algorithm: Prepare a training script Create a sagemaker.xgboost.XGBoost estimator Call the estimator’s fit method Prepare a Training Script¶ A typical training script loads data from the input channels, configures training with hyperparameters, trains a model,\n",
389 | "----------------\n",
390 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"}\n",
391 | "Models with Multi-Model Endpoints SageMaker XGBoost Classes SageMaker XGBoost Docker Containers eXtreme Gradient Boosting (XGBoost) is a popular and efficient machine learning algorithm used for regression and classification tasks on tabular datasets. It implements a technique known as gradient boosting on trees, which performs remarkably well in machine learning competitions. Amazon SageMaker supports two ways to use the XGBoost algorithm: XGBoost built-in algorithm XGBoost open source algorithm The XGBoost open source algorithm provides the following benefits over the built-in algorithm: Latest version - The open source XGBoost algorithm typically supports a more recent version of XGBoost. To see the XGBoost version that is currently supported, see XGBoost SageMaker Estimators and Models. Flexibility - Take advantage of the full range of XGBoost functionality, such as cross-validation support. You can add custom pre- and post-processing logic and run additional code after training. Scalability - The XGBoost open source algorithm has a more efficient implementation of distributed training, which enables it to scale out to more instances and reduce out-of-memory errors. Extensibility - Because the open source XGBoost container is open source, you can extend the container to install additional libraries and change the version of XGBoost that the container uses. For an example notebook that shows how to extend SageMaker containers, see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm¶ Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don’t have to write\n",
392 | "----------------\n",
393 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_algorithms_tabular_xgboost.html\"}\n",
394 | "an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5. The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm. Notebook Title Description How to Create a Custom XGBoost container? This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform. Regression with XGBoost using Parquet This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model. How to Train and Host a Multiclass Classification Model? This notebook shows how to use the MNIST dataset to train and host a multiclass classification model. How to train a Model for Customer Churn Prediction? This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers. An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training This notebook shows you how to use Spot Instances for training with a XGBoost Container. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running. For instructions on how to create and access Jupyter\n",
395 | "----------------\n"
396 | ]
397 | },
398 | {
399 | "data": {
400 | "text/markdown": [
401 | "question=What versions of XGBoost are supported by Amazon SageMaker?
answer=\n",
402 | "The XGBoost open source algorithm in Amazon SageMaker supports the latest version of XGBoost. The built-in XGBoost algorithm is based on XGBoost versions 1.0, 1.2, 1.3, and 1.5.\n",
403 | "\n"
404 | ],
405 | "text/plain": [
406 | ""
407 | ]
408 | },
409 | "metadata": {},
410 | "output_type": "display_data"
411 | }
412 | ],
413 | "source": [
414 | "# 1. Start with the query\n",
415 | "q = \"What versions of XGBoost are supported by Amazon SageMaker?\"\n",
416 | "\n",
417 | "# 2. Create the context by finding similar documents from the knowledge base\n",
418 | "context = create_context_for_query(q, embeddings, client, aoss_vector_index)\n",
419 | "\n",
420 | "# 3. Now create a prompt by combining the query and the context\n",
421 | "prompt = PROMPT_TEMPLATE.format(context, q)\n",
422 | "\n",
423 | "# 4. Provide the prompt to the LLM to generate an answer to the query based on context provided\n",
424 | "response = claude_llm(prompt)\n",
425 | "\n",
426 | "printmd(f\"question={q.strip()}
answer={response.strip()}\\n\")\n"
427 | ]
428 | },
429 | {
430 | "cell_type": "markdown",
431 | "id": "0ec1bd68-f61d-4f15-b152-3f9f54305fa8",
432 | "metadata": {},
433 | "source": [
434 | "## Query 2\n",
435 | "\n",
436 | "For the subsequent queries we use RAG directly."
437 | ]
438 | },
439 | {
440 | "cell_type": "code",
441 | "execution_count": 13,
442 | "id": "2ffbe92d-5fcd-480d-a239-0c461f61f4a0",
443 | "metadata": {
444 | "tags": []
445 | },
446 | "outputs": [
447 | {
448 | "name": "stdout",
449 | "output_type": "stream",
450 | "text": [
451 | "query -> What are the different types of distributed training supported by SageMaker. Give a short summary of each.\n",
452 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_api_training_distributed.html\"}\n",
453 | "Archive Launch a Distributed Training Job Using the SageMaker Python SDK Release Notes SageMaker Distributed Data Parallel 1.8.0 Release Notes Release History The SageMaker Distributed Model Parallel Library¶ The SageMaker Distributed Model Parallel Library Overview Use the Library’s API to Adapt Training Scripts Version 1.11.0, 1.13.0, 1.14.0, 1.15.0 (Latest) Documentation Archive Run a Distributed Training Job Using the SageMaker Python SDK Configuration Parameters for distribution Ranking Basics without Tensor Parallelism Placement Strategy with Tensor Parallelism Prescaled Batch Release Notes SageMaker Distributed Model Parallel 1.15.0 Release Notes Release History Next Previous © Copyright 2023, Amazon Revision af4d7949. Built with Sphinx using a theme provided by Read the Docs. Read the Docs v: stable Versions stable v2.167.0 v2.166.0 v2.165.0 v2.164.0 v2.163.0 v2.162.0 v2.161.0 v2.160.0 v2.159.0 v2.158.0 v2.157.0 v2.156.0 v2.155.0 v2.154.0 v2.153.0 v2.152.0 v2.151.0 v2.150.0 v2.149.0 v2.148.0 v2.147.0 v2.146.1 v2.146.0 v2.145.0 v2.144.0 v2.143.0 v2.142.0 v2.141.0 v2.140.1 v2.140.0\n",
454 | "----------------\n",
455 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_api_training_distributed.html\"}\n",
456 | "sagemaker stable Filters: Example Dev Guide SDK Guide Using the SageMaker Python SDK Use Version 2.x of the SageMaker Python SDK APIs Feature Store APIs Training APIs Distributed Training APIs The SageMaker Distributed Data Parallel Library The SageMaker Distributed Data Parallel Library Overview Use the Library to Adapt Your Training Script Launch a Distributed Training Job Using the SageMaker Python SDK Release Notes The SageMaker Distributed Model Parallel Library The SageMaker Distributed Model Parallel Library Overview Use the Library’s API to Adapt Training Scripts Run a Distributed Training Job Using the SageMaker Python SDK Release Notes Inference APIs Governance APIs Utility APIs Frameworks Built-in Algorithms Workflows Amazon SageMaker Experiments Amazon SageMaker Debugger Amazon SageMaker Feature Store Amazon SageMaker Model Monitor Amazon SageMaker Processing Amazon SageMaker Model Building Pipeline sagemaker » APIs » Distributed Training APIs Edit on GitHub Distributed Training APIs¶ SageMaker distributed training libraries offer both data parallel and model parallel training strategies. They combine software and hardware technologies to improve inter-GPU and inter-node communications. They extend SageMaker’s training capabilities with built-in options that require only small code changes to your training scripts. The SageMaker Distributed Data Parallel Library¶ The SageMaker Distributed Data Parallel Library Overview Use the Library to Adapt Your Training Script For versions between 1.4.0 and 1.8.0 (Latest) Documentation Archive Launch a Distributed Training Job Using the SageMaker Python SDK Release Notes SageMaker Distributed Data Parallel 1.8.0 Release Notes Release History The SageMaker Distributed Model Parallel Library¶ The SageMaker Distributed Model Parallel Library Overview Use the Library’s API to Adapt Training Scripts Version 1.11.0, 1.13.0,\n",
457 | "----------------\n",
458 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n",
459 | "training by calling fit # Setting the wait to `False` would make the fit asynchronous estimator.fit(wait=False) # Get a list of S3 URIs S3Downloader.list(estimator.latest_job_debugger_artifacts_path()) Continuous analyses through rules¶ In addition to collecting the debugging data, Amazon SageMaker Debugger provides the capability for you to analyze it in a streaming fashion using “rules”. A SageMaker Debugger “rule” is a piece of code which encapsulates the logic for analyzing debugging data. SageMaker Debugger provides a set of built-in rules curated by data scientists and engineers at Amazon to identify common problems while training machine learning models. There is also support for using custom rule source codes for evaluation. In the following sections, you’ll learn how to use both the built-in and custom rules while training your model. Relationship between debugger hook and rules¶ Using SageMaker Debugger is, broadly, a two-pronged approach. On one hand you have the production of debugging data, which is done through the Debugger Hook, and on the other hand you have the consumption of this data, which can be with rules (for continuous analyses) or by using the SageMaker Debugger SDK (for interactive analyses). The production and consumption of data are defined independently. For example, you could configure the debugging hook to store only the collection “gradients” and then configure the rules to operate on some other collection, say, “weights”. While this is possible, it’s quite useless as it gives you no meaningful insight into the training process. This is because the rule will do nothing in this example scenario since it will wait\n",
460 | "----------------\n"
461 | ]
462 | },
463 | {
464 | "data": {
465 | "text/markdown": [
466 | "question=What are the different types of distributed training supported by SageMaker. Give a short summary of each.
answer=\n",
467 | "SageMaker supports two main types of distributed training:\n",
468 | "\n",
469 | "1. SageMaker Distributed Data Parallel: This allows scaling model training across multiple GPUs and nodes by splitting the training data. It reduces training time by parallelizing computation.\n",
470 | "\n",
471 | "2. SageMaker Distributed Model Parallel: This allows training very large models that don't fit on a single GPU. It splits the model itself across multiple GPUs and synchronizes gradients during training. It removes memory constraints for large models.\n",
472 | "\n"
473 | ],
474 | "text/plain": [
475 | ""
476 | ]
477 | },
478 | "metadata": {},
479 | "output_type": "display_data"
480 | }
481 | ],
482 | "source": [
483 | "# 1. Start with the query\n",
484 | "q = \"What are the different types of distributed training supported by SageMaker. Give a short summary of each.\"\n",
485 | "\n",
486 | "# 2. Create the context by finding similar documents from the knowledge base\n",
487 | "context = create_context_for_query(q, embeddings, client, aoss_vector_index)\n",
488 | "\n",
489 | "# 3. Now create a prompt by combining the query and the context\n",
490 | "prompt = PROMPT_TEMPLATE.format(context, q)\n",
491 | "\n",
492 | "# 4. Provide the prompt to the LLM to generate an answer to the query based on context provided\n",
493 | "response = claude_llm(prompt)\n",
494 | "printmd(f\"question={q.strip()}
answer={response.strip()}\\n\")\n"
495 | ]
496 | },
497 | {
498 | "cell_type": "markdown",
499 | "id": "d8024b1f-3f99-406c-be1d-9368cd1440f4",
500 | "metadata": {},
501 | "source": [
502 | "## Query 3"
503 | ]
504 | },
505 | {
506 | "cell_type": "code",
507 | "execution_count": 14,
508 | "id": "5444ae8c-0377-46ad-8d4e-2d41f575c289",
509 | "metadata": {
510 | "tags": []
511 | },
512 | "outputs": [
513 | {
514 | "name": "stdout",
515 | "output_type": "stream",
516 | "text": [
517 | "query -> What advantages does SageMaker debugger provide?\n",
518 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n",
519 | "having the TensorBoard data emitted from the hook in addition to the tensors will incur a cost to the training and may slow it down. Interactive analysis using SageMaker Debugger SDK and visualizations¶ Amazon SageMaker Debugger SDK also allows you to do interactive analyses on the debugging data produced from a training job run and to render visualizations of it. After calling fit() on the estimator, you can use the SDK to load the saved data in a SageMaker Debugger trial and do an analysis on the data: from smdebug.trials import create_trial s3_output_path = estimator.latest_job_debugger_artifacts_path() trial = create_trial(s3_output_path) To learn more about the programming model for analysis using the SageMaker Debugger SDK, see SageMaker Debugger Analysis. For a tutorial on what you can do after creating the trial and how to visualize the results, see SageMaker Debugger - Visualizing Debugging Results. Default behavior and opting out¶ For TensorFlow, Keras, MXNet, PyTorch and XGBoost estimators, the DebuggerHookConfig is always initialized regardless of specification while initializing the estimator. This is done to minimize code changes needed to get useful debugging information. To disable the hook initialization, you can do so by specifying False for value of debugger_hook_config in your framework estimator’s initialization: estimator = TensorFlow( role=role, instance_count=1, instance_type=instance_type, debugger_hook_config=False ) Learn More¶ Further documentation¶ API documentation: https://sagemaker.readthedocs.io/en/stable/debugger.html AWS\n",
520 | "----------------\n",
521 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n",
522 | "debugging hook to store only the collection “gradients” and then configure the rules to operate on some other collection, say, “weights”. While this is possible, it’s quite useless as it gives you no meaningful insight into the training process. This is because the rule will do nothing in this example scenario since it will wait for the tensors in the collection “gradients” which are never be emitted. For more useful and efficient debugging, configure your debugging hook to produce and store the debugging data that you care about and employ rules that operate on that particular data. This way, you ensure that the Debugger is utilized to its maximum potential in detecting anomalies. In this sense, there is a loose binding between the hook and the rules. Normally, you’d achieve this binding for a training job by providing values for both debugger_hook_config and rules in your estimator. However, SageMaker Debugger simplifies this by allowing you to specify the collection configuration within the Rule object itself. This way, you don’t have to specify the debugger_hook_config in your estimator separately. Using built-in rules¶ SageMaker Debugger comes with a set of built-in rules which can be used to identify common problems in model training, for example vanishing gradients or exploding tensors. You can choose to evaluate one or more of these rules while training your model to obtain meaningful insight into the training process. To learn more about these built in rules, see SageMaker Debugger Built-in Rules. Pre-defined debugger hook configuration for built-in rules¶ As mentioned earlier, for efficient\n",
523 | "----------------\n",
524 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n",
525 | "Specifying configurations for collections Collection Name Collection Parameters Hook Parameters Begin model training Continuous analyses through rules Relationship between debugger hook and rules Using built-in rules Pre-defined debugger hook configuration for built-in rules Sample Usages Using custom rules Sample Usage Capture real-time TensorBoard data from the debugging hook Interactive analysis using SageMaker Debugger SDK and visualizations Default behavior and opting out Learn More Further documentation Notebook examples Background¶ Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that encompasses the entire machine learning workflow. You can label and prepare your data, choose an algorithm, train a model, and then tune and optimize it for deployment. You can deploy your models to production with Amazon SageMaker to make predictions at lower costs than was previously possible. SageMaker Debugger provides a way to hook into the training process and emit debug artifacts (a.k.a. “tensors”) that represent the training state at each point in the training lifecycle. Debugger then stores the data in real time and uses rules that encapsulate logic to analyze tensors and react to anomalies. Debugger provides built-in rules and allows you to write custom rules for analysis. Setup¶ To get started, you must satisfy the following prerequisites: Specify an AWS Region where you’ll train your model. Give Amazon SageMaker the access to your data in Amazon Simple Storage Service (Amazon S3) needed to train your model by creating an IAM role ARN. See the AWS IAM documentation for how to fine tune the permissions needed. Capture\n",
526 | "----------------\n"
527 | ]
528 | },
529 | {
530 | "data": {
531 | "text/markdown": [
532 | "question=What advantages does SageMaker debugger provide?
answer=\n",
533 | "SageMaker debugger provides the following advantages:\n",
534 | "\n",
535 | "- It allows you to hook into the training process and emit debug artifacts (tensors) that represent the training state at each point in the training lifecycle. \n",
536 | "\n",
537 | "- It stores the debug data in real time and uses rules to analyze tensors and react to anomalies.\n",
538 | "\n",
539 | "- It provides built-in rules and allows you to write custom rules for analysis.\n",
540 | "\n",
541 | "- It allows interactive analysis on the debugging data and visualization of results.\n",
542 | "\n",
543 | "- It minimizes code changes needed to get useful debugging information by automatically initializing the debugger hook for frameworks like TensorFlow, Keras, MXNet, PyTorch and XGBoost.\n",
544 | "\n",
545 | "\n"
546 | ],
547 | "text/plain": [
548 | ""
549 | ]
550 | },
551 | "metadata": {},
552 | "output_type": "display_data"
553 | }
554 | ],
555 | "source": [
556 | "# 1. Start with the query\n",
557 | "q = \"What advantages does SageMaker debugger provide?\"\n",
558 | "\n",
559 | "# 2. Create the context by finding similar documents from the knowledge base\n",
560 | "context = create_context_for_query(q, embeddings, client, aoss_vector_index)\n",
561 | "\n",
562 | "# 3. Now create a prompt by combining the query and the context\n",
563 | "prompt = PROMPT_TEMPLATE.format(context, q)\n",
564 | "\n",
565 | "# 4. Provide the prompt to the LLM to generate an answer to the query based on context provided\n",
566 | "response = claude_llm(prompt)\n",
567 | "\n",
568 | "printmd(f\"question={q.strip()}
answer={response.strip()}\\n\")\n"
569 | ]
570 | },
571 | {
572 | "cell_type": "code",
573 | "execution_count": null,
574 | "id": "f2e7ac93-f5ed-4c0c-99bf-03fa1ab7cf7e",
575 | "metadata": {},
576 | "outputs": [],
577 | "source": []
578 | }
579 | ],
580 | "metadata": {
581 | "kernelspec": {
582 | "display_name": "Python 3.9.18 ('bedrock_py39')",
583 | "language": "python",
584 | "name": "python3"
585 | },
586 | "language_info": {
587 | "codemirror_mode": {
588 | "name": "ipython",
589 | "version": 3
590 | },
591 | "file_extension": ".py",
592 | "mimetype": "text/x-python",
593 | "name": "python",
594 | "nbconvert_exporter": "python",
595 | "pygments_lexer": "ipython3",
596 | "version": "3.9.18"
597 | },
598 | "vscode": {
599 | "interpreter": {
600 | "hash": "3ac4445fedcc02e0ec010c021cc980cd9c85bdedf3d57447a4cb4e8d37edc5f0"
601 | }
602 | }
603 | },
604 | "nbformat": 4,
605 | "nbformat_minor": 5
606 | }
607 |
--------------------------------------------------------------------------------
/setup_bedrock_conda.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # constants
4 | BEDROCK_CONDA_ENV=bedrock_py39
5 | PY_VER=3.9
6 |
7 | # create conda env
8 | conda remove -n $BEDROCK_CONDA_ENV --all -y
9 | conda create --name $BEDROCK_CONDA_ENV -y python=$PY_VER ipykernel
10 | source activate $BEDROCK_CONDA_ENV
11 |
12 | # all set to pip install the bedrock packages, these are awscli, boto3 and botocore
13 | pip install --no-build-isolation --force-reinstall boto3>=1.28.57 awscli>=1.29.57 botocore>=1.31.57
14 | pip install langchain==0.0.304
15 | pip install opensearch-py==2.3.1
16 |
17 | echo "all done"
18 |
--------------------------------------------------------------------------------
/template.yml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: 2010-09-09
2 | Description: 'Amazon OpenSearch Serverless template to create an IAM user, encryption policy, data access policy and collection'
3 |
4 | Metadata:
5 | AWS::CloudFormation::Interface:
6 | ParameterGroups:
7 | - Label:
8 | default: Required Parameters
9 | Parameters:
10 | - BedrockNotebookName
11 | ParameterLabels:
12 | BedrockNotebookName:
13 | default: Name of SageMaker Notebook Instance
14 |
15 | Parameters:
16 | IAMUserArn:
17 | AllowedPattern: "^arn:aws:iam::\\d{12}:user/[\\w+=,.@-]+|arn:aws:sts::\\d{12}:assumed-role/[\\w+=,.@-]+/[\\w+=,.@-]+$"
18 | Description: The Arn of the IAM user (or assumed role) running this CloudFormation template.
19 | Type: String
20 | AOSSCollectionName:
21 | Default: sagemaker-kb
22 | Type: String
23 | Description: Name of the Amazon OpenSearch Service Serverless (AOSS) collection.
24 | MinLength: 1
25 | MaxLength: 21
26 | AllowedPattern: ^[a-z0-9](-*[a-z0-9])*
27 | ConstraintDescription: Must be lowercase or numbers with a length of 1-63 characters.
28 | AOSSIndexName:
29 | Default: sagemaker-readthedocs-io
30 | Type: String
31 | Description: Name of the vector index in the Amazon OpenSearch Service Serverless (AOSS) collection.
32 |
33 |
34 | Resources:
35 |
36 | CodeRepository:
37 | Type: AWS::SageMaker::CodeRepository
38 | Properties:
39 | GitConfig:
40 | RepositoryUrl: https://github.com/aws-samples/bedrock-kb-rag-workshop
41 |
42 | S3Bucket:
43 | Type: AWS::S3::Bucket
44 | Description: Creating Amazon S3 bucket to hold source data for knowledge base
45 | Properties:
46 | BucketName: !Join
47 | - '-'
48 | - - !Ref AOSSCollectionName
49 | - !Sub ${AWS::AccountId}
50 |
51 | cleanupBucketOnDelete:
52 | Type: Custom::cleanupbucket
53 | Properties:
54 | ServiceToken: !GetAtt 'DeleteS3Bucket.Arn'
55 | BucketName: !Ref S3Bucket
56 | DependsOn: S3Bucket
57 |
58 | DeleteS3Bucket:
59 | Type: AWS::Lambda::Function
60 | Properties:
61 | Handler: index.lambda_handler
62 | Description: "Delete all objects in S3 bucket"
63 | Timeout: 30
64 | Role: !GetAtt 'LambdaBasicExecutionRole.Arn'
65 | Runtime: python3.9
66 | Environment:
67 | Variables:
68 | BUCKET_NAME: !Ref S3Bucket
69 | Code:
70 | ZipFile: |
71 | import json, boto3, logging
72 | import cfnresponse
73 | logger = logging.getLogger()
74 | logger.setLevel(logging.INFO)
75 |
76 | def lambda_handler(event, context):
77 | logger.info("event: {}".format(event))
78 | try:
79 | bucket = event['ResourceProperties']['BucketName']
80 | logger.info("bucket: {}, event['RequestType']: {}".format(bucket,event['RequestType']))
81 | if event['RequestType'] == 'Delete':
82 | s3 = boto3.resource('s3')
83 | bucket = s3.Bucket(bucket)
84 | for obj in bucket.objects.filter():
85 | logger.info("delete obj: {}".format(obj))
86 | s3.Object(bucket.name, obj.key).delete()
87 |
88 | sendResponseCfn(event, context, cfnresponse.SUCCESS)
89 | except Exception as e:
90 | logger.info("Exception: {}".format(e))
91 | sendResponseCfn(event, context, cfnresponse.FAILED)
92 |
93 | def sendResponseCfn(event, context, responseStatus):
94 | responseData = {}
95 | responseData['Data'] = {}
96 | cfnresponse.send(event, context, responseStatus, responseData, "CustomResourcePhysicalID")
97 |
98 | CustomSGResource:
99 | Type: AWS::CloudFormation::CustomResource
100 | Properties:
101 | ServiceToken: !GetAtt 'CustomFunctionCopyContentsToS3Bucket.Arn'
102 |
103 |
104 | LambdaBasicExecutionRole:
105 | Type: AWS::IAM::Role
106 | Properties:
107 | AssumeRolePolicyDocument:
108 | Statement:
109 | - Effect: Allow
110 | Principal:
111 | Service: lambda.amazonaws.com
112 | Action: sts:AssumeRole
113 | Path: /
114 | Policies:
115 | - PolicyName: S3Access
116 | PolicyDocument:
117 | Version: '2012-10-17'
118 | Statement:
119 | - Effect: Allow
120 | Action:
121 | - logs:CreateLogGroup
122 | - logs:CreateLogStream
123 | - logs:PutLogEvents
124 | Resource: arn:aws:logs:*:*:*
125 | - Effect: Allow
126 | Action:
127 | - s3:*
128 | Resource: '*'
129 |
130 |
131 | CustomFunctionCopyContentsToS3Bucket:
132 | Type: AWS::Lambda::Function
133 | Properties:
134 | Handler: index.lambda_handler
135 | Description: "Copies files from the Blog bucket to bucket in this account"
136 | Timeout: 30
137 | Role: !GetAtt 'LambdaBasicExecutionRole.Arn'
138 | Runtime: python3.9
139 | Environment:
140 | Variables:
141 | AOSS_COLLECTION_NAME: !Ref AOSSCollectionName
142 | Code:
143 | ZipFile: |
144 | import os
145 | import json
146 | import boto3
147 | import logging
148 | import cfnresponse
149 |
150 | logger = logging.getLogger()
151 | logger.setLevel(logging.INFO)
152 | DATA_BUCKET = "aws-blogs-artifacts-public"
153 | SRC_PREFIX = "artifacts/ML-15729"
154 | MANIFEST = os.path.join(SRC_PREFIX, "manifest.txt")
155 | # s3://aws-blogs-artifacts-public/artifacts/ML-15729/docs/manifest.txt
156 | def lambda_handler(event, context):
157 | logger.info('got event {}'.format(event))
158 | if event['RequestType'] == 'Delete':
159 | logger.info(f"copy files function called at the time of stack deletion, skipping")
160 | response = dict(files_copied=0, error=None)
161 | cfnresponse.send(event, context, cfnresponse.SUCCESS, response)
162 | return
163 | try:
164 | s3 = boto3.client('s3')
165 | obj = s3.get_object(Bucket=DATA_BUCKET, Key=MANIFEST)
166 | manifest_data = obj['Body'].iter_lines()
167 | ctr = 0
168 | for f in manifest_data:
169 | fname = f.decode()
170 | key = os.path.join(SRC_PREFIX, fname)
171 | logger.info(f"going to read {key} from bucket={DATA_BUCKET}")
172 | copy_source = { 'Bucket': DATA_BUCKET, 'Key': key }
173 | account_id = boto3.client('sts').get_caller_identity().get('Account')
174 | bucket = boto3.resource('s3').Bucket(f"{os.environ.get('AOSS_COLLECTION_NAME')}-{account_id}")
175 | dst_key = fname
176 | logger.info(f"going to copy {copy_source} -> s3://{bucket}/{dst_key}")
177 | bucket.copy(copy_source, dst_key)
178 | ctr += 1
179 | response = dict(files_copied=ctr, error=None)
180 | cfnresponse.send(event, context, cfnresponse.SUCCESS, response)
181 | except Exception as e:
182 | logger.error(e)
183 | response = dict(files_copied=0, error=str(e))
184 | cfnresponse.send(event, context, cfnresponse.FAILED, response)
185 |
186 | return
187 |
188 | AmazonBedrockExecutionRoleForKnowledgeBase:
189 | Type: AWS::IAM::Role
190 | Properties:
191 | RoleName: !Join
192 | - '-'
193 | - - AmazonBedrockExecutionRoleForKnowledgeBase
194 | - !Ref AOSSCollectionName
195 | AssumeRolePolicyDocument:
196 | Statement:
197 | - Effect: Allow
198 | Principal:
199 | Service: bedrock.amazonaws.com
200 | Action: sts:AssumeRole
201 | Condition:
202 | StringEquals:
203 | "aws:SourceAccount": !Sub "${AWS::AccountId}"
204 | ArnLike:
205 | "AWS:SourceArn": !Sub "arn:aws:bedrock:${AWS::Region}:${AWS::AccountId}:knowledge-base/*"
206 | Path: /
207 | Policies:
208 | - PolicyName: S3ReadOnlyAccess
209 | PolicyDocument:
210 | Version: '2012-10-17'
211 | Statement:
212 | - Effect: Allow
213 | Action:
214 | - s3:Get*
215 | - s3:List*
216 | - s3:Describe*
217 | - s3-object-lambda:Get*,
218 | - s3-object-lambda:List*
219 | Resource: '*'
220 | - PolicyName: AOSSAPIAccessAll
221 | PolicyDocument:
222 | Version: '2012-10-17'
223 | Statement:
224 | - Effect: Allow
225 | Action:
226 | - aoss:APIAccessAll
227 | Resource: !Sub arn:aws:aoss:${AWS::Region}:${AWS::AccountId}:collection/*
228 | - PolicyName: BedrockListAndInvokeModel
229 | PolicyDocument:
230 | Version: '2012-10-17'
231 | Statement:
232 | - Effect: Allow
233 | Action:
234 | - bedrock:ListCustomModels
235 | Resource: '*'
236 | - Effect: Allow
237 | Action:
238 | - bedrock:InvokeModel
239 | Resource: !Sub arn:aws:bedrock:${AWS::Region}::foundation-model/*
240 |
241 | AmazonBedrockExecutionRoleForAgentsQA:
242 | Type: AWS::IAM::Role
243 | Properties:
244 | RoleName: AmazonBedrockExecutionRoleForAgents_SageMakerQA
245 | AssumeRolePolicyDocument:
246 | Statement:
247 | - Effect: Allow
248 | Principal:
249 | Service: bedrock.amazonaws.com
250 | Action: sts:AssumeRole
251 | ManagedPolicyArns:
252 | - arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
253 | - arn:aws:iam::aws:policy/AmazonBedrockFullAccess
254 |
255 | NotebookInstance:
256 | Type: AWS::SageMaker::NotebookInstance
257 | Properties:
258 | NotebookInstanceName: !Sub ${AWS::StackName}-notebook
259 | InstanceType: ml.t3.xlarge
260 | RoleArn: !GetAtt NotebookRole.Arn
261 | DefaultCodeRepository: !GetAtt CodeRepository.CodeRepositoryName
262 |
263 | NotebookRole:
264 | Type: AWS::IAM::Role
265 | Properties:
266 | RoleName: !Join
267 | - '-'
268 | - - !Ref AOSSCollectionName
269 | - NoteBookRole
270 | Policies:
271 | - PolicyName: CustomNotebookAccess
272 | PolicyDocument:
273 | Version: 2012-10-17
274 | Statement:
275 | - Sid: BedrockFullAccess
276 | Effect: Allow
277 | Action:
278 | - "bedrock:*"
279 | Resource: "*"
280 | - PolicyName: AOSSAPIAccessAll
281 | PolicyDocument:
282 | Version: '2012-10-17'
283 | Statement:
284 | - Effect: Allow
285 | Action:
286 | - aoss:APIAccessAll
287 | Resource: !Sub arn:aws:aoss:${AWS::Region}:${AWS::AccountId}:collection/*
288 | ManagedPolicyArns:
289 | - arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
290 | - arn:aws:iam::aws:policy/AmazonS3FullAccess
291 | - arn:aws:iam::aws:policy/AWSCloudFormationReadOnlyAccess
292 | AssumeRolePolicyDocument:
293 | Version: 2012-10-17
294 | Statement:
295 | - Effect: Allow
296 | Principal:
297 | Service:
298 | - sagemaker.amazonaws.com
299 | Action:
300 | - 'sts:AssumeRole'
301 | - Effect: Allow
302 | Principal:
303 | Service:
304 | - bedrock.amazonaws.com
305 | Action:
306 | - 'sts:AssumeRole'
307 |
308 | DataAccessPolicy:
309 | Type: 'AWS::OpenSearchServerless::AccessPolicy'
310 | Properties:
311 | Name: !Join
312 | - '-'
313 | - - !Ref AOSSCollectionName
314 | - access-policy
315 | Type: data
316 | Description: Access policy for AOSS collection
317 | Policy: !Sub >-
318 | [{"Description":"Access for cfn user","Rules":[{"ResourceType":"index","Resource":["index/*/*"],"Permission":["aoss:*"]},
319 | {"ResourceType":"collection","Resource":["collection/quickstart"],"Permission":["aoss:*"]}],
320 | "Principal":["${IAMUserArn}", "${AmazonBedrockExecutionRoleForKnowledgeBase.Arn}", "${NotebookRole.Arn}"]}]
321 | NetworkPolicy:
322 | Type: 'AWS::OpenSearchServerless::SecurityPolicy'
323 | Properties:
324 | Name: !Join
325 | - '-'
326 | - - !Ref AOSSCollectionName
327 | - network-policy
328 | Type: network
329 | Description: Network policy for AOSS collection
330 | Policy: !Sub >-
331 | [{"Rules":[{"ResourceType":"collection","Resource":["collection/${AOSSCollectionName}"]}, {"ResourceType":"dashboard","Resource":["collection/${AOSSCollectionName}"]}],"AllowFromPublic":true}]
332 | EncryptionPolicy:
333 | Type: 'AWS::OpenSearchServerless::SecurityPolicy'
334 | Properties:
335 | Name: !Join
336 | - '-'
337 | - - !Ref AOSSCollectionName
338 | - security-policy
339 | Type: encryption
340 | Description: Encryption policy for AOSS collection
341 | Policy: !Sub >-
342 | {"Rules":[{"ResourceType":"collection","Resource":["collection/${AOSSCollectionName}"]}],"AWSOwnedKey":true}
343 | Collection:
344 | Type: 'AWS::OpenSearchServerless::Collection'
345 | Properties:
346 | Name: !Ref AOSSCollectionName
347 | Type: VECTORSEARCH
348 | Description: Collection to holds vector search data
349 | DependsOn: EncryptionPolicy
350 |
351 | Outputs:
352 | S3Bucket:
353 | Value: !GetAtt S3Bucket.Arn
354 | DashboardURL:
355 | Value: !GetAtt Collection.DashboardEndpoint
356 | CollectionARN:
357 | Value: !GetAtt Collection.Arn
358 | FilesCopied:
359 | Description: Files copied
360 | Value: !GetAtt 'CustomSGResource.files_copied'
361 | FileCopyError:
362 | Description: Files copy error
363 | Value: !GetAtt 'CustomSGResource.error'
364 | AOSSVectorIndexName:
365 | Description: vector index
366 | Value: !Ref AOSSIndexName
367 | Region:
368 | Description: Deployed Region
369 | Value: !Ref AWS::Region
370 |
--------------------------------------------------------------------------------