├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── blog_post.docx ├── blog_post.html ├── blog_post.md ├── blog_post.qmd ├── img ├── ML-15729-Amit.png ├── ML-15729-agt-console.png ├── ML-15729-agt-iam.png ├── ML-15729-agt-select-model.png ├── ML-15729-agt-trace1.png ├── ML-15729-agt-trace2.png ├── ML-15729-agt1.png ├── ML-15729-agt2-s1-1.png ├── ML-15729-agt2-s1-2.png ├── ML-15729-agt2.png ├── ML-15729-agt3-s1.png ├── ML-15729-agt3.png ├── ML-15729-agt4.png ├── ML-15729-agt5-s1.png ├── ML-15729-agt5.png ├── ML-15729-agt6.png ├── ML-15729-agt7.png ├── ML-15729-aoss-cv.jpg ├── ML-15729-aoss.jpg ├── ML-15729-aoss1.jpg ├── ML-15729-aoss2.jpg ├── ML-15729-bedrock-agents-kb.png ├── ML-15729-blog_post.png ├── ML-15729-cf-outputs.jpg ├── ML-15729-cf1.jpg ├── ML-15729-cloudformation-launch-stack.png ├── ML-15729-kb1.jpg ├── ML-15729-kb10.jpg ├── ML-15729-kb11-w-context.png ├── ML-15729-kb11-wo-context.png ├── ML-15729-kb11.png ├── ML-15729-kb2.jpg ├── ML-15729-kb3.jpg ├── ML-15729-kb4.jpg ├── ML-15729-kb5.jpg ├── ML-15729-kb6.png ├── ML-15729-kb7.png ├── ML-15729-kb8.jpg ├── ML-15729-kb9.jpg ├── ML-15729-kb_02.png ├── ML-15729-os-vi-1.1.png ├── ML-15729-os-vi-1.png ├── ML-15729-os-vi-2.1.png ├── ML-15729-os-vi-2.png ├── ML-15729-rag_1.png ├── ML-15729-sm1.jpg ├── ML-15729-sm2.jpg ├── ML-15729-sm3.jpg └── bedrock-agents-kb.drawio ├── rag_w_bedrock_agent.ipynb ├── rag_w_bedrock_and_aoss.ipynb ├── setup_bedrock_conda.sh └── template.yml /.gitignore: -------------------------------------------------------------------------------- 1 | docs/ 2 | agent-sdk/ 3 | python-sdk-test-runtime-trace/ 4 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *main* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT No Attribution 2 | 3 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so. 10 | 11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 13 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 14 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 15 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 16 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 17 | 18 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Retrieval Augmented Generation using Amazon Bedrock 2 | 3 | This repository provides sample code for implementing a question answering application using the Retrieval Augmented Generation (RAG) technique with Amazon Bedrock. A RAG implementation consists of two parts: 4 | 5 | 1. A data pipeline that ingests that from documents (typically stored in Amazon S3) into a knowledge base i.e. a vector database such as Amazon OpenSearch Service Serverless (AOSS) so that it is available for lookup when a question is received. 6 | 7 | 1. An application that receives a question from the user, looks up the knowledge base for relevant pieces of information (context) and then creates a prompt that includes the question and the context and provides it to an LLM for generating a response. 8 | 9 | The data pipeline represents an undifferentiated heavy lifting and can be implemented using Amazon Bedrock Agents for knowledge Base. We can now connect an S3 bucket to a vector database such as AOSS and have a Bedrock Agent read the objects (html, pdf, text etc.), chunk them, and then convert these chunks into embeddings using Amazon Titan Embeddings model and then store these embeddings in AOSS. All of this without having to build, deploy and manage the data pipeline. 10 | 11 | Once the data is available in the Bedrock Knowledge Base then a question answering application can be built using the following architectural pattern. 12 | 13 | ![KB Agent](img/ML-15729-bedrock-agents-kb.png) 14 | 15 | ## Installation 16 | 17 | Follow the steps listed below to create and run the RAG solution. The [blog_post.md](./blog_post.md) describes this solution in detail. 18 | 19 | 1. Launch the AWS CloudFormation template included in this repository using one of the buttons from the table below. The CloudFormation template creates the following resources within your AWS account: Amazon OpenSearch Service Serverless (AOSS) Collection, Amazon S3 bucket, IAM roles for Amazon Bedrock Knowledge Base Agent and Notebook and a Amazon SageMaker Notebook with this repository cloned to run the next steps. 20 | 21 | 22 | |AWS Region | Link | 23 | |:------------------------:|:-----------:| 24 | |us-east-1 (N. Virginia) | [](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) | 25 | |us-west-2 (Oregon) | [](https://console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) | 26 | 27 | 1. Follow instructions in [Build a RAG based question answer solution using Amazon Bedrock Knowledge Base and Amazon OpenSearch Service Serverless](./blog_post.md) 28 | 29 | ## Security 30 | 31 | See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information. 32 | 33 | ## License 34 | 35 | This library is licensed under the MIT-0 License. See the [LICENSE](./LICENSE) file. 36 | 37 | 38 | -------------------------------------------------------------------------------- /blog_post.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/blog_post.docx -------------------------------------------------------------------------------- /blog_post.md: -------------------------------------------------------------------------------- 1 | Build a RAG based question answer solution using Amazon Bedrock 2 | Knowledge Base, vector engine for Amazon OpenSearch Service Serverless 3 | and LangChain 4 | ================ 5 | 6 | *Amit Arora* 7 | 8 | One of the most common applications of generative AI and Foundation 9 | Models (FMs) in an enterprise environment is answering questions based 10 | on the enterprise’s knowledge corpus. [Amazon 11 | Lex](https://aws.amazon.com/lex/) provides the framework for building 12 | [AI based 13 | chatbots](https://aws.amazon.com/solutions/retail/ai-for-chatbots). 14 | Pre-trained foundation models (FMs) perform well at natural language 15 | understanding (NLU) tasks such summarization, text generation and 16 | question answering on a broad variety of topics but either struggle to 17 | provide accurate (without hallucinations) answers or completely fail at 18 | answering questions about content that they haven’t seen as part of 19 | their training data. Furthermore, FMs are trained with a point in time 20 | snapshot of data and have no inherent ability to access fresh data at 21 | inference time; without this ability they might provide responses that 22 | are potentially incorrect or inadequate. 23 | 24 | A commonly used approach to address this problem is to use a technique 25 | called Retrieval Augmented Generation (RAG). In the RAG-based approach 26 | we convert the user question into vector embeddings using an FM and then 27 | do a similarity search for these embeddings in a pre-populated vector 28 | database holding the embeddings for the enterprise knowledge corpus. A 29 | small number of similar documents (typically three) is added as context 30 | along with the user question to the “prompt” provided to another FM and 31 | then that FM generates an answer to the user question using information 32 | provided as context in the prompt. RAG models were introduced by [Lewis 33 | et al.](https://arxiv.org/abs/2005.11401) in 2020 as a model where 34 | parametric memory is a pre-trained seq2seq model and the non-parametric 35 | memory is a dense vector index of Wikipedia, accessed with a pre-trained 36 | neural retriever. To understand the overall structure of a RAG-based 37 | approach, refer to [Build a powerful question answering bot with Amazon 38 | SageMaker, Amazon OpenSearch Service, Streamlit, and 39 | LangChain](https://aws.amazon.com/blogs/machine-learning/build-a-powerful-question-answering-bot-with-amazon-sagemaker-amazon-opensearch-service-streamlit-and-langchain/). 40 | 41 | In this post we provide a step-by-step guide with all the building 42 | blocks for creating a *Low Code No Code* (LCNC) enterprise ready RAG 43 | application such as a question answering solution. We use FMs available 44 | through Amazon Bedrock for the embeddings model (Amazon Titan Text 45 | Embeddings v2), the text generation model (Anthropic Claude v2), the 46 | Amazon Bedrock Knowledge Base and Amazon Bedrock Agents for this 47 | solution. The text corpus representing an enterprise knowledge base is 48 | stored as HTML files in Amazon S3 and is ingested in the form of text 49 | embeddings into an index in a Amazon OpenSearch Service Serverless 50 | collection using Bedrock knowledge base agent in a fully-managed 51 | serverless fashion. 52 | 53 | We provide an AWS Cloud Formation template to stand up all the resources 54 | required for building this solution. We then demonstrate how to use 55 | [LangChain](https://www.langchain.com) to interface with the Bedrock and 56 | [opensearch-py](https://pypi.org/project/opensearch-py/) to interface 57 | with OpenSearch Service Serverless and build a RAG based question answer 58 | workflow. 59 | 60 | ## Solution overview 61 | 62 | We use a subset of [SageMaker docs](https://sagemaker.readthedocs.io) as 63 | the knowledge corpus for this post. The data is available in the form of 64 | HTML files in an S3 bucket, a Bedrock Knowledge Base Agent then reads 65 | these files, converts them into smaller chunks, encodes these chunks 66 | into vectors (embeddings) and then ingests these embeddings into an 67 | OpenSearch Service Serverless collection index. We implement the RAG 68 | functionality in a notebook, a set of SageMaker related questions is 69 | asked of the Claude model without providing any additional context and 70 | then the same questions are asked again but this time with context based 71 | on similar documents retrieved from OpenSearch Service Serverless 72 | i.e. using the RAG approach. We demonstrate the responses generated 73 | without RAG could be factually inaccurate whereas the RAG based 74 | responses are accurate and more useful. 75 | 76 | All the code for this post is available in the [GitHub 77 | repo](https://github.com/aws-samples/bedrock-kb-rag/tree/main/blogs/rag). 78 | 79 | The following figure represents the high-level architecture of the 80 | proposed solution. 81 | 82 |
83 | Figure 1: Architecture 85 | 86 |
87 | 88 | Step-by-step explanation: 89 | 90 | 1. The user provides a question via the Jupyter notebook. 91 | 2. The question is converted into embedding using Bedrock via the Titan 92 | embeddings v2 model. 93 | 3. The embedding is used to find similar documents from an OpenSearch 94 | Service Serverless index. 95 | 4. The similar documents long with the user question are used to create 96 | a “prompt”. 97 | 5. The prompt is provided to Bedrock to generate a response using the 98 | Claude v2 model. 99 | 6. The response along with the context is printed out in a notebook 100 | cell. 101 | 102 | As illustrated in the architecture diagram, we use the following AWS 103 | services: 104 | 105 | - [Bedrock](https://aws.amazon.com/bedrock/) for access to the FMs for 106 | embedding and text generation as well as for the knowledge base agent. 107 | - [OpenSearch Service Serverless with vector 108 | search](https://aws.amazon.com/opensearch-service/serverless-vector-engine/) 109 | for storing the embeddings of the enterprise knowledge corpus and 110 | doing similarity search with user questions. 111 | - [S3](https://aws.amazon.com/pm/serv-s3/) for storing the raw knowledge 112 | corpus data (HTML files). 113 | - [AWS Identity and Access Management](https://aws.amazon.com/iam/) 114 | roles and policies for access management. 115 | - [AWS CloudFormation](https://aws.amazon.com/cloudformation/) for 116 | creating the entire solution stack through infrastructure as code. 117 | 118 | In terms of open-source packages used in this solution, we use 119 | [LangChain](https://python.langchain.com/en/latest/index.html) for 120 | interfacing with Bedrock and 121 | [opensearch-py](https://pypi.org/project/opensearch-py/) to interface 122 | with OpenSearch Service Serverless. 123 | 124 | The workflow for instantiating the solution presented in this post in 125 | your own AWS account is as follows: 126 | 127 | 1. Run the CloudFormation template provided with this post in your 128 | account. This will create all the necessary infrastructure resources 129 | needed for this solution: 130 | 131 | 1. OpenSearch Service Serverless collection 132 | 2. SageMaker Notebook 133 | 3. IAM roles 134 | 135 | 2. Create a vector index in the OpenSearch Service Serverless 136 | collection. This is done through the OpenSearch Service Serverless 137 | console. 138 | 139 | 3. Create a knowledge base in Bedrock and synch data from the S3 bucket 140 | to the OpenSearch Service Serverless collection index. This is done 141 | through the Bedrock console. 142 | 143 | 4. Create a Bedrock Agent and connect it to the knowledge base and use 144 | the Agent console for question answering *without having to write 145 | any code*. 146 | 147 | 5. Run the 148 | [`rag_w_bedrock_and_aoss.ipynb`](./rag_w_bedrock_and_aoss.ipynb) 149 | notebook in the SageMaker notebook to ask questions based on the 150 | data ingested in OpenSearch Service Serverless collection index. 151 | 152 | These steps are discussed in detail in the following sections. 153 | 154 | ### Prerequisites 155 | 156 | To implement the solution provided in this post, you should have an [AWS 157 | account](https://signin.aws.amazon.com/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&client_id=signup) 158 | and awareness about FMs, OpenSearch Service and Bedrock. 159 | 160 | #### Use AWS Cloud Formation to create the solution stack 161 | 162 | Choose **Launch Stack** for the Region you want to deploy resources to. 163 | All parameters needed by the CloudFormation template have default values 164 | already filled in, except for ARN of the IAM role with which you are 165 | currently logged into your AWS account which you’d have to provide. Make 166 | a note of the OpenSearch Service collection ARN, we use this in 167 | subsequent steps. **This template takes about 10 minutes to complete**. 168 | 169 | | AWS Region | Link | 170 | |:-----------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:| 171 | | us-east-1 (N. Virginia) | [](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) | 172 | | us-west-2 (Oregon) | [](https://console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) | 173 | 174 | After the stack is created successfully, navigate to the stack’s 175 | `Outputs` tab on the AWS CloudFormation console and note the values for 176 | `CollectionARN` and `AOSSVectorIndexName`. We use those in the 177 | subsequent steps. 178 | 179 |
180 | Figure 2: Cloud Formation Stack Outputs 182 | 184 |
185 | 186 | #### Create an OpenSearch Service Serverless vector index 187 | 188 | The CloudFormation stack creates an OpenSearch Service Serverless 189 | collection, the next step is to create a vector index. This is done 190 | through the OpenSearch Service Serverless console as described below. 191 | 192 | 1. Navigate to OpenSearch Service console and click on `Collections`. 193 | The `sagemaker-kb` collection created by the CloudFormation stack 194 | will be listed there. 195 | 196 |
197 | Figure 3: SageMaker Knowledge Base Collection 199 | 201 |
202 | 203 | 2. Click on the `sagemaker-kb` link to create a vector index for 204 | storing the embeddings from the documents in S3. 205 | 206 |
207 | Figure 4: SageMaker Knowledge Base Vector Index 210 | 212 |
213 | 214 | 3. Set the vector index name as `sagemaker-readthedocs-io`, vector 215 | field name as `vector` dimensions as `1536`, choose engine types as `FAISS` and distance metric as 216 | `Euclidean`. **It is required that you set these parameters exactly 217 | as mentioned here because the Bedrock Knowledge Base Agent is going 218 | to use these same values**. 219 | 220 |
221 | Figure 5: SageMaker Knowledge Base Vector Index Parameters 224 | 226 |
227 | 228 | 4. Once created the vector index is listed as part of the collection. 229 | 230 |
231 | Figure 6: SageMaker Knowledge Base Vector Index Created 234 | 236 |
237 | 238 | #### Create a Bedrock knowledge base 239 | 240 | Once the OpenSearch Service Serverless collection and vector index have 241 | been created, it is time to setup the Bedrock knowledge base. 242 | 243 | 1. Navigate to the Bedrock Console and click on `Knowledge Base` and 244 | click on the `Created Knowledge Base` button. 245 | 246 |
247 | Figure 7: Bedrock Knowledge Base 249 | 251 |
252 | 253 | 2. Fill out the details for creating the knowledge base as shown in the 254 | screenshots below. 255 | 256 |
257 | Figure 8: Bedrock Knowledge Base 259 | 261 |
262 | 263 | 3. Select the S3 bucket. 264 | 265 |
266 | Figure 9: Bedrock Knowledge Base S3 bucket 268 | 270 |
271 | 272 | 4. The Titan embeddings model is automatically selected. 273 | 274 |
275 | Figure 10: Bedrock Knowledge Base embeddings model 277 | 279 |
280 | 281 | 5. Select Amazon OpenSearch Service Serverless from the vector database 282 | options available. 283 | 284 |
285 | Figure 11: Bedrock Knowledge Base OpenSearch Service Serverless 287 | 289 |
290 | 291 | 6. Review and create the knowledge base by clicking the 292 | `Create knowledge base` button. 293 | 294 |
295 | Figure 12: Bedrock Knowledge Base Review & Create 297 | 299 |
300 | 301 | 7. The knowledge base should be created now. 302 | 303 |
304 | Figure 13: Bedrock Knowledge Base create complete 306 | 308 |
309 | 310 | ##### Sync the Bedrock knowledge base 311 | 312 | Once the Bedrock knowledge base is created we are now ready to sync the 313 | data (raw documents) in S3 to embeddings in the OpenSearch Service 314 | Serverless collection vector index. 315 | 316 | 1. Start the `Sync` by pressing the `Sync` button, the button label 317 | changes to `Syncing`. 318 | 319 |
320 | Figure 14: Bedrock Knowledge Base sync 322 | 324 |
325 | 326 | 2. Once the `Sync` completes the status changes to `Ready`. 327 | 328 |
329 | Figure 15: Bedrock Knowledge Base sync completed 331 | 333 |
334 | 335 | #### Create a Bedrock Agent for question answering 336 | 337 | Now we are all set to ask some questions of our newly created knowledge 338 | base. In this step we do this in a no code way by creating a Bedrock 339 | Agent. 340 | 341 | 1. Create a new Bedrock agent, call it `sagemaker-qa` and use the 342 | `AmazonBedrockExecutionRoleForAgent_SageMakerQA` IAM role, this role 343 | is created automatically via CloudFormation. 344 | 345 |
346 | Figure 16: Provide agent details - agent name 348 | 350 |
351 | 352 |
353 | Figure 17: Provide agent details - IAM role 355 | 357 |
358 | 359 | 2. Provide the following as the instructions for the agent: 360 | `You are a Q&A agent that politely answers questions from a knowledge base named sagemaker-docs.`. 361 | The `Anthropic Claude V2` model is selected as the model for the 362 | agent. 363 | 364 |
365 | Figure 18: Select model 367 | 368 |
369 | 370 | 3. Click `Next` on the `Add Action groups - optional` page, there are 371 | no action groups needed for this agent. 372 | 373 | 4. Select the `sagemaker-docs` knowledge base, in the knowledge base 374 | instructions for agent field entry 375 | `Answer questions about Amazon SageMaker based only on the information contained in the knowledge base.`. 376 | 377 |
378 | Figure 19: Add knowledge base 380 | 382 |
383 | 384 | 5. Click the `Create Agent` button on the `Review and create` screen. 385 | 386 |
387 | Figure 20: Review and create 389 | 390 |
391 | 392 | 6. Once the agent is ready, we can ask questions to our agent using the 393 | Agent console. 394 | 395 |
396 | Figure 21: Agent console 398 | 399 |
400 | 401 | 7. We ask the agent some questions such as 402 | `What are the XGBoost versions supported in Amazon SageMaker`, 403 | notice that we not only get the correct answer but also a link to 404 | the source of the answer in terms of the original document stored in 405 | S3 that has been used as context to provide this answer! 406 | 407 |
408 | Figure 22: Q&A with Bedrock Agent 410 | 412 |
413 | 414 | 8. The agent also provides a *trace* feature which can show the steps 415 | the agent undertakes to come up with the final answer. The steps 416 | include the prompt used and the text from the retrieved documents 417 | from the knowledge base. 418 | 419 |
420 | Figure 23: Bedrock Agent Trace Step 1 422 | 424 |
425 | 426 |
427 | Figure 24: Bedrock Agent Trace Step 2 429 | 431 |
432 | 433 | #### Run the RAG notebook 434 | 435 | Now we will interact with our knowledge base through code. The 436 | CloudFormation template creates a SageMaker Notebook that contains the 437 | code to demonstrate this. 438 | 439 | 1. Navigate to SageMaker Notebooks and find the notebook named 440 | `bedrock-kb-rag-workshop` and click on `Open Jupyter Lab`. 441 | 442 |
443 | Figure 25: RAG with Bedrock KB notebook 445 | 447 |
448 | 449 | 2. Open a new `Terminal` from `File -> New -> Terminal` and run the 450 | following commands to install the Bedrock SDK in a new conda kernel 451 | called `bedrock_py39`. 452 | 453 | ``` python 454 | chmod +x /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh 455 | /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh 456 | ``` 457 | 458 | 3. Wait for one minute after completing the previous step and now click 459 | on the `rag_w_bedrock_and_aoss.ipynb` to open the notebook. *Confirm 460 | that the notebook is using the newly created `bedrock_py39` kernel, 461 | otherwise the code will not work. In case the kernel is not set to 462 | `bedrock_py39` then refresh the page and this time the 463 | `bedrock_py39` kernel would be selected*. 464 | 465 | 4. The notebook code demonstrates use of Bedrock, LangChain and 466 | opensearch-py packages for implementing the RAG technique for 467 | question answering. 468 | 469 | 5. We access the models available via Bedrock using the `Bedrock` and 470 | `BedrockEmbeddings` classes from the LangChain package. 471 | 472 | ``` python 473 | # we will use Anthropic Claude for text generation 474 | claude_llm = Bedrock(model_id= "anthropic.claude-v2") 475 | claude_llm.model_kwargs = dict(temperature=0.5, max_tokens_to_sample=300, top_k=250, top_p=1, stop_sequences=[]) 476 | 477 | # we will be using the Titan Embeddings Model to generate our Embeddings. 478 | embeddings = BedrockEmbeddings(model_id = "amazon.titan-embed-g1-text-02") 479 | ``` 480 | 481 | 6. Interface to OpenSearch Service Serverless is through the 482 | opensearch-py package. 483 | 484 | ``` python 485 | # Functions to talk to OpenSearch 486 | 487 | # Define queries for OpenSearch 488 | def query_docs(query: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, index: str, k: int = 3) -> Dict: 489 | """ 490 | Convert the query into embedding and then find similar documents from OpenSearch Service Serverless 491 | """ 492 | 493 | # embedding 494 | query_embedding = embeddings.embed_query(query) 495 | 496 | # query to lookup OpenSearch kNN vector. Can add any metadata fields based filtering 497 | # here as part of this query. 498 | query_qna = { 499 | "size": k, 500 | "query": { 501 | "knn": { 502 | "vector": { 503 | "vector": query_embedding, 504 | "k": k 505 | } 506 | } 507 | } 508 | } 509 | 510 | # OpenSearch API call 511 | relevant_documents = aoss_client.search( 512 | body = query_qna, 513 | index = index 514 | ) 515 | return relevant_documents 516 | ``` 517 | 518 | 7. We combine the prompt and the documents retrieved from OpenSearch 519 | Service Serverless as follows. 520 | 521 | ``` python 522 | def create_context_for_query(q: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, vector_index: str) -> str: 523 | """ 524 | Create a context out of the similar docs retrieved from the vector database 525 | by concatenating the text from the similar documents. 526 | """ 527 | print(f"query -> {q}") 528 | aoss_response = query_docs(q, embeddings, aoss_client, vector_index) 529 | context = "" 530 | for r in aoss_response['hits']['hits']: 531 | s = r['_source'] 532 | print(f"{s['metadata']}\n{s['text']}") 533 | context += f"{s['text']}\n" 534 | print("----------------") 535 | return context 536 | ``` 537 | 538 | 8. Combining everything, the RAG workflow works as shown below. 539 | 540 | ``` python 541 | # 1. Start with the query 542 | q = "What versions of XGBoost are supported by Amazon SageMaker?" 543 | 544 | # 2. Create the context by finding similar documents from the knowledge base 545 | context = create_context_for_query(q, embeddings, client, aoss_vector_index) 546 | 547 | # 3. Now create a prompt by combining the query and the context 548 | prompt = PROMPT_TEMPLATE.format(context, q) 549 | 550 | # 4. Provide the prompt to the FM to generate an answer to the query based on context provided 551 | response = claude_llm(prompt) 552 | ``` 553 | 554 | 9. Here is an example of a sample question answered first with just the 555 | question in the prompt i.e. without providing any additional 556 | context. The answer without context is inaccurate. 557 | 558 |
559 | Figure 26: Answer with prompt alone 561 | 563 |
564 | 565 | 10. We then ask the same question but this time with the additional 566 | context retrieved from the knowledge base included in the prompt. 567 | Now the inaccuracy in the earlier response is addressed and we also 568 | have attribution as to the source of this answer (notice the 569 | underlined text for the filename and the actual answer)! 570 | 571 |
572 | Figure 27: Answer with prompt and context 574 | 576 |
577 | 578 | ## Clean up 579 | 580 | To avoid incurring future charges, delete the resources. You can do this 581 | by first deleting all the files from the S3 bucket created by the 582 | CloudFormation template and then deleting the CloudFormation stack. 583 | 584 | ## Conclusion 585 | 586 | In this post, we showed how to create an enterprise ready RAG solution 587 | using a combination of AWS services and open-source Python packages. 588 | 589 | We encourage you to learn more by exploring [Amazon 590 | Titan](https://aws.amazon.com/bedrock/titan/) models, [Amazon 591 | Bedrock](https://aws.amazon.com/bedrock/), and [OpenSearch 592 | Service](https://aws.amazon.com/opensearch-service/) and building a 593 | solution using the sample implementation provided in this post and a 594 | dataset relevant to your business. If you have questions or suggestions, 595 | leave a comment. 596 | 597 | ------------------------------------------------------------------------ 598 | 599 | ## Author bio 600 | 601 | Amit 602 | Arora is an AI and ML Specialist Architect at Amazon Web Services, 603 | helping enterprise customers use cloud-based machine learning services 604 | to rapidly scale their innovations. He is also an adjunct lecturer in 605 | the MS data science and analytics program at Georgetown University in 606 | Washington D.C. 607 | -------------------------------------------------------------------------------- /blog_post.qmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Build a RAG based question answer solution using Amazon Bedrock Knowledge Base, vector engine for Amazon OpenSearch Service Serverless and LangChain" 3 | format: 4 | html: 5 | embed-resources: true 6 | output-file: blog_post.html 7 | theme: cosmo 8 | code-copy: true 9 | code-line-numbers: true 10 | highlight-style: github 11 | docx: 12 | embed-resources: true 13 | output-file: blog_post.docx 14 | theme: cosmo 15 | code-copy: true 16 | code-line-numbers: true 17 | highlight-style: github 18 | gfm: 19 | output-file: blog_post.md 20 | --- 21 | 22 | _Amit Arora_ 23 | 24 | One of the most common applications of generative AI and Foundation Models (FMs) in an enterprise environment is answering questions based on the enterprise’s knowledge corpus. [Amazon Lex](https://aws.amazon.com/lex/) provides the framework for building [AI based chatbots](https://aws.amazon.com/solutions/retail/ai-for-chatbots). Pre-trained foundation models (FMs) perform well at natural language understanding (NLU) tasks such summarization, text generation and question answering on a broad variety of topics but either struggle to provide accurate (without hallucinations) answers or completely fail at answering questions about content that they haven't seen as part of their training data. Furthermore, FMs are trained with a point in time snapshot of data and have no inherent ability to access fresh data at inference time; without this ability they might provide responses that are potentially incorrect or inadequate. 25 | 26 | A commonly used approach to address this problem is to use a technique called Retrieval Augmented Generation (RAG). In the RAG-based approach we convert the user question into vector embeddings using an FM and then do a similarity search for these embeddings in a pre-populated vector database holding the embeddings for the enterprise knowledge corpus. A small number of similar documents (typically three) is added as context along with the user question to the "prompt" provided to another FM and then that FM generates an answer to the user question using information provided as context in the prompt. RAG models were introduced by [Lewis et al.](https://arxiv.org/abs/2005.11401) in 2020 as a model where parametric memory is a pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever. To understand the overall structure of a RAG-based approach, refer to [Build a powerful question answering bot with Amazon SageMaker, Amazon OpenSearch Service, Streamlit, and LangChain](https://aws.amazon.com/blogs/machine-learning/build-a-powerful-question-answering-bot-with-amazon-sagemaker-amazon-opensearch-service-streamlit-and-langchain/). 27 | 28 | In this post we provide a step-by-step guide with all the building blocks for creating a _Low Code No Code_ (LCNC) enterprise ready RAG application such as a question answering solution. We use FMs available through Amazon Bedrock for the embeddings model (Amazon Titan Text Embeddings v2), the text generation model (Anthropic Claude v2), the Amazon Bedrock Knowledge Base and Amazon Bedrock Agents for this solution. The text corpus representing an enterprise knowledge base is stored as HTML files in Amazon S3 and is ingested in the form of text embeddings into an index in a Amazon OpenSearch Service Serverless collection using Bedrock knowledge base agent in a fully-managed serverless fashion. 29 | 30 | We provide an AWS Cloud Formation template to stand up all the resources required for building this solution. We then demonstrate how to use [LangChain](https://www.langchain.com) to interface with the Bedrock and [opensearch-py](https://pypi.org/project/opensearch-py/) to interface with OpenSearch Service Serverless and build a RAG based question answer workflow. 31 | 32 | ## Solution overview 33 | 34 | We use a subset of [SageMaker docs](https://sagemaker.readthedocs.io) as the knowledge corpus for this post. The data is available in the form of HTML files in an S3 bucket, a Bedrock Knowledge Base Agent then reads these files, converts them into smaller chunks, encodes these chunks into vectors (embeddings) and then ingests these embeddings into an OpenSearch Service Serverless collection index. We implement the RAG functionality in a notebook, a set of SageMaker related questions is asked of the Claude model without providing any additional context and then the same questions are asked again but this time with context based on similar documents retrieved from OpenSearch Service Serverless i.e. using the RAG approach. We demonstrate the responses generated without RAG could be factually inaccurate whereas the RAG based responses are accurate and more useful. 35 | 36 | All the code for this post is available in the [GitHub repo](https://github.com/aws-samples/bedrock-kb-rag/tree/main/blogs/rag). 37 | 38 | 39 | The following figure represents the high-level architecture of the proposed solution. 40 | 41 | ![Architecture](img/ML-15729-bedrock-agents-kb.png){#fig-architecture} 42 | 43 | Step-by-step explanation: 44 | 45 | 1. The user provides a question via the Jupyter notebook. 46 | 1. The question is converted into embedding using Bedrock via the Titan embeddings v2 model. 47 | 1. The embedding is used to find similar documents from an OpenSearch Service Serverless index. 48 | 1. The similar documents long with the user question are used to create a "prompt". 49 | 1. The prompt is provided to Bedrock to generate a response using the Claude v2 model. 50 | 1. The response along with the context is printed out in a notebook cell. 51 | 52 | As illustrated in the architecture diagram, we use the following AWS services: 53 | 54 | - [Bedrock](https://aws.amazon.com/bedrock/) for access to the FMs for embedding and text generation as well as for the knowledge base agent. 55 | - [OpenSearch Service Serverless with vector search](https://aws.amazon.com/opensearch-service/serverless-vector-engine/) for storing the embeddings of the enterprise knowledge corpus and doing similarity search with user questions. 56 | - [S3](https://aws.amazon.com/pm/serv-s3/) for storing the raw knowledge corpus data (HTML files). 57 | - [AWS Identity and Access Management](https://aws.amazon.com/iam/) roles and policies for access management. 58 | - [AWS CloudFormation](https://aws.amazon.com/cloudformation/) for creating the entire solution stack through infrastructure as code. 59 | 60 | In terms of open-source packages used in this solution, we use [LangChain](https://python.langchain.com/en/latest/index.html) for interfacing with Bedrock and [opensearch-py](https://pypi.org/project/opensearch-py/) to interface with OpenSearch Service Serverless. 61 | 62 | The workflow for instantiating the solution presented in this post in your own AWS account is as follows: 63 | 64 | 1. Run the CloudFormation template provided with this post in your account. This will create all the necessary infrastructure resources needed for this solution: 65 | a. OpenSearch Service Serverless collection 66 | a. SageMaker Notebook 67 | a. IAM roles 68 | 69 | 1. Create a vector index in the OpenSearch Service Serverless collection. This is done through the OpenSearch Service Serverless console. 70 | 71 | 1. Create a knowledge base in Bedrock and synch data from the S3 bucket to the OpenSearch Service Serverless collection index. This is done through the Bedrock console. 72 | 73 | 1. Create a Bedrock Agent and connect it to the knowledge base and use the Agent console for question answering _without having to write any code_. 74 | 75 | 1. Run the [`rag_w_bedrock_and_aoss.ipynb`](./rag_w_bedrock_and_aoss.ipynb) notebook in the SageMaker notebook to ask questions based on the data ingested in OpenSearch Service Serverless collection index. 76 | 77 | These steps are discussed in detail in the following sections. 78 | 79 | ### Prerequisites 80 | 81 | To implement the solution provided in this post, you should have an [AWS account](https://signin.aws.amazon.com/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&client_id=signup) and awareness about FMs, OpenSearch Service and Bedrock. 82 | 83 | #### Use AWS Cloud Formation to create the solution stack 84 | 85 | Choose **Launch Stack** for the Region you want to deploy resources to. All parameters needed by the CloudFormation template have default values already filled in, except for ARN of the IAM role with which you are currently logged into your AWS account which you'd have to provide. Make a note of the OpenSearch Service collection ARN, we use this in subsequent steps. **This template takes about 10 minutes to complete**. 86 | 87 | |AWS Region | Link | 88 | |:------------------------:|:-----------:| 89 | |us-east-1 (N. Virginia) | [](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) | 90 | |us-west-2 (Oregon) | [](https://console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/new?stackName=rag-w-bedrock-kb&templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-15729/template.yml) | 91 | 92 | After the stack is created successfully, navigate to the stack's `Outputs` tab on the AWS CloudFormation console and note the values for `CollectionARN` and `AOSSVectorIndexName`. We use those in the subsequent steps. 93 | 94 | ![Cloud Formation Stack Outputs](img/ML-15729-cf-outputs.jpg){#fig-cfn-outputs} 95 | 96 | #### Create an OpenSearch Service Serverless vector index 97 | 98 | The CloudFormation stack creates an OpenSearch Service Serverless collection, the next step is to create a vector index. This is done through the OpenSearch Service Serverless console as described below. 99 | 100 | 1. Navigate to OpenSearch Service console and click on `Collections`. The `sagemaker-kb` collection created by the CloudFormation stack will be listed there. 101 | 102 | ![SageMaker Knowledge Base Collection](img/ML-15729-aoss.jpg){#fig-aoss-collections} 103 | 104 | 1. Click on the `sagemaker-kb` link to create a vector index for storing the embeddings from the documents in S3. 105 | 106 | ![SageMaker Knowledge Base Vector Index](img/ML-15729-aoss-cv.jpg){#fig-aoss-collection-vector-index} 107 | 108 | 1. Set the vector index name as `sagemaker-readthedocs-io`, vector field name as `vector` dimensions as `1536`, and distance metric as `Euclidean`. **It is required that you set these parameters exactly as mentioned here because the Bedrock Knowledge Base Agent is going to use these same values**. 109 | 110 | ![SageMaker Knowledge Base Vector Index Parameters](img/ML-15729-os-vi-1.1.png){#fig-aoss-collection-vector-index-parameters} 111 | 112 | 1. Once created the vector index is listed as part of the collection. 113 | 114 | ![SageMaker Knowledge Base Vector Index Created](img/ML-15729-os-vi-2.1.png){#fig-aoss-collection-vector-index-created} 115 | 116 | 117 | #### Create a Bedrock knowledge base 118 | 119 | Once the OpenSearch Service Serverless collection and vector index have been created, it is time to setup the Bedrock knowledge base. 120 | 121 | 1. Navigate to the Bedrock Console and click on `Knowledge Base` and click on the `Created Knowledge Base` button. 122 | 123 | ![Bedrock Knowledge Base](img/ML-15729-kb1.jpg){#fig-br-kb-list} 124 | 125 | 1. Fill out the details for creating the knowledge base as shown in the screenshots below. 126 | 127 | ![Bedrock Knowledge Base](img/ML-15729-kb2.jpg){#fig-br-kb-list} 128 | 129 | 1. Select the S3 bucket. 130 | 131 | ![Bedrock Knowledge Base S3 bucket](img/ML-15729-kb4.jpg){#fig-br-kb-s3-bucket} 132 | 133 | 1. The Titan embeddings model is automatically selected. 134 | 135 | ![Bedrock Knowledge Base embeddings model](img/ML-15729-kb5.jpg){#fig-br-kb-titan} 136 | 137 | 138 | 1. Select Amazon OpenSearch Service Serverless from the vector database options available. 139 | 140 | ![Bedrock Knowledge Base OpenSearch Service Serverless](img/ML-15729-kb6.png){#fig-br-kb-aoss} 141 | 142 | 1. Review and create the knowledge base by clicking the `Create knowledge base` button. 143 | 144 | ![Bedrock Knowledge Base Review & Create](img/ML-15729-kb7.png){#fig-br-kb-review-and-create} 145 | 146 | 1. The knowledge base should be created now. 147 | 148 | ![Bedrock Knowledge Base create complete](img/ML-15729-kb8.jpg){#fig-br-kb-create-complete} 149 | 150 | 151 | ##### Sync the Bedrock knowledge base 152 | 153 | Once the Bedrock knowledge base is created we are now ready to sync the data (raw documents) in S3 to embeddings in the OpenSearch Service Serverless collection vector index. 154 | 155 | 1. Start the `Sync` by pressing the `Sync` button, the button label changes to `Syncing`. 156 | 157 | ![Bedrock Knowledge Base sync](img/ML-15729-kb9.jpg){#fig-br-kb-sync-in-progress} 158 | 159 | 1. Once the `Sync` completes the status changes to `Ready`. 160 | 161 | ![Bedrock Knowledge Base sync completed](img/ML-15729-kb10.jpg){#fig-br-kb-sync-done} 162 | 163 | #### Create a Bedrock Agent for question answering 164 | 165 | Now we are all set to ask some questions of our newly created knowledge base. In this step we do this in a no code way by creating a Bedrock Agent. 166 | 167 | 1. Create a new Bedrock agent, call it `sagemaker-qa` and use the `AmazonBedrockExecutionRoleForAgent_SageMakerQA` IAM role, this role is created automatically via CloudFormation. 168 | 169 | ![Provide agent details - agent name](img/ML-15729-agt2-s1-1.png){#fig-br-agt-create-step1-1} 170 | 171 | ![Provide agent details - IAM role](img/ML-15729-agt-iam.png){#fig-br-agt-create-step1-2} 172 | 173 | 1. Provide the following as the instructions for the agent: `You are a Q&A agent that politely answers questions from a knowledge base.`. The `Anthropic Claude V2` model is selected as the model for the agent. 174 | 175 | ![Select model](img/ML-15729-agt-select-model.png){#fig-br-agt-select-model} 176 | 177 | 1. Click `Next` on the `Add Action groups - optional` page, there are no action groups needed for this agent. 178 | 179 | 1. Select the `sagemaker-docs` knowledge base, in the knowledge base instructions for agent field entry `Answer questions about Amazon SageMaker based only on the information contained in the knowledge base.`. 180 | 181 | ![Add knowledge base](img/ML-15729-agt5-s1.png){#fig-br-agt-add-kb} 182 | 183 | 1. Click the `Create Agent` button on the `Review and create` screen. 184 | 185 | ![Review and create](img/ML-15729-agt6.png){#fig-br-agt-review-and-create} 186 | 187 | 1. Once the agent is ready, we can ask questions to our agent using the Agent console. 188 | 189 | ![Agent console](img/ML-15729-agt-console.png){#fig-br-agt-console} 190 | 191 | 1. We ask the agent some questions such as `What are the XGBoost versions supported in Amazon SageMaker`, notice that we not only get the correct answer but also a link to the source of the answer in terms of the original document stored in S3 that has been used as context to provide this answer! 192 | 193 | ![Q&A with Bedrock Agent](img/ML-15729-agt1.png){#fig-br-agt-qna} 194 | 195 | 1. The agent also provides a _trace_ feature which can show the steps the agent undertakes to come up with the final answer. The steps include the prompt used and the text from the retrieved documents from the knowledge base. 196 | 197 | ![Bedrock Agent Trace Step 1](img/ML-15729-agt-trace1.png){#fig-br-agt-trace-step1} 198 | 199 | ![Bedrock Agent Trace Step 2](img/ML-15729-agt-trace2.png){#fig-br-agt-trace-step2} 200 | 201 | #### Run the RAG notebook 202 | 203 | Now we will interact with our knowledge base through code. The CloudFormation template creates a SageMaker Notebook that contains the code to demonstrate this. 204 | 205 | 1. Navigate to SageMaker Notebooks and find the notebook named `bedrock-kb-rag-workshop` and click on `Open Jupyter Lab`. 206 | 207 | ![RAG with Bedrock KB notebook](img/ML-15729-sm1.jpg){#fig-rag-w-br-nb} 208 | 209 | 1. Open a new `Terminal` from `File -> New -> Terminal` and run the following commands to install the Bedrock SDK in a new conda kernel called `bedrock_py39`. 210 | 211 | ```{.python} 212 | chmod +x /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh 213 | /home/ec2-user/SageMaker/bedrock-kb-rag-workshop/setup_bedrock_conda.sh 214 | ``` 215 | 216 | 1. Wait for one minute after completing the previous step and now click on the `rag_w_bedrock_and_aoss.ipynb` to open the notebook. *Confirm that the notebook is using the newly created `bedrock_py39` kernel, otherwise the code will not work. In case the kernel is not set to `bedrock_py39` then refresh the page and this time the `bedrock_py39` kernel would be selected*. 217 | 218 | 1. The notebook code demonstrates use of Bedrock, LangChain and opensearch-py packages for implementing the RAG technique for question answering. 219 | 220 | 1. We access the models available via Bedrock using the `Bedrock` and `BedrockEmbeddings` classes from the LangChain package. 221 | 222 | ```{.python} 223 | # we will use Anthropic Claude for text generation 224 | claude_llm = Bedrock(model_id= "anthropic.claude-v2") 225 | claude_llm.model_kwargs = dict(temperature=0.5, max_tokens_to_sample=300, top_k=250, top_p=1, stop_sequences=[]) 226 | 227 | # we will be using the Titan Embeddings Model to generate our Embeddings. 228 | embeddings = BedrockEmbeddings(model_id = "amazon.titan-embed-g1-text-02") 229 | ``` 230 | 231 | 1. Interface to OpenSearch Service Serverless is through the opensearch-py package. 232 | 233 | ```{.python} 234 | # Functions to talk to OpenSearch 235 | 236 | # Define queries for OpenSearch 237 | def query_docs(query: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, index: str, k: int = 3) -> Dict: 238 | """ 239 | Convert the query into embedding and then find similar documents from OpenSearch Service Serverless 240 | """ 241 | 242 | # embedding 243 | query_embedding = embeddings.embed_query(query) 244 | 245 | # query to lookup OpenSearch kNN vector. Can add any metadata fields based filtering 246 | # here as part of this query. 247 | query_qna = { 248 | "size": k, 249 | "query": { 250 | "knn": { 251 | "vector": { 252 | "vector": query_embedding, 253 | "k": k 254 | } 255 | } 256 | } 257 | } 258 | 259 | # OpenSearch API call 260 | relevant_documents = aoss_client.search( 261 | body = query_qna, 262 | index = index 263 | ) 264 | return relevant_documents 265 | ``` 266 | 267 | 1. We combine the prompt and the documents retrieved from OpenSearch Service Serverless as follows. 268 | 269 | ```{.python} 270 | def create_context_for_query(q: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, vector_index: str) -> str: 271 | """ 272 | Create a context out of the similar docs retrieved from the vector database 273 | by concatenating the text from the similar documents. 274 | """ 275 | print(f"query -> {q}") 276 | aoss_response = query_docs(q, embeddings, aoss_client, vector_index) 277 | context = "" 278 | for r in aoss_response['hits']['hits']: 279 | s = r['_source'] 280 | print(f"{s['metadata']}\n{s['text']}") 281 | context += f"{s['text']}\n" 282 | print("----------------") 283 | return context 284 | ``` 285 | 286 | 1. Combining everything, the RAG workflow works as shown below. 287 | 288 | ```{.python} 289 | # 1. Start with the query 290 | q = "What versions of XGBoost are supported by Amazon SageMaker?" 291 | 292 | # 2. Create the context by finding similar documents from the knowledge base 293 | context = create_context_for_query(q, embeddings, client, aoss_vector_index) 294 | 295 | # 3. Now create a prompt by combining the query and the context 296 | prompt = PROMPT_TEMPLATE.format(context, q) 297 | 298 | # 4. Provide the prompt to the FM to generate an answer to the query based on context provided 299 | response = claude_llm(prompt) 300 | ``` 301 | 302 | 1. Here is an example of a sample question answered first with just the question in the prompt i.e. without providing any additional context. The answer without context is inaccurate. 303 | 304 | ![Answer with prompt alone](img/ML-15729-kb11-wo-context.png){#fig-rag-wo-context} 305 | 306 | 307 | 1. We then ask the same question but this time with the additional context retrieved from the knowledge base included in the prompt. Now the inaccuracy in the earlier response is addressed and we also have attribution as to the source of this answer (notice the underlined text for the filename and the actual answer)! 308 | 309 | ![Answer with prompt and context](img/ML-15729-kb11-w-context.png){#fig-answer-w-context} 310 | 311 | ## Clean up 312 | 313 | To avoid incurring future charges, delete the resources. You can do this by first deleting all the files from the S3 bucket created by the CloudFormation template and then deleting the CloudFormation stack. 314 | 315 | ## Conclusion 316 | 317 | In this post, we showed how to create an enterprise ready RAG solution using a combination of AWS services and open-source Python packages. 318 | 319 | We encourage you to learn more by exploring [Amazon Titan](https://aws.amazon.com/bedrock/titan/) models, [Amazon Bedrock](https://aws.amazon.com/bedrock/), and [OpenSearch Service](https://aws.amazon.com/opensearch-service/) and building a solution using the sample implementation provided in this post and a dataset relevant to your business. If you have questions or suggestions, leave a comment. 320 | 321 | * * * * * 322 | 323 | ## Author bio 324 | 325 | Amit Arora is an AI and ML Specialist Architect at Amazon Web Services, helping enterprise customers use cloud-based machine learning services to rapidly scale their innovations. He is also an adjunct lecturer in the MS data science and analytics program at Georgetown University in Washington D.C. 326 | -------------------------------------------------------------------------------- /img/ML-15729-Amit.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-Amit.png -------------------------------------------------------------------------------- /img/ML-15729-agt-console.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-console.png -------------------------------------------------------------------------------- /img/ML-15729-agt-iam.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-iam.png -------------------------------------------------------------------------------- /img/ML-15729-agt-select-model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-select-model.png -------------------------------------------------------------------------------- /img/ML-15729-agt-trace1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-trace1.png -------------------------------------------------------------------------------- /img/ML-15729-agt-trace2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt-trace2.png -------------------------------------------------------------------------------- /img/ML-15729-agt1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt1.png -------------------------------------------------------------------------------- /img/ML-15729-agt2-s1-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt2-s1-1.png -------------------------------------------------------------------------------- /img/ML-15729-agt2-s1-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt2-s1-2.png -------------------------------------------------------------------------------- /img/ML-15729-agt2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt2.png -------------------------------------------------------------------------------- /img/ML-15729-agt3-s1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt3-s1.png -------------------------------------------------------------------------------- /img/ML-15729-agt3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt3.png -------------------------------------------------------------------------------- /img/ML-15729-agt4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt4.png -------------------------------------------------------------------------------- /img/ML-15729-agt5-s1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt5-s1.png -------------------------------------------------------------------------------- /img/ML-15729-agt5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt5.png -------------------------------------------------------------------------------- /img/ML-15729-agt6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt6.png -------------------------------------------------------------------------------- /img/ML-15729-agt7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-agt7.png -------------------------------------------------------------------------------- /img/ML-15729-aoss-cv.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss-cv.jpg -------------------------------------------------------------------------------- /img/ML-15729-aoss.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss.jpg -------------------------------------------------------------------------------- /img/ML-15729-aoss1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss1.jpg -------------------------------------------------------------------------------- /img/ML-15729-aoss2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-aoss2.jpg -------------------------------------------------------------------------------- /img/ML-15729-bedrock-agents-kb.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-bedrock-agents-kb.png -------------------------------------------------------------------------------- /img/ML-15729-blog_post.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-blog_post.png -------------------------------------------------------------------------------- /img/ML-15729-cf-outputs.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-cf-outputs.jpg -------------------------------------------------------------------------------- /img/ML-15729-cf1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-cf1.jpg -------------------------------------------------------------------------------- /img/ML-15729-cloudformation-launch-stack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-cloudformation-launch-stack.png -------------------------------------------------------------------------------- /img/ML-15729-kb1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb1.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb10.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb10.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb11-w-context.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb11-w-context.png -------------------------------------------------------------------------------- /img/ML-15729-kb11-wo-context.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb11-wo-context.png -------------------------------------------------------------------------------- /img/ML-15729-kb11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb11.png -------------------------------------------------------------------------------- /img/ML-15729-kb2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb2.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb3.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb4.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb4.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb5.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb5.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb6.png -------------------------------------------------------------------------------- /img/ML-15729-kb7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb7.png -------------------------------------------------------------------------------- /img/ML-15729-kb8.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb8.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb9.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb9.jpg -------------------------------------------------------------------------------- /img/ML-15729-kb_02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-kb_02.png -------------------------------------------------------------------------------- /img/ML-15729-os-vi-1.1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-1.1.png -------------------------------------------------------------------------------- /img/ML-15729-os-vi-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-1.png -------------------------------------------------------------------------------- /img/ML-15729-os-vi-2.1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-2.1.png -------------------------------------------------------------------------------- /img/ML-15729-os-vi-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-os-vi-2.png -------------------------------------------------------------------------------- /img/ML-15729-rag_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-rag_1.png -------------------------------------------------------------------------------- /img/ML-15729-sm1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-sm1.jpg -------------------------------------------------------------------------------- /img/ML-15729-sm2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-sm2.jpg -------------------------------------------------------------------------------- /img/ML-15729-sm3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/bedrock-kb-rag-workshop/aa4da9859092313551b0a32cdabdd6cf636cf20a/img/ML-15729-sm3.jpg -------------------------------------------------------------------------------- /img/bedrock-agents-kb.drawio: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | -------------------------------------------------------------------------------- /rag_w_bedrock_agent.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 30, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import uuid\n", 10 | "import boto3\n", 11 | "import pprint\n", 12 | "import botocore\n", 13 | "import logging\n", 14 | "\n", 15 | "logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)\n", 16 | "logger = logging.getLogger(__name__)\n" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 31, 22 | "metadata": {}, 23 | "outputs": [], 24 | "source": [ 25 | "# global constants\n", 26 | "ENDPOINT_URL: str = None\n" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 32, 32 | "metadata": {}, 33 | "outputs": [], 34 | "source": [ 35 | "# you want to make sure that install sequence is as follows\n", 36 | "# %pip install boto3-1.28.54-py3-none-any.whl\n", 37 | "# %pip install botocore-1.31.54-py3-none-any.whl\n", 38 | "# %pip install awscli-1.29.54-py3-none-any.whl\n", 39 | "\n", 40 | "# exit out if the Boto3 (Python) SDK versions are not correct\n", 41 | "assert boto3.__version__ == \"1.28.73\"\n", 42 | "assert botocore.__version__ == \"1.31.73\"\n" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": 33, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "input_text:str = \"What are the XGBoost versions supported in Amazon SageMaker?\" # replace this with a prompt relevant to your agent\n", 52 | "agent_id:str = 'J0TEWQNZ89' # note this from the agent console on Bedrock\n", 53 | "agent_alias_id:str = 'TSTALIASID' # fixed for draft version of the agent\n", 54 | "session_id:str = str(uuid.uuid1()) # random identifier\n", 55 | "enable_trace:bool = True\n" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": 34, 61 | "metadata": {}, 62 | "outputs": [ 63 | { 64 | "name": "stderr", 65 | "output_type": "stream", 66 | "text": [ 67 | "[2023-11-02 15:01:55,355] p43792 {691742557.py:3} INFO - \n" 68 | ] 69 | } 70 | ], 71 | "source": [ 72 | "# create an boto3 bedrock agent client\n", 73 | "client = boto3.client(\"bedrock-agent-runtime\", endpoint_url=ENDPOINT_URL)\n", 74 | "logger.info(client)\n" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": 35, 80 | "metadata": {}, 81 | "outputs": [ 82 | { 83 | "name": "stderr", 84 | "output_type": "stream", 85 | "text": [ 86 | "[2023-11-02 15:01:55,849] p43792 {4226590062.py:9} INFO - None\n" 87 | ] 88 | }, 89 | { 90 | "name": "stdout", 91 | "output_type": "stream", 92 | "text": [ 93 | "{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',\n", 94 | " 'content-type': 'application/json',\n", 95 | " 'date': 'Thu, 02 Nov 2023 19:01:55 GMT',\n", 96 | " 'transfer-encoding': 'chunked',\n", 97 | " 'x-amz-bedrock-agent-session-id': '4a5687bd-79b2-11ee-943b-846a79be0989',\n", 98 | " 'x-amzn-bedrock-agent-content-type': 'application/json',\n", 99 | " 'x-amzn-requestid': 'b76650eb-ee2d-4120-a091-d680ea4c588f'},\n", 100 | " 'HTTPStatusCode': 200,\n", 101 | " 'RequestId': 'b76650eb-ee2d-4120-a091-d680ea4c588f',\n", 102 | " 'RetryAttempts': 0},\n", 103 | " 'completion': ,\n", 104 | " 'contentType': 'application/json',\n", 105 | " 'sessionId': '4a5687bd-79b2-11ee-943b-846a79be0989'}\n" 106 | ] 107 | } 108 | ], 109 | "source": [ 110 | "# invoke the agent API\n", 111 | "response = client.invoke_agent(inputText=input_text,\n", 112 | " agentId=agent_id,\n", 113 | " agentAliasId=agent_alias_id,\n", 114 | " sessionId=session_id,\n", 115 | " enableTrace=enable_trace\n", 116 | ")\n", 117 | "\n", 118 | "logger.info(pprint.pprint(response))\n" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": 36, 124 | "metadata": {}, 125 | "outputs": [ 126 | { 127 | "name": "stderr", 128 | "output_type": "stream", 129 | "text": [ 130 | "[2023-11-02 15:02:04,472] p43792 {:11} INFO - {\n", 131 | " \"agentId\": \"J0TEWQNZ89\",\n", 132 | " \"agentAliasId\": \"TSTALIASID\",\n", 133 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n", 134 | " \"trace\": {\n", 135 | " \"rationaleTrace\": {\n", 136 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-0\",\n", 137 | " \"text\": \"Review the \\\"User Input\\\", \\\"Conversation History\\\", \\\"Attributes\\\", \\\"APIs\\\" and always think about what to do\"\n", 138 | " }\n", 139 | " }\n", 140 | "}\n", 141 | "[2023-11-02 15:02:19,633] p43792 {:11} INFO - {\n", 142 | " \"agentId\": \"J0TEWQNZ89\",\n", 143 | " \"agentAliasId\": \"TSTALIASID\",\n", 144 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n", 145 | " \"trace\": {\n", 146 | " \"invocationInputTrace\": {\n", 147 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-0\",\n", 148 | " \"invocationType\": \"KNOWLEDGE_BASE\",\n", 149 | " \"knowledgeBaseLookupInput\": {\n", 150 | " \"text\": \"What are the XGBoost versions supported in Amazon SageMaker?\",\n", 151 | " \"knowledgeBaseId\": \"HK7XZ6KQYP\"\n", 152 | " }\n", 153 | " }\n", 154 | " }\n", 155 | "}\n", 156 | "[2023-11-02 15:02:19,634] p43792 {:11} INFO - {\n", 157 | " \"agentId\": \"J0TEWQNZ89\",\n", 158 | " \"agentAliasId\": \"TSTALIASID\",\n", 159 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n", 160 | " \"trace\": {\n", 161 | " \"observationTrace\": {\n", 162 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-0\",\n", 163 | " \"invocationType\": \"KNOWLEDGE_BASE\",\n", 164 | " \"knowledgeBaseLookupOutput\": {\n", 165 | " \"sourceReferences\": {\n", 166 | " \"textSourceReferences\": [\n", 167 | " {\n", 168 | " \"sourceLocation\": {\n", 169 | " \"s3SourceLocation\": {\n", 170 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"\n", 171 | " }\n", 172 | " },\n", 173 | " \"referenceText\": \"see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm\\u00b6 Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don\\u2019t have to write a training script. If you don\\u2019t need the features and flexibility of open source XGBoost, consider using the built-in version. For information about using the Amazon SageMaker XGBoost built-in algorithm, see XGBoost Algorithm in the Amazon SageMaker Developer Guide. Use the Open Source XGBoost Algorithm\\u00b6 If you want the flexibility and additional features that it provides, use the SageMaker open source XGBoost algorithm. For which XGBoost versions are supported, see the AWS documentation. We recommend that you use the latest supported version because that\\u2019s where we focus most of our development efforts. For a complete example of using the open source XGBoost algorithm, see the sample notebook at https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb. For more information about XGBoost, see the XGBoost documentation. Train a Model with Open Source XGBoost\\u00b6 To train a model by using the Amazon SageMaker open source XGBoost algorithm: Prepare a training script Create a sagemaker.xgboost.XGBoost estimator Call the estimator\\u2019s fit method Prepare a Training Script\\u00b6 A typical training script loads data from the input channels, configures training with hyperparameters, trains a model,\"\n", 174 | " },\n", 175 | " {\n", 176 | " \"sourceLocation\": {\n", 177 | " \"s3SourceLocation\": {\n", 178 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"\n", 179 | " }\n", 180 | " },\n", 181 | " \"referenceText\": \"see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm\\u00b6 Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don\\u2019t have to write a training script. If you don\\u2019t need the features and flexibility of open source XGBoost, consider using the built-in version. For information about using the Amazon SageMaker XGBoost built-in algorithm, see XGBoost Algorithm in the Amazon SageMaker Developer Guide. Use the Open Source XGBoost Algorithm\\u00b6 If you want the flexibility and additional features that it provides, use the SageMaker open source XGBoost algorithm. For which XGBoost versions are supported, see the AWS documentation. We recommend that you use the latest supported version because that\\u2019s where we focus most of our development efforts. For a complete example of using the open source XGBoost algorithm, see the sample notebook at https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb. For more information about XGBoost, see the XGBoost documentation. Train a Model with Open Source XGBoost\\u00b6 To train a model by using the Amazon SageMaker open source XGBoost algorithm: Prepare a training script Create a sagemaker.xgboost.XGBoost estimator Call the estimator\\u2019s fit method Prepare a Training Script\\u00b6 A typical training script loads data from the input channels, configures training with hyperparameters, trains a model,\"\n", 182 | " },\n", 183 | " {\n", 184 | " \"sourceLocation\": {\n", 185 | " \"s3SourceLocation\": {\n", 186 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_algorithms_tabular_xgboost.html\"\n", 187 | " }\n", 188 | " },\n", 189 | " \"referenceText\": \"an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5. The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm. Notebook Title Description How to Create a Custom XGBoost container? This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform. Regression with XGBoost using Parquet This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model. How to Train and Host a Multiclass Classification Model? This notebook shows how to use the MNIST dataset to train and host a multiclass classification model. How to train a Model for Customer Churn Prediction? This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers. An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training This notebook shows you how to use Spot Instances for training with a XGBoost Container. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running. For instructions on how to create and access Jupyter\"\n", 190 | " },\n", 191 | " {\n", 192 | " \"sourceLocation\": {\n", 193 | " \"s3SourceLocation\": {\n", 194 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_algorithms_tabular_xgboost.html\"\n", 195 | " }\n", 196 | " },\n", 197 | " \"referenceText\": \"an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5. The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm. Notebook Title Description How to Create a Custom XGBoost container? This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform. Regression with XGBoost using Parquet This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model. How to Train and Host a Multiclass Classification Model? This notebook shows how to use the MNIST dataset to train and host a multiclass classification model. How to train a Model for Customer Churn Prediction? This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers. An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training This notebook shows you how to use Spot Instances for training with a XGBoost Container. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running. For instructions on how to create and access Jupyter\"\n", 198 | " },\n", 199 | " {\n", 200 | " \"sourceLocation\": {\n", 201 | " \"s3SourceLocation\": {\n", 202 | " \"s3Uri\": \"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"\n", 203 | " }\n", 204 | " },\n", 205 | " \"referenceText\": \"the permissions necessary to run an Amazon SageMaker training job, the type and number of instances to use for the training job, and a dictionary of the hyperparameters to pass to the training script. from sagemaker.xgboost.estimator import XGBoost xgb_estimator = XGBoost( entry_point=\\\"abalone.py\\\", hyperparameters=hyperparameters, role=role, instance_count=1, instance_type=\\\"ml.m5.2xlarge\\\", framework_version=\\\"1.0-1\\\", ) Call the fit Method\\u00b6 After you create an estimator, call the fit method to run the training job. xgb_script_mode_estimator.fit({\\\"train\\\": train_input}) Deploy Open Source XGBoost Models\\u00b6 After you fit an XGBoost Estimator, you can host the newly created model in SageMaker. After you call fit, you can call deploy on an XGBoost estimator to create a SageMaker endpoint. The endpoint runs a SageMaker-provided XGBoost model server and hosts the model produced by your training script, which was run when you called fit. This was the model you saved to model_dir. deploy returns a Predictor object, which you can use to do inference on the Endpoint hosting your XGBoost model. Each Predictor provides a predict method which can do inference with numpy arrays, Python lists, or strings. After inference arrays or lists are serialized and sent to the XGBoost model server, predict returns the result of inference against your model. serializer = StringSerializer(content_type=\\\"text/libsvm\\\") predictor = estimator.deploy(\"\n", 206 | " }\n", 207 | " ]\n", 208 | " }\n", 209 | " }\n", 210 | " }\n", 211 | " }\n", 212 | "}\n", 213 | "[2023-11-02 15:02:26,296] p43792 {:11} INFO - {\n", 214 | " \"agentId\": \"J0TEWQNZ89\",\n", 215 | " \"agentAliasId\": \"TSTALIASID\",\n", 216 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n", 217 | " \"trace\": {\n", 218 | " \"rationaleTrace\": {\n", 219 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-1\",\n", 220 | " \"text\": \"Based on the previous \\\"Observation\\\" I am able to provide \\\"Final Answer\\\"\"\n", 221 | " }\n", 222 | " }\n", 223 | "}\n", 224 | "[2023-11-02 15:02:37,746] p43792 {:11} INFO - {\n", 225 | " \"agentId\": \"J0TEWQNZ89\",\n", 226 | " \"agentAliasId\": \"TSTALIASID\",\n", 227 | " \"sessionId\": \"4a5687bd-79b2-11ee-943b-846a79be0989\",\n", 228 | " \"trace\": {\n", 229 | " \"observationTrace\": {\n", 230 | " \"traceId\": \"c5e00690-fbdc-4823-a9ac-5ba9ba27c90a-1\",\n", 231 | " \"invocationType\": \"FINISH\",\n", 232 | " \"finalResponse\": {\n", 233 | " \"text\": \"The XGBoost versions supported in Amazon SageMaker are 1.0, 1.2, 1.3 and 1.5. The latest version 1.5 is recommended.\"\n", 234 | " }\n", 235 | " }\n", 236 | " }\n", 237 | "}\n", 238 | "[2023-11-02 15:02:37,747] p43792 {:7} INFO - Final answer ->\n", 239 | "Amazon SageMaker supports XGBoost versions 1.0, 1.2, 1.3, and 1.5. The latest supported version is recommended because that is where most development efforts are focused. Amazon SageMaker supports XGBoost versions 1.0, 1.2, 1.3, and 1.5. The latest supported version is recommended because that is where most development efforts are focused.\n" 240 | ] 241 | }, 242 | { 243 | "name": "stdout", 244 | "output_type": "stream", 245 | "text": [ 246 | "CPU times: total: 15.6 ms\n", 247 | "Wall time: 41.9 s\n" 248 | ] 249 | } 250 | ], 251 | "source": [ 252 | "%%time\n", 253 | "import json\n", 254 | "event_stream = response['completion']\n", 255 | "try:\n", 256 | " for event in event_stream: \n", 257 | " if 'chunk' in event:\n", 258 | " data = event['chunk']['bytes']\n", 259 | " logger.info(f\"Final answer ->\\n{data.decode('utf8')}\") \n", 260 | " end_event_received = True\n", 261 | " # End event indicates that the request finished successfully\n", 262 | " elif 'trace' in event:\n", 263 | " logger.info(json.dumps(event['trace'], indent=2))\n", 264 | " else:\n", 265 | " raise Exception(\"unexpected event.\", event)\n", 266 | "except Exception as e:\n", 267 | " raise Exception(\"unexpected event.\", e)\n" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": null, 273 | "metadata": {}, 274 | "outputs": [], 275 | "source": [] 276 | } 277 | ], 278 | "metadata": { 279 | "kernelspec": { 280 | "display_name": "python-sdk-test-runtime-trace", 281 | "language": "python", 282 | "name": "python3" 283 | }, 284 | "language_info": { 285 | "codemirror_mode": { 286 | "name": "ipython", 287 | "version": 3 288 | }, 289 | "file_extension": ".py", 290 | "mimetype": "text/x-python", 291 | "name": "python", 292 | "nbconvert_exporter": "python", 293 | "pygments_lexer": "ipython3", 294 | "version": "3.11.5" 295 | } 296 | }, 297 | "nbformat": 4, 298 | "nbformat_minor": 2 299 | } 300 | -------------------------------------------------------------------------------- /rag_w_bedrock_and_aoss.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "c2dc3fcb-ae4f-48e6-9b1c-71b002e0fe1b", 6 | "metadata": { 7 | "tags": [] 8 | }, 9 | "source": [ 10 | "# RAG with Amazon Bedrock Knowledge Base\n", 11 | "\n", 12 | "In this notebook we use the information ingested in the Bedrock knowledge base to answer user queries." 13 | ] 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "id": "a59d4975", 18 | "metadata": {}, 19 | "source": [ 20 | "## Import packages and utility functions\n", 21 | "Import packages, setup utility functions, interface with Amazon OpenSearch Service Serverless (AOSS)." 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 1, 27 | "id": "85ce61b6-795b-488c-b400-1ac80d355162", 28 | "metadata": { 29 | "tags": [] 30 | }, 31 | "outputs": [], 32 | "source": [ 33 | "import os\n", 34 | "import sys\n", 35 | "import json\n", 36 | "import boto3\n", 37 | "from typing import Dict\n", 38 | "from urllib.request import urlretrieve\n", 39 | "from langchain.llms.bedrock import Bedrock\n", 40 | "from IPython.display import Markdown, display\n", 41 | "from langchain.embeddings import BedrockEmbeddings\n", 42 | "from opensearchpy import OpenSearch, RequestsHttpConnection\n", 43 | "from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth\n" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": 2, 49 | "id": "79eb7df4", 50 | "metadata": {}, 51 | "outputs": [ 52 | { 53 | "name": "stdout", 54 | "output_type": "stream", 55 | "text": [ 56 | "Requirement already satisfied: opensearch-py in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (2.3.2)\n", 57 | "Requirement already satisfied: urllib3>=1.26.9 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (1.26.17)\n", 58 | "Requirement already satisfied: requests<3.0.0,>=2.4.0 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (2.31.0)\n", 59 | "Requirement already satisfied: six in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (1.16.0)\n", 60 | "Requirement already satisfied: python-dateutil in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (2.8.2)\n", 61 | "Requirement already satisfied: certifi>=2022.12.07 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from opensearch-py) (2023.7.22)\n", 62 | "Requirement already satisfied: charset-normalizer<4,>=2 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from requests<3.0.0,>=2.4.0->opensearch-py) (3.3.0)\n", 63 | "Requirement already satisfied: idna<4,>=2.5 in c:\\users\\aroraai\\appdata\\local\\miniconda3\\envs\\bedrock_new\\lib\\site-packages (from requests<3.0.0,>=2.4.0->opensearch-py) (3.4)\n", 64 | "Note: you may need to restart the kernel to use updated packages.\n" 65 | ] 66 | } 67 | ], 68 | "source": [ 69 | "%pip install opensearch-py\n" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": 3, 75 | "id": "4c1ea784-37bc-4a3f-84e3-1047f7e5cfd9", 76 | "metadata": { 77 | "tags": [] 78 | }, 79 | "outputs": [], 80 | "source": [ 81 | "# global constants\n", 82 | "SERVICE = 'aoss'\n", 83 | "\n", 84 | "# do not change the name of the CFN stack, we assume that the \n", 85 | "# blog post creates a stack by this name and read output values\n", 86 | "# from the stack.\n", 87 | "CFN_STACK_NAME = \"rag-w-bedrock-kb\"\n" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": 4, 93 | "id": "59d559b7", 94 | "metadata": { 95 | "tags": [] 96 | }, 97 | "outputs": [], 98 | "source": [ 99 | "# Anthropic models need the Human/Assistant terminology used in the prompts, \n", 100 | "# they work better with XML style tags.\n", 101 | "PROMPT_TEMPLATE = \"\"\"Human: Answer the question based only on the information provided in few sentences.\n", 102 | "\n", 103 | "{}\n", 104 | "\n", 105 | "Include your answer in the tags. Do not include any preamble in your answer.\n", 106 | "\n", 107 | "{}\n", 108 | "\n", 109 | "Assistant:\"\"\"\n" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": 5, 115 | "id": "61c6f5cc-2384-4f18-8add-418b258e8ab5", 116 | "metadata": { 117 | "tags": [] 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "# utility functions\n", 122 | "\n", 123 | "def get_cfn_outputs(stackname: str) -> str:\n", 124 | " cfn = boto3.client('cloudformation')\n", 125 | " outputs = {}\n", 126 | " for output in cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']:\n", 127 | " outputs[output['OutputKey']] = output['OutputValue']\n", 128 | " return outputs\n", 129 | "\n", 130 | "def printmd(string: str):\n", 131 | " display(Markdown(string))\n" 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": 6, 137 | "id": "326c8d7f", 138 | "metadata": { 139 | "tags": [] 140 | }, 141 | "outputs": [], 142 | "source": [ 143 | "# Functions to talk to OpenSearch\n", 144 | "\n", 145 | "# Define queries for OpenSearch\n", 146 | "def query_docs(query: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, index: str, k: int = 3) -> Dict:\n", 147 | " \"\"\"\n", 148 | " Convert the query into embedding and then find similar documents from AOSS\n", 149 | " \"\"\"\n", 150 | "\n", 151 | " # embedding\n", 152 | " query_embedding = embeddings.embed_query(query)\n", 153 | "\n", 154 | " # query to lookup OpenSearch kNN vector. Can add any metadata fields based filtering\n", 155 | " # here as part of this query.\n", 156 | " query_qna = {\n", 157 | " \"size\": k,\n", 158 | " \"query\": {\n", 159 | " \"knn\": {\n", 160 | " \"vector\": {\n", 161 | " \"vector\": query_embedding,\n", 162 | " \"k\": k\n", 163 | " }\n", 164 | " }\n", 165 | " }\n", 166 | " }\n", 167 | "\n", 168 | " # OpenSearch API call\n", 169 | " relevant_documents = aoss_client.search(\n", 170 | " body = query_qna,\n", 171 | " index = index\n", 172 | " )\n", 173 | " return relevant_documents\n" 174 | ] 175 | }, 176 | { 177 | "cell_type": "code", 178 | "execution_count": 7, 179 | "id": "1d011b20", 180 | "metadata": { 181 | "tags": [] 182 | }, 183 | "outputs": [], 184 | "source": [ 185 | "def create_context_for_query(q: str, embeddings: BedrockEmbeddings, aoss_client: OpenSearch, vector_index: str) -> str:\n", 186 | " \"\"\"\n", 187 | " Create a context out of the similar docs retrieved from the vector database\n", 188 | " by concatenating the text from the similar documents.\n", 189 | " \"\"\"\n", 190 | " print(f\"query -> {q}\")\n", 191 | " aoss_response = query_docs(q, embeddings, aoss_client, vector_index)\n", 192 | " context = \"\"\n", 193 | " for r in aoss_response['hits']['hits']:\n", 194 | " s = r['_source']\n", 195 | " print(f\"{s['metadata']}\\n{s['text']}\")\n", 196 | " context += f\"{s['text']}\\n\"\n", 197 | " print(\"----------------\")\n", 198 | " return context\n" 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "id": "6adf61b1", 204 | "metadata": {}, 205 | "source": [ 206 | "## Retrieve parameters needed from the AWS CloudFormation stack" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": 8, 212 | "id": "10051806", 213 | "metadata": { 214 | "tags": [] 215 | }, 216 | "outputs": [ 217 | { 218 | "name": "stdout", 219 | "output_type": "stream", 220 | "text": [ 221 | "aoss_collection_arn=arn:aws:aoss:us-east-1:015469603702:collection/sip67bzp3hoel0x7crh3\n", 222 | "aoss_host=sip67bzp3hoel0x7crh3.us-east-1.aoss.amazonaws.com\n", 223 | "aoss_vector_index=sagemaker-readthedocs-io\n", 224 | "aws_region=us-east-1\n" 225 | ] 226 | } 227 | ], 228 | "source": [ 229 | "\n", 230 | "outputs = get_cfn_outputs(CFN_STACK_NAME)\n", 231 | "\n", 232 | "region = outputs[\"Region\"]\n", 233 | "aoss_collection_arn = outputs['CollectionARN']\n", 234 | "aoss_host = f\"{os.path.basename(aoss_collection_arn)}.{region}.aoss.amazonaws.com\"\n", 235 | "aoss_vector_index = outputs['AOSSVectorIndexName']\n", 236 | "print(f\"aoss_collection_arn={aoss_collection_arn}\\naoss_host={aoss_host}\\naoss_vector_index={aoss_vector_index}\\naws_region={region}\")\n" 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "id": "1b4a5e9e", 242 | "metadata": {}, 243 | "source": [ 244 | "## Setup Embeddings and Text Generation model\n", 245 | "\n", 246 | "We can use LangChain to setup the embeddings and text generation models provided via Amazon Bedrock." 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": 9, 252 | "id": "cf6613d2-aae8-48e5-adfb-0ea7fb75f2dd", 253 | "metadata": { 254 | "tags": [] 255 | }, 256 | "outputs": [], 257 | "source": [ 258 | "# create a boto3 bedrock client\n", 259 | "bedrock_client = boto3.client('bedrock-runtime')\n", 260 | "\n", 261 | "# we will use Anthropic Claude for text generation\n", 262 | "claude_llm = Bedrock(model_id= \"anthropic.claude-v2\", client=bedrock_client)\n", 263 | "claude_llm.model_kwargs = dict(temperature=0.5, max_tokens_to_sample=300, top_k=250, top_p=1, stop_sequences=[])\n", 264 | "\n", 265 | "# we will be using the Titan Embeddings Model to generate our Embeddings.\n", 266 | "embeddings = BedrockEmbeddings(model_id=\"amazon.titan-embed-g1-text-02\", client=bedrock_client)\n" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "id": "64f0166a", 272 | "metadata": {}, 273 | "source": [ 274 | "## Interface with Amazon OpenSearch Service Serverless\n", 275 | "We use the open-source [opensearch-py](https://pypi.org/project/opensearch-py/) package to talk to AOSS." 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": 10, 281 | "id": "5d36f340-81ea-4617-b37d-57bf7669c9ac", 282 | "metadata": { 283 | "tags": [] 284 | }, 285 | "outputs": [], 286 | "source": [ 287 | "credentials = boto3.Session().get_credentials()\n", 288 | "auth = AWSV4SignerAuth(credentials, region, SERVICE)\n", 289 | "\n", 290 | "client = OpenSearch(\n", 291 | " hosts = [{'host': aoss_host, 'port': 443}],\n", 292 | " http_auth = auth,\n", 293 | " use_ssl = True,\n", 294 | " verify_certs = True,\n", 295 | " connection_class = RequestsHttpConnection,\n", 296 | " pool_maxsize = 20\n", 297 | ")\n" 298 | ] 299 | }, 300 | { 301 | "cell_type": "markdown", 302 | "id": "3e383e23", 303 | "metadata": {}, 304 | "source": [ 305 | "## Use Retrieval Augumented Generation (RAG) for answering queries\n", 306 | "\n", 307 | "Now that we have setup the LLMs through Bedrock and vector database through AOSS, we are ready to answer queries using RAG. The workflow is as follows:\n", 308 | "\n", 309 | "1. Convert the user query into embeddings.\n", 310 | "\n", 311 | "1. Use the embeddings to find similar documents from the vector database.\n", 312 | "\n", 313 | "1. Create a prompt using the user query and similar documents (retrieved from the vector db) to create a prompt.\n", 314 | "\n", 315 | "1. Provide the prompt to the LLM to create an answer to the user query." 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "id": "0224f2c4-b725-4f3a-84ac-914c4eba8a94", 321 | "metadata": {}, 322 | "source": [ 323 | "## Query 1\n", 324 | "\n", 325 | "Let us first ask the our question to the model without providing any context, see the result and then ask the same question with context provided using document retrieved from AOSS and see if the answer improves!" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": 11, 331 | "id": "d4be3215-3dde-4abd-8c38-45871e63d058", 332 | "metadata": { 333 | "tags": [] 334 | }, 335 | "outputs": [ 336 | { 337 | "data": { 338 | "text/markdown": [ 339 | "question=What versions of XGBoost are supported by Amazon SageMaker?
answer=\n", 340 | "Amazon SageMaker supports XGBoost versions 0.90-1, 0.90-2, 1.0-1, 1.2-1, 1.3-1, and 1.5-1.\n", 341 | "
\n" 342 | ], 343 | "text/plain": [ 344 | "" 345 | ] 346 | }, 347 | "metadata": {}, 348 | "output_type": "display_data" 349 | } 350 | ], 351 | "source": [ 352 | "# 1. Start with the query\n", 353 | "q = \"What versions of XGBoost are supported by Amazon SageMaker?\"\n", 354 | "\n", 355 | "# 2. Now create a prompt by combining the query and the context (which is empty at this time)\n", 356 | "context = \"\"\n", 357 | "prompt = PROMPT_TEMPLATE.format(context, q)\n", 358 | "\n", 359 | "# 3. Provide the prompt to the LLM to generate an answer to the query without any additional context provided\n", 360 | "response = claude_llm(prompt)\n", 361 | "printmd(f\"question={q.strip()}
answer={response.strip()}
\\n\")\n" 362 | ] 363 | }, 364 | { 365 | "cell_type": "markdown", 366 | "id": "a1f429bb-050d-4c81-b532-aa5b8e531990", 367 | "metadata": {}, 368 | "source": [ 369 | "**The answer provided above is incorrect**, as can be seen from the [SageMaker XGBoost Algorithm page](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html). The supported version numbers are \"1.0, 1.2, 1.3, 1.5, and 1.7\".\n", 370 | "\n", 371 | "Now, let us see if we can improve upon this answer by using additional information that is available to use in the vector database. **Also notice in the response below that the source of the documents that are being used as context is also being called out (the name of the file in the S3 bucket), this helps create confidence in the response generated by the LLM**." 372 | ] 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": 12, 377 | "id": "371f86e8-157f-41b0-88a4-59a56f5507c9", 378 | "metadata": { 379 | "tags": [] 380 | }, 381 | "outputs": [ 382 | { 383 | "name": "stdout", 384 | "output_type": "stream", 385 | "text": [ 386 | "query -> What versions of XGBoost are supported by Amazon SageMaker?\n", 387 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"}\n", 388 | "see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm¶ Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don’t have to write a training script. If you don’t need the features and flexibility of open source XGBoost, consider using the built-in version. For information about using the Amazon SageMaker XGBoost built-in algorithm, see XGBoost Algorithm in the Amazon SageMaker Developer Guide. Use the Open Source XGBoost Algorithm¶ If you want the flexibility and additional features that it provides, use the SageMaker open source XGBoost algorithm. For which XGBoost versions are supported, see the AWS documentation. We recommend that you use the latest supported version because that’s where we focus most of our development efforts. For a complete example of using the open source XGBoost algorithm, see the sample notebook at https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb. For more information about XGBoost, see the XGBoost documentation. Train a Model with Open Source XGBoost¶ To train a model by using the Amazon SageMaker open source XGBoost algorithm: Prepare a training script Create a sagemaker.xgboost.XGBoost estimator Call the estimator’s fit method Prepare a Training Script¶ A typical training script loads data from the input channels, configures training with hyperparameters, trains a model,\n", 389 | "----------------\n", 390 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_frameworks_xgboost_using_xgboost.html\"}\n", 391 | "Models with Multi-Model Endpoints SageMaker XGBoost Classes SageMaker XGBoost Docker Containers eXtreme Gradient Boosting (XGBoost) is a popular and efficient machine learning algorithm used for regression and classification tasks on tabular datasets. It implements a technique known as gradient boosting on trees, which performs remarkably well in machine learning competitions. Amazon SageMaker supports two ways to use the XGBoost algorithm: XGBoost built-in algorithm XGBoost open source algorithm The XGBoost open source algorithm provides the following benefits over the built-in algorithm: Latest version - The open source XGBoost algorithm typically supports a more recent version of XGBoost. To see the XGBoost version that is currently supported, see XGBoost SageMaker Estimators and Models. Flexibility - Take advantage of the full range of XGBoost functionality, such as cross-validation support. You can add custom pre- and post-processing logic and run additional code after training. Scalability - The XGBoost open source algorithm has a more efficient implementation of distributed training, which enables it to scale out to more instances and reduce out-of-memory errors. Extensibility - Because the open source XGBoost container is open source, you can extend the container to install additional libraries and change the version of XGBoost that the container uses. For an example notebook that shows how to extend SageMaker containers, see Extending our PyTorch containers. Use XGBoost as a Built-in Algortihm¶ Amazon SageMaker provides XGBoost as a built-in algorithm that you can use like other built-in algorithms. Using the built-in algorithm version of XGBoost is simpler than using the open source version, because you don’t have to write\n", 392 | "----------------\n", 393 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_algorithms_tabular_xgboost.html\"}\n", 394 | "an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5. The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm. Notebook Title Description How to Create a Custom XGBoost container? This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform. Regression with XGBoost using Parquet This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model. How to Train and Host a Multiclass Classification Model? This notebook shows how to use the MNIST dataset to train and host a multiclass classification model. How to train a Model for Customer Churn Prediction? This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers. An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training This notebook shows you how to use Spot Instances for training with a XGBoost Container. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies. How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running. For instructions on how to create and access Jupyter\n", 395 | "----------------\n" 396 | ] 397 | }, 398 | { 399 | "data": { 400 | "text/markdown": [ 401 | "question=What versions of XGBoost are supported by Amazon SageMaker?
answer=\n", 402 | "The XGBoost open source algorithm in Amazon SageMaker supports the latest version of XGBoost. The built-in XGBoost algorithm is based on XGBoost versions 1.0, 1.2, 1.3, and 1.5.\n", 403 | "
\n" 404 | ], 405 | "text/plain": [ 406 | "" 407 | ] 408 | }, 409 | "metadata": {}, 410 | "output_type": "display_data" 411 | } 412 | ], 413 | "source": [ 414 | "# 1. Start with the query\n", 415 | "q = \"What versions of XGBoost are supported by Amazon SageMaker?\"\n", 416 | "\n", 417 | "# 2. Create the context by finding similar documents from the knowledge base\n", 418 | "context = create_context_for_query(q, embeddings, client, aoss_vector_index)\n", 419 | "\n", 420 | "# 3. Now create a prompt by combining the query and the context\n", 421 | "prompt = PROMPT_TEMPLATE.format(context, q)\n", 422 | "\n", 423 | "# 4. Provide the prompt to the LLM to generate an answer to the query based on context provided\n", 424 | "response = claude_llm(prompt)\n", 425 | "\n", 426 | "printmd(f\"question={q.strip()}
answer={response.strip()}
\\n\")\n" 427 | ] 428 | }, 429 | { 430 | "cell_type": "markdown", 431 | "id": "0ec1bd68-f61d-4f15-b152-3f9f54305fa8", 432 | "metadata": {}, 433 | "source": [ 434 | "## Query 2\n", 435 | "\n", 436 | "For the subsequent queries we use RAG directly." 437 | ] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": 13, 442 | "id": "2ffbe92d-5fcd-480d-a239-0c461f61f4a0", 443 | "metadata": { 444 | "tags": [] 445 | }, 446 | "outputs": [ 447 | { 448 | "name": "stdout", 449 | "output_type": "stream", 450 | "text": [ 451 | "query -> What are the different types of distributed training supported by SageMaker. Give a short summary of each.\n", 452 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_api_training_distributed.html\"}\n", 453 | "Archive Launch a Distributed Training Job Using the SageMaker Python SDK Release Notes SageMaker Distributed Data Parallel 1.8.0 Release Notes Release History The SageMaker Distributed Model Parallel Library¶ The SageMaker Distributed Model Parallel Library Overview Use the Library’s API to Adapt Training Scripts Version 1.11.0, 1.13.0, 1.14.0, 1.15.0 (Latest) Documentation Archive Run a Distributed Training Job Using the SageMaker Python SDK Configuration Parameters for distribution Ranking Basics without Tensor Parallelism Placement Strategy with Tensor Parallelism Prescaled Batch Release Notes SageMaker Distributed Model Parallel 1.15.0 Release Notes Release History Next Previous © Copyright 2023, Amazon Revision af4d7949. Built with Sphinx using a theme provided by Read the Docs. Read the Docs v: stable Versions stable v2.167.0 v2.166.0 v2.165.0 v2.164.0 v2.163.0 v2.162.0 v2.161.0 v2.160.0 v2.159.0 v2.158.0 v2.157.0 v2.156.0 v2.155.0 v2.154.0 v2.153.0 v2.152.0 v2.151.0 v2.150.0 v2.149.0 v2.148.0 v2.147.0 v2.146.1 v2.146.0 v2.145.0 v2.144.0 v2.143.0 v2.142.0 v2.141.0 v2.140.1 v2.140.0\n", 454 | "----------------\n", 455 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_api_training_distributed.html\"}\n", 456 | "sagemaker stable Filters: Example Dev Guide SDK Guide Using the SageMaker Python SDK Use Version 2.x of the SageMaker Python SDK APIs Feature Store APIs Training APIs Distributed Training APIs The SageMaker Distributed Data Parallel Library The SageMaker Distributed Data Parallel Library Overview Use the Library to Adapt Your Training Script Launch a Distributed Training Job Using the SageMaker Python SDK Release Notes The SageMaker Distributed Model Parallel Library The SageMaker Distributed Model Parallel Library Overview Use the Library’s API to Adapt Training Scripts Run a Distributed Training Job Using the SageMaker Python SDK Release Notes Inference APIs Governance APIs Utility APIs Frameworks Built-in Algorithms Workflows Amazon SageMaker Experiments Amazon SageMaker Debugger Amazon SageMaker Feature Store Amazon SageMaker Model Monitor Amazon SageMaker Processing Amazon SageMaker Model Building Pipeline sagemaker » APIs » Distributed Training APIs Edit on GitHub Distributed Training APIs¶ SageMaker distributed training libraries offer both data parallel and model parallel training strategies. They combine software and hardware technologies to improve inter-GPU and inter-node communications. They extend SageMaker’s training capabilities with built-in options that require only small code changes to your training scripts. The SageMaker Distributed Data Parallel Library¶ The SageMaker Distributed Data Parallel Library Overview Use the Library to Adapt Your Training Script For versions between 1.4.0 and 1.8.0 (Latest) Documentation Archive Launch a Distributed Training Job Using the SageMaker Python SDK Release Notes SageMaker Distributed Data Parallel 1.8.0 Release Notes Release History The SageMaker Distributed Model Parallel Library¶ The SageMaker Distributed Model Parallel Library Overview Use the Library’s API to Adapt Training Scripts Version 1.11.0, 1.13.0,\n", 457 | "----------------\n", 458 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n", 459 | "training by calling fit # Setting the wait to `False` would make the fit asynchronous estimator.fit(wait=False) # Get a list of S3 URIs S3Downloader.list(estimator.latest_job_debugger_artifacts_path()) Continuous analyses through rules¶ In addition to collecting the debugging data, Amazon SageMaker Debugger provides the capability for you to analyze it in a streaming fashion using “rules”. A SageMaker Debugger “rule” is a piece of code which encapsulates the logic for analyzing debugging data. SageMaker Debugger provides a set of built-in rules curated by data scientists and engineers at Amazon to identify common problems while training machine learning models. There is also support for using custom rule source codes for evaluation. In the following sections, you’ll learn how to use both the built-in and custom rules while training your model. Relationship between debugger hook and rules¶ Using SageMaker Debugger is, broadly, a two-pronged approach. On one hand you have the production of debugging data, which is done through the Debugger Hook, and on the other hand you have the consumption of this data, which can be with rules (for continuous analyses) or by using the SageMaker Debugger SDK (for interactive analyses). The production and consumption of data are defined independently. For example, you could configure the debugging hook to store only the collection “gradients” and then configure the rules to operate on some other collection, say, “weights”. While this is possible, it’s quite useless as it gives you no meaningful insight into the training process. This is because the rule will do nothing in this example scenario since it will wait\n", 460 | "----------------\n" 461 | ] 462 | }, 463 | { 464 | "data": { 465 | "text/markdown": [ 466 | "question=What are the different types of distributed training supported by SageMaker. Give a short summary of each.
answer=\n", 467 | "SageMaker supports two main types of distributed training:\n", 468 | "\n", 469 | "1. SageMaker Distributed Data Parallel: This allows scaling model training across multiple GPUs and nodes by splitting the training data. It reduces training time by parallelizing computation.\n", 470 | "\n", 471 | "2. SageMaker Distributed Model Parallel: This allows training very large models that don't fit on a single GPU. It splits the model itself across multiple GPUs and synchronizes gradients during training. It removes memory constraints for large models.\n", 472 | "
\n" 473 | ], 474 | "text/plain": [ 475 | "" 476 | ] 477 | }, 478 | "metadata": {}, 479 | "output_type": "display_data" 480 | } 481 | ], 482 | "source": [ 483 | "# 1. Start with the query\n", 484 | "q = \"What are the different types of distributed training supported by SageMaker. Give a short summary of each.\"\n", 485 | "\n", 486 | "# 2. Create the context by finding similar documents from the knowledge base\n", 487 | "context = create_context_for_query(q, embeddings, client, aoss_vector_index)\n", 488 | "\n", 489 | "# 3. Now create a prompt by combining the query and the context\n", 490 | "prompt = PROMPT_TEMPLATE.format(context, q)\n", 491 | "\n", 492 | "# 4. Provide the prompt to the LLM to generate an answer to the query based on context provided\n", 493 | "response = claude_llm(prompt)\n", 494 | "printmd(f\"question={q.strip()}
answer={response.strip()}
\\n\")\n" 495 | ] 496 | }, 497 | { 498 | "cell_type": "markdown", 499 | "id": "d8024b1f-3f99-406c-be1d-9368cd1440f4", 500 | "metadata": {}, 501 | "source": [ 502 | "## Query 3" 503 | ] 504 | }, 505 | { 506 | "cell_type": "code", 507 | "execution_count": 14, 508 | "id": "5444ae8c-0377-46ad-8d4e-2d41f575c289", 509 | "metadata": { 510 | "tags": [] 511 | }, 512 | "outputs": [ 513 | { 514 | "name": "stdout", 515 | "output_type": "stream", 516 | "text": [ 517 | "query -> What advantages does SageMaker debugger provide?\n", 518 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n", 519 | "having the TensorBoard data emitted from the hook in addition to the tensors will incur a cost to the training and may slow it down. Interactive analysis using SageMaker Debugger SDK and visualizations¶ Amazon SageMaker Debugger SDK also allows you to do interactive analyses on the debugging data produced from a training job run and to render visualizations of it. After calling fit() on the estimator, you can use the SDK to load the saved data in a SageMaker Debugger trial and do an analysis on the data: from smdebug.trials import create_trial s3_output_path = estimator.latest_job_debugger_artifacts_path() trial = create_trial(s3_output_path) To learn more about the programming model for analysis using the SageMaker Debugger SDK, see SageMaker Debugger Analysis. For a tutorial on what you can do after creating the trial and how to visualize the results, see SageMaker Debugger - Visualizing Debugging Results. Default behavior and opting out¶ For TensorFlow, Keras, MXNet, PyTorch and XGBoost estimators, the DebuggerHookConfig is always initialized regardless of specification while initializing the estimator. This is done to minimize code changes needed to get useful debugging information. To disable the hook initialization, you can do so by specifying False for value of debugger_hook_config in your framework estimator’s initialization: estimator = TensorFlow( role=role, instance_count=1, instance_type=instance_type, debugger_hook_config=False ) Learn More¶ Further documentation¶ API documentation: https://sagemaker.readthedocs.io/en/stable/debugger.html AWS\n", 520 | "----------------\n", 521 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n", 522 | "debugging hook to store only the collection “gradients” and then configure the rules to operate on some other collection, say, “weights”. While this is possible, it’s quite useless as it gives you no meaningful insight into the training process. This is because the rule will do nothing in this example scenario since it will wait for the tensors in the collection “gradients” which are never be emitted. For more useful and efficient debugging, configure your debugging hook to produce and store the debugging data that you care about and employ rules that operate on that particular data. This way, you ensure that the Debugger is utilized to its maximum potential in detecting anomalies. In this sense, there is a loose binding between the hook and the rules. Normally, you’d achieve this binding for a training job by providing values for both debugger_hook_config and rules in your estimator. However, SageMaker Debugger simplifies this by allowing you to specify the collection configuration within the Rule object itself. This way, you don’t have to specify the debugger_hook_config in your estimator separately. Using built-in rules¶ SageMaker Debugger comes with a set of built-in rules which can be used to identify common problems in model training, for example vanishing gradients or exploding tensors. You can choose to evaluate one or more of these rules while training your model to obtain meaningful insight into the training process. To learn more about these built in rules, see SageMaker Debugger Built-in Rules. Pre-defined debugger hook configuration for built-in rules¶ As mentioned earlier, for efficient\n", 523 | "----------------\n", 524 | "{\"source\":\"s3://sagemaker-kb-015469603702/sagemaker.readthedocs.io_en_stable_amazon_sagemaker_debugger.html\"}\n", 525 | "Specifying configurations for collections Collection Name Collection Parameters Hook Parameters Begin model training Continuous analyses through rules Relationship between debugger hook and rules Using built-in rules Pre-defined debugger hook configuration for built-in rules Sample Usages Using custom rules Sample Usage Capture real-time TensorBoard data from the debugging hook Interactive analysis using SageMaker Debugger SDK and visualizations Default behavior and opting out Learn More Further documentation Notebook examples Background¶ Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that encompasses the entire machine learning workflow. You can label and prepare your data, choose an algorithm, train a model, and then tune and optimize it for deployment. You can deploy your models to production with Amazon SageMaker to make predictions at lower costs than was previously possible. SageMaker Debugger provides a way to hook into the training process and emit debug artifacts (a.k.a. “tensors”) that represent the training state at each point in the training lifecycle. Debugger then stores the data in real time and uses rules that encapsulate logic to analyze tensors and react to anomalies. Debugger provides built-in rules and allows you to write custom rules for analysis. Setup¶ To get started, you must satisfy the following prerequisites: Specify an AWS Region where you’ll train your model. Give Amazon SageMaker the access to your data in Amazon Simple Storage Service (Amazon S3) needed to train your model by creating an IAM role ARN. See the AWS IAM documentation for how to fine tune the permissions needed. Capture\n", 526 | "----------------\n" 527 | ] 528 | }, 529 | { 530 | "data": { 531 | "text/markdown": [ 532 | "question=What advantages does SageMaker debugger provide?
answer=\n", 533 | "SageMaker debugger provides the following advantages:\n", 534 | "\n", 535 | "- It allows you to hook into the training process and emit debug artifacts (tensors) that represent the training state at each point in the training lifecycle. \n", 536 | "\n", 537 | "- It stores the debug data in real time and uses rules to analyze tensors and react to anomalies.\n", 538 | "\n", 539 | "- It provides built-in rules and allows you to write custom rules for analysis.\n", 540 | "\n", 541 | "- It allows interactive analysis on the debugging data and visualization of results.\n", 542 | "\n", 543 | "- It minimizes code changes needed to get useful debugging information by automatically initializing the debugger hook for frameworks like TensorFlow, Keras, MXNet, PyTorch and XGBoost.\n", 544 | "\n", 545 | "
\n" 546 | ], 547 | "text/plain": [ 548 | "" 549 | ] 550 | }, 551 | "metadata": {}, 552 | "output_type": "display_data" 553 | } 554 | ], 555 | "source": [ 556 | "# 1. Start with the query\n", 557 | "q = \"What advantages does SageMaker debugger provide?\"\n", 558 | "\n", 559 | "# 2. Create the context by finding similar documents from the knowledge base\n", 560 | "context = create_context_for_query(q, embeddings, client, aoss_vector_index)\n", 561 | "\n", 562 | "# 3. Now create a prompt by combining the query and the context\n", 563 | "prompt = PROMPT_TEMPLATE.format(context, q)\n", 564 | "\n", 565 | "# 4. Provide the prompt to the LLM to generate an answer to the query based on context provided\n", 566 | "response = claude_llm(prompt)\n", 567 | "\n", 568 | "printmd(f\"question={q.strip()}
answer={response.strip()}
\\n\")\n" 569 | ] 570 | }, 571 | { 572 | "cell_type": "code", 573 | "execution_count": null, 574 | "id": "f2e7ac93-f5ed-4c0c-99bf-03fa1ab7cf7e", 575 | "metadata": {}, 576 | "outputs": [], 577 | "source": [] 578 | } 579 | ], 580 | "metadata": { 581 | "kernelspec": { 582 | "display_name": "Python 3.9.18 ('bedrock_py39')", 583 | "language": "python", 584 | "name": "python3" 585 | }, 586 | "language_info": { 587 | "codemirror_mode": { 588 | "name": "ipython", 589 | "version": 3 590 | }, 591 | "file_extension": ".py", 592 | "mimetype": "text/x-python", 593 | "name": "python", 594 | "nbconvert_exporter": "python", 595 | "pygments_lexer": "ipython3", 596 | "version": "3.9.18" 597 | }, 598 | "vscode": { 599 | "interpreter": { 600 | "hash": "3ac4445fedcc02e0ec010c021cc980cd9c85bdedf3d57447a4cb4e8d37edc5f0" 601 | } 602 | } 603 | }, 604 | "nbformat": 4, 605 | "nbformat_minor": 5 606 | } 607 | -------------------------------------------------------------------------------- /setup_bedrock_conda.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # constants 4 | BEDROCK_CONDA_ENV=bedrock_py39 5 | PY_VER=3.9 6 | 7 | # create conda env 8 | conda remove -n $BEDROCK_CONDA_ENV --all -y 9 | conda create --name $BEDROCK_CONDA_ENV -y python=$PY_VER ipykernel 10 | source activate $BEDROCK_CONDA_ENV 11 | 12 | # all set to pip install the bedrock packages, these are awscli, boto3 and botocore 13 | pip install --no-build-isolation --force-reinstall boto3>=1.28.57 awscli>=1.29.57 botocore>=1.31.57 14 | pip install langchain==0.0.304 15 | pip install opensearch-py==2.3.1 16 | 17 | echo "all done" 18 | -------------------------------------------------------------------------------- /template.yml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | Description: 'Amazon OpenSearch Serverless template to create an IAM user, encryption policy, data access policy and collection' 3 | 4 | Metadata: 5 | AWS::CloudFormation::Interface: 6 | ParameterGroups: 7 | - Label: 8 | default: Required Parameters 9 | Parameters: 10 | - BedrockNotebookName 11 | ParameterLabels: 12 | BedrockNotebookName: 13 | default: Name of SageMaker Notebook Instance 14 | 15 | Parameters: 16 | IAMUserArn: 17 | AllowedPattern: "^arn:aws:iam::\\d{12}:user/[\\w+=,.@-]+|arn:aws:sts::\\d{12}:assumed-role/[\\w+=,.@-]+/[\\w+=,.@-]+$" 18 | Description: The Arn of the IAM user (or assumed role) running this CloudFormation template. 19 | Type: String 20 | AOSSCollectionName: 21 | Default: sagemaker-kb 22 | Type: String 23 | Description: Name of the Amazon OpenSearch Service Serverless (AOSS) collection. 24 | MinLength: 1 25 | MaxLength: 21 26 | AllowedPattern: ^[a-z0-9](-*[a-z0-9])* 27 | ConstraintDescription: Must be lowercase or numbers with a length of 1-63 characters. 28 | AOSSIndexName: 29 | Default: sagemaker-readthedocs-io 30 | Type: String 31 | Description: Name of the vector index in the Amazon OpenSearch Service Serverless (AOSS) collection. 32 | 33 | 34 | Resources: 35 | 36 | CodeRepository: 37 | Type: AWS::SageMaker::CodeRepository 38 | Properties: 39 | GitConfig: 40 | RepositoryUrl: https://github.com/aws-samples/bedrock-kb-rag-workshop 41 | 42 | S3Bucket: 43 | Type: AWS::S3::Bucket 44 | Description: Creating Amazon S3 bucket to hold source data for knowledge base 45 | Properties: 46 | BucketName: !Join 47 | - '-' 48 | - - !Ref AOSSCollectionName 49 | - !Sub ${AWS::AccountId} 50 | 51 | cleanupBucketOnDelete: 52 | Type: Custom::cleanupbucket 53 | Properties: 54 | ServiceToken: !GetAtt 'DeleteS3Bucket.Arn' 55 | BucketName: !Ref S3Bucket 56 | DependsOn: S3Bucket 57 | 58 | DeleteS3Bucket: 59 | Type: AWS::Lambda::Function 60 | Properties: 61 | Handler: index.lambda_handler 62 | Description: "Delete all objects in S3 bucket" 63 | Timeout: 30 64 | Role: !GetAtt 'LambdaBasicExecutionRole.Arn' 65 | Runtime: python3.9 66 | Environment: 67 | Variables: 68 | BUCKET_NAME: !Ref S3Bucket 69 | Code: 70 | ZipFile: | 71 | import json, boto3, logging 72 | import cfnresponse 73 | logger = logging.getLogger() 74 | logger.setLevel(logging.INFO) 75 | 76 | def lambda_handler(event, context): 77 | logger.info("event: {}".format(event)) 78 | try: 79 | bucket = event['ResourceProperties']['BucketName'] 80 | logger.info("bucket: {}, event['RequestType']: {}".format(bucket,event['RequestType'])) 81 | if event['RequestType'] == 'Delete': 82 | s3 = boto3.resource('s3') 83 | bucket = s3.Bucket(bucket) 84 | for obj in bucket.objects.filter(): 85 | logger.info("delete obj: {}".format(obj)) 86 | s3.Object(bucket.name, obj.key).delete() 87 | 88 | sendResponseCfn(event, context, cfnresponse.SUCCESS) 89 | except Exception as e: 90 | logger.info("Exception: {}".format(e)) 91 | sendResponseCfn(event, context, cfnresponse.FAILED) 92 | 93 | def sendResponseCfn(event, context, responseStatus): 94 | responseData = {} 95 | responseData['Data'] = {} 96 | cfnresponse.send(event, context, responseStatus, responseData, "CustomResourcePhysicalID") 97 | 98 | CustomSGResource: 99 | Type: AWS::CloudFormation::CustomResource 100 | Properties: 101 | ServiceToken: !GetAtt 'CustomFunctionCopyContentsToS3Bucket.Arn' 102 | 103 | 104 | LambdaBasicExecutionRole: 105 | Type: AWS::IAM::Role 106 | Properties: 107 | AssumeRolePolicyDocument: 108 | Statement: 109 | - Effect: Allow 110 | Principal: 111 | Service: lambda.amazonaws.com 112 | Action: sts:AssumeRole 113 | Path: / 114 | Policies: 115 | - PolicyName: S3Access 116 | PolicyDocument: 117 | Version: '2012-10-17' 118 | Statement: 119 | - Effect: Allow 120 | Action: 121 | - logs:CreateLogGroup 122 | - logs:CreateLogStream 123 | - logs:PutLogEvents 124 | Resource: arn:aws:logs:*:*:* 125 | - Effect: Allow 126 | Action: 127 | - s3:* 128 | Resource: '*' 129 | 130 | 131 | CustomFunctionCopyContentsToS3Bucket: 132 | Type: AWS::Lambda::Function 133 | Properties: 134 | Handler: index.lambda_handler 135 | Description: "Copies files from the Blog bucket to bucket in this account" 136 | Timeout: 30 137 | Role: !GetAtt 'LambdaBasicExecutionRole.Arn' 138 | Runtime: python3.9 139 | Environment: 140 | Variables: 141 | AOSS_COLLECTION_NAME: !Ref AOSSCollectionName 142 | Code: 143 | ZipFile: | 144 | import os 145 | import json 146 | import boto3 147 | import logging 148 | import cfnresponse 149 | 150 | logger = logging.getLogger() 151 | logger.setLevel(logging.INFO) 152 | DATA_BUCKET = "aws-blogs-artifacts-public" 153 | SRC_PREFIX = "artifacts/ML-15729" 154 | MANIFEST = os.path.join(SRC_PREFIX, "manifest.txt") 155 | # s3://aws-blogs-artifacts-public/artifacts/ML-15729/docs/manifest.txt 156 | def lambda_handler(event, context): 157 | logger.info('got event {}'.format(event)) 158 | if event['RequestType'] == 'Delete': 159 | logger.info(f"copy files function called at the time of stack deletion, skipping") 160 | response = dict(files_copied=0, error=None) 161 | cfnresponse.send(event, context, cfnresponse.SUCCESS, response) 162 | return 163 | try: 164 | s3 = boto3.client('s3') 165 | obj = s3.get_object(Bucket=DATA_BUCKET, Key=MANIFEST) 166 | manifest_data = obj['Body'].iter_lines() 167 | ctr = 0 168 | for f in manifest_data: 169 | fname = f.decode() 170 | key = os.path.join(SRC_PREFIX, fname) 171 | logger.info(f"going to read {key} from bucket={DATA_BUCKET}") 172 | copy_source = { 'Bucket': DATA_BUCKET, 'Key': key } 173 | account_id = boto3.client('sts').get_caller_identity().get('Account') 174 | bucket = boto3.resource('s3').Bucket(f"{os.environ.get('AOSS_COLLECTION_NAME')}-{account_id}") 175 | dst_key = fname 176 | logger.info(f"going to copy {copy_source} -> s3://{bucket}/{dst_key}") 177 | bucket.copy(copy_source, dst_key) 178 | ctr += 1 179 | response = dict(files_copied=ctr, error=None) 180 | cfnresponse.send(event, context, cfnresponse.SUCCESS, response) 181 | except Exception as e: 182 | logger.error(e) 183 | response = dict(files_copied=0, error=str(e)) 184 | cfnresponse.send(event, context, cfnresponse.FAILED, response) 185 | 186 | return 187 | 188 | AmazonBedrockExecutionRoleForKnowledgeBase: 189 | Type: AWS::IAM::Role 190 | Properties: 191 | RoleName: !Join 192 | - '-' 193 | - - AmazonBedrockExecutionRoleForKnowledgeBase 194 | - !Ref AOSSCollectionName 195 | AssumeRolePolicyDocument: 196 | Statement: 197 | - Effect: Allow 198 | Principal: 199 | Service: bedrock.amazonaws.com 200 | Action: sts:AssumeRole 201 | Condition: 202 | StringEquals: 203 | "aws:SourceAccount": !Sub "${AWS::AccountId}" 204 | ArnLike: 205 | "AWS:SourceArn": !Sub "arn:aws:bedrock:${AWS::Region}:${AWS::AccountId}:knowledge-base/*" 206 | Path: / 207 | Policies: 208 | - PolicyName: S3ReadOnlyAccess 209 | PolicyDocument: 210 | Version: '2012-10-17' 211 | Statement: 212 | - Effect: Allow 213 | Action: 214 | - s3:Get* 215 | - s3:List* 216 | - s3:Describe* 217 | - s3-object-lambda:Get*, 218 | - s3-object-lambda:List* 219 | Resource: '*' 220 | - PolicyName: AOSSAPIAccessAll 221 | PolicyDocument: 222 | Version: '2012-10-17' 223 | Statement: 224 | - Effect: Allow 225 | Action: 226 | - aoss:APIAccessAll 227 | Resource: !Sub arn:aws:aoss:${AWS::Region}:${AWS::AccountId}:collection/* 228 | - PolicyName: BedrockListAndInvokeModel 229 | PolicyDocument: 230 | Version: '2012-10-17' 231 | Statement: 232 | - Effect: Allow 233 | Action: 234 | - bedrock:ListCustomModels 235 | Resource: '*' 236 | - Effect: Allow 237 | Action: 238 | - bedrock:InvokeModel 239 | Resource: !Sub arn:aws:bedrock:${AWS::Region}::foundation-model/* 240 | 241 | AmazonBedrockExecutionRoleForAgentsQA: 242 | Type: AWS::IAM::Role 243 | Properties: 244 | RoleName: AmazonBedrockExecutionRoleForAgents_SageMakerQA 245 | AssumeRolePolicyDocument: 246 | Statement: 247 | - Effect: Allow 248 | Principal: 249 | Service: bedrock.amazonaws.com 250 | Action: sts:AssumeRole 251 | ManagedPolicyArns: 252 | - arn:aws:iam::aws:policy/AmazonSageMakerFullAccess 253 | - arn:aws:iam::aws:policy/AmazonBedrockFullAccess 254 | 255 | NotebookInstance: 256 | Type: AWS::SageMaker::NotebookInstance 257 | Properties: 258 | NotebookInstanceName: !Sub ${AWS::StackName}-notebook 259 | InstanceType: ml.t3.xlarge 260 | RoleArn: !GetAtt NotebookRole.Arn 261 | DefaultCodeRepository: !GetAtt CodeRepository.CodeRepositoryName 262 | 263 | NotebookRole: 264 | Type: AWS::IAM::Role 265 | Properties: 266 | RoleName: !Join 267 | - '-' 268 | - - !Ref AOSSCollectionName 269 | - NoteBookRole 270 | Policies: 271 | - PolicyName: CustomNotebookAccess 272 | PolicyDocument: 273 | Version: 2012-10-17 274 | Statement: 275 | - Sid: BedrockFullAccess 276 | Effect: Allow 277 | Action: 278 | - "bedrock:*" 279 | Resource: "*" 280 | - PolicyName: AOSSAPIAccessAll 281 | PolicyDocument: 282 | Version: '2012-10-17' 283 | Statement: 284 | - Effect: Allow 285 | Action: 286 | - aoss:APIAccessAll 287 | Resource: !Sub arn:aws:aoss:${AWS::Region}:${AWS::AccountId}:collection/* 288 | ManagedPolicyArns: 289 | - arn:aws:iam::aws:policy/AmazonSageMakerFullAccess 290 | - arn:aws:iam::aws:policy/AmazonS3FullAccess 291 | - arn:aws:iam::aws:policy/AWSCloudFormationReadOnlyAccess 292 | AssumeRolePolicyDocument: 293 | Version: 2012-10-17 294 | Statement: 295 | - Effect: Allow 296 | Principal: 297 | Service: 298 | - sagemaker.amazonaws.com 299 | Action: 300 | - 'sts:AssumeRole' 301 | - Effect: Allow 302 | Principal: 303 | Service: 304 | - bedrock.amazonaws.com 305 | Action: 306 | - 'sts:AssumeRole' 307 | 308 | DataAccessPolicy: 309 | Type: 'AWS::OpenSearchServerless::AccessPolicy' 310 | Properties: 311 | Name: !Join 312 | - '-' 313 | - - !Ref AOSSCollectionName 314 | - access-policy 315 | Type: data 316 | Description: Access policy for AOSS collection 317 | Policy: !Sub >- 318 | [{"Description":"Access for cfn user","Rules":[{"ResourceType":"index","Resource":["index/*/*"],"Permission":["aoss:*"]}, 319 | {"ResourceType":"collection","Resource":["collection/quickstart"],"Permission":["aoss:*"]}], 320 | "Principal":["${IAMUserArn}", "${AmazonBedrockExecutionRoleForKnowledgeBase.Arn}", "${NotebookRole.Arn}"]}] 321 | NetworkPolicy: 322 | Type: 'AWS::OpenSearchServerless::SecurityPolicy' 323 | Properties: 324 | Name: !Join 325 | - '-' 326 | - - !Ref AOSSCollectionName 327 | - network-policy 328 | Type: network 329 | Description: Network policy for AOSS collection 330 | Policy: !Sub >- 331 | [{"Rules":[{"ResourceType":"collection","Resource":["collection/${AOSSCollectionName}"]}, {"ResourceType":"dashboard","Resource":["collection/${AOSSCollectionName}"]}],"AllowFromPublic":true}] 332 | EncryptionPolicy: 333 | Type: 'AWS::OpenSearchServerless::SecurityPolicy' 334 | Properties: 335 | Name: !Join 336 | - '-' 337 | - - !Ref AOSSCollectionName 338 | - security-policy 339 | Type: encryption 340 | Description: Encryption policy for AOSS collection 341 | Policy: !Sub >- 342 | {"Rules":[{"ResourceType":"collection","Resource":["collection/${AOSSCollectionName}"]}],"AWSOwnedKey":true} 343 | Collection: 344 | Type: 'AWS::OpenSearchServerless::Collection' 345 | Properties: 346 | Name: !Ref AOSSCollectionName 347 | Type: VECTORSEARCH 348 | Description: Collection to holds vector search data 349 | DependsOn: EncryptionPolicy 350 | 351 | Outputs: 352 | S3Bucket: 353 | Value: !GetAtt S3Bucket.Arn 354 | DashboardURL: 355 | Value: !GetAtt Collection.DashboardEndpoint 356 | CollectionARN: 357 | Value: !GetAtt Collection.Arn 358 | FilesCopied: 359 | Description: Files copied 360 | Value: !GetAtt 'CustomSGResource.files_copied' 361 | FileCopyError: 362 | Description: Files copy error 363 | Value: !GetAtt 'CustomSGResource.error' 364 | AOSSVectorIndexName: 365 | Description: vector index 366 | Value: !Ref AOSSIndexName 367 | Region: 368 | Description: Deployed Region 369 | Value: !Ref AWS::Region 370 | --------------------------------------------------------------------------------