├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── examples ├── sample_web_application │ ├── README.md │ ├── __init__.py │ ├── amplify_ui │ │ ├── amplify.yml │ │ ├── amplify │ │ │ ├── auth │ │ │ │ ├── pre-token-generation │ │ │ │ │ ├── handler.ts │ │ │ │ │ └── resources.ts │ │ │ │ └── resource.ts │ │ │ ├── backend.ts │ │ │ ├── package.json │ │ │ └── tsconfig.json │ │ ├── dist │ │ │ ├── assets │ │ │ │ ├── index-2zi0aqmG.css │ │ │ │ └── index-DguTJVXA.js │ │ │ ├── favicon.ico │ │ │ └── index.html │ │ ├── env.d.ts │ │ ├── index.html │ │ ├── package-lock.json │ │ ├── package.json │ │ ├── public │ │ │ └── favicon.ico │ │ ├── src │ │ │ ├── App.vue │ │ │ ├── assets │ │ │ │ ├── base.css │ │ │ │ ├── logo.svg │ │ │ │ └── main.css │ │ │ ├── components │ │ │ │ ├── AssistantConfig.vue │ │ │ │ ├── ChatComponent.vue │ │ │ │ ├── MenuBar.vue │ │ │ │ └── icons │ │ │ │ │ ├── IconCommunity.vue │ │ │ │ │ ├── IconDocumentation.vue │ │ │ │ │ ├── IconEcosystem.vue │ │ │ │ │ ├── IconSupport.vue │ │ │ │ │ └── IconTooling.vue │ │ │ └── main.ts │ │ ├── tsconfig.app.json │ │ ├── tsconfig.json │ │ ├── tsconfig.node.json │ │ └── vite.config.ts │ ├── app │ │ ├── __init__.py │ │ ├── api_models │ │ │ ├── __init__.py │ │ │ ├── assistant_model.py │ │ │ ├── bedrock_converse_model.py │ │ │ └── workflow_model.py │ │ ├── logging.yaml │ │ ├── main.py │ │ ├── requirements.txt │ │ └── run.sh │ ├── assistant_config.py │ ├── config_handler │ │ └── config_handler.py │ ├── dependencies │ │ └── python │ │ │ └── assistant_config_interface │ │ │ ├── __init__.py │ │ │ └── data_manager.py │ ├── imgs │ │ ├── Architecture.jpeg │ │ ├── amplify_deployed.jpeg │ │ ├── amplify_intro.jpeg │ │ ├── amplify_save_deploy.jpeg │ │ ├── app_settings.jpeg │ │ ├── download_backend_resources.png │ │ ├── repository_select.jpeg │ │ ├── sample.gif │ │ └── select_source_code.jpeg │ ├── lambda-jwt-verify │ │ └── src │ │ │ ├── index.mjs │ │ │ └── package.json │ ├── package-lock.json │ ├── statemachine │ │ └── rag_parallel_tasks │ │ │ └── RagGenAI.asl.json │ └── template.yaml └── serverless_assistant_rag │ ├── __init__.py │ ├── app │ ├── __init__.py │ ├── main.py │ ├── requirements.txt │ └── run.sh │ ├── statemachine │ └── rag_parallel_tasks │ │ └── RagGenAI.asl.json │ ├── template.yaml │ └── tests │ ├── __init__.py │ ├── openapi.json │ ├── st_serverless_assistant.py │ ├── system_prompt_samples.py │ ├── test.json │ └── test_stream_python_requests.py └── imgs ├── architecture-promptchain-stream.png ├── architecture-stream.jpg ├── assistant_example.gif └── stepfunctions-rag-graph.png /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *main* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this 4 | software and associated documentation files (the "Software"), to deal in the Software 5 | without restriction, including without limitation the rights to use, copy, modify, 6 | merge, publish, distribute, sublicense, and/or sell copies of the Software, and to 7 | permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 10 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A 11 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT 12 | HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 13 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 14 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Serverless GenAI Assistant 2 | 3 | This example implements a GenAI Assistant that streams responses using AWS Lambda with AWS Lambda Web Adapter and takes advantage of low-code using AWS Step Functions to orchestrate workflows with Amazon Bedrock and other AWS services. 4 | 5 | ![Architecture](imgs/architecture-promptchain-stream.png) 6 | 7 | ### How does it work? 8 | 9 | The example uses AWS Lambda to execute a [Synchronous Express Workflow](https://docs.aws.amazon.com/step-functions/latest/dg/concepts-express-synchronous.html) that orchestrates multiples Bedrock API requests allowing the user to implement prompt engineering techniques like [Prompt Chaining](https://www.promptingguide.ai/techniques/prompt_chaining) and [ReAct](https://www.promptingguide.ai/techniques/react) using the [Workflow Studio](https://docs.aws.amazon.com/step-functions/latest/dg/workflow-studio-components.html) and the [ASL](https://states-language.net/spec.html), a JSON-based declarative language to prototype and experiment different integrations and prompt techniques without the need for extensive efforts. Lambda firstly invoke the [state machine](https://docs.aws.amazon.com/step-functions/latest/dg/getting-started-with-sfn.html#key-concepts-get-started) and use it's as a input to invoke the [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) API, that returns a streamed response to the caller. Due to the stream response the TTFB (Time to First Byte) is shorter, improving the user experience of the GenAI assistant, which fits in the customer facing service scenarios. 10 | 11 | The Lambda function uses [AWS Lambda Web Adapter](https://github.com/awslabs/aws-lambda-web-adapter/tree/main) and FastAPI. AWS Lambda Web Adapter allows developers to build web apps (http api) with familiar frameworks (e.g. Express.js, Next.js, Flask, SpringBoot, ASP.NET and Laravel, anything speaks HTTP 1.1/1.0). 12 | The Lambda function can be accessed using a [function url](https://docs.aws.amazon.com/lambda/latest/dg/lambda-urls.html) configured in [Response Stream mode](https://docs.aws.amazon.com/lambda/latest/dg/configuration-response-streaming.html). 13 | 14 | ### How to take advantage of this pattern? 15 | 16 | The combination of AWS Step Functions using Express Workflows and the Bedrock response stream API allows scenarios where the builder can use state machine [Tasks](https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-task-state.html) to define prompt chaining [subtasks](https://docs.anthropic.com/claude/docs/chain-prompts#tips-for-effective-prompt-chaining). Each Step Functions Task can invoke Amazon Bedrock API with a specific Large Language Model, in sequence or parallel for non-dependent prompt subtasks, and then orchestrate the result with other AWS Services APIs like AWS Lambda, Amazon API Gateway and [third-party APIs](https://docs.aws.amazon.com/step-functions/latest/dg/connect-third-party-apis.html). 17 | 18 | Break LLM tasks into subtasks turns the prompt engineering easier since you focus on the specific result of the subtask. Also, it's easier to control the output tokens which reduce the [latency](https://docs.anthropic.com/claude/docs/reducing-latency#2-optimize-prompt-and-output-length) of subtasks that doesn't need to generate large outputs, for example, you can use the user prompt and ask to a most compact LLM like Claude 3 Haiku to return a boolean value and use it to define a deterministic path in a state machine by use a [Choice state](https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-choice-state.html) to check the true/false value. It will consume a short number of output tokens, reducing the latency and cost. The inference for a final response can be executed for Claude 3 Sonnet using stream response for a better response quality and [TTFB](https://docs.anthropic.com/claude/docs/reducing-latency#measuring-latency). 19 | 20 | Here's a few scenarios where this pattern can be used: 21 | 22 | - RAG: Use Step Functions invokes Bedrock to augment the user input and improve the user question by add keywords regarding the conversation context and them invoke Knowledge bases for Amazon Bedrock to execute a semantic search. While the "add keyword task" and "invoke Knowledge Bases" are low latency tasks, the user response will be generated using stream response for better experience. 23 | - Router: Step Functions can be implemented as a [router](https://community.aws/content/2duj7Wn4o0gSC6gNMsSFpeYZ5uO/rethinking-ai-agents-why-a-simple-router-may-be-all-you-need) to merge deterministic and non-deterministic scenarios, for example: Understand that the customer contact is a potential churn and start a retention workflow. 24 | - A/B Testing: Use A/B testing in Step Functions, allowing the low-code fast pace of implementation to test different experiments, make adjusts and elect the best one to you business. While you focus on the business rule, the lambda handles as an interface abstraction since you don't need to change the lambda code or chabge API contract for every experiment. 25 | - API calling: The user input can be used to prompt LLM to generate data as a JSON, XML or create a SQL query based on a table structure. Then you can use the LLM output to execute Step Function Tasks that uses the generated data to call APIs and execute SQL queries on Databases. The lambda function can use the Step Functions output to execute stream response and provide the reasoning over the data generated. 26 | --- 27 | ## Implementation details 28 | 29 | ### State Machine - RAG 30 | 31 | ![state_machine](imgs/stepfunctions-rag-graph.png) 32 | 33 | Using the state machine in ``examples/serverless_assistant_rag`` to exemplify how the workflow works, it implements a RAG architecture using [Claude 3 Haiku](https://aws.amazon.com/bedrock/claude/) with [Messages API](https://docs.anthropic.com/claude/reference/messages-examples), and also [Meta Llama 3 chat](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3) . 34 | 35 | 1. Receives the input data from lambda, and parallelizes for two API Bedrock tasks. 36 | 2. The first task will enhance the user question to add keywords in order to augment the semantic search. It uses Claude 3 Haiku. 37 | 3. The second task is a bypass policy. It will check the conversation context and return true/false to validates if a KB retrieval is really necessary. When avoided it will reduce the answer latency. It uses Llama 3 8B Instruct. 38 | 4. If the result is false, the Knowledge base is invoked using the user question + the added keywords, otherwise no content will be added. Note that there is an error handler if the Retrieve task fails. It adds the error content to the ``contex_output``, and use the ``system_chain_data`` to modify the original instructions. 39 | 5. In order to pass a structured response, the [Pass states](https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-pass-state.html) is used to format the JSON output. This technique can be used to keep the Tasks flexible and use the pass state to filter/format the output to keep the API contract. 40 | 41 | 42 | ### Lambda 43 | 44 | The lambda function is packaged as a Zip and the Lambda Adapter is attached as a layer, similar to the [fastapi-response-streaming-zip](https://github.com/awslabs/aws-lambda-web-adapter/tree/main/examples/fastapi-response-streaming-zip) example. After the deployment you can check the cloudformation output by the key TheFastAPIFunctionUrl that contains the lambda URL to invoke. 45 | 46 | Lambda invokes the express workflow defined in `STATEMACHINE_STATE_MACHINE_ARN` environment variable and merges the content returned in `ContextOutput` attribute to the `system` parameter. The ContextOutput will be wrapped by [XML tags](https://docs.anthropic.com/claude/docs/use-xml-tags), that can be defined by the parameter `content_tag`. 47 | 48 | The lambda is also the last subtask of prompt chaining, and it's ready to call Claude models that support Messages APIs. It means that the `ContextData` attribute in the state machine response is sent to the Bedrock stream API. The prompt instructions that guides the user interaction should be sent to lambda by the caller using the [`system`](https://docs.anthropic.com/claude/docs/system-prompts) parameter. The lambda function will them stream the response to the caller in chunks, check the [Run](#run) section to see examples. 49 | 50 | Here's an example of the prompt instructions format for `system` parameter where the `content_tag` is the string 'document' and the `ContextData` is a JSON: 51 | 52 | ```python 53 | content_tag = 'document' 54 | 55 | #the prompt below is the default prompt sent to the lambda function by the caller during the lambda URL invocation 56 | system = """ 57 | 58 | You are an virtual assistant, use the JSON in the document xml tag to answer the user. 59 | Use the attribute 'data_source' to provide the URL used as a source. 60 | 61 | """ 62 | 63 | #ContextOutput is part of Step Functions response 64 | ContextData = { 65 | "text":"Example", 66 | "data_source": "URL" 67 | } 68 | 69 | #Format a new system attribute 70 | f"{system}<{content_tag}>{ContextData}" 71 | ``` 72 | 73 | >**Note:** For this sample the ``template.yaml`` defines the ``AuthType`` of ``FunctionUrlConfig`` as ``None``. This sample is intended to provide an experimentation environment where the user can explore the architecture. 74 | > For production use cases check how you can add [auth](https://docs.aws.amazon.com/lambda/latest/dg/urls-auth.html) layer and how to [protect](https://aws.amazon.com/blogs/compute/protecting-an-aws-lambda-function-url-with-amazon-cloudfront-and-lambdaedge/) lambda url. 75 | 76 | 77 | ### Prompt Instructions 78 | 79 | #### First prompt 80 | The prompt engineering architecture for this sample explores the [Prompt Chaining](https://www.promptingguide.ai/techniques/prompt_chaining) technique. To implement this technique the lambda functions expects to receive instructions in ``system`` parameter. 81 | Use the first invocation to pass the instructions about how you expect the assistant to answer the user. In general, use the first prompt to define techniques like [Role Playing](https://docs.anthropic.com/claude/docs/give-claude-a-role, 82 | [Few-Shot Prompting](https://www.promptingguide.ai/techniques/fewshot) and others. Using the first prompt to handle more complex instructions will allow to use 83 | the ``InvokeModelWithResponseStream`` API with powerful models that tends to be slower, but the TTFB will help to achieve a better responsiveness. 84 | 85 | 86 | 87 | #### Next prompts 88 | 89 | The next prompts are defined in each Step Functions Task. You can use or not the first prompt in the Step Function tasks. It will depend on the use case. 90 | Design the prompts in Step Function Tasks in concise and straightforward way, focus on solve a specific problem. Experiment with different models. 91 | 92 | You can check the first prompt example at ``examples/serverless_assistant_rag/tests/system_prompt_samples.py`` and the prompt at each Task at ``examples/serverless_assistant_rag/statemachine/rag_parallel_taks/RagGenAI.asl.json``. 93 | 94 | --- 95 | ### Passing data to Step Functions state machine 96 | 97 | You can send data aside LLM parameters through lambda to Step Functions state machine using the optional parameter ``state_machine_custom_params``. It expects a ``dict`` where you can choose the data you want to use in the state machine. 98 | As an example, if you would like to send information about a logged user, you can use the ``state_machine_custom_params`` and handle this information in the Step Functions state machine. Once the data is sent to Step Functions you can use all the resources in 99 | state machine to handle the information. 100 | 101 | ### Chaining to InvokeModelWithResponseStream 102 | 103 | In this example the ``InvokeModelWithResponseStream`` is the last subtask of prompt chaining, it means that if the execution 104 | workflow in Step Functions gets new relevant data to original prompt instructions, it needs to send this data to the stream api aside 105 | of the ``context_data`` parameter because ``context_data`` is used to provide information chunks to help the LLM to generate answer, not instructions. To handle this scenario you can use the ``prompt_chain_data`` parameter. 106 | 107 | The ``system_chain_prompt`` takes the original instructions in ``system`` parameter and allows you to make the following operations 108 | to that prompt chain execution instance: 109 | 110 | - ``APPEND`` - Adds new instructions to the end of original ``system`` parameter 111 | - ``REPLACE_TAG`` - Replace the first mapped in the ``system`` by new data. It provides a mechanism to 112 | merge new instructions to original prompt instructions. 113 | - ``REPLACE_ALL`` - Replace all the prompt in ``system`` by the content in ``system_chain_prompt`` parameter. Useful for cases where a system error happened, 114 | or you have scenarios that needs a new prompt for each flow. 115 | 116 | ##### Why use the system_chain_data attribute? 117 | It avoids you to create complex prompt. When you add many instructions at a time to LLM, the prompt engineering tends to 118 | get more complex to deal with edge scenarios, so instead you create a prompt with few rules to define how to answer questions but an 119 | exception rules for specific cases, which will demand you more reasoning, you can manipulate the instructions based on the path of your workflow 120 | and give direct instructions where your LLM will not deal with ambiguity. 121 | 122 | #### Syntax 123 | 124 | 125 | 126 | 127 | ``` 128 | "system_chain_data": { 129 | "system_chain_prompt": string 130 | "operation": APPEND | REPLACE_TAG | REPLACE_ALL, 131 | "configuration": { 132 | replace_tag: string 133 | } 134 | } 135 | ``` 136 | 137 | Parameters: 138 | 139 | - system_chain_data (dic) 140 | - system_chain_prompt (string) - Prompt instructions 141 | - operation (string) - The action to be executed in the lambda ``system`` parameter 142 | - configuration (dict) - At the moment configuration is intended to support ``REPLACE_TAG`` operation 143 | - replace_tag (string) - Required if ``REPLACE_TAG`` is used. Replace the tag in the original content of ``system`` 144 | parameter. 145 | 146 | #### Usage 147 | 148 | Add the ``system_chain_data`` block in the Step Functions state machine, check if it is being sent as part of the state 149 | machine response. You can check implement examples in ``examples/serverless_assistant_rag/statemachine/rag_parallel_taks/RagGenAI.asl.json``, at ``GenerateResponse`` and ``GenerateResponseKB`` states. 150 | 151 | 152 | --- 153 | ## Deployment 154 | 155 | ### Requirements 156 | - [AWS SAM](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html) 157 | - [AWS Account](https://aws.amazon.com/resources/create-account/) 158 | - [Request access to Anthropic Claude 3 (Sonnet and Haiku), and Meta Llama 3 models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) 159 | - [Knowledge base to execute the Rag statemachine example](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html) 160 | 161 | Note: If you are interested in deploy as Docker image, use [this example](https://github.com/awslabs/aws-lambda-web-adapter/tree/main/examples/fastapi-response-streaming) 162 | 163 | ### Build and Deploy 164 | 165 | Run the following commands to build and deploy the application to lambda. 166 | 167 | ```bash 168 | cd serverless-genai-assistant/examples/serverless-assistant-rag 169 | sam build 170 | sam deploy --guided 171 | ``` 172 | For the deployment options, define the following options: 173 | ```bash 174 | Stack Name []: #Choose a stack name 175 | AWS Region [us-east-1]: #Select a Region that supports Amazon Bedrock and the other AWS Services 176 | Parameter KnowledgeBaseId []: #Insert the KB ID https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-manage.html#kb- 177 | #Shows you resources changes to be deployed and require a 'Y' to initiate deploy 178 | Confirm changes before deploy [y/N]: 179 | #SAM needs permission to be able to create roles to connect to the resources in your template 180 | Allow SAM CLI IAM role creation [Y/n]: 181 | #Preserves the state of previously provisioned resources when an operation fails 182 | Disable rollback [y/N]: 183 | FastAPIFunction Function Url has no authentication. Is this okay? [y/N]: y 184 | Save arguments to configuration file [Y/n]: 185 | SAM configuration file [samconfig.toml]: 186 | SAM configuration environment [default]: 187 | ``` 188 | 189 | >Copy the TheFastAPIFunctionUrl value, it's the url to invoke lambda. 190 | 191 | >**Note:** If you don't provide a Knowledge base ID, you will receive answer from LLM informing a Bedrock error, for a complete experience check the [Requirements](#requirements) and create the Knowledge base. 192 | 193 | 194 | 195 | ## Run 196 | 197 | ### Terminal 198 | 199 | ```bash 200 | cd serverless-genai-assistant/examples/serverless-assistant-rag/tests 201 | curl --no-buffer -H "Content-Type: application/json" -X POST -d @test.json 202 | ``` 203 | 204 | ### Python 205 | ```bash 206 | cd serverless-genai-assistant/examples/serverless-assistant-rag/tests 207 | pip install requests 208 | python test_stream_python_requests.py --lambda-url 209 | ``` 210 | 211 | ### Streamlit 212 | ```bash 213 | cd serverless-genai-assistant/examples/serverless-assistant-rag/tests 214 | pip install requests 215 | pip install streamlit 216 | streamlit run st_serverless_assistant.py -- --lambda-url 217 | ``` 218 | 219 | ### Serverless Gen AI Assistant on Streamlit 220 | 221 | ![Demo](imgs/assistant_example.gif) -------------------------------------------------------------------------------- /examples/sample_web_application/README.md: -------------------------------------------------------------------------------- 1 | # Serverless GenAI Assistant - Web Application Sample 2 | 3 | ## Overview 4 | 5 | The Serverless GenAI Assistant - Web Application Sample, demonstrates a pattern that combines serverless architecture with generative AI capabilities. This sample project demonstrates how to integrate this pattern showcasing both, frontend and backend components. 6 | 7 | To enable this UI the [deep-chat](https://github.com/OvidijusParsiunas/deep-chat) component is used to perform the chat conversation. The deep-chat component is a flexible solution that enables the integration of chatbot/genai APIs with a frontend component. 8 | To turn this integration possible it was necessary to create a [custom connection](https://deepchat.dev/docs/connect#request). Check the `ChatComponent.vue` if want to understand how it can be done. 9 | 10 | ![Architecture](imgs/Architecture.jpeg) 11 | 12 | 1. Frontend: The user interface is hosted and managed using AWS Amplify. It handles the user request. 13 | 2. Authentication: Amazon Cognito is used for secure user authentication. 14 | 3. API Gateway: Manages API requests to retrieve configuration Data. THe JWT is also validated in API GW. 15 | 4. Lambda Configuration Handler: Holds the logic to identify API GW path request and execute data retrieval using Data Manager lambda layer that provides DynamoDB interface. 16 | 5. Cloudfront + JWT Verify: Provides an endpoint able to deliver stream response for authorized requests. 17 | 6. Lambda FastAPI: Using lambda web adapter, it received requests from cloudfront. All the requests are SigV4 signed and also the cognito access token is used to retrieve data using the Data handler layer. 18 | 7. Step Functions: Used to provide custom GenAI Workflows for different users. 19 | 8. Amazon Bedrock: Uses Converse API with response streaming. 20 | 21 | 22 | ## Work in Progress 23 | This readme is in progress and wil be updated with more architecture details and usage instructions. 24 | 25 | ## Configuration data using Single Table Design 26 | 27 | The information is stored based on an `account_id` that can represents a user or multiple users. The is added as a custom attribute 28 | in cognito and `account_id` is added to 29 | JWT during the authentication phase on Cognito. 30 | 31 | | Category | Field | Value | Description | 32 | |----------|------------------|---------------------------------------------------------------------|-------------------------------------------------------------------------------------| 33 | | **Account Details** | 34 | | | Account ID | uuid4 | Unique identifier for the account | 35 | | | Name | Default account | Name of the account | 36 | | | Description | Test account | Brief description of the account | 37 | | **Inference Endpoint** | 38 | | | Type | cloudfront | The type of endpoint used | 39 | | | URL | http://cloudfront_domain/path | The URL for the inference endpoint | 40 | | | Description | Endpoint for inference with stream response | Explanation of the endpoint's purpose | 41 | | **Workflow Details** | 42 | | | Workflow ID | 1 | Identifier for the workflow | 43 | | | Name | Test Workflow for account 1 | Name of the workflow | 44 | | | Type | stepfunctions | Type of the workflow | 45 | | | ARN | arn:aws:states:::stateMachine: | Amazon Resource Name for the workflow | 46 | | | assistant_params | String representation of a JSON | Contain the parameters to be sent to serverless assistant for the specific workflow | 47 | 48 | 49 | ## Features 50 | 51 | - Scalable serverless architecture 52 | - GenAI with Stream Response 53 | - AWS Step Functions to create low code GenAI Workflows 54 | - Cognito user authentication 55 | - Web-based user interface 56 | - Cognito Users can access different Workflow resources 57 | 58 | ## Architecture 59 | 60 | The application leverages several AWS services: 61 | 62 | - **Frontend**: Hosted and managed using AWS Amplify 63 | - **Authentication**: Implemented with Amazon Cognito 64 | - **Backend**: Deployed using AWS Serverless Application Model (SAM) 65 | 66 | 67 | ## Prerequisites 68 | 69 | - AWS Account 70 | - [AWS CLI](https://aws.amazon.com/cli/) installed and configured 71 | - [Python 3.12](https://www.python.org/downloads/) 72 | - [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#installation) 73 | - [AWS SAM CLI](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) installed 74 | - [Node.js](https://nodejs.org/) and npm installed 75 | - Git 76 | 77 | ## Deployment 78 | 79 | The solution is deployed in two parts: 80 | 81 | 1. Amplify V2 (Frontend + Cognito) 82 | 2. SAM Template (Backend) 83 | 84 | Note: It's planned to move the stack to CDK on Amplify. For now, it's necessary to configure the integration between the 85 | stacks. 86 | 87 | ### Frontend Deployment (Amplify) 88 | 89 | This section uses the process as showed in [Amplify Quickstart](https://docs.amplify.aws/vue/start/quickstart/) 90 | 91 | 1. Create a new repository on GitHub: 92 | ``` 93 | https://github.com/new 94 | ``` 95 | 96 | 2. Navigate to the `amplify_ui` directory: 97 | ```bash 98 | cd serverless-genai-assistant/examples/sample_web_application/amplify_ui 99 | ``` 100 | 101 | 3. Initialize and push to the new repository: 102 | ```bash 103 | git init 104 | git add . 105 | git commit -m "Initial commit" 106 | git branch -M main 107 | git remote add origin git@github.com:/.git 108 | git push -u origin main 109 | ``` 110 | 111 | 4. Deploy the frontend using AWS Amplify Console: 112 | - Go to AWS Amplify Console 113 | - Click "Create new app" > "Host web app" 114 | 115 | ![Amplify Intro](imgs/amplify_intro.jpeg) 116 | 117 | - Select your git provider 118 | 119 | ![Select Source Code](imgs/select_source_code.jpeg) 120 | 121 | - Select your repository 122 | 123 | ![Repository Select](imgs/repository_select.jpeg) 124 | 125 | - Click on Next 126 | 127 | ![App Settings](imgs/app_settings.jpeg) 128 | 129 | - Click on Save & Deploy 130 | 131 | ![Amplify Save Deploy](imgs/amplify_save_deploy.jpeg) 132 | 133 | 5. After deployment, note the Amplify Domain for accessing the UI. 134 | - Click on the Branch 135 | 136 | ![Amplify Deployed](imgs/amplify_deployed.jpeg) 137 | 138 | 6. Download `amplify_outputs.json` from the Amplify Console: 139 | - Click on the branch to see more details 140 | - Click on "Deployed backend resources" and Download amplify_outputs.json 141 | 142 | ![Download Backend Resources](imgs/download_backend_resources.png) 143 | 144 | Copy it to the `amplify_ui` directory: 145 | ```bash 146 | cp /amplify_outputs.json serverless-genai-assistant/examples/sample_web_application/amplify_ui 147 | ``` 148 | 149 | ### Backend Deployment (SAM) 150 | 151 | 1. Update Cognito parameters in `lambda-jwt-verify/src/index.mjs`: 152 | ```javascript 153 | const verifier = CognitoJwtVerifier.create({ 154 | userPoolId: "INSERT_COGNITO_USER_POOL_ID", 155 | tokenUse: "access", 156 | clientId: "INSERT_COGNITO_CLIENT_ID", 157 | }); 158 | ``` 159 | 160 | 2. Deploy the backend using SAM CLI, the Cognito UserPoolId and UserPoolClientId will be requested: 161 | ```bash 162 | cd serverless-genai-assistant/examples/sample_web_application/ 163 | sam build 164 | sam deploy 165 | ``` 166 | 167 | 4. Update `assistant_config.py` with the deployment outputs and run it: 168 | ```bash 169 | python serverless-genai-assistant/examples/sample_web_application/assistant_config.py 170 | ``` 171 | 172 | 5. Update the `ConfigUrl` and the region in `amplify_ui/src/main.ts` with the SAM output endpoint. 173 | 174 | ```javascript 175 | AssistantConfigApi: { 176 | endpoint: 177 | '', 178 | region: '' 179 | } 180 | ``` 181 | 182 | 6. Commit the Amplify frontend with the updated configuration. 183 | 184 | ```bash 185 | cd serverless-genai-assistant/examples/sample_web_application/amplify_ui 186 | git commit -am "API Gateway endpoint Updated" 187 | git push 188 | ``` 189 | 190 | Amplify will automatically deploy the changes. 191 | 192 | ## Usage 193 | 194 | 1. Access the application using the Amplify URL provided after deployment. 195 | 2. Log in using the credentials generated by the `assistant_config.py` script. 196 | 3. Interact with the GenAI Assistant through the web interface. 197 | 198 | ![Web App Sample](imgs/sample.gif) 199 | -------------------------------------------------------------------------------- /examples/sample_web_application/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/__init__.py -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/amplify.yml: -------------------------------------------------------------------------------- 1 | version: 1 2 | backend: 3 | phases: 4 | build: 5 | commands: 6 | - npm ci --cache .npm --prefer-offline 7 | - npx ampx pipeline-deploy --branch $AWS_BRANCH --app-id $AWS_APP_ID 8 | frontend: 9 | phases: 10 | build: 11 | commands: 12 | - npm run build 13 | artifacts: 14 | baseDirectory: dist 15 | files: 16 | - '**/*' 17 | cache: 18 | paths: 19 | - .npm/**/* 20 | - node_modules/**/* 21 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/amplify/auth/pre-token-generation/handler.ts: -------------------------------------------------------------------------------- 1 | import type { PreTokenGenerationTriggerHandler } from "aws-lambda"; 2 | 3 | //https://aws.amazon.com/blogs/security/how-to-customize-access-tokens-in-amazon-cognito-user-pools/ 4 | //https://docs.amplify.aws/vue/build-a-backend/functions/examples/override-token/ 5 | 6 | 7 | //Note that any is used here to allow the usage of access token customization. 8 | export const handler: any = async (event: any) => { 9 | event.response = { 10 | claimsAndScopeOverrideDetails: { 11 | accessTokenGeneration: { 12 | claimsToAddOrOverride: { 13 | "custom:account_id": event.request.userAttributes["custom:account_id"] 14 | }, 15 | } 16 | } 17 | }; 18 | return event; 19 | }; -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/amplify/auth/pre-token-generation/resources.ts: -------------------------------------------------------------------------------- 1 | import { defineFunction } from '@aws-amplify/backend'; 2 | 3 | //https://aws.amazon.com/blogs/security/how-to-customize-access-tokens-in-amazon-cognito-user-pools/ 4 | //https://docs.amplify.aws/vue/build-a-backend/functions/examples/override-token/ 5 | export const preTokenGeneration = defineFunction({ 6 | name: 'pre-token-generation' 7 | }); 8 | 9 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/amplify/auth/resource.ts: -------------------------------------------------------------------------------- 1 | import { defineAuth } from '@aws-amplify/backend'; 2 | //import { preTokenGeneration } from './pre-token-generation/resources'; 3 | 4 | /** 5 | * Define and configure your auth resource 6 | * @see https://docs.amplify.aws/gen2/build-a-backend/auth 7 | */ 8 | export const auth = defineAuth({ 9 | loginWith: { 10 | email: true, 11 | }, 12 | groups: ['ServerlessAssistantUser', 'ServerlessAssistantOwner', 'ServerlessAssistantAdmin'] 13 | }); 14 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/amplify/backend.ts: -------------------------------------------------------------------------------- 1 | /* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 */ 3 | import { defineBackend } from '@aws-amplify/backend'; 4 | import { auth } from './auth/resource'; 5 | import { preTokenGeneration } from './auth/pre-token-generation/resources'; 6 | import * as iam from 'aws-cdk-lib/aws-iam'; 7 | import * as lambda from 'aws-cdk-lib/aws-lambda'; 8 | 9 | 10 | const backend = defineBackend({ 11 | auth, 12 | preTokenGeneration 13 | }); 14 | 15 | //https://docs.amplify.aws/react/build-a-backend/auth/modify-resources-with-cdk/#custom-attributes 16 | 17 | // extract L1 CfnUserPool resources 18 | const { cfnUserPool } = backend.auth.resources.cfnResources; 19 | 20 | 21 | // update the schema property to add custom attributes 22 | const user_pool_attributes = [ 23 | { 24 | name: 'account_id', 25 | attributeDataType: 'String', 26 | mutable: false 27 | }, 28 | { 29 | name: 'account_name', 30 | attributeDataType: 'String', 31 | mutable: false 32 | }, 33 | { 34 | name: 'group_id', 35 | attributeDataType: 'String', 36 | mutable: false 37 | }, 38 | { 39 | name: 'group_name', 40 | attributeDataType: 'String', 41 | mutable: false 42 | }, 43 | { 44 | name: 'user_id', 45 | attributeDataType: 'String', 46 | mutable: false 47 | }, 48 | { 49 | name: 'user_name', 50 | attributeDataType: 'String', 51 | mutable: false 52 | }]; 53 | 54 | 55 | if (Array.isArray(cfnUserPool.schema)) { 56 | cfnUserPool.schema.push(...user_pool_attributes) 57 | } 58 | 59 | //advanced security mode to support customization of access token 60 | cfnUserPool.userPoolAddOns = { 61 | advancedSecurityMode: 'AUDIT', 62 | }; 63 | 64 | // Enables request V2_0 to implements access token customization 65 | cfnUserPool.lambdaConfig = { 66 | preTokenGenerationConfig: { 67 | lambdaArn: backend.preTokenGeneration.resources.lambda.functionArn, 68 | lambdaVersion: 'V2_0', 69 | 70 | } 71 | } 72 | 73 | 74 | const invokeFunctionRole = new iam.Role(cfnUserPool, 'CognitoInvokeLambda', { 75 | assumedBy: new iam.ServicePrincipal('cognito-idp.amazonaws.com') 76 | 77 | }); 78 | 79 | // Loads created preTokenGeneration Lambda on cfnUserPool resource to add Invoke permission 80 | // https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_lambda.FunctionAttributes.html 81 | const createLambdaResourcePolicy = lambda.Function.fromFunctionAttributes(cfnUserPool, 'PreToken Function', { 82 | functionArn: backend.preTokenGeneration.resources.lambda.functionArn, 83 | sameEnvironment: true, 84 | skipPermissions: true 85 | }) 86 | 87 | createLambdaResourcePolicy.addPermission('invoke-lambda', { 88 | principal: new iam.ServicePrincipal('cognito-idp.amazonaws.com'), 89 | action: 'lambda:InvokeFunction', 90 | sourceArn: cfnUserPool.attrArn, 91 | }); 92 | 93 | 94 | 95 | 96 | 97 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/amplify/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "type": "module" 3 | } -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/amplify/tsconfig.json: -------------------------------------------------------------------------------- 1 | { 2 | "compilerOptions": { 3 | "target": "es2022", 4 | "module": "es2022", 5 | "moduleResolution": "bundler", 6 | "resolveJsonModule": true, 7 | "esModuleInterop": true, 8 | "forceConsistentCasingInFileNames": true, 9 | "strict": true, 10 | "skipLibCheck": true, 11 | "paths": { 12 | "$amplify/*": [ 13 | "../.amplify/generated/*" 14 | ] 15 | } 16 | } 17 | } -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/dist/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/amplify_ui/dist/favicon.ico -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/dist/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | Vite App 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/env.d.ts: -------------------------------------------------------------------------------- 1 | /// 2 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | Vite App 8 | 9 | 10 |
11 | 12 | 13 | 14 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "amplify-vue-template", 3 | "version": "0.0.0", 4 | "private": true, 5 | "type": "module", 6 | "scripts": { 7 | "dev": "vite", 8 | "build": "run-p type-check \"build-only {@}\" --", 9 | "preview": "vite preview", 10 | "build-only": "vite build", 11 | "type-check": "vue-tsc --build --force" 12 | }, 13 | "dependencies": { 14 | "@aws-amplify/ui-vue": "^4.2.8", 15 | "@microsoft/fetch-event-source": "^2.0.1", 16 | "aws-amplify": "^6.2.0", 17 | "deep-chat": "^1.4.11", 18 | "vue": "^3.4.21" 19 | }, 20 | "devDependencies": { 21 | "@aws-amplify/backend": "^1.0.0", 22 | "@aws-amplify/backend-cli": "^1.0.1", 23 | "@tsconfig/node20": "^20.1.4", 24 | "@types/node": "^20.12.5", 25 | "@vitejs/plugin-vue": "^5.0.4", 26 | "@vue/tsconfig": "^0.5.1", 27 | "aws-cdk": "^2.137.0", 28 | "aws-cdk-lib": "^2.137.0", 29 | "constructs": "^10.3.0", 30 | "esbuild": "^0.20.2", 31 | "npm-run-all2": "^6.1.2", 32 | "tsx": "^4.7.2", 33 | "typescript": "^5.4.5", 34 | "vite": "^5.2.8", 35 | "vue-tsc": "^2.0.11" 36 | } 37 | } 38 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/public/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/amplify_ui/public/favicon.ico -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/App.vue: -------------------------------------------------------------------------------- 1 | 18 | 19 | 41 | 42 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/assets/base.css: -------------------------------------------------------------------------------- 1 | /* color palette from */ 2 | :root { 3 | --vt-c-white: #ffffff; 4 | --vt-c-white-soft: #f8f8f8; 5 | --vt-c-white-mute: #f2f2f2; 6 | 7 | --vt-c-black: #181818; 8 | --vt-c-black-soft: #222222; 9 | --vt-c-black-mute: #282828; 10 | 11 | --vt-c-indigo: #2c3e50; 12 | 13 | --vt-c-divider-light-1: rgba(60, 60, 60, 0.29); 14 | --vt-c-divider-light-2: rgba(60, 60, 60, 0.12); 15 | --vt-c-divider-dark-1: rgba(84, 84, 84, 0.65); 16 | --vt-c-divider-dark-2: rgba(84, 84, 84, 0.48); 17 | 18 | --vt-c-text-light-1: var(--vt-c-indigo); 19 | --vt-c-text-light-2: rgba(60, 60, 60, 0.66); 20 | --vt-c-text-dark-1: var(--vt-c-white); 21 | --vt-c-text-dark-2: rgba(235, 235, 235, 0.64); 22 | } 23 | 24 | /* semantic color variables for this project */ 25 | :root { 26 | --color-background: var(--vt-c-white); 27 | --color-background-soft: var(--vt-c-white-soft); 28 | --color-background-mute: var(--vt-c-white-mute); 29 | 30 | --color-border: var(--vt-c-divider-light-2); 31 | --color-border-hover: var(--vt-c-divider-light-1); 32 | 33 | --color-heading: var(--vt-c-text-light-1); 34 | --color-text: var(--vt-c-text-light-1); 35 | 36 | --section-gap: 160px; 37 | } 38 | 39 | @media (prefers-color-scheme: dark) { 40 | :root { 41 | --color-background: var(--vt-c-black); 42 | --color-background-soft: var(--vt-c-black-soft); 43 | --color-background-mute: var(--vt-c-black-mute); 44 | 45 | --color-border: var(--vt-c-divider-dark-2); 46 | --color-border-hover: var(--vt-c-divider-dark-1); 47 | 48 | --color-heading: var(--vt-c-text-dark-1); 49 | --color-text: var(--vt-c-text-dark-2); 50 | } 51 | } 52 | 53 | *, 54 | *::before, 55 | *::after { 56 | box-sizing: border-box; 57 | margin: 0; 58 | font-weight: normal; 59 | } 60 | 61 | body { 62 | min-height: 100vh; 63 | color: var(--color-text); 64 | background: var(--color-background); 65 | transition: 66 | color 0.5s, 67 | background-color 0.5s; 68 | line-height: 1.6; 69 | font-family: 70 | Inter, 71 | -apple-system, 72 | BlinkMacSystemFont, 73 | 'Segoe UI', 74 | Roboto, 75 | Oxygen, 76 | Ubuntu, 77 | Cantarell, 78 | 'Fira Sans', 79 | 'Droid Sans', 80 | 'Helvetica Neue', 81 | sans-serif; 82 | font-size: 15px; 83 | text-rendering: optimizeLegibility; 84 | -webkit-font-smoothing: antialiased; 85 | -moz-osx-font-smoothing: grayscale; 86 | } 87 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/assets/logo.svg: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/assets/main.css: -------------------------------------------------------------------------------- 1 | body { 2 | margin: 0; 3 | background: linear-gradient(180deg, rgb(117, 81, 194), rgb(255, 255, 255)); 4 | display: flex; 5 | font-family: Inter, system-ui, Avenir, Helvetica, Arial, sans-serif; 6 | height: 100vh; 7 | width: 100vw; 8 | justify-content: center; 9 | align-items: center; 10 | } 11 | 12 | main { 13 | display: flex; 14 | flex-direction: column; 15 | align-items: stretch; 16 | } 17 | 18 | button { 19 | border-radius: 8px; 20 | border: 1px solid transparent; 21 | padding: 0.6em 1.2em; 22 | font-size: 1em; 23 | font-weight: 500; 24 | font-family: inherit; 25 | background-color: #1a1a1a; 26 | cursor: pointer; 27 | transition: border-color 0.25s; 28 | color: white; 29 | } 30 | button:hover { 31 | border-color: #646cff; 32 | } 33 | button:focus, 34 | button:focus-visible { 35 | outline: 4px auto -webkit-focus-ring-color; 36 | } 37 | 38 | ul { 39 | padding-inline-start: 0; 40 | margin-block-start: 0; 41 | margin-block-end: 0; 42 | list-style-type: none; 43 | display: flex; 44 | flex-direction: column; 45 | margin: 8px 0; 46 | border: 1px solid black; 47 | gap: 1px; 48 | background-color: black; 49 | border-radius: 8px; 50 | overflow: auto; 51 | } 52 | 53 | li { 54 | background-color: white; 55 | padding: 8px; 56 | } 57 | 58 | li:hover { 59 | background: #dadbf9; 60 | } 61 | 62 | a { 63 | font-weight: 800; 64 | text-decoration: none; 65 | } -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/AssistantConfig.vue: -------------------------------------------------------------------------------- 1 | 199 | 200 | 281 | 282 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/ChatComponent.vue: -------------------------------------------------------------------------------- 1 | 119 | 120 | 132 | 133 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/MenuBar.vue: -------------------------------------------------------------------------------- 1 | 5 | 6 | 12 | 13 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/icons/IconCommunity.vue: -------------------------------------------------------------------------------- 1 | 8 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/icons/IconDocumentation.vue: -------------------------------------------------------------------------------- 1 | 8 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/icons/IconEcosystem.vue: -------------------------------------------------------------------------------- 1 | 8 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/icons/IconSupport.vue: -------------------------------------------------------------------------------- 1 | 8 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/components/icons/IconTooling.vue: -------------------------------------------------------------------------------- 1 | 2 | 20 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/src/main.ts: -------------------------------------------------------------------------------- 1 | import "./assets/main.css"; 2 | import { createApp } from "vue"; 3 | import App from "./App.vue"; 4 | import { Amplify } from "aws-amplify"; 5 | import outputs from "../amplify_outputs.json"; 6 | 7 | Amplify.configure(outputs); 8 | //Add Existing AWS Resources - https://docs.amplify.aws/react/build-a-backend/add-aws-services/rest-api/existing-resources/ 9 | const existingConfig = Amplify.getConfig(); 10 | Amplify.configure({ 11 | ...existingConfig, 12 | API: { 13 | ...existingConfig.API, 14 | REST: { 15 | ...existingConfig.API?.REST, 16 | AssistantConfigApi: { 17 | endpoint: 18 | '', 19 | region: '' 20 | } 21 | } 22 | } 23 | }); 24 | 25 | createApp(App).mount("#app"); 26 | 27 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/tsconfig.app.json: -------------------------------------------------------------------------------- 1 | { 2 | "extends": "@vue/tsconfig/tsconfig.dom.json", 3 | "include": ["env.d.ts", "src/**/*", "src/**/*.vue", "amplify/**/*"], 4 | "exclude": ["src/**/__tests__/*"], 5 | "compilerOptions": { 6 | "noImplicitAny": false, 7 | "composite": true, 8 | "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo", 9 | 10 | "baseUrl": ".", 11 | "paths": { 12 | "@/*": ["./src/*"] 13 | } 14 | } 15 | } 16 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/tsconfig.json: -------------------------------------------------------------------------------- 1 | { 2 | "files": [], 3 | "references": [ 4 | { 5 | "path": "./tsconfig.node.json" 6 | }, 7 | { 8 | "path": "./tsconfig.app.json" 9 | } 10 | ] 11 | } 12 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/tsconfig.node.json: -------------------------------------------------------------------------------- 1 | { 2 | "extends": "@tsconfig/node20/tsconfig.json", 3 | "include": [ 4 | "vite.config.*", 5 | "vitest.config.*", 6 | "cypress.config.*", 7 | "nightwatch.conf.*", 8 | "playwright.config.*" 9 | ], 10 | "compilerOptions": { 11 | "composite": true, 12 | "noEmit": true, 13 | "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo", 14 | 15 | "module": "ESNext", 16 | "moduleResolution": "Bundler", 17 | "types": ["node"] 18 | } 19 | } 20 | -------------------------------------------------------------------------------- /examples/sample_web_application/amplify_ui/vite.config.ts: -------------------------------------------------------------------------------- 1 | import { fileURLToPath, URL } from 'node:url' 2 | 3 | import { defineConfig } from 'vite' 4 | import vue from '@vitejs/plugin-vue' 5 | 6 | // https://vitejs.dev/config/ 7 | export default defineConfig({ 8 | plugins: [ 9 | vue(), 10 | ], 11 | resolve: { 12 | alias: { 13 | '@': fileURLToPath(new URL('./src', import.meta.url)) 14 | } 15 | } 16 | }) 17 | -------------------------------------------------------------------------------- /examples/sample_web_application/app/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/app/__init__.py -------------------------------------------------------------------------------- /examples/sample_web_application/app/api_models/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/app/api_models/__init__.py -------------------------------------------------------------------------------- /examples/sample_web_application/app/api_models/assistant_model.py: -------------------------------------------------------------------------------- 1 | from typing import Optional 2 | from pydantic import BaseModel, Field 3 | 4 | 5 | class AssistantParameters(BaseModel): 6 | messages_to_sample: int = Field( 7 | default="5", 8 | description="Number of last message history that will be sent to the workflow", 9 | ) 10 | workflow_params: dict = Field( 11 | default="workflow#details#1", 12 | description="Additional params to use as data in workflow" 13 | ) 14 | state_machine_custom_params: dict = Field( 15 | default={}, 16 | description="Additional params to handle in the workflow" 17 | ) 18 | -------------------------------------------------------------------------------- /examples/sample_web_application/app/api_models/bedrock_converse_model.py: -------------------------------------------------------------------------------- 1 | from typing import List, Optional, Union 2 | from pydantic import BaseModel, Field 3 | 4 | 5 | class MessageContent(BaseModel): 6 | text: str 7 | 8 | 9 | class Message(BaseModel): 10 | role: str 11 | content: List 12 | 13 | 14 | class SystemContentBlock(BaseModel): 15 | text: str 16 | 17 | 18 | class InferenceConfiguration(BaseModel): 19 | temperature: float = Field(default=0.7, ge=0.0, le=1.0) 20 | #top_p: float = Field(default=0.999, ge=0.0, le=1.0) 21 | #top_k: int = Field(default=250, ge=0, le=500) 22 | #max_tokens: int = Field(default=1028, ge=0, le=200000) 23 | 24 | 25 | 26 | class ToolConfiguration(BaseModel): 27 | # To implement 28 | pass 29 | 30 | 31 | class BedrockConverseAPIRequest(BaseModel): 32 | model_id: str 33 | additional_model_fields: Optional[dict] = None 34 | additional_model_fields_paths: Optional[List[str]] = Field(default=[], min_items=0, max_items=10, 35 | pattern=r'^/[^/]+(?:/[^/]+)*$') 36 | inference_config: Optional[dict] = None 37 | messages: List 38 | system_prompts: List[dict] 39 | #tool_config: Optional[ToolConfiguration] = None 40 | -------------------------------------------------------------------------------- /examples/sample_web_application/app/api_models/workflow_model.py: -------------------------------------------------------------------------------- 1 | from typing import List, Literal, Optional 2 | from pydantic import BaseModel, model_validator, Field 3 | from bedrock_converse_model import BedrockConverseAPIRequest 4 | 5 | 6 | class WorkflowBedrockTaskDetails(BaseModel): 7 | task_name: str 8 | task_model_id: str 9 | input_token: int 10 | output_token: int 11 | 12 | 13 | class WorkflowBedrockDetails(BaseModel): 14 | task_details: List[WorkflowBedrockTaskDetails] 15 | total_input_tokens: int 16 | total_output_tokens: int 17 | 18 | 19 | class PromptChainParametersConfiguration(BaseModel): 20 | replace_tag: str 21 | 22 | 23 | class PromptChainParameters(BaseModel): 24 | system_chain_prompt: str 25 | operation: Literal["APPEND", "REPLACE_TAG", "REPLACE_ALL"] 26 | configuration: Optional[PromptChainParametersConfiguration] = None 27 | 28 | # Check REPLACE_TAG configuration 29 | @model_validator(mode="after") 30 | def config_validation(cls, step_functions_response, handler): 31 | if step_functions_response.operation == "REPLACE_TAG" and not step_functions_response.configuration.replace_tag: 32 | raise ValueError("'REPLACE_TAG' option must be used with configuration.replace_tag str attribute") 33 | return step_functions_response 34 | 35 | 36 | class StepFunctionResponse(BaseModel): 37 | bedrock_details: WorkflowBedrockDetails = Field( 38 | description="data about bedrock in the workflow" 39 | ) 40 | 41 | context_data: Optional[List] = Field( 42 | default=[], 43 | description="Content Returned from State Machine" 44 | ) 45 | 46 | system_chain_data: Optional[PromptChainParameters] = Field( 47 | default=None, 48 | description="Prompt Chain Operations" 49 | ) 50 | 51 | additional_messages: Optional[List] = None 52 | -------------------------------------------------------------------------------- /examples/sample_web_application/app/logging.yaml: -------------------------------------------------------------------------------- 1 | version: 1 2 | disable_existing_loggers: false 3 | 4 | # thanks to https://nuculabs.dev/p/fastapi-uvicorn-logging-in-production 5 | # This specification rewrites the logger handler to provide logging module 6 | # with lambda web adapter + uvicorn 7 | 8 | formatters: 9 | standard: 10 | format: "%(asctime)s - %(levelname)s - %(message)s" 11 | 12 | handlers: 13 | console: 14 | class: logging.StreamHandler 15 | formatter: standard 16 | stream: ext://sys.stdout 17 | 18 | loggers: 19 | uvicorn: 20 | error: 21 | propagate: true 22 | 23 | root: 24 | level: INFO 25 | handlers: [console] 26 | propagate: no -------------------------------------------------------------------------------- /examples/sample_web_application/app/main.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | import json 5 | import logging 6 | from typing import Dict, Any 7 | 8 | import boto3 9 | from fastapi import FastAPI, Request 10 | from fastapi.middleware.cors import CORSMiddleware 11 | from fastapi.responses import StreamingResponse 12 | 13 | from api_models.assistant_model import AssistantParameters 14 | from api_models.bedrock_converse_model import BedrockConverseAPIRequest 15 | from api_models.workflow_model import StepFunctionResponse, PromptChainParameters 16 | from assistant_config_interface.data_manager import AccountDataAccess 17 | 18 | logger = logging.getLogger() 19 | logger.setLevel(logging.DEBUG) 20 | logger.info("Application started") 21 | 22 | app = FastAPI( 23 | title="Serverless GenAI Assistant API", 24 | description="""This API expects two objects: bedrock_parameters and assistant_parameters. 25 | bedrock_parameters holds the standard parameters of Bedrock invoke_model_with_response_stream, 26 | while assistant_parameters defines behavior for the integration between the lambda and step function, 27 | allowing data passing between invocations to accelerate LLM applications development""" 28 | ) 29 | 30 | # CORS configuration 31 | app.add_middleware( 32 | CORSMiddleware, 33 | allow_origins=["*"], 34 | allow_credentials=True, 35 | allow_methods=["*"], 36 | allow_headers=["*"], 37 | ) 38 | 39 | # Boto3 clients 40 | bedrock_boto_client = boto3.client("bedrock-runtime") 41 | sf_boto_client = boto3.client("stepfunctions") 42 | 43 | 44 | def execute_workflow(state_machine_arn: str, input_data: Dict[str, Any]) -> StepFunctionResponse: 45 | """Execute sync express workflow.""" 46 | logger.info(f"Executing workflow: {state_machine_arn}") 47 | response = sf_boto_client.start_sync_execution( 48 | stateMachineArn=state_machine_arn, 49 | input=json.dumps(input_data) 50 | ) 51 | 52 | if response["status"] != "SUCCEEDED": 53 | error_msg = ( 54 | f"State machine error. Expected: 'SUCCEEDED', Received: {response['status']}. " 55 | f"Details: ERROR: {response.get('error', 'N/A')}, CAUSE: {response.get('cause', 'N/A')}" 56 | ) 57 | logger.error(error_msg) 58 | raise ValueError(error_msg) 59 | 60 | execution_data = StepFunctionResponse.model_validate_json(response['output']) 61 | logger.debug(f"Step Functions Response: {execution_data}") 62 | return execution_data 63 | 64 | 65 | def sf_build_context( 66 | messages: list, 67 | system: str, 68 | assistant_parameters: AssistantParameters, 69 | data_manager: AccountDataAccess 70 | ) -> Dict[str, Any]: 71 | """Build context """ 72 | logger.info("Retrieving workload configuration") 73 | workflow_details = data_manager.get_workflow_details(assistant_parameters.workflow_params['workflow_id']) 74 | 75 | logger.info(f"Executing Step Functions State Machine workflow arn: {workflow_details['arn']}") 76 | sf_workflow_result = execute_workflow( 77 | state_machine_arn=workflow_details['arn'], 78 | input_data={ 79 | "PromptInput": messages, 80 | "state_machine_custom_params": assistant_parameters.state_machine_custom_params, 81 | }, 82 | ) 83 | 84 | context = { 85 | 'system': build_system_context(system, sf_workflow_result.system_chain_data) 86 | } 87 | 88 | if sf_workflow_result.additional_messages: 89 | context['additional_messages'] = sf_workflow_result.additional_messages 90 | 91 | return context 92 | 93 | 94 | def build_system_context(system: str, system_chain_data: PromptChainParameters) -> str: 95 | """Build system context based on the operation returned from the workflow""" 96 | logger.debug(f"Building system context with input: system='{system}', system_chain_data={system_chain_data}") 97 | 98 | if system_chain_data is None: 99 | return system 100 | else: 101 | operation = system_chain_data.operation 102 | system_chain_prompt = system_chain_data.system_chain_prompt 103 | 104 | if operation == "APPEND": 105 | return f"{system}\n\n{system_chain_prompt}" 106 | elif operation == "REPLACE_TAG": 107 | replace_tag = system_chain_data.configuration.replace_tag 108 | opening_tag, closing_tag = f"<{replace_tag}>", f"" 109 | formatted_prompt = f"{opening_tag}{system_chain_prompt}{closing_tag}" 110 | return system.replace(f"{opening_tag}{closing_tag}", formatted_prompt) 111 | elif operation == "REPLACE_ALL": 112 | return system_chain_prompt 113 | else: 114 | return system 115 | 116 | 117 | async def stream_bedrock_converse_api(bedrock_converse_params: BedrockConverseAPIRequest): 118 | """Stream response from Bedrock converse API.""" 119 | try: 120 | stream_response = bedrock_boto_client.converse_stream( 121 | modelId=bedrock_converse_params.model_id, 122 | messages=bedrock_converse_params.messages, 123 | system=bedrock_converse_params.system_prompts, 124 | inferenceConfig=bedrock_converse_params.inference_config, 125 | additionalModelRequestFields=bedrock_converse_params.additional_model_fields 126 | ) 127 | 128 | stream = stream_response.get('stream') 129 | if stream: 130 | for event in stream: 131 | if 'contentBlockDelta' in event: 132 | yield event['contentBlockDelta']['delta']['text'] 133 | 134 | if 'metadata' in event: 135 | metadata = event['metadata'] 136 | if 'usage' in metadata: 137 | logger.info(f"Token usage: Input tokens: {metadata['usage']['inputTokens']}, " 138 | f"Output tokens: {metadata['usage']['outputTokens']}, " 139 | f"Total tokens: {metadata['usage']['totalTokens']}") 140 | if 'metrics' in metadata: 141 | logger.info(f"Latency: {metadata['metrics']['latencyMs']} milliseconds") 142 | except Exception as e: 143 | logger.error(f"Error in stream_bedrock_converse_api: {str(e)}") 144 | yield str(e) 145 | 146 | 147 | @app.post("/bedrock_converse_api") 148 | async def execute_chain_converse_api( 149 | request: Request, 150 | bedrock_converse_parameters: BedrockConverseAPIRequest, 151 | assistant_parameters: AssistantParameters 152 | ): 153 | """Execute the flow with converse API.""" 154 | try: 155 | request_access_token = json.loads(request.headers['x-access-token']) 156 | data_manager = AccountDataAccess(request_access_token) 157 | 158 | # Temporary: Format messages for Claude compatibility 159 | claude_message_format = [ 160 | {"role": message["role"], "content": message["content"][0]["text"]} 161 | for message in bedrock_converse_parameters.messages 162 | ] 163 | 164 | chain_data = sf_build_context( 165 | claude_message_format, 166 | bedrock_converse_parameters.system_prompts[0]["text"], 167 | assistant_parameters, 168 | data_manager 169 | ) 170 | 171 | bedrock_converse_parameters.system_prompts[0]["text"] = chain_data['system'] 172 | bedrock_converse_parameters.messages.extend(chain_data.get('additional_messages', [])) 173 | 174 | return StreamingResponse( 175 | stream_bedrock_converse_api(bedrock_converse_parameters), 176 | media_type="text/plain; charset=utf-8" 177 | ) 178 | except Exception as e: 179 | logger.error(f"Error in execute_chain_converse_api: {str(e)}") 180 | return StreamingResponse( 181 | str(e), 182 | media_type="text/plain; charset=utf-8" 183 | ) -------------------------------------------------------------------------------- /examples/sample_web_application/app/requirements.txt: -------------------------------------------------------------------------------- 1 | boto3>=1.34.126 2 | annotated-types==0.6.0 3 | anyio==4.2.0 4 | click==8.1.7 5 | exceptiongroup==1.2.0 6 | fastapi==0.109.2 7 | h11==0.14.0 8 | idna==3.7 9 | pydantic==2.6.1 10 | pydantic_core==2.16.2 11 | sniffio==1.3.0 12 | starlette==0.36.3 13 | typing_extensions==4.9.0 14 | uvicorn==0.27.0.post1 15 | pyyaml==6.0.1 16 | -------------------------------------------------------------------------------- /examples/sample_web_application/app/run.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # /opt/python added to the PYTHONPATH allows to access the code of another lambda layers for zip package 3 | PATH="$PATH:$LAMBDA_TASK_ROOT/bin" PYTHONPATH="$LAMBDA_TASK_ROOT:$LAMBDA_TASK_ROOT/app:$LAMBDA_TASK_ROOT/api_models:/opt/python" exec python -m uvicorn --log-config logging.yaml --port="$PORT" main:app -------------------------------------------------------------------------------- /examples/sample_web_application/assistant_config.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | from random import random 5 | 6 | import boto3 7 | import botocore.exceptions 8 | import json 9 | import uuid 10 | from botocore.config import Config 11 | 12 | # This script is used to set up the initial data 13 | # 1 - Populates the DynamoDB table with the initial data. 14 | # 2 - Create a Cognito user to app auth 15 | 16 | # configuration data 17 | # AWS profile name and region 18 | profile_name = "" 19 | region_name = "" 20 | 21 | # Cognito Parameters 22 | user_pool_id = "" 23 | email = "test@test.com" # test@test.com works but you can replace it. 24 | 25 | # SAM output parameters 26 | table_name = "" 27 | 28 | cloudfront_domain = "" 29 | 30 | step_functions_arn = "" 31 | 32 | # END of configuration parameters 33 | 34 | # generate random number between 10-99 35 | temporary_password = chr(int(random() * 26 + 65)) + "".join([chr(int(random() * 26 + 97)) for _ in range(8)]) + str( 36 | int(random() * 90 + 10)) 37 | special_char = "!@#$%^&*()_+" 38 | temporary_password += special_char[int(random() * len(special_char))] 39 | 40 | # generate a random uuid for account_id 41 | account_id = str(uuid.uuid4()) 42 | 43 | system_prompt = """You are an expert research assistant. Here is a document you will answer questions about: 44 | 45 | 46 | First, find the quotes from the document that are most relevant to answering the question, and then print them in numbered order. Quotes should be relatively short. 47 | 48 | If there are no relevant quotes, write “No relevant quotes” instead. 49 | 50 | Then, answer the question, starting with “Answer:“. Do not include or reference quoted content verbatim in the answer. Don’t say “According to Quote [1]” when answering. Instead make references to quotes relevant to each section of the answer solely by adding their bracketed numbers at the end of relevant sentences. 51 | 52 | Thus, the format of your overall response should look like what’s shown between the tags. Make sure to follow the formatting and spacing exactly. 53 | Quotes: 54 | [1] “Company X reported revenue of $12 million in 2021.” 55 | [2] “Almost 90% of revenue came from widget sales, with gadget sales making up the remaining 10%.” 56 | 57 | Answer: 58 | Company X earned $12 million. [1] Almost 90% of it was from widget sales. [2] 59 | 60 | 61 | If the question cannot be answered by the document, say so. 62 | """ 63 | 64 | # ConverseAPI Parameters for default workflow 65 | assistant_api_params = {"bedrock_converse_parameters": { 66 | "model_id": "anthropic.claude-3-sonnet-20240229-v1:0", 67 | "inference_config": '{"temperature":0.7, "topP":0.9, "maxTokens":2000}', 68 | "additional_model_fields": '{"top_k":200}', 69 | "additional_model_response_field_paths": "[/stop_sequences]", 70 | "system_prompts": [{ 71 | "text": system_prompt 72 | }] 73 | }, 74 | "assistant_parameters": { 75 | "messages_to_sample": 5, 76 | "workflow_params": {"workflow_id": ""}, 77 | "state_machine_custom_params": 78 | '{"hello": "from configuration database"}' 79 | 80 | } 81 | } 82 | 83 | #Initial data 84 | table_items = [ 85 | {"id": {"S": "account#" + account_id}, "item_type": {"S": "account#details"}, "name": {"S": "Default account"}, 86 | "description": {"S": "Test account"}}, 87 | {"id": {"S": "account#" + account_id}, "item_type": {"S": "account#inference_endpoint"}, 88 | "description": {"S": "Endpoint for inference with stream response"}, 89 | "type": {"S": "cloudfront"}, "url": {"S": cloudfront_domain}}, 90 | {"id": {"S": "account#" + account_id}, "item_type": {"S": "workflow#details#1"}, "name": {"S": "Default Workflow"}, 91 | "description": {"S": "Test Workflow for account 1"}, 92 | "type": {"S": "stepfunctions"}, "arn": {"S": step_functions_arn}, "assistant_params": {"S": json.dumps(assistant_api_params)}}, 93 | {"id": {"S": "account#" + account_id}, "item_type": {"S": "workflow#prompt#1#1"}, 94 | "description": {"S": "TODO: Prompt to be used in the step functions workflow."}, 95 | "type": {"S": "text"}, "role": {"S": "user"}, "content": {"S": "Test Workflow for account: " + account_id}} 96 | ] 97 | 98 | 99 | def populate_table(): 100 | print("Populating table with initial data") 101 | try: 102 | dynamodb = boto3.Session(profile_name=profile_name, region_name=region_name).client('dynamodb') 103 | except botocore.exceptions.ClientError as e: 104 | raise e 105 | 106 | for item in table_items: 107 | try: 108 | response = dynamodb.put_item(TableName=table_name, Item=item) 109 | print("Item added successfully" + str(item['id']) + "|" + str(item['item_type'])) 110 | except Exception as e: 111 | raise e 112 | 113 | 114 | def create_cognito_user(): 115 | try: 116 | cognito = boto3.Session(profile_name=profile_name, region_name=region_name).client('cognito-idp') 117 | except botocore.exceptions.ClientError as e: 118 | raise e 119 | 120 | try: 121 | response = cognito.admin_create_user( 122 | UserPoolId=user_pool_id, 123 | Username=email, 124 | UserAttributes=[ 125 | { 126 | "Name": "email", 127 | "Value": email 128 | }, 129 | { 130 | 131 | "Name": "email_verified", 132 | "Value": "true" 133 | }, 134 | 135 | { 136 | "Name": "custom:account_id", 137 | "Value": account_id 138 | } 139 | ], 140 | TemporaryPassword=temporary_password, 141 | MessageAction="SUPPRESS" 142 | ) 143 | print("User created successfully") 144 | print("*" * 30) 145 | print("Email: " + email) 146 | print("Temporary password: " + temporary_password) 147 | print("*" * 30) 148 | except Exception as e: 149 | raise e 150 | 151 | 152 | if __name__ == "__main__": 153 | create_cognito_user() 154 | populate_table() 155 | -------------------------------------------------------------------------------- /examples/sample_web_application/config_handler/config_handler.py: -------------------------------------------------------------------------------- 1 | from assistant_config_interface.data_manager import AccountDataAccess 2 | import json 3 | 4 | 5 | def lambda_handler(event, context): 6 | try: 7 | # Extract token claims and path parameters 8 | claims = event['requestContext']['authorizer']['jwt']['claims'] 9 | path_parameters = event.get('pathParameters', {}) 10 | 11 | # Create CustomerDataAccess instance 12 | account_data = AccountDataAccess(claims) 13 | 14 | # Determine the API route and call the appropriate method 15 | route_key = event['routeKey'] 16 | account_id = path_parameters.get('accountID') 17 | workflow_id = path_parameters.get('WorkflowID') 18 | 19 | # Ensure the account_id from the path matches the one in the token 20 | if account_id != claims.get('custom:account_id'): 21 | raise ValueError("Account ID in path does not match the one in the token") 22 | 23 | if route_key == 'GET /accounts/{accountID}/account': 24 | result = account_data.get_account_details() 25 | elif route_key == 'GET /accounts/{accountID}/inference-endpoint': 26 | result = account_data.get_inference_endpoint() 27 | elif route_key == 'GET /accounts/{accountID}/workflows': 28 | result = account_data.list_workflows() 29 | elif route_key == 'GET /accounts/{accountID}/workflows/{WorkflowID}': 30 | if not workflow_id: 31 | raise ValueError("WorkflowID is required for this endpoint") 32 | result = account_data.get_workflow_details(workflow_id) 33 | elif route_key == 'GET /accounts/{accountID}/workflows/{WorkflowID}/prompts': 34 | if not workflow_id: 35 | raise ValueError("WorkflowID is required for this endpoint") 36 | result = account_data.list_workflow_prompts(workflow_id) 37 | else: 38 | raise ValueError(f"Unsupported route: {route_key}") 39 | 40 | return { 41 | 'statusCode': 200, 42 | 'body': json.dumps(result), 43 | 'headers': { 44 | 'Content-Type': 'application/json' 45 | } 46 | } 47 | except ValueError as ve: 48 | return { 49 | 'statusCode': 400, 50 | 'body': json.dumps({'error': str(ve)}), 51 | 'headers': { 52 | 'Content-Type': 'application/json', 53 | 'Access-Control-Allow-Origin': '*' 54 | } 55 | } 56 | except Exception as e: 57 | print(f"Error: {str(e)}") 58 | return { 59 | 'statusCode': 500, 60 | 'body': json.dumps({'error': 'Internal server error'}), 61 | 'headers': { 62 | 'Content-Type': 'application/json', 63 | 'Access-Control-Allow-Origin': '*' 64 | } 65 | } -------------------------------------------------------------------------------- /examples/sample_web_application/dependencies/python/assistant_config_interface/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/dependencies/python/assistant_config_interface/__init__.py -------------------------------------------------------------------------------- /examples/sample_web_application/dependencies/python/assistant_config_interface/data_manager.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | import boto3 5 | from boto3.dynamodb.conditions import Key 6 | import os 7 | 8 | 9 | # Get the DynamoDB table name from the environment variable 10 | table_name = os.environ["TABLE_NAME"] 11 | dynamodb = boto3.resource("dynamodb") 12 | table = dynamodb.Table(table_name) 13 | 14 | 15 | class AccountDataAccess: 16 | def __init__(self, cognito_access_token): 17 | # Get the account ID from the Cognito access token 18 | self.account_id = cognito_access_token.get("custom:account_id") 19 | 20 | def get_account_details(self): 21 | response = table.query( 22 | KeyConditionExpression=Key("id").eq(f"account#{self.account_id}") & 23 | Key("item_type").eq("account#details") 24 | ) 25 | return response.get("Items", []) 26 | 27 | def get_inference_endpoint(self): 28 | response = table.query( 29 | KeyConditionExpression=Key("id").eq(f"account#{self.account_id}") & 30 | Key("item_type").eq("account#inference_endpoint") 31 | ) 32 | return response.get("Items", []) 33 | 34 | def list_workflows(self): 35 | attributes = ["id", "item_type", "name", "description"] 36 | expression_attribute_names = { 37 | f"#{word}": word for word in attributes 38 | } 39 | projection_expression = ",".join([f"#{word}" for word in attributes]) 40 | response = table.query( 41 | KeyConditionExpression=Key("id").eq(f"account#{self.account_id}") & 42 | Key("item_type").begins_with("workflow#details#"), 43 | ProjectionExpression=projection_expression, 44 | ExpressionAttributeNames=expression_attribute_names 45 | ) 46 | return response.get("Items", []) 47 | 48 | def get_workflow_details(self, workflow_id): 49 | # Split the workflow ID to get the actual ID 50 | parts = workflow_id.split("#") 51 | if len(parts) == 3 and parts[0] == "workflow" and parts[1] == "details": 52 | workflow_id = parts[2] 53 | else: 54 | raise ValueError("Invalid item_type string format") 55 | # get the id from ex: workflow#details#1 56 | response = table.get_item( 57 | Key={ 58 | "id": f"account#{self.account_id}", 59 | "item_type": f"workflow#details#{workflow_id}" 60 | } 61 | ) 62 | 63 | return response.get("Item", {}) 64 | 65 | def list_workflow_prompts(self, workflow_id): 66 | response = table.query( 67 | KeyConditionExpression=Key("id").eq(f"account#{self.account_id}") & 68 | Key("item_type").begins_with(f"workflow#prompt#{workflow_id}#") 69 | ) 70 | return response.get("Items", []) 71 | 72 | def get_workflow_prompt(self, workflow_id, prompt_id): 73 | response = table.query( 74 | KeyConditionExpression=Key("id").eq(f"account#{self.account_id}") & 75 | Key("item_type").eq(f"workflow#prompt#{workflow_id}#{prompt_id}") 76 | ) 77 | return response.get("Items", []) 78 | 79 | -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/Architecture.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/Architecture.jpeg -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/amplify_deployed.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/amplify_deployed.jpeg -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/amplify_intro.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/amplify_intro.jpeg -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/amplify_save_deploy.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/amplify_save_deploy.jpeg -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/app_settings.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/app_settings.jpeg -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/download_backend_resources.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/download_backend_resources.png -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/repository_select.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/repository_select.jpeg -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/sample.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/sample.gif -------------------------------------------------------------------------------- /examples/sample_web_application/imgs/select_source_code.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/imgs/select_source_code.jpeg -------------------------------------------------------------------------------- /examples/sample_web_application/lambda-jwt-verify/src/index.mjs: -------------------------------------------------------------------------------- 1 | import { CognitoJwtVerifier } from "aws-jwt-verify"; 2 | 3 | /* This is the basic sample. 4 | Check how this can be improved in the project page at https://github.com/awslabs/aws-jwt-verify */ 5 | const verifier = CognitoJwtVerifier.create({ 6 | userPoolId: "", 7 | tokenUse: "access", 8 | clientId: "", 9 | scope: ["aws.cognito.signin.user.admin"] 10 | 11 | }); 12 | 13 | export const handler = async (event, context, callback) => { 14 | const request = event.Records[0].cf.request; 15 | const headers = request.headers; 16 | 17 | const isPreflightRequest = 18 | request.method === 'OPTIONS' && 19 | headers['origin'] && 20 | headers['access-control-request-method']; 21 | 22 | if (isPreflightRequest) { 23 | callback(null, request); 24 | } 25 | 26 | try { 27 | const payload = await verifier.verify(headers["x-access-token"][0].value); 28 | console.log("Token is valid. Payload:", payload); 29 | //Add decoded access token to header 30 | request.headers["x-access-token"][0].value = JSON.stringify(payload); 31 | callback(null, request) 32 | } catch { 33 | console.log("Token not valid!"); 34 | callback(null, { 35 | status: 403, 36 | statusDescription: 'Unauthorized' 37 | }) 38 | } 39 | }; 40 | -------------------------------------------------------------------------------- /examples/sample_web_application/lambda-jwt-verify/src/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "aws-jwt-verify", 3 | "version": "1.0.0", 4 | "description": "", 5 | "main": "index.js", 6 | "scripts": { 7 | "test": "echo \"Error: no test specified\" && exit 1" 8 | }, 9 | "keywords": [], 10 | "author": "Gabriel Leite, Serverless GenAI Assistant - Application sample", 11 | "license": "MIT-0", 12 | "dependencies": { 13 | "aws-jwt-verify": "^4.0.1" 14 | } 15 | } 16 | -------------------------------------------------------------------------------- /examples/sample_web_application/package-lock.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "sample_web_application", 3 | "lockfileVersion": 3, 4 | "requires": true, 5 | "packages": {} 6 | } 7 | -------------------------------------------------------------------------------- /examples/sample_web_application/statemachine/rag_parallel_tasks/RagGenAI.asl.json: -------------------------------------------------------------------------------- 1 | { 2 | "Comment": "GenAI Workflow that parallel validates prompt rules and execute RAG using Bedrock Agents KB", 3 | "StartAt": "Parallel", 4 | "States": { 5 | "Choice": { 6 | "Choices": [ 7 | { 8 | "Variable": "$.ParallelInput[1].Body.content[0].text", 9 | "BooleanEquals": false, 10 | "Next": "Retrieve" 11 | } 12 | ], 13 | "Default": "GenerateResponse", 14 | "Type": "Choice" 15 | }, 16 | "Parallel": { 17 | "Branches": [ 18 | { 19 | "StartAt": "Insert Keywords", 20 | "Comment": "This state validate the user prompt and augment the content inserting keywords related to the topic. The KB will have more context", 21 | "States": { 22 | "Insert Keywords": { 23 | "End": true, 24 | "Parameters": { 25 | "Body": { 26 | "anthropic_version": "bedrock-2023-05-31", 27 | "max_tokens": 150, 28 | "messages.$": "$.PromptInput[0, 1][*][*]", 29 | "system.$": "$.claude_params.task_1.system", 30 | "stop_sequences.$": "$.claude_params.task_1.stop_sequences", 31 | "temperature": 0.7 32 | }, 33 | "ModelId": "arn:aws:bedrock:${AWSRegion}::foundation-model/anthropic.claude-3-haiku-20240307-v1:0" 34 | }, 35 | "Resource": "arn:aws:states:::bedrock:invokeModel", 36 | "Type": "Task" 37 | } 38 | } 39 | }, 40 | { 41 | "StartAt": "KB Bypass Policy", 42 | "Comment": "Based on last KB data and context, validate if a new retrieve is necessary", 43 | "States": { 44 | "KB Bypass Policy": { 45 | "End": true, 46 | "Parameters": { 47 | "Body": { 48 | "prompt.$": "States.Format('<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n{}<|eot_id|><|start_header_id|>user<|end_header_id|>\n{}<|eot_id|><|start_header_id|)>assistant\n{}', $.claude_params.task_2.system, $.PromptInput[0].messages, $.PromptInput.[2].task_2[0].content)", 49 | "temperature": 0.3, 50 | "top_p": 0.5, 51 | "max_gen_len": 50 52 | }, 53 | "ModelId": "arn:aws:bedrock:${AWSRegion}::foundation-model/meta.llama3-8b-instruct-v1:0" 54 | }, 55 | "Resource": "arn:aws:states:::bedrock:invokeModel", 56 | "Type": "Task", 57 | "ResultSelector": { 58 | "Body": { 59 | "content": [ 60 | { 61 | "text.$": "States.StringToJson($.Body.generation)" 62 | } 63 | ], 64 | "model": "meta.llama3-8b-instruct-v1:0", 65 | "usage": { 66 | "input_tokens.$": "$.Body.prompt_token_count", 67 | "output_tokens.$": "$.Body.generation_token_count" 68 | } 69 | } 70 | } 71 | } 72 | } 73 | } 74 | ], 75 | "Next": "Choice", 76 | "ResultPath": "$.ParallelInput", 77 | "Type": "Parallel", 78 | "Parameters": { 79 | "claude_params": { 80 | "task_1": { 81 | "system": "", 82 | "stop_sequences": [ 83 | "" 84 | ] 85 | }, 86 | "task_2": { 87 | "system": "You are a json object generator that follow the Rules: 1. Read the user/assistant JSON interaction history. 2. If the answer to the user's last query was clearly and detailed answered in the conversation history answer the \"{\"boolean\":true}\". 3. If the user's last query was not clearly and detailed answered, return the \"{\"boolean\":false}\". 4. use the json schema for boolean and answer with the json format \"{\"boolean\":{value}\".", 88 | "stop_sequences": [ 89 | "" 90 | ] 91 | } 92 | }, 93 | "PromptInput": [ 94 | { 95 | "messages.$": "$.PromptInput" 96 | }, 97 | { 98 | "task_1": [ 99 | { 100 | "role": "assistant", 101 | "content": "" 102 | } 103 | ] 104 | }, 105 | { 106 | "task_2": [ 107 | { 108 | "role": "assistant", 109 | "content": "{\"boolean\":" 110 | } 111 | ] 112 | } 113 | ] 114 | } 115 | }, 116 | "Retrieve": { 117 | "Next": "GenerateResponseKb", 118 | "Parameters": { 119 | "KnowledgeBaseId": "${KnowledgeBaseId}", 120 | "RetrievalQuery": { 121 | "Text.$": "States.Format('{}, {}', States.ArrayGetItem($.PromptInput[-1:].content, 0), $.ParallelInput[0].Body.content[0].text)" 122 | } 123 | }, 124 | "Resource": "arn:aws:states:::aws-sdk:bedrockagentruntime:retrieve", 125 | "ResultPath": "$.KnowledgeBaseData", 126 | "Type": "Task", 127 | "Catch": [ 128 | { 129 | "ErrorEquals": [ 130 | "States.ALL" 131 | ], 132 | "Next": "GenerateResponseKBError", 133 | "ResultPath": "$.KnowledgeBaseData" 134 | } 135 | ], 136 | "ResultSelector": { 137 | "RetrievalResults.$": "$.RetrievalResults" 138 | } 139 | }, 140 | "GenerateResponse": { 141 | "Comment": "Generates the Output Json for the caller", 142 | "End": true, 143 | "Type": "Pass", 144 | "Parameters": { 145 | "bedrock_details": { 146 | "task_details": [ 147 | { 148 | "task_name": "task0", 149 | "task_model_id.$": "$.ParallelInput[0].Body.model", 150 | "input_token.$": "$.ParallelInput[0].Body.usage.input_tokens", 151 | "output_token.$": "$.ParallelInput[0].Body.usage.output_tokens" 152 | }, 153 | { 154 | "task_name": "task1", 155 | "task_model_id.$": "$.ParallelInput[1].Body.model", 156 | "input_token.$": "$.ParallelInput[1].Body.usage.input_tokens", 157 | "output_token.$": "$.ParallelInput[1].Body.usage.output_tokens" 158 | } 159 | ], 160 | "total_input_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.input_tokens, $.ParallelInput[1].Body.usage.input_tokens)", 161 | "total_output_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.output_tokens, $.ParallelInput[1].Body.usage.output_tokens)" 162 | }, 163 | "context_data": [], 164 | "system_chain_data": { 165 | "system_chain_prompt": "No Data generated this time, this is likely due to a greeting, repeated user's query or the answer is already provided in the conversation history. Check carefully and do not invent any information. You can answer user greetings in a friendly way.", 166 | "operation": "REPLACE_TAG", 167 | "configuration": { 168 | "replace_tag": "chain-information" 169 | } 170 | } 171 | } 172 | }, 173 | "GenerateResponseKb": { 174 | "Comment": "Generates the Output Json for the caller", 175 | "Type": "Pass", 176 | "Parameters": { 177 | "bedrock_details": { 178 | "task_details": [ 179 | { 180 | "task_name": "task0", 181 | "task_model_id.$": "$.ParallelInput[0].Body.model", 182 | "input_token.$": "$.ParallelInput[0].Body.usage.input_tokens", 183 | "output_token.$": "$.ParallelInput[0].Body.usage.output_tokens" 184 | }, 185 | { 186 | "task_name": "task1", 187 | "task_model_id.$": "$.ParallelInput[1].Body.model", 188 | "input_token.$": "$.ParallelInput[1].Body.usage.input_tokens", 189 | "output_token.$": "$.ParallelInput[1].Body.usage.output_tokens" 190 | } 191 | ], 192 | "total_input_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.input_tokens, $.ParallelInput[1].Body.usage.input_tokens)", 193 | "total_output_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.output_tokens, $.ParallelInput[1].Body.usage.output_tokens)" 194 | }, 195 | "system_chain_data": { 196 | "system_chain_prompt.$": "States.JsonToString($.KnowledgeBaseData.RetrievalResults)", 197 | "operation": "REPLACE_TAG", 198 | "configuration": { 199 | "replace_tag": "chain-information" 200 | } 201 | }, 202 | "additional_messages": [ 203 | { 204 | "role": "assistant", 205 | "content": [ 206 | { 207 | "text": "I will follow the instructions in system parameters to provide the answer." 208 | } 209 | ] 210 | } 211 | ] 212 | }, 213 | "End": true 214 | }, 215 | "GenerateResponseKBError": { 216 | "Comment": "Generates the Output Json for KB error ", 217 | "End": true, 218 | "Type": "Pass", 219 | "Parameters": { 220 | "bedrock_details": { 221 | "task_details": [ 222 | { 223 | "task_name": "task0", 224 | "task_model_id.$": "$.ParallelInput[0].Body.model", 225 | "input_token.$": "$.ParallelInput[0].Body.usage.input_tokens", 226 | "output_token.$": "$.ParallelInput[0].Body.usage.output_tokens" 227 | }, 228 | { 229 | "task_name": "task1", 230 | "task_model_id.$": "$.ParallelInput[1].Body.model", 231 | "input_token.$": "$.ParallelInput[1].Body.usage.input_tokens", 232 | "output_token.$": "$.ParallelInput[1].Body.usage.output_tokens" 233 | } 234 | ], 235 | "total_input_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.input_tokens, $.ParallelInput[1].Body.usage.input_tokens)", 236 | "total_output_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.output_tokens, $.ParallelInput[1].Body.usage.output_tokens)" 237 | }, 238 | "system_chain_data": { 239 | "system_chain_prompt.$": "States.Format('You are an error handler task. Your rules are: 1. Inform that you are a error handler task. 2. Inform the user you can\\'t answer any question. 3. explain the bedrock error content bellow: Error: {} Cause: {}.', $.KnowledgeBaseData.Error, $.KnowledgeBaseData.Cause)", 240 | "operation": "REPLACE_ALL" 241 | } 242 | } 243 | } 244 | } 245 | } -------------------------------------------------------------------------------- /examples/sample_web_application/template.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: '2010-09-09' 2 | Transform: AWS::Serverless-2016-10-31 3 | Description: Example of a web app using the serverless assistant concept 4 | Globals: 5 | Function: 6 | Timeout: 60 7 | 8 | Parameters: 9 | KnowledgeBaseId: 10 | Type: String 11 | Default: "Insert_your_Kb_id" 12 | UserPoolClientId: 13 | Type: String 14 | Default: "Cognito User Pool Client ID" 15 | UserPoolId: 16 | Type: String 17 | Default: "Cognito User Pool ID" 18 | 19 | Resources: 20 | 21 | #DynamoDB Table for Assistant configuration data store 22 | AssistantConfigTable: 23 | Type: AWS::DynamoDB::Table 24 | Properties: 25 | TableName: !Sub ${AWS::StackName}-Config 26 | BillingMode: PAY_PER_REQUEST 27 | AttributeDefinitions: 28 | - AttributeName: id 29 | AttributeType: S 30 | - AttributeName: item_type 31 | AttributeType: S 32 | KeySchema: 33 | - AttributeName: id 34 | KeyType: HASH 35 | - AttributeName: item_type 36 | KeyType: RANGE 37 | 38 | #Lambda layer provides an interface to access config datasource 39 | ConfigLayer: 40 | Type: AWS::Serverless::LayerVersion 41 | Properties: 42 | LayerName: !Sub ${AWS::StackName}-ConfigLayer 43 | Description: Layer to provide access to datasource 44 | ContentUri: dependencies/ 45 | CompatibleRuntimes: 46 | - python3.12 47 | CompatibleArchitectures: 48 | - x86_64 49 | 50 | #Lambda implements Config Layer and manages API GW routes 51 | ConfigHandler: 52 | Type: AWS::Serverless::Function 53 | Properties: 54 | Handler: config_handler.lambda_handler 55 | CodeUri: config_handler/ 56 | MemorySize: 512 57 | Timeout: 10 58 | Runtime: python3.12 59 | Environment: 60 | Variables: 61 | TABLE_NAME: !Ref AssistantConfigTable 62 | Layers: 63 | - !Ref ConfigLayer 64 | Policies: 65 | - DynamoDBCrudPolicy: 66 | TableName: !Ref AssistantConfigTable 67 | Events: 68 | GetAccountDetails: 69 | Type: HttpApi 70 | Properties: 71 | ApiId: !Ref ConfigApi 72 | Path: /accounts/{accountID}/account 73 | Method: GET 74 | Auth: 75 | Authorizer: OAuth2Authorizer 76 | 77 | GetInferenceEndpoint: 78 | Type: HttpApi 79 | Properties: 80 | ApiId: !Ref ConfigApi 81 | Path: /accounts/{accountID}/inference-endpoint 82 | Method: GET 83 | Auth: 84 | Authorizer: OAuth2Authorizer 85 | 86 | GetWorkflows: 87 | Type: HttpApi 88 | Properties: 89 | ApiId: !Ref ConfigApi 90 | Path: /accounts/{accountID}/workflows 91 | Method: GET 92 | Auth: 93 | Authorizer: OAuth2Authorizer 94 | 95 | GetWorkflowDetails: 96 | Type: HttpApi 97 | Properties: 98 | ApiId: !Ref ConfigApi 99 | Path: /accounts/{accountID}/workflows/{WorkflowID} 100 | Method: GET 101 | Auth: 102 | Authorizer: OAuth2Authorizer 103 | 104 | GetWorkflowPrompts: 105 | Type: HttpApi 106 | Properties: 107 | ApiId: !Ref ConfigApi 108 | Path: /accounts/{accountID}/workflows/{WorkflowID}/prompts 109 | Method: GET 110 | Auth: 111 | Authorizer: OAuth2Authorizer 112 | 113 | GetWorkflowPrompt: 114 | Type: HttpApi 115 | Properties: 116 | ApiId: !Ref ConfigApi 117 | Path: /accounts/{accountID}/workflows/{WorkflowID}/prompts/{PromptID} 118 | Method: GET 119 | Auth: 120 | Authorizer: OAuth2Authorizer 121 | SetConfig: 122 | Type: HttpApi 123 | Properties: 124 | ApiId: !Ref ConfigApi 125 | Path: /setconfiguration 126 | Method: post 127 | Auth: 128 | Authorizer: OAuth2Authorizer 129 | 130 | #API Gateway access to config data: 131 | ConfigApi: 132 | Type: AWS::Serverless::HttpApi 133 | Properties: 134 | CorsConfiguration: 135 | AllowOrigins: 136 | - '*' 137 | AllowHeaders: 138 | - '*' 139 | AllowMethods: 140 | - GET 141 | - POST 142 | - OPTIONS 143 | Auth: 144 | Authorizers: 145 | OAuth2Authorizer: 146 | AuthorizationScopes: 147 | - aws.cognito.signin.user.admin 148 | IdentitySource: "$request.header.Authorization" 149 | JwtConfiguration: 150 | Audience: 151 | - !Ref UserPoolClientId 152 | Issuer: !Sub 'https://cognito-idp.${AWS::Region}.amazonaws.com/${UserPoolId}' 153 | DefaultAuthorizer: OAuth2Authorizer 154 | 155 | #Core function 156 | FastAPIFunction: 157 | Type: AWS::Serverless::Function 158 | Properties: 159 | CodeUri: app/ 160 | Handler: run.sh 161 | Runtime: python3.12 162 | MemorySize: 1024 163 | Environment: 164 | Variables: 165 | AWS_LAMBDA_EXEC_WRAPPER: /opt/bootstrap 166 | AWS_LWA_INVOKE_MODE: response_stream 167 | PORT: 8000 168 | STATEMACHINE_STATE_MACHINE_ARN: !GetAtt StateMachine.Arn 169 | TABLE_NAME: !Ref AssistantConfigTable 170 | Layers: 171 | - !Sub arn:aws:lambda:${AWS::Region}:753240598075:layer:LambdaAdapterLayerX86:22 172 | - !Ref ConfigLayer 173 | FunctionUrlConfig: 174 | AuthType: AWS_IAM 175 | InvokeMode: RESPONSE_STREAM 176 | Policies: 177 | - DynamoDBCrudPolicy: 178 | TableName: !Ref AssistantConfigTable 179 | - Statement: 180 | - Effect: Allow 181 | Action: 182 | - states:StartSyncExecution 183 | Resource: !Sub arn:aws:states:${AWS::Region}:${AWS::AccountId}:stateMachine:* 184 | - Statement: 185 | - Effect: Allow 186 | Action: 187 | - bedrock:InvokeModelWithResponseStream 188 | Resource: !Sub arn:aws:bedrock:${AWS::Region}::foundation-model/* 189 | 190 | LambdaConfigHandlerPermission: 191 | Type: 'AWS::Lambda::Permission' 192 | Properties: 193 | Action: 'lambda:InvokeFunction' 194 | Principal: apigateway.amazonaws.com 195 | FunctionName: !Ref ConfigHandler 196 | SourceArn: !Sub 'arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${ConfigApi}/*' 197 | 198 | #Will be deployed as a lambda@edge for JWT validation @ cloudfront 199 | LambdaJWTVerify: 200 | Type: AWS::Serverless::Function 201 | Properties: 202 | Description: aws-jwt-verify 4.0.1 203 | AssumeRolePolicyDocument: 204 | { 205 | "Version": "2012-10-17", 206 | "Statement": [ 207 | { 208 | "Effect": "Allow", 209 | "Principal": { 210 | "Service": [ 211 | "lambda.amazonaws.com", 212 | "edgelambda.amazonaws.com" 213 | ] 214 | }, 215 | "Action": "sts:AssumeRole" 216 | } 217 | ] 218 | } 219 | CodeUri: lambda-jwt-verify/src/ 220 | Handler: index.handler 221 | Runtime: nodejs20.x 222 | Timeout: 30 223 | 224 | AutoPublishAlias: dev 225 | VersionDescription: "1" 226 | 227 | StateMachine: 228 | Type: AWS::Serverless::StateMachine 229 | Properties: 230 | DefinitionUri: statemachine/rag_parallel_tasks/RagGenAI.asl.json 231 | Logging: 232 | Level: ALL 233 | IncludeExecutionData: true 234 | Destinations: 235 | - CloudWatchLogsLogGroup: 236 | LogGroupArn: !Sub 'arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${StateMachineLogGroup}:*' 237 | Policies: 238 | - AWSXrayWriteOnlyAccess 239 | - Statement: 240 | - Effect: Allow 241 | Action: 242 | - 'logs:CreateLogDelivery' 243 | - 'logs:GetLogDelivery' 244 | - 'logs:UpdateLogDelivery' 245 | - 'logs:DeleteLogDelivery' 246 | - 'logs:ListLogDeliveries' 247 | - 'logs:PutResourcePolicy' 248 | - 'logs:DescribeResourcePolicies' 249 | - 'logs:DescribeLogGroups' 250 | Resource: '*' 251 | - Statement: 252 | - Effect: Allow 253 | Action: 254 | - 'bedrock:Retrieve' 255 | Resource: !Sub 'arn:aws:bedrock:${AWS::Region}:${AWS::AccountId}:knowledge-base/${KnowledgeBaseId}' 256 | - Statement: 257 | - Effect: Allow 258 | Action: 259 | - 'bedrock:InvokeModel' 260 | Resource: !Sub 'arn:aws:bedrock:${AWS::Region}::foundation-model/*' 261 | Tracing: 262 | Enabled: true 263 | Type: EXPRESS 264 | DefinitionSubstitutions: 265 | AWSRegion: !Sub '${AWS::Region}' 266 | KnowledgeBaseId: !Ref KnowledgeBaseId 267 | 268 | StateMachineLogGroup: 269 | Type: AWS::Logs::LogGroup 270 | Properties: 271 | LogGroupName: !Sub '/aws/vendedlogs/states/${AWS::StackName}-StateMachine-Logs' 272 | 273 | #sigv4 for assistant core 274 | CloudfrontOAC: 275 | Type: AWS::CloudFront::OriginAccessControl 276 | Properties: 277 | OriginAccessControlConfig: 278 | Description: "Lambda OAC Signature" 279 | Name: "Serverless Assistant OAC" 280 | OriginAccessControlOriginType: lambda 281 | SigningBehavior: always 282 | SigningProtocol: sigv4 283 | 284 | Distribution: 285 | Type: AWS::CloudFront::Distribution 286 | Properties: 287 | DistributionConfig: 288 | DefaultCacheBehavior: 289 | CachePolicyId: 4135ea2d-6df8-44a3-9df3-4b5a84be39ad 290 | OriginRequestPolicyId: b689b0a8-53d0-40ab-baf2-68738e2966ac 291 | ResponseHeadersPolicyId: 5cc3b908-e619-4b99-88e5-2cf7f45965bd 292 | TargetOriginId: FastAPIFunctionOrigin 293 | ViewerProtocolPolicy: https-only 294 | AllowedMethods: 295 | - "POST" 296 | - "HEAD" 297 | - "PATCH" 298 | - "DELETE" 299 | - "PUT" 300 | - "GET" 301 | - "OPTIONS" 302 | LambdaFunctionAssociations: 303 | - EventType: origin-request 304 | LambdaFunctionARN: !Sub '${LambdaJWTVerify.Arn}:1' 305 | IncludeBody: false 306 | Enabled: true 307 | HttpVersion: http2 308 | Origins: 309 | - Id: FastAPIFunctionOrigin 310 | DomainName: !Select [ 0, !Split [ '/', !Select [ 1, !Split [ '://', !GetAtt FastAPIFunctionUrl.FunctionUrl ] ] ] ] 311 | CustomOriginConfig: 312 | OriginProtocolPolicy: https-only 313 | OriginAccessControlId: !GetAtt CloudfrontOAC.Id 314 | PriceClass: PriceClass_100 315 | ViewerCertificate: 316 | CloudFrontDefaultCertificate: true 317 | 318 | CloudFrontInvokeURLPermission: 319 | Type: AWS::Lambda::Permission 320 | Properties: 321 | Action: lambda:InvokeFunctionUrl 322 | FunctionName: !Ref FastAPIFunction 323 | Principal: 'cloudfront.amazonaws.com' 324 | SourceArn: !Sub 'arn:aws:cloudfront::${AWS::AccountId}:distribution/${Distribution}' 325 | 326 | 327 | 328 | 329 | Outputs: 330 | FastAPIFunction: 331 | Description: FastAPI Lambda Function ARN 332 | Value: !GetAtt FastAPIFunction.Arn 333 | 334 | CloudfrontURL: 335 | Description: CloudFront URl 336 | Value: !Sub 'https://${Distribution.DomainName}/bedrock_converse_api' 337 | 338 | ConfigUrl: 339 | Description: API Gateway URL 340 | Value: !Sub 'https://${ConfigApi}.execute-api.${AWS::Region}.amazonaws.com/' 341 | 342 | StepFunctionArn: 343 | Description: Step Function ARN 344 | Value: !GetAtt StateMachine.Arn 345 | 346 | DynamoDBTable: 347 | Description: DynamoDB table name 348 | Value: !Ref AssistantConfigTable -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/serverless_assistant_rag/__init__.py -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/app/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/serverless_assistant_rag/app/__init__.py -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/app/main.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | from fastapi.responses import StreamingResponse 5 | from typing import List, Optional, Literal 6 | from typing_extensions import Annotated 7 | from pydantic_core import to_json 8 | from pydantic import BaseModel, field_validator, model_validator 9 | from pydantic import Field 10 | from fastapi import FastAPI 11 | from os import environ 12 | import json 13 | import boto3 14 | 15 | # Create the FastAPI app 16 | app = FastAPI( 17 | title="""Sample of a serverless GenAI assistant. it expects two objects. bedrock_parameters and 18 | assistant_parameters. while bedrock parameters holds the standard parameters of Bedrock 19 | invoke_model_with_response_stream the assistant parameters object define behaviour about the integration between 20 | the lambda and step function that allows the user to pass data between the invocations to enhance the llm 21 | response quality""" 22 | ) 23 | 24 | # Boto3 clients for bedrock and step functions 25 | bedrock_boto_client = boto3.client("bedrock-runtime") 26 | sf_boto_client = boto3.client("stepfunctions") 27 | 28 | 29 | class AssistantParameters(BaseModel): 30 | # The content_tag will be used to wrap the Step Functions output the defined tag in system prompt by the Step 31 | # Functions content output. 32 | content_tag: str = Field( 33 | default="DOCUMENT", 34 | description="the name of tag name that will wrap the retrieved content from step functions", 35 | ) 36 | messages_to_sample: int = Field( 37 | default="5", 38 | description="Number of last message history that will be sent to the workflow", 39 | ) 40 | state_machine_custom_params: dict = Field( 41 | default={}, 42 | description="Additional params to use as data in workflow" 43 | ) 44 | 45 | 46 | class BedrockClaudeMessagesAPIRequest(BaseModel): 47 | """ 48 | This object defines the request body for the Amazon Bedrock using Claude messages API. 49 | Claude models that supports messaging API - https://docs.anthropic.com/claude/reference/messages, 50 | Bedrock doc: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html 51 | 52 | """ 53 | 54 | anthropic_version: str = "bedrock-2023-05-31" 55 | messages: List[dict] = [{"role": "user", "content": "Who are you?"}] 56 | temperature: float = 1.0 57 | top_p: float = 0.999 58 | top_k: int = 250 59 | max_tokens: int = 1028 60 | stop_sequences: List[str] = ["\n\nHuman:"] 61 | modelId: str = "anthropic.claude-3-sonnet-20240229-v1:0" 62 | system: str = "" 63 | 64 | 65 | class WorkflowBedrockTaskDetails(BaseModel): 66 | task_name: str 67 | task_model_id: str 68 | input_token: int 69 | output_token: int 70 | 71 | 72 | class WorkflowBedrockDetails(BaseModel): 73 | task_details: List[WorkflowBedrockTaskDetails] 74 | total_input_tokens: int 75 | total_output_tokens: int 76 | 77 | 78 | class PromptChainParametersConfiguration(BaseModel): 79 | replace_tag: str 80 | 81 | 82 | class PromptChainParameters(BaseModel): 83 | system_chain_prompt: str 84 | operation: Literal["APPEND", "REPLACE_TAG", "REPLACE_ALL"] 85 | configuration: Optional[PromptChainParametersConfiguration] = None 86 | 87 | # Check REPLACE_TAG configuration 88 | @model_validator(mode="after") 89 | def config_validation(cls, step_functions_response, handler): 90 | if step_functions_response.operation == "REPLACE_TAG" and not step_functions_response.configuration.replace_tag: 91 | raise ValueError("'REPLACE_TAG' option must be used with configuration.replace_tag str attribute") 92 | return step_functions_response 93 | 94 | 95 | class StepFunctionResponse(BaseModel): 96 | bedrock_details: WorkflowBedrockDetails = Field( 97 | description="usage information about bedrock in the workflow" 98 | ) 99 | 100 | context_data: List = Field( 101 | default=[], 102 | description="Content Returned from State Machine" 103 | ) 104 | 105 | system_chain_data: Optional[PromptChainParameters] = Field( 106 | default=None, 107 | description="Prompt Chain Operations" 108 | ) 109 | 110 | 111 | # Execute SF sync execution 112 | def execute_workflow(state_machine_arn, input_data): 113 | response = sf_boto_client.start_sync_execution( 114 | stateMachineArn=state_machine_arn, input=json.dumps(input_data) 115 | ) 116 | # raise errors from workflow step executions 117 | if response["status"] != "SUCCEEDED": 118 | error_msg = (f"State machine error. Expected: 'SUCCEEDED', Received:" 119 | f"{response['status']}. Details:\nERROR: {response['error']}," 120 | f"CAUSE: {response['cause']}") 121 | raise ValueError(error_msg) 122 | 123 | output_data = StepFunctionResponse.model_validate_json(response['output']) 124 | return output_data 125 | 126 | 127 | # Validate or Build KB context based on user prompt + context 128 | def sf_build_context(messages, system, content_tag, state_machine_custom_params): 129 | sf_workflow_result = execute_workflow( 130 | state_machine_arn=environ["STATEMACHINE_STATE_MACHINE_ARN"], 131 | input_data={ 132 | "PromptInput": messages, 133 | "state_machine_custom_params": state_machine_custom_params, 134 | }, 135 | ) 136 | 137 | context = sf_build_chain_content( 138 | system, 139 | sf_workflow_result, 140 | content_tag 141 | ) 142 | 143 | return context 144 | 145 | 146 | def sf_build_chain_content(system: str, sf_data: StepFunctionResponse, content_tag: str) -> str: 147 | default_content = f"<{content_tag}>{sf_data.context_data}" 148 | if sf_data.system_chain_data: 149 | system_chain_prompt, operation = sf_data.system_chain_data.system_chain_prompt, sf_data.system_chain_data.operation 150 | if operation == "APPEND": 151 | return f"{system}\n\n{default_content}\n\n{system_chain_prompt}" 152 | 153 | elif operation == "REPLACE_TAG": 154 | replace_tag = sf_data.system_chain_data.configuration.replace_tag 155 | 156 | opening_tag = f"<{replace_tag}>" 157 | closing_tag = f"" 158 | 159 | formatted_prompt = opening_tag + system_chain_prompt + closing_tag 160 | 161 | return f"{system.replace(opening_tag + closing_tag, formatted_prompt)}\n\n{default_content}" 162 | 163 | elif operation == "REPLACE_ALL": 164 | return f"{system_chain_prompt}\n\n{default_content}" 165 | 166 | else: 167 | return f"{system}\n\n{default_content}" 168 | 169 | 170 | async def bedrock_stream(bedrock_params): 171 | try: 172 | response = bedrock_boto_client.invoke_model_with_response_stream( 173 | body=to_json(bedrock_params, exclude=["modelId"]), 174 | modelId=bedrock_params.modelId, 175 | ) 176 | except Exception as e: 177 | yield str(e) 178 | else: 179 | for event in response.get("body"): 180 | chunk = json.loads(event["chunk"]["bytes"]) 181 | if chunk["type"] == "content_block_delta": 182 | if chunk["delta"]["type"] == "text_delta": 183 | yield chunk["delta"]["text"] 184 | 185 | 186 | @app.post("/bedrock_claude_messages_api") 187 | async def execute_chain( 188 | bedrock_parameters: BedrockClaudeMessagesAPIRequest, 189 | assistant_parameters: AssistantParameters, 190 | ): 191 | try: 192 | # Filter the number of message history to be sent to the workflow 193 | messages = ( 194 | bedrock_parameters.messages[-assistant_parameters.messages_to_sample:] 195 | if assistant_parameters.messages_to_sample <= len(bedrock_parameters.messages) 196 | else bedrock_parameters.messages 197 | ) 198 | 199 | system = bedrock_parameters.system 200 | content_tag = assistant_parameters.content_tag 201 | custom_params = assistant_parameters.state_machine_custom_params 202 | 203 | # Generate chain instructions and context prompt 204 | chain_data = sf_build_context( 205 | messages, 206 | system, 207 | content_tag, 208 | custom_params 209 | ) 210 | 211 | # Update chain instructions and context 212 | bedrock_parameters.system = chain_data 213 | return StreamingResponse( 214 | bedrock_stream(bedrock_parameters), 215 | media_type="text/plain; charset=utf-8" 216 | ) 217 | 218 | except Exception as e: 219 | error_message = str(e) + "\n\nRequest content:" 220 | error_message += f"bedrock_parameters: {bedrock_parameters}" 221 | error_message += f"\nassistant_parameters: {assistant_parameters}" 222 | return StreamingResponse( 223 | error_message, 224 | media_type="text/plain; charset=utf-8" 225 | ) 226 | -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/app/requirements.txt: -------------------------------------------------------------------------------- 1 | annotated-types==0.6.0 2 | anyio==4.2.0 3 | click==8.1.7 4 | exceptiongroup==1.2.0 5 | fastapi==0.109.2 6 | h11==0.14.0 7 | idna==3.7 8 | pydantic==2.6.1 9 | pydantic_core==2.16.2 10 | sniffio==1.3.0 11 | starlette==0.36.3 12 | typing_extensions==4.9.0 13 | uvicorn==0.27.0.post1 14 | -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/app/run.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | PATH="$PATH:$LAMBDA_TASK_ROOT/bin" PYTHONPATH="$LAMBDA_TASK_ROOT" exec python -m uvicorn --port="$PORT" main:app -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/statemachine/rag_parallel_tasks/RagGenAI.asl.json: -------------------------------------------------------------------------------- 1 | { 2 | "Comment": "GenAI Workflow that parallel validates prompt rules and execute RAG using Bedrock Agents KB", 3 | "StartAt": "Parallel", 4 | "States": { 5 | "Choice": { 6 | "Choices": [ 7 | { 8 | "Variable": "$.ParallelInput[1].Body.content[0].text", 9 | "BooleanEquals": false, 10 | "Next": "Retrieve" 11 | } 12 | ], 13 | "Default": "GenerateResponse", 14 | "Type": "Choice" 15 | }, 16 | "Parallel": { 17 | "Branches": [ 18 | { 19 | "StartAt": "Insert Keywords", 20 | "Comment": "This state validate the user prompt and augment the content inserting keywords related to the topic. The KB will have more context", 21 | "States": { 22 | "Insert Keywords": { 23 | "End": true, 24 | "Parameters": { 25 | "Body": { 26 | "anthropic_version": "bedrock-2023-05-31", 27 | "max_tokens": 150, 28 | "messages.$": "$.PromptInput[0, 1][*][*]", 29 | "system.$": "$.claude_params.task_1.system", 30 | "stop_sequences.$": "$.claude_params.task_1.stop_sequences", 31 | "temperature": 0.7 32 | }, 33 | "ModelId": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0" 34 | }, 35 | "Resource": "arn:aws:states:::bedrock:invokeModel", 36 | "Type": "Task" 37 | } 38 | } 39 | }, 40 | { 41 | "StartAt": "KB Bypass Policy", 42 | "Comment": "Based on last KB data and context, validate if a new retrieve is necessary", 43 | "States": { 44 | "KB Bypass Policy": { 45 | "End": true, 46 | "Parameters": { 47 | "Body": { 48 | "prompt.$": "States.Format('<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n{}<|eot_id|><|start_header_id|>user<|end_header_id|>\n{}<|eot_id|><|start_header_id|)>assistant\n{}', $.claude_params.task_2.system, $.PromptInput[0].messages, $.PromptInput.[2].task_2[0].content)", 49 | "temperature": 0.3, 50 | "top_p": 0.5, 51 | "max_gen_len": 50 52 | }, 53 | "ModelId": "arn:aws:bedrock:us-east-1::foundation-model/meta.llama3-8b-instruct-v1:0" 54 | }, 55 | "Resource": "arn:aws:states:::bedrock:invokeModel", 56 | "Type": "Task", 57 | "ResultSelector": { 58 | "Body": { 59 | "content": [ 60 | { 61 | "text.$": "States.StringToJson($.Body.generation)" 62 | } 63 | ], 64 | "model": "meta.llama3-8b-instruct-v1:0", 65 | "usage": { 66 | "input_tokens.$": "$.Body.prompt_token_count", 67 | "output_tokens.$": "$.Body.generation_token_count" 68 | } 69 | } 70 | } 71 | } 72 | } 73 | } 74 | ], 75 | "Next": "Choice", 76 | "ResultPath": "$.ParallelInput", 77 | "Type": "Parallel", 78 | "Parameters": { 79 | "claude_params": { 80 | "task_1": { 81 | "system": "", 82 | "stop_sequences": [ 83 | "" 84 | ] 85 | }, 86 | "task_2": { 87 | "system": "You are a json object generator that follow the Rules: 1. Read the user/assistant JSON interaction history. 2. If the answer to the user's last query was clearly and detailed answered in the conversation history answer the \"{\"boolean\":true}\". 3. If the user's last query was not clearly and detailed answered, return the \"{\"boolean\":false}\". 4. use the json schema for boolean and answer with the json format \"{\"boolean\":{value}\". 5. For messages where the user intent is just a clear greeting like 'hello' or 'hi', answer \"{\"boolean\":true}\"", 88 | "stop_sequences": [ 89 | "" 90 | ] 91 | } 92 | }, 93 | "PromptInput": [ 94 | { 95 | "messages.$": "$.PromptInput" 96 | }, 97 | { 98 | "task_1": [ 99 | { 100 | "role": "assistant", 101 | "content": "" 102 | } 103 | ] 104 | }, 105 | { 106 | "task_2": [ 107 | { 108 | "role": "assistant", 109 | "content": "{\"boolean\":" 110 | } 111 | ] 112 | } 113 | ] 114 | } 115 | }, 116 | "Retrieve": { 117 | "Next": "GenerateResponseKb", 118 | "Parameters": { 119 | "KnowledgeBaseId": "${KnowledgeBaseId}", 120 | "RetrievalQuery": { 121 | "Text.$": "States.Format('{}, {}', States.ArrayGetItem($.PromptInput[-1:].content, 0), $.ParallelInput[0].Body.content[0].text)" 122 | } 123 | }, 124 | "Resource": "arn:aws:states:::aws-sdk:bedrockagentruntime:retrieve", 125 | "ResultPath": "$.KnowledgeBaseData", 126 | "Type": "Task", 127 | "Catch": [ 128 | { 129 | "ErrorEquals": [ 130 | "States.ALL" 131 | ], 132 | "Next": "GenerateResponseKBError", 133 | "ResultPath": "$.KnowledgeBaseData" 134 | } 135 | ], 136 | "ResultSelector": { 137 | "RetrievalResults.$": "$.RetrievalResults" 138 | } 139 | }, 140 | "GenerateResponse": { 141 | "Comment": "Generates the Output Json for the caller", 142 | "End": true, 143 | "Type": "Pass", 144 | "Parameters": { 145 | "bedrock_details": { 146 | "task_details": [ 147 | { 148 | "task_name": "task0", 149 | "task_model_id.$": "$.ParallelInput[0].Body.model", 150 | "input_token.$": "$.ParallelInput[0].Body.usage.input_tokens", 151 | "output_token.$": "$.ParallelInput[0].Body.usage.output_tokens" 152 | }, 153 | { 154 | "task_name": "task1", 155 | "task_model_id.$": "$.ParallelInput[1].Body.model", 156 | "input_token.$": "$.ParallelInput[1].Body.usage.input_tokens", 157 | "output_token.$": "$.ParallelInput[1].Body.usage.output_tokens" 158 | } 159 | ], 160 | "total_input_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.input_tokens, $.ParallelInput[1].Body.usage.input_tokens)", 161 | "total_output_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.output_tokens, $.ParallelInput[1].Body.usage.output_tokens)" 162 | }, 163 | "context_data": [], 164 | "system_chain_data": { 165 | "system_chain_prompt": "No Data generated this time, this is likely due to a greeting, repeated user's query or the answer is already provided in the conversation history. Check carefully and do not invent any information. You can answer user greetings in a friendly way.", 166 | "operation": "REPLACE_TAG", 167 | "configuration": { 168 | "replace_tag": "chain-information" 169 | } 170 | } 171 | } 172 | }, 173 | "GenerateResponseKb": { 174 | "Comment": "Generates the Output Json for the caller", 175 | "Type": "Pass", 176 | "Parameters": { 177 | "bedrock_details": { 178 | "task_details": [ 179 | { 180 | "task_name": "task0", 181 | "task_model_id.$": "$.ParallelInput[0].Body.model", 182 | "input_token.$": "$.ParallelInput[0].Body.usage.input_tokens", 183 | "output_token.$": "$.ParallelInput[0].Body.usage.output_tokens" 184 | }, 185 | { 186 | "task_name": "task1", 187 | "task_model_id.$": "$.ParallelInput[1].Body.model", 188 | "input_token.$": "$.ParallelInput[1].Body.usage.input_tokens", 189 | "output_token.$": "$.ParallelInput[1].Body.usage.output_tokens" 190 | } 191 | ], 192 | "total_input_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.input_tokens, $.ParallelInput[1].Body.usage.input_tokens)", 193 | "total_output_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.output_tokens, $.ParallelInput[1].Body.usage.output_tokens)" 194 | }, 195 | "context_data.$": "$.KnowledgeBaseData.RetrievalResults" 196 | }, 197 | "End": true 198 | }, 199 | "GenerateResponseKBError": { 200 | "Comment": "Generates the Output Json for KB error ", 201 | "End": true, 202 | "Type": "Pass", 203 | "Parameters": { 204 | "bedrock_details": { 205 | "task_details": [ 206 | { 207 | "task_name": "task0", 208 | "task_model_id.$": "$.ParallelInput[0].Body.model", 209 | "input_token.$": "$.ParallelInput[0].Body.usage.input_tokens", 210 | "output_token.$": "$.ParallelInput[0].Body.usage.output_tokens" 211 | }, 212 | { 213 | "task_name": "task1", 214 | "task_model_id.$": "$.ParallelInput[1].Body.model", 215 | "input_token.$": "$.ParallelInput[1].Body.usage.input_tokens", 216 | "output_token.$": "$.ParallelInput[1].Body.usage.output_tokens" 217 | } 218 | ], 219 | "total_input_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.input_tokens, $.ParallelInput[1].Body.usage.input_tokens)", 220 | "total_output_tokens.$": "States.MathAdd($.ParallelInput[0].Body.usage.output_tokens, $.ParallelInput[1].Body.usage.output_tokens)" 221 | }, 222 | "context_data": [ 223 | { 224 | "Error.$": "$.KnowledgeBaseData.Error", 225 | "Cause.$": "$.KnowledgeBaseData.Cause", 226 | "Text": "Did you created the Bedrock KB? Check the URL: https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html" 227 | } 228 | ], 229 | "system_chain_data": { 230 | "system_chain_prompt": "You are an error handler task. Your rules are: 1. Inform that you are a error handler task. 2. Inform the user you can't answer any question. 3. explain the bedrock error in the document tag.", 231 | "operation": "REPLACE_ALL" 232 | } 233 | } 234 | } 235 | } 236 | } -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/template.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: '2010-09-09' 2 | Transform: AWS::Serverless-2016-10-31 3 | Description: Serverless GenAI Assistant combining Step Functions for orchestration and Lambda for stream response. 4 | Globals: 5 | Function: 6 | Timeout: 60 7 | 8 | Parameters: 9 | KnowledgeBaseId: 10 | Type: String 11 | Default: "Insert_your_Kb_id" 12 | 13 | Resources: 14 | FastAPIFunction: 15 | Type: AWS::Serverless::Function 16 | Properties: 17 | CodeUri: app/ 18 | Handler: run.sh 19 | Runtime: python3.12 20 | MemorySize: 512 21 | Environment: 22 | Variables: 23 | AWS_LAMBDA_EXEC_WRAPPER: /opt/bootstrap 24 | AWS_LWA_INVOKE_MODE: response_stream 25 | PORT: 8000 26 | STATEMACHINE_STATE_MACHINE_ARN: !GetAtt StateMachine.Arn 27 | Layers: 28 | - !Sub 'arn:aws:lambda:${AWS::Region}:753240598075:layer:LambdaAdapterLayerX86:20' 29 | FunctionUrlConfig: 30 | AuthType: NONE 31 | InvokeMode: RESPONSE_STREAM 32 | Policies: 33 | - StepFunctionsExecutionPolicy: 34 | StateMachineName: !Ref StateMachine 35 | - Statement: 36 | - Effect: Allow 37 | Action: 38 | - 'states:StartSyncExecution' 39 | Resource: !Ref StateMachine 40 | - Statement: 41 | - Effect: Allow 42 | Action: 43 | - 'bedrock:InvokeModelWithResponseStream' 44 | Resource: !Sub 'arn:aws:bedrock:${AWS::Region}::foundation-model/*' 45 | 46 | StateMachine: 47 | Type: AWS::Serverless::StateMachine 48 | Properties: 49 | DefinitionUri: statemachine/rag_parallel_tasks/RagGenAI.asl.json 50 | Logging: 51 | Level: ALL 52 | IncludeExecutionData: true 53 | Destinations: 54 | - CloudWatchLogsLogGroup: 55 | LogGroupArn: !Sub 'arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${StateMachineLogGroup}:*' 56 | Policies: 57 | - AWSXrayWriteOnlyAccess 58 | - Statement: 59 | - Effect: Allow 60 | Action: 61 | - 'logs:CreateLogDelivery' 62 | - 'logs:GetLogDelivery' 63 | - 'logs:UpdateLogDelivery' 64 | - 'logs:DeleteLogDelivery' 65 | - 'logs:ListLogDeliveries' 66 | - 'logs:PutResourcePolicy' 67 | - 'logs:DescribeResourcePolicies' 68 | - 'logs:DescribeLogGroups' 69 | Resource: '*' 70 | - Statement: 71 | - Effect: Allow 72 | Action: 73 | - 'bedrock:Retrieve' 74 | Resource: !Sub 'arn:aws:bedrock:${AWS::Region}:${AWS::AccountId}:knowledge-base/${KnowledgeBaseId}' 75 | - Statement: 76 | - Effect: Allow 77 | Action: 78 | - 'bedrock:InvokeModel' 79 | Resource: !Sub 'arn:aws:bedrock:${AWS::Region}::foundation-model/*' 80 | Tracing: 81 | Enabled: true 82 | Type: EXPRESS 83 | DefinitionSubstitutions: 84 | KnowledgeBaseId: !Ref KnowledgeBaseId 85 | 86 | StateMachineLogGroup: 87 | Type: AWS::Logs::LogGroup 88 | Properties: 89 | LogGroupName: !Sub '/aws/vendedlogs/states/${AWS::StackName}-StateMachine-Logs' 90 | 91 | Outputs: 92 | TheFastAPIFunctionUrl: 93 | Description: Function URL for FastAPI function 94 | Value: !Sub '${FastAPIFunctionUrl.FunctionUrl}bedrock_claude_messages_api' 95 | 96 | -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/serverless_assistant_rag/tests/__init__.py -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/tests/openapi.json: -------------------------------------------------------------------------------- 1 | { 2 | "openapi": "3.1.0", 3 | "info": { 4 | "title": "Sample of a serverless GenAI assistant. it expects two objects. bedrock_parameters and assistant_parameters.\nwhile bedrock parameters holds the standar parameters of Bedrock invoke_model_with_response_stream\nthe assistant parameters object define behaviour about the integration between the lambda and step function\nthat allows the user to pass data between the invocations to enhance the llm response quality", 5 | "version": "0.1.0" 6 | }, 7 | "paths": { 8 | "/bedrock_claude_messages_api": { 9 | "post": { 10 | "summary": "Execute Chain", 11 | "operationId": "execute_chain_bedrock_claude_messages_api_post", 12 | "requestBody": { 13 | "content": { 14 | "application/json": { 15 | "schema": { 16 | "$ref": "#/components/schemas/Body_execute_chain_bedrock_claude_messages_api_post" 17 | } 18 | } 19 | }, 20 | "required": true 21 | }, 22 | "responses": { 23 | "200": { 24 | "description": "Successful Response", 25 | "content": { 26 | "application/json": { 27 | "schema": {} 28 | } 29 | } 30 | }, 31 | "422": { 32 | "description": "Validation Error", 33 | "content": { 34 | "application/json": { 35 | "schema": { 36 | "$ref": "#/components/schemas/HTTPValidationError" 37 | } 38 | } 39 | } 40 | } 41 | } 42 | } 43 | } 44 | }, 45 | "components": { 46 | "schemas": { 47 | "AssistantParameters": { 48 | "properties": { 49 | "content_tag": { 50 | "type": "string", 51 | "title": "Content Tag", 52 | "description": "the name of tag name that will wrap the retrieved content from step functions", 53 | "default": "DOCUMENT" 54 | }, 55 | "messages_to_sample": { 56 | "type": "integer", 57 | "title": "Messages To Sample", 58 | "description": "Number of last message history that will be sent to the workflow", 59 | "default": "5" 60 | }, 61 | "state_machine_custom_params": { 62 | "type": "object", 63 | "title": "State Machine Custom Params", 64 | "description": "Additional params to use as data in worflow", 65 | "default": {} 66 | } 67 | }, 68 | "type": "object", 69 | "title": "AssistantParameters" 70 | }, 71 | "BedrockClaudeMessagesAPIRequest": { 72 | "properties": { 73 | "anthropic_version": { 74 | "type": "string", 75 | "title": "Anthropic Version", 76 | "default": "bedrock-2023-05-31" 77 | }, 78 | "messages": { 79 | "items": { 80 | "type": "object" 81 | }, 82 | "type": "array", 83 | "title": "Messages", 84 | "default": [ 85 | { 86 | "role": "user", 87 | "content": "Who are you?" 88 | } 89 | ] 90 | }, 91 | "temperature": { 92 | "type": "number", 93 | "title": "Temperature", 94 | "default": 1.0 95 | }, 96 | "top_p": { 97 | "type": "number", 98 | "title": "Top P", 99 | "default": 0.999 100 | }, 101 | "top_k": { 102 | "type": "integer", 103 | "title": "Top K", 104 | "default": 250 105 | }, 106 | "max_tokens": { 107 | "type": "integer", 108 | "title": "Max Tokens", 109 | "default": 1028 110 | }, 111 | "stop_sequences": { 112 | "items": { 113 | "type": "string" 114 | }, 115 | "type": "array", 116 | "title": "Stop Sequences", 117 | "default": [ 118 | "\n\nHuman:" 119 | ] 120 | }, 121 | "modelId": { 122 | "type": "string", 123 | "title": "Modelid", 124 | "default": "anthropic.claude-3-sonnet-20240229-v1:0" 125 | }, 126 | "system": { 127 | "type": "string", 128 | "title": "System", 129 | "default": "" 130 | } 131 | }, 132 | "type": "object", 133 | "title": "BedrockClaudeMessagesAPIRequest", 134 | "description": "This object defines the request body for the Amazon Bedrock using Claude messages API.\nClaude models that supports messaging API - https://docs.anthropic.com/claude/reference/messages,\nBedrock doc: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html" 135 | }, 136 | "Body_execute_chain_bedrock_claude_messages_api_post": { 137 | "properties": { 138 | "bedrock_parameters": { 139 | "$ref": "#/components/schemas/BedrockClaudeMessagesAPIRequest" 140 | }, 141 | "assistant_parameters": { 142 | "$ref": "#/components/schemas/AssistantParameters" 143 | } 144 | }, 145 | "type": "object", 146 | "required": [ 147 | "bedrock_parameters", 148 | "assistant_parameters" 149 | ], 150 | "title": "Body_execute_chain_bedrock_claude_messages_api_post" 151 | }, 152 | "HTTPValidationError": { 153 | "properties": { 154 | "detail": { 155 | "items": { 156 | "$ref": "#/components/schemas/ValidationError" 157 | }, 158 | "type": "array", 159 | "title": "Detail" 160 | } 161 | }, 162 | "type": "object", 163 | "title": "HTTPValidationError" 164 | }, 165 | "ValidationError": { 166 | "properties": { 167 | "loc": { 168 | "items": { 169 | "anyOf": [ 170 | { 171 | "type": "string" 172 | }, 173 | { 174 | "type": "integer" 175 | } 176 | ] 177 | }, 178 | "type": "array", 179 | "title": "Location" 180 | }, 181 | "msg": { 182 | "type": "string", 183 | "title": "Message" 184 | }, 185 | "type": { 186 | "type": "string", 187 | "title": "Error Type" 188 | } 189 | }, 190 | "type": "object", 191 | "required": [ 192 | "loc", 193 | "msg", 194 | "type" 195 | ], 196 | "title": "ValidationError" 197 | } 198 | } 199 | } 200 | } -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/tests/st_serverless_assistant.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | import streamlit as st 5 | import requests 6 | import json 7 | from sys import argv 8 | from re import match, IGNORECASE 9 | from system_prompt_samples import rag_prompt_sample 10 | 11 | # This is a simple validation to help the user to pass correct lambda url. 12 | # Comment the code snippet if you want to use another url pattern 13 | 14 | if argv[1] != "--lambda-url" or not match( 15 | "https://[a-z0-9]+\\.lambda-url\\.[a-z]{2}-[a-z]+-[0-9]\\.on\\.aws[/]", 16 | argv[2], 17 | IGNORECASE, 18 | ): 19 | error_msg = "Arg parse error:\n\ 20 | Invalid arg or url. Check the parameters and try again\n \ 21 | \n\texpected: streamlit run -- --lambda-url https://.lambda-url..on.aws/path\n\n" 22 | st.error(error_msg) 23 | st.error("Got: " + argv[1] + " " + argv[2]) 24 | print(error_msg) 25 | print("Press CTRL+C to exit") 26 | st.stop() 27 | # End of url check 28 | 29 | # lambda url 30 | lambda_url = argv[2] 31 | 32 | # Prompt engineering that will be used to invoke Bedrock in lambda function. Note that this prompt contains the 33 | # instructions to answer to the user. Each Task in step functions will have an individual prompt. 34 | system_prompt = rag_prompt_sample 35 | 36 | greetings = """Welcome to sample **Serverless GenAI Assistant**! :wave: The default prompt instructions is intended 37 | to answer questions based on a Knowledge Base. If you do not have one, check this guide: KB ( 38 | https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html) If you want to obtain different 39 | answers, edit the prompt instruction in the system parameter to obtain different results""" 40 | 41 | st.title("Serverless GenAI Assistant") 42 | st.set_page_config.page_title = "Serverless GenAI Assistant" 43 | # Create a container for the welcome message 44 | welcome_container = st.container() 45 | 46 | # Add the welcome message to the container 47 | with welcome_container: 48 | st.info(greetings) 49 | 50 | 51 | def stream_bedrock_response(): 52 | headers = {"content-type": "application/json"} 53 | 54 | payload = { 55 | "bedrock_parameters": { 56 | "messages": st.session_state.messages, 57 | "temperature": float(temperature), 58 | "top_p": float(top_p), 59 | "top_k": int(top_k), 60 | "max_tokens": int(max_tokens), 61 | # transform string in list separated by comma # 62 | "stop_sequences": stop_sequences.split(","), 63 | "system": str(system), 64 | "modelId": str(model_id), 65 | }, 66 | "assistant_parameters": { 67 | "messages_to_sample": messages_to_sample, 68 | "content_tag": "document", 69 | "state_machine_custom_params": {"hello": "state_machine"}, 70 | }, 71 | } 72 | 73 | response = requests.post(lambda_url, json=payload, headers=headers, stream=True, timeout=60) 74 | 75 | if response.status_code == 200: 76 | bot_response = "" 77 | for chunk in response.iter_content(chunk_size=256): 78 | if chunk: 79 | new_content = chunk.decode() 80 | bot_response += new_content 81 | yield new_content # Yield only the new content 82 | return bot_response # Return the complete bot response 83 | else: 84 | yield "An error occurred while processing the response" 85 | 86 | #print(st.session_state.messages) 87 | 88 | 89 | # Initialize chat history 90 | if "messages" not in st.session_state: 91 | st.session_state.messages = [] 92 | 93 | # Add widgets for configuring parameters 94 | temperature = st.sidebar.slider( 95 | "Temperature", min_value=0.0, max_value=1.0, value=0.9, step=0.1 96 | ) 97 | # top_p = st.sidebar.slider("Top-p", min_value=0.0, max_value=1.0, value=0.999, step=0.001) 98 | top_p = st.sidebar.slider( 99 | "Top-p", min_value=0.0, max_value=1.0, value=0.999, step=0.001, format="%.3f" 100 | ) 101 | top_k = st.sidebar.number_input("Top-k", min_value=1, value=250, step=1) 102 | max_tokens = st.sidebar.number_input( 103 | "max_tokens", min_value=1, max_value=2000, value=650, step=1 104 | ) 105 | stop_sequences = st.sidebar.text_input("Stop Sequences", value="\n\nHuman:") 106 | system = st.sidebar.text_area("System", value=system_prompt) 107 | model_id = st.sidebar.radio( 108 | "Model ID", 109 | [ 110 | "anthropic.claude-3-sonnet-20240229-v1:0", 111 | "anthropic.claude-3-haiku-20240307-v1:0", 112 | ], 113 | index=1, 114 | ) 115 | messages_to_sample = st.sidebar.number_input( 116 | "Messages to Sample (last N messages to sent to Step Function)", 117 | min_value=1, 118 | max_value=20, 119 | value=5, 120 | step=1, 121 | ) 122 | 123 | # Keep messages on rerun 124 | for message in st.session_state.messages: 125 | with st.chat_message(message["role"]): 126 | st.markdown(message["content"]) 127 | 128 | # User input 129 | if prompt := st.chat_input("Prompt"): 130 | # Add user message to chat history 131 | st.session_state.messages.append({"role": "user", "content": prompt}) 132 | with st.chat_message("user"): 133 | st.markdown(prompt) 134 | 135 | # Display assistant response in chat message 136 | with st.chat_message("assistant"): 137 | assistant_response = st.write_stream(stream_bedrock_response()) 138 | 139 | # Add response to chat history 140 | st.session_state.messages.append( 141 | {"role": "assistant", "content": assistant_response} 142 | ) 143 | -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/tests/system_prompt_samples.py: -------------------------------------------------------------------------------- 1 | rag_prompt_sample = """ 2 | You are an AI assistant and tou are part of a prompt chain that collects external data. 3 | 4 | Your objective is answer the user queries based on the information contained in the tag. 5 | 6 | The array contains data retrieved from a knowledge base. Your goal is to reason about this data and provide answers 7 | to queries based solely on the array content. 8 | 9 | The prompt chain may insert additional information in xml tag. reason about the content in the tag 10 | to provide the user answer. 11 | 12 | 13 | 14 | You must follow these rules: 15 | 16 | 17 | 1. If the document do not contain any data check the conversation history to understand if the question was already answered, do not invent any data. 18 | 19 | 2. In your responses, you should infer information from the document while cite relevant verbatim excerpts from the "text" fields that support your reasoning. Do not attempt to infer any other information beyond quoting and explaining. 20 | 21 | 3. Be concerned with using proper grammar and spelling in your explanations, but reproduce excerpts exactly as they appear. 22 | 23 | 4. Read the document data carefully and use your reasoning capabilities to provide thorough answers to queries. 24 | 25 | 5. Do not use any external information sources. Answer only based on what can be derived from the provided document content. Do not attempt to infer any other information. 26 | 27 | For each query that has enough data, answer with: 28 | 29 | - The answer to the query, verbatim, and the explanations. 30 | - The URI(s) from the corresponding to the source(s) 31 | - A "score" indicating your confidence in the answer on a 0-1 scale 32 | 33 | 34 | {{query}} 35 | Assistant: According to the documents {{explanation}} the excerpt {{excerpt}} {{Explanation}} Sources used to answer: {{Uri}} Score: {{score}} 36 | {{query}} 37 | Assistant: I could not find information to answer that query in the given document. 38 | {{query}} 39 | Error: {{explanation}} 40 | 41 | """ -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/tests/test.json: -------------------------------------------------------------------------------- 1 | { 2 | "bedrock_parameters": { 3 | "messages": [ 4 | { 5 | "role": "user", 6 | "content": "What is AWS?" 7 | } 8 | ], 9 | "temperature": 0.9, 10 | "top_p": 0.999, 11 | "top_k": 250, 12 | "max_tokens": 650, 13 | "stop_sequences": [ 14 | "\n\nHuman:" 15 | ], 16 | "system": "\nYou are an AI assistant to help explain information contained in a JSON data source provided in the \"KnowledgeBaseData\" array. \n\nThe JSON contains data retrieved from the Amazon Bedrock knowledge base API. Your goal is to reason about this data and provide answers to queries based solely on the content in the \"text\" fields of the \"KnowledgeBaseData\".\n\nYou must follow these rules:\n\n1. If the document contains null \"text\" field, check if the answer was already provided in conversation history, do not invent any data. \n\n2. In your responses, you should infer information from the document while cite relevant verbatim excerpts from the \"text\" fields that support your reasoning. Do not attempt to infer any other information beyond quoting and explaining \n\n3. Be concerned with using proper grammar and spelling in your explanations, but reproduce excerpts exactly as they appear.\n\n4. Read the document data carefully and use your reasoning capabilities to provide thorough answers to queries.\n\n5. Do not use any external information sources. Answer only based on what can be derived from the provided document content. Do not attempt to infer any other information.\n\n6. If the document contains the atrribute/value \"Error: BedrockAgentRuntime.ResourceNotFoundException\": 7. Identify that the document is an error message 8. Quote the full error\nmessage verbatim in \"Text\" attribute, including any relevant details like service name, status code, request ID, etc. 9. Explain that since this is an error message, \nit does not contain information to answer the query directly. 10. Do not attempt to infer any other information beyond quoting and explaining \nthe error message itself. \n\nFor each query, answer with:\n\n- The answer to the query, verbatim, and the explanations.\n- The URI(s) from the corresponding to the source(s)\n- A \"score\" indicating your confidence in the answer on a 0-1 scale\n\n\nHuman: What is AWS?\nAssistant: According to the documents {{explanation}} the excerpt {{excerpt}} {{Explanation}} Sources used to answer: {{Uri}} Score: {{score}}\nHuman: What is Infinidash? \nAssistant: I could not find information to answer that query in the given document.\nHuman: What is Amazon Timestream?\nBedrockAgentRuntime.ResourceNotFoundException\nAssistant: It looks like that Bedrock returned an error. The error is: {{Text}}\n\n", 17 | "modelId": "anthropic.claude-3-haiku-20240307-v1:0" 18 | }, 19 | "assistant_parameters": { 20 | "messages_to_sample": 5, 21 | "content_tag": "document", 22 | "state_machine_custom_params": { 23 | "hello": "state_machine" 24 | } 25 | } 26 | } -------------------------------------------------------------------------------- /examples/serverless_assistant_rag/tests/test_stream_python_requests.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import json 3 | from sys import argv 4 | from re import match, IGNORECASE 5 | from system_prompt_samples import rag_prompt_sample 6 | 7 | 8 | 9 | # This is a simple validation to help the user to pass correct lambda url. 10 | # Comment the code snippet if you want to use another url pattern 11 | 12 | if argv[1] != "--lambda-url" or not match( 13 | "https://[a-z0-9]+\\.lambda-url\\.[a-z]{2}-[a-z]+-[0-9]\\.on\\.aws[/]", 14 | argv[2], 15 | IGNORECASE, 16 | ): 17 | error_msg = "Arg parse error:\n\ 18 | Invalid arg or url. Check the parameters and try again\n \ 19 | \n\texpected: streamlit run -- --lambda-url https://.lambda-url..on.aws/path\n\n" 20 | print(error_msg) 21 | 22 | lambda_url = argv[2] 23 | 24 | 25 | def stream_bedrock_response(): 26 | 27 | print("Testing stream bedrock response") 28 | 29 | 30 | 31 | 32 | headers = {"content-type": "application/json"} 33 | 34 | payload = { 35 | "bedrock_parameters": { 36 | "messages": [{"role": "user", "content": "Hello"}], 37 | "temperature": 0.9, 38 | "top_p": 0.999, 39 | "top_k": 250, 40 | "max_tokens": 650, 41 | "stop_sequences": ["\n\nHuman:"], 42 | "system": '\nYou are an AI assistant to help explain information contained in a JSON data source provided in the "KnowledgeBaseData" array. \n\nThe JSON contains data retrieved from the Amazon Bedrock knowledge base API. Your goal is to reason about this data and provide answers to queries based solely on the content in the "text" fields of the "KnowledgeBaseData".\n\nYou must follow these rules:\n\n1. If the document contains null "text" field, check if the answer was already provided in conversation history, do not invent any data. \n\n2. In your responses, you should infer information from the document while cite relevant verbatim excerpts from the "text" fields that support your reasoning. Do not attempt to infer any other information beyond quoting and explaining \n\n3. Be concerned with using proper grammar and spelling in your explanations, but reproduce excerpts exactly as they appear.\n\n4. Read the document data carefully and use your reasoning capabilities to provide thorough answers to queries.\n\n5. Do not use any external information sources. Answer only based on what can be derived from the provided document content. Do not attempt to infer any other information.\n\n6. If the document contains the atrribute/value "Error: BedrockAgentRuntime.ResourceNotFoundException": 7. Identify that the document is an error message 8. Quote the full error\nmessage verbatim in "Text" attribute, including any relevant details like service name, status code, request ID, etc. 9. Explain that since this is an error message, \nit does not contain information to answer the query directly. 10. Do not attempt to infer any other information beyond quoting and explaining \nthe error message itself. \n\nFor each query, answer with:\n\n- The answer to the query, verbatim, and the explanations.\n- The URI(s) from the corresponding to the source(s)\n- A "score" indicating your confidence in the answer on a 0-1 scale\n\n\nHuman: What is AWS?\nAssistant: According to the documents {{explanation}} the excerpt {{excerpt}} {{Explanation}} Sources used to answer: {{Uri}} Score: {{score}}\nHuman: What is Infinidash? \nAssistant: I could not find information to answer that query in the given document.\nHuman: What is Amazon Timestream?\nBedrockAgentRuntime.ResourceNotFoundException\nAssistant: It looks like that Bedrock returned an error. The error is: {{Text}}\n\n', 43 | "modelId": "anthropic.claude-3-haiku-20240307-v1:0", 44 | }, 45 | "assistant_parameters": {"messages_to_sample": 5, "content_tag": "document"}, 46 | } 47 | 48 | response = requests.post(lambda_url, json=payload, headers=headers, stream=True, timeout=60) 49 | print(response.__dict__) 50 | print(response.status_code) 51 | print("end of reponse") 52 | 53 | 54 | bot_response = "" 55 | for chunk in response.iter_content(chunk_size=1): 56 | if chunk: 57 | new_content = chunk.decode() 58 | bot_response += new_content 59 | yield new_content # Yield only the new content 60 | return bot_response # Return the complete bot response 61 | 62 | 63 | if __name__ == "__main__": 64 | data = stream_bedrock_response() 65 | for chunk in data: 66 | print(chunk) -------------------------------------------------------------------------------- /imgs/architecture-promptchain-stream.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/imgs/architecture-promptchain-stream.png -------------------------------------------------------------------------------- /imgs/architecture-stream.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/imgs/architecture-stream.jpg -------------------------------------------------------------------------------- /imgs/assistant_example.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/imgs/assistant_example.gif -------------------------------------------------------------------------------- /imgs/stepfunctions-rag-graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/imgs/stepfunctions-rag-graph.png --------------------------------------------------------------------------------