├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── examples
├── sample_web_application
│ ├── README.md
│ ├── __init__.py
│ ├── amplify_ui
│ │ ├── amplify.yml
│ │ ├── amplify
│ │ │ ├── auth
│ │ │ │ ├── pre-token-generation
│ │ │ │ │ ├── handler.ts
│ │ │ │ │ └── resources.ts
│ │ │ │ └── resource.ts
│ │ │ ├── backend.ts
│ │ │ ├── package.json
│ │ │ └── tsconfig.json
│ │ ├── dist
│ │ │ ├── assets
│ │ │ │ ├── index-2zi0aqmG.css
│ │ │ │ └── index-DguTJVXA.js
│ │ │ ├── favicon.ico
│ │ │ └── index.html
│ │ ├── env.d.ts
│ │ ├── index.html
│ │ ├── package-lock.json
│ │ ├── package.json
│ │ ├── public
│ │ │ └── favicon.ico
│ │ ├── src
│ │ │ ├── App.vue
│ │ │ ├── assets
│ │ │ │ ├── base.css
│ │ │ │ ├── logo.svg
│ │ │ │ └── main.css
│ │ │ ├── components
│ │ │ │ ├── AssistantConfig.vue
│ │ │ │ ├── ChatComponent.vue
│ │ │ │ ├── MenuBar.vue
│ │ │ │ └── icons
│ │ │ │ │ ├── IconCommunity.vue
│ │ │ │ │ ├── IconDocumentation.vue
│ │ │ │ │ ├── IconEcosystem.vue
│ │ │ │ │ ├── IconSupport.vue
│ │ │ │ │ └── IconTooling.vue
│ │ │ └── main.ts
│ │ ├── tsconfig.app.json
│ │ ├── tsconfig.json
│ │ ├── tsconfig.node.json
│ │ └── vite.config.ts
│ ├── app
│ │ ├── __init__.py
│ │ ├── api_models
│ │ │ ├── __init__.py
│ │ │ ├── assistant_model.py
│ │ │ ├── bedrock_converse_model.py
│ │ │ └── workflow_model.py
│ │ ├── logging.yaml
│ │ ├── main.py
│ │ ├── requirements.txt
│ │ └── run.sh
│ ├── assistant_config.py
│ ├── config_handler
│ │ └── config_handler.py
│ ├── dependencies
│ │ └── python
│ │ │ └── assistant_config_interface
│ │ │ ├── __init__.py
│ │ │ └── data_manager.py
│ ├── imgs
│ │ ├── Architecture.jpeg
│ │ ├── amplify_deployed.jpeg
│ │ ├── amplify_intro.jpeg
│ │ ├── amplify_save_deploy.jpeg
│ │ ├── app_settings.jpeg
│ │ ├── download_backend_resources.png
│ │ ├── repository_select.jpeg
│ │ ├── sample.gif
│ │ └── select_source_code.jpeg
│ ├── lambda-jwt-verify
│ │ └── src
│ │ │ ├── index.mjs
│ │ │ └── package.json
│ ├── package-lock.json
│ ├── statemachine
│ │ └── rag_parallel_tasks
│ │ │ └── RagGenAI.asl.json
│ └── template.yaml
└── serverless_assistant_rag
│ ├── __init__.py
│ ├── app
│ ├── __init__.py
│ ├── main.py
│ ├── requirements.txt
│ └── run.sh
│ ├── statemachine
│ └── rag_parallel_tasks
│ │ └── RagGenAI.asl.json
│ ├── template.yaml
│ └── tests
│ ├── __init__.py
│ ├── openapi.json
│ ├── st_serverless_assistant.py
│ ├── system_prompt_samples.py
│ ├── test.json
│ └── test_stream_python_requests.py
└── imgs
├── architecture-promptchain-stream.png
├── architecture-stream.jpg
├── assistant_example.gif
└── stepfunctions-rag-graph.png
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing Guidelines
2 |
3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
4 | documentation, we greatly value feedback and contributions from our community.
5 |
6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
7 | information to effectively respond to your bug report or contribution.
8 |
9 |
10 | ## Reporting Bugs/Feature Requests
11 |
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 |
14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 |
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 |
22 |
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 |
26 | 1. You are working against the latest source on the *main* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 |
30 | To send us a pull request, please:
31 |
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 |
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 |
42 |
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45 |
46 |
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 |
52 |
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 |
56 |
57 | ## Licensing
58 |
59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2 |
3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this
4 | software and associated documentation files (the "Software"), to deal in the Software
5 | without restriction, including without limitation the rights to use, copy, modify,
6 | merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
7 | permit persons to whom the Software is furnished to do so.
8 |
9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
10 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
11 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
12 | HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
13 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
14 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Serverless GenAI Assistant
2 |
3 | This example implements a GenAI Assistant that streams responses using AWS Lambda with AWS Lambda Web Adapter and takes advantage of low-code using AWS Step Functions to orchestrate workflows with Amazon Bedrock and other AWS services.
4 |
5 | 
6 |
7 | ### How does it work?
8 |
9 | The example uses AWS Lambda to execute a [Synchronous Express Workflow](https://docs.aws.amazon.com/step-functions/latest/dg/concepts-express-synchronous.html) that orchestrates multiples Bedrock API requests allowing the user to implement prompt engineering techniques like [Prompt Chaining](https://www.promptingguide.ai/techniques/prompt_chaining) and [ReAct](https://www.promptingguide.ai/techniques/react) using the [Workflow Studio](https://docs.aws.amazon.com/step-functions/latest/dg/workflow-studio-components.html) and the [ASL](https://states-language.net/spec.html), a JSON-based declarative language to prototype and experiment different integrations and prompt techniques without the need for extensive efforts. Lambda firstly invoke the [state machine](https://docs.aws.amazon.com/step-functions/latest/dg/getting-started-with-sfn.html#key-concepts-get-started) and use it's as a input to invoke the [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) API, that returns a streamed response to the caller. Due to the stream response the TTFB (Time to First Byte) is shorter, improving the user experience of the GenAI assistant, which fits in the customer facing service scenarios.
10 |
11 | The Lambda function uses [AWS Lambda Web Adapter](https://github.com/awslabs/aws-lambda-web-adapter/tree/main) and FastAPI. AWS Lambda Web Adapter allows developers to build web apps (http api) with familiar frameworks (e.g. Express.js, Next.js, Flask, SpringBoot, ASP.NET and Laravel, anything speaks HTTP 1.1/1.0).
12 | The Lambda function can be accessed using a [function url](https://docs.aws.amazon.com/lambda/latest/dg/lambda-urls.html) configured in [Response Stream mode](https://docs.aws.amazon.com/lambda/latest/dg/configuration-response-streaming.html).
13 |
14 | ### How to take advantage of this pattern?
15 |
16 | The combination of AWS Step Functions using Express Workflows and the Bedrock response stream API allows scenarios where the builder can use state machine [Tasks](https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-task-state.html) to define prompt chaining [subtasks](https://docs.anthropic.com/claude/docs/chain-prompts#tips-for-effective-prompt-chaining). Each Step Functions Task can invoke Amazon Bedrock API with a specific Large Language Model, in sequence or parallel for non-dependent prompt subtasks, and then orchestrate the result with other AWS Services APIs like AWS Lambda, Amazon API Gateway and [third-party APIs](https://docs.aws.amazon.com/step-functions/latest/dg/connect-third-party-apis.html).
17 |
18 | Break LLM tasks into subtasks turns the prompt engineering easier since you focus on the specific result of the subtask. Also, it's easier to control the output tokens which reduce the [latency](https://docs.anthropic.com/claude/docs/reducing-latency#2-optimize-prompt-and-output-length) of subtasks that doesn't need to generate large outputs, for example, you can use the user prompt and ask to a most compact LLM like Claude 3 Haiku to return a boolean value and use it to define a deterministic path in a state machine by use a [Choice state](https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-choice-state.html) to check the true/false value. It will consume a short number of output tokens, reducing the latency and cost. The inference for a final response can be executed for Claude 3 Sonnet using stream response for a better response quality and [TTFB](https://docs.anthropic.com/claude/docs/reducing-latency#measuring-latency).
19 |
20 | Here's a few scenarios where this pattern can be used:
21 |
22 | - RAG: Use Step Functions invokes Bedrock to augment the user input and improve the user question by add keywords regarding the conversation context and them invoke Knowledge bases for Amazon Bedrock to execute a semantic search. While the "add keyword task" and "invoke Knowledge Bases" are low latency tasks, the user response will be generated using stream response for better experience.
23 | - Router: Step Functions can be implemented as a [router](https://community.aws/content/2duj7Wn4o0gSC6gNMsSFpeYZ5uO/rethinking-ai-agents-why-a-simple-router-may-be-all-you-need) to merge deterministic and non-deterministic scenarios, for example: Understand that the customer contact is a potential churn and start a retention workflow.
24 | - A/B Testing: Use A/B testing in Step Functions, allowing the low-code fast pace of implementation to test different experiments, make adjusts and elect the best one to you business. While you focus on the business rule, the lambda handles as an interface abstraction since you don't need to change the lambda code or chabge API contract for every experiment.
25 | - API calling: The user input can be used to prompt LLM to generate data as a JSON, XML or create a SQL query based on a table structure. Then you can use the LLM output to execute Step Function Tasks that uses the generated data to call APIs and execute SQL queries on Databases. The lambda function can use the Step Functions output to execute stream response and provide the reasoning over the data generated.
26 | ---
27 | ## Implementation details
28 |
29 | ### State Machine - RAG
30 |
31 | 
32 |
33 | Using the state machine in ``examples/serverless_assistant_rag`` to exemplify how the workflow works, it implements a RAG architecture using [Claude 3 Haiku](https://aws.amazon.com/bedrock/claude/) with [Messages API](https://docs.anthropic.com/claude/reference/messages-examples), and also [Meta Llama 3 chat](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3) .
34 |
35 | 1. Receives the input data from lambda, and parallelizes for two API Bedrock tasks.
36 | 2. The first task will enhance the user question to add keywords in order to augment the semantic search. It uses Claude 3 Haiku.
37 | 3. The second task is a bypass policy. It will check the conversation context and return true/false to validates if a KB retrieval is really necessary. When avoided it will reduce the answer latency. It uses Llama 3 8B Instruct.
38 | 4. If the result is false, the Knowledge base is invoked using the user question + the added keywords, otherwise no content will be added. Note that there is an error handler if the Retrieve task fails. It adds the error content to the ``contex_output``, and use the ``system_chain_data`` to modify the original instructions.
39 | 5. In order to pass a structured response, the [Pass states](https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-pass-state.html) is used to format the JSON output. This technique can be used to keep the Tasks flexible and use the pass state to filter/format the output to keep the API contract.
40 |
41 |
42 | ### Lambda
43 |
44 | The lambda function is packaged as a Zip and the Lambda Adapter is attached as a layer, similar to the [fastapi-response-streaming-zip](https://github.com/awslabs/aws-lambda-web-adapter/tree/main/examples/fastapi-response-streaming-zip) example. After the deployment you can check the cloudformation output by the key TheFastAPIFunctionUrl that contains the lambda URL to invoke.
45 |
46 | Lambda invokes the express workflow defined in `STATEMACHINE_STATE_MACHINE_ARN` environment variable and merges the content returned in `ContextOutput` attribute to the `system` parameter. The ContextOutput will be wrapped by [XML tags](https://docs.anthropic.com/claude/docs/use-xml-tags), that can be defined by the parameter `content_tag`.
47 |
48 | The lambda is also the last subtask of prompt chaining, and it's ready to call Claude models that support Messages APIs. It means that the `ContextData` attribute in the state machine response is sent to the Bedrock stream API. The prompt instructions that guides the user interaction should be sent to lambda by the caller using the [`system`](https://docs.anthropic.com/claude/docs/system-prompts) parameter. The lambda function will them stream the response to the caller in chunks, check the [Run](#run) section to see examples.
49 |
50 | Here's an example of the prompt instructions format for `system` parameter where the `content_tag` is the string 'document' and the `ContextData` is a JSON:
51 |
52 | ```python
53 | content_tag = 'document'
54 |
55 | #the prompt below is the default prompt sent to the lambda function by the caller during the lambda URL invocation
56 | system = """
57 |
58 | You are an virtual assistant, use the JSON in the document xml tag to answer the user.
59 | Use the attribute 'data_source' to provide the URL used as a source.
60 |
61 | """
62 |
63 | #ContextOutput is part of Step Functions response
64 | ContextData = {
65 | "text":"Example",
66 | "data_source": "URL"
67 | }
68 |
69 | #Format a new system attribute
70 | f"{system}<{content_tag}>{ContextData}{content_tag}>"
71 | ```
72 |
73 | >**Note:** For this sample the ``template.yaml`` defines the ``AuthType`` of ``FunctionUrlConfig`` as ``None``. This sample is intended to provide an experimentation environment where the user can explore the architecture.
74 | > For production use cases check how you can add [auth](https://docs.aws.amazon.com/lambda/latest/dg/urls-auth.html) layer and how to [protect](https://aws.amazon.com/blogs/compute/protecting-an-aws-lambda-function-url-with-amazon-cloudfront-and-lambdaedge/) lambda url.
75 |
76 |
77 | ### Prompt Instructions
78 |
79 | #### First prompt
80 | The prompt engineering architecture for this sample explores the [Prompt Chaining](https://www.promptingguide.ai/techniques/prompt_chaining) technique. To implement this technique the lambda functions expects to receive instructions in ``system`` parameter.
81 | Use the first invocation to pass the instructions about how you expect the assistant to answer the user. In general, use the first prompt to define techniques like [Role Playing](https://docs.anthropic.com/claude/docs/give-claude-a-role,
82 | [Few-Shot Prompting](https://www.promptingguide.ai/techniques/fewshot) and others. Using the first prompt to handle more complex instructions will allow to use
83 | the ``InvokeModelWithResponseStream`` API with powerful models that tends to be slower, but the TTFB will help to achieve a better responsiveness.
84 |
85 |
86 |
87 | #### Next prompts
88 |
89 | The next prompts are defined in each Step Functions Task. You can use or not the first prompt in the Step Function tasks. It will depend on the use case.
90 | Design the prompts in Step Function Tasks in concise and straightforward way, focus on solve a specific problem. Experiment with different models.
91 |
92 | You can check the first prompt example at ``examples/serverless_assistant_rag/tests/system_prompt_samples.py`` and the prompt at each Task at ``examples/serverless_assistant_rag/statemachine/rag_parallel_taks/RagGenAI.asl.json``.
93 |
94 | ---
95 | ### Passing data to Step Functions state machine
96 |
97 | You can send data aside LLM parameters through lambda to Step Functions state machine using the optional parameter ``state_machine_custom_params``. It expects a ``dict`` where you can choose the data you want to use in the state machine.
98 | As an example, if you would like to send information about a logged user, you can use the ``state_machine_custom_params`` and handle this information in the Step Functions state machine. Once the data is sent to Step Functions you can use all the resources in
99 | state machine to handle the information.
100 |
101 | ### Chaining to InvokeModelWithResponseStream
102 |
103 | In this example the ``InvokeModelWithResponseStream`` is the last subtask of prompt chaining, it means that if the execution
104 | workflow in Step Functions gets new relevant data to original prompt instructions, it needs to send this data to the stream api aside
105 | of the ``context_data`` parameter because ``context_data`` is used to provide information chunks to help the LLM to generate answer, not instructions. To handle this scenario you can use the ``prompt_chain_data`` parameter.
106 |
107 | The ``system_chain_prompt`` takes the original instructions in ``system`` parameter and allows you to make the following operations
108 | to that prompt chain execution instance:
109 |
110 | - ``APPEND`` - Adds new instructions to the end of original ``system`` parameter
111 | - ``REPLACE_TAG`` - Replace the first mapped in the ``system`` by new data. It provides a mechanism to
112 | merge new instructions to original prompt instructions.
113 | - ``REPLACE_ALL`` - Replace all the prompt in ``system`` by the content in ``system_chain_prompt`` parameter. Useful for cases where a system error happened,
114 | or you have scenarios that needs a new prompt for each flow.
115 |
116 | ##### Why use the system_chain_data attribute?
117 | It avoids you to create complex prompt. When you add many instructions at a time to LLM, the prompt engineering tends to
118 | get more complex to deal with edge scenarios, so instead you create a prompt with few rules to define how to answer questions but an
119 | exception rules for specific cases, which will demand you more reasoning, you can manipulate the instructions based on the path of your workflow
120 | and give direct instructions where your LLM will not deal with ambiguity.
121 |
122 | #### Syntax
123 |
124 |
125 |
126 |
127 | ```
128 | "system_chain_data": {
129 | "system_chain_prompt": string
130 | "operation": APPEND | REPLACE_TAG | REPLACE_ALL,
131 | "configuration": {
132 | replace_tag: string
133 | }
134 | }
135 | ```
136 |
137 | Parameters:
138 |
139 | - system_chain_data (dic)
140 | - system_chain_prompt (string) - Prompt instructions
141 | - operation (string) - The action to be executed in the lambda ``system`` parameter
142 | - configuration (dict) - At the moment configuration is intended to support ``REPLACE_TAG`` operation
143 | - replace_tag (string) - Required if ``REPLACE_TAG`` is used. Replace the tag in the original content of ``system``
144 | parameter.
145 |
146 | #### Usage
147 |
148 | Add the ``system_chain_data`` block in the Step Functions state machine, check if it is being sent as part of the state
149 | machine response. You can check implement examples in ``examples/serverless_assistant_rag/statemachine/rag_parallel_taks/RagGenAI.asl.json``, at ``GenerateResponse`` and ``GenerateResponseKB`` states.
150 |
151 |
152 | ---
153 | ## Deployment
154 |
155 | ### Requirements
156 | - [AWS SAM](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html)
157 | - [AWS Account](https://aws.amazon.com/resources/create-account/)
158 | - [Request access to Anthropic Claude 3 (Sonnet and Haiku), and Meta Llama 3 models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html)
159 | - [Knowledge base to execute the Rag statemachine example](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html)
160 |
161 | Note: If you are interested in deploy as Docker image, use [this example](https://github.com/awslabs/aws-lambda-web-adapter/tree/main/examples/fastapi-response-streaming)
162 |
163 | ### Build and Deploy
164 |
165 | Run the following commands to build and deploy the application to lambda.
166 |
167 | ```bash
168 | cd serverless-genai-assistant/examples/serverless-assistant-rag
169 | sam build
170 | sam deploy --guided
171 | ```
172 | For the deployment options, define the following options:
173 | ```bash
174 | Stack Name []: #Choose a stack name
175 | AWS Region [us-east-1]: #Select a Region that supports Amazon Bedrock and the other AWS Services
176 | Parameter KnowledgeBaseId []: #Insert the KB ID https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-manage.html#kb-
177 | #Shows you resources changes to be deployed and require a 'Y' to initiate deploy
178 | Confirm changes before deploy [y/N]:
179 | #SAM needs permission to be able to create roles to connect to the resources in your template
180 | Allow SAM CLI IAM role creation [Y/n]:
181 | #Preserves the state of previously provisioned resources when an operation fails
182 | Disable rollback [y/N]:
183 | FastAPIFunction Function Url has no authentication. Is this okay? [y/N]: y
184 | Save arguments to configuration file [Y/n]:
185 | SAM configuration file [samconfig.toml]:
186 | SAM configuration environment [default]:
187 | ```
188 |
189 | >Copy the TheFastAPIFunctionUrl value, it's the url to invoke lambda.
190 |
191 | >**Note:** If you don't provide a Knowledge base ID, you will receive answer from LLM informing a Bedrock error, for a complete experience check the [Requirements](#requirements) and create the Knowledge base.
192 |
193 |
194 |
195 | ## Run
196 |
197 | ### Terminal
198 |
199 | ```bash
200 | cd serverless-genai-assistant/examples/serverless-assistant-rag/tests
201 | curl --no-buffer -H "Content-Type: application/json" -X POST -d @test.json
202 | ```
203 |
204 | ### Python
205 | ```bash
206 | cd serverless-genai-assistant/examples/serverless-assistant-rag/tests
207 | pip install requests
208 | python test_stream_python_requests.py --lambda-url
209 | ```
210 |
211 | ### Streamlit
212 | ```bash
213 | cd serverless-genai-assistant/examples/serverless-assistant-rag/tests
214 | pip install requests
215 | pip install streamlit
216 | streamlit run st_serverless_assistant.py -- --lambda-url
217 | ```
218 |
219 | ### Serverless Gen AI Assistant on Streamlit
220 |
221 | 
--------------------------------------------------------------------------------
/examples/sample_web_application/README.md:
--------------------------------------------------------------------------------
1 | # Serverless GenAI Assistant - Web Application Sample
2 |
3 | ## Overview
4 |
5 | The Serverless GenAI Assistant - Web Application Sample, demonstrates a pattern that combines serverless architecture with generative AI capabilities. This sample project demonstrates how to integrate this pattern showcasing both, frontend and backend components.
6 |
7 | To enable this UI the [deep-chat](https://github.com/OvidijusParsiunas/deep-chat) component is used to perform the chat conversation. The deep-chat component is a flexible solution that enables the integration of chatbot/genai APIs with a frontend component.
8 | To turn this integration possible it was necessary to create a [custom connection](https://deepchat.dev/docs/connect#request). Check the `ChatComponent.vue` if want to understand how it can be done.
9 |
10 | 
11 |
12 | 1. Frontend: The user interface is hosted and managed using AWS Amplify. It handles the user request.
13 | 2. Authentication: Amazon Cognito is used for secure user authentication.
14 | 3. API Gateway: Manages API requests to retrieve configuration Data. THe JWT is also validated in API GW.
15 | 4. Lambda Configuration Handler: Holds the logic to identify API GW path request and execute data retrieval using Data Manager lambda layer that provides DynamoDB interface.
16 | 5. Cloudfront + JWT Verify: Provides an endpoint able to deliver stream response for authorized requests.
17 | 6. Lambda FastAPI: Using lambda web adapter, it received requests from cloudfront. All the requests are SigV4 signed and also the cognito access token is used to retrieve data using the Data handler layer.
18 | 7. Step Functions: Used to provide custom GenAI Workflows for different users.
19 | 8. Amazon Bedrock: Uses Converse API with response streaming.
20 |
21 |
22 | ## Work in Progress
23 | This readme is in progress and wil be updated with more architecture details and usage instructions.
24 |
25 | ## Configuration data using Single Table Design
26 |
27 | The information is stored based on an `account_id` that can represents a user or multiple users. The is added as a custom attribute
28 | in cognito and `account_id` is added to
29 | JWT during the authentication phase on Cognito.
30 |
31 | | Category | Field | Value | Description |
32 | |----------|------------------|---------------------------------------------------------------------|-------------------------------------------------------------------------------------|
33 | | **Account Details** |
34 | | | Account ID | uuid4 | Unique identifier for the account |
35 | | | Name | Default account | Name of the account |
36 | | | Description | Test account | Brief description of the account |
37 | | **Inference Endpoint** |
38 | | | Type | cloudfront | The type of endpoint used |
39 | | | URL | http://cloudfront_domain/path | The URL for the inference endpoint |
40 | | | Description | Endpoint for inference with stream response | Explanation of the endpoint's purpose |
41 | | **Workflow Details** |
42 | | | Workflow ID | 1 | Identifier for the workflow |
43 | | | Name | Test Workflow for account 1 | Name of the workflow |
44 | | | Type | stepfunctions | Type of the workflow |
45 | | | ARN | arn:aws:states:::stateMachine: | Amazon Resource Name for the workflow |
46 | | | assistant_params | String representation of a JSON | Contain the parameters to be sent to serverless assistant for the specific workflow |
47 |
48 |
49 | ## Features
50 |
51 | - Scalable serverless architecture
52 | - GenAI with Stream Response
53 | - AWS Step Functions to create low code GenAI Workflows
54 | - Cognito user authentication
55 | - Web-based user interface
56 | - Cognito Users can access different Workflow resources
57 |
58 | ## Architecture
59 |
60 | The application leverages several AWS services:
61 |
62 | - **Frontend**: Hosted and managed using AWS Amplify
63 | - **Authentication**: Implemented with Amazon Cognito
64 | - **Backend**: Deployed using AWS Serverless Application Model (SAM)
65 |
66 |
67 | ## Prerequisites
68 |
69 | - AWS Account
70 | - [AWS CLI](https://aws.amazon.com/cli/) installed and configured
71 | - [Python 3.12](https://www.python.org/downloads/)
72 | - [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#installation)
73 | - [AWS SAM CLI](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) installed
74 | - [Node.js](https://nodejs.org/) and npm installed
75 | - Git
76 |
77 | ## Deployment
78 |
79 | The solution is deployed in two parts:
80 |
81 | 1. Amplify V2 (Frontend + Cognito)
82 | 2. SAM Template (Backend)
83 |
84 | Note: It's planned to move the stack to CDK on Amplify. For now, it's necessary to configure the integration between the
85 | stacks.
86 |
87 | ### Frontend Deployment (Amplify)
88 |
89 | This section uses the process as showed in [Amplify Quickstart](https://docs.amplify.aws/vue/start/quickstart/)
90 |
91 | 1. Create a new repository on GitHub:
92 | ```
93 | https://github.com/new
94 | ```
95 |
96 | 2. Navigate to the `amplify_ui` directory:
97 | ```bash
98 | cd serverless-genai-assistant/examples/sample_web_application/amplify_ui
99 | ```
100 |
101 | 3. Initialize and push to the new repository:
102 | ```bash
103 | git init
104 | git add .
105 | git commit -m "Initial commit"
106 | git branch -M main
107 | git remote add origin git@github.com:/.git
108 | git push -u origin main
109 | ```
110 |
111 | 4. Deploy the frontend using AWS Amplify Console:
112 | - Go to AWS Amplify Console
113 | - Click "Create new app" > "Host web app"
114 |
115 | 
116 |
117 | - Select your git provider
118 |
119 | 
120 |
121 | - Select your repository
122 |
123 | 
124 |
125 | - Click on Next
126 |
127 | 
128 |
129 | - Click on Save & Deploy
130 |
131 | 
132 |
133 | 5. After deployment, note the Amplify Domain for accessing the UI.
134 | - Click on the Branch
135 |
136 | 
137 |
138 | 6. Download `amplify_outputs.json` from the Amplify Console:
139 | - Click on the branch to see more details
140 | - Click on "Deployed backend resources" and Download amplify_outputs.json
141 |
142 | 
143 |
144 | Copy it to the `amplify_ui` directory:
145 | ```bash
146 | cp /amplify_outputs.json serverless-genai-assistant/examples/sample_web_application/amplify_ui
147 | ```
148 |
149 | ### Backend Deployment (SAM)
150 |
151 | 1. Update Cognito parameters in `lambda-jwt-verify/src/index.mjs`:
152 | ```javascript
153 | const verifier = CognitoJwtVerifier.create({
154 | userPoolId: "INSERT_COGNITO_USER_POOL_ID",
155 | tokenUse: "access",
156 | clientId: "INSERT_COGNITO_CLIENT_ID",
157 | });
158 | ```
159 |
160 | 2. Deploy the backend using SAM CLI, the Cognito UserPoolId and UserPoolClientId will be requested:
161 | ```bash
162 | cd serverless-genai-assistant/examples/sample_web_application/
163 | sam build
164 | sam deploy
165 | ```
166 |
167 | 4. Update `assistant_config.py` with the deployment outputs and run it:
168 | ```bash
169 | python serverless-genai-assistant/examples/sample_web_application/assistant_config.py
170 | ```
171 |
172 | 5. Update the `ConfigUrl` and the region in `amplify_ui/src/main.ts` with the SAM output endpoint.
173 |
174 | ```javascript
175 | AssistantConfigApi: {
176 | endpoint:
177 | '',
178 | region: ''
179 | }
180 | ```
181 |
182 | 6. Commit the Amplify frontend with the updated configuration.
183 |
184 | ```bash
185 | cd serverless-genai-assistant/examples/sample_web_application/amplify_ui
186 | git commit -am "API Gateway endpoint Updated"
187 | git push
188 | ```
189 |
190 | Amplify will automatically deploy the changes.
191 |
192 | ## Usage
193 |
194 | 1. Access the application using the Amplify URL provided after deployment.
195 | 2. Log in using the credentials generated by the `assistant_config.py` script.
196 | 3. Interact with the GenAI Assistant through the web interface.
197 |
198 | 
199 |
--------------------------------------------------------------------------------
/examples/sample_web_application/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/__init__.py
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/amplify.yml:
--------------------------------------------------------------------------------
1 | version: 1
2 | backend:
3 | phases:
4 | build:
5 | commands:
6 | - npm ci --cache .npm --prefer-offline
7 | - npx ampx pipeline-deploy --branch $AWS_BRANCH --app-id $AWS_APP_ID
8 | frontend:
9 | phases:
10 | build:
11 | commands:
12 | - npm run build
13 | artifacts:
14 | baseDirectory: dist
15 | files:
16 | - '**/*'
17 | cache:
18 | paths:
19 | - .npm/**/*
20 | - node_modules/**/*
21 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/amplify/auth/pre-token-generation/handler.ts:
--------------------------------------------------------------------------------
1 | import type { PreTokenGenerationTriggerHandler } from "aws-lambda";
2 |
3 | //https://aws.amazon.com/blogs/security/how-to-customize-access-tokens-in-amazon-cognito-user-pools/
4 | //https://docs.amplify.aws/vue/build-a-backend/functions/examples/override-token/
5 |
6 |
7 | //Note that any is used here to allow the usage of access token customization.
8 | export const handler: any = async (event: any) => {
9 | event.response = {
10 | claimsAndScopeOverrideDetails: {
11 | accessTokenGeneration: {
12 | claimsToAddOrOverride: {
13 | "custom:account_id": event.request.userAttributes["custom:account_id"]
14 | },
15 | }
16 | }
17 | };
18 | return event;
19 | };
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/amplify/auth/pre-token-generation/resources.ts:
--------------------------------------------------------------------------------
1 | import { defineFunction } from '@aws-amplify/backend';
2 |
3 | //https://aws.amazon.com/blogs/security/how-to-customize-access-tokens-in-amazon-cognito-user-pools/
4 | //https://docs.amplify.aws/vue/build-a-backend/functions/examples/override-token/
5 | export const preTokenGeneration = defineFunction({
6 | name: 'pre-token-generation'
7 | });
8 |
9 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/amplify/auth/resource.ts:
--------------------------------------------------------------------------------
1 | import { defineAuth } from '@aws-amplify/backend';
2 | //import { preTokenGeneration } from './pre-token-generation/resources';
3 |
4 | /**
5 | * Define and configure your auth resource
6 | * @see https://docs.amplify.aws/gen2/build-a-backend/auth
7 | */
8 | export const auth = defineAuth({
9 | loginWith: {
10 | email: true,
11 | },
12 | groups: ['ServerlessAssistantUser', 'ServerlessAssistantOwner', 'ServerlessAssistantAdmin']
13 | });
14 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/amplify/backend.ts:
--------------------------------------------------------------------------------
1 | /* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2 | # SPDX-License-Identifier: MIT-0 */
3 | import { defineBackend } from '@aws-amplify/backend';
4 | import { auth } from './auth/resource';
5 | import { preTokenGeneration } from './auth/pre-token-generation/resources';
6 | import * as iam from 'aws-cdk-lib/aws-iam';
7 | import * as lambda from 'aws-cdk-lib/aws-lambda';
8 |
9 |
10 | const backend = defineBackend({
11 | auth,
12 | preTokenGeneration
13 | });
14 |
15 | //https://docs.amplify.aws/react/build-a-backend/auth/modify-resources-with-cdk/#custom-attributes
16 |
17 | // extract L1 CfnUserPool resources
18 | const { cfnUserPool } = backend.auth.resources.cfnResources;
19 |
20 |
21 | // update the schema property to add custom attributes
22 | const user_pool_attributes = [
23 | {
24 | name: 'account_id',
25 | attributeDataType: 'String',
26 | mutable: false
27 | },
28 | {
29 | name: 'account_name',
30 | attributeDataType: 'String',
31 | mutable: false
32 | },
33 | {
34 | name: 'group_id',
35 | attributeDataType: 'String',
36 | mutable: false
37 | },
38 | {
39 | name: 'group_name',
40 | attributeDataType: 'String',
41 | mutable: false
42 | },
43 | {
44 | name: 'user_id',
45 | attributeDataType: 'String',
46 | mutable: false
47 | },
48 | {
49 | name: 'user_name',
50 | attributeDataType: 'String',
51 | mutable: false
52 | }];
53 |
54 |
55 | if (Array.isArray(cfnUserPool.schema)) {
56 | cfnUserPool.schema.push(...user_pool_attributes)
57 | }
58 |
59 | //advanced security mode to support customization of access token
60 | cfnUserPool.userPoolAddOns = {
61 | advancedSecurityMode: 'AUDIT',
62 | };
63 |
64 | // Enables request V2_0 to implements access token customization
65 | cfnUserPool.lambdaConfig = {
66 | preTokenGenerationConfig: {
67 | lambdaArn: backend.preTokenGeneration.resources.lambda.functionArn,
68 | lambdaVersion: 'V2_0',
69 |
70 | }
71 | }
72 |
73 |
74 | const invokeFunctionRole = new iam.Role(cfnUserPool, 'CognitoInvokeLambda', {
75 | assumedBy: new iam.ServicePrincipal('cognito-idp.amazonaws.com')
76 |
77 | });
78 |
79 | // Loads created preTokenGeneration Lambda on cfnUserPool resource to add Invoke permission
80 | // https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_lambda.FunctionAttributes.html
81 | const createLambdaResourcePolicy = lambda.Function.fromFunctionAttributes(cfnUserPool, 'PreToken Function', {
82 | functionArn: backend.preTokenGeneration.resources.lambda.functionArn,
83 | sameEnvironment: true,
84 | skipPermissions: true
85 | })
86 |
87 | createLambdaResourcePolicy.addPermission('invoke-lambda', {
88 | principal: new iam.ServicePrincipal('cognito-idp.amazonaws.com'),
89 | action: 'lambda:InvokeFunction',
90 | sourceArn: cfnUserPool.attrArn,
91 | });
92 |
93 |
94 |
95 |
96 |
97 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/amplify/package.json:
--------------------------------------------------------------------------------
1 | {
2 | "type": "module"
3 | }
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/amplify/tsconfig.json:
--------------------------------------------------------------------------------
1 | {
2 | "compilerOptions": {
3 | "target": "es2022",
4 | "module": "es2022",
5 | "moduleResolution": "bundler",
6 | "resolveJsonModule": true,
7 | "esModuleInterop": true,
8 | "forceConsistentCasingInFileNames": true,
9 | "strict": true,
10 | "skipLibCheck": true,
11 | "paths": {
12 | "$amplify/*": [
13 | "../.amplify/generated/*"
14 | ]
15 | }
16 | }
17 | }
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/dist/favicon.ico:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/amplify_ui/dist/favicon.ico
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/dist/index.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 | Vite App
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/env.d.ts:
--------------------------------------------------------------------------------
1 | ///
2 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/index.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 | Vite App
8 |
9 |
10 |
11 |
12 |
13 |
14 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/package.json:
--------------------------------------------------------------------------------
1 | {
2 | "name": "amplify-vue-template",
3 | "version": "0.0.0",
4 | "private": true,
5 | "type": "module",
6 | "scripts": {
7 | "dev": "vite",
8 | "build": "run-p type-check \"build-only {@}\" --",
9 | "preview": "vite preview",
10 | "build-only": "vite build",
11 | "type-check": "vue-tsc --build --force"
12 | },
13 | "dependencies": {
14 | "@aws-amplify/ui-vue": "^4.2.8",
15 | "@microsoft/fetch-event-source": "^2.0.1",
16 | "aws-amplify": "^6.2.0",
17 | "deep-chat": "^1.4.11",
18 | "vue": "^3.4.21"
19 | },
20 | "devDependencies": {
21 | "@aws-amplify/backend": "^1.0.0",
22 | "@aws-amplify/backend-cli": "^1.0.1",
23 | "@tsconfig/node20": "^20.1.4",
24 | "@types/node": "^20.12.5",
25 | "@vitejs/plugin-vue": "^5.0.4",
26 | "@vue/tsconfig": "^0.5.1",
27 | "aws-cdk": "^2.137.0",
28 | "aws-cdk-lib": "^2.137.0",
29 | "constructs": "^10.3.0",
30 | "esbuild": "^0.20.2",
31 | "npm-run-all2": "^6.1.2",
32 | "tsx": "^4.7.2",
33 | "typescript": "^5.4.5",
34 | "vite": "^5.2.8",
35 | "vue-tsc": "^2.0.11"
36 | }
37 | }
38 |
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/public/favicon.ico:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/serverless-genai-assistant/2ed6d6ed71b75a0b7a3260885541a7e5e690b2dc/examples/sample_web_application/amplify_ui/public/favicon.ico
--------------------------------------------------------------------------------
/examples/sample_web_application/amplify_ui/src/App.vue:
--------------------------------------------------------------------------------
1 |
18 |
19 |
20 |
21 |
22 |
23 |
24 | {{ user.signInDetails.loginId }} | Sign Out
25 |