├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── DEVELOPMENT.md
├── LICENSE
├── README.md
├── diagram.png
└── src
    └── lex-gen-ai-demo-cdk
        ├── app.py
        ├── cdk.json
        ├── create_web_crawler_lambda.py
        ├── endpoint_handler.py
        ├── index-creation-docker-image
            ├── Dockerfile
            ├── index_creation_app.py
            └── index_creation_requirements.txt
        ├── lex-gen-ai-demo-docker-image
            ├── Dockerfile
            ├── runtime_lambda_app.py
            └── runtime_lambda_requirements.txt
        ├── lex_gen_ai_demo_cdk_files
            ├── __init__.py
            └── lex_gen_ai_demo_cdk_files_stack.py
        ├── requirements.txt
        ├── shut_down_endpoint.py
        ├── source.bat
        ├── upload_file_to_s3.py
        ├── web-crawler-docker-image
            ├── Dockerfile
            ├── web_crawler_app.py
            └── web_crawler_requirements.txt
        └── web_crawl.py


/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing Guidelines
 2 | 
 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
 4 | documentation, we greatly value feedback and contributions from our community.
 5 | 
 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
 7 | information to effectively respond to your bug report or contribution.
 8 | 
 9 | 
10 | ## Reporting Bugs/Feature Requests
11 | 
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 | 
14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 | 
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 | 
22 | 
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 | 
26 | 1. You are working against the latest source on the *main* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 | 
30 | To send us a pull request, please:
31 | 
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 | 
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 | 
42 | 
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45 | 
46 | 
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 | 
52 | 
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 | 
56 | 
57 | ## Licensing
58 | 
59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 | 


--------------------------------------------------------------------------------
/DEVELOPMENT.md:
--------------------------------------------------------------------------------
 1 | 
 2 | ### Project Structure within src/lex-gen-ai-demo-cdk:
 3 | ```
 4 | AWSLexKoiosBlogDemo/src/lex-gen-ai-demo-cdk
 5 | - app.py
 6 | - cdk.json
 7 | - endpoint_handler.py
 8 | - upload_file_to_s3.py
 9 | - shutdown_endpoint.py
10 | - index-creation-docker-image/
11 |    - index_creation_app.py
12 |    - Dockerfile
13 |    - index_creation_requirements.txt
14 | - lex_gen_ai_demo_cdk_files/
15 |    - __init__.py
16 |    - lex_gen_ai_demo_file_stack.py
17 | - lex-gen-ai-demo-docker-image/
18 |    - runtime_lambda_app.py
19 |    - Dockerfile
20 |    - runtime_lambda_requirements.txt
21 | - requirements.txt
22 | - source.bat
23 | ```
24 | 
25 | ## Common Errors & Troubleshooting
26 | 
27 | ### "ValueError: Must setup local AWS configuration with a region supported by SageMaker."
28 | Solution: You must set an aws region with `export AWS_DEFAULT_REGION=<your-region>`
29 | 
30 | ### Error creating role
31 | ```
32 | botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the CreateRole operation: User: <user-arn> is not authorized to perform: iam:CreateRole on resource: <role-arn> because no identity-based policy allows the iam:CreateRole action
33 | ```
34 | Solution: you must ensure the Iam role you are using has sufficient permissions to create Iam roles
35 | 
36 | ### Error LexGenAIDemoFilesStack:  fail: docker push <IMAGE> exited with error code 1: tag does not exist
37 | Issue: Error while building the image. Here are some common ones 
38 | 
39 | #### Error processing tar file(exit status 1): write /path/libcublas.so.11: no space left on device
40 | Issue: Docker has run out of memory due to too many images
41 | Solution: Delete unused images in the Docker application and then [prune docker](https://docs.docker.com/config/pruning/) in command line 
42 | 
43 | #### ConnectionResetError: [Errno 104] Connection reset by peer
44 | Issue: Pip issue
45 | Solution: Clear pip cache (`python3 -m pip cache purge`) and run again
46 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT No Attribution
 2 | 
 3 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 6 | this software and associated documentation files (the "Software"), to deal in
 7 | the Software without restriction, including without limitation the rights to
 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 9 | the Software, and to permit persons to whom the Software is furnished to do so.
10 | 
11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
13 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
14 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
15 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
16 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
17 | 
18 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # AWS Lex Conversational FAQ Demo
  2 | 
  3 | Demonstration of LLM integration into a lex bot using Lambda codehooks and a Sagemaker endpoint.
  4 | 
  5 | ![Diagram](diagram.png)
  6 | 
  7 | ### What resources will be created?
  8 | This CDK code will create the following:
  9 |    - 1 Sagemaker endpoint hosting a model (default configuration is falcon-7b-instruct on ml.g5.8xlarge but you can configure model or hardware)
 10 |    - 1 Lex bot
 11 |    - 2 S3 buckets (one for your uploaded source, one for the created index)
 12 |    - 2 Lambda functions (one to ingest the source and create an image, one to be invoked as codehook during lambda and provide an FAQ answer when needed)
 13 |    - 1 Event listener attached to an S3 bucket to call the index creation lambda automatically when a file is uploaded
 14 |    - 2 Iam roles (one for the lex bot to call lambda, one for the lambdas to call sagemaker and S3)
 15 | 
 16 | ## Requirements
 17 | 
 18 | ### AWS setup
 19 | **Region**
 20 | 
 21 | If you have not yet run `aws configure` and set a default region, you must do so, or you can also run `export AWS_DEFAULT_REGION=<your-region>` 
 22 | 
 23 | **Authorization**
 24 | 
 25 | You must use a role that has sufficient permissions to create Iam roles, as well as cloudformation resources
 26 | 
 27 | #### Python >=3.7
 28 | Make sure you have [python3](https://www.python.org/downloads/) installed at a version >=3.7.x
 29 | 
 30 | #### Docker
 31 | Make sure you have [Docker](https://www.docker.com/products/docker-desktop/) installed on your machine and running in the background 
 32 | 
 33 | #### AWS CDK
 34 | Make sure you have the [AWS CDK](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html#getting_started_install) installed on your machine
 35 | 
 36 | 
 37 | ## Setup
 38 | 
 39 | ### Set up virtual enviroment and gather packages
 40 | 
 41 | ```
 42 | cd src/lex-gen-ai-demo-cdk-files
 43 | ```
 44 | 
 45 | Install the required dependencies (aws-cdk-lib and constructs) into your Python environment 
 46 | ```
 47 | pip install -r requirements.txt
 48 | ```
 49 | 
 50 | ### Gather and deploy resources with the CDK
 51 | 
 52 | First synthesize, which executes the application, defines which resources will be created, and translates this into a cloudformation template
 53 | ```
 54 | cdk synth
 55 | ```
 56 | Now bootstrap, which provisions the resources you'll use when deploying the application
 57 | ```
 58 | cdk bootstrap
 59 | ```
 60 | and deploy with
 61 | ```
 62 | cdk deploy LexGenAIDemoFilesStack
 63 | ```
 64 | 
 65 | The deployment will create a lex bot and S3 buckets and will dockerize the code in the `lex-gen-ai-demo-cdk/index-creation-docker-image` and `lex-gen-ai-demo-cdk/lex-gen-ai-demo-docker-image` directory and push that image to ECR so it can run in Lambda. Don't worry if this step takes a long time while pushing to ECR, we are bundling up two docker images and uploading them so it will take some time.
 66 | 
 67 | ## Usage
 68 | Once all the resources are created after `cdk deploy` finishes running you must upload a .pdf or .txt file at least once so an index can be created. You can use our upload script `upload_file_to_s3.py path/to/your/file` or you can navigate to the S3 console and manually upload a file. On upload the ingestion lambda will read the file and create an embedding which it will upload to the other S3 bucket. Now that an embedding exists you can go to your bot and begin using it. If you want to update the embedding you can upload a new file and a new embedding will overwrite the old embedding. Once you have a new embedding you must restart the runtime lambda function for it to start using the new embedding. 
 69 | 
 70 | Note, the first time the embedding lambda and the runtime lambda are called the latency will be much slower as it must load resources and save them in the lambda enviroment. Once loaded these resources will stay in the enviroment as long as the ECR image is not deleted. This means your first request will be slow but after that it will be faster now that the resources are cached.
 71 | 
 72 | ### Uploading files 
 73 | Now, you have to upload your source file so the indexing lambda can create an index for the runtime lambda to use. You can use our script with any .pdf or .txt file by running
 74 | ```
 75 | python3 upload_file_to_s3.py path/to/your/file
 76 | ```
 77 | or you can open the S3 bucket in the console and manually upload a file. On upload an index will automatically be generated.
 78 | Note: If you upload a large file, the index will be large and the S3 read time on cold start may become large.
 79 | 
 80 | Once you've uploaded your file, wait a little for your index to be created and then you can go into the Lex console and test your bot (no need to build your bot unless you've made changes after creation). The first time you create an index and the first time you query the bot it will take a little longer (around 90 seconds) as we need to load models and cache them in the lambda-ECR enviroment, but once they are cached there is no need to download them and latency will be much faster. These resources will remain cached as long as the ECR image is not deleted. Additionally for better cold start performance you can provision an instance for your runtime lambda function. There are directions to do so below.
 81 | 
 82 | ### Configurations
 83 | 
 84 | 🚨 **Remember to shut down your endpoint if you're done using it!** 🚨
 85 | 
 86 | We have provided a script to deactivate an endpoint and endpoint configuration with whatever name is in the endpoint creation script. To run:
 87 | ```
 88 | python3 shut_down_endpoint.py
 89 | ```
 90 | 
 91 | #### Custom model and instance type configuration:
 92 | 
 93 | The function `create_endpoint_from_HF_image()` is called in `app.py`. This function accepts the following arguments:
 94 |  - hf_model_id (required): For the purposes of the demo we have this set to [tiiuae/falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b). You can find any model on https://huggingface.co/ and feed it in
 95 |  - instance_type (optional, default is ml.g5.8xlarge): If you don't give an argument we'll use ml.g5.8xlarge. You can use any endpoint [sage instance type](https://aws.amazon.com/sagemaker/pricing/)
 96 |  - endpoint_name (optional, default is whatever SAGEMAKER_ENDPOINT_NAME is set to in the file endpoint_handler.py): You can give your endpoint a custom name. It is recomended that you don't do this but if you do, you have to change it in the lamdba images (constant is called ENDPOINT_NAME in index_creation_app.py and runtime_lambda_app.py)
 97 |  - number_of_gpu (optional, default is 1): Set this to any number of GPUs the hardware you chose allows. 
 98 | 
 99 |  If you have in invalid configuration the endpoint will fail to create. You can see the specific error in the cloudwatch logs. If you fail creation you can run `python3 shut_down_endpoint.py` to clean up the endpoint but if you do so manually in the console **you must delete both the endpoint and the endpoint configuration**
100 | 
101 | #### Further configuration
102 | If you would like to further configure the endpoint you can change the specific code in `endpoint_handler.py`
103 | 
104 | The LLM is hosted on a sagemaker endpoint and deployed as a sagemaker [ceModel](https://sagemaker.readthedocs.io/en/stable/frameworks/ce/sagemaker.ce.html). We are also using a ce model image. You can read more about it [here](https://aws.amazon.com/blogs/machine-learning/announcing-the-launch-of-new-hugging-face-llm-inference-containers-on-amazon-sagemaker/). For further model configuration you can read about sagemaker model deployments [here](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints-deployment.html).
105 | 
106 | For our indexing and retrieval we are using [llama-index](https://github.com/jerryjliu/llama_index). If you would like to configure the index retriever you can do so in the `runtime_lambda_app.py` file in the `VectorIndexRetriever` object on line 70. If you want to update index creation you can update the constants defined at the top of the index creation and runtime lambdas (`index_creation_app.py` and `runtime_lambda_app.py`). Make sure to familiarize yourself with [llama-index terms](https://gpt-index.readthedocs.io/en/latest/guides/tutorials/terms_definitions_tutorial.html) and the [llama-index prompthelper](https://gpt-index.readthedocs.io/en/latest/reference/service_context/prompt_helper.html) for best results.
107 | 
108 | ### Tips for best results
109 | 
110 | **Keep your lambda perpetually warm by provisioning an instance for the runtime lambda (lex-codehook-fn)**
111 | 
112 | Go to Lambda console > select the function lex-codehook-fn
113 | 
114 | Versions > Publish new version
115 | 
116 | Under this version 
117 |    - Provisioned Concurrency > set value to 1
118 |    - Permissions > Resource based policy statements > Add Permissions > AWS Service > Other, your-policy-name, lexv2.amazonaws.com, your-lex-bot-arn, lamdba:InvokeFunction
119 | 
120 | Go to your Lex Bot (LexGenAIDemoBotCfn)
121 | 
122 | Aliases > your-alias > your-language > change lambda function version or alias > change to your-version
123 | 
124 | This will keep an instance running at all times and keep your lambda ready so that you won't have cold start latency. This will cost a bit extra (https://aws.amazon.com/lambda/pricing/) so use thoughtfully. 
125 | 


--------------------------------------------------------------------------------
/diagram.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/aws-lex-conv-faq/e6be217505bc3c8f1422639f289a1a537d24fe3d/diagram.png


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/app.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import aws_cdk as cdk
 3 | 
 4 | from lex_gen_ai_demo_cdk_files.lex_gen_ai_demo_cdk_files_stack import LexGenAIDemoFilesStack
 5 | from create_web_crawler_lambda import LambdaStack
 6 | from endpoint_handler import create_endpoint_from_HF_image
 7 | 
 8 | # create_endpoint_from_HF_image(hf_model_id, instance_type="ml.g5.8xlarge", endpoint_name=SAGEMAKER_ENDPOINT_NAME, number_of_gpu=1)
 9 | # You can run with no arguments to get default values of google/flan-t5-xxl on ml.g5.8xlarge, or pass in your own arguments
10 | create_endpoint_from_HF_image(hf_model_id="tiiuae/falcon-7b-instruct")
11 | 
12 | app = cdk.App()
13 | filestack = LexGenAIDemoFilesStack(app, "LexGenAIDemoFilesStack")
14 | web_crawler_lambda_stack = LambdaStack(app, 'LexGenAIDemoFilesStack-Webcrawler')
15 | 
16 | app.synth()
17 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/cdk.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "app": "python3 app.py",
 3 |     "watch": {
 4 |       "include": [
 5 |         "**"
 6 |       ],
 7 |       "exclude": [
 8 |         "README.md",
 9 |         "cdk*.json",
10 |         "requirements*.txt",
11 |         "source.bat",
12 |         "**/__init__.py",
13 |         "python/__pycache__",
14 |         "tests"
15 |       ]
16 |     },
17 |     "context": {
18 |       "@aws-cdk/aws-lambda:recognizeLayerVersion": true,
19 |       "@aws-cdk/core:checkSecretUsage": true,
20 |       "@aws-cdk/core:target-partitions": [
21 |         "aws",
22 |         "aws-cn"
23 |       ],
24 |       "@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
25 |       "@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
26 |       "@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
27 |       "@aws-cdk/aws-iam:minimizePolicies": true,
28 |       "@aws-cdk/core:validateSnapshotRemovalPolicy": true,
29 |       "@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
30 |       "@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
31 |       "@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
32 |       "@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
33 |       "@aws-cdk/core:enablePartitionLiterals": true,
34 |       "@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
35 |       "@aws-cdk/aws-iam:standardizedServicePrincipals": true,
36 |       "@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
37 |       "@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
38 |       "@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
39 |       "@aws-cdk/aws-route53-patters:useCertificate": true,
40 |       "@aws-cdk/customresources:installLatestAwsSdkDefault": false,
41 |       "@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
42 |       "@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
43 |       "@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
44 |       "@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
45 |       "@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
46 |       "@aws-cdk/aws-redshift:columnId": true,
47 |       "@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
48 |       "@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
49 |       "@aws-cdk/aws-apigateway:requestValidatorUniqueId": true
50 |     }
51 |   }


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/create_web_crawler_lambda.py:
--------------------------------------------------------------------------------
 1 | from aws_cdk import (
 2 |     Duration, Stack,
 3 |     aws_lambda as lambda_,
 4 |     aws_s3 as s3,
 5 |     aws_iam as iam
 6 | )
 7 | 
 8 | from constructs import Construct
 9 | 
10 | class LambdaStack(Stack):
11 | 
12 |     def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
13 |         super().__init__(scope, construct_id, **kwargs)
14 |         # Iam role for lambda to invoke sagemaker
15 |         web_crawl_lambda_cfn_role = iam.Role(self, "Cfn-gen-ai-demo-web-crawler",
16 |             assumed_by=iam.ServicePrincipal("lambda.amazonaws.com")
17 |         )
18 |         web_crawl_lambda_cfn_role.add_managed_policy(iam.ManagedPolicy.from_aws_managed_policy_name("AmazonS3FullAccess"))
19 |         web_crawl_lambda_cfn_role.add_to_policy(
20 |             iam.PolicyStatement(
21 |                 actions=[
22 |                     "logs:CreateLogGroup",
23 |                     "logs:CreateLogStream",
24 |                     "logs:PutLogEvents"
25 |                 ],
26 |                 resources=["*"]
27 |             )
28 |         )
29 |         # Lambda function
30 |         lambda_function= lambda_.DockerImageFunction(self, "web-crawler-docker-image-CFN",
31 |                                     function_name="WebCrawlerLambda",
32 |                                     code=lambda_.DockerImageCode.from_image_asset("web-crawler-docker-image"),
33 |                                     role=web_crawl_lambda_cfn_role,
34 |                                     memory_size=1024,
35 |                                     timeout=Duration.minutes(5)
36 |                                     )
37 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/endpoint_handler.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import boto3
  3 | import time
  4 | from sagemaker.huggingface import get_huggingface_llm_image_uri
  5 | from sagemaker.huggingface import HuggingFaceModel
  6 | 
  7 | # get image from huggingface
  8 | llm_image = get_huggingface_llm_image_uri(
  9 |   "huggingface",
 10 |   version="0.8.2"
 11 | )
 12 | 
 13 | assume_role_policy_document = json.dumps({
 14 |     "Version": "2012-10-17",
 15 |     "Statement": [
 16 |         {
 17 |         "Effect": "Allow",
 18 |         "Principal": {
 19 |             "Service": [
 20 |                 "sagemaker.amazonaws.com",
 21 |                 "ecs.amazonaws.com"
 22 |             ]
 23 |         },
 24 |         "Action": "sts:AssumeRole"
 25 |         }
 26 |     ]
 27 | })
 28 | 
 29 | # editable to whatever you want your endpoint and role to be. You can use an existing role or a new one
 30 | # IMPORTANT: make sure your lambda endpoint name in lambda_app.py is consisitent if you change it here
 31 | SAGEMAKER_IAM_ROLE_NAME = 'Sagemaker-Endpoint-Creation-Role'
 32 | SAGEMAKER_ENDPOINT_NAME = "huggingface-pytorch-sagemaker-endpoint"
 33 | 
 34 | # Create role and give sagemaker permissions
 35 | def get_iam_role(role_name=SAGEMAKER_IAM_ROLE_NAME):
 36 |     iam_client = boto3.client('iam')
 37 | 
 38 |     try: 
 39 |         role = iam_client.get_role(RoleName=role_name)
 40 |         role_arn = role['Role']['Arn']
 41 |         print(f"Role {role_arn} found!")
 42 |         return role_arn
 43 |     
 44 |     except:
 45 |         role_arn = iam_client.create_role(
 46 |             RoleName=SAGEMAKER_IAM_ROLE_NAME,
 47 |             AssumeRolePolicyDocument=assume_role_policy_document
 48 |             )['Role']['Arn']
 49 | 
 50 |         time.sleep(10) # give the policy some time to properly create
 51 | 
 52 |         response = iam_client.attach_role_policy(
 53 |             PolicyArn='arn:aws:iam::aws:policy/AmazonSageMakerFullAccess',
 54 |             RoleName=SAGEMAKER_IAM_ROLE_NAME,
 55 |         )
 56 |         print(f"Creating {role_arn}")
 57 |         time.sleep(20) # give iam time to let the role create
 58 |         return role_arn
 59 | 
 60 | 
 61 | # Define Model and Endpoint configuration parameter
 62 | 
 63 | health_check_timeout = 300
 64 | trust_remote_code = True
 65 | 
 66 | # Create sagemaker endpoint, default values are flan t5 xxl in a g5.8xl instance
 67 | def create_endpoint_from_HF_image(hf_model_id, instance_type="ml.g5.8xlarge", endpoint_name=SAGEMAKER_ENDPOINT_NAME, number_of_gpu=1):
 68 |     sagemaker_client = boto3.client('sagemaker')
 69 | 
 70 |     try: # check if endpoint already existst
 71 |         sagemaker_client.describe_endpoint(EndpointName=SAGEMAKER_ENDPOINT_NAME)
 72 |         print(f"Endpoint with name {SAGEMAKER_ENDPOINT_NAME} found!")
 73 |         return
 74 |     
 75 |     except:
 76 |         print(f"Creating endpoint with model{hf_model_id} on {instance_type}...")
 77 | 
 78 |         # create HuggingFaceModel with the image uri
 79 |         llm_model = HuggingFaceModel(
 80 |             role=get_iam_role(),
 81 |             image_uri=llm_image,
 82 |             env={
 83 |                 'HF_MODEL_ID': hf_model_id,
 84 |                 'SM_NUM_GPUS': json.dumps(number_of_gpu),
 85 |                 'HF_MODEL_TRUST_REMOTE_CODE': json.dumps(trust_remote_code)
 86 |             }
 87 |         )
 88 | 
 89 |         # Deploy model to an endpoint
 90 |         # https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#sagemaker.model.Model.deploy
 91 |         llm = llm_model.deploy(
 92 |             endpoint_name=endpoint_name,
 93 |             initial_instance_count=1,
 94 |             instance_type=instance_type,
 95 |             # volume_size=400, # If using an instance with local SSD storage, volume_size must be None, e.g. p4 but not p3
 96 |             container_startup_health_check_timeout=health_check_timeout  # 10 minutes to be able to load the model
 97 |         )
 98 | 
 99 |         print(f"\nEndpoint created ({endpoint_name})")
100 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/index-creation-docker-image/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM public.ecr.aws/lambda/python:3.8
 2 | 
 3 | COPY index_creation_requirements.txt  .
 4 | RUN  pip3 install -r index_creation_requirements.txt --target "${LAMBDA_TASK_ROOT}"
 5 |  
 6 | # Copy function code
 7 | COPY *.py ${LAMBDA_TASK_ROOT}
 8 |  
 9 | # Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
10 | CMD [ "index_creation_app.handler" ]
11 | 
12 | # Set cache to a location lambda can write to
13 | ENV TRANSFORMERS_CACHE="/tmp/TRANSFORMERS_CACHE"
14 | 
15 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/index-creation-docker-image/index_creation_app.py:
--------------------------------------------------------------------------------
  1 | import boto3
  2 | import json
  3 | from pathlib import Path
  4 | 
  5 | import logging
  6 | from langchain.llms.base import LLM
  7 | from typing import Optional, List, Mapping, Any
  8 | import os
  9 | from llama_index import (
 10 |     LangchainEmbedding,
 11 |     GPTVectorStoreIndex,
 12 |     LLMPredictor,
 13 |     ServiceContext,
 14 |     Document,
 15 |     PromptHelper,
 16 |     download_loader
 17 | )
 18 | 
 19 | from langchain.embeddings import HuggingFaceEmbeddings
 20 | 
 21 | import logging
 22 | from botocore.exceptions import ClientError
 23 | 
 24 | logger = logging.getLogger()
 25 | logger.setLevel(logging.INFO)
 26 | 
 27 | ACCOUNT_ID = boto3.client('sts').get_caller_identity().get('Account')
 28 | INDEX_BUCKET = "lexgenaistack-created-index-bucket-"+ACCOUNT_ID
 29 | S3_BUCKET = "lexgenaistack-source-materials-bucket-"+ACCOUNT_ID
 30 | ENDPOINT_NAME = "huggingface-pytorch-sagemaker-endpoint"
 31 | DELIMITER = "\n\n\n"
 32 | LOCAL_INDEX_LOC = "/tmp/index_files"
 33 | 
 34 | def handler(event, context):
 35 |     event_record = event['Records'][0]
 36 |     if event_record['eventName'] == "ObjectCreated:Put":
 37 |         if ".txt" in event_record['s3']['object']['key'].lower() or ".pdf" in event_record['s3']['object']['key'].lower():
 38 |             source_material_key = event_record['s3']['object']['key']
 39 |             logger.info(f"Source file {source_material_key} found")
 40 |         else:
 41 |             logger.error("INVALID FILE, MUST END IN .TXT or .PDF")
 42 |             return
 43 |     else:
 44 |         logger.error("NON OBJECTCREATION INVOCATION")
 45 |         return
 46 |     
 47 |     s3_client = boto3.client('s3')
 48 |     try:
 49 |         s3_client.download_file(S3_BUCKET, source_material_key, "/tmp/"+source_material_key)
 50 |         logger.info(f"Downloaded {source_material_key}")
 51 |     except ClientError as e:
 52 |         logger.error(e)
 53 |         return "ERROR READING FILE"
 54 |     
 55 |     if ".pdf" in source_material_key.lower():
 56 |         PDFReader = download_loader("PDFReader", custom_path="/tmp/llama_cache")
 57 |         loader = PDFReader()
 58 |         documents = loader.load_data(file=Path("/tmp/"+source_material_key))
 59 |     else:
 60 |         with open("/tmp/"+source_material_key) as f:
 61 |             text_list = f.read().split(DELIMITER)
 62 |         logger.info(f"Reading text with delimiter {repr(DELIMITER)}")
 63 |         documents = [Document(t) for t in text_list]
 64 |     
 65 |     # define prompt helper
 66 |     max_input_size = 400  # set maximum input size
 67 |     num_output = 50  # set number of output tokens
 68 |     max_chunk_overlap = 0  # set maximum chunk overlap
 69 |     prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
 70 | 
 71 |     # define our LLM
 72 |     llm_predictor = LLMPredictor(llm=CustomLLM())
 73 |     embed_model = LangchainEmbedding(HuggingFaceEmbeddings(cache_folder="/tmp/HF_CACHE"))
 74 |     service_context = ServiceContext.from_defaults(
 75 |         llm_predictor=llm_predictor, prompt_helper=prompt_helper, embed_model=embed_model,
 76 |     )
 77 | 
 78 |     index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)
 79 |     index.storage_context.persist(persist_dir=LOCAL_INDEX_LOC)
 80 | 
 81 |     for file in os.listdir(LOCAL_INDEX_LOC):
 82 |         s3_client.upload_file(LOCAL_INDEX_LOC+"/"+file, INDEX_BUCKET, file) # ASSUMES IT CAN OVERWRITE, I.E. S3 OBJECT LOCK MUST BE OFF
 83 |  
 84 |     logger.info("Index successfully created")
 85 |     return
 86 | 
 87 | def call_sagemaker(prompt, endpoint_name=ENDPOINT_NAME):
 88 |     payload = {
 89 |         "inputs": prompt,
 90 |         "parameters": {
 91 |             "do_sample": False,
 92 |             # "top_p": 0.9,
 93 |             "temperature": 0.1,
 94 |             "max_new_tokens": 200,
 95 |             "repetition_penalty": 1.03,
 96 |             "stop": ["\nUser:", "<|endoftext|>", "</s>"]
 97 |         }
 98 |     }
 99 | 
100 |     sagemaker_client = boto3.client("sagemaker-runtime")
101 |     payload = json.dumps(payload)
102 |     response = sagemaker_client.invoke_endpoint(
103 |         EndpointName=endpoint_name, ContentType="application/json", Body=payload
104 |     )
105 |     response_string = response["Body"].read().decode()
106 |     return response_string
107 | 
108 | def get_response_sagemaker_inference(prompt, endpoint_name=ENDPOINT_NAME):
109 |     resp = call_sagemaker(prompt, endpoint_name)
110 |     resp = json.loads(resp)[0]["generated_text"][len(prompt):]
111 |     return resp
112 | 
113 | class CustomLLM(LLM):
114 |     model_name = "tiiuae/falcon-7b-instruct"
115 | 
116 |     def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
117 |         response = get_response_sagemaker_inference(prompt, ENDPOINT_NAME)
118 |         return response
119 | 
120 |     @property
121 |     def _identifying_params(self) -> Mapping[str, Any]:
122 |         return {"name_of_model": self.model_name}
123 | 
124 |     @property
125 |     def _llm_type(self) -> str:
126 |         return "custom"


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/index-creation-docker-image/index_creation_requirements.txt:
--------------------------------------------------------------------------------
1 | transformers==4.25.1
2 | langchain
3 | llama-index==0.6.20
4 | sentence-transformers
5 | pypdf
6 | typing_extensions


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/lex-gen-ai-demo-docker-image/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM public.ecr.aws/lambda/python:3.8
 2 | 
 3 | COPY runtime_lambda_requirements.txt  .
 4 | RUN  pip3 install -r runtime_lambda_requirements.txt --target "${LAMBDA_TASK_ROOT}"
 5 |  
 6 | # Copy function code
 7 | COPY *.py ${LAMBDA_TASK_ROOT}
 8 |  
 9 | # Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
10 | CMD [ "runtime_lambda_app.handler" ]
11 | 
12 | # Set cache to a location lambda can write to
13 | ENV TRANSFORMERS_CACHE="/tmp/TRANSFORMERS_CACHE"
14 | 
15 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/lex-gen-ai-demo-docker-image/runtime_lambda_app.py:
--------------------------------------------------------------------------------
  1 | import boto3
  2 | from botocore.exceptions import ClientError
  3 | import logging
  4 | import json
  5 | import os
  6 | from typing import Optional, List, Mapping, Any
  7 | from langchain.llms.base import LLM
  8 | from llama_index import (
  9 |     LangchainEmbedding,
 10 |     PromptHelper,
 11 |     ResponseSynthesizer,
 12 |     LLMPredictor,
 13 |     ServiceContext,
 14 |     Prompt,
 15 | )
 16 | 
 17 | from langchain.embeddings import HuggingFaceEmbeddings
 18 | from llama_index.query_engine import RetrieverQueryEngine
 19 | from llama_index.retrievers import VectorIndexRetriever
 20 | from llama_index.vector_stores.types import VectorStoreQueryMode
 21 | from llama_index import StorageContext, load_index_from_storage
 22 | 
 23 | s3_client = boto3.client('s3')
 24 | 
 25 | logger = logging.getLogger()
 26 | logger.setLevel(logging.INFO)
 27 | 
 28 | ENDPOINT_NAME = "huggingface-pytorch-sagemaker-endpoint"
 29 | OUT_OF_DOMAIN_RESPONSE = "I'm sorry, but I am only able to give responses regarding the source topic"
 30 | INDEX_WRITE_LOCATION = "/tmp/index"
 31 | ACCOUNT_ID = boto3.client('sts').get_caller_identity().get('Account')
 32 | INDEX_BUCKET = "lexgenaistack-created-index-bucket-"+ACCOUNT_ID
 33 | RETRIEVAL_THRESHOLD = 0.4
 34 | 
 35 | # define prompt helper
 36 | max_input_size = 400  # set maximum input size
 37 | num_output = 50  # set number of output tokens
 38 | max_chunk_overlap = 0  # set maximum chunk overlap
 39 | prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
 40 | 
 41 | 
 42 | def handler(event, context):
 43 | 
 44 |     # lamda can only write to /tmp/
 45 |     initialize_cache()
 46 | 
 47 |     # define our LLM
 48 |     llm_predictor = LLMPredictor(llm=CustomLLM())
 49 |     embed_model = LangchainEmbedding(HuggingFaceEmbeddings(cache_folder="/tmp/HF_CACHE"))
 50 |     service_context = ServiceContext.from_defaults(
 51 |         llm_predictor=llm_predictor, prompt_helper=prompt_helper, embed_model=embed_model,
 52 |     )
 53 | 
 54 |     ### Download index here
 55 |     if not os.path.exists(INDEX_WRITE_LOCATION):
 56 |         os.mkdir(INDEX_WRITE_LOCATION)
 57 |     try:
 58 |         s3_client.download_file(INDEX_BUCKET, "docstore.json", INDEX_WRITE_LOCATION + "/docstore.json")
 59 |         s3_client.download_file(INDEX_BUCKET, "index_store.json", INDEX_WRITE_LOCATION + "/index_store.json")
 60 |         s3_client.download_file(INDEX_BUCKET, "vector_store.json", INDEX_WRITE_LOCATION + "/vector_store.json")
 61 | 
 62 |         # load index
 63 |         storage_context = StorageContext.from_defaults(persist_dir=INDEX_WRITE_LOCATION)
 64 |         index = load_index_from_storage(storage_context, service_context=service_context)
 65 |         logger.info("Index successfully loaded")
 66 |     except ClientError as e:
 67 |         logger.error(e)
 68 |         return "ERROR LOADING/READING INDEX"
 69 | 
 70 |     retriever = VectorIndexRetriever(
 71 |         service_context=service_context,
 72 |         index=index,
 73 |         similarity_top_k=5,
 74 |         vector_store_query_mode=VectorStoreQueryMode.DEFAULT,  # doesn't work with simple
 75 |         alpha=0.5,
 76 |     )
 77 | 
 78 |     # configure response synthesizer
 79 |     synth = ResponseSynthesizer.from_args(
 80 |         response_mode="simple_summarize",
 81 |         service_context=service_context
 82 |     )
 83 | 
 84 |     query_engine = RetrieverQueryEngine(retriever=retriever, response_synthesizer=synth)
 85 |     query_input = event["inputTranscript"]
 86 | 
 87 |     try:
 88 |         answer = query_engine.query(query_input)
 89 |         if answer.source_nodes[0].score < RETRIEVAL_THRESHOLD:
 90 |             answer = OUT_OF_DOMAIN_RESPONSE
 91 |     except:
 92 |         answer = OUT_OF_DOMAIN_RESPONSE
 93 | 
 94 |     response = generate_lex_response(event, {}, "Fulfilled", answer)
 95 |     jsonified_resp = json.loads(json.dumps(response, default=str))
 96 |     return jsonified_resp
 97 | 
 98 | def generate_lex_response(intent_request, session_attributes, fulfillment_state, message):
 99 |     intent_request['sessionState']['intent']['state'] = fulfillment_state
100 |     return {
101 |         'sessionState': {
102 |             'sessionAttributes': session_attributes,
103 |             'dialogAction': {
104 |                 'type': 'Close'
105 |             },
106 |             'intent': intent_request['sessionState']['intent']
107 |         },
108 |         'messages': [
109 |             {
110 |                 "contentType": "PlainText",
111 |                 "content": message
112 |             }
113 |         ],
114 |         'requestAttributes': intent_request['requestAttributes'] if 'requestAttributes' in intent_request else None
115 |     }
116 | 
117 | # define prompt template
118 | template = (
119 |     "We have provided context information below. \n"
120 |     "---------------------\n"
121 |     "CONTEXT1:\n"
122 |     "{context_str}\n\n"
123 |     "CONTEXT2:\n"
124 |     "CANNOTANSWER"
125 |     "\n---------------------\n"
126 |     'Given this context, please answer the question if answerable based on on the CONTEXT1 and CONTEXT2: "{query_str}"\n; '  # otherwise specify it as CANNOTANSWER
127 | )
128 | my_qa_template = Prompt(template)
129 | 
130 | def call_sagemaker(prompt, endpoint_name=ENDPOINT_NAME):
131 |     payload = {
132 |         "inputs": prompt,
133 |         "parameters": {
134 |             "do_sample": False,
135 |             # "top_p": 0.9,
136 |             "temperature": 0.1,
137 |             "max_new_tokens": 200,
138 |             "repetition_penalty": 1.03,
139 |             "stop": ["\nUser:", "<|endoftext|>", "</s>"]
140 |         }
141 |     }
142 | 
143 |     sagemaker_client = boto3.client("sagemaker-runtime")
144 |     payload = json.dumps(payload)
145 |     response = sagemaker_client.invoke_endpoint(
146 |         EndpointName=endpoint_name, ContentType="application/json", Body=payload
147 |     )
148 |     response_string = response["Body"].read().decode()
149 |     return response_string
150 | 
151 | def get_response_sagemaker_inference(prompt, endpoint_name=ENDPOINT_NAME):
152 |     resp = call_sagemaker(prompt, endpoint_name)
153 |     resp = json.loads(resp)[0]["generated_text"][len(prompt):]
154 |     return resp
155 | 
156 | class CustomLLM(LLM):
157 |     model_name = "tiiuae/falcon-7b-instruct"
158 | 
159 |     def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
160 |         response = get_response_sagemaker_inference(prompt, ENDPOINT_NAME)
161 |         return response
162 | 
163 |     @property
164 |     def _identifying_params(self) -> Mapping[str, Any]:
165 |         return {"name_of_model": self.model_name}
166 | 
167 |     @property
168 |     def _llm_type(self) -> str:
169 |         return "custom"
170 |     
171 | def initialize_cache():
172 |     if not os.path.exists("/tmp/TRANSFORMERS_CACHE"):
173 |         os.mkdir("/tmp/TRANSFORMERS_CACHE")
174 | 
175 |     if not os.path.exists("/tmp/HF_CACHE"):
176 |         os.mkdir("/tmp/HF_CACHE")


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/lex-gen-ai-demo-docker-image/runtime_lambda_requirements.txt:
--------------------------------------------------------------------------------
1 | transformers==4.25.1
2 | langchain
3 | llama-index==0.6.20
4 | sentence-transformers


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/lex_gen_ai_demo_cdk_files/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/aws-lex-conv-faq/e6be217505bc3c8f1422639f289a1a537d24fe3d/src/lex-gen-ai-demo-cdk/lex_gen_ai_demo_cdk_files/__init__.py


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/lex_gen_ai_demo_cdk_files/lex_gen_ai_demo_cdk_files_stack.py:
--------------------------------------------------------------------------------
  1 | from aws_cdk import (
  2 |     Duration, App, Stack, CfnResource,
  3 |     aws_lex as lex,
  4 |     aws_s3 as s3,
  5 |     aws_s3_notifications as s3n,
  6 |     aws_s3_deployment as s3deploy,
  7 |     aws_iam as iam,
  8 |     aws_lambda as lambda_
  9 | )
 10 | 
 11 | from constructs import Construct
 12 | 
 13 | class LexGenAIDemoFilesStack(Stack):
 14 | 
 15 |     def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
 16 |         super().__init__(scope, construct_id, **kwargs)
 17 | 
 18 |         # Iam role for bot to invoke lambda
 19 |         lex_cfn_role = iam.Role(self, "CfnLexGenAIDemoRole",
 20 |             assumed_by=iam.ServicePrincipal("lexv2.amazonaws.com")
 21 |         )
 22 |         lex_cfn_role.add_managed_policy(iam.ManagedPolicy.from_aws_managed_policy_name("AWSLambdaExecute")) 
 23 | 
 24 |         # Iam role for lambda to invoke sagemaker
 25 |         lambda_cfn_role = iam.Role(self, "CfnLambdaGenAIDemoRole",
 26 |             assumed_by=iam.ServicePrincipal("lambda.amazonaws.com")
 27 |         )
 28 |         lambda_cfn_role.add_managed_policy(iam.ManagedPolicy.from_aws_managed_policy_name("AmazonSageMakerFullAccess"))
 29 |         lambda_cfn_role.add_managed_policy(iam.ManagedPolicy.from_aws_managed_policy_name("AmazonS3FullAccess")) 
 30 | 
 31 |         # will append account id to this string to avoid in region collisions
 32 |         source_bucket_name = "lexgenaistack-source-materials-bucket-"
 33 |         index_bucket_name = "lexgenaistack-created-index-bucket-"
 34 | 
 35 |         # S3 Buckets for materials to index and for the resulting indexes
 36 |         source_bucket = s3.Bucket(self, "SourceMatBucketID-CFN", 
 37 |                                   bucket_name=source_bucket_name+lex_cfn_role.principal_account,
 38 |                                   block_public_access=s3.BlockPublicAccess.BLOCK_ALL,
 39 |                                   encryption=s3.BucketEncryption.S3_MANAGED,
 40 |                                   enforce_ssl=True,
 41 |                                   versioned=True)
 42 |         
 43 |         index_bucket = s3.Bucket(self, "IndexBucket-CFN", 
 44 |                                  bucket_name=index_bucket_name+lex_cfn_role.principal_account,
 45 |                                  block_public_access=s3.BlockPublicAccess.BLOCK_ALL,
 46 |                                  encryption=s3.BucketEncryption.S3_MANAGED,
 47 |                                  enforce_ssl=True,
 48 |                                  versioned=True)
 49 | 
 50 |         # create lambda image for on demand index creation
 51 |         read_source_and_build_index_function = lambda_.DockerImageFunction(self, "read-source-and-build-index-function-CFN", function_name="read-source-and-build-index-fn",
 52 |             code=lambda_.DockerImageCode.from_image_asset("index-creation-docker-image"),
 53 |             role=lambda_cfn_role,
 54 |             memory_size=10240,
 55 |             timeout=Duration.minutes(5)
 56 |         )
 57 |         source_bucket.add_event_notification(s3.EventType.OBJECT_CREATED, s3n.LambdaDestination(read_source_and_build_index_function))
 58 | 
 59 |         # create image of lex-gen-ai-demo-docker-image, push to ECR and into a lambda function
 60 |         runtime_function = lambda_.DockerImageFunction(self, "CFN-runtime-fn", function_name="lex-codehook-fn",
 61 |             code=lambda_.DockerImageCode.from_image_asset("lex-gen-ai-demo-docker-image"),
 62 |             role=lambda_cfn_role,
 63 |             memory_size=10240,
 64 |             timeout=Duration.minutes(5)
 65 |         )
 66 |         runtime_function.grant_invoke(iam.ServicePrincipal("lexv2.amazonaws.com"))
 67 | 
 68 |         ### BOT SETUP
 69 | 
 70 |         # alias settings, where we define the lambda function with the ECR container with our LLM dialog code (defined in the lex-gen-ai-demo-docker-image directory)
 71 |         # test bot alias for demo, create a dedicated alias for serving traffic
 72 |         bot_alias_settings = lex.CfnBot.TestBotAliasSettingsProperty(
 73 |                                         bot_alias_locale_settings=[lex.CfnBot.BotAliasLocaleSettingsItemProperty(
 74 |                                             bot_alias_locale_setting=lex.CfnBot.BotAliasLocaleSettingsProperty(
 75 |                                                 enabled=True,
 76 |                                                 code_hook_specification=lex.CfnBot.CodeHookSpecificationProperty(
 77 |                                                     lambda_code_hook=lex.CfnBot.LambdaCodeHookProperty(
 78 |                                                         code_hook_interface_version="1.0",
 79 |                                                         lambda_arn=runtime_function.function_arn
 80 |                                                     )
 81 |                                                 )
 82 |                                             ),
 83 |                                             locale_id="en_US"
 84 |                                         )])
 85 |         
 86 |         # lambda itself is tied to alias but codehook settings are intent specific
 87 |         initial_response_codehook_settings = lex.CfnBot.InitialResponseSettingProperty(
 88 |                                         code_hook=lex.CfnBot.DialogCodeHookInvocationSettingProperty(
 89 |                                             enable_code_hook_invocation=True,
 90 |                                             is_active=True,
 91 |                                             post_code_hook_specification=lex.CfnBot.PostDialogCodeHookInvocationSpecificationProperty()
 92 |                                         )
 93 |                                     )
 94 |         
 95 |         # placeholder intent to be missed for this demo
 96 |         placeholder_intent = lex.CfnBot.IntentProperty(
 97 |                                     name="placeHolderIntent",
 98 |                                     initial_response_setting=initial_response_codehook_settings,
 99 |                                     sample_utterances=[lex.CfnBot.SampleUtteranceProperty(
100 |                                                             utterance="utterance"
101 |                                                         )]
102 |                                 )
103 |         
104 |         fallback_intent = lex.CfnBot.IntentProperty(
105 |                                     name="FallbackIntent",
106 |                                     parent_intent_signature="AMAZON.FallbackIntent",
107 |                                     initial_response_setting=initial_response_codehook_settings,
108 |                                     fulfillment_code_hook=lex.CfnBot.FulfillmentCodeHookSettingProperty(
109 |                                         enabled=True,
110 |                                         is_active=True,
111 |                                         post_fulfillment_status_specification=lex.CfnBot.PostFulfillmentStatusSpecificationProperty()
112 |                                     )
113 |                                 )
114 | 
115 |         # Create actual Lex Bot
116 |         cfn_bot = lex.CfnBot(self, "LexGenAIDemoCfnBot",
117 |             data_privacy={"ChildDirected":"false"},
118 |             idle_session_ttl_in_seconds=300,
119 |             name="LexGenAIDemoBotCfn",
120 |             description="Bot created for blog post: Enhance Amazon Lex with conversational FAQ features using LLMs",
121 |             role_arn=lex_cfn_role.role_arn,
122 |             bot_locales=[lex.CfnBot.BotLocaleProperty(
123 |                             locale_id="en_US",
124 |                             nlu_confidence_threshold=0.4,
125 |                             intents=[placeholder_intent, fallback_intent])
126 |                         ],
127 |             test_bot_alias_settings = bot_alias_settings,
128 |             auto_build_bot_locales=True
129 |         )
130 | 
131 | 
132 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/requirements.txt:
--------------------------------------------------------------------------------
1 | aws-cdk-lib==2.80.0
2 | constructs>=10.0.0,<11.0.0
3 | sagemaker==2.163.0
4 | boto3


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/shut_down_endpoint.py:
--------------------------------------------------------------------------------
 1 | import boto3
 2 | from botocore.exceptions import ClientError
 3 | 
 4 | from endpoint_handler import SAGEMAKER_ENDPOINT_NAME
 5 | 
 6 | 
 7 | sagemaker_client = boto3.client('sagemaker')
 8 | 
 9 | 
10 | try:
11 |     # verify endpoint exists
12 |     endpoint = sagemaker_client.describe_endpoint(EndpointName=SAGEMAKER_ENDPOINT_NAME)
13 |     print(f"Endpoint {endpoint['EndpointName']} found, shutting down")
14 | 
15 |     try: # delete both endpoint and configuration
16 |         sagemaker_client.delete_endpoint(
17 |             EndpointName=SAGEMAKER_ENDPOINT_NAME
18 |         )
19 |         sagemaker_client.delete_endpoint_config(
20 |             EndpointConfigName=SAGEMAKER_ENDPOINT_NAME
21 |         )
22 |         print(f"Endpoint {SAGEMAKER_ENDPOINT_NAME} shut down")
23 |     except ClientError as e:
24 |         print(e)
25 | except:
26 |     print(f"Endpoint {SAGEMAKER_ENDPOINT_NAME} does not exist in account {boto3.client('sts').get_caller_identity().get('Account')}")


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/source.bat:
--------------------------------------------------------------------------------
 1 | @echo off
 2 |  
 3 | rem The sole purpose of this script is to make the command
 4 | rem
 5 | rem     source .venv/bin/activate
 6 | rem
 7 | rem (which activates a Python virtualenv on Linux or Mac OS X) work on Windows.
 8 | rem On Windows, this command just runs this batch file (the argument is ignored).
 9 | rem
10 | rem Now we don't need to document a Windows command for activating a virtualenv.
11 |  
12 | echo Executing .venv\Scripts\activate.bat for you
13 | .venv\Scripts\activate.bat


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/upload_file_to_s3.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import boto3
 3 | from botocore.exceptions import ClientError
 4 | import logging
 5 | 
 6 | ACCOUNT_ID = boto3.client('sts').get_caller_identity().get('Account')
 7 | S3_BUCKET = "lexgenaistack-source-materials-bucket-"+ACCOUNT_ID
 8 | s3_client = boto3.client("s3")
 9 | 
10 | def main():
11 |     if len(sys.argv) == 1:
12 |         print(f"[ERROR] You must specify file to upload")
13 |     elif len(sys.argv) == 2:
14 |         filepath = sys.argv[1]
15 |         upload(filepath)
16 |     elif len(sys.argv) == 3:
17 |         filepath = sys.argv[2]
18 |         upload(filepath)
19 |     else:
20 |         print("[ERROR] Too many arguments, only include /path/to/your/file")
21 | 
22 | 
23 | def upload(filepath):
24 |     if filepath[-4:].lower() == '.txt' or filepath[-4:].lower() == '.pdf':
25 |         print(f"Uploading file at {filepath}")
26 |         try:
27 |             upload_name = filepath.split("/")[-1].replace(" ","").replace("/","")
28 |             s3_client.upload_file(filepath, S3_BUCKET, upload_name) 
29 |             print(f"Successfully uploaded file at {filepath}, creating index...")
30 |         except ClientError as e:
31 |             logging.error(e)
32 |     else:
33 |         print("[ERROR] File must be txt or PDF")
34 | 
35 | 
36 | if __name__ == "__main__":
37 |     main()


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/web-crawler-docker-image/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM public.ecr.aws/lambda/python:3.8
 2 | 
 3 | COPY web_crawler_requirements.txt  .
 4 | RUN  pip3 install -r web_crawler_requirements.txt --target "${LAMBDA_TASK_ROOT}"
 5 | 
 6 | # Copy function code
 7 | COPY *.py ${LAMBDA_TASK_ROOT}
 8 | 
 9 | # Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
10 | CMD [ "web_crawler_app.handler" ]
11 | 
12 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/web-crawler-docker-image/web_crawler_app.py:
--------------------------------------------------------------------------------
  1 | import boto3
  2 | import requests
  3 | import html2text
  4 | from typing import List
  5 | import re
  6 | import logging
  7 | import json
  8 | import traceback
  9 | 
 10 | logger = logging.getLogger()
 11 | logger.setLevel(logging.INFO)
 12 | 
 13 | 
 14 | def find_http_urls_in_parentheses(s: str, prefix: str = None):
 15 |     pattern = r'\((https?://[^)]+)\)'
 16 |     urls = re.findall(pattern, s)
 17 | 
 18 |     matched = []
 19 |     if prefix is not None:
 20 |         for url in urls:
 21 |             if str(url).startswith(prefix):
 22 |                 matched.append(url)
 23 |     else:
 24 |         matched = urls
 25 | 
 26 |     return list(set(matched))  # remove duplicates by converting to set, then convert back to list
 27 | 
 28 | 
 29 | 
 30 | class EZWebLoader:
 31 | 
 32 |     def __init__(self, default_header: str = None):
 33 |         self._html_to_text_parser = html2text
 34 |         if default_header is None:
 35 |             self._default_header =  {"User-agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 Safari/537.36"}
 36 |         else:
 37 |             self._default_header = default_header
 38 | 
 39 |     def load_data(self,
 40 |                   urls: List[str],
 41 |                   num_levels: int = 0,
 42 |                   level_prefix: str = None,
 43 |                   headers: str = None) -> List[str]:
 44 | 
 45 |         logging.info(f"Number of urls: {len(urls)}.")
 46 | 
 47 |         if headers is None:
 48 |             headers = self._default_header
 49 | 
 50 |         documents = []
 51 |         visited = {}
 52 |         for url in urls:
 53 |             q = [url]
 54 |             depth = num_levels
 55 |             for page in q:
 56 |                 if page not in visited:     #prevent cycles by checking to see if we already crawled a link
 57 |                     logging.info(f"Crawling {page}")
 58 |                     visited[page] = True   #add entry to visited to prevent re-crawling pages
 59 |                     response = requests.get(page, headers=headers).text
 60 |                     response = self._html_to_text_parser.html2text(response)  #reduce html to text
 61 |                     documents.append(response)
 62 |                     if depth > 0:
 63 |                         #crawl linked pages
 64 |                         ingest_urls = find_http_urls_in_parentheses(response, level_prefix)
 65 |                         logging.info(f"Found {len(ingest_urls)} pages to crawl.")
 66 |                         q.extend(ingest_urls)
 67 |                         depth -= 1  #reduce the depth counter so we go only num_levels deep in our crawl
 68 |                 else:
 69 |                     logging.info(f"Skipping {page} as it has already been crawled")
 70 |         logging.info(f"Number of documents: {len(documents)}.")
 71 |         return documents
 72 | 
 73 | ACCOUNT_ID = boto3.client('sts').get_caller_identity().get('Account')
 74 | S3_BUCKET = "lexgenaistack-source-materials-bucket-" + ACCOUNT_ID
 75 | FILE_NAME = 'web-crawl-results.txt'
 76 | 
 77 | 
 78 | def handler(event, context):
 79 |     url = "http://www.zappos.com/general-questions"
 80 |     depth = 1
 81 |     level_prefix = "https://www.zappos.com/"
 82 | 
 83 |     if event is not None:
 84 |         if "url" in event:
 85 |             url = event["url"]
 86 |         if "depth" in event:
 87 |             depth = int(event["depth"])
 88 |         if "level_prefix" in event:
 89 |             level_prefix = event["level_prefix"]
 90 | 
 91 |     # crawl the website
 92 |     try:
 93 |         logger.info(f"Crawling {url} to depth of {depth}...")
 94 |         loader = EZWebLoader()
 95 |         documents = loader.load_data([url], depth, level_prefix)
 96 |         doc_string = json.dumps(documents, indent=1)
 97 |         logger.info(f"Crawling {url} to depth of {depth} succeeded")
 98 |     except Exception as e:
 99 |         # If there's an error, print the error message
100 |         logging.error(f"An error occurred during the crawl of {url}.")
101 |         exception_traceback = traceback.format_exc()
102 |         logger.error(exception_traceback)
103 |         return {
104 |             "status": 500,
105 |             "message": exception_traceback
106 |         }
107 |     # save the results for indexing
108 |     try:
109 |         # Use the S3 client to write the string to S3
110 |         s3 = boto3.client('s3')
111 |         s3.put_object(Body=doc_string, Bucket=S3_BUCKET, Key=FILE_NAME)
112 |         success_msg = f'Successfully put {FILE_NAME} to {S3_BUCKET}'
113 |         logging.info(success_msg)
114 |         return {
115 |             "status": 200,
116 |             "message": success_msg
117 |         }
118 |     except Exception as e:
119 |         # If there's an error, print the error message
120 |         exception_traceback = traceback.format_exc()
121 |         logger.error(exception_traceback)
122 |         return {
123 |             "status": 500,
124 |             "message": exception_traceback
125 |         }
126 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/web-crawler-docker-image/web_crawler_requirements.txt:
--------------------------------------------------------------------------------
1 | requests
2 | html2text
3 | accelerate
4 | boto3
5 | 


--------------------------------------------------------------------------------
/src/lex-gen-ai-demo-cdk/web_crawl.py:
--------------------------------------------------------------------------------
 1 | import boto3
 2 | import argparse
 3 | import json
 4 | 
 5 | 
 6 | def invoke_lambda(url=None, depth="1", level_prefix=None):
 7 |     client = boto3.client('lambda')
 8 | 
 9 |     # Prepare the payload
10 |     payload = {}
11 |     if url is not None:
12 |         payload["url"] = url
13 |     if depth is not None:
14 |         payload["depth"] = depth
15 |     if level_prefix is not None:
16 |         payload["level_prefix"] = level_prefix
17 | 
18 |     try:
19 |         response = client.invoke(
20 |             FunctionName='WebCrawlerLambda',
21 |             InvocationType='RequestResponse',
22 |             LogType='Tail',
23 |             # The payload must be a JSON-formatted string
24 |             Payload=json.dumps(payload)
25 |         )
26 | 
27 |         # The response from Lambda will be a JSON string, so you need to parse it
28 |         result = response['Payload'].read().decode('utf-8')
29 | 
30 |         print("Response: " + result)
31 | 
32 |     except Exception as e:
33 |         print(e)
34 | 
35 | 
36 | # Parse command-line arguments
37 | parser = argparse.ArgumentParser()
38 | parser.add_argument('--url', type=str, help='The URL to process.', required=False, default=None)
39 | parser.add_argument('--depth', type=int, help='The depth of the crawl.', required=False, default="1")
40 | parser.add_argument('--level_prefix', type=str, help='The prefix that any links must contain to crawl.', required=False, default=None)
41 | args = parser.parse_args()
42 | 
43 | invoke_lambda(args.url, args.depth, args.level_prefix)
44 | 


--------------------------------------------------------------------------------