├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── Module 1 - Difference between BM25 similarity and Semantic similarity.ipynb ├── Module 2 - Text Search.ipynb ├── Module 3 - Semantic Search.ipynb ├── Module 4 - Fullstack Semantic Search.ipynb ├── Module 5 - Semantic Search with Fine Tuned Model.ipynb ├── Module 6 - Lab1 - Sematic Search with Neural Search Local Model.ipynb ├── Module 6 - Lab2 - Sematic Search with Neural Search Remote Model.ipynb ├── Module 7 - Retrieval Augmented Generation.ipynb ├── Module 8 - Conversational Search.ipynb ├── Module_7A_Conversational_Search_with_GenAI.ipynb ├── README.md ├── backend ├── lambda │ ├── app.py │ ├── build-lambda.sh │ └── requirements.txt └── template.yaml ├── blog ├── LLM-Based-Agent │ ├── Generative AI with LLM based autonomous agents augmented with structured and unstructured data.ipynb │ └── stock-price │ │ ├── CRM.csv │ │ ├── MSFT.csv │ │ ├── ORCL.csv │ │ ├── PANW.csv │ │ ├── SNOW.csv │ │ └── stock_symbol.csv └── NeuralSearch │ └── Semantic Search with OpenSearch.ipynb ├── code ├── inference.py └── requirements.txt ├── converstational-search.png ├── convert_pqa.py ├── deploy-semantic-search-backend-cloudformation.jpg ├── deploy-to-aws.png ├── download-dependencies.sh ├── frontend ├── package-lock.json ├── package.json ├── public │ ├── favicon.ico │ ├── index.html │ ├── logo192.png │ ├── logo512.png │ ├── manifest.json │ └── robots.txt └── src │ ├── App.css │ ├── App.js │ ├── config │ └── index.js │ ├── images │ └── header.jpg │ ├── index.css │ ├── index.js │ ├── logo.svg │ └── serviceWorker.js ├── full-stack-semantic-search-ui-2.jpg ├── full-stack-semantic-search-ui.jpg ├── generative-ai ├── Module_1_Build_Conversational_Search │ ├── README.md │ ├── __pycache__ │ │ ├── create_IAM_role.cpython-310.pyc │ │ ├── create_IAMrole.cpython-310.pyc │ │ ├── create_role.cpython-310.pyc │ │ ├── lambda_URL.cpython-310.pyc │ │ ├── lambda_exec_role.cpython-310.pyc │ │ ├── lambda_function.cpython-310.pyc │ │ ├── lambda_functions.cpython-310.pyc │ │ ├── lambda_functionz.cpython-310.pyc │ │ └── lambda_role.cpython-310.pyc │ ├── app.py │ ├── bundle.sh │ ├── bundle_lambda.sh │ ├── cdk.json │ ├── chain_documentEncoder.py │ ├── chain_queryEncoder.py │ ├── config.py │ ├── conversational_search_full_stack_with_gpu.yaml │ ├── environment.yml │ ├── images │ │ └── service-design.svg │ ├── lambda_URL.py │ ├── lambda_exec_role.py │ ├── lambda_function.py │ ├── main_documentEncoder.py │ ├── main_queryEncoder.py │ ├── module1 │ │ ├── all_components.png │ │ ├── encoders.png │ │ ├── memory.png │ │ ├── ml_models.png │ │ ├── module1.gif │ │ ├── vectordb.png │ │ └── webserver.png │ ├── requirements.txt │ ├── sample_pdfs │ │ ├── OpenSearch_best_practices.pdf │ │ └── Sizing_Amazon_OpenSearch_Service_domains.pdf │ └── webapp │ │ ├── api.py │ │ ├── app.py │ │ ├── images │ │ ├── ai-icon.png │ │ ├── opensearch-twitter-card.png │ │ ├── regenerate.png │ │ ├── user-icon.png │ │ └── user.png │ │ └── pdfs │ │ ├── Amazon_OpenSearch_best_practices.pdf │ │ ├── Sizing_Amazon_OpenSearch_Service_domain.pdf │ │ └── Vector_database_capabilities_OpenSearch_Service.pdf ├── Module_1_Build_Conversational_Search_Components.ipynb ├── Module_2_Conversational_Search_with_GenAI.ipynb └── Module_3_Exercise_Conversational_Search_with_GenAI.ipynb ├── image ├── module3 │ ├── document_loader │ └── workflow └── module8 │ ├── conversation-final-answer.png │ ├── conversation-new-question.png │ ├── document-loader.png │ ├── enable-models-bedrock.gif │ ├── rag-with-memory.png │ └── rag.png ├── inference.py ├── keyword_search.png ├── model └── all-MiniLM-L6-v2_torchscript.json ├── nlp_bert.png ├── rag.png ├── requirements.txt ├── semantic_search.png ├── semantic_search_fullstack.jpg ├── semantic_search_with_fine_tuning.png ├── static ├── conversation-final-answer.png ├── conversation-new-question.png ├── data_ingestion.png ├── document-loader.png ├── enable-models-bedrock.gif ├── huggingface.png ├── huggingfact-SBERT.jpeg ├── rag-with-memory.png ├── rag.png ├── retrieveAPI.png ├── retrieveAndGenerate.png └── sbert.jpeg └── word2vec.png /.gitignore: -------------------------------------------------------------------------------- 1 | **/.DS_Store 2 | .ipynb_checkpoints/* 3 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *master* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | 61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. 62 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | 16 | -------------------------------------------------------------------------------- /Module 1 - Difference between BM25 similarity and Semantic similarity.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "c229af9c", 6 | "metadata": {}, 7 | "source": [ 8 | "# Module 1 : Exploring BM25 similarity and Semantic similarity\n", 9 | "\n", 10 | "Before we get started with Amazon OpenSearch and our search web app, let's explore some of the core concepts in search. Below, we'll demonstrate the different between algorithms for matching data using BM25 similarity (keyword matching) and Cosine similarity (sematnic vector matching)." 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "id": "ae752e16", 16 | "metadata": {}, 17 | "source": [ 18 | "### 1. Check PyTorch version\n", 19 | "\n", 20 | "Before we begin, we need to make sure PyTorch version is above 2.2.0" 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": null, 26 | "id": "f1dd8491", 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import torch\n", 31 | "print(torch.__version__)" 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "id": "311ff552", 37 | "metadata": {}, 38 | "source": [ 39 | "### 2. Install Pre-Requisites\n", 40 | "\n", 41 | "Before we can experiment with different searches, we need to install some required libraries." 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "id": "30e40333", 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "!pip install transformers\n", 52 | "!pip install transformers[torch]\n", 53 | "!pip install sentence-transformers \n", 54 | "!pip install rank_bm25\n", 55 | "!pip install tqdm" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "id": "1b126cc3", 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "from rank_bm25 import BM25Okapi\n", 66 | "from sklearn.feature_extraction import _stop_words\n", 67 | "import string\n", 68 | "from tqdm.autonotebook import tqdm\n", 69 | "import numpy as np" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "id": "72e83e3b", 75 | "metadata": {}, 76 | "source": [ 77 | "### 3. Create a sample dataset\n", 78 | "\n", 79 | "Let's now create a very simple dataset as an array of 4 questions." 80 | ] 81 | }, 82 | { 83 | "cell_type": "code", 84 | "execution_count": null, 85 | "id": "52ac493e", 86 | "metadata": {}, 87 | "outputs": [], 88 | "source": [ 89 | "passages=[\n", 90 | " \"does this work with xbox?\",\n", 91 | " \"does the M70 work with Android phones?\", \n", 92 | " \"does this work with iphone?\",\n", 93 | " \"can this work with an xbox \"\n", 94 | " ]" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "id": "09b8e8b6", 100 | "metadata": {}, 101 | "source": [ 102 | "### 4. Explore BM25 similarity \n", 103 | "\n", 104 | "Execute the following to explore BM25 similarity. First, we'll tokenize the data set, then use BM25 similarity to compare the phrase \"does this work with xbox?\" with our above 4 questions. " 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": null, 110 | "id": "af5409ce", 111 | "metadata": {}, 112 | "outputs": [], 113 | "source": [ 114 | "# split the sentence into words and remove stop words. For example setence \"does this work with xbox\" \n", 115 | "# will be converted into words \"does\", \"work\", \"xbox\"\n", 116 | "def bm25_tokenizer(text):\n", 117 | " tokenized_doc = []\n", 118 | " for token in text.lower().split():\n", 119 | " token = token.strip(string.punctuation)\n", 120 | "\n", 121 | " if len(token) > 0 and token not in _stop_words.ENGLISH_STOP_WORDS:\n", 122 | " tokenized_doc.append(token)\n", 123 | " return tokenized_doc\n", 124 | "\n", 125 | "\n", 126 | "tokenized_corpus = []\n", 127 | "for passage in tqdm(passages):\n", 128 | " tokenized_corpus.append(bm25_tokenizer(passage))\n", 129 | "\n", 130 | "#get the BM25 score between the query \"does this work with xbox?\" and exiting 4 questions above. \n", 131 | "#If the score is high, it means BM25 think the two sentences are similiar.\n", 132 | "bm25 = BM25Okapi(tokenized_corpus)\n", 133 | "bm25_scores = bm25.get_scores(bm25_tokenizer(\"does this work with xbox?\"))\n", 134 | "\n", 135 | "all_sentence_combinations = []\n", 136 | "for i in range(len(bm25_scores)):\n", 137 | " all_sentence_combinations.append([bm25_scores[i], i])\n", 138 | "\n", 139 | "#sort the score descending\n", 140 | "all_sentence_combinations = sorted(all_sentence_combinations, key=lambda x: x[0], reverse=True)\n", 141 | "\n", 142 | "# print the 4 sentences, tokens and the BM25 score with query \"does this work with xbox?\"\n", 143 | "# You can \"does this work with iphone?\" has high BM25 score with \"does this work with xbox?\" \n", 144 | "# even though the semantics meaning is different. While \"Can this work with an xbox\" has low BM25 score \n", 145 | "# with \"does this work with xbox?\" even though the two sentence has same semantic meaning. \n", 146 | "# This is the drawback of BM25.\n", 147 | "print(\"Top most similar pairs:\")\n", 148 | "for score, i in all_sentence_combinations[0:4]:\n", 149 | " print(\"{} \\t {} \\t {:.4f}\".format(passages[i],bm25_tokenizer(passages[i]),bm25_scores[i]))\n", 150 | " \n", 151 | " " 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "id": "b7d61730", 157 | "metadata": {}, 158 | "source": [ 159 | "Check the result. You can see sentence \"does this work with xbox?\" and \"does this work with iphone?\" has the same BM25 similarity score with sentence \"does this work with xbox?\". However \"does this work with iphone?\" is totally different meaning.\n", 160 | "\n", 161 | "Meanwhile, sentence \"can this work with an xbox\" has the same meaning with query question \"does this work with xbox?\". However the BM25 score is low." 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "id": "1ed5375b", 167 | "metadata": {}, 168 | "source": [ 169 | "### 5. Semantic Similarities\n", 170 | "\n", 171 | "\n", 172 | "Execute the following to explore semantic similarity with cosine similarity. In this code, we'll use the same dataset as above, but using cosine similarity. Compare the differences in how similarity is measured.\n", 173 | "\n", 174 | "The 'all-MiniLM-L6-v2' is a sentence transformer from HuggingFace that maps sentences & paragraphs to a 384 dimensional dense vector space. We'll use this library to translate our sample data set to a vector set. We'll then use util.cos_sim to provide a cosine similarity between each combination of records in the dataset. Finally, we'll print the cosine similarity score of the first record (\"does this work with xbox?\") with records in the dataset." 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": null, 180 | "id": "17db6997", 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [ 184 | "from sentence_transformers import SentenceTransformer, util\n", 185 | "model = SentenceTransformer('all-MiniLM-L6-v2')\n", 186 | "\n", 187 | "#Encode all sentences\n", 188 | "embeddings = model.encode(passages)\n", 189 | "\n", 190 | "#Compute cosine similarity between all pairs\n", 191 | "cos_sim = util.cos_sim(embeddings, embeddings)\n", 192 | "\n", 193 | "#cosine similarity score with query\n", 194 | "all_sentence_combinations = []\n", 195 | "for i in range(len(cos_sim)):\n", 196 | " all_sentence_combinations.append([cos_sim[0][i], i])\n", 197 | "\n", 198 | "#Sort list by the highest cosine similarity score\n", 199 | "all_sentence_combinations = sorted(all_sentence_combinations, key=lambda x: x[0], reverse=True)\n", 200 | "\n", 201 | "# You see \"does this work with xbox?\" has the same meaning with query \"does this work with xbox?\", \n", 202 | "# so the semantic score is highest of course. \"Can this work with an xbox\" has similiar meaning \n", 203 | "# with \"does this work with xbox?\", semantic score ranked second.\n", 204 | "print(\"Top most similar pairs:\")\n", 205 | "for score, i in all_sentence_combinations[0:4]:\n", 206 | " print(\"{} \\t {:.4f}\".format(passages[i],cos_sim[0][i]))" 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "id": "f6c6e0a3", 212 | "metadata": {}, 213 | "source": [ 214 | "### 6. Compare the differences.\n", 215 | "\n", 216 | "As you can see, the similarity is significantly different, even with with a trivial data set. In particular, observe the differences between the text for \"Can this work with an xbox\" - with BM25 keyword search, the word \"can\" doesn't match the original question, and so it's given a low rating. Cosine similarity semantic search provides a much better match in this case.\n", 217 | "\n", 218 | "In this module, we've used fairly simple steps with a very small dataset to demonstrate the difference between BM25 and cosine similarity. In the following modules, we'll demonstrate these same concepts with using OpenSearch and a larger and more complex dataset." 219 | ] 220 | } 221 | ], 222 | "metadata": { 223 | "kernelspec": { 224 | "display_name": "conda_pytorch_p310", 225 | "language": "python", 226 | "name": "conda_pytorch_p310" 227 | }, 228 | "language_info": { 229 | "codemirror_mode": { 230 | "name": "ipython", 231 | "version": 3 232 | }, 233 | "file_extension": ".py", 234 | "mimetype": "text/x-python", 235 | "name": "python", 236 | "nbconvert_exporter": "python", 237 | "pygments_lexer": "ipython3", 238 | "version": "3.10.14" 239 | } 240 | }, 241 | "nbformat": 4, 242 | "nbformat_minor": 5 243 | } 244 | -------------------------------------------------------------------------------- /Module 2 - Text Search.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "7a260e21", 6 | "metadata": {}, 7 | "source": [ 8 | "# Module 2: Text Search with Amazon OpenSearch Service \n", 9 | "\n", 10 | "In this module, we are going to perform a simple search in OpenSearch by matching the individual words in our search query. We will:\n", 11 | "1. Load data into OpenSearch from the Amazon Product Question and Answer (PQA) dataset. This dataset contains a list of common questions and answers related to products.\n", 12 | "2. Query the data using a simple query search for find potentially matching questions. We will search the PQA dataset for questions similar to our sample question \"does this work with xbox?\". We expect to find matches in the dataset based on the individual words such as \"xbox\" and \"work\".\n", 13 | "\n", 14 | "In subsequent modules, we will then demonstrate how to use semantic search to improve the relvance of the query results." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "id": "3df71ad6", 20 | "metadata": {}, 21 | "source": [ 22 | "### 1. Install required libraries" 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "id": "a1582d18", 28 | "metadata": {}, 29 | "source": [ 30 | "Before we begin, we need to install some required libraries." 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": null, 36 | "id": "bf023c28", 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "!pip install boto3\n", 41 | "!pip install requests\n", 42 | "!pip install requests-aws4auth\n", 43 | "!pip install opensearch-py" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "id": "467dd76e", 49 | "metadata": {}, 50 | "source": [ 51 | "### 2. Get Cloud Formation stack output variables\n", 52 | "\n", 53 | "We also need to grab some key values from the infrastructure we provisioned using CloudFormation. To do this, we will list the outputs from the stack and store this in \"outputs\" to be used later.\n", 54 | "\n", 55 | "You can ignore any \"PythonDeprecationWarning\" warnings." 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "id": "23e3ac3e", 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "import boto3\n", 66 | "import json\n", 67 | "\n", 68 | "cfn = boto3.client('cloudformation')\n", 69 | "kms = boto3.client('secretsmanager')\n", 70 | "\n", 71 | "\n", 72 | "def get_cfn_outputs(stackname):\n", 73 | " outputs = {}\n", 74 | " for output in cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']:\n", 75 | " outputs[output['OutputKey']] = output['OutputValue']\n", 76 | " return outputs\n", 77 | "\n", 78 | "## Setup variables to use for the rest of the demo\n", 79 | "cloudformation_stack_name = \"semantic-search\"\n", 80 | "\n", 81 | "outputs = get_cfn_outputs(cloudformation_stack_name)\n", 82 | "bucket = outputs['s3BucketTraining']\n", 83 | "aos_host = outputs['OpenSearchDomainEndpoint']\n", 84 | "aos_credentials = json.loads(kms.get_secret_value(SecretId=outputs['OpenSearchSecret'])['SecretString'])\n", 85 | "\n", 86 | "outputs" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "id": "192599d8", 92 | "metadata": {}, 93 | "source": [ 94 | "### 3. Copy the data set locally\n", 95 | "Before we can run any queries, we need to download the Amazon Product Question and Answer data from : https://registry.opendata.aws/amazon-pqa/" 96 | ] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "id": "bcd10e11", 101 | "metadata": {}, 102 | "source": [ 103 | "Let's start by having a look at all the files in the dataset." 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "id": "8a00371e", 110 | "metadata": {}, 111 | "outputs": [], 112 | "source": [ 113 | "!aws s3 ls --no-sign-request s3://amazon-pqa/" 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "id": "38f90419", 119 | "metadata": {}, 120 | "source": [ 121 | "There are a lot of files here, so for the purposes of this demo, we focus on just the headset data. Let's download the amazon_pqa_headsets.json data locally. " 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": null, 127 | "id": "e74b8c61", 128 | "metadata": {}, 129 | "outputs": [], 130 | "source": [ 131 | "!aws s3 cp --no-sign-request s3://amazon-pqa/amazon_pqa_headsets.json ./amazon-pqa/amazon_pqa_headsets.json" 132 | ] 133 | }, 134 | { 135 | "cell_type": "markdown", 136 | "id": "a157a87c", 137 | "metadata": {}, 138 | "source": [ 139 | "### 4. Create an OpenSearch cluster connection.\n", 140 | "Next, we'll use Python API to set up connection with Amazon Opensearch Service domain.\n", 141 | "\n", 142 | "Note: if you're using a region other than us-east-1, please update the region in the code below. " 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": null, 148 | "id": "de2bbf1d", 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth\n", 153 | "import boto3\n", 154 | "\n", 155 | "#update the region if you're working other than us-east-1\n", 156 | "region = 'us-east-1' \n", 157 | "\n", 158 | "print (aos_host)\n", 159 | "\n", 160 | "#credentials = boto3.Session().get_credentials()\n", 161 | "#auth = AWSV4SignerAuth(credentials, region)\n", 162 | "auth = (aos_credentials['username'], aos_credentials['password'])\n", 163 | "\n", 164 | "aos_client = OpenSearch(\n", 165 | " hosts = [{'host': aos_host, 'port': 443}],\n", 166 | " http_auth = auth,\n", 167 | " use_ssl = True,\n", 168 | " verify_certs = True,\n", 169 | " connection_class = RequestsHttpConnection\n", 170 | ")" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "id": "9f324821", 176 | "metadata": {}, 177 | "source": [ 178 | "### 5. Create a index in Amazon Opensearch Service \n", 179 | "We are defining an index with english analyzer which will strip the common stopwords like `the`, `is`, `a`, `an`, etc..\n", 180 | "\n", 181 | "We will use the aos_client connection we initiated ealier to create an index in Amazon OpenSearch Service" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": null, 187 | "id": "b052b473", 188 | "metadata": {}, 189 | "outputs": [], 190 | "source": [ 191 | "headset_default_index = {\n", 192 | " \"settings\": {\n", 193 | " \"number_of_replicas\": 1,\n", 194 | " \"number_of_shards\": 1,\n", 195 | " \"analysis\": {\n", 196 | " \"analyzer\": {\n", 197 | " \"default\": {\n", 198 | " \"type\": \"standard\",\n", 199 | " \"stopwords\": \"_english_\"\n", 200 | " }\n", 201 | " }\n", 202 | " }\n", 203 | " }\n", 204 | " \n", 205 | "}" 206 | ] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "id": "f54cbfdf", 211 | "metadata": {}, 212 | "source": [ 213 | "If for any reason you need to recreate your dataset, you can uncomment and execute the following to delete any previously created indexes. If this is the first time you're running this, you can skip this step." 214 | ] 215 | }, 216 | { 217 | "cell_type": "code", 218 | "execution_count": null, 219 | "id": "573cf751", 220 | "metadata": {}, 221 | "outputs": [], 222 | "source": [ 223 | "#aos_client.indices.delete(index=\"headset_pqa\")" 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "id": "5f5bd09f", 229 | "metadata": {}, 230 | "source": [ 231 | "Using the above index definition, we now need to create the index in Amazon OpenSearch" 232 | ] 233 | }, 234 | { 235 | "cell_type": "code", 236 | "execution_count": null, 237 | "id": "8b86ecec", 238 | "metadata": {}, 239 | "outputs": [], 240 | "source": [ 241 | "aos_client.indices.create(index=\"headset_pqa\",body=headset_default_index,ignore=400)\n" 242 | ] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "id": "4a041f9e", 247 | "metadata": {}, 248 | "source": [ 249 | "Let's verify the created index information" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": null, 255 | "id": "5ae5f0f1", 256 | "metadata": {}, 257 | "outputs": [], 258 | "source": [ 259 | "aos_client.indices.get(index=\"headset_pqa\")" 260 | ] 261 | }, 262 | { 263 | "cell_type": "markdown", 264 | "id": "037debef", 265 | "metadata": {}, 266 | "source": [ 267 | "### 6. Load the raw data into the Index\n", 268 | "Next, let's load the headset PQA data we copied locally into the index we've just created." 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "id": "f84c7e5c", 275 | "metadata": {}, 276 | "outputs": [], 277 | "source": [ 278 | "import json\n", 279 | "from tqdm.contrib.concurrent import process_map\n", 280 | "from multiprocessing import cpu_count\n", 281 | "\n", 282 | "def load_pqa_as_json(file_name,number_rows=1000):\n", 283 | " result=[]\n", 284 | " with open(file_name) as f:\n", 285 | " i=0\n", 286 | " for line in f:\n", 287 | " data = json.loads(line)\n", 288 | " result.append(data)\n", 289 | " i+=1\n", 290 | " if(i == number_rows):\n", 291 | " break\n", 292 | " return result\n", 293 | "\n", 294 | "\n", 295 | "qa_list_json = load_pqa_as_json('amazon-pqa/amazon_pqa_headsets.json',number_rows=1000)\n", 296 | "\n", 297 | "\n", 298 | "def es_import(question):\n", 299 | " aos_client.index(index='headset_pqa', body=question)\n", 300 | " \n", 301 | "workers = 4 * cpu_count()\n", 302 | " \n", 303 | "process_map(es_import, qa_list_json,chunksize=1000)" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "id": "78ddf7aa", 309 | "metadata": {}, 310 | "source": [ 311 | "To validate the load, we'll query the number of documents number in the index. We should have 1000 hits in the index." 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": null, 317 | "id": "348f45e6", 318 | "metadata": {}, 319 | "outputs": [], 320 | "source": [ 321 | "res = aos_client.search(index=\"headset_pqa\", body={\"query\": {\"match_all\": {}}})\n", 322 | "print(\"Records found: %d \" % res['hits']['total']['value'])" 323 | ] 324 | }, 325 | { 326 | "cell_type": "markdown", 327 | "id": "5564e873", 328 | "metadata": {}, 329 | "source": [ 330 | "### 7. Run a \" Simple Text Search\"\n", 331 | "\n", 332 | "Now that we've loaded our data, let's run a keyword search for the question \"does this work with xbox?\", using the default OpenSearch query, and display the results." 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": null, 338 | "id": "98b17174", 339 | "metadata": {}, 340 | "outputs": [], 341 | "source": [ 342 | "import pandas as pd\n", 343 | "query={\n", 344 | " \"size\": 10,\n", 345 | " \"query\": {\n", 346 | " \"match\": {\n", 347 | " \"question_text\": \"does this work with xbox?\"\n", 348 | " }\n", 349 | " }\n", 350 | "}\n", 351 | "res = aos_client.search(index=\"headset_pqa\", body=query)\n", 352 | "query_result=[]\n", 353 | "for hit in res['hits']['hits']:\n", 354 | " row=[hit['_id'],hit['_score'],hit['_source']['question_text'],hit['_source']['answers'][0]['answer_text']]\n", 355 | " query_result.append(row)\n", 356 | "\n", 357 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 358 | "display(query_result_df)" 359 | ] 360 | }, 361 | { 362 | "cell_type": "markdown", 363 | "id": "8182469e", 364 | "metadata": {}, 365 | "source": [ 366 | "### 8. Search across multiple fields\n", 367 | "\n", 368 | "Search across multiple fields could bring more results and scored based on BM25 relevancy " 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": null, 374 | "id": "3aabf931", 375 | "metadata": {}, 376 | "outputs": [], 377 | "source": [ 378 | "import pandas as pd\n", 379 | "query={\n", 380 | " \"size\": 10,\n", 381 | " \"query\": {\n", 382 | " \"multi_match\": {\n", 383 | " \"query\": \"does this work with xbox?\",\n", 384 | " \"fields\": [\"question_text\",\"bullet_point*\", \"answers.answer_text\", \"item_name\"]\n", 385 | " }\n", 386 | " }\n", 387 | "}\n", 388 | "res = aos_client.search(index=\"headset_pqa\", body=query)\n", 389 | "\n", 390 | "query_result=[]\n", 391 | "for hit in res['hits']['hits']:\n", 392 | " row=[hit['_id'],hit['_score'],hit['_source']['question_text'],hit['_source']['answers'][0]['answer_text']]\n", 393 | " query_result.append(row)\n", 394 | "\n", 395 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 396 | "display(query_result_df)" 397 | ] 398 | }, 399 | { 400 | "cell_type": "markdown", 401 | "id": "90bf288f", 402 | "metadata": {}, 403 | "source": [ 404 | "### 9. Search with Field preference or boosting\n", 405 | "\n", 406 | "When searching across fields, all fields given the same priority by default. But you can control the preference by giving static boost score to each field" 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": null, 412 | "id": "50e50bd6", 413 | "metadata": {}, 414 | "outputs": [], 415 | "source": [ 416 | "import pandas as pd\n", 417 | "query={\n", 418 | " \"size\": 10,\n", 419 | " \"query\": {\n", 420 | " \"multi_match\": {\n", 421 | " \"query\": \"does this work with xbox?\",\n", 422 | " \"fields\": [\"question_text^2\", \"bullet_point*\", \"answers.answer_text^2\", \"item_name^1.5\"]\n", 423 | " }\n", 424 | " }\n", 425 | "}\n", 426 | "res = aos_client.search(index=\"headset_pqa\", body=query)\n", 427 | "\n", 428 | "query_result=[]\n", 429 | "for hit in res['hits']['hits']:\n", 430 | " row=[hit['_id'],hit['_score'],hit['_source']['question_text'],hit['_source']['answers'][0]['answer_text']]\n", 431 | " query_result.append(row)\n", 432 | "\n", 433 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 434 | "display(query_result_df)" 435 | ] 436 | }, 437 | { 438 | "cell_type": "markdown", 439 | "id": "f72a86ad", 440 | "metadata": {}, 441 | "source": [ 442 | "### 10. Compound queries with `bool`\n", 443 | "\n", 444 | "With `bool` queries, you can give more preference based on other field values/existance. In the below query, it will get higher score if `answer_aggregated` is `netural`" 445 | ] 446 | }, 447 | { 448 | "cell_type": "code", 449 | "execution_count": null, 450 | "id": "06c982cb", 451 | "metadata": {}, 452 | "outputs": [], 453 | "source": [ 454 | "import pandas as pd\n", 455 | "query={\n", 456 | " \"query\": {\n", 457 | " \"bool\": {\n", 458 | " \"must\": [\n", 459 | " {\n", 460 | " \"multi_match\": {\n", 461 | " \"query\": \"does this work with xbox?\",\n", 462 | " \"fields\": [ \"question_text^2\", \"bullet_point*\", \"answers.answer_text^2\",\"item_name^2\"]\n", 463 | " }\n", 464 | " }\n", 465 | " ],\n", 466 | " \"should\": [\n", 467 | " {\n", 468 | " \"term\": {\n", 469 | " \"answer_aggregated.keyword\": {\n", 470 | " \"value\": \"neutral\"\n", 471 | " }\n", 472 | " }\n", 473 | " }\n", 474 | " ]\n", 475 | " }\n", 476 | " }\n", 477 | "}\n", 478 | "res = aos_client.search(index=\"headset_pqa\", body=query)\n", 479 | "query_result=[]\n", 480 | "for hit in res['hits']['hits']:\n", 481 | " row=[hit['_id'],hit['_score'],hit['_source']['question_text'],hit['_source']['answers'][0]['answer_text']]\n", 482 | " query_result.append(row)\n", 483 | "\n", 484 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 485 | "display(query_result_df)" 486 | ] 487 | }, 488 | { 489 | "cell_type": "markdown", 490 | "id": "8b62e7a7", 491 | "metadata": {}, 492 | "source": [ 493 | "### 11. Use custom scoring with function score queries\n", 494 | "\n", 495 | "Function score are handy queries to overwrite the default BM-25 scoring. In the below query, it recalculates the score based on how many times the question was answered before." 496 | ] 497 | }, 498 | { 499 | "cell_type": "code", 500 | "execution_count": null, 501 | "id": "a22cdba9", 502 | "metadata": {}, 503 | "outputs": [], 504 | "source": [ 505 | "import pandas as pd\n", 506 | "query={\n", 507 | " \"query\": {\n", 508 | " \"function_score\": {\n", 509 | " \"query\": {\n", 510 | " \"bool\": {\n", 511 | " \"must\": [\n", 512 | " {\n", 513 | " \"multi_match\": {\n", 514 | " \"query\": \"does this work with xbox?\",\n", 515 | " \"fields\": [\"question_text^5\",\"bullet_point*\",\"answers.answer_text^2\", \"item_name^2\" ]\n", 516 | " }\n", 517 | " }\n", 518 | " ],\n", 519 | " \"should\": [\n", 520 | " {\n", 521 | " \"term\": {\n", 522 | " \"answer_aggregated.keyword\": {\n", 523 | " \"value\": \"neutral\"\n", 524 | " }\n", 525 | " }\n", 526 | " }\n", 527 | " ]\n", 528 | " }\n", 529 | " },\n", 530 | " \"functions\": [\n", 531 | " {\n", 532 | " \"script_score\": {\n", 533 | " \"script\": \"_score * 0.25 * doc['answers.answer_text.keyword'].length\"\n", 534 | " }\n", 535 | " }\n", 536 | " ]\n", 537 | " }\n", 538 | " }\n", 539 | "}\n", 540 | "res = aos_client.search(index=\"headset_pqa\", body=query)\n", 541 | "print(\"Got %d Hits:\" % res['hits']['total']['value'])\n", 542 | "query_result=[]\n", 543 | "for hit in res['hits']['hits']:\n", 544 | " row=[hit['_id'],hit['_score'],hit['_source']['question_text'],hit['_source']['answers'][0]['answer_text']]\n", 545 | " query_result.append(row)\n", 546 | "\n", 547 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 548 | "display(query_result_df)" 549 | ] 550 | }, 551 | { 552 | "cell_type": "markdown", 553 | "id": "f58971cb", 554 | "metadata": {}, 555 | "source": [ 556 | "### 12. Observe The Results and Refine\n", 557 | "\n", 558 | "Congratulations, you've now explored the possiblities of text search on the data in OpenSearch.\n", 559 | "\n", 560 | "If you take a look at the results above, you'll notice that the results match one or more of the key words from our question, most commonly the words \"work\" and \"xbox\". You'll also notices that a lot of these results aren't relevant to our original question, such as \"Does it work on PS3?\" and \"Does it work for computers\". In Module 3, we'll instead use semantic search to make the result more relevant." 561 | ] 562 | }, 563 | { 564 | "cell_type": "markdown", 565 | "id": "c0baab8a", 566 | "metadata": {}, 567 | "source": [ 568 | "### Store Variables Used for the Next Notebook\n", 569 | "\n", 570 | "There are a few values you will need for the next notebook, execute the cells below to store them so they can be copied and pasted into the next part of the exercise." 571 | ] 572 | }, 573 | { 574 | "cell_type": "code", 575 | "execution_count": null, 576 | "id": "6ecad485", 577 | "metadata": {}, 578 | "outputs": [], 579 | "source": [ 580 | "%store outputs\n", 581 | "%store bucket\n", 582 | "%store aos_host" 583 | ] 584 | } 585 | ], 586 | "metadata": { 587 | "kernelspec": { 588 | "display_name": "conda_pytorch_p310", 589 | "language": "python", 590 | "name": "conda_pytorch_p310" 591 | }, 592 | "language_info": { 593 | "codemirror_mode": { 594 | "name": "ipython", 595 | "version": 3 596 | }, 597 | "file_extension": ".py", 598 | "mimetype": "text/x-python", 599 | "name": "python", 600 | "nbconvert_exporter": "python", 601 | "pygments_lexer": "ipython3", 602 | "version": "3.10.14" 603 | }, 604 | "vscode": { 605 | "interpreter": { 606 | "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" 607 | } 608 | } 609 | }, 610 | "nbformat": 4, 611 | "nbformat_minor": 5 612 | } 613 | -------------------------------------------------------------------------------- /Module 3 - Semantic Search.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "e87dc259", 6 | "metadata": {}, 7 | "source": [ 8 | "# Semantic Search with Amazon OpenSearch Service " 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "b0cfd51d", 14 | "metadata": {}, 15 | "source": [ 16 | "Now that we've been able to search the data set with a keyword search, let's see how we can use Semantic Search to improve the matches. To do this, we will add a vector representation of the questions to our data set in OpenSearch, then do the same with our sample query \"Does this work with xbox?\". In OpenSearch, we'll use a KNN search to find matches based on a cosine similarity rating on the vector.\n", 17 | "\n", 18 | "![word vector](word2vec.png)\n", 19 | "\n", 20 | "\n", 21 | "We will:\n", 22 | "1. Use a HuggingFace BERT model to generate vector for the PQA dataset\n", 23 | "2. Upload the dataset to OpenSearch, with the original question and answer text combined with the vector representation of the questions.\n", 24 | "3. Translate the query question to a vector.\n", 25 | "4. Perform a KNN search in OpenSearch to perform semantic search" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "id": "31703e3d", 31 | "metadata": {}, 32 | "source": [ 33 | "### 1. Check PyTorch Version\n" 34 | ] 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "id": "9ac12126", 39 | "metadata": {}, 40 | "source": [ 41 | "As in the previous modules, let's import PyTorch and confirm that have have the latest version of PyTorch. The version should already be 2.2.0 or higher. " 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "id": "0b532987", 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "import torch\n", 52 | "print(torch.__version__)" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "id": "f2f1cc51", 58 | "metadata": {}, 59 | "source": [ 60 | "### 2. Retrieve notebook variables\n", 61 | "\n", 62 | "The line below will retrieve your shared variables from the previous notebook." 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "id": "18a0e06e", 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "%store -r" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "id": "0aa614bc", 78 | "metadata": {}, 79 | "source": [ 80 | "### 3. Import library\n", 81 | "\n" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": null, 87 | "id": "1688f4e9", 88 | "metadata": {}, 89 | "outputs": [], 90 | "source": [ 91 | "import boto3\n", 92 | "import re\n", 93 | "import time\n", 94 | "import sagemaker" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "id": "9510e820", 100 | "metadata": {}, 101 | "source": [ 102 | "### 4. Prepare BERT Model \n", 103 | "\n", 104 | "For this module, we will be using the HuggingFace BERT model to generate vectorization data, where every sentence is 768 dimension data. Let's create some helper functions we'll use later on.\n", 105 | "![BERT](nlp_bert.png)\n", 106 | "\n", 107 | "We are creating 2 functions:\n", 108 | "1. mean_pooling\n", 109 | "2. sentence_to_vector - this is the key function we'll use to generate our vector for the headset PQA dataset." 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": null, 115 | "id": "32964815", 116 | "metadata": {}, 117 | "outputs": [], 118 | "source": [ 119 | "import torch\n", 120 | "from transformers import AutoTokenizer, AutoModel\n", 121 | "from transformers import DistilBertTokenizer, DistilBertModel\n", 122 | "\n", 123 | "#model_name = \"distilbert-base-uncased\"\n", 124 | "#model_name = \"sentence-transformers/msmarco-distilbert-base-dot-prod-v3\"\n", 125 | "model_name = \"sentence-transformers/distilbert-base-nli-stsb-mean-tokens\"\n", 126 | "\n", 127 | "\n", 128 | "#Mean Pooling - Take attention mask into account for correct averaging\n", 129 | "def mean_pooling(model_output, attention_mask):\n", 130 | " token_embeddings = model_output[0] #First element of model_output contains all token embeddings\n", 131 | " input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()\n", 132 | " sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1)\n", 133 | " sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9)\n", 134 | " return sum_embeddings / sum_mask\n", 135 | "\n", 136 | "\n", 137 | "def sentence_to_vector(raw_inputs):\n", 138 | " tokenizer = DistilBertTokenizer.from_pretrained(model_name)\n", 139 | " model = DistilBertModel.from_pretrained(model_name)\n", 140 | " inputs_tokens = tokenizer(raw_inputs, padding=True, return_tensors=\"pt\")\n", 141 | " \n", 142 | " with torch.no_grad():\n", 143 | " outputs = model(**inputs_tokens)\n", 144 | "\n", 145 | " sentence_embeddings = mean_pooling(outputs, inputs_tokens['attention_mask'])\n", 146 | " return sentence_embeddings\n" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "id": "c6607721", 152 | "metadata": {}, 153 | "source": [ 154 | "### 5. Prepare Headset PQA data\n", 155 | "We have already downloaded the dataset in Module 2, so let's start by ingesting 1000 rows of the data into a Pandas data frame. " 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "id": "9b1cf47b", 162 | "metadata": {}, 163 | "outputs": [], 164 | "source": [ 165 | "import json\n", 166 | "import pandas as pd\n", 167 | "\n", 168 | "def load_pqa(file_name,number_rows=1000):\n", 169 | " qa_list = []\n", 170 | " df = pd.DataFrame(columns=('question', 'answer'))\n", 171 | " with open(file_name) as f:\n", 172 | " i=0\n", 173 | " for line in f:\n", 174 | " data = json.loads(line)\n", 175 | " df.loc[i] = [data['question_text'],data['answers'][0]['answer_text']]\n", 176 | " i+=1\n", 177 | " if(i == number_rows):\n", 178 | " break\n", 179 | " return df\n", 180 | "\n", 181 | "\n", 182 | "qa_list = load_pqa('amazon-pqa/amazon_pqa_headsets.json',number_rows=1000)\n", 183 | "\n" 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "id": "295df3f6", 189 | "metadata": {}, 190 | "source": [ 191 | "### 6. Convert the text data into vector\n", 192 | "Using the helper function we created earlier, let's convert the questions from the Headset PQA dataset into vectors." 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "id": "11405a76", 199 | "metadata": {}, 200 | "outputs": [], 201 | "source": [ 202 | "vector_sentences = sentence_to_vector(qa_list[\"question\"].tolist())" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "id": "2f54a349", 208 | "metadata": {}, 209 | "source": [ 210 | "### 7. Create an OpenSearch cluster connection.\n", 211 | "Next, we'll use Python API to set up connection with OpenSearch Cluster.\n", 212 | "\n", 213 | "Note: if you're using a region other than us-east-1, please update the region in the code below. " 214 | ] 215 | }, 216 | { 217 | "cell_type": "code", 218 | "execution_count": null, 219 | "id": "405e0e52", 220 | "metadata": {}, 221 | "outputs": [], 222 | "source": [ 223 | "from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth\n", 224 | "import boto3\n", 225 | "import json\n", 226 | "\n", 227 | "kms = boto3.client('secretsmanager')\n", 228 | "aos_credentials = json.loads(kms.get_secret_value(SecretId=outputs['OpenSearchSecret'])['SecretString'])\n", 229 | "\n", 230 | "region = 'us-east-1' \n", 231 | "\n", 232 | "#credentials = boto3.Session().get_credentials()\n", 233 | "#auth = AWSV4SignerAuth(credentials, region)\n", 234 | "auth = (aos_credentials['username'], aos_credentials['password'])\n", 235 | "\n", 236 | "index_name = 'nlp_pqa'\n", 237 | "\n", 238 | "aos_client = OpenSearch(\n", 239 | " hosts = [{'host': aos_host, 'port': 443}],\n", 240 | " http_auth = auth,\n", 241 | " use_ssl = True,\n", 242 | " verify_certs = True,\n", 243 | " connection_class = RequestsHttpConnection\n", 244 | ")" 245 | ] 246 | }, 247 | { 248 | "cell_type": "markdown", 249 | "id": "beaabc1e", 250 | "metadata": {}, 251 | "source": [ 252 | "### 8. Create a index in Amazon Opensearch Service \n", 253 | "Whereas we previously created an index with 2 fields, this time we'll define the index with 3 fields: the first field ' question_vector' holds the vector representation of the question, the second is the \"question\" for raw sentence and the third field is \"answer\" for the raw answer data.\n", 254 | "\n", 255 | "To create the index, we first define the index in JSON, then use the aos_client connection we initiated ealier to create the index in OpenSearch." 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": null, 261 | "id": "5eba5754", 262 | "metadata": {}, 263 | "outputs": [], 264 | "source": [ 265 | "knn_index = {\n", 266 | " \"settings\": {\n", 267 | " \"index.knn\": True,\n", 268 | " \"index.knn.space_type\": \"cosinesimil\",\n", 269 | " \"analysis\": {\n", 270 | " \"analyzer\": {\n", 271 | " \"default\": {\n", 272 | " \"type\": \"standard\",\n", 273 | " \"stopwords\": \"_english_\"\n", 274 | " }\n", 275 | " }\n", 276 | " }\n", 277 | " },\n", 278 | " \"mappings\": {\n", 279 | " \"properties\": {\n", 280 | " \"question_vector\": {\n", 281 | " \"type\": \"knn_vector\",\n", 282 | " \"dimension\": 768,\n", 283 | " \"store\": True\n", 284 | " },\n", 285 | " \"question\": {\n", 286 | " \"type\": \"text\",\n", 287 | " \"store\": True\n", 288 | " },\n", 289 | " \"answer\": {\n", 290 | " \"type\": \"text\",\n", 291 | " \"store\": True\n", 292 | " }\n", 293 | " }\n", 294 | " }\n", 295 | "}\n" 296 | ] 297 | }, 298 | { 299 | "cell_type": "markdown", 300 | "id": "1330502a", 301 | "metadata": {}, 302 | "source": [ 303 | "If for any reason you need to recreate your dataset, you can uncomment and execute the following to delete any previously created indexes. If this is the first time you're running this, you can skip this step." 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": null, 309 | "id": "a835b9fb", 310 | "metadata": {}, 311 | "outputs": [], 312 | "source": [ 313 | "#aos_client.indices.delete(index=\"nlp_pqa\")\n" 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "id": "c6de634d", 319 | "metadata": {}, 320 | "source": [ 321 | "Using the above index definition, we now need to create the index in Amazon OpenSearch" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": null, 327 | "id": "715b751d", 328 | "metadata": {}, 329 | "outputs": [], 330 | "source": [ 331 | "aos_client.indices.create(index=\"nlp_pqa\",body=knn_index,ignore=400)\n" 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "id": "a7007735", 337 | "metadata": {}, 338 | "source": [ 339 | "Let's verify the created index information" 340 | ] 341 | }, 342 | { 343 | "cell_type": "code", 344 | "execution_count": null, 345 | "id": "1f71659d", 346 | "metadata": {}, 347 | "outputs": [], 348 | "source": [ 349 | "aos_client.indices.get(index=\"nlp_pqa\")" 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "id": "0040992c", 355 | "metadata": {}, 356 | "source": [ 357 | "### 9. Load the raw data into the Index\n", 358 | "Next, let's load the headset enhanced PQA data into the index we've just created." 359 | ] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": null, 364 | "id": "7e55e6a6", 365 | "metadata": {}, 366 | "outputs": [], 367 | "source": [ 368 | "i = 0\n", 369 | "for c in qa_list[\"question\"].tolist():\n", 370 | " content=c\n", 371 | " vector=vector_sentences[i].tolist()\n", 372 | " answer=qa_list[\"answer\"][i]\n", 373 | " i+=1\n", 374 | " aos_client.index(index='nlp_pqa',body={\"question_vector\": vector, \"question\": content,\"answer\":answer})" 375 | ] 376 | }, 377 | { 378 | "cell_type": "markdown", 379 | "id": "67fad674", 380 | "metadata": {}, 381 | "source": [ 382 | "To validate the load, we'll query the number of documents number in the index. We should have 1000 hits in the index." 383 | ] 384 | }, 385 | { 386 | "cell_type": "code", 387 | "execution_count": null, 388 | "id": "05ed0b71", 389 | "metadata": {}, 390 | "outputs": [], 391 | "source": [ 392 | "res = aos_client.search(index=\"nlp_pqa\", body={\"query\": {\"match_all\": {}}})\n", 393 | "print(\"Records found: %d.\" % res['hits']['total']['value'])" 394 | ] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "id": "93888b33", 399 | "metadata": {}, 400 | "source": [ 401 | "### 10. Generate vector for user input query \n", 402 | "\n", 403 | "Next, we'll use the same helper function to translate our input question \"does this work with xbox?\" into a vector. " 404 | ] 405 | }, 406 | { 407 | "cell_type": "code", 408 | "execution_count": null, 409 | "id": "ec91279d", 410 | "metadata": {}, 411 | "outputs": [], 412 | "source": [ 413 | "query_raw_sentences = ['does this work with xbox?']\n", 414 | "search_vector = sentence_to_vector(query_raw_sentences)[0].tolist()\n", 415 | "search_vector" 416 | ] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "id": "de9b827c", 421 | "metadata": {}, 422 | "source": [ 423 | "### 11. Search vector with \"Semantic Search\" \n", 424 | "\n", 425 | "Now that we have vector in OpenSearch and a vector for our query question, let's perform a KNN search in OpenSearch.\n" 426 | ] 427 | }, 428 | { 429 | "cell_type": "code", 430 | "execution_count": null, 431 | "id": "5c5f4e81", 432 | "metadata": {}, 433 | "outputs": [], 434 | "source": [ 435 | "\n", 436 | "query={\n", 437 | " \"size\": 30,\n", 438 | " \"query\": {\n", 439 | " \"knn\": {\n", 440 | " \"question_vector\":{\n", 441 | " \"vector\":search_vector,\n", 442 | " \"k\":30\n", 443 | " }\n", 444 | " }\n", 445 | " }\n", 446 | "}\n", 447 | "\n", 448 | "res = aos_client.search(index=\"nlp_pqa\", \n", 449 | " body=query,\n", 450 | " stored_fields=[\"question\",\"answer\"])\n", 451 | "#print(\"Got %d Hits:\" % res['hits']['total']['value'])\n", 452 | "query_result=[]\n", 453 | "for hit in res['hits']['hits']:\n", 454 | " row=[hit['_id'],hit['_score'],hit['fields']['question'][0],hit['fields']['answer'][0]]\n", 455 | " query_result.append(row)\n", 456 | "\n", 457 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 458 | "display(query_result_df)" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "id": "0abddaa4", 464 | "metadata": {}, 465 | "source": [ 466 | "### 12. Search the same query with \"Text Search\"\n", 467 | "\n", 468 | "Let's repeat the same query with a keyword search and compare the differences." 469 | ] 470 | }, 471 | { 472 | "cell_type": "code", 473 | "execution_count": null, 474 | "id": "8c652c52", 475 | "metadata": {}, 476 | "outputs": [], 477 | "source": [ 478 | "query={\n", 479 | " \"size\": 30,\n", 480 | " \"query\": {\n", 481 | " \"match\": {\n", 482 | " \"question\":\"does this work with xbox?\"\n", 483 | " }\n", 484 | " }\n", 485 | "}\n", 486 | "\n", 487 | "res = aos_client.search(index=\"nlp_pqa\", \n", 488 | " body=query,\n", 489 | " stored_fields=[\"question\",\"answer\"])\n", 490 | "#print(\"Got %d Hits:\" % res['hits']['total']['value'])\n", 491 | "query_result=[]\n", 492 | "for hit in res['hits']['hits']:\n", 493 | " row=[hit['_id'],hit['_score'],hit['fields']['question'][0],hit['fields']['answer'][0]]\n", 494 | " query_result.append(row)\n", 495 | "\n", 496 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 497 | "display(query_result_df)" 498 | ] 499 | }, 500 | { 501 | "cell_type": "markdown", 502 | "id": "fb777d3d", 503 | "metadata": {}, 504 | "source": [ 505 | "### 13. Observe The Results\n", 506 | "\n", 507 | "Compare the first few records in the two searches above. For the Semantic search, the first 10 or so results are very similar to our input questions, as we expect. Compare this to keyword search, where the results quickly start to deviate from our search query (e.g. \"it shows xbox 360. Does it work for ps3 as well?\" - this matches on keywords but has a different meaning)." 508 | ] 509 | }, 510 | { 511 | "cell_type": "markdown", 512 | "id": "607e909b", 513 | "metadata": {}, 514 | "source": [ 515 | "### 14. Store Variables Used for the Next Notebook\n", 516 | "\n", 517 | "There are a few values you will need for the next notebook, execute the cells below to store them so they can be copied and pasted into the next part of the exercise." 518 | ] 519 | }, 520 | { 521 | "cell_type": "code", 522 | "execution_count": null, 523 | "id": "8db65388", 524 | "metadata": {}, 525 | "outputs": [], 526 | "source": [ 527 | "%store qa_list" 528 | ] 529 | } 530 | ], 531 | "metadata": { 532 | "kernelspec": { 533 | "display_name": "conda_pytorch_p310", 534 | "language": "python", 535 | "name": "conda_pytorch_p310" 536 | }, 537 | "language_info": { 538 | "codemirror_mode": { 539 | "name": "ipython", 540 | "version": 3 541 | }, 542 | "file_extension": ".py", 543 | "mimetype": "text/x-python", 544 | "name": "python", 545 | "nbconvert_exporter": "python", 546 | "pygments_lexer": "ipython3", 547 | "version": "3.10.14" 548 | } 549 | }, 550 | "nbformat": 4, 551 | "nbformat_minor": 5 552 | } 553 | -------------------------------------------------------------------------------- /Module 6 - Lab1 - Sematic Search with Neural Search Local Model.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "e87dc259", 6 | "metadata": {}, 7 | "source": [ 8 | "# Semantic Search with OpenSearch Neural Search " 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "b0cfd51d", 14 | "metadata": {}, 15 | "source": [ 16 | "We will use Neural Search plugin in OpenSearch to implement semantic search" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "31703e3d", 22 | "metadata": {}, 23 | "source": [ 24 | "### 1. Check PyTorch Version\n" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "id": "9ac12126", 30 | "metadata": {}, 31 | "source": [ 32 | "As in the previous modules, let's import PyTorch and confirm that have have the latest version of PyTorch. " 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": null, 38 | "id": "0b532987", 39 | "metadata": {}, 40 | "outputs": [], 41 | "source": [ 42 | "import torch\n", 43 | "print(torch.__version__)" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "id": "f2f1cc51", 49 | "metadata": {}, 50 | "source": [ 51 | "### 2. Retrieve notebook variables\n", 52 | "\n", 53 | "The line below will retrieve your shared variables from the previous notebook." 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": null, 59 | "id": "18a0e06e", 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "%store -r" 64 | ] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "id": "4a3fa4b0", 69 | "metadata": {}, 70 | "source": [ 71 | "### 3. Install OpenSearch ML Python library" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": null, 77 | "id": "58a1c491", 78 | "metadata": {}, 79 | "outputs": [], 80 | "source": [ 81 | "!pip install opensearch-py-ml\n", 82 | "!pip install accelerate\n", 83 | "!pip install deprecated" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "id": "05c00375", 89 | "metadata": {}, 90 | "source": [ 91 | "Now we need to restart the kernel by running below cell." 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": null, 97 | "id": "f94df946", 98 | "metadata": {}, 99 | "outputs": [], 100 | "source": [ 101 | "from IPython.display import display_html\n", 102 | "def restartkernel() :\n", 103 | " display_html(\"\",raw=True)\n", 104 | "restartkernel()" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "id": "0aa614bc", 110 | "metadata": {}, 111 | "source": [ 112 | "### 4. Import library\n", 113 | "\n" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "id": "1688f4e9", 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "import boto3\n", 124 | "import re\n", 125 | "import time" 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "id": "c6607721", 131 | "metadata": {}, 132 | "source": [ 133 | "### 5. Prepare Headset PQA data\n", 134 | "We have already downloaded the dataset in Module 2, so let's start by ingesting 1000 rows of the data into a Pandas data frame. \n", 135 | "\n", 136 | "Before we can run any queries, we need to download the Amazon Product Question and Answer data from : https://registry.opendata.aws/amazon-pqa/" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": null, 142 | "id": "58fca957", 143 | "metadata": {}, 144 | "outputs": [], 145 | "source": [ 146 | "!aws s3 cp --no-sign-request s3://amazon-pqa/amazon_pqa_headsets.json ./amazon-pqa/amazon_pqa_headsets.json" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "id": "9b1cf47b", 153 | "metadata": {}, 154 | "outputs": [], 155 | "source": [ 156 | "import json\n", 157 | "import pandas as pd\n", 158 | "\n", 159 | "def load_pqa(file_name,number_rows=1000):\n", 160 | " qa_list = []\n", 161 | " df = pd.DataFrame(columns=('question', 'answer'))\n", 162 | " with open(file_name) as f:\n", 163 | " i=0\n", 164 | " for line in f:\n", 165 | " data = json.loads(line)\n", 166 | " df.loc[i] = [data['question_text'],data['answers'][0]['answer_text']]\n", 167 | " i+=1\n", 168 | " if(i == number_rows):\n", 169 | " break\n", 170 | " return df\n", 171 | "\n", 172 | "\n", 173 | "qa_list = load_pqa('amazon-pqa/amazon_pqa_headsets.json',number_rows=1000)\n", 174 | "\n" 175 | ] 176 | }, 177 | { 178 | "cell_type": "markdown", 179 | "id": "2f54a349", 180 | "metadata": {}, 181 | "source": [ 182 | "### 6. Create an OpenSearch cluster connection.\n", 183 | "Next, we'll use Python API to set up connection with OpenSearch Cluster.\n", 184 | "\n", 185 | "Note: if you're using a region other than us-east-1, please update the region in the code below.\n", 186 | "\n", 187 | "#### Get Cloud Formation stack output variables\n", 188 | "\n", 189 | "We also need to grab some key values from the infrastructure we provisioned using CloudFormation. To do this, we will list the outputs from the stack and store this in \"outputs\" to be used later.\n", 190 | "\n", 191 | "You can ignore any \"PythonDeprecationWarning\" warnings." 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "id": "81dc45a2", 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [ 201 | "import boto3\n", 202 | "\n", 203 | "cfn = boto3.client('cloudformation')\n", 204 | "\n", 205 | "def get_cfn_outputs(stackname):\n", 206 | " outputs = {}\n", 207 | " for output in cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']:\n", 208 | " outputs[output['OutputKey']] = output['OutputValue']\n", 209 | " return outputs\n", 210 | "\n", 211 | "## Setup variables to use for the rest of the demo\n", 212 | "cloudformation_stack_name = \"semantic-search\"\n", 213 | "\n", 214 | "outputs = get_cfn_outputs(cloudformation_stack_name)\n", 215 | "\n", 216 | "bucket = outputs['s3BucketTraining']\n", 217 | "aos_host = outputs['OpenSearchDomainEndpoint']\n", 218 | "\n", 219 | "outputs" 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": null, 225 | "id": "0cee3dd0", 226 | "metadata": {}, 227 | "outputs": [], 228 | "source": [ 229 | "from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth\n", 230 | "import boto3\n", 231 | "import json\n", 232 | "\n", 233 | "kms = boto3.client('secretsmanager')\n", 234 | "aos_credentials = json.loads(kms.get_secret_value(SecretId=outputs['OpenSearchSecret'])['SecretString'])\n", 235 | "\n", 236 | "region = 'us-east-1' \n", 237 | "\n", 238 | "#credentials = boto3.Session().get_credentials()\n", 239 | "#auth = AWSV4SignerAuth(credentials, region)\n", 240 | "auth = (aos_credentials['username'], aos_credentials['password'])\n", 241 | "\n", 242 | "index_name = 'nlp_pqa'\n", 243 | "\n", 244 | "aos_client = OpenSearch(\n", 245 | " hosts = [{'host': aos_host, 'port': 443}],\n", 246 | " http_auth = auth,\n", 247 | " use_ssl = True,\n", 248 | " verify_certs = True,\n", 249 | " connection_class = RequestsHttpConnection\n", 250 | ")" 251 | ] 252 | }, 253 | { 254 | "cell_type": "markdown", 255 | "id": "0173ff23", 256 | "metadata": {}, 257 | "source": [ 258 | "### 7. Configure OpenSearch domain to enable run Machine Learning code in data node" 259 | ] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "execution_count": null, 264 | "id": "9e4080d5", 265 | "metadata": {}, 266 | "outputs": [], 267 | "source": [ 268 | "s = b'{\"transient\":{\"plugins.ml_commons.only_run_on_ml_node\": false}}'\n", 269 | "aos_client.cluster.put_settings(body=s)" 270 | ] 271 | }, 272 | { 273 | "cell_type": "markdown", 274 | "id": "e53a9cd9", 275 | "metadata": {}, 276 | "source": [ 277 | "Verify `plugins.ml_commons.only_run_on_ml_node` is set to false" 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": null, 283 | "id": "cce6a646", 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "aos_client.cluster.get_settings(flat_settings=True)" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "id": "62421acd", 293 | "metadata": {}, 294 | "source": [ 295 | "### 8. Register pre-trained model to OpenSearch domain" 296 | ] 297 | }, 298 | { 299 | "cell_type": "code", 300 | "execution_count": null, 301 | "id": "bda0e264", 302 | "metadata": {}, 303 | "outputs": [], 304 | "source": [ 305 | "from opensearch_py_ml.ml_models import SentenceTransformerModel\n", 306 | "from opensearch_py_ml.ml_commons import MLCommonClient\n", 307 | "\n", 308 | "ml_client = MLCommonClient(aos_client)\n", 309 | "model_id = ml_client.register_pretrained_model(model_name = \"huggingface/sentence-transformers/all-MiniLM-L12-v2\", model_version = \"1.0.1\", model_format = \"TORCH_SCRIPT\", deploy_model=False, wait_until_deployed=False)\n", 310 | "print(model_id)" 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "id": "5dc9d104", 316 | "metadata": {}, 317 | "source": [ 318 | "### 9. Load the model for inference." 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": null, 324 | "id": "52cba357", 325 | "metadata": {}, 326 | "outputs": [], 327 | "source": [ 328 | "load_model_output = ml_client.deploy_model(model_id)\n", 329 | "\n", 330 | "print(load_model_output)" 331 | ] 332 | }, 333 | { 334 | "cell_type": "markdown", 335 | "id": "1c54be9c", 336 | "metadata": {}, 337 | "source": [ 338 | "### 10.Get the model detailed information." 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "execution_count": null, 344 | "id": "1211a76d", 345 | "metadata": {}, 346 | "outputs": [], 347 | "source": [ 348 | "model_info = ml_client.get_model_info(model_id)\n", 349 | "\n", 350 | "print(model_info)" 351 | ] 352 | }, 353 | { 354 | "cell_type": "markdown", 355 | "id": "3625b5cf", 356 | "metadata": {}, 357 | "source": [ 358 | "### 11. Create pipeline to convert text into vector with BERT model\n", 359 | "We will use the just uploaded model to convert `qestion` field into vector(embedding) and stored into `question_vector` field." 360 | ] 361 | }, 362 | { 363 | "cell_type": "code", 364 | "execution_count": null, 365 | "id": "dc810643", 366 | "metadata": {}, 367 | "outputs": [], 368 | "source": [ 369 | "pipeline={\n", 370 | " \"description\": \"An example neural search pipeline\",\n", 371 | " \"processors\" : [\n", 372 | " {\n", 373 | " \"text_embedding\": {\n", 374 | " \"model_id\": model_id,\n", 375 | " \"field_map\": {\n", 376 | " \"question\": \"question_vector\"\n", 377 | " }\n", 378 | " }\n", 379 | " }\n", 380 | " ]\n", 381 | "}\n", 382 | "pipeline_id = 'nlp_pipeline'\n", 383 | "aos_client.ingest.put_pipeline(id=pipeline_id,body=pipeline)" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "id": "c431a804", 389 | "metadata": {}, 390 | "source": [ 391 | "Verify pipeline is created succefuflly." 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "execution_count": null, 397 | "id": "13ff2f91", 398 | "metadata": {}, 399 | "outputs": [], 400 | "source": [ 401 | "aos_client.ingest.get_pipeline(id=pipeline_id)" 402 | ] 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "id": "beaabc1e", 407 | "metadata": {}, 408 | "source": [ 409 | "### 12. Create a index in Amazon Opensearch Service \n", 410 | "Whereas we previously created an index with 2 fields, this time we'll define the index with 3 fields: the first field ' question_vector' holds the vector representation of the question, the second is the \"question\" for raw sentence and the third field is \"answer\" for the raw answer data.\n", 411 | "\n", 412 | "To create the index, we first define the index in JSON, then use the aos_client connection we initiated ealier to create the index in OpenSearch." 413 | ] 414 | }, 415 | { 416 | "cell_type": "code", 417 | "execution_count": null, 418 | "id": "5eba5754", 419 | "metadata": {}, 420 | "outputs": [], 421 | "source": [ 422 | "knn_index = {\n", 423 | " \"settings\": {\n", 424 | " \"index.knn\": True,\n", 425 | " \"index.knn.space_type\": \"cosinesimil\",\n", 426 | " \"default_pipeline\": pipeline_id,\n", 427 | " \"analysis\": {\n", 428 | " \"analyzer\": {\n", 429 | " \"default\": {\n", 430 | " \"type\": \"standard\",\n", 431 | " \"stopwords\": \"_english_\"\n", 432 | " }\n", 433 | " }\n", 434 | " }\n", 435 | " },\n", 436 | " \"mappings\": {\n", 437 | " \"properties\": {\n", 438 | " \"question_vector\": {\n", 439 | " \"type\": \"knn_vector\",\n", 440 | " \"dimension\": 384,\n", 441 | " \"method\": {\n", 442 | " \"name\": \"hnsw\",\n", 443 | " \"space_type\": \"l2\",\n", 444 | " \"engine\": \"faiss\"\n", 445 | " },\n", 446 | " \"store\": True\n", 447 | " },\n", 448 | " \"question\": {\n", 449 | " \"type\": \"text\",\n", 450 | " \"store\": True\n", 451 | " },\n", 452 | " \"answer\": {\n", 453 | " \"type\": \"text\",\n", 454 | " \"store\": True\n", 455 | " }\n", 456 | " }\n", 457 | " }\n", 458 | "}\n" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "id": "1330502a", 464 | "metadata": {}, 465 | "source": [ 466 | "If for any reason you need to recreate your dataset, you can uncomment and execute the following to delete any previously created indexes. If this is the first time you're running this, you can skip this step." 467 | ] 468 | }, 469 | { 470 | "cell_type": "code", 471 | "execution_count": null, 472 | "id": "a835b9fb", 473 | "metadata": {}, 474 | "outputs": [], 475 | "source": [ 476 | "aos_client.indices.delete(index=\"nlp_pqa\")\n" 477 | ] 478 | }, 479 | { 480 | "cell_type": "markdown", 481 | "id": "c6de634d", 482 | "metadata": {}, 483 | "source": [ 484 | "Using the above index definition, we now need to create the index in Amazon OpenSearch" 485 | ] 486 | }, 487 | { 488 | "cell_type": "code", 489 | "execution_count": null, 490 | "id": "715b751d", 491 | "metadata": {}, 492 | "outputs": [], 493 | "source": [ 494 | "aos_client.indices.create(index=\"nlp_pqa\",body=knn_index,ignore=400)\n" 495 | ] 496 | }, 497 | { 498 | "cell_type": "markdown", 499 | "id": "a7007735", 500 | "metadata": {}, 501 | "source": [ 502 | "Let's verify the created index information" 503 | ] 504 | }, 505 | { 506 | "cell_type": "code", 507 | "execution_count": null, 508 | "id": "1f71659d", 509 | "metadata": {}, 510 | "outputs": [], 511 | "source": [ 512 | "aos_client.indices.get(index=\"nlp_pqa\")" 513 | ] 514 | }, 515 | { 516 | "cell_type": "markdown", 517 | "id": "0040992c", 518 | "metadata": {}, 519 | "source": [ 520 | "### 13. Load the raw data into the Index\n", 521 | "Next, let's load the headset enhanced PQA data into the index we've just created. During ingest data, `question` field will also be converted to vector(embedding) by the `nlp_pipeline` we defined." 522 | ] 523 | }, 524 | { 525 | "cell_type": "code", 526 | "execution_count": null, 527 | "id": "7e55e6a6", 528 | "metadata": {}, 529 | "outputs": [], 530 | "source": [ 531 | "i = 0\n", 532 | "for c in qa_list[\"question\"].tolist():\n", 533 | " content=c\n", 534 | " answer=qa_list[\"answer\"][i]\n", 535 | " i+=1\n", 536 | " aos_client.index(index='nlp_pqa',body={\"question\": content,\"answer\":answer})" 537 | ] 538 | }, 539 | { 540 | "cell_type": "markdown", 541 | "id": "67fad674", 542 | "metadata": {}, 543 | "source": [ 544 | "To validate the load, we'll query the number of documents number in the index. We should have 1000 hits in the index." 545 | ] 546 | }, 547 | { 548 | "cell_type": "code", 549 | "execution_count": null, 550 | "id": "05ed0b71", 551 | "metadata": {}, 552 | "outputs": [], 553 | "source": [ 554 | "res = aos_client.search(index=\"nlp_pqa\", body={\"query\": {\"match_all\": {}}})\n", 555 | "print(\"Records found: %d.\" % res['hits']['total']['value'])\n" 556 | ] 557 | }, 558 | { 559 | "cell_type": "markdown", 560 | "id": "de9b827c", 561 | "metadata": {}, 562 | "source": [ 563 | "### 14. Search vector with \"Semantic Search\" \n", 564 | "\n", 565 | "We can search the data with neural search.\n" 566 | ] 567 | }, 568 | { 569 | "cell_type": "code", 570 | "execution_count": null, 571 | "id": "5c5f4e81", 572 | "metadata": {}, 573 | "outputs": [], 574 | "source": [ 575 | "query={\n", 576 | " \"_source\": {\n", 577 | " \"exclude\": [ \"question_vector\" ]\n", 578 | " },\n", 579 | " \"size\": 30,\n", 580 | " \"query\": {\n", 581 | " \"neural\": {\n", 582 | " \"question_vector\": {\n", 583 | " \"query_text\": \"does this work with xbox?\",\n", 584 | " \"model_id\": model_id,\n", 585 | " \"k\": 30\n", 586 | " }\n", 587 | " }\n", 588 | " }\n", 589 | "}\n", 590 | "\n", 591 | "res = aos_client.search(index=\"nlp_pqa\", \n", 592 | " body=query,\n", 593 | " stored_fields=[\"question\",\"answer\"])\n", 594 | "print(\"Got %d Hits:\" % res['hits']['total']['value'])\n", 595 | "query_result=[]\n", 596 | "for hit in res['hits']['hits']:\n", 597 | " row=[hit['_id'],hit['_score'],hit['_source']['question'],hit['_source']['answer']]\n", 598 | " query_result.append(row)\n", 599 | "\n", 600 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 601 | "display(query_result_df)" 602 | ] 603 | }, 604 | { 605 | "cell_type": "markdown", 606 | "id": "0abddaa4", 607 | "metadata": {}, 608 | "source": [ 609 | "### 15. Search the same query with \"Text Search\"\n", 610 | "\n", 611 | "Let's repeat the same query with a keyword search and compare the differences." 612 | ] 613 | }, 614 | { 615 | "cell_type": "code", 616 | "execution_count": null, 617 | "id": "8c652c52", 618 | "metadata": {}, 619 | "outputs": [], 620 | "source": [ 621 | "query={\n", 622 | " \"size\": 30,\n", 623 | " \"query\": {\n", 624 | " \"match\": {\n", 625 | " \"question\":\"does this work with xbox?\"\n", 626 | " }\n", 627 | " }\n", 628 | "}\n", 629 | "\n", 630 | "res = aos_client.search(index=\"nlp_pqa\", \n", 631 | " body=query,\n", 632 | " stored_fields=[\"question\",\"answer\"])\n", 633 | "#print(\"Got %d Hits:\" % res['hits']['total']['value'])\n", 634 | "query_result=[]\n", 635 | "for hit in res['hits']['hits']:\n", 636 | " row=[hit['_id'],hit['_score'],hit['fields']['question'][0],hit['fields']['answer'][0]]\n", 637 | " query_result.append(row)\n", 638 | "\n", 639 | "query_result_df = pd.DataFrame(data=query_result,columns=[\"_id\",\"_score\",\"question\",\"answer\"])\n", 640 | "display(query_result_df)" 641 | ] 642 | }, 643 | { 644 | "cell_type": "markdown", 645 | "id": "fb777d3d", 646 | "metadata": {}, 647 | "source": [ 648 | "### 16. Observe The Results\n", 649 | "\n", 650 | "Compare the first few records in the two searches above. For the Semantic search, the first 10 or so results are very similar to our input questions, as we expect. Compare this to keyword search, where the results quickly start to deviate from our search query (e.g. \"it shows xbox 360. Does it work for ps3 as well?\" - this matches on keywords but has a different meaning).\n", 651 | "\n", 652 | "You can also use \"Compare search results\" in Search relevance plugin to compare search relevance side by side. Please refer the lab \"Option 2: OpenSearch Dashboard Dev Tools\" to compare search results." 653 | ] 654 | }, 655 | { 656 | "cell_type": "markdown", 657 | "id": "9c8d80ff", 658 | "metadata": {}, 659 | "source": [ 660 | "### 17. Summary\n", 661 | "With OpenSearch Neural Search plugin, embedding is automatically generated with model we uploaded. We don't need care about inference pipeline anymore. It makes the semantic search solution simple to develop and maintain. " 662 | ] 663 | } 664 | ], 665 | "metadata": { 666 | "kernelspec": { 667 | "display_name": "conda_pytorch_p310", 668 | "language": "python", 669 | "name": "conda_pytorch_p310" 670 | }, 671 | "language_info": { 672 | "codemirror_mode": { 673 | "name": "ipython", 674 | "version": 3 675 | }, 676 | "file_extension": ".py", 677 | "mimetype": "text/x-python", 678 | "name": "python", 679 | "nbconvert_exporter": "python", 680 | "pygments_lexer": "ipython3", 681 | "version": "3.10.14" 682 | } 683 | }, 684 | "nbformat": 4, 685 | "nbformat_minor": 5 686 | } 687 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Improve search relevance with machine learning in Amazon OpenSearch Service 2 | 3 | This repository guides users through creating a semantic search using Amazon SageMaker and Amazon OpenSearch services 4 | 5 | 6 | ## How does it work? 7 | 8 | This code repository is for [Semantic and Vector Search with Amazon OpenSearch Service Workshop](https://catalog.workshops.aws/semantic-search/en-US). For more information about semantic search, please refer the workshop content. 9 | 10 | ### Semantic Search Architecture 11 | ![semantic_search_fullstack](./semantic_search_fullstack.jpg) 12 | 13 | ### Retrieval Augmented Generation Architecture 14 | ![rag](./rag.png) 15 | 16 | ### Conversational Search Architecture 17 | ![converstational-search](./converstational-search.png) 18 | 19 | 20 | ### CloudFormation Deployment 21 | 22 | 1. The workshop can only be deployed in us-east-1 region 23 | 2. Use the Cloudformation template `cfn/semantic-search.yaml` to create CF stack 24 | 3. Cloudformation stack name must be `semantic-search` as we use this stack name in our lab 25 | 4. You can click the following link to deploy CloudFormation Stack 26 | 27 | | Region | Launch Template | 28 | | --------------------------- | ----------------------- | 29 | | **US East (N. Virginia)** | [![Deploy to AWS](deploy-to-aws.png)](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/quickcreate?templateUrl=https://ws-assets-prod-iad-r-iad-ed304a55c2ca1aee.s3.us-east-1.amazonaws.com/df655552-1e61-4a6b-9dc4-c03eb94c6f75/semantic-search.yaml&stackName=semantic-search) | 30 | 31 | 32 | ### Lab Instruction 33 | There are 8 modules in this workshop: 34 | * **Module 1 - Search basics**: You will learn fundamentals of text search and semantic search. This section also introduces differences between a best matching algorithm, popularly known as BM25 similarity and semantic similarity. 35 | 36 | * **Module 2 -Text search**: You will learn text search with Amazon OpenSearch Service. In information retrieval this type of searching is traditionally called 'Keyword' search. 37 | 38 | * **Module 3 - Semantic search**: You will learn semantic search with Amazon OpenSearch Service and Amazon SageMaker. You will use a machine learning technique called Bidirectional Encoder Representations from transformers, popularly known as BERT. BERT uses a pre-trained natural language processing (NLP) model that represents text in the form numbers or in other words, vectors. You will learn to use vectors with kNN feature in Amazon OpenSearch Service. 39 | 40 | * **Module 4 - Fullstack semantic search**: You will bring together all the concepts learnt earlier with an user interface that shows the advantages of using semantic search with text search. You will be using Amazon OpenSearch Service, Amazon SageMaker, AWS Lambda, Amazon API Gateway and Amazon S3 for this purpose. 41 | 42 | * **Module 5 - Fine tuning semantic search**: Large language models like BERT show better results when they are trained in-domain, which means fine tuning the general model to fit ones particular business requirements in the domain of its application. You will learn how to fine tune the model for semantic search with the chosen data set. 43 | 44 | * **Module 6 - Neural Search**: Implement semantic search with [OpenSearch Neural Search Plugin](https://opensearch.org/docs/latest/search-plugins/neural-search/). 45 | 46 | * **Module 7 - Retrieval Augmented Generation**: Use semantic search result as context, combine the user input and context as prompt for large language models to generate factual content for knowledge intensive applications. 47 | 48 | * **Module 8 - Conversational Search**: Search with history context while leveraging RAG. 49 | 50 | 51 | Please refer [Semantic Search Workshop](https://catalog.workshops.aws/semantic-search/en-US) for lab instruction. 52 | 53 | ### Note 54 | In this workshop, we use OpenSearch internal database to store username and password to simplify the lab. However in production env, you should design your security solution per your requirements. For more information , please refer [Fine-grained access control](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/fgac.html) and [Identity and Access Management](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/ac.html). 55 | 56 | ## Feedback 57 | 58 | If you have any questions or feedback, please reach us by sending email to [semantic-search@amazon.com](mailto:semantic-search@amazon.com). 59 | 60 | ## License 61 | 62 | This library is licensed under the MIT-0 License. See the LICENSE file. 63 | 64 | -------------------------------------------------------------------------------- /backend/lambda/app.py: -------------------------------------------------------------------------------- 1 | import json 2 | from os import environ 3 | 4 | import boto3 5 | from urllib.parse import urlparse 6 | 7 | from elasticsearch import Elasticsearch, RequestsHttpConnection 8 | from requests_aws4auth import AWS4Auth 9 | 10 | # Global variables that are reused 11 | sm_runtime_client = boto3.client('sagemaker-runtime') 12 | s3_client = boto3.client('s3') 13 | 14 | 15 | def get_features(sm_runtime_client, sagemaker_endpoint, payload): 16 | response = sm_runtime_client.invoke_endpoint( 17 | EndpointName=sagemaker_endpoint, 18 | ContentType='text/plain', 19 | Body=payload) 20 | response_body = json.loads((response['Body'].read())) 21 | features = response_body 22 | 23 | return features 24 | 25 | 26 | def get_neighbors(features, es, k_neighbors=30): 27 | idx_name = 'nlp_pqa' 28 | res = es.search( 29 | request_timeout=30, index=idx_name, 30 | body={ 31 | 'size': k_neighbors, 32 | 'query': {'knn': {'question_vector': {'vector': features, 'k': k_neighbors}}}}, 33 | stored_fields=["question","answer"] 34 | ) 35 | results = [{'question':res['hits']['hits'][x]['fields']['question'][0], 36 | 'answer':res['hits']['hits'][x]['fields']['answer'][0]} for x in range(k_neighbors)] 37 | return results 38 | 39 | 40 | def es_match_query(payload, es, k=30): 41 | idx_name = 'nlp_pqa' 42 | search_body = { 43 | "size": 30, 44 | "_source": { 45 | "excludes": ["question_vector"] 46 | }, 47 | "highlight": { 48 | "fields": { 49 | "question": {} 50 | } 51 | }, 52 | "query": { 53 | "match": { 54 | "question": { 55 | "query": payload 56 | } 57 | } 58 | } 59 | } 60 | 61 | search_response = es.search(request_timeout=30, index=idx_name, 62 | body=search_body)['hits']['hits'][:k] 63 | response = [{'question': x['highlight']['question'], 'answer': x['_source']['answer']} for x in search_response] 64 | return response 65 | 66 | 67 | 68 | def lambda_handler(event, context): 69 | 70 | # elasticsearch variables 71 | service = 'es' 72 | region = environ['AWS_REGION'] 73 | elasticsearch_endpoint = environ['ES_ENDPOINT'] 74 | 75 | session = boto3.session.Session() 76 | credentials = session.get_credentials() 77 | # awsauth = AWS4Auth( 78 | # credentials.access_key, 79 | # credentials.secret_key, 80 | # region, 81 | # service, 82 | # session_token=credentials.token 83 | # ) 84 | awsauth=("master","Semantic123!") 85 | 86 | es = Elasticsearch( 87 | hosts=[{'host': elasticsearch_endpoint, 'port': 443}], 88 | http_auth=awsauth, 89 | use_ssl=True, 90 | verify_certs=True, 91 | connection_class=RequestsHttpConnection 92 | ) 93 | 94 | # sagemaker variables 95 | sagemaker_endpoint = environ['SM_ENDPOINT'] 96 | 97 | api_payload = json.loads(event['body']) 98 | k = 30 99 | payload = api_payload['searchString'] 100 | 101 | if event['path'] == '/postText': 102 | features = get_features(sm_runtime_client, sagemaker_endpoint, payload) 103 | similiar_questions = get_neighbors(features, es, k_neighbors=k) 104 | return { 105 | "statusCode": 200, 106 | "headers": { 107 | "Access-Control-Allow-Origin": "*", 108 | "Access-Control-Allow-Headers": "*", 109 | "Access-Control-Allow-Methods": "*" 110 | }, 111 | "body": json.dumps({ 112 | "semantics": similiar_questions, 113 | }), 114 | } 115 | else: 116 | search = es_match_query(payload, es, k) 117 | 118 | for i in range(len(search)): 119 | search[i]['question'][0] = search[i]['question'][0].replace("",'') 120 | return { 121 | "statusCode": 200, 122 | "headers": { 123 | "Access-Control-Allow-Origin": "*", 124 | "Access-Control-Allow-Headers": "*", 125 | "Access-Control-Allow-Methods": "*" 126 | }, 127 | "body": json.dumps(search), 128 | } 129 | -------------------------------------------------------------------------------- /backend/lambda/build-lambda.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | pip install --target ./package -r requirements.txt 4 | 5 | cd package 6 | zip -r ../lambda.zip . 7 | 8 | cd .. 9 | zip -g lambda.zip app.py 10 | 11 | -------------------------------------------------------------------------------- /backend/lambda/requirements.txt: -------------------------------------------------------------------------------- 1 | requests 2 | elasticsearch==7.10 3 | requests-aws4auth -------------------------------------------------------------------------------- /backend/template.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: '2010-09-09' 2 | Transform: AWS::Serverless-2016-10-31 3 | Description: 'backend 4 | 5 | Sample SAM Template for backend 6 | 7 | ' 8 | Parameters: 9 | BucketName: 10 | Type: String 11 | DomainName: 12 | Type: String 13 | ElasticSearchURL: 14 | Type: String 15 | SagemakerEndpoint: 16 | Type: String 17 | LambdaZipFile: 18 | Type: String 19 | Globals: 20 | Function: 21 | Timeout: 60 22 | MemorySize: 512 23 | Api: 24 | Cors: 25 | AllowMethods: '''*''' 26 | AllowHeaders: '''*''' 27 | AllowOrigin: '''*''' 28 | Resources: 29 | PostGetSimilarTextFunction: 30 | Type: AWS::Serverless::Function 31 | Properties: 32 | CodeUri: 33 | Bucket: 34 | Ref: LambdaZipFile 35 | Key: 'lambda/lambda.zip' 36 | Handler: app.lambda_handler 37 | Runtime: python3.8 38 | Environment: 39 | Variables: 40 | ES_ENDPOINT: 41 | Ref: ElasticSearchURL 42 | SM_ENDPOINT: 43 | Ref: SagemakerEndpoint 44 | Policies: 45 | - Version: '2012-10-17' 46 | Statement: 47 | - Sid: AllowSagemakerInvokeEndpoint 48 | Effect: Allow 49 | Action: 50 | - sagemaker:InvokeEndpoint 51 | Resource: 52 | - Fn::Sub: arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:endpoint/${SagemakerEndpoint} 53 | - Version: '2012-10-17' 54 | Statement: 55 | - Sid: AllowESS 56 | Effect: Allow 57 | Action: 58 | - es:* 59 | Resource: 60 | - Fn::Sub: arn:aws:es:${AWS::Region}:${AWS::AccountId}:domain/${DomainName}/* 61 | - S3ReadPolicy: 62 | BucketName: 63 | Ref: BucketName 64 | Events: 65 | PostText: 66 | Type: Api 67 | Properties: 68 | Path: /postText 69 | Method: post 70 | PostMatch: 71 | Type: Api 72 | Properties: 73 | Method: post 74 | Path: /postMatch 75 | Outputs: 76 | TextSimilarityApi: 77 | Description: API Gateway endpoint URL for Prod stage for GetSimilarText function 78 | Value: 79 | Fn::Sub: https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/ 80 | PostGetSimilarTextFunctionArn: 81 | Description: GetSimilarText Lambda Function ARN 82 | Value: 83 | Fn::GetAtt: 84 | - PostGetSimilarTextFunction 85 | - Arn 86 | PostGetSimilarTextLambdaIamRole: 87 | Description: Implicit IAM Role created for GetSimilarText function 88 | Value: 89 | Fn::GetAtt: 90 | - PostGetSimilarTextFunction 91 | - Arn 92 | -------------------------------------------------------------------------------- /blog/LLM-Based-Agent/stock-price/stock_symbol.csv: -------------------------------------------------------------------------------- 1 | stock_symbol,company_name 2 | AMZN,"Amazon.com, Inc." 3 | MSFT,Microsoft Corp. 4 | AYX,"Alteryx, Inc." 5 | MSTR,MICROSTRATEGY Inc 6 | ESTC,Elastic N.V. 7 | MDB,"MongoDB, Inc." 8 | PANW,Palo Alto Networks Inc 9 | OKTA,"Okta, Inc." 10 | DDOG,"Datadog, Inc." 11 | SNOW,Snowflake Inc. 12 | CRM,"SALESFORCE.COM, INC." 13 | ORCL,ORACLE CORP 14 | MSFT,Microsoft Corp. 15 | PLTR,Palantir Technologies Inc. -------------------------------------------------------------------------------- /code/inference.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import logging 3 | import sagemaker_containers 4 | import requests 5 | 6 | import os 7 | import json 8 | import io 9 | import time 10 | import torch 11 | from transformers import AutoTokenizer, AutoModel 12 | # from sentence_transformers import models, losses, SentenceTransformer 13 | 14 | logger = logging.getLogger(__name__) 15 | logger.setLevel(logging.DEBUG) 16 | 17 | #Mean Pooling - Take attention mask into account for correct averaging 18 | def mean_pooling(model_output, attention_mask): 19 | token_embeddings = model_output[0] #First element of model_output contains all token embeddings 20 | input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float() 21 | sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1) 22 | sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9) 23 | return sum_embeddings / sum_mask 24 | 25 | def embed_tformer(model, tokenizer, sentences): 26 | encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=256, return_tensors='pt') 27 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 28 | encoded_input.to(device) 29 | 30 | #Compute token embeddings 31 | with torch.no_grad(): 32 | model_output = model(**encoded_input) 33 | 34 | sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask']) 35 | return sentence_embeddings 36 | 37 | def model_fn(model_dir): 38 | logger.info('model_fn') 39 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 40 | logger.info(model_dir) 41 | tokenizer = AutoTokenizer.from_pretrained(model_dir) 42 | nlp_model = AutoModel.from_pretrained(model_dir) 43 | nlp_model.to(device) 44 | model = {'model':nlp_model, 'tokenizer':tokenizer} 45 | 46 | return model 47 | 48 | # Deserialize the Invoke request body into an object we can perform prediction on 49 | def input_fn(serialized_input_data, content_type='text/plain'): 50 | logger.info('Deserializing the input data.') 51 | try: 52 | data = [serialized_input_data.decode('utf-8')] 53 | return data 54 | except: 55 | raise Exception('Requested unsupported ContentType in content_type: {}'.format(content_type)) 56 | 57 | # Perform prediction on the deserialized object, with the loaded model 58 | def predict_fn(input_object, model): 59 | logger.info("Calling model") 60 | start_time = time.time() 61 | sentence_embeddings = embed_tformer(model['model'], model['tokenizer'], input_object) 62 | print("--- Inference time: %s seconds ---" % (time.time() - start_time)) 63 | response = sentence_embeddings[0].tolist() 64 | return response 65 | 66 | # Serialize the prediction result into the desired response content type 67 | def output_fn(prediction, accept): 68 | logger.info('Serializing the generated output.') 69 | if accept == 'application/json': 70 | output = json.dumps(prediction) 71 | return output 72 | raise Exception('Requested unsupported ContentType in Accept: {}'.format(content_type)) 73 | -------------------------------------------------------------------------------- /code/requirements.txt: -------------------------------------------------------------------------------- 1 | sentence-transformers 2 | sagemaker-containers 3 | numpy>=1.17 4 | -------------------------------------------------------------------------------- /converstational-search.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/converstational-search.png -------------------------------------------------------------------------------- /convert_pqa.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | with open("amazon-pqa/amazon_pqa_headsets.json") as input: 4 | with open("amazon-pqa/converted_amazon_pqa_headsets_1000.json","w") as output: 5 | i=0 6 | for line in input: 7 | line_data = json.loads(line) 8 | meta = '{"index":{"_index":"nlp_pqa"}}\n' 9 | output.write(meta) 10 | data = '{"question":"' 11 | data = data + line_data['question_text'] 12 | data = data + '","answer":"' 13 | data = data + line_data['answers'][0]['answer_text'] 14 | data = data + '"}\n' 15 | output.write(data) 16 | i+=1 17 | if(i == 1000): 18 | break 19 | 20 | print("converted file is saved to 'amazon-pqa/converted_amazon_pqa_headsets_1000.json'") -------------------------------------------------------------------------------- /deploy-semantic-search-backend-cloudformation.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/deploy-semantic-search-backend-cloudformation.jpg -------------------------------------------------------------------------------- /deploy-to-aws.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/deploy-to-aws.png -------------------------------------------------------------------------------- /download-dependencies.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | set -e 4 | 5 | echo "(Re)-creating directory" 6 | rm -rf ./dependencies 7 | mkdir ./dependencies 8 | cd ./dependencies 9 | echo "Downloading dependencies" 10 | curl -sS https://d2eo22ngex1n9g.cloudfront.net/Documentation/SDK/bedrock-python-sdk.zip > sdk.zip 11 | echo "Unpacking dependencies" 12 | # (SageMaker Studio system terminals don't have `unzip` utility installed) 13 | if command -v unzip &> /dev/null 14 | then 15 | unzip sdk.zip && rm sdk.zip && echo "Done" 16 | else 17 | echo "'unzip' command not found: Trying to unzip via Python" 18 | python -m zipfile -e sdk.zip . && rm sdk.zip && echo "Done" 19 | fi 20 | -------------------------------------------------------------------------------- /frontend/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "frontend", 3 | "version": "0.1.0", 4 | "private": true, 5 | "dependencies": { 6 | "@material-ui/core": "^4.12.4", 7 | "@material-ui/icons": "^4.11.3", 8 | "aws-amplify": "^4.3.34", 9 | "react": "^17.0.2", 10 | "react-dom": "^17.0.2", 11 | "typeface-roboto": "^1.1.13" 12 | }, 13 | "devDependencies": { 14 | "react-scripts": "^5.0.1" 15 | }, 16 | "scripts": { 17 | "start": "react-scripts start", 18 | "build": "react-scripts build", 19 | "eject": "react-scripts eject" 20 | }, 21 | "eslintConfig": { 22 | "extends": "react-app" 23 | }, 24 | "browserslist": { 25 | "production": [ 26 | ">0.2%", 27 | "not dead", 28 | "not op_mini all" 29 | ], 30 | "development": [ 31 | "last 1 chrome version", 32 | "last 1 firefox version", 33 | "last 1 safari version" 34 | ] 35 | } 36 | } 37 | -------------------------------------------------------------------------------- /frontend/public/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/frontend/public/favicon.ico -------------------------------------------------------------------------------- /frontend/public/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 9 | 10 | 11 | 12 | 16 | 17 | 21 | 22 | 31 | AWS Natural Language Search - AWS Samples 32 | 33 | 34 | 35 |
36 | 46 | 47 | 48 | -------------------------------------------------------------------------------- /frontend/public/logo192.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/frontend/public/logo192.png -------------------------------------------------------------------------------- /frontend/public/logo512.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/frontend/public/logo512.png -------------------------------------------------------------------------------- /frontend/public/manifest.json: -------------------------------------------------------------------------------- 1 | { 2 | "short_name": "React App", 3 | "name": "Create React App Sample", 4 | "icons": [ 5 | { 6 | "src": "favicon.ico", 7 | "sizes": "64x64 32x32 24x24 16x16", 8 | "type": "image/x-icon" 9 | }, 10 | { 11 | "src": "logo192.png", 12 | "type": "image/png", 13 | "sizes": "192x192" 14 | }, 15 | { 16 | "src": "logo512.png", 17 | "type": "image/png", 18 | "sizes": "512x512" 19 | } 20 | ], 21 | "start_url": ".", 22 | "display": "standalone", 23 | "theme_color": "#000000", 24 | "background_color": "#ffffff" 25 | } 26 | -------------------------------------------------------------------------------- /frontend/public/robots.txt: -------------------------------------------------------------------------------- 1 | # https://www.robotstxt.org/robotstxt.html 2 | User-agent: * 3 | Disallow: 4 | -------------------------------------------------------------------------------- /frontend/src/App.css: -------------------------------------------------------------------------------- 1 | .App { 2 | text-align: center; 3 | } 4 | 5 | .App-logo { 6 | height: 40vmin; 7 | pointer-events: none; 8 | } 9 | 10 | @media (prefers-reduced-motion: no-preference) { 11 | .App-logo { 12 | animation: App-logo-spin infinite 20s linear; 13 | } 14 | } 15 | 16 | .App-header { 17 | background-color: #282c34; 18 | min-height: 100vh; 19 | display: flex; 20 | flex-direction: column; 21 | align-items: center; 22 | justify-content: center; 23 | font-size: calc(10px + 2vmin); 24 | color: white; 25 | } 26 | 27 | .App-link { 28 | color: #61dafb; 29 | } 30 | 31 | @keyframes App-logo-spin { 32 | from { 33 | transform: rotate(0deg); 34 | } 35 | to { 36 | transform: rotate(360deg); 37 | } 38 | } 39 | -------------------------------------------------------------------------------- /frontend/src/App.js: -------------------------------------------------------------------------------- 1 | import React from 'react'; 2 | import './App.css'; 3 | import 'typeface-roboto'; 4 | import { Button, Input, FormControl, Select, MenuItem } from '@material-ui/core'; 5 | import { withStyles, lighten } from "@material-ui/core/styles"; 6 | import InputAdornment from '@material-ui/core/InputAdornment'; 7 | import SearchIcon from '@material-ui/icons/Search'; 8 | import Paper from '@material-ui/core/Paper'; 9 | import Grid from '@material-ui/core/Grid'; 10 | import GridList from '@material-ui/core/GridList'; 11 | import GridListTile from '@material-ui/core/GridListTile'; 12 | import GridListTileBar from '@material-ui/core/GridListTileBar'; 13 | import LinearProgress from '@material-ui/core/LinearProgress'; 14 | import Typography from '@material-ui/core/Typography'; 15 | import Amplify, { API } from "aws-amplify"; 16 | import '@aws-amplify/ui/dist/style.css'; 17 | import Config from './config'; 18 | 19 | 20 | Amplify.configure({ 21 | API: { 22 | endpoints: [ 23 | { 24 | name: "NluSearch", 25 | endpoint: Config.apiEndpoint 26 | } 27 | ] 28 | } 29 | }); 30 | 31 | const styles = theme => ({ 32 | root: { 33 | flexGrow: 1, 34 | }, 35 | paper: { 36 | padding: theme.spacing(2), 37 | textAlign: 'center', 38 | height: "100%", 39 | color: theme.palette.text.secondary 40 | }, 41 | paper2: { 42 | padding: theme.spacing(2), 43 | textAlign: 'center', 44 | color: theme.palette.text.secondary 45 | }, 46 | em: { 47 | backgroundColor: "#f18973" 48 | } 49 | }); 50 | 51 | const BorderLinearProgress = withStyles({ 52 | root: { 53 | height: 10, 54 | backgroundColor: lighten('#ff6c5c', 0.5), 55 | }, 56 | bar: { 57 | borderRadius: 20, 58 | backgroundColor: '#ff6c5c', 59 | }, 60 | })(LinearProgress); 61 | 62 | // const classes = useStyles(); 63 | 64 | class App extends React.Component { 65 | 66 | constructor(props) { 67 | super(props); 68 | this.state = { 69 | semantics: [], 70 | results: [], 71 | completed: 0, 72 | k: 3 73 | }; 74 | this.handleSearchSubmit = this.handleSearchSubmit.bind(this); 75 | this.handleFormChange = this.handleFormChange.bind(this); 76 | } 77 | 78 | handleSearchSubmit(event) { 79 | // function for when a use submits a URL 80 | // if the URL bar is empty, it will remove similar photos from state 81 | console.log(this.state.searchText); 82 | if (this.state.searchText === undefined || this.state.searchText === "") { 83 | console.log("Empty Text field"); 84 | this.setState({ semantics: [], completed: 0, results: [] }); 85 | } else { 86 | const myInit = { 87 | body: { "searchString": this.state.searchText } 88 | }; 89 | 90 | this.setState({ completed: 66 }); 91 | 92 | API.post('NluSearch', '/postText', myInit) 93 | .then(response => { 94 | this.setState({ 95 | semantics: response.semantics.map(function (elem) { 96 | let semantic = {}; 97 | semantic.question = elem.question; 98 | semantic.asnwer = elem.answer; 99 | return semantic; 100 | }) 101 | }); 102 | console.log(this.state.semantics); 103 | }) 104 | .catch(error => { 105 | console.log(error); 106 | }); 107 | 108 | this.setState({ completed: 85 }); 109 | 110 | console.log(this.state.results); 111 | API.post('NluSearch', '/postMatch', myInit) 112 | .then(response => { 113 | // this.setState({results: []}); 114 | this.setState({ 115 | results: response.map(function (elem) { 116 | let result = {}; 117 | result.question = elem.question; 118 | result.answer = elem.answer; 119 | return result; 120 | }) 121 | }); 122 | console.log(this.state.results); 123 | this.setState({ completed: 100 }); 124 | }) 125 | .catch(error => { 126 | console.log(error); 127 | }); 128 | 129 | 130 | }; 131 | event.preventDefault(); 132 | } 133 | 134 | handleFormChange(event) { 135 | this.setState({ searchText: event.target.value }); 136 | } 137 | 138 | 139 | render() { 140 | const { classes } = this.props; 141 | const createMarkup = htmlString => ({ __html: htmlString }); 142 | 143 | return ( 144 |
145 | 146 | 147 | {/* 148 | Header 149 | */} 150 | 151 | 152 | Semantic Search with Amazon OpenSearch 153 | 154 | 155 | 156 | 157 | 158 | 159 | Enter a question you have about the product, for example "Does this work with xbox?" 160 |

161 |

162 | 173 | 174 | 175 | } 176 | /> 177 | 183 | 184 |
185 |
186 | 187 | 188 | 189 | 190 | 191 | 192 | Semantic Search 193 | 194 | {this.state.semantics.map((tile) => ( 195 | 196 | 197 |

198 | 199 | 200 | ) 201 | ) 202 | } 203 | 204 | 205 | 206 | 207 | 208 | Keyword Search 209 | 210 | {this.state.results.map((tile) => ( 211 | 212 | 213 |

214 | 215 | 216 | ))} 217 | 218 | 219 | 220 | 221 | 222 | 223 |

224 | ); 225 | } 226 | } 227 | 228 | export default withStyles(styles, { withTheme: true })(App); 229 | 230 | -------------------------------------------------------------------------------- /frontend/src/config/index.js: -------------------------------------------------------------------------------- 1 | import Config from './config.json' 2 | 3 | export default { 4 | apiEndpoint: Config.apiEndpoint 5 | }; -------------------------------------------------------------------------------- /frontend/src/images/header.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/frontend/src/images/header.jpg -------------------------------------------------------------------------------- /frontend/src/index.css: -------------------------------------------------------------------------------- 1 | body { 2 | margin: 0; 3 | font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Oxygen', 4 | 'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue', 5 | sans-serif; 6 | -webkit-font-smoothing: antialiased; 7 | -moz-osx-font-smoothing: grayscale; 8 | } 9 | 10 | code { 11 | font-family: source-code-pro, Menlo, Monaco, Consolas, 'Courier New', 12 | monospace; 13 | } 14 | -------------------------------------------------------------------------------- /frontend/src/index.js: -------------------------------------------------------------------------------- 1 | import React from 'react'; 2 | import ReactDOM from 'react-dom'; 3 | import './index.css'; 4 | import App from './App'; 5 | import * as serviceWorker from './serviceWorker'; 6 | 7 | ReactDOM.render( 8 | 9 | 10 | , 11 | document.getElementById('root') 12 | ); 13 | 14 | // If you want your app to work offline and load faster, you can change 15 | // unregister() to register() below. Note this comes with some pitfalls. 16 | // Learn more about service workers: https://bit.ly/CRA-PWA 17 | serviceWorker.unregister(); 18 | -------------------------------------------------------------------------------- /frontend/src/logo.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /frontend/src/serviceWorker.js: -------------------------------------------------------------------------------- 1 | // This optional code is used to register a service worker. 2 | // register() is not called by default. 3 | 4 | // This lets the app load faster on subsequent visits in production, and gives 5 | // it offline capabilities. However, it also means that developers (and users) 6 | // will only see deployed updates on subsequent visits to a page, after all the 7 | // existing tabs open on the page have been closed, since previously cached 8 | // resources are updated in the background. 9 | 10 | // To learn more about the benefits of this model and instructions on how to 11 | // opt-in, read https://bit.ly/CRA-PWA 12 | 13 | const isLocalhost = Boolean( 14 | window.location.hostname === 'localhost' || 15 | // [::1] is the IPv6 localhost address. 16 | window.location.hostname === '[::1]' || 17 | // 127.0.0.0/8 are considered localhost for IPv4. 18 | window.location.hostname.match( 19 | /^127(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}$/ 20 | ) 21 | ); 22 | 23 | export function register(config) { 24 | if (process.env.NODE_ENV === 'production' && 'serviceWorker' in navigator) { 25 | // The URL constructor is available in all browsers that support SW. 26 | const publicUrl = new URL(process.env.PUBLIC_URL, window.location.href); 27 | if (publicUrl.origin !== window.location.origin) { 28 | // Our service worker won't work if PUBLIC_URL is on a different origin 29 | // from what our page is served on. This might happen if a CDN is used to 30 | // serve assets; see https://github.com/facebook/create-react-app/issues/2374 31 | return; 32 | } 33 | 34 | window.addEventListener('load', () => { 35 | const swUrl = `${process.env.PUBLIC_URL}/service-worker.js`; 36 | 37 | if (isLocalhost) { 38 | // This is running on localhost. Let's check if a service worker still exists or not. 39 | checkValidServiceWorker(swUrl, config); 40 | 41 | // Add some additional logging to localhost, pointing developers to the 42 | // service worker/PWA documentation. 43 | navigator.serviceWorker.ready.then(() => { 44 | console.log( 45 | 'This web app is being served cache-first by a service ' + 46 | 'worker. To learn more, visit https://bit.ly/CRA-PWA' 47 | ); 48 | }); 49 | } else { 50 | // Is not localhost. Just register service worker 51 | registerValidSW(swUrl, config); 52 | } 53 | }); 54 | } 55 | } 56 | 57 | function registerValidSW(swUrl, config) { 58 | navigator.serviceWorker 59 | .register(swUrl) 60 | .then(registration => { 61 | registration.onupdatefound = () => { 62 | const installingWorker = registration.installing; 63 | if (installingWorker == null) { 64 | return; 65 | } 66 | installingWorker.onstatechange = () => { 67 | if (installingWorker.state === 'installed') { 68 | if (navigator.serviceWorker.controller) { 69 | // At this point, the updated precached content has been fetched, 70 | // but the previous service worker will still serve the older 71 | // content until all client tabs are closed. 72 | console.log( 73 | 'New content is available and will be used when all ' + 74 | 'tabs for this page are closed. See https://bit.ly/CRA-PWA.' 75 | ); 76 | 77 | // Execute callback 78 | if (config && config.onUpdate) { 79 | config.onUpdate(registration); 80 | } 81 | } else { 82 | // At this point, everything has been precached. 83 | // It's the perfect time to display a 84 | // "Content is cached for offline use." message. 85 | console.log('Content is cached for offline use.'); 86 | 87 | // Execute callback 88 | if (config && config.onSuccess) { 89 | config.onSuccess(registration); 90 | } 91 | } 92 | } 93 | }; 94 | }; 95 | }) 96 | .catch(error => { 97 | console.error('Error during service worker registration:', error); 98 | }); 99 | } 100 | 101 | function checkValidServiceWorker(swUrl, config) { 102 | // Check if the service worker can be found. If it can't reload the page. 103 | fetch(swUrl, { 104 | headers: { 'Service-Worker': 'script' }, 105 | }) 106 | .then(response => { 107 | // Ensure service worker exists, and that we really are getting a JS file. 108 | const contentType = response.headers.get('content-type'); 109 | if ( 110 | response.status === 404 || 111 | (contentType != null && contentType.indexOf('javascript') === -1) 112 | ) { 113 | // No service worker found. Probably a different app. Reload the page. 114 | navigator.serviceWorker.ready.then(registration => { 115 | registration.unregister().then(() => { 116 | window.location.reload(); 117 | }); 118 | }); 119 | } else { 120 | // Service worker found. Proceed as normal. 121 | registerValidSW(swUrl, config); 122 | } 123 | }) 124 | .catch(() => { 125 | console.log( 126 | 'No internet connection found. App is running in offline mode.' 127 | ); 128 | }); 129 | } 130 | 131 | export function unregister() { 132 | if ('serviceWorker' in navigator) { 133 | navigator.serviceWorker.ready 134 | .then(registration => { 135 | registration.unregister(); 136 | }) 137 | .catch(error => { 138 | console.error(error.message); 139 | }); 140 | } 141 | } 142 | -------------------------------------------------------------------------------- /full-stack-semantic-search-ui-2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/full-stack-semantic-search-ui-2.jpg -------------------------------------------------------------------------------- /full-stack-semantic-search-ui.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/full-stack-semantic-search-ui.jpg -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/README.md: -------------------------------------------------------------------------------- 1 | ## Code organization 2 | 3 | ### main_queryEncoder.py 4 | Lambda handler that processes the incoming request and calls the LLM chain to generate a reply. 5 | 6 | ### chain_queryEncoder.py 7 | The LLM chain code that calls the LLM with the input from the user. 8 | 9 | ### main_documentEncoder.py 10 | Lambda handler that processes the document chunks 11 | 12 | ### chain_documentEncoder.py 13 | The Langchain code that inserts documents into opensearch from s3 14 | 15 | ### conversational_search_full_stack_with_gpu.yaml 16 | This is the full stack that deploys the entire chat applicatiom in your own account 17 | 18 | ### webapp directory 19 | This folder has all code to deploy the front end for chat inteface on ec2 using streamlit module. 20 | 21 | ## Packaging the Lambda functions 22 | 23 | Clone the repository 24 | ```bash 25 | git clone https://github.com/aws-samples/semantic-search-with-amazon-opensearch.git 26 | ``` 27 | 28 | Move to the package directory 29 | ```bash 30 | cd generative-ai/Module_1_Chat_With_PDF 31 | ``` 32 | 33 | Install the dependencies; this creates a Conda env named `conversational-search-with-opensearch-service` and activates it. 34 | ```bash 35 | conda deactivate 36 | conda env create -f environment.yml # only needed once 37 | conda activate conversational-search-with-opensearch-service 38 | ``` 39 | 40 | Bundle the code for Lambda deployment. 41 | ```bash 42 | ./bundle.sh 43 | ``` 44 | This will create two lambda .zip packages and webapp .zip package inside generative-ai/Module_1_Chat_With_PDF directory as lambda.zip, lambda_s3.zip and webapp.zip respectively. 45 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/create_IAM_role.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/create_IAM_role.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/create_IAMrole.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/create_IAMrole.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/create_role.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/create_role.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_URL.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_URL.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_exec_role.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_exec_role.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_function.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_function.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_functions.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_functions.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_functionz.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_functionz.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_role.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/__pycache__/lambda_role.cpython-310.pyc -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/app.py: -------------------------------------------------------------------------------- 1 | from aws_cdk import ( 2 | App, Duration, Stack, 3 | aws_apigateway as apigateway, 4 | aws_lambda as lambda_, 5 | aws_secretsmanager as secretsmanager, 6 | aws_dynamodb as dynamodb 7 | ) 8 | import config 9 | 10 | 11 | class LangChainApp(Stack): 12 | def __init__(self, app: App, id: str) -> None: 13 | super().__init__(app, id) 14 | 15 | table = dynamodb.Table(self, "table", table_name=config.config.DYNAMODB_TABLE_NAME, 16 | partition_key=dynamodb.Attribute(name="SessionId", type=dynamodb.AttributeType.STRING) 17 | ) 18 | 19 | handler = lambda_.Function(self, "LangChainHandler", 20 | runtime=lambda_.Runtime.PYTHON_3_9, 21 | code=lambda_.Code.from_asset("dist/lambda.zip"), 22 | handler="main.handler", 23 | layers=[ 24 | lambda_.LayerVersion.from_layer_version_arn( 25 | self, 26 | "SecretsExtensionLayer", 27 | layer_version_arn=config.config.SECRETS_EXTENSION_ARN 28 | ) 29 | ], 30 | timeout=Duration.minutes(5) 31 | ) 32 | 33 | table.grant_read_write_data(handler) 34 | 35 | secret = secretsmanager.Secret.from_secret_name_v2(self, 'secret', config.config.API_KEYS_SECRET_NAME) 36 | secret.grant_read(handler) 37 | secret.grant_write(handler) 38 | 39 | 40 | 41 | api = apigateway.RestApi(self, "langchain-api", 42 | rest_api_name="LangChain Service Api", 43 | description="Showcases langchain use of llm models" 44 | ) 45 | 46 | request_model = api.add_model("RequestModel", content_type="application/json", 47 | model_name="RequestModel", 48 | description="Schema for request payload", 49 | schema={ 50 | "title": "requestParameters", 51 | "type": apigateway.JsonSchemaType.OBJECT, 52 | "properties": { 53 | "prompt": { 54 | "type": apigateway.JsonSchemaType.STRING 55 | }, 56 | "session_id": { 57 | "type": apigateway.JsonSchemaType.STRING 58 | } 59 | } 60 | } 61 | ) 62 | 63 | post_integration = apigateway.LambdaIntegration(handler) 64 | 65 | api.root.add_method( 66 | "POST", 67 | post_integration, 68 | authorization_type=apigateway.AuthorizationType.IAM, 69 | request_models={ 70 | "application/json": request_model 71 | }, 72 | request_validator_options={ 73 | "request_validator_name": 'request-validator', 74 | "validate_request_body": True, 75 | "validate_request_parameters": False 76 | } 77 | ) 78 | 79 | app = App() 80 | LangChainApp(app, "LangChainApp") 81 | app.synth() -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/bundle.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | rm -rf dist 4 | 5 | pip install --platform manylinux2014_x86_64 --implementation cp --only-binary=:all: -r requirements.txt -t dist 6 | 7 | # (optional when openai is used) openai doesn't have a binary distribution, so need a separate install 8 | pip install -I openai -t dist 9 | 10 | # remove extraneous bits from installed packages 11 | rm -r dist/*.dist-info 12 | cp config.py chain_queryEncoder.py main_queryEncoder.py chain_documentEncoder.py main_documentEncoder.py dist/ 13 | cd dist && zip -r ../queryEncoder.zip * 14 | zip -r ../documentEncoder.zip * -x queryEncoder.zip 15 | rm -rf ../dist 16 | cd ../webapp && zip -r ../webapp.zip * -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/bundle_lambda.sh: -------------------------------------------------------------------------------- 1 | # Download the Langchain module 2 | cd Module_1_Build_Conversational_Search 3 | 4 | aws s3 cp s3://ws-assets-prod-iad-r-pdx-f3b3f9f1a7d6a3d0/2108cfcf-6cd6-4613-83c0-db4e55998757/Langchain.zip . 5 | 6 | aws s3 cp s3://ws-assets-prod-iad-r-pdx-f3b3f9f1a7d6a3d0/2108cfcf-6cd6-4613-83c0-db4e55998757/documentEncoder.zip . 7 | 8 | aws s3 cp s3://ws-assets-prod-iad-r-pdx-f3b3f9f1a7d6a3d0/2108cfcf-6cd6-4613-83c0-db4e55998757/queryEncoder.zip . -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/cdk.json: -------------------------------------------------------------------------------- 1 | { 2 | "app": "python3 app.py" 3 | } -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/chain_documentEncoder.py: -------------------------------------------------------------------------------- 1 | from typing import Tuple 2 | from uuid import uuid4 3 | from langchain.docstore.document import Document 4 | from langchain import ConversationChain,PromptTemplate, LLMChain 5 | from langchain.memory import ConversationBufferMemory, DynamoDBChatMessageHistory,ConversationBufferWindowMemory 6 | from langchain.prompts import ( 7 | ChatPromptTemplate, 8 | MessagesPlaceholder, 9 | SystemMessagePromptTemplate, 10 | HumanMessagePromptTemplate 11 | ) 12 | from langchain.chains import ConversationalRetrievalChain 13 | from langchain.chat_models import ChatOpenAI 14 | from langchain.vectorstores import OpenSearchVectorSearch 15 | from langchain.embeddings import OpenAIEmbeddings 16 | from langchain.schema import messages_to_dict 17 | import config 18 | from langchain.chains import RetrievalQA 19 | from langchain.llms import OpenAI 20 | import requests 21 | from typing import Dict 22 | from langchain import PromptTemplate, SagemakerEndpoint 23 | import json 24 | import time 25 | import logging 26 | from typing import List 27 | from langchain.embeddings import SagemakerEndpointEmbeddings 28 | from langchain.embeddings.sagemaker_endpoint import EmbeddingsContentHandler 29 | import os 30 | # from langchain.tools import Tool, tool 31 | 32 | 33 | EMBEDDING_ENDPOINT = os.environ['EmbeddingEndpointName'] 34 | LLM_ENDPOINT = os.environ['LLMEndpointName'] 35 | DOMAIN_ENDPOINT = os.environ['OpenSearchDomainEndpoint'] 36 | DOMAIN_INDEX = "llm_apps_workshop_embeddings" 37 | DYNAMO_DB_TABLE = os.environ['DynamoDBTableName'] 38 | REGION = os.environ['aws_region'] 39 | 40 | import boto3 41 | 42 | kms = boto3.client('secretsmanager') 43 | aos_credentials = kms.get_secret_value(SecretId=os.environ['OpenSearchSecret']) 44 | DOMAIN_ADMIN_UNAME = json.loads(aos_credentials['SecretString'])['username'] 45 | DOMAIN_ADMIN_PW = json.loads(aos_credentials['SecretString'])['password'] 46 | 47 | 48 | def run(bucket_: str, key_: str) -> Tuple[str, str]: 49 | 50 | print('embedding model initialisation') 51 | 52 | #Sagemaker embedding model 53 | class SagemakerEndpointEmbeddingsJumpStart(SagemakerEndpointEmbeddings): 54 | def embed_documents( 55 | self, texts: List[str], chunk_size: int = 5 56 | ) -> List[List[float]]: 57 | """Compute doc embeddings using a SageMaker Inference Endpoint. 58 | 59 | Args: 60 | texts: The list of texts to embed. 61 | chunk_size: The chunk size defines how many input texts will 62 | be grouped together as request. If None, will use the 63 | chunk size specified by the class. 64 | 65 | Returns: 66 | List of embeddings, one for each text. 67 | """ 68 | results = [] 69 | _chunk_size = len(texts) if chunk_size > len(texts) else chunk_size 70 | st = time.time() 71 | for i in range(0, len(texts), _chunk_size): 72 | response = self._embedding_func(texts[i:i + _chunk_size]) 73 | results.extend(response) 74 | time_taken = time.time() - st 75 | #logger.info(f"got results for {len(texts)} in {time_taken}s, length of embeddings list is {len(results)}") 76 | return results 77 | 78 | class ContentHandler(EmbeddingsContentHandler): 79 | content_type = "application/json" 80 | accepts = "application/json" 81 | 82 | def transform_input(self, prompt: str, model_kwargs={}) -> bytes: 83 | 84 | input_str = json.dumps({"text_inputs": prompt, **model_kwargs}) 85 | return input_str.encode('utf-8') 86 | 87 | def transform_output(self, output: bytes) -> str: 88 | 89 | response_json = json.loads(output.read().decode("utf-8")) 90 | embeddings = response_json["embedding"] 91 | if len(embeddings) == 1: 92 | return [embeddings[0]] 93 | return embeddings 94 | 95 | embeddings = SagemakerEndpointEmbeddingsJumpStart( 96 | endpoint_name=EMBEDDING_ENDPOINT, 97 | region_name=REGION, 98 | content_handler=ContentHandler() 99 | ) 100 | 101 | 102 | import boto3 103 | import os 104 | s3 = boto3.client('s3') 105 | 106 | local_path = '/tmp/' 107 | 108 | 109 | with open(os.path.join(local_path, key_), 'wb') as file: 110 | s3.download_file(bucket_, key_, file.name) 111 | 112 | from PyPDF2 import PdfReader 113 | pdf_reader = PdfReader("/tmp/"+key_) 114 | text = "" 115 | for page in pdf_reader.pages: 116 | text += page.extract_text() 117 | 118 | 119 | from langchain.text_splitter import CharacterTextSplitter 120 | text_splitter = CharacterTextSplitter( 121 | separator="\n", 122 | chunk_size=500, 123 | chunk_overlap=200, 124 | length_function=len 125 | ) 126 | chunks = text_splitter.split_text(text) 127 | print(len(chunks)) 128 | 129 | 130 | os_domain_ep = 'https://'+DOMAIN_ENDPOINT 131 | 132 | logger = logging.getLogger() 133 | docsearch = OpenSearchVectorSearch.from_texts(index_name = DOMAIN_INDEX, 134 | texts=chunks, 135 | embedding=embeddings, 136 | opensearch_url=os_domain_ep, 137 | http_auth=(DOMAIN_ADMIN_UNAME, DOMAIN_ADMIN_PW) ) 138 | 139 | print("docs inserted into opensearch") 140 | 141 | 142 | response = "docs inserted into opensearch" 143 | 144 | print(response) 145 | return response 146 | 147 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/chain_queryEncoder.py: -------------------------------------------------------------------------------- 1 | from typing import Tuple 2 | from uuid import uuid4 3 | from langchain.docstore.document import Document 4 | import os 5 | 6 | 7 | from langchain import ConversationChain,PromptTemplate, LLMChain 8 | from langchain.memory import ConversationBufferMemory, DynamoDBChatMessageHistory,ConversationBufferWindowMemory 9 | from langchain.prompts import (PromptTemplate, 10 | ChatPromptTemplate, 11 | MessagesPlaceholder, 12 | SystemMessagePromptTemplate, 13 | HumanMessagePromptTemplate 14 | ) 15 | from langchain.chains import ConversationalRetrievalChain 16 | from langchain.chat_models import ChatOpenAI 17 | from langchain.vectorstores import OpenSearchVectorSearch 18 | from langchain.embeddings import OpenAIEmbeddings 19 | from langchain.schema import messages_to_dict 20 | import config 21 | from langchain.chains import RetrievalQA 22 | from langchain.llms import OpenAI 23 | import requests 24 | from typing import Dict 25 | 26 | from langchain import PromptTemplate, SagemakerEndpoint 27 | from langchain.llms.sagemaker_endpoint import LLMContentHandler 28 | from langchain.chains.question_answering import load_qa_chain 29 | from typing import Any, Dict, Iterable, List, Optional, Tuple, Callable 30 | import json 31 | # from langchain.tools import Tool, tool 32 | 33 | 34 | 35 | EMBEDDING_ENDPOINT = os.environ['EmbeddingEndpointName'] 36 | LLM_ENDPOINT = os.environ['LLMEndpointName'] 37 | DOMAIN_ENDPOINT = os.environ['OpenSearchDomainEndpoint'] 38 | DOMAIN_INDEX = "llm_apps_workshop_embeddings" 39 | DYNAMO_DB_TABLE = os.environ['DynamoDBTableName'] 40 | REGION = os.environ['aws_region'] 41 | 42 | import boto3 43 | 44 | kms = boto3.client('secretsmanager') 45 | aos_credentials = kms.get_secret_value(SecretId=os.environ['OpenSearchSecret']) 46 | DOMAIN_ADMIN_UNAME = json.loads(aos_credentials['SecretString'])['username'] 47 | DOMAIN_ADMIN_PW = json.loads(aos_credentials['SecretString'])['password'] 48 | 49 | def run(api_key: str, session_id: str, prompt: str) -> Tuple[str, str]: 50 | """This is the main function that executes the prediction chain. 51 | Updating this code will change the predictions of the service. 52 | Current implementation creates a new session id for each run, client 53 | should pass the returned session id in the next execution run, so the 54 | conversation chain can load message context from previous execution. 55 | 56 | Args: 57 | api_key: api key for the LLM service, OpenAI used here 58 | session_id: session id from the previous execution run, pass blank for first execution 59 | prompt: prompt question entered by the user 60 | 61 | Returns: 62 | The prediction from LLM 63 | """ 64 | import json 65 | input_ = json.loads(prompt) 66 | question_ = input_["text"] 67 | searchType_ = input_["searchType"] 68 | temperature_ = float(input_["temperature"]) 69 | topK_ = input_["topK"] 70 | topP_ = input_["topP"] 71 | maxTokens_ = input_["maxTokens"] 72 | 73 | if not session_id.strip(): 74 | print('no session id') 75 | session_id = str(uuid4()) 76 | 77 | chat_memory = DynamoDBChatMessageHistory( 78 | table_name=DYNAMO_DB_TABLE, 79 | session_id=session_id 80 | ) 81 | messages = chat_memory.messages 82 | 83 | # Maintains immutable sessions 84 | # If previous session was present, create 85 | # a new session and copy messages, and 86 | # generate a new session_id 87 | 88 | if messages: 89 | session_id = str(uuid4()) 90 | chat_memory = DynamoDBChatMessageHistory( 91 | table_name=DYNAMO_DB_TABLE, 92 | session_id=session_id 93 | ) 94 | 95 | # This is a workaround at the moment. Ideally, this should 96 | # be added to the DynamoDBChatMessageHistory class 97 | 98 | try: 99 | messages = messages_to_dict(messages) 100 | chat_memory.table.put_item( 101 | Item={"SessionId": session_id, "History": messages} 102 | ) 103 | except Exception as e: 104 | print(e) 105 | 106 | memory = ConversationBufferMemory(chat_memory=chat_memory, return_messages=True) 107 | 108 | # session memory WO DynamoDB 109 | # memory_ = ConversationBufferWindowMemory( 110 | # memory_key='chat_history', 111 | # k=5, 112 | # return_messages=True 113 | # ) 114 | 115 | #Using prompt template instead of STUFF and conversational chain 116 | 117 | # prompt_template = ChatPromptTemplate.from_messages([ 118 | # SystemMessagePromptTemplate.from_template("The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know."), 119 | # MessagesPlaceholder(variable_name="history"), 120 | # HumanMessagePromptTemplate.from_template("{input}") 121 | # ]) 122 | # conversation = ConversationChain( 123 | # llm=llm, 124 | # prompt=prompt_template, 125 | # verbose=True, 126 | # memory=memory 127 | # ) 128 | 129 | 130 | #Sagemaker embedding model 131 | import time 132 | import json 133 | import logging 134 | from typing import List 135 | from langchain.embeddings import SagemakerEndpointEmbeddings 136 | from langchain.embeddings.sagemaker_endpoint import EmbeddingsContentHandler 137 | 138 | class SagemakerEndpointEmbeddingsJumpStart(SagemakerEndpointEmbeddings): 139 | def embed_documents( 140 | self, texts: List[str], chunk_size: int = 5 141 | ) -> List[List[float]]: 142 | """Compute doc embeddings using a SageMaker Inference Endpoint. 143 | 144 | Args: 145 | texts: The list of texts to embed. 146 | chunk_size: The chunk size defines how many input texts will 147 | be grouped together as request. If None, will use the 148 | chunk size specified by the class. 149 | 150 | Returns: 151 | List of embeddings, one for each text. 152 | """ 153 | results = [] 154 | _chunk_size = len(texts) if chunk_size > len(texts) else chunk_size 155 | st = time.time() 156 | for i in range(0, len(texts), _chunk_size): 157 | response = self._embedding_func(texts[i:i + _chunk_size]) 158 | results.extend(response) 159 | time_taken = time.time() - st 160 | #logger.info(f"got results for {len(texts)} in {time_taken}s, length of embeddings list is {len(results)}") 161 | return results 162 | 163 | class ContentHandler(EmbeddingsContentHandler): 164 | content_type = "application/json" 165 | accepts = "application/json" 166 | 167 | def transform_input(self, prompt: str, model_kwargs={}) -> bytes: 168 | 169 | input_str = json.dumps({"text_inputs": prompt, **model_kwargs}) 170 | return input_str.encode('utf-8') 171 | 172 | def transform_output(self, output: bytes) -> str: 173 | 174 | response_json = json.loads(output.read().decode("utf-8")) 175 | embeddings = response_json["embedding"] 176 | if len(embeddings) == 1: 177 | return [embeddings[0]] 178 | return embeddings 179 | 180 | embeddings = SagemakerEndpointEmbeddingsJumpStart( 181 | endpoint_name=EMBEDDING_ENDPOINT, 182 | region_name=REGION, 183 | content_handler=ContentHandler() 184 | ) 185 | 186 | class SimiliarOpenSearchVectorSearch(OpenSearchVectorSearch): 187 | def relevance_score(self, distance: float) -> float: 188 | return distance 189 | def _select_relevance_score_fn(self) -> Callable[[float], float]: 190 | return self.relevance_score 191 | 192 | os_domain_ep = 'https://'+DOMAIN_ENDPOINT 193 | 194 | openSearch_ = SimiliarOpenSearchVectorSearch(index_name=DOMAIN_INDEX, 195 | embedding_function=embeddings, 196 | opensearch_url=os_domain_ep, 197 | http_auth=(DOMAIN_ADMIN_UNAME, DOMAIN_ADMIN_PW) ) 198 | 199 | openSearch_retriever = openSearch_.as_retriever() 200 | # search_type="similarity_score_threshold", 201 | # search_kwargs={ 202 | # 'k': 2, 203 | # 'score_threshold': 0.7 204 | # } 205 | 206 | 207 | 208 | 209 | 210 | #openAI LLM 211 | #llm = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY) 212 | 213 | # prompt_template = """Use the following pieces of context to answer the question at the end. 214 | 215 | # {context} 216 | 217 | # Question: {question} 218 | # Answer:""" 219 | # PROMPT = PromptTemplate( 220 | # template=prompt_template, input_variables=["context", "question"] 221 | # ) 222 | 223 | #Sagemaker Falcon XL LLM 224 | class ContentHandler(LLMContentHandler): 225 | content_type = "application/json" 226 | accepts = "application/json" 227 | 228 | def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes: 229 | input_str = json.dumps({"inputs": prompt, "parameters":model_kwargs}) 230 | return input_str.encode("utf-8") 231 | 232 | def transform_output(self, output: bytes) -> str: 233 | #response_json = json.loads(output.read().decode("utf-8")) 234 | decode_str_output = output.read().decode("utf-8") 235 | print(type(decode_str_output)) 236 | print(len(decode_str_output)) 237 | print(decode_str_output) 238 | #return output.read().decode("utf-8") 239 | response_json = json.loads(decode_str_output) 240 | print("LLM generated text:\n" + response_json[0]["generated_text"]) 241 | return response_json[0]["generated_text"] 242 | 243 | 244 | content_handler = ContentHandler() 245 | 246 | 247 | if(('cpu'.upper()) in (LLM_ENDPOINT.upper())): 248 | params = { 249 | "max_new_tokens": 128, 250 | "num_return_sequences": 1, 251 | "top_k": 200, 252 | "top_p": 0.9, 253 | "do_sample": False, 254 | "return_full_text": False, 255 | "temperature": 0.0001 256 | } 257 | else: 258 | params = { 259 | "max_new_tokens": maxTokens_, 260 | "num_return_sequences": 1, 261 | "top_k": topK_, 262 | "top_p": topP_, 263 | "do_sample": False, 264 | "return_full_text": False, 265 | "temperature": temperature_ 266 | } 267 | 268 | 269 | llm=SagemakerEndpoint( 270 | endpoint_name=LLM_ENDPOINT, 271 | #credentials_profile_name="credentials-profile-name", 272 | region_name=REGION, 273 | model_kwargs=params, 274 | content_handler=content_handler, 275 | ) 276 | 277 | # using STUFF instead of prompt templates 278 | 279 | qa = RetrievalQA.from_chain_type( 280 | llm=llm, 281 | chain_type="stuff", 282 | retriever=openSearch_retriever, 283 | memory = memory 284 | ) 285 | 286 | # using prompt templates instead of STUFF 287 | 288 | # chain = load_qa_chain( 289 | # llm=llm, 290 | # prompt=PROMPT, 291 | # ) 292 | 293 | prompt_template = """Answer the question with the best of your knowledge, if there are no relevant answers, please answer as 'I don't know' 294 | 295 | 296 | Question: {question} 297 | Answer:""" 298 | PROMPT = PromptTemplate( 299 | template=prompt_template, input_variables=[ "question"] 300 | ) 301 | 302 | #PROMPT=PromptTemplate.from_template(prompt_template) 303 | 304 | 305 | if(searchType_ == "OpenSearch vector search"): 306 | print("Only OpenSearch as retriever=true") 307 | docs_ = openSearch_.similarity_search(json.loads(prompt)["text"]) 308 | print("opensearch results:"+docs_[0].page_content) 309 | response = docs_[0].page_content 310 | else: 311 | if(searchType_ == "Conversational Search (RAG)"): 312 | print("OpenSearch and LLM as RAG") 313 | response = qa.run( json.loads(prompt)["text"]) 314 | else: 315 | print("Only LLM as generator=true") 316 | chain = LLMChain( 317 | llm=llm, 318 | prompt=PROMPT, 319 | ) 320 | 321 | result = chain({ "question": json.loads(prompt)["text"]}, return_only_outputs=True) 322 | response = result['text'] 323 | print("responsetype") 324 | print(response) 325 | 326 | 327 | 328 | 329 | # docs_[0].page_content # chain({"input_documents": docs_, "question": prompt}, return_only_outputs=True) 330 | print("response from agent") 331 | print("response:"+response) 332 | print("response:"+session_id) 333 | return response, session_id 334 | 335 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/config.py: -------------------------------------------------------------------------------- 1 | 2 | from dataclasses import dataclass 3 | import boto3 4 | 5 | 6 | @dataclass(frozen=True) 7 | class Config: 8 | # openai key is expected to be saved in SecretsManager under openai-api-key name 9 | API_KEYS_SECRET_NAME = "api-keys" 10 | 11 | # Needed for reading secrets from SecretManager 12 | # See https://docs.aws.amazon.com/systems-manager/latest/userguide/ps-integration-lambda-extensions.html#ps-integration-lambda-extensions-add 13 | # for extension arn in other regions 14 | SECRETS_EXTENSION_ARN = 'arn:aws:lambda:us-east-1:177933569100:layer:AWS-Parameters-and-Secrets-Lambda-Extension:4' 15 | 16 | # Dynamo db table that stores the conversation history 17 | DYNAMODB_TABLE_NAME = "conversation-history-store" 18 | 19 | 20 | config = Config() -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/environment.yml: -------------------------------------------------------------------------------- 1 | name: conversational-search-with-opensearch-services 2 | channels: 3 | - https://conda.anaconda.org/conda-forge 4 | dependencies: 5 | - python=3.9 6 | - pip 7 | - pip: 8 | - aws-cdk-lib 9 | - streamlit 10 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/lambda_URL.py: -------------------------------------------------------------------------------- 1 | import boto3 2 | lambda_ = boto3.client('lambda') 3 | 4 | def createLambdaURL(function_name, account_id): 5 | response_ = lambda_.add_permission( 6 | FunctionName=function_name, 7 | StatementId=function_name+'_permissions', 8 | Action="lambda:InvokeFunctionUrl", 9 | Principal=account_id, 10 | FunctionUrlAuthType='AWS_IAM' 11 | ) 12 | 13 | response = lambda_.create_function_url_config( 14 | FunctionName=function_name, 15 | AuthType='AWS_IAM', 16 | Cors={ 17 | 'AllowCredentials': True, 18 | 19 | 'AllowMethods':["*"], 20 | 'AllowOrigins': ["*"] 21 | 22 | }, 23 | InvokeMode='RESPONSE_STREAM' 24 | ) 25 | 26 | query_invoke_URL = response['FunctionUrl'] 27 | return query_invoke_URL -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/lambda_exec_role.py: -------------------------------------------------------------------------------- 1 | import boto3 2 | 3 | def createLambdaRole(roleName, policies): 4 | 5 | iam_ = boto3.client('iam') 6 | 7 | lambda_iam_role = iam_.create_role( 8 | RoleName=roleName, 9 | AssumeRolePolicyDocument='{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Principal": {"Service": "lambda.amazonaws.com"}, "Action": "sts:AssumeRole"}]}', 10 | Description='LLMApp Lambda Permissions', 11 | 12 | ) 13 | waiter = iam_.get_waiter('role_exists') 14 | 15 | waiter.wait( 16 | RoleName=roleName, 17 | WaiterConfig={ 18 | 'Delay': 2, 19 | 'MaxAttempts': 5 20 | } 21 | ) 22 | 23 | for policy in policies: 24 | iam_.attach_role_policy( 25 | RoleName=roleName, 26 | PolicyArn='arn:aws:iam::aws:policy/'+policy 27 | ) 28 | 29 | lambda_iam_role_arn = lambda_iam_role['Role']['Arn'] 30 | return lambda_iam_role_arn -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/lambda_function.py: -------------------------------------------------------------------------------- 1 | import boto3 2 | lambda_ = boto3.client('lambda') 3 | 4 | def createLambdaFunction(encoders, roleARN, env_variables): 5 | 6 | iam_ = boto3.client('iam') 7 | waiter = iam_.get_waiter('role_exists') 8 | 9 | waiter.wait( 10 | RoleName=roleARN.split("/")[1], 11 | WaiterConfig={ 12 | 'Delay': 2, 13 | 'MaxAttempts': 5 14 | } 15 | ) 16 | 17 | for encoder in encoders: 18 | response = lambda_.create_function( 19 | FunctionName=encoder, 20 | Runtime='python3.9', 21 | Role=roleARN, 22 | Handler='main_'+encoder+'.handler', 23 | Code={ 24 | 25 | 'S3Bucket': 'ws-assets-prod-iad-r-pdx-f3b3f9f1a7d6a3d0', 26 | 'S3Key': '2108cfcf-6cd6-4613-83c0-db4e55998757/'+encoder+'.zip', 27 | 28 | }, 29 | Timeout=900, 30 | MemorySize=512, 31 | Environment={ 32 | 'Variables': env_variables 33 | } 34 | ) 35 | print("\n"+encoder +" Lambda Function created, ARN: "+response['FunctionArn']) 36 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/main_documentEncoder.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | from typing import Dict 4 | import requests 5 | import chain_documentEncoder 6 | import config 7 | import json 8 | import urllib.parse 9 | import boto3 10 | 11 | def handler(event, context): 12 | 13 | # body = json.loads(event["body"]) 14 | 15 | # validate_response = validate_inputs(body) 16 | # if validate_response: 17 | # return validate_response 18 | 19 | bucket = event['bucket']#event['Records'][0]['s3']['bucket']['name'] 20 | key = event['key']#urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8') 21 | 22 | 23 | # bucket = 'lambda-artifact-opensearch' 24 | # key = 'sample.pdf' 25 | 26 | print(bucket) 27 | print(key) 28 | 29 | response = chain_documentEncoder.run( 30 | 31 | bucket_ = bucket, 32 | key_=key 33 | ) 34 | 35 | print(response) 36 | 37 | 38 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/main_queryEncoder.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | from typing import Dict 4 | import requests 5 | import chain_queryEncoder 6 | import config 7 | 8 | 9 | def handler(event, context): 10 | 11 | body = json.loads(event["body"]) 12 | 13 | # validate_response = validate_inputs(body) 14 | # if validate_response: 15 | # return validate_response 16 | 17 | 18 | 19 | # prompt = "what is the recommended shard size?" 20 | # session_id = "7859484" 21 | 22 | prompt = body['prompt'] 23 | session_id = (body["session_id"]).strip() 24 | 25 | 26 | 27 | 28 | 29 | print(f"prompt is {prompt}") 30 | print(f"session_id is {session_id}") 31 | 32 | print(type(prompt)) 33 | 34 | if(type(prompt) == 'dict'): 35 | prompt = json.dumps(prompt) 36 | 37 | response, session_id = chain_queryEncoder.run( 38 | api_key=get_api_key(), 39 | session_id=session_id, 40 | prompt=prompt 41 | ) 42 | 43 | print("response inside main") 44 | print( { 45 | "response": response, 46 | "session_id": session_id 47 | }) 48 | return build_response({ 49 | "response": response, 50 | "session_id": session_id 51 | }) 52 | 53 | 54 | def validate_inputs(body: Dict): 55 | for input_name in ['prompt', 'session_id']: 56 | if input_name not in body: 57 | return build_response({ 58 | "status": "error", 59 | "message": f"{input_name} missing in payload" 60 | }) 61 | return "" 62 | 63 | def build_response(body: Dict): 64 | return { 65 | "statusCode": 200, 66 | "headers": { 67 | "Content-Type": "application/json" 68 | }, 69 | "body": json.dumps(body) 70 | } 71 | 72 | 73 | def get_api_key(): 74 | """Fetches the api keys saved in Secrets Manager""" 75 | 76 | # headers = {"X-Aws-Parameters-Secrets-Token": os.environ.get('AWS_SESSION_TOKEN')} 77 | # secrets_extension_endpoint = "http://localhost:2773" + \ 78 | # "/secretsmanager/get?secretId=" + \ 79 | # config.config.API_KEYS_SECRET_NAME 80 | 81 | # r = requests.get(secrets_extension_endpoint, headers=headers) 82 | # secret = json.loads(json.loads(r.text)["SecretString"]) 83 | 84 | return "openAI key" 85 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/module1/all_components.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/module1/all_components.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/module1/encoders.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/module1/encoders.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/module1/memory.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/module1/memory.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/module1/ml_models.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/module1/ml_models.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/module1/module1.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/module1/module1.gif -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/module1/vectordb.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/module1/vectordb.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/module1/webserver.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/module1/webserver.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/requirements.txt: -------------------------------------------------------------------------------- 1 | langchain==0.0.308 2 | requests 3 | boto3==1.28.59 4 | opensearch-py 5 | PyPDF2 6 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/sample_pdfs/OpenSearch_best_practices.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/sample_pdfs/OpenSearch_best_practices.pdf -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/sample_pdfs/Sizing_Amazon_OpenSearch_Service_domains.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/sample_pdfs/Sizing_Amazon_OpenSearch_Service_domains.pdf -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/api.py: -------------------------------------------------------------------------------- 1 | import json 2 | from urllib.parse import urlparse, urlencode, parse_qs 3 | 4 | import re 5 | 6 | import requests 7 | import boto3 8 | from boto3 import Session 9 | from botocore.auth import SigV4Auth 10 | from botocore.awsrequest import AWSRequest 11 | import botocore.session 12 | 13 | 14 | def signing_headers(method, url_string, body): 15 | region = url_string.split(".")[2] 16 | url = urlparse(url_string) 17 | path = url.path or '/' 18 | querystring = '' 19 | if url.query: 20 | querystring = '?' + urlencode( 21 | parse_qs(url.query, keep_blank_values=True), doseq=True) 22 | 23 | safe_url = url.scheme + '://' + url.netloc.split( 24 | ':')[0] + path + querystring 25 | request = AWSRequest(method=method.upper(), url=safe_url, data=body) 26 | 27 | session = botocore.session.Session() 28 | sigv4 = SigV4Auth(session.get_credentials(), "lambda", region) 29 | sigv4.add_auth(request) 30 | 31 | header_ = dict(request.headers.items()) 32 | 33 | header_["Content-Type"] = "application/json; charset=utf-8" 34 | 35 | print(header_) 36 | 37 | return dict(header_) 38 | 39 | def call(prompt: str, session_id: str): 40 | body = json.dumps({ 41 | "prompt": prompt, 42 | "session_id": session_id 43 | }) 44 | method = "post" 45 | url = "API_URL_TO_BE_REPLACED" 46 | #https://$query_invoke_URL_cmd.execute-api.us-east-1.amazonaws.com/prod/lambda 47 | r = requests.post(url, headers= signing_headers(method,url,body), data=body) 48 | #{"Content-Type": "application/json; charset=utf-8"} 49 | #signing_headers(method,url,body) 50 | response = json.loads(r.text) 51 | print(response) 52 | return response 53 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/app.py: -------------------------------------------------------------------------------- 1 | import streamlit as st 2 | import uuid 3 | import os 4 | import boto3 5 | import requests 6 | import api 7 | from boto3 import Session 8 | import botocore.session 9 | import json 10 | import random 11 | import string 12 | #from langchain.callbacks.base import BaseCallbackHandler 13 | 14 | 15 | USER_ICON = "/home/ec2-user/images/user.png" 16 | AI_ICON = "/home/ec2-user/images/opensearch-twitter-card.png" 17 | REGENERATE_ICON = "/home/ec2-user/images/regenerate.png" 18 | s3_bucket_ = "pdf-repo-uploads" 19 | #"pdf-repo-uploads" 20 | 21 | # Check if the user ID is already stored in the session state 22 | if 'user_id' in st.session_state: 23 | user_id = st.session_state['user_id'] 24 | print(f"User ID: {user_id}") 25 | 26 | # If the user ID is not yet stored in the session state, generate a random UUID 27 | else: 28 | user_id = str(uuid.uuid4()) 29 | st.session_state['user_id'] = user_id 30 | 31 | 32 | if 'session_id' not in st.session_state: 33 | st.session_state['session_id'] = "" 34 | 35 | if "chats" not in st.session_state: 36 | st.session_state.chats = [ 37 | { 38 | 'id': 0, 39 | 'question': '', 40 | 'answer': '' 41 | } 42 | ] 43 | 44 | if "questions" not in st.session_state: 45 | st.session_state.questions = [] 46 | 47 | if "answers" not in st.session_state: 48 | st.session_state.answers = [] 49 | 50 | if "input_text" not in st.session_state: 51 | st.session_state.input_text="" 52 | 53 | if "input_searchType" not in st.session_state: 54 | st.session_state.input_searchType = "Conversational Search (RAG)" 55 | 56 | # if "input_temperature" not in st.session_state: 57 | # st.session_state.input_temperature = "0.001" 58 | 59 | # if "input_topK" not in st.session_state: 60 | # st.session_state.input_topK = 200 61 | 62 | # if "input_topP" not in st.session_state: 63 | # st.session_state.input_topP = 0.95 64 | 65 | # if "input_maxTokens" not in st.session_state: 66 | # st.session_state.input_maxTokens = 1024 67 | 68 | 69 | def write_logo(): 70 | col1, col2, col3 = st.columns([5, 1, 5]) 71 | with col2: 72 | st.image(AI_ICON, use_column_width='always') 73 | 74 | def write_top_bar(): 75 | col1, col2, col3 = st.columns([1,10,2]) 76 | with col1: 77 | st.image(AI_ICON, use_column_width='always') 78 | with col2: 79 | st.subheader("Chat with your PDF using OpenSearch") 80 | with col3: 81 | clear = st.button("Clear Chat") 82 | return clear 83 | 84 | clear = write_top_bar() 85 | 86 | if clear: 87 | st.session_state.questions = [] 88 | st.session_state.answers = [] 89 | st.session_state.input_text="" 90 | # st.session_state.input_searchType="Conversational Search (RAG)" 91 | # st.session_state.input_temperature = "0.001" 92 | # st.session_state.input_topK = 200 93 | # st.session_state.input_topP = 0.95 94 | # st.session_state.input_maxTokens = 1024 95 | 96 | st.markdown('---') 97 | 98 | def handle_input(): 99 | inputs = {} 100 | for key in st.session_state: 101 | if key.startswith('input_'): 102 | inputs[key.removeprefix('input_')] = st.session_state[key] 103 | st.session_state.inputs_ = inputs 104 | 105 | #st.write(inputs) 106 | question_with_id = { 107 | 'question': inputs["text"], 108 | 'id': len(st.session_state.questions) 109 | } 110 | st.session_state.questions.append(question_with_id) 111 | st.session_state.answers.append({ 112 | 'answer': api.call(json.dumps(inputs), st.session_state['session_id']), 113 | 'search_type':inputs['searchType'], 114 | 'id': len(st.session_state.questions) 115 | }) 116 | st.session_state.input_text="" 117 | #st.session_state.input_searchType=st.session_state.input_searchType 118 | 119 | 120 | 121 | search_type = st.selectbox('Select the Search type', 122 | ('Conversational Search (RAG)', 123 | 'OpenSearch vector search', 124 | 'LLM Text Generation' 125 | ), 126 | 127 | key = 'input_searchType', 128 | help = "Select the type of retriever\n1. Conversational Search (Recommended) - This will include both the OpenSearch and LLM in the retrieval pipeline \n (note: This will put opensearch response as context to LLM to answer) \n2. OpenSearch vector search - This will put only OpenSearch's vector search in the pipeline, \n(Warning: this will lead to unformatted results )\n3. LLM Text Generation - This will include only LLM in the pipeline, \n(Warning: This will give hallucinated and out of context answers)" 129 | ) 130 | 131 | col1, col2, col3, col4 = st.columns(4) 132 | 133 | with col1: 134 | st.text_input('Temperature', value = "0.001", placeholder='LLM Temperature', key = 'input_temperature',help = "Set the temperature of the Large Language model. \n Note: 1. Set this to values lower to 1 in the order of 0.001, 0.0001, such low values reduces hallucination and creativity in the LLM response; 2. This applies only when LLM is a part of the retriever pipeline") 135 | with col2: 136 | st.number_input('Top K', value = 200, placeholder='Top K', key = 'input_topK', step = 50, help = "This limits the LLM's predictions to the top k most probable tokens at each step of generation, this applies only when LLM is a prt of the retriever pipeline") 137 | with col3: 138 | st.number_input('Top P', value = 0.95, placeholder='Top P', key = 'input_topP', step = 0.05, help = "This sets a threshold probability and selects the top tokens whose cumulative probability exceeds the threshold while the tokens are generated by the LLM") 139 | with col4: 140 | st.number_input('Max Output Tokens', value = 500, placeholder='Max Output Tokens', key = 'input_maxTokens', step = 100, help = "This decides the total number of tokens generated as the final response. Note: Values greater than 1000 takes longer response time") 141 | 142 | st.markdown('---') 143 | 144 | 145 | def write_user_message(md): 146 | col1, col2 = st.columns([0.60,12]) 147 | 148 | with col1: 149 | st.image(USER_ICON, use_column_width='always') 150 | with col2: 151 | #st.warning(md['question']) 152 | 153 | st.markdown("
"+md['question']+"
", unsafe_allow_html = True) 154 | 155 | def render_answer(answer,search_type,index): 156 | col1, col2, col3, col4 = st.columns([2,20,6,1]) 157 | with col1: 158 | st.image(AI_ICON, use_column_width='always') 159 | with col2: 160 | # chat_box=st.empty() 161 | # self.text+=token+"/" 162 | # self.container.markdown(self.text) 163 | #st.markdown(answer,unsafe_allow_html = True) 164 | st.markdown("
"+answer+"
", unsafe_allow_html = True) 165 | 166 | with col3: 167 | if(search_type== 'Conversational Search (RAG)'): 168 | st.markdown("

RAG

", unsafe_allow_html = True, help = "Retriever type of the response") 169 | if(search_type== 'OpenSearch vector search'): 170 | st.markdown("

OpenSearch

", unsafe_allow_html = True, help = "Retriever type of the response") 171 | if(search_type== 'LLM Text Generation'): 172 | st.markdown("

LLM

", unsafe_allow_html = True, help = "Retriever type of the response") 173 | 174 | print("------------------------") 175 | print(type(st.session_state)) 176 | print("------------------------") 177 | print(st.session_state) 178 | print("------------------------") 179 | 180 | with col4: 181 | if(index == len(st.session_state.questions)): 182 | 183 | rdn_key = ''.join([random.choice(string.ascii_letters) 184 | for _ in range(10)]) 185 | currentValue = st.session_state.input_searchType+st.session_state.input_temperature+str(st.session_state.input_topK)+str(st.session_state.input_topP)+str(st.session_state.input_maxTokens) 186 | oldValue = st.session_state.inputs_["searchType"]+st.session_state.inputs_["temperature"]+str(st.session_state.inputs_["topK"])+str(st.session_state.inputs_["topP"])+str(st.session_state.inputs_["maxTokens"]) 187 | 188 | def on_button_click(): 189 | if(currentValue!=oldValue): 190 | st.session_state.input_text = st.session_state.questions[-1]["question"] 191 | st.session_state.answers.pop() 192 | st.session_state.questions.pop() 193 | 194 | handle_input() 195 | with placeholder.container(): 196 | render_all() 197 | 198 | if("currentValue" in st.session_state): 199 | del st.session_state["currentValue"] 200 | 201 | try: 202 | del regenerate 203 | except: 204 | pass 205 | 206 | print("------------------------") 207 | print(st.session_state) 208 | 209 | placeholder__ = st.empty() 210 | 211 | placeholder__.button("🔄",key=rdn_key,on_click=on_button_click, help = "This will regenerate the last response with new settings that you entered, Note: This applies to only the last response and to see difference in responses, you should change any of the settings above")#,type="primary",use_container_width=True) 212 | 213 | #Each answer will have context of the question asked in order to associate the provided feedback with the respective question 214 | def write_chat_message(md, q,index): 215 | if('body' in md['answer']): 216 | res = json.loads(md['answer']['body']) 217 | else: 218 | res = md['answer'] 219 | st.session_state['session_id'] = res['session_id'] 220 | chat = st.container() 221 | with chat: 222 | render_answer(res["response"],md['search_type'],index) 223 | 224 | def render_all(): 225 | index = 0 226 | for (q, a) in zip(st.session_state.questions, st.session_state.answers): 227 | index = index +1 228 | print("answers----") 229 | print(a) 230 | write_user_message(q) 231 | write_chat_message(a, q,index) 232 | 233 | placeholder = st.empty() 234 | with placeholder.container(): 235 | render_all() 236 | 237 | st.markdown("") 238 | col_1, col_2, col_3 = st.columns([6,50,5]) 239 | with col_1: 240 | st.markdown("

Ask:

",unsafe_allow_html=True, help = 'Enter the questions and click on "GO"') 241 | 242 | with col_2: 243 | #st.markdown("") 244 | input = st.text_input( "Ask here",label_visibility = "collapsed",key="input_text") 245 | with col_3: 246 | #hidden = st.button("RUN",disabled=True,key = "hidden") 247 | play = st.button("GO",on_click=handle_input,key = "play") 248 | with st.sidebar: 249 | st.subheader("Sample PDF(s)") 250 | # Initialize boto3 to use the S3 client. 251 | s3_client = boto3.resource('s3') 252 | bucket=s3_client.Bucket(s3_bucket_) 253 | 254 | objects = bucket.objects.filter(Prefix="sample_pdfs/") 255 | urls = [] 256 | 257 | client = boto3.client('s3') 258 | 259 | for obj in objects: 260 | if obj.key.endswith('.pdf'): 261 | 262 | # Generate the S3 presigned URL 263 | s3_presigned_url = client.generate_presigned_url( 264 | ClientMethod='get_object', 265 | Params={ 266 | 'Bucket': s3_bucket_, 267 | 'Key': obj.key 268 | }, 269 | ExpiresIn=3600 270 | ) 271 | 272 | # Print the created S3 presigned URL 273 | print(s3_presigned_url) 274 | urls.append(s3_presigned_url) 275 | #st.write("["+obj.key.split('/')[1]+"]("+s3_presigned_url+")") 276 | st.link_button(obj.key.split('/')[1], s3_presigned_url) 277 | 278 | st.subheader("Your documents") 279 | pdf_docs = st.file_uploader( 280 | "Upload your PDFs here and click on 'Process'", accept_multiple_files=True) 281 | if st.button("Process"): 282 | with st.spinner("Processing"): 283 | for pdf_doc in pdf_docs: 284 | print(type(pdf_doc)) 285 | pdf_doc_name = (pdf_doc.name).replace(" ","_") 286 | print("aws s3 cp pdfs"+pdf_doc_name+" s3://"+s3_bucket_) 287 | with open(os.path.join("/home/ec2-user/pdfs",pdf_doc_name),"wb") as f: 288 | f.write(pdf_doc.getbuffer()) 289 | os.system("aws s3 cp /home/ec2-user/pdfs/"+pdf_doc_name+" s3://"+s3_bucket_) 290 | request_ = '{ "bucket": "'+s3_bucket_+'","key": "'+pdf_doc_name+'" }' 291 | os.system("aws lambda invoke --function-name documentEncoder --cli-binary-format raw-in-base64-out --payload '"+request_+"' response.json") 292 | print('lambda done') 293 | st.success('you can start searching on your PDF') 294 | -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/images/ai-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/images/ai-icon.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/images/opensearch-twitter-card.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/images/opensearch-twitter-card.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/images/regenerate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/images/regenerate.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/images/user-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/images/user-icon.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/images/user.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/images/user.png -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/pdfs/Amazon_OpenSearch_best_practices.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/pdfs/Amazon_OpenSearch_best_practices.pdf -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/pdfs/Sizing_Amazon_OpenSearch_Service_domain.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/pdfs/Sizing_Amazon_OpenSearch_Service_domain.pdf -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search/webapp/pdfs/Vector_database_capabilities_OpenSearch_Service.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/generative-ai/Module_1_Build_Conversational_Search/webapp/pdfs/Vector_database_capabilities_OpenSearch_Service.pdf -------------------------------------------------------------------------------- /generative-ai/Module_1_Build_Conversational_Search_Components.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "aede8483", 6 | "metadata": {}, 7 | "source": [ 8 | "# Build the Conversational Search Building Blocks" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "e93f41d1", 14 | "metadata": {}, 15 | "source": [ 16 | "
\n", 17 | "\n", 18 | "
" 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "id": "7c21bd0b", 24 | "metadata": {}, 25 | "source": [ 26 | "\n", 27 | "\n", 28 | "### In this lab, We will build the above components one by one to design an end to end conversational search application where you can simply upload a pdf and ask questions over the pdf content. The components include:\n", 29 | "\n", 30 | "* **OpenSearch** as the Vector Database\n", 31 | "* **SageMaker endpoints** to host Embedding and the large language models\n", 32 | "* **DynamoDB** as the memory store\n", 33 | "* **Lambda functions** as the Document and Query Enoders\n", 34 | "* **Ec2 instance** to host the web application\n", 35 | "\n", 36 | "---\n", 37 | "\n", 38 | "The lab includes the following steps:\n", 39 | "\n", 40 | "1. [Get the Cloudformation outputs](#Get-the-Cloudformation-outputs)\n", 41 | "2. [Component 1 : OpenSearch Vector DB](#Component-1-:-OpenSearch-Vector-DB)\n", 42 | "3. [Component 2 : Embedding and LLM Endpoints](#Component-2-:--Embedding-and-LLM-Endpoints)\n", 43 | "4. [Component 3 : Memory Store](#Component-3-:--Memory-Store)\n", 44 | "5. [Component 4 : Document and Query Encoder](#Component-4-:--Document-and-Query-Encoder)\n", 45 | "6. [Component 5 : Client WebServer](#Component-5-:-Client-WebServer)\n" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "id": "c01dbffe", 51 | "metadata": {}, 52 | "source": [ 53 | "## Get the Cloudformation outputs\n", 54 | "\n", 55 | "Here, we retrieve the services that are already deployed as a part of the cloudformation template to reduce the deployemnt time for the purpose of this lab. These services include OpenSearch cluster and the Sagemaker endpoints for the LLM and the embedding models.\n", 56 | "\n", 57 | "We also create a **env_variables** dictionary to store the parameters needed to passed onto Lambda functions (Encoders) as environment variables." 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "id": "8e34e89e", 64 | "metadata": {}, 65 | "outputs": [], 66 | "source": [ 67 | "import sagemaker, boto3, json, time\n", 68 | "from sagemaker.session import Session\n", 69 | "import subprocess\n", 70 | "from IPython.utils import io\n", 71 | "from Module_1_Build_Conversational_Search import lambda_URL, lambda_exec_role as createRole, lambda_function as createLambda\n", 72 | "\n", 73 | "cfn = boto3.client('cloudformation')\n", 74 | "response = cfn.list_stacks(StackStatusFilter=['CREATE_COMPLETE'])\n", 75 | "for cfns in response['StackSummaries']:\n", 76 | " if('semantic-search' in cfns['StackName']):\n", 77 | " stackname = cfns['StackName']\n", 78 | "\n", 79 | "cfn_outputs = cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']\n", 80 | "\n", 81 | "for output in cfn_outputs:\n", 82 | " if('s3' in output['OutputKey'].lower()):\n", 83 | " s3_bucket = output['OutputValue']\n", 84 | "\n", 85 | "aws_region = boto3.Session().region_name \n", 86 | "env_variables = {\"aws_region\":aws_region}\n", 87 | "\n", 88 | "cfn_outputs" 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "316ab957", 94 | "metadata": {}, 95 | "source": [ 96 | "## Component 1 : OpenSearch Vector DB" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "e9c3a13e", 102 | "metadata": {}, 103 | "source": [ 104 | "
\n", 105 | "\n", 106 | "
" 107 | ] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "id": "806982f4", 112 | "metadata": {}, 113 | "source": [ 114 | "Here, we retrieve the Endpoint of the OpenSearch cluster from the CloudFormation outputs, pass it to the `env_variables` dictionary and also describe the cluster to see the highlevel configuration quickly." 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "id": "59187261", 121 | "metadata": {}, 122 | "outputs": [], 123 | "source": [ 124 | "for output in cfn_outputs:\n", 125 | " if('opensearch' in output['OutputKey'].lower()):\n", 126 | " env_variables[output['OutputKey']] = output['OutputValue']\n", 127 | " \n", 128 | "opensearch_ = boto3.client('opensearch')\n", 129 | "\n", 130 | "response = opensearch_.describe_domain(\n", 131 | " DomainName=env_variables['OpenSearchDomainName']\n", 132 | ")\n", 133 | "\n", 134 | "print(\"OpenSearch Version: \"+response['DomainStatus']['EngineVersion']+\"\\n\")\n", 135 | "print(\"OpenSearch Configuration\\n------------------------\\n\")\n", 136 | "print(json.dumps(response['DomainStatus']['ClusterConfig'], indent=4)) " 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "id": "b5358cdb", 142 | "metadata": {}, 143 | "source": [ 144 | "## Component 2 : Embedding and LLM Endpoints" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "id": "24f2cf0f", 150 | "metadata": {}, 151 | "source": [ 152 | "\n", 153 | "
\n", 154 | "\n", 155 | "
" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "id": "a1f87311", 161 | "metadata": {}, 162 | "source": [ 163 | "Now, we retrieve the endpoints of the LLM and the embedding models from the cloudformation outputs, pass it to the `env_variables` dictionary and also describe the endpoints to see the highlevel configuration quickly.\n" 164 | ] 165 | }, 166 | { 167 | "cell_type": "code", 168 | "execution_count": null, 169 | "id": "d0c828c7", 170 | "metadata": {}, 171 | "outputs": [], 172 | "source": [ 173 | "sagemaker_ = boto3.client('sagemaker')\n", 174 | "\n", 175 | "for output in cfn_outputs:\n", 176 | " if('endpointname' in output['OutputKey'].lower()):\n", 177 | " env_variables[output['OutputKey']] = output['OutputValue']\n", 178 | " print(output['OutputKey'] + \" : \"+output['OutputValue']+\"\\n\"+\"------------------------------------------------\")\n", 179 | " print(json.dumps(sagemaker_.describe_endpoint_config(EndpointConfigName = sagemaker_.describe_endpoint(\n", 180 | " EndpointName=output['OutputValue']\n", 181 | " )['EndpointConfigName'])['ProductionVariants'][0],indent = 4))\n", 182 | " " 183 | ] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "id": "b69420c4", 188 | "metadata": {}, 189 | "source": [ 190 | "## Component 3 : Memory Store\n", 191 | "\n", 192 | " " 193 | ] 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "id": "6a789f24", 198 | "metadata": {}, 199 | "source": [ 200 | "
\n", 201 | "\n", 202 | "
" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "id": "83bb5e2f", 208 | "metadata": {}, 209 | "source": [ 210 | "Here, We establish a DynamoDB table(`conversation-history-memory`) designated as a memory store to retain the history of conversations within the application. The `SessionId` serves as the unique identifier for each conversation entry in the table, functioning as the partition column. " 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "id": "616b4a1d", 217 | "metadata": {}, 218 | "outputs": [], 219 | "source": [ 220 | "dynamo = boto3.client('dynamodb')\n", 221 | "\n", 222 | "response = dynamo.create_table(\n", 223 | " TableName='conversation-history-memory',\n", 224 | " AttributeDefinitions=[\n", 225 | " {'AttributeName': 'SessionId', 'AttributeType': 'S'}\n", 226 | " ],\n", 227 | " KeySchema=[\n", 228 | " { 'AttributeName': 'SessionId', 'KeyType': 'HASH'}\n", 229 | " ],\n", 230 | " ProvisionedThroughput={'ReadCapacityUnits': 5,'WriteCapacityUnits': 5}\n", 231 | ")\n", 232 | "env_variables['DynamoDBTableName'] = response['TableDescription']['TableName']\n", 233 | "\n", 234 | "print(\"dynamo DB Table, '\"+response['TableDescription']['TableName']+\"' is created\")" 235 | ] 236 | }, 237 | { 238 | "cell_type": "markdown", 239 | "id": "780d8486", 240 | "metadata": {}, 241 | "source": [ 242 | "## Component 4 : Document and Query Encoder" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "id": "3163438a", 248 | "metadata": {}, 249 | "source": [ 250 | "
\n", 251 | "\n", 252 | "
" 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "id": "78aaacc9", 258 | "metadata": {}, 259 | "source": [ 260 | "In the next step, We generate Lambda functions for the document and query encoder, incorporating the Langchain module. The process involves the following steps:\n", 261 | "1. Package the dependant libraries (Langchain) and handler files for lambda functions into zip files and upload them to S3.\n", 262 | "2. Create the IAM role with sufficient permissions that can be assumed by the lambda functions.\n", 263 | "3. Construct the Python 3.9 Lambda functions, passing the previously created `env_variables` as environment variables for these functions.\n", 264 | "```\n", 265 | " { \n", 266 | " 'aws_region': 'us-west-2',\n", 267 | " 'OpenSearchDomainEndpoint': 'xxxx',\n", 268 | " 'OpenSearchDomainName': 'opensearchservi-xxxxxx',\n", 269 | " 'OpenSearchSecret': 'xxxx',\n", 270 | " 'EmbeddingEndpointName': 'opensearch-gen-ai-embedding-gpt-j-xx-xxxxx',\n", 271 | " 'LLMEndpointName': 'opensearch-gen-ai-llm-falcon-7b-xx-xx',\n", 272 | " 'DynamoDBTableName': 'conversation-history-memory'\n", 273 | " }\n", 274 | "```\n", 275 | "4. Create external Lambda URL for queryEncoder lambda to be called from the outside world" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "id": "6939b677", 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "#Get the ARN of the IAM role (deployed in cloud formation) for the lambda to assume.\n", 286 | "\n", 287 | "iam_ = boto3.client('iam')\n", 288 | "response = iam_.get_role(\n", 289 | " RoleName='LambdaRoleForEncoders'\n", 290 | ")\n", 291 | "\n", 292 | "roleARN = response['Role']['Arn']\n", 293 | "\n", 294 | "#Create Lambda functions\n", 295 | "encoders = ['queryEncoder','documentEncoder']\n", 296 | "createLambda.createLambdaFunction(encoders,roleARN,env_variables)\n", 297 | "\n", 298 | "#Create Lambda URL\n", 299 | "account_id=roleARN.split(':')[4]\n", 300 | "query_invoke_URL = lambda_URL.createLambdaURL('queryEncoder',account_id)\n", 301 | "print(\"\\nLambdaURL created, URL: \"+query_invoke_URL)" 302 | ] 303 | }, 304 | { 305 | "cell_type": "markdown", 306 | "id": "8fc4b86f", 307 | "metadata": {}, 308 | "source": [ 309 | "## Component 5 : Client WebServer" 310 | ] 311 | }, 312 | { 313 | "cell_type": "markdown", 314 | "id": "e9170aa9", 315 | "metadata": {}, 316 | "source": [ 317 | "
\n", 318 | "\n", 319 | "
" 320 | ] 321 | }, 322 | { 323 | "cell_type": "markdown", 324 | "id": "a9c07a61", 325 | "metadata": {}, 326 | "source": [ 327 | "## Notice\n", 328 | "\n", 329 | "To ensure security access to the provisioned resources, we use EC2 security group to limit access scope. Before you go into the final step, you need to add your current **PUBLIC IP** address to the ec2 security group so that you are able to access the web application (chat interface) that you are going to host in the next step.\n", 330 | "\n", 331 | "

Warning

\n", 332 | "

Without doing the below steps, you will not be able to proceed further.

\n", 333 | "\n", 334 | "
\n", 335 | "

Enter your IP address

\n", 336 | "

STEP 1. Get your IP address HERE. If you are connecting with VPN, we recommend you disconnect VPN first.

\n", 337 | "
\n", 338 | "\n", 339 | "

STEP 2. Run the below cell

\n", 340 | "

STEP 3. Paste the IP address in the input box that prompts you to enter your IP

\n", 341 | "

STEP 4. Press ENTER

" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": null, 347 | "id": "fc899b4a", 348 | "metadata": {}, 349 | "outputs": [], 350 | "source": [ 351 | "my_ip = (input(\"Enter your IP : \")).split(\".\")\n", 352 | "my_ip.pop()\n", 353 | "IP = \".\".join(my_ip)+\".0/24\"\n", 354 | "\n", 355 | "port_protocol = {443:'HTTPS',80:'HTTP',8501:'streamlit'}\n", 356 | "\n", 357 | "IpPermissions = []\n", 358 | "\n", 359 | "for port in port_protocol.keys():\n", 360 | " IpPermissions.append({\n", 361 | " 'FromPort': port,\n", 362 | " 'IpProtocol': 'tcp',\n", 363 | " 'IpRanges': [\n", 364 | " {\n", 365 | " 'CidrIp': IP,\n", 366 | " 'Description': port_protocol[port]+' access',\n", 367 | " },\n", 368 | " ],\n", 369 | " 'ToPort': port,\n", 370 | " })\n", 371 | "\n", 372 | "IpPermissions\n", 373 | "\n", 374 | "for output in cfn_outputs:\n", 375 | " if('securitygroupid' in output['OutputKey'].lower()):\n", 376 | " sg_id = output['OutputValue']\n", 377 | " \n", 378 | "#sg_id = 'sg-0e0d72baa90696638'\n", 379 | "\n", 380 | "ec2_ = boto3.client('ec2') \n", 381 | "\n", 382 | "response = ec2_.authorize_security_group_ingress(\n", 383 | " GroupId=sg_id,\n", 384 | " IpPermissions=IpPermissions,\n", 385 | ")\n", 386 | "\n", 387 | "print(\"\\nIngress rules added for the security group, ports:protocol - \"+json.dumps(port_protocol)+\" with my ip - \"+IP)" 388 | ] 389 | }, 390 | { 391 | "cell_type": "markdown", 392 | "id": "fb338301", 393 | "metadata": {}, 394 | "source": [ 395 | "Finally, We are ready to host our conversational search application, here we perform the following steps, Steps 2-5 are achieved by executing the terminal commands in the ec2 instance using a SSM client.\n", 396 | "1. Update the web application code files with lambda url (in [api.py](https://github.com/aws-samples/semantic-search-with-amazon-opensearch/blob/main/generative-ai/Module_1_Build_Conversational_Search/webapp/api.py)) and s3 bucket name (in [app.py](https://github.com/aws-samples/semantic-search-with-amazon-opensearch/blob/main/generative-ai/Module_1_Build_Conversational_Search/webapp/app.py))\n", 397 | "2. Archieve the application files and push to the configured s3 bucket.\n", 398 | "3. Download the application (.zip) from s3 bucket into ec2 instance (/home/ec2-user/), and uncompress it.\n", 399 | "4. We install the streamlit and boto3 dependencies inside a virtual environment inside the ec2 instance.\n", 400 | "5. Start the streamlit application." 401 | ] 402 | }, 403 | { 404 | "cell_type": "code", 405 | "execution_count": null, 406 | "id": "5b3f35a8", 407 | "metadata": {}, 408 | "outputs": [], 409 | "source": [ 410 | "#modify the code files with lambda url and s3 bucket names\n", 411 | "query_invoke_URL_cmd = query_invoke_URL.replace(\"/\",\"\\/\")\n", 412 | "\n", 413 | "with io.capture_output() as captured:\n", 414 | " #Update the webapp files to include the s3 bucket name and the LambdaURL\n", 415 | " !sed -i 's/API_URL_TO_BE_REPLACED/{query_invoke_URL_cmd}/g' Module_1_Build_Conversational_Search/webapp/api.py\n", 416 | " !sed -i 's/pdf-repo-uploads/{s3_bucket}/g' Module_1_Build_Conversational_Search/webapp/app.py\n", 417 | " #Push the WebAPP code artefacts to s3\n", 418 | " !cd Module_1_Build_Conversational_Search/webapp && zip -r ../webapp.zip *\n", 419 | " !aws s3 cp Module_1_Build_Conversational_Search/webapp.zip s3://$s3_bucket\n", 420 | " \n", 421 | "#Get the Ec2 instance ID which is already deployed\n", 422 | "response = cfn.describe_stack_resources(\n", 423 | " StackName=stackname\n", 424 | ")\n", 425 | "for resource in response['StackResources']:\n", 426 | " if(resource['ResourceType'] == 'AWS::EC2::Instance'):\n", 427 | " ec2_instance_id = resource['PhysicalResourceId']\n", 428 | " \n", 429 | "# function to execute commands in ec2 terminal\n", 430 | "def execute_commands_on_linux_instances(client, commands):\n", 431 | " resp = client.send_command(\n", 432 | " DocumentName=\"AWS-RunShellScript\", # One of AWS' preconfigured documents\n", 433 | " Parameters={'commands': commands},\n", 434 | " InstanceIds=[ec2_instance_id],\n", 435 | " )\n", 436 | " return resp['Command']['CommandId']\n", 437 | "\n", 438 | "ssm_client = boto3.client('ssm') \n", 439 | "\n", 440 | "commands = [\n", 441 | " 'aws s3 cp s3://'+s3_bucket+'/webapp.zip /home/ec2-user/',\n", 442 | " 'unzip -o /home/ec2-user/webapp.zip -d /home/ec2-user/' , \n", 443 | " 'sudo chmod -R 0777 /home/ec2-user/',\n", 444 | " 'aws s3 cp /home/ec2-user/pdfs s3://'+s3_bucket+'/sample_pdfs/ --recursive',\n", 445 | " 'python3 -m venv /home/ec2-user/.myenv',\n", 446 | " 'source /home/ec2-user/.myenv/bin/activate',\n", 447 | " 'pip install streamlit',\n", 448 | " 'pip install boto3',\n", 449 | " \n", 450 | " #start the web applicaiton\n", 451 | " 'streamlit run /home/ec2-user/app.py',\n", 452 | " ]\n", 453 | "\n", 454 | "command_id = execute_commands_on_linux_instances(ssm_client, commands)\n", 455 | "\n", 456 | "ec2_ = boto3.client('ec2')\n", 457 | "response = ec2_.describe_instances(\n", 458 | " InstanceIds=[ec2_instance_id]\n", 459 | ")\n", 460 | "public_ip = response['Reservations'][0]['Instances'][0]['PublicIpAddress']\n", 461 | "print(\"Please wait while the application is being hosted . . .\")\n", 462 | "time.sleep(10)\n", 463 | "print(\"\\nApplication hosted successfully\")\n", 464 | "print(\"\\nClick the below URL to open the application. It may take up to a minute or two to start the application, Please keep refreshing the page if you are seeing connection error.\\n\")\n", 465 | "print('http://'+public_ip+\":8501\")\n", 466 | "print(\"\\nCheck the below video on how to interact with the application\")" 467 | ] 468 | }, 469 | { 470 | "cell_type": "markdown", 471 | "id": "ead82474", 472 | "metadata": {}, 473 | "source": [ 474 | "## Search Type\n", 475 | "\n", 476 | "You can use 3 different search type in this lab, you can compare the difference of search type:\n", 477 | "* **LLM Text Generation**: The search result is purely from LLM generation. Without some domain knowledge, LLM may generate plausible result. This is the limitation of LLM hallucination.\n", 478 | "* **OpenSearch vector search**: The search result is from semantic search of OpenSearch. It match the most relevant semantic document in the vector DB and return orginal document in the vector store of OpenSearch.\n", 479 | "* **Conversational Search**: The search result is RAG generated content. First it use semantic search match the documents in the vector store of OpenSearch, combine the relevant documents and original question as prompt to LLM and generate result.\n", 480 | "\n", 481 | "## Hyper Parameters\n", 482 | "\n", 483 | "Several hyperparameters for Language Models (LLMs) can be adjusted to tune content generation to specific requirements:\n", 484 | "\n", 485 | "* **Temperature**: Large language models use probability to construct the words in a sequence. For any given sequence, there is a probability distribution of options for the next word in the sequence. When you set the temperature closer to zero, the model tends to select the higher-probability words. When you set the temperature further away from zero, the model may select a lower-probability word.\n", 486 | "\n", 487 | "* **Top K**: Temperature defines the probability distribution of potential words, and Top K defines the cutoff where the model no longer selects the words. For example, if K=50, the model selects from 50 of the most probable words that could be next in a given sequence. When you lower the Top K value, it reduces the probability that an unusual word gets selected next in a sequence.\n", 488 | "\n", 489 | "* **Top P**: Top P defines a cut off based on the sum of probabilities of the potential choices. If you set Top P below 1.0, the model considers the most probable options and ignores less probable ones. Top P is similar to Top K, but instead of capping the number of choices, it caps choices based on the sum of their probabilities.\n", 490 | "\n", 491 | "* **Max Output Tokens**: Configures the maximum number of tokens to use in the generated response\n", 492 | "\n" 493 | ] 494 | }, 495 | { 496 | "cell_type": "markdown", 497 | "id": "3609bde4", 498 | "metadata": {}, 499 | "source": [ 500 | "

Play with the chat application

\n", 501 | "
\n", 502 | "\n", 503 | "
" 504 | ] 505 | }, 506 | { 507 | "cell_type": "code", 508 | "execution_count": null, 509 | "id": "e52b700b", 510 | "metadata": {}, 511 | "outputs": [], 512 | "source": [] 513 | } 514 | ], 515 | "metadata": { 516 | "kernelspec": { 517 | "display_name": "conda_pytorch_p310", 518 | "language": "python", 519 | "name": "conda_pytorch_p310" 520 | }, 521 | "language_info": { 522 | "codemirror_mode": { 523 | "name": "ipython", 524 | "version": 3 525 | }, 526 | "file_extension": ".py", 527 | "mimetype": "text/x-python", 528 | "name": "python", 529 | "nbconvert_exporter": "python", 530 | "pygments_lexer": "ipython3", 531 | "version": "3.10.13" 532 | } 533 | }, 534 | "nbformat": 4, 535 | "nbformat_minor": 5 536 | } 537 | -------------------------------------------------------------------------------- /image/module3/document_loader: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module3/document_loader -------------------------------------------------------------------------------- /image/module3/workflow: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module3/workflow -------------------------------------------------------------------------------- /image/module8/conversation-final-answer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module8/conversation-final-answer.png -------------------------------------------------------------------------------- /image/module8/conversation-new-question.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module8/conversation-new-question.png -------------------------------------------------------------------------------- /image/module8/document-loader.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module8/document-loader.png -------------------------------------------------------------------------------- /image/module8/enable-models-bedrock.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module8/enable-models-bedrock.gif -------------------------------------------------------------------------------- /image/module8/rag-with-memory.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module8/rag-with-memory.png -------------------------------------------------------------------------------- /image/module8/rag.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/image/module8/rag.png -------------------------------------------------------------------------------- /inference.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import logging 3 | import sagemaker_containers 4 | import requests 5 | 6 | import os 7 | import json 8 | import io 9 | import time 10 | import torch 11 | from transformers import AutoTokenizer, AutoModel 12 | # from sentence_transformers import models, losses, SentenceTransformer 13 | 14 | logger = logging.getLogger(__name__) 15 | logger.setLevel(logging.DEBUG) 16 | 17 | #Mean Pooling - Take attention mask into account for correct averaging 18 | def mean_pooling(model_output, attention_mask): 19 | token_embeddings = model_output[0] #First element of model_output contains all token embeddings 20 | input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float() 21 | sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1) 22 | sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9) 23 | return sum_embeddings / sum_mask 24 | 25 | def embed_tformer(model, tokenizer, sentences): 26 | encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=256, return_tensors='pt') 27 | 28 | #Compute token embeddings 29 | with torch.no_grad(): 30 | model_output = model(**encoded_input) 31 | 32 | sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask']) 33 | return sentence_embeddings 34 | 35 | def model_fn(model_dir): 36 | logger.info('model_fn') 37 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 38 | logger.info(model_dir) 39 | tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/bert-base-nli-mean-tokens") 40 | nlp_model = AutoModel.from_pretrained("sentence-transformers/bert-base-nli-mean-tokens") 41 | nlp_model.to(device) 42 | model = {'model':nlp_model, 'tokenizer':tokenizer} 43 | 44 | # model = SentenceTransformer(model_dir + '/transformer/') 45 | # logger.info(model) 46 | return model 47 | 48 | # Deserialize the Invoke request body into an object we can perform prediction on 49 | def input_fn(serialized_input_data, content_type='text/plain'): 50 | logger.info('Deserializing the input data.') 51 | try: 52 | data = [serialized_input_data.decode('utf-8')] 53 | return data 54 | except: 55 | raise Exception('Requested unsupported ContentType in content_type: {}'.format(content_type)) 56 | 57 | # Perform prediction on the deserialized object, with the loaded model 58 | def predict_fn(input_object, model): 59 | logger.info("Calling model") 60 | start_time = time.time() 61 | sentence_embeddings = embed_tformer(model['model'], model['tokenizer'], input_object) 62 | print("--- Inference time: %s seconds ---" % (time.time() - start_time)) 63 | response = sentence_embeddings[0].tolist() 64 | return response 65 | 66 | # Serialize the prediction result into the desired response content type 67 | def output_fn(prediction, accept): 68 | logger.info('Serializing the generated output.') 69 | if accept == 'application/json': 70 | output = json.dumps(prediction) 71 | return output 72 | raise Exception('Requested unsupported ContentType in Accept: {}'.format(content_type)) 73 | -------------------------------------------------------------------------------- /keyword_search.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/keyword_search.png -------------------------------------------------------------------------------- /model/all-MiniLM-L6-v2_torchscript.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "all-MiniLM-L6-v2", 3 | "version": "1.1.0", 4 | "description": "semantic search model", 5 | "model_format": "TORCH_SCRIPT", 6 | "model_config": { 7 | "model_type": "bert", 8 | "embedding_dimension": 384, 9 | "framework_type": "sentence_transformers" 10 | } 11 | } -------------------------------------------------------------------------------- /nlp_bert.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/nlp_bert.png -------------------------------------------------------------------------------- /rag.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/rag.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | sentence-transformers 2 | -------------------------------------------------------------------------------- /semantic_search.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/semantic_search.png -------------------------------------------------------------------------------- /semantic_search_fullstack.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/semantic_search_fullstack.jpg -------------------------------------------------------------------------------- /semantic_search_with_fine_tuning.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/semantic_search_with_fine_tuning.png -------------------------------------------------------------------------------- /static/conversation-final-answer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/conversation-final-answer.png -------------------------------------------------------------------------------- /static/conversation-new-question.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/conversation-new-question.png -------------------------------------------------------------------------------- /static/data_ingestion.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/data_ingestion.png -------------------------------------------------------------------------------- /static/document-loader.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/document-loader.png -------------------------------------------------------------------------------- /static/enable-models-bedrock.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/enable-models-bedrock.gif -------------------------------------------------------------------------------- /static/huggingface.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/huggingface.png -------------------------------------------------------------------------------- /static/huggingfact-SBERT.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/huggingfact-SBERT.jpeg -------------------------------------------------------------------------------- /static/rag-with-memory.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/rag-with-memory.png -------------------------------------------------------------------------------- /static/rag.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/rag.png -------------------------------------------------------------------------------- /static/retrieveAPI.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/retrieveAPI.png -------------------------------------------------------------------------------- /static/retrieveAndGenerate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/retrieveAndGenerate.png -------------------------------------------------------------------------------- /static/sbert.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/static/sbert.jpeg -------------------------------------------------------------------------------- /word2vec.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/semantic-search-with-amazon-opensearch/6d7f5d0e517ae7f1a8f58144096dd807cead81be/word2vec.png --------------------------------------------------------------------------------