├── DotNet ├── SqlClient │ ├── .gitignore │ ├── Properties │ │ └── launchSettings.json │ ├── SqlServer.NativeVectorSearch.Samples.csproj │ ├── .env.example │ ├── SqlServer.NativeVectorSearch.Samples.sln │ └── ReadMe.md ├── Dapper │ ├── .vscode │ │ └── extensions.json │ ├── app │ │ ├── .env.sample │ │ ├── TypeHandler.cs │ │ ├── DapperVectors.csproj │ │ ├── Model.cs │ │ ├── content.json │ │ ├── EmbeddingClient.cs │ │ └── Program.cs │ ├── db │ │ ├── Blogs.sql │ │ ├── Posts.sql │ │ └── BlogDB.sqlproj │ └── README.md ├── EF-Core-10 │ ├── .env.sample │ ├── .vscode │ │ └── launch.json │ ├── README.md │ ├── EFCore10Vectors.csproj │ ├── EFCore10Vectors.sln │ ├── EmbeddingClient.cs │ ├── content.json │ ├── Model.cs │ ├── Program.cs │ └── Migrations │ │ ├── 20250910190225_InitialCreate.cs │ │ ├── BloggingContextModelSnapshot.cs │ │ └── 20250910190225_InitialCreate.Designer.cs ├── EF-Core-9 │ ├── .env.sample │ ├── .vscode │ │ └── launch.json │ ├── README.md │ ├── EFCore9Vectors.csproj │ ├── EFCore9Vectors.sln │ ├── EmbeddingClient.cs │ ├── content.json │ ├── Model.cs │ ├── Program.cs │ └── Migrations │ │ ├── 20241030213329_InitialCreate.cs │ │ ├── BloggingContextModelSnapshot.cs │ │ └── 20241030213329_InitialCreate.Designer.cs ├── SqlBulkCopy │ ├── .env.sample │ ├── script.sql │ ├── SqlBulkCopy.csproj │ ├── Program.cs │ └── EmbeddingClient.cs └── ReadMe.md ├── DiskANN ├── .gitignore ├── .vscode │ └── extensions.json ├── Wikipedia │ ├── 003-wikipedia-fulltext-setup.sql │ ├── 999-wikipedia-vectorizer.sql │ ├── 001-wikipedia-diskann-setup.sql │ ├── 004-wikipedia-hybrid-search.sql │ ├── 005-wikipedia-fp16.sql │ └── 002-wikipedia-diskann-test.sql ├── README.md ├── diskann-quickstart-sql-server-2025.sql └── diskann-quickstart-azure-sql.sql ├── .gitattributes ├── Datasets ├── .gitattributes ├── ResumeData │ ├── 10089434.pdf │ ├── 10247517.pdf │ ├── 10265057.pdf │ ├── 10553553.pdf │ ├── 10641230.pdf │ ├── 10839851.pdf │ ├── 10840430.pdf │ ├── 11580408.pdf │ ├── 11584809.pdf │ ├── 11957080.pdf │ └── Readme.md ├── Reviews.csv ├── FineFoodEmbeddings.csv ├── Rapid RAG for your existing Applications.docx └── Rapid RAG for Unstructured Data - SQL Demo 2.docx ├── Hybrid-Search ├── requirements.txt ├── .env.sample ├── 00-setup-database.sql ├── utilities.py ├── README.md └── hybrid_search.py ├── Assets ├── endpoint.png ├── NVIDIA_embed.png ├── importwizard.png ├── importsuccess.png ├── modeldeployment.png ├── wizardpreview.png ├── docintelendpoint.png ├── NVIDIA_key_endpoint.png ├── embedding-deployment.png └── importwizarddatatypes.png ├── CHANGELOG.md ├── 5-Min-RAG-SQL-Accelerator ├── Step2-Deploy-RAG-App │ ├── RAG_structured-docs │ │ ├── requirements.txt │ │ └── Readme.md │ ├── RAG-unstructured-docs │ │ ├── requirements.txt │ │ └── Readme.md │ └── README.md ├── Readme.md └── Step1-OneClick-Deployment │ ├── SQL_deployment.json │ ├── OpenAI_deployment.json │ ├── README.md │ └── RAG_deployment.json ├── NVIDIA-AI-SQL ├── CreateTable.sql ├── .gitignore ├── .env.sample └── README.md ├── RAG-with-Documents ├── CreateTable.sql ├── .gitignore ├── requirements.txt ├── .env.sample └── Readme.md ├── Langchain-SQL-RAG ├── .gitignore ├── .env.sample └── readme.md ├── Retrieval-Augmented-Generation ├── requirements.txt ├── CreateTable.sql ├── .env.sample └── Readme.md ├── SemanticKernel └── dotnet │ ├── VectorStoreSample │ ├── .env.example │ ├── VectorStoreSample.csproj │ ├── VectorStoreSample.sln │ └── Program.cs │ ├── MemoryStoreSample │ ├── .env.example │ ├── MemoryStoreSample.csproj │ ├── MemoryStoreSample.sln │ └── Program.cs │ └── README.md ├── .github ├── CODE_OF_CONDUCT.md ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── Vector-Search └── Readme.md ├── Embeddings └── T-SQL │ ├── 01-store-openai-credentials.sql │ ├── 03-get-embedding-sample.sql │ ├── 02-create-get-embedding-procedure.sql │ ├── 04-update-embeddings-with-trigger.sql │ └── README.md ├── LICENSE.md ├── CONTRIBUTING.md └── README.md /DotNet/SqlClient/.gitignore: -------------------------------------------------------------------------------- 1 | *.txt -------------------------------------------------------------------------------- /DiskANN/.gitignore: -------------------------------------------------------------------------------- 1 | *.csv 2 | *.local.sql -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | *.csv filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /Datasets/.gitattributes: -------------------------------------------------------------------------------- 1 | *.csv filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /Hybrid-Search/requirements.txt: -------------------------------------------------------------------------------- 1 | python-dotenv 2 | pyodbc 3 | azure-identity 4 | sentence-transformers 5 | -------------------------------------------------------------------------------- /DiskANN/.vscode/extensions.json: -------------------------------------------------------------------------------- 1 | { 2 | "recommendations": [ 3 | "ms-mssql.mssql" 4 | ] 5 | } -------------------------------------------------------------------------------- /Assets/endpoint.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/endpoint.png -------------------------------------------------------------------------------- /Assets/NVIDIA_embed.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/NVIDIA_embed.png -------------------------------------------------------------------------------- /Assets/importwizard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/importwizard.png -------------------------------------------------------------------------------- /Assets/importsuccess.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/importsuccess.png -------------------------------------------------------------------------------- /Assets/modeldeployment.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/modeldeployment.png -------------------------------------------------------------------------------- /Assets/wizardpreview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/wizardpreview.png -------------------------------------------------------------------------------- /Assets/docintelendpoint.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/docintelendpoint.png -------------------------------------------------------------------------------- /Assets/NVIDIA_key_endpoint.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/NVIDIA_key_endpoint.png -------------------------------------------------------------------------------- /Assets/embedding-deployment.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/embedding-deployment.png -------------------------------------------------------------------------------- /Assets/importwizarddatatypes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Assets/importwizarddatatypes.png -------------------------------------------------------------------------------- /Datasets/ResumeData/10089434.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/10089434.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/10247517.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/10247517.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/10265057.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/10265057.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/10553553.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/10553553.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/10641230.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/10641230.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/10839851.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/10839851.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/10840430.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/10840430.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/11580408.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/11580408.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/11584809.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/11584809.pdf -------------------------------------------------------------------------------- /Datasets/ResumeData/11957080.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/ResumeData/11957080.pdf -------------------------------------------------------------------------------- /DotNet/Dapper/.vscode/extensions.json: -------------------------------------------------------------------------------- 1 | { 2 | "recommendations": [ 3 | "ms-mssql.mssql", 4 | "ms-mssql.sql-database-projects-vscode" 5 | ] 6 | } -------------------------------------------------------------------------------- /Datasets/Reviews.csv: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:a88348c20ed3f85d647e8fbaac0a730ab2f09f95e5d1f4bcf1f9e3650ef624d7 3 | size 300904694 4 | -------------------------------------------------------------------------------- /Datasets/FineFoodEmbeddings.csv: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:fc3fb797e3a8520a06a8293dc1009bd3a529253b4959a0472a57dbf6f9eeb1a9 3 | size 217236142 4 | -------------------------------------------------------------------------------- /Datasets/Rapid RAG for your existing Applications.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/Rapid RAG for your existing Applications.docx -------------------------------------------------------------------------------- /Datasets/Rapid RAG for Unstructured Data - SQL Demo 2.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/HEAD/Datasets/Rapid RAG for Unstructured Data - SQL Demo 2.docx -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## [project-title] Changelog 2 | 3 | 4 | # x.y.z (yyyy-mm-dd) 5 | 6 | *Features* 7 | * ... 8 | 9 | *Bug Fixes* 10 | * ... 11 | 12 | *Breaking Changes* 13 | * ... 14 | -------------------------------------------------------------------------------- /Hybrid-Search/.env.sample: -------------------------------------------------------------------------------- 1 | MSSQL='Driver={ODBC Driver 18 for SQL Server};Server=tcp:.database.windows.net,1433;Database=;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;LongAsMax=yes;' 2 | -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step2-Deploy-RAG-App/RAG_structured-docs/requirements.txt: -------------------------------------------------------------------------------- 1 | streamlit==1.45.0 2 | pandas==2.2.3 3 | requests==2.32.3 4 | python-dotenv==1.1.0 5 | openai==1.72.0 6 | pyodbc==5.2.0 7 | azure-identity==1.21.0 8 | -------------------------------------------------------------------------------- /NVIDIA-AI-SQL/CreateTable.sql: -------------------------------------------------------------------------------- 1 | CREATE TABLE resumedocs ( 2 | id INT IDENTITY(1,1) PRIMARY KEY, 3 | chunkid NVARCHAR(255), 4 | filename NVARCHAR(255), 5 | chunk NVARCHAR(MAX), 6 | embedding VECTOR(1536) 7 | ); 8 | -------------------------------------------------------------------------------- /RAG-with-Documents/CreateTable.sql: -------------------------------------------------------------------------------- 1 | CREATE TABLE resumedocs ( 2 | id INT IDENTITY(1,1) PRIMARY KEY, 3 | chunkid NVARCHAR(255), 4 | filename NVARCHAR(255), 5 | chunk NVARCHAR(MAX), 6 | embedding VECTOR(1536) 7 | ); 8 | -------------------------------------------------------------------------------- /Langchain-SQL-RAG/.gitignore: -------------------------------------------------------------------------------- 1 | *.pdf 2 | *.docx 3 | *.doc 4 | *.xls 5 | *.xlsx 6 | *.ppt 7 | *.pptx 8 | *.txt 9 | *.csv 10 | *.jpg 11 | *.jpeg 12 | *.png 13 | *.gif 14 | *.bmp 15 | *.tif 16 | *.tiff 17 | *.svg 18 | *.eps 19 | *.ai 20 | *.psd 21 | *.env -------------------------------------------------------------------------------- /RAG-with-Documents/.gitignore: -------------------------------------------------------------------------------- 1 | *.pdf 2 | *.docx 3 | *.doc 4 | *.xls 5 | *.xlsx 6 | *.ppt 7 | *.pptx 8 | *.txt 9 | *.csv 10 | *.jpg 11 | *.jpeg 12 | *.png 13 | *.gif 14 | *.bmp 15 | *.tif 16 | *.tiff 17 | *.svg 18 | *.eps 19 | *.ai 20 | *.psd 21 | -------------------------------------------------------------------------------- /Retrieval-Augmented-Generation/requirements.txt: -------------------------------------------------------------------------------- 1 | pyodbc 2 | python-dotenv 3 | openai 4 | num2words 5 | matplotlib 6 | plotly 7 | scipy 8 | scikit-learn 9 | pandas 10 | tiktoken 11 | tokenizer 12 | azure-identity 13 | PrettyTable 14 | nltk 15 | -------------------------------------------------------------------------------- /DotNet/Dapper/app/.env.sample: -------------------------------------------------------------------------------- 1 | MSSQL="Data Source=.database.windows.net;Initial Catalog=;Authentication=Active Directory Default;Connection Timeout=30" 2 | OPENAI_KEY="" 3 | OPENAI_URL="https://.openai.azure.com/" 4 | OPENAI_DEPLOYMENT_NAME="text-embedding-3-small" -------------------------------------------------------------------------------- /DotNet/EF-Core-10/.env.sample: -------------------------------------------------------------------------------- 1 | MSSQL="Data Source=.database.windows.net;Initial Catalog=;Authentication=Active Directory Default;Connection Timeout=30" 2 | OPENAI_KEY="" 3 | OPENAI_URL="https://.openai.azure.com/" 4 | OPENAI_DEPLOYMENT_NAME="text-embedding-3-small" -------------------------------------------------------------------------------- /DotNet/EF-Core-9/.env.sample: -------------------------------------------------------------------------------- 1 | MSSQL="Data Source=.database.windows.net;Initial Catalog=;Authentication=Active Directory Default;Connection Timeout=30" 2 | OPENAI_KEY="" 3 | OPENAI_URL="https://.openai.azure.com/" 4 | OPENAI_DEPLOYMENT_NAME="text-embedding-3-small" -------------------------------------------------------------------------------- /NVIDIA-AI-SQL/.gitignore: -------------------------------------------------------------------------------- 1 | *.pdf 2 | *.docx 3 | *.doc 4 | *.xls 5 | *.xlsx 6 | *.ppt 7 | *.pptx 8 | *.txt 9 | *.csv 10 | *.jpg 11 | *.jpeg 12 | *.png 13 | *.gif 14 | *.bmp 15 | *.tif 16 | *.tiff 17 | *.svg 18 | *.eps 19 | *.ai 20 | *.psd 21 | *.env 22 | -------------------------------------------------------------------------------- /DotNet/SqlBulkCopy/.env.sample: -------------------------------------------------------------------------------- 1 | MSSQL="Data Source=.database.windows.net;Initial Catalog=;Authentication=Active Directory Default;Connection Timeout=30" 2 | OPENAI_KEY="" 3 | OPENAI_URL="https://.openai.azure.com/" 4 | OPENAI_DEPLOYMENT_NAME="text-embedding-3-small" -------------------------------------------------------------------------------- /DotNet/Dapper/db/Blogs.sql: -------------------------------------------------------------------------------- 1 | CREATE TABLE [dbo].[Blogs] 2 | ( 3 | [BlogId] INT IDENTITY(1,1) NOT NULL PRIMARY KEY, 4 | [Name] NVARCHAR(200) NOT NULL, 5 | [Url] NVARCHAR(400) NOT NULL 6 | ); 7 | GO 8 | 9 | CREATE UNIQUE INDEX IX_Blogs_Name ON [dbo].[Blogs]([Name]); 10 | GO -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step2-Deploy-RAG-App/RAG-unstructured-docs/requirements.txt: -------------------------------------------------------------------------------- 1 | streamlit==1.45.0 2 | pandas==2.2.3 3 | requests==2.32.3 4 | tiktoken==0.9.0 5 | pyodbc==5.2.0 6 | azure-ai-formrecognizer==3.3.3 7 | azure-core==1.33.0 8 | azure-identity==1.21.0 9 | openai==1.72.0 10 | -------------------------------------------------------------------------------- /DotNet/SqlClient/Properties/launchSettings.json: -------------------------------------------------------------------------------- 1 | { 2 | "profiles": { 3 | "SqlServer.NativeVectorSearch.Samples": { 4 | "commandName": "Project", 5 | "environmentVariables": { 6 | "ApiKey": "", 7 | "EmbeddingModelName": "", 8 | "SqlConnStr": "" 9 | } 10 | } 11 | } 12 | } -------------------------------------------------------------------------------- /DotNet/SqlBulkCopy/script.sql: -------------------------------------------------------------------------------- 1 | drop table if exists dbo.[SqlBulkCopyEmbedding]; 2 | create table dbo.[SqlBulkCopyEmbedding] 3 | ( 4 | Id int, 5 | Embedding vector(1536), 6 | [Description] nvarchar(max) 7 | ) 8 | go 9 | 10 | -- Run the SqlBulkCopySample then 11 | select * from dbo.[SqlBulkCopyEmbedding] 12 | go 13 | 14 | -------------------------------------------------------------------------------- /RAG-with-Documents/requirements.txt: -------------------------------------------------------------------------------- 1 | tiktoken 2 | tokenizer 3 | azure-ai-documentintelligence 4 | azure-ai-formrecognizer 5 | azure-identity 6 | azure-core 7 | azure-search-documents==11.6.0b3 8 | python-dotenv 9 | openai 10 | numpy 11 | pyodbc 12 | num2words 13 | matplotlib 14 | plotly 15 | scipy 16 | scikit-learn 17 | pandas 18 | tokenizer 19 | PrettyTable 20 | nltk 21 | -------------------------------------------------------------------------------- /SemanticKernel/dotnet/VectorStoreSample/.env.example: -------------------------------------------------------------------------------- 1 | AZURE_OPENAI_ENDPOINT="https://.openai.azure.com/" 2 | AZURE_OPENAI_API_KEY="" 3 | AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-3-small" 4 | AZURE_SQL_CONNECTION_STRING="Data Source=.database.windows.net;Initial Catalog=;Authentication=Active Directory Default;Connection Timeout=30" 5 | -------------------------------------------------------------------------------- /Retrieval-Augmented-Generation/CreateTable.sql: -------------------------------------------------------------------------------- 1 | CREATE TABLE [dbo].[embeddings] 2 | ( 3 | [Id] [bigint] NULL, 4 | [ProductId] [nvarchar](500) NULL, 5 | [UserId] [nvarchar](50) NULL, 6 | [Score] [bigint] null, 7 | [Summary] [nvarchar](max) NULL, 8 | [Text] [nvarchar](max) NULL, 9 | [Combined] [nvarchar](max) NULL, 10 | [Vector] [vector](1536) NULL 11 | ) 12 | GO 13 | -------------------------------------------------------------------------------- /SemanticKernel/dotnet/MemoryStoreSample/.env.example: -------------------------------------------------------------------------------- 1 | AZURE_OPENAI_ENDPOINT="https://.openai.azure.com/" 2 | AZURE_OPENAI_API_KEY="" 3 | AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-3-small" 4 | AZURE_OPENAI_CHAT_MODEL="gpt-4o" 5 | AZURE_SQL_CONNECTION_STRING="Data Source=.database.windows.net;Initial Catalog=;Authentication=Active Directory Default;Connection Timeout=30" 6 | -------------------------------------------------------------------------------- /Datasets/ResumeData/Readme.md: -------------------------------------------------------------------------------- 1 | ## Dataset Information 2 | 3 | The original dataset is available on https://www.kaggle.com/datasets/snehaanbhawal/resume-dataset and contains **more than 120 resumes** in the Information Technology sector. 4 | 5 | For this project, we have included **10 sample resumes** from the original dataset for **testing purposes**. These samples are intended to demonstrate the functionality of the application without requiring the full dataset. 6 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/.vscode/launch.json: -------------------------------------------------------------------------------- 1 | { 2 | "version": "0.2.0", 3 | "configurations": [ 4 | { 5 | "name": "C#: EFCoreVectors", 6 | "type": "dotnet", 7 | "request": "launch", 8 | "projectPath": "${workspaceFolder}\\EFCoreVectors.csproj", 9 | "launchConfigurationId": "TargetFramework=;EFCoreVectors", 10 | "envFile": "${workspaceFolder}\\.env" 11 | } 12 | ] 13 | } -------------------------------------------------------------------------------- /DotNet/EF-Core-9/.vscode/launch.json: -------------------------------------------------------------------------------- 1 | { 2 | "version": "0.2.0", 3 | "configurations": [ 4 | { 5 | "name": "C#: EFCoreVectors", 6 | "type": "dotnet", 7 | "request": "launch", 8 | "projectPath": "${workspaceFolder}\\EFCoreVectors.csproj", 9 | "launchConfigurationId": "TargetFramework=;EFCoreVectors", 10 | "envFile": "${workspaceFolder}\\.env" 11 | } 12 | ] 13 | } -------------------------------------------------------------------------------- /DotNet/ReadMe.md: -------------------------------------------------------------------------------- 1 | # Welcome to .NET/C# Native Vector Search Samples 2 | 3 | This folder contains the samples demonstrating how to use Native Vector Search in C#. 4 | 5 | - [SQL Server Native Vector Search with Entity Framework 9](./EF-Core-9/README.md) 6 | - [SQL Server Native Vector Search with Entity Framework 10](./EF-Core-10/README.md) 7 | - [SQL Server Native Vector Search with Dapper](./Dapper/README.md) 8 | - [SQL Server Native Vector Search with SqlClient](./SqlClient/README.md) 9 | -------------------------------------------------------------------------------- /DotNet/Dapper/db/Posts.sql: -------------------------------------------------------------------------------- 1 | CREATE TABLE [dbo].[Posts] 2 | ( 3 | [PostId] INT IDENTITY(1,1) NOT NULL PRIMARY KEY, 4 | [Title] NVARCHAR(400) NOT NULL, 5 | [Content] NVARCHAR(MAX) NOT NULL, 6 | [Embedding] VECTOR(1536) NOT NULL, 7 | [BlogId] INT NOT NULL 8 | ); 9 | GO 10 | 11 | ALTER TABLE [dbo].[Posts] ADD CONSTRAINT FK_Posts_Blogs_BlogId FOREIGN KEY (BlogId) REFERENCES dbo.Blogs(BlogId); 12 | GO 13 | 14 | CREATE UNIQUE INDEX IX_Posts_Title ON [dbo].[Posts]([Title]); 15 | GO -------------------------------------------------------------------------------- /.github/CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Microsoft Open Source Code of Conduct 2 | 3 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 4 | 5 | Resources: 6 | 7 | - [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/) 8 | - [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) 9 | - Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns 10 | -------------------------------------------------------------------------------- /Vector-Search/Readme.md: -------------------------------------------------------------------------------- 1 | # Store and Query OpenAI Embeddings in Azure SQL DB (T-SQL) 2 | 3 | ## Vector Similarity Search in SQL DB 4 | 5 | Learn how to store and query vector embeddings in Azure SQL Database using T-SQL with this step-by-step tutorial. The sample is available in the [VectorSearch.ipynb](VectorSearch.ipynb) notebook. 6 | 7 | ## Running the Notebook 8 | 9 | Execute the notebook using the [SQL Kernel for Notebooks in Azure Data Studio](https://learn.microsoft.com/azure-data-studio/notebooks/notebooks-guidance#connect-to-a-kernel). -------------------------------------------------------------------------------- /DotNet/SqlBulkCopy/SqlBulkCopy.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net9.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /Hybrid-Search/00-setup-database.sql: -------------------------------------------------------------------------------- 1 | drop table if exists dbo.documents 2 | go 3 | 4 | create table dbo.documents 5 | ( 6 | id int constraint pk__documents primary key, 7 | content nvarchar(max), 8 | embedding vector(384) 9 | ) 10 | 11 | if not exists(select * from sys.fulltext_catalogs where [name] = 'FullTextCatalog') 12 | begin 13 | create fulltext catalog [FullTextCatalog] as default; 14 | end 15 | go 16 | 17 | create fulltext index on dbo.documents (content) key index pk__documents; 18 | go 19 | 20 | alter fulltext index on dbo.documents enable; 21 | go 22 | -------------------------------------------------------------------------------- /DotNet/Dapper/db/BlogDB.sqlproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | BlogDB 6 | {226C56DA-3984-48B5-A2C2-5B8BEF57C309} 7 | Microsoft.Data.Tools.Schema.Sql.SqlAzureV12DatabaseSchemaProvider 8 | 1033, CI 9 | 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /DotNet/Dapper/README.md: -------------------------------------------------------------------------------- 1 | # Dapper vector search sample 2 | 3 | This folder contains a sample that demonstrates how to store and query vectors in Azure SQL using Dapper. 4 | 5 | Make sure the create the tables using the scripts in the `db` folder before running the sample. 6 | 7 | Create a `.env` file from `.env.example` and fill in the required values, then run the application 8 | 9 | ```bash 10 | cd app 11 | dotnet run 12 | ``` 13 | 14 | The sample will populate the `Blogs` and `Posts` table with some data. It will then query the database using the vector functions to find similar blog content to a given topic. 15 | -------------------------------------------------------------------------------- /DotNet/SqlClient/SqlServer.NativeVectorSearch.Samples.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net8.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | -------------------------------------------------------------------------------- /DotNet/Dapper/app/TypeHandler.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.Data.SqlClient; 2 | using Dapper; 3 | using Microsoft.Data.SqlTypes; 4 | using Microsoft.Data; 5 | 6 | namespace DapperVectors; 7 | 8 | public class VectorTypeHandler : SqlMapper.TypeHandler 9 | { 10 | public override float[] Parse(object value) 11 | { 12 | return ((SqlVector)value).Memory.ToArray(); 13 | } 14 | 15 | public override void SetValue(System.Data.IDbDataParameter parameter, float[]? value) 16 | { 17 | parameter.Value = value is not null ? new SqlVector(value) : DBNull.Value; 18 | ((SqlParameter)parameter).SqlDbType = SqlDbTypeExtensions.Vector; 19 | } 20 | } 21 | -------------------------------------------------------------------------------- /DotNet/Dapper/app/DapperVectors.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net9.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | -------------------------------------------------------------------------------- /DotNet/Dapper/app/Model.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.Data.SqlTypes; 2 | 3 | namespace DapperVectors; 4 | 5 | public class Blog 6 | { 7 | public int BlogId { get; set; } 8 | public string Name { get; set; } = string.Empty; 9 | public string Url { get; set; } = string.Empty; 10 | } 11 | 12 | public class Post 13 | { 14 | public int PostId { get; set; } 15 | public string Title { get; set; } = string.Empty; 16 | public string Content { get; set; } = string.Empty; 17 | public float[]? Embedding { get; set; } 18 | public int BlogId { get; set; } 19 | } 20 | 21 | public record SavedPost 22 | { 23 | public string Title { get; init; } = string.Empty; 24 | public string Content { get; init; } = string.Empty; 25 | } 26 | -------------------------------------------------------------------------------- /Retrieval-Augmented-Generation/.env.sample: -------------------------------------------------------------------------------- 1 | AZURE_OPENAI_ENDPOINT="https://.openai.azure.com/" 2 | AZURE_OPENAI_API_KEY="" 3 | AZURE_OPENAI_EMBEDDING_MODEL_DEPLOYMENT_NAME="text-embedding-3-small" 4 | 5 | # Use only one of the below. The one you are not using should be commented out. 6 | 7 | # For Entra ID Authentication 8 | ENTRAID_CONNECTION_STRING="Driver={ODBC Driver 18 for SQL Server};Server=tcp:.database.windows.net;Database=test;LongAsMax=yes;" 9 | 10 | # For SQL Authentication 11 | SQL_CONNECTION_STRING="Driver={ODBC Driver 18 for SQL Server};Server=tcp:.database.windows.net;Database=;Uid=;Pwd=;LongAsMax=yes;TrustServerCertificate=yes;Connection Timeout=30;" 12 | -------------------------------------------------------------------------------- /SemanticKernel/dotnet/MemoryStoreSample/MemoryStoreSample.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net8.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | PreserveNewest 19 | 20 | 21 | 22 | 23 | -------------------------------------------------------------------------------- /Embeddings/T-SQL/01-store-openai-credentials.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Create database credentials to store API key 3 | 4 | Replace with the name of your Azure OpenAI service and with the API key for the Azure OpenAI API 5 | */ 6 | if not exists(select * from sys.symmetric_keys where [name] = '##MS_DatabaseMasterKey##') 7 | begin 8 | create master key encryption by password = N'V3RYStr0NGP@ssw0rd!'; 9 | end 10 | go 11 | if exists(select * from sys.[database_scoped_credentials] where name = 'https://.openai.azure.com') 12 | begin 13 | drop database scoped credential [https://.openai.azure.com]; 14 | end 15 | create database scoped credential [https://.openai.azure.com] 16 | with identity = 'HTTPEndpointHeaders', secret = '{"api-key": ""}'; 17 | go -------------------------------------------------------------------------------- /NVIDIA-AI-SQL/.env.sample: -------------------------------------------------------------------------------- 1 | NVIDIA_API_KEY ="NVIDIA AI emdedding model API Key" 2 | NVIDIA_CHAT_API_KEY="NVIDIA AI chat model API Key" 3 | 4 | AZUREDOCINTELLIGENCE_ENDPOINT = "https://.cognitiveservices.azure.com/" 5 | AZUREDOCINTELLIGENCE_API_KEY = "" 6 | 7 | FILE_PATH="Path to the resume dataset" 8 | 9 | # Use only one of the below. The one you are not using should be commented out. 10 | # For Entra ID Service Principle Authentication 11 | ENTRA_CONNECTION_STRING="Driver={ODBC Driver 18 for SQL Server};LongAsMax=yes;Server=tcp:.database.windows.net;Database=;" 12 | 13 | # For SQL Authentication 14 | SQL_CONNECTION_STRING="Driver={ODBC Driver 18 for SQL Server};LongAsMax=yes;Server=tcp:.database.windows.net;Database=;Uid=;Pwd=;" -------------------------------------------------------------------------------- /DiskANN/Wikipedia/003-wikipedia-fulltext-setup.sql: -------------------------------------------------------------------------------- 1 | use WikipediaTest 2 | go 3 | 4 | if not exists(select * from sys.fulltext_catalogs where [name] = 'FullTextCatalog') 5 | begin 6 | create fulltext catalog [FullTextCatalog] as default; 7 | end 8 | go 9 | 10 | create fulltext index on dbo.wikipedia_articles_embeddings ([text]) key index pk__wikipedia_articles_embeddings; 11 | go 12 | 13 | alter fulltext index on dbo.wikipedia_articles_embeddings enable; 14 | go 15 | 16 | select * from sys.fulltext_catalogs 17 | go 18 | 19 | -- Wait ~15 seconds for FT to start and process all the documents, then 20 | waitfor delay '00:00:15' 21 | go 22 | 23 | -- Check how many documents have been indexed so far (final count must be 25000) 24 | select count(distinct document_id) 25 | from sys.dm_fts_index_keywords_by_document(db_id(), object_id('dbo.wikipedia_articles_embeddings')) 26 | go 27 | 28 | 29 | 30 | 31 | 32 | -------------------------------------------------------------------------------- /DotNet/EF-Core-9/README.md: -------------------------------------------------------------------------------- 1 | # EF Core 9 Vector Sample 2 | 3 | This sample shows how to use the vector functions in EF Core to store and query vector data. It is an end-to-end sample using the extension [`EFCore.SqlServer.VectorSearch`](https://github.com/efcore/EFCore.SqlServer.VectorSearch) package. 4 | 5 | You need to have a Azure OpenAI embedding endpoint to run this sample. 6 | 7 | Create a `.env` file from `.env.example` and fill in the required values, then run the database migration to create the database tables 8 | 9 | ```bash 10 | dotnet tool install --global dotnet-ef 11 | dotnet build 12 | dotnet ef database update 13 | ``` 14 | 15 | Run the application 16 | 17 | ```bash 18 | dotnet run 19 | ``` 20 | 21 | The sample will create a database with a `Blogs` and `Posts` table and seed it with some data. It will then query the database using the vector functions to find similar blog content to a given topic. -------------------------------------------------------------------------------- /Embeddings/T-SQL/03-get-embedding-sample.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Get the embeddings for the input text by calling the OpenAI API 3 | */ 4 | declare @king vector(1536), @queen vector(1536), @pizza vector(1536); 5 | 6 | exec dbo.get_embedding @deployedModelName = '', @inputText = 'King', @embedding = @king output; 7 | exec dbo.get_embedding @deployedModelName = '', @inputText = 'Queen', @embedding = @queen output; 8 | exec dbo.get_embedding @deployedModelName = '', @inputText = 'Pizza', @embedding = @pizza output; 9 | 10 | -- Find distance between vectorized cocepts. 11 | -- The smaller the distance, the more similar the concepts are. 12 | select 13 | vector_distance('cosine', @king, @king) as 'King vs King', 14 | vector_distance('cosine', @king, @queen) as 'King vs Queen', 15 | vector_distance('cosine', @king, @pizza) as 'King vs Pizza' 16 | 17 | 18 | 19 | -------------------------------------------------------------------------------- /RAG-with-Documents/.env.sample: -------------------------------------------------------------------------------- 1 | AZOPENAI_ENDPOINT="https://.openai.azure.com/" 2 | AZOPENAI_API_KEY="" 3 | AZOPENAI_EMBEDDING_MODEL_DEPLOYMENT_NAME="" 4 | AZOPENAI_CHAT_MODEL_DEPLOYMENT_NAME="" 5 | 6 | AZUREDOCINTELLIGENCE_ENDPOINT = "https://.cognitiveservices.azure.com/" 7 | AZUREDOCINTELLIGENCE_API_KEY = "" 8 | 9 | # Use only one of the below. The one you are not using should be commented out. 10 | # For Entra ID Service Principle Authentication 11 | ENTRA_CONNECTION_STRING="Driver={ODBC Driver 18 for SQL Server};LongAsMax=yes;Server=tcp:.database.windows.net;Database=;" 12 | 13 | # For SQL Authentication 14 | SQL_CONNECTION_STRING="Driver={ODBC Driver 18 for SQL Server};LongAsMax=yes;Server=tcp:.database.windows.net;Database=;Uid=;Pwd=;" 15 | -------------------------------------------------------------------------------- /SemanticKernel/dotnet/VectorStoreSample/VectorStoreSample.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net9.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | Always 21 | 22 | 23 | 24 | 25 | -------------------------------------------------------------------------------- /DotNet/SqlClient/.env.example: -------------------------------------------------------------------------------- 1 | # SQL Server connection string 2 | SqlConnStr="Data Source=your-server.database.windows.net;Initial Catalog=your-database;Authentication=Active Directory Default;Connection Timeout=30" 3 | 4 | # Embedding model name (e.g., text-embedding-3-small, text-embedding-ada-002) 5 | EmbeddingModelName="text-embedding-3-small" 6 | 7 | # Set to "true" to use Azure OpenAI, "false" or leave empty for OpenAI 8 | UseAzureOpenAI="false" 9 | 10 | # ======================================== 11 | # Azure OpenAI Configuration (when UseAzureOpenAI=true) 12 | # ======================================== 13 | # Uncomment and set these values when using Azure OpenAI 14 | # AzureOpenAIEndpoint="https://your-resource.openai.azure.com/" 15 | # AzureOpenAIKey="your-azure-openai-api-key" 16 | 17 | # ======================================== 18 | # OpenAI Configuration (when UseAzureOpenAI=false) 19 | # ======================================== 20 | # Uncomment and set this value when using OpenAI 21 | # ApiKey="sk-your-openai-api-key-here" 22 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | 4 | > Please provide us with the following information: 5 | > --------------------------------------------------------------- 6 | 7 | ### This issue is for a: (mark with an `x`) 8 | ``` 9 | - [ ] bug report -> please search issues before submitting 10 | - [ ] feature request 11 | - [ ] documentation issue or request 12 | - [ ] regression (a behavior that used to work and stopped in a new release) 13 | ``` 14 | 15 | ### Minimal steps to reproduce 16 | > 17 | 18 | ### Any log messages given by the failure 19 | > 20 | 21 | ### Expected/desired behavior 22 | > 23 | 24 | ### OS and Version? 25 | > Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?) 26 | 27 | ### Versions 28 | > 29 | 30 | ### Mention any other details that might be useful 31 | 32 | > --------------------------------------------------------------- 33 | > Thanks! We'll be in touch soon. 34 | -------------------------------------------------------------------------------- /Langchain-SQL-RAG/.env.sample: -------------------------------------------------------------------------------- 1 | 2 | # Azure OpenAI Service details 3 | AZURE_ENDPOINT="https://.openai.azure.com/" 4 | AZURE_DEPLOYMENT_EMBEDDING_NAME=")); 11 | table.Columns.Add("Description", typeof(string)); 12 | 13 | var embeddingClient = new MockEmbeddingClient(); 14 | 15 | for (int i = 0; i < 10; i++) 16 | { 17 | table.Rows.Add(i, new SqlVector(embeddingClient.GetEmbedding($"This is a test {i}")), $"This is a test {i}"); 18 | } 19 | 20 | Console.WriteLine("Inserting rows with SqlBulkCopy..."); 21 | using SqlConnection connection = new(Env.GetString("MSSQL")); 22 | { 23 | Console.WriteLine("-> Opening connection..."); 24 | connection.Open(); 25 | 26 | Console.WriteLine("-> Inserting rows..."); 27 | using SqlBulkCopy bulkCopy = new(connection) 28 | { 29 | DestinationTableName = "dbo.SqlBulkCopyEmbedding" 30 | }; 31 | bulkCopy.WriteToServer(table); 32 | } 33 | 34 | Console.WriteLine("Done."); 35 | 36 | -------------------------------------------------------------------------------- /DotNet/EF-Core-9/EFCore9Vectors.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net9.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | runtime; build; native; contentfiles; analyzers; buildtransitive 17 | all 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Purpose 2 | 3 | * ... 4 | 5 | ## Does this introduce a breaking change? 6 | 7 | ``` 8 | [ ] Yes 9 | [ ] No 10 | ``` 11 | 12 | ## Pull Request Type 13 | What kind of change does this Pull Request introduce? 14 | 15 | 16 | ``` 17 | [ ] Bugfix 18 | [ ] Feature 19 | [ ] Code style update (formatting, local variables) 20 | [ ] Refactoring (no functional changes, no api changes) 21 | [ ] Documentation content changes 22 | [ ] Other... Please describe: 23 | ``` 24 | 25 | ## How to Test 26 | * Get the code 27 | 28 | ``` 29 | git clone [repo-address] 30 | cd [repo-name] 31 | git checkout [branch-name] 32 | npm install 33 | ``` 34 | 35 | * Test the code 36 | 37 | ``` 38 | ``` 39 | 40 | ## What to Check 41 | Verify that the following are valid 42 | * ... 43 | 44 | ## Other Information 45 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/EFCore10Vectors.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net10.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | runtime; build; native; contentfiles; analyzers; buildtransitive 17 | all 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | -------------------------------------------------------------------------------- /Hybrid-Search/utilities.py: -------------------------------------------------------------------------------- 1 | import os 2 | import pyodbc 3 | import struct 4 | import logging 5 | from azure import identity 6 | 7 | def get_mssql_connection(): 8 | print('Getting MSSQL connection') 9 | mssql_connection_string = os.environ["MSSQL"] 10 | if any(s in mssql_connection_string.lower() for s in ["uid"]): 11 | print(' - Using SQL Server authentication') 12 | attrs_before = None 13 | else: 14 | print(' - Getting EntraID credentials...') 15 | mssql_connection_string = os.environ["MSSQL"] 16 | credential = identity.DefaultAzureCredential(exclude_interactive_browser_credential=False) 17 | token_bytes = credential.get_token("https://database.windows.net/.default").token.encode("UTF-16-LE") 18 | token_struct = struct.pack(f' 1 35 | go 36 | 37 | select * from [dbo].[wikipedia_articles_text_embeddings] where parent_id = 3796 38 | order by id 39 | go 40 | -------------------------------------------------------------------------------- /DiskANN/README.md: -------------------------------------------------------------------------------- 1 | # Approximate Nearest Neighbor Search 2 | 3 | SQL Server 2025 introduces a new `VECTOR_SEARCH` function that allows you to perform approximate nearest neighbor search using the DiskANN algorithm. This function is designed to work with vector columns in SQL Server, enabling efficient similarity search on high-dimensional data. 4 | 5 | The samples in this folder demonstrate how to use the `VECTOR_SEARCH` function with DiskANN. The samples include: 6 | 7 | - Creating a table with a vector column, importing data from a CSV file, and inserting data into the table. 8 | - Creating a approximate vector index on the table using `CREATE VECTOR INDEX` statement. 9 | - Performing approximate nearest neighbor search using the `VECTOR_SEARCH` function. 10 | - Performing hybrid search using the `VECTOR_SEARCH` function along with full-text search. 11 | - Use Half-Precision floating points to store embeddings to have a more compact representation of vectors. 12 | - Use the Vectorizer to generate embeddings for text data. 13 | 14 | ## Vectorizer 15 | 16 | To quickly generate embeddings for existing text data, you can use the Vectorizer, which is available as an sample open-source project here: [azure-sql-db-vectorizer](https://github.com/Azure-Samples/azure-sql-db-vectorizer) 17 | 18 | ## End-To-End sample 19 | 20 | A full end-to-end sample using Streamlit is available here: https://github.com/Azure-Samples/azure-sql-diskann 21 | -------------------------------------------------------------------------------- /SemanticKernel/dotnet/README.md: -------------------------------------------------------------------------------- 1 | ## Semantic Kernel 2 | 3 | Semantic Kernel is an SDK that integrates Large Language Models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. Semantic Kernel achieves this by allowing you to define plugins that can be chained together in just a few lines of code. 4 | 5 | A plugin to use SQL Server and Azure SQL as a vector store is available here: 6 | 7 | [Microsoft.SemanticKernel.Connectors.SqlServer](https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/VectorData/SqlServer) 8 | 9 | Semantic Kernel provides native SQL Server support both for the [legacy Memory Store](https://learn.microsoft.com/semantic-kernel/concepts/vector-store-connectors/memory-stores/?pivots=programming-language-csharp) and the new [Vector Store](https://learn.microsoft.com/semantic-kernel/concepts/vector-store-connectors/?pivots=programming-language-csharp). Samples on how to use both are available in this repository: 10 | 11 | - [Vector Store sample](./VectorStoreSample) 12 | - [Memory Store (legacy) sample](./MemoryStoreSample) 13 | 14 | ## Getting Started 15 | 16 | Create a `.env` file using the provided `.env.example` file as a template. Replace the values in `<>` with your own values. Then move into the folder with the sample you want to try and run application using the following command: 17 | 18 | ```bash 19 | dotnet run 20 | ``` 21 | 22 | -------------------------------------------------------------------------------- /Embeddings/T-SQL/02-create-get-embedding-procedure.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Create a procedure to get the embeddings for the input text by calling the OpenAI API 3 | 4 | Replace with the name of your Azure OpenAI service 5 | */ 6 | create or alter procedure dbo.get_embedding 7 | @deployedModelName nvarchar(1000), 8 | @inputText nvarchar(max), 9 | @embedding vector(1536) output 10 | as 11 | declare @retval int, @response nvarchar(max); 12 | declare @payload nvarchar(max) = json_object('input': @inputText); 13 | declare @url nvarchar(1000) = 'https://.openai.azure.com/openai/deployments/' + @deployedModelName + '/embeddings?api-version=2023-03-15-preview' 14 | exec @retval = sp_invoke_external_rest_endpoint 15 | @url = @url, 16 | @method = 'POST', 17 | @credential = [https://.openai.azure.com], 18 | @payload = @payload, 19 | @response = @response output; 20 | 21 | declare @re vector(1536); 22 | if (@retval = 0) begin 23 | set @re = cast(json_query(@response, '$.result.data[0].embedding') as vector(1536)) 24 | end else begin 25 | declare @msg nvarchar(max) = 26 | 'Error calling OpenAI API' + char(13) + char(10) + 27 | '[HTTP Status: ' + json_value(@response, '$.response.status.http.code') + '] ' + 28 | json_value(@response, '$.result.error.message'); 29 | throw 50000, @msg, 1; 30 | end 31 | 32 | set @embedding = @re; 33 | 34 | return @retval 35 | go 36 | 37 | -------------------------------------------------------------------------------- /DotNet/SqlBulkCopy/EmbeddingClient.cs: -------------------------------------------------------------------------------- 1 | using Azure; 2 | using Azure.AI.OpenAI; 3 | using DotNetEnv; 4 | 5 | public interface IEmbeddingClient 6 | { 7 | float[] GetEmbedding(string text, int dimensions); 8 | } 9 | 10 | public class AzureOpenAIEmbeddingClient: IEmbeddingClient 11 | { 12 | static readonly AzureKeyCredential credentials = new(Env.GetString("OPENAI_KEY")); 13 | static readonly AzureOpenAIClient aiClient = new(new Uri(Env.GetString("OPENAI_URL")), credentials); 14 | 15 | public float[] GetEmbedding(string text, int dimensions = 1536) 16 | { 17 | Console.WriteLine($"-> Getting embedding for: {text}"); 18 | 19 | var embeddingClient = aiClient.GetEmbeddingClient(Env.GetString("OPENAI_DEPLOYMENT_NAME")); 20 | 21 | var embedding = embeddingClient.GenerateEmbedding(text, new() { Dimensions = dimensions }); 22 | 23 | var vector = embedding.Value.ToFloats().ToArray(); 24 | if (vector.Length != dimensions) { 25 | throw new Exception($"Expected {dimensions} dimensions, but got {vector.Length}"); 26 | } 27 | 28 | return vector; 29 | } 30 | } 31 | 32 | public class MockEmbeddingClient: IEmbeddingClient 33 | { 34 | public float[] GetEmbedding(string text, int dimensions = 1536) 35 | { 36 | Random random = new(); 37 | return [.. Enumerable.Range(0, dimensions).Select(_ => (float)random.NextDouble())]; 38 | } 39 | } 40 | -------------------------------------------------------------------------------- /DotNet/Dapper/app/content.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "Title": "My first Dapper app!", 4 | "Content": "I wrote an app using Dapper!" 5 | }, 6 | { 7 | "Title": "Vectors with Azure SQL and Dapper", 8 | "Content": "You can store vectors in Azure SQL and query them from Dapper by defining a type handler that uses the new SqlVector type from Microsoft.Data.SqlClient." 9 | }, 10 | { 11 | "Title": "Dapper and vector search", 12 | "Content": "This sample shows how to use Dapper for vector-enabled queries against Azure SQL (vector column type)." 13 | }, 14 | { 15 | "Title": "SQL Server Best Practices", 16 | "Content": "Here are some best practices for using SQL Server in your applications." 17 | }, 18 | { 19 | "Title": "Python and Flask", 20 | "Content": "Learn how to build a web app using Python and Flask!" 21 | }, 22 | { 23 | "Title": "Django for REST APIs", 24 | "Content": "Create a REST API using Django!" 25 | }, 26 | { 27 | "Title": "JavaScript for Beginners", 28 | "Content": "Learn JavaScript from scratch!" 29 | }, 30 | { 31 | "Title": "Node vs Rust", 32 | "Content": "Which one should you choose for your next project?" 33 | }, 34 | { 35 | "Title": "Pizza or Focaccia?", 36 | "Content": "What's the difference between pizza and focaccia. Learn everything you need to know!" 37 | }, 38 | { 39 | "Title": "Chocolate Eggs for your next dessert", 40 | "Content": "Try this delicious recipe for chocolate eggs!" 41 | } 42 | ] 43 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/EmbeddingClient.cs: -------------------------------------------------------------------------------- 1 | using Azure; 2 | using Azure.AI.OpenAI; 3 | using DotNetEnv; 4 | 5 | namespace EFCoreVectors; 6 | 7 | public interface IEmbeddingClient 8 | { 9 | float[] GetEmbedding(string text, int dimensions); 10 | } 11 | 12 | public class AzureOpenAIEmbeddingClient: IEmbeddingClient 13 | { 14 | static readonly AzureKeyCredential credentials = new(Env.GetString("OPENAI_KEY")); 15 | static readonly AzureOpenAIClient aiClient = new(new Uri(Env.GetString("OPENAI_URL")), credentials); 16 | 17 | public float[] GetEmbedding(string text, int dimensions = 1536) 18 | { 19 | Console.WriteLine($"-> Getting embedding for: {text}"); 20 | 21 | var embeddingClient = aiClient.GetEmbeddingClient(Env.GetString("OPENAI_DEPLOYMENT_NAME")); 22 | 23 | var embedding = embeddingClient.GenerateEmbedding(text, new() { Dimensions = dimensions }); 24 | 25 | var vector = embedding.Value.ToFloats().ToArray(); 26 | if (vector.Length != dimensions) { 27 | throw new Exception($"Expected {dimensions} dimensions, but got {vector.Length}"); 28 | } 29 | 30 | return vector; 31 | } 32 | } 33 | 34 | public class MockEmbeddingClient: IEmbeddingClient 35 | { 36 | public float[] GetEmbedding(string text, int dimensions = 1536) 37 | { 38 | Random random = new(); 39 | return [.. Enumerable.Range(0, dimensions).Select(_ => (float)random.NextDouble())]; 40 | } 41 | } 42 | -------------------------------------------------------------------------------- /DotNet/EF-Core-9/EmbeddingClient.cs: -------------------------------------------------------------------------------- 1 | using Azure; 2 | using Azure.AI.OpenAI; 3 | using DotNetEnv; 4 | 5 | namespace EFCoreVectors; 6 | 7 | public interface IEmbeddingClient 8 | { 9 | float[] GetEmbedding(string text, int dimensions); 10 | } 11 | 12 | public class AzureOpenAIEmbeddingClient: IEmbeddingClient 13 | { 14 | static AzureKeyCredential credentials = new(Env.GetString("OPENAI_KEY")); 15 | static AzureOpenAIClient aiClient = new(new Uri(Env.GetString("OPENAI_URL")), credentials); 16 | 17 | public float[] GetEmbedding(string text, int dimensions = 1536) 18 | { 19 | Console.WriteLine($"-> Getting embedding for: {text}"); 20 | 21 | var embeddingClient = aiClient.GetEmbeddingClient(Env.GetString("OPENAI_DEPLOYMENT_NAME")); 22 | 23 | var embedding = embeddingClient.GenerateEmbedding(text, new() { Dimensions = dimensions }); 24 | 25 | var vector = embedding.Value.ToFloats().ToArray(); 26 | if (vector.Length != dimensions) { 27 | throw new Exception($"Expected {dimensions} dimensions, but got {vector.Length}"); 28 | } 29 | 30 | return vector; 31 | } 32 | } 33 | 34 | public class MockEmbeddingClient: IEmbeddingClient 35 | { 36 | public float[] GetEmbedding(string text, int dimensions = 1536) 37 | { 38 | Random random = new(); 39 | return Enumerable.Range(0, dimensions) 40 | .Select(_ => (float)random.NextDouble()) 41 | .ToArray(); 42 | } 43 | } 44 | -------------------------------------------------------------------------------- /DotNet/EF-Core-9/content.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "Title": "My first EF Core app!", 4 | "Content": "I wrote an app using EF Core!" 5 | }, 6 | { 7 | "Title": "Vectors with Azure SQL and EF Core", 8 | "Content": "You can use and store vectors easily Azure SQL and EF Core" 9 | }, 10 | { 11 | "Title": "EFCore.SqlServer.VectorSearch in available", 12 | "Content": "The NuGet package EFCore.SqlServer.VectorSearch is now available for EF Core 9! With this package you can use vector search functions in your LINQ queries." 13 | }, 14 | { 15 | "Title": "Entity Framework Core 10 natively support Vectors in SQL Server engine", 16 | "Content": "With EF core 10, support for Azure SQL and SQL Server engine vectors is native." 17 | }, 18 | { 19 | "Title": "SQL Server Best Practices", 20 | "Content": "Here are some best practices for using SQL Server in your applications." 21 | }, 22 | { 23 | "Title": "Python and Flask", 24 | "Content": "Learn how to build a web app using Python and Flask!" 25 | }, 26 | { 27 | "Title": "Django for REST APIs", 28 | "Content": "Create a REST API using Django!" 29 | }, 30 | { 31 | "Title": "JavaScript for Beginners", 32 | "Content": "Learn JavaScript from scratch!" 33 | }, 34 | { 35 | "Title": "Node vs Rust", 36 | "Content": "Which one should you choose for your next project?" 37 | }, 38 | { 39 | "Title": "Pizza or Focaccia?", 40 | "Content": "What's the difference between pizza and focaccia. Learn everything you need to know!" 41 | }, 42 | { 43 | "Title": "Chocolate Eggs for your next dessert", 44 | "Content": "Try this delicious recipe for chocolate eggs!" 45 | } 46 | ] -------------------------------------------------------------------------------- /NVIDIA-AI-SQL/README.md: -------------------------------------------------------------------------------- 1 | # RAG with NVIDIA AI and Azure SQL 2 | 3 | This Jupyter Notebook implements **Retrieval-Augmented Generation (RAG)** using **NVIDIA AI** models for embeddings and chat, and **Azure SQL Database** for storing and retrieving resume embeddings. 4 | 5 | ## Features 6 | - Uses **NVIDIA AI models** (`meta/llama-3.3-70b-instruct` and `nvidia/embed-qa-4`). 7 | - Stores resume embeddings in **Azure SQL Database**. 8 | - Supports **optimized vector search** for relevant candidates. 9 | - Implements **streaming responses** for better chatbot experience. 10 | 11 | ## Setup Instructions 12 | 1. Install dependencies: 13 | ```bash 14 | pip install -r requirements.txt 15 | ``` 16 | 2. Set up your `.env` file with API keys: 17 | ```bash 18 | NVIDIA_API_KEY=your_nvidia_api_key_here 19 | NVIDIA_CHAT_API_KEY=your_nvidia_chat_model_api_key_here 20 | AZUREDOCINTELLIGENCE_ENDPOINT=your_azure_doc_intelligence_endpoint_here 21 | AZUREDOCINTELLIGENCE_API_KEY=your_azure_doc_intelligence_api_key_here 22 | AZURE_SQL_CONNECTION_STRING=your_azure_sql_connection_string_here 23 | FILE_PATH=Path to the resume dataset 24 | ``` 25 | 3. Run the notebook. 26 | 27 | ## File Structure 28 | - `NVIDIA-RAG-with-resumes.ipynb` → Main Jupyter Notebook 29 | - `.env` → Environment variables for API keys 30 | - `README.md` → Documentation 31 | - `CreateTable.sql` → Create Table for Azure SQL Database 32 | 33 | ## Dataset 34 | 35 | We use a sample dataset from [Kaggle](https://www.kaggle.com/datasets/snehaanbhawal/resume-dataset) containing PDF resumes for this tutorial. For the purpose of this tutorial we will use 120 resumes from the **Information-Technology** folder 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/content.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "Title": "My first EF Core app!", 4 | "Content": "I wrote an app using EF Core!" 5 | }, 6 | { 7 | "Title": "Vectors with Azure SQL and EF Core", 8 | "Content": "You can use and store vectors easily Azure SQL and EF Core" 9 | }, 10 | { 11 | "Title": "EFCore.SqlServer.VectorSearch in available", 12 | "Content": "The NuGet package EFCore.SqlServer.VectorSearch is now available for EF Core 9! With this package you can use vector search functions in your LINQ queries." 13 | }, 14 | { 15 | "Title": "Entity Framework Core 10 natively support Vectors in SQL Server engine", 16 | "Content": "With EF core 10, support for Azure SQL and SQL Server engine vectors is native." 17 | }, 18 | { 19 | "Title": "SQL Server Best Practices", 20 | "Content": "Here are some best practices for using SQL Server in your applications." 21 | }, 22 | { 23 | "Title": "Python and Flask", 24 | "Content": "Learn how to build a web app using Python and Flask!" 25 | }, 26 | { 27 | "Title": "Django for REST APIs", 28 | "Content": "Create a REST API using Django!" 29 | }, 30 | { 31 | "Title": "JavaScript for Beginners", 32 | "Content": "Learn JavaScript from scratch!" 33 | }, 34 | { 35 | "Title": "Node vs Rust", 36 | "Content": "Which one should you choose for your next project?" 37 | }, 38 | { 39 | "Title": "Pizza or Focaccia?", 40 | "Content": "What's the difference between pizza and focaccia. Learn everything you need to know!" 41 | }, 42 | { 43 | "Title": "Chocolate Eggs for your next dessert", 44 | "Content": "Try this delicious recipe for chocolate eggs!" 45 | } 46 | ] -------------------------------------------------------------------------------- /DotNet/Dapper/app/EmbeddingClient.cs: -------------------------------------------------------------------------------- 1 | using Azure; 2 | using Azure.AI.OpenAI; 3 | using Azure.Identity; 4 | using DotNetEnv; 5 | 6 | namespace DapperVectors; 7 | 8 | public interface IEmbeddingClient 9 | { 10 | float[] GetEmbedding(string text, int dimensions = 1536); 11 | } 12 | 13 | public class AzureOpenAIEmbeddingClient : IEmbeddingClient 14 | { 15 | static readonly AzureOpenAIClient aiClient; 16 | 17 | static AzureOpenAIEmbeddingClient() 18 | { 19 | var endpoint = new Uri(Env.GetString("OPENAI_URL")); 20 | 21 | aiClient = Env.GetString("OPENAI_KEY") switch 22 | { 23 | null or "" => new AzureOpenAIClient(endpoint, new DefaultAzureCredential()), 24 | string key => new AzureOpenAIClient(endpoint, new AzureKeyCredential(key)), 25 | }; 26 | } 27 | 28 | public float[] GetEmbedding(string text, int dimensions = 1536) 29 | { 30 | Console.WriteLine($"-> Getting embedding for: {text}"); 31 | 32 | var embeddingClient = aiClient.GetEmbeddingClient(Env.GetString("OPENAI_DEPLOYMENT_NAME")); 33 | var embedding = embeddingClient.GenerateEmbedding(text, new() { Dimensions = dimensions }); 34 | var vector = embedding.Value.ToFloats().ToArray(); 35 | 36 | if (vector.Length != dimensions) 37 | { 38 | throw new Exception($"Expected {dimensions} dimensions, but got {vector.Length}"); 39 | } 40 | 41 | return vector; 42 | } 43 | } 44 | 45 | public class MockEmbeddingClient : IEmbeddingClient 46 | { 47 | public float[] GetEmbedding(string text, int dimensions = 1536) 48 | { 49 | Random random = new(); 50 | return Enumerable.Range(0, dimensions).Select(_ => (float)random.NextDouble()).ToArray(); 51 | } 52 | } 53 | -------------------------------------------------------------------------------- /Embeddings/T-SQL/04-update-embeddings-with-trigger.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Create sample table 3 | */ 4 | drop table if exists dbo.sample_text; 5 | create table dbo.sample_text 6 | ( 7 | id int identity not null primary key, 8 | content nvarchar(max) null, 9 | embedding vector(1536) null, 10 | [vectors_update_info] nvarchar(max) null 11 | ) 12 | go 13 | 14 | /* 15 | Create trigger to update embeddings 16 | when "content" column is changed 17 | */ 18 | create or alter trigger sample_text_generate_embeddings 19 | on dbo.sample_text 20 | after insert, update 21 | as 22 | set nocount on; 23 | 24 | if not(update(content)) return; 25 | 26 | declare c cursor fast_forward read_only 27 | for select [id], [content] from inserted 28 | order by id; 29 | 30 | declare @id int, @content nvarchar(max); 31 | 32 | open c; 33 | fetch next from c into @id, @content 34 | while @@fetch_status = 0 35 | begin 36 | begin try 37 | declare @retval int; 38 | 39 | if update(content) begin 40 | declare @embedding vector(1536); 41 | exec @retval = [dbo].[get_embedding] '', @content, @embedding output with result sets none 42 | update [dbo].[sample_text] set embedding = @embedding where id = @id 43 | end 44 | 45 | update [dbo].[sample_text] set [vectors_update_info] = json_object('status':'updated', 'timestamp':CURRENT_TIMESTAMP) 46 | end try 47 | begin catch 48 | update [dbo].[sample_text] set [vectors_update_info] = json_object('status':'error', 'timestamp':CURRENT_TIMESTAMP) 49 | end catch 50 | fetch next from c into @id, @content 51 | 52 | end 53 | close c 54 | deallocate c 55 | go 56 | 57 | /* 58 | Test trigger 59 | */ 60 | insert into dbo.sample_text (content) values ('The foundation series from Isaac Asimov') 61 | go 62 | 63 | select * from dbo.sample_text 64 | go 65 | -------------------------------------------------------------------------------- /DiskANN/Wikipedia/001-wikipedia-diskann-setup.sql: -------------------------------------------------------------------------------- 1 | /* 2 | This script requires SQL Server 2025 RC1 3 | */ 4 | 5 | create database WikipediaTest 6 | go 7 | 8 | use WikipediaTest 9 | go 10 | 11 | drop table if exists [dbo].[wikipedia_articles_embeddings]; 12 | create table [dbo].[wikipedia_articles_embeddings] 13 | ( 14 | [id] [int] not null, 15 | [url] [varchar](1000) not null, 16 | [title] [varchar](1000) not null, 17 | [text] [varchar](max) not null, 18 | [title_vector] [vector](1536) not null, 19 | [content_vector] [vector](1536) not null, 20 | [vector_id] [int] not null 21 | ) 22 | go 23 | 24 | /* 25 | Import data. File taken from 26 | https://cdn.openai.com/API/examples/data/vector_database_wikipedia_articles_embedded.zip 27 | */ 28 | bulk insert dbo.[wikipedia_articles_embeddings] 29 | from 'C:\samples\rc1\datasets\vector_database_wikipedia_articles_embedded.csv' 30 | with ( 31 | format = 'csv', 32 | firstrow = 2, 33 | codepage = '65001', --comment if using MSSQL on Linux 34 | fieldterminator = ',', 35 | rowterminator = '0x0a', 36 | fieldquote = '"', 37 | batchsize = 1000, 38 | tablock 39 | ) 40 | go 41 | select row_count from sys.dm_db_partition_stats 42 | where object_id = OBJECT_ID('[dbo].[wikipedia_articles_embeddings]') and index_id in (0, 1) 43 | go 44 | 45 | /* 46 | Add primary key 47 | */ 48 | alter table [dbo].[wikipedia_articles_embeddings] 49 | add constraint pk__wikipedia_articles_embeddings primary key clustered (id) 50 | go 51 | 52 | /* 53 | Add index on title 54 | */ 55 | create index [ix_title] on [dbo].[wikipedia_articles_embeddings](title) 56 | go 57 | 58 | /* 59 | Verify data 60 | */ 61 | select top (10) * from [dbo].[wikipedia_articles_embeddings] 62 | go 63 | 64 | select *, 65 | DATALENGTH(content_vector) as bytes, 66 | DATALENGTH(CAST(content_vector as varchar(max))) as chars 67 | from 68 | [dbo].[wikipedia_articles_embeddings] where title like 'Philosoph%' 69 | go 70 | -------------------------------------------------------------------------------- /DotNet/EF-Core-9/Model.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.EntityFrameworkCore; 2 | using System.ComponentModel.DataAnnotations; 3 | 4 | namespace EFCoreVectors; 5 | 6 | public class BloggingContext : DbContext 7 | { 8 | public DbSet Blogs { get; set; } 9 | public DbSet Posts { get; set; } 10 | 11 | protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) 12 | { 13 | DotNetEnv.Env.Load(); 14 | 15 | // Enable vector search 16 | optionsBuilder.UseSqlServer( 17 | Environment.GetEnvironmentVariable("MSSQL"), 18 | o => o.UseVectorSearch() 19 | ); 20 | 21 | //optionsBuilder.LogTo(Console.WriteLine, [ DbLoggerCategory.Database.Command.Name ]) 22 | // .EnableSensitiveDataLogging() 23 | // .EnableDetailedErrors(); 24 | } 25 | 26 | protected override void OnModelCreating(ModelBuilder modelBuilder) 27 | { 28 | // Configure the float[] property as a vector: 29 | modelBuilder.Entity().Property(b => b.Embedding).HasColumnType("vector(1536)"); 30 | } 31 | } 32 | 33 | [Index(nameof(Name), IsUnique = true)] 34 | public class Blog 35 | { 36 | [Key] 37 | public int BlogId { get; set; } 38 | public string Name { get; set; } = string.Empty; 39 | public string Url { get; set; } = string.Empty; 40 | public List Posts { get; } = []; 41 | } 42 | 43 | [Index(nameof(Title), IsUnique = true)] 44 | public class Post 45 | { 46 | [Key] 47 | public int PostId { get; set; } 48 | public string Title { get; set; } = string.Empty; 49 | public string Content { get; set; } = string.Empty; 50 | public float[] Embedding { get; set; } = []; 51 | public int BlogId { get; set; } 52 | public Blog Blog { get; set; } = null!; 53 | } 54 | 55 | public class SavedPost { 56 | public string Title { get; set; } = string.Empty; 57 | public string Content { get; set; } = string.Empty; 58 | } -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Readme.md: -------------------------------------------------------------------------------- 1 | # 5-Min RAG SQL Accelerator 2 | 3 | Welcome! 🎉 This repository contains two solution accelerators. These tools demonstrate the ease of deployment and the powerful vector search capabilities of Azure SQL. 4 | 5 | Credits: Developed by Kushagra Agarwal - Orginal [Repo](https://github.com/Kushagra-2000/ARM_SQL_OpenAI) 6 | --- 7 | 8 | ## 🚀 Step 1: One-Click Deployment Accelerator 9 | 10 | **Purpose**: Automatically deploy the entire Azure SQL + OpenAI setup for a Retrieval-Augmented Generation (RAG) application using an ARM template. 11 | 12 | **Highlights**: 13 | - One-click deployment via Azure Resource Manager (ARM) 14 | - Sets up Azure SQL Database, OpenAI integration, and supporting resources 15 | - Reduces manual configuration to just click and go 16 | - Completes setup in under 5 minutes 17 | 18 | **Setup Instructions**: 19 | 1. Navigate to the folder [Step1-OneClick-Deployment](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/5-Min-RAG-SQL-Accelerator/Step1-OneClick-Deployment) 20 | 2. Follow the instructions in the README to deploy using the ARM template. 21 | 3. Once deployed, your backend RAG infrastructure is ready to use. 22 | 23 | --- 24 | 25 | ## 🌐 Step 2: Web-Based Demo for Vector Search 26 | 27 | **Purpose**: Launch a Streamlit-based web UI that showcases Azure SQL’s native vector search capabilities. 28 | 29 | **Highlights**: 30 | - Visual demo for semantic search scenarios (e.g., product search, resume matching) 31 | - No coding required to explore and validate vector search 32 | - Helps users understand and interact with vector search features in Azure SQL 33 | 34 | **Setup Instructions**: 35 | 1. Navigate to the [Step2-Deploy-RAG-App](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/5-Min-RAG-SQL-Accelerator/Step2-Deploy-RAG-App) 36 | 2. Follow the README to install dependencies and run the Streamlit app. 37 | 3. Use the web UI to explore vector search on sample data. 38 | 39 | --- 40 | 41 | 42 | 43 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/Model.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.EntityFrameworkCore; 2 | using System; 3 | using System.Collections.Generic; 4 | using System.ComponentModel.DataAnnotations; 5 | using Microsoft.Data.SqlTypes; 6 | 7 | namespace EFCoreVectors; 8 | 9 | public class BloggingContext : DbContext 10 | { 11 | public DbSet Blogs { get; set; } 12 | public DbSet Posts { get; set; } 13 | 14 | protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) 15 | { 16 | DotNetEnv.Env.Load(); 17 | 18 | // Enable vector search 19 | optionsBuilder.UseSqlServer( 20 | Environment.GetEnvironmentVariable("MSSQL") 21 | ); 22 | 23 | // optionsBuilder.LogTo(Console.WriteLine, [ DbLoggerCategory.Database.Command.Name ]) 24 | // .EnableSensitiveDataLogging() 25 | // .EnableDetailedErrors(); 26 | } 27 | 28 | protected override void OnModelCreating(ModelBuilder modelBuilder) 29 | { 30 | // Configure the float[] property as a vector: 31 | modelBuilder.Entity().Property(b => b.Embedding).HasColumnType("vector(1536)"); 32 | } 33 | } 34 | 35 | [Index(nameof(Name), IsUnique = true)] 36 | public class Blog 37 | { 38 | [Key] 39 | public int BlogId { get; set; } 40 | public string Name { get; set; } = string.Empty; 41 | public string Url { get; set; } = string.Empty; 42 | public List Posts { get; } = []; 43 | } 44 | 45 | [Index(nameof(Title), IsUnique = true)] 46 | public class Post 47 | { 48 | [Key] 49 | public int PostId { get; set; } 50 | public string Title { get; set; } = string.Empty; 51 | public string Content { get; set; } = string.Empty; 52 | public SqlVector Embedding { get; set; } 53 | public int BlogId { get; set; } 54 | public Blog Blog { get; set; } = null!; 55 | } 56 | 57 | public class SavedPost 58 | { 59 | public string Title { get; set; } = string.Empty; 60 | public string Content { get; set; } = string.Empty; 61 | } -------------------------------------------------------------------------------- /Hybrid-Search/README.md: -------------------------------------------------------------------------------- 1 | # Hybrid Search 2 | 3 | This sample shows how to combine Fulltext search in Azure SQL database with BM25 ranking and cosine similarity ranking to do hybrid search. 4 | 5 | In this sample the local model [multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1) to generate embeddings. The Python script `./python/hybrid_search.py` shows how to 6 | 7 | - use Python to generate the embeddings 8 | - do similarity search in Azure SQL database 9 | - use [Fulltext search in Azure SQL database with BM25 ranking](https://learn.microsoft.com/en-us/sql/relational-databases/search/limit-search-results-with-rank?view=sql-server-ver16#ranking-of-freetexttable) 10 | - do re-ranking applying Reciprocal Rank Fusion (RRF) to combine the BM25 ranking with the cosine similarity ranking 11 | 12 | Make sure to setup the database for this sample using the `./python/00-setup-database.sql` script. Database can be either an Azure SQL DB or a SQL Server database. Once the database has been created, you can run the `./python/hybrid_search.py` script to do the hybrid search: 13 | 14 | First, set up the virtual environment and install the required packages: 15 | 16 | ```bash 17 | python -m venv .venv 18 | ``` 19 | 20 | Activate the virtual environment and then install the required packages: 21 | 22 | ```bash 23 | pip install -r requirements.txt 24 | ``` 25 | 26 | Create an environment file `.env` with the connection string to Azure SQL database. You can use the `.env.sample` as a starting point. The sample `.env` file shows how to use Entra ID to connect to the database, which looks like: 27 | 28 | ```text 29 | MSSQL='Driver={ODBC Driver 18 for SQL Server};Server=tcp:,1433;Database=;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;LongAsMax=yes;' 30 | ``` 31 | 32 | If you want to use SQL Authentication the connection string would instead look like the following: 33 | 34 | ``` 35 | MSSQL='Driver={ODBC Driver 18 for SQL Server};Server=tcp:,1433;Database=;UID=;PWD=;Encrypt=yes;TrustServerCertificate=yes;Connection Timeout=30;LongAsMax=yes;' 36 | ``` 37 | 38 | Then run the script: 39 | 40 | ```bash 41 | python hybrid_search.py 42 | ``` -------------------------------------------------------------------------------- /DotNet/EF-Core-9/Program.cs: -------------------------------------------------------------------------------- 1 | using System.Text.Json; 2 | using Microsoft.EntityFrameworkCore; 3 | using DotNetEnv; 4 | using EFCoreVectors; 5 | 6 | // Load .env 7 | Env.Load(); 8 | 9 | // Create EF Core context 10 | using var db = new BloggingContext(); 11 | 12 | // Create blog 13 | Console.WriteLine("Getting sample blog..."); 14 | var blog = db.Blogs 15 | .Include(blog => blog.Posts) 16 | .FirstOrDefault(b => b.Name == "Sample blog"); 17 | 18 | if (blog == null) { 19 | Console.WriteLine("Creating 'Sample blog'..."); 20 | blog = new Blog { Name = "Sample blog", Url = "https://devblogs.microsoft.com" }; 21 | db.Add(blog); 22 | db.SaveChanges(); 23 | } 24 | 25 | // Add posts 26 | Console.WriteLine("Adding posts..."); 27 | var content = File.ReadAllText("content.json"); 28 | var newPosts = JsonSerializer.Deserialize>(content)!; 29 | 30 | // Console.WriteLine("Adding embeddings..."); 31 | var embeddingClient = new AzureOpenAIEmbeddingClient(); 32 | //var embeddingClient = new MockEmbeddingClient(); 33 | 34 | newPosts.ForEach(np => { 35 | var p = blog.Posts.FirstOrDefault(p => p.Title == np.Title); 36 | if (p == null) { 37 | blog.Posts.Add(new Post { 38 | Title = np.Title, 39 | Content = np.Content, 40 | Embedding = embeddingClient.GetEmbedding(np.Content) 41 | }); 42 | } else { 43 | p.Title = np.Title; 44 | p.Content = np.Content; 45 | p.Embedding = embeddingClient.GetEmbedding(np.Content); 46 | } 47 | }); 48 | 49 | // Adding posts to database 50 | db.SaveChanges(); 51 | 52 | // Find similar post 53 | Console.WriteLine("\n----------\n"); 54 | 55 | // Querying 56 | string searchPhrase = "I want to use Azure SQL, EF Core and vectors in my app!"; 57 | Console.WriteLine($"Search phrase is: '{searchPhrase}'..."); 58 | 59 | Console.WriteLine("Querying for similar posts..."); 60 | float[] vector = embeddingClient.GetEmbedding(searchPhrase); 61 | var relatedPosts = await db.Posts 62 | .Where(p => p.Blog.Name == "Sample Blog") 63 | .OrderBy(p => EF.Functions.VectorDistance("cosine", p.Embedding, vector)) 64 | .Select(p => new { p.Title, Distance = EF.Functions.VectorDistance("cosine", p.Embedding, vector)}) 65 | .Take(5) 66 | .ToListAsync(); 67 | 68 | Console.WriteLine("Similar posts found:"); 69 | relatedPosts.ForEach(rp => { 70 | Console.WriteLine($"Post: {rp.Title}, Distance: {rp.Distance}"); 71 | }); 72 | 73 | -------------------------------------------------------------------------------- /DiskANN/diskann-quickstart-sql-server-2025.sql: -------------------------------------------------------------------------------- 1 | CREATE DATABASE DiskANNQuickstart; 2 | GO 3 | 4 | USE DiskANNQuickstart; 5 | GO 6 | 7 | -- Step 0: Enable Preview Feature 8 | ALTER DATABASE SCOPED CONFIGURATION 9 | SET PREVIEW_FEATURES = ON; 10 | GO 11 | SELECT * FROM sys.database_scoped_configurations WHERE [name] = 'PREVIEW_FEATURES' 12 | GO 13 | 14 | -- Step 1: Create a sample table with a VECTOR(5) column 15 | DROP TABLE IF EXISTS dbo.Articles; 16 | CREATE TABLE dbo.Articles 17 | ( 18 | id INT PRIMARY KEY, 19 | title NVARCHAR(100), 20 | content NVARCHAR(MAX), 21 | embedding VECTOR(5) 22 | ); 23 | 24 | -- Step 2: Insert sample data 25 | INSERT INTO Articles (id, title, content, embedding) 26 | VALUES 27 | (1, 'Intro to AI', 'This article introduces AI concepts.', '[0.1, 0.2, 0.3, 0.4, 0.5]'), 28 | (2, 'Deep Learning', 'Deep learning is a subset of ML.', '[0.2, 0.1, 0.4, 0.3, 0.6]'), 29 | (3, 'Neural Networks', 'Neural networks are powerful models.', '[0.3, 0.3, 0.3, 0.5, 0.1]'), 30 | (4, 'Machine Learning Basics', 'ML basics for beginners.', '[0.4, 0.5, 0.1, 0.7, 0.3]'), 31 | (5, 'Advanced AI', 'Exploring advanced AI techniques.', '[0.5, 0.4, 0.1, 0.1, 0.2]'), 32 | (6, 'AI in Healthcare', 'AI applications in healthcare.', '[0.6, 0.3, 0.4, 0.3, 0.2]'), 33 | (7, 'AI Ethics', 'Ethical considerations in AI.', '[0.1, 0.9, 0.5, 0.4, 0.3]'), 34 | (8, 'AI and Society', 'Impact of AI on society.', '[0.2, 0.3, 0.5, 0.5, 0.4]'), 35 | (9, 'Future of AI', 'Predictions for the future of AI.', '[0.8, 0.4, 0.5, 0.1, 0.2]'), 36 | (10, 'AI Innovations', 'Latest innovations in AI.', '[0.4, 0.7, 0.2, 0.3, 0.1]'); 37 | GO 38 | 39 | -- Step 3: Create a vector index on the embedding column 40 | CREATE VECTOR INDEX vec_idx ON Articles(embedding) 41 | WITH (METRIC = 'Cosine', TYPE = 'DiskANN') 42 | ON [PRIMARY]; 43 | GO 44 | 45 | -- Step 4: Perform a vector similarity search 46 | DECLARE @qv VECTOR(5) = (SELECT TOP(1) embedding FROM Articles WHERE id = 1); 47 | SELECT 48 | t.id, 49 | t.title, 50 | t.content, 51 | s.distance 52 | FROM 53 | VECTOR_SEARCH( 54 | TABLE = Articles AS t, 55 | COLUMN = embedding, 56 | SIMILAR_TO = @qv, 57 | METRIC = 'Cosine', 58 | TOP_N = 3 59 | ) AS s 60 | ORDER BY s.distance, t.title; 61 | GO 62 | 63 | -- Step 5: View index details 64 | SELECT index_id, [type], [type_desc], vector_index_type, distance_metric, build_parameters FROM sys.vector_indexes WHERE [name] = 'vec_idx'; 65 | GO 66 | 67 | -- Step 6: Clean up by dropping the table 68 | DROP INDEX vec_idx ON Articles; -------------------------------------------------------------------------------- /DotNet/EF-Core-10/Program.cs: -------------------------------------------------------------------------------- 1 | using System.Text.Json; 2 | using Microsoft.EntityFrameworkCore; 3 | using DotNetEnv; 4 | using EFCoreVectors; 5 | using Microsoft.Data.SqlTypes; 6 | 7 | // Load .env 8 | Env.Load(); 9 | 10 | // Create EF Core context 11 | using var db = new BloggingContext(); 12 | 13 | // Create blog 14 | Console.WriteLine("Getting sample blog..."); 15 | var blog = db.Blogs 16 | .Include(blog => blog.Posts) 17 | .FirstOrDefault(b => b.Name == "Sample blog"); 18 | 19 | if (blog == null) { 20 | Console.WriteLine("Creating 'Sample blog'..."); 21 | blog = new Blog { Name = "Sample blog", Url = "https://devblogs.microsoft.com" }; 22 | db.Add(blog); 23 | db.SaveChanges(); 24 | } 25 | 26 | // Add posts 27 | Console.WriteLine("Adding posts..."); 28 | var content = File.ReadAllText("content.json"); 29 | var newPosts = JsonSerializer.Deserialize>(content)!; 30 | 31 | // Console.WriteLine("Adding embeddings..."); 32 | var embeddingClient = new AzureOpenAIEmbeddingClient(); 33 | //var embeddingClient = new MockEmbeddingClient(); 34 | 35 | newPosts.ForEach(np => { 36 | var p = blog.Posts.FirstOrDefault(p => p.Title == np.Title); 37 | if (p == null) { 38 | blog.Posts.Add(new Post { 39 | Title = np.Title, 40 | Content = np.Content, 41 | Embedding = new SqlVector(embeddingClient.GetEmbedding(np.Content)) 42 | }); 43 | } else { 44 | p.Title = np.Title; 45 | p.Content = np.Content; 46 | p.Embedding = new SqlVector(embeddingClient.GetEmbedding(np.Content)); 47 | } 48 | }); 49 | 50 | // Adding posts to database 51 | db.SaveChanges(); 52 | 53 | // Find similar post 54 | Console.WriteLine("\n----------\n"); 55 | 56 | // Querying 57 | string searchPhrase = "I want to use Azure SQL, EF Core and vectors in my app!"; 58 | Console.WriteLine($"Search phrase is: '{searchPhrase}'..."); 59 | 60 | Console.WriteLine("Querying for similar posts..."); 61 | var vector = new SqlVector(embeddingClient.GetEmbedding(searchPhrase)); 62 | var relatedPosts = await db.Posts 63 | .Where(p => p.Blog.Name == "Sample Blog") 64 | .OrderBy(p => EF.Functions.VectorDistance("cosine", p.Embedding, vector)) 65 | .Select(p => new { p.Title, Distance = EF.Functions.VectorDistance("cosine", p.Embedding, vector) }) 66 | .Take(5) 67 | .ToListAsync(); 68 | 69 | Console.WriteLine("Similar posts found:"); 70 | relatedPosts.ForEach(rp => { 71 | Console.WriteLine($"Post: {rp.Title}, Distance: {rp.Distance}"); 72 | }); 73 | 74 | -------------------------------------------------------------------------------- /DiskANN/Wikipedia/004-wikipedia-hybrid-search.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Run Hybrid Search using Vector Search and FullText Search and then 3 | using Reciprocal Ranking Fusion to calculate the final rank score 4 | 5 | This script requires SQL Server 2025 RC0 6 | */ 7 | USE WikipediaTest 8 | go 9 | 10 | SET STATISTICS TIME ON 11 | SET STATISTICS IO ON 12 | GO 13 | 14 | DECLARE @q NVARCHAR(1000) = 'the foundation series by isaac asimov'; 15 | DECLARE @k INT = 10 16 | 17 | DECLARE @r INT, @e VECTOR(1536); 18 | 19 | SELECT @e = ai_generate_embeddings(@q use model Ada2Embeddings); 20 | IF (@r != 0) SELECT @r; 21 | 22 | WITH keyword_search AS ( 23 | SELECT TOP(@k) 24 | id, 25 | RANK() OVER (ORDER BY ft_rank DESC) AS [rank], 26 | title, 27 | [text] 28 | FROM 29 | ( 30 | SELECT TOP(@k) 31 | id, 32 | ftt.[RANK] AS ft_rank, 33 | title, 34 | [text] 35 | FROM 36 | dbo.wikipedia_articles_embeddings w 37 | INNER JOIN 38 | FREETEXTTABLE(dbo.wikipedia_articles_embeddings, *, @q) AS ftt ON w.id = ftt.[KEY] -- FREETEXTTABLE returns BM25 rank 39 | ORDER BY 40 | ft_rank DESC 41 | ) AS freetext_documents 42 | ORDER BY 43 | rank ASC 44 | ), 45 | semantic_search AS 46 | ( 47 | SELECT TOP(@k) 48 | id, 49 | RANK() OVER (ORDER BY cosine_distance) AS [rank] 50 | FROM 51 | ( 52 | SELECT TOP(@k) 53 | t.id, s.distance as cosine_distance 54 | FROM 55 | VECTOR_SEARCH( 56 | TABLE = [dbo].[wikipedia_articles_embeddings] as t, 57 | COLUMN = [content_vector], 58 | SIMILAR_TO = @e, 59 | METRIC = 'cosine', 60 | TOP_N = @k 61 | ) AS s 62 | ORDER BY cosine_distance 63 | ) AS similar_documents 64 | ), 65 | result AS ( 66 | SELECT TOP(@k) 67 | COALESCE(ss.id, ks.id) AS id, 68 | ss.[rank] AS semantic_rank, 69 | ks.[rank] AS keyword_rank, 70 | COALESCE(1.0 / (@k + ss.[rank]), 0.0) + 71 | COALESCE(1.0 / (@k + ks.[rank]), 0.0) AS score -- Reciprocal Rank Fusion (RRF) 72 | FROM 73 | semantic_search ss 74 | FULL OUTER JOIN 75 | keyword_search ks ON ss.id = ks.id 76 | ORDER BY 77 | score DESC 78 | ) 79 | SELECT 80 | w.id, 81 | cast(score * 1000 as int) as rrf_score, 82 | rank() OVER(ORDER BY cast(score * 1000 AS INT) DESC) AS rrf_rank, 83 | semantic_rank, 84 | keyword_rank, 85 | w.title, 86 | w.[text] 87 | FROM 88 | result AS r 89 | INNER JOIN 90 | dbo.wikipedia_articles_embeddings AS w ON r.id = w.id 91 | ORDER BY 92 | rrf_rank -------------------------------------------------------------------------------- /Embeddings/T-SQL/README.md: -------------------------------------------------------------------------------- 1 | # Get Embeddings from OpenAI from Azure SQL 2 | 3 | In this sample you'll be creating a stored procedure to easily transform text into a vector using OpenAI embedding model. 4 | 5 | ## Create the Embedding Model 6 | 7 | Make sure you can access OpenAI service by following the documentation here: [How do I get access to Azure OpenAI?](https://learn.microsoft.com/azure/ai-services/openai/overview#how-do-i-get-access-to-azure-openai). 8 | 9 | Deploy an embedding model - for example the `text-embedding-3-small` - following the [Create and deploy an Azure OpenAI Service resource](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource). Please note that the sample assumes that the choose embedding model returns a 1536-dimensional vector, so if you choose another embedding model, you may need to adjust the sample accordingly. 10 | 11 | Then retrieve the Azure OpenAI *endpoint* and *key*: 12 | 13 | ![Azure OpenAI Endpoint and Key](../../Assets/endpoint.png) 14 | 15 | ## Store the credentials into Azure SQL 16 | 17 | Connect to Azure SQL database and run the `01-store-openai-credentials.sql` to store the Azure OpenAI endpoint and secret so that it can be used for later use. 18 | 19 | ## Create the `get_embeddings` Stored Procedure 20 | 21 | Use the `02-create-get-embeddings-procedure.sql` to create a stored procedure that will call the OpenAI embedding model you have deployed before. The stored procedure uses the [`sp_invoke_external_rest_endpoint`](https://learn.microsoft.com/sql/relational-databases/system-stored-procedures/sp-invoke-external-rest-endpoint-transact-sql) 22 | 23 | ## Transform text into embedding 24 | 25 | Use the `03-get-embeddings.sql` to call to OpenAI to transform sample text into embeddings. Make sure to use the deployed model *name*: 26 | 27 | ![Deployed OpenAI Models](../../Assets/embedding-deployment.png) 28 | 29 | And then use 30 | 31 | ```sql 32 | declare @king vector(1536); 33 | exec dbo.get_embedding @deployedModelName = 'text-embedding-3-small', @inputText = 'King', @embedding = @king output; 34 | ``` 35 | 36 | the resulting vector will be stored into the `@king` variable. 37 | You can now compare it with another variable to find how similar they are: 38 | 39 | ```sql 40 | select 41 | vector_distance('cosine', @king, @queen) as 'King vs Queen' 42 | ``` 43 | 44 | ## Use triggers to automatically update embeddings 45 | 46 | Use the `04-update-embeddings-with-trigger.sql` to create a trigger that will automatically update the embeddings when a new row is inserted into the table or when the content is updated. 47 | 48 | To learn about other ways to keep embeddings updated, read the post here: [Storing, querying and keeping embeddings updated: options and best practices](https://devblogs.microsoft.com/azure-sql/storing-querying-and-keeping-embeddings-updated-options-and-best-practices/) -------------------------------------------------------------------------------- /DotNet/EF-Core-9/Migrations/20241030213329_InitialCreate.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.EntityFrameworkCore.Migrations; 2 | 3 | #nullable disable 4 | 5 | namespace EFCoreVectors.Migrations 6 | { 7 | /// 8 | public partial class InitialCreate : Migration 9 | { 10 | /// 11 | protected override void Up(MigrationBuilder migrationBuilder) 12 | { 13 | migrationBuilder.CreateTable( 14 | name: "Blogs", 15 | columns: table => new 16 | { 17 | BlogId = table.Column(type: "int", nullable: false) 18 | .Annotation("SqlServer:Identity", "1, 1"), 19 | Name = table.Column(type: "nvarchar(450)", nullable: false), 20 | Url = table.Column(type: "nvarchar(max)", nullable: false) 21 | }, 22 | constraints: table => 23 | { 24 | table.PrimaryKey("PK_Blogs", x => x.BlogId); 25 | }); 26 | 27 | migrationBuilder.CreateTable( 28 | name: "Posts", 29 | columns: table => new 30 | { 31 | PostId = table.Column(type: "int", nullable: false) 32 | .Annotation("SqlServer:Identity", "1, 1"), 33 | Title = table.Column(type: "nvarchar(450)", nullable: false), 34 | Content = table.Column(type: "nvarchar(max)", nullable: false), 35 | Embedding = table.Column(type: "vector(1536)", nullable: false), 36 | BlogId = table.Column(type: "int", nullable: false) 37 | }, 38 | constraints: table => 39 | { 40 | table.PrimaryKey("PK_Posts", x => x.PostId); 41 | table.ForeignKey( 42 | name: "FK_Posts_Blogs_BlogId", 43 | column: x => x.BlogId, 44 | principalTable: "Blogs", 45 | principalColumn: "BlogId", 46 | onDelete: ReferentialAction.Cascade); 47 | }); 48 | 49 | migrationBuilder.CreateIndex( 50 | name: "IX_Blogs_Name", 51 | table: "Blogs", 52 | column: "Name", 53 | unique: true); 54 | 55 | migrationBuilder.CreateIndex( 56 | name: "IX_Posts_BlogId", 57 | table: "Posts", 58 | column: "BlogId"); 59 | 60 | migrationBuilder.CreateIndex( 61 | name: "IX_Posts_Title", 62 | table: "Posts", 63 | column: "Title", 64 | unique: true); 65 | } 66 | 67 | /// 68 | protected override void Down(MigrationBuilder migrationBuilder) 69 | { 70 | migrationBuilder.DropTable( 71 | name: "Posts"); 72 | 73 | migrationBuilder.DropTable( 74 | name: "Blogs"); 75 | } 76 | } 77 | } 78 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/Migrations/20250910190225_InitialCreate.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.Data.SqlTypes; 2 | using Microsoft.EntityFrameworkCore.Migrations; 3 | 4 | #nullable disable 5 | 6 | namespace EFCore10Vectors.Migrations 7 | { 8 | /// 9 | public partial class InitialCreate : Migration 10 | { 11 | /// 12 | protected override void Up(MigrationBuilder migrationBuilder) 13 | { 14 | migrationBuilder.CreateTable( 15 | name: "Blogs", 16 | columns: table => new 17 | { 18 | BlogId = table.Column(type: "int", nullable: false) 19 | .Annotation("SqlServer:Identity", "1, 1"), 20 | Name = table.Column(type: "nvarchar(450)", nullable: false), 21 | Url = table.Column(type: "nvarchar(max)", nullable: false) 22 | }, 23 | constraints: table => 24 | { 25 | table.PrimaryKey("PK_Blogs", x => x.BlogId); 26 | }); 27 | 28 | migrationBuilder.CreateTable( 29 | name: "Posts", 30 | columns: table => new 31 | { 32 | PostId = table.Column(type: "int", nullable: false) 33 | .Annotation("SqlServer:Identity", "1, 1"), 34 | Title = table.Column(type: "nvarchar(450)", nullable: false), 35 | Content = table.Column(type: "nvarchar(max)", nullable: false), 36 | Embedding = table.Column>(type: "vector(1536)", nullable: false), 37 | BlogId = table.Column(type: "int", nullable: false) 38 | }, 39 | constraints: table => 40 | { 41 | table.PrimaryKey("PK_Posts", x => x.PostId); 42 | table.ForeignKey( 43 | name: "FK_Posts_Blogs_BlogId", 44 | column: x => x.BlogId, 45 | principalTable: "Blogs", 46 | principalColumn: "BlogId", 47 | onDelete: ReferentialAction.Cascade); 48 | }); 49 | 50 | migrationBuilder.CreateIndex( 51 | name: "IX_Blogs_Name", 52 | table: "Blogs", 53 | column: "Name", 54 | unique: true); 55 | 56 | migrationBuilder.CreateIndex( 57 | name: "IX_Posts_BlogId", 58 | table: "Posts", 59 | column: "BlogId"); 60 | 61 | migrationBuilder.CreateIndex( 62 | name: "IX_Posts_Title", 63 | table: "Posts", 64 | column: "Title", 65 | unique: true); 66 | } 67 | 68 | /// 69 | protected override void Down(MigrationBuilder migrationBuilder) 70 | { 71 | migrationBuilder.DropTable( 72 | name: "Posts"); 73 | 74 | migrationBuilder.DropTable( 75 | name: "Blogs"); 76 | } 77 | } 78 | } 79 | -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step1-OneClick-Deployment/SQL_deployment.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", 3 | "contentVersion": "1.0.0.0", 4 | "metadata": { 5 | "_generator": { 6 | "name": "bicep", 7 | "version": "0.12.40.16777", 8 | "templateHash": "16856611863128783179" 9 | } 10 | }, 11 | "parameters": { 12 | "serverName": { 13 | "type": "string", 14 | "defaultValue": "SampleServer", 15 | "metadata": { 16 | "description": "The name of the SQL logical server." 17 | } 18 | }, 19 | "sqlDBName": { 20 | "type": "string", 21 | "defaultValue": "SampleDB", 22 | "metadata": { 23 | "description": "The name of the SQL Database." 24 | } 25 | }, 26 | "location": { 27 | "type": "string", 28 | "defaultValue": "eastus2", 29 | "allowedValues": [ 30 | "eastus", 31 | "eastus2", 32 | "westus", 33 | "centralus", 34 | "northcentralus", 35 | "southcentralus", 36 | "westus2", 37 | "westus3", 38 | "australiaeast", 39 | "australiasoutheast", 40 | "brazilsouth", 41 | "canadacentral", 42 | "canadaeast", 43 | "centralindia", 44 | "eastasia", 45 | "japaneast", 46 | "japanwest", 47 | "koreacentral", 48 | "koreasouth", 49 | "northeurope", 50 | "southafricanorth", 51 | "southindia", 52 | "southeastasia", 53 | "uksouth", 54 | "ukwest", 55 | "westeurope" 56 | ], 57 | "metadata": { 58 | "description": "Location for all resources." 59 | } 60 | }, 61 | "administratorLogin": { 62 | "type": "string", 63 | "metadata": { 64 | "description": "The administrator username of the SQL logical server." 65 | } 66 | }, 67 | "administratorLoginPassword": { 68 | "type": "secureString", 69 | "metadata": { 70 | "description": "The administrator password of the SQL logical server." 71 | } 72 | } 73 | }, 74 | "resources": [ 75 | { 76 | "type": "Microsoft.Sql/servers", 77 | "apiVersion": "2022-05-01-preview", 78 | "name": "[parameters('serverName')]", 79 | "location": "[parameters('location')]", 80 | "properties": { 81 | "administratorLogin": "[parameters('administratorLogin')]", 82 | "administratorLoginPassword": "[parameters('administratorLoginPassword')]", 83 | "publicNetworkAccess": "Enabled" 84 | } 85 | }, 86 | { 87 | "type": "Microsoft.Sql/servers/databases", 88 | "apiVersion": "2022-05-01-preview", 89 | "name": "[format('{0}/{1}', parameters('serverName'), parameters('sqlDBName'))]", 90 | "location": "[parameters('location')]", 91 | "sku": { 92 | "name": "GP_S_Gen5", 93 | "tier": "GeneralPurpose", 94 | "family": "Gen5", 95 | "capacity": 2 96 | }, 97 | "kind": "v12.0,user,vcore,serverless,freelimit", 98 | "properties": { 99 | "useFreeLimit": "true", 100 | "freeLimitExhaustionBehavior": "AutoPause" 101 | }, 102 | "dependsOn": [ 103 | "[resourceId('Microsoft.Sql/servers', parameters('serverName'))]" 104 | ] 105 | } 106 | ] 107 | } 108 | -------------------------------------------------------------------------------- /DotNet/EF-Core-9/Migrations/BloggingContextModelSnapshot.cs: -------------------------------------------------------------------------------- 1 | // 2 | using EFCoreVectors; 3 | using Microsoft.EntityFrameworkCore; 4 | using Microsoft.EntityFrameworkCore.Infrastructure; 5 | using Microsoft.EntityFrameworkCore.Metadata; 6 | using Microsoft.EntityFrameworkCore.Storage.ValueConversion; 7 | 8 | #nullable disable 9 | 10 | namespace EFCoreVectors.Migrations 11 | { 12 | [DbContext(typeof(BloggingContext))] 13 | partial class BloggingContextModelSnapshot : ModelSnapshot 14 | { 15 | protected override void BuildModel(ModelBuilder modelBuilder) 16 | { 17 | #pragma warning disable 612, 618 18 | modelBuilder 19 | .HasAnnotation("ProductVersion", "8.0.10") 20 | .HasAnnotation("Relational:MaxIdentifierLength", 128); 21 | 22 | SqlServerModelBuilderExtensions.UseIdentityColumns(modelBuilder); 23 | 24 | modelBuilder.Entity("EFCoreVectors.Blog", b => 25 | { 26 | b.Property("BlogId") 27 | .ValueGeneratedOnAdd() 28 | .HasColumnType("int"); 29 | 30 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("BlogId")); 31 | 32 | b.Property("Name") 33 | .IsRequired() 34 | .HasColumnType("nvarchar(450)"); 35 | 36 | b.Property("Url") 37 | .IsRequired() 38 | .HasColumnType("nvarchar(max)"); 39 | 40 | b.HasKey("BlogId"); 41 | 42 | b.HasIndex("Name") 43 | .IsUnique(); 44 | 45 | b.ToTable("Blogs"); 46 | }); 47 | 48 | modelBuilder.Entity("EFCoreVectors.Post", b => 49 | { 50 | b.Property("PostId") 51 | .ValueGeneratedOnAdd() 52 | .HasColumnType("int"); 53 | 54 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("PostId")); 55 | 56 | b.Property("BlogId") 57 | .HasColumnType("int"); 58 | 59 | b.Property("Content") 60 | .IsRequired() 61 | .HasColumnType("nvarchar(max)"); 62 | 63 | b.Property("Embedding") 64 | .IsRequired() 65 | .HasColumnType("vector(1536)"); 66 | 67 | b.Property("Title") 68 | .IsRequired() 69 | .HasColumnType("nvarchar(450)"); 70 | 71 | b.HasKey("PostId"); 72 | 73 | b.HasIndex("BlogId"); 74 | 75 | b.HasIndex("Title") 76 | .IsUnique(); 77 | 78 | b.ToTable("Posts"); 79 | }); 80 | 81 | modelBuilder.Entity("EFCoreVectors.Post", b => 82 | { 83 | b.HasOne("EFCoreVectors.Blog", "Blog") 84 | .WithMany("Posts") 85 | .HasForeignKey("BlogId") 86 | .OnDelete(DeleteBehavior.Cascade) 87 | .IsRequired(); 88 | 89 | b.Navigation("Blog"); 90 | }); 91 | 92 | modelBuilder.Entity("EFCoreVectors.Blog", b => 93 | { 94 | b.Navigation("Posts"); 95 | }); 96 | #pragma warning restore 612, 618 97 | } 98 | } 99 | } 100 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/Migrations/BloggingContextModelSnapshot.cs: -------------------------------------------------------------------------------- 1 | // 2 | using EFCoreVectors; 3 | using Microsoft.Data.SqlTypes; 4 | using Microsoft.EntityFrameworkCore; 5 | using Microsoft.EntityFrameworkCore.Infrastructure; 6 | using Microsoft.EntityFrameworkCore.Metadata; 7 | using Microsoft.EntityFrameworkCore.Storage.ValueConversion; 8 | 9 | #nullable disable 10 | 11 | namespace EFCore10Vectors.Migrations 12 | { 13 | [DbContext(typeof(BloggingContext))] 14 | partial class BloggingContextModelSnapshot : ModelSnapshot 15 | { 16 | protected override void BuildModel(ModelBuilder modelBuilder) 17 | { 18 | #pragma warning disable 612, 618 19 | modelBuilder 20 | .HasAnnotation("ProductVersion", "10.0.0-rc.1.25451.107") 21 | .HasAnnotation("Relational:MaxIdentifierLength", 128); 22 | 23 | SqlServerModelBuilderExtensions.UseIdentityColumns(modelBuilder); 24 | 25 | modelBuilder.Entity("EFCoreVectors.Blog", b => 26 | { 27 | b.Property("BlogId") 28 | .ValueGeneratedOnAdd() 29 | .HasColumnType("int"); 30 | 31 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("BlogId")); 32 | 33 | b.Property("Name") 34 | .IsRequired() 35 | .HasColumnType("nvarchar(450)"); 36 | 37 | b.Property("Url") 38 | .IsRequired() 39 | .HasColumnType("nvarchar(max)"); 40 | 41 | b.HasKey("BlogId"); 42 | 43 | b.HasIndex("Name") 44 | .IsUnique(); 45 | 46 | b.ToTable("Blogs"); 47 | }); 48 | 49 | modelBuilder.Entity("EFCoreVectors.Post", b => 50 | { 51 | b.Property("PostId") 52 | .ValueGeneratedOnAdd() 53 | .HasColumnType("int"); 54 | 55 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("PostId")); 56 | 57 | b.Property("BlogId") 58 | .HasColumnType("int"); 59 | 60 | b.Property("Content") 61 | .IsRequired() 62 | .HasColumnType("nvarchar(max)"); 63 | 64 | b.Property>("Embedding") 65 | .HasColumnType("vector(1536)"); 66 | 67 | b.Property("Title") 68 | .IsRequired() 69 | .HasColumnType("nvarchar(450)"); 70 | 71 | b.HasKey("PostId"); 72 | 73 | b.HasIndex("BlogId"); 74 | 75 | b.HasIndex("Title") 76 | .IsUnique(); 77 | 78 | b.ToTable("Posts"); 79 | }); 80 | 81 | modelBuilder.Entity("EFCoreVectors.Post", b => 82 | { 83 | b.HasOne("EFCoreVectors.Blog", "Blog") 84 | .WithMany("Posts") 85 | .HasForeignKey("BlogId") 86 | .OnDelete(DeleteBehavior.Cascade) 87 | .IsRequired(); 88 | 89 | b.Navigation("Blog"); 90 | }); 91 | 92 | modelBuilder.Entity("EFCoreVectors.Blog", b => 93 | { 94 | b.Navigation("Posts"); 95 | }); 96 | #pragma warning restore 612, 618 97 | } 98 | } 99 | } 100 | -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step1-OneClick-Deployment/OpenAI_deployment.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", 3 | "contentVersion": "1.0.0.0", 4 | "parameters": { 5 | "account_name": { 6 | "type": "String", 7 | "metadata": { 8 | "description": "The name of Azure OpenAI resource." 9 | } 10 | }, 11 | "account_location": { 12 | "defaultValue": "eastus2", 13 | "type": "String", 14 | "metadata": { 15 | "description": "The location of Azure OpenAI resource." 16 | } 17 | }, 18 | "deployment_name": { 19 | "defaultValue": "gpt-4.1", 20 | "type": "String", 21 | "metadata": { 22 | "description": "The name of Azure OpenAI chat completion model." 23 | } 24 | }, 25 | "embedding_model": { 26 | "defaultValue": "text-embedding-ada-002", 27 | "type": "String", 28 | "metadata": { 29 | "description": "The name of Azure OpenAI embedding model." 30 | } 31 | } 32 | }, 33 | "variables": {}, 34 | "resources": [ 35 | { 36 | "type": "Microsoft.CognitiveServices/accounts", 37 | "apiVersion": "2024-10-01", 38 | "name": "[parameters('account_name')]", 39 | "location": "[parameters('account_location')]", 40 | "sku": { 41 | "name": "S0" 42 | }, 43 | "kind": "OpenAI", 44 | "properties": { 45 | "networkAcls": { 46 | "defaultAction": "Allow", 47 | "virtualNetworkRules": [], 48 | "ipRules": [] 49 | }, 50 | "publicNetworkAccess": "Enabled" 51 | } 52 | }, 53 | { 54 | "type": "Microsoft.CognitiveServices/accounts/deployments", 55 | "apiVersion": "2024-10-01", 56 | "name": "[concat(parameters('account_name'), '/', parameters('deployment_name'))]", 57 | "dependsOn": [ 58 | "[resourceId('Microsoft.CognitiveServices/accounts', parameters('account_name'))]" 59 | ], 60 | "sku": { 61 | "name": "GlobalStandard", 62 | "capacity": 50 63 | }, 64 | "properties": { 65 | "model": { 66 | "format": "OpenAI", 67 | "name": "gpt-4.1", 68 | "version": "2025-04-14" 69 | }, 70 | "versionUpgradeOption": "OnceNewDefaultVersionAvailable", 71 | "currentCapacity": 50, 72 | "raiPolicyName": "Microsoft.DefaultV2" 73 | } 74 | }, 75 | { 76 | "type": "Microsoft.CognitiveServices/accounts/deployments", 77 | "apiVersion": "2024-10-01", 78 | "name": "[concat(parameters('account_name'), '/', parameters('embedding_model'))]", 79 | "dependsOn": [ 80 | "[resourceId('Microsoft.CognitiveServices/accounts', parameters('account_name'))]" 81 | ], 82 | "sku": { 83 | "name": "GlobalStandard", 84 | "capacity": 150 85 | }, 86 | "properties": { 87 | "model": { 88 | "format": "OpenAI", 89 | "name": "text-embedding-ada-002", 90 | "version": "1" 91 | }, 92 | "versionUpgradeOption": "NoAutoUpgrade", 93 | "currentCapacity": 150, 94 | "raiPolicyName": "Microsoft.DefaultV2" 95 | } 96 | } 97 | ] 98 | } 99 | -------------------------------------------------------------------------------- /DotNet/EF-Core-9/Migrations/20241030213329_InitialCreate.Designer.cs: -------------------------------------------------------------------------------- 1 | // 2 | using EFCoreVectors; 3 | using Microsoft.EntityFrameworkCore; 4 | using Microsoft.EntityFrameworkCore.Infrastructure; 5 | using Microsoft.EntityFrameworkCore.Metadata; 6 | using Microsoft.EntityFrameworkCore.Migrations; 7 | using Microsoft.EntityFrameworkCore.Storage.ValueConversion; 8 | 9 | #nullable disable 10 | 11 | namespace EFCoreVectors.Migrations 12 | { 13 | [DbContext(typeof(BloggingContext))] 14 | [Migration("20241030213329_InitialCreate")] 15 | partial class InitialCreate 16 | { 17 | /// 18 | protected override void BuildTargetModel(ModelBuilder modelBuilder) 19 | { 20 | #pragma warning disable 612, 618 21 | modelBuilder 22 | .HasAnnotation("ProductVersion", "8.0.10") 23 | .HasAnnotation("Relational:MaxIdentifierLength", 128); 24 | 25 | SqlServerModelBuilderExtensions.UseIdentityColumns(modelBuilder); 26 | 27 | modelBuilder.Entity("EFCoreVectors.Blog", b => 28 | { 29 | b.Property("BlogId") 30 | .ValueGeneratedOnAdd() 31 | .HasColumnType("int"); 32 | 33 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("BlogId")); 34 | 35 | b.Property("Name") 36 | .IsRequired() 37 | .HasColumnType("nvarchar(450)"); 38 | 39 | b.Property("Url") 40 | .IsRequired() 41 | .HasColumnType("nvarchar(max)"); 42 | 43 | b.HasKey("BlogId"); 44 | 45 | b.HasIndex("Name") 46 | .IsUnique(); 47 | 48 | b.ToTable("Blogs"); 49 | }); 50 | 51 | modelBuilder.Entity("EFCoreVectors.Post", b => 52 | { 53 | b.Property("PostId") 54 | .ValueGeneratedOnAdd() 55 | .HasColumnType("int"); 56 | 57 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("PostId")); 58 | 59 | b.Property("BlogId") 60 | .HasColumnType("int"); 61 | 62 | b.Property("Content") 63 | .IsRequired() 64 | .HasColumnType("nvarchar(max)"); 65 | 66 | b.Property("Embedding") 67 | .IsRequired() 68 | .HasColumnType("vector(1536)"); 69 | 70 | b.Property("Title") 71 | .IsRequired() 72 | .HasColumnType("nvarchar(450)"); 73 | 74 | b.HasKey("PostId"); 75 | 76 | b.HasIndex("BlogId"); 77 | 78 | b.HasIndex("Title") 79 | .IsUnique(); 80 | 81 | b.ToTable("Posts"); 82 | }); 83 | 84 | modelBuilder.Entity("EFCoreVectors.Post", b => 85 | { 86 | b.HasOne("EFCoreVectors.Blog", "Blog") 87 | .WithMany("Posts") 88 | .HasForeignKey("BlogId") 89 | .OnDelete(DeleteBehavior.Cascade) 90 | .IsRequired(); 91 | 92 | b.Navigation("Blog"); 93 | }); 94 | 95 | modelBuilder.Entity("EFCoreVectors.Blog", b => 96 | { 97 | b.Navigation("Posts"); 98 | }); 99 | #pragma warning restore 612, 618 100 | } 101 | } 102 | } 103 | -------------------------------------------------------------------------------- /DotNet/EF-Core-10/Migrations/20250910190225_InitialCreate.Designer.cs: -------------------------------------------------------------------------------- 1 | // 2 | using EFCoreVectors; 3 | using Microsoft.Data.SqlTypes; 4 | using Microsoft.EntityFrameworkCore; 5 | using Microsoft.EntityFrameworkCore.Infrastructure; 6 | using Microsoft.EntityFrameworkCore.Metadata; 7 | using Microsoft.EntityFrameworkCore.Migrations; 8 | using Microsoft.EntityFrameworkCore.Storage.ValueConversion; 9 | 10 | #nullable disable 11 | 12 | namespace EFCore10Vectors.Migrations 13 | { 14 | [DbContext(typeof(BloggingContext))] 15 | [Migration("20250910190225_InitialCreate")] 16 | partial class InitialCreate 17 | { 18 | /// 19 | protected override void BuildTargetModel(ModelBuilder modelBuilder) 20 | { 21 | #pragma warning disable 612, 618 22 | modelBuilder 23 | .HasAnnotation("ProductVersion", "10.0.0-rc.1.25451.107") 24 | .HasAnnotation("Relational:MaxIdentifierLength", 128); 25 | 26 | SqlServerModelBuilderExtensions.UseIdentityColumns(modelBuilder); 27 | 28 | modelBuilder.Entity("EFCoreVectors.Blog", b => 29 | { 30 | b.Property("BlogId") 31 | .ValueGeneratedOnAdd() 32 | .HasColumnType("int"); 33 | 34 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("BlogId")); 35 | 36 | b.Property("Name") 37 | .IsRequired() 38 | .HasColumnType("nvarchar(450)"); 39 | 40 | b.Property("Url") 41 | .IsRequired() 42 | .HasColumnType("nvarchar(max)"); 43 | 44 | b.HasKey("BlogId"); 45 | 46 | b.HasIndex("Name") 47 | .IsUnique(); 48 | 49 | b.ToTable("Blogs"); 50 | }); 51 | 52 | modelBuilder.Entity("EFCoreVectors.Post", b => 53 | { 54 | b.Property("PostId") 55 | .ValueGeneratedOnAdd() 56 | .HasColumnType("int"); 57 | 58 | SqlServerPropertyBuilderExtensions.UseIdentityColumn(b.Property("PostId")); 59 | 60 | b.Property("BlogId") 61 | .HasColumnType("int"); 62 | 63 | b.Property("Content") 64 | .IsRequired() 65 | .HasColumnType("nvarchar(max)"); 66 | 67 | b.Property>("Embedding") 68 | .HasColumnType("vector(1536)"); 69 | 70 | b.Property("Title") 71 | .IsRequired() 72 | .HasColumnType("nvarchar(450)"); 73 | 74 | b.HasKey("PostId"); 75 | 76 | b.HasIndex("BlogId"); 77 | 78 | b.HasIndex("Title") 79 | .IsUnique(); 80 | 81 | b.ToTable("Posts"); 82 | }); 83 | 84 | modelBuilder.Entity("EFCoreVectors.Post", b => 85 | { 86 | b.HasOne("EFCoreVectors.Blog", "Blog") 87 | .WithMany("Posts") 88 | .HasForeignKey("BlogId") 89 | .OnDelete(DeleteBehavior.Cascade) 90 | .IsRequired(); 91 | 92 | b.Navigation("Blog"); 93 | }); 94 | 95 | modelBuilder.Entity("EFCoreVectors.Blog", b => 96 | { 97 | b.Navigation("Posts"); 98 | }); 99 | #pragma warning restore 612, 618 100 | } 101 | } 102 | } 103 | -------------------------------------------------------------------------------- /DiskANN/diskann-quickstart-azure-sql.sql: -------------------------------------------------------------------------------- 1 | -- Step 1: Create a sample table with a VECTOR(5) column 2 | DROP TABLE IF EXISTS dbo.Articles; 3 | CREATE TABLE dbo.Articles 4 | ( 5 | id INT PRIMARY KEY, 6 | title NVARCHAR(100), 7 | content NVARCHAR(MAX), 8 | embedding VECTOR(5) 9 | ); 10 | 11 | -- Step 2: Insert sample data 12 | INSERT INTO Articles (id, title, content, embedding) 13 | VALUES 14 | (1, 'Intro to AI', 'This article introduces AI concepts.', '[0.1, 0.2, 0.3, 0.4, 0.5]'), 15 | (2, 'Deep Learning', 'Deep learning is a subset of ML.', '[0.2, 0.1, 0.4, 0.3, 0.6]'), 16 | (3, 'Neural Networks', 'Neural networks are powerful models.', '[0.3, 0.3, 0.3, 0.5, 0.1]'), 17 | (4, 'Machine Learning Basics', 'ML basics for beginners.', '[0.4, 0.5, 0.1, 0.7, 0.3]'), 18 | (5, 'Advanced AI', 'Exploring advanced AI techniques.', '[0.5, 0.4, 0.1, 0.1, 0.2]'), 19 | (6, 'AI in Healthcare', 'AI applications in healthcare.', '[0.6, 0.3, 0.4, 0.3, 0.2]'), 20 | (7, 'AI Ethics', 'Ethical considerations in AI.', '[0.1, 0.9, 0.5, 0.4, 0.3]'), 21 | (8, 'AI and Society', 'Impact of AI on society.', '[0.2, 0.3, 0.5, 0.5, 0.4]'), 22 | (9, 'Future of AI', 'Predictions for the future of AI.', '[0.8, 0.4, 0.5, 0.1, 0.2]'), 23 | (10, 'AI Innovations', 'Latest innovations in AI.', '[0.4, 0.7, 0.2, 0.3, 0.1]'); 24 | GO 25 | 26 | -- Step 3: Create a vector index on the embedding column 27 | CREATE VECTOR INDEX vec_idx ON Articles(embedding) 28 | WITH (METRIC = 'Cosine', TYPE = 'DiskANN') 29 | ON [PRIMARY]; 30 | GO 31 | 32 | -- Step 4: Perform a vector similarity search 33 | DECLARE @qv VECTOR(5) = (SELECT TOP(1) embedding FROM Articles WHERE id = 1); 34 | SELECT 35 | t.id, 36 | t.title, 37 | t.content, 38 | s.distance 39 | FROM 40 | VECTOR_SEARCH( 41 | TABLE = Articles AS t, 42 | COLUMN = embedding, 43 | SIMILAR_TO = @qv, 44 | METRIC = 'Cosine', 45 | TOP_N = 3 46 | ) AS s 47 | ORDER BY s.distance, t.title; 48 | GO 49 | 50 | -- Step 5: View index details 51 | SELECT index_id, [type], [type_desc], vector_index_type, distance_metric, build_parameters FROM sys.vector_indexes WHERE [name] = 'vec_idx'; 52 | GO 53 | 54 | -- Step 6a: Data modification is disabled when DiskANN exist on a table 55 | INSERT INTO Articles (id, title, content, embedding) 56 | VALUES 57 | (11, 'Vectors and Embeddings', 'Everything about vectors and embeddings.', '[0.1, 0.2, 0.3, 0.4, 0.6]'); 58 | GO 59 | 60 | -- Step 6b: Allow index to go stale 61 | ALTER DATABASE SCOPED CONFIGURATION 62 | SET ALLOW_STALE_VECTOR_INDEX = ON 63 | GO 64 | 65 | -- Step 6c: Data modification is now works 66 | INSERT INTO Articles (id, title, content, embedding) 67 | VALUES 68 | (11, 'Vectors and Embeddings', 'Everything about vectors and embeddings.', '[0.1, 0.2, 0.3, 0.4, 0.6]'); 69 | GO 70 | 71 | -- Step 7: Perform a vector similarity search, new data not visible 72 | DECLARE @qv VECTOR(5) = (SELECT TOP(1) embedding FROM Articles WHERE id = 1); 73 | SELECT 74 | t.id, 75 | t.title, 76 | t.content, 77 | s.distance 78 | FROM 79 | VECTOR_SEARCH( 80 | TABLE = Articles AS t, 81 | COLUMN = embedding, 82 | SIMILAR_TO = @qv, 83 | METRIC = 'Cosine', 84 | TOP_N = 3 85 | ) AS s 86 | ORDER BY s.distance, t.title; 87 | GO 88 | 89 | -- Step 8: Re-Create a vector index on the embedding column 90 | DROP INDEX vec_idx ON Articles; 91 | CREATE VECTOR INDEX vec_idx ON Articles(embedding) 92 | WITH (METRIC = 'Cosine', TYPE = 'DiskANN') 93 | ON [PRIMARY]; 94 | GO 95 | 96 | -- Step 9: Data now visible 97 | DECLARE @qv VECTOR(5) = (SELECT TOP(1) embedding FROM Articles WHERE id = 1); 98 | SELECT 99 | t.id, 100 | t.title, 101 | t.content, 102 | s.distance 103 | FROM 104 | VECTOR_SEARCH( 105 | TABLE = Articles AS t, 106 | COLUMN = embedding, 107 | SIMILAR_TO = @qv, 108 | METRIC = 'Cosine', 109 | TOP_N = 3 110 | ) AS s 111 | ORDER BY s.distance; 112 | GO 113 | 114 | -- Step 6: Clean up by dropping the table 115 | DROP INDEX vec_idx ON Articles; -------------------------------------------------------------------------------- /Retrieval-Augmented-Generation/Readme.md: -------------------------------------------------------------------------------- 1 | # Create, Store, and Query OpenAI Embeddings in Azure SQL DB 2 | 3 | Learn how to integrate Azure OpenAI API with Azure SQL DB to create, store, and query embeddings for advanced similarity searches and LLM generation augmentation. 4 | 5 | ## Tutorial Overview 6 | 7 | This Python [notebook](RetrievalAugmentedGeneration.ipynb) will teach you to: 8 | 9 | - **Create Embeddings**: Generate embeddings from content using the Azure OpenAI API. 10 | - **Vector Database Utilization**: Use Azure SQL DB to store embeddings and perform similarity searches. 11 | - **LLM Generation Augmentation**: Enhance language model generation with embeddings from a vector database. In this case we use the embeddings to inform a GPT-4 chat model, enabling it to provide rich, context-aware answers about products based on past customer reviews. 12 | 13 | ## Dataset 14 | 15 | We use the Fine Foods Review Dataset from Kaggle, which contains Amazon reviews of fine foods. 16 | 17 | - For simplicity, this tutorial uses a smaller sample [Fine Foods Review Dataset](../Datasets/Reviews.csv) to demonstrate embedding generation. 18 | - Alternatively, if **you to wish bypass embedding generation** and jump straight to similarity search in SQLDB. you can download the pre-generated [FineFoodEmbeddings.csv](../Datasets/FineFoodEmbeddings.csv) 19 | 20 | ## Prerequisites 21 | 22 | - **Azure Subscription**: [Create one for free](https:\azure.microsoft.com\free\cognitive-services?azure-portal=true) 23 | - **Azure SQL Database**: [Set up your database for free](https:\learn.microsoft.com\azure\azure-sql\database\free-offer?view=azuresql) 24 | - **Azure Data Studio**: Download [here](https://azure.microsoft.com/products/data-studio) to manage your Azure SQL database and [execute the notebook](https://learn.microsoft.com/azure-data-studio/notebooks/notebooks-python-kernel) 25 | 26 | ## Additional Requirements for Embedding Generation 27 | 28 | - **Azure OpenAI Access**: Apply for access in the desired Azure subscription at [https://aka.ms/oai/access](https:\aka.ms\oai\access) 29 | - **Azure OpenAI Resource**: Deploy an embeddings model (e.g., `text-embedding-small` or `text-embedding-ada-002`) and a `GPT-4` model for chat completion. Refer to the [resource deployment guide](https:\learn.microsoft.com\azure\ai-services\openai\how-to\create-resource) 30 | - **Python**: Version 3.7.1 or later from Python.org. 31 | - **Python Libraries**: Install the required libraries openai, num2words, matplotlib, plotly, scipy, scikit-learn, pandas, tiktoken, and pyodbc. 32 | - **Jupyter Notebooks**: Use within [Azure Data Studio](https:\learn.microsoft.com\en-us\azure-data-studio\notebooks\notebooks-guidance) or Visual Studio Code . 33 | 34 | Code snippets are adapted from the [Azure OpenAI Service embeddings Tutorial](https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/embeddings?tabs=python-new%2Ccommand-line&pivots=programming-language-python) 35 | 36 | ## Getting Started 37 | 38 | 1. **Database Setup**: Execute SQL commands from the `createtable.sql` script to create the necessary table in your database. 39 | 2. **Model Deployment**: Deploy an embeddings model (`text-embedding-small` or `text-embedding-ada-002`) and a `GPT-4` model for chat completion. Note the 2 model deployment names for later use. 40 | ![Deployed OpenAI Models](../Assets/modeldeployment.png) 41 | 3. **Connection String**: Find your Azure SQL DB connection string in the Azure portal under your database settings. 42 | 4. **Configuration**: Populate the `.env` file with your SQL server connection details , Azure OpenAI key, and endpoint values. 43 | 44 | You can retrieve the Azure OpenAI *endpoint* and *key*: 45 | 46 | ![Azure OpenAI Endpoint and Key](../Assets/endpoint.png) 47 | 48 | ## Running the Notebook 49 | 50 | To [execute the notebook](https://learn.microsoft.com/azure-data-studio/notebooks/notebooks-python-kernel), connect to your Azure SQL database using Azure Data Studio, which can be downloaded [here](https://azure.microsoft.com/products/data-studio). 51 | 52 | Then open the notebook [RetrievalAugmentedGeneration.ipynb](./RetrievalAugmentedGeneration.ipynb) 53 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to [project-title] 2 | 3 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 4 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 5 | the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com. 6 | 7 | When you submit a pull request, a CLA bot will automatically determine whether you need to provide 8 | a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions 9 | provided by the bot. You will only need to do this once across all repos using our CLA. 10 | 11 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 12 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 13 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 14 | 15 | - [Code of Conduct](#coc) 16 | - [Issues and Bugs](#issue) 17 | - [Feature Requests](#feature) 18 | - [Submission Guidelines](#submit) 19 | 20 | ## Code of Conduct 21 | Help us keep this project open and inclusive. Please read and follow our [Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 22 | 23 | ## Found an Issue? 24 | If you find a bug in the source code or a mistake in the documentation, you can help us by 25 | [submitting an issue](#submit-issue) to the GitHub Repository. Even better, you can 26 | [submit a Pull Request](#submit-pr) with a fix. 27 | 28 | ## Want a Feature? 29 | You can *request* a new feature by [submitting an issue](#submit-issue) to the GitHub 30 | Repository. If you would like to *implement* a new feature, please submit an issue with 31 | a proposal for your work first, to be sure that we can use it. 32 | 33 | * **Small Features** can be crafted and directly [submitted as a Pull Request](#submit-pr). 34 | 35 | ## Submission Guidelines 36 | 37 | ### Submitting an Issue 38 | Before you submit an issue, search the archive, maybe your question was already answered. 39 | 40 | If your issue appears to be a bug, and hasn't been reported, open a new issue. 41 | Help us to maximize the effort we can spend fixing issues and adding new 42 | features, by not reporting duplicate issues. Providing the following information will increase the 43 | chances of your issue being dealt with quickly: 44 | 45 | * **Overview of the Issue** - if an error is being thrown a non-minified stack trace helps 46 | * **Version** - what version is affected (e.g. 0.1.2) 47 | * **Motivation for or Use Case** - explain what are you trying to do and why the current behavior is a bug for you 48 | * **Browsers and Operating System** - is this a problem with all browsers? 49 | * **Reproduce the Error** - provide a live example or a unambiguous set of steps 50 | * **Related Issues** - has a similar issue been reported before? 51 | * **Suggest a Fix** - if you can't fix the bug yourself, perhaps you can point to what might be 52 | causing the problem (line of code or commit) 53 | 54 | You can file new issues by providing the above information at the corresponding repository's issues link: https://github.com/[organization-name]/[repository-name]/issues/new]. 55 | 56 | ### Submitting a Pull Request (PR) 57 | Before you submit your Pull Request (PR) consider the following guidelines: 58 | 59 | * Search the repository (https://github.com/[organization-name]/[repository-name]/pulls) for an open or closed PR 60 | that relates to your submission. You don't want to duplicate effort. 61 | 62 | * Make your changes in a new git fork: 63 | 64 | * Commit your changes using a descriptive commit message 65 | * Push your fork to GitHub: 66 | * In GitHub, create a pull request 67 | * If we suggest changes then: 68 | * Make the required updates. 69 | * Rebase your fork and force push to your GitHub repository (this will update your Pull Request): 70 | 71 | ```shell 72 | git rebase master -i 73 | git push -f 74 | ``` 75 | 76 | That's it! Thank you for your contribution! 77 | -------------------------------------------------------------------------------- /DotNet/Dapper/app/Program.cs: -------------------------------------------------------------------------------- 1 | using System.Text.Json; 2 | using Dapper; 3 | using DotNetEnv; 4 | using Microsoft.Data.SqlClient; 5 | using Microsoft.Data.SqlTypes; 6 | using DapperVectors; 7 | 8 | // Load .env 9 | Env.Load(); 10 | 11 | // Update Dapper Type Handler to handle SqlVector 12 | SqlMapper.AddTypeHandler(new VectorTypeHandler()); 13 | 14 | // Get connection string from environment variable 15 | var connectionString = Environment.GetEnvironmentVariable("MSSQL"); 16 | if (string.IsNullOrWhiteSpace(connectionString)) 17 | { 18 | Console.WriteLine("Please set the MSSQL environment variable (for example in a .env file). Exiting."); 19 | return; 20 | } 21 | 22 | // Create SQL connection 23 | Console.WriteLine("Connecting to database..."); 24 | await using var connection = new SqlConnection(connectionString); 25 | 26 | // Confirm database project objects exist 27 | var tableExists = await connection.ExecuteScalarAsync( 28 | "SELECT COUNT(*) FROM sys.tables WHERE (name = 'Blogs' or name = 'Posts') AND schema_id = SCHEMA_ID('dbo')"); 29 | 30 | if (tableExists.GetValueOrDefault() != 2) 31 | { 32 | Console.WriteLine("The database schema does not appear to be deployed. Please publish the database project in the `db` folder before running this sample."); 33 | return; 34 | } 35 | 36 | // Ensure Sample blog exists 37 | const string sampleBlogName = "Sample blog"; 38 | var blogId = await connection.QuerySingleOrDefaultAsync( 39 | "SELECT BlogId FROM dbo.Blogs WHERE Name = @Name", new { Name = sampleBlogName }); 40 | 41 | if (!blogId.HasValue) 42 | { 43 | Console.WriteLine("Creating 'Sample blog'..."); 44 | blogId = await connection.ExecuteScalarAsync( 45 | "INSERT INTO dbo.Blogs (Name, Url) OUTPUT INSERTED.BlogId VALUES (@Name, @Url)", 46 | new { Name = sampleBlogName, Url = "https://devblogs.microsoft.com" } 47 | ); 48 | } 49 | 50 | // Choose real or mock embedding client 51 | Console.WriteLine("Creating embedding client..."); 52 | var embeddingClient = new AzureOpenAIEmbeddingClient(); 53 | // var embeddingClient = new MockEmbeddingClient(); 54 | 55 | Console.WriteLine("Adding posts..."); 56 | var content = File.ReadAllText("content.json"); 57 | var newPosts = JsonSerializer.Deserialize>(content)!; 58 | foreach (var np in newPosts) 59 | { 60 | // compute embedding 61 | var vector = embeddingClient.GetEmbedding(np.Content); 62 | 63 | // check if post exists for this blog 64 | var existingPostId = await connection.QuerySingleOrDefaultAsync( 65 | "SELECT PostId FROM dbo.Posts WHERE BlogId = @BlogId and Title = @Title", 66 | new { BlogId = blogId, Title = np.Title }); 67 | 68 | if (existingPostId.HasValue) 69 | { 70 | await connection.ExecuteAsync( 71 | "UPDATE dbo.Posts SET Content = @Content, Embedding = @Embedding WHERE PostId = @PostId", 72 | new { np.Content, Embedding = vector, PostId = existingPostId.Value } 73 | ); 74 | } 75 | else 76 | { 77 | await connection.ExecuteAsync( 78 | "INSERT INTO dbo.Posts (Title, Content, Embedding, BlogId) VALUES (@Title, @Content, @Embedding, @BlogId)", 79 | new { np.Title, np.Content, Embedding = vector, BlogId = blogId } 80 | ); 81 | } 82 | } 83 | 84 | // Query for similar posts 85 | Console.WriteLine("\n----------\n"); 86 | string searchPhrase = "I want to use Azure SQL, Dapper and vectors in my app!"; 87 | Console.WriteLine($"Search phrase is: '{searchPhrase}'..."); 88 | 89 | var queryVector = embeddingClient.GetEmbedding(searchPhrase); 90 | 91 | var sql = @" 92 | SELECT TOP (5) p.Title, 93 | VECTOR_DISTANCE('cosine', p.Embedding, @vector) AS Distance 94 | FROM dbo.Posts p 95 | JOIN dbo.Blogs b ON p.BlogId = b.BlogId 96 | WHERE b.Name = @BlogName 97 | ORDER BY Distance ASC;"; 98 | 99 | var results = await connection.QueryAsync(sql, new { vector = queryVector, BlogName = sampleBlogName }); 100 | 101 | Console.WriteLine("Similar posts found:"); 102 | foreach (var r in results) 103 | { 104 | Console.WriteLine($"Post: {r.Title}, Distance: {r.Distance}"); 105 | } 106 | -------------------------------------------------------------------------------- /SemanticKernel/dotnet/MemoryStoreSample/Program.cs: -------------------------------------------------------------------------------- 1 | using System.Text; 2 | using Microsoft.SemanticKernel; 3 | using Microsoft.SemanticKernel.ChatCompletion; 4 | using Microsoft.SemanticKernel.Connectors.OpenAI; 5 | using Microsoft.SemanticKernel.Connectors.SqlServer; 6 | using Microsoft.SemanticKernel.Memory; 7 | using DotNetEnv; 8 | using System.Data.Common; 9 | using Microsoft.SemanticKernel.Embeddings; 10 | 11 | Env.Load(); 12 | 13 | #pragma warning disable SKEXP0001, SKEXP0010, SKEXP0020 14 | 15 | string azureOpenAIEndpoint = Env.GetString("AZURE_OPENAI_ENDPOINT")!; 16 | string azureOpenAIApiKey = Env.GetString("AZURE_OPENAI_API_KEY")!; 17 | string embeddingModelDeploymentName = Env.GetString("AZURE_OPENAI_EMBEDDING_MODEL")!; 18 | string chatModelDeploymentName = Env.GetString("AZURE_OPENAI_CHAT_MODEL")!; 19 | string connectionString = Env.GetString("AZURE_SQL_CONNECTION_STRING")!; 20 | string tableName = "SemanticKernel_Memory"; 21 | 22 | Console.WriteLine("Creating Semantic Kernel services..."); 23 | var kernel = Kernel.CreateBuilder() 24 | .AddAzureOpenAIChatCompletion(chatModelDeploymentName, azureOpenAIEndpoint, azureOpenAIApiKey) 25 | .AddAzureOpenAITextEmbeddingGeneration(embeddingModelDeploymentName, azureOpenAIEndpoint, azureOpenAIApiKey) 26 | .Build(); 27 | 28 | var textEmbeddingGenerationService = kernel.GetRequiredService(); 29 | 30 | Console.WriteLine("Connecting to Memory Store..."); 31 | var memory = new MemoryBuilder() 32 | .WithSqlServerMemoryStore(connectionString) 33 | .WithTextEmbeddingGeneration(textEmbeddingGenerationService) 34 | .Build(); 35 | 36 | Console.WriteLine("Adding memories..."); 37 | await memory.SaveInformationAsync(tableName, id: "semantic-kernel-mssql", text: "With the new connector Microsoft.SemanticKernel.Connectors.SqlServer it is possible to efficiently store and retrieve memories thanks to the newly added vector support"); 38 | await memory.SaveInformationAsync(tableName, id: "semantic-kernel-azuresql", text: "At the moment Microsoft.SemanticKernel.Connectors.SqlServer can be used only with Azure SQL"); 39 | await memory.SaveInformationAsync(tableName, id: "azuresql-vector-1", text: "Azure SQL support for vectors is in Public Preview and can be used by anyone in Azure right away"); 40 | await memory.SaveInformationAsync(tableName, id: "pizza-favourite-food", text: "Pizza is one of the favourite food in the world."); 41 | 42 | Console.WriteLine("You can now chat with the AI chatbot."); 43 | Console.WriteLine("Sample question: Can I use vector with Azure SQL?"); 44 | Console.WriteLine(""); 45 | 46 | var ai = kernel.GetRequiredService(); 47 | var chatHistory = new ChatHistory("You are an AI assistant that helps people find information. Only use the information provided in the memory to answer the questions. Do not make up any information."); 48 | var consoleMessages = new StringBuilder(); 49 | while (true) 50 | { 51 | Console.Write("Question: "); 52 | var question = Console.ReadLine()!; 53 | 54 | Console.WriteLine("\nSearching information from the memory..."); 55 | consoleMessages.Clear(); 56 | await foreach (var result in memory.SearchAsync(tableName, question, limit: 3)) 57 | { 58 | consoleMessages.AppendLine(result.Metadata.Text); 59 | } 60 | if (consoleMessages.Length != 0) { 61 | Console.WriteLine("\nFound information from the memory:"); 62 | Console.WriteLine(consoleMessages.ToString()); 63 | } 64 | 65 | Console.WriteLine("Answer: "); 66 | var contextToRemove = -1; 67 | if (consoleMessages.Length != 0) 68 | { 69 | consoleMessages.Insert(0, "Here's some additional information: "); 70 | contextToRemove = chatHistory.Count; 71 | chatHistory.AddUserMessage(consoleMessages.ToString()); 72 | } 73 | 74 | chatHistory.AddUserMessage(question); 75 | 76 | consoleMessages.Clear(); 77 | await foreach (var message in ai.GetStreamingChatMessageContentsAsync(chatHistory)) 78 | { 79 | Console.Write(message); 80 | consoleMessages.Append(message.Content); 81 | } 82 | Console.WriteLine(); 83 | chatHistory.AddAssistantMessage(consoleMessages.ToString()); 84 | 85 | if (contextToRemove >= 0) 86 | chatHistory.RemoveAt(contextToRemove); 87 | 88 | Console.WriteLine(); 89 | } 90 | -------------------------------------------------------------------------------- /Hybrid-Search/hybrid_search.py: -------------------------------------------------------------------------------- 1 | import os 2 | import pyodbc 3 | import logging 4 | import json 5 | from sentence_transformers import SentenceTransformer 6 | from dotenv import load_dotenv 7 | from utilities import get_mssql_connection 8 | 9 | load_dotenv() 10 | 11 | if __name__ == '__main__': 12 | print('Initializing sample...') 13 | print('Getting embeddings...') 14 | sentences = [ 15 | 'The dog is barking', 16 | 'The cat is purring', 17 | 'The bear is growling', 18 | 'The bear roars' 19 | ] 20 | model = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1') # returns a 384-dimensional vector 21 | embeddings = model.encode(sentences) 22 | 23 | conn = get_mssql_connection() 24 | 25 | print('Cleaning up the database...') 26 | try: 27 | cursor = conn.cursor() 28 | cursor.execute("DELETE FROM dbo.documents;") 29 | cursor.commit(); 30 | finally: 31 | cursor.close() 32 | 33 | print('Saving documents and embeddings in the database...') 34 | try: 35 | cursor = conn.cursor() 36 | 37 | for id, (content, embedding) in enumerate(zip(sentences, embeddings)): 38 | cursor.execute(f""" 39 | INSERT INTO dbo.documents (id, content, embedding) VALUES (?, ?, CAST(? AS VECTOR(384))); 40 | """, 41 | id, 42 | content, 43 | json.dumps(embedding.tolist()) 44 | ) 45 | 46 | cursor.commit() 47 | finally: 48 | cursor.close() 49 | 50 | print('Searching for similar documents...') 51 | print('Getting embeddings...') 52 | query = 'growling bear' 53 | embedding = model.encode(query) 54 | 55 | print(f'Querying database for "{query}"...') 56 | k = 5 57 | try: 58 | cursor = conn.cursor() 59 | 60 | results = cursor.execute(f""" 61 | DECLARE @k INT = ?; 62 | DECLARE @q NVARCHAR(1000) = ?; 63 | DECLARE @v VECTOR(384) = CAST(? AS VECTOR(384)); 64 | WITH keyword_search AS ( 65 | SELECT TOP(@k) 66 | id, 67 | RANK() OVER (ORDER BY ft_rank DESC) AS rank 68 | FROM 69 | ( 70 | SELECT TOP(@k) 71 | id, 72 | ftt.[RANK] AS ft_rank 73 | FROM 74 | dbo.documents 75 | INNER JOIN 76 | FREETEXTTABLE(dbo.documents, *, @q) AS ftt ON dbo.documents.id = ftt.[KEY] 77 | ORDER BY 78 | ft_rank DESC 79 | ) AS freetext_documents 80 | ), 81 | semantic_search AS 82 | ( 83 | SELECT TOP(@k) 84 | id, 85 | RANK() OVER (ORDER BY cosine_distance) AS rank 86 | FROM 87 | ( 88 | SELECT 89 | id, 90 | VECTOR_DISTANCE('cosine', @v, embedding) AS cosine_distance 91 | FROM 92 | dbo.documents 93 | ) AS similar_documents 94 | ), 95 | result AS ( 96 | SELECT TOP(@k) 97 | COALESCE(ss.id, ks.id) AS id, 98 | ss.rank AS semantic_rank, 99 | ks.rank AS keyword_rank, 100 | COALESCE(1.0 / (@k + ss.rank), 0.0) + 101 | COALESCE(1.0 / (@k + ks.rank), 0.0) AS score -- Reciprocal Rank Fusion (RRF) 102 | FROM 103 | semantic_search ss 104 | FULL OUTER JOIN 105 | keyword_search ks ON ss.id = ks.id 106 | ORDER BY 107 | score DESC 108 | ) 109 | SELECT 110 | d.id, 111 | semantic_rank, 112 | keyword_rank, 113 | score, 114 | content 115 | FROM 116 | result AS r 117 | INNER JOIN 118 | dbo.documents AS d ON r.id = d.id 119 | """, 120 | k, 121 | query, 122 | json.dumps(embedding.tolist()), 123 | ) 124 | 125 | for row in results: 126 | print(f'Document: {row[0]} (content: {row[4]}) -> RRF score: {row[3]:0.4} (Semantic Rank: {row[1]}, Keyword Rank: {row[2]})') 127 | 128 | finally: 129 | cursor.close() 130 | -------------------------------------------------------------------------------- /Langchain-SQL-RAG/readme.md: -------------------------------------------------------------------------------- 1 | # Building AI-powered apps on Azure SQL Database using LLMs and LangChain 2 | 3 | Azure SQL Database now supports native vector search capabilities, bringing the power of vector search operations directly to your SQL databases. You can read the full announcement of the public preview [here](https:/devblogs.microsoft.com/azure-sql/exciting-announcement-public-preview-of-native-vector-support-in-azure-sql-database) 4 | 5 | We are also thrilled to announce the release of [langchain-sqlserver](https:/pypi.org/project/langchain-sqlserver) version 0.1.1. You can use this package to manage Langchain vectorstores in SQL Server. This new release brings enhanced capabilities by parsing both ODBC connection strings and SQLAlchemy format connection strings, making it easier than ever to integrate with Azure SQL DB 6 | 7 | In this step-by-step tutorial, we will show you how to add generative AI features to your own applications with just a few lines of code using Azure SQL DB, [LangChain](https:/pypi.org/project/langchain-sqlserver), and LLMs. 8 | 9 | ## Dataset 10 | 11 | The Harry Potter series, written by J.K. Rowling, is a globally beloved collection of seven books that follow the journey of a young wizard, Harry Potter, and his friends as they battle the dark forces led by the evil Voldemort. Its captivating plot, rich characters, and imaginative world have made it one of the most famous and cherished series in literary history.  12 | 13 | This Sample dataset from [Kaggle](https:/www.kaggle.com/datasets/shubhammaindola/harry-potter-books) contains 7 .txt files of 7 books of Harry Potter. For this demo we will only be using the first book - Harry Potter and the Sorcerer's Stone. 14 | 15 | In this notebook, we will showcase two exciting use cases: 16 | 1. A sample Python application that can understand and respond to human language queries about the data stored in your Azure SQL Database. This **Q&A system** leverages the power of SQL Vectore Store & LangChain to provide accurate and context-rich answers from the Harry Potter Book. 17 | 1. Next, we will push the creative limits of the application by teaching it to generate new AI-driven **Harry Potter fan fiction** based on our existing dataset of Harry Potter books. This feature is sure to delight Potterheads, allowing them to explore new adventures and create their own magical stories. 18 | 19 | ## Prerequisites 20 | 21 | - **Azure Subscription**: [Create one for free](https:/azure.microsoft.com/free/cognitive-services?azure-portal=true) 22 | 23 | - **Azure SQL Database**: [Set up your database for free](https:/learn.microsoft.com/azure/azure-sql/database/free-offer?view=azuresql) 24 | 25 | - **Azure OpenAI Access**: Apply for access in the desired Azure subscription at [https://aka.ms/oai/access](https:/aka.ms/oai/access) 26 | 27 | - **Azure OpenAI Resource**: Deploy an embeddings model (e.g., `text-embedding-small` or `text-embedding-ada-002`) and a `GPT-4.0` model for chat completion. Refer to the [resource deployment guide](https:/learn.microsoft.com/azure/ai-services/openai/how-to/create-resource) 28 | 29 | - **Azure Blob Storage** Deploy a Azure [Blob Storage Account](https:/learn.microsoft.com/azure/storage/blobs/storage-quickstart-blobs-portal) to upload your dataset 30 | 31 | - **Python**: Version 3.7.1 or later from Python.org. (Sample has been tested with Python 3.11) 32 | 33 | - **Python Libraries**: Install the required libraries from the requirements.txt 34 | 35 | - **Jupyter Notebooks**: Use within [Azure Data Studio](https:/learn.microsoft.com/azure-data-studio/notebooks/notebooks-guidance) or Visual Studio Code . 36 | 37 | 38 | ## Getting Started 39 | 40 | 1. **Model Deployment**: Deploy an embeddings model (`text-embedding-small` or `text-embedding-ada-002`) and a `GPT-4` model for chat completion. Note the 2 models deployment names for use in the `.env` file 41 | 42 | ![Deployed OpenAI Models](..\Assets\modeldeployment.png) 43 | 44 | 2. **Connection String**: Find your Azure SQL DB connection string in the Azure portal under your database settings. 45 | 3. **Configuration**: Populate the `.env` file with your SQL server connection details , Azure OpenAI key and endpoint , api-version & Model deploymentname 46 | 47 | You can retrieve the Azure OpenAI _endpoint_ and _key_: 48 | 49 | ![Azure OpenAI Endpoint and Key](..\Assets\endpoint.png) 50 | 51 | 4. **Upload dataset** In your [Blob Storage Account](https:/learn.microsoft.com/azure/storage/blobs/storage-quickstart-blobs-portal) create a container and upload the .txt file using the steps [here](https:/learn.microsoft.com/azure/storage/blobs/storage-quickstart-blobs-portal) 52 | 53 | ## Running the Notebook 54 | 55 | To [execute the notebook](https:/learn.microsoft.com/azure-data-studio/notebooks/notebooks-python-kernel), connect to your Azure SQL database using Azure Data Studio, which can be downloaded [here](https:/azure.microsoft.com/products/data-studio) -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step1-OneClick-Deployment/README.md: -------------------------------------------------------------------------------- 1 | # Azure Resource Deployment for Retrieval-Augmented Generation (RAG) 2 | 3 | This repository provides ARM templates to deploy all required Azure resources for building Retrieval-Augmented Generation (RAG) pipelines. It supports both structured and unstructured data scenarios and is designed to simplify user onboarding when creating GenAI applications. The template includes the deployment of Azure SQL Database, Azure Document Intelligence resource, Azure OpenAI resources, and models from AI Foundry. 4 | 5 | ## Prerequisites 6 | Before deploying the template, ensure you have the following: 7 | 8 | - An active **Azure for Students** subscription 9 | - Permissions to create resources in the selected resource group 10 | - Sufficient quota for Azure OpenAI resources 11 | 12 | 13 | ## Deploy to Azure 14 | 15 | ### 🔹 RAG for Structured Data 16 | 17 | Use this template if your data is stored in structured formats such as .csv file. 18 | [![Deploy to Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FKushagra-2000%2FARM_SQL_OpenAI%2Frefs%2Fheads%2Fmain%2FRAG_deployment.json) 19 | 20 | --- 21 | 22 | ### 🔸 RAG for Unstructured Documents 23 | 24 | Use this template if your data consists of PDFs, scanned documents, or other unstructured formats. 25 | [![Deploy to Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FKushagra-2000%2FARM_SQL_OpenAI%2Frefs%2Fheads%2Fmain%2FRAG_unstructured_deployment.json) 26 | 27 | ### Parameters Description 28 | | Parameter Name | Description | Example | 29 | | :---------------: | :------------- | :-------: | 30 | | `serverName` | Name of the Azure SQL Server (logical server) to host your database. | Sample-SQL-server 31 | | `sqlDBName` | Name of the SQL database to be created. | Sample-SQL-Database 32 | | `location` | Azure region for all resources. Recommended to keep the default. | eastus2 33 | |`administratorLogin` | The administrator username of the SQL server for SQL authentication. | | 34 | | `administratorLoginPassword` | The administrator password of the SQL server for SQL authentication. | | 35 | | `OpenAI_account_name` | Name of the Azure OpenAI resource. | Sample-OpenAI-resource 36 | | `OpenAI_account_location` | Region for the Azure OpenAI resource | eastus2 37 | | `OpenAI_chat_completion_model` | Chat model to deploy. | gpt-4.1 38 | | `embedding_model` | Embedding model for vector search. | text-embedding-3-small 39 | | `Document_Intelligence_account_name` | Name of the Azure Document Intelligence (Form Recognizer) resource. | sample-doc-intel          40 | 41 | ⚠️ Note: This is a demo deployment. `The default server name (sample-sqlserver) may already exist in your region`, which can cause deployment errors. To avoid this, please customize the server (or resource) name by appending your name or initials. 42 | If you encounter an error, simply re-deploy with a unique resource name. 43 | 44 | ## Firewall Configuration 45 | **Note:** After the deployment is completed successfully, you need to configure the firewall settings for the **SQL server** separately to allow access from your client IP addresses. 46 | 47 | 1. Go to the deployed SQL Server in the Azure Portal. 48 | 2. Navigate to **Security > Networking > Virtual networks**. 49 | 3. Add your client IP and click Save. 50 | 51 | ## Clean Up Resources 52 | When you're finished using these resources, or if you want to start over again with a new free database (limit 10 per subscription), you can delete the resource group you created, which deletes all the resources within it. 53 | 54 | To delete `myResourceGroup` and all its resources using the Azure portal: 55 | 56 | 1. In the Azure portal, search for and select Resource groups, and then select `myResourceGroup` from the list. 57 | 2. On the Resource group page, select Delete resource group. 58 | 3. Under Type the resource group name, enter `myResourceGroup`, and then select Delete. 59 | 60 | ## Troubleshooting 61 | If you encounter any issues during deployment, check the following: 62 | 63 | - Ensure you have sufficient quota for Azure OpenAI resources. 64 | - Verify that all parameters are correctly specified. 65 | - Check the deployment logs in the Azure Portal for detailed error messages. 66 | 67 | ## Resources 68 | For guidelines and information on any specific resource, check out the following microsoft documentation: 69 | 70 | - 📄 [Deploy Azure SQL Database for free](https://learn.microsoft.com/en-us/azure/azure-sql/database/free-offer?view=azuresql) 71 | - 📄 [Create a Document Intelligence resource](https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/how-to-guides/create-document-intelligence-resource?view=doc-intel-4.0.0#get-endpoint-url-and-keys) 72 | - 📄 [Create and deploy an Azure OpenAI Service resource](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal) 73 | -------------------------------------------------------------------------------- /SemanticKernel/dotnet/VectorStoreSample/Program.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.Extensions.VectorData; 2 | using Microsoft.SemanticKernel; 3 | using Kernel = Microsoft.SemanticKernel.Kernel; 4 | using Microsoft.SemanticKernel.Connectors.SqlServer; 5 | using Microsoft.SemanticKernel.Embeddings; 6 | using System.Text.Json; 7 | using DotNetEnv; 8 | 9 | #pragma warning disable SKEXP0001, SKEXP0010, SKEXP0020, CS8620 10 | 11 | Env.Load(); 12 | 13 | // Get parameters from environment 14 | string azureOpenAIEndpoint = Env.GetString("AZURE_OPENAI_ENDPOINT")!; 15 | string azureOpenAIApiKey = Env.GetString("AZURE_OPENAI_API_KEY")!; 16 | string embeddingModelDeploymentName = Env.GetString("AZURE_OPENAI_EMBEDDING_MODEL")!; 17 | string connectionString = Env.GetString("AZURE_SQL_CONNECTION_STRING")!; 18 | 19 | // Sample Data 20 | var glossaryEntries = new List() 21 | { 22 | new() 23 | { 24 | Key = 1, 25 | Term = "API", 26 | Definition = "Application Programming Interface. A set of rules and specifications that allow software components to communicate and exchange data." 27 | }, 28 | new() 29 | { 30 | Key = 2, 31 | Term = "Connectors", 32 | Definition = "Connectors allow you to integrate with various services provide AI capabilities, including LLM, AudioToText, TextToAudio, Embedding generation, etc." 33 | }, 34 | new() 35 | { 36 | Key = 3, 37 | Term = "RAG", 38 | Definition = "Retrieval Augmented Generation - a term that refers to the process of retrieving additional data to provide as context to an LLM to use when generating a response (completion) to a user's question (prompt)." 39 | } 40 | }; 41 | 42 | /* 43 | * Set up Semantic Kernel 44 | */ 45 | Console.WriteLine("Creating Semantic Kernel services..."); 46 | 47 | // Build the kernel and configure the embedding provider 48 | var builder = Kernel.CreateBuilder(); 49 | builder.AddAzureOpenAITextEmbeddingGeneration(embeddingModelDeploymentName, azureOpenAIEndpoint, azureOpenAIApiKey); 50 | var kernel = builder.Build(); 51 | 52 | // Define vector store 53 | var vectorStore = new SqlServerVectorStore(connectionString); 54 | 55 | // Get a collection instance using vector store 56 | // IMPORTANT: Make sure the use the same data type for key here and for the VectorStoreRecordKey element 57 | var collection = vectorStore.GetCollection("SemanticKernel_VectorStore"); 58 | await collection.CreateCollectionIfNotExistsAsync(); 59 | 60 | // Get embedding service 61 | var textEmbeddingGenerationService = kernel.GetRequiredService(); 62 | 63 | /* 64 | * Generate embeddings for each glossary item 65 | */ 66 | Console.WriteLine("\nGenerating embeddings..."); 67 | 68 | var tasks = glossaryEntries.Select(entry => Task.Run(async () => 69 | { 70 | entry.DefinitionEmbedding = await textEmbeddingGenerationService.GenerateEmbeddingAsync(entry.Definition); 71 | })); 72 | 73 | await Task.WhenAll(tasks); 74 | 75 | /* 76 | * Upsert the data into the vector store 77 | */ 78 | Console.WriteLine("\nUpserting data into vector store..."); 79 | 80 | await foreach (var key in collection.UpsertBatchAsync(glossaryEntries)) 81 | { 82 | Console.WriteLine(key); 83 | } 84 | 85 | /* 86 | * Upsert the data into the vector store 87 | */ 88 | Console.WriteLine("\nReturn the inserted data..."); 89 | 90 | var options = new GetRecordOptions() { IncludeVectors = false }; 91 | 92 | await foreach (var record in collection.GetBatchAsync(keys: [1, 2, 3], options)) 93 | { 94 | Console.WriteLine($"Key: {record.Key}"); 95 | Console.WriteLine($"Term: {record.Term}"); 96 | Console.WriteLine($"Definition: {record.Definition}"); 97 | } 98 | 99 | /* 100 | * Upsert the data into the vector store 101 | */ 102 | Console.WriteLine("\nRun vector search..."); 103 | 104 | var searchString = "I want to learn more about Connectors"; 105 | 106 | Console.WriteLine($"Search string: '{searchString}'"); 107 | 108 | var searchVector = await textEmbeddingGenerationService.GenerateEmbeddingAsync(searchString); 109 | var searchResult = await collection.VectorizedSearchAsync(searchVector); 110 | 111 | Console.WriteLine($"Results:"); 112 | 113 | await foreach (var result in searchResult.Results) 114 | { 115 | Console.WriteLine($"Search score: {result.Score}"); 116 | Console.WriteLine($"Key: {result.Record.Key}"); 117 | Console.WriteLine($"Term: {result.Record.Term}"); 118 | Console.WriteLine($"Definition: {result.Record.Definition}"); 119 | Console.WriteLine("========="); 120 | } 121 | 122 | public sealed class Glossary 123 | { 124 | [VectorStoreRecordKey] 125 | public int Key { get; set; } 126 | 127 | [VectorStoreRecordData] 128 | public string? Term { get; set; } 129 | 130 | [VectorStoreRecordData] 131 | public string? Definition { get; set; } 132 | 133 | [VectorStoreRecordVector(Dimensions: 1536)] 134 | public ReadOnlyMemory DefinitionEmbedding { get; set; } 135 | } -------------------------------------------------------------------------------- /RAG-with-Documents/Readme.md: -------------------------------------------------------------------------------- 1 | # Leveraging Azure SQL DB’s Native Vector Capabilities for Enhanced Resume Matching with Azure Document Intelligence and RAG 2 | 3 | In this tutorial, we will explore how to leverage Azure SQL DB’s new vector data type to store embeddings and perform similarity searches using built-in vector functions, enabling advanced resume matching to identify the most suitable candidates. 4 | 5 | By extracting and chunking content from PDF resumes using Azure Document Intelligence, generating embeddings with Azure OpenAI, and storing these embeddings in Azure SQL DB, we can perform sophisticated vector similarity searches and retrieval-augmented generation (RAG) to identify the most suitable candidates based on their resumes. 6 | 7 | ### **Tutorial Overview** 8 | 9 | - This Python notebook will teach you to: 10 | 1. **Chunk PDF Resumes**: Use **`Azure Document Intelligence`** to extract and chunk content from PDF resumes. 11 | 2. **Create Embeddings**: Generate embeddings from the chunked content using the **`Azure OpenAI API`**. 12 | 3. **Vector Database Utilization**: Store embeddings in **`Azure SQL DB`** utilizing the **`new Vector Data Type`** and perform similarity searches using built-in vector functions to find the most suitable candidates. 13 | 4. **LLM Generation Augmentation**: Enhance language model generation with embeddings from a vector database. In this case, we use the embeddings to inform a GPT-4 chat model, enabling it to provide rich, context-aware answers about candidates based on their resumes 14 | 15 | ## Dataset 16 | 17 | We use a sample dataset from [Kaggle](https://www.kaggle.com/datasets/snehaanbhawal/resume-dataset) containing PDF resumes for this tutorial. For the purpose of this tutorial we will use 120 resumes from the **Information-Technology** folder 18 | 19 | ## Prerequisites 20 | 21 | - **Azure Subscription**: [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true) 22 | - **Azure SQL Database**: [Set up your database for free](https://learn.microsoft.com/azure/azure-sql/database/free-offer?view=azuresql) 23 | - **Azure Document Intelligence** [Create a FreeAzure Doc Intelligence resource](https:/learn.microsoft.com/azure/ai-services/document-intelligence/create-document-intelligence-resource?view=doc-intel-4.0.0) 24 | - **Azure Data Studio**: Download [here](https://azure.microsoft.com/products/data-studio) to manage your Azure SQL database and [execute the notebook](https://learn.microsoft.com/azure-data-studio/notebooks/notebooks-python-kernel) 25 | 26 | ## Additional Requirements for Embedding Generation 27 | 28 | - **Azure OpenAI Access**: Apply for access in the desired Azure subscription at [https://aka.ms/oai/access](https://aka.ms/oai/access) 29 | - **Azure OpenAI Resource**: Deploy an embeddings model (e.g., `text-embedding-small` or `text-embedding-ada-002`) and a `GPT-4.0` model for chat completion. Refer to the [resource deployment guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource) 30 | - **Python**: Version 3.7.1 or later from Python.org. (Sample has been tested with Python 3.11) 31 | - **Python Libraries**: Install the required libraries openai, num2words, matplotlib, plotly, scipy, scikit-learn, pandas, tiktoken, and pyodbc. 32 | - **Jupyter Notebooks**: Use within [Azure Data Studio](https://learn.microsoft.com/en-us/azure-data-studio/notebooks/notebooks-guidance) or Visual Studio Code . 33 | 34 | Code snippets are adapted from the [Azure OpenAI Service embeddings Tutorial](https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/embeddings?tabs=python-new%2Ccommand-line&pivots=programming-language-python) 35 | 36 | ## Getting Started 37 | 38 | 1. **Database Setup**: Execute SQL commands from the `createtable.sql` script to create the necessary table in your database. 39 | 2. **Model Deployment**: Deploy an embeddings model (`text-embedding-small` or `text-embedding-ada-002`) and a `GPT-4` model for chat completion. Note the 2 models deployment names for later use. 40 | 41 | ![Deployed OpenAI Models](../Assets/modeldeployment.png) 42 | 43 | 3. **Connection String**: Find your Azure SQL DB connection string in the Azure portal under your database settings. 44 | 4. **Configuration**: Populate the `.env` file with your SQL server connection details , Azure OpenAI key and endpoint, Azure Document Intelligence key and endpoint values. 45 | 46 | You can retrieve the Azure OpenAI _endpoint_ and _key_: 47 | 48 | ![Azure OpenAI Endpoint and Key](../Assets/endpoint.png) 49 | 50 | You can [retrieve](https://learn.microsoft.com/azure/ai-services/document-intelligence/create-document-intelligence-resource?view=doc-intel-4.0.0#get-endpoint-url-and-keys) the Document Intelligence _endpoint_ and _key_: 51 | 52 | ![Azure Document Intelligence Endpoint and Key](../Assets/docintelendpoint.png) 53 | 54 | ## Running the Notebook 55 | 56 | To [execute the notebook](https://learn.microsoft.com/azure-data-studio/notebooks/notebooks-python-kernel), connect to your Azure SQL database using Azure Data Studio, which can be downloaded [here](https://azure.microsoft.com/products/data-studio) -------------------------------------------------------------------------------- /DotNet/SqlClient/ReadMe.md: -------------------------------------------------------------------------------- 1 | # .NET/C# samples 2 | 3 | This folder contains samples that demonstrate how to use Native Vector Search in C#/.NET. The Sample solution contains 4 | examples for different use cases, that demonstrate how to work with vectors. Following is the list of use cases implemented in the solution: 5 | 6 | 1. Create and insert vectors in the SQL table 7 | 2. Create and insert embeddings in the SQL table 8 | 3. Reading vectors 9 | 4. Find Similar vectors 10 | 5. Document Classification 11 | 12 | ## Configuration 13 | 14 | The samples support both **OpenAI** and **Azure OpenAI** embedding services. You can configure the samples using either: 15 | 16 | - Environment variables in the `launchSettings.json` file, or 17 | - A `.env` file in the project root 18 | 19 | ### Using OpenAI 20 | 21 | ~~~json 22 | { 23 | "profiles": { 24 | "SqlServer.NativeVectorSearch.Samples": { 25 | "commandName": "Project", 26 | "environmentVariables": { 27 | "UseAzureOpenAI": "false", 28 | "ApiKey": "sk-your-openai-api-key", 29 | "EmbeddingModelName": "text-embedding-3-small", 30 | "SqlConnStr": "Data Source=your-server;Initial Catalog=your-db;..." 31 | } 32 | } 33 | } 34 | } 35 | ~~~ 36 | 37 | ### Using Azure OpenAI 38 | 39 | ~~~json 40 | { 41 | "profiles": { 42 | "SqlServer.NativeVectorSearch.Samples": { 43 | "commandName": "Project", 44 | "environmentVariables": { 45 | "UseAzureOpenAI": "true", 46 | "AzureOpenAIEndpoint": "https://your-resource.openai.azure.com/", 47 | "AzureOpenAIKey": "your-azure-openai-key", 48 | "EmbeddingModelName": "text-embedding-3-small", 49 | "SqlConnStr": "Data Source=your-server;Initial Catalog=your-db;..." 50 | } 51 | } 52 | } 53 | } 54 | ~~~ 55 | 56 | ### Using .env file 57 | 58 | Alternatively, you can create a `.env` file in the project root. See `.env.example` for a template. 59 | 60 | ### Required Environment Variables 61 | 62 | | Variable | Required For | Description | 63 | |----------|-------------|-------------| 64 | | `SqlConnStr` | All | SQL Server connection string | 65 | | `EmbeddingModelName` | All | Name of the embedding model (e.g., text-embedding-3-small) | 66 | | `UseAzureOpenAI` | All | Set to "true" for Azure OpenAI, "false" or omit for OpenAI (default: false) | 67 | | `ApiKey` | OpenAI | Your OpenAI API key | 68 | | `AzureOpenAIEndpoint` | Azure OpenAI | Your Azure OpenAI endpoint | 69 | | `AzureOpenAIKey` | Azure OpenAI | Your Azure OpenAI API key | 70 | 71 | You also need to create the following table (Note: the solution uses the schema 'test'): 72 | 73 | ~~~sql 74 | CREATE TABLE test.Vectors 75 | ( 76 | [Id] INT IDENTITY(1,1) NOT NULL PRIMARY KEY, 77 | [Text] NVARCHAR(MAX) NULL, 78 | [VectorShort] VECTOR(3) NULL, 79 | [Vector] VECTOR(1536) NULL 80 | ) ON [PRIMARY]; 81 | ~~~ 82 | 83 | ## 1. Create and insert vectors in the SQL table 84 | 85 | This example (CreateAndInsertVectorsAsync) provides a demonstration of creating vectors and inserting them into a database table. For clarity and simplicity, the example inserts two 3-dimensional vectors. 86 | 87 | - `CreateAndInsertVectorsAsync()` 88 | 89 | ## 2. Create and insert embeddings in the SQL table 90 | 91 | This example (CreateAndInsertEmbeddingAsync) illustrates the process of generating an embedding vector from a string using a pre-defined embedding model and subsequently inserting the resulting vector into a database table. 92 | The sample provides a step-by-step approach, highlighting how to utilize the embedding model to transform textual input into a high-dimensional vector representation. It further details the process of storing this vector efficiently within a table, ensuring compatibility with advanced search and semantic matching capabilities. 93 | 94 | - `CreateAndInsertEmbeddingAsync()` 95 | 96 | ## 3. Reading vectors 97 | 98 | This example (ReadVectorsAsync) demonstrates the process of retrieving vectors stored in a database table. 99 | It provides a step-by-step guide on how to read rows that contain vector columns. The example focuses on best practices for handling vector data, including correct casting. 100 | 101 | - `ReadVectorsAsync()` 102 | 103 | ## 4. Find Similar Vectors 104 | 105 | The example (FindSimilarAsync) demonstrates how to calculate the distance between vectors and how to look up the top best-matching vectors. 106 | 107 | - `FindSimilarAsync()` 108 | 109 | ## 5. Document Classification 110 | 111 | The example (ClassifyDocumentsAsync) presents a high-level scenario to demonstrate the implementation of document classification. 112 | It begins by generating sample documents of two distinct types: invoices and delivery documents (shipment statements). 113 | To create test documents for both types, the GenerateTestDocumentsAsync method is utilized, producing 10 simulated invoices and 10 shipment statements. Once the documents are generated, the method (ClassifyDocumentsAsync) leverages the embedding model to create corresponding embedding vectors. 114 | These vectors, along with the associated document text, are then inserted into the database to facilitate further classification and semantic analysis. -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step2-Deploy-RAG-App/RAG_structured-docs/Readme.md: -------------------------------------------------------------------------------- 1 | # RAG Streamlit App with Azure SQL DB on Structured Dataset 2 | 3 | ## Objective 4 | This Streamlit app demonstrates how to build a Retrieval-Augmented Generation (RAG) solution for product review search and Q&A using Azure SQL Database's native vector capabilities and Azure OpenAI. The app enables users to upload a dataset (e.g., Fine Foods Review), generate embeddings, store/query them in Azure SQL DB, and perform LLM-powered Q&A for product discovery and recommendations. 5 | 6 | ## Key Features 7 | - **CSV Upload & Processing:** Upload a CSV file (e.g., Fine Foods Review dataset) and process it for embedding generation. 8 | - **Dataset Cleaning:** Clean and combine review text and summary for optimal embedding. 9 | - **Embedding Generation:** Generate semantic embeddings for each review using Azure OpenAI's embedding models. 10 | - **Vector Storage in Azure SQL DB:** Store embeddings in Azure SQL DB using the VECTOR data type for efficient similarity search. 11 | - **Vector Search:** Query the database for the most relevant reviews based on a user query using built-in vector distance functions. 12 | - **LLM Q&A:** Augment search results with GPT-4.1-based recommendations and summaries, grounded in the retrieved data. 13 | - **Simplistic UI:** Interactive, step-by-step workflow with persistent results and clear progress indicators. 14 | 15 | ## Prerequisites 16 | - **Azure Subscription** - Azure Free Trial subscription or Azure for Students subscription would also work 17 | - **Python 3.8+** 18 | - **Required Python packages** (see `requirements.txt`) 19 | - **ODBC Driver 18+ for SQL Server** (for pyodbc) 20 | - **Fabric Subscription** (Optional) 21 | 22 | ## Products & Services Used 23 | - Azure SQL Database 24 | - Azure OpenAI Service 25 | - Streamlit 26 | - Python (pandas, pyodbc, etc.) 27 | - SQL Database on Fabric *(an alternative to Azure SQL Database)* 28 | 29 | ## Automated Deployments 30 | - **ARM Template Scripts:** 31 | - ARM templates are provided separately to automate the deployment of required resources. Please refer to [this repository](https://github.com/Kushagra-2000/ARM_SQL_OpenAI) for scripts and detailed instructions. 32 | - Follow the RAG for Structured Data for this particular demo. 33 | - **SQL on Fabric:** 34 | - Create a workspace in Fabric, if not existed before. 35 | - New Item > SQL Database (preview) > Provide DB name > Create 36 | 37 | ## Steps to Execute 38 | 1. **Clone the Repository** 39 | 2. **Install Requirements:** 40 | ``` 41 | pip install -r requirements.txt 42 | ``` 43 | 3. **Deploy Azure Resources:** 44 | - Use the provided ARM templates or Azure Portal to deploy SQL DB, Document Intelligence, and OpenAI resources. 45 | - Setup SQL on Fabric if using that as the database 46 | 4. **Setup Firewall Configuration:** (Skip this step if using SQL on Fabric) 47 | - Configure the firewall settings for the SQL server separately to allow access from your client IP addresses. 48 | - Go to the deployed SQL Server in the Azure Portal. 49 | - Navigate to Security > Networking > Virtual networks. 50 | - Add your client IP and click Save. 51 | 5. **Run the Streamlit App:** 52 | - Navigate to the cloned repository destination and then run the below command to start the app on `localhost:8501` 53 | ``` 54 | streamlit run streamlit_app.py --server.maxUploadSize 500 55 | ``` 56 | 6. **Configure Credentials:** 57 | - Launch the app and enter your Azure endpoints, API keys, and SQL connection string in the sidebar. 58 | - Credentials for Document Intelligence: Document Intelligence resource > Overview > Keys and endpoint 59 | - Credentials for OpenAI: OpenAI resource > Overview > Develop > Keys and endpoint 60 | - Connection String for Azure SQL DB: Azure SQL DB resource > Overview > Show database connection strings > ODBC > {Change Pwd parameter with your admin password set during deployment} 61 | - Connection String for SQL on Fabric: SQL DB > Settings > Connection Strings > ODBC > Copy string as it is > Authentication window would pop-up > Provide authentication details 62 | 7. **Talk to your docs:** 63 | - Refer to the following datasets for use. It contains both the reviews csv file and embeddings csv file. 64 | [Datasets](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/Datasets) 65 | 66 | 67 | ## Troubleshooting 68 | - **Connection Errors:** 69 | - Ensure your SQL connection string is correct and the ODBC driver is installed. 70 | - Verify API keys and endpoints for Azure services. 71 | - **Table Creation Issues:** 72 | - Confirm your user has permissions to create tables in the target database. 73 | - If using database for the first time or it is in Paused state, try creating table again after the DB is Running state - it would take 1-2 mins for DB to be ready, if in paused state. 74 | - **Embedding/Vector Errors:** 75 | - Use the correct double-casting in SQL queries as shown in the app. 76 | - **Performance:** 77 | - Large PDFs or many files may take time to process and embed. Monitor resource usage. 78 | - **Streamlit UI Issues:** 79 | - Refresh the page or restart the app if UI elements do not update as expected. 80 | 81 | ## Resources 82 | - [Azure SQL DB Vector Support](https://devblogs.microsoft.com/azure-sql/eap-for-vector-support-refresh-introducing-vector-type/) 83 | - [Azure OpenAI Service](https://learn.microsoft.com/azure/ai-services/openai/) 84 | - [Streamlit Documentation](https://docs.streamlit.io/) 85 | - [Project GitHub Repository](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/Retrieval-Augmented-Generation) 86 | 87 | --- 88 | 89 | -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step2-Deploy-RAG-App/RAG-unstructured-docs/Readme.md: -------------------------------------------------------------------------------- 1 | # RAG Streamlit App with Azure SQL DB on Structured Dataset 2 | 3 | ## Objective 4 | This Streamlit app demonstrates how to build a Retrieval-Augmented Generation (RAG) solution for resume matching using Azure SQL Database's native vector capabilities, Azure Document Intelligence, and Azure OpenAI. The app enables users to upload PDF resumes, extract and chunk their content, generate embeddings, store/query them in Azure SQL DB, and perform LLM-powered Q&A for candidate search and recommendations. 5 | 6 | ## Key Features 7 | - **PDF Resume Upload & Processing:** Upload multiple PDF resumes and extract text using Azure Document Intelligence. 8 | - **Text Chunking:** Automatically split extracted text into manageable chunks (500 tokens each) for embedding. 9 | - **Embedding Generation:** Generate semantic embeddings for each chunk using Azure OpenAI's embedding models. 10 | - **Vector Storage in Azure SQL DB:** Store embeddings in Azure SQL DB using the new VECTOR data type for efficient similarity search. 11 | - **Vector Search:** Query the database for the most relevant resume chunks based on a user query using built-in vector distance functions. 12 | - **LLM Q&A:** Augment search results with GPT-4.1-based recommendations and summaries, grounded in the retrieved data. 13 | - **Simplistic UI:** Interactive, step-by-step workflow with persistent results and clear progress indicators. 14 | 15 | ## Prerequisites 16 | - **Azure Subscription** - Azure Free Trial subscription or Azure for Students subscription would also work 17 | - **Python 3.8+** 18 | - **Required Python packages** (see `requirements.txt`) 19 | - **ODBC Driver 18+ for SQL Server** (for pyodbc) 20 | - **Fabric Subscription** (Optional) 21 | 22 | ## Products & Services Used 23 | - Azure SQL Database 24 | - Azure Document Intelligence (Form Recognizer) 25 | - Azure OpenAI Service 26 | - Streamlit 27 | - Python (pandas, tiktoken, pyodbc, etc.) 28 | - SQL Database on Fabric *(an alternative to Azure SQL Database)* 29 | 30 | ## Automated Deployments 31 | - **ARM Template Scripts:** 32 | - ARM templates are provided separately to automate the deployment of required resources. Please refer to [this repository](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/5-Min-RAG-SQL-Accelerator/Step1-OneClick-Deployment) for scripts and detailed instructions. 33 | - Follow the RAG for Unstructured Docs for this particular demo. 34 | - **SQL on Fabric:** 35 | - Create a workspace in Fabric, if not existed before. 36 | - New Item > SQL Database (preview) > Provide DB name > Create 37 | 38 | ## Steps to Execute 39 | 1. **Clone the Repository** 40 | 2. **Install Requirements:** 41 | ``` 42 | pip install -r requirements.txt 43 | ``` 44 | 3. **Deploy Azure Resources:** 45 | - Use the provided ARM templates or Azure Portal to deploy SQL DB, Document Intelligence, and OpenAI resources. 46 | - Setup SQL on Fabric if using that as the database 47 | 4. **Setup Firewall Configuration:** (Skip this step if using SQL on Fabric) 48 | - Configure the firewall settings for the SQL server separately to allow access from your client IP addresses. 49 | - Go to the deployed SQL Server in the Azure Portal. 50 | - Navigate to Security > Networking > Virtual networks. 51 | - Add your client IP and click Save. 52 | 5. **Run the Streamlit App:** 53 | - Navigate to the cloned repository destination and then run the below command to start the app on `localhost:8501` 54 | ``` 55 | streamlit run --server.maxUploadSize 500 56 | ``` 57 | 6. **Configure Credentials:** 58 | - Launch the app and enter your Azure endpoints, API keys, and SQL connection string in the sidebar. 59 | - Credentials for Document Intelligence: Document Intelligence resource > Overview > Keys and endpoint 60 | - Credentials for OpenAI: OpenAI resource > Overview > Develop > Keys and endpoint 61 | - Connection String for Azure SQL DB: Azure SQL DB resource > Overview > Show database connection strings > ODBC > {Change Pwd parameter with your admin password set during deployment} 62 | - Connection String for SQL on Fabric: SQL DB > Settings > Connection Strings > ODBC > Copy string as it is > Authentication window would pop-up > Provide authentication details 63 | 7. **Talk to your docs:** 64 | - Upload [resume docs](https://www.kaggle.com/datasets/snehaanbhawal/resume-dataset) or upload your own docs and query upon them to see action of RAG in real-time 65 | 66 | ## Troubleshooting 67 | - **Connection Errors:** 68 | - Ensure your SQL connection string is correct, and the ODBC driver is installed. 69 | - Verify API keys and endpoints for Azure services. 70 | - **Table Creation Issues:** 71 | - Confirm your user has permission to create tables in the target database. 72 | - If using database for the first time or it is in Paused state, try creating table again after the DB is Running state - it would take 1-2 mins for DB to be ready, if in paused state. 73 | - **Performance:** 74 | - Large PDFs or many files may take time to process and embed. Monitor resource usage. 75 | - **Streamlit UI Issues:** 76 | - Refresh the page or restart the app if UI elements do not update as expected. 77 | 78 | ## Resources 79 | - [Azure SQL DB Vector Support](https://devblogs.microsoft.com/azure-sql/eap-for-vector-support-refresh-introducing-vector-type/) 80 | - [Azure Document Intelligence](https://learn.microsoft.com/azure/ai-services/document-intelligence/) 81 | - [Azure OpenAI Service](https://learn.microsoft.com/azure/ai-services/openai/) 82 | - [Streamlit Documentation](https://docs.streamlit.io/) 83 | - [Project GitHub Repository](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/RAG-with-Documents) 84 | 85 | --- 86 | -------------------------------------------------------------------------------- /DiskANN/Wikipedia/005-wikipedia-fp16.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Test the new helf-precision support for vectors 3 | By converting existing single-precision vectors to half-precision vectors 4 | and then do a test run using both to see if there are any differences in the outcome 5 | 6 | This script requires SQL Server 2025 RC1 7 | */ 8 | use WikipediaTest 9 | go 10 | 11 | -- Enable preview_features configuration for vector index features 12 | alter database scoped configuration 13 | set preview_features = on; 14 | go 15 | select * from sys.database_scoped_configurations where [name] = 'preview_features' 16 | go 17 | 18 | -- Add half-precision vector column 19 | alter table [dbo].[wikipedia_articles_embeddings] 20 | add content_vector_fp16 vector(1536, float16) 21 | go 22 | 23 | -- View the metadata 24 | select 25 | [name] AS column_name, 26 | system_type_id, 27 | user_type_id, 28 | vector_dimensions, 29 | vector_base_type, 30 | vector_base_type_desc 31 | from 32 | sys.columns 33 | where 34 | object_id = object_id('[dbo].[wikipedia_articles_embeddings]') 35 | go 36 | 37 | -- Remove existing vector indexes 38 | select * from sys.vector_indexes 39 | go 40 | drop index if exists vec_idx on [dbo].[wikipedia_articles_embeddings] 41 | drop index if exists vec_idx2 on [dbo].[wikipedia_articles_embeddings] 42 | go 43 | select * from sys.vector_indexes 44 | go 45 | 46 | -- Copy the exiting single-precision embeddings to half-precision vector column 47 | update [dbo].[wikipedia_articles_embeddings] 48 | --set content_vector_fp16 = cast(content_vector as vector(1536, float16)) -- Not working at the moment 49 | set content_vector_fp16 = cast(cast(content_vector as json) as vector(1536, float16)) 50 | go 51 | 52 | -- View different storage space for single-precision (fp32) vs half-precision (fp16) floating point vector 53 | select 54 | id, title, 55 | DATALENGTH(content_vector) as fp32_bytes, 56 | DATALENGTH(content_vector_fp16) as fp16_bytes 57 | from 58 | [dbo].[wikipedia_articles_embeddings] where title like 'Philosoph%' 59 | go 60 | 61 | -- Generate query embeddings 62 | drop table if exists #t; 63 | create table #t (id int, q nvarchar(max), v32 vector(1536, float32), v16 vector(1536, float16)) 64 | 65 | insert into #t (id, q, v32) 66 | select 67 | id, q, ai_generate_embeddings(q use model Ada2Embeddings) 68 | from 69 | (values 70 | (1, N'four legged furry animal'), 71 | (2, N'pink floyd music style') 72 | ) S(id, q) 73 | go 74 | update #t set v16 = cast(cast(v32 as json) as vector(1536, float16)); 75 | select * from #t 76 | go 77 | 78 | -- Create vector index of single-precision vectors 79 | -- Should take ~30 seconds on a 16 vCore server 80 | create vector index vec_idx32 on [dbo].[wikipedia_articles_embeddings]([content_vector]) 81 | with (metric = 'cosine', type = 'diskann'); 82 | go 83 | 84 | -- Create vector index of half-precision vectors 85 | -- Should take ~22 seconds on a 16 vCore server 86 | create vector index vec_idx16 on [dbo].[wikipedia_articles_embeddings]([content_vector_fp16]) 87 | with (metric = 'cosine', type = 'diskann'); 88 | go 89 | 90 | select * from sys.vector_indexes 91 | go 92 | 93 | set statistics time on 94 | set statistics io on 95 | go 96 | 97 | /* 98 | RUN KNN (Exact) VECTOR SEARCH 99 | */ 100 | declare @qv vector(1536, float16) = (select top(1) v16 from #t where id=2); 101 | select top (50) id, vector_distance('cosine', @qv, [content_vector_fp16]) as distance, title 102 | from [dbo].[wikipedia_articles_embeddings] 103 | order by distance; 104 | go 105 | 106 | /* 107 | RUN ANN (Approximate) VECTOR SEARCH 108 | */ 109 | declare @qv vector(1536, float16) = (select top(1) v16 from #t where id = 2); 110 | select 111 | t.id, s.distance, t.title 112 | from 113 | vector_search( 114 | table = [dbo].[wikipedia_articles_embeddings] as t, 115 | column = [content_vector_fp16], 116 | similar_to = @qv, 117 | metric = 'cosine', 118 | top_n = 50 119 | ) as s 120 | order by s.distance, title 121 | ; 122 | go 123 | 124 | /* 125 | Calculate Recall and compare fp16 vs fp32 126 | */ 127 | declare @n int = 100; 128 | declare @qv32 vector(1536, float32), @qv16 vector(1536, float16); 129 | select top(1) @qv32 = v32, @qv16 = v16 from #t where id = 1; 130 | with cteANN32 as 131 | ( 132 | select top (@n) 133 | t.id, s.distance, t.title 134 | from 135 | vector_search( 136 | table = [dbo].[wikipedia_articles_embeddings] as t, 137 | column = [content_vector], 138 | similar_to = @qv32, 139 | metric = 'cosine', 140 | top_n = @n 141 | ) as s 142 | order by s.distance, id 143 | ), 144 | cteANN16 as 145 | ( 146 | select top (@n) 147 | t.id, s.distance, t.title 148 | from 149 | vector_search( 150 | table = [dbo].[wikipedia_articles_embeddings] as t, 151 | column = [content_vector_fp16], 152 | similar_to = @qv16, 153 | metric = 'cosine', 154 | top_n = @n 155 | ) as s 156 | order by s.distance, id 157 | ), 158 | cteKNN32 as 159 | ( 160 | select top (@n) id, vector_distance('cosine', @qv32, [content_vector]) as distance, title 161 | from [dbo].[wikipedia_articles_embeddings] 162 | order by distance, id 163 | ) 164 | select 165 | k32.id as id_knn, 166 | a32.id as id_ann_fp32, 167 | a16.id as id_ann_fp16, 168 | k32.distance as distance_knn, 169 | a32.distance as distance_ann_fp32, 170 | a16.distance as distance_ann_fp16, 171 | running_recall_fp32 = cast(cast(count(a32.id) over (order by k32.distance) as float) 172 | / cast(count(k32.id) over (order by k32.distance) as float) as decimal(6,3)), 173 | running_recall_fp16 = cast(cast(count(a16.id) over (order by k32.distance) as float) 174 | / cast(count(k32.id) over (order by k32.distance) as float) as decimal(6,3)) 175 | from 176 | cteKNN32 k32 177 | left outer join 178 | cteANN32 a32 on k32.id = a32.id 179 | left outer join 180 | cteANN16 a16 on k32.id = a16.id 181 | order by 182 | k32.distance 183 | go 184 | 185 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Native Vector Support in Azure SQL and SQL Server 2 | 3 | This repo hosts samples meant to help use the new [**Native Vector Support in Azure SQL DB**](https://devblogs.microsoft.com/azure-sql/announcing-general-availability-of-native-vector-type-functions-in-azure-sql/) feature. We illustrate key technical concepts and demonstrate how you can store and query embeddings in Azure SQL data to enhance your application with AI capabilities. 4 | 5 | ## Prerequisites 6 | 7 | To use the provided samples make sure you have the following pre-requisites: 8 | 9 | 1. An Azure subscription - [Create one for free](https://azure.microsoft.com/pricing/purchase-options/azure-account) 10 | 11 | 1. Azure SQL Database - [Create one for free](https:/learn.microsoft.com/azure/azure-sql/database/free-offer?view=azuresql) or [SQL Server 2025 RC1](https://www.microsoft.com/en-us/evalcenter/sql-server-2025-downloads) if you want to test DiskANN. 12 | 13 | 1. Make sure you have an [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/overview) resource created in your Azure subscription. 14 | 15 | 1. Visual Studio Code with MSSQL Extension 16 | - Download it for [Free](https://code.visualstudio.com/) and then install the [MSSQL](https://marketplace.visualstudio.com/items?itemName=ms-mssql.mssql) extension, or 17 | - [SQL Server Management Studio](https://learn.microsoft.com/sql/ssms/download-sql-server-management-studio-ssms). 18 | 19 | 1. If you are going to clone this repository in your machine, make sure to have installed the `git-lfs` extension: [Git Large File Storage](https://git-lfs.com/) 20 | 21 | 1. For testing DiskANN, at the moment, you need to use SQL Server 2025. See the announcement here: [Announcing Public Preview of DiskANN in SQL Server 2025](https://techcommunity.microsoft.com/blog/sqlserver/announcing-public-preview-of-diskann-in-sql-server-2025/4414683). 22 | 23 | ## Samples 24 | 25 | ### Getting Started 26 | 27 | A simple getting started to get familiar with common vector functions is available here: [Getting-Started](./Getting-Started/getting-started.ipynb) 28 | 29 | ### Embeddings 30 | 31 | Learn how to get embeddings from OpenAI directly from Azure SQL using the sample available the [Embeddings/T-SQL](./Embeddings/T-SQL) folder. 32 | 33 | ### Exact Vector Search 34 | 35 | The [Vector-Search](./Vector-Search) example illustrates the implementation of Vector Similarity Search within an SQL database, highlighting the capabilities of semantic search. By leveraging vector representations of text, the system can identify reviews that share contextual similarities with a given search query, transcending the limitations of keyword exact matches. Additionally, it demonstrates the integration of Keyword Search to guarantee the inclusion of specific terms within the search outcomes. 36 | 37 | #### Hybrid Search 38 | 39 | The Python sample in the [Hybrid-Search](./Hybrid-Search/) folder shows how to combine Fulltext search in Azure SQL database with BM25 ranking and cosine similarity ranking to do hybrid search. 40 | 41 | ### Retrieval Augmented Generation 42 | 43 | The RAG pattern is a powerful way to generate text using a pre-trained language model and a retrieval mechanism. The [Retrieval Augmented Generation](./Retrieval-Augmented-Generation) folder contains a sample that demonstrates how to use the RAG pattern with Azure SQL and Azure OpenAI, using Python notebooks. 44 | 45 | ### Approximate Vector Search 46 | 47 | The [DiskANN](./DiskANN/) folder contains a sample that demonstrates how to use the new `VECTOR_SEARCH` function with DiskANN. The sample uses a subset of Wikipedia data to create a table with a vector column, insert data, and perform approximate nearest neighbor search using the `VECTOR_SEARCH` function. 48 | 49 | This sample, at the moment, requires SQL Server 2025. See the announcement here: [Announcing Public Preview of DiskANN in SQL Server 2025](https://techcommunity.microsoft.com/blog/sqlserver/announcing-public-preview-of-diskann-in-sql-server-2025/4414683). 50 | 51 | #### DiskANN and Hybrid Search 52 | 53 | Using DiskANN together with FullText enables you to do hybrid search. The [DiskANN](./DiskANN/) folder contains the file `004-wikipedia-hybrid-search.sql` that demonstrates how to use the the new `VECTOR_SEARCH` function along with `FREETEXTTABLE` to implement hybrid search with Reciprocal Rank Fusion (RRF) and BM25 ranking. 54 | 55 | ### SQL Client 56 | 57 | If you are using SQL Client directly in your applications, you can use the [SqlClient](./DotNet) folder to see how to use Native Vector Search in C#/.NET. 58 | 59 | ### Entity Framework Core 60 | 61 | If you are using .NET EF Core, you can use the [EF-Core](./DotNet) sample to see how to use the new vector functions in your application. 62 | 63 | ### Dapper 64 | 65 | If you are using the MicroORM Dapper, you can use the [Dapper](./DotNet) sample to see how to use the new vector functions in your application. 66 | 67 | ### Semantic Kernel 68 | 69 | [Semantic Kernel](https://github.com/microsoft/semantic-kernel) is an SDK that simplifies the creation of enterprise AI-enabled applications. Details on support for SQL Server and Azure SQL as vectors stores are available in the [SemanticKernel](./SemanticKernel) folder. 70 | 71 | ## Resources 72 | 73 | - [Create and deploy an Azure OpenAI Service resource](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource?pivots=web-portal) 74 | - [Embeddings models](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#embeddings-models) 75 | - [SQL AI Samples and Examples](https://aka.ms/sqlaisamples) 76 | - [Frequently asked questions about Copilot in Azure SQL Database (preview)](https://learn.microsoft.com/azure/azure-sql/copilot/copilot-azure-sql-faq?view=azuresql) 77 | - [Responsible AI FAQ for Microsoft Copilot for Azure (preview)](https://learn.microsoft.com/azure/copilot/responsible-ai-faq) 78 | 79 | ## Trademarks 80 | 81 | This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/legal/intellectualproperty/trademarks/usage/general). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies. 82 | -------------------------------------------------------------------------------- /DiskANN/Wikipedia/002-wikipedia-diskann-test.sql: -------------------------------------------------------------------------------- 1 | /* 2 | This script requires SQL Server 2025 RC0 3 | */ 4 | 5 | use WikipediaTest 6 | go 7 | 8 | set statistics time off 9 | go 10 | 11 | select db_id(), @@spid 12 | go 13 | 14 | -- Enable preview_features configuration for vector index features 15 | alter database scoped configuration 16 | set preview_features = on; 17 | go 18 | select * from sys.database_scoped_configurations where [name] = 'preview_features' 19 | go 20 | 21 | --- Create Indexes 22 | --- (with 16 vCores, creation time is expected to be 30 seconds for each index) 23 | --- Monitor index creation progress using: 24 | --- select session_id, status, command, percent_complete from sys.dm_exec_requests where session_id = 25 | create vector index vec_idx on [dbo].[wikipedia_articles_embeddings]([title_vector]) 26 | with (metric = 'cosine', type = 'diskann'); 27 | go 28 | 29 | create vector index vec_idx2 on [dbo].[wikipedia_articles_embeddings]([content_vector]) 30 | with (metric = 'cosine', type = 'diskann'); 31 | go 32 | 33 | -- View created vector indexes 34 | select * from sys.vector_indexes 35 | go 36 | 37 | -- Enable io statistics 38 | set statistics time on 39 | go 40 | 41 | /* 42 | Option 1: LOAD A PRE-GENERATED EMBEDDING 43 | 44 | The following code loads a pre-generated embedding for the text 45 | "The foundation series by Isaac Asimov" using the "ada2-text-embedding" model 46 | Uncomment the following text if you don't have access to a OpenAI model, 47 | otherwise it is recommended to use the new "ai_generate_embedding" function 48 | by using the code in the "Option 2" section below 49 | */ 50 | /* 51 | declare @j json = (select BulkColumn from 52 | openrowset(bulk 'C:\samples\rc1\datasets\reference-embedding.json', single_clob) as j) 53 | declare @qv vector(1536) = json_query(@j, '$."embedding-vector"') 54 | drop table if exists #t; 55 | create table #t (v vector(1536)); 56 | insert into #t values (@qv); 57 | select * from #t; 58 | go 59 | */ 60 | 61 | /* 62 | Option 2: Get the pre-calculated embedding via REST call 63 | 64 | The following code loads a pre-generated embedding for the text 65 | "The foundation series by Isaac Asimov" using the "ada2-text-embedding" model 66 | Uncomment the following text if you don't have access to a OpenAI model, 67 | otherwise it is recommended to use the new "ai_generate_embedding" function 68 | by using the code in the "Option 2" section below*/ 69 | /* 70 | -- Enable external rest endpoint used by sp_invoke_external_rest_endpoint procedure 71 | exec sp_configure 'external rest endpoint enabled', 1 72 | reconfigure 73 | go 74 | 75 | declare @response nvarchar(max) 76 | exec sp_invoke_external_rest_endpoint 77 | @url = 'https://raw.githubusercontent.com/Azure-Samples/azure-sql-db-vector-search/refs/heads/main/DiskANN/Wikipedia/reference-embedding.json', 78 | @method = 'GET', 79 | @response = @response output 80 | 81 | declare @qv vector(1536) = json_query(@response, '$.result."embedding-vector"') 82 | drop table if exists #t; 83 | create table #t (v vector(1536)); 84 | insert into #t values (@qv); 85 | select * from #t; 86 | */ 87 | 88 | /* 89 | Option 2: GENERATE EMBEDDING USING OPEN AI 90 | 91 | The following code uses the new get_embeddings function to generate 92 | embeddings for the requested text. Make sure to have an OpenAI "ada2-text-embedding" 93 | model deployed in OpenAI o Azure OpenAI. 94 | */ 95 | 96 | -- Create database credentials to store API key 97 | if not exists(select * from sys.symmetric_keys where [name] = '##MS_DatabaseMasterKey##') 98 | begin 99 | create master key encryption by password = 'Pa$$_w0rd!ThatIS_L0Ng' 100 | end 101 | go 102 | if exists(select * from sys.[database_scoped_credentials] where name = 'https://xyz.openai.azure.com/') -- use your Azure OpenAI endpoint 103 | begin 104 | drop database scoped credential [https://xyz.openai.azure.com/]; 105 | end 106 | create database scoped credential [https://xyz.openai.azure.com/] 107 | with identity = 'HTTPEndpointHeaders', secret = '{"api-key": ""}'; -- Add your Azure OpenAI Key 108 | go 109 | select * from sys.[database_scoped_credentials] 110 | go 111 | 112 | -- Create reference to OpenAI model 113 | --drop external model Ada2Embeddings 114 | --go 115 | create external model Ada2Embeddings 116 | with ( 117 | location = 'https://xyz.openai.azure.com/openai/deployments//embeddings?api-version=2024-08-01-preview', 118 | credential = [https://xyz.openai.azure.com/], 119 | api_format = 'Azure OpenAI', 120 | model_type = embeddings, 121 | model = 'embeddings' 122 | ); 123 | go 124 | select * from sys.external_models 125 | go 126 | 127 | -- Enable external rest endpoint used by ai_generate_embeddings function 128 | exec sp_configure 'external rest endpoint enabled', 1 129 | reconfigure 130 | go 131 | 132 | -- Generate embeddings and save it for future use 133 | declare @qv vector(1536) 134 | drop table if exists #t; 135 | create table #t (v vector(1536)) 136 | insert into #t 137 | select ai_generate_embeddings(N'The foundation series by Isaac Asimov' use model Ada2Embeddings); 138 | select * from #t; 139 | go 140 | 141 | /* 142 | RUN ANN (Approximate) VECTOR SEARCH 143 | */ 144 | declare @qv vector(1536) = (select top(1) v from #t); 145 | select 146 | t.id, s.distance, t.title 147 | from 148 | vector_search( 149 | table = [dbo].[wikipedia_articles_embeddings] as t, 150 | column = [content_vector], 151 | similar_to = @qv, 152 | metric = 'cosine', 153 | top_n = 50 154 | ) as s 155 | order by s.distance, title; 156 | go 157 | 158 | /* 159 | RUN KNN (Exact) VECTOR SEARCH 160 | */ 161 | declare @qv vector(1536) = (select top(1) v from #t); 162 | select top (50) id, vector_distance('cosine', @qv, [content_vector]) as distance, title 163 | from [dbo].[wikipedia_articles_embeddings] 164 | order by distance; 165 | go 166 | 167 | /* 168 | Calculate Recall 169 | */ 170 | declare @n int = 100; 171 | declare @qv vector(1536) = (select top(1) v from #t); 172 | with cteANN as 173 | ( 174 | select top (@n) 175 | t.id, s.distance, t.title 176 | from 177 | vector_search( 178 | table = [dbo].[wikipedia_articles_embeddings] as t, 179 | column = [content_vector], 180 | similar_to = @qv, 181 | metric = 'cosine', 182 | top_n = @n 183 | ) as s 184 | order by s.distance, id 185 | ), 186 | cteKNN as 187 | ( 188 | select top (@n) id, vector_distance('cosine', @qv, [content_vector]) as distance, title 189 | from [dbo].[wikipedia_articles_embeddings] 190 | order by distance, id 191 | ) 192 | select 193 | k.id as id_knn, 194 | a.id as id_ann, 195 | k.title, 196 | k.distance as distance_knn, 197 | a.distance as distance_ann, 198 | running_recall = cast(cast(count(a.id) over (order by k.distance) as float) 199 | / cast(count(k.id) over (order by k.distance) as float) as decimal(6,3)) 200 | from 201 | cteKNN k 202 | left outer join 203 | cteANN a on k.id = a.id 204 | order by 205 | k.distance -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step1-OneClick-Deployment/RAG_deployment.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", 3 | "contentVersion": "1.0.0.0", 4 | "parameters": { 5 | "serverName": { 6 | "defaultValue": "Sample-SQL-server-", 7 | "type": "String", 8 | "metadata": { 9 | "description": "The name of the SQL server that will host the SQL Database." 10 | } 11 | }, 12 | "sqlDBName": { 13 | "defaultValue": "Sample-SQL-Database-", 14 | "type": "String", 15 | "metadata": { 16 | "description": "The name of the SQL Database." 17 | } 18 | }, 19 | "location": { 20 | "defaultValue": "eastus2", 21 | "allowedValues": [ 22 | "eastus", 23 | "eastus2", 24 | "westus", 25 | "centralus", 26 | "northcentralus", 27 | "southcentralus", 28 | "westus2", 29 | "westus3", 30 | "australiaeast", 31 | "australiasoutheast", 32 | "brazilsouth", 33 | "canadacentral", 34 | "canadaeast", 35 | "centralindia", 36 | "eastasia", 37 | "japaneast", 38 | "japanwest", 39 | "koreacentral", 40 | "koreasouth", 41 | "northeurope", 42 | "southafricanorth", 43 | "southindia", 44 | "southeastasia", 45 | "uksouth", 46 | "ukwest", 47 | "westeurope" 48 | ], 49 | "type": "String", 50 | "metadata": { 51 | "description": "Location for SQL resources. Recommended to keep location same as default value to ensure compatibility and lower latency." 52 | } 53 | }, 54 | "administratorLogin": { 55 | "type": "String", 56 | "metadata": { 57 | "description": "Username for the SQL Server admin account. You'll use this to log in and manage the database" 58 | } 59 | }, 60 | "administratorLoginPassword": { 61 | "type": "SecureString", 62 | "metadata": { 63 | "description": "Secure password for the SQL Server admin account." 64 | } 65 | }, 66 | "OpenAI_account_name": { 67 | "defaultValue": "Sample-OpenAI-", 68 | "type": "String", 69 | "metadata": { 70 | "description": "Name of the Azure OpenAI resource used for deploying language models like GPT." 71 | } 72 | }, 73 | "OpenAI_account_location": { 74 | "defaultValue": "eastus2", 75 | "type": "String", 76 | "metadata": { 77 | "description": "Azure region where the OpenAI resource will be deployed. Keeping it as default ensures model availability." 78 | } 79 | }, 80 | "OpenAI_chat_completion_model": { 81 | "defaultValue": "gpt-4.1", 82 | "type": "String", 83 | "metadata": { 84 | "description": "Name of the Azure OpenAI chat model to be deployed for generating conversational responses. Recommended to keep same as default value." 85 | } 86 | }, 87 | "embedding_model": { 88 | "defaultValue": "text-embedding-ada-002", 89 | "type": "String", 90 | "metadata": { 91 | "description": "Name of the Azure OpenAI chat model to be deployed for generating embeddings. Recommended to keep same as default value." 92 | } 93 | } 94 | }, 95 | "resources": [ 96 | { 97 | "type": "Microsoft.CognitiveServices/accounts", 98 | "apiVersion": "2024-10-01", 99 | "name": "[parameters('OpenAI_account_name')]", 100 | "location": "[parameters('OpenAI_account_location')]", 101 | "sku": { 102 | "name": "S0" 103 | }, 104 | "kind": "OpenAI", 105 | "properties": { 106 | "customSubDomainName": "[parameters('OpenAI_account_name')]", 107 | "networkAcls": { 108 | "defaultAction": "Allow", 109 | "virtualNetworkRules": [], 110 | "ipRules": [] 111 | }, 112 | "publicNetworkAccess": "Enabled" 113 | } 114 | }, 115 | { 116 | "type": "Microsoft.CognitiveServices/accounts/deployments", 117 | "apiVersion": "2024-10-01", 118 | "name": "[concat(parameters('OpenAI_account_name'), '/', parameters('OpenAI_chat_completion_model'))]", 119 | "dependsOn": [ 120 | "[resourceId('Microsoft.CognitiveServices/accounts', parameters('OpenAI_account_name'))]" 121 | ], 122 | "sku": { 123 | "name": "GlobalStandard", 124 | "capacity": 50 125 | }, 126 | "properties": { 127 | "model": { 128 | "format": "OpenAI", 129 | "name": "gpt-4.1", 130 | "version": "2025-04-14" 131 | }, 132 | "versionUpgradeOption": "OnceNewDefaultVersionAvailable", 133 | "currentCapacity": 50, 134 | "raiPolicyName": "Microsoft.DefaultV2" 135 | } 136 | }, 137 | { 138 | "type": "Microsoft.CognitiveServices/accounts/deployments", 139 | "apiVersion": "2024-10-01", 140 | "name": "[concat(parameters('OpenAI_account_name'), '/', parameters('embedding_model'))]", 141 | "dependsOn": [ 142 | "[resourceId('Microsoft.CognitiveServices/accounts', parameters('OpenAI_account_name'))]" 143 | ], 144 | "sku": { 145 | "name": "GlobalStandard", 146 | "capacity": 150 147 | }, 148 | "properties": { 149 | "model": { 150 | "format": "OpenAI", 151 | "name": "text-embedding-3-small", 152 | "version": "1" 153 | }, 154 | "versionUpgradeOption": "NoAutoUpgrade", 155 | "currentCapacity": 150, 156 | "raiPolicyName": "Microsoft.DefaultV2" 157 | } 158 | }, 159 | { 160 | "type": "Microsoft.Sql/servers", 161 | "apiVersion": "2022-05-01-preview", 162 | "name": "[parameters('serverName')]", 163 | "location": "[parameters('location')]", 164 | "properties": { 165 | "administratorLogin": "[parameters('administratorLogin')]", 166 | "administratorLoginPassword": "[parameters('administratorLoginPassword')]", 167 | "publicNetworkAccess": "Enabled" 168 | } 169 | }, 170 | { 171 | "type": "Microsoft.Sql/servers/databases", 172 | "apiVersion": "2022-05-01-preview", 173 | "name": "[format('{0}/{1}', parameters('serverName'), parameters('sqlDBName'))]", 174 | "location": "[parameters('location')]", 175 | "dependsOn": [ 176 | "[resourceId('Microsoft.Sql/servers', parameters('serverName'))]" 177 | ], 178 | "sku": { 179 | "name": "GP_S_Gen5", 180 | "tier": "GeneralPurpose", 181 | "family": "Gen5", 182 | "capacity": 2 183 | }, 184 | "kind": "v12.0,user,vcore,serverless,freelimit", 185 | "properties": { 186 | "useFreeLimit": "true", 187 | "freeLimitExhaustionBehavior": "AutoPause" 188 | } 189 | } 190 | ] 191 | } 192 | -------------------------------------------------------------------------------- /5-Min-RAG-SQL-Accelerator/Step2-Deploy-RAG-App/README.md: -------------------------------------------------------------------------------- 1 | # SQL-Vector-Search Solution Accelerator 2 | 3 | ## Objective 4 | 5 | This repository showcases two Streamlit applications that demonstrate how to build Retrieval-Augmented Generation (RAG) solutions using Azure SQL Database's native vector capabilities and Azure OpenAI. The apps are designed for: 6 | - **Product Review Search**: Upload and semantically search product reviews using CSV data. 7 | - **Resume Matching**: Upload and semantically match resumes using PDF documents. 8 | 9 | Both apps allow users to generate embeddings, store/query them in Azure SQL DB, and perform LLM-powered Q&A for intelligent retrieval and recommendations. 10 | These demo follow the Jupyter notebook provided in the [azure-sql-db-vector-search repo](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main) under the Azure-Samples GitHub repo. 11 | 12 | --- 13 | 14 | ## Products & Services Used 15 | 16 | - Azure SQL Database (VECTOR data type) 17 | - Azure OpenAI Service 18 | - Azure Document Intelligence (for resume parsing) 19 | - SQL on Fabric *(alternative to Azure SQL DB)* 20 | - Streamlit 21 | - Python (pandas, pyodbc, tiktoken, etc.) 22 | 23 | --- 24 | 25 | ## Resources Deployment 26 | 27 | ### One Click Deployment 28 | Automated deployment scripts are available in a separate [GitHub repository](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/5-Min-RAG-SQL-Accelerator/Step1-OneClick-Deployment) that will help to deploy required component. 29 | 30 | This includes ARM templates to provision: 31 | - Azure SQL DB 32 | - Azure OpenAI 33 | - Azure Document Intelligence (for resume app) 34 | 35 | ### Steps to Deploy SQL DB on Fabric 36 | - Create a workspace in Fabric, if not existed before. 37 | - New Item > SQL Database (preview) > Provide DB name > Create 38 | 39 | --- 40 | 41 | ## Run the Application 42 | 43 | 1. **Clone the Repository** 44 | 2. **Navigate** to corresponding repository depending upon structured or unstructured dataset. 45 | 3. **Install Requirements:** 46 | ``` 47 | pip install -r requirements.txt 48 | ``` 49 | 4. **Deploy Azure Resources:** 50 | - Use the provided ARM templates or Azure Portal to deploy SQL DB, Document Intelligence, and OpenAI resources. 51 | - Setup SQL on Fabric if using that as the database 52 | 5. **Setup Firewall Configuration:** (Skip this step if using SQL on Fabric) 53 | - Configure the firewall settings for the SQL server separately to allow access from your client IP addresses. 54 | - Go to the deployed SQL Server in the Azure Portal. 55 | - Navigate to Security > Networking > Virtual networks. 56 | - Add your client IP and click Save. 57 | 6. **Run the Streamlit App:** 58 | - Navigate to the cloned repository destination and then run the below command to start the app on `localhost:8501` 59 | ``` 60 | streamlit run --server.maxUploadSize 500 61 | ``` 62 | 7. **Configure Credentials:** 63 | - Launch the app and enter your Azure endpoints, API keys, and SQL connection string in the sidebar. 64 | - Credentials for Document Intelligence: Document Intelligence resource > Overview > Keys and endpoint 65 | - Credentials for OpenAI: OpenAI resource > Overview > Develop > Keys and endpoint 66 | - Connection String for Azure SQL DB: Azure SQL DB resource > Overview > Show database connection strings > ODBC > {Change Pwd parameter with your admin password set during deployment} 67 | - Connection String for SQL on Fabric: SQL DB > Settings > Connection Strings > ODBC > Copy string as it is > Authentication window would pop-up > Provide authentication details 68 | 69 | --- 70 | 71 | ## Deploy the App on Azure App Services 72 | 73 | To deploy a Streamlit application on Azure App Service, follow these steps: 74 | 1. Create an Azure App Service with B1 SKU or higher, as the free version does not support Streamlit. 75 | 2. Choose Python v3.10 or above for Streamlit in the App Service. 76 | 3. Choose Linux as the operating system for the App Service. 77 | 4. Make sure your code folder has a `requirements.txt` file with all the dependencies. 78 | 5. Add two files, `streamlit.sh` and `.deployment`, to the root directory of your project. 79 | - streamlit.sh 80 | ``` 81 | pip install -r requirements.txt 82 | python -m streamlit run app.py --server.port 8000 --server.address 0.0.0.0 83 | ``` 84 | - .deployment 85 | ``` 86 | [config] 87 | SCM_DO_BUILD_DURING_DEPLOYMENT=false 88 | ``` 89 | - Replace `app.py` with your application name. 90 | - Use port 8000 because Azure App Service by default exposes only 8000 and 443 ports. 91 | 6. Open Visual Studio Code and install the Azure Extension Pack. 92 | 7. Log in to Visual Studio Code with your Azure account. 93 | 8. Use the `Azure App Service: Deploy to Web App` command in Visual Studio Code and select your App Service name. 94 | 9. Wait for deployment to be finished. 95 | 10. Go to the Azure portal and update the `Startup Command` configuration for the App Service and set the value to `bash /home/site/wwwroot/streamlit.sh`. 96 | - You can find this configuration inside `App Service > Settings > Configurations > General settings`. 97 | 11. Wait for some seconds and visit the application URL. Congratulations! You have successfully deployed your Streamlit application to the Azure App Service. 98 | 99 | Refer to following resources for any further clarification: 100 | - [Learn Microsoft | Answers](https://learn.microsoft.com/en-us/answers/questions/1470782/how-to-deploy-a-streamlit-application-on-azure-app#:~:text=Deploying%20Streamlit%20Application%20on%20Azure%20App%20Service) 101 | - [Tech Community Microsoft](https://techcommunity.microsoft.com/blog/appsonazureblog/deploy-streamlit-on-azure-web-app/4276108) 102 | 103 | 104 | --- 105 | 106 | ## What Next? 107 | 108 | Once you've successfully deployed and explored the Streamlit applications, here are some next steps to deepen your understanding and expand your solution: 109 | 110 | - **Explore the Jupyter Notebook Project**: Visit the companion GitHub repository that includes Jupyter notebooks to: 111 | - Understand the underlying code and logic behind embedding generation, vector storage, and querying. 112 | - Experiment with different datasets and prompts. 113 | - Learn how to customize the pipeline for your own use case. 114 | - Checkout implementation using Semantic Kernel or LangChain. 115 | 116 | - **Dive into Azure Portal**: 117 | - Monitor and manage your deployed resources. 118 | - Explore Azure SQL DB’s vector capabilities and performance tuning. 119 | - Review usage metrics and cost estimations. 120 | 121 | - **Customize the Apps**: Modify the Streamlit apps to: 122 | - Add new data sources (e.g., JSON, web scraping). 123 | - Enhance the UI/UX for specific business scenarios. 124 | 125 | - **Build Your Own Use Case**: 126 | - Use Azure SQL DB or SQL on Fabric as your vector store. 127 | - Design a RAG solution tailored to your domain—be it legal, healthcare, customer support, or internal knowledge bases. 128 | - Share your solution with the community. 129 | 130 | --- 131 | 132 | ## Troubleshooting 133 | - **Connection Errors:** 134 | - Ensure your SQL connection string is correct and the ODBC driver is installed. 135 | - Verify API keys and endpoints for Azure services. 136 | - **Table Creation Issues:** 137 | - Confirm your user has permissions to create tables in the target database. 138 | - If using the database for the first time or it is in Paused state, try creating table again after the database is in Running state - it would take 1-2 mins for DB to be ready, if in paused state. 139 | - **Embedding/Vector Errors:** 140 | - Use the correct double-casting in SQL queries as shown in the app. 141 | - **Performance:** 142 | - Large PDFs or many files may take time to process and embed. Monitor resource usage. 143 | - **Streamlit UI Issues:** 144 | - Refresh the page or restart the app if UI elements do not update as expected. 145 | 146 | --- 147 | 148 | ## 📚 Resources 149 | 150 | - [Azure SQL DB Vector Support](https://devblogs.microsoft.com/azure-sql/eap-for-vector-support-refresh-introducing-vector-type/) 151 | - [Azure Document Intelligence](https://learn.microsoft.com/azure/ai-services/document-intelligence/) 152 | - [Azure OpenAI Service](https://learn.microsoft.com/azure/ai-services/openai/) 153 | - [Streamlit Documentation](https://docs.streamlit.io/) 154 | - [Project GitHub Repository](https://github.com/Azure-Samples/azure-sql-db-vector-search/tree/main/RAG-with-Documents) 155 | 156 | --- 157 | --------------------------------------------------------------------------------