├── LICENSE.txt ├── README.md ├── advanced_rag_techniques ├── basic_unstructured_rag.ipynb ├── contextual_rag.ipynb ├── fusion_rag.ipynb ├── hybrid_rag.ipynb ├── hyde_rag.ipynb ├── naive_rag.ipynb ├── parent_document_retriever.ipynb └── rewrite_retrieve_read.ipynb ├── agent_techniques ├── react.ipynb ├── reflexion.ipynb └── rewoo.ipynb ├── agentic_rag_techniques ├── adaptive_rag.ipynb ├── agentic_rag_using_deepseek_qdrant_and_langchain.ipynb ├── basic_agentic_rag.ipynb ├── corrective_rag.ipynb └── self_rag.ipynb └── data ├── context.csv └── tesla_q3.pdf /LICENSE.txt: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Athin AI 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![LinkedIn](https://img.shields.io/badge/LinkedIn-follow-blue)](https://www.linkedin.com/company/athina-ai/posts/?feedView=all)  2 | [![Twitter](https://img.shields.io/twitter/follow/AthinaAI?label=Follow%20@AthinaAI&style=social)](https://x.com/AthinaAI)  3 | [![Share](https://img.shields.io/badge/share-000000?logo=x&logoColor=white)](https://x.com/intent/tweet?text=Check%20out%20this%20project%20on%20GitHub:%20https://github.com/athina-ai/rag-cookbooks)  4 | [![Share](https://img.shields.io/badge/share-0A66C2?logo=linkedin&logoColor=white)](https://www.linkedin.com/sharing/share-offsite/?url=https://github.com/athina-ai/rag-cookbooks)  5 | [![Share](https://img.shields.io/badge/share-FF4500?logo=reddit&logoColor=white)](https://www.reddit.com/submit?title=Check%20out%20this%20project%20on%20GitHub:%20https://github.com/athina-ai/rag-cookbooks) 6 | 7 | >If you find this repository helpful, please consider giving it a star⭐️ 8 | 9 | # Advanced + Agentic RAG Cookbooks👨🏻‍💻 10 | Welcome to the comprehensive collection of advanced + agentic Retrieval-Augmented Generation (RAG) techniques. 11 | 12 | ## Introduction🚀 13 | RAG is a popular method that improves accuracy and relevance by finding the right information from reliable sources and transforming it into useful answers. This repository covers the most effective advanced + agentic RAG techniques with clear implementations and explanations. 14 | 15 | The main goal of this repository is to provide a helpful resource for researchers and developers looking to use advanced RAG techniques in their projects. Building these techniques from scratch takes time, and finding proper evaluation methods can be challenging. This repository simplifies the process by offering ready-to-use implementations and guidance on how to evaluate them. 16 | >[!NOTE] 17 | >This repository starts with naive RAG as a foundation and progresses to advanced and agentic techniques. It also includes research papers/references for each RAG technique, which you can explore for further reading. 18 | 19 | ## Introduction to RAG💡 20 | Large Language Models are trained on a fixed dataset, which limits their ability to handle private or recent information. They can sometimes "hallucinate", providing incorrect yet believable answers. Fine-tuning can help but it is expensive and not ideal for retraining again and again on new data. The Retrieval-Augmented Generation (RAG) framework addresses this issue by using external documents to improve the LLM's responses through in-context learning. RAG ensures that the information provided by the LLM is not only contextually relevant but also accurate and up-to-date. 21 | 22 | ![final diagram](https://github.com/user-attachments/assets/508b3a87-ac46-4bf7-b849-145c5465a6c0) 23 | 24 | There are four main components in RAG: 25 | 26 | **Indexing:** First, documents (in any format) are split into chunks, and embeddings for these chunks are created. These embeddings are then added to a vector store. 27 | 28 | **Retriever:** Then, the retriever finds the most relevant documents based on the user's query, using techniques like vector similarity from the vector store. 29 | 30 | **Augment:** After that, the Augment part combines the user's query with the retrieved context into a prompt, ensuring the LLM has the information needed to generate accurate responses. 31 | 32 | **Generate:** Finally, the combined query and prompt are passed to the model, which then generates the final response to the user's query. 33 | 34 | These components of RAG allow the model to access up-to-date, accurate information and generate responses based on external knowledge. However, to ensure RAG systems are functioning effectively, it’s essential to evaluate their performance. 35 | 36 | ## RAG Evaluation📊 37 | Evaluating RAG applications is important for understanding how well these systems work. We can see how effectively they combine information retrieval with generative models by checking their accuracy and relevance. This evaluation helps improve RAG applications in tasks like text summarization, chatbots, and question-answering. It also identifies areas for improvement, ensuring that these systems provide trustworthy responses as information changes. Overall, effective evaluation helps optimize performance and builds confidence in RAG applications for real-world use. These notebooks contain an end-to-end RAG implementation + RAG evaluation part in Athina AI. 38 | 39 | ![evals diagram](https://github.com/user-attachments/assets/65c2b5af-a931-40c5-b006-87567aef019f) 40 | 41 | 42 | 43 | ## Advanced RAG Techniques⚙️ 44 | Here are the details of all the Advanced RAG techniques covered in this repository. 45 | 46 | | Technique | Tools | Description | Notebooks | 47 | |---------------------------------|------------------------------|--------------------------------------------------------------|-----------| 48 | | Naive RAG | LangChain, Pinecone, Athina AI | Combines retrieved data with LLMs for simple and effective responses.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/naive_rag.ipynb) | 49 | | Hybrid RAG | LangChain, Chromadb, Athina AI | Combines vector search and traditional methods like BM25 for better information retrieval.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/hybrid_rag.ipynb) | 50 | | Hyde RAG | LangChain, Weaviate, Athina AI | Creates hypothetical document embeddings to find relevant information for a query.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/hyde_rag.ipynb) | 51 | | Parent Document Retriever | LangChain, Chromadb, Athina AI | Breaks large documents into small parts and retrieves the full document if a part matches the query.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/parent_document_retriever.ipynb) | 52 | | RAG fusion | LangChain, LangSmith, Qdrant, Athina AI | Generates sub-queries, ranks documents with Reciprocal Rank Fusion, and uses top results for accurate responses.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/fusion_rag.ipynb) | 53 | | Contextual RAG | LangChain, Chromadb, Athina AI | Compresses retrieved documents to keep only relevant details for concise and accurate responses.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/contextual_rag.ipynb) | 54 | | Rewrite Retrieve Read | LangChain, Chromadb, Athina AI | Improves query, retrieves better data, and generates accurate answers.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/rewrite_retrieve_read.ipynb) | 55 | | Unstructured RAG | LangChain, LangGraph, FAISS, Athina AI, Unstructured | This method designed to handle documents that combine text, tables, and images.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/advanced_rag_techniques/basic_unstructured_rag.ipynb) | 56 | 57 | ## Agentic RAG Techniques⚙️ 58 | Here are the details of all the Agentic RAG techniques covered in this repository. 59 | 60 | | Technique | Tools | Description | Notebooks | 61 | |---------------------------------|------------------------------|--------------------------------------------------------------|-----------| 62 | | Basic Agentic RAG | LangChain, FAISS, Athina AI | Agentic RAG uses AI agents to find and generate answers using tools like vectordb and web searches.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/agentic_rag_techniques/basic_agentic_rag.ipynb) | 63 | | Corrective RAG | LangChain, LangGraph, Chromadb, Athina AI | Refines relevant documents, removes irrelevant ones or does the web search.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/agentic_rag_techniques/corrective_rag.ipynb) | 64 | | Self RAG | LangChain, LangGraph, FAISS, Athina AI | Reflects on retrieved data to ensure accurate and complete responses.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/agentic_rag_techniques/self_rag.ipynb) | 65 | | Adaptive RAG | LangChain, LangGraph, FAISS, Athina AI | Adjusts retrieval methods based on query type, using indexed data or web search.| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/agentic_rag_techniques/adaptive_rag.ipynb) | 66 | | ReAct RAG | LangChain, LangGraph, FAISS, Athina AI | System combining reasoning and retrieval for context-aware responses| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/athina-ai/rag-cookbooks/blob/main/agentic_rag_techniques/react_rag.ipynb) | 67 | 68 | ## Demo🎬 69 | A quick demo of how each notebook works: 70 | 71 | https://github.com/user-attachments/assets/c6f17961-40a1-4cca-ab1f-2c8fa3d71a7a 72 | 73 | ## Getting Started🛠️ 74 | First, clone this repository by using the following command: 75 | ```bash 76 | git clone https://github.com/athina-ai/rag-cookbooks.git 77 | ``` 78 | Next, navigate to the project directory: 79 | ```bash 80 | cd rag-cookbooks 81 | ``` 82 | Once you are in the 'rag-cookbooks' directory, follow the detailed implementation for each technique. 83 | 84 | ## Creators + Contributors👨🏻‍💻 85 | [![Contributors](https://contrib.rocks/image?repo=athina-ai/cookbooks)](https://github.com/athina-ai/cookbooks/graphs/contributors) 86 | 87 | ## Contributing🤝 88 | If you have a new technique or improvement to suggest, we welcome contributions from the community! 89 | 90 | ## License📝 91 | This project is licensed under [MIT License](LICENSE) 92 | 93 | 94 | 95 | 96 | 97 | 98 | -------------------------------------------------------------------------------- /advanced_rag_techniques/basic_unstructured_rag.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "id": "OkpPD_RfNYkg" 7 | }, 8 | "source": [ 9 | "# **Unstructured RAG**\n", 10 | "Unstructured or (Semi-Structured) RAG is a method designed to handle documents that combine text, tables, and images. It addresses challenges like broken tables caused by text splitting and the difficulty of embedding tables for semantic search.\n", 11 | "\n", 12 | "Here we are using unstructured.io to parse and separate text, tables, and images.\n", 13 | "\n", 14 | "Tool Reference: [Unstructured](https://unstructured.io/)" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": { 20 | "id": "urjXoWDk9rg5" 21 | }, 22 | "source": [ 23 | "## **Initial Setup**" 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "execution_count": null, 29 | "metadata": { 30 | "id": "oYh10OBF6sUe" 31 | }, 32 | "outputs": [], 33 | "source": [ 34 | "! pip install --q athina faiss-gpu pytesseract unstructured-client \"unstructured[all-docs]\"" 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": null, 40 | "metadata": { 41 | "id": "F0bKsHlVQqqc" 42 | }, 43 | "outputs": [], 44 | "source": [ 45 | "!apt-get install poppler-utils\n", 46 | "!apt-get install tesseract-ocr\n", 47 | "!apt-get install libtesseract-dev" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 3, 53 | "metadata": { 54 | "id": "8AHlmfoP6t21" 55 | }, 56 | "outputs": [], 57 | "source": [ 58 | "import os\n", 59 | "from google.colab import userdata\n", 60 | "os.environ[\"OPENAI_API_KEY\"] = userdata.get('OPENAI_API_KEY')\n", 61 | "os.environ['ATHINA_API_KEY'] = userdata.get('ATHINA_API_KEY')" 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": { 67 | "id": "9ObH1Z5s9svo" 68 | }, 69 | "source": [ 70 | "## **Indexing**" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": 4, 76 | "metadata": { 77 | "id": "Ougq8dbPFvO4" 78 | }, 79 | "outputs": [], 80 | "source": [ 81 | "# load embedding model\n", 82 | "from langchain_openai import OpenAIEmbeddings\n", 83 | "embeddings = OpenAIEmbeddings()" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": null, 89 | "metadata": { 90 | "id": "03zjZ3WF7LoE" 91 | }, 92 | "outputs": [], 93 | "source": [ 94 | " # load and extract images, tables, and chunk text\n", 95 | "from unstructured.partition.pdf import partition_pdf\n", 96 | "\n", 97 | "filename = \"/content/sample.pdf\"\n", 98 | "\n", 99 | "pdf_elements = partition_pdf(\n", 100 | " filename=filename,\n", 101 | " extract_images_in_pdf=True,\n", 102 | " strategy = \"hi_res\",\n", 103 | " hi_res_model_name=\"yolox\",\n", 104 | " infer_table_structure=True,\n", 105 | " chunking_strategy=\"by_title\",\n", 106 | " max_characters=3000,\n", 107 | " combine_text_under_n_chars=200,\n", 108 | ")" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": 7, 114 | "metadata": { 115 | "colab": { 116 | "base_uri": "https://localhost:8080/" 117 | }, 118 | "id": "QHk-K9voHeea", 119 | "outputId": "1e1246b1-9e3e-48a5-a101-8530f5ce4ae0" 120 | }, 121 | "outputs": [ 122 | { 123 | "data": { 124 | "text/plain": [ 125 | "Counter({\"\": 14,\n", 126 | " \"\": 2})" 127 | ] 128 | }, 129 | "execution_count": 7, 130 | "metadata": {}, 131 | "output_type": "execute_result" 132 | } 133 | ], 134 | "source": [ 135 | "# check unique categories\n", 136 | "from collections import Counter\n", 137 | "category_counts = Counter(str(type(element)) for element in pdf_elements)\n", 138 | "unique_categories = set(category_counts)\n", 139 | "category_counts" 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": 8, 145 | "metadata": { 146 | "colab": { 147 | "base_uri": "https://localhost:8080/" 148 | }, 149 | "id": "E7jI_njBE5kj", 150 | "outputId": "9f9eea79-a762-4c7a-b431-cc97869d66d9" 151 | }, 152 | "outputs": [ 153 | { 154 | "data": { 155 | "text/plain": [ 156 | "{'CompositeElement', 'Table'}" 157 | ] 158 | }, 159 | "execution_count": 8, 160 | "metadata": {}, 161 | "output_type": "execute_result" 162 | } 163 | ], 164 | "source": [ 165 | "# extract unique types\n", 166 | "unique_types = {el.to_dict()['type'] for el in pdf_elements}\n", 167 | "unique_types" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 9, 173 | "metadata": { 174 | "id": "aZXVtfYfKUJz" 175 | }, 176 | "outputs": [], 177 | "source": [ 178 | "# # display images from pdf\n", 179 | "# from IPython.display import Image, display\n", 180 | "# image_files = os.listdir('/content/figures')\n", 181 | "# image_files = [os.path.join('/content/figures', image_file) for image_file in image_files]\n", 182 | "\n", 183 | "# for image_file in image_files:\n", 184 | "# display(Image(filename=image_file))" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": 10, 190 | "metadata": { 191 | "id": "ArQxTM8pHDWl" 192 | }, 193 | "outputs": [], 194 | "source": [ 195 | "# convert pdf_elements to langchain documents\n", 196 | "from langchain.schema import Document\n", 197 | "documents = [Document(page_content=el.text, metadata={\"source\": filename}) for el in pdf_elements]" 198 | ] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "metadata": { 203 | "id": "NLvE3-BXNMde" 204 | }, 205 | "source": [ 206 | "## **Vector Store**" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": 11, 212 | "metadata": { 213 | "id": "R-4GZaxzGx9E" 214 | }, 215 | "outputs": [], 216 | "source": [ 217 | "# create vectorstore\n", 218 | "from langchain.vectorstores import FAISS\n", 219 | "vectorstore = FAISS.from_documents(documents, embeddings)" 220 | ] 221 | }, 222 | { 223 | "cell_type": "markdown", 224 | "metadata": { 225 | "id": "VRiczRDVNKzR" 226 | }, 227 | "source": [ 228 | "## **Retriever**" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 12, 234 | "metadata": { 235 | "id": "YlliENVAGczz" 236 | }, 237 | "outputs": [], 238 | "source": [ 239 | "# create retriever\n", 240 | "retriever = vectorstore.as_retriever()" 241 | ] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": { 246 | "id": "KbYdQT9xNKRT" 247 | }, 248 | "source": [ 249 | "## **RAG Chain**" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": 13, 255 | "metadata": { 256 | "id": "c6CIteJzNWCk" 257 | }, 258 | "outputs": [], 259 | "source": [ 260 | "# load llm\n", 261 | "from langchain_openai import ChatOpenAI\n", 262 | "llm = ChatOpenAI()" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": 14, 268 | "metadata": { 269 | "id": "r7LT4JSCNYEM" 270 | }, 271 | "outputs": [], 272 | "source": [ 273 | "# create document chain\n", 274 | "from langchain.prompts import ChatPromptTemplate\n", 275 | "from langchain.schema.runnable import RunnablePassthrough\n", 276 | "from langchain.schema.output_parser import StrOutputParser\n", 277 | "\n", 278 | "template = \"\"\"\"\n", 279 | "You are a helpful assistant that answers questions based on the provided context, which can include text and tables.\n", 280 | "Use the provided context to answer the question.\n", 281 | "Question: {input}\n", 282 | "Context: {context}\n", 283 | "Answer:\n", 284 | "\"\"\"\n", 285 | "prompt = ChatPromptTemplate.from_template(template)\n", 286 | "\n", 287 | "# Setup RAG pipeline\n", 288 | "rag_chain = (\n", 289 | " {\"context\": retriever, \"input\": RunnablePassthrough()}\n", 290 | " | prompt\n", 291 | " | llm\n", 292 | " | StrOutputParser()\n", 293 | ")" 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": 15, 299 | "metadata": { 300 | "colab": { 301 | "base_uri": "https://localhost:8080/", 302 | "height": 105 303 | }, 304 | "id": "1v4A3yzHNbJ-", 305 | "outputId": "0a960628-82a8-4159-867a-27b8fb1d9107" 306 | }, 307 | "outputs": [ 308 | { 309 | "data": { 310 | "application/vnd.google.colaboratory.intrinsic+json": { 311 | "type": "string" 312 | }, 313 | "text/plain": [ 314 | "'To compare all the Training Results on the MATH Test Set, we can look at the results from Table 6 in the provided context. The results are as follows:\\n\\n- deepseek-sft-abel:\\n - SFT-phase1: 0.372\\n - SFT-phase2-shortcutLearning: 0.386\\n - SFT-phase2-journeyLearining: 0.470\\n - DPO: 0.472\\n\\n- deepseek-sft-prm800k:\\n - SFT-phase1: 0.290\\n - SFT-phase2-shortcutLearning: 0.348\\n - SFT-phase2-journeyLearining: 0.428\\n - DPO: 0.440\\n\\nBased on these results, we can see that Journey Learning led to significant improvements compared to Shortcut Learning on both models, with gains of +8.4 and +8.0 on deepseek-sft-abel and deepseek-sft-prm800k, respectively. The DPO results were also provided for comparison.'" 315 | ] 316 | }, 317 | "execution_count": 15, 318 | "metadata": {}, 319 | "output_type": "execute_result" 320 | } 321 | ], 322 | "source": [ 323 | "# response\n", 324 | "response = rag_chain.invoke(\"Compare all the Training Results on MATH Test Set\")\n", 325 | "response" 326 | ] 327 | }, 328 | { 329 | "cell_type": "markdown", 330 | "metadata": { 331 | "id": "lq3KgKOKPi-J" 332 | }, 333 | "source": [ 334 | "## **Preparing Data for Evaluation**" 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": 16, 340 | "metadata": { 341 | "colab": { 342 | "base_uri": "https://localhost:8080/" 343 | }, 344 | "id": "YlaS-vyfPx1G", 345 | "outputId": "e428c7fb-e611-4df5-f989-628d3ffadc06" 346 | }, 347 | "outputs": [ 348 | { 349 | "name": "stderr", 350 | "output_type": "stream", 351 | "text": [ 352 | "/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method `BaseRetriever.get_relevant_documents` was deprecated in langchain-core 0.1.46 and will be removed in 0.3.0. Use invoke instead.\n", 353 | " warn_deprecated(\n" 354 | ] 355 | } 356 | ], 357 | "source": [ 358 | "# create dataset\n", 359 | "question = [\"Compare all the Training Results on MATH Test Set\"]\n", 360 | "response = []\n", 361 | "contexts = []\n", 362 | "\n", 363 | "# Inference\n", 364 | "for query in question:\n", 365 | " response.append(rag_chain.invoke(query))\n", 366 | " contexts.append([docs.page_content for docs in retriever.get_relevant_documents(query)])\n", 367 | "\n", 368 | "# To dict\n", 369 | "data = {\n", 370 | " \"query\": question,\n", 371 | " \"response\": response,\n", 372 | " \"context\": contexts,\n", 373 | "}" 374 | ] 375 | }, 376 | { 377 | "cell_type": "code", 378 | "execution_count": 17, 379 | "metadata": { 380 | "id": "yJwNmnwwQFcZ" 381 | }, 382 | "outputs": [], 383 | "source": [ 384 | "# create dataset\n", 385 | "from datasets import Dataset\n", 386 | "dataset = Dataset.from_dict(data)" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": 18, 392 | "metadata": { 393 | "id": "ht2xXRiTQHBu" 394 | }, 395 | "outputs": [], 396 | "source": [ 397 | "# create dataframe\n", 398 | "import pandas as pd\n", 399 | "df = pd.DataFrame(dataset)" 400 | ] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "execution_count": 19, 405 | "metadata": { 406 | "colab": { 407 | "base_uri": "https://localhost:8080/", 408 | "height": 89 409 | }, 410 | "id": "nZDNxwXbQI8K", 411 | "outputId": "da52093d-2e02-40d0-8f68-842892259bc2" 412 | }, 413 | "outputs": [ 414 | { 415 | "data": { 416 | "application/vnd.google.colaboratory.intrinsic+json": { 417 | "summary": "{\n \"name\": \"df\",\n \"rows\": 1,\n \"fields\": [\n {\n \"column\": \"query\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"Compare all the Training Results on MATH Test Set\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"response\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"To compare all the Training Results on the MATH Test Set, we can look at the results provided in Table 6 from the context. The results for the different models on the MATH test set are as follows:\\n\\n- deepseek-sft-abel: SFT-phase1 = 0.372, SFT-phase2-shortcutLearning = 0.386, SFT-phase2-journeyLearining = 0.470, DPO = 0.472\\n- deepseek-sft-prm800k: SFT-phase1 = 0.290, SFT-phase2-shortcutLearning = 0.348, SFT-phase2-journeyLearining = 0.428, DPO = 0.440\\n\\nFrom these results, we can see that the Journey Learning method led to significant improvements compared to Shortcut Learning for both models on the MATH test set. The gains were +8.4 and +8.0 for the deepseek-sft-abel and deepseek-sft-prm800k models, respectively. The improvement from DPO was more modest in comparison.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}", 418 | "type": "dataframe", 419 | "variable_name": "df" 420 | }, 421 | "text/html": [ 422 | "\n", 423 | "
\n", 424 | "
\n", 425 | "\n", 438 | "\n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | "
queryresponsecontext
0Compare all the Training Results on MATH Test SetTo compare all the Training Results on the MAT...[The results of our experiments are shown in T...
\n", 456 | "
\n", 457 | "
\n", 458 | "\n", 459 | "
\n", 460 | " \n", 468 | "\n", 469 | " \n", 509 | "\n", 510 | " \n", 534 | "
\n", 535 | "\n", 536 | "\n", 537 | "
\n", 538 | " \n", 569 | " \n", 578 | " \n", 590 | "
\n", 591 | "\n", 592 | "
\n", 593 | "
\n" 594 | ], 595 | "text/plain": [ 596 | " query \\\n", 597 | "0 Compare all the Training Results on MATH Test Set \n", 598 | "\n", 599 | " response \\\n", 600 | "0 To compare all the Training Results on the MAT... \n", 601 | "\n", 602 | " context \n", 603 | "0 [The results of our experiments are shown in T... " 604 | ] 605 | }, 606 | "execution_count": 19, 607 | "metadata": {}, 608 | "output_type": "execute_result" 609 | } 610 | ], 611 | "source": [ 612 | "df" 613 | ] 614 | }, 615 | { 616 | "cell_type": "code", 617 | "execution_count": 20, 618 | "metadata": { 619 | "id": "GnKSiZ1yQVQu" 620 | }, 621 | "outputs": [], 622 | "source": [ 623 | "# Convert to dictionary\n", 624 | "df_dict = df.to_dict(orient='records')\n", 625 | "\n", 626 | "# Convert context to list\n", 627 | "for record in df_dict:\n", 628 | " if not isinstance(record.get('context'), list):\n", 629 | " if record.get('context') is None:\n", 630 | " record['context'] = []\n", 631 | " else:\n", 632 | " record['context'] = [record['context']]" 633 | ] 634 | }, 635 | { 636 | "cell_type": "markdown", 637 | "metadata": { 638 | "id": "Gt_JpGdnQa5v" 639 | }, 640 | "source": [ 641 | "## **Evaluation in Athina AI**\n", 642 | "\n", 643 | "We will use **Does Response Answer Query** eval here. It Checks if the response answer the user's query. To learn more about this. Please refer to our [documentation](https://docs.athina.ai/api-reference/evals/preset-evals/overview) for further details." 644 | ] 645 | }, 646 | { 647 | "cell_type": "code", 648 | "execution_count": 21, 649 | "metadata": { 650 | "colab": { 651 | "base_uri": "https://localhost:8080/" 652 | }, 653 | "id": "i_JqLJ6eQbaZ", 654 | "outputId": "1e4ccb88-1a88-48a7-b809-64d9832fd15c" 655 | }, 656 | "outputs": [ 657 | { 658 | "name": "stderr", 659 | "output_type": "stream", 660 | "text": [ 661 | "/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_generate_schema.py:547: UserWarning: is not a Python type (it may be an instance of an object), Pydantic will allow any object with no validation since we cannot even enforce that the input is an instance of the given type. To get rid of this error wrap the type with `pydantic.SkipValidation`.\n", 662 | " warn(\n" 663 | ] 664 | } 665 | ], 666 | "source": [ 667 | "# set api keys for Athina evals\n", 668 | "from athina.keys import AthinaApiKey, OpenAiApiKey\n", 669 | "OpenAiApiKey.set_key(os.getenv('OPENAI_API_KEY'))\n", 670 | "AthinaApiKey.set_key(os.getenv('ATHINA_API_KEY'))" 671 | ] 672 | }, 673 | { 674 | "cell_type": "code", 675 | "execution_count": 22, 676 | "metadata": { 677 | "id": "8vd7LKlWQfUg" 678 | }, 679 | "outputs": [], 680 | "source": [ 681 | "# load dataset\n", 682 | "from athina.loaders import Loader\n", 683 | "dataset = Loader().load_dict(df_dict)" 684 | ] 685 | }, 686 | { 687 | "cell_type": "code", 688 | "execution_count": 23, 689 | "metadata": { 690 | "colab": { 691 | "base_uri": "https://localhost:8080/", 692 | "height": 445 693 | }, 694 | "id": "lvTOKtOSQhKk", 695 | "outputId": "61f63043-a5b8-4e79-f97d-55fdea34e231" 696 | }, 697 | "outputs": [ 698 | { 699 | "name": "stdout", 700 | "output_type": "stream", 701 | "text": [ 702 | "You can view your dataset at: https://app.athina.ai/develop/e5dec38c-c58c-412d-b910-588d97ccd090\n" 703 | ] 704 | }, 705 | { 706 | "data": { 707 | "application/vnd.google.colaboratory.intrinsic+json": { 708 | "repr_error": "Out of range float values are not JSON compliant: nan", 709 | "type": "dataframe" 710 | }, 711 | "text/html": [ 712 | "\n", 713 | "
\n", 714 | "
\n", 715 | "\n", 728 | "\n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | "
querycontextresponseexpected_responsedisplay_namefailedgrade_reasonruntimemodelpassed
0Compare all the Training Results on MATH Test Set[The results of our experiments are shown in Table 6. All results are tested on the MATH test set, using a re-divided subset from PRM800K, which includes 500 examples. The results show that Journey Learning led to significant improvements compared to Shortcut Learning, with gains of +8.4 and +8.0 on the deepseek-sft-abel and deepseek-sft-prm800k models, respectively, demonstrating the effectiveness of our proposed Journey Learning method. However, the improvement from DPO was more modest, an...To compare all the Training Results on the MATH Test Set, we can look at the results provided in Table 6 from the context. The results for the different models on the MATH test set are as follows:\\n\\n- deepseek-sft-abel: SFT-phase1 = 0.372, SFT-phase2-shortcutLearning = 0.386, SFT-phase2-journeyLearining = 0.470, DPO = 0.472\\n- deepseek-sft-prm800k: SFT-phase1 = 0.290, SFT-phase2-shortcutLearning = 0.348, SFT-phase2-journeyLearining = 0.428, DPO = 0.440\\n\\nFrom these results, we can see that...NoneDoes Response Answer QueryFalseThe response provides a detailed comparison of the training results on the MATH test set for two models, deepseek-sft-abel and deepseek-sft-prm800k. It includes specific performance metrics for different phases and methods, such as SFT-phase1, SFT-phase2-shortcutLearning, SFT-phase2-journeyLearning, and DPO. Additionally, it highlights the improvements observed with the Journey Learning method compared to Shortcut Learning, which directly addresses the user's query about comparing training r...1910gpt-4o1.0
\n", 760 | "
\n", 761 | "
\n", 762 | "\n", 763 | "
\n", 764 | " \n", 772 | "\n", 773 | " \n", 813 | "\n", 814 | " \n", 838 | "
\n", 839 | "\n", 840 | "\n", 841 | "
\n", 842 | "
\n" 843 | ], 844 | "text/plain": [ 845 | " query \\\n", 846 | "0 Compare all the Training Results on MATH Test Set \n", 847 | "\n", 848 | " context \\\n", 849 | "0 [The results of our experiments are shown in Table 6. All results are tested on the MATH test set, using a re-divided subset from PRM800K, which includes 500 examples. The results show that Journey Learning led to significant improvements compared to Shortcut Learning, with gains of +8.4 and +8.0 on the deepseek-sft-abel and deepseek-sft-prm800k models, respectively, demonstrating the effectiveness of our proposed Journey Learning method. However, the improvement from DPO was more modest, an... \n", 850 | "\n", 851 | " response \\\n", 852 | "0 To compare all the Training Results on the MATH Test Set, we can look at the results provided in Table 6 from the context. The results for the different models on the MATH test set are as follows:\\n\\n- deepseek-sft-abel: SFT-phase1 = 0.372, SFT-phase2-shortcutLearning = 0.386, SFT-phase2-journeyLearining = 0.470, DPO = 0.472\\n- deepseek-sft-prm800k: SFT-phase1 = 0.290, SFT-phase2-shortcutLearning = 0.348, SFT-phase2-journeyLearining = 0.428, DPO = 0.440\\n\\nFrom these results, we can see that... \n", 853 | "\n", 854 | " expected_response display_name failed \\\n", 855 | "0 None Does Response Answer Query False \n", 856 | "\n", 857 | " grade_reason \\\n", 858 | "0 The response provides a detailed comparison of the training results on the MATH test set for two models, deepseek-sft-abel and deepseek-sft-prm800k. It includes specific performance metrics for different phases and methods, such as SFT-phase1, SFT-phase2-shortcutLearning, SFT-phase2-journeyLearning, and DPO. Additionally, it highlights the improvements observed with the Journey Learning method compared to Shortcut Learning, which directly addresses the user's query about comparing training r... \n", 859 | "\n", 860 | " runtime model passed \n", 861 | "0 1910 gpt-4o 1.0 " 862 | ] 863 | }, 864 | "execution_count": 23, 865 | "metadata": {}, 866 | "output_type": "execute_result" 867 | } 868 | ], 869 | "source": [ 870 | "# evaluate\n", 871 | "from athina.evals import DoesResponseAnswerQuery\n", 872 | "DoesResponseAnswerQuery(model=\"gpt-4o\").run_batch(data=dataset).to_df()" 873 | ] 874 | } 875 | ], 876 | "metadata": { 877 | "colab": { 878 | "provenance": [] 879 | }, 880 | "kernelspec": { 881 | "display_name": "Python 3", 882 | "name": "python3" 883 | }, 884 | "language_info": { 885 | "name": "python" 886 | } 887 | }, 888 | "nbformat": 4, 889 | "nbformat_minor": 0 890 | } 891 | -------------------------------------------------------------------------------- /advanced_rag_techniques/contextual_rag.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [] 7 | }, 8 | "kernelspec": { 9 | "name": "python3", 10 | "display_name": "Python 3" 11 | }, 12 | "language_info": { 13 | "name": "python" 14 | } 15 | }, 16 | "cells": [ 17 | { 18 | "cell_type": "markdown", 19 | "source": [ 20 | "# **Contextual RAG**\n", 21 | "\n", 22 | "Contextual Retrieval-Augmented Generation (RAG) is an advanced RAG technique that improves response relevance and efficiency by incorporating contextual compression during the retrieval process. Traditional RAG retrieves and sends full documents to the generation model, which may include irrelevant information, leading to higher costs and less accurate responses.\n", 23 | "\n", 24 | "In Contextual RAG, the retrieved documents are processed through a Document Compressor before being passed to the language model. This compressor extracts and retains only the most relevant information for the query, or even discards entire irrelevant documents. This approach reduces the noise in the retrieved context, resulting in more precise, concise, and cost-effective responses from the generation model.\n", 25 | "\n", 26 | "Reference: [Contextual RAG](https://python.langchain.com/docs/how_to/contextual_compression/)" 27 | ], 28 | "metadata": { 29 | "id": "uSh0jjVbzHCz" 30 | } 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "source": [ 35 | "## **Initial Setup**" 36 | ], 37 | "metadata": { 38 | "id": "IDIEkHsh1Eka" 39 | } 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": null, 44 | "metadata": { 45 | "id": "MrZhFJX_yBzX" 46 | }, 47 | "outputs": [], 48 | "source": [ 49 | "!pip install --q athina chromadb" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "source": [ 55 | "import os\n", 56 | "from google.colab import userdata\n", 57 | "os.environ[\"OPENAI_API_KEY\"] = userdata.get('OPENAI_API_KEY')\n", 58 | "os.environ['ATHINA_API_KEY'] = userdata.get('ATHINA_API_KEY')" 59 | ], 60 | "metadata": { 61 | "id": "tg6l6UKW06dY" 62 | }, 63 | "execution_count": null, 64 | "outputs": [] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "source": [ 69 | "## **Indexing**" 70 | ], 71 | "metadata": { 72 | "id": "AO7zLKC_1HSm" 73 | } 74 | }, 75 | { 76 | "cell_type": "code", 77 | "source": [ 78 | "# load embedding model\n", 79 | "from langchain_openai import OpenAIEmbeddings\n", 80 | "embeddings = OpenAIEmbeddings()" 81 | ], 82 | "metadata": { 83 | "id": "SS6M8Fvl1HH0" 84 | }, 85 | "execution_count": null, 86 | "outputs": [] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "source": [ 91 | "# load data\n", 92 | "from langchain.document_loaders import CSVLoader\n", 93 | "loader = CSVLoader(\"./context.csv\")\n", 94 | "documents = loader.load()" 95 | ], 96 | "metadata": { 97 | "id": "iR30q3uZ1YjU" 98 | }, 99 | "execution_count": null, 100 | "outputs": [] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "source": [ 105 | "# split documents\n", 106 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", 107 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n", 108 | "documents = text_splitter.split_documents(documents)" 109 | ], 110 | "metadata": { 111 | "id": "WDEHMq-I1dEE" 112 | }, 113 | "execution_count": null, 114 | "outputs": [] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "source": [ 119 | "# create vectorstore\n", 120 | "from langchain.vectorstores import Chroma\n", 121 | "vectorstore = Chroma.from_documents(documents, embeddings)" 122 | ], 123 | "metadata": { 124 | "id": "cCQ6DKuf1fkf" 125 | }, 126 | "execution_count": null, 127 | "outputs": [] 128 | }, 129 | { 130 | "cell_type": "markdown", 131 | "source": [ 132 | "## **Retriever**" 133 | ], 134 | "metadata": { 135 | "id": "jC31p6zQ1gRZ" 136 | } 137 | }, 138 | { 139 | "cell_type": "code", 140 | "source": [ 141 | "# create retriever\n", 142 | "retriever = vectorstore.as_retriever()" 143 | ], 144 | "metadata": { 145 | "id": "fu0TnRi41iUD" 146 | }, 147 | "execution_count": null, 148 | "outputs": [] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "source": [ 153 | "## **Contextual Retriever**" 154 | ], 155 | "metadata": { 156 | "id": "pbNId_f-1u9O" 157 | } 158 | }, 159 | { 160 | "cell_type": "code", 161 | "source": [ 162 | "# create llm\n", 163 | "from langchain_openai import ChatOpenAI\n", 164 | "llm = ChatOpenAI()" 165 | ], 166 | "metadata": { 167 | "id": "SZfW0rkT2G4x" 168 | }, 169 | "execution_count": null, 170 | "outputs": [] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "source": [ 175 | "# create compression retriever\n", 176 | "from langchain.retrievers import ContextualCompressionRetriever\n", 177 | "from langchain.retrievers.document_compressors import LLMChainExtractor\n", 178 | "\n", 179 | "compressor = LLMChainExtractor.from_llm(llm)\n", 180 | "compression_retriever = ContextualCompressionRetriever(\n", 181 | " base_compressor=compressor, base_retriever=retriever\n", 182 | ")" 183 | ], 184 | "metadata": { 185 | "id": "SXpNy93c3F_Y" 186 | }, 187 | "execution_count": null, 188 | "outputs": [] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "source": [ 193 | "# checking compressed doc\n", 194 | "compressed_docs = compression_retriever.invoke(\"what are points on a mortgage\")\n", 195 | "compressed_docs" 196 | ], 197 | "metadata": { 198 | "colab": { 199 | "base_uri": "https://localhost:8080/" 200 | }, 201 | "id": "i3r6ULBq1-76", 202 | "outputId": "c9b64510-3d33-4b19-b3b4-983859a9fefb" 203 | }, 204 | "execution_count": null, 205 | "outputs": [ 206 | { 207 | "output_type": "execute_result", 208 | "data": { 209 | "text/plain": [ 210 | "[Document(page_content='Discount points, also called mortgage points or simply points, are a form of pre-paid interest available in the United States when arranging a mortgage. One point equals one percent of the loan amount. By charging a borrower points, a lender effectively increases the yield on the loan above the amount of the stated interest rate. Borrowers can offer to pay a lender points as a method to reduce the interest rate on the loan, thus obtaining a lower monthly payment in exchange for this', metadata={'row': 1, 'source': './context.csv'}),\n", 211 | " Document(page_content=\"points is the concept of the 'no closing cost loan', in which the consumer accepts a higher interest rate in return for the lender paying the loan's closing costs up front. In some cases a purchaser can negotiate with the seller to get them to pay seller's points which can be used to pay mortgage points.\", metadata={'row': 1, 'source': './context.csv'}),\n", 212 | " Document(page_content='Points may also be purchased to reduce the monthly payment for the purpose of qualifying for a loan. Loan qualification based on monthly income versus the monthly loan payment may sometimes only be achievable by reducing the monthly payment through the purchasing of points to buy down the interest rate, thereby reducing the monthly loan payment. Discount points may be different from origination fee, mortgage arrangement fee or broker fee. Discount points are always used to buy down the', metadata={'row': 1, 'source': './context.csv'}),\n", 213 | " Document(page_content='paying points will cost more than just paying the higher interest', metadata={'row': 1, 'source': './context.csv'})]" 214 | ] 215 | }, 216 | "metadata": {}, 217 | "execution_count": 17 218 | } 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "source": [ 224 | "## **RAG Chain**" 225 | ], 226 | "metadata": { 227 | "id": "znrnrY073o82" 228 | } 229 | }, 230 | { 231 | "cell_type": "code", 232 | "source": [ 233 | "# create document chain\n", 234 | "from langchain.prompts import ChatPromptTemplate\n", 235 | "from langchain.schema.runnable import RunnablePassthrough\n", 236 | "from langchain.schema.output_parser import StrOutputParser\n", 237 | "\n", 238 | "template = \"\"\"\"\n", 239 | "You are a helpful assistant that answers questions based on the following context.\n", 240 | "If you don't find the answer in the context, just say that you don't know.\n", 241 | "Context: {context}\n", 242 | "\n", 243 | "Question: {input}\n", 244 | "\n", 245 | "Answer:\n", 246 | "\n", 247 | "\"\"\"\n", 248 | "prompt = ChatPromptTemplate.from_template(template)\n", 249 | "\n", 250 | "# Setup RAG pipeline\n", 251 | "rag_chain = (\n", 252 | " {\"context\": compression_retriever, \"input\": RunnablePassthrough()}\n", 253 | " | prompt\n", 254 | " | llm\n", 255 | " | StrOutputParser()\n", 256 | ")" 257 | ], 258 | "metadata": { 259 | "id": "DC_S8t0f3pV5" 260 | }, 261 | "execution_count": null, 262 | "outputs": [] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "source": [ 267 | "# response\n", 268 | "response = rag_chain.invoke(\"what are points on a mortgage\")\n", 269 | "response" 270 | ], 271 | "metadata": { 272 | "colab": { 273 | "base_uri": "https://localhost:8080/", 274 | "height": 87 275 | }, 276 | "id": "RnXKAPED3teL", 277 | "outputId": "478e08df-62ef-4ef1-8edb-2b1c7840cf8d" 278 | }, 279 | "execution_count": null, 280 | "outputs": [ 281 | { 282 | "output_type": "execute_result", 283 | "data": { 284 | "text/plain": [ 285 | "'Points on a mortgage, also known as discount points or mortgage points, are a form of pre-paid interest that borrowers can pay to lenders when arranging a mortgage in the United States. One point equals one percent of the loan amount. By paying points, borrowers can effectively reduce the interest rate on the loan, resulting in a lower monthly payment. Points can also be used to qualify for a loan or to have the lender pay the closing costs upfront. Points are different from origination fees, mortgage arrangement fees, or broker fees. The loan rate is typically reduced by a certain percentage when points are paid.'" 286 | ], 287 | "application/vnd.google.colaboratory.intrinsic+json": { 288 | "type": "string" 289 | } 290 | }, 291 | "metadata": {}, 292 | "execution_count": 19 293 | } 294 | ] 295 | }, 296 | { 297 | "cell_type": "markdown", 298 | "source": [ 299 | "## **Preparing Data for Evaluation**" 300 | ], 301 | "metadata": { 302 | "id": "EOz3Wp0Q31ln" 303 | } 304 | }, 305 | { 306 | "cell_type": "code", 307 | "source": [ 308 | "# create dataset\n", 309 | "questions = [\"what are points on a mortgage\"]\n", 310 | "response = []\n", 311 | "contexts = []\n", 312 | "\n", 313 | "# Inference\n", 314 | "for query in questions:\n", 315 | " response.append(rag_chain.invoke(query))\n", 316 | " contexts.append([docs.page_content for docs in compression_retriever.get_relevant_documents(query)])\n", 317 | "\n", 318 | "# To dict\n", 319 | "data = {\n", 320 | " \"query\": questions,\n", 321 | " \"response\": response,\n", 322 | " \"context\": contexts,\n", 323 | "}" 324 | ], 325 | "metadata": { 326 | "id": "pR8aRn_032H8" 327 | }, 328 | "execution_count": null, 329 | "outputs": [] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "source": [ 334 | "# create dataset\n", 335 | "from datasets import Dataset\n", 336 | "dataset = Dataset.from_dict(data)" 337 | ], 338 | "metadata": { 339 | "id": "NQ_kFqDa4fN2" 340 | }, 341 | "execution_count": null, 342 | "outputs": [] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "source": [ 347 | "# create dataframe\n", 348 | "import pandas as pd\n", 349 | "df = pd.DataFrame(dataset)" 350 | ], 351 | "metadata": { 352 | "id": "YvdWjllI4l5y" 353 | }, 354 | "execution_count": null, 355 | "outputs": [] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "source": [ 360 | "df" 361 | ], 362 | "metadata": { 363 | "colab": { 364 | "base_uri": "https://localhost:8080/", 365 | "height": 89 366 | }, 367 | "id": "fK4P2QEG4oIu", 368 | "outputId": "26669cc6-1f72-4aef-d7a9-2a563e4eaa2d" 369 | }, 370 | "execution_count": null, 371 | "outputs": [ 372 | { 373 | "output_type": "execute_result", 374 | "data": { 375 | "text/plain": [ 376 | " query \\\n", 377 | "0 what are points on a mortgage \n", 378 | "\n", 379 | " response \\\n", 380 | "0 Points on a mortgage are a form of pre-paid in... \n", 381 | "\n", 382 | " context \n", 383 | "0 [Discount points, also called mortgage points ... " 384 | ], 385 | "text/html": [ 386 | "\n", 387 | "
\n", 388 | "
\n", 389 | "\n", 402 | "\n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | "
queryresponsecontext
0what are points on a mortgagePoints on a mortgage are a form of pre-paid in...[Discount points, also called mortgage points ...
\n", 420 | "
\n", 421 | "
\n", 422 | "\n", 423 | "
\n", 424 | " \n", 432 | "\n", 433 | " \n", 473 | "\n", 474 | " \n", 498 | "
\n", 499 | "\n", 500 | "\n", 501 | "
\n", 502 | " \n", 533 | " \n", 542 | " \n", 554 | "
\n", 555 | "\n", 556 | "
\n", 557 | "
\n" 558 | ], 559 | "application/vnd.google.colaboratory.intrinsic+json": { 560 | "type": "dataframe", 561 | "variable_name": "df", 562 | "summary": "{\n \"name\": \"df\",\n \"rows\": 1,\n \"fields\": [\n {\n \"column\": \"query\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"what are points on a mortgage\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"response\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"Points on a mortgage are a form of pre-paid interest that a borrower can offer to pay a lender in order to reduce the interest rate on the loan. One point equals one percent of the loan amount. By paying points, a borrower can obtain a lower monthly payment in exchange for this. Additionally, points can also be used to reduce the monthly payment to qualify for a loan. It is important to note that discount points may be different from origination fee, mortgage arrangement fee, or broker fee.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}" 563 | } 564 | }, 565 | "metadata": {}, 566 | "execution_count": 26 567 | } 568 | ] 569 | }, 570 | { 571 | "cell_type": "code", 572 | "source": [ 573 | "# Convert to dictionary\n", 574 | "df_dict = df.to_dict(orient='records')\n", 575 | "\n", 576 | "# Convert context to list\n", 577 | "for record in df_dict:\n", 578 | " if not isinstance(record.get('context'), list):\n", 579 | " if record.get('context') is None:\n", 580 | " record['context'] = []\n", 581 | " else:\n", 582 | " record['context'] = [record['context']]" 583 | ], 584 | "metadata": { 585 | "id": "CjNrTs7l4qg_" 586 | }, 587 | "execution_count": null, 588 | "outputs": [] 589 | }, 590 | { 591 | "cell_type": "markdown", 592 | "source": [ 593 | "## **Evaluation in Athina AI**\n", 594 | "\n", 595 | "We will use **Context Relevancy** eval here. It Measures the relevancy of the retrieved context, calculated based on both the query and contexts. To learn more about this. Please refer to our [documentation](https://docs.athina.ai/api-reference/evals/preset-evals/overview) for further details" 596 | ], 597 | "metadata": { 598 | "id": "JZFlEciE4yFO" 599 | } 600 | }, 601 | { 602 | "cell_type": "code", 603 | "source": [ 604 | "# set api keys for Athina evals\n", 605 | "from athina.keys import AthinaApiKey, OpenAiApiKey\n", 606 | "OpenAiApiKey.set_key(os.getenv('OPENAI_API_KEY'))\n", 607 | "AthinaApiKey.set_key(os.getenv('ATHINA_API_KEY'))" 608 | ], 609 | "metadata": { 610 | "id": "vCwCaKjl4ywl" 611 | }, 612 | "execution_count": null, 613 | "outputs": [] 614 | }, 615 | { 616 | "cell_type": "code", 617 | "source": [ 618 | "# load dataset\n", 619 | "from athina.loaders import Loader\n", 620 | "dataset = Loader().load_dict(df_dict)" 621 | ], 622 | "metadata": { 623 | "id": "ivAlmBPc42LB" 624 | }, 625 | "execution_count": null, 626 | "outputs": [] 627 | }, 628 | { 629 | "cell_type": "code", 630 | "source": [ 631 | "# evaluate\n", 632 | "from athina.evals import RagasContextRelevancy\n", 633 | "RagasContextRelevancy(model=\"gpt-4o\").run_batch(data=dataset).to_df()" 634 | ], 635 | "metadata": { 636 | "colab": { 637 | "base_uri": "https://localhost:8080/", 638 | "height": 480 639 | }, 640 | "id": "l3l78eoF43y1", 641 | "outputId": "11d88637-b234-425a-8f94-279f7bd9c29f" 642 | }, 643 | "execution_count": null, 644 | "outputs": [ 645 | { 646 | "output_type": "stream", 647 | "name": "stdout", 648 | "text": [ 649 | "evaluating with [context_relevancy]\n" 650 | ] 651 | }, 652 | { 653 | "output_type": "stream", 654 | "name": "stderr", 655 | "text": [ 656 | "100%|██████████| 1/1 [00:01<00:00, 1.42s/it]\n" 657 | ] 658 | }, 659 | { 660 | "output_type": "stream", 661 | "name": "stdout", 662 | "text": [ 663 | "You can view your dataset at: https://app.athina.ai/develop/76c73e9b-7e13-4e2e-9cde-37565deefa56\n" 664 | ] 665 | }, 666 | { 667 | "output_type": "execute_result", 668 | "data": { 669 | "text/plain": [ 670 | " query \\\n", 671 | "0 what are points on a mortgage \n", 672 | "\n", 673 | " context \\\n", 674 | "0 [Discount points, also called mortgage points or simply points, are a form of pre-paid interest available in the United States when arranging a mortgage. One point equals one percent of the loan amount. Borrowers can offer to pay a lender points as a method to reduce the interest rate on the loan., points is the concept of the 'no closing cost loan', in which the consumer accepts a higher interest rate in return for the lender paying the loan's closing costs up front. In some cases a purchas... \n", 675 | "\n", 676 | " response \\\n", 677 | "0 Points on a mortgage are a form of pre-paid interest that a borrower can offer to pay a lender in order to reduce the interest rate on the loan. One point equals one percent of the loan amount. By paying points, a borrower can obtain a lower monthly payment in exchange for this. Additionally, points can also be used to reduce the monthly payment to qualify for a loan. It is important to note that discount points may be different from origination fee, mortgage arrangement fee, or broker fee. \n", 678 | "\n", 679 | " expected_response display_name failed \\\n", 680 | "0 None Ragas Context Relevancy None \n", 681 | "\n", 682 | " grade_reason \\\n", 683 | "0 This metric is calulated by dividing the number of sentences in context that are relevant for answering the given query by the total number of sentences in the retrieved context \n", 684 | "\n", 685 | " runtime model ragas_context_relevancy \n", 686 | "0 1679 gpt-4o 0.454545 " 687 | ], 688 | "text/html": [ 689 | "\n", 690 | "
\n", 691 | "
\n", 692 | "\n", 705 | "\n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | "
querycontextresponseexpected_responsedisplay_namefailedgrade_reasonruntimemodelragas_context_relevancy
0what are points on a mortgage[Discount points, also called mortgage points or simply points, are a form of pre-paid interest available in the United States when arranging a mortgage. One point equals one percent of the loan amount. Borrowers can offer to pay a lender points as a method to reduce the interest rate on the loan., points is the concept of the 'no closing cost loan', in which the consumer accepts a higher interest rate in return for the lender paying the loan's closing costs up front. In some cases a purchas...Points on a mortgage are a form of pre-paid interest that a borrower can offer to pay a lender in order to reduce the interest rate on the loan. One point equals one percent of the loan amount. By paying points, a borrower can obtain a lower monthly payment in exchange for this. Additionally, points can also be used to reduce the monthly payment to qualify for a loan. It is important to note that discount points may be different from origination fee, mortgage arrangement fee, or broker fee.NoneRagas Context RelevancyNoneThis metric is calulated by dividing the number of sentences in context that are relevant for answering the given query by the total number of sentences in the retrieved context1679gpt-4o0.454545
\n", 737 | "
\n", 738 | "
\n", 739 | "\n", 740 | "
\n", 741 | " \n", 749 | "\n", 750 | " \n", 790 | "\n", 791 | " \n", 815 | "
\n", 816 | "\n", 817 | "\n", 818 | "
\n", 819 | "
\n" 820 | ], 821 | "application/vnd.google.colaboratory.intrinsic+json": { 822 | "type": "dataframe", 823 | "repr_error": "Out of range float values are not JSON compliant: nan" 824 | } 825 | }, 826 | "metadata": {}, 827 | "execution_count": 30 828 | } 829 | ] 830 | } 831 | ] 832 | } -------------------------------------------------------------------------------- /advanced_rag_techniques/naive_rag.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [] 7 | }, 8 | "kernelspec": { 9 | "name": "python3", 10 | "display_name": "Python 3" 11 | }, 12 | "language_info": { 13 | "name": "python" 14 | } 15 | }, 16 | "cells": [ 17 | { 18 | "cell_type": "markdown", 19 | "source": [ 20 | "# **Naive RAG**\n", 21 | "The Naive RAG is the simplest technique in the RAG ecosystem, providing a straightforward approach to combining retrieved data with LLM models for efficient user responses.\n", 22 | "\n", 23 | "Research Paper: [RAG](https://arxiv.org/pdf/2005.11401)" 24 | ], 25 | "metadata": { 26 | "id": "h3Crc_qvjnc3" 27 | } 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "## **Initial Setup**" 33 | ], 34 | "metadata": { 35 | "id": "ZQuZ2ARQks63" 36 | } 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": { 42 | "id": "XktcxUHHi0FF" 43 | }, 44 | "outputs": [], 45 | "source": [ 46 | "! pip install --q athina" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "source": [ 52 | "import os\n", 53 | "from google.colab import userdata\n", 54 | "os.environ[\"OPENAI_API_KEY\"] = userdata.get('OPENAI_API_KEY')\n", 55 | "os.environ['ATHINA_API_KEY'] = userdata.get('ATHINA_API_KEY')\n", 56 | "os.environ['PINECONE_API_KEY'] = userdata.get('PINECONE_API_KEY')\n" 57 | ], 58 | "metadata": { 59 | "id": "nAzEvcXulGYW" 60 | }, 61 | "execution_count": null, 62 | "outputs": [] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "source": [ 67 | "## **Indexing**" 68 | ], 69 | "metadata": { 70 | "id": "BT6jhNgulNV3" 71 | } 72 | }, 73 | { 74 | "cell_type": "code", 75 | "source": [ 76 | "# load embedding model\n", 77 | "from langchain_openai import OpenAIEmbeddings\n", 78 | "embeddings = OpenAIEmbeddings()" 79 | ], 80 | "metadata": { 81 | "id": "DmXqyqQ5lMrk" 82 | }, 83 | "execution_count": null, 84 | "outputs": [] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "source": [ 89 | "# load data\n", 90 | "from langchain.document_loaders import CSVLoader\n", 91 | "loader = CSVLoader(\"./context.csv\")\n", 92 | "documents = loader.load()" 93 | ], 94 | "metadata": { 95 | "id": "lv9mslL9lWdJ" 96 | }, 97 | "execution_count": null, 98 | "outputs": [] 99 | }, 100 | { 101 | "cell_type": "code", 102 | "source": [ 103 | "# split documents\n", 104 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", 105 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n", 106 | "documents = text_splitter.split_documents(documents)" 107 | ], 108 | "metadata": { 109 | "id": "dzQlFFoWlZj1" 110 | }, 111 | "execution_count": null, 112 | "outputs": [] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "source": [ 117 | "## **Pinecone Vector Database**" 118 | ], 119 | "metadata": { 120 | "id": "Wk9-1mcEsPa8" 121 | } 122 | }, 123 | { 124 | "cell_type": "code", 125 | "source": [ 126 | "# initialize pinecone client\n", 127 | "from pinecone import Pinecone as PineconeClient, ServerlessSpec\n", 128 | "pc = PineconeClient(\n", 129 | " api_key=os.environ.get(\"PINECONE_API_KEY\"),\n", 130 | ")" 131 | ], 132 | "metadata": { 133 | "id": "oCKff2pOritW" 134 | }, 135 | "execution_count": null, 136 | "outputs": [] 137 | }, 138 | { 139 | "cell_type": "code", 140 | "source": [ 141 | "# create index\n", 142 | "pc.create_index(\n", 143 | " name='my-index',\n", 144 | " dimension=1536,\n", 145 | " metric=\"cosine\",\n", 146 | " spec=ServerlessSpec(\n", 147 | " cloud=\"aws\",\n", 148 | " region=\"us-east-1\"\n", 149 | " )\n", 150 | " )" 151 | ], 152 | "metadata": { 153 | "id": "2do7kkOrsZvR" 154 | }, 155 | "execution_count": null, 156 | "outputs": [] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "source": [ 161 | "# load index\n", 162 | "index_name = \"my-index\"" 163 | ], 164 | "metadata": { 165 | "id": "AzqQvvRpsf4l" 166 | }, 167 | "execution_count": null, 168 | "outputs": [] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "source": [ 173 | "# create vectorstore\n", 174 | "from langchain.vectorstores import Pinecone\n", 175 | "vectorstore = Pinecone.from_documents(\n", 176 | " documents=documents,\n", 177 | " embedding=embeddings,\n", 178 | " index_name=index_name\n", 179 | ")" 180 | ], 181 | "metadata": { 182 | "id": "G7YSkjF5slji" 183 | }, 184 | "execution_count": null, 185 | "outputs": [] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "source": [ 190 | "## **FAISS (Optional)**" 191 | ], 192 | "metadata": { 193 | "id": "En1bq6B0srDA" 194 | } 195 | }, 196 | { 197 | "cell_type": "code", 198 | "source": [ 199 | "# # optional vectorstore\n", 200 | "# !pip install --q faiss-gpu\n", 201 | "\n", 202 | "# # create vectorstore\n", 203 | "# from langchain.vectorstores import FAISS\n", 204 | "# vectorstore = FAISS.from_documents(documents, embeddings)" 205 | ], 206 | "metadata": { 207 | "id": "l1SQyeIrlcDk" 208 | }, 209 | "execution_count": null, 210 | "outputs": [] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "source": [ 215 | "## **Retriever**" 216 | ], 217 | "metadata": { 218 | "id": "yHq5hAhZls9s" 219 | } 220 | }, 221 | { 222 | "cell_type": "code", 223 | "source": [ 224 | "# create retriever\n", 225 | "retriever = vectorstore.as_retriever()" 226 | ], 227 | "metadata": { 228 | "id": "wazZm3Hzltj0" 229 | }, 230 | "execution_count": null, 231 | "outputs": [] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "source": [ 236 | "## **RAG Chain**" 237 | ], 238 | "metadata": { 239 | "id": "nwDXLi4elyxP" 240 | } 241 | }, 242 | { 243 | "cell_type": "code", 244 | "source": [ 245 | "# load llm\n", 246 | "from langchain_openai import ChatOpenAI\n", 247 | "llm = ChatOpenAI()" 248 | ], 249 | "metadata": { 250 | "id": "KbFmmqZBlzCN" 251 | }, 252 | "execution_count": null, 253 | "outputs": [] 254 | }, 255 | { 256 | "cell_type": "code", 257 | "source": [ 258 | "# create document chain\n", 259 | "from langchain.prompts import ChatPromptTemplate\n", 260 | "from langchain.schema.runnable import RunnablePassthrough\n", 261 | "from langchain.schema.output_parser import StrOutputParser\n", 262 | "\n", 263 | "template = \"\"\"\"\n", 264 | "You are a helpful assistant that answers questions based on the provided context.\n", 265 | "Use the provided context to answer the question.\n", 266 | "Question: {input}\n", 267 | "Context: {context}\n", 268 | "Answer:\n", 269 | "\"\"\"\n", 270 | "prompt = ChatPromptTemplate.from_template(template)\n", 271 | "\n", 272 | "# Setup RAG pipeline\n", 273 | "rag_chain = (\n", 274 | " {\"context\": retriever, \"input\": RunnablePassthrough()}\n", 275 | " | prompt\n", 276 | " | llm\n", 277 | " | StrOutputParser()\n", 278 | ")" 279 | ], 280 | "metadata": { 281 | "id": "XTbjjl9Ml4JT" 282 | }, 283 | "execution_count": null, 284 | "outputs": [] 285 | }, 286 | { 287 | "cell_type": "code", 288 | "source": [ 289 | "# response\n", 290 | "response = rag_chain.invoke(\"when did ww1 end?\")\n", 291 | "response" 292 | ], 293 | "metadata": { 294 | "colab": { 295 | "base_uri": "https://localhost:8080/", 296 | "height": 35 297 | }, 298 | "id": "KCmVRhwkl-L4", 299 | "outputId": "013b60e8-79f2-4a34-a6f9-025d22a0040e" 300 | }, 301 | "execution_count": null, 302 | "outputs": [ 303 | { 304 | "output_type": "execute_result", 305 | "data": { 306 | "text/plain": [ 307 | "'World War I ended on November 11, 1918.'" 308 | ], 309 | "application/vnd.google.colaboratory.intrinsic+json": { 310 | "type": "string" 311 | } 312 | }, 313 | "metadata": {}, 314 | "execution_count": 13 315 | } 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "source": [ 321 | "## **Preparing Data for Evaluation**" 322 | ], 323 | "metadata": { 324 | "id": "ydmLJTKcmN-L" 325 | } 326 | }, 327 | { 328 | "cell_type": "code", 329 | "source": [ 330 | "# create dataset\n", 331 | "question = [\"when did ww1 end?\"]\n", 332 | "response = []\n", 333 | "contexts = []\n", 334 | "\n", 335 | "# Inference\n", 336 | "for query in question:\n", 337 | " response.append(rag_chain.invoke(query))\n", 338 | " contexts.append([docs.page_content for docs in retriever.get_relevant_documents(query)])\n", 339 | "\n", 340 | "# To dict\n", 341 | "data = {\n", 342 | " \"query\": question,\n", 343 | " \"response\": response,\n", 344 | " \"context\": contexts,\n", 345 | "}" 346 | ], 347 | "metadata": { 348 | "id": "8mYI56TXmNMo" 349 | }, 350 | "execution_count": null, 351 | "outputs": [] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "source": [ 356 | "# create dataset\n", 357 | "from datasets import Dataset\n", 358 | "dataset = Dataset.from_dict(data)" 359 | ], 360 | "metadata": { 361 | "id": "fsvXVMDZmlYC" 362 | }, 363 | "execution_count": null, 364 | "outputs": [] 365 | }, 366 | { 367 | "cell_type": "code", 368 | "source": [ 369 | "# create dataframe\n", 370 | "import pandas as pd\n", 371 | "df = pd.DataFrame(dataset)" 372 | ], 373 | "metadata": { 374 | "id": "c6ENh8IImn8X" 375 | }, 376 | "execution_count": null, 377 | "outputs": [] 378 | }, 379 | { 380 | "cell_type": "code", 381 | "source": [ 382 | "df" 383 | ], 384 | "metadata": { 385 | "colab": { 386 | "base_uri": "https://localhost:8080/", 387 | "height": 89 388 | }, 389 | "id": "jbBOytBzmp03", 390 | "outputId": "295f18de-9923-4a49-8c0c-4fa950620a79" 391 | }, 392 | "execution_count": null, 393 | "outputs": [ 394 | { 395 | "output_type": "execute_result", 396 | "data": { 397 | "text/plain": [ 398 | " query response \\\n", 399 | "0 when did ww1 end? World War I ended on November 11, 1918. \n", 400 | "\n", 401 | " context \n", 402 | "0 [context: ['World War I or the First World War... " 403 | ], 404 | "text/html": [ 405 | "\n", 406 | "
\n", 407 | "
\n", 408 | "\n", 421 | "\n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | "
queryresponsecontext
0when did ww1 end?World War I ended on November 11, 1918.[context: ['World War I or the First World War...
\n", 439 | "
\n", 440 | "
\n", 441 | "\n", 442 | "
\n", 443 | " \n", 451 | "\n", 452 | " \n", 492 | "\n", 493 | " \n", 517 | "
\n", 518 | "\n", 519 | "\n", 520 | "
\n", 521 | " \n", 552 | " \n", 561 | " \n", 573 | "
\n", 574 | "\n", 575 | "
\n", 576 | "
\n" 577 | ], 578 | "application/vnd.google.colaboratory.intrinsic+json": { 579 | "type": "dataframe", 580 | "variable_name": "df", 581 | "summary": "{\n \"name\": \"df\",\n \"rows\": 1,\n \"fields\": [\n {\n \"column\": \"query\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"when did ww1 end?\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"response\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"World War I ended on November 11, 1918.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}" 582 | } 583 | }, 584 | "metadata": {}, 585 | "execution_count": 17 586 | } 587 | ] 588 | }, 589 | { 590 | "cell_type": "code", 591 | "source": [ 592 | "# Convert to dictionary\n", 593 | "df_dict = df.to_dict(orient='records')\n", 594 | "\n", 595 | "# Convert context to list\n", 596 | "for record in df_dict:\n", 597 | " if not isinstance(record.get('context'), list):\n", 598 | " if record.get('context') is None:\n", 599 | " record['context'] = []\n", 600 | " else:\n", 601 | " record['context'] = [record['context']]" 602 | ], 603 | "metadata": { 604 | "id": "qRQUgNeemx1S" 605 | }, 606 | "execution_count": null, 607 | "outputs": [] 608 | }, 609 | { 610 | "cell_type": "markdown", 611 | "source": [ 612 | "## **Evaluation in Athina AI**\n", 613 | "\n", 614 | "We will use **Does Response Answer Query** eval here. It Checks if the response answer the user's query. To learn more about this. Please refer to our [documentation](https://docs.athina.ai/api-reference/evals/preset-evals/overview) for further details." 615 | ], 616 | "metadata": { 617 | "id": "-r2bDl3-m0UU" 618 | } 619 | }, 620 | { 621 | "cell_type": "code", 622 | "source": [ 623 | "# set api keys for Athina evals\n", 624 | "from athina.keys import AthinaApiKey, OpenAiApiKey\n", 625 | "OpenAiApiKey.set_key(os.getenv('OPENAI_API_KEY'))\n", 626 | "AthinaApiKey.set_key(os.getenv('ATHINA_API_KEY'))" 627 | ], 628 | "metadata": { 629 | "id": "rZJLIyV0m1As" 630 | }, 631 | "execution_count": null, 632 | "outputs": [] 633 | }, 634 | { 635 | "cell_type": "code", 636 | "source": [ 637 | "# load dataset\n", 638 | "from athina.loaders import Loader\n", 639 | "dataset = Loader().load_dict(df_dict)" 640 | ], 641 | "metadata": { 642 | "id": "8hKmTTy9m4mh" 643 | }, 644 | "execution_count": null, 645 | "outputs": [] 646 | }, 647 | { 648 | "cell_type": "code", 649 | "source": [ 650 | "# evaluate\n", 651 | "from athina.evals import DoesResponseAnswerQuery\n", 652 | "DoesResponseAnswerQuery(model=\"gpt-4o\").run_batch(data=dataset).to_df()" 653 | ], 654 | "metadata": { 655 | "colab": { 656 | "base_uri": "https://localhost:8080/", 657 | "height": 254 658 | }, 659 | "id": "_c6tIOXXm6rC", 660 | "outputId": "9d4af1ae-6a35-4093-9b14-50c0ee13863b" 661 | }, 662 | "execution_count": null, 663 | "outputs": [ 664 | { 665 | "output_type": "stream", 666 | "name": "stdout", 667 | "text": [ 668 | "You can view your dataset at: https://app.athina.ai/develop/80872384-24ac-4ad9-824d-74dc02cb7cca\n" 669 | ] 670 | }, 671 | { 672 | "output_type": "execute_result", 673 | "data": { 674 | "text/plain": [ 675 | " query \\\n", 676 | "0 when did ww1 end? \n", 677 | "\n", 678 | " context \\\n", 679 | "0 [context: ['World War I or the First World War (28 July 1914 – 11 November 1918), often abbreviated as WWI, was one of the deadliest global conflicts in history. It was fought between two coalitions, the Allies and the Central Powers. Fighting occurred throughout Europe, the Middle East, Africa, the Pacific, and parts of Asia. An estimated 9 million soldiers were killed in combat, plus another 23 million wounded, while 5 million civilians died as a result of military action, hunger, and dise... \n", 680 | "\n", 681 | " response expected_response \\\n", 682 | "0 World War I ended on November 11, 1918. None \n", 683 | "\n", 684 | " display_name failed \\\n", 685 | "0 Does Response Answer Query False \n", 686 | "\n", 687 | " grade_reason \\\n", 688 | "0 The response directly answers the user's query by providing the specific date on which World War I ended, which is November 11, 1918. This sufficiently covers all aspects of the user's query. \n", 689 | "\n", 690 | " runtime model passed \n", 691 | "0 787 gpt-4o 1.0 " 692 | ], 693 | "text/html": [ 694 | "\n", 695 | "
\n", 696 | "
\n", 697 | "\n", 710 | "\n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | "
querycontextresponseexpected_responsedisplay_namefailedgrade_reasonruntimemodelpassed
0when did ww1 end?[context: ['World War I or the First World War (28 July 1914 – 11 November 1918), often abbreviated as WWI, was one of the deadliest global conflicts in history. It was fought between two coalitions, the Allies and the Central Powers. Fighting occurred throughout Europe, the Middle East, Africa, the Pacific, and parts of Asia. An estimated 9 million soldiers were killed in combat, plus another 23 million wounded, while 5 million civilians died as a result of military action, hunger, and dise...World War I ended on November 11, 1918.NoneDoes Response Answer QueryFalseThe response directly answers the user's query by providing the specific date on which World War I ended, which is November 11, 1918. This sufficiently covers all aspects of the user's query.787gpt-4o1.0
\n", 742 | "
\n", 743 | "
\n", 744 | "\n", 745 | "
\n", 746 | " \n", 754 | "\n", 755 | " \n", 795 | "\n", 796 | " \n", 820 | "
\n", 821 | "\n", 822 | "\n", 823 | "
\n", 824 | "
\n" 825 | ], 826 | "application/vnd.google.colaboratory.intrinsic+json": { 827 | "type": "dataframe", 828 | "repr_error": "Out of range float values are not JSON compliant: nan" 829 | } 830 | }, 831 | "metadata": {}, 832 | "execution_count": 21 833 | } 834 | ] 835 | } 836 | ] 837 | } -------------------------------------------------------------------------------- /advanced_rag_techniques/parent_document_retriever.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [] 7 | }, 8 | "kernelspec": { 9 | "name": "python3", 10 | "display_name": "Python 3" 11 | }, 12 | "language_info": { 13 | "name": "python" 14 | } 15 | }, 16 | "cells": [ 17 | { 18 | "cell_type": "markdown", 19 | "source": [ 20 | "# **Parent Document Retriver**\n", 21 | "\n", 22 | "Parent Document retriever is a technique where large documents are split into smaller pieces, called \"child chunks.\" These chunks are stored in a way that lets the system find and compare specific parts of a document with a user’s query. The large document, or \"parent,\" is still kept but is only retrieved if one of its child chunks is relevant to the query.\n", 23 | "\n", 24 | "Reference: [Parent Document Retriver](https://python.langchain.com/docs/how_to/parent_document_retriever/)" 25 | ], 26 | "metadata": { 27 | "id": "CgZgNXqy1Rka" 28 | } 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "source": [ 33 | "## **Initial Setup**" 34 | ], 35 | "metadata": { 36 | "id": "eAvUVSNs09wu" 37 | } 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": { 43 | "id": "o_ghPLog0cH4" 44 | }, 45 | "outputs": [], 46 | "source": [ 47 | "! pip install --q athina chromadb" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "source": [ 53 | "import os\n", 54 | "from google.colab import userdata\n", 55 | "os.environ[\"OPENAI_API_KEY\"] = userdata.get('OPENAI_API_KEY')\n", 56 | "os.environ['ATHINA_API_KEY'] = userdata.get('ATHINA_API_KEY')" 57 | ], 58 | "metadata": { 59 | "id": "qqJOrbvt1RAd" 60 | }, 61 | "execution_count": null, 62 | "outputs": [] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "source": [ 67 | "## **Indexing**" 68 | ], 69 | "metadata": { 70 | "id": "RK8GP6MJ1mMo" 71 | } 72 | }, 73 | { 74 | "cell_type": "code", 75 | "source": [ 76 | "# load embedding model\n", 77 | "from langchain_openai import OpenAIEmbeddings\n", 78 | "embeddings = OpenAIEmbeddings()" 79 | ], 80 | "metadata": { 81 | "id": "zjR7SDSHvC1I" 82 | }, 83 | "execution_count": null, 84 | "outputs": [] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "source": [ 89 | "# load data\n", 90 | "from langchain.document_loaders import CSVLoader\n", 91 | "loader = CSVLoader(\"./context.csv\")\n", 92 | "documents = loader.load()" 93 | ], 94 | "metadata": { 95 | "id": "SSQFfZPwvJbi" 96 | }, 97 | "execution_count": null, 98 | "outputs": [] 99 | }, 100 | { 101 | "cell_type": "markdown", 102 | "source": [ 103 | "### **Parent Child Text Spliting**" 104 | ], 105 | "metadata": { 106 | "id": "rT1Qpkr6vMUk" 107 | } 108 | }, 109 | { 110 | "cell_type": "code", 111 | "source": [ 112 | "# split pages content\n", 113 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", 114 | "\n", 115 | "# create the parent documents - The big chunks\n", 116 | "parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)\n", 117 | "\n", 118 | "# create the child documents - The small chunks\n", 119 | "child_splitter = RecursiveCharacterTextSplitter(chunk_size=400)\n", 120 | "\n", 121 | "# The storage layer for the parent chunks\n", 122 | "from langchain.storage import InMemoryStore\n", 123 | "store = InMemoryStore()" 124 | ], 125 | "metadata": { 126 | "id": "HE2h73vovL4H" 127 | }, 128 | "execution_count": null, 129 | "outputs": [] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "source": [ 134 | "from langchain.vectorstores import Chroma\n", 135 | "vectorstore = Chroma(collection_name=\"split_parents\", embedding_function=embeddings)" 136 | ], 137 | "metadata": { 138 | "id": "rUu68fgQvZpw" 139 | }, 140 | "execution_count": null, 141 | "outputs": [] 142 | }, 143 | { 144 | "cell_type": "markdown", 145 | "source": [ 146 | "## **Retriever**" 147 | ], 148 | "metadata": { 149 | "id": "kl6uKAG75XKU" 150 | } 151 | }, 152 | { 153 | "cell_type": "code", 154 | "source": [ 155 | "# create retriever\n", 156 | "from langchain.retrievers import ParentDocumentRetriever\n", 157 | "retriever = ParentDocumentRetriever(\n", 158 | " vectorstore=vectorstore,\n", 159 | " docstore=store,\n", 160 | " child_splitter=child_splitter,\n", 161 | " parent_splitter=parent_splitter,\n", 162 | ")" 163 | ], 164 | "metadata": { 165 | "id": "od7NwTTBveJN" 166 | }, 167 | "execution_count": null, 168 | "outputs": [] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "source": [ 173 | "# add documents to vectorstore\n", 174 | "retriever.add_documents(documents)" 175 | ], 176 | "metadata": { 177 | "id": "i0r_3ynrvg_0" 178 | }, 179 | "execution_count": null, 180 | "outputs": [] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "source": [ 185 | "## **RAG Chain**" 186 | ], 187 | "metadata": { 188 | "id": "x3s1p56DvnWO" 189 | } 190 | }, 191 | { 192 | "cell_type": "code", 193 | "source": [ 194 | "# create llm\n", 195 | "from langchain_openai import ChatOpenAI\n", 196 | "llm = ChatOpenAI()" 197 | ], 198 | "metadata": { 199 | "id": "PmT_vPA4vuN6" 200 | }, 201 | "execution_count": null, 202 | "outputs": [] 203 | }, 204 | { 205 | "cell_type": "code", 206 | "source": [ 207 | "# create document chain\n", 208 | "from langchain.prompts import ChatPromptTemplate\n", 209 | "from langchain.schema.runnable import RunnablePassthrough\n", 210 | "from langchain.schema.output_parser import StrOutputParser\n", 211 | "\n", 212 | "template = \"\"\"\"\n", 213 | "You are a helpful assistant that answers questions based on the following context\n", 214 | "Context: {context}\n", 215 | "\n", 216 | "Question: {input}\n", 217 | "\n", 218 | "Answer:\n", 219 | "\n", 220 | "\"\"\"\n", 221 | "prompt = ChatPromptTemplate.from_template(template)\n", 222 | "\n", 223 | "# Setup RAG pipeline\n", 224 | "rag_chain = (\n", 225 | " {\"context\": retriever, \"input\": RunnablePassthrough()}\n", 226 | " | prompt\n", 227 | " | llm\n", 228 | " | StrOutputParser()\n", 229 | ")" 230 | ], 231 | "metadata": { 232 | "id": "EGXrA8PNvnFB" 233 | }, 234 | "execution_count": null, 235 | "outputs": [] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "source": [ 240 | "# response\n", 241 | "response = rag_chain.invoke(\"who played the lead roles in the movie leaving las vegas\")\n", 242 | "response" 243 | ], 244 | "metadata": { 245 | "colab": { 246 | "base_uri": "https://localhost:8080/", 247 | "height": 53 248 | }, 249 | "id": "kKdSj2Vdvz7c", 250 | "outputId": "f9703edb-4b91-40d7-eabe-728d5229bd2a" 251 | }, 252 | "execution_count": null, 253 | "outputs": [ 254 | { 255 | "output_type": "execute_result", 256 | "data": { 257 | "text/plain": [ 258 | "'Nicolas Cage played the role of Ben Sanderson, the alcoholic screenwriter, and Elisabeth Shue played the role of Sera, the sex worker, in the movie \"Leaving Las Vegas.\"'" 259 | ], 260 | "application/vnd.google.colaboratory.intrinsic+json": { 261 | "type": "string" 262 | } 263 | }, 264 | "metadata": {}, 265 | "execution_count": 12 266 | } 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "source": [ 272 | "## **Preparing Data for Evaluation**" 273 | ], 274 | "metadata": { 275 | "id": "dXvs0qf7wFLX" 276 | } 277 | }, 278 | { 279 | "cell_type": "code", 280 | "source": [ 281 | "question = [\"who played the lead roles in the movie leaving las vegas\"]\n", 282 | "response = []\n", 283 | "contexts = []\n", 284 | "ground_truth = [\"Nicolas Cage stars as a suicidal alcoholic who has ended his personal and professional life to drink himself to death in Las Vegas .\"]\n", 285 | "# Inference\n", 286 | "for query in question:\n", 287 | " response.append(rag_chain.invoke(query))\n", 288 | " contexts.append([docs.page_content for docs in retriever.get_relevant_documents(query)])\n", 289 | "\n", 290 | "# To dict\n", 291 | "data = {\n", 292 | " \"query\": question,\n", 293 | " \"response\": response,\n", 294 | " \"context\": contexts,\n", 295 | " \"expected_response\": ground_truth\n", 296 | "}" 297 | ], 298 | "metadata": { 299 | "id": "yUDUy3HFwJ4A" 300 | }, 301 | "execution_count": null, 302 | "outputs": [] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "source": [ 307 | "# create dataset\n", 308 | "from datasets import Dataset\n", 309 | "dataset = Dataset.from_dict(data)" 310 | ], 311 | "metadata": { 312 | "id": "Gc5A1Qtywxi5" 313 | }, 314 | "execution_count": null, 315 | "outputs": [] 316 | }, 317 | { 318 | "cell_type": "code", 319 | "source": [ 320 | "# create dataframe\n", 321 | "import pandas as pd\n", 322 | "df = pd.DataFrame(dataset)" 323 | ], 324 | "metadata": { 325 | "id": "fa9mopKbw0Vc" 326 | }, 327 | "execution_count": null, 328 | "outputs": [] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "source": [ 333 | "df" 334 | ], 335 | "metadata": { 336 | "colab": { 337 | "base_uri": "https://localhost:8080/", 338 | "height": 150 339 | }, 340 | "id": "G1L2EKpow2GN", 341 | "outputId": "ccff1986-f268-4e43-8d00-d7f152afe229" 342 | }, 343 | "execution_count": null, 344 | "outputs": [ 345 | { 346 | "output_type": "execute_result", 347 | "data": { 348 | "text/plain": [ 349 | " query \\\n", 350 | "0 who played the lead roles in the movie leaving las vegas \n", 351 | "\n", 352 | " response \\\n", 353 | "0 Nicolas Cage and Elisabeth Shue played the lead roles in the movie Leaving Las Vegas. \n", 354 | "\n", 355 | " context \\\n", 356 | "0 ['Leaving Las Vegas is a 1995 American drama film written and directed by Mike Figgis and based on the semi-autobiographical 1990 novel of the same name by John O\\'Brien. Nicolas Cage stars as a suicidal alcoholic in Los Angeles who, having lost his family and been recently fired, has decided to move to Las Vegas and drink himself to death. He loads a supply of liquor and beer into his BMW and gets drunk as he drives from Los Angeles to Las Vegas. Once there, he develops a romantic relations... \n", 357 | "\n", 358 | " expected_response \n", 359 | "0 Nicolas Cage stars as a suicidal alcoholic who has ended his personal and professional life to drink himself to death in Las Vegas . " 360 | ], 361 | "text/html": [ 362 | "\n", 363 | "
\n", 364 | "
\n", 365 | "\n", 378 | "\n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | "
queryresponsecontextexpected_response
0who played the lead roles in the movie leaving las vegasNicolas Cage and Elisabeth Shue played the lead roles in the movie Leaving Las Vegas.['Leaving Las Vegas is a 1995 American drama film written and directed by Mike Figgis and based on the semi-autobiographical 1990 novel of the same name by John O\\'Brien. Nicolas Cage stars as a suicidal alcoholic in Los Angeles who, having lost his family and been recently fired, has decided to move to Las Vegas and drink himself to death. He loads a supply of liquor and beer into his BMW and gets drunk as he drives from Los Angeles to Las Vegas. Once there, he develops a romantic relations...Nicolas Cage stars as a suicidal alcoholic who has ended his personal and professional life to drink himself to death in Las Vegas .
\n", 398 | "
\n", 399 | "
\n", 400 | "\n", 401 | "
\n", 402 | " \n", 410 | "\n", 411 | " \n", 451 | "\n", 452 | " \n", 476 | "
\n", 477 | "\n", 478 | "\n", 479 | "
\n", 480 | " \n", 511 | " \n", 520 | " \n", 532 | "
\n", 533 | "\n", 534 | "
\n", 535 | "
\n" 536 | ], 537 | "application/vnd.google.colaboratory.intrinsic+json": { 538 | "type": "dataframe", 539 | "variable_name": "df", 540 | "summary": "{\n \"name\": \"df\",\n \"rows\": 1,\n \"fields\": [\n {\n \"column\": \"query\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"who played the lead roles in the movie leaving las vegas\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"response\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"Nicolas Cage and Elisabeth Shue played the lead roles in the movie Leaving Las Vegas.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"expected_response\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"Nicolas Cage stars as a suicidal alcoholic who has ended his personal and professional life to drink himself to death in Las Vegas .\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}" 541 | } 542 | }, 543 | "metadata": {}, 544 | "execution_count": 25 545 | } 546 | ] 547 | }, 548 | { 549 | "cell_type": "code", 550 | "source": [ 551 | "# Convert to dictionary\n", 552 | "df_dict = df.to_dict(orient='records')\n", 553 | "\n", 554 | "# Convert context to list\n", 555 | "for record in df_dict:\n", 556 | " if not isinstance(record.get('context'), list):\n", 557 | " if record.get('context') is None:\n", 558 | " record['context'] = []\n", 559 | " else:\n", 560 | " record['context'] = [record['context']]" 561 | ], 562 | "metadata": { 563 | "id": "K8ku0-Mhw4BG" 564 | }, 565 | "execution_count": null, 566 | "outputs": [] 567 | }, 568 | { 569 | "cell_type": "markdown", 570 | "source": [ 571 | "## **Evaluation in Athina AI**\n", 572 | "\n", 573 | "We will use **Context Recall** eval here. It Measures the extent to which the retrieved context aligns with the expected response. Please refer to our [documentation](https://docs.athina.ai/api-reference/evals/preset-evals/overview) for further details" 574 | ], 575 | "metadata": { 576 | "id": "a9dgLw0FxBcO" 577 | } 578 | }, 579 | { 580 | "cell_type": "code", 581 | "source": [ 582 | "# set api keys for Athina evals\n", 583 | "from athina.keys import AthinaApiKey, OpenAiApiKey\n", 584 | "OpenAiApiKey.set_key(os.getenv('OPENAI_API_KEY'))\n", 585 | "AthinaApiKey.set_key(os.getenv('ATHINA_API_KEY'))" 586 | ], 587 | "metadata": { 588 | "id": "YeH78_zMxA-w" 589 | }, 590 | "execution_count": null, 591 | "outputs": [] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "source": [ 596 | "# load dataset\n", 597 | "from athina.loaders import Loader\n", 598 | "dataset = Loader().load_dict(df_dict)" 599 | ], 600 | "metadata": { 601 | "id": "zo415QJGxFyU" 602 | }, 603 | "execution_count": null, 604 | "outputs": [] 605 | }, 606 | { 607 | "cell_type": "code", 608 | "source": [ 609 | "# evaluate\n", 610 | "from athina.evals import RagasContextRecall\n", 611 | "RagasContextRecall(model=\"gpt-4o\").run_batch(data=dataset).to_df()" 612 | ], 613 | "metadata": { 614 | "colab": { 615 | "base_uri": "https://localhost:8080/", 616 | "height": 375 617 | }, 618 | "id": "dBXSUZGOxIer", 619 | "outputId": "a44496e4-e3a0-4068-dcce-64a6128136f6" 620 | }, 621 | "execution_count": null, 622 | "outputs": [ 623 | { 624 | "output_type": "stream", 625 | "name": "stdout", 626 | "text": [ 627 | "evaluating with [context_recall]\n" 628 | ] 629 | }, 630 | { 631 | "output_type": "stream", 632 | "name": "stderr", 633 | "text": [ 634 | "100%|██████████| 1/1 [00:01<00:00, 1.49s/it]\n" 635 | ] 636 | }, 637 | { 638 | "output_type": "stream", 639 | "name": "stdout", 640 | "text": [ 641 | "You can view your dataset at: https://app.athina.ai/develop/3e8a5c23-5ddc-4dd3-ae9a-0790587da1f5\n" 642 | ] 643 | }, 644 | { 645 | "output_type": "execute_result", 646 | "data": { 647 | "text/plain": [ 648 | " query \\\n", 649 | "0 who played the lead roles in the movie leaving las vegas \n", 650 | "\n", 651 | " context \\\n", 652 | "0 ['Leaving Las Vegas is a 1995 American drama film written and directed by Mike Figgis and based on the semi-autobiographical 1990 novel of the same name by John O\\'Brien. Nicolas Cage stars as a suicidal alcoholic in Los Angeles who, having lost his family and been recently fired, has decided to move to Las Vegas and drink himself to death. He loads a supply of liquor and beer into his BMW and gets drunk as he drives from Los Angeles to Las Vegas. Once there, he develops a romantic relations... \n", 653 | "\n", 654 | " response \\\n", 655 | "0 Nicolas Cage and Elisabeth Shue played the lead roles in the movie Leaving Las Vegas. \n", 656 | "\n", 657 | " expected_response \\\n", 658 | "0 Nicolas Cage stars as a suicidal alcoholic who has ended his personal and professional life to drink himself to death in Las Vegas . \n", 659 | "\n", 660 | " display_name failed \\\n", 661 | "0 Ragas Context Recall None \n", 662 | "\n", 663 | " grade_reason \\\n", 664 | "0 Context Recall metric is calculated by dividing the number of sentences in the ground truth that can be attributed to retrieved context by the total number of sentences in the grouund truth \n", 665 | "\n", 666 | " runtime model ragas_context_recall \n", 667 | "0 2316 gpt-4o 1.0 " 668 | ], 669 | "text/html": [ 670 | "\n", 671 | "
\n", 672 | "
\n", 673 | "\n", 686 | "\n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | "
querycontextresponseexpected_responsedisplay_namefailedgrade_reasonruntimemodelragas_context_recall
0who played the lead roles in the movie leaving las vegas['Leaving Las Vegas is a 1995 American drama film written and directed by Mike Figgis and based on the semi-autobiographical 1990 novel of the same name by John O\\'Brien. Nicolas Cage stars as a suicidal alcoholic in Los Angeles who, having lost his family and been recently fired, has decided to move to Las Vegas and drink himself to death. He loads a supply of liquor and beer into his BMW and gets drunk as he drives from Los Angeles to Las Vegas. Once there, he develops a romantic relations...Nicolas Cage and Elisabeth Shue played the lead roles in the movie Leaving Las Vegas.Nicolas Cage stars as a suicidal alcoholic who has ended his personal and professional life to drink himself to death in Las Vegas .Ragas Context RecallNoneContext Recall metric is calculated by dividing the number of sentences in the ground truth that can be attributed to retrieved context by the total number of sentences in the grouund truth2316gpt-4o1.0
\n", 718 | "
\n", 719 | "
\n", 720 | "\n", 721 | "
\n", 722 | " \n", 730 | "\n", 731 | " \n", 771 | "\n", 772 | " \n", 796 | "
\n", 797 | "\n", 798 | "\n", 799 | "
\n", 800 | "
\n" 801 | ], 802 | "application/vnd.google.colaboratory.intrinsic+json": { 803 | "type": "dataframe", 804 | "repr_error": "Out of range float values are not JSON compliant: nan" 805 | } 806 | }, 807 | "metadata": {}, 808 | "execution_count": 30 809 | } 810 | ] 811 | } 812 | ] 813 | } -------------------------------------------------------------------------------- /advanced_rag_techniques/rewrite_retrieve_read.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [] 7 | }, 8 | "kernelspec": { 9 | "name": "python3", 10 | "display_name": "Python 3" 11 | }, 12 | "language_info": { 13 | "name": "python" 14 | } 15 | }, 16 | "cells": [ 17 | { 18 | "cell_type": "markdown", 19 | "source": [ 20 | "# **Rewrite-Retrieve-Read (RRR)**\n", 21 | "Rewrite-Retrieve-Read is a three-step framework for tasks that involve retrieval augmentation, such as open-domain question answering. It focuses on improving the quality of retrieved information and generating accurate outputs by refining the input query.\n", 22 | "\n", 23 | "Research Paper: [Rewrite-Retrieve-Read](https://arxiv.org/pdf/2305.14283)" 24 | ], 25 | "metadata": { 26 | "id": "SO0vrYMzBAEt" 27 | } 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "## **Initial Setup**" 33 | ], 34 | "metadata": { 35 | "id": "C9ppe73RTGSc" 36 | } 37 | }, 38 | { 39 | "cell_type": "code", 40 | "source": [ 41 | "! pip install --q athina chromadb" 42 | ], 43 | "metadata": { 44 | "id": "7HE6-T-8TFQo" 45 | }, 46 | "execution_count": null, 47 | "outputs": [] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "source": [ 52 | "# ! pip install --q athina datasets langchain_community langchain-openai langchainhub chromadb langchain" 53 | ], 54 | "metadata": { 55 | "id": "tN9wwP2cCFmW" 56 | }, 57 | "execution_count": null, 58 | "outputs": [] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "source": [ 63 | "import os\n", 64 | "from google.colab import userdata\n", 65 | "os.environ[\"OPENAI_API_KEY\"] = userdata.get('OPENAI_API_KEY')\n", 66 | "os.environ['ATHINA_API_KEY'] = userdata.get('ATHINA_API_KEY')" 67 | ], 68 | "metadata": { 69 | "id": "HD6QjehgA_P5" 70 | }, 71 | "execution_count": null, 72 | "outputs": [] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "source": [ 77 | "## **Indexing**" 78 | ], 79 | "metadata": { 80 | "id": "tP1ZFteNTN7K" 81 | } 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": { 87 | "id": "TS-Th7YUAVWG" 88 | }, 89 | "outputs": [], 90 | "source": [ 91 | "# load embedding model\n", 92 | "from langchain_openai import OpenAIEmbeddings\n", 93 | "embeddings = OpenAIEmbeddings()" 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "source": [ 99 | "# load data\n", 100 | "from langchain.document_loaders import CSVLoader\n", 101 | "loader = CSVLoader(\"./context.csv\")\n", 102 | "documents = loader.load()" 103 | ], 104 | "metadata": { 105 | "id": "EqCSPjoLChwv" 106 | }, 107 | "execution_count": null, 108 | "outputs": [] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "source": [ 113 | "# split documents\n", 114 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", 115 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n", 116 | "documents = text_splitter.split_documents(documents)" 117 | ], 118 | "metadata": { 119 | "id": "sPS52vUdCjqt" 120 | }, 121 | "execution_count": null, 122 | "outputs": [] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "source": [ 127 | "# create vectorstore\n", 128 | "from langchain.vectorstores import Chroma\n", 129 | "vectorstore = Chroma.from_documents(documents, embeddings)" 130 | ], 131 | "metadata": { 132 | "id": "3h0Ui6uGClnW" 133 | }, 134 | "execution_count": null, 135 | "outputs": [] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "source": [ 140 | "## **Retriever**" 141 | ], 142 | "metadata": { 143 | "id": "0U0xQG8jT2l9" 144 | } 145 | }, 146 | { 147 | "cell_type": "code", 148 | "source": [ 149 | "# create retriever\n", 150 | "retriever = vectorstore.as_retriever()" 151 | ], 152 | "metadata": { 153 | "id": "L20M2_CCCn49" 154 | }, 155 | "execution_count": null, 156 | "outputs": [] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "source": [ 161 | "## **RAG Chain**" 162 | ], 163 | "metadata": { 164 | "id": "5OyGAQhTT9UK" 165 | } 166 | }, 167 | { 168 | "cell_type": "code", 169 | "source": [ 170 | "# load llm\n", 171 | "from langchain_openai import ChatOpenAI\n", 172 | "llm = ChatOpenAI()" 173 | ], 174 | "metadata": { 175 | "id": "bo5gAxDADFmn" 176 | }, 177 | "execution_count": null, 178 | "outputs": [] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "source": [ 183 | "# create document chain\n", 184 | "from langchain.prompts import ChatPromptTemplate\n", 185 | "from langchain.schema.runnable import RunnablePassthrough\n", 186 | "from langchain.schema.output_parser import StrOutputParser\n", 187 | "\n", 188 | "template = \"\"\"\"\n", 189 | "You are a helpful assistant that answers questions based on the following context.\n", 190 | "If you don't find the answer in the context, just say that you don't know.\n", 191 | "Context: {context}\n", 192 | "\n", 193 | "Question: {input}\n", 194 | "\n", 195 | "Answer:\n", 196 | "\n", 197 | "\"\"\"\n", 198 | "prompt = ChatPromptTemplate.from_template(template)\n", 199 | "\n", 200 | "\n", 201 | "rag_chain = (\n", 202 | " {\"context\": retriever, \"input\": RunnablePassthrough()}\n", 203 | " | prompt\n", 204 | " | llm\n", 205 | " | StrOutputParser()\n", 206 | ")" 207 | ], 208 | "metadata": { 209 | "id": "fXtQRxFSDNec" 210 | }, 211 | "execution_count": null, 212 | "outputs": [] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "source": [ 217 | "## **Simple Query**" 218 | ], 219 | "metadata": { 220 | "id": "xRmlBMZFVg5H" 221 | } 222 | }, 223 | { 224 | "cell_type": "code", 225 | "source": [ 226 | "# define simple query\n", 227 | "simple_query = \"who directed the matrix\"" 228 | ], 229 | "metadata": { 230 | "id": "T3Bst4XBDlZ8" 231 | }, 232 | "execution_count": null, 233 | "outputs": [] 234 | }, 235 | { 236 | "cell_type": "code", 237 | "source": [ 238 | "# response\n", 239 | "response = rag_chain.invoke(simple_query)\n", 240 | "response" 241 | ], 242 | "metadata": { 243 | "colab": { 244 | "base_uri": "https://localhost:8080/", 245 | "height": 35 246 | }, 247 | "id": "0o_gT5PPDgql", 248 | "outputId": "2f17dbfa-7a84-4379-d55f-c6357170ec17" 249 | }, 250 | "execution_count": null, 251 | "outputs": [ 252 | { 253 | "output_type": "execute_result", 254 | "data": { 255 | "text/plain": [ 256 | "'The Matrix was directed by the Wachowskis.'" 257 | ], 258 | "application/vnd.google.colaboratory.intrinsic+json": { 259 | "type": "string" 260 | } 261 | }, 262 | "metadata": {}, 263 | "execution_count": 11 264 | } 265 | ] 266 | }, 267 | { 268 | "cell_type": "markdown", 269 | "source": [ 270 | "## **Distracted Query**" 271 | ], 272 | "metadata": { 273 | "id": "V3JS_nBmVmwT" 274 | } 275 | }, 276 | { 277 | "cell_type": "code", 278 | "source": [ 279 | "# define distracted query\n", 280 | "distracted_query = \"who create the matrics\"" 281 | ], 282 | "metadata": { 283 | "id": "SIwphWquDv9E" 284 | }, 285 | "execution_count": null, 286 | "outputs": [] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "source": [ 291 | "# response\n", 292 | "response = rag_chain.invoke(distracted_query)\n", 293 | "response" 294 | ], 295 | "metadata": { 296 | "colab": { 297 | "base_uri": "https://localhost:8080/", 298 | "height": 35 299 | }, 300 | "id": "p-2Rq06TEALm", 301 | "outputId": "59bc1a76-edc1-4262-a717-0fd7ff64e6fd" 302 | }, 303 | "execution_count": null, 304 | "outputs": [ 305 | { 306 | "output_type": "execute_result", 307 | "data": { 308 | "text/plain": [ 309 | "\"I don't know.\"" 310 | ], 311 | "application/vnd.google.colaboratory.intrinsic+json": { 312 | "type": "string" 313 | } 314 | }, 315 | "metadata": {}, 316 | "execution_count": 13 317 | } 318 | ] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "source": [ 323 | "## **Rewrite Retrieve Read**" 324 | ], 325 | "metadata": { 326 | "id": "e-4zJU8LVrJz" 327 | } 328 | }, 329 | { 330 | "cell_type": "code", 331 | "source": [ 332 | "# define rewrite prompt for distracted query\n", 333 | "\n", 334 | "template = \"\"\"Provide a better search query for \\\n", 335 | "web search engine to answer the given question, end \\\n", 336 | "the queries with ’**’. Question: \\\n", 337 | "{x} Answer:\"\"\"\n", 338 | "\n", 339 | "rewrite_prompt = ChatPromptTemplate.from_template(template)" 340 | ], 341 | "metadata": { 342 | "id": "82XnIBLQE3r_" 343 | }, 344 | "execution_count": null, 345 | "outputs": [] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "source": [ 350 | "# parse response\n", 351 | "def _parse(text):\n", 352 | " return text.strip('\"').strip(\"**\")" 353 | ], 354 | "metadata": { 355 | "id": "eE8VflngFcRu" 356 | }, 357 | "execution_count": null, 358 | "outputs": [] 359 | }, 360 | { 361 | "cell_type": "code", 362 | "source": [ 363 | "# create rewriter chain\n", 364 | "rewriter = rewrite_prompt | ChatOpenAI(temperature=0) | StrOutputParser() | _parse" 365 | ], 366 | "metadata": { 367 | "id": "_Cg_zYy5FAkB" 368 | }, 369 | "execution_count": null, 370 | "outputs": [] 371 | }, 372 | { 373 | "cell_type": "code", 374 | "source": [ 375 | "# updated query\n", 376 | "rewriter.invoke({\"x\": distracted_query})" 377 | ], 378 | "metadata": { 379 | "colab": { 380 | "base_uri": "https://localhost:8080/", 381 | "height": 35 382 | }, 383 | "id": "KOFlwsq3FKWT", 384 | "outputId": "026bcbe4-9219-4c76-c870-3031cb95e1c2" 385 | }, 386 | "execution_count": null, 387 | "outputs": [ 388 | { 389 | "output_type": "execute_result", 390 | "data": { 391 | "text/plain": [ 392 | "'Who is the creator of the Matrix film series?'" 393 | ], 394 | "application/vnd.google.colaboratory.intrinsic+json": { 395 | "type": "string" 396 | } 397 | }, 398 | "metadata": {}, 399 | "execution_count": 17 400 | } 401 | ] 402 | }, 403 | { 404 | "cell_type": "code", 405 | "source": [ 406 | "# create rewrite retrieve read chain\n", 407 | "rewrite_retrieve_read_chain = (\n", 408 | " {\n", 409 | " \"context\": {\"x\": RunnablePassthrough()} | rewriter | retriever,\n", 410 | " \"input\": RunnablePassthrough(),\n", 411 | " }\n", 412 | " | prompt\n", 413 | " | llm\n", 414 | " | StrOutputParser()\n", 415 | ")" 416 | ], 417 | "metadata": { 418 | "id": "DsfoEcsJGBsJ" 419 | }, 420 | "execution_count": null, 421 | "outputs": [] 422 | }, 423 | { 424 | "cell_type": "code", 425 | "source": [ 426 | "# final response\n", 427 | "rewrite_retrieve_read_chain.invoke(distracted_query)" 428 | ], 429 | "metadata": { 430 | "colab": { 431 | "base_uri": "https://localhost:8080/", 432 | "height": 35 433 | }, 434 | "id": "NMDFiHBKFlsz", 435 | "outputId": "20b89c91-96b0-4a3b-92b2-ce0c32b0489e" 436 | }, 437 | "execution_count": null, 438 | "outputs": [ 439 | { 440 | "output_type": "execute_result", 441 | "data": { 442 | "text/plain": [ 443 | "'The Matrix was created by the Wachowskis.'" 444 | ], 445 | "application/vnd.google.colaboratory.intrinsic+json": { 446 | "type": "string" 447 | } 448 | }, 449 | "metadata": {}, 450 | "execution_count": 19 451 | } 452 | ] 453 | }, 454 | { 455 | "cell_type": "markdown", 456 | "source": [ 457 | "## **Preparing Data for Evaluation**" 458 | ], 459 | "metadata": { 460 | "id": "-H_IJfdsYIRv" 461 | } 462 | }, 463 | { 464 | "cell_type": "code", 465 | "source": [ 466 | "# create data\n", 467 | "response = []\n", 468 | "contexts = []\n", 469 | "questions = []\n", 470 | "\n", 471 | "rewritten_query = rewriter.invoke({\"x\": distracted_query})\n", 472 | "questions.append(rewritten_query)\n", 473 | "response.append(rewrite_retrieve_read_chain.invoke(distracted_query))\n", 474 | "contexts.append([docs.page_content for docs in retriever.get_relevant_documents(rewritten_query)])\n", 475 | "\n", 476 | "\n", 477 | "data = {\n", 478 | " \"query\": questions,\n", 479 | " \"response\": response,\n", 480 | " \"context\": contexts,\n", 481 | "}" 482 | ], 483 | "metadata": { 484 | "id": "RdR9KRSCWm5T" 485 | }, 486 | "execution_count": null, 487 | "outputs": [] 488 | }, 489 | { 490 | "cell_type": "code", 491 | "source": [ 492 | "# create dataset\n", 493 | "from datasets import Dataset\n", 494 | "dataset = Dataset.from_dict(data)" 495 | ], 496 | "metadata": { 497 | "id": "gBS2PKAgXXWc" 498 | }, 499 | "execution_count": null, 500 | "outputs": [] 501 | }, 502 | { 503 | "cell_type": "code", 504 | "source": [ 505 | "# create dataframe\n", 506 | "import pandas as pd\n", 507 | "df = pd.DataFrame(dataset)" 508 | ], 509 | "metadata": { 510 | "id": "fEAxNCj4XdgW" 511 | }, 512 | "execution_count": null, 513 | "outputs": [] 514 | }, 515 | { 516 | "cell_type": "code", 517 | "source": [ 518 | "df" 519 | ], 520 | "metadata": { 521 | "colab": { 522 | "base_uri": "https://localhost:8080/", 523 | "height": 132 524 | }, 525 | "id": "gNyS_qn-XeME", 526 | "outputId": "476fc3ff-bc7d-4457-f0b6-c54c450b8ae9" 527 | }, 528 | "execution_count": null, 529 | "outputs": [ 530 | { 531 | "output_type": "execute_result", 532 | "data": { 533 | "text/plain": [ 534 | " query \\\n", 535 | "0 Who is the creator of the Matrix film series? \n", 536 | "\n", 537 | " response \\\n", 538 | "0 The Matrix was created by the Wachowskis. \n", 539 | "\n", 540 | " context \n", 541 | "0 ['The Matrix is a 1999 science fiction action film written and directed by the Wachowskis. It is the first installment in the Matrix film series, starring Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss, Hugo Weaving, and Joe Pantoliano, and depicts a dystopian future in which humanity is unknowingly trapped inside the Matrix, a simulated reality that intelligent machines have created to distract humans while using their bodies as an energy source. When computer programmer Thomas Anderson... " 542 | ], 543 | "text/html": [ 544 | "\n", 545 | "
\n", 546 | "
\n", 547 | "\n", 560 | "\n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | "
queryresponsecontext
0Who is the creator of the Matrix film series?The Matrix was created by the Wachowskis.['The Matrix is a 1999 science fiction action film written and directed by the Wachowskis. It is the first installment in the Matrix film series, starring Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss, Hugo Weaving, and Joe Pantoliano, and depicts a dystopian future in which humanity is unknowingly trapped inside the Matrix, a simulated reality that intelligent machines have created to distract humans while using their bodies as an energy source. When computer programmer Thomas Anderson...
\n", 578 | "
\n", 579 | "
\n", 580 | "\n", 581 | "
\n", 582 | " \n", 590 | "\n", 591 | " \n", 631 | "\n", 632 | " \n", 656 | "
\n", 657 | "\n", 658 | "\n", 659 | "
\n", 660 | " \n", 691 | " \n", 700 | " \n", 712 | "
\n", 713 | "\n", 714 | "
\n", 715 | "
\n" 716 | ], 717 | "application/vnd.google.colaboratory.intrinsic+json": { 718 | "type": "dataframe", 719 | "variable_name": "df", 720 | "summary": "{\n \"name\": \"df\",\n \"rows\": 1,\n \"fields\": [\n {\n \"column\": \"query\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"Who is the creator of the Matrix film series?\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"response\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"The Matrix was created by the Wachowskis.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}" 721 | } 722 | }, 723 | "metadata": {}, 724 | "execution_count": 38 725 | } 726 | ] 727 | }, 728 | { 729 | "cell_type": "code", 730 | "source": [ 731 | "# Convert to dictionary\n", 732 | "df_dict = df.to_dict(orient='records')\n", 733 | "\n", 734 | "# Convert context to list\n", 735 | "for record in df_dict:\n", 736 | " if not isinstance(record.get('context'), list):\n", 737 | " if record.get('context') is None:\n", 738 | " record['context'] = []\n", 739 | " else:\n", 740 | " record['context'] = [record['context']]" 741 | ], 742 | "metadata": { 743 | "id": "26r0OY_YYGfw" 744 | }, 745 | "execution_count": null, 746 | "outputs": [] 747 | }, 748 | { 749 | "cell_type": "markdown", 750 | "source": [ 751 | "## **Evaluation in Athina AI**\n", 752 | "\n", 753 | "We will use **Answer Relevancy** eval here. It Measures how pertinent the generated response is to the given prompt. Please refer to our [documentation](https://docs.athina.ai/api-reference/evals/preset-evals/overview) for further details" 754 | ], 755 | "metadata": { 756 | "id": "HaBz2noKZP7C" 757 | } 758 | }, 759 | { 760 | "cell_type": "code", 761 | "source": [ 762 | "# set api keys for Athina evals\n", 763 | "from athina.keys import AthinaApiKey, OpenAiApiKey\n", 764 | "OpenAiApiKey.set_key(os.getenv('OPENAI_API_KEY'))\n", 765 | "AthinaApiKey.set_key(os.getenv('ATHINA_API_KEY'))" 766 | ], 767 | "metadata": { 768 | "id": "s0nAgnCtYw_s" 769 | }, 770 | "execution_count": null, 771 | "outputs": [] 772 | }, 773 | { 774 | "cell_type": "code", 775 | "source": [ 776 | "# load dataset\n", 777 | "from athina.loaders import Loader\n", 778 | "dataset = Loader().load_dict(df_dict)" 779 | ], 780 | "metadata": { 781 | "id": "XTnWXK9_ZlR6" 782 | }, 783 | "execution_count": null, 784 | "outputs": [] 785 | }, 786 | { 787 | "cell_type": "code", 788 | "source": [ 789 | "# evaluate\n", 790 | "from athina.evals import RagasAnswerRelevancy\n", 791 | "RagasAnswerRelevancy(model=\"gpt-4o\").run_batch(data=dataset).to_df()" 792 | ], 793 | "metadata": { 794 | "colab": { 795 | "base_uri": "https://localhost:8080/", 796 | "height": 393 797 | }, 798 | "id": "7l5rGh5LZnA8", 799 | "outputId": "d1310618-80c4-4437-86e2-a4326010b46b" 800 | }, 801 | "execution_count": null, 802 | "outputs": [ 803 | { 804 | "output_type": "stream", 805 | "name": "stdout", 806 | "text": [ 807 | "evaluating with [answer_relevancy]\n" 808 | ] 809 | }, 810 | { 811 | "output_type": "stream", 812 | "name": "stderr", 813 | "text": [ 814 | "100%|██████████| 1/1 [00:01<00:00, 1.08s/it]\n" 815 | ] 816 | }, 817 | { 818 | "output_type": "stream", 819 | "name": "stdout", 820 | "text": [ 821 | "You can view your dataset at: https://app.athina.ai/develop/ddec3010-12e6-4f5e-bbc0-188934bc90dc\n" 822 | ] 823 | }, 824 | { 825 | "output_type": "execute_result", 826 | "data": { 827 | "text/plain": [ 828 | " query \\\n", 829 | "0 Who is the creator of the Matrix film series? \n", 830 | "\n", 831 | " context \\\n", 832 | "0 ['The Matrix is a 1999 science fiction action film written and directed by the Wachowskis. It is the first installment in the Matrix film series, starring Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss, Hugo Weaving, and Joe Pantoliano, and depicts a dystopian future in which humanity is unknowingly trapped inside the Matrix, a simulated reality that intelligent machines have created to distract humans while using their bodies as an energy source. When computer programmer Thomas Anderson... \n", 833 | "\n", 834 | " response expected_response \\\n", 835 | "0 The Matrix was created by the Wachowskis. None \n", 836 | "\n", 837 | " display_name failed \\\n", 838 | "0 Ragas Answer Relevancy None \n", 839 | "\n", 840 | " grade_reason \\\n", 841 | "0 A response is deemed relevant when it directly and appropriately addresses the original query. Importantly, our assessment of answer relevance does not consider factuality but instead penalizes cases where the response lacks completeness or contains redundant details \n", 842 | "\n", 843 | " runtime model ragas_answer_relevancy \n", 844 | "0 1493 gpt-4o 0.961197 " 845 | ], 846 | "text/html": [ 847 | "\n", 848 | "
\n", 849 | "
\n", 850 | "\n", 863 | "\n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | "
querycontextresponseexpected_responsedisplay_namefailedgrade_reasonruntimemodelragas_answer_relevancy
0Who is the creator of the Matrix film series?['The Matrix is a 1999 science fiction action film written and directed by the Wachowskis. It is the first installment in the Matrix film series, starring Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss, Hugo Weaving, and Joe Pantoliano, and depicts a dystopian future in which humanity is unknowingly trapped inside the Matrix, a simulated reality that intelligent machines have created to distract humans while using their bodies as an energy source. When computer programmer Thomas Anderson...The Matrix was created by the Wachowskis.NoneRagas Answer RelevancyNoneA response is deemed relevant when it directly and appropriately addresses the original query. Importantly, our assessment of answer relevance does not consider factuality but instead penalizes cases where the response lacks completeness or contains redundant details1493gpt-4o0.961197
\n", 895 | "
\n", 896 | "
\n", 897 | "\n", 898 | "
\n", 899 | " \n", 907 | "\n", 908 | " \n", 948 | "\n", 949 | " \n", 973 | "
\n", 974 | "\n", 975 | "\n", 976 | "
\n", 977 | "
\n" 978 | ], 979 | "application/vnd.google.colaboratory.intrinsic+json": { 980 | "type": "dataframe", 981 | "repr_error": "Out of range float values are not JSON compliant: nan" 982 | } 983 | }, 984 | "metadata": {}, 985 | "execution_count": 41 986 | } 987 | ] 988 | } 989 | ] 990 | } -------------------------------------------------------------------------------- /data/tesla_q3.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/athina-ai/rag-cookbooks/ab087e8ff35f39c0f7136e292836f28c9454591c/data/tesla_q3.pdf --------------------------------------------------------------------------------