├── README.md ├── ghraphrag_viz.pdf ├── graph_examples.ipynb └── media ├── basic_retrieval.png ├── coffee_graph_ex.png ├── communities.png ├── drift_search.png ├── entities.png ├── ghraphrag_viz.svg ├── global_search.png ├── graph_building.png ├── graph_start.png ├── graphrag_data_flow.png ├── kg_retrieval.png ├── leidan.png ├── local_search.png ├── relationship.png └── table_comp.png /README.md: -------------------------------------------------------------------------------- 1 | # Knowledge Graph RAG 2 | 3 | 4 | 5 | *[Improving Knowledge Graph Completion with Generative LM and neighbors](https://deeppavlov.ai/research/tpost/bn15u1y4v1-improving-knowledge-graph-completion-wit)* 6 | 7 | In the evolving landscape of AI and information retrieval, knowledge graphs have emerged as a powerful way to represent complex, interconnected information. A knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the free-form semantics or relationships underlying these entities. [Source: Wikipedia](https://en.wikipedia.org/wiki/Knowledge_graph) 8 | 9 | What makes knowledge graphs particularly powerful is their ability to mirror human cognition in data. They more explicitly map the relationships between objects, concepts, or ideas together through both their semantic and relational connections. This approach closely parallels how our brains naturally understand and internalize information – not as isolated facts, but as a web of interconnected concepts and relationships. 10 | 11 | 12 | 13 | Looking at a concept like "coffee," we don't just know it's a beverage; we automatically connect it to related concepts like beans, brewing methods, caffeine, morning routines, and social interactions. Knowledge graphs capture these natural associations in a structured way. 14 | 15 | Traditional RAG systems, while effective at semantic similarity-based retrieval, often struggle to capture broader conceptual relationships across text chunks. Knowledge Graph RAG addresses this limitation by introducing a structured, hierarchical approach to information organization and retrieval. By representing data in a graph format, these systems can traverse relationships between concepts, enabling more sophisticated query understanding and response generation. This approach allows for targeted querying along specific relationship paths, handles complex multi-hop questions, and provides clearer reasoning through explicit connection paths. The result is a more nuanced and interpretable system that combines the structured reasoning of knowledge graphs with the natural language capabilities of large language models. 16 | 17 | While [knowledge graphs are not a new concept](https://blog.google/products/search/introducing-knowledge-graph-things-not/), their creation has traditionally been a resource-intensive process. Early knowledge graphs were built either through manual curation by domain experts or by converting existing structured data from relational databases. This limited both their scale and adaptability to new domains. 18 | 19 | 20 | 21 | *[What is a Knowledge Graph (KG)?](https://zilliz.com/learn/what-is-knowledge-graph)* 22 | 23 | The introduction of LLMs has transformed this landscape. LLMs' capabilities in NLP, reasoning, and relationship extraction now enable automated construction of knowledge graphs from unstructured text. These models can identify entities, infer relationships, and structure information in ways that previously required extensive manual labor. As a plus, this allows knowledge graphs to be dynamically updated and expanded as new information becomes available, making them more practical and scalable for real-world applications. 24 | 25 | To see this in action ourselves, and compare it to traditional vector similarity techniques, we'll take a look at Microsoft's Open Source [GraphRAG](https://microsoft.github.io/graphrag/) and how it works behind the scenes. 26 | -------------------------------------------------------------------------------- /ghraphrag_viz.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/ghraphrag_viz.pdf -------------------------------------------------------------------------------- /graph_examples.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "d362591e-9f54-46f9-b703-49062ac3072f", 6 | "metadata": {}, 7 | "source": [ 8 | "# Knowledge Graph RAG\n", 9 | "\n", 10 | "\n", 11 | "\n", 12 | "*[Improving Knowledge Graph Completion with Generative LM and neighbors](https://deeppavlov.ai/research/tpost/bn15u1y4v1-improving-knowledge-graph-completion-wit)*\n", 13 | "\n", 14 | "In the evolving landscape of AI and information retrieval, knowledge graphs have emerged as a powerful way to represent complex, interconnected information. A knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the free-form semantics or relationships underlying these entities. [Source: Wikipedia](https://en.wikipedia.org/wiki/Knowledge_graph)\n", 15 | "\n", 16 | "What makes knowledge graphs particularly powerful is their ability to mirror human cognition in data. They more explicitly map the relationships between objects, concepts, or ideas together through both their semantic and relational connections. This approach closely parallels how our brains naturally understand and internalize information – not as isolated facts, but as a web of interconnected concepts and relationships.\n", 17 | "\n", 18 | "\n", 19 | "\n", 20 | "Looking at a concept like \"coffee,\" we don't just know it's a beverage; we automatically connect it to related concepts like beans, brewing methods, caffeine, morning routines, and social interactions. Knowledge graphs capture these natural associations in a structured way.\n", 21 | "\n", 22 | "Traditional RAG systems, while effective at semantic similarity-based retrieval, often struggle to capture broader conceptual relationships across text chunks. Knowledge Graph RAG addresses this limitation by introducing a structured, hierarchical approach to information organization and retrieval. By representing data in a graph format, these systems can traverse relationships between concepts, enabling more sophisticated query understanding and response generation. This approach allows for targeted querying along specific relationship paths, handles complex multi-hop questions, and provides clearer reasoning through explicit connection paths. The result is a more nuanced and interpretable system that combines the structured reasoning of knowledge graphs with the natural language capabilities of large language models.\n", 23 | "\n", 24 | "While [knowledge graphs are not a new concept](https://blog.google/products/search/introducing-knowledge-graph-things-not/), their creation has traditionally been a resource-intensive process. Early knowledge graphs were built either through manual curation by domain experts or by converting existing structured data from relational databases. This limited both their scale and adaptability to new domains.\n", 25 | "\n", 26 | "\n", 27 | "\n", 28 | "*[What is a Knowledge Graph (KG)?](https://zilliz.com/learn/what-is-knowledge-graph)*\n", 29 | "\n", 30 | "The introduction of LLMs has transformed this landscape. LLMs' capabilities in NLP, reasoning, and relationship extraction now enable automated construction of knowledge graphs from unstructured text. These models can identify entities, infer relationships, and structure information in ways that previously required extensive manual labor. As a plus, this allows knowledge graphs to be dynamically updated and expanded as new information becomes available, making them more practical and scalable for real-world applications.\n", 31 | "\n", 32 | "To see this in action ourselves, and compare it to traditional vector similarity techniques, we'll take a look at Microsoft's Open Source [GraphRAG](https://microsoft.github.io/graphrag/) and how it works behind the scenes." 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "id": "cb752dad-d6bf-436f-a175-03a1d491bb3e", 38 | "metadata": {}, 39 | "source": [ 40 | "---\n", 41 | "## 3 Main Components of Knowledge Graphs\n", 42 | "\n", 43 | "**Entity**\n", 44 | "\n", 45 | "\n", 46 | "\n", 47 | "An Entity is a distinct object, person, place, event, or concept that has been extracted from a chunk of text through LLM analysis. Entities form the nodes of the knowledge graph. During the creation of the knowledge graph, when duplicate entities are found they are merged while preserving their various descriptions, creating a comprehensive representation of each unique entity.\n", 48 | "\n", 49 | "**Relationship**\n", 50 | "\n", 51 | "\n", 52 | "\n", 53 | "A Relationship defines a connection between two entities in the knowledge graph. These connections are extracted directly from text units through LLM analysis, alongside entities. Each relationship includes a source entity, target entity, and descriptive information about their connection. When duplicate relationships are found between the same entities, they are merged by combining their descriptions to create a more complete understanding of the connection.\n", 54 | "\n", 55 | "**Community**\n", 56 | "\n", 57 | "\n", 58 | "\n", 59 | "A Community is a cluster of related entities and relationships identified through hierarchical community detection, generally using the [Leiden Algorithm](https://en.wikipedia.org/wiki/Leiden_algorithm). Communities create a structured way to understand different levels of granularity within the knowledge graph, from broad overviews at the top level to detailed local clusters at lower levels. This hierarchical structure helps in organizing and navigating complex knowledge graphs." 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "id": "f5d7e77f-a543-49e2-813e-c1305f7a058d", 65 | "metadata": {}, 66 | "source": [ 67 | "---\n", 68 | "## GraphRAG Creation Data Flow\n", 69 | "\n", 70 | "\n", 71 | "\n", 72 | "Indexxing in GraphRAG is an extensive process, where we load the document, split it into chunks, create sub graphs at a chunk level, combine these subgraphs into our final graph, algorithmically identify communities, then document the communities main features." 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "id": "19896a38-bcf7-4664-9f57-cb12cf15cdea", 78 | "metadata": {}, 79 | "source": [ 80 | "### **Loading and Splitting Our Text**\n", 81 | "\n", 82 | "For our example, we'll be using [The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities](https://arxiv.org/pdf/2408.13296).\n", 83 | "\n", 84 | "This will be loaded as a text file (remove index, glossary, and references) and split into 1200 token, 100 token overlap chunks." 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "id": "97e6a581-74a1-455e-8634-2c94d6496e45", 91 | "metadata": { 92 | "scrolled": true 93 | }, 94 | "outputs": [], 95 | "source": [ 96 | "from langchain_text_splitters import TokenTextSplitter\n", 97 | "\n", 98 | "with open(\"./ragtest/input/ft_guide.txt\", 'r') as file:\n", 99 | " content = file.read()\n", 100 | "\n", 101 | "text_splitter = TokenTextSplitter(chunk_size=1200, chunk_overlap=100)\n", 102 | "\n", 103 | "texts = text_splitter.split_text(content)" 104 | ] 105 | }, 106 | { 107 | "cell_type": "markdown", 108 | "id": "2df9e65e-531c-4e94-9808-4ede865621d7", 109 | "metadata": {}, 110 | "source": [ 111 | "**Entity and Relationship Extraction Prompt**\n", 112 | "\n", 113 | "This is a [tuned](https://microsoft.github.io/graphrag/prompt_tuning/auto_prompt_tuning/) entity extraction prompt used in our real GraphRAG implementation, extracted in this format to see what's happening." 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 74, 119 | "id": "b6236c47-8584-41d6-8553-f516e282d319", 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "from langchain_core.prompts import ChatPromptTemplate\n", 124 | "from langchain_core.output_parsers import StrOutputParser\n", 125 | "from langchain_openai import ChatOpenAI\n", 126 | "\n", 127 | "llm = ChatOpenAI(temperature=0.0, model=\"gpt-4o\")\n", 128 | "\n", 129 | "prompt_template = \"\"\"\n", 130 | "-Goal-\n", 131 | "Given a text document that is potentially relevant to this activity and a list of entity types, identify all entities of those types from the text and all relationships among the identified entities.\n", 132 | "\n", 133 | "-Steps-\n", 134 | "1. Identify all entities. For each identified entity, extract the following information:\n", 135 | "- entity_name: Name of the entity, capitalized\n", 136 | "- entity_type: One of the following types: [large language model, differential privacy, federated learning, healthcare, adversarial training, security measures, open-source tool, dataset, learning rate, AdaGrad, RMSprop, adapter architecture, LoRA, API, model support, evaluation metrics, deployment, Python library, hardware accelerators, hyperparameters, data preprocessing, data imbalance, GPU-based deployment, distributed inference]\n", 137 | "- entity_description: Comprehensive description of the entity's attributes and activities\n", 138 | "Format each entity as (\"entity\"{{tuple_delimiter}}{{tuple_delimiter}}{{tuple_delimiter}})\n", 139 | "\n", 140 | "2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.\n", 141 | "For each pair of related entities, extract the following information:\n", 142 | "- source_entity: name of the source entity, as identified in step 1\n", 143 | "- target_entity: name of the target entity, as identified in step 1\n", 144 | "- relationship_description: explanation as to why you think the source entity and the target entity are related to each other\n", 145 | "- relationship_strength: an integer score between 1 to 10, indicating strength of the relationship between the source entity and target entity\n", 146 | "Format each relationship as (\"relationship\"{{tuple_delimiter}}{{tuple_delimiter}}{{tuple_delimiter}}{{tuple_delimiter}})\n", 147 | "\n", 148 | "3. Return output in The primary language of the provided text is \"English.\" as a single list of all the entities and relationships identified in steps 1 and 2. Use **{{record_delimiter}}** as the list delimiter.\n", 149 | "\n", 150 | "4. If you have to translate into The primary language of the provided text is \"English.\", just translate the descriptions, nothing else!\n", 151 | "\n", 152 | "5. When finished, output {{completion_delimiter}}.\n", 153 | "\n", 154 | "-Examples-\n", 155 | "######################\n", 156 | "\n", 157 | "Example 1:\n", 158 | "\n", 159 | "entity_types: [large language model, differential privacy, federated learning, healthcare, adversarial training, security measures, open-source tool, dataset, learning rate, AdaGrad, RMSprop, adapter architecture, LoRA, API, model support, evaluation metrics, deployment, Python library, hardware accelerators, hyperparameters, data preprocessing, data imbalance, GPU-based deployment, distributed inference]\n", 160 | "text:\n", 161 | " LLMs to create synthetic samples that mimic clients’ private data distribution using\n", 162 | "differential privacy. This approach significantly boosts SLMs’ performance by approximately 5% while\n", 163 | "maintaining data privacy with a minimal privacy budget, outperforming traditional methods relying\n", 164 | "solely on local private data.\n", 165 | "In healthcare, federated fine-tuning can allow hospitals to collaboratively train models on patient data\n", 166 | "without transferring sensitive information. This approach ensures data privacy while enabling the de-\n", 167 | "velopment of robust, generalisable AI systems.\n", 168 | "8https://ai.meta.com/responsible-ai/\n", 169 | "9https://huggingface.co/docs/hub/en/model-cards\n", 170 | "10https://www.tensorflow.org/responsible_ai/privacy/guide\n", 171 | "101 Frameworks for Enhancing Security\n", 172 | "Adversarial training and robust security measures[111] are essential for protecting fine-tuned models\n", 173 | "against attacks. The adversarial training approach involves training models with adversarial examples\n", 174 | "to improve their resilience against malicious inputs. Microsoft Azure’s\n", 175 | "------------------------\n", 176 | "output:\n", 177 | "(\"entity\"{{tuple_delimiter}}DIFFERENTIAL PRIVACY{{tuple_delimiter}}differential privacy{{tuple_delimiter}}Differential privacy is a technique used to create synthetic samples that mimic clients' private data distribution while maintaining data privacy with a minimal privacy budget{{record_delimiter}}\n", 178 | "(\"entity\"{{tuple_delimiter}}HEALTHCARE{{tuple_delimiter}}healthcare{{tuple_delimiter}}In healthcare, federated fine-tuning allows hospitals to collaboratively train models on patient data without transferring sensitive information, ensuring data privacy{{record_delimiter}}\n", 179 | "(\"entity\"{{tuple_delimiter}}FEDERATED LEARNING{{tuple_delimiter}}federated learning{{tuple_delimiter}}Federated learning is a method that enables collaborative model training on decentralized data sources, such as hospitals, without sharing sensitive information{{record_delimiter}}\n", 180 | "(\"entity\"{{tuple_delimiter}}ADVERSARIAL TRAINING{{tuple_delimiter}}adversarial training{{tuple_delimiter}}Adversarial training involves training models with adversarial examples to improve their resilience against malicious inputs{{record_delimiter}}\n", 181 | "(\"entity\"{{tuple_delimiter}}SECURITY MEASURES{{tuple_delimiter}}security measures{{tuple_delimiter}}Robust security measures are essential for protecting fine-tuned models against attacks{{record_delimiter}}\n", 182 | "(\"relationship\"{{tuple_delimiter}}DIFFERENTIAL PRIVACY{{tuple_delimiter}}FEDERATED LEARNING{{tuple_delimiter}}Differential privacy is used in federated learning to maintain data privacy while training models collaboratively{{tuple_delimiter}}8{{record_delimiter}}\n", 183 | "(\"relationship\"{{tuple_delimiter}}HEALTHCARE{{tuple_delimiter}}FEDERATED LEARNING{{tuple_delimiter}}Federated learning is applied in healthcare to train models on patient data without transferring sensitive information{{tuple_delimiter}}9{{record_delimiter}}\n", 184 | "(\"relationship\"{{tuple_delimiter}}ADVERSARIAL TRAINING{{tuple_delimiter}}SECURITY MEASURES{{tuple_delimiter}}Adversarial training is a security measure used to protect models against attacks by improving their resilience{{tuple_delimiter}}8{{completion_delimiter}}\n", 185 | "#############################\n", 186 | "\n", 187 | "\n", 188 | "Example 2:\n", 189 | "\n", 190 | "entity_types: [large language model, differential privacy, federated learning, healthcare, adversarial training, security measures, open-source tool, dataset, learning rate, AdaGrad, RMSprop, adapter architecture, LoRA, API, model support, evaluation metrics, deployment, Python library, hardware accelerators, hyperparameters, data preprocessing, data imbalance, GPU-based deployment, distributed inference]\n", 191 | "text:\n", 192 | "ARD [82] is an innovative open-source tool developed to enhance the safety of interactions\n", 193 | "with large language models (LLMs). This tool addresses three critical moderation tasks: detecting\n", 194 | "2https://huggingface.co/docs/transformers/en/model_doc/auto#transformers.AutoModelForCausalLM\n", 195 | "63 harmful intent in user prompts, identifying safety risks in model responses, and determining when a\n", 196 | "model appropriately refuses unsafe requests. Central to its development is WILDGUARD MIX3, a\n", 197 | "meticulously curated dataset comprising 92,000 labelled examples that include both benign prompts and\n", 198 | "adversarial attempts to bypass safety measures. The dataset is divided into WILDGUARD TRAIN, used\n", 199 | "for training the model, and WILDGUARD TEST, consisting of high-quality human-annotated examples\n", 200 | "for evaluation.\n", 201 | "The WILDGUARD model itself is fine-tuned on the Mistral-7B language model using the WILDGUARD\n", 202 | "TRAIN dataset, enabling it to perform all\n", 203 | "------------------------\n", 204 | "output:\n", 205 | "```plaintext\n", 206 | "(\"entity\"{{tuple_delimiter}}ARD{{tuple_delimiter}}open-source tool{{tuple_delimiter}}ARD is an innovative open-source tool developed to enhance the safety of interactions with large language models by addressing moderation tasks such as detecting harmful intent, identifying safety risks, and determining appropriate refusals of unsafe requests)\n", 207 | "{{record_delimiter}}\n", 208 | "(\"entity\"{{tuple_delimiter}}LARGE LANGUAGE MODELS{{tuple_delimiter}}large language model{{tuple_delimiter}}Large language models (LLMs) are advanced AI models designed to understand and generate human-like text, which ARD aims to interact with safely)\n", 209 | "{{record_delimiter}}\n", 210 | "(\"entity\"{{tuple_delimiter}}WILDGUARD MIX3{{tuple_delimiter}}dataset{{tuple_delimiter}}WILDGUARD MIX3 is a meticulously curated dataset comprising 92,000 labeled examples, including benign prompts and adversarial attempts, used for training and evaluating safety measures in language models)\n", 211 | "{{record_delimiter}}\n", 212 | "(\"entity\"{{tuple_delimiter}}WILDGUARD TRAIN{{tuple_delimiter}}dataset{{tuple_delimiter}}WILDGUARD TRAIN is a subset of the WILDGUARD MIX3 dataset used specifically for training the model on safety measures)\n", 213 | "{{record_delimiter}}\n", 214 | "(\"entity\"{{tuple_delimiter}}WILDGUARD TEST{{tuple_delimiter}}dataset{{tuple_delimiter}}WILDGUARD TEST is a subset of the WILDGUARD MIX3 dataset consisting of high-quality human-annotated examples used for evaluating the model's performance)\n", 215 | "{{record_delimiter}}\n", 216 | "(\"entity\"{{tuple_delimiter}}MISTRAL-7B{{tuple_delimiter}}large language model{{tuple_delimiter}}Mistral-7B is a language model that the WILDGUARD model is fine-tuned on using the WILDGUARD TRAIN dataset to enhance its safety performance)\n", 217 | "{{record_delimiter}}\n", 218 | "(\"entity\"{{tuple_delimiter}}ADVERSARIAL ATTEMPTS{{tuple_delimiter}}adversarial training{{tuple_delimiter}}Adversarial attempts are part of the WILDGUARD MIX3 dataset, used to test and improve the model's ability to handle unsafe or harmful inputs)\n", 219 | "{{record_delimiter}}\n", 220 | "(\"entity\"{{tuple_delimiter}}SAFETY MEASURES{{tuple_delimiter}}security measures{{tuple_delimiter}}Safety measures are protocols and techniques implemented to ensure that large language models interact safely with users, which ARD and the WILDGUARD dataset aim to enhance)\n", 221 | "{{record_delimiter}}\n", 222 | "(\"relationship\"{{tuple_delimiter}}ARD{{tuple_delimiter}}LARGE LANGUAGE MODELS{{tuple_delimiter}}ARD is designed to enhance the safety of interactions with large language models by addressing critical moderation tasks{{tuple_delimiter}}8)\n", 223 | "{{record_delimiter}}\n", 224 | "(\"relationship\"{{tuple_delimiter}}ARD{{tuple_delimiter}}WILDGUARD MIX3{{tuple_delimiter}}ARD uses the WILDGUARD MIX3 dataset to train and evaluate its moderation capabilities{{tuple_delimiter}}7)\n", 225 | "{{record_delimiter}}\n", 226 | "(\"relationship\"{{tuple_delimiter}}WILDGUARD MIX3{{tuple_delimiter}}WILDGUARD TRAIN{{tuple_delimiter}}WILDGUARD TRAIN is a subset of the WILDGUARD MIX3 dataset used for training{{tuple_delimiter}}9)\n", 227 | "{{record_delimiter}}\n", 228 | "(\"relationship\"{{tuple_delimiter}}WILDGUARD MIX3{{tuple_delimiter}}WILDGUARD TEST{{tuple_delimiter}}WILDGUARD TEST is a subset of the WILDGUARD MIX3 dataset used for evaluation{{tuple_delimiter}}9)\n", 229 | "{{record_delimiter}}\n", 230 | "(\"relationship\"{{tuple_delimiter}}WILDGUARD TRAIN{{tuple_delimiter}}MISTRAL-7B{{tuple_delimiter}}The WILDGUARD TRAIN dataset is used to fine-tune the Mistral-7B language model{{tuple_delimiter}}8)\n", 231 | "{{record_delimiter}}\n", 232 | "(\"relationship\"{{tuple_delimiter}}ADVERSARIAL ATTEMPTS{{tuple_delimiter}}SAFETY MEASURES{{tuple_delimiter}}Adversarial attempts are used to test and improve safety measures in language models{{tuple_delimiter}}7)\n", 233 | "{{completion_delimiter}}\n", 234 | "```\n", 235 | "#############################\n", 236 | "\n", 237 | "\n", 238 | "\n", 239 | "-Real Data-\n", 240 | "######################\n", 241 | "entity_types: [large language model, differential privacy, federated learning, healthcare, adversarial training, security measures, open-source tool, dataset, learning rate, AdaGrad, RMSprop, adapter architecture, LoRA, API, model support, evaluation metrics, deployment, Python library, hardware accelerators, hyperparameters, data preprocessing, data imbalance, GPU-based deployment, distributed inference]\n", 242 | "text: {input_text}\n", 243 | "######################\n", 244 | "output:\n", 245 | "\"\"\"\n", 246 | "\n", 247 | "prompt = ChatPromptTemplate.from_template(prompt_template)\n", 248 | "\n", 249 | "chain = prompt | llm | StrOutputParser()" 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "id": "0742c48f-043a-4653-ae9a-550d0b929386", 255 | "metadata": {}, 256 | "source": [ 257 | "**Creating a Response**" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 75, 263 | "id": "81a8e812-85b4-446b-9e54-ca8a227947f2", 264 | "metadata": {}, 265 | "outputs": [ 266 | { 267 | "data": { 268 | "text/html": [ 269 | "
HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
 270 |        "
\n" 271 | ], 272 | "text/plain": [ 273 | "HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" 274 | ] 275 | }, 276 | "metadata": {}, 277 | "output_type": "display_data" 278 | } 279 | ], 280 | "source": [ 281 | "response = chain.invoke({\"input_text\": texts[25]})" 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": 76, 287 | "id": "1176f935-4ffb-4e37-8daa-505edced7bc1", 288 | "metadata": {}, 289 | "outputs": [ 290 | { 291 | "name": "stdout", 292 | "output_type": "stream", 293 | "text": [ 294 | "```plaintext\n", 295 | "(\"entity\"{tuple_delimiter}EVALUATION METRICS{tuple_delimiter}evaluation metrics{tuple_delimiter}Evaluation metrics are used to measure the performance of AI models, including metrics like cross-entropy, perplexity, factuality, and context relevance)\n", 296 | "{record_delimiter}\n", 297 | "(\"entity\"{tuple_delimiter}HYPERPARAMETERS{tuple_delimiter}hyperparameters{tuple_delimiter}Hyperparameters are key settings in model training, such as learning rate, batch size, and number of training epochs, which are adjusted to optimize model performance)\n", 298 | "{record_delimiter}\n", 299 | "(\"entity\"{tuple_delimiter}CROSS-ENTROPY{tuple_delimiter}evaluation metrics{tuple_delimiter}Cross-entropy is a key metric for evaluating large language models (LLMs) during training or fine-tuning, quantifying the difference between predicted and actual data distributions)\n", 300 | "{record_delimiter}\n", 301 | "(\"entity\"{tuple_delimiter}PERPLEXITY{tuple_delimiter}evaluation metrics{tuple_delimiter}Perplexity measures how well a probability distribution or model predicts a sample, indicating the model's uncertainty about the next word in a sequence)\n", 302 | "{record_delimiter}\n", 303 | "(\"entity\"{tuple_delimiter}FACTUALITY{tuple_delimiter}evaluation metrics{tuple_delimiter}Factuality assesses the accuracy of the information produced by the LLM, important for applications where misinformation could have serious consequences)\n", 304 | "{record_delimiter}\n", 305 | "(\"entity\"{tuple_delimiter}LLM UNCERTAINTY{tuple_delimiter}evaluation metrics{tuple_delimiter}LLM uncertainty is measured using log probability to identify low-quality generations, with lower uncertainty indicating higher output quality)\n", 306 | "{record_delimiter}\n", 307 | "(\"entity\"{tuple_delimiter}PROMPT PERPLEXITY{tuple_delimiter}evaluation metrics{tuple_delimiter}Prompt perplexity evaluates how well the model understands the input prompt, with lower values indicating clearer and more comprehensible prompts)\n", 308 | "{record_delimiter}\n", 309 | "(\"entity\"{tuple_delimiter}CONTEXT RELEVANCE{tuple_delimiter}evaluation metrics{tuple_delimiter}Context relevance measures how pertinent the retrieved context is to the user query in retrieval-augmented generation systems, improving response quality)\n", 310 | "{record_delimiter}\n", 311 | "(\"relationship\"{tuple_delimiter}CROSS-ENTROPY{tuple_delimiter}PERPLEXITY{tuple_delimiter}Both cross-entropy and perplexity are metrics used to evaluate the performance of large language models, focusing on prediction accuracy and uncertainty{tuple_delimiter}7)\n", 312 | "{record_delimiter}\n", 313 | "(\"relationship\"{tuple_delimiter}HYPERPARAMETERS{tuple_delimiter}EVALUATION METRICS{tuple_delimiter}Hyperparameters are adjusted based on evaluation metrics to optimize model performance and prevent overfitting{tuple_delimiter}8)\n", 314 | "{completion_delimiter}\n", 315 | "```\n" 316 | ] 317 | } 318 | ], 319 | "source": [ 320 | "print(response)" 321 | ] 322 | }, 323 | { 324 | "cell_type": "markdown", 325 | "id": "55fd5fa6-626b-4c6b-ad17-84d4d3bc5bf4", 326 | "metadata": {}, 327 | "source": [ 328 | "We see the extraction of **entities**:\n", 329 | "\n", 330 | "`(\"entity\"{tuple_delimiter}EVALUATION METRICS{tuple_delimiter}evaluation metrics{tuple_delimiter}Evaluation metrics are criteria used to assess the performance of AI models, including metrics like cross-entropy, perplexity, factuality, and context relevance)`\n", 331 | "\n", 332 | "As well as **relationships**:\n", 333 | "\n", 334 | "`(\"relationship\"{tuple_delimiter}EVALUATION METRICS{tuple_delimiter}CONTEXT RELEVANCE{tuple_delimiter}Context relevance is an evaluation metric that ensures the model uses the most pertinent information for generating responses{tuple_delimiter}8)`\n", 335 | "\n", 336 | "Following this, these per chunk subgraphs are merged together - any entities with the same name and type are merged by creating an array of their descriptions. Similarly, any relationships with the same source and target are merged by creating an array of their descriptions. These lists are then summarized one more time " 337 | ] 338 | }, 339 | { 340 | "cell_type": "markdown", 341 | "id": "ad41d036-00da-4dba-8fcf-1cee5b683d52", 342 | "metadata": {}, 343 | "source": [ 344 | "### **Looking at Final Entities and Relationships**" 345 | ] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "execution_count": 77, 350 | "id": "26613855-7891-4b16-ad84-758f8a0ed8fd", 351 | "metadata": {}, 352 | "outputs": [ 353 | { 354 | "data": { 355 | "text/html": [ 356 | "
\n", 357 | "\n", 370 | "\n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | "
idhuman_readable_idtitletypedescriptiontext_unit_ids
0e3a7f24b-88b6-4481-b3a7-c35075a9671f0GPT-3ORGANIZATIONGPT-3 is a large language model developed by O...[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
1f55cae4e-dd0d-47a2-912b-f7680147dd311GPT-4ORGANIZATIONGPT-4 is an advanced large language model deve...[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
2f3e3e46b-6746-45a7-9a26-1432f14c45e42BERTORGANIZATIONBERT, which stands for Bidirectional Encoder R...[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
30491a417-2e18-41c4-ae1e-3e39bf2eb98f3PALMORGANIZATIONPaLM is a large language model developed by Go...[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
42b7f14f5-d1d5-49f6-bace-46fd1767f99e4LLAMAORGANIZATIONLLAMA is a versatile and advanced model known ...[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
\n", 430 | "
" 431 | ], 432 | "text/plain": [ 433 | " id human_readable_id title \\\n", 434 | "0 e3a7f24b-88b6-4481-b3a7-c35075a9671f 0 GPT-3 \n", 435 | "1 f55cae4e-dd0d-47a2-912b-f7680147dd31 1 GPT-4 \n", 436 | "2 f3e3e46b-6746-45a7-9a26-1432f14c45e4 2 BERT \n", 437 | "3 0491a417-2e18-41c4-ae1e-3e39bf2eb98f 3 PALM \n", 438 | "4 2b7f14f5-d1d5-49f6-bace-46fd1767f99e 4 LLAMA \n", 439 | "\n", 440 | " type description \\\n", 441 | "0 ORGANIZATION GPT-3 is a large language model developed by O... \n", 442 | "1 ORGANIZATION GPT-4 is an advanced large language model deve... \n", 443 | "2 ORGANIZATION BERT, which stands for Bidirectional Encoder R... \n", 444 | "3 ORGANIZATION PaLM is a large language model developed by Go... \n", 445 | "4 ORGANIZATION LLAMA is a versatile and advanced model known ... \n", 446 | "\n", 447 | " text_unit_ids \n", 448 | "0 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 449 | "1 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 450 | "2 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 451 | "3 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 452 | "4 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... " 453 | ] 454 | }, 455 | "execution_count": 77, 456 | "metadata": {}, 457 | "output_type": "execute_result" 458 | } 459 | ], 460 | "source": [ 461 | "import pandas as pd\n", 462 | "\n", 463 | "entities = pd.read_parquet('./ragtest/output/create_final_entities.parquet')\n", 464 | "\n", 465 | "entities.head()" 466 | ] 467 | }, 468 | { 469 | "cell_type": "code", 470 | "execution_count": 78, 471 | "id": "9c4bf9fc-bb4e-4546-897b-991236079323", 472 | "metadata": {}, 473 | "outputs": [ 474 | { 475 | "data": { 476 | "text/html": [ 477 | "
\n", 478 | "\n", 491 | "\n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | "
idhuman_readable_idsourcetargetdescriptionweightcombined_degreetext_unit_ids
0b895553a-f860-4d15-bba2-a42f1464e8100GPT-3GPT-4GPT-4 is an advanced version of GPT-3, buildin...8.020[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
11548feb2-5a6a-43e6-ab44-81252056193e1GPT-3CHATGPTChatGPT is based on the GPT architecture, spec...7.014[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
2e538721d-c023-4994-b918-1efece80ea7e2GPT-3BERTBoth BERT and GPT-3 are pre-trained language m...6.018[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
3063a2941-df53-4b5c-a66a-45e60bdba6043GPT-3REINFORCEMENT LEARNING FROM HUMAN FEEDBACK (RLHF)RLHF is used in training GPT-3 to refine its o...7.013[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
47e0cd0b4-4688-434f-a194-f2b9ce397a944GPT-3PROMPT ENGINEERINGPrompt engineering is a technique used to guid...6.014[ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41...
\n", 563 | "
" 564 | ], 565 | "text/plain": [ 566 | " id human_readable_id source \\\n", 567 | "0 b895553a-f860-4d15-bba2-a42f1464e810 0 GPT-3 \n", 568 | "1 1548feb2-5a6a-43e6-ab44-81252056193e 1 GPT-3 \n", 569 | "2 e538721d-c023-4994-b918-1efece80ea7e 2 GPT-3 \n", 570 | "3 063a2941-df53-4b5c-a66a-45e60bdba604 3 GPT-3 \n", 571 | "4 7e0cd0b4-4688-434f-a194-f2b9ce397a94 4 GPT-3 \n", 572 | "\n", 573 | " target \\\n", 574 | "0 GPT-4 \n", 575 | "1 CHATGPT \n", 576 | "2 BERT \n", 577 | "3 REINFORCEMENT LEARNING FROM HUMAN FEEDBACK (RLHF) \n", 578 | "4 PROMPT ENGINEERING \n", 579 | "\n", 580 | " description weight combined_degree \\\n", 581 | "0 GPT-4 is an advanced version of GPT-3, buildin... 8.0 20 \n", 582 | "1 ChatGPT is based on the GPT architecture, spec... 7.0 14 \n", 583 | "2 Both BERT and GPT-3 are pre-trained language m... 6.0 18 \n", 584 | "3 RLHF is used in training GPT-3 to refine its o... 7.0 13 \n", 585 | "4 Prompt engineering is a technique used to guid... 6.0 14 \n", 586 | "\n", 587 | " text_unit_ids \n", 588 | "0 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 589 | "1 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 590 | "2 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 591 | "3 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... \n", 592 | "4 [ca73c495111f5cadd87e6a7a01aed66647ae6623fdf41... " 593 | ] 594 | }, 595 | "execution_count": 78, 596 | "metadata": {}, 597 | "output_type": "execute_result" 598 | } 599 | ], 600 | "source": [ 601 | "relationships = pd.read_parquet('./ragtest/output/create_final_relationships.parquet')\n", 602 | "\n", 603 | "relationships.head()" 604 | ] 605 | }, 606 | { 607 | "cell_type": "markdown", 608 | "id": "d0c77836-f9ae-4602-a4d0-760915248e0a", 609 | "metadata": {}, 610 | "source": [ 611 | "### **Community Detection & Node Embedding**\n", 612 | "\n", 613 | "\n", 614 | "\n", 615 | "After we have our basic graph with entities and relationships, we analyze its structure in two ways. Community Detection uses the [Leiden algorithm](https://en.wikipedia.org/wiki/Leiden_algorithm) to find explicit groupings in the graph, creating a hierarchy of related entities. The lower in the hierarchy, the more granular the community. Node Embedding uses [Node2Vec](https://arxiv.org/abs/1607.00653) to create vector representations of each entity, capturing implicit relationships in the graph structure. These complementary approaches let us understand both obvious connections through communities and subtle patterns through embeddings.\n", 616 | "\n", 617 | "Combining all of this with our relationships gives us our final nodes." 618 | ] 619 | }, 620 | { 621 | "cell_type": "code", 622 | "execution_count": 80, 623 | "id": "058d8f6d-6eb7-45fe-bf1c-e28055a683c5", 624 | "metadata": {}, 625 | "outputs": [ 626 | { 627 | "data": { 628 | "text/html": [ 629 | "
\n", 630 | "\n", 643 | "\n", 644 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 663 | " \n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | "
idhuman_readable_idtitlecommunityleveldegreexy
0e3a7f24b-88b6-4481-b3a7-c35075a9671f0GPT-38012-4.8755454.017587
1e3a7f24b-88b6-4481-b3a7-c35075a9671f0GPT-343112-4.8755454.017587
2f55cae4e-dd0d-47a2-912b-f7680147dd311GPT-4808-4.5610641.505724
3f55cae4e-dd0d-47a2-912b-f7680147dd311GPT-44618-4.5610641.505724
4f3e3e46b-6746-45a7-9a26-1432f14c45e42BERT806-5.7105803.546957
5f3e3e46b-6746-45a7-9a26-1432f14c45e42BERT4416-5.7105803.546957
60491a417-2e18-41c4-ae1e-3e39bf2eb98f3PALM803-5.3093921.548029
70491a417-2e18-41c4-ae1e-3e39bf2eb98f3PALM4613-5.3093921.548029
82b7f14f5-d1d5-49f6-bace-46fd1767f99e4LLAMA304-6.6445730.421999
92b7f14f5-d1d5-49f6-bace-46fd1767f99e4LLAMA2714-6.6445730.421999
\n", 770 | "
" 771 | ], 772 | "text/plain": [ 773 | " id human_readable_id title community \\\n", 774 | "0 e3a7f24b-88b6-4481-b3a7-c35075a9671f 0 GPT-3 8 \n", 775 | "1 e3a7f24b-88b6-4481-b3a7-c35075a9671f 0 GPT-3 43 \n", 776 | "2 f55cae4e-dd0d-47a2-912b-f7680147dd31 1 GPT-4 8 \n", 777 | "3 f55cae4e-dd0d-47a2-912b-f7680147dd31 1 GPT-4 46 \n", 778 | "4 f3e3e46b-6746-45a7-9a26-1432f14c45e4 2 BERT 8 \n", 779 | "5 f3e3e46b-6746-45a7-9a26-1432f14c45e4 2 BERT 44 \n", 780 | "6 0491a417-2e18-41c4-ae1e-3e39bf2eb98f 3 PALM 8 \n", 781 | "7 0491a417-2e18-41c4-ae1e-3e39bf2eb98f 3 PALM 46 \n", 782 | "8 2b7f14f5-d1d5-49f6-bace-46fd1767f99e 4 LLAMA 3 \n", 783 | "9 2b7f14f5-d1d5-49f6-bace-46fd1767f99e 4 LLAMA 27 \n", 784 | "\n", 785 | " level degree x y \n", 786 | "0 0 12 -4.875545 4.017587 \n", 787 | "1 1 12 -4.875545 4.017587 \n", 788 | "2 0 8 -4.561064 1.505724 \n", 789 | "3 1 8 -4.561064 1.505724 \n", 790 | "4 0 6 -5.710580 3.546957 \n", 791 | "5 1 6 -5.710580 3.546957 \n", 792 | "6 0 3 -5.309392 1.548029 \n", 793 | "7 1 3 -5.309392 1.548029 \n", 794 | "8 0 4 -6.644573 0.421999 \n", 795 | "9 1 4 -6.644573 0.421999 " 796 | ] 797 | }, 798 | "execution_count": 80, 799 | "metadata": {}, 800 | "output_type": "execute_result" 801 | } 802 | ], 803 | "source": [ 804 | "nodes = pd.read_parquet('./ragtest/output/create_final_nodes.parquet')\n", 805 | "\n", 806 | "nodes.head(10)" 807 | ] 808 | }, 809 | { 810 | "cell_type": "markdown", 811 | "id": "457b12c9-fc5c-47cf-a597-6c6bfb3177ee", 812 | "metadata": {}, 813 | "source": [ 814 | "At this step the graph is effectively created, however we can introduce a few extra steps that will allow us to do some advanced retrieval." 815 | ] 816 | }, 817 | { 818 | "cell_type": "markdown", 819 | "id": "62d8ed95-677b-405e-82d3-6fb1a3f1917c", 820 | "metadata": {}, 821 | "source": [ 822 | "### Community Report Generation & Summarization\n", 823 | "\n", 824 | "Now that we have clear community grouping, we can aggregate the main concepts across hierarchical node communities with another generation step, and a shorthand summary of that summary. Similar to the nodes, these summaries are also ran through an embedding model and stored in a vector store." 825 | ] 826 | }, 827 | { 828 | "cell_type": "code", 829 | "execution_count": 81, 830 | "id": "702e5559-ed4e-4964-bf72-113912974102", 831 | "metadata": {}, 832 | "outputs": [ 833 | { 834 | "data": { 835 | "text/html": [ 836 | "
\n", 837 | "\n", 850 | "\n", 851 | " \n", 852 | " \n", 853 | " \n", 854 | " \n", 855 | " \n", 856 | " \n", 857 | " \n", 858 | " \n", 859 | " \n", 860 | " \n", 861 | " \n", 862 | " \n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 936 | " \n", 937 | " \n", 938 | " \n", 939 | " \n", 940 | " \n", 941 | " \n", 942 | " \n", 943 | " \n", 944 | " \n", 945 | " \n", 946 | " \n", 947 | " \n", 948 | " \n", 949 | " \n", 950 | " \n", 951 | " \n", 952 | " \n", 953 | " \n", 954 | " \n", 955 | " \n", 956 | " \n", 957 | "
idhuman_readable_idcommunityparentleveltitlesummaryfull_contentrankrank_explanationfindingsfull_content_jsonperiodsize
0a85d59a64a054114982b1ce6e1ced5916161322Amazon Bedrock and AI Model ProvidersThe community is centered around Amazon Bedroc...# Amazon Bedrock and AI Model Providers\\n\\nThe...8.5The impact severity rating is high due to Amaz...[{'explanation': 'Amazon Bedrock is a pivotal ...{\\n \"title\": \"Amazon Bedrock and AI Model P...2024-12-189
16aafc6eeddd848bc8ffbfb9177790c266262322AWS and SageMaker JumpStartThe community is centered around Amazon Web Se...# AWS and SageMaker JumpStart\\n\\nThe community...8.5The impact severity rating is high due to AWS'...[{'explanation': 'Amazon Web Services (AWS) is...{\\n \"title\": \"AWS and SageMaker JumpStart\",...2024-12-182
2e13e3ed0a0b74fd090319957ae9f3e1e141401PPO for LLM Alignment and Reinforcement Learni...The community centers around the study 'PPO fo...# PPO for LLM Alignment and Reinforcement Lear...7.5The impact severity rating is high due to the ...[{'explanation': 'The study 'PPO for LLM Align...{\\n \"title\": \"PPO for LLM Alignment and Rei...2024-12-187
3828baab1461b439ea71203ad8fd0aae5151501HuggingFace and Advanced NLP ToolsThe community is centered around HuggingFace, ...# HuggingFace and Advanced NLP Tools\\n\\nThe co...8.5The impact severity rating is high due to Hugg...[{'explanation': 'HuggingFace is a prominent e...{\\n \"title\": \"HuggingFace and Advanced NLP ...2024-12-187
4791da6e7031e45228442b277e7d912c6161601OpenAI and AI Development PlatformsThe community is centered around OpenAI, a lea...# OpenAI and AI Development Platforms\\n\\nThe c...8.5The impact severity rating is high due to the ...[{'explanation': 'OpenAI is a central entity i...{\\n \"title\": \"OpenAI and AI Development Pla...2024-12-187
\n", 958 | "
" 959 | ], 960 | "text/plain": [ 961 | " id human_readable_id community parent \\\n", 962 | "0 a85d59a64a054114982b1ce6e1ced591 61 61 32 \n", 963 | "1 6aafc6eeddd848bc8ffbfb9177790c26 62 62 32 \n", 964 | "2 e13e3ed0a0b74fd090319957ae9f3e1e 14 14 0 \n", 965 | "3 828baab1461b439ea71203ad8fd0aae5 15 15 0 \n", 966 | "4 791da6e7031e45228442b277e7d912c6 16 16 0 \n", 967 | "\n", 968 | " level title \\\n", 969 | "0 2 Amazon Bedrock and AI Model Providers \n", 970 | "1 2 AWS and SageMaker JumpStart \n", 971 | "2 1 PPO for LLM Alignment and Reinforcement Learni... \n", 972 | "3 1 HuggingFace and Advanced NLP Tools \n", 973 | "4 1 OpenAI and AI Development Platforms \n", 974 | "\n", 975 | " summary \\\n", 976 | "0 The community is centered around Amazon Bedroc... \n", 977 | "1 The community is centered around Amazon Web Se... \n", 978 | "2 The community centers around the study 'PPO fo... \n", 979 | "3 The community is centered around HuggingFace, ... \n", 980 | "4 The community is centered around OpenAI, a lea... \n", 981 | "\n", 982 | " full_content rank \\\n", 983 | "0 # Amazon Bedrock and AI Model Providers\\n\\nThe... 8.5 \n", 984 | "1 # AWS and SageMaker JumpStart\\n\\nThe community... 8.5 \n", 985 | "2 # PPO for LLM Alignment and Reinforcement Lear... 7.5 \n", 986 | "3 # HuggingFace and Advanced NLP Tools\\n\\nThe co... 8.5 \n", 987 | "4 # OpenAI and AI Development Platforms\\n\\nThe c... 8.5 \n", 988 | "\n", 989 | " rank_explanation \\\n", 990 | "0 The impact severity rating is high due to Amaz... \n", 991 | "1 The impact severity rating is high due to AWS'... \n", 992 | "2 The impact severity rating is high due to the ... \n", 993 | "3 The impact severity rating is high due to Hugg... \n", 994 | "4 The impact severity rating is high due to the ... \n", 995 | "\n", 996 | " findings \\\n", 997 | "0 [{'explanation': 'Amazon Bedrock is a pivotal ... \n", 998 | "1 [{'explanation': 'Amazon Web Services (AWS) is... \n", 999 | "2 [{'explanation': 'The study 'PPO for LLM Align... \n", 1000 | "3 [{'explanation': 'HuggingFace is a prominent e... \n", 1001 | "4 [{'explanation': 'OpenAI is a central entity i... \n", 1002 | "\n", 1003 | " full_content_json period size \n", 1004 | "0 {\\n \"title\": \"Amazon Bedrock and AI Model P... 2024-12-18 9 \n", 1005 | "1 {\\n \"title\": \"AWS and SageMaker JumpStart\",... 2024-12-18 2 \n", 1006 | "2 {\\n \"title\": \"PPO for LLM Alignment and Rei... 2024-12-18 7 \n", 1007 | "3 {\\n \"title\": \"HuggingFace and Advanced NLP ... 2024-12-18 7 \n", 1008 | "4 {\\n \"title\": \"OpenAI and AI Development Pla... 2024-12-18 7 " 1009 | ] 1010 | }, 1011 | "execution_count": 81, 1012 | "metadata": {}, 1013 | "output_type": "execute_result" 1014 | } 1015 | ], 1016 | "source": [ 1017 | "community_reports = pd.read_parquet('./ragtest/output/create_final_community_reports.parquet')\n", 1018 | "\n", 1019 | "community_reports.head()" 1020 | ] 1021 | }, 1022 | { 1023 | "cell_type": "code", 1024 | "execution_count": 83, 1025 | "id": "0ca51c33-92db-47c1-9f00-0a8e934332fe", 1026 | "metadata": {}, 1027 | "outputs": [ 1028 | { 1029 | "name": "stdout", 1030 | "output_type": "stream", 1031 | "text": [ 1032 | "# Amazon Bedrock and AI Model Providers\n", 1033 | "\n", 1034 | "The community is centered around Amazon Bedrock, a service by AWS that facilitates access to foundation models from various AI innovators. Key entities include AI21 Labs, Anthropic, Cohere, Mistral AI, and Stability AI, all of which provide models accessible through Amazon Bedrock. The service integrates with AWS infrastructure, including AWS Lambda and AWS SageMaker, to support scalable AI model deployment.\n", 1035 | "\n", 1036 | "## Amazon Bedrock as a central service\n", 1037 | "\n", 1038 | "Amazon Bedrock is a pivotal service within the AWS ecosystem, designed to simplify access to high-performing foundation models for generative AI applications. It integrates seamlessly with other AWS services, such as Amazon S3, AWS Lambda, and AWS SageMaker, to facilitate the fine-tuning and deployment of AI models. This integration underscores its importance in the AI landscape, providing a comprehensive suite of tools for scalable AI model deployment [Data: Entities (206); Relationships (281, 326, 327)].\n", 1039 | "\n", 1040 | "## AI21 Labs' contribution to Amazon Bedrock\n", 1041 | "\n", 1042 | "AI21 Labs is one of the key providers of foundation models available through Amazon Bedrock. The company's advanced AI models contribute significantly to the capabilities offered by Amazon Bedrock, enhancing its utility for natural language processing tasks. This partnership highlights the collaborative nature of the AI community in advancing technology through shared resources and expertise [Data: Entities (267); Relationships (318)].\n", 1043 | "\n", 1044 | "## Anthropic's role in AI safety and research\n", 1045 | "\n", 1046 | "Anthropic is an AI safety and research company that provides foundation models accessible through Amazon Bedrock. Its focus on AI safety is crucial in the development of responsible AI technologies, ensuring that the models deployed via Amazon Bedrock adhere to ethical standards. This relationship emphasizes the importance of integrating safety considerations into AI development [Data: Entities (268); Relationships (319)].\n", 1047 | "\n", 1048 | "## Cohere's NLP models in Amazon Bedrock\n", 1049 | "\n", 1050 | "Cohere offers natural language processing models that are part of the foundation models available through Amazon Bedrock. These models enhance the service's capabilities in processing and understanding human language, making it a valuable tool for businesses and developers seeking to implement NLP solutions. Cohere's involvement underscores the diversity of AI models supported by Amazon Bedrock [Data: Entities (269); Relationships (320)].\n", 1051 | "\n", 1052 | "## Integration with AWS Lambda and AWS SageMaker\n", 1053 | "\n", 1054 | "Amazon Bedrock's integration with AWS Lambda and AWS SageMaker is a key feature that supports the deployment and management of AI models. AWS Lambda provides serverless computing capabilities, allowing for efficient resource management, while AWS SageMaker offers tools for building, training, and deploying machine learning models. This integration facilitates a seamless workflow for AI model development and deployment, enhancing the overall efficiency and scalability of AI applications [Data: Entities (276, 277); Relationships (326, 327)].\n" 1055 | ] 1056 | } 1057 | ], 1058 | "source": [ 1059 | "print(community_reports[\"full_content\"][0])" 1060 | ] 1061 | }, 1062 | { 1063 | "cell_type": "code", 1064 | "execution_count": 82, 1065 | "id": "42c3e76b-e59f-4577-b1a0-3b07f0ebb499", 1066 | "metadata": {}, 1067 | "outputs": [ 1068 | { 1069 | "name": "stdout", 1070 | "output_type": "stream", 1071 | "text": [ 1072 | "The community is centered around Amazon Bedrock, a service by AWS that facilitates access to foundation models from various AI innovators. Key entities include AI21 Labs, Anthropic, Cohere, Mistral AI, and Stability AI, all of which provide models accessible through Amazon Bedrock. The service integrates with AWS infrastructure, including AWS Lambda and AWS SageMaker, to support scalable AI model deployment.\n" 1073 | ] 1074 | } 1075 | ], 1076 | "source": [ 1077 | "print(community_reports[\"summary\"][0])" 1078 | ] 1079 | }, 1080 | { 1081 | "cell_type": "markdown", 1082 | "id": "fb5243d1-e1d2-4525-ab3c-e1457db2eea7", 1083 | "metadata": {}, 1084 | "source": [ 1085 | "### The Final Graph!\n", 1086 | "\n", 1087 | "\n", 1088 | "\n", 1089 | "*[Full Size PDF](./ghraphrag_viz.pdf)*" 1090 | ] 1091 | }, 1092 | { 1093 | "cell_type": "markdown", 1094 | "id": "172bf775-ee3b-4d4a-a973-317f1681b8af", 1095 | "metadata": {}, 1096 | "source": [ 1097 | "---\n", 1098 | "\n", 1099 | "## GraphRAG Retrieval\n", 1100 | "\n", 1101 | "\n", 1102 | "\n", 1103 | "*[Unifying Large Language Models and Knowledge Graphs: A Roadmap](https://arxiv.org/pdf/2306.08302)*\n", 1104 | "\n", 1105 | "With our knowledge graph constructed, and hierarchichal communities delineated, we can now perform multiple types of search that can both take advantage of the graph structure, and multiple levels of specificity across our communities. Specifically:\n", 1106 | "\n", 1107 | "1. **Global Search**: Uses the LLM Generated community reports from a specified level of the graph's community hierarchy as context data to generate response.\n", 1108 | "2. **Local Search**: Combines structured data from the knowledge graph with unstructured data from the input document(s) to augment the LLM context with relevant entity information.\n", 1109 | "3. **Drift Search**: Dynamic Reasoning and Inference with Flexible Traversal, an approach to local search queries by including community information in the search process, thus combining global and local search." 1110 | ] 1111 | }, 1112 | { 1113 | "cell_type": "markdown", 1114 | "id": "f14a5823-048b-4a54-9da5-51305b8a1c7a", 1115 | "metadata": {}, 1116 | "source": [ 1117 | "**GraphRAG Retrieval Function**\n", 1118 | "\n", 1119 | "*Note: Wrapping the [GraphRAG CLI tool](https://microsoft.github.io/graphrag/cli/) as a function here instead of using their [library](https://microsoft.github.io/graphrag/examples_notebooks/api_overview/) for an easier example. As such, notebook needs to be running in the same GraphRAG environment/kernal.*" 1120 | ] 1121 | }, 1122 | { 1123 | "cell_type": "code", 1124 | "execution_count": 84, 1125 | "id": "ae01778e-518f-42fe-929e-dd2ef63a8753", 1126 | "metadata": {}, 1127 | "outputs": [], 1128 | "source": [ 1129 | "import subprocess\n", 1130 | "import shlex\n", 1131 | "from typing import Optional\n", 1132 | "\n", 1133 | "def query_graphrag(\n", 1134 | " query: str,\n", 1135 | " method: str = \"global\",\n", 1136 | " root_path: str = \"./ragtest\",\n", 1137 | " timeout: Optional[int] = None,\n", 1138 | " community_level: int = 2,\n", 1139 | " dynamic_community_selection: bool = False\n", 1140 | ") -> str:\n", 1141 | " \"\"\"\n", 1142 | " Execute a GraphRAG query using the CLI tool.\n", 1143 | " \n", 1144 | " Args:\n", 1145 | " query (str): The query string to process\n", 1146 | " method (str): Query method (e.g., \"global\", \"local\", or \"drift\")\n", 1147 | " root_path (str): Path to the root directory\n", 1148 | " timeout (int, optional): Timeout in seconds for the command\n", 1149 | " community_level (int): The community level in the Leiden community hierarchy (default: 2)\n", 1150 | " dynamic_community_selection (bool): Whether to use global search with dynamic community selection (default: False)\n", 1151 | " \n", 1152 | " Returns:\n", 1153 | " str: The output from GraphRAG\n", 1154 | " \n", 1155 | " Raises:\n", 1156 | " subprocess.CalledProcessError: If the command fails\n", 1157 | " subprocess.TimeoutExpired: If the command times out\n", 1158 | " ValueError: If community_level is negative\n", 1159 | " \"\"\"\n", 1160 | " # Validate community level\n", 1161 | " if community_level < 0:\n", 1162 | " raise ValueError(\"Community level must be non-negative\")\n", 1163 | " \n", 1164 | " # Construct the base command\n", 1165 | " command = [\n", 1166 | " 'graphrag', 'query',\n", 1167 | " '--root', root_path,\n", 1168 | " '--method', method,\n", 1169 | " '--query', query,\n", 1170 | " '--community-level', str(community_level)\n", 1171 | " ]\n", 1172 | " \n", 1173 | " # Add dynamic community selection flag if enabled\n", 1174 | " if dynamic_community_selection:\n", 1175 | " command.append('--dynamic-community-selection')\n", 1176 | " \n", 1177 | " try:\n", 1178 | " # Execute the command and capture output\n", 1179 | " result = subprocess.run(\n", 1180 | " command,\n", 1181 | " capture_output=True,\n", 1182 | " text=True,\n", 1183 | " timeout=timeout\n", 1184 | " )\n", 1185 | " \n", 1186 | " # Check if the command was successful\n", 1187 | " result.check_returncode()\n", 1188 | " \n", 1189 | " return result.stdout.strip()\n", 1190 | " \n", 1191 | " except subprocess.CalledProcessError as e:\n", 1192 | " error_message = f\"Command failed with exit code {e.returncode}\\nError: {e.stderr}\"\n", 1193 | " raise subprocess.CalledProcessError(\n", 1194 | " e.returncode,\n", 1195 | " e.cmd,\n", 1196 | " output=e.output,\n", 1197 | " stderr=error_message\n", 1198 | " )" 1199 | ] 1200 | }, 1201 | { 1202 | "cell_type": "markdown", 1203 | "id": "c02da6ea-e0ad-41d4-982a-f9058184add0", 1204 | "metadata": {}, 1205 | "source": [ 1206 | "### Local Search\n", 1207 | "\n", 1208 | "\n", 1209 | "\n", 1210 | "The GraphRAG approach to local search is the most similar to regular semantic RAG search. It combines structured data from the knowledge graph with unstructured data from the input documents to augment the LLM context with relevant entity information. In essence, we are going to first search for relevant entities to the query using semantic search. These become the entry points on our graph that we can now traverse. Starting at these points, we look at connected chunks of text, community reports, other entities, and relationships between them. All of the data retrieved is filtered and ranked to fit into a pre-defined context window." 1211 | ] 1212 | }, 1213 | { 1214 | "cell_type": "code", 1215 | "execution_count": 88, 1216 | "id": "ad477f20-1019-4da6-a70a-9cc9e362fb03", 1217 | "metadata": {}, 1218 | "outputs": [ 1219 | { 1220 | "name": "stderr", 1221 | "output_type": "stream", 1222 | "text": [ 1223 | "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", 1224 | "To disable this warning, you can either:\n", 1225 | "\t- Avoid using `tokenizers` before the fork if possible\n", 1226 | "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n" 1227 | ] 1228 | }, 1229 | { 1230 | "name": "stdout", 1231 | "output_type": "stream", 1232 | "text": [ 1233 | "Query result:\n", 1234 | "INFO: Vector Store Args: {\n", 1235 | " \"type\": \"lancedb\",\n", 1236 | " \"db_uri\": \"/Users/adamlucek/Desktop/github/GraphRAG/ragtest/output/lancedb\",\n", 1237 | " \"container_name\": \"==== REDACTED ====\",\n", 1238 | " \"overwrite\": true\n", 1239 | "}\n", 1240 | "creating llm client with {'api_key': 'REDACTED,len=51', 'type': \"openai_chat\", 'encoding_model': 'cl100k_base', 'model': 'gpt-4o', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'audience': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25, 'responses': None}\n", 1241 | "creating embedding llm client with {'api_key': 'REDACTED,len=51', 'type': \"openai_embedding\", 'encoding_model': 'cl100k_base', 'model': 'text-embedding-3-small', 'max_tokens': 4000, 'temperature': 0, 'top_p': 1, 'n': 1, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'audience': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25, 'responses': None}\n", 1242 | "\n", 1243 | "SUCCESS: Local Search Response:\n", 1244 | "When a company is deciding between Retrieval-Augmented Generation (RAG), fine-tuning, and various Parameter-Efficient Fine-Tuning (PEFT) approaches, several factors must be considered, including the specific application requirements, available resources, and desired outcomes. Each method offers distinct advantages and is suited to different scenarios.\n", 1245 | "\n", 1246 | "### Retrieval-Augmented Generation (RAG)\n", 1247 | "\n", 1248 | "RAG is particularly beneficial for applications that require integrating external data to enhance the accuracy and relevance of generated content. It is ideal for scenarios where the model needs to access up-to-date or domain-specific information without extensive retraining. For instance, RAG is effectively used in question-and-answer systems, customer support automation, and summarization tasks, where the ability to retrieve and incorporate external data can significantly improve performance [Data: Reports (20); Entities (20); Relationships (14, 21, 22, 23)]. Companies might choose RAG when they need to maintain high precision and adaptability in dynamic environments, such as customer service or technical support, where the context and information are constantly evolving.\n", 1249 | "\n", 1250 | "### Fine-Tuning\n", 1251 | "\n", 1252 | "Fine-tuning involves adapting a pre-trained model to perform better on specific tasks by updating its parameters with domain-specific data. This approach is essential when a company needs to tailor a model to a particular application or domain, ensuring that it can handle specific tasks with high accuracy. Fine-tuning is a comprehensive process that includes stages such as dataset preparation, model initialization, and evaluation, making it suitable for applications where precise model adaptation is crucial [Data: Reports (19); Entities (17); Relationships (10, 12, 24, 27, 29, 274, 275, 276, 277); Sources (3, 12, 13)]. Companies might opt for fine-tuning when they have the resources to manage the computational demands and when the task requires a high degree of customization.\n", 1253 | "\n", 1254 | "### Parameter-Efficient Fine-Tuning (PEFT)\n", 1255 | "\n", 1256 | "PEFT methods, such as LoRA and QLoRA, offer a more resource-efficient alternative to traditional fine-tuning by updating only a small subset of model parameters. This approach is particularly advantageous in scenarios where computational resources are limited, as it reduces the memory and processing requirements while maintaining performance levels comparable to full fine-tuning [Data: Entities (25, 104, 110, 130); Relationships (24, 137, 140, 104); Sources (6, 15)]. PEFT is suitable for companies looking to optimize the financial and environmental costs associated with fine-tuning large language models, especially in low-data scenarios or when deploying models on less powerful hardware.\n", 1257 | "\n", 1258 | "### Decision-Making Considerations\n", 1259 | "\n", 1260 | "In choosing between these methods, companies should evaluate the specific needs of their application, the availability of computational resources, and the importance of model adaptability and precision. RAG is ideal for applications requiring real-time data integration, fine-tuning is best for highly customized tasks, and PEFT is suitable for resource-constrained environments. By aligning the choice of method with their strategic goals and operational constraints, companies can effectively enhance their AI capabilities and achieve desired outcomes.\n" 1261 | ] 1262 | } 1263 | ], 1264 | "source": [ 1265 | "result = query_graphrag(\n", 1266 | " query=\"How does a company choose between RAG, fine-tuning, and different PEFT approaches?\",\n", 1267 | " method=\"local\"\n", 1268 | ")\n", 1269 | "print(\"Query result:\")\n", 1270 | "print(result)" 1271 | ] 1272 | }, 1273 | { 1274 | "cell_type": "markdown", 1275 | "id": "6b8368cd-25bb-4460-802f-8ed76a39bb2d", 1276 | "metadata": {}, 1277 | "source": [ 1278 | "### Global Search\n", 1279 | "\n", 1280 | "\n", 1281 | "\n", 1282 | "Through the semantic clustering of communities during the indexxing process outlined above we created community reports as summaries of high level themes across these groupings. Having this community summary data at various levels allows us to do something that traditional RAG performs poorly at, answering queries about broad themes and ideas across our unstructured data.\n", 1283 | "\n", 1284 | "To capture as much broad information as possible in an efficient manner, GraphRAG implements a [map reduce](https://en.wikipedia.org/wiki/MapReduce) approach. Given a query, relevant community node reports at a specific hierarchical level are retrieved. These are shuffled and chunked, where each chunk is used to generate a list of points that each have their own \"importance score\". These intermediate points are ranked and filtered, attempting to maintain the most important points. These become the aggregate intermediary response, which is passed to the LLM as the context for the final response." 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "code", 1289 | "execution_count": 86, 1290 | "id": "77b59560-c02f-4c2e-a969-8042caef03bf", 1291 | "metadata": {}, 1292 | "outputs": [ 1293 | { 1294 | "name": "stderr", 1295 | "output_type": "stream", 1296 | "text": [ 1297 | "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", 1298 | "To disable this warning, you can either:\n", 1299 | "\t- Avoid using `tokenizers` before the fork if possible\n", 1300 | "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n" 1301 | ] 1302 | }, 1303 | { 1304 | "name": "stdout", 1305 | "output_type": "stream", 1306 | "text": [ 1307 | "Query result:\n", 1308 | "creating llm client with {'api_key': 'REDACTED,len=51', 'type': \"openai_chat\", 'encoding_model': 'cl100k_base', 'model': 'gpt-4o', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'audience': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25, 'responses': None}\n", 1309 | "\n", 1310 | "SUCCESS: Global Search Response:\n", 1311 | "### Choosing Between RAG, Fine-Tuning, and PEFT Approaches\n", 1312 | "\n", 1313 | "When a company is deciding between Retrieval-Augmented Generation (RAG), fine-tuning, and Parameter-Efficient Fine-Tuning (PEFT) approaches, several key factors must be considered. These factors include the specific requirements of the application, the need for external data integration, computational resource constraints, and the desired level of model adaptation and performance.\n", 1314 | "\n", 1315 | "#### Application Requirements\n", 1316 | "\n", 1317 | "The choice largely depends on the specific needs of the application. For instance, if the application requires high precision and adaptability, such as in customer support or information retrieval systems, RAG may be the preferred choice. RAG enhances large language models by integrating external data, allowing for more accurate and contextually relevant content generation without extensive fine-tuning [Data: Reports (19, 20, 35, 21, 36, 38, 60)].\n", 1318 | "\n", 1319 | "#### Computational Resources\n", 1320 | "\n", 1321 | "Computational resources are another critical consideration. Fine-tuning involves further training pre-trained models to improve their performance on specific tasks by updating the model's parameters using a smaller, domain-specific dataset. This process can be resource-intensive but is essential for adapting models to targeted applications and domains [Data: Reports (19, 45, 31, 60)].\n", 1322 | "\n", 1323 | "In contrast, PEFT approaches, such as LoRA and QLoRA, are designed to enhance memory efficiency and reduce computational costs during the fine-tuning process. These techniques allow for fine-tuning on less powerful hardware while maintaining performance levels comparable to traditional methods. PEFT is particularly useful for fine-tuning multimodal models and deploying NLP models on mobile devices [Data: Reports (36, 38, 19, 35)].\n", 1324 | "\n", 1325 | "#### Model Customization and Contextual Relevance\n", 1326 | "\n", 1327 | "The desired level of model customization and contextual relevance also plays a significant role. Fine-tuning is crucial for enhancing the performance of large language models (LLMs) on specific tasks, addressing challenges such as scalability, memory requirements, and resource efficiency. It allows for the customization of models to meet specialized needs and ensures they perform optimally in their intended environments [Data: Reports (45, 31, 60)].\n", 1328 | "\n", 1329 | "RAG systems, on the other hand, improve the accuracy and contextuality of the outputs produced by LLMs, making them suitable for applications where context is crucial [Data: Reports (20, 21, 53)].\n", 1330 | "\n", 1331 | "### Conclusion\n", 1332 | "\n", 1333 | "Ultimately, the decision between RAG, fine-tuning, and PEFT approaches should be guided by the specific needs and constraints of the company, such as the importance of context relevance, the need for task-specific optimization, and resource availability. By carefully evaluating these factors, companies can select the most appropriate method to achieve their desired outcomes [Data: Reports (19, 20, 35, 21, 36, 38, 60, 53, 45)].\n" 1334 | ] 1335 | } 1336 | ], 1337 | "source": [ 1338 | "result = query_graphrag(\n", 1339 | " query=\"How does a company choose between RAG, fine-tuning, and different PEFT approaches?\",\n", 1340 | " method=\"global\"\n", 1341 | ")\n", 1342 | "print(\"Query result:\")\n", 1343 | "print(result)" 1344 | ] 1345 | }, 1346 | { 1347 | "cell_type": "markdown", 1348 | "id": "c35d9948-ba7e-49af-ac49-62b835d62174", 1349 | "metadata": {}, 1350 | "source": [ 1351 | "### DRIFT Search\n", 1352 | "\n", 1353 | "\n", 1354 | "\n", 1355 | "[Dynamic Reasoning and Inference with Flexible Traversal](https://www.microsoft.com/en-us/research/blog/introducing-drift-search-combining-global-and-local-search-methods-to-improve-quality-and-efficiency/), or DRIFT, is a novel GraphRAG concept introduced by Microsoft as an approach to local search queries that include community information in the search process.\n", 1356 | "\n", 1357 | "The user's query is initially processed through [Hypothetical Document Embedding (HyDE)](https://arxiv.org/pdf/2212.10496), which creates a hypothetical document similar to those found in the graph already, but using the user's topic query. This document is embedded and used for semantic retrieval of the top-k relevant community reports. From these matches, we generate an initial answer along with several follow-up questions as a lightweight version of global search. They refer to this as the primer.\n", 1358 | "\n", 1359 | "Once this primer phase is complete, we execute local searches for each follow-up question generated. Each local search produces both intermediate answers and new follow-up questions, creating a refinement loop. This loop runs for two iterations (noted future research planned to develop reward functions for smarter termination). An important note that makes these local searches unique is that they are informed by both community-level knowledge and detailed entity/relationship data. This allows the DRIFT process to find relevant information even when the initial query diverges from the indexing persona, and it can adapt its approach based on emerging information during the search.\n", 1360 | "\n", 1361 | "The final output is structured as a hierarchy of questions and answers, ranked by their relevance to the original query. Map reduce is used again with an equal weighting on all intermediate answers, then passed to the language model for a final response. DRIFT cleverly combines global and local search with guided exploration to provide both broad context and specific details in responses." 1362 | ] 1363 | }, 1364 | { 1365 | "cell_type": "code", 1366 | "execution_count": 89, 1367 | "id": "0ffcac67-ab84-4dd7-bf7e-cf4b2a4e7cbd", 1368 | "metadata": {}, 1369 | "outputs": [ 1370 | { 1371 | "name": "stderr", 1372 | "output_type": "stream", 1373 | "text": [ 1374 | "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", 1375 | "To disable this warning, you can either:\n", 1376 | "\t- Avoid using `tokenizers` before the fork if possible\n", 1377 | "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n" 1378 | ] 1379 | }, 1380 | { 1381 | "name": "stdout", 1382 | "output_type": "stream", 1383 | "text": [ 1384 | "Query result:\n", 1385 | "INFO: Vector Store Args: {\n", 1386 | " \"type\": \"lancedb\",\n", 1387 | " \"db_uri\": \"/Users/adamlucek/Desktop/github/GraphRAG/ragtest/output/lancedb\",\n", 1388 | " \"container_name\": \"==== REDACTED ====\",\n", 1389 | " \"overwrite\": true\n", 1390 | "}\n", 1391 | "creating llm client with {'api_key': 'REDACTED,len=51', 'type': \"openai_chat\", 'encoding_model': 'cl100k_base', 'model': 'gpt-4o', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'audience': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25, 'responses': None}\n", 1392 | "creating embedding llm client with {'api_key': 'REDACTED,len=51', 'type': \"openai_embedding\", 'encoding_model': 'cl100k_base', 'model': 'text-embedding-3-small', 'max_tokens': 4000, 'temperature': 0, 'top_p': 1, 'n': 1, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'audience': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25, 'responses': None}\n", 1393 | "\n", 1394 | "SUCCESS: DRIFT Search Response:\n", 1395 | "# Understanding the Choice Between RAG, Fine-Tuning, and PEFT Approaches\n", 1396 | "\n", 1397 | "When a company is faced with the decision to choose between Retrieval-Augmented Generation (RAG), fine-tuning, and various Parameter-Efficient Fine-Tuning (PEFT) approaches, several factors must be taken into account, determined by the specific application requirements and resource availability.\n", 1398 | "\n", 1399 | "## Retrieving Capabilities with RAG\n", 1400 | "RAG is particularly advantageous in scenarios where it is crucial to integrate external information into language models for enhanced accuracy and relevance. It is highly recommended for applications like question and answer systems, customer support, and summarization tasks, where real-time context retrieval from large datasets improves response precision without needing extensive and expensive model retraining. The strength of RAG lies in its ability to quickly adapt to various knowledge domains by leveraging existing data without modifying the underlying language model significantly.\n", 1401 | "\n", 1402 | "## The Focus and Depth of Fine-Tuning\n", 1403 | "Traditional fine-tuning is the go-to choice when a clear, defined improvement in task performance is needed using domain-specific datasets. It involves re-calibrating a pre-trained model’s parameters to cater precisely to industry-specific or task-specific requirements. This approach can enhance performance metrics significantly but incurs high computational costs and demands substantial amounts of fine-tuning data, making it ideal for well-funded projects aiming for precise task optimization.\n", 1404 | "\n", 1405 | "## Resource-Efficiency with PEFT Techniques\n", 1406 | "On the other end of the spectrum, PEFT approaches such as DORA and LoRA present themselves as highly resource-efficient alternatives to standard fine-tuning. They are suitable for companies with limited computational resources but still prioritize performance enhancements. By selectively adjusting additional smaller model parameters, PEFT methods allow substantial flexibility and significant model adaptability for various applications without the full overhead of conventional fine-tuning.\n", 1407 | "\n", 1408 | "Ultimately, the decision for a company involves balancing resource availability (both in terms of data and computing power), desired performance efficiency, and the specific features of the task at hand. Companies with broad application needs across different domains might favor RAG's flexibility; domain-specific improvements would benefit more from deep fine-tuning, while constrained environments should consider PEFT methods for their adaptability and efficiency.\n", 1409 | "\n", 1410 | "# Understanding the Choice Between RAG, Fine-Tuning, and PEFT Approaches\n", 1411 | "\n", 1412 | "When choosing between Retrieval-Augmented Generation (RAG), fine-tuning, and Parameter-Efficient Fine-Tuning (PEFT) approaches, businesses must evaluate their specific needs in terms of model performance, resource constraints, and flexibility. Each approach involves distinct methods for optimally leveraging machine learning models and NLP systems.\n", 1413 | "\n", 1414 | "**Retrieval-Augmented Generation (RAG):** RAG is a strategy combining retrieval and generative models to enhance the context provided to language models like GPT-3. This approach allows models to answer queries more accurately by accessing external knowledge bases. It is particularly beneficial when dealing with tasks requiring extensive context or up-to-date information. Companies would favor RAG when the model needs to maintain relevance in dynamic information environments or when the contextual accuracy is crucial.\n", 1415 | "\n", 1416 | "**Fine-Tuning:** This traditional method involves adjusting the entire model's parameters to optimize performance on a specific task. Fine-tuning is often resource-intensive, requiring significant computational power and time, which can prove a bottleneck for large models or organizations with limited resources. It produces highly task-specific outcomes, making it an ideal choice when high accuracy is paramount and resources are abundant. Fine-tuning is suitable for specialized tasks where the nuances of data need careful embedding into the model.\n", 1417 | "\n", 1418 | "**Parameter-Efficient Fine-Tuning (PEFT):** Techniques like QLoRA and LoRA are designed to address the resource constraints associated with fine-tuning. By adjusting only a subset of parameters, these methods provide a more efficient approach, maintaining model performance while requiring less computational power. Companies might choose PEFT when deploying large models on resource-constrained devices, like mobile platforms, or when exploring multiple tasks with shared resources. PEFT's importance is growing in real-world applications where cost and efficiency are central concerns.\n", 1419 | "\n", 1420 | "In summary, choosing between these approaches depends on an organization’s priorities: RAG offers dynamic and contextual adaptability; fine-tuning promises high precision at the cost of resources; and PEFT balances resource demands with performance, making it suitable for scalable deployment. Organizations are encouraged to deeply understand their specific contextual and technical requirements to select the most appropriate approach.\n", 1421 | "\n", 1422 | "## How Companies Choose Between RAG, Fine-Tuning, and PEFT Approaches\n", 1423 | "\n", 1424 | "When deciding between Retrieval-Augmented Generation (RAG), fine-tuning, and Parameter-Efficient Fine-Tuning (PEFT) techniques, companies must consider a variety of factors related to their unique needs and resources. Each approach serves distinct purposes and offers specific advantages, making them suitable for different scenarios.\n", 1425 | "\n", 1426 | "### Retrieval-Augmented Generation (RAG)\n", 1427 | "RAG is particularly beneficial when a company requires a model to access a vast external data set to create highly contextual and informed responses. This method is preferred when the task involves processing or retrieving large amounts of dynamic information. The power of RAG lies in its ability to provide more comprehensive answers by leveraging external data sources, enhancing the model's output quality. As such, organizations where information retrieval and integration are crucial components of the application, such as in customer support systems or knowledge-based industries, may find RAG an optimal choice.\n", 1428 | "\n", 1429 | "### Fine-Tuning\n", 1430 | "Fine-tuning a large language model involves adjusting its parameters based on specific tasks or domain data to improve its performance on said tasks. This method is ideal for companies aiming to develop a bespoke solution tailored to precise needs or niche applications. Fine-tuning a model can significantly increase its utility in specialized fields such as legal research or medical analysis, where the accuracy and specificity of information are critical. Though resource-intensive, the specificity in outcome makes traditional fine-tuning valuable for businesses that require high degrees of customization.\n", 1431 | "\n", 1432 | "### Parameter-Efficient Fine-Tuning (PEFT)\n", 1433 | "PEFT techniques, such as LoRA and adapter-based methods, offer a way to enhance large language models without the typical computational cost associated with traditional fine-tuning. This strategy updates only a small subset of a model's parameters, thus making it a cost-effective alternative for companies with limited resources or those that need to frequently update models with new data. PEFT is advantageous for businesses that need flexibility without a significant investment in computational power, popular among startups or organizations exploring new avenues without extensive AI infrastructure.\n", 1434 | "\n", 1435 | "In summary, a company's choice between RAG, fine-tuning, and PEFT depends on their specific requirements, the nature and volume of data they interact with, and the computational resources at their disposal. By aligning their strategy with the strengths of each approach, businesses can optimize their use of AI technologies to meet their unique operational goals.\n", 1436 | "\n", 1437 | "# Evaluating Company Choices Among RAG, Fine-Tuning, and PEFT Approaches\n", 1438 | "\n", 1439 | "When a company considers enhancing its machine learning models, particularly large language models (LLMs), it encounters a range of techniques, including Retrieval-Augmented Generation (RAG), fine-tuning, and Parameter-Efficient Fine-Tuning (PEFT) methods like Half Fine-Tuning (HFT). Each of these techniques offers unique benefits and is suitable for different objectives and scenarios. Understanding the distinction and potential applications is crucial for making an informed decision.\n", 1440 | "\n", 1441 | "RAG is highly effective in situations where models need to provide accurate responses based on external information not inherently within the model's parameters. By integrating information retrieval techniques, RAG allows models to access a database or document store to augment their generated text, making it ideal for applications where dynamic content access and up-to-date information are critical, such as live customer service or research queries.\n", 1442 | "\n", 1443 | "In contrast, fine-tuning broadly involves adjusting an already pre-trained model to suit new, specific tasks or datasets. Traditional supervised fine-tuning is useful when the primary goal is to improve a model’s performance on specific tasks while leveraging existing knowledge. For example, in supervised fine-tuning, Half Fine-Tuning (HFT) offers an innovative balance by freezing a portion of the model's parameters, thus maintaining foundational knowledge while developing new competencies. This efficiency is especially important when handling models like the LLAMA 2-7B, where sustaining knowledge over several iterations is vital.\n", 1444 | "\n", 1445 | "PEFT techniques, such as HFT, provide a focused approach to fine-tuning by optimizing resource usage and reducing computational demands. They ensure that changes to the model preserve core functionalities while adapting to new requirements, which is critically important in environments where computational resources and model interpretability are constrained.\n", 1446 | "\n", 1447 | "The choice among RAG, fine-tuning, and PEFT approaches depends on factors such as the company's resource availability, existing model performance, specific use-cases, and the need for adaptability versus resource conservation. Companies prioritizing task-specific improvements with limited computational power might lean toward PEFT or HFT, whereas RAG would be preferable for scenarios needing comprehensive and current information access capabilities.\n", 1448 | "\n", 1449 | "Ultimately, the decision must align with strategic objectives, model lifecycle considerations, and the specific complexities of the data and questions the model is expected to handle.\n", 1450 | "\n", 1451 | "# Factors in Choosing Between RAG, Fine-Tuning, and PEFT Approaches\n", 1452 | "\n", 1453 | "When a company is deciding between Retrieval-Augmented Generation (RAG), fine-tuning, and Parameter-Efficient Fine-Tuning (PEFT) methods for large language models (LLMs), several important factors must be considered. These methods each have unique advantages and challenges that make them suitable for different scenarios.\n", 1454 | "\n", 1455 | "Retrieval-Augmented Generation (RAG) enhances the performance of LLMs by integrating retrieval mechanisms that bring relevant information into the context before generating results. This approach is particularly useful when the task requires the contextually relevant retrieval of information from extensive databases or when the generation task is highly dependent on external knowledge. RAG systems improve both the accuracy and the contextual relevance of LLM outputs, making them valuable for applications where precision in response to contextual queries is critical.\n", 1456 | "\n", 1457 | "Fine-tuning involves adjusting the LLM using a carefully prepared dataset tailored to a specific task. This process includes essential stages such as dataset preparation, instruction tuning, and topic mapping. Fine-tuning is especially advantageous when the intention is to specialize a model for a specific application, ensuring nuanced comprehension and performance improvement for the target task. It provides more control over the model's behavior, as specific datasets align the model with intended outcomes, offering enhanced reliability.\n", 1458 | "\n", 1459 | "Parameter-Efficient Fine-Tuning (PEFT) approaches are designed to fine-tune LLMs with fewer resource requirements compared to traditional fine-tuning. Methods like LoRA (Low-Rank Adaptation), Adapter modules, and prefix tuning are examples of PEFT techniques. They focus on adapting a small set of parameters while maintaining the majority of the LLM fixed, thus reducing computational costs and the necessary dataset size. PEFT is particularly beneficial for scenarios with constrained computational resources or when rapidly deploying updates or adaptations is necessary without retraining the entire model.\n", 1460 | "\n", 1461 | "Ultimately, the choice between these approaches depends on the specific requirements and constraints of the task at hand, including the need for real-time retrieval capabilities (favoring RAG), the specificity and quality of output (favoring fine-tuning), or capacity and resource efficiency (favoring PEFT).\n" 1462 | ] 1463 | } 1464 | ], 1465 | "source": [ 1466 | "result = query_graphrag(\n", 1467 | " query=\"How does a company choose between RAG, fine-tuning, and different PEFT approaches?\",\n", 1468 | " method=\"drift\"\n", 1469 | ")\n", 1470 | "print(\"Query result:\")\n", 1471 | "print(result)" 1472 | ] 1473 | }, 1474 | { 1475 | "cell_type": "markdown", 1476 | "id": "7b2e04a5-aa55-4955-b2c0-b86dd462f549", 1477 | "metadata": {}, 1478 | "source": [ 1479 | "---\n", 1480 | "\n", 1481 | "## Comparing to Regular Vector Database Retrieval\n", 1482 | "\n", 1483 | "\n", 1484 | " \n", 1485 | "To give some comparison, let's look back at traditional chunking, embedding, and similarity retrieval RAG" 1486 | ] 1487 | }, 1488 | { 1489 | "cell_type": "markdown", 1490 | "id": "8e12fa20-81d9-43cc-a641-6e8ba7cb5e9b", 1491 | "metadata": {}, 1492 | "source": [ 1493 | "**Instantiate our Database**\n", 1494 | "\n", 1495 | "For this we'll be using [ChromaDB](https://www.trychroma.com) with the same chunks as were loaded into our graph." 1496 | ] 1497 | }, 1498 | { 1499 | "cell_type": "code", 1500 | "execution_count": null, 1501 | "id": "a361a631-4e98-47d5-9b2d-48810c2eab41", 1502 | "metadata": {}, 1503 | "outputs": [], 1504 | "source": [ 1505 | "import chromadb\n", 1506 | "\n", 1507 | "chroma_client = chromadb.PersistentClient(path=\"./notebook/chromadb\")\n", 1508 | "paper_collection = chroma_client.get_or_create_collection(name=\"paper_collection\")" 1509 | ] 1510 | }, 1511 | { 1512 | "cell_type": "markdown", 1513 | "id": "6a160165-6ed9-4ac9-be3b-db5bff88beb4", 1514 | "metadata": {}, 1515 | "source": [ 1516 | "**Embed Chunks Into Collection**" 1517 | ] 1518 | }, 1519 | { 1520 | "cell_type": "code", 1521 | "execution_count": null, 1522 | "id": "24350f0f-32ef-41ea-9d7c-5e970ca172f7", 1523 | "metadata": {}, 1524 | "outputs": [], 1525 | "source": [ 1526 | "i = 0\n", 1527 | "for text in texts:\n", 1528 | " paper_collection.add(\n", 1529 | " documents=[text],\n", 1530 | " ids=f\"chunk_{i}\"\n", 1531 | " )\n", 1532 | " i += 1" 1533 | ] 1534 | }, 1535 | { 1536 | "cell_type": "markdown", 1537 | "id": "1f3961f0-bdce-4d1d-8d49-86355bb7d505", 1538 | "metadata": {}, 1539 | "source": [ 1540 | "**Retrieval Function**" 1541 | ] 1542 | }, 1543 | { 1544 | "cell_type": "code", 1545 | "execution_count": null, 1546 | "id": "841f2276-fc27-4008-9e63-6ec9d4db66ce", 1547 | "metadata": {}, 1548 | "outputs": [], 1549 | "source": [ 1550 | "def chroma_retrieval(query, num_results=5):\n", 1551 | " results = paper_collection.query(\n", 1552 | " query_texts=[query],\n", 1553 | " n_results=num_results\n", 1554 | " )\n", 1555 | " return results" 1556 | ] 1557 | }, 1558 | { 1559 | "cell_type": "markdown", 1560 | "id": "ac2bb61e-5d36-47fc-b2ec-6557293f5ef4", 1561 | "metadata": {}, 1562 | "source": [ 1563 | "**RAG Prompt & Chain**" 1564 | ] 1565 | }, 1566 | { 1567 | "cell_type": "code", 1568 | "execution_count": 90, 1569 | "id": "593a23ff-c34c-47b4-b6d8-c2ca59fad69f", 1570 | "metadata": {}, 1571 | "outputs": [], 1572 | "source": [ 1573 | "rag_prompt_template = \"\"\"\n", 1574 | "Generate a response of the target length and format that responds to the user's question, summarizing all information in the input data tables appropriate for the response length and format, and incorporating any relevant general knowledge.\n", 1575 | "\n", 1576 | "If you don't know the answer, just say so. Do not make anything up.\n", 1577 | "\n", 1578 | "Do not include information where the supporting evidence for it is not provided.\n", 1579 | "\n", 1580 | "Context: {retrieved_docs}\n", 1581 | "\n", 1582 | "User Question: {query}\n", 1583 | "\n", 1584 | "\"\"\"\n", 1585 | "\n", 1586 | "rag_prompt = ChatPromptTemplate.from_template(rag_prompt_template)\n", 1587 | "\n", 1588 | "rag_chain = rag_prompt | llm | StrOutputParser()" 1589 | ] 1590 | }, 1591 | { 1592 | "cell_type": "markdown", 1593 | "id": "3211bbff-9927-464b-942d-964005707475", 1594 | "metadata": {}, 1595 | "source": [ 1596 | "**RAG Function**" 1597 | ] 1598 | }, 1599 | { 1600 | "cell_type": "code", 1601 | "execution_count": 91, 1602 | "id": "e654479e-0d40-4de3-8d03-1de08b54a3fb", 1603 | "metadata": {}, 1604 | "outputs": [], 1605 | "source": [ 1606 | "def chroma_rag(query):\n", 1607 | " retrieved_docs = chroma_retrieval(query)[\"documents\"][0]\n", 1608 | " response = rag_chain.invoke({\"retrieved_docs\": retrieved_docs, \"query\": query})\n", 1609 | " return response" 1610 | ] 1611 | }, 1612 | { 1613 | "cell_type": "markdown", 1614 | "id": "9fba855e-b29c-4169-8302-b1b5cceee76c", 1615 | "metadata": {}, 1616 | "source": [ 1617 | "**RAG Response**" 1618 | ] 1619 | }, 1620 | { 1621 | "cell_type": "code", 1622 | "execution_count": 92, 1623 | "id": "72201a27-6dac-4f3b-8ff9-717f4998470c", 1624 | "metadata": {}, 1625 | "outputs": [ 1626 | { 1627 | "data": { 1628 | "text/html": [ 1629 | "
HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
1630 |        "
\n" 1631 | ], 1632 | "text/plain": [ 1633 | "HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" 1634 | ] 1635 | }, 1636 | "metadata": {}, 1637 | "output_type": "display_data" 1638 | }, 1639 | { 1640 | "name": "stdout", 1641 | "output_type": "stream", 1642 | "text": [ 1643 | "When choosing between Retrieval-Augmented Generation (RAG), fine-tuning, and different Parameter-Efficient Fine-Tuning (PEFT) approaches, a company should consider several factors:\n", 1644 | "\n", 1645 | "1. **Data Access and Updates**: RAG is preferable for applications requiring access to external data sources or environments where data frequently updates. It provides dynamic data retrieval capabilities and is less prone to generating incorrect information.\n", 1646 | "\n", 1647 | "2. **Model Behavior and Domain-Specific Knowledge**: Fine-tuning is suitable when the model needs to adjust its behavior, writing style, or incorporate domain-specific knowledge. It is effective if there is ample domain-specific, labeled training data available.\n", 1648 | "\n", 1649 | "3. **Resource Constraints and Efficiency**: PEFT approaches like LoRA and DEFT are designed to reduce computational and resource requirements. LoRA focuses on low-rank matrices to reduce memory usage and computational load, while DEFT optimizes the fine-tuning process by focusing on the most critical data samples.\n", 1650 | "\n", 1651 | "4. **Task-Specific Adaptation**: If the goal is to adapt a model for specific tasks with minimal data, PEFT methods like DEFT and adapter-based techniques can be beneficial. They allow for efficient fine-tuning with fewer resources.\n", 1652 | "\n", 1653 | "5. **Transparency and Interpretability**: RAG systems offer more transparency and interpretability in the model’s decision-making process compared to solely fine-tuned models.\n", 1654 | "\n", 1655 | "6. **Scalability and Deployment**: For large-scale deployments, PEFT methods can significantly reduce computational costs by focusing on influential data samples and using surrogate models.\n", 1656 | "\n", 1657 | "In summary, the choice depends on the specific needs of the application, such as data access, model behavior, resource availability, and the importance of transparency and scalability.\n" 1658 | ] 1659 | } 1660 | ], 1661 | "source": [ 1662 | "response = chroma_rag(\"How does a company choose between RAG, fine-tuning, and different PEFT approaches?\")\n", 1663 | "print(response)" 1664 | ] 1665 | }, 1666 | { 1667 | "cell_type": "markdown", 1668 | "id": "0c6a630f-3268-45ec-b58f-dbb42106864a", 1669 | "metadata": {}, 1670 | "source": [ 1671 | "---\n", 1672 | "## Discussion\n", 1673 | "\n", 1674 | "**Traditional/Naive RAG:**\n", 1675 | "\n", 1676 | "Benefits:\n", 1677 | "- Simpler implementation and deployment\n", 1678 | "- Works well for straightforward information retrieval tasks\n", 1679 | "- Good at handling unstructured text data\n", 1680 | "- Lower computational overhead\n", 1681 | "\n", 1682 | "Drawbacks:\n", 1683 | "- Loses structural information when chunking documents\n", 1684 | "- Can break up related content during text segmentation\n", 1685 | "- Limited ability to capture relationships between different pieces of information\n", 1686 | "- May struggle with complex reasoning tasks requiring connecting multiple facts\n", 1687 | "- Potential for incomplete or fragmented answers due to chunking boundaries\n", 1688 | "\n", 1689 | "**GraphRAG:**\n", 1690 | "\n", 1691 | "Benefits:\n", 1692 | "- Preserves structural relationships and hierarchies in the knowledge\n", 1693 | "- Better at capturing connections between related information\n", 1694 | "- Can provide more complete and contextual answers\n", 1695 | "- Improved retrieval accuracy by leveraging graph structure\n", 1696 | "- Better supports complex reasoning across multiple facts\n", 1697 | "- Can maintain document coherence better than chunk-based approaches\n", 1698 | "- More interpretable due to explicit knowledge representation\n", 1699 | "\n", 1700 | "Drawbacks:\n", 1701 | "- More complex to implement and maintain\n", 1702 | "- Requires additional processing to construct and update knowledge graphs\n", 1703 | "- Higher computational overhead for graph operations\n", 1704 | "- May require domain expertise to define graph schema/structure\n", 1705 | "- More challenging to scale to very large datasets\n", 1706 | "- Additional storage requirements for graph structure\n", 1707 | "\n", 1708 | "**Key Differentiators:**\n", 1709 | "1. Knowledge Representation: Traditional RAG treats everything as flat text chunks, while GraphRAG maintains structured relationships in a graph format\n", 1710 | "\n", 1711 | "2. Context Preservation: GraphRAG better preserves context and relationships between different pieces of information compared to the chunking approach of traditional RAG\n", 1712 | "\n", 1713 | "3. Reasoning Capability: GraphRAG enables better multi-hop reasoning and connection of related facts through graph traversal, while traditional RAG is more limited to direct retrieval\n", 1714 | "\n", 1715 | "4. Answer Quality: GraphRAG tends to produce more complete and coherent answers since it can access related information through graph connections rather than being limited by chunk boundaries\n", 1716 | "\n", 1717 | "The choice between traditional RAG and GraphRAG often depends on the specific use case, with GraphRAG being particularly valuable when maintaining relationships between information is important or when complex reasoning is required. An important note as well, GraphRAG approaches still rely on regular embedding and retrieval methods themselves. They compliment eahcother!" 1718 | ] 1719 | }, 1720 | { 1721 | "cell_type": "code", 1722 | "execution_count": null, 1723 | "id": "6b8c5872-2185-46dc-aaef-2108bc490a80", 1724 | "metadata": {}, 1725 | "outputs": [], 1726 | "source": [] 1727 | } 1728 | ], 1729 | "metadata": { 1730 | "kernelspec": { 1731 | "display_name": "graphrag", 1732 | "language": "python", 1733 | "name": "graphrag" 1734 | }, 1735 | "language_info": { 1736 | "codemirror_mode": { 1737 | "name": "ipython", 1738 | "version": 3 1739 | }, 1740 | "file_extension": ".py", 1741 | "mimetype": "text/x-python", 1742 | "name": "python", 1743 | "nbconvert_exporter": "python", 1744 | "pygments_lexer": "ipython3", 1745 | "version": "3.12.8" 1746 | } 1747 | }, 1748 | "nbformat": 4, 1749 | "nbformat_minor": 5 1750 | } 1751 | -------------------------------------------------------------------------------- /media/basic_retrieval.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/basic_retrieval.png -------------------------------------------------------------------------------- /media/coffee_graph_ex.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/coffee_graph_ex.png -------------------------------------------------------------------------------- /media/communities.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/communities.png -------------------------------------------------------------------------------- /media/drift_search.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/drift_search.png -------------------------------------------------------------------------------- /media/entities.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/entities.png -------------------------------------------------------------------------------- /media/global_search.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/global_search.png -------------------------------------------------------------------------------- /media/graph_building.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/graph_building.png -------------------------------------------------------------------------------- /media/graph_start.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/graph_start.png -------------------------------------------------------------------------------- /media/graphrag_data_flow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/graphrag_data_flow.png -------------------------------------------------------------------------------- /media/kg_retrieval.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/kg_retrieval.png -------------------------------------------------------------------------------- /media/leidan.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/leidan.png -------------------------------------------------------------------------------- /media/local_search.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/local_search.png -------------------------------------------------------------------------------- /media/relationship.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/relationship.png -------------------------------------------------------------------------------- /media/table_comp.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ALucek/GraphRAG-Breakdown/bf302b676ea1dce29b8319b5dd8f28509bedce1d/media/table_comp.png --------------------------------------------------------------------------------