├── .github └── pull_request_template.md ├── README-zh.md ├── README.md ├── docs ├── Algorithm.md └── DockerDeployment.md ├── examples ├── openai_README.md └── openai_README_zh.md ├── lightrag ├── api │ ├── README-zh.md │ ├── README.md │ └── docs │ │ └── LightRagWithPostGRESQL.md ├── llm │ └── Readme.md └── tools │ └── lightrag_visualizer │ ├── README-zh.md │ └── README.md └── lightrag_webui └── README.md /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | 10 | 11 | ## Description 12 | 13 | [Briefly describe the changes made in this pull request.] 14 | 15 | ## Related Issues 16 | 17 | [Reference any related issues or tasks addressed by this pull request.] 18 | 19 | ## Changes Made 20 | 21 | [List the specific changes made in this pull request.] 22 | 23 | ## Checklist 24 | 25 | - [ ] Changes tested locally 26 | - [ ] Code reviewed 27 | - [ ] Documentation updated (if necessary) 28 | - [ ] Unit tests added (if applicable) 29 | 30 | ## Additional Notes 31 | 32 | [Add any additional notes or context for the reviewer(s).] 33 | -------------------------------------------------------------------------------- /README-zh.md: -------------------------------------------------------------------------------- 1 | # LightRAG: Simple and Fast Retrieval-Augmented Generation 2 | 3 | LightRAG Diagram 4 | 5 | ## 🎉 新闻 6 | 7 | - [X] [2025.03.18]🎯📢LightRAG现已支持引文功能。 8 | - [X] [2025.02.05]🎯📢我们团队发布了[VideoRAG](https://github.com/HKUDS/VideoRAG),用于理解超长上下文视频。 9 | - [X] [2025.01.13]🎯📢我们团队发布了[MiniRAG](https://github.com/HKUDS/MiniRAG),使用小型模型简化RAG。 10 | - [X] [2025.01.06]🎯📢现在您可以[使用PostgreSQL进行存储](#using-postgresql-for-storage)。 11 | - [X] [2024.12.31]🎯📢LightRAG现在支持[通过文档ID删除](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete)。 12 | - [X] [2024.11.25]🎯📢LightRAG现在支持无缝集成[自定义知识图谱](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#insert-custom-kg),使用户能够用自己的领域专业知识增强系统。 13 | - [X] [2024.11.19]🎯📢LightRAG的综合指南现已在[LearnOpenCV](https://learnopencv.com/lightrag)上发布。非常感谢博客作者。 14 | - [X] [2024.11.11]🎯📢LightRAG现在支持[通过实体名称删除实体](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete)。 15 | - [X] [2024.11.09]🎯📢推出[LightRAG Gui](https://lightrag-gui.streamlit.app),允许您插入、查询、可视化和下载LightRAG知识。 16 | - [X] [2024.11.04]🎯📢现在您可以[使用Neo4J进行存储](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage)。 17 | - [X] [2024.10.29]🎯📢LightRAG现在通过`textract`支持多种文件类型,包括PDF、DOC、PPT和CSV。 18 | - [X] [2024.10.20]🎯📢我们为LightRAG添加了一个新功能:图形可视化。 19 | - [X] [2024.10.18]🎯📢我们添加了[LightRAG介绍视频](https://youtu.be/oageL-1I0GE)的链接。感谢作者! 20 | - [X] [2024.10.17]🎯📢我们创建了一个[Discord频道](https://discord.gg/yF2MmDJyGJ)!欢迎加入分享和讨论!🎉🎉 21 | - [X] [2024.10.16]🎯📢LightRAG现在支持[Ollama模型](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)! 22 | - [X] [2024.10.15]🎯📢LightRAG现在支持[Hugging Face模型](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)! 23 | 24 |
25 | 26 | 算法流程图 27 | 28 | 29 | ![LightRAG索引流程图](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-VectorDB-Json-KV-Store-Indexing-Flowchart-scaled.jpg) 30 | *图1:LightRAG索引流程图 - 图片来源:[Source](https://learnopencv.com/lightrag/)* 31 | ![LightRAG检索和查询流程图](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-Querying-Flowchart-Dual-Level-Retrieval-Generation-Knowledge-Graphs-scaled.jpg) 32 | *图2:LightRAG检索和查询流程图 - 图片来源:[Source](https://learnopencv.com/lightrag/)* 33 | 34 |
35 | 36 | ## 安装 37 | 38 | ### 安装LightRAG服务器 39 | 40 | LightRAG服务器旨在提供Web UI和API支持。Web UI便于文档索引、知识图谱探索和简单的RAG查询界面。LightRAG服务器还提供兼容Ollama的接口,旨在将LightRAG模拟为Ollama聊天模型。这使得AI聊天机器人(如Open WebUI)可以轻松访问LightRAG。 41 | 42 | * 从PyPI安装 43 | 44 | ```bash 45 | pip install "lightrag-hku[api]" 46 | ``` 47 | 48 | * 从源代码安装 49 | 50 | ```bash 51 | # 如有必要,创建Python虚拟环境 52 | # 以可编辑模式安装并支持API 53 | pip install -e ".[api]" 54 | ``` 55 | 56 | ### 安装LightRAG Core 57 | 58 | * 从源代码安装(推荐) 59 | 60 | ```bash 61 | cd LightRAG 62 | pip install -e . 63 | ``` 64 | 65 | * 从PyPI安装 66 | 67 | ```bash 68 | pip install lightrag-hku 69 | ``` 70 | 71 | ## 快速开始 72 | 73 | ### 使用LightRAG服务器 74 | 75 | **有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。** 76 | 77 | ### 使用LightRAG Core 78 | 79 | LightRAG核心功能的示例代码请参见`examples`目录。您还可参照[视频](https://www.youtube.com/watch?v=g21royNJ4fw)视频完成环境配置。若已持有OpenAI API密钥,可以通过以下命令运行演示代码: 80 | 81 | ```bash 82 | ### you should run the demo code with project folder 83 | cd LightRAG 84 | ### provide your API-KEY for OpenAI 85 | export OPENAI_API_KEY="sk-...your_opeai_key..." 86 | ### download the demo document of "A Christmas Carol" by Charles Dickens 87 | curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt 88 | ### run the demo code 89 | python examples/lightrag_openai_demo.py 90 | ``` 91 | 92 | 如需流式响应示例的实现代码,请参阅 `examples/lightrag_openai_compatible_demo.py`。运行前,请确保根据需求修改示例代码中的LLM及嵌入模型配置。 93 | 94 | **注意事项**:在运行demo程序的时候需要注意,不同的测试程序可能使用的是不同的embedding模型,更换不同的embeding模型的时候需要把清空数据目录(`./dickens`),否则层序执行会出错。如果你想保留LLM缓存,可以在清除数据目录是保留`kv_store_llm_response_cache.json`文件。 95 | 96 | ## 使用LightRAG Core进行编程 97 | 98 | ### 一个简单程序 99 | 100 | 以下Python代码片段演示了如何初始化LightRAG、插入文本并进行查询: 101 | 102 | ```python 103 | import os 104 | import asyncio 105 | from lightrag import LightRAG, QueryParam 106 | from lightrag.llm.openai import gpt_4o_mini_complete, gpt_4o_complete, openai_embed 107 | from lightrag.kg.shared_storage import initialize_pipeline_status 108 | from lightrag.utils import setup_logger 109 | 110 | setup_logger("lightrag", level="INFO") 111 | 112 | WORKING_DIR = "./rag_storage" 113 | if not os.path.exists(WORKING_DIR): 114 | os.mkdir(WORKING_DIR) 115 | 116 | async def initialize_rag(): 117 | rag = LightRAG( 118 | working_dir=WORKING_DIR, 119 | embedding_func=openai_embed, 120 | llm_model_func=gpt_4o_mini_complete, 121 | ) 122 | await rag.initialize_storages() 123 | await initialize_pipeline_status() 124 | return rag 125 | 126 | async def main(): 127 | try: 128 | # 初始化RAG实例 129 | rag = await initialize_rag() 130 | # 插入文本 131 | await rag.insert("Your text") 132 | 133 | # 执行混合检索 134 | mode = "hybrid" 135 | print( 136 | await rag.query( 137 | "这个故事的主要主题是什么?", 138 | param=QueryParam(mode=mode) 139 | ) 140 | ) 141 | 142 | except Exception as e: 143 | print(f"发生错误: {e}") 144 | finally: 145 | if rag: 146 | await rag.finalize_storages() 147 | 148 | if __name__ == "__main__": 149 | asyncio.run(main()) 150 | ``` 151 | 152 | 重要说明: 153 | - 运行脚本前请先导出你的OPENAI_API_KEY环境变量。 154 | - 该程序使用LightRAG的默认存储设置,所有数据将持久化在WORKING_DIR/rag_storage目录下。 155 | - 该示例仅展示了初始化LightRAG对象的最简单方式:注入embedding和LLM函数,并在创建LightRAG对象后初始化存储和管道状态。 156 | 157 | ### LightRAG初始化参数 158 | 159 | 以下是完整的LightRAG对象初始化参数清单: 160 | 161 |
162 | 参数 163 | 164 | | **参数** | **类型** | **说明** | **默认值** | 165 | |--------------|----------|-----------------|-------------| 166 | | **working_dir** | `str` | 存储缓存的目录 | `lightrag_cache+timestamp` | 167 | | **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` | 168 | | **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` | 169 | | **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` | 170 | | **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` | 171 | | **chunk_token_size** | `int` | 拆分文档时每个块的最大令牌大小 | `1200` | 172 | | **chunk_overlap_token_size** | `int` | 拆分文档时两个块之间的重叠令牌大小 | `100` | 173 | | **tokenizer** | `Tokenizer` | 用于将文本转换为 tokens(数字)以及使用遵循 TokenizerInterface 协议的 .encode() 和 .decode() 函数将 tokens 转换回文本的函数。 如果您不指定,它将使用默认的 Tiktoken tokenizer。 | `TiktokenTokenizer` | 174 | | **tiktoken_model_name** | `str` | 如果您使用的是默认的 Tiktoken tokenizer,那么这是要使用的特定 Tiktoken 模型的名称。如果您提供自己的 tokenizer,则忽略此设置。 | `gpt-4o-mini` | 175 | | **entity_extract_max_gleaning** | `int` | 实体提取过程中的循环次数,附加历史消息 | `1` | 176 | | **entity_summary_to_max_tokens** | `int` | 每个实体摘要的最大令牌大小 | `500` | 177 | | **node_embedding_algorithm** | `str` | 节点嵌入算法(当前未使用) | `node2vec` | 178 | | **node2vec_params** | `dict` | 节点嵌入的参数 | `{"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}` | 179 | | **embedding_func** | `EmbeddingFunc` | 从文本生成嵌入向量的函数 | `openai_embed` | 180 | | **embedding_batch_num** | `int` | 嵌入过程的最大批量大小(每批发送多个文本) | `32` | 181 | | **embedding_func_max_async** | `int` | 最大并发异步嵌入进程数 | `16` | 182 | | **llm_model_func** | `callable` | LLM生成的函数 | `gpt_4o_mini_complete` | 183 | | **llm_model_name** | `str` | 用于生成的LLM模型名称 | `meta-llama/Llama-3.2-1B-Instruct` | 184 | | **llm_model_max_token_size** | `int` | LLM生成的最大令牌大小(影响实体关系摘要) | `32768`(默认值由环境变量MAX_TOKENS更改) | 185 | | **llm_model_max_async** | `int` | 最大并发异步LLM进程数 | `4`(默认值由环境变量MAX_ASYNC更改) | 186 | | **llm_model_kwargs** | `dict` | LLM生成的附加参数 | | 187 | | **vector_db_storage_cls_kwargs** | `dict` | 向量数据库的附加参数,如设置节点和关系检索的阈值 | cosine_better_than_threshold: 0.2(默认值由环境变量COSINE_THRESHOLD更改) | 188 | | **enable_llm_cache** | `bool` | 如果为`TRUE`,将LLM结果存储在缓存中;重复的提示返回缓存的响应 | `TRUE` | 189 | | **enable_llm_cache_for_entity_extract** | `bool` | 如果为`TRUE`,将实体提取的LLM结果存储在缓存中;适合初学者调试应用程序 | `TRUE` | 190 | | **addon_params** | `dict` | 附加参数,例如`{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}`:设置示例限制、输出语言和文档处理的批量大小 | `example_number: 所有示例, language: English` | 191 | | **convert_response_to_json_func** | `callable` | 未使用 | `convert_response_to_json` | 192 | | **embedding_cache_config** | `dict` | 问答缓存的配置。包含三个参数:`enabled`:布尔值,启用/禁用缓存查找功能。启用时,系统将在生成新答案之前检查缓存的响应。`similarity_threshold`:浮点值(0-1),相似度阈值。当新问题与缓存问题的相似度超过此阈值时,将直接返回缓存的答案而不调用LLM。`use_llm_check`:布尔值,启用/禁用LLM相似度验证。启用时,在返回缓存答案之前,将使用LLM作为二次检查来验证问题之间的相似度。 | 默认:`{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` | 193 | 194 |
195 | 196 | ### 查询参数 197 | 198 | 使用QueryParam控制你的查询行为: 199 | 200 | ```python 201 | class QueryParam: 202 | mode: Literal["local", "global", "hybrid", "naive", "mix"] = "global" 203 | """指定检索模式: 204 | - "local":专注于上下文相关信息。 205 | - "global":利用全局知识。 206 | - "hybrid":结合本地和全局检索方法。 207 | - "naive":执行基本搜索,不使用高级技术。 208 | - "mix":集成知识图谱和向量检索。混合模式结合知识图谱和向量搜索: 209 | - 同时使用结构化(KG)和非结构化(向量)信息 210 | - 通过分析关系和上下文提供全面的答案 211 | - 通过HTML img标签支持图像内容 212 | - 允许通过top_k参数控制检索深度 213 | """ 214 | only_need_context: bool = False 215 | """如果为True,仅返回检索到的上下文而不生成响应。""" 216 | response_type: str = "Multiple Paragraphs" 217 | """定义响应格式。示例:'Multiple Paragraphs'(多段落), 'Single Paragraph'(单段落), 'Bullet Points'(要点列表)。""" 218 | top_k: int = 60 219 | """要检索的顶部项目数量。在'local'模式下代表实体,在'global'模式下代表关系。""" 220 | max_token_for_text_unit: int = 4000 221 | """每个检索文本块允许的最大令牌数。""" 222 | max_token_for_global_context: int = 4000 223 | """全局检索中关系描述的最大令牌分配。""" 224 | max_token_for_local_context: int = 4000 225 | """本地检索中实体描述的最大令牌分配。""" 226 | ids: list[str] | None = None # 仅支持PG向量数据库 227 | """用于过滤RAG的ID列表。""" 228 | model_func: Callable[..., object] | None = None 229 | """查询使用的LLM模型函数。如果提供了此选项,它将代替LightRAG全局模型函数。 230 | 这允许为不同的查询模式使用不同的模型。 231 | """ 232 | ... 233 | ``` 234 | 235 | > top_k的默认值可以通过环境变量TOP_K更改。 236 | 237 | ### LLM and Embedding注入 238 | 239 | LightRAG 需要利用LLM和Embeding模型来完成文档索引和知识库查询工作。在初始化LightRAG的时候需要把阶段,需要把LLM和Embedding的操作函数注入到对象中: 240 | 241 |
242 | 使用类OpenAI的API 243 | 244 | * LightRAG还支持类OpenAI的聊天/嵌入API: 245 | 246 | ```python 247 | async def llm_model_func( 248 | prompt, system_prompt=None, history_messages=[], keyword_extraction=False, **kwargs 249 | ) -> str: 250 | return await openai_complete_if_cache( 251 | "solar-mini", 252 | prompt, 253 | system_prompt=system_prompt, 254 | history_messages=history_messages, 255 | api_key=os.getenv("UPSTAGE_API_KEY"), 256 | base_url="https://api.upstage.ai/v1/solar", 257 | **kwargs 258 | ) 259 | 260 | async def embedding_func(texts: list[str]) -> np.ndarray: 261 | return await openai_embed( 262 | texts, 263 | model="solar-embedding-1-large-query", 264 | api_key=os.getenv("UPSTAGE_API_KEY"), 265 | base_url="https://api.upstage.ai/v1/solar" 266 | ) 267 | 268 | async def initialize_rag(): 269 | rag = LightRAG( 270 | working_dir=WORKING_DIR, 271 | llm_model_func=llm_model_func, 272 | embedding_func=EmbeddingFunc( 273 | embedding_dim=4096, 274 | max_token_size=8192, 275 | func=embedding_func 276 | ) 277 | ) 278 | 279 | await rag.initialize_storages() 280 | await initialize_pipeline_status() 281 | 282 | return rag 283 | ``` 284 | 285 |
286 | 287 |
288 | 使用Hugging Face模型 289 | 290 | * 如果您想使用Hugging Face模型,只需要按如下方式设置LightRAG: 291 | 292 | 参见`lightrag_hf_demo.py` 293 | 294 | ```python 295 | # 使用Hugging Face模型初始化LightRAG 296 | rag = LightRAG( 297 | working_dir=WORKING_DIR, 298 | llm_model_func=hf_model_complete, # 使用Hugging Face模型进行文本生成 299 | llm_model_name='meta-llama/Llama-3.1-8B-Instruct', # Hugging Face的模型名称 300 | # 使用Hugging Face嵌入函数 301 | embedding_func=EmbeddingFunc( 302 | embedding_dim=384, 303 | max_token_size=5000, 304 | func=lambda texts: hf_embed( 305 | texts, 306 | tokenizer=AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2"), 307 | embed_model=AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") 308 | ) 309 | ), 310 | ) 311 | ``` 312 | 313 |
314 | 315 |
316 | 使用Ollama模型 317 | 如果您想使用Ollama模型,您需要拉取计划使用的模型和嵌入模型,例如`nomic-embed-text`。 318 | 319 | 然后您只需要按如下方式设置LightRAG: 320 | 321 | ```python 322 | # 使用Ollama模型初始化LightRAG 323 | rag = LightRAG( 324 | working_dir=WORKING_DIR, 325 | llm_model_func=ollama_model_complete, # 使用Ollama模型进行文本生成 326 | llm_model_name='your_model_name', # 您的模型名称 327 | # 使用Ollama嵌入函数 328 | embedding_func=EmbeddingFunc( 329 | embedding_dim=768, 330 | max_token_size=8192, 331 | func=lambda texts: ollama_embed( 332 | texts, 333 | embed_model="nomic-embed-text" 334 | ) 335 | ), 336 | ) 337 | ``` 338 | 339 | * **增加上下文大小** 340 | 341 | 为了使LightRAG正常工作,上下文应至少为32k令牌。默认情况下,Ollama模型的上下文大小为8k。您可以通过以下两种方式之一实现这一点: 342 | 343 | * **在Modelfile中增加`num_ctx`参数** 344 | 345 | 1. 拉取模型: 346 | 347 | ```bash 348 | ollama pull qwen2 349 | ``` 350 | 351 | 2. 显示模型文件: 352 | 353 | ```bash 354 | ollama show --modelfile qwen2 > Modelfile 355 | ``` 356 | 357 | 3. 编辑Modelfile,添加以下行: 358 | 359 | ```bash 360 | PARAMETER num_ctx 32768 361 | ``` 362 | 363 | 4. 创建修改后的模型: 364 | 365 | ```bash 366 | ollama create -f Modelfile qwen2m 367 | ``` 368 | 369 | * **通过Ollama API设置`num_ctx`** 370 | 371 | 您可以使用`llm_model_kwargs`参数配置ollama: 372 | 373 | ```python 374 | rag = LightRAG( 375 | working_dir=WORKING_DIR, 376 | llm_model_func=ollama_model_complete, # 使用Ollama模型进行文本生成 377 | llm_model_name='your_model_name', # 您的模型名称 378 | llm_model_kwargs={"options": {"num_ctx": 32768}}, 379 | # 使用Ollama嵌入函数 380 | embedding_func=EmbeddingFunc( 381 | embedding_dim=768, 382 | max_token_size=8192, 383 | func=lambda texts: ollama_embedding( 384 | texts, 385 | embed_model="nomic-embed-text" 386 | ) 387 | ), 388 | ) 389 | ``` 390 | 391 | * **低RAM GPU** 392 | 393 | 为了在低RAM GPU上运行此实验,您应该选择小型模型并调整上下文窗口(增加上下文会增加内存消耗)。例如,在6Gb RAM的改装挖矿GPU上运行这个ollama示例需要将上下文大小设置为26k,同时使用`gemma2:2b`。它能够在`book.txt`中找到197个实体和19个关系。 394 | 395 |
396 |
397 | LlamaIndex 398 | 399 | LightRAG支持与LlamaIndex集成 (`llm/llama_index_impl.py`): 400 | 401 | - 通过LlamaIndex与OpenAI和其他提供商集成 402 | - 详细设置和示例请参见[LlamaIndex文档](lightrag/llm/Readme.md) 403 | 404 | **使用示例:** 405 | 406 | ```python 407 | # 使用LlamaIndex直接访问OpenAI 408 | import asyncio 409 | from lightrag import LightRAG 410 | from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed 411 | from llama_index.embeddings.openai import OpenAIEmbedding 412 | from llama_index.llms.openai import OpenAI 413 | from lightrag.kg.shared_storage import initialize_pipeline_status 414 | from lightrag.utils import setup_logger 415 | 416 | # 为LightRAG设置日志处理程序 417 | setup_logger("lightrag", level="INFO") 418 | 419 | async def initialize_rag(): 420 | rag = LightRAG( 421 | working_dir="your/path", 422 | llm_model_func=llama_index_complete_if_cache, # LlamaIndex兼容的完成函数 423 | embedding_func=EmbeddingFunc( # LlamaIndex兼容的嵌入函数 424 | embedding_dim=1536, 425 | max_token_size=8192, 426 | func=lambda texts: llama_index_embed(texts, embed_model=embed_model) 427 | ), 428 | ) 429 | 430 | await rag.initialize_storages() 431 | await initialize_pipeline_status() 432 | 433 | return rag 434 | 435 | def main(): 436 | # 初始化RAG实例 437 | rag = asyncio.run(initialize_rag()) 438 | 439 | with open("./book.txt", "r", encoding="utf-8") as f: 440 | rag.insert(f.read()) 441 | 442 | # 执行朴素搜索 443 | print( 444 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="naive")) 445 | ) 446 | 447 | # 执行本地搜索 448 | print( 449 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="local")) 450 | ) 451 | 452 | # 执行全局搜索 453 | print( 454 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="global")) 455 | ) 456 | 457 | # 执行混合搜索 458 | print( 459 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="hybrid")) 460 | ) 461 | 462 | if __name__ == "__main__": 463 | main() 464 | ``` 465 | 466 | **详细文档和示例,请参见:** 467 | 468 | - [LlamaIndex文档](lightrag/llm/Readme.md) 469 | - [直接OpenAI示例](examples/lightrag_llamaindex_direct_demo.py) 470 | - [LiteLLM代理示例](examples/lightrag_llamaindex_litellm_demo.py) 471 | 472 |
473 | 474 | ### 对话历史 475 | 476 | LightRAG现在通过对话历史功能支持多轮对话。以下是使用方法: 477 | 478 | ```python 479 | # 创建对话历史 480 | conversation_history = [ 481 | {"role": "user", "content": "主角对圣诞节的态度是什么?"}, 482 | {"role": "assistant", "content": "在故事开始时,埃比尼泽·斯克鲁奇对圣诞节持非常消极的态度..."}, 483 | {"role": "user", "content": "他的态度是如何改变的?"} 484 | ] 485 | 486 | # 创建带有对话历史的查询参数 487 | query_param = QueryParam( 488 | mode="mix", # 或其他模式:"local"、"global"、"hybrid" 489 | conversation_history=conversation_history, # 添加对话历史 490 | history_turns=3 # 考虑最近的对话轮数 491 | ) 492 | 493 | # 进行考虑对话历史的查询 494 | response = rag.query( 495 | "是什么导致了他性格的这种变化?", 496 | param=query_param 497 | ) 498 | ``` 499 | 500 | ### 自定义提示词 501 | 502 | LightRAG现在支持自定义提示,以便对系统行为进行精细控制。以下是使用方法: 503 | 504 | ```python 505 | # 创建查询参数 506 | query_param = QueryParam( 507 | mode="hybrid", # 或其他模式:"local"、"global"、"hybrid"、"mix"和"naive" 508 | ) 509 | 510 | # 示例1:使用默认系统提示 511 | response_default = rag.query( 512 | "可再生能源的主要好处是什么?", 513 | param=query_param 514 | ) 515 | print(response_default) 516 | 517 | # 示例2:使用自定义提示 518 | custom_prompt = """ 519 | 您是环境科学领域的专家助手。请提供详细且结构化的答案,并附带示例。 520 | ---对话历史--- 521 | {history} 522 | 523 | ---知识库--- 524 | {context_data} 525 | 526 | ---响应规则--- 527 | 528 | - 目标格式和长度:{response_type} 529 | """ 530 | response_custom = rag.query( 531 | "可再生能源的主要好处是什么?", 532 | param=query_param, 533 | system_prompt=custom_prompt # 传递自定义提示 534 | ) 535 | print(response_custom) 536 | ``` 537 | 538 | ### 关键词提取 539 | 540 | 我们引入了新函数`query_with_separate_keyword_extraction`来增强关键词提取功能。该函数将关键词提取过程与用户提示分开,专注于查询以提高提取关键词的相关性。 541 | 542 | * 工作原理 543 | 544 | 该函数将输入分为两部分: 545 | 546 | - `用户查询` 547 | - `提示` 548 | 549 | 然后仅对`用户查询`执行关键词提取。这种分离确保提取过程是集中和相关的,不受`提示`中任何额外语言的影响。它还允许`提示`纯粹用于响应格式化,保持用户原始问题的意图和清晰度。 550 | 551 | * 使用示例 552 | 553 | 这个`示例`展示了如何为教育内容定制函数,专注于为高年级学生提供详细解释。 554 | 555 | ```python 556 | rag.query_with_separate_keyword_extraction( 557 | query="解释重力定律", 558 | prompt="提供适合学习物理的高中生的详细解释。", 559 | param=QueryParam(mode="hybrid") 560 | ) 561 | ``` 562 | 563 | ### 插入自定义知识 564 | 565 | ```python 566 | custom_kg = { 567 | "chunks": [ 568 | { 569 | "content": "Alice和Bob正在合作进行量子计算研究。", 570 | "source_id": "doc-1" 571 | } 572 | ], 573 | "entities": [ 574 | { 575 | "entity_name": "Alice", 576 | "entity_type": "person", 577 | "description": "Alice是一位专门研究量子物理的研究员。", 578 | "source_id": "doc-1" 579 | }, 580 | { 581 | "entity_name": "Bob", 582 | "entity_type": "person", 583 | "description": "Bob是一位数学家。", 584 | "source_id": "doc-1" 585 | }, 586 | { 587 | "entity_name": "量子计算", 588 | "entity_type": "technology", 589 | "description": "量子计算利用量子力学现象进行计算。", 590 | "source_id": "doc-1" 591 | } 592 | ], 593 | "relationships": [ 594 | { 595 | "src_id": "Alice", 596 | "tgt_id": "Bob", 597 | "description": "Alice和Bob是研究伙伴。", 598 | "keywords": "合作 研究", 599 | "weight": 1.0, 600 | "source_id": "doc-1" 601 | }, 602 | { 603 | "src_id": "Alice", 604 | "tgt_id": "量子计算", 605 | "description": "Alice进行量子计算研究。", 606 | "keywords": "研究 专业", 607 | "weight": 1.0, 608 | "source_id": "doc-1" 609 | }, 610 | { 611 | "src_id": "Bob", 612 | "tgt_id": "量子计算", 613 | "description": "Bob研究量子计算。", 614 | "keywords": "研究 应用", 615 | "weight": 1.0, 616 | "source_id": "doc-1" 617 | } 618 | ] 619 | } 620 | 621 | rag.insert_custom_kg(custom_kg) 622 | ``` 623 | 624 | ### 插入 625 | 626 |
627 | 基本插入 628 | 629 | ```python 630 | # 基本插入 631 | rag.insert("文本") 632 | ``` 633 | 634 |
635 | 636 |
637 | 批量插入 638 | 639 | ```python 640 | # 基本批量插入:一次插入多个文本 641 | rag.insert(["文本1", "文本2",...]) 642 | 643 | # 带有自定义批量大小配置的批量插入 644 | rag = LightRAG( 645 | ... 646 | working_dir=WORKING_DIR, 647 | max_parallel_insert = 4 648 | ) 649 | 650 | rag.insert(["文本1", "文本2", "文本3", ...]) # 文档将以4个为一批进行处理 651 | ``` 652 | 653 | 参数 `max_parallel_insert` 用于控制文档索引流水线中并行处理的文档数量。若未指定,默认值为 **2**。建议将该参数设置为 **10 以下**,因为性能瓶颈通常出现在大语言模型(LLM)的处理环节。 654 | 655 |
656 | 657 |
658 | 带ID插入 659 | 660 | 如果您想为文档提供自己的ID,文档数量和ID数量必须相同。 661 | 662 | ```python 663 | # 插入单个文本,并为其提供ID 664 | rag.insert("文本1", ids=["文本1的ID"]) 665 | 666 | # 插入多个文本,并为它们提供ID 667 | rag.insert(["文本1", "文本2",...], ids=["文本1的ID", "文本2的ID"]) 668 | ``` 669 | 670 |
671 | 672 |
673 | 使用管道插入 674 | 675 | `apipeline_enqueue_documents`和`apipeline_process_enqueue_documents`函数允许您对文档进行增量插入到图中。 676 | 677 | 这对于需要在后台处理文档的场景很有用,同时仍允许主线程继续执行。 678 | 679 | 并使用例程处理新文档。 680 | 681 | ```python 682 | rag = LightRAG(..) 683 | 684 | await rag.apipeline_enqueue_documents(input) 685 | # 您的循环例程 686 | await rag.apipeline_process_enqueue_documents(input) 687 | ``` 688 | 689 |
690 | 691 |
692 | 插入多文件类型支持 693 | 694 | `textract`支持读取TXT、DOCX、PPTX、CSV和PDF等文件类型。 695 | 696 | ```python 697 | import textract 698 | 699 | file_path = 'TEXT.pdf' 700 | text_content = textract.process(file_path) 701 | 702 | rag.insert(text_content.decode('utf-8')) 703 | ``` 704 | 705 |
706 | 707 |
708 | 引文功能 709 | 710 | 通过提供文件路径,系统确保可以将来源追溯到其原始文档。 711 | 712 | ```python 713 | # 定义文档及其文件路径 714 | documents = ["文档内容1", "文档内容2"] 715 | file_paths = ["path/to/doc1.txt", "path/to/doc2.txt"] 716 | 717 | # 插入带有文件路径的文档 718 | rag.insert(documents, file_paths=file_paths) 719 | ``` 720 | 721 |
722 | 723 | ### 存储 724 | 725 | LightRAG使用到4种类型的存储,每一种存储都有多种实现方案。在初始化LightRAG的时候可以通过参数设定这四类存储的实现方案。详情请参看前面的LightRAG初始化参数。 726 | 727 |
728 | 使用Neo4J进行存储 729 | 730 | * 对于生产级场景,您很可能想要利用企业级解决方案 731 | * 进行KG存储。推荐在Docker中运行Neo4J以进行无缝本地测试。 732 | * 参见:https://hub.docker.com/_/neo4j 733 | 734 | ```python 735 | export NEO4J_URI="neo4j://localhost:7687" 736 | export NEO4J_USERNAME="neo4j" 737 | export NEO4J_PASSWORD="password" 738 | 739 | # 为LightRAG设置日志记录器 740 | setup_logger("lightrag", level="INFO") 741 | 742 | # 当您启动项目时,请确保通过指定kg="Neo4JStorage"来覆盖默认的KG:NetworkX。 743 | 744 | # 注意:默认设置使用NetworkX 745 | # 使用Neo4J实现初始化LightRAG。 746 | async def initialize_rag(): 747 | rag = LightRAG( 748 | working_dir=WORKING_DIR, 749 | llm_model_func=gpt_4o_mini_complete, # 使用gpt_4o_mini_complete LLM模型 750 | graph_storage="Neo4JStorage", #<-----------覆盖KG默认值 751 | ) 752 | 753 | # 初始化数据库连接 754 | await rag.initialize_storages() 755 | # 初始化文档处理的管道状态 756 | await initialize_pipeline_status() 757 | 758 | return rag 759 | ``` 760 | 761 | 参见test_neo4j.py获取工作示例。 762 | 763 |
764 | 765 |
766 | 使用Faiss进行存储 767 | 768 | - 安装所需依赖: 769 | 770 | ``` 771 | pip install faiss-cpu 772 | ``` 773 | 774 | 如果您有GPU支持,也可以安装`faiss-gpu`。 775 | 776 | - 这里我们使用`sentence-transformers`,但您也可以使用维度为`3072`的`OpenAIEmbedding`模型。 777 | 778 | ```python 779 | async def embedding_func(texts: list[str]) -> np.ndarray: 780 | model = SentenceTransformer('all-MiniLM-L6-v2') 781 | embeddings = model.encode(texts, convert_to_numpy=True) 782 | return embeddings 783 | 784 | # 使用LLM模型函数和嵌入函数初始化LightRAG 785 | rag = LightRAG( 786 | working_dir=WORKING_DIR, 787 | llm_model_func=llm_model_func, 788 | embedding_func=EmbeddingFunc( 789 | embedding_dim=384, 790 | max_token_size=8192, 791 | func=embedding_func, 792 | ), 793 | vector_storage="FaissVectorDBStorage", 794 | vector_db_storage_cls_kwargs={ 795 | "cosine_better_than_threshold": 0.3 # 您期望的阈值 796 | } 797 | ) 798 | ``` 799 | 800 |
801 | 802 |
803 | 使用PostgreSQL进行存储 804 | 805 | 对于生产级场景,您很可能想要利用企业级解决方案。PostgreSQL可以为您提供一站式解决方案,作为KV存储、向量数据库(pgvector)和图数据库(apache AGE)。 806 | 807 | * PostgreSQL很轻量,整个二进制发行版包括所有必要的插件可以压缩到40MB:参考[Windows发布版](https://github.com/ShanGor/apache-age-windows/releases/tag/PG17%2Fv1.5.0-rc0),它在Linux/Mac上也很容易安装。 808 | * 如果您是初学者并想避免麻烦,推荐使用docker,请从这个镜像开始(请务必阅读概述):https://hub.docker.com/r/shangor/postgres-for-rag 809 | * 如何开始?参考:[examples/lightrag_zhipu_postgres_demo.py](https://github.com/HKUDS/LightRAG/blob/main/examples/lightrag_zhipu_postgres_demo.py) 810 | * 为AGE创建索引示例:(如有必要,将下面的`dickens`改为您的图名) 811 | ```sql 812 | load 'age'; 813 | SET search_path = ag_catalog, "$user", public; 814 | CREATE INDEX CONCURRENTLY entity_p_idx ON dickens."Entity" (id); 815 | CREATE INDEX CONCURRENTLY vertex_p_idx ON dickens."_ag_label_vertex" (id); 816 | CREATE INDEX CONCURRENTLY directed_p_idx ON dickens."DIRECTED" (id); 817 | CREATE INDEX CONCURRENTLY directed_eid_idx ON dickens."DIRECTED" (end_id); 818 | CREATE INDEX CONCURRENTLY directed_sid_idx ON dickens."DIRECTED" (start_id); 819 | CREATE INDEX CONCURRENTLY directed_seid_idx ON dickens."DIRECTED" (start_id,end_id); 820 | CREATE INDEX CONCURRENTLY edge_p_idx ON dickens."_ag_label_edge" (id); 821 | CREATE INDEX CONCURRENTLY edge_sid_idx ON dickens."_ag_label_edge" (start_id); 822 | CREATE INDEX CONCURRENTLY edge_eid_idx ON dickens."_ag_label_edge" (end_id); 823 | CREATE INDEX CONCURRENTLY edge_seid_idx ON dickens."_ag_label_edge" (start_id,end_id); 824 | create INDEX CONCURRENTLY vertex_idx_node_id ON dickens."_ag_label_vertex" (ag_catalog.agtype_access_operator(properties, '"node_id"'::agtype)); 825 | create INDEX CONCURRENTLY entity_idx_node_id ON dickens."Entity" (ag_catalog.agtype_access_operator(properties, '"node_id"'::agtype)); 826 | CREATE INDEX CONCURRENTLY entity_node_id_gin_idx ON dickens."Entity" using gin(properties); 827 | ALTER TABLE dickens."DIRECTED" CLUSTER ON directed_sid_idx; 828 | 829 | -- 如有必要可以删除 830 | drop INDEX entity_p_idx; 831 | drop INDEX vertex_p_idx; 832 | drop INDEX directed_p_idx; 833 | drop INDEX directed_eid_idx; 834 | drop INDEX directed_sid_idx; 835 | drop INDEX directed_seid_idx; 836 | drop INDEX edge_p_idx; 837 | drop INDEX edge_sid_idx; 838 | drop INDEX edge_eid_idx; 839 | drop INDEX edge_seid_idx; 840 | drop INDEX vertex_idx_node_id; 841 | drop INDEX entity_idx_node_id; 842 | drop INDEX entity_node_id_gin_idx; 843 | ``` 844 | * Apache AGE的已知问题:发布版本存在以下问题: 845 | > 您可能会发现节点/边的属性是空的。 846 | > 这是发布版本的已知问题:https://github.com/apache/age/pull/1721 847 | > 848 | > 您可以从源代码编译AGE来修复它。 849 | > 850 | 851 |
852 | 853 | ## 编辑实体和关系 854 | 855 | LightRAG现在支持全面的知识图谱管理功能,允许您在知识图谱中创建、编辑和删除实体和关系。 856 | 857 |
858 | 创建实体和关系 859 | 860 | ```python 861 | # 创建新实体 862 | entity = rag.create_entity("Google", { 863 | "description": "Google是一家专注于互联网相关服务和产品的跨国科技公司。", 864 | "entity_type": "company" 865 | }) 866 | 867 | # 创建另一个实体 868 | product = rag.create_entity("Gmail", { 869 | "description": "Gmail是由Google开发的电子邮件服务。", 870 | "entity_type": "product" 871 | }) 872 | 873 | # 创建实体之间的关系 874 | relation = rag.create_relation("Google", "Gmail", { 875 | "description": "Google开发和运营Gmail。", 876 | "keywords": "开发 运营 服务", 877 | "weight": 2.0 878 | }) 879 | ``` 880 | 881 |
882 | 883 |
884 | 编辑实体和关系 885 | 886 | ```python 887 | # 编辑现有实体 888 | updated_entity = rag.edit_entity("Google", { 889 | "description": "Google是Alphabet Inc.的子公司,成立于1998年。", 890 | "entity_type": "tech_company" 891 | }) 892 | 893 | # 重命名实体(所有关系都会正确迁移) 894 | renamed_entity = rag.edit_entity("Gmail", { 895 | "entity_name": "Google Mail", 896 | "description": "Google Mail(前身为Gmail)是一项电子邮件服务。" 897 | }) 898 | 899 | # 编辑实体之间的关系 900 | updated_relation = rag.edit_relation("Google", "Google Mail", { 901 | "description": "Google创建并维护Google Mail服务。", 902 | "keywords": "创建 维护 电子邮件服务", 903 | "weight": 3.0 904 | }) 905 | ``` 906 | 907 |
908 | 909 | 所有操作都有同步和异步版本。异步版本带有前缀"a"(例如,`acreate_entity`,`aedit_relation`)。 910 | 911 | #### 实体操作 912 | 913 | - **create_entity**:创建具有指定属性的新实体 914 | - **edit_entity**:更新现有实体的属性或重命名它 915 | 916 | #### 关系操作 917 | 918 | - **create_relation**:在现有实体之间创建新关系 919 | - **edit_relation**:更新现有关系的属性 920 | 921 | 这些操作在图数据库和向量数据库组件之间保持数据一致性,确保您的知识图谱保持连贯。 922 | 923 | ## Token统计功能 924 |
925 | 概述和使用 926 | 927 | LightRAG提供了TokenTracker工具来跟踪和管理大模型的token消耗。这个功能对于控制API成本和优化性能特别有用。 928 | 929 | ### 使用方法 930 | 931 | ```python 932 | from lightrag.utils import TokenTracker 933 | 934 | # 创建TokenTracker实例 935 | token_tracker = TokenTracker() 936 | 937 | # 方法1:使用上下文管理器(推荐) 938 | # 适用于需要自动跟踪token使用的场景 939 | with token_tracker: 940 | result1 = await llm_model_func("你的问题1") 941 | result2 = await llm_model_func("你的问题2") 942 | 943 | # 方法2:手动添加token使用记录 944 | # 适用于需要更精细控制token统计的场景 945 | token_tracker.reset() 946 | 947 | rag.insert() 948 | 949 | rag.query("你的问题1", param=QueryParam(mode="naive")) 950 | rag.query("你的问题2", param=QueryParam(mode="mix")) 951 | 952 | # 显示总token使用量(包含插入和查询操作) 953 | print("Token usage:", token_tracker.get_usage()) 954 | ``` 955 | 956 | ### 使用建议 957 | - 在长会话或批量操作中使用上下文管理器,可以自动跟踪所有token消耗 958 | - 对于需要分段统计的场景,使用手动模式并适时调用reset() 959 | - 定期检查token使用情况,有助于及时发现异常消耗 960 | - 在开发测试阶段积极使用此功能,以便优化生产环境的成本 961 | 962 | ### 实际应用示例 963 | 您可以参考以下示例来实现token统计: 964 | - `examples/lightrag_gemini_track_token_demo.py`:使用Google Gemini模型的token统计示例 965 | - `examples/lightrag_siliconcloud_track_token_demo.py`:使用SiliconCloud模型的token统计示例 966 | 967 | 这些示例展示了如何在不同模型和场景下有效地使用TokenTracker功能。 968 | 969 |
970 | 971 | ## 数据导出功能 972 | 973 | ### 概述 974 | 975 | LightRAG允许您以各种格式导出知识图谱数据,用于分析、共享和备份目的。系统支持导出实体、关系和关系数据。 976 | 977 | ### 导出功能 978 | 979 | #### 基本用法 980 | 981 | ```python 982 | # 基本CSV导出(默认格式) 983 | rag.export_data("knowledge_graph.csv") 984 | 985 | # 指定任意格式 986 | rag.export_data("output.xlsx", file_format="excel") 987 | ``` 988 | 989 | #### 支持的不同文件格式 990 | 991 | ```python 992 | # 以CSV格式导出数据 993 | rag.export_data("graph_data.csv", file_format="csv") 994 | 995 | # 导出数据到Excel表格 996 | rag.export_data("graph_data.xlsx", file_format="excel") 997 | 998 | # 以markdown格式导出数据 999 | rag.export_data("graph_data.md", file_format="md") 1000 | 1001 | # 导出数据为文本 1002 | rag.export_data("graph_data.txt", file_format="txt") 1003 | ``` 1004 | 1005 | #### 附加选项 1006 | 1007 | 在导出中包含向量嵌入(可选): 1008 | 1009 | ```python 1010 | rag.export_data("complete_data.csv", include_vector_data=True) 1011 | ``` 1012 | 1013 | ### 导出数据包括 1014 | 1015 | 所有导出包括: 1016 | 1017 | * 实体信息(名称、ID、元数据) 1018 | * 关系数据(实体之间的连接) 1019 | * 来自向量数据库的关系信息 1020 | 1021 | ## 实体合并 1022 | 1023 |
1024 | 合并实体及其关系 1025 | 1026 | LightRAG现在支持将多个实体合并为单个实体,自动处理所有关系: 1027 | 1028 | ```python 1029 | # 基本实体合并 1030 | rag.merge_entities( 1031 | source_entities=["人工智能", "AI", "机器智能"], 1032 | target_entity="AI技术" 1033 | ) 1034 | ``` 1035 | 1036 | 使用自定义合并策略: 1037 | 1038 | ```python 1039 | # 为不同字段定义自定义合并策略 1040 | rag.merge_entities( 1041 | source_entities=["约翰·史密斯", "史密斯博士", "J·史密斯"], 1042 | target_entity="约翰·史密斯", 1043 | merge_strategy={ 1044 | "description": "concatenate", # 组合所有描述 1045 | "entity_type": "keep_first", # 保留第一个实体的类型 1046 | "source_id": "join_unique" # 组合所有唯一的源ID 1047 | } 1048 | ) 1049 | ``` 1050 | 1051 | 使用自定义目标实体数据: 1052 | 1053 | ```python 1054 | # 为合并后的实体指定确切值 1055 | rag.merge_entities( 1056 | source_entities=["纽约", "NYC", "大苹果"], 1057 | target_entity="纽约市", 1058 | target_entity_data={ 1059 | "entity_type": "LOCATION", 1060 | "description": "纽约市是美国人口最多的城市。", 1061 | } 1062 | ) 1063 | ``` 1064 | 1065 | 结合两种方法的高级用法: 1066 | 1067 | ```python 1068 | # 使用策略和自定义数据合并公司实体 1069 | rag.merge_entities( 1070 | source_entities=["微软公司", "Microsoft Corporation", "MSFT"], 1071 | target_entity="微软", 1072 | merge_strategy={ 1073 | "description": "concatenate", # 组合所有描述 1074 | "source_id": "join_unique" # 组合源ID 1075 | }, 1076 | target_entity_data={ 1077 | "entity_type": "ORGANIZATION", 1078 | } 1079 | ) 1080 | ``` 1081 | 1082 | 合并实体时: 1083 | 1084 | * 所有来自源实体的关系都会重定向到目标实体 1085 | * 重复的关系会被智能合并 1086 | * 防止自我关系(循环) 1087 | * 合并后删除源实体 1088 | * 保留关系权重和属性 1089 | 1090 |
1091 | 1092 | ## 缓存 1093 | 1094 |
1095 | 清除缓存 1096 | 1097 | 您可以使用不同模式清除LLM响应缓存: 1098 | 1099 | ```python 1100 | # 清除所有缓存 1101 | await rag.aclear_cache() 1102 | 1103 | # 清除本地模式缓存 1104 | await rag.aclear_cache(modes=["local"]) 1105 | 1106 | # 清除提取缓存 1107 | await rag.aclear_cache(modes=["default"]) 1108 | 1109 | # 清除多个模式 1110 | await rag.aclear_cache(modes=["local", "global", "hybrid"]) 1111 | 1112 | # 同步版本 1113 | rag.clear_cache(modes=["local"]) 1114 | ``` 1115 | 1116 | 有效的模式包括: 1117 | 1118 | - `"default"`:提取缓存 1119 | - `"naive"`:朴素搜索缓存 1120 | - `"local"`:本地搜索缓存 1121 | - `"global"`:全局搜索缓存 1122 | - `"hybrid"`:混合搜索缓存 1123 | - `"mix"`:混合搜索缓存 1124 | 1125 |
1126 | 1127 | ## LightRAG API 1128 | 1129 | LightRAG服务器旨在提供Web UI和API支持。**有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。** 1130 | 1131 | ## 知识图谱可视化 1132 | 1133 | LightRAG服务器提供全面的知识图谱可视化功能。它支持各种重力布局、节点查询、子图过滤等。**有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。** 1134 | 1135 | ![iShot_2025-03-23_12.40.08](./README.assets/iShot_2025-03-23_12.40.08.png) 1136 | 1137 | ## 评估 1138 | 1139 | ### 数据集 1140 | 1141 | LightRAG使用的数据集可以从[TommyChien/UltraDomain](https://huggingface.co/datasets/TommyChien/UltraDomain)下载。 1142 | 1143 | ### 生成查询 1144 | 1145 | LightRAG使用以下提示生成高级查询,相应的代码在`example/generate_query.py`中。 1146 | 1147 |
1148 | 提示 1149 | 1150 | ```python 1151 | 给定以下数据集描述: 1152 | 1153 | {description} 1154 | 1155 | 请识别5个可能会使用此数据集的潜在用户。对于每个用户,列出他们会使用此数据集执行的5个任务。然后,对于每个(用户,任务)组合,生成5个需要对整个数据集有高级理解的问题。 1156 | 1157 | 按以下结构输出结果: 1158 | - 用户1:[用户描述] 1159 | - 任务1:[任务描述] 1160 | - 问题1: 1161 | - 问题2: 1162 | - 问题3: 1163 | - 问题4: 1164 | - 问题5: 1165 | - 任务2:[任务描述] 1166 | ... 1167 | - 任务5:[任务描述] 1168 | - 用户2:[用户描述] 1169 | ... 1170 | - 用户5:[用户描述] 1171 | ... 1172 | ``` 1173 | 1174 |
1175 | 1176 | ### 批量评估 1177 | 1178 | 为了评估两个RAG系统在高级查询上的性能,LightRAG使用以下提示,具体代码可在`example/batch_eval.py`中找到。 1179 | 1180 |
1181 | 提示 1182 | 1183 | ```python 1184 | ---角色--- 1185 | 您是一位专家,负责根据三个标准评估同一问题的两个答案:**全面性**、**多样性**和**赋能性**。 1186 | ---目标--- 1187 | 您将根据三个标准评估同一问题的两个答案:**全面性**、**多样性**和**赋能性**。 1188 | 1189 | - **全面性**:答案提供了多少细节来涵盖问题的所有方面和细节? 1190 | - **多样性**:答案在提供关于问题的不同视角和见解方面有多丰富多样? 1191 | - **赋能性**:答案在多大程度上帮助读者理解并对主题做出明智判断? 1192 | 1193 | 对于每个标准,选择更好的答案(答案1或答案2)并解释原因。然后,根据这三个类别选择总体赢家。 1194 | 1195 | 这是问题: 1196 | {query} 1197 | 1198 | 这是两个答案: 1199 | 1200 | **答案1:** 1201 | {answer1} 1202 | 1203 | **答案2:** 1204 | {answer2} 1205 | 1206 | 使用上述三个标准评估两个答案,并为每个标准提供详细解释。 1207 | 1208 | 以下列JSON格式输出您的评估: 1209 | 1210 | {{ 1211 | "全面性": {{ 1212 | "获胜者": "[答案1或答案2]", 1213 | "解释": "[在此提供解释]" 1214 | }}, 1215 | "赋能性": {{ 1216 | "获胜者": "[答案1或答案2]", 1217 | "解释": "[在此提供解释]" 1218 | }}, 1219 | "总体获胜者": {{ 1220 | "获胜者": "[答案1或答案2]", 1221 | "解释": "[根据三个标准总结为什么这个答案是总体获胜者]" 1222 | }} 1223 | }} 1224 | ``` 1225 | 1226 |
1227 | 1228 | ### 总体性能表 1229 | 1230 | | |**农业**| |**计算机科学**| |**法律**| |**混合**| | 1231 | |----------------------|---------------|------------|------|------------|---------|------------|-------|------------| 1232 | | |NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**| 1233 | |**全面性**|32.4%|**67.6%**|38.4%|**61.6%**|16.4%|**83.6%**|38.8%|**61.2%**| 1234 | |**多样性**|23.6%|**76.4%**|38.0%|**62.0%**|13.6%|**86.4%**|32.4%|**67.6%**| 1235 | |**赋能性**|32.4%|**67.6%**|38.8%|**61.2%**|16.4%|**83.6%**|42.8%|**57.2%**| 1236 | |**总体**|32.4%|**67.6%**|38.8%|**61.2%**|15.2%|**84.8%**|40.0%|**60.0%**| 1237 | | |RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**| 1238 | |**全面性**|31.6%|**68.4%**|38.8%|**61.2%**|15.2%|**84.8%**|39.2%|**60.8%**| 1239 | |**多样性**|29.2%|**70.8%**|39.2%|**60.8%**|11.6%|**88.4%**|30.8%|**69.2%**| 1240 | |**赋能性**|31.6%|**68.4%**|36.4%|**63.6%**|15.2%|**84.8%**|42.4%|**57.6%**| 1241 | |**总体**|32.4%|**67.6%**|38.0%|**62.0%**|14.4%|**85.6%**|40.0%|**60.0%**| 1242 | | |HyDE|**LightRAG**|HyDE|**LightRAG**|HyDE|**LightRAG**|HyDE|**LightRAG**| 1243 | |**全面性**|26.0%|**74.0%**|41.6%|**58.4%**|26.8%|**73.2%**|40.4%|**59.6%**| 1244 | |**多样性**|24.0%|**76.0%**|38.8%|**61.2%**|20.0%|**80.0%**|32.4%|**67.6%**| 1245 | |**赋能性**|25.2%|**74.8%**|40.8%|**59.2%**|26.0%|**74.0%**|46.0%|**54.0%**| 1246 | |**总体**|24.8%|**75.2%**|41.6%|**58.4%**|26.4%|**73.6%**|42.4%|**57.6%**| 1247 | | |GraphRAG|**LightRAG**|GraphRAG|**LightRAG**|GraphRAG|**LightRAG**|GraphRAG|**LightRAG**| 1248 | |**全面性**|45.6%|**54.4%**|48.4%|**51.6%**|48.4%|**51.6%**|**50.4%**|49.6%| 1249 | |**多样性**|22.8%|**77.2%**|40.8%|**59.2%**|26.4%|**73.6%**|36.0%|**64.0%**| 1250 | |**赋能性**|41.2%|**58.8%**|45.2%|**54.8%**|43.6%|**56.4%**|**50.8%**|49.2%| 1251 | |**总体**|45.2%|**54.8%**|48.0%|**52.0%**|47.2%|**52.8%**|**50.4%**|49.6%| 1252 | 1253 | ## 复现 1254 | 1255 | 所有代码都可以在`./reproduce`目录中找到。 1256 | 1257 | ### 步骤0 提取唯一上下文 1258 | 1259 | 首先,我们需要提取数据集中的唯一上下文。 1260 | 1261 |
1262 | 代码 1263 | 1264 | ```python 1265 | def extract_unique_contexts(input_directory, output_directory): 1266 | 1267 | os.makedirs(output_directory, exist_ok=True) 1268 | 1269 | jsonl_files = glob.glob(os.path.join(input_directory, '*.jsonl')) 1270 | print(f"找到{len(jsonl_files)}个JSONL文件。") 1271 | 1272 | for file_path in jsonl_files: 1273 | filename = os.path.basename(file_path) 1274 | name, ext = os.path.splitext(filename) 1275 | output_filename = f"{name}_unique_contexts.json" 1276 | output_path = os.path.join(output_directory, output_filename) 1277 | 1278 | unique_contexts_dict = {} 1279 | 1280 | print(f"处理文件:{filename}") 1281 | 1282 | try: 1283 | with open(file_path, 'r', encoding='utf-8') as infile: 1284 | for line_number, line in enumerate(infile, start=1): 1285 | line = line.strip() 1286 | if not line: 1287 | continue 1288 | try: 1289 | json_obj = json.loads(line) 1290 | context = json_obj.get('context') 1291 | if context and context not in unique_contexts_dict: 1292 | unique_contexts_dict[context] = None 1293 | except json.JSONDecodeError as e: 1294 | print(f"文件{filename}第{line_number}行JSON解码错误:{e}") 1295 | except FileNotFoundError: 1296 | print(f"未找到文件:{filename}") 1297 | continue 1298 | except Exception as e: 1299 | print(f"处理文件{filename}时发生错误:{e}") 1300 | continue 1301 | 1302 | unique_contexts_list = list(unique_contexts_dict.keys()) 1303 | print(f"文件{filename}中有{len(unique_contexts_list)}个唯一的`context`条目。") 1304 | 1305 | try: 1306 | with open(output_path, 'w', encoding='utf-8') as outfile: 1307 | json.dump(unique_contexts_list, outfile, ensure_ascii=False, indent=4) 1308 | print(f"唯一的`context`条目已保存到:{output_filename}") 1309 | except Exception as e: 1310 | print(f"保存到文件{output_filename}时发生错误:{e}") 1311 | 1312 | print("所有文件已处理完成。") 1313 | 1314 | ``` 1315 | 1316 |
1317 | 1318 | ### 步骤1 插入上下文 1319 | 1320 | 对于提取的上下文,我们将它们插入到LightRAG系统中。 1321 | 1322 |
1323 | 代码 1324 | 1325 | ```python 1326 | def insert_text(rag, file_path): 1327 | with open(file_path, mode='r') as f: 1328 | unique_contexts = json.load(f) 1329 | 1330 | retries = 0 1331 | max_retries = 3 1332 | while retries < max_retries: 1333 | try: 1334 | rag.insert(unique_contexts) 1335 | break 1336 | except Exception as e: 1337 | retries += 1 1338 | print(f"插入失败,重试({retries}/{max_retries}),错误:{e}") 1339 | time.sleep(10) 1340 | if retries == max_retries: 1341 | print("超过最大重试次数后插入失败") 1342 | ``` 1343 | 1344 |
1345 | 1346 | ### 步骤2 生成查询 1347 | 1348 | 我们从数据集中每个上下文的前半部分和后半部分提取令牌,然后将它们组合为数据集描述以生成查询。 1349 | 1350 |
1351 | 代码 1352 | 1353 | ```python 1354 | tokenizer = GPT2Tokenizer.from_pretrained('gpt2') 1355 | 1356 | def get_summary(context, tot_tokens=2000): 1357 | tokens = tokenizer.tokenize(context) 1358 | half_tokens = tot_tokens // 2 1359 | 1360 | start_tokens = tokens[1000:1000 + half_tokens] 1361 | end_tokens = tokens[-(1000 + half_tokens):1000] 1362 | 1363 | summary_tokens = start_tokens + end_tokens 1364 | summary = tokenizer.convert_tokens_to_string(summary_tokens) 1365 | 1366 | return summary 1367 | ``` 1368 | 1369 |
1370 | 1371 | ### 步骤3 查询 1372 | 1373 | 对于步骤2中生成的查询,我们将提取它们并查询LightRAG。 1374 | 1375 |
1376 | 代码 1377 | 1378 | ```python 1379 | def extract_queries(file_path): 1380 | with open(file_path, 'r') as f: 1381 | data = f.read() 1382 | 1383 | data = data.replace('**', '') 1384 | 1385 | queries = re.findall(r'- Question \d+: (.+)', data) 1386 | 1387 | return queries 1388 | ``` 1389 | 1390 |
1391 | 1392 | ## Star历史 1393 | 1394 | 1395 | 1396 | 1397 | 1398 | Star History Chart 1399 | 1400 | 1401 | 1402 | ## 贡献 1403 | 1404 | 感谢所有贡献者! 1405 | 1406 | 1407 | 1408 | 1409 | 1410 | ## 🌟引用 1411 | 1412 | ```python 1413 | @article{guo2024lightrag, 1414 | title={LightRAG: Simple and Fast Retrieval-Augmented Generation}, 1415 | author={Zirui Guo and Lianghao Xia and Yanhua Yu and Tu Ao and Chao Huang}, 1416 | year={2024}, 1417 | eprint={2410.05779}, 1418 | archivePrefix={arXiv}, 1419 | primaryClass={cs.IR} 1420 | } 1421 | ``` 1422 | 1423 | **感谢您对我们工作的关注!** 1424 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

🚀 LightRAG: Simple and Fast Retrieval-Augmented Generation

2 | 3 |
4 | 5 | 6 | 9 | 30 | 31 |
7 | lightrag 8 | 10 | 11 |
12 |

13 | 14 | 15 | 16 | 17 |

18 |

19 | 20 | 21 | 22 | 23 |

24 |

25 | 26 | 27 |

28 |
29 |
32 | 33 | LightRAG Diagram 34 | 35 |
36 | 37 |
38 | HKUDS%2FLightRAG | Trendshift 39 |
40 | 41 | ## 🎉 News 42 | 43 | - [X] [2025.03.18]🎯📢LightRAG now supports citation functionality, enabling proper source attribution. 44 | - [X] [2025.02.05]🎯📢Our team has released [VideoRAG](https://github.com/HKUDS/VideoRAG) understanding extremely long-context videos. 45 | - [X] [2025.01.13]🎯📢Our team has released [MiniRAG](https://github.com/HKUDS/MiniRAG) making RAG simpler with small models. 46 | - [X] [2025.01.06]🎯📢You can now [use PostgreSQL for Storage](#using-postgresql-for-storage). 47 | - [X] [2024.12.31]🎯📢LightRAG now supports [deletion by document ID](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete). 48 | - [X] [2024.11.25]🎯📢LightRAG now supports seamless integration of [custom knowledge graphs](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#insert-custom-kg), empowering users to enhance the system with their own domain expertise. 49 | - [X] [2024.11.19]🎯📢A comprehensive guide to LightRAG is now available on [LearnOpenCV](https://learnopencv.com/lightrag). Many thanks to the blog author. 50 | - [X] [2024.11.11]🎯📢LightRAG now supports [deleting entities by their names](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete). 51 | - [X] [2024.11.09]🎯📢Introducing the [LightRAG Gui](https://lightrag-gui.streamlit.app), which allows you to insert, query, visualize, and download LightRAG knowledge. 52 | - [X] [2024.11.04]🎯📢You can now [use Neo4J for Storage](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage). 53 | - [X] [2024.10.29]🎯📢LightRAG now supports multiple file types, including PDF, DOC, PPT, and CSV via `textract`. 54 | - [X] [2024.10.20]🎯📢We've added a new feature to LightRAG: Graph Visualization. 55 | - [X] [2024.10.18]🎯📢We've added a link to a [LightRAG Introduction Video](https://youtu.be/oageL-1I0GE). Thanks to the author! 56 | - [X] [2024.10.17]🎯📢We have created a [Discord channel](https://discord.gg/yF2MmDJyGJ)! Welcome to join for sharing and discussions! 🎉🎉 57 | - [X] [2024.10.16]🎯📢LightRAG now supports [Ollama models](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)! 58 | - [X] [2024.10.15]🎯📢LightRAG now supports [Hugging Face models](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)! 59 | 60 |
61 | 62 | Algorithm Flowchart 63 | 64 | 65 | ![LightRAG Indexing Flowchart](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-VectorDB-Json-KV-Store-Indexing-Flowchart-scaled.jpg) 66 | *Figure 1: LightRAG Indexing Flowchart - Img Caption : [Source](https://learnopencv.com/lightrag/)* 67 | ![LightRAG Retrieval and Querying Flowchart](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-Querying-Flowchart-Dual-Level-Retrieval-Generation-Knowledge-Graphs-scaled.jpg) 68 | *Figure 2: LightRAG Retrieval and Querying Flowchart - Img Caption : [Source](https://learnopencv.com/lightrag/)* 69 | 70 |
71 | 72 | ## Installation 73 | 74 | ### Install LightRAG Server 75 | 76 | The LightRAG Server is designed to provide Web UI and API support. The Web UI facilitates document indexing, knowledge graph exploration, and a simple RAG query interface. LightRAG Server also provide an Ollama compatible interfaces, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat bot, such as Open WebUI, to access LightRAG easily. 77 | 78 | * Install from PyPI 79 | 80 | ```bash 81 | pip install "lightrag-hku[api]" 82 | ``` 83 | 84 | * Installation from Source 85 | 86 | ```bash 87 | # create a Python virtual enviroment if neccesary 88 | # Install in editable mode with API support 89 | pip install -e ".[api]" 90 | ``` 91 | 92 | ### Install LightRAG Core 93 | 94 | * Install from source (Recommend) 95 | 96 | ```bash 97 | cd LightRAG 98 | pip install -e . 99 | ``` 100 | 101 | * Install from PyPI 102 | 103 | ```bash 104 | pip install lightrag-hku 105 | ``` 106 | 107 | ## Quick Start 108 | 109 | ### Quick Start for LightRAG Server 110 | 111 | For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md). 112 | 113 | ### Quick Start for LightRAG core 114 | 115 | To get started with LightRAG core, refer to the sample codes available in the `examples` folder. Additionally, a [video demo](https://www.youtube.com/watch?v=g21royNJ4fw) demonstration is provided to guide you through the local setup process. If you already possess an OpenAI API key, you can run the demo right away: 116 | 117 | ```bash 118 | ### you should run the demo code with project folder 119 | cd LightRAG 120 | ### provide your API-KEY for OpenAI 121 | export OPENAI_API_KEY="sk-...your_opeai_key..." 122 | ### download the demo document of "A Christmas Carol" by Charles Dickens 123 | curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt 124 | ### run the demo code 125 | python examples/lightrag_openai_demo.py 126 | ``` 127 | 128 | For a streaming response implementation example, please see `examples/lightrag_openai_compatible_demo.py`. Prior to execution, ensure you modify the sample code’s LLM and embedding configurations accordingly. 129 | 130 | **Note**: When running the demo program, please be aware that different test scripts may use different embedding models. If you switch to a different embedding model, you must clear the data directory (`./dickens`); otherwise, the program may encounter errors. If you wish to retain the LLM cache, you can preserve the `kv_store_llm_response_cache.json` file while clearing the data directory. 131 | 132 | Integrate Using LightRAG core object 133 | 134 | ## Programing with LightRAG Core 135 | 136 | ### A Simple Program 137 | 138 | Use the below Python snippet to initialize LightRAG, insert text to it, and perform queries: 139 | 140 | ```python 141 | import os 142 | import asyncio 143 | from lightrag import LightRAG, QueryParam 144 | from lightrag.llm.openai import gpt_4o_mini_complete, gpt_4o_complete, openai_embed 145 | from lightrag.kg.shared_storage import initialize_pipeline_status 146 | from lightrag.utils import setup_logger 147 | 148 | setup_logger("lightrag", level="INFO") 149 | 150 | WORKING_DIR = "./rag_storage" 151 | if not os.path.exists(WORKING_DIR): 152 | os.mkdir(WORKING_DIR) 153 | 154 | async def initialize_rag(): 155 | rag = LightRAG( 156 | working_dir=WORKING_DIR, 157 | embedding_func=openai_embed, 158 | llm_model_func=gpt_4o_mini_complete, 159 | ) 160 | await rag.initialize_storages() 161 | await initialize_pipeline_status() 162 | return rag 163 | 164 | async def main(): 165 | try: 166 | # Initialize RAG instance 167 | rag = await initialize_rag() 168 | rag.insert("Your text") 169 | 170 | # Perform hybrid search 171 | mode="hybrid" 172 | print( 173 | await rag.query( 174 | "What are the top themes in this story?", 175 | param=QueryParam(mode=mode) 176 | ) 177 | ) 178 | 179 | except Exception as e: 180 | print(f"An error occurred: {e}") 181 | finally: 182 | if rag: 183 | await rag.finalize_storages() 184 | 185 | if __name__ == "__main__": 186 | asyncio.run(main()) 187 | ``` 188 | 189 | Important notes for the above snippet: 190 | 191 | - Export your OPENAI_API_KEY environment variable before running the script. 192 | - This program uses the default storage settings for LightRAG, so all data will be persisted to WORKING_DIR/rag_storage. 193 | - This program demonstrates only the simplest way to initialize a LightRAG object: Injecting the embedding and LLM functions, and initializing storage and pipeline status after creating the LightRAG object. 194 | 195 | ### LightRAG init parameters 196 | 197 | A full list of LightRAG init parameters: 198 | 199 |
200 | Parameters 201 | 202 | | **Parameter** | **Type** | **Explanation** | **Default** | 203 | |--------------|----------|-----------------|-------------| 204 | | **working_dir** | `str` | Directory where the cache will be stored | `lightrag_cache+timestamp` | 205 | | **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` | 206 | | **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` | 207 | | **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` | 208 | | **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` | 209 | | **chunk_token_size** | `int` | Maximum token size per chunk when splitting documents | `1200` | 210 | | **chunk_overlap_token_size** | `int` | Overlap token size between two chunks when splitting documents | `100` | 211 | | **tokenizer** | `Tokenizer` | The function used to convert text into tokens (numbers) and back using .encode() and .decode() functions following `TokenizerInterface` protocol. If you don't specify one, it will use the default Tiktoken tokenizer. | `TiktokenTokenizer` | 212 | | **tiktoken_model_name** | `str` | If you're using the default Tiktoken tokenizer, this is the name of the specific Tiktoken model to use. This setting is ignored if you provide your own tokenizer. | `gpt-4o-mini` | 213 | | **entity_extract_max_gleaning** | `int` | Number of loops in the entity extraction process, appending history messages | `1` | 214 | | **entity_summary_to_max_tokens** | `int` | Maximum token size for each entity summary | `500` | 215 | | **node_embedding_algorithm** | `str` | Algorithm for node embedding (currently not used) | `node2vec` | 216 | | **node2vec_params** | `dict` | Parameters for node embedding | `{"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}` | 217 | | **embedding_func** | `EmbeddingFunc` | Function to generate embedding vectors from text | `openai_embed` | 218 | | **embedding_batch_num** | `int` | Maximum batch size for embedding processes (multiple texts sent per batch) | `32` | 219 | | **embedding_func_max_async** | `int` | Maximum number of concurrent asynchronous embedding processes | `16` | 220 | | **llm_model_func** | `callable` | Function for LLM generation | `gpt_4o_mini_complete` | 221 | | **llm_model_name** | `str` | LLM model name for generation | `meta-llama/Llama-3.2-1B-Instruct` | 222 | | **llm_model_max_token_size** | `int` | Maximum token size for LLM generation (affects entity relation summaries) | `32768`(default value changed by env var MAX_TOKENS) | 223 | | **llm_model_max_async** | `int` | Maximum number of concurrent asynchronous LLM processes | `4`(default value changed by env var MAX_ASYNC) | 224 | | **llm_model_kwargs** | `dict` | Additional parameters for LLM generation | | 225 | | **vector_db_storage_cls_kwargs** | `dict` | Additional parameters for vector database, like setting the threshold for nodes and relations retrieval | cosine_better_than_threshold: 0.2(default value changed by env var COSINE_THRESHOLD) | 226 | | **enable_llm_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` | 227 | | **enable_llm_cache_for_entity_extract** | `bool` | If `TRUE`, stores LLM results in cache for entity extraction; Good for beginners to debug your application | `TRUE` | 228 | | **addon_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}`: sets example limit, entiy/relation extraction output language | `example_number: all examples, language: English` | 229 | | **convert_response_to_json_func** | `callable` | Not used | `convert_response_to_json` | 230 | | **embedding_cache_config** | `dict` | Configuration for question-answer caching. Contains three parameters: `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers. `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM. `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` | 231 | 232 |
233 | 234 | ### Query Param 235 | 236 | Use QueryParam to control the behavior your query: 237 | 238 | ```python 239 | class QueryParam: 240 | mode: Literal["local", "global", "hybrid", "naive", "mix"] = "global" 241 | """Specifies the retrieval mode: 242 | - "local": Focuses on context-dependent information. 243 | - "global": Utilizes global knowledge. 244 | - "hybrid": Combines local and global retrieval methods. 245 | - "naive": Performs a basic search without advanced techniques. 246 | - "mix": Integrates knowledge graph and vector retrieval. Mix mode combines knowledge graph and vector search: 247 | - Uses both structured (KG) and unstructured (vector) information 248 | - Provides comprehensive answers by analyzing relationships and context 249 | - Supports image content through HTML img tags 250 | - Allows control over retrieval depth via top_k parameter 251 | """ 252 | only_need_context: bool = False 253 | """If True, only returns the retrieved context without generating a response.""" 254 | response_type: str = "Multiple Paragraphs" 255 | """Defines the response format. Examples: 'Multiple Paragraphs', 'Single Paragraph', 'Bullet Points'.""" 256 | top_k: int = 60 257 | """Number of top items to retrieve. Represents entities in 'local' mode and relationships in 'global' mode.""" 258 | max_token_for_text_unit: int = 4000 259 | """Maximum number of tokens allowed for each retrieved text chunk.""" 260 | max_token_for_global_context: int = 4000 261 | """Maximum number of tokens allocated for relationship descriptions in global retrieval.""" 262 | max_token_for_local_context: int = 4000 263 | """Maximum number of tokens allocated for entity descriptions in local retrieval.""" 264 | ids: list[str] | None = None # ONLY SUPPORTED FOR PG VECTOR DBs 265 | """List of ids to filter the RAG.""" 266 | model_func: Callable[..., object] | None = None 267 | """Optional override for the LLM model function to use for this specific query. 268 | If provided, this will be used instead of the global model function. 269 | This allows using different models for different query modes. 270 | """ 271 | ... 272 | ``` 273 | 274 | > default value of Top_k can be change by environment variables TOP_K. 275 | 276 | ### LLM and Embedding Injection 277 | 278 | LightRAG requires the utilization of LLM and Embedding models to accomplish document indexing and querying tasks. During the initialization phase, it is necessary to inject the invocation methods of the relevant models into LightRAG: 279 | 280 |
281 | Using Open AI-like APIs 282 | 283 | * LightRAG also supports Open AI-like chat/embeddings APIs: 284 | 285 | ```python 286 | async def llm_model_func( 287 | prompt, system_prompt=None, history_messages=[], keyword_extraction=False, **kwargs 288 | ) -> str: 289 | return await openai_complete_if_cache( 290 | "solar-mini", 291 | prompt, 292 | system_prompt=system_prompt, 293 | history_messages=history_messages, 294 | api_key=os.getenv("UPSTAGE_API_KEY"), 295 | base_url="https://api.upstage.ai/v1/solar", 296 | **kwargs 297 | ) 298 | 299 | async def embedding_func(texts: list[str]) -> np.ndarray: 300 | return await openai_embed( 301 | texts, 302 | model="solar-embedding-1-large-query", 303 | api_key=os.getenv("UPSTAGE_API_KEY"), 304 | base_url="https://api.upstage.ai/v1/solar" 305 | ) 306 | 307 | async def initialize_rag(): 308 | rag = LightRAG( 309 | working_dir=WORKING_DIR, 310 | llm_model_func=llm_model_func, 311 | embedding_func=EmbeddingFunc( 312 | embedding_dim=4096, 313 | max_token_size=8192, 314 | func=embedding_func 315 | ) 316 | ) 317 | 318 | await rag.initialize_storages() 319 | await initialize_pipeline_status() 320 | 321 | return rag 322 | ``` 323 | 324 |
325 | 326 |
327 | Using Hugging Face Models 328 | 329 | * If you want to use Hugging Face models, you only need to set LightRAG as follows: 330 | 331 | See `lightrag_hf_demo.py` 332 | 333 | ```python 334 | # Initialize LightRAG with Hugging Face model 335 | rag = LightRAG( 336 | working_dir=WORKING_DIR, 337 | llm_model_func=hf_model_complete, # Use Hugging Face model for text generation 338 | llm_model_name='meta-llama/Llama-3.1-8B-Instruct', # Model name from Hugging Face 339 | # Use Hugging Face embedding function 340 | embedding_func=EmbeddingFunc( 341 | embedding_dim=384, 342 | max_token_size=5000, 343 | func=lambda texts: hf_embed( 344 | texts, 345 | tokenizer=AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2"), 346 | embed_model=AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") 347 | ) 348 | ), 349 | ) 350 | ``` 351 | 352 |
353 | 354 |
355 | Using Ollama Models 356 | **Overview** 357 | 358 | If you want to use Ollama models, you need to pull model you plan to use and embedding model, for example `nomic-embed-text`. 359 | 360 | Then you only need to set LightRAG as follows: 361 | 362 | ```python 363 | # Initialize LightRAG with Ollama model 364 | rag = LightRAG( 365 | working_dir=WORKING_DIR, 366 | llm_model_func=ollama_model_complete, # Use Ollama model for text generation 367 | llm_model_name='your_model_name', # Your model name 368 | # Use Ollama embedding function 369 | embedding_func=EmbeddingFunc( 370 | embedding_dim=768, 371 | max_token_size=8192, 372 | func=lambda texts: ollama_embed( 373 | texts, 374 | embed_model="nomic-embed-text" 375 | ) 376 | ), 377 | ) 378 | ``` 379 | 380 | * **Increasing context size** 381 | 382 | In order for LightRAG to work context should be at least 32k tokens. By default Ollama models have context size of 8k. You can achieve this using one of two ways: 383 | 384 | * **Increasing the `num_ctx` parameter in Modelfile** 385 | 386 | 1. Pull the model: 387 | 388 | ```bash 389 | ollama pull qwen2 390 | ``` 391 | 392 | 2. Display the model file: 393 | 394 | ```bash 395 | ollama show --modelfile qwen2 > Modelfile 396 | ``` 397 | 398 | 3. Edit the Modelfile by adding the following line: 399 | 400 | ```bash 401 | PARAMETER num_ctx 32768 402 | ``` 403 | 404 | 4. Create the modified model: 405 | 406 | ```bash 407 | ollama create -f Modelfile qwen2m 408 | ``` 409 | 410 | * **Setup `num_ctx` via Ollama API** 411 | 412 | Tiy can use `llm_model_kwargs` param to configure ollama: 413 | 414 | ```python 415 | rag = LightRAG( 416 | working_dir=WORKING_DIR, 417 | llm_model_func=ollama_model_complete, # Use Ollama model for text generation 418 | llm_model_name='your_model_name', # Your model name 419 | llm_model_kwargs={"options": {"num_ctx": 32768}}, 420 | # Use Ollama embedding function 421 | embedding_func=EmbeddingFunc( 422 | embedding_dim=768, 423 | max_token_size=8192, 424 | func=lambda texts: ollama_embedding( 425 | texts, 426 | embed_model="nomic-embed-text" 427 | ) 428 | ), 429 | ) 430 | ``` 431 | 432 | * **Low RAM GPUs** 433 | 434 | In order to run this experiment on low RAM GPU you should select small model and tune context window (increasing context increase memory consumption). For example, running this ollama example on repurposed mining GPU with 6Gb of RAM required to set context size to 26k while using `gemma2:2b`. It was able to find 197 entities and 19 relations on `book.txt`. 435 | 436 |
437 |
438 | LlamaIndex 439 | 440 | LightRAG supports integration with LlamaIndex (`llm/llama_index_impl.py`): 441 | 442 | - Integrates with OpenAI and other providers through LlamaIndex 443 | - See [LlamaIndex Documentation](lightrag/llm/Readme.md) for detailed setup and examples 444 | 445 | **Example Usage** 446 | 447 | ```python 448 | # Using LlamaIndex with direct OpenAI access 449 | import asyncio 450 | from lightrag import LightRAG 451 | from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed 452 | from llama_index.embeddings.openai import OpenAIEmbedding 453 | from llama_index.llms.openai import OpenAI 454 | from lightrag.kg.shared_storage import initialize_pipeline_status 455 | from lightrag.utils import setup_logger 456 | 457 | # Setup log handler for LightRAG 458 | setup_logger("lightrag", level="INFO") 459 | 460 | async def initialize_rag(): 461 | rag = LightRAG( 462 | working_dir="your/path", 463 | llm_model_func=llama_index_complete_if_cache, # LlamaIndex-compatible completion function 464 | embedding_func=EmbeddingFunc( # LlamaIndex-compatible embedding function 465 | embedding_dim=1536, 466 | max_token_size=8192, 467 | func=lambda texts: llama_index_embed(texts, embed_model=embed_model) 468 | ), 469 | ) 470 | 471 | await rag.initialize_storages() 472 | await initialize_pipeline_status() 473 | 474 | return rag 475 | 476 | def main(): 477 | # Initialize RAG instance 478 | rag = asyncio.run(initialize_rag()) 479 | 480 | with open("./book.txt", "r", encoding="utf-8") as f: 481 | rag.insert(f.read()) 482 | 483 | # Perform naive search 484 | print( 485 | rag.query("What are the top themes in this story?", param=QueryParam(mode="naive")) 486 | ) 487 | 488 | # Perform local search 489 | print( 490 | rag.query("What are the top themes in this story?", param=QueryParam(mode="local")) 491 | ) 492 | 493 | # Perform global search 494 | print( 495 | rag.query("What are the top themes in this story?", param=QueryParam(mode="global")) 496 | ) 497 | 498 | # Perform hybrid search 499 | print( 500 | rag.query("What are the top themes in this story?", param=QueryParam(mode="hybrid")) 501 | ) 502 | 503 | if __name__ == "__main__": 504 | main() 505 | ``` 506 | 507 | **For detailed documentation and examples, see:** 508 | 509 | - [LlamaIndex Documentation](lightrag/llm/Readme.md) 510 | - [Direct OpenAI Example](examples/lightrag_llamaindex_direct_demo.py) 511 | - [LiteLLM Proxy Example](examples/lightrag_llamaindex_litellm_demo.py) 512 | 513 |
514 | 515 | ### Conversation History Support 516 | 517 | 518 | LightRAG now supports multi-turn dialogue through the conversation history feature. Here's how to use it: 519 | 520 |
521 | Usage Example 522 | 523 | ```python 524 | # Create conversation history 525 | conversation_history = [ 526 | {"role": "user", "content": "What is the main character's attitude towards Christmas?"}, 527 | {"role": "assistant", "content": "At the beginning of the story, Ebenezer Scrooge has a very negative attitude towards Christmas..."}, 528 | {"role": "user", "content": "How does his attitude change?"} 529 | ] 530 | 531 | # Create query parameters with conversation history 532 | query_param = QueryParam( 533 | mode="mix", # or any other mode: "local", "global", "hybrid" 534 | conversation_history=conversation_history, # Add the conversation history 535 | history_turns=3 # Number of recent conversation turns to consider 536 | ) 537 | 538 | # Make a query that takes into account the conversation history 539 | response = rag.query( 540 | "What causes this change in his character?", 541 | param=query_param 542 | ) 543 | ``` 544 | 545 |
546 | 547 | ### Custom Prompt Support 548 | 549 | LightRAG now supports custom prompts for fine-tuned control over the system's behavior. Here's how to use it: 550 | 551 |
552 | Usage Example 553 | 554 | ```python 555 | # Create query parameters 556 | query_param = QueryParam( 557 | mode="hybrid", # or other mode: "local", "global", "hybrid", "mix" and "naive" 558 | ) 559 | 560 | # Example 1: Using the default system prompt 561 | response_default = rag.query( 562 | "What are the primary benefits of renewable energy?", 563 | param=query_param 564 | ) 565 | print(response_default) 566 | 567 | # Example 2: Using a custom prompt 568 | custom_prompt = """ 569 | You are an expert assistant in environmental science. Provide detailed and structured answers with examples. 570 | ---Conversation History--- 571 | {history} 572 | 573 | ---Knowledge Base--- 574 | {context_data} 575 | 576 | ---Response Rules--- 577 | 578 | - Target format and length: {response_type} 579 | """ 580 | response_custom = rag.query( 581 | "What are the primary benefits of renewable energy?", 582 | param=query_param, 583 | system_prompt=custom_prompt # Pass the custom prompt 584 | ) 585 | print(response_custom) 586 | ``` 587 | 588 |
589 | 590 | ### Separate Keyword Extraction 591 | 592 | We've introduced a new function `query_with_separate_keyword_extraction` to enhance the keyword extraction capabilities. This function separates the keyword extraction process from the user's prompt, focusing solely on the query to improve the relevance of extracted keywords. 593 | 594 | **How It Works?** 595 | 596 | The function operates by dividing the input into two parts: 597 | 598 | - `User Query` 599 | - `Prompt` 600 | 601 | It then performs keyword extraction exclusively on the `user query`. This separation ensures that the extraction process is focused and relevant, unaffected by any additional language in the `prompt`. It also allows the `prompt` to serve purely for response formatting, maintaining the intent and clarity of the user's original question. 602 | 603 |
604 | Usage Example 605 | 606 | This `example` shows how to tailor the function for educational content, focusing on detailed explanations for older students. 607 | 608 | ```python 609 | rag.query_with_separate_keyword_extraction( 610 | query="Explain the law of gravity", 611 | prompt="Provide a detailed explanation suitable for high school students studying physics.", 612 | param=QueryParam(mode="hybrid") 613 | ) 614 | ``` 615 | 616 |
617 | 618 | ### Insert 619 | 620 |
621 | Basic Insert 622 | 623 | ```python 624 | # Basic Insert 625 | rag.insert("Text") 626 | ``` 627 | 628 |
629 | 630 |
631 | Batch Insert 632 | 633 | ```python 634 | # Basic Batch Insert: Insert multiple texts at once 635 | rag.insert(["TEXT1", "TEXT2",...]) 636 | 637 | # Batch Insert with custom batch size configuration 638 | rag = LightRAG( 639 | ... 640 | working_dir=WORKING_DIR, 641 | max_parallel_insert = 4 642 | ) 643 | 644 | rag.insert(["TEXT1", "TEXT2", "TEXT3", ...]) # Documents will be processed in batches of 4 645 | ``` 646 | 647 | The `max_parallel_insert` parameter determines the number of documents processed concurrently in the document indexing pipeline. If unspecified, the default value is **2**. We recommend keeping this setting **below 10**, as the performance bottleneck typically lies with the LLM (Large Language Model) processing.The `max_parallel_insert` parameter determines the number of documents processed concurrently in the document indexing pipeline. If unspecified, the default value is **2**. We recommend keeping this setting **below 10**, as the performance bottleneck typically lies with the LLM (Large Language Model) processing. 648 | 649 |
650 | 651 |
652 | Insert with ID 653 | 654 | If you want to provide your own IDs for your documents, number of documents and number of IDs must be the same. 655 | 656 | ```python 657 | # Insert single text, and provide ID for it 658 | rag.insert("TEXT1", ids=["ID_FOR_TEXT1"]) 659 | 660 | # Insert multiple texts, and provide IDs for them 661 | rag.insert(["TEXT1", "TEXT2",...], ids=["ID_FOR_TEXT1", "ID_FOR_TEXT2"]) 662 | ``` 663 | 664 |
665 | 666 |
667 | Insert using Pipeline 668 | 669 | The `apipeline_enqueue_documents` and `apipeline_process_enqueue_documents` functions allow you to perform incremental insertion of documents into the graph. 670 | 671 | This is useful for scenarios where you want to process documents in the background while still allowing the main thread to continue executing. 672 | 673 | And using a routine to process new documents. 674 | 675 | ```python 676 | rag = LightRAG(..) 677 | 678 | await rag.apipeline_enqueue_documents(input) 679 | # Your routine in loop 680 | await rag.apipeline_process_enqueue_documents(input) 681 | ``` 682 | 683 |
684 | 685 |
686 | Insert Multi-file Type Support 687 | 688 | The `textract` supports reading file types such as TXT, DOCX, PPTX, CSV, and PDF. 689 | 690 | ```python 691 | import textract 692 | 693 | file_path = 'TEXT.pdf' 694 | text_content = textract.process(file_path) 695 | 696 | rag.insert(text_content.decode('utf-8')) 697 | ``` 698 | 699 |
700 | 701 |
702 | Insert Custom KG 703 | 704 | ```python 705 | custom_kg = { 706 | "chunks": [ 707 | { 708 | "content": "Alice and Bob are collaborating on quantum computing research.", 709 | "source_id": "doc-1" 710 | } 711 | ], 712 | "entities": [ 713 | { 714 | "entity_name": "Alice", 715 | "entity_type": "person", 716 | "description": "Alice is a researcher specializing in quantum physics.", 717 | "source_id": "doc-1" 718 | }, 719 | { 720 | "entity_name": "Bob", 721 | "entity_type": "person", 722 | "description": "Bob is a mathematician.", 723 | "source_id": "doc-1" 724 | }, 725 | { 726 | "entity_name": "Quantum Computing", 727 | "entity_type": "technology", 728 | "description": "Quantum computing utilizes quantum mechanical phenomena for computation.", 729 | "source_id": "doc-1" 730 | } 731 | ], 732 | "relationships": [ 733 | { 734 | "src_id": "Alice", 735 | "tgt_id": "Bob", 736 | "description": "Alice and Bob are research partners.", 737 | "keywords": "collaboration research", 738 | "weight": 1.0, 739 | "source_id": "doc-1" 740 | }, 741 | { 742 | "src_id": "Alice", 743 | "tgt_id": "Quantum Computing", 744 | "description": "Alice conducts research on quantum computing.", 745 | "keywords": "research expertise", 746 | "weight": 1.0, 747 | "source_id": "doc-1" 748 | }, 749 | { 750 | "src_id": "Bob", 751 | "tgt_id": "Quantum Computing", 752 | "description": "Bob researches quantum computing.", 753 | "keywords": "research application", 754 | "weight": 1.0, 755 | "source_id": "doc-1" 756 | } 757 | ] 758 | } 759 | 760 | rag.insert_custom_kg(custom_kg) 761 | ``` 762 | 763 |
764 | 765 |
766 | Citation Functionality 767 | 768 | By providing file paths, the system ensures that sources can be traced back to their original documents. 769 | 770 | ```python 771 | # Define documents and their file paths 772 | documents = ["Document content 1", "Document content 2"] 773 | file_paths = ["path/to/doc1.txt", "path/to/doc2.txt"] 774 | 775 | # Insert documents with file paths 776 | rag.insert(documents, file_paths=file_paths) 777 | ``` 778 | 779 |
780 | 781 | ### Storage 782 | 783 | LightRAG uses four types of storage, each of which has multiple implementation options. When initializing LightRAG, the implementation schemes for these four types of storage can be set through parameters. For details, please refer to the previous LightRAG initialization parameters. 784 | 785 |
786 | Using Neo4J for Storage 787 | 788 | * For production level scenarios you will most likely want to leverage an enterprise solution 789 | * for KG storage. Running Neo4J in Docker is recommended for seamless local testing. 790 | * See: https://hub.docker.com/_/neo4j 791 | 792 | ```python 793 | export NEO4J_URI="neo4j://localhost:7687" 794 | export NEO4J_USERNAME="neo4j" 795 | export NEO4J_PASSWORD="password" 796 | 797 | # Setup logger for LightRAG 798 | setup_logger("lightrag", level="INFO") 799 | 800 | # When you launch the project be sure to override the default KG: NetworkX 801 | # by specifying kg="Neo4JStorage". 802 | 803 | # Note: Default settings use NetworkX 804 | # Initialize LightRAG with Neo4J implementation. 805 | async def initialize_rag(): 806 | rag = LightRAG( 807 | working_dir=WORKING_DIR, 808 | llm_model_func=gpt_4o_mini_complete, # Use gpt_4o_mini_complete LLM model 809 | graph_storage="Neo4JStorage", #<-----------override KG default 810 | ) 811 | 812 | # Initialize database connections 813 | await rag.initialize_storages() 814 | # Initialize pipeline status for document processing 815 | await initialize_pipeline_status() 816 | 817 | return rag 818 | ``` 819 | 820 | see test_neo4j.py for a working example. 821 | 822 |
823 | 824 |
825 | Using PostgreSQL for Storage 826 | 827 | For production level scenarios you will most likely want to leverage an enterprise solution. PostgreSQL can provide a one-stop solution for you as KV store, VectorDB (pgvector) and GraphDB (apache AGE). 828 | 829 | * PostgreSQL is lightweight,the whole binary distribution including all necessary plugins can be zipped to 40MB: Ref to [Windows Release](https://github.com/ShanGor/apache-age-windows/releases/tag/PG17%2Fv1.5.0-rc0) as it is easy to install for Linux/Mac. 830 | * If you prefer docker, please start with this image if you are a beginner to avoid hiccups (DO read the overview): https://hub.docker.com/r/shangor/postgres-for-rag 831 | * How to start? Ref to: [examples/lightrag_zhipu_postgres_demo.py](https://github.com/HKUDS/LightRAG/blob/main/examples/lightrag_zhipu_postgres_demo.py) 832 | * Create index for AGE example: (Change below `dickens` to your graph name if necessary) 833 | ```sql 834 | load 'age'; 835 | SET search_path = ag_catalog, "$user", public; 836 | CREATE INDEX CONCURRENTLY entity_p_idx ON dickens."Entity" (id); 837 | CREATE INDEX CONCURRENTLY vertex_p_idx ON dickens."_ag_label_vertex" (id); 838 | CREATE INDEX CONCURRENTLY directed_p_idx ON dickens."DIRECTED" (id); 839 | CREATE INDEX CONCURRENTLY directed_eid_idx ON dickens."DIRECTED" (end_id); 840 | CREATE INDEX CONCURRENTLY directed_sid_idx ON dickens."DIRECTED" (start_id); 841 | CREATE INDEX CONCURRENTLY directed_seid_idx ON dickens."DIRECTED" (start_id,end_id); 842 | CREATE INDEX CONCURRENTLY edge_p_idx ON dickens."_ag_label_edge" (id); 843 | CREATE INDEX CONCURRENTLY edge_sid_idx ON dickens."_ag_label_edge" (start_id); 844 | CREATE INDEX CONCURRENTLY edge_eid_idx ON dickens."_ag_label_edge" (end_id); 845 | CREATE INDEX CONCURRENTLY edge_seid_idx ON dickens."_ag_label_edge" (start_id,end_id); 846 | create INDEX CONCURRENTLY vertex_idx_node_id ON dickens."_ag_label_vertex" (ag_catalog.agtype_access_operator(properties, '"node_id"'::agtype)); 847 | create INDEX CONCURRENTLY entity_idx_node_id ON dickens."Entity" (ag_catalog.agtype_access_operator(properties, '"node_id"'::agtype)); 848 | CREATE INDEX CONCURRENTLY entity_node_id_gin_idx ON dickens."Entity" using gin(properties); 849 | ALTER TABLE dickens."DIRECTED" CLUSTER ON directed_sid_idx; 850 | 851 | -- drop if necessary 852 | drop INDEX entity_p_idx; 853 | drop INDEX vertex_p_idx; 854 | drop INDEX directed_p_idx; 855 | drop INDEX directed_eid_idx; 856 | drop INDEX directed_sid_idx; 857 | drop INDEX directed_seid_idx; 858 | drop INDEX edge_p_idx; 859 | drop INDEX edge_sid_idx; 860 | drop INDEX edge_eid_idx; 861 | drop INDEX edge_seid_idx; 862 | drop INDEX vertex_idx_node_id; 863 | drop INDEX entity_idx_node_id; 864 | drop INDEX entity_node_id_gin_idx; 865 | ``` 866 | * Known issue of the Apache AGE: The released versions got below issue: 867 | > You might find that the properties of the nodes/edges are empty. 868 | > It is a known issue of the release version: https://github.com/apache/age/pull/1721 869 | > 870 | > You can Compile the AGE from source code and fix it. 871 | > 872 | 873 |
874 | 875 |
876 | Using Faiss for Storage 877 | 878 | - Install the required dependencies: 879 | 880 | ``` 881 | pip install faiss-cpu 882 | ``` 883 | 884 | You can also install `faiss-gpu` if you have GPU support. 885 | 886 | - Here we are using `sentence-transformers` but you can also use `OpenAIEmbedding` model with `3072` dimensions. 887 | 888 | ```python 889 | async def embedding_func(texts: list[str]) -> np.ndarray: 890 | model = SentenceTransformer('all-MiniLM-L6-v2') 891 | embeddings = model.encode(texts, convert_to_numpy=True) 892 | return embeddings 893 | 894 | # Initialize LightRAG with the LLM model function and embedding function 895 | rag = LightRAG( 896 | working_dir=WORKING_DIR, 897 | llm_model_func=llm_model_func, 898 | embedding_func=EmbeddingFunc( 899 | embedding_dim=384, 900 | max_token_size=8192, 901 | func=embedding_func, 902 | ), 903 | vector_storage="FaissVectorDBStorage", 904 | vector_db_storage_cls_kwargs={ 905 | "cosine_better_than_threshold": 0.3 # Your desired threshold 906 | } 907 | ) 908 | ``` 909 | 910 |
911 | 912 | ## Edit Entities and Relations 913 | 914 | LightRAG now supports comprehensive knowledge graph management capabilities, allowing you to create, edit, and delete entities and relationships within your knowledge graph. 915 | 916 |
917 | Create Entities and Relations 918 | 919 | ```python 920 | # Create new entity 921 | entity = rag.create_entity("Google", { 922 | "description": "Google is a multinational technology company specializing in internet-related services and products.", 923 | "entity_type": "company" 924 | }) 925 | 926 | # Create another entity 927 | product = rag.create_entity("Gmail", { 928 | "description": "Gmail is an email service developed by Google.", 929 | "entity_type": "product" 930 | }) 931 | 932 | # Create relation between entities 933 | relation = rag.create_relation("Google", "Gmail", { 934 | "description": "Google develops and operates Gmail.", 935 | "keywords": "develops operates service", 936 | "weight": 2.0 937 | }) 938 | ``` 939 | 940 |
941 | 942 |
943 | Edit Entities and Relations 944 | 945 | ```python 946 | # Edit an existing entity 947 | updated_entity = rag.edit_entity("Google", { 948 | "description": "Google is a subsidiary of Alphabet Inc., founded in 1998.", 949 | "entity_type": "tech_company" 950 | }) 951 | 952 | # Rename an entity (with all its relationships properly migrated) 953 | renamed_entity = rag.edit_entity("Gmail", { 954 | "entity_name": "Google Mail", 955 | "description": "Google Mail (formerly Gmail) is an email service." 956 | }) 957 | 958 | # Edit a relation between entities 959 | updated_relation = rag.edit_relation("Google", "Google Mail", { 960 | "description": "Google created and maintains Google Mail service.", 961 | "keywords": "creates maintains email service", 962 | "weight": 3.0 963 | }) 964 | ``` 965 | 966 | All operations are available in both synchronous and asynchronous versions. The asynchronous versions have the prefix "a" (e.g., `acreate_entity`, `aedit_relation`). 967 | 968 | #### Entity Operations 969 | 970 | - **create_entity**: Creates a new entity with specified attributes 971 | - **edit_entity**: Updates an existing entity's attributes or renames it 972 | 973 | #### Relation Operations 974 | 975 | - **create_relation**: Creates a new relation between existing entities 976 | - **edit_relation**: Updates an existing relation's attributes 977 | 978 | These operations maintain data consistency across both the graph database and vector database components, ensuring your knowledge graph remains coherent. 979 | 980 |
981 | 982 | ## Token Usage Tracking 983 | 984 |
985 | Overview and Usage 986 | 987 | LightRAG provides a TokenTracker tool to monitor and manage token consumption by large language models. This feature is particularly useful for controlling API costs and optimizing performance. 988 | 989 | ### Usage 990 | 991 | ```python 992 | from lightrag.utils import TokenTracker 993 | 994 | # Create TokenTracker instance 995 | token_tracker = TokenTracker() 996 | 997 | # Method 1: Using context manager (Recommended) 998 | # Suitable for scenarios requiring automatic token usage tracking 999 | with token_tracker: 1000 | result1 = await llm_model_func("your question 1") 1001 | result2 = await llm_model_func("your question 2") 1002 | 1003 | # Method 2: Manually adding token usage records 1004 | # Suitable for scenarios requiring more granular control over token statistics 1005 | token_tracker.reset() 1006 | 1007 | rag.insert() 1008 | 1009 | rag.query("your question 1", param=QueryParam(mode="naive")) 1010 | rag.query("your question 2", param=QueryParam(mode="mix")) 1011 | 1012 | # Display total token usage (including insert and query operations) 1013 | print("Token usage:", token_tracker.get_usage()) 1014 | ``` 1015 | 1016 | ### Usage Tips 1017 | - Use context managers for long sessions or batch operations to automatically track all token consumption 1018 | - For scenarios requiring segmented statistics, use manual mode and call reset() when appropriate 1019 | - Regular checking of token usage helps detect abnormal consumption early 1020 | - Actively use this feature during development and testing to optimize production costs 1021 | 1022 | ### Practical Examples 1023 | You can refer to these examples for implementing token tracking: 1024 | - `examples/lightrag_gemini_track_token_demo.py`: Token tracking example using Google Gemini model 1025 | - `examples/lightrag_siliconcloud_track_token_demo.py`: Token tracking example using SiliconCloud model 1026 | 1027 | These examples demonstrate how to effectively use the TokenTracker feature with different models and scenarios. 1028 | 1029 |
1030 | 1031 | ## Data Export Functions 1032 | 1033 | ### Overview 1034 | 1035 | LightRAG allows you to export your knowledge graph data in various formats for analysis, sharing, and backup purposes. The system supports exporting entities, relations, and relationship data. 1036 | 1037 | ### Export Functions 1038 | 1039 |
1040 | Basic Usage 1041 | 1042 | ```python 1043 | # Basic CSV export (default format) 1044 | rag.export_data("knowledge_graph.csv") 1045 | 1046 | # Specify any format 1047 | rag.export_data("output.xlsx", file_format="excel") 1048 | ``` 1049 | 1050 |
1051 | 1052 |
1053 | Different File Formats supported 1054 | 1055 | ```python 1056 | #Export data in CSV format 1057 | rag.export_data("graph_data.csv", file_format="csv") 1058 | 1059 | # Export data in Excel sheet 1060 | rag.export_data("graph_data.xlsx", file_format="excel") 1061 | 1062 | # Export data in markdown format 1063 | rag.export_data("graph_data.md", file_format="md") 1064 | 1065 | # Export data in Text 1066 | rag.export_data("graph_data.txt", file_format="txt") 1067 | ``` 1068 |
1069 | 1070 |
1071 | Additional Options 1072 | 1073 | Include vector embeddings in the export (optional): 1074 | 1075 | ```python 1076 | rag.export_data("complete_data.csv", include_vector_data=True) 1077 | ``` 1078 |
1079 | 1080 | ### Data Included in Export 1081 | 1082 | All exports include: 1083 | 1084 | * Entity information (names, IDs, metadata) 1085 | * Relation data (connections between entities) 1086 | * Relationship information from vector database 1087 | 1088 | 1089 | ## Entity Merging 1090 | 1091 |
1092 | Merge Entities and Their Relationships 1093 | 1094 | LightRAG now supports merging multiple entities into a single entity, automatically handling all relationships: 1095 | 1096 | ```python 1097 | # Basic entity merging 1098 | rag.merge_entities( 1099 | source_entities=["Artificial Intelligence", "AI", "Machine Intelligence"], 1100 | target_entity="AI Technology" 1101 | ) 1102 | ``` 1103 | 1104 | With custom merge strategy: 1105 | 1106 | ```python 1107 | # Define custom merge strategy for different fields 1108 | rag.merge_entities( 1109 | source_entities=["John Smith", "Dr. Smith", "J. Smith"], 1110 | target_entity="John Smith", 1111 | merge_strategy={ 1112 | "description": "concatenate", # Combine all descriptions 1113 | "entity_type": "keep_first", # Keep the entity type from the first entity 1114 | "source_id": "join_unique" # Combine all unique source IDs 1115 | } 1116 | ) 1117 | ``` 1118 | 1119 | With custom target entity data: 1120 | 1121 | ```python 1122 | # Specify exact values for the merged entity 1123 | rag.merge_entities( 1124 | source_entities=["New York", "NYC", "Big Apple"], 1125 | target_entity="New York City", 1126 | target_entity_data={ 1127 | "entity_type": "LOCATION", 1128 | "description": "New York City is the most populous city in the United States.", 1129 | } 1130 | ) 1131 | ``` 1132 | 1133 | Advanced usage combining both approaches: 1134 | 1135 | ```python 1136 | # Merge company entities with both strategy and custom data 1137 | rag.merge_entities( 1138 | source_entities=["Microsoft Corp", "Microsoft Corporation", "MSFT"], 1139 | target_entity="Microsoft", 1140 | merge_strategy={ 1141 | "description": "concatenate", # Combine all descriptions 1142 | "source_id": "join_unique" # Combine source IDs 1143 | }, 1144 | target_entity_data={ 1145 | "entity_type": "ORGANIZATION", 1146 | } 1147 | ) 1148 | ``` 1149 | 1150 | When merging entities: 1151 | 1152 | * All relationships from source entities are redirected to the target entity 1153 | * Duplicate relationships are intelligently merged 1154 | * Self-relationships (loops) are prevented 1155 | * Source entities are removed after merging 1156 | * Relationship weights and attributes are preserved 1157 | 1158 |
1159 | 1160 | ## Cache 1161 | 1162 |
1163 | Clear Cache 1164 | 1165 | You can clear the LLM response cache with different modes: 1166 | 1167 | ```python 1168 | # Clear all cache 1169 | await rag.aclear_cache() 1170 | 1171 | # Clear local mode cache 1172 | await rag.aclear_cache(modes=["local"]) 1173 | 1174 | # Clear extraction cache 1175 | await rag.aclear_cache(modes=["default"]) 1176 | 1177 | # Clear multiple modes 1178 | await rag.aclear_cache(modes=["local", "global", "hybrid"]) 1179 | 1180 | # Synchronous version 1181 | rag.clear_cache(modes=["local"]) 1182 | ``` 1183 | 1184 | Valid modes are: 1185 | 1186 | - `"default"`: Extraction cache 1187 | - `"naive"`: Naive search cache 1188 | - `"local"`: Local search cache 1189 | - `"global"`: Global search cache 1190 | - `"hybrid"`: Hybrid search cache 1191 | - `"mix"`: Mix search cache 1192 | 1193 |
1194 | 1195 | ## LightRAG API 1196 | 1197 | The LightRAG Server is designed to provide Web UI and API support. **For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md).** 1198 | 1199 | ## Graph Visualization 1200 | 1201 | The LightRAG Server offers a comprehensive knowledge graph visualization feature. It supports various gravity layouts, node queries, subgraph filtering, and more. **For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md).** 1202 | 1203 | ![iShot_2025-03-23_12.40.08](./README.assets/iShot_2025-03-23_12.40.08.png) 1204 | 1205 | ## Evaluation 1206 | 1207 | ### Dataset 1208 | 1209 | The dataset used in LightRAG can be downloaded from [TommyChien/UltraDomain](https://huggingface.co/datasets/TommyChien/UltraDomain). 1210 | 1211 | ### Generate Query 1212 | 1213 | LightRAG uses the following prompt to generate high-level queries, with the corresponding code in `example/generate_query.py`. 1214 | 1215 |
1216 | Prompt 1217 | 1218 | ```python 1219 | Given the following description of a dataset: 1220 | 1221 | {description} 1222 | 1223 | Please identify 5 potential users who would engage with this dataset. For each user, list 5 tasks they would perform with this dataset. Then, for each (user, task) combination, generate 5 questions that require a high-level understanding of the entire dataset. 1224 | 1225 | Output the results in the following structure: 1226 | - User 1: [user description] 1227 | - Task 1: [task description] 1228 | - Question 1: 1229 | - Question 2: 1230 | - Question 3: 1231 | - Question 4: 1232 | - Question 5: 1233 | - Task 2: [task description] 1234 | ... 1235 | - Task 5: [task description] 1236 | - User 2: [user description] 1237 | ... 1238 | - User 5: [user description] 1239 | ... 1240 | ``` 1241 | 1242 |
1243 | 1244 | ### Batch Eval 1245 | 1246 | To evaluate the performance of two RAG systems on high-level queries, LightRAG uses the following prompt, with the specific code available in `example/batch_eval.py`. 1247 | 1248 |
1249 | Prompt 1250 | 1251 | ```python 1252 | ---Role--- 1253 | You are an expert tasked with evaluating two answers to the same question based on three criteria: **Comprehensiveness**, **Diversity**, and **Empowerment**. 1254 | ---Goal--- 1255 | You will evaluate two answers to the same question based on three criteria: **Comprehensiveness**, **Diversity**, and **Empowerment**. 1256 | 1257 | - **Comprehensiveness**: How much detail does the answer provide to cover all aspects and details of the question? 1258 | - **Diversity**: How varied and rich is the answer in providing different perspectives and insights on the question? 1259 | - **Empowerment**: How well does the answer help the reader understand and make informed judgments about the topic? 1260 | 1261 | For each criterion, choose the better answer (either Answer 1 or Answer 2) and explain why. Then, select an overall winner based on these three categories. 1262 | 1263 | Here is the question: 1264 | {query} 1265 | 1266 | Here are the two answers: 1267 | 1268 | **Answer 1:** 1269 | {answer1} 1270 | 1271 | **Answer 2:** 1272 | {answer2} 1273 | 1274 | Evaluate both answers using the three criteria listed above and provide detailed explanations for each criterion. 1275 | 1276 | Output your evaluation in the following JSON format: 1277 | 1278 | {{ 1279 | "Comprehensiveness": {{ 1280 | "Winner": "[Answer 1 or Answer 2]", 1281 | "Explanation": "[Provide explanation here]" 1282 | }}, 1283 | "Empowerment": {{ 1284 | "Winner": "[Answer 1 or Answer 2]", 1285 | "Explanation": "[Provide explanation here]" 1286 | }}, 1287 | "Overall Winner": {{ 1288 | "Winner": "[Answer 1 or Answer 2]", 1289 | "Explanation": "[Summarize why this answer is the overall winner based on the three criteria]" 1290 | }} 1291 | }} 1292 | ``` 1293 | 1294 |
1295 | 1296 | ### Overall Performance Table 1297 | 1298 | | |**Agriculture**| |**CS**| |**Legal**| |**Mix**| | 1299 | |----------------------|---------------|------------|------|------------|---------|------------|-------|------------| 1300 | | |NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**| 1301 | |**Comprehensiveness**|32.4%|**67.6%**|38.4%|**61.6%**|16.4%|**83.6%**|38.8%|**61.2%**| 1302 | |**Diversity**|23.6%|**76.4%**|38.0%|**62.0%**|13.6%|**86.4%**|32.4%|**67.6%**| 1303 | |**Empowerment**|32.4%|**67.6%**|38.8%|**61.2%**|16.4%|**83.6%**|42.8%|**57.2%**| 1304 | |**Overall**|32.4%|**67.6%**|38.8%|**61.2%**|15.2%|**84.8%**|40.0%|**60.0%**| 1305 | | |RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**| 1306 | |**Comprehensiveness**|31.6%|**68.4%**|38.8%|**61.2%**|15.2%|**84.8%**|39.2%|**60.8%**| 1307 | |**Diversity**|29.2%|**70.8%**|39.2%|**60.8%**|11.6%|**88.4%**|30.8%|**69.2%**| 1308 | |**Empowerment**|31.6%|**68.4%**|36.4%|**63.6%**|15.2%|**84.8%**|42.4%|**57.6%**| 1309 | |**Overall**|32.4%|**67.6%**|38.0%|**62.0%**|14.4%|**85.6%**|40.0%|**60.0%**| 1310 | | |HyDE|**LightRAG**|HyDE|**LightRAG**|HyDE|**LightRAG**|HyDE|**LightRAG**| 1311 | |**Comprehensiveness**|26.0%|**74.0%**|41.6%|**58.4%**|26.8%|**73.2%**|40.4%|**59.6%**| 1312 | |**Diversity**|24.0%|**76.0%**|38.8%|**61.2%**|20.0%|**80.0%**|32.4%|**67.6%**| 1313 | |**Empowerment**|25.2%|**74.8%**|40.8%|**59.2%**|26.0%|**74.0%**|46.0%|**54.0%**| 1314 | |**Overall**|24.8%|**75.2%**|41.6%|**58.4%**|26.4%|**73.6%**|42.4%|**57.6%**| 1315 | | |GraphRAG|**LightRAG**|GraphRAG|**LightRAG**|GraphRAG|**LightRAG**|GraphRAG|**LightRAG**| 1316 | |**Comprehensiveness**|45.6%|**54.4%**|48.4%|**51.6%**|48.4%|**51.6%**|**50.4%**|49.6%| 1317 | |**Diversity**|22.8%|**77.2%**|40.8%|**59.2%**|26.4%|**73.6%**|36.0%|**64.0%**| 1318 | |**Empowerment**|41.2%|**58.8%**|45.2%|**54.8%**|43.6%|**56.4%**|**50.8%**|49.2%| 1319 | |**Overall**|45.2%|**54.8%**|48.0%|**52.0%**|47.2%|**52.8%**|**50.4%**|49.6%| 1320 | 1321 | ## Reproduce 1322 | 1323 | All the code can be found in the `./reproduce` directory. 1324 | 1325 | ### Step-0 Extract Unique Contexts 1326 | 1327 | First, we need to extract unique contexts in the datasets. 1328 | 1329 |
1330 | Code 1331 | 1332 | ```python 1333 | def extract_unique_contexts(input_directory, output_directory): 1334 | 1335 | os.makedirs(output_directory, exist_ok=True) 1336 | 1337 | jsonl_files = glob.glob(os.path.join(input_directory, '*.jsonl')) 1338 | print(f"Found {len(jsonl_files)} JSONL files.") 1339 | 1340 | for file_path in jsonl_files: 1341 | filename = os.path.basename(file_path) 1342 | name, ext = os.path.splitext(filename) 1343 | output_filename = f"{name}_unique_contexts.json" 1344 | output_path = os.path.join(output_directory, output_filename) 1345 | 1346 | unique_contexts_dict = {} 1347 | 1348 | print(f"Processing file: {filename}") 1349 | 1350 | try: 1351 | with open(file_path, 'r', encoding='utf-8') as infile: 1352 | for line_number, line in enumerate(infile, start=1): 1353 | line = line.strip() 1354 | if not line: 1355 | continue 1356 | try: 1357 | json_obj = json.loads(line) 1358 | context = json_obj.get('context') 1359 | if context and context not in unique_contexts_dict: 1360 | unique_contexts_dict[context] = None 1361 | except json.JSONDecodeError as e: 1362 | print(f"JSON decoding error in file {filename} at line {line_number}: {e}") 1363 | except FileNotFoundError: 1364 | print(f"File not found: {filename}") 1365 | continue 1366 | except Exception as e: 1367 | print(f"An error occurred while processing file {filename}: {e}") 1368 | continue 1369 | 1370 | unique_contexts_list = list(unique_contexts_dict.keys()) 1371 | print(f"There are {len(unique_contexts_list)} unique `context` entries in the file {filename}.") 1372 | 1373 | try: 1374 | with open(output_path, 'w', encoding='utf-8') as outfile: 1375 | json.dump(unique_contexts_list, outfile, ensure_ascii=False, indent=4) 1376 | print(f"Unique `context` entries have been saved to: {output_filename}") 1377 | except Exception as e: 1378 | print(f"An error occurred while saving to the file {output_filename}: {e}") 1379 | 1380 | print("All files have been processed.") 1381 | 1382 | ``` 1383 | 1384 |
1385 | 1386 | ### Step-1 Insert Contexts 1387 | 1388 | For the extracted contexts, we insert them into the LightRAG system. 1389 | 1390 |
1391 | Code 1392 | 1393 | ```python 1394 | def insert_text(rag, file_path): 1395 | with open(file_path, mode='r') as f: 1396 | unique_contexts = json.load(f) 1397 | 1398 | retries = 0 1399 | max_retries = 3 1400 | while retries < max_retries: 1401 | try: 1402 | rag.insert(unique_contexts) 1403 | break 1404 | except Exception as e: 1405 | retries += 1 1406 | print(f"Insertion failed, retrying ({retries}/{max_retries}), error: {e}") 1407 | time.sleep(10) 1408 | if retries == max_retries: 1409 | print("Insertion failed after exceeding the maximum number of retries") 1410 | ``` 1411 | 1412 |
1413 | 1414 | ### Step-2 Generate Queries 1415 | 1416 | We extract tokens from the first and the second half of each context in the dataset, then combine them as dataset descriptions to generate queries. 1417 | 1418 |
1419 | Code 1420 | 1421 | ```python 1422 | tokenizer = GPT2Tokenizer.from_pretrained('gpt2') 1423 | 1424 | def get_summary(context, tot_tokens=2000): 1425 | tokens = tokenizer.tokenize(context) 1426 | half_tokens = tot_tokens // 2 1427 | 1428 | start_tokens = tokens[1000:1000 + half_tokens] 1429 | end_tokens = tokens[-(1000 + half_tokens):1000] 1430 | 1431 | summary_tokens = start_tokens + end_tokens 1432 | summary = tokenizer.convert_tokens_to_string(summary_tokens) 1433 | 1434 | return summary 1435 | ``` 1436 | 1437 |
1438 | 1439 | ### Step-3 Query 1440 | 1441 | For the queries generated in Step-2, we will extract them and query LightRAG. 1442 | 1443 |
1444 | Code 1445 | 1446 | ```python 1447 | def extract_queries(file_path): 1448 | with open(file_path, 'r') as f: 1449 | data = f.read() 1450 | 1451 | data = data.replace('**', '') 1452 | 1453 | queries = re.findall(r'- Question \d+: (.+)', data) 1454 | 1455 | return queries 1456 | ``` 1457 | 1458 |
1459 | 1460 | ## Star History 1461 | 1462 | 1463 | 1464 | 1465 | 1466 | Star History Chart 1467 | 1468 | 1469 | 1470 | ## Contribution 1471 | 1472 | Thank you to all our contributors! 1473 | 1474 | 1475 | 1476 | 1477 | 1478 | ## 🌟Citation 1479 | 1480 | ```python 1481 | @article{guo2024lightrag, 1482 | title={LightRAG: Simple and Fast Retrieval-Augmented Generation}, 1483 | author={Zirui Guo and Lianghao Xia and Yanhua Yu and Tu Ao and Chao Huang}, 1484 | year={2024}, 1485 | eprint={2410.05779}, 1486 | archivePrefix={arXiv}, 1487 | primaryClass={cs.IR} 1488 | } 1489 | ``` 1490 | 1491 | **Thank you for your interest in our work!** 1492 | -------------------------------------------------------------------------------- /docs/Algorithm.md: -------------------------------------------------------------------------------- 1 | ![LightRAG Indexing Flowchart](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-VectorDB-Json-KV-Store-Indexing-Flowchart-scaled.jpg) 2 | *Figure 1: LightRAG Indexing Flowchart - Img Caption : [Source](https://learnopencv.com/lightrag/)* 3 | ![LightRAG Retrieval and Querying Flowchart](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-Querying-Flowchart-Dual-Level-Retrieval-Generation-Knowledge-Graphs-scaled.jpg) 4 | *Figure 2: LightRAG Retrieval and Querying Flowchart - Img Caption : [Source](https://learnopencv.com/lightrag/)* 5 | -------------------------------------------------------------------------------- /docs/DockerDeployment.md: -------------------------------------------------------------------------------- 1 | # LightRAG 2 | 3 | A lightweight Knowledge Graph Retrieval-Augmented Generation system with multiple LLM backend support. 4 | 5 | ## 🚀 Installation 6 | 7 | ### Prerequisites 8 | - Python 3.10+ 9 | - Git 10 | - Docker (optional for Docker deployment) 11 | 12 | ### Native Installation 13 | 14 | 1. Clone the repository: 15 | ```bash 16 | # Linux/MacOS 17 | git clone https://github.com/HKUDS/LightRAG.git 18 | cd LightRAG 19 | ``` 20 | ```powershell 21 | # Windows PowerShell 22 | git clone https://github.com/HKUDS/LightRAG.git 23 | cd LightRAG 24 | ``` 25 | 26 | 2. Configure your environment: 27 | ```bash 28 | # Linux/MacOS 29 | cp .env.example .env 30 | # Edit .env with your preferred configuration 31 | ``` 32 | ```powershell 33 | # Windows PowerShell 34 | Copy-Item .env.example .env 35 | # Edit .env with your preferred configuration 36 | ``` 37 | 38 | 3. Create and activate virtual environment: 39 | ```bash 40 | # Linux/MacOS 41 | python -m venv venv 42 | source venv/bin/activate 43 | ``` 44 | ```powershell 45 | # Windows PowerShell 46 | python -m venv venv 47 | .\venv\Scripts\Activate 48 | ``` 49 | 50 | 4. Install dependencies: 51 | ```bash 52 | # Both platforms 53 | pip install -r requirements.txt 54 | ``` 55 | 56 | ## 🐳 Docker Deployment 57 | 58 | Docker instructions work the same on all platforms with Docker Desktop installed. 59 | 60 | 1. Build and start the container: 61 | ```bash 62 | docker-compose up -d 63 | ``` 64 | 65 | ### Configuration Options 66 | 67 | LightRAG can be configured using environment variables in the `.env` file: 68 | 69 | #### Server Configuration 70 | - `HOST`: Server host (default: 0.0.0.0) 71 | - `PORT`: Server port (default: 9621) 72 | 73 | #### LLM Configuration 74 | - `LLM_BINDING`: LLM backend to use (lollms/ollama/openai) 75 | - `LLM_BINDING_HOST`: LLM server host URL 76 | - `LLM_MODEL`: Model name to use 77 | 78 | #### Embedding Configuration 79 | - `EMBEDDING_BINDING`: Embedding backend (lollms/ollama/openai) 80 | - `EMBEDDING_BINDING_HOST`: Embedding server host URL 81 | - `EMBEDDING_MODEL`: Embedding model name 82 | 83 | #### RAG Configuration 84 | - `MAX_ASYNC`: Maximum async operations 85 | - `MAX_TOKENS`: Maximum token size 86 | - `EMBEDDING_DIM`: Embedding dimensions 87 | - `MAX_EMBED_TOKENS`: Maximum embedding token size 88 | 89 | #### Security 90 | - `LIGHTRAG_API_KEY`: API key for authentication 91 | 92 | ### Data Storage Paths 93 | 94 | The system uses the following paths for data storage: 95 | ``` 96 | data/ 97 | ├── rag_storage/ # RAG data persistence 98 | └── inputs/ # Input documents 99 | ``` 100 | 101 | ### Example Deployments 102 | 103 | 1. Using with Ollama: 104 | ```env 105 | LLM_BINDING=ollama 106 | LLM_BINDING_HOST=http://host.docker.internal:11434 107 | LLM_MODEL=mistral 108 | EMBEDDING_BINDING=ollama 109 | EMBEDDING_BINDING_HOST=http://host.docker.internal:11434 110 | EMBEDDING_MODEL=bge-m3 111 | ``` 112 | 113 | you can't just use localhost from docker, that's why you need to use host.docker.internal which is defined in the docker compose file and should allow you to access the localhost services. 114 | 115 | 2. Using with OpenAI: 116 | ```env 117 | LLM_BINDING=openai 118 | LLM_MODEL=gpt-3.5-turbo 119 | EMBEDDING_BINDING=openai 120 | EMBEDDING_MODEL=text-embedding-ada-002 121 | OPENAI_API_KEY=your-api-key 122 | ``` 123 | 124 | ### API Usage 125 | 126 | Once deployed, you can interact with the API at `http://localhost:9621` 127 | 128 | Example query using PowerShell: 129 | ```powershell 130 | $headers = @{ 131 | "X-API-Key" = "your-api-key" 132 | "Content-Type" = "application/json" 133 | } 134 | $body = @{ 135 | query = "your question here" 136 | } | ConvertTo-Json 137 | 138 | Invoke-RestMethod -Uri "http://localhost:9621/query" -Method Post -Headers $headers -Body $body 139 | ``` 140 | 141 | Example query using curl: 142 | ```bash 143 | curl -X POST "http://localhost:9621/query" \ 144 | -H "X-API-Key: your-api-key" \ 145 | -H "Content-Type: application/json" \ 146 | -d '{"query": "your question here"}' 147 | ``` 148 | 149 | ## 🔒 Security 150 | 151 | Remember to: 152 | 1. Set a strong API key in production 153 | 2. Use SSL in production environments 154 | 3. Configure proper network security 155 | 156 | ## 📦 Updates 157 | 158 | To update the Docker container: 159 | ```bash 160 | docker-compose pull 161 | docker-compose up -d --build 162 | ``` 163 | 164 | To update native installation: 165 | ```bash 166 | # Linux/MacOS 167 | git pull 168 | source venv/bin/activate 169 | pip install -r requirements.txt 170 | ``` 171 | ```powershell 172 | # Windows PowerShell 173 | git pull 174 | .\venv\Scripts\Activate 175 | pip install -r requirements.txt 176 | ``` 177 | -------------------------------------------------------------------------------- /examples/openai_README.md: -------------------------------------------------------------------------------- 1 | 2 | ## API Server Implementation 3 | 4 | LightRAG also provides a FastAPI-based server implementation for RESTful API access to RAG operations. This allows you to run LightRAG as a service and interact with it through HTTP requests. 5 | 6 | ### Setting up the API Server 7 |
8 | Click to expand setup instructions 9 | 10 | 1. First, ensure you have the required dependencies: 11 | ```bash 12 | pip install fastapi uvicorn pydantic 13 | ``` 14 | 15 | 2. Set up your environment variables: 16 | ```bash 17 | export RAG_DIR="your_index_directory" # Optional: Defaults to "index_default" 18 | export OPENAI_BASE_URL="Your OpenAI API base URL" # Optional: Defaults to "https://api.openai.com/v1" 19 | export OPENAI_API_KEY="Your OpenAI API key" # Required 20 | export LLM_MODEL="Your LLM model" # Optional: Defaults to "gpt-4o-mini" 21 | export EMBEDDING_MODEL="Your embedding model" # Optional: Defaults to "text-embedding-3-large" 22 | ``` 23 | 24 | 3. Run the API server: 25 | ```bash 26 | python examples/lightrag_api_openai_compatible_demo.py 27 | ``` 28 | 29 | The server will start on `http://0.0.0.0:8020`. 30 |
31 | 32 | ### API Endpoints 33 | 34 | The API server provides the following endpoints: 35 | 36 | #### 1. Query Endpoint 37 |
38 | Click to view Query endpoint details 39 | 40 | - **URL:** `/query` 41 | - **Method:** POST 42 | - **Body:** 43 | ```json 44 | { 45 | "query": "Your question here", 46 | "mode": "hybrid", // Can be "naive", "local", "global", or "hybrid" 47 | "only_need_context": true // Optional: Defaults to false, if true, only the referenced context will be returned, otherwise the llm answer will be returned 48 | } 49 | ``` 50 | - **Example:** 51 | ```bash 52 | curl -X POST "http://127.0.0.1:8020/query" \ 53 | -H "Content-Type: application/json" \ 54 | -d '{"query": "What are the main themes?", "mode": "hybrid"}' 55 | ``` 56 |
57 | 58 | #### 2. Insert Text Endpoint 59 |
60 | Click to view Insert Text endpoint details 61 | 62 | - **URL:** `/insert` 63 | - **Method:** POST 64 | - **Body:** 65 | ```json 66 | { 67 | "text": "Your text content here" 68 | } 69 | ``` 70 | - **Example:** 71 | ```bash 72 | curl -X POST "http://127.0.0.1:8020/insert" \ 73 | -H "Content-Type: application/json" \ 74 | -d '{"text": "Content to be inserted into RAG"}' 75 | ``` 76 |
77 | 78 | #### 3. Insert File Endpoint 79 |
80 | Click to view Insert File endpoint details 81 | 82 | - **URL:** `/insert_file` 83 | - **Method:** POST 84 | - **Body:** 85 | ```json 86 | { 87 | "file_path": "path/to/your/file.txt" 88 | } 89 | ``` 90 | - **Example:** 91 | ```bash 92 | curl -X POST "http://127.0.0.1:8020/insert_file" \ 93 | -H "Content-Type: application/json" \ 94 | -d '{"file_path": "./book.txt"}' 95 | ``` 96 |
97 | 98 | #### 4. Health Check Endpoint 99 |
100 | Click to view Health Check endpoint details 101 | 102 | - **URL:** `/health` 103 | - **Method:** GET 104 | - **Example:** 105 | ```bash 106 | curl -X GET "http://127.0.0.1:8020/health" 107 | ``` 108 |
109 | 110 | ### Configuration 111 | 112 | The API server can be configured using environment variables: 113 | - `RAG_DIR`: Directory for storing the RAG index (default: "index_default") 114 | - API keys and base URLs should be configured in the code for your specific LLM and embedding model providers 115 | -------------------------------------------------------------------------------- /examples/openai_README_zh.md: -------------------------------------------------------------------------------- 1 | 2 | ## API 服务器实现 3 | 4 | LightRAG also provides a FastAPI-based server implementation for RESTful API access to RAG operations. This allows you to run LightRAG as a service and interact with it through HTTP requests. 5 | LightRAG 还提供基于 FastAPI 的服务器实现,用于对 RAG 操作进行 RESTful API 访问。这允许您将 LightRAG 作为服务运行并通过 HTTP 请求与其交互。 6 | 7 | ### 设置 API 服务器 8 |
9 | 单击展开设置说明 10 | 11 | 1. 首先,确保您具有所需的依赖项: 12 | ```bash 13 | pip install fastapi uvicorn pydantic 14 | ``` 15 | 16 | 2. 设置您的环境变量: 17 | ```bash 18 | export RAG_DIR="your_index_directory" # Optional: Defaults to "index_default" 19 | export OPENAI_BASE_URL="Your OpenAI API base URL" # Optional: Defaults to "https://api.openai.com/v1" 20 | export OPENAI_API_KEY="Your OpenAI API key" # Required 21 | export LLM_MODEL="Your LLM model" # Optional: Defaults to "gpt-4o-mini" 22 | export EMBEDDING_MODEL="Your embedding model" # Optional: Defaults to "text-embedding-3-large" 23 | ``` 24 | 25 | 3. 运行API服务器: 26 | ```bash 27 | python examples/lightrag_api_openai_compatible_demo.py 28 | ``` 29 | 30 | 服务器将启动于 `http://0.0.0.0:8020`. 31 |
32 | 33 | ### API端点 34 | 35 | API服务器提供以下端点: 36 | 37 | #### 1. 查询端点 38 |
39 | 点击查看查询端点详情 40 | 41 | - **URL:** `/query` 42 | - **Method:** POST 43 | - **Body:** 44 | ```json 45 | { 46 | "query": "Your question here", 47 | "mode": "hybrid", // Can be "naive", "local", "global", or "hybrid" 48 | "only_need_context": true // Optional: Defaults to false, if true, only the referenced context will be returned, otherwise the llm answer will be returned 49 | } 50 | ``` 51 | - **Example:** 52 | ```bash 53 | curl -X POST "http://127.0.0.1:8020/query" \ 54 | -H "Content-Type: application/json" \ 55 | -d '{"query": "What are the main themes?", "mode": "hybrid"}' 56 | ``` 57 |
58 | 59 | #### 2. 插入文本端点 60 |
61 | 单击可查看插入文本端点详细信息 62 | 63 | - **URL:** `/insert` 64 | - **Method:** POST 65 | - **Body:** 66 | ```json 67 | { 68 | "text": "Your text content here" 69 | } 70 | ``` 71 | - **Example:** 72 | ```bash 73 | curl -X POST "http://127.0.0.1:8020/insert" \ 74 | -H "Content-Type: application/json" \ 75 | -d '{"text": "Content to be inserted into RAG"}' 76 | ``` 77 |
78 | 79 | #### 3. 插入文件端点 80 |
81 | 单击查看插入文件端点详细信息 82 | 83 | - **URL:** `/insert_file` 84 | - **Method:** POST 85 | - **Body:** 86 | ```json 87 | { 88 | "file_path": "path/to/your/file.txt" 89 | } 90 | ``` 91 | - **Example:** 92 | ```bash 93 | curl -X POST "http://127.0.0.1:8020/insert_file" \ 94 | -H "Content-Type: application/json" \ 95 | -d '{"file_path": "./book.txt"}' 96 | ``` 97 |
98 | 99 | #### 4. 健康检查端点 100 |
101 | 点击查看健康检查端点详细信息 102 | 103 | - **URL:** `/health` 104 | - **Method:** GET 105 | - **Example:** 106 | ```bash 107 | curl -X GET "http://127.0.0.1:8020/health" 108 | ``` 109 |
110 | 111 | ### 配置 112 | 113 | 可以使用环境变量配置API服务器: 114 | - `RAG_DIR`: 存放RAG索引的目录 (default: "index_default") 115 | - 应在代码中为您的特定 LLM 和嵌入模型提供商配置 API 密钥和基本 URL 116 | -------------------------------------------------------------------------------- /lightrag/api/README-zh.md: -------------------------------------------------------------------------------- 1 | # LightRAG 服务器和 Web 界面 2 | 3 | LightRAG 服务器旨在提供 Web 界面和 API 支持。Web 界面便于文档索引、知识图谱探索和简单的 RAG 查询界面。LightRAG 服务器还提供了与 Ollama 兼容的接口,旨在将 LightRAG 模拟为 Ollama 聊天模型。这使得 AI 聊天机器人(如 Open WebUI)可以轻松访问 LightRAG。 4 | 5 | ![image-20250323122538997](./README.assets/image-20250323122538997.png) 6 | 7 | ![image-20250323122754387](./README.assets/image-20250323122754387.png) 8 | 9 | ![image-20250323123011220](./README.assets/image-20250323123011220.png) 10 | 11 | ## 入门指南 12 | 13 | ### 安装 14 | 15 | * 从 PyPI 安装 16 | 17 | ```bash 18 | pip install "lightrag-hku[api]" 19 | ``` 20 | 21 | * 从源代码安装 22 | 23 | ```bash 24 | # 克隆仓库 25 | git clone https://github.com/HKUDS/lightrag.git 26 | 27 | # 切换到仓库目录 28 | cd lightrag 29 | 30 | # 如有必要,创建 Python 虚拟环境 31 | # 以可编辑模式安装并支持 API 32 | pip install -e ".[api]" 33 | ``` 34 | 35 | ### 启动 LightRAG 服务器前的准备 36 | 37 | LightRAG 需要同时集成 LLM(大型语言模型)和嵌入模型以有效执行文档索引和查询操作。在首次部署 LightRAG 服务器之前,必须配置 LLM 和嵌入模型的设置。LightRAG 支持绑定到各种 LLM/嵌入后端: 38 | 39 | * ollama 40 | * lollms 41 | * openai 或 openai 兼容 42 | * azure_openai 43 | 44 | 建议使用环境变量来配置 LightRAG 服务器。项目根目录中有一个名为 `env.example` 的示例环境变量文件。请将此文件复制到启动目录并重命名为 `.env`。之后,您可以在 `.env` 文件中修改与 LLM 和嵌入模型相关的参数。需要注意的是,LightRAG 服务器每次启动时都会将 `.env` 中的环境变量加载到系统环境变量中。由于 LightRAG 服务器会优先使用系统环境变量中的设置,如果您在通过命令行启动 LightRAG 服务器后修改了 `.env` 文件,则需要执行 `source .env` 使新设置生效。 45 | 46 | 以下是 LLM 和嵌入模型的一些常见设置示例: 47 | 48 | * OpenAI LLM + Ollama 嵌入 49 | 50 | ``` 51 | LLM_BINDING=openai 52 | LLM_MODEL=gpt-4o 53 | LLM_BINDING_HOST=https://api.openai.com/v1 54 | LLM_BINDING_API_KEY=your_api_key 55 | ### 发送给 LLM 的最大 token 数(小于模型上下文大小) 56 | MAX_TOKENS=32768 57 | 58 | EMBEDDING_BINDING=ollama 59 | EMBEDDING_BINDING_HOST=http://localhost:11434 60 | EMBEDDING_MODEL=bge-m3:latest 61 | EMBEDDING_DIM=1024 62 | # EMBEDDING_BINDING_API_KEY=your_api_key 63 | ``` 64 | 65 | * Ollama LLM + Ollama 嵌入 66 | 67 | ``` 68 | LLM_BINDING=ollama 69 | LLM_MODEL=mistral-nemo:latest 70 | LLM_BINDING_HOST=http://localhost:11434 71 | # LLM_BINDING_API_KEY=your_api_key 72 | ### 发送给 LLM 的最大 token 数(基于您的 Ollama 服务器容量) 73 | MAX_TOKENS=8192 74 | 75 | EMBEDDING_BINDING=ollama 76 | EMBEDDING_BINDING_HOST=http://localhost:11434 77 | EMBEDDING_MODEL=bge-m3:latest 78 | EMBEDDING_DIM=1024 79 | # EMBEDDING_BINDING_API_KEY=your_api_key 80 | ``` 81 | 82 | ### 启动 LightRAG 服务器 83 | 84 | LightRAG 服务器支持两种运行模式: 85 | * 简单高效的 Uvicorn 模式 86 | 87 | ``` 88 | lightrag-server 89 | ``` 90 | * 多进程 Gunicorn + Uvicorn 模式(生产模式,不支持 Windows 环境) 91 | 92 | ``` 93 | lightrag-gunicorn --workers 4 94 | ``` 95 | `.env` 文件必须放在启动目录中。启动时,LightRAG 服务器将创建一个文档目录(默认为 `./inputs`)和一个数据目录(默认为 `./rag_storage`)。这允许您从不同目录启动多个 LightRAG 服务器实例,每个实例配置为监听不同的网络端口。 96 | 97 | 以下是一些常用的启动参数: 98 | 99 | - `--host`:服务器监听地址(默认:0.0.0.0) 100 | - `--port`:服务器监听端口(默认:9621) 101 | - `--timeout`:LLM 请求超时时间(默认:150 秒) 102 | - `--log-level`:日志级别(默认:INFO) 103 | - --input-dir:指定要扫描文档的目录(默认:./input) 104 | 105 | > ** 要求将.env文件置于启动目录中是经过特意设计的**。 这样做的目的是支持用户同时启动多个LightRAG实例,并为不同实例配置不同的.env文件。 106 | 107 | > **修改.env文件后,您需要重新打开终端以使新设置生效**。 这是因为每次启动时,LightRAG Server会将.env文件中的环境变量加载至系统环境变量,且系统环境变量的设置具有更高优先级。 108 | 109 | ### 启动时自动扫描 110 | 111 | 当使用 `--auto-scan-at-startup` 参数启动任何服务器时,系统将自动: 112 | 113 | 1. 扫描输入目录中的新文件 114 | 2. 为尚未在数据库中的新文档建立索引 115 | 3. 使所有内容立即可用于 RAG 查询 116 | 117 | > `--input-dir` 参数指定要扫描的输入目录。您可以从 webui 触发输入目录扫描。 118 | 119 | ### Gunicorn + Uvicorn 的多工作进程 120 | 121 | LightRAG 服务器可以在 `Gunicorn + Uvicorn` 预加载模式下运行。Gunicorn 的多工作进程(多进程)功能可以防止文档索引任务阻塞 RAG 查询。使用 CPU 密集型文档提取工具(如 docling)在纯 Uvicorn 模式下可能会导致整个系统被阻塞。 122 | 123 | 虽然 LightRAG 服务器使用一个工作进程来处理文档索引流程,但通过 Uvicorn 的异步任务支持,可以并行处理多个文件。文档索引速度的瓶颈主要在于 LLM。如果您的 LLM 支持高并发,您可以通过增加 LLM 的并发级别来加速文档索引。以下是几个与并发处理相关的环境变量及其默认值: 124 | 125 | ``` 126 | ### 工作进程数,数字不大于 (2 x 核心数) + 1 127 | WORKERS=2 128 | ### 一批中并行处理的文件数 129 | MAX_PARALLEL_INSERT=2 130 | # LLM 的最大并发请求数 131 | MAX_ASYNC=4 132 | ``` 133 | 134 | ### 将 Lightrag 安装为 Linux 服务 135 | 136 | 从示例文件 `lightrag.sevice.example` 创建您的服务文件 `lightrag.sevice`。修改服务文件中的 WorkingDirectory 和 ExecStart: 137 | 138 | ```text 139 | Description=LightRAG Ollama Service 140 | WorkingDirectory= 141 | ExecStart=/lightrag/api/lightrag-api 142 | ``` 143 | 144 | 修改您的服务启动脚本:`lightrag-api`。根据需要更改 python 虚拟环境激活命令: 145 | 146 | ```shell 147 | #!/bin/bash 148 | 149 | # 您的 python 虚拟环境激活命令 150 | source /home/netman/lightrag-xyj/venv/bin/activate 151 | # 启动 lightrag api 服务器 152 | lightrag-server 153 | ``` 154 | 155 | 安装 LightRAG 服务。如果您的系统是 Ubuntu,以下命令将生效: 156 | 157 | ```shell 158 | sudo cp lightrag.service /etc/systemd/system/ 159 | sudo systemctl daemon-reload 160 | sudo systemctl start lightrag.service 161 | sudo systemctl status lightrag.service 162 | sudo systemctl enable lightrag.service 163 | ``` 164 | 165 | ## Ollama 模拟 166 | 167 | 我们为 LightRAG 提供了 Ollama 兼容接口,旨在将 LightRAG 模拟为 Ollama 聊天模型。这使得支持 Ollama 的 AI 聊天前端(如 Open WebUI)可以轻松访问 LightRAG。 168 | 169 | ### 将 Open WebUI 连接到 LightRAG 170 | 171 | 启动 lightrag-server 后,您可以在 Open WebUI 管理面板中添加 Ollama 类型的连接。然后,一个名为 `lightrag:latest` 的模型将出现在 Open WebUI 的模型管理界面中。用户随后可以通过聊天界面向 LightRAG 发送查询。对于这种用例,最好将 LightRAG 安装为服务。 172 | 173 | Open WebUI 使用 LLM 来执行会话标题和会话关键词生成任务。因此,Ollama 聊天补全 API 会检测并将 OpenWebUI 会话相关请求直接转发给底层 LLM。Open WebUI 的截图: 174 | 175 | ![image-20250323194750379](./README.assets/image-20250323194750379.png) 176 | 177 | ### 在聊天中选择查询模式 178 | 179 | 如果您从 LightRAG 的 Ollama 接口发送消息(查询),默认查询模式是 `hybrid`。您可以通过发送带有查询前缀的消息来选择查询模式。 180 | 181 | 查询字符串中的查询前缀可以决定使用哪种 LightRAG 查询模式来生成响应。支持的前缀包括: 182 | 183 | ``` 184 | /local 185 | /global 186 | /hybrid 187 | /naive 188 | /mix 189 | 190 | /bypass 191 | /context 192 | /localcontext 193 | /globalcontext 194 | /hybridcontext 195 | /naivecontext 196 | /mixcontext 197 | ``` 198 | 199 | 例如,聊天消息 "/mix 唐僧有几个徒弟" 将触发 LightRAG 的混合模式查询。没有查询前缀的聊天消息默认会触发混合模式查询。 200 | 201 | "/bypass" 不是 LightRAG 查询模式,它会告诉 API 服务器将查询连同聊天历史直接传递给底层 LLM。因此用户可以使用 LLM 基于聊天历史回答问题。如果您使用 Open WebUI 作为前端,您可以直接切换到普通 LLM 模型,而不是使用 /bypass 前缀。 202 | 203 | "/context" 也不是 LightRAG 查询模式,它会告诉 LightRAG 只返回为 LLM 准备的上下文信息。您可以检查上下文是否符合您的需求,或者自行处理上下文。 204 | 205 | ## API 密钥和认证 206 | 207 | 默认情况下,LightRAG 服务器可以在没有任何认证的情况下访问。我们可以使用 API 密钥或账户凭证配置服务器以确保其安全。 208 | 209 | * API 密钥 210 | 211 | ``` 212 | LIGHTRAG_API_KEY=your-secure-api-key-here 213 | WHITELIST_PATHS=/health,/api/* 214 | ``` 215 | 216 | > 健康检查和 Ollama 模拟端点默认不进行 API 密钥检查。 217 | 218 | * 账户凭证(Web 界面需要登录后才能访问) 219 | 220 | LightRAG API 服务器使用基于 HS256 算法的 JWT 认证。要启用安全访问控制,需要以下环境变量: 221 | 222 | ```bash 223 | # JWT 认证 224 | AUTH_ACCOUNTS='admin:admin123,user1:pass456' 225 | TOKEN_SECRET='your-key' 226 | TOKEN_EXPIRE_HOURS=4 227 | ``` 228 | 229 | > 目前仅支持配置一个管理员账户和密码。尚未开发和实现完整的账户系统。 230 | 231 | 如果未配置账户凭证,Web 界面将以访客身份访问系统。因此,即使仅配置了 API 密钥,所有 API 仍然可以通过访客账户访问,这仍然不安全。因此,要保护 API,需要同时配置这两种认证方法。 232 | 233 | ## Azure OpenAI 后端配置 234 | 235 | 可以使用以下 Azure CLI 命令创建 Azure OpenAI API(您需要先从 [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) 安装 Azure CLI): 236 | 237 | ```bash 238 | # 根据需要更改资源组名称、位置和 OpenAI 资源名称 239 | RESOURCE_GROUP_NAME=LightRAG 240 | LOCATION=swedencentral 241 | RESOURCE_NAME=LightRAG-OpenAI 242 | 243 | az login 244 | az group create --name $RESOURCE_GROUP_NAME --location $LOCATION 245 | az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral 246 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard" 247 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard" 248 | az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint" 249 | az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME 250 | ``` 251 | 252 | 最后一个命令的输出将提供 OpenAI API 的端点和密钥。您可以使用这些值在 `.env` 文件中设置环境变量。 253 | 254 | ``` 255 | # .env 中的 Azure OpenAI 配置 256 | LLM_BINDING=azure_openai 257 | LLM_BINDING_HOST=your-azure-endpoint 258 | LLM_MODEL=your-model-deployment-name 259 | LLM_BINDING_API_KEY=your-azure-api-key 260 | ### API Version可选,默认为最新版本 261 | AZURE_OPENAI_API_VERSION=2024-08-01-preview 262 | 263 | ### 如果使用 Azure OpenAI 进行嵌入 264 | EMBEDDING_BINDING=azure_openai 265 | EMBEDDING_MODEL=your-embedding-deployment-name 266 | ``` 267 | 268 | ## LightRAG 服务器详细配置 269 | 270 | API 服务器可以通过三种方式配置(优先级从高到低): 271 | 272 | * 命令行参数 273 | * 环境变量或 .env 文件 274 | * Config.ini(仅用于存储配置) 275 | 276 | 大多数配置都有默认设置,详细信息请查看示例文件:`.env.example`。数据存储配置也可以通过 config.ini 设置。为方便起见,提供了示例文件 `config.ini.example`。 277 | 278 | ### 支持的 LLM 和嵌入后端 279 | 280 | LightRAG 支持绑定到各种 LLM/嵌入后端: 281 | 282 | * ollama 283 | * lollms 284 | * openai 和 openai 兼容 285 | * azure_openai 286 | 287 | 使用环境变量 `LLM_BINDING` 或 CLI 参数 `--llm-binding` 选择 LLM 后端类型。使用环境变量 `EMBEDDING_BINDING` 或 CLI 参数 `--embedding-binding` 选择嵌入后端类型。 288 | 289 | ### 实体提取配置 290 | * ENABLE_LLM_CACHE_FOR_EXTRACT:为实体提取启用 LLM 缓存(默认:true) 291 | 292 | 在测试环境中将 `ENABLE_LLM_CACHE_FOR_EXTRACT` 设置为 true 以减少 LLM 调用成本是很常见的做法。 293 | 294 | ### 支持的存储类型 295 | 296 | LightRAG 使用 4 种类型的存储用于不同目的: 297 | 298 | * KV_STORAGE:llm 响应缓存、文本块、文档信息 299 | * VECTOR_STORAGE:实体向量、关系向量、块向量 300 | * GRAPH_STORAGE:实体关系图 301 | * DOC_STATUS_STORAGE:文档索引状态 302 | 303 | 每种存储类型都有几种实现: 304 | 305 | * KV_STORAGE 支持的实现名称 306 | 307 | ``` 308 | JsonKVStorage JsonFile(默认) 309 | PGKVStorage Postgres 310 | RedisKVStorage Redis 311 | MongoKVStorage MogonDB 312 | ``` 313 | 314 | * GRAPH_STORAGE 支持的实现名称 315 | 316 | ``` 317 | NetworkXStorage NetworkX(默认) 318 | Neo4JStorage Neo4J 319 | PGGraphStorage PostgreSQL with AGE plugin 320 | ``` 321 | 322 | > 在测试中Neo4j图形数据库相比PostgreSQL AGE有更好的性能表现。 323 | 324 | * VECTOR_STORAGE 支持的实现名称 325 | 326 | ``` 327 | NanoVectorDBStorage NanoVector(默认) 328 | PGVectorStorage Postgres 329 | MilvusVectorDBStorge Milvus 330 | ChromaVectorDBStorage Chroma 331 | FaissVectorDBStorage Faiss 332 | QdrantVectorDBStorage Qdrant 333 | MongoVectorDBStorage MongoDB 334 | ``` 335 | 336 | * DOC_STATUS_STORAGE 支持的实现名称 337 | 338 | ``` 339 | JsonDocStatusStorage JsonFile(默认) 340 | PGDocStatusStorage Postgres 341 | MongoDocStatusStorage MongoDB 342 | ``` 343 | 344 | ### 如何选择存储实现 345 | 346 | 您可以通过环境变量选择存储实现。在首次启动 API 服务器之前,您可以将以下环境变量设置为特定的存储实现名称: 347 | 348 | ``` 349 | LIGHTRAG_KV_STORAGE=PGKVStorage 350 | LIGHTRAG_VECTOR_STORAGE=PGVectorStorage 351 | LIGHTRAG_GRAPH_STORAGE=PGGraphStorage 352 | LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage 353 | ``` 354 | 355 | 在向 LightRAG 添加文档后,您不能更改存储实现选择。目前尚不支持从一个存储实现迁移到另一个存储实现。更多信息请阅读示例 env 文件或 config.ini 文件。 356 | 357 | ### LightRag API 服务器命令行选项 358 | 359 | | 参数 | 默认值 | 描述 | 360 | |-----------|---------|-------------| 361 | | --host | 0.0.0.0 | 服务器主机 | 362 | | --port | 9621 | 服务器端口 | 363 | | --working-dir | ./rag_storage | RAG 存储的工作目录 | 364 | | --input-dir | ./inputs | 包含输入文档的目录 | 365 | | --max-async | 4 | 最大异步操作数 | 366 | | --max-tokens | 32768 | 最大 token 大小 | 367 | | --timeout | 150 | 超时时间(秒)。None 表示无限超时(不推荐) | 368 | | --log-level | INFO | 日志级别(DEBUG、INFO、WARNING、ERROR、CRITICAL) | 369 | | --verbose | - | 详细调试输出(True、False) | 370 | | --key | None | 用于认证的 API 密钥。保护 lightrag 服务器免受未授权访问 | 371 | | --ssl | False | 启用 HTTPS | 372 | | --ssl-certfile | None | SSL 证书文件路径(如果启用 --ssl 则必需) | 373 | | --ssl-keyfile | None | SSL 私钥文件路径(如果启用 --ssl 则必需) | 374 | | --top-k | 50 | 要检索的 top-k 项目数;在"local"模式下对应实体,在"global"模式下对应关系。 | 375 | | --cosine-threshold | 0.4 | 节点和关系检索的余弦阈值,与 top-k 一起控制节点和关系的检索。 | 376 | | --llm-binding | ollama | LLM 绑定类型(lollms、ollama、openai、openai-ollama、azure_openai) | 377 | | --embedding-binding | ollama | 嵌入绑定类型(lollms、ollama、openai、azure_openai) | 378 | | auto-scan-at-startup | - | 扫描输入目录中的新文件并开始索引 | 379 | 380 | ### .env 文件示例 381 | 382 | ```bash 383 | ### Server Configuration 384 | # HOST=0.0.0.0 385 | PORT=9621 386 | WORKERS=2 387 | 388 | ### Settings for document indexing 389 | ENABLE_LLM_CACHE_FOR_EXTRACT=true 390 | SUMMARY_LANGUAGE=Chinese 391 | MAX_PARALLEL_INSERT=2 392 | 393 | ### LLM Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 394 | TIMEOUT=200 395 | TEMPERATURE=0.0 396 | MAX_ASYNC=4 397 | MAX_TOKENS=32768 398 | 399 | LLM_BINDING=openai 400 | LLM_MODEL=gpt-4o-mini 401 | LLM_BINDING_HOST=https://api.openai.com/v1 402 | LLM_BINDING_API_KEY=your-api-key 403 | 404 | ### Embedding Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 405 | EMBEDDING_MODEL=bge-m3:latest 406 | EMBEDDING_DIM=1024 407 | EMBEDDING_BINDING=ollama 408 | EMBEDDING_BINDING_HOST=http://localhost:11434 409 | 410 | ### For JWT Auth 411 | # AUTH_ACCOUNTS='admin:admin123,user1:pass456' 412 | # TOKEN_SECRET=your-key-for-LightRAG-API-Server-xxx 413 | # TOKEN_EXPIRE_HOURS=48 414 | 415 | # LIGHTRAG_API_KEY=your-secure-api-key-here-123 416 | # WHITELIST_PATHS=/api/* 417 | # WHITELIST_PATHS=/health,/api/* 418 | ``` 419 | 420 | 421 | 422 | #### 使用 ollama 默认本地服务器作为 llm 和嵌入后端运行 Lightrag 服务器 423 | 424 | Ollama 是 llm 和嵌入的默认后端,因此默认情况下您可以不带参数运行 lightrag-server,将使用默认值。确保已安装 ollama 并且正在运行,且默认模型已安装在 ollama 上。 425 | 426 | ```bash 427 | # 使用 ollama 运行 lightrag,llm 使用 mistral-nemo:latest,嵌入使用 bge-m3:latest 428 | lightrag-server 429 | 430 | # 使用认证密钥 431 | lightrag-server --key my-key 432 | ``` 433 | 434 | #### 使用 lollms 默认本地服务器作为 llm 和嵌入后端运行 Lightrag 服务器 435 | 436 | ```bash 437 | # 使用 lollms 运行 lightrag,llm 使用 mistral-nemo:latest,嵌入使用 bge-m3:latest 438 | # 在 .env 或 config.ini 中配置 LLM_BINDING=lollms 和 EMBEDDING_BINDING=lollms 439 | lightrag-server 440 | 441 | # 使用认证密钥 442 | lightrag-server --key my-key 443 | ``` 444 | 445 | #### 使用 openai 服务器作为 llm 和嵌入后端运行 Lightrag 服务器 446 | 447 | ```bash 448 | # 使用 openai 运行 lightrag,llm 使用 GPT-4o-mini,嵌入使用 text-embedding-3-small 449 | # 在 .env 或 config.ini 中配置: 450 | # LLM_BINDING=openai 451 | # LLM_MODEL=GPT-4o-mini 452 | # EMBEDDING_BINDING=openai 453 | # EMBEDDING_MODEL=text-embedding-3-small 454 | lightrag-server 455 | 456 | # 使用认证密钥 457 | lightrag-server --key my-key 458 | ``` 459 | 460 | #### 使用 azure openai 服务器作为 llm 和嵌入后端运行 Lightrag 服务器 461 | 462 | ```bash 463 | # 使用 azure_openai 运行 lightrag 464 | # 在 .env 或 config.ini 中配置: 465 | # LLM_BINDING=azure_openai 466 | # LLM_MODEL=your-model 467 | # EMBEDDING_BINDING=azure_openai 468 | # EMBEDDING_MODEL=your-embedding-model 469 | lightrag-server 470 | 471 | # 使用认证密钥 472 | lightrag-server --key my-key 473 | ``` 474 | 475 | **重要说明:** 476 | - 对于 LoLLMs:确保指定的模型已安装在您的 LoLLMs 实例中 477 | - 对于 Ollama:确保指定的模型已安装在您的 Ollama 实例中 478 | - 对于 OpenAI:确保您已设置 OPENAI_API_KEY 环境变量 479 | - 对于 Azure OpenAI:按照先决条件部分所述构建和配置您的服务器 480 | 481 | 要获取任何服务器的帮助,使用 --help 标志: 482 | ```bash 483 | lightrag-server --help 484 | ``` 485 | 486 | 注意:如果您不需要 API 功能,可以使用以下命令安装不带 API 支持的基本包: 487 | ```bash 488 | pip install lightrag-hku 489 | ``` 490 | 491 | ## API 端点 492 | 493 | 所有服务器(LoLLMs、Ollama、OpenAI 和 Azure OpenAI)都为 RAG 功能提供相同的 REST API 端点。当 API 服务器运行时,访问: 494 | 495 | - Swagger UI:http://localhost:9621/docs 496 | - ReDoc:http://localhost:9621/redoc 497 | 498 | 您可以使用提供的 curl 命令或通过 Swagger UI 界面测试 API 端点。确保: 499 | 500 | 1. 启动适当的后端服务(LoLLMs、Ollama 或 OpenAI) 501 | 2. 启动 RAG 服务器 502 | 3. 使用文档管理端点上传一些文档 503 | 4. 使用查询端点查询系统 504 | 5. 如果在输入目录中放入新文件,触发文档扫描 505 | 506 | ### 查询端点 507 | 508 | #### POST /query 509 | 使用不同搜索模式查询 RAG 系统。 510 | 511 | ```bash 512 | curl -X POST "http://localhost:9621/query" \ 513 | -H "Content-Type: application/json" \ 514 | -d '{"query": "您的问题", "mode": "hybrid", ""}' 515 | ``` 516 | 517 | #### POST /query/stream 518 | 从 RAG 系统流式获取响应。 519 | 520 | ```bash 521 | curl -X POST "http://localhost:9621/query/stream" \ 522 | -H "Content-Type: application/json" \ 523 | -d '{"query": "您的问题", "mode": "hybrid"}' 524 | ``` 525 | 526 | ### 文档管理端点 527 | 528 | #### POST /documents/text 529 | 直接将文本插入 RAG 系统。 530 | 531 | ```bash 532 | curl -X POST "http://localhost:9621/documents/text" \ 533 | -H "Content-Type: application/json" \ 534 | -d '{"text": "您的文本内容", "description": "可选描述"}' 535 | ``` 536 | 537 | #### POST /documents/file 538 | 向 RAG 系统上传单个文件。 539 | 540 | ```bash 541 | curl -X POST "http://localhost:9621/documents/file" \ 542 | -F "file=@/path/to/your/document.txt" \ 543 | -F "description=可选描述" 544 | ``` 545 | 546 | #### POST /documents/batch 547 | 一次上传多个文件。 548 | 549 | ```bash 550 | curl -X POST "http://localhost:9621/documents/batch" \ 551 | -F "files=@/path/to/doc1.txt" \ 552 | -F "files=@/path/to/doc2.txt" 553 | ``` 554 | 555 | #### POST /documents/scan 556 | 557 | 触发输入目录中新文件的文档扫描。 558 | 559 | ```bash 560 | curl -X POST "http://localhost:9621/documents/scan" --max-time 1800 561 | ``` 562 | 563 | > 根据所有新文件的预计索引时间调整 max-time。 564 | 565 | #### DELETE /documents 566 | 567 | 从 RAG 系统中清除所有文档。 568 | 569 | ```bash 570 | curl -X DELETE "http://localhost:9621/documents" 571 | ``` 572 | 573 | ### Ollama 模拟端点 574 | 575 | #### GET /api/version 576 | 577 | 获取 Ollama 版本信息。 578 | 579 | ```bash 580 | curl http://localhost:9621/api/version 581 | ``` 582 | 583 | #### GET /api/tags 584 | 585 | 获取 Ollama 可用模型。 586 | 587 | ```bash 588 | curl http://localhost:9621/api/tags 589 | ``` 590 | 591 | #### POST /api/chat 592 | 593 | 处理聊天补全请求。通过根据查询前缀选择查询模式将用户查询路由到 LightRAG。检测并将 OpenWebUI 会话相关请求(用于元数据生成任务)直接转发给底层 LLM。 594 | 595 | ```shell 596 | curl -N -X POST http://localhost:9621/api/chat -H "Content-Type: application/json" -d \ 597 | '{"model":"lightrag:latest","messages":[{"role":"user","content":"猪八戒是谁"}],"stream":true}' 598 | ``` 599 | 600 | > 有关 Ollama API 的更多信息,请访问:[Ollama API 文档](https://github.com/ollama/ollama/blob/main/docs/api.md) 601 | 602 | #### POST /api/generate 603 | 604 | 处理生成补全请求。为了兼容性目的,该请求不由 LightRAG 处理,而是由底层 LLM 模型处理。 605 | 606 | ### 实用工具端点 607 | 608 | #### GET /health 609 | 检查服务器健康状况和配置。 610 | 611 | ```bash 612 | curl "http://localhost:9621/health" 613 | 614 | ``` 615 | -------------------------------------------------------------------------------- /lightrag/api/README.md: -------------------------------------------------------------------------------- 1 | # LightRAG Server and WebUI 2 | 3 | The LightRAG Server is designed to provide a Web UI and API support. The Web UI facilitates document indexing, knowledge graph exploration, and a simple RAG query interface. LightRAG Server also provides an Ollama-compatible interface, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat bots, such as Open WebUI, to access LightRAG easily. 4 | 5 | ![image-20250323122538997](./README.assets/image-20250323122538997.png) 6 | 7 | ![image-20250323122754387](./README.assets/image-20250323122754387.png) 8 | 9 | ![image-20250323123011220](./README.assets/image-20250323123011220.png) 10 | 11 | ## Getting Started 12 | 13 | ### Installation 14 | 15 | * Install from PyPI 16 | 17 | ```bash 18 | pip install "lightrag-hku[api]" 19 | ``` 20 | 21 | * Installation from Source 22 | 23 | ```bash 24 | # Clone the repository 25 | git clone https://github.com/HKUDS/lightrag.git 26 | 27 | # Change to the repository directory 28 | cd lightrag 29 | 30 | # create a Python virtual environment if necessary 31 | # Install in editable mode with API support 32 | pip install -e ".[api]" 33 | ``` 34 | 35 | ### Before Starting LightRAG Server 36 | 37 | LightRAG necessitates the integration of both an LLM (Large Language Model) and an Embedding Model to effectively execute document indexing and querying operations. Prior to the initial deployment of the LightRAG server, it is essential to configure the settings for both the LLM and the Embedding Model. LightRAG supports binding to various LLM/Embedding backends: 38 | 39 | * ollama 40 | * lollms 41 | * openai or openai compatible 42 | * azure_openai 43 | 44 | It is recommended to use environment variables to configure the LightRAG Server. There is an example environment variable file named `env.example` in the root directory of the project. Please copy this file to the startup directory and rename it to `.env`. After that, you can modify the parameters related to the LLM and Embedding models in the `.env` file. It is important to note that the LightRAG Server will load the environment variables from `.env` into the system environment variables each time it starts. Since the LightRAG Server will prioritize the settings in the system environment variables, if you modify the `.env` file after starting the LightRAG Server via the command line, you need to execute `source .env` to make the new settings take effect. 45 | 46 | Here are some examples of common settings for LLM and Embedding models: 47 | 48 | * OpenAI LLM + Ollama Embedding: 49 | 50 | ``` 51 | LLM_BINDING=openai 52 | LLM_MODEL=gpt-4o 53 | LLM_BINDING_HOST=https://api.openai.com/v1 54 | LLM_BINDING_API_KEY=your_api_key 55 | ### Max tokens sent to LLM (less than model context size) 56 | MAX_TOKENS=32768 57 | 58 | EMBEDDING_BINDING=ollama 59 | EMBEDDING_BINDING_HOST=http://localhost:11434 60 | EMBEDDING_MODEL=bge-m3:latest 61 | EMBEDDING_DIM=1024 62 | # EMBEDDING_BINDING_API_KEY=your_api_key 63 | ``` 64 | 65 | * Ollama LLM + Ollama Embedding: 66 | 67 | ``` 68 | LLM_BINDING=ollama 69 | LLM_MODEL=mistral-nemo:latest 70 | LLM_BINDING_HOST=http://localhost:11434 71 | # LLM_BINDING_API_KEY=your_api_key 72 | ### Max tokens sent to LLM (based on your Ollama Server capacity) 73 | MAX_TOKENS=8192 74 | 75 | EMBEDDING_BINDING=ollama 76 | EMBEDDING_BINDING_HOST=http://localhost:11434 77 | EMBEDDING_MODEL=bge-m3:latest 78 | EMBEDDING_DIM=1024 79 | # EMBEDDING_BINDING_API_KEY=your_api_key 80 | ``` 81 | 82 | ### Starting LightRAG Server 83 | 84 | The LightRAG Server supports two operational modes: 85 | * The simple and efficient Uvicorn mode: 86 | 87 | ``` 88 | lightrag-server 89 | ``` 90 | * The multiprocess Gunicorn + Uvicorn mode (production mode, not supported on Windows environments): 91 | 92 | ``` 93 | lightrag-gunicorn --workers 4 94 | ``` 95 | The `.env` file **must be placed in the startup directory**. 96 | 97 | Upon launching, the LightRAG Server will create a documents directory (default is `./inputs`) and a data directory (default is `./rag_storage`). This allows you to initiate multiple instances of LightRAG Server from different directories, with each instance configured to listen on a distinct network port. 98 | 99 | Here are some commonly used startup parameters: 100 | 101 | - `--host`: Server listening address (default: 0.0.0.0) 102 | - `--port`: Server listening port (default: 9621) 103 | - `--timeout`: LLM request timeout (default: 150 seconds) 104 | - `--log-level`: Logging level (default: INFO) 105 | - `--input-dir`: Specifying the directory to scan for documents (default: ./inputs) 106 | 107 | > The requirement for the .env file to be in the startup directory is intentionally designed this way. The purpose is to support users in launching multiple LightRAG instances simultaneously, allowing different .env files for different instances. 108 | 109 | > **After changing the .env file, you need to open a new terminal to make the new settings take effect.** This because the LightRAG Server will load the environment variables from .env into the system environment variables each time it starts, and LightRAG Server will prioritize the settings in the system environment variables. 110 | 111 | ### Auto scan on startup 112 | 113 | When starting any of the servers with the `--auto-scan-at-startup` parameter, the system will automatically: 114 | 115 | 1. Scan for new files in the input directory 116 | 2. Index new documents that aren't already in the database 117 | 3. Make all content immediately available for RAG queries 118 | 119 | > The `--input-dir` parameter specifies the input directory to scan. You can trigger the input directory scan from the Web UI. 120 | 121 | ### Multiple workers for Gunicorn + Uvicorn 122 | 123 | The LightRAG Server can operate in the `Gunicorn + Uvicorn` preload mode. Gunicorn's multiple worker (multiprocess) capability prevents document indexing tasks from blocking RAG queries. Using CPU-exhaustive document extraction tools, such as docling, can lead to the entire system being blocked in pure Uvicorn mode. 124 | 125 | Though LightRAG Server uses one worker to process the document indexing pipeline, with the async task support of Uvicorn, multiple files can be processed in parallel. The bottleneck of document indexing speed mainly lies with the LLM. If your LLM supports high concurrency, you can accelerate document indexing by increasing the concurrency level of the LLM. Below are several environment variables related to concurrent processing, along with their default values: 126 | 127 | ``` 128 | ### Number of worker processes, not greater than (2 x number_of_cores) + 1 129 | WORKERS=2 130 | ### Number of parallel files to process in one batch 131 | MAX_PARALLEL_INSERT=2 132 | ### Max concurrent requests to the LLM 133 | MAX_ASYNC=4 134 | ``` 135 | 136 | ### Install LightRAG as a Linux Service 137 | 138 | Create your service file `lightrag.service` from the sample file: `lightrag.service.example`. Modify the `WorkingDirectory` and `ExecStart` in the service file: 139 | 140 | ```text 141 | Description=LightRAG Ollama Service 142 | WorkingDirectory= 143 | ExecStart=/lightrag/api/lightrag-api 144 | ``` 145 | 146 | Modify your service startup script: `lightrag-api`. Change your Python virtual environment activation command as needed: 147 | 148 | ```shell 149 | #!/bin/bash 150 | 151 | # your python virtual environment activation 152 | source /home/netman/lightrag-xyj/venv/bin/activate 153 | # start lightrag api server 154 | lightrag-server 155 | ``` 156 | 157 | Install LightRAG service. If your system is Ubuntu, the following commands will work: 158 | 159 | ```shell 160 | sudo cp lightrag.service /etc/systemd/system/ 161 | sudo systemctl daemon-reload 162 | sudo systemctl start lightrag.service 163 | sudo systemctl status lightrag.service 164 | sudo systemctl enable lightrag.service 165 | ``` 166 | 167 | ## Ollama Emulation 168 | 169 | We provide Ollama-compatible interfaces for LightRAG, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat frontends supporting Ollama, such as Open WebUI, to access LightRAG easily. 170 | 171 | ### Connect Open WebUI to LightRAG 172 | 173 | After starting the lightrag-server, you can add an Ollama-type connection in the Open WebUI admin panel. And then a model named `lightrag:latest` will appear in Open WebUI's model management interface. Users can then send queries to LightRAG through the chat interface. You should install LightRAG as a service for this use case. 174 | 175 | Open WebUI uses an LLM to do the session title and session keyword generation task. So the Ollama chat completion API detects and forwards OpenWebUI session-related requests directly to the underlying LLM. Screenshot from Open WebUI: 176 | 177 | ![image-20250323194750379](./README.assets/image-20250323194750379.png) 178 | 179 | ### Choose Query mode in chat 180 | 181 | The default query mode is `hybrid` if you send a message (query) from the Ollama interface of LightRAG. You can select query mode by sending a message with a query prefix. 182 | 183 | A query prefix in the query string can determine which LightRAG query mode is used to generate the response for the query. The supported prefixes include: 184 | 185 | ``` 186 | /local 187 | /global 188 | /hybrid 189 | /naive 190 | /mix 191 | 192 | /bypass 193 | /context 194 | /localcontext 195 | /globalcontext 196 | /hybridcontext 197 | /naivecontext 198 | /mixcontext 199 | ``` 200 | 201 | For example, the chat message `/mix What's LightRAG?` will trigger a mix mode query for LightRAG. A chat message without a query prefix will trigger a hybrid mode query by default. 202 | 203 | `/bypass` is not a LightRAG query mode; it will tell the API Server to pass the query directly to the underlying LLM, including the chat history. So the user can use the LLM to answer questions based on the chat history. If you are using Open WebUI as a front end, you can just switch the model to a normal LLM instead of using the `/bypass` prefix. 204 | 205 | `/context` is also not a LightRAG query mode; it will tell LightRAG to return only the context information prepared for the LLM. You can check the context if it's what you want, or process the context by yourself. 206 | 207 | ## API Key and Authentication 208 | 209 | By default, the LightRAG Server can be accessed without any authentication. We can configure the server with an API Key or account credentials to secure it. 210 | 211 | * API Key: 212 | 213 | ``` 214 | LIGHTRAG_API_KEY=your-secure-api-key-here 215 | WHITELIST_PATHS=/health,/api/* 216 | ``` 217 | 218 | > Health check and Ollama emulation endpoints are excluded from API Key check by default. 219 | 220 | * Account credentials (the Web UI requires login before access can be granted): 221 | 222 | LightRAG API Server implements JWT-based authentication using the HS256 algorithm. To enable secure access control, the following environment variables are required: 223 | 224 | ```bash 225 | # For jwt auth 226 | AUTH_ACCOUNTS='admin:admin123,user1:pass456' 227 | TOKEN_SECRET='your-key' 228 | TOKEN_EXPIRE_HOURS=4 229 | ``` 230 | 231 | > Currently, only the configuration of an administrator account and password is supported. A comprehensive account system is yet to be developed and implemented. 232 | 233 | If Account credentials are not configured, the Web UI will access the system as a Guest. Therefore, even if only an API Key is configured, all APIs can still be accessed through the Guest account, which remains insecure. Hence, to safeguard the API, it is necessary to configure both authentication methods simultaneously. 234 | 235 | ## For Azure OpenAI Backend 236 | 237 | Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)): 238 | 239 | ```bash 240 | # Change the resource group name, location, and OpenAI resource name as needed 241 | RESOURCE_GROUP_NAME=LightRAG 242 | LOCATION=swedencentral 243 | RESOURCE_NAME=LightRAG-OpenAI 244 | 245 | az login 246 | az group create --name $RESOURCE_GROUP_NAME --location $LOCATION 247 | az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral 248 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard" 249 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard" 250 | az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint" 251 | az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME 252 | 253 | ``` 254 | 255 | The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file. 256 | 257 | ``` 258 | # Azure OpenAI Configuration in .env: 259 | LLM_BINDING=azure_openai 260 | LLM_BINDING_HOST=your-azure-endpoint 261 | LLM_MODEL=your-model-deployment-name 262 | LLM_BINDING_API_KEY=your-azure-api-key 263 | ### API version is optional, defaults to latest version 264 | AZURE_OPENAI_API_VERSION=2024-08-01-preview 265 | 266 | ### If using Azure OpenAI for embeddings 267 | EMBEDDING_BINDING=azure_openai 268 | EMBEDDING_MODEL=your-embedding-deployment-name 269 | ``` 270 | 271 | ## LightRAG Server Configuration in Detail 272 | 273 | The API Server can be configured in three ways (highest priority first): 274 | 275 | * Command line arguments 276 | * Environment variables or .env file 277 | * Config.ini (Only for storage configuration) 278 | 279 | Most of the configurations come with default settings; check out the details in the sample file: `.env.example`. Data storage configuration can also be set by config.ini. A sample file `config.ini.example` is provided for your convenience. 280 | 281 | ### LLM and Embedding Backend Supported 282 | 283 | LightRAG supports binding to various LLM/Embedding backends: 284 | 285 | * ollama 286 | * lollms 287 | * openai & openai compatible 288 | * azure_openai 289 | 290 | Use environment variables `LLM_BINDING` or CLI argument `--llm-binding` to select the LLM backend type. Use environment variables `EMBEDDING_BINDING` or CLI argument `--embedding-binding` to select the Embedding backend type. 291 | 292 | ### Entity Extraction Configuration 293 | * ENABLE_LLM_CACHE_FOR_EXTRACT: Enable LLM cache for entity extraction (default: true) 294 | 295 | It's very common to set `ENABLE_LLM_CACHE_FOR_EXTRACT` to true for a test environment to reduce the cost of LLM calls. 296 | 297 | ### Storage Types Supported 298 | 299 | LightRAG uses 4 types of storage for different purposes: 300 | 301 | * KV_STORAGE: llm response cache, text chunks, document information 302 | * VECTOR_STORAGE: entities vectors, relation vectors, chunks vectors 303 | * GRAPH_STORAGE: entity relation graph 304 | * DOC_STATUS_STORAGE: document indexing status 305 | 306 | Each storage type has several implementations: 307 | 308 | * KV_STORAGE supported implementations: 309 | 310 | ``` 311 | JsonKVStorage JsonFile (default) 312 | PGKVStorage Postgres 313 | RedisKVStorage Redis 314 | MongoKVStorage MongoDB 315 | ``` 316 | 317 | * GRAPH_STORAGE supported implementations: 318 | 319 | ``` 320 | NetworkXStorage NetworkX (default) 321 | Neo4JStorage Neo4J 322 | PGGraphStorage PostgreSQL with AGE plugin 323 | ``` 324 | 325 | > Testing has shown that Neo4J delivers superior performance in production environments compared to PostgreSQL with AGE plugin. 326 | 327 | * VECTOR_STORAGE supported implementations: 328 | 329 | ``` 330 | NanoVectorDBStorage NanoVector (default) 331 | PGVectorStorage Postgres 332 | MilvusVectorDBStorage Milvus 333 | ChromaVectorDBStorage Chroma 334 | FaissVectorDBStorage Faiss 335 | QdrantVectorDBStorage Qdrant 336 | MongoVectorDBStorage MongoDB 337 | ``` 338 | 339 | * DOC_STATUS_STORAGE: supported implementations: 340 | 341 | ``` 342 | JsonDocStatusStorage JsonFile (default) 343 | PGDocStatusStorage Postgres 344 | MongoDocStatusStorage MongoDB 345 | ``` 346 | 347 | ### How to Select Storage Implementation 348 | 349 | You can select storage implementation by environment variables. You can set the following environment variables to a specific storage implementation name before the first start of the API Server: 350 | 351 | ``` 352 | LIGHTRAG_KV_STORAGE=PGKVStorage 353 | LIGHTRAG_VECTOR_STORAGE=PGVectorStorage 354 | LIGHTRAG_GRAPH_STORAGE=PGGraphStorage 355 | LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage 356 | ``` 357 | 358 | You cannot change storage implementation selection after adding documents to LightRAG. Data migration from one storage implementation to another is not supported yet. For further information, please read the sample env file or config.ini file. 359 | 360 | ### LightRAG API Server Command Line Options 361 | 362 | | Parameter | Default | Description | 363 | | --------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------- | 364 | | --host | 0.0.0.0 | Server host | 365 | | --port | 9621 | Server port | 366 | | --working-dir | ./rag_storage | Working directory for RAG storage | 367 | | --input-dir | ./inputs | Directory containing input documents | 368 | | --max-async | 4 | Maximum number of async operations | 369 | | --max-tokens | 32768 | Maximum token size | 370 | | --timeout | 150 | Timeout in seconds. None for infinite timeout (not recommended) | 371 | | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | 372 | | --verbose | - | Verbose debug output (True, False) | 373 | | --key | None | API key for authentication. Protects the LightRAG server against unauthorized access | 374 | | --ssl | False | Enable HTTPS | 375 | | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) | 376 | | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) | 377 | | --top-k | 50 | Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode. | 378 | | --cosine-threshold | 0.4 | The cosine threshold for nodes and relation retrieval, works with top-k to control the retrieval of nodes and relations. | 379 | | --llm-binding | ollama | LLM binding type (lollms, ollama, openai, openai-ollama, azure_openai) | 380 | | --embedding-binding | ollama | Embedding binding type (lollms, ollama, openai, azure_openai) | 381 | | --auto-scan-at-startup| - | Scan input directory for new files and start indexing | 382 | 383 | ### .env Examples 384 | 385 | ```bash 386 | ### Server Configuration 387 | # HOST=0.0.0.0 388 | PORT=9621 389 | WORKERS=2 390 | 391 | ### Settings for document indexing 392 | ENABLE_LLM_CACHE_FOR_EXTRACT=true 393 | SUMMARY_LANGUAGE=Chinese 394 | MAX_PARALLEL_INSERT=2 395 | 396 | ### LLM Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 397 | TIMEOUT=200 398 | TEMPERATURE=0.0 399 | MAX_ASYNC=4 400 | MAX_TOKENS=32768 401 | 402 | LLM_BINDING=openai 403 | LLM_MODEL=gpt-4o-mini 404 | LLM_BINDING_HOST=https://api.openai.com/v1 405 | LLM_BINDING_API_KEY=your-api-key 406 | 407 | ### Embedding Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 408 | EMBEDDING_MODEL=bge-m3:latest 409 | EMBEDDING_DIM=1024 410 | EMBEDDING_BINDING=ollama 411 | EMBEDDING_BINDING_HOST=http://localhost:11434 412 | 413 | ### For JWT Auth 414 | # AUTH_ACCOUNTS='admin:admin123,user1:pass456' 415 | # TOKEN_SECRET=your-key-for-LightRAG-API-Server-xxx 416 | # TOKEN_EXPIRE_HOURS=48 417 | 418 | # LIGHTRAG_API_KEY=your-secure-api-key-here-123 419 | # WHITELIST_PATHS=/api/* 420 | # WHITELIST_PATHS=/health,/api/* 421 | 422 | ``` 423 | 424 | 425 | ## API Endpoints 426 | 427 | All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality. When the API Server is running, visit: 428 | 429 | - Swagger UI: http://localhost:9621/docs 430 | - ReDoc: http://localhost:9621/redoc 431 | 432 | You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to: 433 | 434 | 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI) 435 | 2. Start the RAG server 436 | 3. Upload some documents using the document management endpoints 437 | 4. Query the system using the query endpoints 438 | 5. Trigger document scan if new files are put into the inputs directory 439 | 440 | ### Query Endpoints: 441 | 442 | #### POST /query 443 | Query the RAG system with options for different search modes. 444 | 445 | ```bash 446 | curl -X POST "http://localhost:9621/query" \ 447 | -H "Content-Type: application/json" \ 448 | -d '{"query": "Your question here", "mode": "hybrid"}' 449 | ``` 450 | 451 | #### POST /query/stream 452 | Stream responses from the RAG system. 453 | 454 | ```bash 455 | curl -X POST "http://localhost:9621/query/stream" \ 456 | -H "Content-Type: application/json" \ 457 | -d '{"query": "Your question here", "mode": "hybrid"}' 458 | ``` 459 | 460 | ### Document Management Endpoints: 461 | 462 | #### POST /documents/text 463 | Insert text directly into the RAG system. 464 | 465 | ```bash 466 | curl -X POST "http://localhost:9621/documents/text" \ 467 | -H "Content-Type: application/json" \ 468 | -d '{"text": "Your text content here", "description": "Optional description"}' 469 | ``` 470 | 471 | #### POST /documents/file 472 | Upload a single file to the RAG system. 473 | 474 | ```bash 475 | curl -X POST "http://localhost:9621/documents/file" \ 476 | -F "file=@/path/to/your/document.txt" \ 477 | -F "description=Optional description" 478 | ``` 479 | 480 | #### POST /documents/batch 481 | Upload multiple files at once. 482 | 483 | ```bash 484 | curl -X POST "http://localhost:9621/documents/batch" \ 485 | -F "files=@/path/to/doc1.txt" \ 486 | -F "files=@/path/to/doc2.txt" 487 | ``` 488 | 489 | #### POST /documents/scan 490 | 491 | Trigger document scan for new files in the input directory. 492 | 493 | ```bash 494 | curl -X POST "http://localhost:9621/documents/scan" --max-time 1800 495 | ``` 496 | 497 | > Adjust max-time according to the estimated indexing time for all new files. 498 | 499 | #### DELETE /documents 500 | 501 | Clear all documents from the RAG system. 502 | 503 | ```bash 504 | curl -X DELETE "http://localhost:9621/documents" 505 | ``` 506 | 507 | ### Ollama Emulation Endpoints: 508 | 509 | #### GET /api/version 510 | 511 | Get Ollama version information. 512 | 513 | ```bash 514 | curl http://localhost:9621/api/version 515 | ``` 516 | 517 | #### GET /api/tags 518 | 519 | Get available Ollama models. 520 | 521 | ```bash 522 | curl http://localhost:9621/api/tags 523 | ``` 524 | 525 | #### POST /api/chat 526 | 527 | Handle chat completion requests. Routes user queries through LightRAG by selecting query mode based on query prefix. Detects and forwards OpenWebUI session-related requests (for metadata generation task) directly to the underlying LLM. 528 | 529 | ```shell 530 | curl -N -X POST http://localhost:9621/api/chat -H "Content-Type: application/json" -d \ 531 | '{"model":"lightrag:latest","messages":[{"role":"user","content":"猪八戒是谁"}],"stream":true}' 532 | ``` 533 | 534 | > For more information about Ollama API, please visit: [Ollama API documentation](https://github.com/ollama/ollama/blob/main/docs/api.md) 535 | 536 | #### POST /api/generate 537 | 538 | Handle generate completion requests. For compatibility purposes, the request is not processed by LightRAG, and will be handled by the underlying LLM model. 539 | 540 | ### Utility Endpoints: 541 | 542 | #### GET /health 543 | Check server health and configuration. 544 | 545 | ```bash 546 | curl "http://localhost:9621/health" 547 | ``` 548 | -------------------------------------------------------------------------------- /lightrag/api/docs/LightRagWithPostGRESQL.md: -------------------------------------------------------------------------------- 1 | # Installing and Using PostgreSQL with LightRAG 2 | 3 | This guide provides step-by-step instructions on setting up PostgreSQL for use with LightRAG, a tool designed to enhance large language model (LLM) performance using retrieval-augmented generation techniques. 4 | 5 | ## Prerequisites 6 | 7 | Before beginning this setup, ensure that you have administrative access to your server or local machine and can install software packages. 8 | 9 | ### 1. Install PostgreSQL 10 | 11 | First, update your package list and install PostgreSQL: 12 | 13 | ```bash 14 | sudo apt update 15 | sudo apt install postgresql postgresql-contrib 16 | ``` 17 | 18 | Start the PostgreSQL service if it isn’t already running: 19 | 20 | ```bash 21 | sudo systemctl start postgresql 22 | ``` 23 | 24 | Ensure that PostgreSQL starts on boot: 25 | 26 | ```bash 27 | sudo systemctl enable postgresql 28 | ``` 29 | 30 | ### 2. Set a Password for Your Postgres Role 31 | 32 | By default, PostgreSQL creates a user named `postgres`. You'll need to set a password for this role or create another role with a password. 33 | 34 | To set a password for the `postgres` user: 35 | 36 | ```bash 37 | sudo -u postgres psql 38 | ``` 39 | 40 | Inside the PostgreSQL shell, run: 41 | 42 | ```sql 43 | ALTER USER postgres WITH PASSWORD 'your_secure_password'; 44 | \q 45 | ``` 46 | 47 | Alternatively, to create a new role with a password: 48 | 49 | ```bash 50 | sudo -u postgres createuser --interactive 51 | ``` 52 | 53 | You'll be prompted for the name of the new role and whether it should have superuser permissions. Then set a password: 54 | 55 | ```sql 56 | ALTER USER your_new_role WITH PASSWORD 'your_secure_password'; 57 | \q 58 | ``` 59 | 60 | ### 3. Install PGVector and Age Extensions 61 | 62 | Install PGVector: 63 | ```bash 64 | sudo apt install postgresql-server-dev-all 65 | cd /tmp 66 | git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git 67 | cd pgvector 68 | make 69 | sudo make install 70 | ``` 71 | Install age: 72 | ```bash 73 | sudo apt-get install build-essential libpq-dev 74 | cd /tmp 75 | git clone https://github.com/apache/age.git 76 | cd age 77 | make 78 | sudo make install 79 | ``` 80 | 81 | ### 4. Create a Database for LightRAG 82 | 83 | Create an empty database to store your data: 84 | 85 | ```bash 86 | sudo -u postgres createdb your_database 87 | ``` 88 | 89 | ### 5. Activate PGVector Extension in the Database 90 | 91 | Switch to the newly created database and enable the `pgvector` extension: 92 | 93 | ```bash 94 | sudo -u postgres psql -d your_database 95 | ``` 96 | 97 | Inside the PostgreSQL shell, run: 98 | 99 | ```sql 100 | CREATE EXTENSION vector; 101 | ``` 102 | 103 | Verify installation by checking the extension version within this specific database: 104 | 105 | ```sql 106 | SELECT extversion FROM pg_extension WHERE extname = 'vector'; 107 | \q 108 | ``` 109 | 110 | ### 6. Install LightRAG with API Access 111 | 112 | Install LightRAG using pip, targeting the API package for server-side use: 113 | 114 | ```bash 115 | pip install "lightrag-hku[api]" 116 | ``` 117 | 118 | ### 7. Configure `config.ini` 119 | 120 | Create a configuration file to specify PostgreSQL connection details and other settings: 121 | 122 | In your project directory, create a `config.ini` file with the following content: 123 | 124 | ```ini 125 | [postgres] 126 | host = localhost 127 | port = 5432 128 | user = your_role_name 129 | password = your_password 130 | database = your_database 131 | workspace = default 132 | ``` 133 | 134 | Replace placeholders like `your_role_name`, `your_password`, and `your_database` with actual values. 135 | 136 | ### 8. Run LightRAG Server 137 | 138 | Start the LightRAG server using specified options: 139 | 140 | ```bash 141 | lightrag-server --port 9621 --key sk-somepassword --kv-storage PGKVStorage --graph-storage PGGraphStorage --vector-storage PGVectorStorage --doc-status-storage PGDocStatusStorage 142 | ``` 143 | 144 | Replace the `port` number with your desired port number (default is 9621) and `your-secret-key` with a secure key. 145 | 146 | ## Conclusion 147 | 148 | With PostgreSQL set up to work with LightRAG, you can now leverage vector storage and retrieval-augmented capabilities for enhanced language model operations. Adjust configurations as needed based on your specific environment and use case requirements. 149 | -------------------------------------------------------------------------------- /lightrag/llm/Readme.md: -------------------------------------------------------------------------------- 1 | 2 | 1. **LlamaIndex** (`llm/llama_index.py`): 3 | - Provides integration with OpenAI and other providers through LlamaIndex 4 | - Supports both direct API access and proxy services like LiteLLM 5 | - Handles embeddings and completions with consistent interfaces 6 | - See example implementations: 7 | - [Direct OpenAI Usage](../../examples/lightrag_llamaindex_direct_demo.py) 8 | - [LiteLLM Proxy Usage](../../examples/lightrag_llamaindex_litellm_demo.py) 9 | 10 |
11 | Using LlamaIndex 12 | 13 | LightRAG supports LlamaIndex for embeddings and completions in two ways: direct OpenAI usage or through LiteLLM proxy. 14 | 15 | ### Setup 16 | 17 | First, install the required dependencies: 18 | ```bash 19 | pip install llama-index-llms-litellm llama-index-embeddings-litellm 20 | ``` 21 | 22 | ### Standard OpenAI Usage 23 | 24 | ```python 25 | from lightrag import LightRAG 26 | from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed 27 | from llama_index.embeddings.openai import OpenAIEmbedding 28 | from llama_index.llms.openai import OpenAI 29 | from lightrag.utils import EmbeddingFunc 30 | 31 | # Initialize with direct OpenAI access 32 | async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs): 33 | try: 34 | # Initialize OpenAI if not in kwargs 35 | if 'llm_instance' not in kwargs: 36 | llm_instance = OpenAI( 37 | model="gpt-4", 38 | api_key="your-openai-key", 39 | temperature=0.7, 40 | ) 41 | kwargs['llm_instance'] = llm_instance 42 | 43 | response = await llama_index_complete_if_cache( 44 | kwargs['llm_instance'], 45 | prompt, 46 | system_prompt=system_prompt, 47 | history_messages=history_messages, 48 | **kwargs, 49 | ) 50 | return response 51 | except Exception as e: 52 | logger.error(f"LLM request failed: {str(e)}") 53 | raise 54 | 55 | # Initialize LightRAG with OpenAI 56 | rag = LightRAG( 57 | working_dir="your/path", 58 | llm_model_func=llm_model_func, 59 | embedding_func=EmbeddingFunc( 60 | embedding_dim=1536, 61 | max_token_size=8192, 62 | func=lambda texts: llama_index_embed( 63 | texts, 64 | embed_model=OpenAIEmbedding( 65 | model="text-embedding-3-large", 66 | api_key="your-openai-key" 67 | ) 68 | ), 69 | ), 70 | ) 71 | ``` 72 | 73 | ### Using LiteLLM Proxy 74 | 75 | 1. Use any LLM provider through LiteLLM 76 | 2. Leverage LlamaIndex's embedding and completion capabilities 77 | 3. Maintain consistent configuration across services 78 | 79 | ```python 80 | from lightrag import LightRAG 81 | from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed 82 | from llama_index.llms.litellm import LiteLLM 83 | from llama_index.embeddings.litellm import LiteLLMEmbedding 84 | from lightrag.utils import EmbeddingFunc 85 | 86 | # Initialize with LiteLLM proxy 87 | async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs): 88 | try: 89 | # Initialize LiteLLM if not in kwargs 90 | if 'llm_instance' not in kwargs: 91 | llm_instance = LiteLLM( 92 | model=f"openai/{settings.LLM_MODEL}", # Format: "provider/model_name" 93 | api_base=settings.LITELLM_URL, 94 | api_key=settings.LITELLM_KEY, 95 | temperature=0.7, 96 | ) 97 | kwargs['llm_instance'] = llm_instance 98 | 99 | response = await llama_index_complete_if_cache( 100 | kwargs['llm_instance'], 101 | prompt, 102 | system_prompt=system_prompt, 103 | history_messages=history_messages, 104 | **kwargs, 105 | ) 106 | return response 107 | except Exception as e: 108 | logger.error(f"LLM request failed: {str(e)}") 109 | raise 110 | 111 | # Initialize LightRAG with LiteLLM 112 | rag = LightRAG( 113 | working_dir="your/path", 114 | llm_model_func=llm_model_func, 115 | embedding_func=EmbeddingFunc( 116 | embedding_dim=1536, 117 | max_token_size=8192, 118 | func=lambda texts: llama_index_embed( 119 | texts, 120 | embed_model=LiteLLMEmbedding( 121 | model_name=f"openai/{settings.EMBEDDING_MODEL}", 122 | api_base=settings.LITELLM_URL, 123 | api_key=settings.LITELLM_KEY, 124 | ) 125 | ), 126 | ), 127 | ) 128 | ``` 129 | 130 | ### Environment Variables 131 | 132 | For OpenAI direct usage: 133 | ```bash 134 | OPENAI_API_KEY=your-openai-key 135 | ``` 136 | 137 | For LiteLLM proxy: 138 | ```bash 139 | # LiteLLM Configuration 140 | LITELLM_URL=http://litellm:4000 141 | LITELLM_KEY=your-litellm-key 142 | 143 | # Model Configuration 144 | LLM_MODEL=gpt-4 145 | EMBEDDING_MODEL=text-embedding-3-large 146 | EMBEDDING_MAX_TOKEN_SIZE=8192 147 | ``` 148 | 149 | ### Key Differences 150 | 1. **Direct OpenAI**: 151 | - Simpler setup 152 | - Direct API access 153 | - Requires OpenAI API key 154 | 155 | 2. **LiteLLM Proxy**: 156 | - Model provider agnostic 157 | - Centralized API key management 158 | - Support for multiple providers 159 | - Better cost control and monitoring 160 | 161 |
162 | -------------------------------------------------------------------------------- /lightrag/tools/lightrag_visualizer/README-zh.md: -------------------------------------------------------------------------------- 1 | # 3D GraphML Viewer 2 | 3 | 一个基于 Dear ImGui 和 ModernGL 的交互式 3D 图可视化工具。 4 | 5 | ## 功能特点 6 | 7 | - **3D 交互式可视化**: 使用 ModernGL 实现高性能的 3D 图形渲染 8 | - **多种布局算法**: 支持多种图布局方式 9 | - Spring 布局 10 | - Circular 布局 11 | - Shell 布局 12 | - Random 布局 13 | - **社区检测**: 支持图社区结构的自动检测和可视化 14 | - **交互控制**: 15 | - WASD + QE 键控制相机移动 16 | - 鼠标右键拖拽控制视角 17 | - 节点选择和高亮 18 | - 可调节节点大小和边宽度 19 | - 可控制标签显示 20 | - 可在节点的Connections间快速跳转 21 | - **社区检测**: 支持图社区结构的自动检测和可视化 22 | - **交互控制**: 23 | - WASD + QE 键控制相机移动 24 | - 鼠标右键拖拽控制视角 25 | - 节点选择和高亮 26 | - 可调节节点大小和边宽度 27 | - 可控制标签显示 28 | 29 | ## 技术栈 30 | 31 | - **imgui_bundle**: 用户界面 32 | - **ModernGL**: OpenGL 图形渲染 33 | - **NetworkX**: 图数据结构和算法 34 | - **NumPy**: 数值计算 35 | - **community**: 社区检测 36 | 37 | ## 使用方法 38 | 39 | 1. **启动程序**: 40 | ```bash 41 | pip install lightrag-hku[tools] 42 | lightrag-viewer 43 | ``` 44 | 45 | 2. **加载字体**: 46 | - 将中文字体文件 `font.ttf` 放置在 `assets` 目录下 47 | - 或者修改 `CUSTOM_FONT` 常量来使用其他字体文件 48 | 49 | 3. **加载图文件**: 50 | - 点击界面上的 "Load GraphML" 按钮 51 | - 选择 GraphML 格式的图文件 52 | 53 | 4. **交互控制**: 54 | - **相机移动**: 55 | - W: 前进 56 | - S: 后退 57 | - A: 左移 58 | - D: 右移 59 | - Q: 上升 60 | - E: 下降 61 | - **视角控制**: 62 | - 按住鼠标右键拖动来旋转视角 63 | - **节点交互**: 64 | - 鼠标悬停可高亮节点 65 | - 点击可选中节点 66 | 67 | 5. **可视化设置**: 68 | - 可通过 UI 控制面板调整: 69 | - 布局类型 70 | - 节点大小 71 | - 边的宽度 72 | - 标签显示 73 | - 标签大小 74 | - 背景颜色 75 | 76 | ## 自定义设置 77 | 78 | - **节点缩放**: 通过 `node_scale` 参数调整节点大小 79 | - **边宽度**: 通过 `edge_width` 参数调整边的宽度 80 | - **标签显示**: 可通过 `show_labels` 开关标签显示 81 | - **标签大小**: 使用 `label_size` 调整标签大小 82 | - **标签颜色**: 通过 `label_color` 设置标签颜色 83 | - **视距控制**: 使用 `label_culling_distance` 控制标签显示的最大距离 84 | 85 | ## 性能优化 86 | 87 | - 使用 ModernGL 进行高效的图形渲染 88 | - 视距裁剪优化标签显示 89 | - 社区检测算法优化大规模图的可视化效果 90 | 91 | ## 系统要求 92 | 93 | - Python 3.10+ 94 | - OpenGL 3.3+ 兼容的显卡 95 | - 支持的操作系统:Windows/Linux/MacOS 96 | -------------------------------------------------------------------------------- /lightrag/tools/lightrag_visualizer/README.md: -------------------------------------------------------------------------------- 1 | # LightRAG 3D Graph Viewer 2 | 3 | An interactive 3D graph visualization tool included in the LightRAG package for visualizing and analyzing RAG (Retrieval-Augmented Generation) graphs and other graph structures. 4 | 5 | ![image](https://github.com/user-attachments/assets/b0d86184-99fc-468c-96ed-c611f14292bf) 6 | 7 | ## Installation 8 | 9 | ### Quick Install 10 | ```bash 11 | pip install lightrag-hku[tools] # Install with visualization tool only 12 | # or 13 | pip install lightrag-hku[api,tools] # Install with both API and visualization tools 14 | ``` 15 | 16 | ## Launch the Viewer 17 | ```bash 18 | lightrag-viewer 19 | ``` 20 | 21 | ## Features 22 | 23 | - **3D Interactive Visualization**: High-performance 3D graphics rendering using ModernGL 24 | - **Multiple Layout Algorithms**: Support for various graph layouts 25 | - Spring layout 26 | - Circular layout 27 | - Shell layout 28 | - Random layout 29 | - **Community Detection**: Automatic detection and visualization of graph community structures 30 | - **Interactive Controls**: 31 | - WASD + QE keys for camera movement 32 | - Right mouse drag for view angle control 33 | - Node selection and highlighting 34 | - Adjustable node size and edge width 35 | - Configurable label display 36 | - Quick navigation between node connections 37 | 38 | ## Tech Stack 39 | 40 | - **imgui_bundle**: User interface 41 | - **ModernGL**: OpenGL graphics rendering 42 | - **NetworkX**: Graph data structures and algorithms 43 | - **NumPy**: Numerical computations 44 | - **community**: Community detection 45 | 46 | ## Interactive Controls 47 | 48 | ### Camera Movement 49 | - W: Move forward 50 | - S: Move backward 51 | - A: Move left 52 | - D: Move right 53 | - Q: Move up 54 | - E: Move down 55 | 56 | ### View Control 57 | - Hold right mouse button and drag to rotate view 58 | 59 | ### Node Interaction 60 | - Hover mouse to highlight nodes 61 | - Click to select nodes 62 | 63 | ## Visualization Settings 64 | 65 | Adjustable via UI control panel: 66 | - Layout type 67 | - Node size 68 | - Edge width 69 | - Label visibility 70 | - Label size 71 | - Background color 72 | 73 | ## Customization Options 74 | 75 | - **Node Scaling**: Adjust node size via `node_scale` parameter 76 | - **Edge Width**: Modify edge width using `edge_width` parameter 77 | - **Label Display**: Toggle label visibility with `show_labels` 78 | - **Label Size**: Adjust label size using `label_size` 79 | - **Label Color**: Set label color through `label_color` 80 | - **View Distance**: Control maximum label display distance with `label_culling_distance` 81 | 82 | ## System Requirements 83 | 84 | - Python 3.9+ 85 | - Graphics card with OpenGL 3.3+ support 86 | - Supported Operating Systems: Windows/Linux/MacOS 87 | 88 | ## Troubleshooting 89 | 90 | ### Common Issues 91 | 92 | 1. **Command Not Found** 93 | ```bash 94 | # Make sure you installed with the 'tools' option 95 | pip install lightrag-hku[tools] 96 | 97 | # Verify installation 98 | pip list | grep lightrag-hku 99 | ``` 100 | 101 | 2. **ModernGL Initialization Failed** 102 | ```bash 103 | # Check OpenGL version 104 | glxinfo | grep "OpenGL version" 105 | 106 | # Update graphics drivers if needed 107 | ``` 108 | 109 | 3. **Font Loading Issues** 110 | - The required fonts are included in the package 111 | - If issues persist, check your graphics drivers 112 | 113 | ## Usage with LightRAG 114 | 115 | The viewer is particularly useful for: 116 | - Visualizing RAG knowledge graphs 117 | - Analyzing document relationships 118 | - Exploring semantic connections 119 | - Debugging retrieval patterns 120 | 121 | ## Performance Optimizations 122 | 123 | - Efficient graphics rendering using ModernGL 124 | - View distance culling for label display optimization 125 | - Community detection algorithms for optimized visualization of large-scale graphs 126 | 127 | ## Support 128 | 129 | - GitHub Issues: [LightRAG Repository](https://github.com/HKUDS/LightRAG) 130 | - Documentation: [LightRAG Docs](https://URL-to-docs) 131 | 132 | ## License 133 | 134 | This tool is part of LightRAG and is distributed under the MIT License. See `LICENSE` for more information. 135 | 136 | Note: This visualization tool is an optional component of the LightRAG package. Install with the [tools] option to access the viewer functionality. 137 | -------------------------------------------------------------------------------- /lightrag_webui/README.md: -------------------------------------------------------------------------------- 1 | # LightRAG WebUI 2 | 3 | LightRAG WebUI is a React-based web interface for interacting with the LightRAG system. It provides a user-friendly interface for querying, managing, and exploring LightRAG's functionalities. 4 | 5 | ## Installation 6 | 7 | 1. **Install Bun:** 8 | 9 | If you haven't already installed Bun, follow the official documentation: [https://bun.sh/docs/installation](https://bun.sh/docs/installation) 10 | 11 | 2. **Install Dependencies:** 12 | 13 | In the `lightrag_webui` directory, run the following command to install project dependencies: 14 | 15 | ```bash 16 | bun install --frozen-lockfile 17 | ``` 18 | 19 | 3. **Build the Project:** 20 | 21 | Run the following command to build the project: 22 | 23 | ```bash 24 | bun run build --emptyOutDir 25 | ``` 26 | 27 | This command will bundle the project and output the built files to the `lightrag/api/webui` directory. 28 | 29 | ## Development 30 | 31 | - **Start the Development Server:** 32 | 33 | If you want to run the WebUI in development mode, use the following command: 34 | 35 | ```bash 36 | bun run dev 37 | ``` 38 | 39 | ## Script Commands 40 | 41 | The following are some commonly used script commands defined in `package.json`: 42 | 43 | - `bun install`: Installs project dependencies. 44 | - `bun run dev`: Starts the development server. 45 | - `bun run build`: Builds the project. 46 | - `bun run lint`: Runs the linter. 47 | --------------------------------------------------------------------------------