├── .github └── pull_request_template.md ├── README-zh.md ├── README.md ├── SECURITY.md ├── docs ├── Algorithm.md ├── DockerDeployment.md ├── LightRAG_concurrent_explain.md └── rerank_integration.md ├── k8s-deploy ├── README-zh.md ├── README.md └── databases │ └── README.md ├── lightrag ├── api │ ├── README-zh.md │ └── README.md ├── llm │ └── Readme.md └── tools │ └── lightrag_visualizer │ ├── README-zh.md │ └── README.md ├── lightrag_webui └── README.md └── paging.md /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | 10 | 11 | ## Description 12 | 13 | [Briefly describe the changes made in this pull request.] 14 | 15 | ## Related Issues 16 | 17 | [Reference any related issues or tasks addressed by this pull request.] 18 | 19 | ## Changes Made 20 | 21 | [List the specific changes made in this pull request.] 22 | 23 | ## Checklist 24 | 25 | - [ ] Changes tested locally 26 | - [ ] Code reviewed 27 | - [ ] Documentation updated (if necessary) 28 | - [ ] Unit tests added (if applicable) 29 | 30 | ## Additional Notes 31 | 32 | [Add any additional notes or context for the reviewer(s).] 33 | -------------------------------------------------------------------------------- /README-zh.md: -------------------------------------------------------------------------------- 1 |
2 | 3 |
4 | LightRAG Logo 5 |
6 | 7 | # 🚀 LightRAG: Simple and Fast Retrieval-Augmented Generation 8 | 9 |
10 | HKUDS%2FLightRAG | Trendshift 11 |
12 | 13 |
14 |
15 |
16 | 17 |
18 |
19 |

20 | 21 | 22 | 23 |

24 |

25 | 26 | 27 |

28 |

29 | 30 | 31 |

32 |

33 | 34 | 35 |

36 |
37 |
38 | 39 |
40 | 41 |
42 | 43 |
44 | 45 |
46 | LightRAG Diagram 47 |
48 | 49 | --- 50 | 51 | ## 🎉 新闻 52 | 53 | - [X] [2025.06.05]🎯📢LightRAG现已集成RAG-Anything,支持全面的多模态文档解析与RAG能力(PDF、图片、Office文档、表格、公式等)。详见下方[多模态处理模块](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#多模态文档处理rag-anything集成)。 54 | - [X] [2025.03.18]🎯📢LightRAG现已支持引文功能。 55 | - [X] [2025.02.05]🎯📢我们团队发布了[VideoRAG](https://github.com/HKUDS/VideoRAG),用于理解超长上下文视频。 56 | - [X] [2025.01.13]🎯📢我们团队发布了[MiniRAG](https://github.com/HKUDS/MiniRAG),使用小型模型简化RAG。 57 | - [X] [2025.01.06]🎯📢现在您可以[使用PostgreSQL进行存储](#using-postgresql-for-storage)。 58 | - [X] [2024.12.31]🎯📢LightRAG现在支持[通过文档ID删除](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete)。 59 | - [X] [2024.11.25]🎯📢LightRAG现在支持无缝集成[自定义知识图谱](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#insert-custom-kg),使用户能够用自己的领域专业知识增强系统。 60 | - [X] [2024.11.19]🎯📢LightRAG的综合指南现已在[LearnOpenCV](https://learnopencv.com/lightrag)上发布。非常感谢博客作者。 61 | - [X] [2024.11.11]🎯📢LightRAG现在支持[通过实体名称删除实体](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete)。 62 | - [X] [2024.11.09]🎯📢推出[LightRAG Gui](https://lightrag-gui.streamlit.app),允许您插入、查询、可视化和下载LightRAG知识。 63 | - [X] [2024.11.04]🎯📢现在您可以[使用Neo4J进行存储](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage)。 64 | - [X] [2024.10.29]🎯📢LightRAG现在通过`textract`支持多种文件类型,包括PDF、DOC、PPT和CSV。 65 | - [X] [2024.10.20]🎯📢我们为LightRAG添加了一个新功能:图形可视化。 66 | - [X] [2024.10.18]🎯📢我们添加了[LightRAG介绍视频](https://youtu.be/oageL-1I0GE)的链接。感谢作者! 67 | - [X] [2024.10.17]🎯📢我们创建了一个[Discord频道](https://discord.gg/yF2MmDJyGJ)!欢迎加入分享和讨论!🎉🎉 68 | - [X] [2024.10.16]🎯📢LightRAG现在支持[Ollama模型](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)! 69 | - [X] [2024.10.15]🎯📢LightRAG现在支持[Hugging Face模型](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#quick-start)! 70 | 71 |
72 | 73 | 算法流程图 74 | 75 | 76 | ![LightRAG索引流程图](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-VectorDB-Json-KV-Store-Indexing-Flowchart-scaled.jpg) 77 | *图1:LightRAG索引流程图 - 图片来源:[Source](https://learnopencv.com/lightrag/)* 78 | ![LightRAG检索和查询流程图](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-Querying-Flowchart-Dual-Level-Retrieval-Generation-Knowledge-Graphs-scaled.jpg) 79 | *图2:LightRAG检索和查询流程图 - 图片来源:[Source](https://learnopencv.com/lightrag/)* 80 | 81 |
82 | 83 | ## 安装 84 | 85 | ### 安装LightRAG服务器 86 | 87 | LightRAG服务器旨在提供Web UI和API支持。Web UI便于文档索引、知识图谱探索和简单的RAG查询界面。LightRAG服务器还提供兼容Ollama的接口,旨在将LightRAG模拟为Ollama聊天模型。这使得AI聊天机器人(如Open WebUI)可以轻松访问LightRAG。 88 | 89 | * 从PyPI安装 90 | 91 | ```bash 92 | pip install "lightrag-hku[api]" 93 | cp env.example .env 94 | lightrag-server 95 | ``` 96 | 97 | * 从源代码安装 98 | 99 | ```bash 100 | git clone https://github.com/HKUDS/LightRAG.git 101 | cd LightRAG 102 | # 如有必要,创建Python虚拟环境 103 | # 以可编辑模式安装并支持API 104 | pip install -e ".[api]" 105 | cp env.example .env 106 | lightrag-server 107 | ``` 108 | 109 | * 使用 Docker Compose 启动 LightRAG 服务器 110 | 111 | ``` 112 | git clone https://github.com/HKUDS/LightRAG.git 113 | cd LightRAG 114 | cp env.example .env 115 | # modify LLM and Embedding settings in .env 116 | docker compose up 117 | ``` 118 | 119 | > 在此获取LightRAG docker镜像历史版本: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag) 120 | 121 | ### 安装LightRAG Core 122 | 123 | * 从源代码安装(推荐) 124 | 125 | ```bash 126 | cd LightRAG 127 | pip install -e . 128 | ``` 129 | 130 | * 从PyPI安装 131 | 132 | ```bash 133 | pip install lightrag-hku 134 | ``` 135 | 136 | ## 快速开始 137 | 138 | ### LightRAG的LLM及配套技术栈要求 139 | 140 | LightRAG对大型语言模型(LLM)的能力要求远高于传统RAG,因为它需要LLM执行文档中的实体关系抽取任务。配置合适的Embedding和Reranker模型对提高查询表现也至关重要。 141 | 142 | - **LLM选型**: 143 | - 推荐选用参数量至少为32B的LLM。 144 | - 上下文长度至少为32KB,推荐达到64KB。 145 | - **Embedding模型**: 146 | - 高性能的Embedding模型对RAG至关重要。 147 | - 推荐使用主流的多语言Embedding模型,例如:BAAI/bge-m3 和 text-embedding-3-large。 148 | - **重要提示**:在文档索引前必须确定使用的Embedding模型,且在文档查询阶段必须沿用与索引阶段相同的模型。 149 | - **Reranker模型配置**: 150 | - 配置Reranker模型能够显著提升LightRAG的检索效果。 151 | - 启用Reranker模型后,推荐将“mix模式”设为默认查询模式。 152 | - 推荐选用主流的Reranker模型,例如:BAAI/bge-reranker-v2-m3 或 Jina 等服务商提供的模型。 153 | 154 | ### 使用LightRAG服务器 155 | 156 | **有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。** 157 | 158 | ### 使用LightRAG Core 159 | 160 | LightRAG核心功能的示例代码请参见`examples`目录。您还可参照[视频](https://www.youtube.com/watch?v=g21royNJ4fw)视频完成环境配置。若已持有OpenAI API密钥,可以通过以下命令运行演示代码: 161 | 162 | ```bash 163 | ### you should run the demo code with project folder 164 | cd LightRAG 165 | ### provide your API-KEY for OpenAI 166 | export OPENAI_API_KEY="sk-...your_opeai_key..." 167 | ### download the demo document of "A Christmas Carol" by Charles Dickens 168 | curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt 169 | ### run the demo code 170 | python examples/lightrag_openai_demo.py 171 | ``` 172 | 173 | 如需流式响应示例的实现代码,请参阅 `examples/lightrag_openai_compatible_demo.py`。运行前,请确保根据需求修改示例代码中的LLM及嵌入模型配置。 174 | 175 | **注意1**:在运行demo程序的时候需要注意,不同的测试程序可能使用的是不同的embedding模型,更换不同的embeding模型的时候需要把清空数据目录(`./dickens`),否则层序执行会出错。如果你想保留LLM缓存,可以在清除数据目录时保留`kv_store_llm_response_cache.json`文件。 176 | 177 | **注意2**:官方支持的示例代码仅为 `lightrag_openai_demo.py` 和 `lightrag_openai_compatible_demo.py` 两个文件。其他示例文件均为社区贡献内容,尚未经过完整测试与优化。 178 | 179 | ## 使用LightRAG Core进行编程 180 | 181 | > 如果您希望将LightRAG集成到您的项目中,建议您使用LightRAG Server提供的REST API。LightRAG Core通常用于嵌入式应用,或供希望进行研究与评估的学者使用。 182 | 183 | ### 一个简单程序 184 | 185 | 以下Python代码片段演示了如何初始化LightRAG、插入文本并进行查询: 186 | 187 | ```python 188 | import os 189 | import asyncio 190 | from lightrag import LightRAG, QueryParam 191 | from lightrag.llm.openai import gpt_4o_mini_complete, gpt_4o_complete, openai_embed 192 | from lightrag.kg.shared_storage import initialize_pipeline_status 193 | from lightrag.utils import setup_logger 194 | 195 | setup_logger("lightrag", level="INFO") 196 | 197 | WORKING_DIR = "./rag_storage" 198 | if not os.path.exists(WORKING_DIR): 199 | os.mkdir(WORKING_DIR) 200 | 201 | async def initialize_rag(): 202 | rag = LightRAG( 203 | working_dir=WORKING_DIR, 204 | embedding_func=openai_embed, 205 | llm_model_func=gpt_4o_mini_complete, 206 | ) 207 | await rag.initialize_storages() 208 | await initialize_pipeline_status() 209 | return rag 210 | 211 | async def main(): 212 | try: 213 | # 初始化RAG实例 214 | rag = await initialize_rag() 215 | # 插入文本 216 | await rag.insert("Your text") 217 | 218 | # 执行混合检索 219 | mode = "hybrid" 220 | print( 221 | await rag.query( 222 | "这个故事的主要主题是什么?", 223 | param=QueryParam(mode=mode) 224 | ) 225 | ) 226 | 227 | except Exception as e: 228 | print(f"发生错误: {e}") 229 | finally: 230 | if rag: 231 | await rag.finalize_storages() 232 | 233 | if __name__ == "__main__": 234 | asyncio.run(main()) 235 | ``` 236 | 237 | 重要说明: 238 | - 运行脚本前请先导出你的OPENAI_API_KEY环境变量。 239 | - 该程序使用LightRAG的默认存储设置,所有数据将持久化在WORKING_DIR/rag_storage目录下。 240 | - 该示例仅展示了初始化LightRAG对象的最简单方式:注入embedding和LLM函数,并在创建LightRAG对象后初始化存储和管道状态。 241 | 242 | ### LightRAG初始化参数 243 | 244 | 以下是完整的LightRAG对象初始化参数清单: 245 | 246 |
247 | 参数 248 | 249 | | **参数** | **类型** | **说明** | **默认值** | 250 | |--------------|----------|-----------------|-------------| 251 | | **working_dir** | `str` | 存储缓存的目录 | `lightrag_cache+timestamp` | 252 | | **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` | 253 | | **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` | 254 | | **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` | 255 | | **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` | 256 | | **chunk_token_size** | `int` | 拆分文档时每个块的最大令牌大小 | `1200` | 257 | | **chunk_overlap_token_size** | `int` | 拆分文档时两个块之间的重叠令牌大小 | `100` | 258 | | **tokenizer** | `Tokenizer` | 用于将文本转换为 tokens(数字)以及使用遵循 TokenizerInterface 协议的 .encode() 和 .decode() 函数将 tokens 转换回文本的函数。 如果您不指定,它将使用默认的 Tiktoken tokenizer。 | `TiktokenTokenizer` | 259 | | **tiktoken_model_name** | `str` | 如果您使用的是默认的 Tiktoken tokenizer,那么这是要使用的特定 Tiktoken 模型的名称。如果您提供自己的 tokenizer,则忽略此设置。 | `gpt-4o-mini` | 260 | | **entity_extract_max_gleaning** | `int` | 实体提取过程中的循环次数,附加历史消息 | `1` | 261 | | **node_embedding_algorithm** | `str` | 节点嵌入算法(当前未使用) | `node2vec` | 262 | | **node2vec_params** | `dict` | 节点嵌入的参数 | `{"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}` | 263 | | **embedding_func** | `EmbeddingFunc` | 从文本生成嵌入向量的函数 | `openai_embed` | 264 | | **embedding_batch_num** | `int` | 嵌入过程的最大批量大小(每批发送多个文本) | `32` | 265 | | **embedding_func_max_async** | `int` | 最大并发异步嵌入进程数 | `16` | 266 | | **llm_model_func** | `callable` | LLM生成的函数 | `gpt_4o_mini_complete` | 267 | | **llm_model_name** | `str` | 用于生成的LLM模型名称 | `meta-llama/Llama-3.2-1B-Instruct` | 268 | | **summary_max_tokens** | `int` | 生成实体关系摘要时送给LLM的最大令牌数 | `32000`(默认值由环境变量MAX_TOKENS更改) | 269 | | **llm_model_max_async** | `int` | 最大并发异步LLM进程数 | `4`(默认值由环境变量MAX_ASYNC更改) | 270 | | **llm_model_kwargs** | `dict` | LLM生成的附加参数 | | 271 | | **vector_db_storage_cls_kwargs** | `dict` | 向量数据库的附加参数,如设置节点和关系检索的阈值 | cosine_better_than_threshold: 0.2(默认值由环境变量COSINE_THRESHOLD更改) | 272 | | **enable_llm_cache** | `bool` | 如果为`TRUE`,将LLM结果存储在缓存中;重复的提示返回缓存的响应 | `TRUE` | 273 | | **enable_llm_cache_for_entity_extract** | `bool` | 如果为`TRUE`,将实体提取的LLM结果存储在缓存中;适合初学者调试应用程序 | `TRUE` | 274 | | **addon_params** | `dict` | 附加参数,例如`{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}`:设置示例限制、输出语言和文档处理的批量大小 | `example_number: 所有示例, language: English` | 275 | | **embedding_cache_config** | `dict` | 问答缓存的配置。包含三个参数:`enabled`:布尔值,启用/禁用缓存查找功能。启用时,系统将在生成新答案之前检查缓存的响应。`similarity_threshold`:浮点值(0-1),相似度阈值。当新问题与缓存问题的相似度超过此阈值时,将直接返回缓存的答案而不调用LLM。`use_llm_check`:布尔值,启用/禁用LLM相似度验证。启用时,在返回缓存答案之前,将使用LLM作为二次检查来验证问题之间的相似度。 | 默认:`{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` | 276 | 277 |
278 | 279 | ### 查询参数 280 | 281 | 使用QueryParam控制你的查询行为: 282 | 283 | ```python 284 | class QueryParam: 285 | """Configuration parameters for query execution in LightRAG.""" 286 | 287 | mode: Literal["local", "global", "hybrid", "naive", "mix", "bypass"] = "global" 288 | """Specifies the retrieval mode: 289 | - "local": Focuses on context-dependent information. 290 | - "global": Utilizes global knowledge. 291 | - "hybrid": Combines local and global retrieval methods. 292 | - "naive": Performs a basic search without advanced techniques. 293 | - "mix": Integrates knowledge graph and vector retrieval. 294 | """ 295 | 296 | only_need_context: bool = False 297 | """If True, only returns the retrieved context without generating a response.""" 298 | 299 | only_need_prompt: bool = False 300 | """If True, only returns the generated prompt without producing a response.""" 301 | 302 | response_type: str = "Multiple Paragraphs" 303 | """Defines the response format. Examples: 'Multiple Paragraphs', 'Single Paragraph', 'Bullet Points'.""" 304 | 305 | stream: bool = False 306 | """If True, enables streaming output for real-time responses.""" 307 | 308 | top_k: int = int(os.getenv("TOP_K", "60")) 309 | """Number of top items to retrieve. Represents entities in 'local' mode and relationships in 'global' mode.""" 310 | 311 | chunk_top_k: int = int(os.getenv("CHUNK_TOP_K", "10")) 312 | """Number of text chunks to retrieve initially from vector search and keep after reranking. 313 | If None, defaults to top_k value. 314 | """ 315 | 316 | max_entity_tokens: int = int(os.getenv("MAX_ENTITY_TOKENS", "10000")) 317 | """Maximum number of tokens allocated for entity context in unified token control system.""" 318 | 319 | max_relation_tokens: int = int(os.getenv("MAX_RELATION_TOKENS", "10000")) 320 | """Maximum number of tokens allocated for relationship context in unified token control system.""" 321 | 322 | max_total_tokens: int = int(os.getenv("MAX_TOTAL_TOKENS", "30000")) 323 | """Maximum total tokens budget for the entire query context (entities + relations + chunks + system prompt).""" 324 | 325 | hl_keywords: list[str] = field(default_factory=list) 326 | """List of high-level keywords to prioritize in retrieval.""" 327 | 328 | ll_keywords: list[str] = field(default_factory=list) 329 | """List of low-level keywords to refine retrieval focus.""" 330 | 331 | conversation_history: list[dict[str, str]] = field(default_factory=list) 332 | """Stores past conversation history to maintain context. 333 | Format: [{"role": "user/assistant", "content": "message"}]. 334 | """ 335 | 336 | # Deprated: history message have negtive effect on query performance 337 | history_turns: int = 0 338 | """Number of complete conversation turns (user-assistant pairs) to consider in the response context.""" 339 | 340 | ids: list[str] | None = None 341 | """List of ids to filter the results.""" 342 | 343 | model_func: Callable[..., object] | None = None 344 | """Optional override for the LLM model function to use for this specific query. 345 | If provided, this will be used instead of the global model function. 346 | This allows using different models for different query modes. 347 | """ 348 | 349 | user_prompt: str | None = None 350 | """User-provided prompt for the query. 351 | If proivded, this will be use instead of the default vaulue from prompt template. 352 | """ 353 | 354 | enable_rerank: bool = True 355 | """Enable reranking for retrieved text chunks. If True but no rerank model is configured, a warning will be issued. 356 | Default is True to enable reranking when rerank model is available. 357 | """ 358 | ``` 359 | 360 | > top_k的默认值可以通过环境变量TOP_K更改。 361 | 362 | ### LLM and Embedding注入 363 | 364 | LightRAG 需要利用LLM和Embeding模型来完成文档索引和知识库查询工作。在初始化LightRAG的时候需要把阶段,需要把LLM和Embedding的操作函数注入到对象中: 365 | 366 |
367 | 使用类OpenAI的API 368 | 369 | * LightRAG还支持类OpenAI的聊天/嵌入API: 370 | 371 | ```python 372 | async def llm_model_func( 373 | prompt, system_prompt=None, history_messages=[], keyword_extraction=False, **kwargs 374 | ) -> str: 375 | return await openai_complete_if_cache( 376 | "solar-mini", 377 | prompt, 378 | system_prompt=system_prompt, 379 | history_messages=history_messages, 380 | api_key=os.getenv("UPSTAGE_API_KEY"), 381 | base_url="https://api.upstage.ai/v1/solar", 382 | **kwargs 383 | ) 384 | 385 | async def embedding_func(texts: list[str]) -> np.ndarray: 386 | return await openai_embed( 387 | texts, 388 | model="solar-embedding-1-large-query", 389 | api_key=os.getenv("UPSTAGE_API_KEY"), 390 | base_url="https://api.upstage.ai/v1/solar" 391 | ) 392 | 393 | async def initialize_rag(): 394 | rag = LightRAG( 395 | working_dir=WORKING_DIR, 396 | llm_model_func=llm_model_func, 397 | embedding_func=EmbeddingFunc( 398 | embedding_dim=4096, 399 | func=embedding_func 400 | ) 401 | ) 402 | 403 | await rag.initialize_storages() 404 | await initialize_pipeline_status() 405 | 406 | return rag 407 | ``` 408 | 409 |
410 | 411 |
412 | 使用Hugging Face模型 413 | 414 | * 如果您想使用Hugging Face模型,只需要按如下方式设置LightRAG: 415 | 416 | 参见`lightrag_hf_demo.py` 417 | 418 | ```python 419 | # 使用Hugging Face模型初始化LightRAG 420 | rag = LightRAG( 421 | working_dir=WORKING_DIR, 422 | llm_model_func=hf_model_complete, # 使用Hugging Face模型进行文本生成 423 | llm_model_name='meta-llama/Llama-3.1-8B-Instruct', # Hugging Face的模型名称 424 | # 使用Hugging Face嵌入函数 425 | embedding_func=EmbeddingFunc( 426 | embedding_dim=384, 427 | func=lambda texts: hf_embed( 428 | texts, 429 | tokenizer=AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2"), 430 | embed_model=AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") 431 | ) 432 | ), 433 | ) 434 | ``` 435 | 436 |
437 | 438 |
439 | 使用Ollama模型 440 | 如果您想使用Ollama模型,您需要拉取计划使用的模型和嵌入模型,例如`nomic-embed-text`。 441 | 442 | 然后您只需要按如下方式设置LightRAG: 443 | 444 | ```python 445 | # 使用Ollama模型初始化LightRAG 446 | rag = LightRAG( 447 | working_dir=WORKING_DIR, 448 | llm_model_func=ollama_model_complete, # 使用Ollama模型进行文本生成 449 | llm_model_name='your_model_name', # 您的模型名称 450 | # 使用Ollama嵌入函数 451 | embedding_func=EmbeddingFunc( 452 | embedding_dim=768, 453 | func=lambda texts: ollama_embed( 454 | texts, 455 | embed_model="nomic-embed-text" 456 | ) 457 | ), 458 | ) 459 | ``` 460 | 461 | * **增加上下文大小** 462 | 463 | 为了使LightRAG正常工作,上下文应至少为32k令牌。默认情况下,Ollama模型的上下文大小为8k。您可以通过以下两种方式之一实现这一点: 464 | 465 | * **在Modelfile中增加`num_ctx`参数** 466 | 467 | 1. 拉取模型: 468 | 469 | ```bash 470 | ollama pull qwen2 471 | ``` 472 | 473 | 2. 显示模型文件: 474 | 475 | ```bash 476 | ollama show --modelfile qwen2 > Modelfile 477 | ``` 478 | 479 | 3. 编辑Modelfile,添加以下行: 480 | 481 | ```bash 482 | PARAMETER num_ctx 32768 483 | ``` 484 | 485 | 4. 创建修改后的模型: 486 | 487 | ```bash 488 | ollama create -f Modelfile qwen2m 489 | ``` 490 | 491 | * **通过Ollama API设置`num_ctx`** 492 | 493 | 您可以使用`llm_model_kwargs`参数配置ollama: 494 | 495 | ```python 496 | rag = LightRAG( 497 | working_dir=WORKING_DIR, 498 | llm_model_func=ollama_model_complete, # 使用Ollama模型进行文本生成 499 | llm_model_name='your_model_name', # 您的模型名称 500 | llm_model_kwargs={"options": {"num_ctx": 32768}}, 501 | # 使用Ollama嵌入函数 502 | embedding_func=EmbeddingFunc( 503 | embedding_dim=768, 504 | func=lambda texts: ollama_embed( 505 | texts, 506 | embed_model="nomic-embed-text" 507 | ) 508 | ), 509 | ) 510 | ``` 511 | 512 | * **低RAM GPU** 513 | 514 | 为了在低RAM GPU上运行此实验,您应该选择小型模型并调整上下文窗口(增加上下文会增加内存消耗)。例如,在6Gb RAM的改装挖矿GPU上运行这个ollama示例需要将上下文大小设置为26k,同时使用`gemma2:2b`。它能够在`book.txt`中找到197个实体和19个关系。 515 | 516 |
517 |
518 | LlamaIndex 519 | 520 | LightRAG支持与LlamaIndex集成 (`llm/llama_index_impl.py`): 521 | 522 | - 通过LlamaIndex与OpenAI和其他提供商集成 523 | - 详细设置和示例请参见[LlamaIndex文档](lightrag/llm/Readme.md) 524 | 525 | **使用示例:** 526 | 527 | ```python 528 | # 使用LlamaIndex直接访问OpenAI 529 | import asyncio 530 | from lightrag import LightRAG 531 | from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed 532 | from llama_index.embeddings.openai import OpenAIEmbedding 533 | from llama_index.llms.openai import OpenAI 534 | from lightrag.kg.shared_storage import initialize_pipeline_status 535 | from lightrag.utils import setup_logger 536 | 537 | # 为LightRAG设置日志处理程序 538 | setup_logger("lightrag", level="INFO") 539 | 540 | async def initialize_rag(): 541 | rag = LightRAG( 542 | working_dir="your/path", 543 | llm_model_func=llama_index_complete_if_cache, # LlamaIndex兼容的完成函数 544 | embedding_func=EmbeddingFunc( # LlamaIndex兼容的嵌入函数 545 | embedding_dim=1536, 546 | func=lambda texts: llama_index_embed(texts, embed_model=embed_model) 547 | ), 548 | ) 549 | 550 | await rag.initialize_storages() 551 | await initialize_pipeline_status() 552 | 553 | return rag 554 | 555 | def main(): 556 | # 初始化RAG实例 557 | rag = asyncio.run(initialize_rag()) 558 | 559 | with open("./book.txt", "r", encoding="utf-8") as f: 560 | rag.insert(f.read()) 561 | 562 | # 执行朴素搜索 563 | print( 564 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="naive")) 565 | ) 566 | 567 | # 执行本地搜索 568 | print( 569 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="local")) 570 | ) 571 | 572 | # 执行全局搜索 573 | print( 574 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="global")) 575 | ) 576 | 577 | # 执行混合搜索 578 | print( 579 | rag.query("这个故事的主要主题是什么?", param=QueryParam(mode="hybrid")) 580 | ) 581 | 582 | if __name__ == "__main__": 583 | main() 584 | ``` 585 | 586 | **详细文档和示例,请参见:** 587 | 588 | - [LlamaIndex文档](lightrag/llm/Readme.md) 589 | - [直接OpenAI示例](examples/lightrag_llamaindex_direct_demo.py) 590 | - [LiteLLM代理示例](examples/lightrag_llamaindex_litellm_demo.py) 591 | 592 |
593 | 594 | ### 对话历史 595 | 596 | LightRAG现在通过对话历史功能支持多轮对话。以下是使用方法: 597 | 598 | ```python 599 | # 创建对话历史 600 | conversation_history = [ 601 | {"role": "user", "content": "主角对圣诞节的态度是什么?"}, 602 | {"role": "assistant", "content": "在故事开始时,埃比尼泽·斯克鲁奇对圣诞节持非常消极的态度..."}, 603 | {"role": "user", "content": "他的态度是如何改变的?"} 604 | ] 605 | 606 | # 创建带有对话历史的查询参数 607 | query_param = QueryParam( 608 | mode="mix", # 或其他模式:"local"、"global"、"hybrid" 609 | conversation_history=conversation_history, # 添加对话历史 610 | history_turns=3 # 考虑最近的对话轮数 611 | ) 612 | 613 | # 进行考虑对话历史的查询 614 | response = rag.query( 615 | "是什么导致了他性格的这种变化?", 616 | param=query_param 617 | ) 618 | ``` 619 | 620 | ### 用户提示词 vs. 查询内容 621 | 622 | 当使用LightRAG查询内容的时候,不要把内容查询和与查询结果无关的输出加工写在一起。因为把两者混在一起会严重影响查询的效果。Query Param中的`user_prompt`就是为解决这一问题而设计的。`user_prompt`中的内容不参与RAG中的查询过程,它仅会在获得查询结果之后,与查询结果一起送给LLM,指导LLM如何处理查询结果。以下是使用方法: 623 | 624 | ```python 625 | # Create query parameters 626 | query_param = QueryParam( 627 | mode = "hybrid", # Other modes:local, global, hybrid, mix, naive 628 | user_prompt = "如需画图使用mermaid格式,节点名称用英文或拼音,显示名称用中文", 629 | ) 630 | 631 | # Query and process 632 | response_default = rag.query( 633 | "请画出 Scrooge 的人物关系图谱", 634 | param=query_param 635 | ) 636 | print(response_default) 637 | ``` 638 | 639 | ### 插入 640 | 641 |
642 | 基本插入 643 | 644 | ```python 645 | # 基本插入 646 | rag.insert("文本") 647 | ``` 648 | 649 |
650 | 651 |
652 | 批量插入 653 | 654 | ```python 655 | # 基本批量插入:一次插入多个文本 656 | rag.insert(["文本1", "文本2",...]) 657 | 658 | # 带有自定义批量大小配置的批量插入 659 | rag = LightRAG( 660 | ... 661 | working_dir=WORKING_DIR, 662 | max_parallel_insert = 4 663 | ) 664 | 665 | rag.insert(["文本1", "文本2", "文本3", ...]) # 文档将以4个为一批进行处理 666 | ``` 667 | 668 | 参数 `max_parallel_insert` 用于控制文档索引流水线中并行处理的文档数量。若未指定,默认值为 **2**。建议将该参数设置为 **10 以下**,因为性能瓶颈通常出现在大语言模型(LLM)的处理环节。 669 | 670 |
671 | 672 |
673 | 带ID插入 674 | 675 | 如果您想为文档提供自己的ID,文档数量和ID数量必须相同。 676 | 677 | ```python 678 | # 插入单个文本,并为其提供ID 679 | rag.insert("文本1", ids=["文本1的ID"]) 680 | 681 | # 插入多个文本,并为它们提供ID 682 | rag.insert(["文本1", "文本2",...], ids=["文本1的ID", "文本2的ID"]) 683 | ``` 684 | 685 |
686 | 687 |
688 | 使用流水线插入 689 | 690 | `apipeline_enqueue_documents`和`apipeline_process_enqueue_documents`函数允许您对文档进行增量插入到图中。 691 | 692 | 这对于需要在后台处理文档的场景很有用,同时仍允许主线程继续执行。 693 | 694 | 并使用例程处理新文档。 695 | 696 | ```python 697 | rag = LightRAG(..) 698 | 699 | await rag.apipeline_enqueue_documents(input) 700 | # 您的循环例程 701 | await rag.apipeline_process_enqueue_documents(input) 702 | ``` 703 | 704 |
705 | 706 |
707 | 插入多文件类型支持 708 | 709 | `textract`支持读取TXT、DOCX、PPTX、CSV和PDF等文件类型。 710 | 711 | ```python 712 | import textract 713 | 714 | file_path = 'TEXT.pdf' 715 | text_content = textract.process(file_path) 716 | 717 | rag.insert(text_content.decode('utf-8')) 718 | ``` 719 | 720 |
721 | 722 |
723 | 引文功能 724 | 725 | 通过提供文件路径,系统确保可以将来源追溯到其原始文档。 726 | 727 | ```python 728 | # 定义文档及其文件路径 729 | documents = ["文档内容1", "文档内容2"] 730 | file_paths = ["path/to/doc1.txt", "path/to/doc2.txt"] 731 | 732 | # 插入带有文件路径的文档 733 | rag.insert(documents, file_paths=file_paths) 734 | ``` 735 | 736 |
737 | 738 | ### 存储 739 | 740 | LightRAG 使用 4 种类型的存储用于不同目的: 741 | 742 | * KV_STORAGE:llm 响应缓存、文本块、文档信息 743 | * VECTOR_STORAGE:实体向量、关系向量、块向量 744 | * GRAPH_STORAGE:实体关系图 745 | * DOC_STATUS_STORAGE:文档索引状态 746 | 747 | 每种存储类型都有几种实现: 748 | 749 | * KV_STORAGE 支持的实现名称 750 | 751 | ``` 752 | JsonKVStorage JsonFile(默认) 753 | PGKVStorage Postgres 754 | RedisKVStorage Redis 755 | MongoKVStorage MogonDB 756 | ``` 757 | 758 | * GRAPH_STORAGE 支持的实现名称 759 | 760 | ``` 761 | NetworkXStorage NetworkX(默认) 762 | Neo4JStorage Neo4J 763 | PGGraphStorage PostgreSQL with AGE plugin 764 | ``` 765 | 766 | > 在测试中Neo4j图形数据库相比PostgreSQL AGE有更好的性能表现。 767 | 768 | * VECTOR_STORAGE 支持的实现名称 769 | 770 | ``` 771 | NanoVectorDBStorage NanoVector(默认) 772 | PGVectorStorage Postgres 773 | MilvusVectorDBStorge Milvus 774 | FaissVectorDBStorage Faiss 775 | QdrantVectorDBStorage Qdrant 776 | MongoVectorDBStorage MongoDB 777 | ``` 778 | 779 | * DOC_STATUS_STORAGE 支持的实现名称 780 | 781 | ``` 782 | JsonDocStatusStorage JsonFile(默认) 783 | PGDocStatusStorage Postgres 784 | MongoDocStatusStorage MongoDB 785 | ``` 786 | 787 | 每一种存储类型的链接配置范例可以在 `env.example` 文件中找到。链接字符串中的数据库实例是需要你预先在数据库服务器上创建好的,LightRAG 仅负责在数据库实例中创建数据表,不负责创建数据库实例。如果使用 Redis 作为存储,记得给 Redis 配置自动持久化数据规则,否则 Redis 服务重启后数据会丢失。如果使用PostgreSQL数据库,推荐使用16.6版本或以上。 788 | 789 |
790 | 使用Neo4J进行存储 791 | 792 | * 对于生产级场景,您很可能想要利用企业级解决方案 793 | * 进行KG存储。推荐在Docker中运行Neo4J以进行无缝本地测试。 794 | * 参见:https://hub.docker.com/_/neo4j 795 | 796 | ```python 797 | export NEO4J_URI="neo4j://localhost:7687" 798 | export NEO4J_USERNAME="neo4j" 799 | export NEO4J_PASSWORD="password" 800 | 801 | # 为LightRAG设置日志记录器 802 | setup_logger("lightrag", level="INFO") 803 | 804 | # 当您启动项目时,请确保通过指定kg="Neo4JStorage"来覆盖默认的KG:NetworkX。 805 | 806 | # 注意:默认设置使用NetworkX 807 | # 使用Neo4J实现初始化LightRAG。 808 | async def initialize_rag(): 809 | rag = LightRAG( 810 | working_dir=WORKING_DIR, 811 | llm_model_func=gpt_4o_mini_complete, # 使用gpt_4o_mini_complete LLM模型 812 | graph_storage="Neo4JStorage", #<-----------覆盖KG默认值 813 | ) 814 | 815 | # 初始化数据库连接 816 | await rag.initialize_storages() 817 | # 初始化文档处理的管道状态 818 | await initialize_pipeline_status() 819 | 820 | return rag 821 | ``` 822 | 823 | 参见test_neo4j.py获取工作示例。 824 | 825 |
826 | 827 |
828 | 使用Faiss进行存储 829 | 在使用Faiss向量数据库之前必须手工安装`faiss-cpu`或`faiss-gpu`。 830 | 831 | - 安装所需依赖: 832 | 833 | ``` 834 | pip install faiss-cpu 835 | ``` 836 | 837 | 如果您有GPU支持,也可以安装`faiss-gpu`。 838 | 839 | - 这里我们使用`sentence-transformers`,但您也可以使用维度为`3072`的`OpenAIEmbedding`模型。 840 | 841 | ```python 842 | async def embedding_func(texts: list[str]) -> np.ndarray: 843 | model = SentenceTransformer('all-MiniLM-L6-v2') 844 | embeddings = model.encode(texts, convert_to_numpy=True) 845 | return embeddings 846 | 847 | # 使用LLM模型函数和嵌入函数初始化LightRAG 848 | rag = LightRAG( 849 | working_dir=WORKING_DIR, 850 | llm_model_func=llm_model_func, 851 | embedding_func=EmbeddingFunc( 852 | embedding_dim=384, 853 | func=embedding_func, 854 | ), 855 | vector_storage="FaissVectorDBStorage", 856 | vector_db_storage_cls_kwargs={ 857 | "cosine_better_than_threshold": 0.3 # 您期望的阈值 858 | } 859 | ) 860 | ``` 861 | 862 |
863 | 864 |
865 | 使用PostgreSQL进行存储 866 | 867 | 对于生产级场景,您很可能想要利用企业级解决方案。PostgreSQL可以为您提供一站式解决方案,作为KV存储、向量数据库(pgvector)和图数据库(apache AGE)。支持 PostgreSQL 版本为16.6或以上。 868 | 869 | * PostgreSQL很轻量,整个二进制发行版包括所有必要的插件可以压缩到40MB:参考[Windows发布版](https://github.com/ShanGor/apache-age-windows/releases/tag/PG17%2Fv1.5.0-rc0),它在Linux/Mac上也很容易安装。 870 | * 如果您是初学者并想避免麻烦,推荐使用docker,请从这个镜像开始(请务必阅读概述):https://hub.docker.com/r/shangor/postgres-for-rag 871 | * 如何开始?参考:[examples/lightrag_zhipu_postgres_demo.py](https://github.com/HKUDS/LightRAG/blob/main/examples/lightrag_zhipu_postgres_demo.py) 872 | 873 | * Apache AGE的性能不如Neo4j。最求高性能的图数据库请使用Noe4j。 874 | 875 |
876 | 877 | ### LightRAG实例间的数据隔离 878 | 879 | 通过 workspace 参数可以不同实现不同LightRAG实例之间的存储数据隔离。LightRAG在初始化后workspace就已经确定,之后修改workspace是无效的。下面是不同类型的存储实现工作空间的方式: 880 | 881 | - **对于本地基于文件的数据库,数据隔离通过工作空间子目录实现:** JsonKVStorage, JsonDocStatusStorage, NetworkXStorage, NanoVectorDBStorage, FaissVectorDBStorage。 882 | - **对于将数据存储在集合(collection)中的数据库,通过在集合名称前添加工作空间前缀来实现:** RedisKVStorage, RedisDocStatusStorage, MilvusVectorDBStorage, QdrantVectorDBStorage, MongoKVStorage, MongoDocStatusStorage, MongoVectorDBStorage, MongoGraphStorage, PGGraphStorage。 883 | - **对于关系型数据库,数据隔离通过向表中添加 `workspace` 字段进行数据的逻辑隔离:** PGKVStorage, PGVectorStorage, PGDocStatusStorage。 884 | 885 | * **对于Neo4j图数据库,通过label来实现数据的逻辑隔离**:Neo4JStorage 886 | 887 | 为了保持对遗留数据的兼容,在未配置工作空间时PostgreSQL非图存储的工作空间为`default`,PostgreSQL AGE图存储的工作空间为空,Neo4j图存储的默认工作空间为`base`。对于所有的外部存储,系统都提供了专用的工作空间环境变量,用于覆盖公共的 `WORKSPACE`环境变量配置。这些适用于指定存储类型的工作空间环境变量为:`REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`。 888 | 889 | ## 编辑实体和关系 890 | 891 | LightRAG现在支持全面的知识图谱管理功能,允许您在知识图谱中创建、编辑和删除实体和关系。 892 | 893 |
894 | 创建实体和关系 895 | 896 | ```python 897 | # 创建新实体 898 | entity = rag.create_entity("Google", { 899 | "description": "Google是一家专注于互联网相关服务和产品的跨国科技公司。", 900 | "entity_type": "company" 901 | }) 902 | 903 | # 创建另一个实体 904 | product = rag.create_entity("Gmail", { 905 | "description": "Gmail是由Google开发的电子邮件服务。", 906 | "entity_type": "product" 907 | }) 908 | 909 | # 创建实体之间的关系 910 | relation = rag.create_relation("Google", "Gmail", { 911 | "description": "Google开发和运营Gmail。", 912 | "keywords": "开发 运营 服务", 913 | "weight": 2.0 914 | }) 915 | ``` 916 | 917 |
918 | 919 |
920 | 编辑实体和关系 921 | 922 | ```python 923 | # 编辑现有实体 924 | updated_entity = rag.edit_entity("Google", { 925 | "description": "Google是Alphabet Inc.的子公司,成立于1998年。", 926 | "entity_type": "tech_company" 927 | }) 928 | 929 | # 重命名实体(所有关系都会正确迁移) 930 | renamed_entity = rag.edit_entity("Gmail", { 931 | "entity_name": "Google Mail", 932 | "description": "Google Mail(前身为Gmail)是一项电子邮件服务。" 933 | }) 934 | 935 | # 编辑实体之间的关系 936 | updated_relation = rag.edit_relation("Google", "Google Mail", { 937 | "description": "Google创建并维护Google Mail服务。", 938 | "keywords": "创建 维护 电子邮件服务", 939 | "weight": 3.0 940 | }) 941 | ``` 942 | 943 | 所有操作都有同步和异步版本。异步版本带有前缀"a"(例如,`acreate_entity`,`aedit_relation`)。 944 | 945 |
946 | 947 |
948 | 插入自定义知识 949 | 950 | ```python 951 | custom_kg = { 952 | "chunks": [ 953 | { 954 | "content": "Alice和Bob正在合作进行量子计算研究。", 955 | "source_id": "doc-1" 956 | } 957 | ], 958 | "entities": [ 959 | { 960 | "entity_name": "Alice", 961 | "entity_type": "person", 962 | "description": "Alice是一位专门研究量子物理的研究员。", 963 | "source_id": "doc-1" 964 | }, 965 | { 966 | "entity_name": "Bob", 967 | "entity_type": "person", 968 | "description": "Bob是一位数学家。", 969 | "source_id": "doc-1" 970 | }, 971 | { 972 | "entity_name": "量子计算", 973 | "entity_type": "technology", 974 | "description": "量子计算利用量子力学现象进行计算。", 975 | "source_id": "doc-1" 976 | } 977 | ], 978 | "relationships": [ 979 | { 980 | "src_id": "Alice", 981 | "tgt_id": "Bob", 982 | "description": "Alice和Bob是研究伙伴。", 983 | "keywords": "合作 研究", 984 | "weight": 1.0, 985 | "source_id": "doc-1" 986 | }, 987 | { 988 | "src_id": "Alice", 989 | "tgt_id": "量子计算", 990 | "description": "Alice进行量子计算研究。", 991 | "keywords": "研究 专业", 992 | "weight": 1.0, 993 | "source_id": "doc-1" 994 | }, 995 | { 996 | "src_id": "Bob", 997 | "tgt_id": "量子计算", 998 | "description": "Bob研究量子计算。", 999 | "keywords": "研究 应用", 1000 | "weight": 1.0, 1001 | "source_id": "doc-1" 1002 | } 1003 | ] 1004 | } 1005 | 1006 | rag.insert_custom_kg(custom_kg) 1007 | ``` 1008 | 1009 |
1010 | 1011 |
1012 | 其它实体与关系操作 1013 | 1014 | - **create_entity**:创建具有指定属性的新实体 1015 | - **edit_entity**:更新现有实体的属性或重命名它 1016 | 1017 | - **create_relation**:在现有实体之间创建新关系 1018 | - **edit_relation**:更新现有关系的属性 1019 | 1020 | 这些操作在图数据库和向量数据库组件之间保持数据一致性,确保您的知识图谱保持连贯。 1021 | 1022 |
1023 | 1024 | ## 删除功能 1025 | 1026 | LightRAG提供了全面的删除功能,允许您删除文档、实体和关系。 1027 | 1028 |
1029 | 删除实体 1030 | 1031 | 您可以通过实体名称删除实体及其所有关联关系: 1032 | 1033 | ```python 1034 | # 删除实体及其所有关系(同步版本) 1035 | rag.delete_by_entity("Google") 1036 | 1037 | # 异步版本 1038 | await rag.adelete_by_entity("Google") 1039 | ``` 1040 | 1041 | 删除实体时会: 1042 | - 从知识图谱中移除该实体节点 1043 | - 删除该实体的所有关联关系 1044 | - 从向量数据库中移除相关的嵌入向量 1045 | - 保持知识图谱的完整性 1046 | 1047 |
1048 | 1049 |
1050 | 删除关系 1051 | 1052 | 您可以删除两个特定实体之间的关系: 1053 | 1054 | ```python 1055 | # 删除两个实体之间的关系(同步版本) 1056 | rag.delete_by_relation("Google", "Gmail") 1057 | 1058 | # 异步版本 1059 | await rag.adelete_by_relation("Google", "Gmail") 1060 | ``` 1061 | 1062 | 删除关系时会: 1063 | - 移除指定的关系边 1064 | - 从向量数据库中删除关系的嵌入向量 1065 | - 保留两个实体节点及其他关系 1066 | 1067 |
1068 | 1069 |
1070 | 通过文档ID删除 1071 | 1072 | 您可以通过文档ID删除整个文档及其相关的所有知识: 1073 | 1074 | ```python 1075 | # 通过文档ID删除(异步版本) 1076 | await rag.adelete_by_doc_id("doc-12345") 1077 | ``` 1078 | 1079 | 通过文档ID删除时的优化处理: 1080 | - **智能清理**:自动识别并删除仅属于该文档的实体和关系 1081 | - **保留共享知识**:如果实体或关系在其他文档中也存在,则会保留并重新构建描述 1082 | - **缓存优化**:清理相关的LLM缓存以减少存储开销 1083 | - **增量重建**:从剩余文档重新构建受影响的实体和关系描述 1084 | 1085 | 删除过程包括: 1086 | 1. 删除文档相关的所有文本块 1087 | 2. 识别仅属于该文档的实体和关系并删除 1088 | 3. 重新构建在其他文档中仍存在的实体和关系 1089 | 4. 更新所有相关的向量索引 1090 | 5. 清理文档状态记录 1091 | 1092 | 注意:通过文档ID删除是一个异步操作,因为它涉及复杂的知识图谱重构过程。 1093 | 1094 |
1095 | 1096 |
1097 | 删除注意事项 1098 | 1099 | **重要提醒:** 1100 | 1101 | 1. **不可逆操作**:所有删除操作都是不可逆的,请谨慎使用 1102 | 2. **性能考虑**:删除大量数据时可能需要一些时间,特别是通过文档ID删除 1103 | 3. **数据一致性**:删除操作会自动维护知识图谱和向量数据库之间的一致性 1104 | 4. **备份建议**:在执行重要删除操作前建议备份数据 1105 | 1106 | **批量删除建议:** 1107 | - 对于批量删除操作,建议使用异步方法以获得更好的性能 1108 | - 大规模删除时,考虑分批进行以避免系统负载过高 1109 | 1110 |
1111 | 1112 | ## 实体合并 1113 | 1114 |
1115 | 合并实体及其关系 1116 | 1117 | LightRAG现在支持将多个实体合并为单个实体,自动处理所有关系: 1118 | 1119 | ```python 1120 | # 基本实体合并 1121 | rag.merge_entities( 1122 | source_entities=["人工智能", "AI", "机器智能"], 1123 | target_entity="AI技术" 1124 | ) 1125 | ``` 1126 | 1127 | 使用自定义合并策略: 1128 | 1129 | ```python 1130 | # 为不同字段定义自定义合并策略 1131 | rag.merge_entities( 1132 | source_entities=["约翰·史密斯", "史密斯博士", "J·史密斯"], 1133 | target_entity="约翰·史密斯", 1134 | merge_strategy={ 1135 | "description": "concatenate", # 组合所有描述 1136 | "entity_type": "keep_first", # 保留第一个实体的类型 1137 | "source_id": "join_unique" # 组合所有唯一的源ID 1138 | } 1139 | ) 1140 | ``` 1141 | 1142 | 使用自定义目标实体数据: 1143 | 1144 | ```python 1145 | # 为合并后的实体指定确切值 1146 | rag.merge_entities( 1147 | source_entities=["纽约", "NYC", "大苹果"], 1148 | target_entity="纽约市", 1149 | target_entity_data={ 1150 | "entity_type": "LOCATION", 1151 | "description": "纽约市是美国人口最多的城市。", 1152 | } 1153 | ) 1154 | ``` 1155 | 1156 | 结合两种方法的高级用法: 1157 | 1158 | ```python 1159 | # 使用策略和自定义数据合并公司实体 1160 | rag.merge_entities( 1161 | source_entities=["微软公司", "Microsoft Corporation", "MSFT"], 1162 | target_entity="微软", 1163 | merge_strategy={ 1164 | "description": "concatenate", # 组合所有描述 1165 | "source_id": "join_unique" # 组合源ID 1166 | }, 1167 | target_entity_data={ 1168 | "entity_type": "ORGANIZATION", 1169 | } 1170 | ) 1171 | ``` 1172 | 1173 | 合并实体时: 1174 | 1175 | * 所有来自源实体的关系都会重定向到目标实体 1176 | * 重复的关系会被智能合并 1177 | * 防止自我关系(循环) 1178 | * 合并后删除源实体 1179 | * 保留关系权重和属性 1180 | 1181 |
1182 | 1183 | ## 多模态文档处理(RAG-Anything集成) 1184 | 1185 | LightRAG 现已与 [RAG-Anything](https://github.com/HKUDS/RAG-Anything) 实现无缝集成,这是一个专为 LightRAG 构建的**全能多模态文档处理RAG系统**。RAG-Anything 提供先进的解析和检索增强生成(RAG)能力,让您能够无缝处理多模态文档,并从各种文档格式中提取结构化内容——包括文本、图片、表格和公式——以集成到您的RAG流程中。 1186 | 1187 | **主要特性:** 1188 | - **端到端多模态流程**:从文档摄取解析到智能多模态问答的完整工作流程 1189 | - **通用文档支持**:无缝处理PDF、Office文档(DOC/DOCX/PPT/PPTX/XLS/XLSX)、图片和各种文件格式 1190 | - **专业内容分析**:针对图片、表格、数学公式和异构内容类型的专用处理器 1191 | - **多模态知识图谱**:自动实体提取和跨模态关系发现以增强理解 1192 | - **混合智能检索**:覆盖文本和多模态内容的高级搜索能力,具备上下文理解 1193 | 1194 | **快速开始:** 1195 | 1. 安装RAG-Anything: 1196 | ```bash 1197 | pip install raganything 1198 | ``` 1199 | 2. 处理多模态文档: 1200 |
1201 | RAGAnything 使用示例 1202 | 1203 | ```python 1204 | import asyncio 1205 | from raganything import RAGAnything 1206 | from lightrag import LightRAG 1207 | from lightrag.llm.openai import openai_complete_if_cache, openai_embed 1208 | from lightrag.utils import EmbeddingFunc 1209 | import os 1210 | 1211 | async def load_existing_lightrag(): 1212 | # 首先,创建或加载现有的 LightRAG 实例 1213 | lightrag_working_dir = "./existing_lightrag_storage" 1214 | 1215 | # 检查是否存在之前的 LightRAG 实例 1216 | if os.path.exists(lightrag_working_dir) and os.listdir(lightrag_working_dir): 1217 | print("✅ Found existing LightRAG instance, loading...") 1218 | else: 1219 | print("❌ No existing LightRAG instance found, will create new one") 1220 | 1221 | # 使用您的配置创建/加载 LightRAG 实例 1222 | lightrag_instance = LightRAG( 1223 | working_dir=lightrag_working_dir, 1224 | llm_model_func=lambda prompt, system_prompt=None, history_messages=[], **kwargs: openai_complete_if_cache( 1225 | "gpt-4o-mini", 1226 | prompt, 1227 | system_prompt=system_prompt, 1228 | history_messages=history_messages, 1229 | api_key="your-api-key", 1230 | **kwargs, 1231 | ), 1232 | embedding_func=EmbeddingFunc( 1233 | embedding_dim=3072, 1234 | func=lambda texts: openai_embed( 1235 | texts, 1236 | model="text-embedding-3-large", 1237 | api_key=api_key, 1238 | base_url=base_url, 1239 | ), 1240 | ) 1241 | ) 1242 | 1243 | # 初始化存储(如果有现有数据,这将加载现有数据) 1244 | await lightrag_instance.initialize_storages() 1245 | 1246 | # 现在使用现有的 LightRAG 实例初始化 RAGAnything 1247 | rag = RAGAnything( 1248 | lightrag=lightrag_instance, # 传递现有的 LightRAG 实例 1249 | # 仅需要视觉模型用于多模态处理 1250 | vision_model_func=lambda prompt, system_prompt=None, history_messages=[], image_data=None, **kwargs: openai_complete_if_cache( 1251 | "gpt-4o", 1252 | "", 1253 | system_prompt=None, 1254 | history_messages=[], 1255 | messages=[ 1256 | {"role": "system", "content": system_prompt} if system_prompt else None, 1257 | {"role": "user", "content": [ 1258 | {"type": "text", "text": prompt}, 1259 | {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}} 1260 | ]} if image_data else {"role": "user", "content": prompt} 1261 | ], 1262 | api_key="your-api-key", 1263 | **kwargs, 1264 | ) if image_data else openai_complete_if_cache( 1265 | "gpt-4o-mini", 1266 | prompt, 1267 | system_prompt=system_prompt, 1268 | history_messages=history_messages, 1269 | api_key="your-api-key", 1270 | **kwargs, 1271 | ) 1272 | # 注意:working_dir、llm_model_func、embedding_func 等都从 lightrag_instance 继承 1273 | ) 1274 | 1275 | # 查询现有的知识库 1276 | result = await rag.query_with_multimodal( 1277 | "What data has been processed in this LightRAG instance?", 1278 | mode="hybrid" 1279 | ) 1280 | print("Query result:", result) 1281 | 1282 | # 向现有的 LightRAG 实例添加新的多模态文档 1283 | await rag.process_document_complete( 1284 | file_path="path/to/new/multimodal_document.pdf", 1285 | output_dir="./output" 1286 | ) 1287 | 1288 | if __name__ == "__main__": 1289 | asyncio.run(load_existing_lightrag()) 1290 | ``` 1291 | 1292 |
1293 | 1294 | 如需详细文档和高级用法,请参阅 [RAG-Anything 仓库](https://github.com/HKUDS/RAG-Anything)。 1295 | 1296 | ## Token统计功能 1297 | 1298 |
1299 | 概述和使用 1300 | 1301 | LightRAG提供了TokenTracker工具来跟踪和管理大模型的token消耗。这个功能对于控制API成本和优化性能特别有用。 1302 | 1303 | ### 使用方法 1304 | 1305 | ```python 1306 | from lightrag.utils import TokenTracker 1307 | 1308 | # 创建TokenTracker实例 1309 | token_tracker = TokenTracker() 1310 | 1311 | # 方法1:使用上下文管理器(推荐) 1312 | # 适用于需要自动跟踪token使用的场景 1313 | with token_tracker: 1314 | result1 = await llm_model_func("你的问题1") 1315 | result2 = await llm_model_func("你的问题2") 1316 | 1317 | # 方法2:手动添加token使用记录 1318 | # 适用于需要更精细控制token统计的场景 1319 | token_tracker.reset() 1320 | 1321 | rag.insert() 1322 | 1323 | rag.query("你的问题1", param=QueryParam(mode="naive")) 1324 | rag.query("你的问题2", param=QueryParam(mode="mix")) 1325 | 1326 | # 显示总token使用量(包含插入和查询操作) 1327 | print("Token usage:", token_tracker.get_usage()) 1328 | ``` 1329 | 1330 | ### 使用建议 1331 | - 在长会话或批量操作中使用上下文管理器,可以自动跟踪所有token消耗 1332 | - 对于需要分段统计的场景,使用手动模式并适时调用reset() 1333 | - 定期检查token使用情况,有助于及时发现异常消耗 1334 | - 在开发测试阶段积极使用此功能,以便优化生产环境的成本 1335 | 1336 | ### 实际应用示例 1337 | 您可以参考以下示例来实现token统计: 1338 | - `examples/lightrag_gemini_track_token_demo.py`:使用Google Gemini模型的token统计示例 1339 | - `examples/lightrag_siliconcloud_track_token_demo.py`:使用SiliconCloud模型的token统计示例 1340 | 1341 | 这些示例展示了如何在不同模型和场景下有效地使用TokenTracker功能。 1342 | 1343 |
1344 | 1345 | ## 数据导出功能 1346 | 1347 | ### 概述 1348 | 1349 | LightRAG允许您以各种格式导出知识图谱数据,用于分析、共享和备份目的。系统支持导出实体、关系和关系数据。 1350 | 1351 | ### 导出功能 1352 | 1353 | #### 基本用法 1354 | 1355 | ```python 1356 | # 基本CSV导出(默认格式) 1357 | rag.export_data("knowledge_graph.csv") 1358 | 1359 | # 指定任意格式 1360 | rag.export_data("output.xlsx", file_format="excel") 1361 | ``` 1362 | 1363 | #### 支持的不同文件格式 1364 | 1365 | ```python 1366 | # 以CSV格式导出数据 1367 | rag.export_data("graph_data.csv", file_format="csv") 1368 | 1369 | # 导出数据到Excel表格 1370 | rag.export_data("graph_data.xlsx", file_format="excel") 1371 | 1372 | # 以markdown格式导出数据 1373 | rag.export_data("graph_data.md", file_format="md") 1374 | 1375 | # 导出数据为文本 1376 | rag.export_data("graph_data.txt", file_format="txt") 1377 | ``` 1378 | 1379 | #### 附加选项 1380 | 1381 | 在导出中包含向量嵌入(可选): 1382 | 1383 | ```python 1384 | rag.export_data("complete_data.csv", include_vector_data=True) 1385 | ``` 1386 | 1387 | ### 导出数据包括 1388 | 1389 | 所有导出包括: 1390 | 1391 | * 实体信息(名称、ID、元数据) 1392 | * 关系数据(实体之间的连接) 1393 | * 来自向量数据库的关系信息 1394 | 1395 | ## 缓存 1396 | 1397 |
1398 | 清除缓存 1399 | 1400 | 您可以使用不同模式清除LLM响应缓存: 1401 | 1402 | ```python 1403 | # 清除所有缓存 1404 | await rag.aclear_cache() 1405 | 1406 | # 清除本地模式缓存 1407 | await rag.aclear_cache(modes=["local"]) 1408 | 1409 | # 清除提取缓存 1410 | await rag.aclear_cache(modes=["default"]) 1411 | 1412 | # 清除多个模式 1413 | await rag.aclear_cache(modes=["local", "global", "hybrid"]) 1414 | 1415 | # 同步版本 1416 | rag.clear_cache(modes=["local"]) 1417 | ``` 1418 | 1419 | 有效的模式包括: 1420 | 1421 | - `"default"`:提取缓存 1422 | - `"naive"`:朴素搜索缓存 1423 | - `"local"`:本地搜索缓存 1424 | - `"global"`:全局搜索缓存 1425 | - `"hybrid"`:混合搜索缓存 1426 | - `"mix"`:混合搜索缓存 1427 | 1428 |
1429 | 1430 | ## LightRAG API 1431 | 1432 | LightRAG服务器旨在提供Web UI和API支持。**有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。** 1433 | 1434 | ## 知识图谱可视化 1435 | 1436 | LightRAG服务器提供全面的知识图谱可视化功能。它支持各种重力布局、节点查询、子图过滤等。**有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。** 1437 | 1438 | ![iShot_2025-03-23_12.40.08](./README.assets/iShot_2025-03-23_12.40.08.png) 1439 | 1440 | ## 评估 1441 | 1442 | ### 数据集 1443 | 1444 | LightRAG使用的数据集可以从[TommyChien/UltraDomain](https://huggingface.co/datasets/TommyChien/UltraDomain)下载。 1445 | 1446 | ### 生成查询 1447 | 1448 | LightRAG使用以下提示生成高级查询,相应的代码在`example/generate_query.py`中。 1449 | 1450 |
1451 | 提示 1452 | 1453 | ```python 1454 | 给定以下数据集描述: 1455 | 1456 | {description} 1457 | 1458 | 请识别5个可能会使用此数据集的潜在用户。对于每个用户,列出他们会使用此数据集执行的5个任务。然后,对于每个(用户,任务)组合,生成5个需要对整个数据集有高级理解的问题。 1459 | 1460 | 按以下结构输出结果: 1461 | - 用户1:[用户描述] 1462 | - 任务1:[任务描述] 1463 | - 问题1: 1464 | - 问题2: 1465 | - 问题3: 1466 | - 问题4: 1467 | - 问题5: 1468 | - 任务2:[任务描述] 1469 | ... 1470 | - 任务5:[任务描述] 1471 | - 用户2:[用户描述] 1472 | ... 1473 | - 用户5:[用户描述] 1474 | ... 1475 | ``` 1476 | 1477 |
1478 | 1479 | ### 批量评估 1480 | 1481 | 为了评估两个RAG系统在高级查询上的性能,LightRAG使用以下提示,具体代码可在`example/batch_eval.py`中找到。 1482 | 1483 |
1484 | 提示 1485 | 1486 | ```python 1487 | ---角色--- 1488 | 您是一位专家,负责根据三个标准评估同一问题的两个答案:**全面性**、**多样性**和**赋能性**。 1489 | ---目标--- 1490 | 您将根据三个标准评估同一问题的两个答案:**全面性**、**多样性**和**赋能性**。 1491 | 1492 | - **全面性**:答案提供了多少细节来涵盖问题的所有方面和细节? 1493 | - **多样性**:答案在提供关于问题的不同视角和见解方面有多丰富多样? 1494 | - **赋能性**:答案在多大程度上帮助读者理解并对主题做出明智判断? 1495 | 1496 | 对于每个标准,选择更好的答案(答案1或答案2)并解释原因。然后,根据这三个类别选择总体赢家。 1497 | 1498 | 这是问题: 1499 | {query} 1500 | 1501 | 这是两个答案: 1502 | 1503 | **答案1:** 1504 | {answer1} 1505 | 1506 | **答案2:** 1507 | {answer2} 1508 | 1509 | 使用上述三个标准评估两个答案,并为每个标准提供详细解释。 1510 | 1511 | 以下列JSON格式输出您的评估: 1512 | 1513 | {{ 1514 | "全面性": {{ 1515 | "获胜者": "[答案1或答案2]", 1516 | "解释": "[在此提供解释]" 1517 | }}, 1518 | "赋能性": {{ 1519 | "获胜者": "[答案1或答案2]", 1520 | "解释": "[在此提供解释]" 1521 | }}, 1522 | "总体获胜者": {{ 1523 | "获胜者": "[答案1或答案2]", 1524 | "解释": "[根据三个标准总结为什么这个答案是总体获胜者]" 1525 | }} 1526 | }} 1527 | ``` 1528 | 1529 |
1530 | 1531 | ### 总体性能表 1532 | 1533 | | |**农业**| |**计算机科学**| |**法律**| |**混合**| | 1534 | |----------------------|---------------|------------|------|------------|---------|------------|-------|------------| 1535 | | |NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**|NaiveRAG|**LightRAG**| 1536 | |**全面性**|32.4%|**67.6%**|38.4%|**61.6%**|16.4%|**83.6%**|38.8%|**61.2%**| 1537 | |**多样性**|23.6%|**76.4%**|38.0%|**62.0%**|13.6%|**86.4%**|32.4%|**67.6%**| 1538 | |**赋能性**|32.4%|**67.6%**|38.8%|**61.2%**|16.4%|**83.6%**|42.8%|**57.2%**| 1539 | |**总体**|32.4%|**67.6%**|38.8%|**61.2%**|15.2%|**84.8%**|40.0%|**60.0%**| 1540 | | |RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**|RQ-RAG|**LightRAG**| 1541 | |**全面性**|31.6%|**68.4%**|38.8%|**61.2%**|15.2%|**84.8%**|39.2%|**60.8%**| 1542 | |**多样性**|29.2%|**70.8%**|39.2%|**60.8%**|11.6%|**88.4%**|30.8%|**69.2%**| 1543 | |**赋能性**|31.6%|**68.4%**|36.4%|**63.6%**|15.2%|**84.8%**|42.4%|**57.6%**| 1544 | |**总体**|32.4%|**67.6%**|38.0%|**62.0%**|14.4%|**85.6%**|40.0%|**60.0%**| 1545 | | |HyDE|**LightRAG**|HyDE|**LightRAG**|HyDE|**LightRAG**|HyDE|**LightRAG**| 1546 | |**全面性**|26.0%|**74.0%**|41.6%|**58.4%**|26.8%|**73.2%**|40.4%|**59.6%**| 1547 | |**多样性**|24.0%|**76.0%**|38.8%|**61.2%**|20.0%|**80.0%**|32.4%|**67.6%**| 1548 | |**赋能性**|25.2%|**74.8%**|40.8%|**59.2%**|26.0%|**74.0%**|46.0%|**54.0%**| 1549 | |**总体**|24.8%|**75.2%**|41.6%|**58.4%**|26.4%|**73.6%**|42.4%|**57.6%**| 1550 | | |GraphRAG|**LightRAG**|GraphRAG|**LightRAG**|GraphRAG|**LightRAG**|GraphRAG|**LightRAG**| 1551 | |**全面性**|45.6%|**54.4%**|48.4%|**51.6%**|48.4%|**51.6%**|**50.4%**|49.6%| 1552 | |**多样性**|22.8%|**77.2%**|40.8%|**59.2%**|26.4%|**73.6%**|36.0%|**64.0%**| 1553 | |**赋能性**|41.2%|**58.8%**|45.2%|**54.8%**|43.6%|**56.4%**|**50.8%**|49.2%| 1554 | |**总体**|45.2%|**54.8%**|48.0%|**52.0%**|47.2%|**52.8%**|**50.4%**|49.6%| 1555 | 1556 | ## 复现 1557 | 1558 | 所有代码都可以在`./reproduce`目录中找到。 1559 | 1560 | ### 步骤0 提取唯一上下文 1561 | 1562 | 首先,我们需要提取数据集中的唯一上下文。 1563 | 1564 |
1565 | 代码 1566 | 1567 | ```python 1568 | def extract_unique_contexts(input_directory, output_directory): 1569 | 1570 | os.makedirs(output_directory, exist_ok=True) 1571 | 1572 | jsonl_files = glob.glob(os.path.join(input_directory, '*.jsonl')) 1573 | print(f"找到{len(jsonl_files)}个JSONL文件。") 1574 | 1575 | for file_path in jsonl_files: 1576 | filename = os.path.basename(file_path) 1577 | name, ext = os.path.splitext(filename) 1578 | output_filename = f"{name}_unique_contexts.json" 1579 | output_path = os.path.join(output_directory, output_filename) 1580 | 1581 | unique_contexts_dict = {} 1582 | 1583 | print(f"处理文件:{filename}") 1584 | 1585 | try: 1586 | with open(file_path, 'r', encoding='utf-8') as infile: 1587 | for line_number, line in enumerate(infile, start=1): 1588 | line = line.strip() 1589 | if not line: 1590 | continue 1591 | try: 1592 | json_obj = json.loads(line) 1593 | context = json_obj.get('context') 1594 | if context and context not in unique_contexts_dict: 1595 | unique_contexts_dict[context] = None 1596 | except json.JSONDecodeError as e: 1597 | print(f"文件{filename}第{line_number}行JSON解码错误:{e}") 1598 | except FileNotFoundError: 1599 | print(f"未找到文件:{filename}") 1600 | continue 1601 | except Exception as e: 1602 | print(f"处理文件{filename}时发生错误:{e}") 1603 | continue 1604 | 1605 | unique_contexts_list = list(unique_contexts_dict.keys()) 1606 | print(f"文件{filename}中有{len(unique_contexts_list)}个唯一的`context`条目。") 1607 | 1608 | try: 1609 | with open(output_path, 'w', encoding='utf-8') as outfile: 1610 | json.dump(unique_contexts_list, outfile, ensure_ascii=False, indent=4) 1611 | print(f"唯一的`context`条目已保存到:{output_filename}") 1612 | except Exception as e: 1613 | print(f"保存到文件{output_filename}时发生错误:{e}") 1614 | 1615 | print("所有文件已处理完成。") 1616 | 1617 | ``` 1618 | 1619 |
1620 | 1621 | ### 步骤1 插入上下文 1622 | 1623 | 对于提取的上下文,我们将它们插入到LightRAG系统中。 1624 | 1625 |
1626 | 代码 1627 | 1628 | ```python 1629 | def insert_text(rag, file_path): 1630 | with open(file_path, mode='r') as f: 1631 | unique_contexts = json.load(f) 1632 | 1633 | retries = 0 1634 | max_retries = 3 1635 | while retries < max_retries: 1636 | try: 1637 | rag.insert(unique_contexts) 1638 | break 1639 | except Exception as e: 1640 | retries += 1 1641 | print(f"插入失败,重试({retries}/{max_retries}),错误:{e}") 1642 | time.sleep(10) 1643 | if retries == max_retries: 1644 | print("超过最大重试次数后插入失败") 1645 | ``` 1646 | 1647 |
1648 | 1649 | ### 步骤2 生成查询 1650 | 1651 | 我们从数据集中每个上下文的前半部分和后半部分提取令牌,然后将它们组合为数据集描述以生成查询。 1652 | 1653 |
1654 | 代码 1655 | 1656 | ```python 1657 | tokenizer = GPT2Tokenizer.from_pretrained('gpt2') 1658 | 1659 | def get_summary(context, tot_tokens=2000): 1660 | tokens = tokenizer.tokenize(context) 1661 | half_tokens = tot_tokens // 2 1662 | 1663 | start_tokens = tokens[1000:1000 + half_tokens] 1664 | end_tokens = tokens[-(1000 + half_tokens):1000] 1665 | 1666 | summary_tokens = start_tokens + end_tokens 1667 | summary = tokenizer.convert_tokens_to_string(summary_tokens) 1668 | 1669 | return summary 1670 | ``` 1671 | 1672 |
1673 | 1674 | ### 步骤3 查询 1675 | 1676 | 对于步骤2中生成的查询,我们将提取它们并查询LightRAG。 1677 | 1678 |
1679 | 代码 1680 | 1681 | ```python 1682 | def extract_queries(file_path): 1683 | with open(file_path, 'r') as f: 1684 | data = f.read() 1685 | 1686 | data = data.replace('**', '') 1687 | 1688 | queries = re.findall(r'- Question \d+: (.+)', data) 1689 | 1690 | return queries 1691 | ``` 1692 | 1693 |
1694 | 1695 | ## Star历史 1696 | 1697 | 1698 | 1699 | 1700 | 1701 | Star History Chart 1702 | 1703 | 1704 | 1705 | ## 贡献 1706 | 1707 | 感谢所有贡献者! 1708 | 1709 | 1710 | 1711 | 1712 | 1713 | ## 🌟引用 1714 | 1715 | ```python 1716 | @article{guo2024lightrag, 1717 | title={LightRAG: Simple and Fast Retrieval-Augmented Generation}, 1718 | author={Zirui Guo and Lianghao Xia and Yanhua Yu and Tu Ao and Chao Huang}, 1719 | year={2024}, 1720 | eprint={2410.05779}, 1721 | archivePrefix={arXiv}, 1722 | primaryClass={cs.IR} 1723 | } 1724 | ``` 1725 | 1726 | **感谢您对我们工作的关注!** 1727 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Reporting Security Issues 2 | 3 | The LightRAG team and community take security bugs seriously. We appreciate your efforts to responsibly disclose your findings, and will make every effort to acknowledge your contributions. 4 | 5 | To report a security issue, please use the GitHub Security Advisory: [Report a Vulnerability](https://github.com/HKUDS/LightRAG/security/advisories/new) 6 | 7 | The LightRAG team will send a response indicating the next steps in handling your report. After the initial reply to your report, the security team will keep you informed of the progress towards a fix and full announcement, and may ask for additional information or guidance. 8 | 9 | Report security bugs in third-party modules to the person or team maintaining the module. 10 | 11 | ### Supported Versions 12 | 13 | The following versions currently being supported with security updates. 14 | 15 | | Version | Supported | 16 | | ------- | ------------------ | 17 | | 1.2.x | :x: | 18 | | 1.3.x | :white_check_mark: | 19 | -------------------------------------------------------------------------------- /docs/Algorithm.md: -------------------------------------------------------------------------------- 1 | ![LightRAG Indexing Flowchart](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-VectorDB-Json-KV-Store-Indexing-Flowchart-scaled.jpg) 2 | *Figure 1: LightRAG Indexing Flowchart - Img Caption : [Source](https://learnopencv.com/lightrag/)* 3 | ![LightRAG Retrieval and Querying Flowchart](https://learnopencv.com/wp-content/uploads/2024/11/LightRAG-Querying-Flowchart-Dual-Level-Retrieval-Generation-Knowledge-Graphs-scaled.jpg) 4 | *Figure 2: LightRAG Retrieval and Querying Flowchart - Img Caption : [Source](https://learnopencv.com/lightrag/)* 5 | -------------------------------------------------------------------------------- /docs/DockerDeployment.md: -------------------------------------------------------------------------------- 1 | # LightRAG 2 | 3 | A lightweight Knowledge Graph Retrieval-Augmented Generation system with multiple LLM backend support. 4 | 5 | ## 🚀 Installation 6 | 7 | ### Prerequisites 8 | - Python 3.10+ 9 | - Git 10 | - Docker (optional for Docker deployment) 11 | 12 | ### Native Installation 13 | 14 | 1. Clone the repository: 15 | ```bash 16 | # Linux/MacOS 17 | git clone https://github.com/HKUDS/LightRAG.git 18 | cd LightRAG 19 | ``` 20 | ```powershell 21 | # Windows PowerShell 22 | git clone https://github.com/HKUDS/LightRAG.git 23 | cd LightRAG 24 | ``` 25 | 26 | 2. Configure your environment: 27 | ```bash 28 | # Linux/MacOS 29 | cp .env.example .env 30 | # Edit .env with your preferred configuration 31 | ``` 32 | ```powershell 33 | # Windows PowerShell 34 | Copy-Item .env.example .env 35 | # Edit .env with your preferred configuration 36 | ``` 37 | 38 | 3. Create and activate virtual environment: 39 | ```bash 40 | # Linux/MacOS 41 | python -m venv venv 42 | source venv/bin/activate 43 | ``` 44 | ```powershell 45 | # Windows PowerShell 46 | python -m venv venv 47 | .\venv\Scripts\Activate 48 | ``` 49 | 50 | 4. Install dependencies: 51 | ```bash 52 | # Both platforms 53 | pip install -r requirements.txt 54 | ``` 55 | 56 | ## 🐳 Docker Deployment 57 | 58 | Docker instructions work the same on all platforms with Docker Desktop installed. 59 | 60 | 1. Build and start the container: 61 | ```bash 62 | docker-compose up -d 63 | ``` 64 | 65 | ### Configuration Options 66 | 67 | LightRAG can be configured using environment variables in the `.env` file: 68 | 69 | #### Server Configuration 70 | - `HOST`: Server host (default: 0.0.0.0) 71 | - `PORT`: Server port (default: 9621) 72 | 73 | #### LLM Configuration 74 | - `LLM_BINDING`: LLM backend to use (lollms/ollama/openai) 75 | - `LLM_BINDING_HOST`: LLM server host URL 76 | - `LLM_MODEL`: Model name to use 77 | 78 | #### Embedding Configuration 79 | - `EMBEDDING_BINDING`: Embedding backend (lollms/ollama/openai) 80 | - `EMBEDDING_BINDING_HOST`: Embedding server host URL 81 | - `EMBEDDING_MODEL`: Embedding model name 82 | 83 | #### RAG Configuration 84 | - `MAX_ASYNC`: Maximum async operations 85 | - `MAX_TOKENS`: Maximum token size 86 | - `EMBEDDING_DIM`: Embedding dimensions 87 | 88 | #### Security 89 | - `LIGHTRAG_API_KEY`: API key for authentication 90 | 91 | ### Data Storage Paths 92 | 93 | The system uses the following paths for data storage: 94 | ``` 95 | data/ 96 | ├── rag_storage/ # RAG data persistence 97 | └── inputs/ # Input documents 98 | ``` 99 | 100 | ### Example Deployments 101 | 102 | 1. Using with Ollama: 103 | ```env 104 | LLM_BINDING=ollama 105 | LLM_BINDING_HOST=http://host.docker.internal:11434 106 | LLM_MODEL=mistral 107 | EMBEDDING_BINDING=ollama 108 | EMBEDDING_BINDING_HOST=http://host.docker.internal:11434 109 | EMBEDDING_MODEL=bge-m3 110 | ``` 111 | 112 | you can't just use localhost from docker, that's why you need to use host.docker.internal which is defined in the docker compose file and should allow you to access the localhost services. 113 | 114 | 2. Using with OpenAI: 115 | ```env 116 | LLM_BINDING=openai 117 | LLM_MODEL=gpt-3.5-turbo 118 | EMBEDDING_BINDING=openai 119 | EMBEDDING_MODEL=text-embedding-ada-002 120 | OPENAI_API_KEY=your-api-key 121 | ``` 122 | 123 | ### API Usage 124 | 125 | Once deployed, you can interact with the API at `http://localhost:9621` 126 | 127 | Example query using PowerShell: 128 | ```powershell 129 | $headers = @{ 130 | "X-API-Key" = "your-api-key" 131 | "Content-Type" = "application/json" 132 | } 133 | $body = @{ 134 | query = "your question here" 135 | } | ConvertTo-Json 136 | 137 | Invoke-RestMethod -Uri "http://localhost:9621/query" -Method Post -Headers $headers -Body $body 138 | ``` 139 | 140 | Example query using curl: 141 | ```bash 142 | curl -X POST "http://localhost:9621/query" \ 143 | -H "X-API-Key: your-api-key" \ 144 | -H "Content-Type: application/json" \ 145 | -d '{"query": "your question here"}' 146 | ``` 147 | 148 | ## 🔒 Security 149 | 150 | Remember to: 151 | 1. Set a strong API key in production 152 | 2. Use SSL in production environments 153 | 3. Configure proper network security 154 | 155 | ## 📦 Updates 156 | 157 | To update the Docker container: 158 | ```bash 159 | docker-compose pull 160 | docker-compose up -d --build 161 | ``` 162 | 163 | To update native installation: 164 | ```bash 165 | # Linux/MacOS 166 | git pull 167 | source venv/bin/activate 168 | pip install -r requirements.txt 169 | ``` 170 | ```powershell 171 | # Windows PowerShell 172 | git pull 173 | .\venv\Scripts\Activate 174 | pip install -r requirements.txt 175 | ``` 176 | -------------------------------------------------------------------------------- /docs/LightRAG_concurrent_explain.md: -------------------------------------------------------------------------------- 1 | ## LightRAG Multi-Document Processing: Concurrent Control Strategy 2 | 3 | LightRAG employs a multi-layered concurrent control strategy when processing multiple documents. This article provides an in-depth analysis of the concurrent control mechanisms at document level, chunk level, and LLM request level, helping you understand why specific concurrent behaviors occur. 4 | 5 | ### 1. Document-Level Concurrent Control 6 | 7 | **Control Parameter**: `max_parallel_insert` 8 | 9 | This parameter controls the number of documents processed simultaneously. The purpose is to prevent excessive parallelism from overwhelming system resources, which could lead to extended processing times for individual files. Document-level concurrency is governed by the `max_parallel_insert` attribute within LightRAG, which defaults to 2 and is configurable via the `MAX_PARALLEL_INSERT` environment variable. `max_parallel_insert` is recommended to be set between 2 and 10, typically `llm_model_max_async/3`. Setting this value too high can increase the likelihood of naming conflicts among entities and relationships across different documents during the merge phase, thereby reducing its overall efficiency. 10 | 11 | ### 2. Chunk-Level Concurrent Control 12 | 13 | **Control Parameter**: `llm_model_max_async` 14 | 15 | This parameter controls the number of chunks processed simultaneously in the extraction stage within a document. The purpose is to prevent a high volume of concurrent requests from monopolizing LLM processing resources, which would impede the efficient parallel processing of multiple files. Chunk-Level Concurrent Control is governed by the `llm_model_max_async` attribute within LightRAG, which defaults to 4 and is configurable via the `MAX_ASYNC` environment variable. The purpose of this parameter is to fully leverage the LLM's concurrency capabilities when processing individual documents. 16 | 17 | In the `extract_entities` function, **each document independently creates** its own chunk semaphore. Since each document independently creates chunk semaphores, the theoretical chunk concurrency of the system is: 18 | $$ 19 | ChunkConcurrency = Max Parallel Insert × LLM Model Max Async 20 | $$ 21 | For example: 22 | - `max_parallel_insert = 2` (process 2 documents simultaneously) 23 | - `llm_model_max_async = 4` (maximum 4 chunk concurrency per document) 24 | - Theoretical chunk-level concurrent: 2 × 4 = 8 25 | 26 | ### 3. Graph-Level Concurrent Control 27 | 28 | **Control Parameter**: `llm_model_max_async * 2` 29 | 30 | This parameter controls the number of entities and relations processed simultaneously in the merging stage within a document. The purpose is to prevent a high volume of concurrent requests from monopolizing LLM processing resources, which would impede the efficient parallel processing of multiple files. Graph-level concurrency is governed by the `llm_model_max_async` attribute within LightRAG, which defaults to 4 and is configurable via the `MAX_ASYNC` environment variable. Graph-level parallelism control parameters are equally applicable to managing parallelism during the entity relationship reconstruction phase after document deletion. 31 | 32 | Given that the entity relationship merging phase doesn't necessitate LLM interaction for every operation, its parallelism is set at double the LLM's parallelism. This optimizes machine utilization while concurrently preventing excessive queuing resource contention for the LLM. 33 | 34 | ### 4. LLM-Level Concurrent Control 35 | 36 | **Control Parameter**: `llm_model_max_async` 37 | 38 | This parameter governs the **concurrent volume** of LLM requests dispatched by the entire LightRAG system, encompassing the document extraction stage, merging stage, and user query handling. 39 | 40 | LLM request prioritization is managed via a global priority queue, which **systematically prioritizes user queries** over merging-related requests, and merging-related requests over extraction-related requests. This strategic prioritization **minimizes user query latency**. 41 | 42 | LLM-level concurrency is governed by the `llm_model_max_async` attribute within LightRAG, which defaults to 4 and is configurable via the `MAX_ASYNC` environment variable. 43 | 44 | ### 5. Complete Concurrent Hierarchy Diagram 45 | 46 | ```mermaid 47 | graph TD 48 | classDef doc fill:#e6f3ff,stroke:#5b9bd5,stroke-width:2px; 49 | classDef chunk fill:#fbe5d6,stroke:#ed7d31,stroke-width:1px; 50 | classDef merge fill:#e2f0d9,stroke:#70ad47,stroke-width:2px; 51 | 52 | A["Multiple Documents
max_parallel_insert = 2"] --> A1 53 | A --> B1 54 | 55 | A1[DocA: split to n chunks] --> A_chunk; 56 | B1[DocB: split to m chunks] --> B_chunk; 57 | 58 | subgraph A_chunk[Extraction Stage] 59 | A_chunk_title[Entity Relation Extraction
llm_model_max_async = 4]; 60 | A_chunk_title --> A_chunk1[Chunk A1]:::chunk; 61 | A_chunk_title --> A_chunk2[Chunk A2]:::chunk; 62 | A_chunk_title --> A_chunk3[Chunk A3]:::chunk; 63 | A_chunk_title --> A_chunk4[Chunk A4]:::chunk; 64 | A_chunk1 & A_chunk2 & A_chunk3 & A_chunk4 --> A_chunk_done([Extraction Complete]); 65 | end 66 | 67 | subgraph B_chunk[Extraction Stage] 68 | B_chunk_title[Entity Relation Extraction
llm_model_max_async = 4]; 69 | B_chunk_title --> B_chunk1[Chunk B1]:::chunk; 70 | B_chunk_title --> B_chunk2[Chunk B2]:::chunk; 71 | B_chunk_title --> B_chunk3[Chunk B3]:::chunk; 72 | B_chunk_title --> B_chunk4[Chunk B4]:::chunk; 73 | B_chunk1 & B_chunk2 & B_chunk3 & B_chunk4 --> B_chunk_done([Extraction Complete]); 74 | end 75 | A_chunk -.->|LLM Request| LLM_Queue; 76 | 77 | A_chunk --> A_merge; 78 | B_chunk --> B_merge; 79 | 80 | subgraph A_merge[Merge Stage] 81 | A_merge_title[Entity Relation Merging
llm_model_max_async * 2 = 8]; 82 | A_merge_title --> A1_entity[Ent a1]:::merge; 83 | A_merge_title --> A2_entity[Ent a2]:::merge; 84 | A_merge_title --> A3_entity[Rel a3]:::merge; 85 | A_merge_title --> A4_entity[Rel a4]:::merge; 86 | A1_entity & A2_entity & A3_entity & A4_entity --> A_done([Merge Complete]) 87 | end 88 | 89 | subgraph B_merge[Merge Stage] 90 | B_merge_title[Entity Relation Merging
llm_model_max_async * 2 = 8]; 91 | B_merge_title --> B1_entity[Ent b1]:::merge; 92 | B_merge_title --> B2_entity[Ent b2]:::merge; 93 | B_merge_title --> B3_entity[Rel b3]:::merge; 94 | B_merge_title --> B4_entity[Rel b4]:::merge; 95 | B1_entity & B2_entity & B3_entity & B4_entity --> B_done([Merge Complete]) 96 | end 97 | 98 | A_merge -.->|LLM Request| LLM_Queue["LLM Request Prioritized Queue
llm_model_max_async = 4"]; 99 | B_merge -.->|LLM Request| LLM_Queue; 100 | B_chunk -.->|LLM Request| LLM_Queue; 101 | 102 | ``` 103 | 104 | > The extraction and merge stages share a global prioritized LLM queue, regulated by `llm_model_max_async`. While numerous entity and relation extraction and merging operations may be "actively processing", **only a limited number will concurrently execute LLM requests** the remainder will be queued and awaiting their turn. 105 | 106 | ### 6. Performance Optimization Recommendations 107 | 108 | * **Increase LLM Concurrent Setting based on the capabilities of your LLM server or API provider** 109 | 110 | During the file processing phase, the performance and concurrency capabilities of the LLM are critical bottlenecks. When deploying LLMs locally, the service's concurrency capacity must adequately account for the context length requirements of LightRAG. LightRAG recommends that LLMs support a minimum context length of 32KB; therefore, server concurrency should be calculated based on this benchmark. For API providers, LightRAG will retry requests up to three times if the client's request is rejected due to concurrent request limits. Backend logs can be used to determine if LLM retries are occurring, thereby indicating whether `MAX_ASYNC` has exceeded the API provider's limits. 111 | 112 | * **Align Parallel Document Insertion Settings with LLM Concurrency Configurations** 113 | 114 | The recommended number of parallel document processing tasks is 1/4 of the LLM's concurrency, with a minimum of 2 and a maximum of 10. Setting a higher number of parallel document processing tasks typically does not accelerate overall document processing speed, as even a small number of concurrently processed documents can fully utilize the LLM's parallel processing capabilities. Excessive parallel document processing can significantly increase the processing time for each individual document. Since LightRAG commits processing results on a file-by-file basis, a large number of concurrent files would necessitate caching a substantial amount of data. In the event of a system error, all documents in the middle stage would require reprocessing, thereby increasing error handling costs. For instance, setting `MAX_PARALLEL_INSERT` to 3 is appropriate when `MAX_ASYNC` is configured to 12. 115 | -------------------------------------------------------------------------------- /docs/rerank_integration.md: -------------------------------------------------------------------------------- 1 | # Rerank Integration Guide 2 | 3 | LightRAG supports reranking functionality to improve retrieval quality by re-ordering documents based on their relevance to the query. Reranking is now controlled per query via the `enable_rerank` parameter (default: True). 4 | 5 | ## Quick Start 6 | 7 | ### Environment Variables 8 | 9 | Set these variables in your `.env` file or environment for rerank model configuration: 10 | 11 | ```bash 12 | # Rerank model configuration (required when enable_rerank=True in queries) 13 | RERANK_MODEL=BAAI/bge-reranker-v2-m3 14 | RERANK_BINDING_HOST=https://api.your-provider.com/v1/rerank 15 | RERANK_BINDING_API_KEY=your_api_key_here 16 | ``` 17 | 18 | ### Programmatic Configuration 19 | 20 | ```python 21 | from lightrag import LightRAG, QueryParam 22 | from lightrag.rerank import custom_rerank, RerankModel 23 | 24 | # Method 1: Using a custom rerank function with all settings included 25 | async def my_rerank_func(query: str, documents: list, top_n: int = None, **kwargs): 26 | return await custom_rerank( 27 | query=query, 28 | documents=documents, 29 | model="BAAI/bge-reranker-v2-m3", 30 | base_url="https://api.your-provider.com/v1/rerank", 31 | api_key="your_api_key_here", 32 | top_n=top_n or 10, # Handle top_n within the function 33 | **kwargs 34 | ) 35 | 36 | rag = LightRAG( 37 | working_dir="./rag_storage", 38 | llm_model_func=your_llm_func, 39 | embedding_func=your_embedding_func, 40 | rerank_model_func=my_rerank_func, # Configure rerank function 41 | ) 42 | 43 | # Query with rerank enabled (default) 44 | result = await rag.aquery( 45 | "your query", 46 | param=QueryParam(enable_rerank=True) # Control rerank per query 47 | ) 48 | 49 | # Query with rerank disabled 50 | result = await rag.aquery( 51 | "your query", 52 | param=QueryParam(enable_rerank=False) 53 | ) 54 | 55 | # Method 2: Using RerankModel wrapper 56 | rerank_model = RerankModel( 57 | rerank_func=custom_rerank, 58 | kwargs={ 59 | "model": "BAAI/bge-reranker-v2-m3", 60 | "base_url": "https://api.your-provider.com/v1/rerank", 61 | "api_key": "your_api_key_here", 62 | } 63 | ) 64 | 65 | rag = LightRAG( 66 | working_dir="./rag_storage", 67 | llm_model_func=your_llm_func, 68 | embedding_func=your_embedding_func, 69 | rerank_model_func=rerank_model.rerank, 70 | ) 71 | 72 | # Control rerank per query 73 | result = await rag.aquery( 74 | "your query", 75 | param=QueryParam( 76 | enable_rerank=True, # Enable rerank for this query 77 | chunk_top_k=5 # Number of chunks to keep after reranking 78 | ) 79 | ) 80 | ``` 81 | 82 | ## Supported Providers 83 | 84 | ### 1. Custom/Generic API (Recommended) 85 | 86 | For Jina/Cohere compatible APIs: 87 | 88 | ```python 89 | from lightrag.rerank import custom_rerank 90 | 91 | # Your custom API endpoint 92 | result = await custom_rerank( 93 | query="your query", 94 | documents=documents, 95 | model="BAAI/bge-reranker-v2-m3", 96 | base_url="https://api.your-provider.com/v1/rerank", 97 | api_key="your_api_key_here", 98 | top_n=10 99 | ) 100 | ``` 101 | 102 | ### 2. Jina AI 103 | 104 | ```python 105 | from lightrag.rerank import jina_rerank 106 | 107 | result = await jina_rerank( 108 | query="your query", 109 | documents=documents, 110 | model="BAAI/bge-reranker-v2-m3", 111 | api_key="your_jina_api_key", 112 | top_n=10 113 | ) 114 | ``` 115 | 116 | ### 3. Cohere 117 | 118 | ```python 119 | from lightrag.rerank import cohere_rerank 120 | 121 | result = await cohere_rerank( 122 | query="your query", 123 | documents=documents, 124 | model="rerank-english-v2.0", 125 | api_key="your_cohere_api_key", 126 | top_n=10 127 | ) 128 | ``` 129 | 130 | ## Integration Points 131 | 132 | Reranking is automatically applied at these key retrieval stages: 133 | 134 | 1. **Naive Mode**: After vector similarity search in `_get_vector_context` 135 | 2. **Local Mode**: After entity retrieval in `_get_node_data` 136 | 3. **Global Mode**: After relationship retrieval in `_get_edge_data` 137 | 4. **Hybrid/Mix Modes**: Applied to all relevant components 138 | 139 | ## Configuration Parameters 140 | 141 | | Parameter | Type | Default | Description | 142 | |-----------|------|---------|-------------| 143 | | `enable_rerank` | bool | False | Enable/disable reranking | 144 | | `rerank_model_func` | callable | None | Custom rerank function containing all configurations (model, API keys, top_n, etc.) | 145 | 146 | ## Example Usage 147 | 148 | ### Basic Usage 149 | 150 | ```python 151 | import asyncio 152 | from lightrag import LightRAG, QueryParam 153 | from lightrag.llm.openai import gpt_4o_mini_complete, openai_embedding 154 | from lightrag.kg.shared_storage import initialize_pipeline_status 155 | from lightrag.rerank import jina_rerank 156 | 157 | async def my_rerank_func(query: str, documents: list, top_n: int = None, **kwargs): 158 | """Custom rerank function with all settings included""" 159 | return await jina_rerank( 160 | query=query, 161 | documents=documents, 162 | model="BAAI/bge-reranker-v2-m3", 163 | api_key="your_jina_api_key_here", 164 | top_n=top_n or 10, # Default top_n if not provided 165 | **kwargs 166 | ) 167 | 168 | async def main(): 169 | # Initialize with rerank enabled 170 | rag = LightRAG( 171 | working_dir="./rag_storage", 172 | llm_model_func=gpt_4o_mini_complete, 173 | embedding_func=openai_embedding, 174 | rerank_model_func=my_rerank_func, 175 | ) 176 | 177 | await rag.initialize_storages() 178 | await initialize_pipeline_status() 179 | 180 | # Insert documents 181 | await rag.ainsert([ 182 | "Document 1 content...", 183 | "Document 2 content...", 184 | ]) 185 | 186 | # Query with rerank (automatically applied) 187 | result = await rag.aquery( 188 | "Your question here", 189 | param=QueryParam(enable_rerank=True) # This top_n is passed to rerank function 190 | ) 191 | 192 | print(result) 193 | 194 | asyncio.run(main()) 195 | ``` 196 | 197 | ### Direct Rerank Usage 198 | 199 | ```python 200 | from lightrag.rerank import custom_rerank 201 | 202 | async def test_rerank(): 203 | documents = [ 204 | {"content": "Text about topic A"}, 205 | {"content": "Text about topic B"}, 206 | {"content": "Text about topic C"}, 207 | ] 208 | 209 | reranked = await custom_rerank( 210 | query="Tell me about topic A", 211 | documents=documents, 212 | model="BAAI/bge-reranker-v2-m3", 213 | base_url="https://api.your-provider.com/v1/rerank", 214 | api_key="your_api_key_here", 215 | top_n=2 216 | ) 217 | 218 | for doc in reranked: 219 | print(f"Score: {doc.get('rerank_score')}, Content: {doc.get('content')}") 220 | ``` 221 | 222 | ## Best Practices 223 | 224 | 1. **Self-Contained Functions**: Include all necessary configurations (API keys, models, top_n handling) within your rerank function 225 | 2. **Performance**: Use reranking selectively for better performance vs. quality tradeoff 226 | 3. **API Limits**: Monitor API usage and implement rate limiting within your rerank function 227 | 4. **Fallback**: Always handle rerank failures gracefully (returns original results) 228 | 5. **Top-n Handling**: Handle top_n parameter appropriately within your rerank function 229 | 6. **Cost Management**: Consider rerank API costs in your budget planning 230 | 231 | ## Troubleshooting 232 | 233 | ### Common Issues 234 | 235 | 1. **API Key Missing**: Ensure API keys are properly configured within your rerank function 236 | 2. **Network Issues**: Check API endpoints and network connectivity 237 | 3. **Model Errors**: Verify the rerank model name is supported by your API 238 | 4. **Document Format**: Ensure documents have `content` or `text` fields 239 | 240 | ### Debug Mode 241 | 242 | Enable debug logging to see rerank operations: 243 | 244 | ```python 245 | import logging 246 | logging.getLogger("lightrag.rerank").setLevel(logging.DEBUG) 247 | ``` 248 | 249 | ### Error Handling 250 | 251 | The rerank integration includes automatic fallback: 252 | 253 | ```python 254 | # If rerank fails, original documents are returned 255 | # No exceptions are raised to the user 256 | # Errors are logged for debugging 257 | ``` 258 | 259 | ## API Compatibility 260 | 261 | The generic rerank API expects this response format: 262 | 263 | ```json 264 | { 265 | "results": [ 266 | { 267 | "index": 0, 268 | "relevance_score": 0.95 269 | }, 270 | { 271 | "index": 2, 272 | "relevance_score": 0.87 273 | } 274 | ] 275 | } 276 | ``` 277 | 278 | This is compatible with: 279 | - Jina AI Rerank API 280 | - Cohere Rerank API 281 | - Custom APIs following the same format 282 | -------------------------------------------------------------------------------- /k8s-deploy/README-zh.md: -------------------------------------------------------------------------------- 1 | # LightRAG Helm Chart 2 | 3 | 这是用于在Kubernetes集群上部署LightRAG服务的Helm chart。 4 | 5 | LightRAG有两种推荐的部署方法: 6 | 1. **轻量级部署**:使用内置轻量级存储,适合测试和小规模使用 7 | 2. **生产环境部署**:使用外部数据库(如PostgreSQL和Neo4J),适合生产环境和大规模使用 8 | 9 | > 如果您想要部署过程的视频演示,可以查看[bilibili](https://www.bilibili.com/video/BV1bUJazBEq2/)上的视频教程,对于喜欢视觉指导的用户可能会有所帮助。 10 | 11 | ## 前提条件 12 | 13 | 确保安装和配置了以下工具: 14 | 15 | * **Kubernetes集群** 16 | * 需要一个运行中的Kubernetes集群。 17 | * 对于本地开发或演示,可以使用[Minikube](https://minikube.sigs.k8s.io/docs/start/)(需要≥2个CPU,≥4GB内存,以及Docker/VM驱动支持)。 18 | * 任何标准的云端或本地Kubernetes集群(EKS、GKE、AKS等)也可以使用。 19 | 20 | * **kubectl** 21 | * Kubernetes命令行工具,用于管理集群。 22 | * 按照官方指南安装:[安装和设置kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl)。 23 | 24 | * **Helm**(v3.x+) 25 | * Kubernetes包管理器,用于安装LightRAG。 26 | * 通过官方指南安装:[安装Helm](https://helm.sh/docs/intro/install/)。 27 | 28 | ## 轻量级部署(无需外部数据库) 29 | 30 | 这种部署选项使用内置的轻量级存储组件,非常适合测试、演示或小规模使用场景。无需外部数据库配置。 31 | 32 | 您可以使用提供的便捷脚本或直接使用Helm命令部署LightRAG。两种方法都配置了`lightrag/values.yaml`文件中定义的相同环境变量。 33 | 34 | ### 使用便捷脚本(推荐): 35 | 36 | ```bash 37 | export OPENAI_API_BASE=<您的OPENAI_API_BASE> 38 | export OPENAI_API_KEY=<您的OPENAI_API_KEY> 39 | bash ./install_lightrag_dev.sh 40 | ``` 41 | 42 | ### 或直接使用Helm: 43 | 44 | ```bash 45 | # 您可以覆盖任何想要的环境参数 46 | helm upgrade --install lightrag ./lightrag \ 47 | --namespace rag \ 48 | --set-string env.LIGHTRAG_KV_STORAGE=JsonKVStorage \ 49 | --set-string env.LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage \ 50 | --set-string env.LIGHTRAG_GRAPH_STORAGE=NetworkXStorage \ 51 | --set-string env.LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage \ 52 | --set-string env.LLM_BINDING=openai \ 53 | --set-string env.LLM_MODEL=gpt-4o-mini \ 54 | --set-string env.LLM_BINDING_HOST=$OPENAI_API_BASE \ 55 | --set-string env.LLM_BINDING_API_KEY=$OPENAI_API_KEY \ 56 | --set-string env.EMBEDDING_BINDING=openai \ 57 | --set-string env.EMBEDDING_MODEL=text-embedding-ada-002 \ 58 | --set-string env.EMBEDDING_DIM=1536 \ 59 | --set-string env.EMBEDDING_BINDING_API_KEY=$OPENAI_API_KEY 60 | ``` 61 | 62 | ### 访问应用程序: 63 | 64 | ```bash 65 | # 1. 在终端中运行此端口转发命令: 66 | kubectl --namespace rag port-forward svc/lightrag-dev 9621:9621 67 | 68 | # 2. 当命令运行时,打开浏览器并导航到: 69 | # http://localhost:9621 70 | ``` 71 | 72 | ## 生产环境部署(使用外部数据库) 73 | 74 | ### 1. 安装数据库 75 | > 如果您已经准备好了数据库,可以跳过此步骤。详细信息可以在:[README.md](databases%2FREADME.md)中找到。 76 | 77 | 我们推荐使用KubeBlocks进行数据库部署。KubeBlocks是一个云原生数据库操作符,可以轻松地在Kubernetes上以生产规模运行任何数据库。 78 | 79 | 首先,安装KubeBlocks和KubeBlocks-Addons(如已安装可跳过): 80 | ```bash 81 | bash ./databases/01-prepare.sh 82 | ``` 83 | 84 | 然后安装所需的数据库。默认情况下,这将安装PostgreSQL和Neo4J,但您可以修改[00-config.sh](databases%2F00-config.sh)以根据需要选择不同的数据库: 85 | ```bash 86 | bash ./databases/02-install-database.sh 87 | ``` 88 | 89 | 验证集群是否正在运行: 90 | ```bash 91 | kubectl get clusters -n rag 92 | # 预期输出: 93 | # NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE 94 | # neo4j-cluster Delete Running 39s 95 | # pg-cluster postgresql Delete Running 42s 96 | 97 | kubectl get po -n rag 98 | # 预期输出: 99 | # NAME READY STATUS RESTARTS AGE 100 | # neo4j-cluster-neo4j-0 1/1 Running 0 58s 101 | # pg-cluster-postgresql-0 4/4 Running 0 59s 102 | # pg-cluster-postgresql-1 4/4 Running 0 59s 103 | ``` 104 | 105 | ### 2. 安装LightRAG 106 | 107 | LightRAG及其数据库部署在同一Kubernetes集群中,使配置变得简单。 108 | 安装脚本会自动从KubeBlocks获取所有数据库连接信息,无需手动设置数据库凭证: 109 | 110 | ```bash 111 | export OPENAI_API_BASE=<您的OPENAI_API_BASE> 112 | export OPENAI_API_KEY=<您的OPENAI_API_KEY> 113 | bash ./install_lightrag.sh 114 | ``` 115 | 116 | ### 访问应用程序: 117 | 118 | ```bash 119 | # 1. 在终端中运行此端口转发命令: 120 | kubectl --namespace rag port-forward svc/lightrag 9621:9621 121 | 122 | # 2. 当命令运行时,打开浏览器并导航到: 123 | # http://localhost:9621 124 | ``` 125 | 126 | ## 配置 127 | 128 | ### 修改资源配置 129 | 130 | 您可以通过修改`values.yaml`文件来配置LightRAG的资源使用: 131 | 132 | ```yaml 133 | replicaCount: 1 # 副本数量,可根据需要增加 134 | 135 | resources: 136 | limits: 137 | cpu: 1000m # CPU限制,可根据需要调整 138 | memory: 2Gi # 内存限制,可根据需要调整 139 | requests: 140 | cpu: 500m # CPU请求,可根据需要调整 141 | memory: 1Gi # 内存请求,可根据需要调整 142 | ``` 143 | 144 | ### 修改持久存储 145 | 146 | ```yaml 147 | persistence: 148 | enabled: true 149 | ragStorage: 150 | size: 10Gi # RAG存储大小,可根据需要调整 151 | inputs: 152 | size: 5Gi # 输入数据存储大小,可根据需要调整 153 | ``` 154 | 155 | ### 配置环境变量 156 | 157 | `values.yaml`文件中的`env`部分包含LightRAG的所有环境配置,类似于`.env`文件。当使用helm upgrade或helm install命令时,可以使用--set标志覆盖这些变量。 158 | 159 | ```yaml 160 | env: 161 | HOST: 0.0.0.0 162 | PORT: 9621 163 | WEBUI_TITLE: Graph RAG Engine 164 | WEBUI_DESCRIPTION: Simple and Fast Graph Based RAG System 165 | 166 | # LLM配置 167 | LLM_BINDING: openai # LLM服务提供商 168 | LLM_MODEL: gpt-4o-mini # LLM模型 169 | LLM_BINDING_HOST: # API基础URL(可选) 170 | LLM_BINDING_API_KEY: # API密钥 171 | 172 | # 嵌入配置 173 | EMBEDDING_BINDING: openai # 嵌入服务提供商 174 | EMBEDDING_MODEL: text-embedding-ada-002 # 嵌入模型 175 | EMBEDDING_DIM: 1536 # 嵌入维度 176 | EMBEDDING_BINDING_API_KEY: # API密钥 177 | 178 | # 存储配置 179 | LIGHTRAG_KV_STORAGE: PGKVStorage # 键值存储类型 180 | LIGHTRAG_VECTOR_STORAGE: PGVectorStorage # 向量存储类型 181 | LIGHTRAG_GRAPH_STORAGE: Neo4JStorage # 图存储类型 182 | LIGHTRAG_DOC_STATUS_STORAGE: PGDocStatusStorage # 文档状态存储类型 183 | ``` 184 | 185 | ## 注意事项 186 | 187 | - 在部署前确保设置了所有必要的环境变量(API密钥和数据库密码) 188 | - 出于安全原因,建议使用环境变量传递敏感信息,而不是直接写入脚本或values文件 189 | - 轻量级部署适合测试和小规模使用,但数据持久性和性能可能有限 190 | - 生产环境部署(PostgreSQL + Neo4J)推荐用于生产环境和大规模使用 191 | - 有关更多自定义配置,请参考LightRAG官方文档 192 | -------------------------------------------------------------------------------- /k8s-deploy/README.md: -------------------------------------------------------------------------------- 1 | # LightRAG Helm Chart 2 | 3 | This is the Helm chart for LightRAG, used to deploy LightRAG services on a Kubernetes cluster. 4 | 5 | There are two recommended deployment methods for LightRAG: 6 | 1. **Lightweight Deployment**: Using built-in lightweight storage, suitable for testing and small-scale usage 7 | 2. **Production Deployment**: Using external databases (such as PostgreSQL and Neo4J), suitable for production environments and large-scale usage 8 | 9 | > If you'd like a video walkthrough of the deployment process, feel free to check out this optional [video tutorial](https://youtu.be/JW1z7fzeKTw?si=vPzukqqwmdzq9Q4q) on YouTube. It might help clarify some steps for those who prefer visual guidance. 10 | 11 | ## Prerequisites 12 | 13 | Make sure the following tools are installed and configured: 14 | 15 | * **Kubernetes cluster** 16 | * A running Kubernetes cluster is required. 17 | * For local development or demos you can use [Minikube](https://minikube.sigs.k8s.io/docs/start/) (needs ≥ 2 CPUs, ≥ 4 GB RAM, and Docker/VM-driver support). 18 | * Any standard cloud or on-premises Kubernetes cluster (EKS, GKE, AKS, etc.) also works. 19 | 20 | * **kubectl** 21 | * The Kubernetes command-line tool for managing your cluster. 22 | * Follow the official guide: [Install and Set Up kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl). 23 | 24 | * **Helm** (v3.x+) 25 | * Kubernetes package manager used to install LightRAG. 26 | * Install it via the official instructions: [Installing Helm](https://helm.sh/docs/intro/install/). 27 | 28 | ## Lightweight Deployment (No External Databases Required) 29 | 30 | This deployment option uses built-in lightweight storage components that are perfect for testing, demos, or small-scale usage scenarios. No external database configuration is required. 31 | 32 | You can deploy LightRAG using either the provided convenience script or direct Helm commands. Both methods configure the same environment variables defined in the `lightrag/values.yaml` file. 33 | 34 | ### Using the convenience script (recommended): 35 | 36 | ```bash 37 | export OPENAI_API_BASE= 38 | export OPENAI_API_KEY= 39 | bash ./install_lightrag_dev.sh 40 | ``` 41 | 42 | ### Or using Helm directly: 43 | 44 | ```bash 45 | # You can override any env param you want 46 | helm upgrade --install lightrag ./lightrag \ 47 | --namespace rag \ 48 | --set-string env.LIGHTRAG_KV_STORAGE=JsonKVStorage \ 49 | --set-string env.LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage \ 50 | --set-string env.LIGHTRAG_GRAPH_STORAGE=NetworkXStorage \ 51 | --set-string env.LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage \ 52 | --set-string env.LLM_BINDING=openai \ 53 | --set-string env.LLM_MODEL=gpt-4o-mini \ 54 | --set-string env.LLM_BINDING_HOST=$OPENAI_API_BASE \ 55 | --set-string env.LLM_BINDING_API_KEY=$OPENAI_API_KEY \ 56 | --set-string env.EMBEDDING_BINDING=openai \ 57 | --set-string env.EMBEDDING_MODEL=text-embedding-ada-002 \ 58 | --set-string env.EMBEDDING_DIM=1536 \ 59 | --set-string env.EMBEDDING_BINDING_API_KEY=$OPENAI_API_KEY 60 | ``` 61 | 62 | ### Accessing the application: 63 | 64 | ```bash 65 | # 1. Run this port-forward command in your terminal: 66 | kubectl --namespace rag port-forward svc/lightrag-dev 9621:9621 67 | 68 | # 2. While the command is running, open your browser and navigate to: 69 | # http://localhost:9621 70 | ``` 71 | 72 | ## Production Deployment (Using External Databases) 73 | 74 | ### 1. Install Databases 75 | > You can skip this step if you've already prepared databases. Detailed information can be found in: [README.md](databases%2FREADME.md). 76 | 77 | We recommend KubeBlocks for database deployment. KubeBlocks is a cloud-native database operator that makes it easy to run any database on Kubernetes at production scale. 78 | 79 | First, install KubeBlocks and KubeBlocks-Addons (skip if already installed): 80 | ```bash 81 | bash ./databases/01-prepare.sh 82 | ``` 83 | 84 | Then install the required databases. By default, this will install PostgreSQL and Neo4J, but you can modify [00-config.sh](databases%2F00-config.sh) to select different databases based on your needs: 85 | ```bash 86 | bash ./databases/02-install-database.sh 87 | ``` 88 | 89 | Verify that the clusters are up and running: 90 | ```bash 91 | kubectl get clusters -n rag 92 | # Expected output: 93 | # NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE 94 | # neo4j-cluster Delete Running 39s 95 | # pg-cluster postgresql Delete Running 42s 96 | 97 | kubectl get po -n rag 98 | # Expected output: 99 | # NAME READY STATUS RESTARTS AGE 100 | # neo4j-cluster-neo4j-0 1/1 Running 0 58s 101 | # pg-cluster-postgresql-0 4/4 Running 0 59s 102 | # pg-cluster-postgresql-1 4/4 Running 0 59s 103 | ``` 104 | 105 | ### 2. Install LightRAG 106 | 107 | LightRAG and its databases are deployed within the same Kubernetes cluster, making configuration straightforward. 108 | The installation script automatically retrieves all database connection information from KubeBlocks, eliminating the need to manually set database credentials: 109 | 110 | ```bash 111 | export OPENAI_API_BASE= 112 | export OPENAI_API_KEY= 113 | bash ./install_lightrag.sh 114 | ``` 115 | 116 | ### Accessing the application: 117 | 118 | ```bash 119 | # 1. Run this port-forward command in your terminal: 120 | kubectl --namespace rag port-forward svc/lightrag 9621:9621 121 | 122 | # 2. While the command is running, open your browser and navigate to: 123 | # http://localhost:9621 124 | ``` 125 | 126 | ## Configuration 127 | 128 | ### Modifying Resource Configuration 129 | 130 | You can configure LightRAG's resource usage by modifying the `values.yaml` file: 131 | 132 | ```yaml 133 | replicaCount: 1 # Number of replicas, can be increased as needed 134 | 135 | resources: 136 | limits: 137 | cpu: 1000m # CPU limit, can be adjusted as needed 138 | memory: 2Gi # Memory limit, can be adjusted as needed 139 | requests: 140 | cpu: 500m # CPU request, can be adjusted as needed 141 | memory: 1Gi # Memory request, can be adjusted as needed 142 | ``` 143 | 144 | ### Modifying Persistent Storage 145 | 146 | ```yaml 147 | persistence: 148 | enabled: true 149 | ragStorage: 150 | size: 10Gi # RAG storage size, can be adjusted as needed 151 | inputs: 152 | size: 5Gi # Input data storage size, can be adjusted as needed 153 | ``` 154 | 155 | ### Configuring Environment Variables 156 | 157 | The `env` section in the `values.yaml` file contains all environment configurations for LightRAG, similar to a `.env` file. When using helm upgrade or helm install commands, you can override these with the --set flag. 158 | 159 | ```yaml 160 | env: 161 | HOST: 0.0.0.0 162 | PORT: 9621 163 | WEBUI_TITLE: Graph RAG Engine 164 | WEBUI_DESCRIPTION: Simple and Fast Graph Based RAG System 165 | 166 | # LLM Configuration 167 | LLM_BINDING: openai # LLM service provider 168 | LLM_MODEL: gpt-4o-mini # LLM model 169 | LLM_BINDING_HOST: # API base URL (optional) 170 | LLM_BINDING_API_KEY: # API key 171 | 172 | # Embedding Configuration 173 | EMBEDDING_BINDING: openai # Embedding service provider 174 | EMBEDDING_MODEL: text-embedding-ada-002 # Embedding model 175 | EMBEDDING_DIM: 1536 # Embedding dimension 176 | EMBEDDING_BINDING_API_KEY: # API key 177 | 178 | # Storage Configuration 179 | LIGHTRAG_KV_STORAGE: PGKVStorage # Key-value storage type 180 | LIGHTRAG_VECTOR_STORAGE: PGVectorStorage # Vector storage type 181 | LIGHTRAG_GRAPH_STORAGE: Neo4JStorage # Graph storage type 182 | LIGHTRAG_DOC_STATUS_STORAGE: PGDocStatusStorage # Document status storage type 183 | ``` 184 | 185 | ## Notes 186 | 187 | - Ensure all necessary environment variables (API keys and database passwords) are set before deployment 188 | - For security reasons, it's recommended to pass sensitive information using environment variables rather than writing them directly in scripts or values files 189 | - Lightweight deployment is suitable for testing and small-scale usage, but data persistence and performance may be limited 190 | - Production deployment (PostgreSQL + Neo4J) is recommended for production environments and large-scale usage 191 | - For more customized configurations, please refer to the official LightRAG documentation 192 | -------------------------------------------------------------------------------- /k8s-deploy/databases/README.md: -------------------------------------------------------------------------------- 1 | # Using KubeBlocks to Deploy and Manage Databases 2 | 3 | Learn how to quickly deploy and manage various databases in a Kubernetes (K8s) environment through KubeBlocks. 4 | 5 | ## Introduction to KubeBlocks 6 | 7 | KubeBlocks is a production-ready, open-source toolkit that runs any database--SQL, NoSQL, vector, or document--on Kubernetes. 8 | It scales smoothly from quick dev tests to full production clusters, making it a solid choice for RAG workloads like FastGPT that need several data stores working together. 9 | 10 | ## Prerequisites 11 | 12 | Make sure the following tools are installed and configured: 13 | 14 | * **Kubernetes cluster** 15 | * A running Kubernetes cluster is required. 16 | * For local development or demos you can use [Minikube](https://minikube.sigs.k8s.io/docs/start/) (needs ≥ 2 CPUs, ≥ 4 GB RAM, and Docker/VM-driver support). 17 | * Any standard cloud or on-premises Kubernetes cluster (EKS, GKE, AKS, etc.) also works. 18 | 19 | * **kubectl** 20 | * The Kubernetes command-line interface. 21 | * Follow the official guide: [Install and Set Up kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl). 22 | 23 | * **Helm** (v3.x+) 24 | * Kubernetes package manager used by the scripts below. 25 | * Install it via the official instructions: [Installing Helm](https://helm.sh/docs/intro/install/). 26 | 27 | ## Installing 28 | 29 | 1. **Configure the databases you want** 30 | Edit `00-config.sh` file. Based on your requirements, set the variable to `true` for the databases you want to install. 31 | For example, to install PostgreSQL and Neo4j: 32 | 33 | ```bash 34 | ENABLE_POSTGRESQL=true 35 | ENABLE_REDIS=false 36 | ENABLE_ELASTICSEARCH=false 37 | ENABLE_QDRANT=false 38 | ENABLE_MONGODB=false 39 | ENABLE_NEO4J=true 40 | ``` 41 | 42 | 2. **Prepare the environment and install KubeBlocks add-ons** 43 | 44 | ```bash 45 | bash ./01-prepare.sh 46 | ``` 47 | 48 | *What the script does* 49 | `01-prepare.sh` performs basic pre-checks (Helm, kubectl, cluster reachability), adds the KubeBlocks Helm repo, and installs any core CRDs or controllers that KubeBlocks itself needs. It also installs the addons for every database you enabled in `00-config.sh`, but **does not** create the actual database clusters yet. 50 | 51 | 3. **(Optional) Modify database settings** 52 | Before deployment you can edit the `values.yaml` file inside each `/` directory to change `version`, `replicas`, `CPU`, `memory`, `storage size`, etc. 53 | 54 | 4. **Install the database clusters** 55 | 56 | ```bash 57 | bash ./02-install-database.sh 58 | ``` 59 | 60 | *What the script does* 61 | `02-install-database.sh` **actually deploys the chosen databases to Kubernetes**. 62 | 63 | When the script completes, confirm that the clusters are up. It may take a few minutes for all the clusters to become ready, 64 | especially if this is the first time running the script as Kubernetes needs to pull container images from registries. 65 | You can monitor the progress using the following commands: 66 | 67 | ```bash 68 | kubectl get clusters -n rag 69 | NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE 70 | es-cluster Delete Running 11m 71 | mongodb-cluster mongodb Delete Running 11m 72 | pg-cluster postgresql Delete Running 11m 73 | qdrant-cluster qdrant Delete Running 11m 74 | redis-cluster redis Delete Running 11m 75 | ``` 76 | 77 | You can see all the Database `Pods` created by KubeBlocks. 78 | Initially, you might see pods in `ContainerCreating` or `Pending` status - this is normal while images are being pulled and containers are starting up. 79 | Wait until all pods show `Running` status: 80 | 81 | ```bash 82 | kubectl get po -n rag 83 | NAME READY STATUS RESTARTS AGE 84 | es-cluster-mdit-0 2/2 Running 0 11m 85 | mongodb-cluster-mongodb-0 2/2 Running 0 11m 86 | pg-cluster-postgresql-0 4/4 Running 0 11m 87 | pg-cluster-postgresql-1 4/4 Running 0 11m 88 | qdrant-cluster-qdrant-0 2/2 Running 0 11m 89 | redis-cluster-redis-0 2/2 Running 0 11m 90 | ``` 91 | 92 | You can also check the detailed status of a specific pod if it's taking longer than expected: 93 | 94 | ```bash 95 | kubectl describe pod -n rag 96 | ``` 97 | 98 | ## Connect to Databases 99 | 100 | To connect to your databases, follow these steps to identify available accounts, retrieve credentials, and establish connections: 101 | 102 | ### 1. List Available Database Clusters 103 | 104 | First, view the database clusters running in your namespace: 105 | 106 | ```bash 107 | kubectl get cluster -n rag 108 | ``` 109 | 110 | ### 2. Retrieve Authentication Credentials 111 | 112 | For PostgreSQL, retrieve the username and password from Kubernetes secrets: 113 | 114 | ```bash 115 | # Get PostgreSQL username 116 | kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.username}' | base64 -d 117 | # Get PostgreSQL password 118 | kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.password}' | base64 -d 119 | ``` 120 | 121 | If you have trouble finding the correct secret name, list all secrets: 122 | 123 | ```bash 124 | kubectl get secrets -n rag 125 | ``` 126 | 127 | ### 3. Port Forward to Local Machine 128 | 129 | Use port forwarding to access PostgreSQL from your local machine: 130 | 131 | ```bash 132 | # Forward PostgreSQL port (5432) to your local machine 133 | # You can see all services with: kubectl get svc -n rag 134 | kubectl port-forward -n rag svc/pg-cluster-postgresql-postgresql 5432:5432 135 | ``` 136 | 137 | ### 4. Connect Using Database Client 138 | 139 | Now you can connect using your preferred PostgreSQL client with the retrieved credentials: 140 | 141 | ```bash 142 | # Example: connecting with psql 143 | export PGUSER=$(kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.username}' | base64 -d) 144 | export PGPASSWORD=$(kubectl get secrets -n rag pg-cluster-postgresql-account-postgres -o jsonpath='{.data.password}' | base64 -d) 145 | psql -h localhost -p 5432 -U $PGUSER 146 | ``` 147 | 148 | Keep the port-forwarding terminal running while you're connecting to the database. 149 | 150 | 151 | ## Uninstalling 152 | 153 | 1. **Remove the database clusters** 154 | 155 | ```bash 156 | bash ./03-uninstall-database.sh 157 | ``` 158 | 159 | The script deletes the database clusters that were enabled in `00-config.sh`. 160 | 161 | 2. **Clean up KubeBlocks add-ons** 162 | 163 | ```bash 164 | bash ./04-cleanup.sh 165 | ``` 166 | 167 | This removes the addons installed by `01-prepare.sh`. 168 | 169 | ## Reference 170 | * [Kubeblocks Documentation](https://kubeblocks.io/docs/preview/user_docs/overview/introduction) 171 | -------------------------------------------------------------------------------- /lightrag/api/README-zh.md: -------------------------------------------------------------------------------- 1 | # LightRAG 服务器和 Web 界面 2 | 3 | LightRAG 服务器旨在提供 Web 界面和 API 支持。Web 界面便于文档索引、知识图谱探索和简单的 RAG 查询界面。LightRAG 服务器还提供了与 Ollama 兼容的接口,旨在将 LightRAG 模拟为 Ollama 聊天模型。这使得 AI 聊天机器人(如 Open WebUI)可以轻松访问 LightRAG。 4 | 5 | ![image-20250323122538997](./README.assets/image-20250323122538997.png) 6 | 7 | ![image-20250323122754387](./README.assets/image-20250323122754387.png) 8 | 9 | ![image-20250323123011220](./README.assets/image-20250323123011220.png) 10 | 11 | ## 入门指南 12 | 13 | ### 安装 14 | 15 | * 从 PyPI 安装 16 | 17 | ```bash 18 | pip install "lightrag-hku[api]" 19 | ``` 20 | 21 | * 从源代码安装 22 | 23 | ```bash 24 | # 克隆仓库 25 | git clone https://github.com/HKUDS/lightrag.git 26 | 27 | # 切换到仓库目录 28 | cd lightrag 29 | 30 | # 如有必要,创建 Python 虚拟环境 31 | # 以可编辑模式安装并支持 API 32 | pip install -e ".[api]" 33 | ``` 34 | 35 | ### 启动 LightRAG 服务器前的准备 36 | 37 | LightRAG 需要同时集成 LLM(大型语言模型)和嵌入模型以有效执行文档索引和查询操作。在首次部署 LightRAG 服务器之前,必须配置 LLM 和嵌入模型的设置。LightRAG 支持绑定到各种 LLM/嵌入后端: 38 | 39 | * ollama 40 | * lollms 41 | * openai 或 openai 兼容 42 | * azure_openai 43 | 44 | 建议使用环境变量来配置 LightRAG 服务器。项目根目录中有一个名为 `env.example` 的示例环境变量文件。请将此文件复制到启动目录并重命名为 `.env`。之后,您可以在 `.env` 文件中修改与 LLM 和嵌入模型相关的参数。需要注意的是,LightRAG 服务器每次启动时都会将 `.env` 中的环境变量加载到系统环境变量中。**LightRAG 服务器会优先使用系统环境变量中的设置**。 45 | 46 | > 由于安装了 Python 扩展的 VS Code 可能会在集成终端中自动加载 .env 文件,请在每次修改 .env 文件后打开新的终端会话。 47 | 48 | 以下是 LLM 和嵌入模型的一些常见设置示例: 49 | 50 | * OpenAI LLM + Ollama 嵌入 51 | 52 | ``` 53 | LLM_BINDING=openai 54 | LLM_MODEL=gpt-4o 55 | LLM_BINDING_HOST=https://api.openai.com/v1 56 | LLM_BINDING_API_KEY=your_api_key 57 | 58 | EMBEDDING_BINDING=ollama 59 | EMBEDDING_BINDING_HOST=http://localhost:11434 60 | EMBEDDING_MODEL=bge-m3:latest 61 | EMBEDDING_DIM=1024 62 | # EMBEDDING_BINDING_API_KEY=your_api_key 63 | ``` 64 | 65 | * Ollama LLM + Ollama 嵌入 66 | 67 | ``` 68 | LLM_BINDING=ollama 69 | LLM_MODEL=mistral-nemo:latest 70 | LLM_BINDING_HOST=http://localhost:11434 71 | # LLM_BINDING_API_KEY=your_api_key 72 | ### Ollama 服务器上下文 token 数(必须大于 MAX_TOTAL_TOKENS+2000) 73 | OLLAMA_LLM_NUM_CTX=8192 74 | 75 | EMBEDDING_BINDING=ollama 76 | EMBEDDING_BINDING_HOST=http://localhost:11434 77 | EMBEDDING_MODEL=bge-m3:latest 78 | EMBEDDING_DIM=1024 79 | # EMBEDDING_BINDING_API_KEY=your_api_key 80 | ``` 81 | 82 | ### 启动 LightRAG 服务器 83 | 84 | LightRAG 服务器支持两种运行模式: 85 | * 简单高效的 Uvicorn 模式 86 | 87 | ``` 88 | lightrag-server 89 | ``` 90 | * 多进程 Gunicorn + Uvicorn 模式(生产模式,不支持 Windows 环境) 91 | 92 | ``` 93 | lightrag-gunicorn --workers 4 94 | ``` 95 | 启动LightRAG的时候,当前工作目录必须含有`.env`配置文件。**要求将.env文件置于启动目录中是经过特意设计的**。 这样做的目的是支持用户同时启动多个LightRAG实例,并为不同实例配置不同的.env文件。**修改.env文件后,您需要重新打开终端以使新设置生效**。 这是因为每次启动时,LightRAG Server会将.env文件中的环境变量加载至系统环境变量,且系统环境变量的设置具有更高优先级。 96 | 97 | 启动时可以通过命令行参数覆盖`.env`文件中的配置。常用的命令行参数包括: 98 | 99 | - `--host`:服务器监听地址(默认:0.0.0.0) 100 | - `--port`:服务器监听端口(默认:9621) 101 | - `--timeout`:LLM 请求超时时间(默认:150 秒) 102 | - `--log-level`:日志级别(默认:INFO) 103 | - `--working-dir`:数据库持久化目录(默认:./rag_storage) 104 | - `--input-dir`:上传文件存放目录(默认:./inputs) 105 | - `--workspace`: 工作空间名称,用于逻辑上隔离多个LightRAG实例之间的数据(默认:空) 106 | 107 | ### 使用 Docker 启动 LightRAG 服务器 108 | 109 | * 配置 .env 文件: 110 | 通过复制示例文件 [`env.example`](env.example) 创建个性化的 .env 文件,并根据实际需求设置 LLM 及 Embedding 参数。 111 | * 创建一个名为 docker-compose.yml 的文件: 112 | 113 | ```yaml 114 | services: 115 | lightrag: 116 | container_name: lightrag 117 | image: ghcr.io/hkuds/lightrag:latest 118 | ports: 119 | - "${PORT:-9621}:9621" 120 | volumes: 121 | - ./data/rag_storage:/app/data/rag_storage 122 | - ./data/inputs:/app/data/inputs 123 | - ./config.ini:/app/config.ini 124 | - ./.env:/app/.env 125 | env_file: 126 | - .env 127 | restart: unless-stopped 128 | extra_hosts: 129 | - "host.docker.internal:host-gateway" 130 | ``` 131 | 132 | * 通过以下命令启动 LightRAG 服务器: 133 | 134 | ```shell 135 | docker compose up 136 | # 如果希望启动后让程序退到后台运行,需要在命令的最后添加 -d 参数 137 | ``` 138 | > 可以通过以下链接获取官方的docker compose文件:[docker-compose.yml]( https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml) 。如需获取LightRAG的历史版本镜像,可以访问以下链接: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag) 139 | 140 | ### 启动时自动扫描 141 | 142 | 当使用 `--auto-scan-at-startup` 参数启动LightRAG Server时,系统将自动: 143 | 144 | 1. 扫描输入目录中的新文件 145 | 2. 为尚未在数据库中的新文档建立索引 146 | 3. 使所有内容立即可用于 RAG 查询 147 | 148 | 这种工作模式给启动一个临时的RAG任务提供给了方便。 149 | 150 | > `--input-dir` 参数指定要扫描的输入目录。您可以从 webui 触发输入目录扫描。 151 | 152 | ### 启动多个LightRAG实例 153 | 154 | 有两种方式可以启动多个LightRAG实例。第一种方式是为每个实例配置一个完全独立的工作环境。此时需要为每个实例创建一个独立的工作目录,然后在这个工作目录上放置一个当前实例专用的`.env`配置文件。不同实例的配置文件中的服务器监听端口不能重复,然后在工作目录上执行 lightrag-server 启动服务即可。 155 | 156 | 第二种方式是所有实例共享一套相同的`.env`配置文件,然后通过命令行参数来为每个实例指定不同的服务器监听端口和工作空间。你可以在同一个工作目录中通过不同的命令行参数启动多个LightRAG实例。例如: 157 | 158 | ``` 159 | # 启动实例1 160 | lightrag-server --port 9621 --workspace space1 161 | 162 | # 启动实例2 163 | lightrag-server --port 9622 --workspace space2 164 | ``` 165 | 166 | 工作空间的作用是实现不同实例之间的数据隔离。因此不同实例之间的`workspace`参数必须不同,否则会导致数据混乱,数据将会被破坏。 167 | 168 | 通过 Docker Compose 启动多个 LightRAG 实例时,只需在 `docker-compose.yml` 中为每个容器指定不同的 `WORKSPACE` 和 `PORT` 环境变量即可。即使所有实例共享同一个 `.env` 文件,Compose 中定义的容器环境变量也会优先覆盖 `.env` 文件中的同名设置,从而确保每个实例拥有独立的配置。 169 | 170 | ### LightRAG实例间的数据隔离 171 | 172 | 每个实例配置一个独立的工作目录和专用`.env`配置文件通常能够保证内存数据库中的本地持久化文件保存在各自的工作目录,实现数据的相互隔离。LightRAG默认存储全部都是内存数据库,通过这种方式进行数据隔离是没有问题的。但是如果使用的是外部数据库,如果不同实例访问的是同一个数据库实例,就需要通过配置工作空间来实现数据隔离,否则不同实例的数据将会出现冲突并被破坏。 173 | 174 | 命令行的 workspace 参数和`.env`文件中的环境变量`WORKSPACE` 都可以用于指定当前实例的工作空间名字,命令行参数的优先级别更高。下面是不同类型的存储实现工作空间的方式: 175 | 176 | - **对于本地基于文件的数据库,数据隔离通过工作空间子目录实现:** JsonKVStorage, JsonDocStatusStorage, NetworkXStorage, NanoVectorDBStorage, FaissVectorDBStorage。 177 | - **对于将数据存储在集合(collection)中的数据库,通过在集合名称前添加工作空间前缀来实现:** RedisKVStorage, RedisDocStatusStorage, MilvusVectorDBStorage, QdrantVectorDBStorage, MongoKVStorage, MongoDocStatusStorage, MongoVectorDBStorage, MongoGraphStorage, PGGraphStorage。 178 | - **对于关系型数据库,数据隔离通过向表中添加 `workspace` 字段进行数据的逻辑隔离:** PGKVStorage, PGVectorStorage, PGDocStatusStorage。 179 | 180 | * **对于Neo4j图数据库,通过label来实现数据的逻辑隔离**:Neo4JStorage 181 | 182 | 为了保持对遗留数据的兼容,在未配置工作空间时PostgreSQL的默认工作空间为`default`,Neo4j的默认工作空间为`base`。对于所有的外部存储,系统都提供了专用的工作空间环境变量,用于覆盖公共的 `WORKSPACE`环境变量配置。这些适用于指定存储类型的工作空间环境变量为:`REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`。 183 | 184 | ### Gunicorn + Uvicorn 的多工作进程 185 | 186 | LightRAG 服务器可以在 `Gunicorn + Uvicorn` 预加载模式下运行。Gunicorn 的多工作进程(多进程)功能可以防止文档索引任务阻塞 RAG 查询。使用 CPU 密集型文档提取工具(如 docling)在纯 Uvicorn 模式下可能会导致整个系统被阻塞。 187 | 188 | 虽然 LightRAG 服务器使用一个工作进程来处理文档索引流程,但通过 Uvicorn 的异步任务支持,可以并行处理多个文件。文档索引速度的瓶颈主要在于 LLM。如果您的 LLM 支持高并发,您可以通过增加 LLM 的并发级别来加速文档索引。以下是几个与并发处理相关的环境变量及其默认值: 189 | 190 | ``` 191 | ### 工作进程数,数字不大于 (2 x 核心数) + 1 192 | WORKERS=2 193 | ### 一批中并行处理的文件数 194 | MAX_PARALLEL_INSERT=2 195 | # LLM 的最大并发请求数 196 | MAX_ASYNC=4 197 | ``` 198 | 199 | ### 将 Lightrag 安装为 Linux 服务 200 | 201 | 从示例文件 `lightrag.service.example` 创建您的服务文件 `lightrag.service`。修改服务文件中的 WorkingDirectory 和 ExecStart: 202 | 203 | ```text 204 | Description=LightRAG Ollama Service 205 | WorkingDirectory= 206 | ExecStart=/lightrag/api/lightrag-api 207 | ``` 208 | 209 | 修改您的服务启动脚本:`lightrag-api`。根据需要更改 python 虚拟环境激活命令: 210 | 211 | ```shell 212 | #!/bin/bash 213 | 214 | # 您的 python 虚拟环境激活命令 215 | source /home/netman/lightrag-xyj/venv/bin/activate 216 | # 启动 lightrag api 服务器 217 | lightrag-server 218 | ``` 219 | 220 | 安装 LightRAG 服务。如果您的系统是 Ubuntu,以下命令将生效: 221 | 222 | ```shell 223 | sudo cp lightrag.service /etc/systemd/system/ 224 | sudo systemctl daemon-reload 225 | sudo systemctl start lightrag.service 226 | sudo systemctl status lightrag.service 227 | sudo systemctl enable lightrag.service 228 | ``` 229 | 230 | ## Ollama 模拟 231 | 232 | 我们为 LightRAG 提供了 Ollama 兼容接口,旨在将 LightRAG 模拟为 Ollama 聊天模型。这使得支持 Ollama 的 AI 聊天前端(如 Open WebUI)可以轻松访问 LightRAG。 233 | 234 | ### 将 Open WebUI 连接到 LightRAG 235 | 236 | 启动 lightrag-server 后,您可以在 Open WebUI 管理面板中添加 Ollama 类型的连接。然后,一个名为 `lightrag:latest` 的模型将出现在 Open WebUI 的模型管理界面中。用户随后可以通过聊天界面向 LightRAG 发送查询。对于这种用例,最好将 LightRAG 安装为服务。 237 | 238 | Open WebUI 使用 LLM 来执行会话标题和会话关键词生成任务。因此,Ollama 聊天补全 API 会检测并将 OpenWebUI 会话相关请求直接转发给底层 LLM。Open WebUI 的截图: 239 | 240 | ![image-20250323194750379](./README.assets/image-20250323194750379.png) 241 | 242 | ### 在聊天中选择查询模式 243 | 244 | 如果您从 LightRAG 的 Ollama 接口发送消息(查询),默认查询模式是 `hybrid`。您可以通过发送带有查询前缀的消息来选择查询模式。 245 | 246 | 查询字符串中的查询前缀可以决定使用哪种 LightRAG 查询模式来生成响应。支持的前缀包括: 247 | 248 | ``` 249 | /local 250 | /global 251 | /hybrid 252 | /naive 253 | /mix 254 | 255 | /bypass 256 | /context 257 | /localcontext 258 | /globalcontext 259 | /hybridcontext 260 | /naivecontext 261 | /mixcontext 262 | ``` 263 | 264 | 例如,聊天消息 "/mix 唐僧有几个徒弟" 将触发 LightRAG 的混合模式查询。没有查询前缀的聊天消息默认会触发混合模式查询。 265 | 266 | "/bypass" 不是 LightRAG 查询模式,它会告诉 API 服务器将查询连同聊天历史直接传递给底层 LLM。因此用户可以使用 LLM 基于聊天历史回答问题。如果您使用 Open WebUI 作为前端,您可以直接切换到普通 LLM 模型,而不是使用 /bypass 前缀。 267 | 268 | "/context" 也不是 LightRAG 查询模式,它会告诉 LightRAG 只返回为 LLM 准备的上下文信息。您可以检查上下文是否符合您的需求,或者自行处理上下文。 269 | 270 | ### 在聊天中添加用户提示词 271 | 272 | 使用LightRAG进行内容查询时,应避免将搜索过程与无关的输出处理相结合,这会显著影响查询效果。用户提示(user prompt)正是为解决这一问题而设计 -- 它不参与RAG检索阶段,而是在查询完成后指导大语言模型(LLM)如何处理检索结果。我们可以在查询前缀末尾添加方括号,从而向LLM传递用户提示词: 273 | 274 | ``` 275 | /[使用mermaid格式画图] 请画出 Scrooge 的人物关系图谱 276 | /mix[使用mermaid格式画图] 请画出 Scrooge 的人物关系图谱 277 | ``` 278 | 279 | ## API 密钥和认证 280 | 281 | 默认情况下,LightRAG 服务器可以在没有任何认证的情况下访问。我们可以使用 API 密钥或账户凭证配置服务器以确保其安全。 282 | 283 | * API 密钥 284 | 285 | ``` 286 | LIGHTRAG_API_KEY=your-secure-api-key-here 287 | WHITELIST_PATHS=/health,/api/* 288 | ``` 289 | 290 | > 健康检查和 Ollama 模拟端点默认不进行 API 密钥检查。 291 | 292 | * 账户凭证(Web 界面需要登录后才能访问) 293 | 294 | LightRAG API 服务器使用基于 HS256 算法的 JWT 认证。要启用安全访问控制,需要以下环境变量: 295 | 296 | ```bash 297 | # JWT 认证 298 | AUTH_ACCOUNTS='admin:admin123,user1:pass456' 299 | TOKEN_SECRET='your-key' 300 | TOKEN_EXPIRE_HOURS=4 301 | ``` 302 | 303 | > 目前仅支持配置一个管理员账户和密码。尚未开发和实现完整的账户系统。 304 | 305 | 如果未配置账户凭证,Web 界面将以访客身份访问系统。因此,即使仅配置了 API 密钥,所有 API 仍然可以通过访客账户访问,这仍然不安全。因此,要保护 API,需要同时配置这两种认证方法。 306 | 307 | ## Azure OpenAI 后端配置 308 | 309 | 可以使用以下 Azure CLI 命令创建 Azure OpenAI API(您需要先从 [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) 安装 Azure CLI): 310 | 311 | ```bash 312 | # 根据需要更改资源组名称、位置和 OpenAI 资源名称 313 | RESOURCE_GROUP_NAME=LightRAG 314 | LOCATION=swedencentral 315 | RESOURCE_NAME=LightRAG-OpenAI 316 | 317 | az login 318 | az group create --name $RESOURCE_GROUP_NAME --location $LOCATION 319 | az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral 320 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard" 321 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard" 322 | az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint" 323 | az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME 324 | ``` 325 | 326 | 最后一个命令的输出将提供 OpenAI API 的端点和密钥。您可以使用这些值在 `.env` 文件中设置环境变量。 327 | 328 | ``` 329 | # .env 中的 Azure OpenAI 配置 330 | LLM_BINDING=azure_openai 331 | LLM_BINDING_HOST=your-azure-endpoint 332 | LLM_MODEL=your-model-deployment-name 333 | LLM_BINDING_API_KEY=your-azure-api-key 334 | ### API Version可选,默认为最新版本 335 | AZURE_OPENAI_API_VERSION=2024-08-01-preview 336 | 337 | ### 如果使用 Azure OpenAI 进行嵌入 338 | EMBEDDING_BINDING=azure_openai 339 | EMBEDDING_MODEL=your-embedding-deployment-name 340 | ``` 341 | 342 | ## LightRAG 服务器详细配置 343 | 344 | API 服务器可以通过三种方式配置(优先级从高到低): 345 | 346 | * 命令行参数 347 | * 环境变量或 .env 文件 348 | * Config.ini(仅用于存储配置) 349 | 350 | 大多数配置都有默认设置,详细信息请查看示例文件:`.env.example`。数据存储配置也可以通过 config.ini 设置。为方便起见,提供了示例文件 `config.ini.example`。 351 | 352 | ### 支持的 LLM 和嵌入后端 353 | 354 | LightRAG 支持绑定到各种 LLM/嵌入后端: 355 | 356 | * ollama 357 | * openai 和 openai 兼容 358 | * azure_openai 359 | * lollms 360 | 361 | 使用环境变量 `LLM_BINDING` 或 CLI 参数 `--llm-binding` 选择 LLM 后端类型。使用环境变量 `EMBEDDING_BINDING` 或 CLI 参数 `--embedding-binding` 选择嵌入后端类型。 362 | 363 | LLM和Embedding配置例子请查看项目根目录的 env.example 文件。OpenAI和Ollama兼容LLM接口的支持的完整配置选型可以通过一下命令查看: 364 | 365 | ``` 366 | lightrag-server --llm-binding openai --help 367 | lightrag-server --llm-binding ollama --help 368 | lightrag-server --embedding-binding ollama --help 369 | ``` 370 | 371 | ### 实体提取配置 372 | * ENABLE_LLM_CACHE_FOR_EXTRACT:为实体提取启用 LLM 缓存(默认:true) 373 | 374 | 在测试环境中将 `ENABLE_LLM_CACHE_FOR_EXTRACT` 设置为 true 以减少 LLM 调用成本是很常见的做法。 375 | 376 | ### 支持的存储类型 377 | 378 | LightRAG 使用 4 种类型的存储用于不同目的: 379 | 380 | * KV_STORAGE:llm 响应缓存、文本块、文档信息 381 | * VECTOR_STORAGE:实体向量、关系向量、块向量 382 | * GRAPH_STORAGE:实体关系图 383 | * DOC_STATUS_STORAGE:文档索引状态 384 | 385 | 每种存储类型都有几种实现: 386 | 387 | * KV_STORAGE 支持的实现名称 388 | 389 | ``` 390 | JsonKVStorage JsonFile(默认) 391 | PGKVStorage Postgres 392 | RedisKVStorage Redis 393 | MongoKVStorage MogonDB 394 | ``` 395 | 396 | * GRAPH_STORAGE 支持的实现名称 397 | 398 | ``` 399 | NetworkXStorage NetworkX(默认) 400 | Neo4JStorage Neo4J 401 | PGGraphStorage PostgreSQL with AGE plugin 402 | ``` 403 | 404 | > 在测试中Neo4j图形数据库相比PostgreSQL AGE有更好的性能表现。 405 | 406 | * VECTOR_STORAGE 支持的实现名称 407 | 408 | ``` 409 | NanoVectorDBStorage NanoVector(默认) 410 | PGVectorStorage Postgres 411 | MilvusVectorDBStorge Milvus 412 | FaissVectorDBStorage Faiss 413 | QdrantVectorDBStorage Qdrant 414 | MongoVectorDBStorage MongoDB 415 | ``` 416 | 417 | * DOC_STATUS_STORAGE 支持的实现名称 418 | 419 | ``` 420 | JsonDocStatusStorage JsonFile(默认) 421 | PGDocStatusStorage Postgres 422 | MongoDocStatusStorage MongoDB 423 | ``` 424 | 425 | 每一种存储类型的链接配置范例可以在 `env.example` 文件中找到。链接字符串中的数据库实例是需要你预先在数据库服务器上创建好的,LightRAG 仅负责在数据库实例中创建数据表,不负责创建数据库实例。如果使用 Redis 作为存储,记得给 Redis 配置自动持久化数据规则,否则 Redis 服务重启后数据会丢失。如果使用PostgreSQL数据库,推荐使用16.6版本或以上。 426 | 427 | ### 如何选择存储实现 428 | 429 | 您可以通过环境变量选择存储实现。在首次启动 API 服务器之前,您可以将以下环境变量设置为特定的存储实现名称: 430 | 431 | ``` 432 | LIGHTRAG_KV_STORAGE=PGKVStorage 433 | LIGHTRAG_VECTOR_STORAGE=PGVectorStorage 434 | LIGHTRAG_GRAPH_STORAGE=PGGraphStorage 435 | LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage 436 | ``` 437 | 438 | 在向 LightRAG 添加文档后,您不能更改存储实现选择。目前尚不支持从一个存储实现迁移到另一个存储实现。更多信息请阅读示例 env 文件或 config.ini 文件。 439 | 440 | ### LightRag API 服务器命令行选项 441 | 442 | | 参数 | 默认值 | 描述 | 443 | |-----------|---------|-------------| 444 | | --host | 0.0.0.0 | 服务器主机 | 445 | | --port | 9621 | 服务器端口 | 446 | | --working-dir | ./rag_storage | RAG 存储的工作目录 | 447 | | --input-dir | ./inputs | 包含输入文档的目录 | 448 | | --max-async | 4 | 最大异步操作数 | 449 | | --max-tokens | 32768 | 最大 token 大小 | 450 | | --timeout | 150 | 超时时间(秒)。None 表示无限超时(不推荐) | 451 | | --log-level | INFO | 日志级别(DEBUG、INFO、WARNING、ERROR、CRITICAL) | 452 | | --verbose | - | 详细调试输出(True、False) | 453 | | --key | None | 用于认证的 API 密钥。保护 lightrag 服务器免受未授权访问 | 454 | | --ssl | False | 启用 HTTPS | 455 | | --ssl-certfile | None | SSL 证书文件路径(如果启用 --ssl 则必需) | 456 | | --ssl-keyfile | None | SSL 私钥文件路径(如果启用 --ssl 则必需) | 457 | | --top-k | 50 | 要检索的 top-k 项目数;在"local"模式下对应实体,在"global"模式下对应关系。 | 458 | | --cosine-threshold | 0.4 | 节点和关系检索的余弦阈值,与 top-k 一起控制节点和关系的检索。 | 459 | | --llm-binding | ollama | LLM 绑定类型(lollms、ollama、openai、openai-ollama、azure_openai) | 460 | | --embedding-binding | ollama | 嵌入绑定类型(lollms、ollama、openai、azure_openai) | 461 | | auto-scan-at-startup | - | 扫描输入目录中的新文件并开始索引 | 462 | 463 | ### .env 文件示例 464 | 465 | ```bash 466 | ### Server Configuration 467 | # HOST=0.0.0.0 468 | PORT=9621 469 | WORKERS=2 470 | 471 | ### Settings for document indexing 472 | ENABLE_LLM_CACHE_FOR_EXTRACT=true 473 | SUMMARY_LANGUAGE=Chinese 474 | MAX_PARALLEL_INSERT=2 475 | 476 | ### LLM Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 477 | TIMEOUT=200 478 | MAX_ASYNC=4 479 | 480 | LLM_BINDING=openai 481 | LLM_MODEL=gpt-4o-mini 482 | LLM_BINDING_HOST=https://api.openai.com/v1 483 | LLM_BINDING_API_KEY=your-api-key 484 | 485 | ### Embedding Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 486 | EMBEDDING_MODEL=bge-m3:latest 487 | EMBEDDING_DIM=1024 488 | EMBEDDING_BINDING=ollama 489 | EMBEDDING_BINDING_HOST=http://localhost:11434 490 | 491 | ### For JWT Auth 492 | # AUTH_ACCOUNTS='admin:admin123,user1:pass456' 493 | # TOKEN_SECRET=your-key-for-LightRAG-API-Server-xxx 494 | # TOKEN_EXPIRE_HOURS=48 495 | 496 | # LIGHTRAG_API_KEY=your-secure-api-key-here-123 497 | # WHITELIST_PATHS=/api/* 498 | # WHITELIST_PATHS=/health,/api/* 499 | ``` 500 | 501 | #### 使用 ollama 默认本地服务器作为 llm 和嵌入后端运行 Lightrag 服务器 502 | 503 | Ollama 是 llm 和嵌入的默认后端,因此默认情况下您可以不带参数运行 lightrag-server,将使用默认值。确保已安装 ollama 并且正在运行,且默认模型已安装在 ollama 上。 504 | 505 | ```bash 506 | # 使用 ollama 运行 lightrag,llm 使用 mistral-nemo:latest,嵌入使用 bge-m3:latest 507 | lightrag-server 508 | 509 | # 使用认证密钥 510 | lightrag-server --key my-key 511 | ``` 512 | 513 | #### 使用 lollms 默认本地服务器作为 llm 和嵌入后端运行 Lightrag 服务器 514 | 515 | ```bash 516 | # 使用 lollms 运行 lightrag,llm 使用 mistral-nemo:latest,嵌入使用 bge-m3:latest 517 | # 在 .env 或 config.ini 中配置 LLM_BINDING=lollms 和 EMBEDDING_BINDING=lollms 518 | lightrag-server 519 | 520 | # 使用认证密钥 521 | lightrag-server --key my-key 522 | ``` 523 | 524 | #### 使用 openai 服务器作为 llm 和嵌入后端运行 Lightrag 服务器 525 | 526 | ```bash 527 | # 使用 openai 运行 lightrag,llm 使用 GPT-4o-mini,嵌入使用 text-embedding-3-small 528 | # 在 .env 或 config.ini 中配置: 529 | # LLM_BINDING=openai 530 | # LLM_MODEL=GPT-4o-mini 531 | # EMBEDDING_BINDING=openai 532 | # EMBEDDING_MODEL=text-embedding-3-small 533 | lightrag-server 534 | 535 | # 使用认证密钥 536 | lightrag-server --key my-key 537 | ``` 538 | 539 | #### 使用 azure openai 服务器作为 llm 和嵌入后端运行 Lightrag 服务器 540 | 541 | ```bash 542 | # 使用 azure_openai 运行 lightrag 543 | # 在 .env 或 config.ini 中配置: 544 | # LLM_BINDING=azure_openai 545 | # LLM_MODEL=your-model 546 | # EMBEDDING_BINDING=azure_openai 547 | # EMBEDDING_MODEL=your-embedding-model 548 | lightrag-server 549 | 550 | # 使用认证密钥 551 | lightrag-server --key my-key 552 | ``` 553 | 554 | **重要说明:** 555 | - 对于 LoLLMs:确保指定的模型已安装在您的 LoLLMs 实例中 556 | - 对于 Ollama:确保指定的模型已安装在您的 Ollama 实例中 557 | - 对于 OpenAI:确保您已设置 OPENAI_API_KEY 环境变量 558 | - 对于 Azure OpenAI:按照先决条件部分所述构建和配置您的服务器 559 | 560 | 要获取任何服务器的帮助,使用 --help 标志: 561 | ```bash 562 | lightrag-server --help 563 | ``` 564 | 565 | 注意:如果您不需要 API 功能,可以使用以下命令安装不带 API 支持的基本包: 566 | ```bash 567 | pip install lightrag-hku 568 | ``` 569 | 570 | ## 文档和块处理逻辑说明 571 | 572 | LightRAG 中的文档处理流程有些复杂,分为两个主要阶段:提取阶段(实体和关系提取)和合并阶段(实体和关系合并)。有两个关键参数控制流程并发性:并行处理的最大文件数(`MAX_PARALLEL_INSERT`)和最大并发 LLM 请求数(`MAX_ASYNC`)。工作流程描述如下: 573 | 574 | 1. `MAX_ASYNC` 限制系统中并发 LLM 请求的总数,包括查询、提取和合并的请求。LLM 请求具有不同的优先级:查询操作优先级最高,其次是合并,然后是提取。 575 | 2. `MAX_PARALLEL_INSERT` 控制提取阶段并行处理的文件数量。`MAX_PARALLEL_INSERT`建议设置为2~10之间,通常设置为 `MAX_ASYNC/3`,设置太大会导致合并阶段不同文档之间实体和关系重名的机会增大,降低合并阶段的效率。 576 | 3. 在单个文件中,来自不同文本块的实体和关系提取是并发处理的,并发度由 `MAX_ASYNC` 设置。只有在处理完 `MAX_ASYNC` 个文本块后,系统才会继续处理同一文件中的下一批文本块。 577 | 4. 当一个文件完成实体和关系提后,将进入实体和关系合并阶段。这一阶段也会并发处理多个实体和关系,其并发度同样是由 `MAX_ASYNC` 控制。 578 | 5. 合并阶段的 LLM 请求的优先级别高于提取阶段,目的是让进入合并阶段的文件尽快完成处理,并让处理结果尽快更新到向量数据库中。 579 | 6. 为防止竞争条件,合并阶段会避免并发处理同一个实体或关系,当多个文件中都涉及同一个实体或关系需要合并的时候他们会串行执行。 580 | 7. 每个文件在流程中被视为一个原子处理单元。只有当其所有文本块都完成提取和合并后,文件才会被标记为成功处理。如果在处理过程中发生任何错误,整个文件将被标记为失败,并且必须重新处理。 581 | 8. 当由于错误而重新处理文件时,由于 LLM 缓存,先前处理的文本块可以快速跳过。尽管 LLM 缓存在合并阶段也会被利用,但合并顺序的不一致可能会限制其在此阶段的有效性。 582 | 9. 如果在提取过程中发生错误,系统不会保留任何中间结果。如果在合并过程中发生错误,已合并的实体和关系可能会被保留;当重新处理同一文件时,重新提取的实体和关系将与现有实体和关系合并,而不会影响查询结果。 583 | 10. 在合并阶段结束时,所有实体和关系数据都会在向量数据库中更新。如果此时发生错误,某些更新可能会被保留。但是,下一次处理尝试将覆盖先前结果,确保成功重新处理的文件不会影响未来查询结果的完整性。 584 | 585 | 大型文件应分割成较小的片段以启用增量处理。可以通过在 Web UI 上按“扫描”按钮来启动失败文件的重新处理。 586 | 587 | ## API 端点 588 | 589 | 所有服务器(LoLLMs、Ollama、OpenAI 和 Azure OpenAI)都为 RAG 功能提供相同的 REST API 端点。当 API 服务器运行时,访问: 590 | 591 | - Swagger UI:http://localhost:9621/docs 592 | - ReDoc:http://localhost:9621/redoc 593 | 594 | 您可以使用提供的 curl 命令或通过 Swagger UI 界面测试 API 端点。确保: 595 | 596 | 1. 启动适当的后端服务(LoLLMs、Ollama 或 OpenAI) 597 | 2. 启动 RAG 服务器 598 | 3. 使用文档管理端点上传一些文档 599 | 4. 使用查询端点查询系统 600 | 5. 如果在输入目录中放入新文件,触发文档扫描 601 | 602 | ## 异步文档索引与进度跟踪 603 | 604 | LightRAG采用异步文档索引机制,便于前端监控和查询文档处理进度。用户通过指定端点上传文件或插入文本时,系统将返回唯一的跟踪ID,以便实时监控处理进度。 605 | 606 | **支持生成跟踪ID的API端点:** 607 | * `/documents/upload` 608 | * `/documents/text` 609 | * `/documents/texts` 610 | 611 | **文档处理状态查询端点:** 612 | * `/track_status/{track_id}` 613 | 614 | 该端点提供全面的状态信息,包括: 615 | * 文档处理状态(待处理/处理中/已处理/失败) 616 | * 内容摘要和元数据 617 | * 处理失败时的错误信息 618 | * 创建和更新时间戳 619 | -------------------------------------------------------------------------------- /lightrag/api/README.md: -------------------------------------------------------------------------------- 1 | # LightRAG Server and WebUI 2 | 3 | The LightRAG Server is designed to provide a Web UI and API support. The Web UI facilitates document indexing, knowledge graph exploration, and a simple RAG query interface. LightRAG Server also provides an Ollama-compatible interface, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat bots, such as Open WebUI, to access LightRAG easily. 4 | 5 | ![image-20250323122538997](./README.assets/image-20250323122538997.png) 6 | 7 | ![image-20250323122754387](./README.assets/image-20250323122754387.png) 8 | 9 | ![image-20250323123011220](./README.assets/image-20250323123011220.png) 10 | 11 | ## Getting Started 12 | 13 | ### Installation 14 | 15 | * Install from PyPI 16 | 17 | ```bash 18 | pip install "lightrag-hku[api]" 19 | ``` 20 | 21 | * Installation from Source 22 | 23 | ```bash 24 | # Clone the repository 25 | git clone https://github.com/HKUDS/lightrag.git 26 | 27 | # Change to the repository directory 28 | cd lightrag 29 | 30 | # create a Python virtual environment if necessary 31 | # Install in editable mode with API support 32 | pip install -e ".[api]" 33 | ``` 34 | 35 | ### Before Starting LightRAG Server 36 | 37 | LightRAG necessitates the integration of both an LLM (Large Language Model) and an Embedding Model to effectively execute document indexing and querying operations. Prior to the initial deployment of the LightRAG server, it is essential to configure the settings for both the LLM and the Embedding Model. LightRAG supports binding to various LLM/Embedding backends: 38 | 39 | * ollama 40 | * lollms 41 | * openai or openai compatible 42 | * azure_openai 43 | 44 | It is recommended to use environment variables to configure the LightRAG Server. There is an example environment variable file named `env.example` in the root directory of the project. Please copy this file to the startup directory and rename it to `.env`. After that, you can modify the parameters related to the LLM and Embedding models in the `.env` file. It is important to note that the LightRAG Server will load the environment variables from `.env` into the system environment variables each time it starts. **LightRAG Server will prioritize the settings in the system environment variables to .env file**. 45 | 46 | > Since VS Code with the Python extension may automatically load the .env file in the integrated terminal, please open a new terminal session after each modification to the .env file. 47 | 48 | Here are some examples of common settings for LLM and Embedding models: 49 | 50 | * OpenAI LLM + Ollama Embedding: 51 | 52 | ``` 53 | LLM_BINDING=openai 54 | LLM_MODEL=gpt-4o 55 | LLM_BINDING_HOST=https://api.openai.com/v1 56 | LLM_BINDING_API_KEY=your_api_key 57 | 58 | EMBEDDING_BINDING=ollama 59 | EMBEDDING_BINDING_HOST=http://localhost:11434 60 | EMBEDDING_MODEL=bge-m3:latest 61 | EMBEDDING_DIM=1024 62 | # EMBEDDING_BINDING_API_KEY=your_api_key 63 | ``` 64 | 65 | * Ollama LLM + Ollama Embedding: 66 | 67 | ``` 68 | LLM_BINDING=ollama 69 | LLM_MODEL=mistral-nemo:latest 70 | LLM_BINDING_HOST=http://localhost:11434 71 | # LLM_BINDING_API_KEY=your_api_key 72 | ### Ollama Server context length (Must be larger than MAX_TOTAL_TOKENS+2000) 73 | OLLAMA_LLM_NUM_CTX=16384 74 | 75 | EMBEDDING_BINDING=ollama 76 | EMBEDDING_BINDING_HOST=http://localhost:11434 77 | EMBEDDING_MODEL=bge-m3:latest 78 | EMBEDDING_DIM=1024 79 | # EMBEDDING_BINDING_API_KEY=your_api_key 80 | ``` 81 | 82 | ### Starting LightRAG Server 83 | 84 | The LightRAG Server supports two operational modes: 85 | * The simple and efficient Uvicorn mode: 86 | 87 | ``` 88 | lightrag-server 89 | ``` 90 | * The multiprocess Gunicorn + Uvicorn mode (production mode, not supported on Windows environments): 91 | 92 | ``` 93 | lightrag-gunicorn --workers 4 94 | ``` 95 | 96 | When starting LightRAG, the current working directory must contain the `.env` configuration file. **It is intentionally designed that the `.env` file must be placed in the startup directory**. The purpose of this is to allow users to launch multiple LightRAG instances simultaneously and configure different `.env` files for different instances. **After modifying the `.env` file, you need to reopen the terminal for the new settings to take effect.** This is because each time LightRAG Server starts, it loads the environment variables from the `.env` file into the system environment variables, and system environment variables have higher precedence. 97 | 98 | During startup, configurations in the `.env` file can be overridden by command-line parameters. Common command-line parameters include: 99 | 100 | - `--host`: Server listening address (default: 0.0.0.0) 101 | - `--port`: Server listening port (default: 9621) 102 | - `--timeout`: LLM request timeout (default: 150 seconds) 103 | - `--log-level`: Log level (default: INFO) 104 | - `--working-dir`: Database persistence directory (default: ./rag_storage) 105 | - `--input-dir`: Directory for uploaded files (default: ./inputs) 106 | - `--workspace`: Workspace name, used to logically isolate data between multiple LightRAG instances (default: empty) 107 | 108 | ### Launching LightRAG Server with Docker 109 | 110 | * Prepare the .env file: 111 | Create a personalized .env file by copying the sample file [`env.example`](env.example). Configure the LLM and embedding parameters according to your requirements. 112 | 113 | * Create a file named `docker-compose.yml`: 114 | 115 | ```yaml 116 | services: 117 | lightrag: 118 | container_name: lightrag 119 | image: ghcr.io/hkuds/lightrag:latest 120 | ports: 121 | - "${PORT:-9621}:9621" 122 | volumes: 123 | - ./data/rag_storage:/app/data/rag_storage 124 | - ./data/inputs:/app/data/inputs 125 | - ./config.ini:/app/config.ini 126 | - ./.env:/app/.env 127 | env_file: 128 | - .env 129 | restart: unless-stopped 130 | extra_hosts: 131 | - "host.docker.internal:host-gateway" 132 | ``` 133 | 134 | * Start the LightRAG Server with the following command: 135 | 136 | ```shell 137 | docker compose up 138 | # If you want the program to run in the background after startup, add the -d parameter at the end of the command. 139 | ``` 140 | 141 | > You can get the official docker compose file from here: [docker-compose.yml](https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml). For historical versions of LightRAG docker images, visit this link: [LightRAG Docker Images](https://github.com/HKUDS/LightRAG/pkgs/container/lightrag) 142 | 143 | ### Auto scan on startup 144 | 145 | When starting the LightRAG Server with the `--auto-scan-at-startup` parameter, the system will automatically: 146 | 147 | 1. Scan for new files in the input directory 148 | 2. Index new documents that aren't already in the database 149 | 3. Make all content immediately available for RAG queries 150 | 151 | This offers an efficient method for deploying ad-hoc RAG processes. 152 | 153 | > The `--input-dir` parameter specifies the input directory to scan. You can trigger the input directory scan from the Web UI. 154 | 155 | ### Starting Multiple LightRAG Instances 156 | 157 | There are two ways to start multiple LightRAG instances. The first way is to configure a completely independent working environment for each instance. This requires creating a separate working directory for each instance and placing a dedicated `.env` configuration file in that directory. The server listening ports in the configuration files of different instances cannot be the same. Then, you can start the service by running `lightrag-server` in the working directory. 158 | 159 | The second way is for all instances to share the same set of `.env` configuration files, and then use command-line arguments to specify different server listening ports and workspaces for each instance. You can start multiple LightRAG instances in the same working directory with different command-line arguments. For example: 160 | 161 | ``` 162 | # Start instance 1 163 | lightrag-server --port 9621 --workspace space1 164 | 165 | # Start instance 2 166 | lightrag-server --port 9622 --workspace space2 167 | ``` 168 | 169 | The purpose of a workspace is to achieve data isolation between different instances. Therefore, the `workspace` parameter must be different for different instances; otherwise, it will lead to data confusion and corruption. 170 | 171 | When launching multiple LightRAG instances via Docker Compose, simply specify unique `WORKSPACE` and `PORT` environment variables for each container within your `docker-compose.yml`. Even if all instances share a common `.env` file, the container-specific environment variables defined in Compose will take precedence, ensuring independent configurations for each instance. 172 | 173 | ### Data Isolation Between LightRAG Instances 174 | 175 | Configuring an independent working directory and a dedicated `.env` configuration file for each instance can generally ensure that locally persisted files in the in-memory database are saved in their respective working directories, achieving data isolation. By default, LightRAG uses all in-memory databases, and this method of data isolation is sufficient. However, if you are using an external database, and different instances access the same database instance, you need to use workspaces to achieve data isolation; otherwise, the data of different instances will conflict and be destroyed. 176 | 177 | The command-line `workspace` argument and the `WORKSPACE` environment variable in the `.env` file can both be used to specify the workspace name for the current instance, with the command-line argument having higher priority. Here is how workspaces are implemented for different types of storage: 178 | 179 | - **For local file-based databases, data isolation is achieved through workspace subdirectories:** `JsonKVStorage`, `JsonDocStatusStorage`, `NetworkXStorage`, `NanoVectorDBStorage`, `FaissVectorDBStorage`. 180 | - **For databases that store data in collections, it's done by adding a workspace prefix to the collection name:** `RedisKVStorage`, `RedisDocStatusStorage`, `MilvusVectorDBStorage`, `QdrantVectorDBStorage`, `MongoKVStorage`, `MongoDocStatusStorage`, `MongoVectorDBStorage`, `MongoGraphStorage`, `PGGraphStorage`. 181 | - **For relational databases, data isolation is achieved by adding a `workspace` field to the tables for logical data separation:** `PGKVStorage`, `PGVectorStorage`, `PGDocStatusStorage`. 182 | - **For graph databases, logical data isolation is achieved through labels:** `Neo4JStorage`, `MemgraphStorage` 183 | 184 | To maintain compatibility with legacy data, the default workspace for PostgreSQL is `default` and for Neo4j is `base` when no workspace is configured. For all external storages, the system provides dedicated workspace environment variables to override the common `WORKSPACE` environment variable configuration. These storage-specific workspace environment variables are: `REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`, `MEMGRAPH_WORKSPACE`. 185 | 186 | ### Multiple workers for Gunicorn + Uvicorn 187 | 188 | The LightRAG Server can operate in the `Gunicorn + Uvicorn` preload mode. Gunicorn's multiple worker (multiprocess) capability prevents document indexing tasks from blocking RAG queries. Using CPU-exhaustive document extraction tools, such as docling, can lead to the entire system being blocked in pure Uvicorn mode. 189 | 190 | Though LightRAG Server uses one worker to process the document indexing pipeline, with the async task support of Uvicorn, multiple files can be processed in parallel. The bottleneck of document indexing speed mainly lies with the LLM. If your LLM supports high concurrency, you can accelerate document indexing by increasing the concurrency level of the LLM. Below are several environment variables related to concurrent processing, along with their default values: 191 | 192 | ``` 193 | ### Number of worker processes, not greater than (2 x number_of_cores) + 1 194 | WORKERS=2 195 | ### Number of parallel files to process in one batch 196 | MAX_PARALLEL_INSERT=2 197 | ### Max concurrent requests to the LLM 198 | MAX_ASYNC=4 199 | ``` 200 | 201 | ### Install LightRAG as a Linux Service 202 | 203 | Create your service file `lightrag.service` from the sample file: `lightrag.service.example`. Modify the `WorkingDirectory` and `ExecStart` in the service file: 204 | 205 | ```text 206 | Description=LightRAG Ollama Service 207 | WorkingDirectory= 208 | ExecStart=/lightrag/api/lightrag-api 209 | ``` 210 | 211 | Modify your service startup script: `lightrag-api`. Change your Python virtual environment activation command as needed: 212 | 213 | ```shell 214 | #!/bin/bash 215 | 216 | # your python virtual environment activation 217 | source /home/netman/lightrag-xyj/venv/bin/activate 218 | # start lightrag api server 219 | lightrag-server 220 | ``` 221 | 222 | Install LightRAG service. If your system is Ubuntu, the following commands will work: 223 | 224 | ```shell 225 | sudo cp lightrag.service /etc/systemd/system/ 226 | sudo systemctl daemon-reload 227 | sudo systemctl start lightrag.service 228 | sudo systemctl status lightrag.service 229 | sudo systemctl enable lightrag.service 230 | ``` 231 | 232 | ## Ollama Emulation 233 | 234 | We provide Ollama-compatible interfaces for LightRAG, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat frontends supporting Ollama, such as Open WebUI, to access LightRAG easily. 235 | 236 | ### Connect Open WebUI to LightRAG 237 | 238 | After starting the lightrag-server, you can add an Ollama-type connection in the Open WebUI admin panel. And then a model named `lightrag:latest` will appear in Open WebUI's model management interface. Users can then send queries to LightRAG through the chat interface. You should install LightRAG as a service for this use case. 239 | 240 | Open WebUI uses an LLM to do the session title and session keyword generation task. So the Ollama chat completion API detects and forwards OpenWebUI session-related requests directly to the underlying LLM. Screenshot from Open WebUI: 241 | 242 | ![image-20250323194750379](./README.assets/image-20250323194750379.png) 243 | 244 | ### Choose Query mode in chat 245 | 246 | The default query mode is `hybrid` if you send a message (query) from the Ollama interface of LightRAG. You can select query mode by sending a message with a query prefix. 247 | 248 | A query prefix in the query string can determine which LightRAG query mode is used to generate the response for the query. The supported prefixes include: 249 | 250 | ``` 251 | /local 252 | /global 253 | /hybrid 254 | /naive 255 | /mix 256 | 257 | /bypass 258 | /context 259 | /localcontext 260 | /globalcontext 261 | /hybridcontext 262 | /naivecontext 263 | /mixcontext 264 | ``` 265 | 266 | For example, the chat message `/mix What's LightRAG?` will trigger a mix mode query for LightRAG. A chat message without a query prefix will trigger a hybrid mode query by default. 267 | 268 | `/bypass` is not a LightRAG query mode; it will tell the API Server to pass the query directly to the underlying LLM, including the chat history. So the user can use the LLM to answer questions based on the chat history. If you are using Open WebUI as a front end, you can just switch the model to a normal LLM instead of using the `/bypass` prefix. 269 | 270 | `/context` is also not a LightRAG query mode; it will tell LightRAG to return only the context information prepared for the LLM. You can check the context if it's what you want, or process the context by yourself. 271 | 272 | ### Add user prompt in chat 273 | 274 | When using LightRAG for content queries, avoid combining the search process with unrelated output processing, as this significantly impacts query effectiveness. User prompt is specifically designed to address this issue — it does not participate in the RAG retrieval phase, but rather guides the LLM on how to process the retrieved results after the query is completed. We can append square brackets to the query prefix to provide the LLM with the user prompt: 275 | 276 | ``` 277 | /[Use mermaid format for diagrams] Please draw a character relationship diagram for Scrooge 278 | /mix[Use mermaid format for diagrams] Please draw a character relationship diagram for Scrooge 279 | ``` 280 | 281 | ## API Key and Authentication 282 | 283 | By default, the LightRAG Server can be accessed without any authentication. We can configure the server with an API Key or account credentials to secure it. 284 | 285 | * API Key: 286 | 287 | ``` 288 | LIGHTRAG_API_KEY=your-secure-api-key-here 289 | WHITELIST_PATHS=/health,/api/* 290 | ``` 291 | 292 | > Health check and Ollama emulation endpoints are excluded from API Key check by default. 293 | 294 | * Account credentials (the Web UI requires login before access can be granted): 295 | 296 | LightRAG API Server implements JWT-based authentication using the HS256 algorithm. To enable secure access control, the following environment variables are required: 297 | 298 | ```bash 299 | # For jwt auth 300 | AUTH_ACCOUNTS='admin:admin123,user1:pass456' 301 | TOKEN_SECRET='your-key' 302 | TOKEN_EXPIRE_HOURS=4 303 | ``` 304 | 305 | > Currently, only the configuration of an administrator account and password is supported. A comprehensive account system is yet to be developed and implemented. 306 | 307 | If Account credentials are not configured, the Web UI will access the system as a Guest. Therefore, even if only an API Key is configured, all APIs can still be accessed through the Guest account, which remains insecure. Hence, to safeguard the API, it is necessary to configure both authentication methods simultaneously. 308 | 309 | ## For Azure OpenAI Backend 310 | 311 | Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)): 312 | 313 | ```bash 314 | # Change the resource group name, location, and OpenAI resource name as needed 315 | RESOURCE_GROUP_NAME=LightRAG 316 | LOCATION=swedencentral 317 | RESOURCE_NAME=LightRAG-OpenAI 318 | 319 | az login 320 | az group create --name $RESOURCE_GROUP_NAME --location $LOCATION 321 | az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral 322 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard" 323 | az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard" 324 | az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint" 325 | az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME 326 | 327 | ``` 328 | 329 | The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file. 330 | 331 | ``` 332 | # Azure OpenAI Configuration in .env: 333 | LLM_BINDING=azure_openai 334 | LLM_BINDING_HOST=your-azure-endpoint 335 | LLM_MODEL=your-model-deployment-name 336 | LLM_BINDING_API_KEY=your-azure-api-key 337 | ### API version is optional, defaults to latest version 338 | AZURE_OPENAI_API_VERSION=2024-08-01-preview 339 | 340 | ### If using Azure OpenAI for embeddings 341 | EMBEDDING_BINDING=azure_openai 342 | EMBEDDING_MODEL=your-embedding-deployment-name 343 | ``` 344 | 345 | ## LightRAG Server Configuration in Detail 346 | 347 | The API Server can be configured in three ways (highest priority first): 348 | 349 | * Command line arguments 350 | * Environment variables or .env file 351 | * Config.ini (Only for storage configuration) 352 | 353 | Most of the configurations come with default settings; check out the details in the sample file: `.env.example`. Data storage configuration can also be set by config.ini. A sample file `config.ini.example` is provided for your convenience. 354 | 355 | ### LLM and Embedding Backend Supported 356 | 357 | LightRAG supports binding to various LLM/Embedding backends: 358 | 359 | * ollama 360 | * openai & openai compatible 361 | * azure_openai 362 | * lollms 363 | 364 | Use environment variables `LLM_BINDING` or CLI argument `--llm-binding` to select the LLM backend type. Use environment variables `EMBEDDING_BINDING` or CLI argument `--embedding-binding` to select the Embedding backend type. 365 | 366 | For LLM and embedding configuration examples, please refer to the `env.example` file in the project's root directory. To view the complete list of configurable options for OpenAI and Ollama-compatible LLM interfaces, use the following commands: 367 | ``` 368 | lightrag-server --llm-binding openai --help 369 | lightrag-server --llm-binding ollama --help 370 | lightrag-server --embedding-binding ollama --help 371 | ``` 372 | 373 | ### Entity Extraction Configuration 374 | * ENABLE_LLM_CACHE_FOR_EXTRACT: Enable LLM cache for entity extraction (default: true) 375 | 376 | It's very common to set `ENABLE_LLM_CACHE_FOR_EXTRACT` to true for a test environment to reduce the cost of LLM calls. 377 | 378 | ### Storage Types Supported 379 | 380 | LightRAG uses 4 types of storage for different purposes: 381 | 382 | * KV_STORAGE: llm response cache, text chunks, document information 383 | * VECTOR_STORAGE: entities vectors, relation vectors, chunks vectors 384 | * GRAPH_STORAGE: entity relation graph 385 | * DOC_STATUS_STORAGE: document indexing status 386 | 387 | Each storage type has several implementations: 388 | 389 | * KV_STORAGE supported implementations: 390 | 391 | ``` 392 | JsonKVStorage JsonFile (default) 393 | PGKVStorage Postgres 394 | RedisKVStorage Redis 395 | MongoKVStorage MongoDB 396 | ``` 397 | 398 | * GRAPH_STORAGE supported implementations: 399 | 400 | ``` 401 | NetworkXStorage NetworkX (default) 402 | Neo4JStorage Neo4J 403 | PGGraphStorage PostgreSQL with AGE plugin 404 | MemgraphStorage. Memgraph 405 | ``` 406 | 407 | > Testing has shown that Neo4J delivers superior performance in production environments compared to PostgreSQL with AGE plugin. 408 | 409 | * VECTOR_STORAGE supported implementations: 410 | 411 | ``` 412 | NanoVectorDBStorage NanoVector (default) 413 | PGVectorStorage Postgres 414 | MilvusVectorDBStorage Milvus 415 | FaissVectorDBStorage Faiss 416 | QdrantVectorDBStorage Qdrant 417 | MongoVectorDBStorage MongoDB 418 | ``` 419 | 420 | * DOC_STATUS_STORAGE: supported implementations: 421 | 422 | ``` 423 | JsonDocStatusStorage JsonFile (default) 424 | PGDocStatusStorage Postgres 425 | MongoDocStatusStorage MongoDB 426 | ``` 427 | Example connection configurations for each storage type can be found in the `env.example` file. The database instance in the connection string needs to be created by you on the database server beforehand. LightRAG is only responsible for creating tables within the database instance, not for creating the database instance itself. If using Redis as storage, remember to configure automatic data persistence rules for Redis, otherwise data will be lost after the Redis service restarts. If using PostgreSQL, it is recommended to use version 16.6 or above. 428 | 429 | 430 | ### How to Select Storage Implementation 431 | 432 | You can select storage implementation by environment variables. You can set the following environment variables to a specific storage implementation name before the first start of the API Server: 433 | 434 | ``` 435 | LIGHTRAG_KV_STORAGE=PGKVStorage 436 | LIGHTRAG_VECTOR_STORAGE=PGVectorStorage 437 | LIGHTRAG_GRAPH_STORAGE=PGGraphStorage 438 | LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage 439 | ``` 440 | 441 | You cannot change storage implementation selection after adding documents to LightRAG. Data migration from one storage implementation to another is not supported yet. For further information, please read the sample env file or config.ini file. 442 | 443 | ### LightRAG API Server Command Line Options 444 | 445 | | Parameter | Default | Description | 446 | | --------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------- | 447 | | --host | 0.0.0.0 | Server host | 448 | | --port | 9621 | Server port | 449 | | --working-dir | ./rag_storage | Working directory for RAG storage | 450 | | --input-dir | ./inputs | Directory containing input documents | 451 | | --max-async | 4 | Maximum number of async operations | 452 | | --max-tokens | 32768 | Maximum token size | 453 | | --timeout | 150 | Timeout in seconds. None for infinite timeout (not recommended) | 454 | | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | 455 | | --verbose | - | Verbose debug output (True, False) | 456 | | --key | None | API key for authentication. Protects the LightRAG server against unauthorized access | 457 | | --ssl | False | Enable HTTPS | 458 | | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) | 459 | | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) | 460 | | --top-k | 50 | Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode. | 461 | | --cosine-threshold | 0.4 | The cosine threshold for nodes and relation retrieval, works with top-k to control the retrieval of nodes and relations. | 462 | | --llm-binding | ollama | LLM binding type (lollms, ollama, openai, openai-ollama, azure_openai) | 463 | | --embedding-binding | ollama | Embedding binding type (lollms, ollama, openai, azure_openai) | 464 | | --auto-scan-at-startup| - | Scan input directory for new files and start indexing | 465 | 466 | ### Additional Ollama Binding Options 467 | 468 | When using `--llm-binding ollama` or `--embedding-binding ollama`, additional Ollama-specific configuration options are available. To see all available Ollama binding options, add `--help` to the command line when starting the server. These additional options allow for fine-tuning of Ollama model parameters and connection settings. 469 | 470 | ### .env Examples 471 | 472 | ```bash 473 | ### Server Configuration 474 | # HOST=0.0.0.0 475 | PORT=9621 476 | WORKERS=2 477 | 478 | ### Settings for document indexing 479 | ENABLE_LLM_CACHE_FOR_EXTRACT=true 480 | SUMMARY_LANGUAGE=Chinese 481 | MAX_PARALLEL_INSERT=2 482 | 483 | ### LLM Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 484 | TIMEOUT=200 485 | MAX_ASYNC=4 486 | 487 | LLM_BINDING=openai 488 | LLM_MODEL=gpt-4o-mini 489 | LLM_BINDING_HOST=https://api.openai.com/v1 490 | LLM_BINDING_API_KEY=your-api-key 491 | 492 | ### Embedding Configuration (Use valid host. For local services installed with docker, you can use host.docker.internal) 493 | # see also env.ollama-binding-options.example for fine tuning ollama 494 | EMBEDDING_MODEL=bge-m3:latest 495 | EMBEDDING_DIM=1024 496 | EMBEDDING_BINDING=ollama 497 | EMBEDDING_BINDING_HOST=http://localhost:11434 498 | 499 | ### For JWT Auth 500 | # AUTH_ACCOUNTS='admin:admin123,user1:pass456' 501 | # TOKEN_SECRET=your-key-for-LightRAG-API-Server-xxx 502 | # TOKEN_EXPIRE_HOURS=48 503 | 504 | # LIGHTRAG_API_KEY=your-secure-api-key-here-123 505 | # WHITELIST_PATHS=/api/* 506 | # WHITELIST_PATHS=/health,/api/* 507 | 508 | ``` 509 | 510 | ## Document and Chunk Processing Login Clarification 511 | 512 | The document processing pipeline in LightRAG is somewhat complex and is divided into two primary stages: the Extraction stage (entity and relationship extraction) and the Merging stage (entity and relationship merging). There are two key parameters that control pipeline concurrency: the maximum number of files processed in parallel (MAX_PARALLEL_INSERT) and the maximum number of concurrent LLM requests (MAX_ASYNC). The workflow is described as follows: 513 | 514 | 1. MAX_ASYNC limits the total number of concurrent LLM requests in the system, including those for querying, extraction, and merging. LLM requests have different priorities: query operations have the highest priority, followed by merging, and then extraction. 515 | 2. MAX_PARALLEL_INSERT controls the number of files processed in parallel during the extraction stage. For optimal performance, MAX_PARALLEL_INSERT is recommended to be set between 2 and 10, typically MAX_ASYNC/3. Setting this value too high can increase the likelihood of naming conflicts among entities and relationships across different documents during the merge phase, thereby reducing its overall efficiency. 516 | 3. Within a single file, entity and relationship extractions from different text blocks are processed concurrently, with the degree of concurrency set by MAX_ASYNC. Only after MAX_ASYNC text blocks are processed will the system proceed to the next batch within the same file. 517 | 4. When a file completes entity and relationship extraction, it enters the entity and relationship merging stage. This stage also processes multiple entities and relationships concurrently, with the concurrency level also controlled by `MAX_ASYNC`. 518 | 5. LLM requests for the merging stage are prioritized over the extraction stage to ensure that files in the merging phase are processed quickly and their results are promptly updated in the vector database. 519 | 6. To prevent race conditions, the merging stage avoids concurrent processing of the same entity or relationship. When multiple files involve the same entity or relationship that needs to be merged, they are processed serially. 520 | 7. Each file is treated as an atomic processing unit in the pipeline. A file is marked as successfully processed only after all its text blocks have completed extraction and merging. If any error occurs during processing, the entire file is marked as failed and must be reprocessed. 521 | 8. When a file is reprocessed due to errors, previously processed text blocks can be quickly skipped thanks to LLM caching. Although LLM cache is also utilized during the merging stage, inconsistencies in merging order may limit its effectiveness in this stage. 522 | 9. If an error occurs during extraction, the system does not retain any intermediate results. If an error occurs during merging, already merged entities and relationships might be preserved; when the same file is reprocessed, re-extracted entities and relationships will be merged with the existing ones, without impacting the query results. 523 | 10. At the end of the merging stage, all entity and relationship data are updated in the vector database. Should an error occur at this point, some updates may be retained. However, the next processing attempt will overwrite previous results, ensuring that successfully reprocessed files do not affect the integrity of future query results. 524 | 525 | Large files should be divided into smaller segments to enable incremental processing. Reprocessing of failed files can be initiated by pressing the "Scan" button on the web UI. 526 | 527 | ## API Endpoints 528 | 529 | All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality. When the API Server is running, visit: 530 | 531 | - Swagger UI: http://localhost:9621/docs 532 | - ReDoc: http://localhost:9621/redoc 533 | 534 | You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to: 535 | 536 | 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI) 537 | 2. Start the RAG server 538 | 3. Upload some documents using the document management endpoints 539 | 4. Query the system using the query endpoints 540 | 5. Trigger document scan if new files are put into the inputs directory 541 | 542 | ## Asynchronous Document Indexing with Progress Tracking 543 | 544 | LightRAG implements asynchronous document indexing to enable frontend monitoring and querying of document processing progress. Upon uploading files or inserting text through designated endpoints, a unique Track ID is returned to facilitate real-time progress monitoring. 545 | 546 | **API Endpoints Supporting Track ID Generation:** 547 | 548 | * `/documents/upload` 549 | * `/documents/text` 550 | * `/documents/texts` 551 | 552 | **Document Processing Status Query Endpoint:** 553 | * `/track_status/{track_id}` 554 | 555 | This endpoint provides comprehensive status information including: 556 | * Document processing status (pending/processing/processed/failed) 557 | * Content summary and metadata 558 | * Error messages if processing failed 559 | * Timestamps for creation and updates 560 | -------------------------------------------------------------------------------- /lightrag/llm/Readme.md: -------------------------------------------------------------------------------- 1 | 2 | 1. **LlamaIndex** (`llm/llama_index.py`): 3 | - Provides integration with OpenAI and other providers through LlamaIndex 4 | - Supports both direct API access and proxy services like LiteLLM 5 | - Handles embeddings and completions with consistent interfaces 6 | - See example implementations: 7 | - [Direct OpenAI Usage](../../examples/lightrag_llamaindex_direct_demo.py) 8 | - [LiteLLM Proxy Usage](../../examples/lightrag_llamaindex_litellm_demo.py) 9 | 10 |
11 | Using LlamaIndex 12 | 13 | LightRAG supports LlamaIndex for embeddings and completions in two ways: direct OpenAI usage or through LiteLLM proxy. 14 | 15 | ### Setup 16 | 17 | First, install the required dependencies: 18 | ```bash 19 | pip install llama-index-llms-litellm llama-index-embeddings-litellm 20 | ``` 21 | 22 | ### Standard OpenAI Usage 23 | 24 | ```python 25 | from lightrag import LightRAG 26 | from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed 27 | from llama_index.embeddings.openai import OpenAIEmbedding 28 | from llama_index.llms.openai import OpenAI 29 | from lightrag.utils import EmbeddingFunc 30 | 31 | # Initialize with direct OpenAI access 32 | async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs): 33 | try: 34 | # Initialize OpenAI if not in kwargs 35 | if 'llm_instance' not in kwargs: 36 | llm_instance = OpenAI( 37 | model="gpt-4", 38 | api_key="your-openai-key", 39 | temperature=0.7, 40 | ) 41 | kwargs['llm_instance'] = llm_instance 42 | 43 | response = await llama_index_complete_if_cache( 44 | kwargs['llm_instance'], 45 | prompt, 46 | system_prompt=system_prompt, 47 | history_messages=history_messages, 48 | **kwargs, 49 | ) 50 | return response 51 | except Exception as e: 52 | logger.error(f"LLM request failed: {str(e)}") 53 | raise 54 | 55 | # Initialize LightRAG with OpenAI 56 | rag = LightRAG( 57 | working_dir="your/path", 58 | llm_model_func=llm_model_func, 59 | embedding_func=EmbeddingFunc( 60 | embedding_dim=1536, 61 | func=lambda texts: llama_index_embed( 62 | texts, 63 | embed_model=OpenAIEmbedding( 64 | model="text-embedding-3-large", 65 | api_key="your-openai-key" 66 | ) 67 | ), 68 | ), 69 | ) 70 | ``` 71 | 72 | ### Using LiteLLM Proxy 73 | 74 | 1. Use any LLM provider through LiteLLM 75 | 2. Leverage LlamaIndex's embedding and completion capabilities 76 | 3. Maintain consistent configuration across services 77 | 78 | ```python 79 | from lightrag import LightRAG 80 | from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed 81 | from llama_index.llms.litellm import LiteLLM 82 | from llama_index.embeddings.litellm import LiteLLMEmbedding 83 | from lightrag.utils import EmbeddingFunc 84 | 85 | # Initialize with LiteLLM proxy 86 | async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs): 87 | try: 88 | # Initialize LiteLLM if not in kwargs 89 | if 'llm_instance' not in kwargs: 90 | llm_instance = LiteLLM( 91 | model=f"openai/{settings.LLM_MODEL}", # Format: "provider/model_name" 92 | api_base=settings.LITELLM_URL, 93 | api_key=settings.LITELLM_KEY, 94 | temperature=0.7, 95 | ) 96 | kwargs['llm_instance'] = llm_instance 97 | 98 | response = await llama_index_complete_if_cache( 99 | kwargs['llm_instance'], 100 | prompt, 101 | system_prompt=system_prompt, 102 | history_messages=history_messages, 103 | **kwargs, 104 | ) 105 | return response 106 | except Exception as e: 107 | logger.error(f"LLM request failed: {str(e)}") 108 | raise 109 | 110 | # Initialize LightRAG with LiteLLM 111 | rag = LightRAG( 112 | working_dir="your/path", 113 | llm_model_func=llm_model_func, 114 | embedding_func=EmbeddingFunc( 115 | embedding_dim=1536, 116 | func=lambda texts: llama_index_embed( 117 | texts, 118 | embed_model=LiteLLMEmbedding( 119 | model_name=f"openai/{settings.EMBEDDING_MODEL}", 120 | api_base=settings.LITELLM_URL, 121 | api_key=settings.LITELLM_KEY, 122 | ) 123 | ), 124 | ), 125 | ) 126 | ``` 127 | 128 | ### Environment Variables 129 | 130 | For OpenAI direct usage: 131 | ```bash 132 | OPENAI_API_KEY=your-openai-key 133 | ``` 134 | 135 | For LiteLLM proxy: 136 | ```bash 137 | # LiteLLM Configuration 138 | LITELLM_URL=http://litellm:4000 139 | LITELLM_KEY=your-litellm-key 140 | 141 | # Model Configuration 142 | LLM_MODEL=gpt-4 143 | EMBEDDING_MODEL=text-embedding-3-large 144 | ``` 145 | 146 | ### Key Differences 147 | 1. **Direct OpenAI**: 148 | - Simpler setup 149 | - Direct API access 150 | - Requires OpenAI API key 151 | 152 | 2. **LiteLLM Proxy**: 153 | - Model provider agnostic 154 | - Centralized API key management 155 | - Support for multiple providers 156 | - Better cost control and monitoring 157 | 158 |
159 | -------------------------------------------------------------------------------- /lightrag/tools/lightrag_visualizer/README-zh.md: -------------------------------------------------------------------------------- 1 | # 3D GraphML Viewer 2 | 3 | 一个基于 Dear ImGui 和 ModernGL 的交互式 3D 图可视化工具。 4 | 5 | ## 功能特点 6 | 7 | - **3D 交互式可视化**: 使用 ModernGL 实现高性能的 3D 图形渲染 8 | - **多种布局算法**: 支持多种图布局方式 9 | - Spring 布局 10 | - Circular 布局 11 | - Shell 布局 12 | - Random 布局 13 | - **社区检测**: 支持图社区结构的自动检测和可视化 14 | - **交互控制**: 15 | - WASD + QE 键控制相机移动 16 | - 鼠标右键拖拽控制视角 17 | - 节点选择和高亮 18 | - 可调节节点大小和边宽度 19 | - 可控制标签显示 20 | - 可在节点的Connections间快速跳转 21 | - **社区检测**: 支持图社区结构的自动检测和可视化 22 | - **交互控制**: 23 | - WASD + QE 键控制相机移动 24 | - 鼠标右键拖拽控制视角 25 | - 节点选择和高亮 26 | - 可调节节点大小和边宽度 27 | - 可控制标签显示 28 | 29 | ## 技术栈 30 | 31 | - **imgui_bundle**: 用户界面 32 | - **ModernGL**: OpenGL 图形渲染 33 | - **NetworkX**: 图数据结构和算法 34 | - **NumPy**: 数值计算 35 | - **community**: 社区检测 36 | 37 | ## 使用方法 38 | 39 | 1. **启动程序**: 40 | ```bash 41 | pip install lightrag-hku[tools] 42 | lightrag-viewer 43 | ``` 44 | 45 | 2. **加载字体**: 46 | - 将中文字体文件 `font.ttf` 放置在 `assets` 目录下 47 | - 或者修改 `CUSTOM_FONT` 常量来使用其他字体文件 48 | 49 | 3. **加载图文件**: 50 | - 点击界面上的 "Load GraphML" 按钮 51 | - 选择 GraphML 格式的图文件 52 | 53 | 4. **交互控制**: 54 | - **相机移动**: 55 | - W: 前进 56 | - S: 后退 57 | - A: 左移 58 | - D: 右移 59 | - Q: 上升 60 | - E: 下降 61 | - **视角控制**: 62 | - 按住鼠标右键拖动来旋转视角 63 | - **节点交互**: 64 | - 鼠标悬停可高亮节点 65 | - 点击可选中节点 66 | 67 | 5. **可视化设置**: 68 | - 可通过 UI 控制面板调整: 69 | - 布局类型 70 | - 节点大小 71 | - 边的宽度 72 | - 标签显示 73 | - 标签大小 74 | - 背景颜色 75 | 76 | ## 自定义设置 77 | 78 | - **节点缩放**: 通过 `node_scale` 参数调整节点大小 79 | - **边宽度**: 通过 `edge_width` 参数调整边的宽度 80 | - **标签显示**: 可通过 `show_labels` 开关标签显示 81 | - **标签大小**: 使用 `label_size` 调整标签大小 82 | - **标签颜色**: 通过 `label_color` 设置标签颜色 83 | - **视距控制**: 使用 `label_culling_distance` 控制标签显示的最大距离 84 | 85 | ## 性能优化 86 | 87 | - 使用 ModernGL 进行高效的图形渲染 88 | - 视距裁剪优化标签显示 89 | - 社区检测算法优化大规模图的可视化效果 90 | 91 | ## 系统要求 92 | 93 | - Python 3.10+ 94 | - OpenGL 3.3+ 兼容的显卡 95 | - 支持的操作系统:Windows/Linux/MacOS 96 | -------------------------------------------------------------------------------- /lightrag/tools/lightrag_visualizer/README.md: -------------------------------------------------------------------------------- 1 | # LightRAG 3D Graph Viewer 2 | 3 | An interactive 3D graph visualization tool included in the LightRAG package for visualizing and analyzing RAG (Retrieval-Augmented Generation) graphs and other graph structures. 4 | 5 | ![image](https://github.com/user-attachments/assets/b0d86184-99fc-468c-96ed-c611f14292bf) 6 | 7 | ## Installation 8 | 9 | ### Quick Install 10 | ```bash 11 | pip install lightrag-hku[tools] # Install with visualization tool only 12 | # or 13 | pip install lightrag-hku[api,tools] # Install with both API and visualization tools 14 | ``` 15 | 16 | ## Launch the Viewer 17 | ```bash 18 | lightrag-viewer 19 | ``` 20 | 21 | ## Features 22 | 23 | - **3D Interactive Visualization**: High-performance 3D graphics rendering using ModernGL 24 | - **Multiple Layout Algorithms**: Support for various graph layouts 25 | - Spring layout 26 | - Circular layout 27 | - Shell layout 28 | - Random layout 29 | - **Community Detection**: Automatic detection and visualization of graph community structures 30 | - **Interactive Controls**: 31 | - WASD + QE keys for camera movement 32 | - Right mouse drag for view angle control 33 | - Node selection and highlighting 34 | - Adjustable node size and edge width 35 | - Configurable label display 36 | - Quick navigation between node connections 37 | 38 | ## Tech Stack 39 | 40 | - **imgui_bundle**: User interface 41 | - **ModernGL**: OpenGL graphics rendering 42 | - **NetworkX**: Graph data structures and algorithms 43 | - **NumPy**: Numerical computations 44 | - **community**: Community detection 45 | 46 | ## Interactive Controls 47 | 48 | ### Camera Movement 49 | - W: Move forward 50 | - S: Move backward 51 | - A: Move left 52 | - D: Move right 53 | - Q: Move up 54 | - E: Move down 55 | 56 | ### View Control 57 | - Hold right mouse button and drag to rotate view 58 | 59 | ### Node Interaction 60 | - Hover mouse to highlight nodes 61 | - Click to select nodes 62 | 63 | ## Visualization Settings 64 | 65 | Adjustable via UI control panel: 66 | - Layout type 67 | - Node size 68 | - Edge width 69 | - Label visibility 70 | - Label size 71 | - Background color 72 | 73 | ## Customization Options 74 | 75 | - **Node Scaling**: Adjust node size via `node_scale` parameter 76 | - **Edge Width**: Modify edge width using `edge_width` parameter 77 | - **Label Display**: Toggle label visibility with `show_labels` 78 | - **Label Size**: Adjust label size using `label_size` 79 | - **Label Color**: Set label color through `label_color` 80 | - **View Distance**: Control maximum label display distance with `label_culling_distance` 81 | 82 | ## System Requirements 83 | 84 | - Python 3.9+ 85 | - Graphics card with OpenGL 3.3+ support 86 | - Supported Operating Systems: Windows/Linux/MacOS 87 | 88 | ## Troubleshooting 89 | 90 | ### Common Issues 91 | 92 | 1. **Command Not Found** 93 | ```bash 94 | # Make sure you installed with the 'tools' option 95 | pip install lightrag-hku[tools] 96 | 97 | # Verify installation 98 | pip list | grep lightrag-hku 99 | ``` 100 | 101 | 2. **ModernGL Initialization Failed** 102 | ```bash 103 | # Check OpenGL version 104 | glxinfo | grep "OpenGL version" 105 | 106 | # Update graphics drivers if needed 107 | ``` 108 | 109 | 3. **Font Loading Issues** 110 | - The required fonts are included in the package 111 | - If issues persist, check your graphics drivers 112 | 113 | ## Usage with LightRAG 114 | 115 | The viewer is particularly useful for: 116 | - Visualizing RAG knowledge graphs 117 | - Analyzing document relationships 118 | - Exploring semantic connections 119 | - Debugging retrieval patterns 120 | 121 | ## Performance Optimizations 122 | 123 | - Efficient graphics rendering using ModernGL 124 | - View distance culling for label display optimization 125 | - Community detection algorithms for optimized visualization of large-scale graphs 126 | 127 | ## Support 128 | 129 | - GitHub Issues: [LightRAG Repository](https://github.com/HKUDS/LightRAG) 130 | - Documentation: [LightRAG Docs](https://URL-to-docs) 131 | 132 | ## License 133 | 134 | This tool is part of LightRAG and is distributed under the MIT License. See `LICENSE` for more information. 135 | 136 | Note: This visualization tool is an optional component of the LightRAG package. Install with the [tools] option to access the viewer functionality. 137 | -------------------------------------------------------------------------------- /lightrag_webui/README.md: -------------------------------------------------------------------------------- 1 | # LightRAG WebUI 2 | 3 | LightRAG WebUI is a React-based web interface for interacting with the LightRAG system. It provides a user-friendly interface for querying, managing, and exploring LightRAG's functionalities. 4 | 5 | ## Installation 6 | 7 | 1. **Install Bun:** 8 | 9 | If you haven't already installed Bun, follow the official documentation: [https://bun.sh/docs/installation](https://bun.sh/docs/installation) 10 | 11 | 2. **Install Dependencies:** 12 | 13 | In the `lightrag_webui` directory, run the following command to install project dependencies: 14 | 15 | ```bash 16 | bun install --frozen-lockfile 17 | ``` 18 | 19 | 3. **Build the Project:** 20 | 21 | Run the following command to build the project: 22 | 23 | ```bash 24 | bun run build --emptyOutDir 25 | ``` 26 | 27 | This command will bundle the project and output the built files to the `lightrag/api/webui` directory. 28 | 29 | ## Development 30 | 31 | - **Start the Development Server:** 32 | 33 | If you want to run the WebUI in development mode, use the following command: 34 | 35 | ```bash 36 | bun run dev 37 | ``` 38 | 39 | ## Script Commands 40 | 41 | The following are some commonly used script commands defined in `package.json`: 42 | 43 | - `bun install`: Installs project dependencies. 44 | - `bun run dev`: Starts the development server. 45 | - `bun run build`: Builds the project. 46 | - `bun run lint`: Runs the linter. 47 | -------------------------------------------------------------------------------- /paging.md: -------------------------------------------------------------------------------- 1 | # 文档列表页面分页显示功能改造方案 2 | 3 | ## 一、改造目标 4 | 5 | ### 问题现状 6 | - 当前文档页面一次性加载所有文档,导致大量文档时界面加载慢 7 | - 前端内存占用过大,用户操作体验差 8 | - 状态过滤和排序都在前端进行,效率低下 9 | 10 | ### 改造目标 11 | - 实现后端分页查询,减少单次数据传输量 12 | - 添加分页控制组件,支持翻页和跳转功能 13 | - 允许用户设置每页显示行数(10-200条) 14 | - 保持现有状态过滤和排序功能不变 15 | - 提升大数据量场景下的性能表现 16 | 17 | ## 二、总体架构设计 18 | 19 | ### 设计原则 20 | 1. **统一分页接口**:后端提供统一的分页API,支持状态过滤和排序 21 | 2. **智能刷新策略**:根据处理状态选择合适的刷新频率和范围 22 | 3. **即时用户反馈**:状态切换、分页操作提供立即响应 23 | 4. **向后兼容**:保持现有功能完整性,不影响现有操作流程 24 | 5. **性能优化**:减少内存占用,优化网络请求 25 | 26 | ### 技术方案 27 | - **后端**:在现有存储层基础上添加分页查询接口 28 | - **前端**:改造DocumentManager组件,添加分页控制 29 | - **数据流**:统一分页查询 + 独立状态计数查询 30 | 31 | ## 三、后端改造步骤 32 | 33 | ### 步骤1:存储层接口扩展 34 | 35 | **改动文件**:`lightrag/kg/base.py` 36 | 37 | **关键思路**: 38 | - 在BaseDocStatusStorage抽象类中添加分页查询方法 39 | - 设计统一的分页接口,支持状态过滤、排序、分页参数 40 | - 返回文档列表和总数量的元组 41 | 42 | **接口设计要点**: 43 | ``` 44 | get_docs_paginated(status_filter, page, page_size, sort_field, sort_direction) -> (documents, total_count) 45 | count_by_status(status) -> int 46 | get_all_status_counts() -> Dict[str, int] 47 | ``` 48 | 49 | ### 步骤2:各存储后端实现 50 | 51 | **改动文件**: 52 | - `lightrag/kg/postgres_impl.py` 53 | - `lightrag/kg/mongo_impl.py` 54 | - `lightrag/kg/redis_impl.py` 55 | - `lightrag/kg/json_doc_status_impl.py` 56 | 57 | **PostgreSQL实现要点**: 58 | - 使用LIMIT和OFFSET实现分页 59 | - 构建动态WHERE条件支持状态过滤 60 | - 使用COUNT查询获取总数量 61 | - 添加合适的数据库索引优化查询性能 62 | 63 | **MongoDB实现要点**: 64 | - 使用skip()和limit()实现分页 65 | - 使用聚合管道进行状态统计 66 | - 优化查询条件和索引 67 | 68 | **Redis 与 Json实现要点:** 69 | 70 | * 考虑先用简单的方式实现,即把所有文件清单读到内存中后进行过滤和排序 71 | 72 | **关键考虑**: 73 | 74 | - 确保各存储后端的分页逻辑一致性 75 | - 处理边界情况(空结果、超出页码范围等) 76 | - 优化查询性能,避免全表扫描 77 | 78 | ### 步骤3:API路由层改造 79 | 80 | **改动文件**:`lightrag/api/routers/document_routes.py` 81 | 82 | **新增接口**: 83 | 1. `POST /documents/paginated` - 分页查询文档 84 | 2. `GET /documents/status_counts` - 获取状态计数 85 | 86 | **数据模型设计**: 87 | - DocumentsRequest:分页请求参数 88 | - PaginatedDocsResponse:分页响应数据 89 | - PaginationInfo:分页元信息 90 | 91 | **关键逻辑**: 92 | - 参数验证(页码范围、页面大小限制) 93 | - 并行查询分页数据和状态计数 94 | - 错误处理和异常响应 95 | 96 | ### 步骤4:数据库优化 97 | 98 | **索引策略**: 99 | - 为workspace + status + updated_at创建复合索引 100 | - 为workspace + status + created_at创建复合索引 101 | - 为workspace + updated_at创建索引 102 | - 为workspace + created_at创建索引 103 | 104 | **性能考虑**: 105 | - 避免深度分页的性能问题 106 | - 考虑添加缓存层优化状态计数查询 107 | - 监控查询性能,必要时调整索引策略 108 | 109 | ## 四、前端改造步骤 110 | 111 | ### 步骤1:API客户端扩展 112 | 113 | **改动文件**:`lightrag_webui/src/api/lightrag.ts` 114 | 115 | **新增函数**: 116 | - `getDocumentsPaginated()` - 分页查询文档 117 | - `getDocumentStatusCounts()` - 获取状态计数 118 | 119 | **类型定义**: 120 | - 定义分页请求和响应的TypeScript类型 121 | - 确保类型安全和代码提示 122 | 123 | ### 步骤2:分页控制组件开发 124 | 125 | **新增文件**:`lightrag_webui/src/components/ui/PaginationControls.tsx` 126 | 127 | **组件功能**: 128 | - 支持紧凑模式和完整模式 129 | - 页码输入和跳转功能 130 | - 每页显示数量选择(10-200) 131 | - 总数信息显示 132 | - 禁用状态处理 133 | 134 | **设计要点**: 135 | - 响应式设计,适配不同屏幕尺寸 136 | - 防抖处理,避免频繁请求 137 | - 错误处理和状态回滚 138 | - 组件摆放位置:目前状态按钮上方,与scan按钮同一层,居中摆放 139 | 140 | ### 步骤3:状态过滤按钮优化 141 | 142 | **改动文件**:现有状态过滤相关组件 143 | 144 | **优化要点**: 145 | 146 | - 添加加载状态指示 147 | - 数据不足时的智能提示 148 | - 定期刷新数据,状态切换时如果最先的状态数据距离上次刷新数据超过5秒应即时刷新数据 149 | - 防止重复点击和并发请求 150 | 151 | ### 步骤4:主组件DocumentManager改造 152 | 153 | **改动文件**:`lightrag_webui/src/features/DocumentManager.tsx` 154 | 155 | **核心改动**: 156 | 157 | **状态管理重构**: 158 | - 将docs状态改为currentPageDocs(仅存储当前页数据) 159 | - 添加pagination状态管理分页信息 160 | - 添加statusCounts状态独立管理状态计数 161 | - 添加加载状态管理(isStatusChanging, isRefreshing) 162 | 163 | **数据获取策略**: 164 | - 实现智能刷新:活跃期完整刷新,稳定期轻量刷新 165 | - 状态切换时立即刷新数据 166 | - 分页操作时立即更新数据 167 | - 定期刷新与手动操作协调 168 | 169 | **布局调整**: 170 | - 将分页控制组件放置在顶部操作栏中间位置 171 | - 保持状态过滤按钮在表格上方 172 | - 确保响应式布局适配 173 | 174 | **事件处理优化**: 175 | - 状态切换时,如果当前页码数据不足,则重置到第一页 176 | - 页面大小变更时智能计算新页码 177 | - 错误时状态回滚机制 178 | 179 | ## 五、用户体验优化 180 | 181 | ### 即时反馈机制 182 | - 状态切换时显示加载动画 183 | - 分页操作时提供视觉反馈 184 | - 数据不足时智能提示用户 185 | 186 | ### 错误处理策略 187 | - 网络错误时自动重试 188 | - 操作失败时状态回滚 189 | - 友好的错误提示信息 190 | 191 | ### 性能优化措施 192 | - 防抖处理频繁操作 193 | - 智能刷新策略减少不必要请求 194 | - 组件卸载时清理定时器和请求 195 | 196 | ## 六、兼容性保障 197 | 198 | ### 向后兼容 199 | - 保留原有的/documents接口作为备用 200 | - 现有功能(排序、过滤、选择)保持不变 201 | - 渐进式升级,支持配置开关 202 | 203 | ### 数据一致性 204 | - 确保分页数据与状态计数同步 205 | - 处理并发更新的数据一致性问题 206 | - 定期刷新保持数据最新 207 | 208 | ## 七、测试策略 209 | 210 | ### 功能测试 211 | - 各种分页场景测试 212 | - 状态过滤组合测试 213 | - 排序功能验证 214 | - 边界条件测试 215 | 216 | ### 性能测试 217 | - 大数据量场景测试 218 | - 并发访问压力测试 219 | - 内存使用情况监控 220 | - 响应时间测试 221 | 222 | ### 兼容性测试 223 | - 不同存储后端测试 224 | - 不同浏览器兼容性 225 | - 移动端响应式测试 226 | 227 | ## 八、关键实现细节 228 | 229 | ### 后端分页查询设计 230 | - **统一接口**:所有存储后端实现相同的分页接口签名 231 | - **参数验证**:严格验证页码、页面大小、排序参数的合法性 232 | - **性能优化**:使用数据库原生分页功能,避免应用层分页 233 | - **错误处理**:统一的错误响应格式和异常处理机制 234 | 235 | ### 前端状态管理策略 236 | - **数据分离**:当前页数据与状态计数分别管理 237 | - **智能刷新**:根据文档处理状态选择刷新策略 238 | - **状态同步**:确保UI状态与后端数据保持一致 239 | - **错误恢复**:操作失败时自动回滚到之前状态 240 | 241 | ### 分页控制组件设计 242 | - **紧凑布局**:适配顶部操作栏的空间限制 243 | - **响应式设计**:在不同屏幕尺寸下自适应布局 244 | - **交互优化**:防抖处理、加载状态、禁用状态管理 245 | - **可访问性**:支持键盘导航和屏幕阅读器 246 | 247 | ### 数据库索引优化 248 | - **复合索引**:workspace + status + sort_field的组合索引 249 | - **覆盖索引**:尽可能使用覆盖索引减少回表查询 250 | - **索引监控**:定期监控索引使用情况和查询性能 251 | - **渐进优化**:根据实际使用情况调整索引策略 252 | --------------------------------------------------------------------------------