├── CONTRIBUTING.md ├── LICENSE └── README.md /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contribution Guidelines 2 | 3 | Thank you for your interest in contributing to the **RAGHub** project! We appreciate your support in making this a valuable resource for the community. 4 | 5 | ## How to Contribute 6 | 7 | We welcome contributions from everyone. If you'd like to add a new framework, project, or resource, follow these steps to get started: 8 | 9 | 1. **Fork this repository**. 10 | 2. **Create a new branch**: Use a descriptive name for your branch. For example: `git checkout -b add-framework-yourname`. 11 | 3. **Make your changes**: Update the appropriate section in the `README.md` by following the table format below. 12 | 4. **Update the Table of Contents (TOC)**: If you are adding a new category or section, make sure to add a link to the TOC. 13 | 5. **Submit a pull request (PR)**: Once your changes are complete, submit a PR to the main branch for review. 14 | 15 | ### Contribution Format 16 | 17 | #### Adding a new resource (Frameworks, projects etc.): 18 | 19 | If you're adding a resource, please insert a new row into the correct category table in the README.md file, using the following format: 20 | 21 | ```md 22 | | Framework Name | Description (max 80 char) | Website link | Github link | Github stars | Last activity on Github | 23 | ``` 24 | 25 | Example: 26 | 27 | ```md 28 | | LangChain | A framework for building applications with LLMs. | [Website](https://langchain.com) | [Github](https://github.com/langchain-ai/langchain) | 93.2k | 9h ago | 29 | ``` 30 | 31 | If you don't see a category that fits your resource, feel free to create a new category by adding it both to the README.md and updating the Table of Contents accordingly. 32 | 33 | ### Updating the Table of Contents (TOC) 34 | 35 | If you're adding a new category or section, please ensure that you update the Table of Contents to reflect your changes. Use the following format to add new categories: 36 | 37 | ```md 38 | - [New Category Name](#new-category-name) 39 | ``` 40 | 41 | For example, if you're adding a new section called RAG Best Practices, add it like this in the TOC: 42 | 43 | ```md 44 | - [RAG Best Practices](#rag-best-practices) 45 | ``` 46 | 47 | ### Final Steps 48 | 49 | Once you've added your contribution, please: 50 | 51 | 1. Ensure that your changes follow the format provided and are consistent with the current style. 52 | 2. Double-check that all links work and point to the correct URLs. 53 | 3. Submit your pull request. We will review it and provide feedback if needed. 54 | 55 | Thank you again for contributing to **RAGHub**! Your help makes this a valuable resource for the community. 56 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Andrew Jang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # RAGHub: A Directory of Tools for Retrieval-Augmented Generation (RAG) 2 | 3 | Welcome to **RAGHub**, a living collection of **new and emerging frameworks, projects, and resources** in the **Retrieval-Augmented Generation (RAG)** ecosystem. This is a **community-driven project for [r/RAG](https://www.reddit.com/r/Rag/)**, where we aim to catalog the rapid growth of RAG tools and projects that are pushing the boundaries of the field. 4 | 5 | Each day, it feels like a new tool or framework emerges, and choosing the right one is becoming more of an art than a science. Is the framework from three months ago still relevant? Or was it just hype, rehashing old concepts with a fresh look? **RAGHub exists to help you stay ahead of these changes**, providing a platform for the latest innovations in RAG. 6 | 7 | ## How to Contribute 8 | 9 | This is a community project, and **we welcome contributions from everyone**! If you’d like to add a new framework, project, or resource, please check out our [Contribution Guidelines](CONTRIBUTING.md) for details on how to get started. 10 | 11 | ## Table of Contents 12 | 13 | - [RAGHub: A Directory of Tools for Retrieval-Augmented Generation (RAG)](#raghub-a-directory-of-tools-for-retrieval-augmented-generation-rag) 14 | - [How to Contribute](#how-to-contribute) 15 | - [Table of Contents](#table-of-contents) 16 | - [RAG Frameworks](#rag-frameworks) 17 | - [RAG Evaluation and Optimization Frameworks](#rag-evaluation-and-optimization-frameworks) 18 | - [RAG Engines](#rag-engines) 19 | - [RAG Data Preparation Frameworks](#rag-data-preparation-frameworks) 20 | - [RAG Projects](#rag-projects) 21 | - [RAG Resources and Sites](#rag-resources-and-sites) 22 | - [Model LeaderBoards](#model-leaderboards) 23 | - [License](#license) 24 | - [Join the Conversation](#join-the-conversation) 25 | 26 | ## RAG Frameworks 27 | 28 | | Name | Description | Website | Github | Stars | Activity | 29 | | --------------------------- | ------------------------------------------------------------------ | ---------------------------------------------------------- | --------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- | ---------- | 30 | | Dcup Open-Source RAG-as-a-Service | Connect your app to user data in minutes with self-hostable RAG pipelines. | [Website](https://dcup.dev) | [Github](https://github.com/Dcup-dev/dcup) | [![GitHub](https://img.shields.io/github/stars/Dcup-dev/dcup?color=5B5BD6)](https://github.com/Dcup-dev/) | 1h ago | 31 | | LangChain | Building applications with LLMs | [Website](https://langchain.com) | [Github](https://github.com/langchain-ai/langchain) | [![GitHub](https://img.shields.io/github/stars/langchain-ai/langchain?color=5B5BD6)](https://github.com/langchain-ai/langchain) | 9h ago | 32 | | Scout | Building apps with LLMs/vector databses/web scraping | [Website](https://scoutos.com) | [Github](https://github.com/scoutos) | [![GitHub](https://img.shields.io/github/stars/scoutos?color=5B5BD6)](https://github.com/scoutos) | 1h ago | 33 | | Haystack | A framework for building search engines using neural networks | [Website](https://haystack.deepset.ai) | [Github](https://github.com/deepset-ai/haystack) | [![GitHub](https://img.shields.io/github/stars/deepset-ai/haystack?color=5B5BD6)](https://github.com/deepset-ai/haystack) | Last week | 34 | | LlamaIndex | A framework for building data-driven LLM applications | [Website](https://www.llamaindex.ai/) | [Github](https://github.com/run-llama/llama_index) | [![GitHub](https://img.shields.io/github/stars/run-llama/llama_index?color=5B5BD6)](https://github.com/run-llama/llama_index) | 7h ago | 35 | | BentoML | Build Inference APIs, LLM apps, Multi-model chains, RAG | [Website](https://www.bentoml.com/) | [Github](https://github.com/bentoml/BentoML) | [![GitHub](https://img.shields.io/github/stars/bentoml/BentoML?color=5B5BD6)](https://github.com/bentoml/BentoML) | 1h ago | 36 | | LightRAG | Simple and fast Retrieval-Augmented Generation | [Website](https://arxiv.org/abs/2410.05779) | [Github](https://github.com/HKUDS/LightRAG) | [![GitHub](https://img.shields.io/github/stars/HKUDS/LightRAG?color=5B5BD6)](https://github.com/HKUDS/LightRAG) | 1d ago | 37 | | Swarm by OpenAI | Educational framework for lightweight multi-agent orchestration | - | [Github](https://github.com/openai/swarm) | [![GitHub](https://img.shields.io/github/stars/openai/swarm?color=5B5BD6)](https://github.com/openai/swarm) | 1d ago | 38 | | Langroid | Python framework to easily build LLM-powered applications | [Website](https://langroid.github.io/langroid/) | [Github](https://github.com/langroid/langroid) | [![GitHub](https://img.shields.io/github/stars/langroid/langroid?color=5B5BD6)](https://github.com/langroid/langroid) | 10h ago | 39 | | NeMo-Guardrails | Add programmable guardrails to LLM-based applications | [Website](https://docs.nvidia.com/nemo-guardrails/) | [Github](https://github.com/NVIDIA/NeMo-Guardrails) | [![GitHub](https://img.shields.io/github/stars/NVIDIA/NeMo-Guardrails?color=5B5BD6)](https://github.com/NVIDIA/NeMo-Guardrails) | Last week | 40 | | Swiftide | A Rust library for building fast, streaming applications with LLMs | [Website](https://swiftide.rs) | [GitHub](https://github.com/bosun-ai/swiftide) | ![GitHub stars](https://img.shields.io/github/stars/bosun-ai/swiftide?style=social) | 1h ago | 41 | | Korvus | The entire RAG pipeline in a single database query | [Website](https://postgresml.org) | [GitHub](https://github.com/postgresml/korvus) | ![GitHub stars](https://img.shields.io/github/stars/postgresml/korvus?style=social) | Last month | 42 | | semantic-router | A framework for routing LLM requests using semantic vectors | [Website](https://www.aurelio.ai/semantic-router) | [GitHub](https://github.com/aurelio-labs/semantic-router) | ![GitHub stars](https://img.shields.io/github/stars/aurelio-labs/semantic-router?style=social) | 4h ago | 43 | | AWS Bedrock Knowledge Bases | Service to build, scale, and deploy RAG-powered applications | [Website](https://aws.amazon.com/bedrock/knowledge-bases/) | - | - | 1h ago | 44 | | langflow | Build, scale, and deploy RAG and multi-agent AI apps | [Website](https://www.langflow.org/) | [GitHub](https://github.com/langflow-ai/langflow) | ![GitHub stars](https://img.shields.io/github/stars/langflow-ai/langflow?style=social) | 1h ago | 45 | | dspy | Build language model apps with modular programming | [Website](https://dspy-docs.vercel.app/) | [GitHub](https://github.com/stanfordnlp/dspy) | ![GitHub stars](https://img.shields.io/github/stars/stanfordnlp/dspy?style=social) | 13h ago | 46 | | mem0 | The Memory layer for your AI apps | [Website](https://mem0.ai/) | [GitHub](https://github.com/mem0ai/mem0) | ![GitHub stars](https://img.shields.io/github/stars/mem0ai/mem0?style=social) | 2h ago | 47 | | RAGLite | A Python package for building RAG applications | [Website](https://superlinear.eu) | [GitHub](https://github.com/superlinear-ai/raglite) | ![GitHub stars](https://img.shields.io/github/stars/superlinear-ai/raglite?style=social) | 18h ago | 48 | | cognee | Memory framework for building GraphRAG applications | [Website](https://www.cognee.ai) | [GitHub](https://github.com/topoteretes/cognee) | ![GitHub stars](https://img.shields.io/github/stars/topoteretes/cognee?style=social) | 2h ago | 49 | 50 | ## RAG Evaluation and Optimization Frameworks 51 | 52 | | Name | Description | Website | GitHub | Stars | Activity | 53 | | ------------ | ------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------------------------------------------- | -------- | 54 | | Trulens | Measures and enhance LLM app quality with feedback functions for scalable evaluation | [Website](https://www.trulens.org/) | [GitHub](https://github.com/truera/trulens) | ![GitHub stars](https://img.shields.io/github/stars/truera/trulens?style=social) | 11h ago | 55 | | Phoenix | AI observability platform designed for experimentation, evaluation, and troubleshooting | [Website](https://github.com/Arize-ai/phoenix) | [GitHub](https://github.com/Arize-ai/phoenix) | ![GitHub stars](https://img.shields.io/github/stars/Arize-ai/phoenix?style=social) | 1d ago | 56 | | ragas | Evaluates and quantifies the performance of RAG pipelines that enhance LLM context with external data | [Website](https://docs.ragas.io/en/stable/) | [GitHub](https://github.com/explodinggradients/ragas) | ![GitHub stars](https://img.shields.io/github/stars/explodinggradients/ragas?style=social) | 3h ago | 57 | | Deepchecks | Continuous validation of AI & ML models, detecting data drift and model issues | [Website](https://docs.deepchecks.com/stable) | [GitHub](https://github.com/deepchecks/deepchecks) | ![GitHub stars](https://img.shields.io/github/stars/deepchecks/deepchecks?style=social) | 8m ago | 58 | | AutoRAG | End-to-end RAG optimization: parsing, chunking, evaluation dataset creation, and pipeline deployment | [Website](https://auto-rag.com/) | [GitHub](https://github.com/Marker-Inc-Korea/AutoRAG) | ![GitHub stars](https://img.shields.io/github/stars/Marker-Inc-Korea/AutoRAG?style=social) | 1h ago | 59 | | evalmy.ai | Fine-tuned lightweight RAG evaluation service + Python client library | [Website](https://www.evalmy.ai/) | [GitHub](https://github.com/evalmy-ai/evalmyai-python) | ![GitHub stars](https://img.shields.io/github/stars/evalmy-ai/evalmyai-python?style=social) | -- | 60 | | TextGrad | A framework for LLM-based text optimization, focusing on reducing hallucinations and improving prompts | [Website](https://textgrad.com/) | [GitHub](https://github.com/zou-group/textgrad) | ![GitHub stars](https://img.shields.io/github/stars/zou-group/textgrad?style=social) | 24h ago | 61 | | langfuse | Traces, evals, prompt management, and metrics to debug and improve your LLM application. | [Website](https://langfuse.com/) | [GitHub](https://github.com/langfuse/langfuse) | ![GitHub stars](https://img.shields.io/github/stars/langfuse/langfuse?style=social) | 1h ago | 62 | | Vectara HHEM | Hallucination evaluation model for RAG | [Huggingface](https://huggingface.co/vectara/hallucination_evaluation_model) | -- | -- | -- | 63 | | StepsTrack | An Observability tool built to track, inspect, and visualize every steps in a pipeline | - | [GitHub](https://github.com/lokwkin/steps-track) | ![GitHub stars](https://img.shields.io/github/stars/lokwkin/steps-track?style=social) | 15h ago | 64 | | syftr | Multi-objective end-to-end agentic RAG optimization. | - | [GitHub](https://github.com/datarobot/syftr) | ![GitHub stars](https://img.shields.io/github/stars/datarobot/syftr?style=social) | 1h ago | 65 | 66 | ## RAG Engines 67 | 68 | | Name | Description | Website | GitHub | Stars | Activity | 69 | | -------------------------- | ---------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | ------------ | 70 | | Agentset | Open-source agentic RAG platform. | [Website](https://agentset.ai) | [GitHub](https://github.com/agentset-ai/agentset) | ![GitHub stars](https://img.shields.io/github/stars/agentset-ai/agentset?style=social) | 1d ago | 71 | | Engramic | RAG engine focused on long-term memory and advanced context management | [Website](https://engramic.org) | [Github](https://github.com/engramic/engramic) | ![GitHub stars](https://img.shields.io/github/stars/engramic/engramic?style=social) | 2h ago | 72 | | TrustGraph | LLM Agnostic Agent Development Platform | [Website](https://trustgraph.ai) | [GitHub](https://github.com/trustgraph-ai/trustgraph) | ![GitHub stars](https://img.shields.io/github/stars/trustgraph-ai/trustgraph?style=social) | 2d ago | 73 | | R2R | The Elasticsearch for RAG, helps you quickly build and launch scalable RAG solutions | [Website](https://r2r-docs.sciphi.ai/introduction) | [GitHub](https://github.com/SciPhi-AI/R2R) | ![GitHub stars](https://img.shields.io/github/stars/SciPhi-AI/R2R?style=social) | 6h ago | 74 | | RAGFlow | Open-source RAG engine based on deep document understanding | [Website](https://ragflow.io) | [GitHub](https://github.com/infiniflow/ragflow) | ![GitHub stars](https://img.shields.io/github/stars/infiniflow/ragflow?style=social) | 1h ago | 75 | | Liquid Index | The Unified RAG Platform. One API. Every Tool You Need | [Website](https://liquidindex.dev) | - | - | 1h ago | 76 | | Vertex AI Knowledge Engine | A data framework for context-augmented LLM applications | [Website](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-overview) | - | - | 1d ago | 77 | | Embedchain | Open Source Framework for personalizing LLM responses under 10 lines of code | [Website](https://docs.embedchain.ai/get-started/quickstart) | [GitHub](https://github.com/mem0ai/mem0/tree/main/embedchain) | ![GitHub stars](https://img.shields.io/github/stars/mem0ai/mem0?style=social) | Last week | 78 | | txtai | All-in-one embeddings database for semantic search, LLM orchestration, and RAG workflows | [Website](https://neuml.github.io/txtai/) | [GitHub](https://github.com/neuml/txtai) | ![GitHub stars](https://img.shields.io/github/stars/neuml/txtai?style=social) | Last week | 79 | | dsRAG | High-performance retrieval engine for unstructured data | - | [GitHub](https://github.com/D-Star-AI/dsRAG) | ![GitHub stars](https://img.shields.io/github/stars/D-Star-AI/dsRAG?style=social) | Last week | 80 | | Flash-Rank | Use Pairwise or Listwise rerankers to improve search accuracy before passing to LLMs. | - | [GitHub](https://github.com/PrithivirajDamodaran/FlashRank) | ![GitHub stars](https://img.shields.io/github/stars/PrithivirajDamodaran/FlashRank?style=social) | 2w ago | 81 | | Graphlit | API-first platform for building knowledge-driven AI applications and agents | [Website](https://www.graphlit.com) | [GitHub](https://github.com/graphlit) | ![GitHub stars](https://img.shields.io/github/stars/graphlit?style=social) | 8h ago | 82 | | rag-citation | Combines RAG with automatic citation generation to enhance content credibility | [Website](https://pypi.org/project/rag-citation/) | [GitHub](https://github.com/rahulanand1103/rag-citation) | ![GitHub stars](https://img.shields.io/github/stars/rahulanand1103/rag-citation?style=social) | Last week | 83 | | PostgresML | Postgres + GPUs with functions for chunking, embedding, transforming and ranking | [Website](https://postgresml.org) | [GitHub](https://github.com/postgresml/postgresml) | ![GitHub stars](https://img.shields.io/github/stars/postgresml/postgresml?style=social) | Yesterday | 84 | | chainlit | Build production-ready Conversational AI applications in minutes, not weeks | [Website](https://chainlit.io/) | [GitHub](https://github.com/Chainlit/chainlit) | ![GitHub stars](https://img.shields.io/github/stars/Chainlit/chainlit?style=social) | 24h ago | 85 | | pathway | Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG. | [Website](https://pathway.com/) | [GitHub](https://github.com/pathwaycom/pathway) | ![GitHub stars](https://img.shields.io/github/stars/pathwaycom/pathway?style=social) | 7h ago | 86 | | cognita | RAG framework for modular, open-source production apps. | [Website](https://cognita.truefoundry.com/) | [GitHub](https://github.com/truefoundry/cognita) | ![GitHub stars](https://img.shields.io/github/stars/truefoundry/cognita?style=social) | 2 days ago | 87 | | FlashRAG | A Python Toolkit for Efficient RAG Research | - | [GitHub](https://github.com/RUC-NLPIR/FlashRAG) | ![GitHub stars](https://img.shields.io/github/stars/RUC-NLPIR/FlashRAG?style=social) | 3h ago | 88 | | RAGatouille | Easily train and use advanced retrieval methods in any RAG pipeline. | - | [GitHub](https://github.com/RUC-NLPIR/FlashRAG) | ![GitHub stars](https://img.shields.io/github/stars/RUC-NLPIR/FlashRAG?style=social) | 4 months ago | 89 | | pgai | A suite of tools to develop RAG, semantic search, and other AI applications just in PostgreSQL | [Website](https://www.timescale.com/ai) | [GitHub](https://github.com/timescale/pgai) | ![GitHub stars](https://img.shields.io/github/stars/timescale/pgai?style=social) | 10h ago | 90 | | Vectara | The trusted RAG platform for quickly building AI assistants and agents. | [Website](https://www.vectara.com/) | [GitHub](https://github.com/vectara/) | - | - | 91 | | mode | RAG framework with expert models, smart clustering,and efficient retrieval for small datasets. | - | [GitHub](https://github.com/rahulanand1103/mode) |![GitHub stars](https://img.shields.io/github/stars/rahulanand1103/mode?style=social) | 2 days ago | 92 | 93 | ## RAG Data Preparation Frameworks 94 | 95 | | Name | Description | Website | GitHub | Stars | Activity | 96 | | --------- | ---------------------------------- | -------------------------------- | --------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- | -------- | 97 | | CocoIndex | ETL framework to build fresh index | [Website](https://cocoindex.io/) | [Github](https://github.com/cocoindex-io/cocoindex) | [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex) | 1h ago | 98 | | Gitana.io | Content platform for editorial approval and scheduled deployment of trained data sets to RAG vector DBs | [Website](https://gitana.io/) | - | - | - | 99 | | Chonkie | No-nonsense, lightweight and fast RAG chunking library | [Website](https://chonkie.ai/) | [GitHub](https://github.com/chonkie-inc/chonkie) | [![GitHub stars](https://img.shields.io/github/stars/chonkie-inc/chonkie?style=social)](https://github.com/chonkie-inc/chonkie) | 1h ago | 100 | 101 | ## RAG Projects 102 | 103 | | Name | Description | Website | GitHub | Stars | Activity | 104 | | ---------------------------------------- | ------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | --------- | 105 | | LlamaParse | GenAI-native document parsing platform | [Website](https://docs.llamaindex.ai/en/stable/llama_cloud/llama_parse/) | [GitHub](https://github.com/run-llama/llama_parse) | ![GitHub stars](https://img.shields.io/github/stars/run-llama/llama_parse?style=social) | 2d ago | 106 | | Langchain-extract | Web server to extract information from text and files using LLMs | [Website](https://python.langchain.com/v0.1/docs/use_cases/extraction/) | [GitHub](https://github.com/langchain-ai/langchain-extract) | ![GitHub stars](https://img.shields.io/github/stars/langchain-ai/langchain-extract?style=social) | 4m ago | 107 | | Needle | Production-ready RAG pipelines out of the box. | [Website](https://needle-ai.com/) | [GitHub](https://github.com/oeken/needle-python) | ![GitHub stars](https://img.shields.io/github/stars/oeken/needle-python?style=social) | 1h ago | 108 | | Unstructured.io | Build custom preprocessing pipelines for labeling, training, or production ML | [Website](https://unstructured.io/) | [GitHub](https://github.com/Unstructured-IO/unstructured) | ![GitHub stars](https://img.shields.io/github/stars/Unstructured-IO/unstructured?style=social) | 3d ago | 109 | | Verba | RAG chatbot powered by Weaviate | [Website](https://verba.weaviate.io/) | [GitHub](https://github.com/weaviate/Verba) | ![GitHub stars](https://img.shields.io/github/stars/weaviate/Verba?style=social) | 2w ago | 110 | | Unstract | No-code platform to launch APIs and ETL Pipelines to structure unstructured documents | [Website](https://unstract.com/) | [GitHub](https://github.com/Zipstack/unstract) | ![GitHub stars](https://img.shields.io/github/stars/Zipstack/unstract?style=social) | 4h ago | 111 | | Humata.ai | Ask questions across all of your document files | [Website](https://www.humata.ai/) | - | - | 4h ago | 112 | | Ragie.ai | Fully managed RAG-as-a-Service for developers. | [Website](https://www.ragie.ai/) | [GitHub](https://github.com/ragieai) | - | 12h ago | 113 | | Reducto | Parses complex documents and creates LLM-ready inputs | [Website](https://reducto.ai/) | [GitHub](https://github.com/reductoai) | - | 2w ago | 114 | | Midship | Extract document data straight into your spreadsheet/ERP/CRM | [Website](https://midship.ai/) | - | - | - | 115 | | DocuPanda | Convert documents into a structured, standard set of fields and values | [Website](https://www.docupanda.io/) | - | - | - | 116 | | contextual-doc-retrieval-opneai-reranker | Using GPT-4 and Cohere for query expansion and re-ranking with BM25 | - | [GitHub](https://github.com/lesteroliver911/contextual-doc-retrieval-opneai-reranker) | ![GitHub stars](https://img.shields.io/github/stars/lesteroliver911/contextual-doc-retrieval-opneai-reranker?style=social) | Last week | 117 | | Raggenie | Low-code platform to build custom RAG-based AI applications | [Website](https://www.raggenie.com) | [GitHub](https://github.com/sirocco-ventures/raggenie) | ![GitHub stars](https://img.shields.io/github/stars/sirocco-ventures/raggenie?style=social) | 10h ago | 118 | | Chunkr | Vision model-based PDF chunking and OCR, optimized for fast processing of large datasets | [Website](https://chunkr.ai) | [GitHub](https://github.com/lumina-ai-inc/chunkr) | ![GitHub stars](https://img.shields.io/github/stars/lumina-ai-inc/chunkr?style=social) | 11h ago | 119 | | tldw | Open-source project similar to NotebookLM | [Website](https://tldwproject.com) | [GitHub](https://github.com/rmusser01/tldw) | ![GitHub stars](https://img.shields.io/github/stars/rmusser01/tldw?style=social) | Yesterday | 120 | | Cerbos | Access control for RAG and LLMs. | [Website](https://solutions.cerbos.dev/authorization-in-rag-based-ai-systems-with-cerbos) | [GitHub](https://github.com/cerbos/cerbos) | ![GitHub stars](https://img.shields.io/github/stars/cerbos/cerbos?style=social) | 14h ago | 121 | | extractous | Extremely fast data extraction for your AI applications | [Website](https://www.extractous.com/) | [GitHub](https://github.com/yobix-ai/extractous) | ![GitHub stars](https://img.shields.io/github/stars/yobix-ai/extractous?style=social) | - | 122 | | SWIRL | AI search & RAG for your workplace. Get AI insights from your company's knowledge instantly. | [Website](https://www.swirlaiconnect.com/) | [GitHub](https://github.com/swirlai/swirl-search) | ![GitHub stars](https://img.shields.io/github/stars/swirlai/swirl-search?style=social) | 2w ago | 123 | | ChatDOC PDF Parser | Precision PDF parsing that transforms documents into flawless structured data for RAG systems. | [Website](https://pdfparser.io/?src=github) | - | - | - | 124 | | Gurubase | Create AI-powered Q&A assistants by indexing websites, PDF documents, YouTube videos, and GitHub code repositories. | [Website](https://gurubase.io) | [GitHub](https://github.com/Gurubase/gurubase) | ![GitHub stars](https://img.shields.io/github/stars/Gurubase/gurubase?style=social) | 1d ago | 125 | | Archive Agent | Open-source semantic file tracker with OCR + AI search. Smart indexer with RAG engine. | - | [GitHub](https://github.com/shredEngineer/Archive-Agent) | ![GitHub stars](https://img.shields.io/github/stars/shredEngineer/Archive-Agent?style=social) | - | 126 | | MidrasAI | Simple API for Colpali, a multi-modal retrieval model. | - | [Github](https://github.com/ajac-zero/midrasai) | ![GitHub stars](https://img.shields.io/github/stars/ajac-zero/midrasai?style=social) | 6m ago | 127 | 128 | ## RAG Resources and Sites 129 | 130 | | Site/Article | Description | Link | 131 | | -------------------- | ----------------------------------------------------------------------------- | -------------------------------------------------------------- | 132 | | Contextual Retrieval | Anthropic introducing Contextual Retrieval | [Website](https://www.anthropic.com/news/contextual-retrieval) | 133 | | Open-RAG | Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models | [Website](https://arxiv.org/abs/2410.01782) | 134 | | ColPali | Efficient Document Retrieval with Vision Language Models | [Website](https://arxiv.org/abs/2407.01449) | 135 | | RAG_Techniques | Showcases various advanced techniques for RAG systems | [Website](https://github.com/NirDiamant/RAG_Techniques) | 136 | | GenAI_Agents | Tutorials and implementations for various AI Agent techniques | [Website](https://github.com/NirDiamant/GenAI_Agents) | 137 | 138 | ## Model LeaderBoards 139 | 140 | | Name | Description | Link | 141 | | --------------------------------- | ---------------------------------- | --------------------------------------------------------------- | 142 | | Artificial Analysis | LLM Comparison | [Website](https://artificialanalysis.ai) | 143 | | HuggingFace/mteb | Embedding models leaderboard | [Website](https://huggingface.co/spaces/mteb/leaderboard) | 144 | | Vectara Hallucination Leaderboard | Hallucination leaderboard for LLMs | [Website](https://github.com/vectara/hallucination-leaderboard) | 145 | 146 | > If you're looking for mainstream RAG frameworks and techniques\*\*, check out the excellent repository by Nir Diamant: [RAG Techniques](https://github.com/NirDiamant/RAG_Techniques). This repository focuses on more established tools and methods that have already gained traction in the community. 147 | 148 | ## License 149 | 150 | This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details. 151 | 152 | ## Join the Conversation 153 | 154 | This project is part of the [r/RAG](https://www.reddit.com/r/Rag/) community. Have feedback or suggestions? Feel free to open an issue, start a discussion, or join the conversation on our [Discord server](https://discord.gg/nn92wC5QmN)! We want to make this repository a valuable resource for everyone exploring the RAG ecosystem, and your input is crucial. 155 | --------------------------------------------------------------------------------