├── .gitignore ├── README.md ├── config ├── build_config.json ├── qwen2_config.json └── tiny_llm_config.json ├── doc ├── README.md ├── image.png ├── index │ ├── Embedding模型.md │ └── image │ │ ├── image_-YfBZmJauO.png │ │ ├── image_03V-6r-dxq.png │ │ ├── image_0X3ViRbGuT.png │ │ ├── image_0oV5hi-wTo.png │ │ ├── image_RkVb_BRhx-.png │ │ ├── image_aNJS5vo6X7.png │ │ ├── image_cKPVG6fU-Z.png │ │ ├── image_lyFZ-z41e4.png │ │ ├── image_mc5vdNa3-S.png │ │ ├── image_n-lRTsnuua.png │ │ └── lr3r0h6wjf_VCg5aguvM7.png ├── rag │ ├── RAG技术.md │ └── image │ │ ├── image_-YfBZmJauO.png │ │ ├── image_03V-6r-dxq.png │ │ ├── image_0X3ViRbGuT.png │ │ ├── image_0oV5hi-wTo.png │ │ ├── image_RkVb_BRhx-.png │ │ ├── image_aNJS5vo6X7.png │ │ ├── image_cKPVG6fU-Z.png │ │ ├── image_lyFZ-z41e4.png │ │ ├── image_mc5vdNa3-S.png │ │ ├── image_n-lRTsnuua.png │ │ └── lr3r0h6wjf_VCg5aguvM7.png └── search │ ├── ReRank vs Embedding.md │ ├── image │ ├── image_-YfBZmJauO.png │ ├── image_03V-6r-dxq.png │ ├── image_0X3ViRbGuT.png │ ├── image_0oV5hi-wTo.png │ ├── image_RkVb_BRhx-.png │ ├── image_aNJS5vo6X7.png │ ├── image_cKPVG6fU-Z.png │ ├── image_lyFZ-z41e4.png │ ├── image_mc5vdNa3-S.png │ ├── image_n-lRTsnuua.png │ └── lr3r0h6wjf_VCg5aguvM7.png │ └── 两阶段检索.md ├── script ├── build_database.py ├── load_db_search.py └── tiny_rag.py ├── test ├── test_bm25.py ├── test_bm25_recall.py ├── test_emb.py ├── test_emb_recall.py ├── test_llm.py ├── test_parser.py ├── test_reranker_bge.py ├── test_search.py └── test_sent_split.py └── tinyrag ├── __init__.py ├── embedding ├── __init__.py ├── base_emb.py ├── hf_emb.py ├── img_emb.py ├── openai_emb.py └── zhipu_emb.py ├── llm ├── base_llm.py ├── qwen2_llm.py └── tiny_llm.py ├── parser ├── __init__.py ├── base_parser.py ├── doc_parser.py ├── img_parser.py ├── md_parser.py ├── pdf_parser.py ├── ppt_parser.py └── txt_parser.py ├── requirements.txt ├── searcher ├── __init__.py ├── bm25_recall │ ├── bm25_retriever.py │ └── rank_bm25.py ├── emb_recall │ ├── emb_index.py │ └── emb_retriever.py ├── reranker │ ├── reanker_bge_m3.py │ └── reranker_base.py └── searcher.py ├── sentence_splitter.py ├── tiny_rag.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | models 2 | data 3 | *.pyc 4 | __pycache__ 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/README.md -------------------------------------------------------------------------------- /config/build_config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/config/build_config.json -------------------------------------------------------------------------------- /config/qwen2_config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/config/qwen2_config.json -------------------------------------------------------------------------------- /config/tiny_llm_config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/config/tiny_llm_config.json -------------------------------------------------------------------------------- /doc/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/README.md -------------------------------------------------------------------------------- /doc/image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/image.png -------------------------------------------------------------------------------- /doc/index/Embedding模型.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/Embedding模型.md -------------------------------------------------------------------------------- /doc/index/image/image_-YfBZmJauO.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_-YfBZmJauO.png -------------------------------------------------------------------------------- /doc/index/image/image_03V-6r-dxq.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_03V-6r-dxq.png -------------------------------------------------------------------------------- /doc/index/image/image_0X3ViRbGuT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_0X3ViRbGuT.png -------------------------------------------------------------------------------- /doc/index/image/image_0oV5hi-wTo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_0oV5hi-wTo.png -------------------------------------------------------------------------------- /doc/index/image/image_RkVb_BRhx-.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_RkVb_BRhx-.png -------------------------------------------------------------------------------- /doc/index/image/image_aNJS5vo6X7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_aNJS5vo6X7.png -------------------------------------------------------------------------------- /doc/index/image/image_cKPVG6fU-Z.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_cKPVG6fU-Z.png -------------------------------------------------------------------------------- /doc/index/image/image_lyFZ-z41e4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_lyFZ-z41e4.png -------------------------------------------------------------------------------- /doc/index/image/image_mc5vdNa3-S.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_mc5vdNa3-S.png -------------------------------------------------------------------------------- /doc/index/image/image_n-lRTsnuua.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/image_n-lRTsnuua.png -------------------------------------------------------------------------------- /doc/index/image/lr3r0h6wjf_VCg5aguvM7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/index/image/lr3r0h6wjf_VCg5aguvM7.png -------------------------------------------------------------------------------- /doc/rag/RAG技术.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/RAG技术.md -------------------------------------------------------------------------------- /doc/rag/image/image_-YfBZmJauO.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_-YfBZmJauO.png -------------------------------------------------------------------------------- /doc/rag/image/image_03V-6r-dxq.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_03V-6r-dxq.png -------------------------------------------------------------------------------- /doc/rag/image/image_0X3ViRbGuT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_0X3ViRbGuT.png -------------------------------------------------------------------------------- /doc/rag/image/image_0oV5hi-wTo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_0oV5hi-wTo.png -------------------------------------------------------------------------------- /doc/rag/image/image_RkVb_BRhx-.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_RkVb_BRhx-.png -------------------------------------------------------------------------------- /doc/rag/image/image_aNJS5vo6X7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_aNJS5vo6X7.png -------------------------------------------------------------------------------- /doc/rag/image/image_cKPVG6fU-Z.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_cKPVG6fU-Z.png -------------------------------------------------------------------------------- /doc/rag/image/image_lyFZ-z41e4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_lyFZ-z41e4.png -------------------------------------------------------------------------------- /doc/rag/image/image_mc5vdNa3-S.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_mc5vdNa3-S.png -------------------------------------------------------------------------------- /doc/rag/image/image_n-lRTsnuua.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/image_n-lRTsnuua.png -------------------------------------------------------------------------------- /doc/rag/image/lr3r0h6wjf_VCg5aguvM7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/rag/image/lr3r0h6wjf_VCg5aguvM7.png -------------------------------------------------------------------------------- /doc/search/ReRank vs Embedding.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/ReRank vs Embedding.md -------------------------------------------------------------------------------- /doc/search/image/image_-YfBZmJauO.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_-YfBZmJauO.png -------------------------------------------------------------------------------- /doc/search/image/image_03V-6r-dxq.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_03V-6r-dxq.png -------------------------------------------------------------------------------- /doc/search/image/image_0X3ViRbGuT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_0X3ViRbGuT.png -------------------------------------------------------------------------------- /doc/search/image/image_0oV5hi-wTo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_0oV5hi-wTo.png -------------------------------------------------------------------------------- /doc/search/image/image_RkVb_BRhx-.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_RkVb_BRhx-.png -------------------------------------------------------------------------------- /doc/search/image/image_aNJS5vo6X7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_aNJS5vo6X7.png -------------------------------------------------------------------------------- /doc/search/image/image_cKPVG6fU-Z.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_cKPVG6fU-Z.png -------------------------------------------------------------------------------- /doc/search/image/image_lyFZ-z41e4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_lyFZ-z41e4.png -------------------------------------------------------------------------------- /doc/search/image/image_mc5vdNa3-S.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_mc5vdNa3-S.png -------------------------------------------------------------------------------- /doc/search/image/image_n-lRTsnuua.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/image_n-lRTsnuua.png -------------------------------------------------------------------------------- /doc/search/image/lr3r0h6wjf_VCg5aguvM7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/image/lr3r0h6wjf_VCg5aguvM7.png -------------------------------------------------------------------------------- /doc/search/两阶段检索.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/doc/search/两阶段检索.md -------------------------------------------------------------------------------- /script/build_database.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/script/build_database.py -------------------------------------------------------------------------------- /script/load_db_search.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/script/load_db_search.py -------------------------------------------------------------------------------- /script/tiny_rag.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/script/tiny_rag.py -------------------------------------------------------------------------------- /test/test_bm25.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_bm25.py -------------------------------------------------------------------------------- /test/test_bm25_recall.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_bm25_recall.py -------------------------------------------------------------------------------- /test/test_emb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_emb.py -------------------------------------------------------------------------------- /test/test_emb_recall.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_emb_recall.py -------------------------------------------------------------------------------- /test/test_llm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_llm.py -------------------------------------------------------------------------------- /test/test_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_parser.py -------------------------------------------------------------------------------- /test/test_reranker_bge.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_reranker_bge.py -------------------------------------------------------------------------------- /test/test_search.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_search.py -------------------------------------------------------------------------------- /test/test_sent_split.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/test/test_sent_split.py -------------------------------------------------------------------------------- /tinyrag/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/__init__.py -------------------------------------------------------------------------------- /tinyrag/embedding/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/embedding/__init__.py -------------------------------------------------------------------------------- /tinyrag/embedding/base_emb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/embedding/base_emb.py -------------------------------------------------------------------------------- /tinyrag/embedding/hf_emb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/embedding/hf_emb.py -------------------------------------------------------------------------------- /tinyrag/embedding/img_emb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/embedding/img_emb.py -------------------------------------------------------------------------------- /tinyrag/embedding/openai_emb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/embedding/openai_emb.py -------------------------------------------------------------------------------- /tinyrag/embedding/zhipu_emb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/embedding/zhipu_emb.py -------------------------------------------------------------------------------- /tinyrag/llm/base_llm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/llm/base_llm.py -------------------------------------------------------------------------------- /tinyrag/llm/qwen2_llm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/llm/qwen2_llm.py -------------------------------------------------------------------------------- /tinyrag/llm/tiny_llm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/llm/tiny_llm.py -------------------------------------------------------------------------------- /tinyrag/parser/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/__init__.py -------------------------------------------------------------------------------- /tinyrag/parser/base_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/base_parser.py -------------------------------------------------------------------------------- /tinyrag/parser/doc_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/doc_parser.py -------------------------------------------------------------------------------- /tinyrag/parser/img_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/img_parser.py -------------------------------------------------------------------------------- /tinyrag/parser/md_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/md_parser.py -------------------------------------------------------------------------------- /tinyrag/parser/pdf_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/pdf_parser.py -------------------------------------------------------------------------------- /tinyrag/parser/ppt_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/ppt_parser.py -------------------------------------------------------------------------------- /tinyrag/parser/txt_parser.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/parser/txt_parser.py -------------------------------------------------------------------------------- /tinyrag/requirements.txt: -------------------------------------------------------------------------------- 1 | 2 | fitz 3 | frontend 4 | PyMuPDF 5 | modelscope -------------------------------------------------------------------------------- /tinyrag/searcher/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/__init__.py -------------------------------------------------------------------------------- /tinyrag/searcher/bm25_recall/bm25_retriever.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/bm25_recall/bm25_retriever.py -------------------------------------------------------------------------------- /tinyrag/searcher/bm25_recall/rank_bm25.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/bm25_recall/rank_bm25.py -------------------------------------------------------------------------------- /tinyrag/searcher/emb_recall/emb_index.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/emb_recall/emb_index.py -------------------------------------------------------------------------------- /tinyrag/searcher/emb_recall/emb_retriever.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/emb_recall/emb_retriever.py -------------------------------------------------------------------------------- /tinyrag/searcher/reranker/reanker_bge_m3.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/reranker/reanker_bge_m3.py -------------------------------------------------------------------------------- /tinyrag/searcher/reranker/reranker_base.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/reranker/reranker_base.py -------------------------------------------------------------------------------- /tinyrag/searcher/searcher.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/searcher/searcher.py -------------------------------------------------------------------------------- /tinyrag/sentence_splitter.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/sentence_splitter.py -------------------------------------------------------------------------------- /tinyrag/tiny_rag.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/tiny_rag.py -------------------------------------------------------------------------------- /tinyrag/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wdndev/tiny-rag/HEAD/tinyrag/utils.py --------------------------------------------------------------------------------