├── LICENSE
├── README.md
├── faiss
    ├── camp
    │   ├── index.faiss
    │   └── index.pkl
    └── logistics
    │   ├── index.faiss
    │   └── index.pkl
├── get_vector.py
├── main.py
├── model.py
└── 物流信息.txt


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 Haohao-end
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | #  物流知识智能问答系统 - LogisticsQA
  2 | 
  3 | **让物流信息查询像聊天一样简单！** 
  4 | 基于大语言模型 + 向量检索的智能问答解决方案，快速定位物流信息中的关键内容 �
  5 | 
  6 | ---
  7 | 
  8 | ##  项目特色
  9 | 
 10 | - ** 极速响应**：FAISS 向量库实现毫秒级信息检索
 11 | - ** 语义理解**：M3E 中文嵌入模型精准捕捉上下文
 12 | - ** 智能对话**：ChatGLM2-6B 大模型生成自然流畅回答
 13 | - ** 开箱即用**：模块化设计，五分钟快速部署
 14 | 
 15 | ---
 16 | 
 17 | ##  功能演示
 18 | 
 19 | 
 20 | 
 21 | **用户提问**：  
 22 | `我买的商品来自于哪个仓库，从哪出发的，预计什么时候到达？`
 23 | 
 24 | **系统回答**：  
 25 |  您的包裹正在飞奔而来！  
 26 |  出发地：广州仓库  
 27 |  启程时间：2023年1月20日  
 28 |  预计到达：1月23日（3天运输时长）  
 29 |  目的地：重庆  
 30 |  运输方式：陆运  
 31 | 
 32 | ---
 33 | 
 34 | ##  技术栈
 35 | 
 36 | ```bash
 37 | - LangChain 
 38 | - FAISS 
 39 | - HuggingFace Transformers 
 40 | - ChatGLM2-6B 
 41 | - M3E 中文嵌入模型 🇨🇳
 42 | ```
 43 | 
 44 | ---
 45 | 
 46 | ##  快速开始
 47 | 
 48 | ### 1. 安装依赖
 49 | ```bash
 50 | pip install -r requirements.txt
 51 | pip install langchain-community unstructured faiss-cpu
 52 | # 处理文档需要
 53 | pip install python-magic python-magic-bin
 54 | ```
 55 | 
 56 | ### 2. 运行准备
 57 | ```bash
 58 | # 下载模型文件
 59 | git lfs install
 60 | git clone https://huggingface.co/THUDM/chatglm2-6b ./model_files/
 61 | 
 62 | # 准备物流文档
 63 | echo "广州仓库 发货时间 2023-01-20..." > 物流信息.txt
 64 | ```
 65 | 
 66 | ### 3. 启动系统
 67 | ```python
 68 | # 生成向量库
 69 | python get_vector.py
 70 | 
 71 | # 运行问答系统
 72 | python Knowledge_QA/main.py
 73 | ```
 74 | 
 75 | ---
 76 | 
 77 | ##  核心逻辑
 78 | 
 79 | ```mermaid
 80 | graph TD
 81 |     A[用户提问] --> B(语义向量编码)
 82 |     B --> C[FAISS 向量库检索]
 83 |     C --> D{Top1 匹配结果}
 84 |     D --> E[构建Prompt模板]
 85 |     E --> F[ChatGLM2生成回答]
 86 |     F --> G[返回结构化结果]
 87 | ```
 88 | 
 89 | ---
 90 | 
 91 | ##  使用示例
 92 | 
 93 | ```python
 94 | from Knowledge_QA.main import qa
 95 | 
 96 | # 像朋友一样提问！
 97 | response = qa("我的包裹现在到哪了？")
 98 | print(f" 物流小助手：{response}")
 99 | ```
100 | 
101 | ---
102 | 
103 | ##  重要提示
104 | 
105 | 1.  **模型路径配置**  
106 |    修改 `ChatGLM2.load_model()` 为本地模型路径：  
107 |    `llm.load_model("/path/to/chatglm2-6b")`
108 | 
109 | 2.  **文本分割优化**  
110 |    根据文档特性调整 `chunk_size`：
111 |    ```python
112 |    # 推荐值：100-500字符
113 |    TextSplitter(chunk_size=200)
114 |    ```
115 | 
116 | 3.  **性能优化建议**  
117 |    ```python
118 |    # 在模型加载时添加参数
119 |    llm.load_model(..., device_map="auto", torch_dtype=torch.float16)
120 |    ```
121 | 
122 | ---
123 | 
124 | ##  开源协议
125 | 
126 | 本项目采用 [MIT License](LICENSE)，拥抱开源，共同进步！ 
127 | 
128 | ---
129 | 
130 | **让物流信息流动起来** 
131 | 有任何建议欢迎提交 Issue 或 Star  支持我们！
132 | 


--------------------------------------------------------------------------------
/faiss/camp/index.faiss:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Haohao-end/LangChain-ChatGLM-6B-RAG/ec7e35c94140951dba12418eaf8fa055e9f7d263/faiss/camp/index.faiss


--------------------------------------------------------------------------------
/faiss/camp/index.pkl:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Haohao-end/LangChain-ChatGLM-6B-RAG/ec7e35c94140951dba12418eaf8fa055e9f7d263/faiss/camp/index.pkl


--------------------------------------------------------------------------------
/faiss/logistics/index.faiss:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Haohao-end/LangChain-ChatGLM-6B-RAG/ec7e35c94140951dba12418eaf8fa055e9f7d263/faiss/logistics/index.faiss


--------------------------------------------------------------------------------
/faiss/logistics/index.pkl:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Haohao-end/LangChain-ChatGLM-6B-RAG/ec7e35c94140951dba12418eaf8fa055e9f7d263/faiss/logistics/index.pkl


--------------------------------------------------------------------------------
/get_vector.py:
--------------------------------------------------------------------------------
 1 | from langchain_community.document_loaders import UnstructuredFileLoader
 2 | from langchain.text_splitter import RecursiveCharacterTextSplitter
 3 | from langchain_community.embeddings import HuggingFaceEmbeddings
 4 | from langchain_community.vectorstores import FAISS  # 向量数据库
 5 | # from langchain.document_loaders import UnstructuredFileLoader
 6 | # from langchain.text_splitter import RecursiveCharacterTextSplitter
 7 | # from langchain.embeddings import HuggingFaceEmbeddings
 8 | # from langchain.vectorstores import FAISS  # 向量数据库
 9 | 
10 | def main():
11 |     # 定义向量模型路径
12 |     EMBEDDING_MODEL = './m3e-base'
13 | 
14 |     # 第一步：加载文档：
15 |     loader = UnstructuredFileLoader('物流信息.txt')
16 |     data = loader.load()
17 |     # print(f'data-->{data}')
18 |     # 第二步：切分文档：
19 |     text_split = RecursiveCharacterTextSplitter(chunk_size=128,
20 |                                                 chunk_overlap=4)
21 |     split_data = text_split.split_documents(data)
22 |     # print(f'split_data-->{split_data}')
23 | 
24 |     # 第三步：初始化huggingface模型embedding
25 |     embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL)
26 | 
27 |     # 第四步：将切分后的文档进行向量化，并且存储下来
28 |     db = FAISS.from_documents(split_data, embeddings)
29 |     db.save_local('./faiss/camp')
30 | 
31 |     return split_data
32 | 
33 | 
34 | if __name__ == '__main__':
35 |     split_data = main()
36 |     print(f'split_data-->{split_data}')
37 | 
38 | 


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
 1 | # coding:utf-8
 2 | # 导入必备的工具包
 3 | from langchain.prompts import PromptTemplate
 4 | from get_vector import *
 5 | from model import ChatGLM2
 6 | # 加载FAISS向量库
 7 | EMBEDDING_MODEL = './m3e-base'
 8 | embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL)
 9 | db = FAISS.load_local('./faiss/camp', embeddings)
10 | 
11 | 
12 | def get_related_content(related_docs):
13 |     related_content = []
14 |     for doc in related_docs:
15 |         related_content.append(doc.page_content.replace('\n\n', '\n'))
16 |     return '\n'.join(related_content)
17 | 
18 | def define_prompt():
19 |     question = '我买的商品来自于哪个仓库，从哪出发的，预计什么到达'
20 |     docs = db.similarity_search(question, k=1)
21 |     # print(f'docs-->{docs}')
22 |     related_docs = get_related_content(docs)
23 | 
24 |     # 构建模板
25 |     PROMPT_TEMPLATE = """
26 |            基于以下已知信息，简洁和专业的来回答用户的问题。不允许在答案中添加编造成分。
27 |            已知内容:
28 |            {context}
29 |            问题:
30 |            {question}"""
31 |     prompt = PromptTemplate(input_variables=["context", "question"],
32 |                             template=PROMPT_TEMPLATE)
33 | 
34 |     my_prompt = prompt.format(context=related_docs,
35 |                                 question=question)
36 |     return my_prompt
37 | 
38 | def qa():
39 |     llm = ChatGLM2()
40 |     llm.load_model('/Users/ligang/PycharmProjects/llm/ChatGLM-6B/THUDM/chatglm-6b')
41 |     my_prompt = define_prompt()
42 |     result = llm(my_prompt)
43 |     return result
44 | 
45 | if __name__ == '__main__':
46 |     result = qa()
47 |     print(f'result-->{result}')
48 | 


--------------------------------------------------------------------------------
/model.py:
--------------------------------------------------------------------------------
 1 | from langchain.llms.base import LLM
 2 | from langchain.llms.utils import enforce_stop_tokens
 3 | from transformers import AutoTokenizer, AutoModel
 4 | from typing import List, Optional, Any
 5 | 
 6 | 
 7 | # 自定义GLM类
 8 | class ChatGLM2(LLM):
 9 |     max_token: int = 4096
10 |     temperature: float = 0.8
11 |     top_p = 0.9
12 |     tokenizer: object = None
13 |     model: object = None
14 |     history = []
15 | 
16 |     def __init__(self):
17 |         super().__init__()
18 | 
19 |     @property
20 |     def _llm_type(self) -> str:
21 |         return "custom_chatglm2"
22 | 
23 |     # 定义load_model的方法
24 |     def load_model(self, model_path=None):
25 |         # 加载分词器
26 |         self.tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
27 |         # 加载模型
28 |         self.model = AutoModel.from_pretrained(model_path, trust_remote_code=True).float()
29 | 
30 | 
31 |     # 定义_call方法：进行模型的推理
32 |     def _call(self,prompt: str, stop: Optional[List[str]] = None) -> str:
33 |         response, _ = self.model.chat(self.tokenizer,
34 |                                         prompt,
35 |                                         history=self.history,
36 |                                         temperature=self.temperature,
37 |                                         top_p=self.top_p)
38 | 
39 |         if stop is not None:
40 |             response = enforce_stop_tokens(response, stop)
41 | 
42 |         self.history = self.history + [[None, response]]
43 |         return response
44 | 
45 | if __name__ == '__main__':
46 |     llm = ChatGLM2()
47 |     llm.load_model(model_path='/Users/ligang/PycharmProjects/llm/ChatGLM-6B/THUDM/chatglm-6b-int4')
48 |     print(f'llm--->{llm}')
49 |     print(llm("1+1等于几？"))
50 | 
51 | 
52 | 


--------------------------------------------------------------------------------
/物流信息.txt:
--------------------------------------------------------------------------------
 1 | 物流公司：速达物流
 2 | 公司总部：北京市
 3 | 业务范围：国际快递、仓储管理
 4 | 
 5 | 货物追踪：
 6 | - 货物编号：ABC123456
 7 | - 发货日期：2023-01-15
 8 | - 当前位置：上海分拨中心
 9 | - 预计到达日期：2023-01-20
10 | 
11 | 运输方式：
12 | - 运输公司：快运通
13 | - 运输方式：陆运
14 | - 出发地：广州
15 | - 目的地：重庆
16 | - 预计运输时间：3天
17 | 
18 | 仓储信息：
19 | - 仓库名称：东方仓储中心
20 | - 仓库位置：深圳市
21 | - 存储货物类型：电子产品
22 | - 存储条件：常温仓储
23 | - 当前库存量：1000件


--------------------------------------------------------------------------------