├── .gitignore
├── LICENSE
├── README.md
├── README_CN.md
├── image
    ├── 1.gif
    ├── 2.gif
    └── image.png
└── src
    ├── Agent
        ├── Action.py
        ├── ReAct.py
        └── __init__.py
    ├── Models
        ├── Factory.py
        ├── Github.py
        └── __init__.py
    ├── Tools
        ├── EmailTool.py
        ├── ExcelTool.py
        ├── FileQATool.py
        ├── FileTool.py
        ├── FinishTool.py
        ├── PythonTool.py
        ├── Python_structure_Tool.py
        ├── Tools.py
        ├── WriterTool.py
        ├── __init__.py
        └── githubTool.py
    ├── Utils
        ├── CallbackHandlers.py
        ├── PrintUtils.py
        └── __init__.py
    ├── config.json
    ├── main.py
    ├── prompts
        └── main
        │   ├── allinall_prompt.txt
        │   ├── choose_agent.txt
        │   ├── dict_list_prompt.txt
        │   ├── main.txt
        │   └── sample_agent.txt
    └── requirements.txt


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Compiled source #
 2 | ###################
 3 | *.com
 4 | *.class
 5 | *.dll
 6 | *.exe
 7 | *.o
 8 | *.so
 9 | .idea
10 | __pycache__
11 | 
12 | # Packages #
13 | ############
14 | # it's better to unpack these files and commit the raw source
15 | # git has its own built in compression methods
16 | *.7z
17 | *.dmg
18 | *.gz
19 | *.iso
20 | *.jar
21 | *.rar
22 | *.tar
23 | *.zip
24 | 
25 | # Logs and databases #
26 | ######################
27 | *.log
28 | *.sql
29 | *.sqlite
30 | 
31 | # OS generated files #
32 | ######################
33 | .DS_Store
34 | .DS_Store?
35 | ._*
36 | .Spotlight-V100
37 | .Trashes
38 | ehthumbs.db
39 | Thumbs.db
40 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 syzhy113
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | **更换阅读语言: [中文](README_CN.md)**
  2 | 
  3 | # Engineering-Code-Analysis
  4 | Automated and fast parsing of local project directories and GitHub directories, one-click deployment of local parsing with AutoGPT:
  5 | 
  6 | ## Introduction
  7 | 
  8 | This project is an innovative localized AI assistant system designed to overcome the limitations of traditional online AI services. It supports direct access to local folders and can analyze the content of local project structures. With no file size limitations, it efficiently parses GitHub projects while ensuring that data is stored locally to enhance security.
  9 | 
 10 | ### Workflow Framework
 11 | ![工作流框架](image/image.png)
 12 | 
 13 | ### Demo
 14 | ![notebook_gif_demo](image/1.gif)
 15 | 
 16 | ![notebook_gif_demo](image/2.gif)
 17 | 
 18 | ## Main Advantages
 19 | 
 20 | - **Increased File Size and Access Speed**:Say goodbye to the 100MB file size limit and internet speed issues. With the local version, everything is under your control.
 21 | - **Explicit Network Connection Access**:The official version cannot explicitly access web links and is not friendly to GitHub projects. This project can analyze GitHub projects.
 22 | - **Direct Access to Local Files**:Runs in your local directory, making it convenient for personalized file directory operations and real-time file directory analysis.
 23 | -  **Data Security**:The code runs locally, eliminating the need to upload files to the internet, which enhances data security.
 24 | -  **Model Support**:Access via API, allowing the use of ```GPT-4``` without needing an ```OpenAI Plus``` subscription.
 25 | 
 26 | ## Precautions
 27 | AI-generated code executed on local devices, without human review, may pose security risks. You are responsible for any consequences arising from running such unreviewed programs.
 28 | 
 29 | ## Usage
 30 | 
 31 | ### Installation
 32 | 
 33 | 1. Clone this repository:
 34 |    ```shell
 35 |    git clone https://github.com/syzhy113/Engineering-Code-Analysis.git
 36 |    cd Engineering-Code-Analys
 37 |    ```
 38 | 
 39 | 2. Install dependencies. This program has been tested on Windows 11 and Ubuntu 18.04. Required libraries and versions:
 40 |    Create a conda environment:
 41 |       ```shell
 42 |    conda create -n env_name python=3.10
 43 |    conda activate env_name
 44 |    ```
 45 |    You can install it directly using the following command
 46 |    ```shell
 47 |    pip install -r requirements.txt
 48 |    ```
 49 | 
 50 | ### API Configuration
 51 | 1. Replace the corresponding variables in ```src/config.json``` with your own ```OpenAI API``` to enable model calls.<br>
 52 | 2. Additionally, if you need to access GitHub projects, you'll need to obtain the corresponding ```Github API KEY```。
 53 |    ```shell
 54 |    {
 55 |      "GIT_KEY": "",
 56 |      "OPENAI_API_KEY": "",
 57 |      "OPENAI_BASE_URL": ""
 58 |      "LANGFUSE_PUBLIC_KEY": "",
 59 |      "LANGFUSE_SECRET_KEY": "",
 60 |      "LANGFUSE_HOST": ""
 61 |    }
 62 |    ```
 63 | 
 64 | ## Run
 65 | ### Use it directly from the command line
 66 | 1. Navigate to the src directory:
 67 |    ```shell
 68 |    cd src
 69 |    ```
 70 | 
 71 | 2. Run the following command:
 72 |    ```shell
 73 |    python main.py
 74 |    ```
 75 | 
 76 | 3. To complete the conversation in the command line, use:
 77 |    ```shell
 78 |    🤖：有什么可以帮您？
 79 |    👨：
 80 |    ```
 81 | ### Use Web UI
 82 | 1. Navigate to the src directory:
 83 |    ```shell
 84 |    cd src
 85 |    ```
 86 | 
 87 | 2. Run the following command:
 88 |    ```shell
 89 |    python web_ui.py
 90 |    ```
 91 | 
 92 | 
 93 | 
 94 | 
 95 | ## Example
 96 | 
 97 | Here is an example of using this program for engineering task analysis:
 98 | 
 99 | 1. ：
100 |    ```shell
101 |    🤖：有什么可以帮您？
102 |    👨：Agent文件夹中的核心代码是什么，内容是什么，是干什么的，具体分析一下
103 |    🤖：Agent文件夹中的核心代码包括两个文件：`Action.py`和`ReAct.py`。`Action.py`定义了一个`Action`类，用于描述一个动作。`ReAct.py`实
104 |       现了一个基于Langchain的智能体框架，包括多个方法来格式化思维过程、提取动作、初始化智能体、查找工具、执行思考过程和运行智能体。
105 |    —Agent
106 |       ——Action.py 文件作用是定义动作类
107 |          --class Action 类的作用是描述一个动作
108 |               -- function __str__ 函数的作用是生成动作的字符串表示
109 |       ——ReAct.py 文件作用是实现基于Langchain的智能体
110 |          --class ReActAgent 类的作用是实现智能体框架
111 |               -- function __format_thought_observation 函数的作用是格式化思维过程和观察结果
112 |               -- function __extract_json_action 函数的作用是从文本中提取JSON格式的动作
113 |               -- function __init__ 函数的作用是初始化智能体
114 |               -- function __init_prompt_templates 函数的作用是初始化提示模板
115 |               -- function __init_chains 函数的作用是初始化主流程链
116 |               -- function __find_tool 函数的作用是根据工具名称查找工具
117 |               -- function __step 函数的作用是执行一步思考过程
118 |               -- function __exec_action 函数的作用是执行动作并返回观察结果
119 |               -- function extract_between_markers 函数的作用是从字符串中提取标记之间的子字符串
120 |               -- function run 函数的作用是运行智能体
121 |       ```
122 | 2. ：
123 |    ```shell
124 |    🤖：有什么可以帮您？
125 |    👨：https://github.com/raghakot/keras-resnet解析一下这个工程
126 |    🤖：工程基本信息该工程是使用Keras实现的残差网络（ResNet），支持Theano和TensorFlow后端。主要目的是提供一种基于Keras的方便的ResNet实现。
127 |          安装和使用方法
128 |          1. 克隆本仓库：
129 |             git clone https://github.com/raghakot/keras-resnet.git
130 |             cd keras-resnet
131 |          
132 |          2. 使用pip安装依赖：
133 |             pip install -r requirements.txt
134 |          
135 |          3. 使用方法：
136 |             - 训练ResNet-18模型在CIFAR-10数据集上：
137 |               THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10.py
138 |          
139 |          核心代码结构
140 |          
141 |          cifar10.py
142 |          - 作用：训练ResNet-18模型在CIFAR-10数据集上。
143 |            - 加载和预处理CIFAR-10数据。
144 |            - 构建ResNet-18模型。
145 |            - 使用数据增广（可选）。
146 |            - 训练模型并保存训练日志。
147 |          
148 |          resnet.py
149 |          - 作用：定义ResNet模型的构建逻辑。
150 |            - _bn_relu：构建Batch Normalization和ReLU激活块。
151 |            - _conv_bn_relu：构建卷积、Batch Normalization和ReLU激活块。
152 |            - _bn_relu_conv：构建Batch Normalization、ReLU激活和卷积块。
153 |            - _shortcut：构建残差块的shortcut连接。
154 |            - _residual_block：构建残差块。
155 |            - basic_block：定义基本残差块。
156 |            - bottleneck：定义瓶颈残差块。
157 |            - _handle_dim_ordering：处理不同的图像维度顺序。
158 |            - ResnetBuilder：提供构建不同深度ResNet模型的方法，如ResNet-18、ResNet-34、ResNet-50等。
159 |          
160 |          tests/test_resnet.py
161 |          - 作用：测试ResNet模型的构建和编译。
162 |            - 定义了多个测试函数，测试不同深度的ResNet模型（如ResNet-18、ResNet-34、ResNet-50等）的构建和编译。
163 |            - 使用pytest框架进行测试。
164 |          
165 |          核心代码结构图
166 |          plaintext
167 |          ── cifar10.py (训练ResNet-18模型在CIFAR-10数据集上)
168 |              -- 加载和预处理数据
169 |              -- 构建和训练模型
170 |          
171 |          ── resnet.py (定义ResNet模型的构建逻辑)
172 |              -- class ResnetBuilder (提供构建ResNet模型的方法)
173 |                  -- function build (构建ResNet模型的核心方法)
174 |                  -- function build_resnet_18 (构建ResNet-18模型)
175 |                  -- function build_resnet_34 (构建ResNet-34模型)
176 |                  -- function build_resnet_50 (构建ResNet-50模型)
177 |                  -- function build_resnet_101 (构建ResNet-101模型)
178 |                  -- function build_resnet_152 (构建ResNet-152模型)
179 |          
180 |          ── tests/test_resnet.py (测试ResNet模型的构建和编译)
181 |              -- function test_resnet18 (测试ResNet-18模型)
182 |              -- function test_resnet34 (测试ResNet-34模型)
183 |              -- function test_resnet50 (测试ResNet-50模型)
184 |              -- function test_resnet101 (测试ResNet-101模型)
185 |              -- function test_resnet152 (测试ResNet-152模型)
186 |              -- function test_custom1 (自定义测试1)
187 |              -- function test_custom2 (自定义测试2)
188 | 
189 | 
190 | 


--------------------------------------------------------------------------------
/README_CN.md:
--------------------------------------------------------------------------------
  1 | **Read in other language: [English](README.md)**
  2 | 
  3 | # 工程代码解释器
  4 | 使用Openai有关项目工程的代码解释器
  5 | 
  6 | ## 简介
  7 | 
  8 | 本项目是一款创新的本地化AI助手系统，
  9 | 旨在突破传统在线AI服务的局限。它支持直接访问本地文件夹，对本地项目结构内容进行解析。
 10 | 无文件大小限制，能高效解析GitHub项目，同时保证数据仅存储在本地以提高安全性。
 11 | 
 12 | ### 工作流框架
 13 | ![工作流框架](image/image.png)
 14 | 
 15 | ### Demo
 16 | ![notebook_gif_demo](image/1.gif)
 17 | 
 18 | ![notebook_gif_demo](image/2.gif)
 19 | 
 20 | ## 优势
 21 | 
 22 | - **文件大小及访问速度提升**：告别100MB文件大小限制和网速问题。使用本地版，一切尽在掌控之中。
 23 | - **显式访问网络连接**：官方无法显式访问网页链接，对GitHub项目不友好，该工程可以对Github项目进行解析。
 24 | - **直接访问本地文件**：在您本地目录中运行，方便进行个性化文件目录操作，实时解析文件目录。
 25 | -  **数据安全**：代码在本地运行，无需将文件上至网络，提高了数据的安全性。
 26 | -  **模型支持**：使用API进行访问，无需```OpenAI plus```也可以使用```GPT4```。
 27 | 
 28 | ## 注意事项
 29 | 本地设备上执行AI生成但未经人工审核的代码可能存在安全风险。若未经审核运行程序所产生的所有后果，您需自行承担。
 30 | 
 31 | ## 使用方法
 32 | 
 33 | ### 安装
 34 | 
 35 | 1. 克隆本仓库
 36 |    ```shell
 37 |    git clone https://github.com/syzhy113/Engineering-Code-Analysis.git
 38 |    cd Engineering-Code-Analys
 39 |    ```
 40 | 
 41 | 2. 安装依赖。该程序已在Windows 11和ubuntu18.04测试。所需的库及版本：
 42 |    创建conda环境
 43 |       ```shell
 44 |    conda create -n env_name python=3.10
 45 |    conda activate env_name
 46 |    ```
 47 |    可以直接使用以下命令安装
 48 |    ```shell
 49 |    pip install -r requirements.txt
 50 |    ```
 51 | 
 52 | ### API配置
 53 | 1. 使用自己的```OpenAI API```替换 
 54 | ```src/config.json ```
 55 | 下的对应变量，以实现对模型的调用。<br>
 56 | 2. 同时，如果需要使用对Github工程的访问功能，需要申请对应的```Github API KEY```。
 57 |    ```shell
 58 |    {
 59 |      "GIT_KEY": "",
 60 |      "OPENAI_API_KEY": "",
 61 |      "OPENAI_BASE_URL": ""
 62 |      "LANGFUSE_PUBLIC_KEY": "",
 63 |      "LANGFUSE_SECRET_KEY": "",
 64 |      "LANGFUSE_HOST": ""
 65 |    }
 66 |    ```
 67 | 
 68 | ## 使用
 69 | ### 直接在命令行使用
 70 | 1. 进入`src`目录。
 71 |    ```shell
 72 |    cd src
 73 |    ```
 74 | 
 75 | 2. 运行以下命令：
 76 |    ```shell
 77 |    python main.py
 78 |    ```
 79 | 
 80 | 3. 在命令行中，完成对话使用
 81 |    ```shell
 82 |    🤖：有什么可以帮您？
 83 |    👨：
 84 |    ```
 85 | ### UI界面使用
 86 | 1. 进入`src`目录。
 87 |    ```shell
 88 |    cd src
 89 |    ```
 90 | 2. 运行以下命令：
 91 |    ```shell
 92 |    python web_ui.py
 93 |    ```
 94 | 
 95 | ## 示例
 96 | 
 97 | 以下是一个使用本程序进行工程任务分析的示例：
 98 | 
 99 | 1. ：
100 |    ```shell
101 |    🤖：有什么可以帮您？
102 |    👨：Agent文件夹中的核心代码是什么，内容是什么，是干什么的，具体分析一下
103 |    🤖：Agent文件夹中的核心代码包括两个文件：`Action.py`和`ReAct.py`。`Action.py`定义了一个`Action`类，用于描述一个动作。`ReAct.py`实
104 |       现了一个基于Langchain的智能体框架，包括多个方法来格式化思维过程、提取动作、初始化智能体、查找工具、执行思考过程和运行智能体。
105 |    —Agent
106 |       ——Action.py 文件作用是定义动作类
107 |          --class Action 类的作用是描述一个动作
108 |               -- function __str__ 函数的作用是生成动作的字符串表示
109 |       ——ReAct.py 文件作用是实现基于Langchain的智能体
110 |          --class ReActAgent 类的作用是实现智能体框架
111 |               -- function __format_thought_observation 函数的作用是格式化思维过程和观察结果
112 |               -- function __extract_json_action 函数的作用是从文本中提取JSON格式的动作
113 |               -- function __init__ 函数的作用是初始化智能体
114 |               -- function __init_prompt_templates 函数的作用是初始化提示模板
115 |               -- function __init_chains 函数的作用是初始化主流程链
116 |               -- function __find_tool 函数的作用是根据工具名称查找工具
117 |               -- function __step 函数的作用是执行一步思考过程
118 |               -- function __exec_action 函数的作用是执行动作并返回观察结果
119 |               -- function extract_between_markers 函数的作用是从字符串中提取标记之间的子字符串
120 |               -- function run 函数的作用是运行智能体
121 |       ```
122 | 2. ：
123 |    ```shell
124 |    🤖：有什么可以帮您？
125 |    👨：https://github.com/raghakot/keras-resnet解析一下这个工程
126 |    🤖：工程基本信息该工程是使用Keras实现的残差网络（ResNet），支持Theano和TensorFlow后端。主要目的是提供一种基于Keras的方便的ResNet实现。
127 |          安装和使用方法
128 |          1. 克隆本仓库：
129 |             git clone https://github.com/raghakot/keras-resnet.git
130 |             cd keras-resnet
131 |          
132 |          2. 使用pip安装依赖：
133 |             pip install -r requirements.txt
134 |          
135 |          3. 使用方法：
136 |             - 训练ResNet-18模型在CIFAR-10数据集上：
137 |               THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10.py
138 |          
139 |          核心代码结构
140 |          
141 |          cifar10.py
142 |          - 作用：训练ResNet-18模型在CIFAR-10数据集上。
143 |            - 加载和预处理CIFAR-10数据。
144 |            - 构建ResNet-18模型。
145 |            - 使用数据增广（可选）。
146 |            - 训练模型并保存训练日志。
147 |          
148 |          resnet.py
149 |          - 作用：定义ResNet模型的构建逻辑。
150 |            - _bn_relu：构建Batch Normalization和ReLU激活块。
151 |            - _conv_bn_relu：构建卷积、Batch Normalization和ReLU激活块。
152 |            - _bn_relu_conv：构建Batch Normalization、ReLU激活和卷积块。
153 |            - _shortcut：构建残差块的shortcut连接。
154 |            - _residual_block：构建残差块。
155 |            - basic_block：定义基本残差块。
156 |            - bottleneck：定义瓶颈残差块。
157 |            - _handle_dim_ordering：处理不同的图像维度顺序。
158 |            - ResnetBuilder：提供构建不同深度ResNet模型的方法，如ResNet-18、ResNet-34、ResNet-50等。
159 |          
160 |          tests/test_resnet.py
161 |          - 作用：测试ResNet模型的构建和编译。
162 |            - 定义了多个测试函数，测试不同深度的ResNet模型（如ResNet-18、ResNet-34、ResNet-50等）的构建和编译。
163 |            - 使用pytest框架进行测试。
164 |          
165 |          核心代码结构图
166 |          plaintext
167 |          ── cifar10.py (训练ResNet-18模型在CIFAR-10数据集上)
168 |              -- 加载和预处理数据
169 |              -- 构建和训练模型
170 |          
171 |          ── resnet.py (定义ResNet模型的构建逻辑)
172 |              -- class ResnetBuilder (提供构建ResNet模型的方法)
173 |                  -- function build (构建ResNet模型的核心方法)
174 |                  -- function build_resnet_18 (构建ResNet-18模型)
175 |                  -- function build_resnet_34 (构建ResNet-34模型)
176 |                  -- function build_resnet_50 (构建ResNet-50模型)
177 |                  -- function build_resnet_101 (构建ResNet-101模型)
178 |                  -- function build_resnet_152 (构建ResNet-152模型)
179 |          
180 |          ── tests/test_resnet.py (测试ResNet模型的构建和编译)
181 |              -- function test_resnet18 (测试ResNet-18模型)
182 |              -- function test_resnet34 (测试ResNet-34模型)
183 |              -- function test_resnet50 (测试ResNet-50模型)
184 |              -- function test_resnet101 (测试ResNet-101模型)
185 |              -- function test_resnet152 (测试ResNet-152模型)
186 |              -- function test_custom1 (自定义测试1)
187 |              -- function test_custom2 (自定义测试2)
188 | 
189 | 
190 | 
191 | 


--------------------------------------------------------------------------------
/image/1.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/syzhy113/Engineering-Code-Analysis/a2fdb2c4d565c63ec80ce2c0897833e546879389/image/1.gif


--------------------------------------------------------------------------------
/image/2.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/syzhy113/Engineering-Code-Analysis/a2fdb2c4d565c63ec80ce2c0897833e546879389/image/2.gif


--------------------------------------------------------------------------------
/image/image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/syzhy113/Engineering-Code-Analysis/a2fdb2c4d565c63ec80ce2c0897833e546879389/image/image.png


--------------------------------------------------------------------------------
/src/Agent/Action.py:
--------------------------------------------------------------------------------
 1 | from pydantic.v1 import BaseModel, Field
 2 | from typing import List, Optional, Dict, Any
 3 | 
 4 | 
 5 | class Action(BaseModel):
 6 |     name: str = Field(description="Tool name")
 7 |     args: Optional[Dict[str, Any]] = Field(description="Tool input arguments, containing arguments names and values")
 8 | 
 9 |     def __str__(self):
10 |         ret = f"Action(name={self.name}"
11 |         if self.args:
12 |             for k, v in self.args.items():
13 |                 ret += f", {k}={v}"
14 |         ret += ")"
15 |         return ret
16 | 


--------------------------------------------------------------------------------
/src/Agent/ReAct.py:
--------------------------------------------------------------------------------
  1 | import re
  2 | from typing import List, Tuple
  3 | 
  4 | from langchain_community.chat_message_histories.in_memory import ChatMessageHistory
  5 | from langchain_core.language_models.chat_models import BaseChatModel
  6 | from langchain.output_parsers import PydanticOutputParser, OutputFixingParser
  7 | from langchain.schema.output_parser import StrOutputParser
  8 | from langchain.tools.base import BaseTool
  9 | from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
 10 | from langchain_core.tools import render_text_description
 11 | from pydantic import ValidationError
 12 | from langchain_core.prompts import HumanMessagePromptTemplate
 13 | from Tools.githubTool import github_core
 14 | 
 15 | from Agent.Action import Action
 16 | from Utils.CallbackHandlers import *
 17 | 
 18 | 
 19 | class ReActAgent:
 20 |     """AutoGPT：基于Langchain实现"""
 21 | 
 22 |     @staticmethod
 23 |     def __format_thought_observation(thought: str, action: Action, observation: str) -> str:
 24 |         ret = re.sub(r'```json(.*?)```', '', thought, flags=re.DOTALL)
 25 |         ret += "\n" + str(action) + "\n返回结果:\n" + observation
 26 |         return ret
 27 | 
 28 |     @staticmethod
 29 |     def __extract_json_action(text: str) -> str | None:
 30 |         json_pattern = re.compile(r'```json(.*?)```', re.DOTALL)
 31 |         matches = json_pattern.findall(text)
 32 |         if matches:
 33 |             last_json_str = matches[-1]
 34 |             return last_json_str
 35 |         return None
 36 | 
 37 |     def __init__(
 38 |             self,
 39 |             llm: BaseChatModel,
 40 |             tools: List[BaseTool],
 41 |             main_prompt_file: str,
 42 |             max_thought_steps: Optional[int] = 10,
 43 |             work_dir: str = None,
 44 |     ):
 45 |         self.llm = llm
 46 |         self.tools = tools
 47 |         self.work_dir = work_dir
 48 |         self.max_thought_steps = max_thought_steps
 49 |         self.github_core = github_core()
 50 | 
 51 |         self.output_parser = PydanticOutputParser(pydantic_object=Action)
 52 |         self.robust_parser = OutputFixingParser.from_llm(
 53 |             parser=self.output_parser,
 54 |             llm=llm
 55 |         )
 56 | 
 57 |         self.main_prompt_file = main_prompt_file
 58 | 
 59 |         self.__init_prompt_templates()
 60 |         self.__init_chains()
 61 | 
 62 |         self.verbose_handler = ColoredPrintHandler(color=THOUGHT_COLOR)
 63 | 
 64 |     def __init_prompt_templates(self):
 65 |         with open(self.main_prompt_file, 'r', encoding='utf-8') as f:
 66 |             self.prompt = ChatPromptTemplate.from_messages(
 67 |                 [
 68 |                     MessagesPlaceholder(variable_name="chat_history"),
 69 |                     HumanMessagePromptTemplate.from_template(f.read()),
 70 |                 ]
 71 |             ).partial(
 72 |                 tools=render_text_description(self.tools),
 73 |                 tool_names=','.join([tool.name for tool in self.tools]),
 74 |                 format_instructions=self.output_parser.get_format_instructions(),
 75 |             )
 76 | 
 77 |     def __init_chains(self):
 78 |         # 主流程的chain
 79 |         self.main_chain = (self.prompt | self.llm | StrOutputParser())
 80 | 
 81 |     def __find_tool(self, tool_name: str) -> Optional[BaseTool]:
 82 |         for tool in self.tools:
 83 |             if tool.name == tool_name:
 84 |                 return tool
 85 |         return None
 86 | 
 87 |     def __step(self,
 88 |                task,
 89 |                short_term_memory,
 90 |                chat_history,
 91 |                verbose=False
 92 |                ) -> Tuple[Action, str]:
 93 | 
 94 |         """执行一步思考"""
 95 | 
 96 |         inputs = {
 97 |             "input": task,
 98 |             "agent_scratchpad": "\n".join(short_term_memory),
 99 |             "chat_history": chat_history.messages,
100 |             "work_dir": self.work_dir,
101 |         }
102 | 
103 |         config = {
104 |             "callbacks": [self.verbose_handler]
105 |             if verbose else []
106 |         }
107 |         response = ""
108 |         for s in self.main_chain.stream(inputs, config=config):
109 |             response += s
110 | 
111 |         json_action = self.__extract_json_action(response)
112 | 
113 |         action = self.robust_parser.parse(
114 |             json_action if json_action else response
115 |         )
116 |         return action, response
117 | 
118 |     def __exec_action(self, action: Action) -> str:
119 |         # 查找工具
120 |         tool = self.__find_tool(action.name)
121 |         if tool is None:
122 |             observation = (
123 |                 f"Error: 找不到工具或指令 '{action.name}'. "
124 |                 f"请从提供的工具/指令列表中选择，请确保按对顶格式输出。"
125 |             )
126 |         else:
127 |             try:
128 |                 # 执行工具
129 |                 observation = tool.run(action.args)
130 |             except ValidationError as e:
131 |                 # 工具的入参异常
132 |                 observation = (
133 |                     f"Validation Error in args: {str(e)}, args: {action.args}"
134 |                 )
135 |             except Exception as e:
136 |                 # 工具执行异常
137 |                 observation = f"Error: {str(e)}, {type(e).__name__}, args: {action.args}"
138 | 
139 |         return observation
140 | 
141 |     def extract_between_markers(self, s, marker1, marker2):
142 |         """
143 |         Extracts a substring from 's' that is between two markers 'marker1' and 'marker2'.
144 | 
145 |         Parameters:
146 |         - s (str): The string from which to extract the substring.
147 |         - marker1 (str): The start marker.
148 |         - marker2 (str): The end marker.
149 | 
150 |         Returns:
151 |         - str: The extracted substring. If the markers are not found, an empty string is returned.
152 |         """
153 |         start_pos = s.find(marker1)
154 |         end_pos = s.find(marker2)
155 | 
156 |         if start_pos != -1 and end_pos != -1:
157 |             return s[start_pos + len(marker1):end_pos]
158 |         else:
159 |             return "标志不存在"
160 | 
161 |     def run(
162 |             self,
163 |             task: str,
164 |             chat_history: ChatMessageHistory,
165 |             verbose=False
166 |     ) -> str:
167 |         """
168 |         运行智能体
169 |         :param task: 用户任务
170 |         :param chat_history: 对话上下文（长时记忆）
171 |         :param verbose: 是否显示详细信息
172 |         """
173 |         short_term_memory = []
174 | 
175 |         thought_step_count = 0
176 | 
177 |         reply = ""
178 | 
179 |         while thought_step_count < self.max_thought_steps:
180 |             if verbose:
181 |                 self.verbose_handler.on_thought_start(thought_step_count)
182 | 
183 |             action, response = self.__step(
184 |                 task=task,
185 |                 short_term_memory=short_term_memory,
186 |                 chat_history=chat_history,
187 |                 verbose=verbose,
188 |             )
189 | 
190 |             if action.name == "FINISH":
191 |                 reply = response
192 |                 break
193 | 
194 |             observation = self.__exec_action(action)
195 | 
196 |             if verbose:
197 |                 self.verbose_handler.on_tool_end(observation)
198 | 
199 |             short_term_memory.append(
200 |                 self.__format_thought_observation(
201 |                     response, action, observation
202 |                 )
203 |             )
204 | 
205 |             thought_step_count += 1
206 | 
207 |         if thought_step_count >= self.max_thought_steps:
208 |             reply = "抱歉，我没能完成您的任务。"
209 | 
210 |         chat_history.add_user_message(task)
211 |         chat_history.add_ai_message(reply)
212 |         return reply
213 | 


--------------------------------------------------------------------------------
/src/Agent/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/syzhy113/Engineering-Code-Analysis/a2fdb2c4d565c63ec80ce2c0897833e546879389/src/Agent/__init__.py


--------------------------------------------------------------------------------
/src/Models/Factory.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | from dotenv import load_dotenv, find_dotenv
 3 | 
 4 | _ = load_dotenv(find_dotenv())
 5 | 
 6 | from langchain_openai import ChatOpenAI, OpenAIEmbeddings, AzureChatOpenAI, AzureOpenAIEmbeddings
 7 | 
 8 | 
 9 | class ChatModelFactory:
10 |     model_params = {
11 |         "temperature": 0,
12 |         "model_kwargs": {"seed": 42},
13 |     }
14 | 
15 |     @classmethod
16 |     def get_model(cls, model_name: str, use_azure: bool = False):
17 |         if "gpt" in model_name:
18 |             if not use_azure:
19 |                 return ChatOpenAI(model=model_name, **cls.model_params)
20 |             else:
21 |                 return AzureChatOpenAI(
22 |                     azure_deployment=model_name,
23 |                     api_version="2024-05-01-preview",
24 |                     **cls.model_params
25 |                 )
26 |         elif model_name == "qwen2":
27 |             # 换成开源模型试试
28 |             # https://siliconflow.cn/
29 |             # 一个 Model-as-a-Service 平台
30 |             # 可以通过与 OpenAI API 兼容的方式调用各种开源语言模型。
31 |             return ChatOpenAI(
32 |                 model="alibaba/Qwen2-72B-Instruct",  # 模型名称
33 |                 openai_api_key=os.getenv("SILICONFLOW_API_KEY"),  # 在平台注册账号后获取
34 |                 openai_api_base="https://api.siliconflow.cn/v1",  # 平台 API 地址
35 |                 **cls.model_params,
36 |             )
37 | 
38 |     @classmethod
39 |     def get_default_model(cls):
40 |         return cls.get_model("gpt-3.5-turbo")
41 | 
42 | 
43 | class EmbeddingModelFactory:
44 | 
45 |     @classmethod
46 |     def get_model(cls, model_name: str, use_azure: bool = False):
47 |         if model_name.startswith("text-embedding"):
48 |             if not use_azure:
49 |                 return OpenAIEmbeddings(model=model_name)
50 |             else:
51 |                 return AzureOpenAIEmbeddings(
52 |                     azure_deployment=model_name,
53 |                     openai_api_version="2024-05-01-preview",
54 |                 )
55 |         else:
56 |             raise NotImplementedError(f"Model {model_name} not implemented.")
57 | 
58 |     @classmethod
59 |     def get_default_model(cls):
60 |         return cls.get_model("text-embedding-ada-002")
61 | 


--------------------------------------------------------------------------------
/src/Models/Github.py:
--------------------------------------------------------------------------------
 1 | from Tools.githubTool import github_core
 2 | from github import Github
 3 | from Tools.Python_structure_Tool import analyze_python_dict
 4 | import re
 5 | import os
 6 | 
 7 | 
 8 | g = Github(os.environ["GIT_KEY"])
 9 | core = github_core()
10 | 
11 | 
12 | def parse_github_url(url):
13 |     match = re.match(r"https://github.com/([^/]+)/([^/]+)", url)
14 |     if match:
15 |         owner, repo_name = match.groups()
16 |         return f"{owner}/{repo_name}"
17 |     else:
18 |         return "无效的 GitHub URL，不存在该URL"
19 | 
20 | 
21 | def directory_contents2str_github(contents, repo, branch_name, level=0):
22 |     content_str = ""
23 |     for content_file in contents:
24 |         content_str += "  " * level + '--' + content_file.path + '\n'
25 |         if content_file.type == "dir":
26 |             content_str += directory_contents2str_github(repo.get_contents(content_file.path, ref=branch_name), repo,
27 |                                                          branch_name, level + 1)
28 |     return content_str
29 | 
30 | 
31 | def directory_contents2str(directory, level=0):
32 |     content_str = str(directory) + "文件夹下的路径为" + '\n'
33 |     for item in os.listdir(directory):
34 |         item_path = os.path.join(directory, item)
35 |         content_str += "  " * level + '--' + item + '\n'
36 |         if item_path.endswith('.py'):
37 |             content_str += "  " * (level + 1) + analyze_python_dict([item_path]) + '\n'
38 |         if os.path.isdir(item_path):
39 |             content_str += directory_contents2str(item_path, level + 1)
40 |     return content_str
41 | 


--------------------------------------------------------------------------------
/src/Models/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/syzhy113/Engineering-Code-Analysis/a2fdb2c4d565c63ec80ce2c0897833e546879389/src/Models/__init__.py


--------------------------------------------------------------------------------
/src/Tools/EmailTool.py:
--------------------------------------------------------------------------------
 1 | import webbrowser
 2 | import urllib.parse
 3 | import re
 4 | 
 5 | 
 6 | def _is_valid_email(email: str) -> bool:
 7 |     receivers = email.split(';')
 8 |     # 正则表达式匹配电子邮件
 9 |     pattern = r'^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$'
10 |     for receiver in receivers:
11 |         if not bool(re.match(pattern, receiver.strip())):
12 |             return False
13 |     return True
14 | 
15 | 
16 | def send_email(
17 |         to: str,
18 |         subject: str,
19 |         body: str,
20 |         cc: str = None,
21 |         bcc: str = None,
22 | ) -> str:
23 |     """给指定的邮箱发送邮件"""
24 | 
25 |     if not _is_valid_email(to):
26 |         return f"电子邮件地址 {to} 不合法"
27 | 
28 |     # 对邮件的主题和正文进行URL编码
29 |     subject_code = urllib.parse.quote(subject)
30 |     body_code = urllib.parse.quote(body)
31 | 
32 |     # 构造mailto链接
33 |     mailto_url = f'mailto:{to}?subject={subject_code}&body={body_code}'
34 |     if cc is not None:
35 |         cc = urllib.parse.quote(cc)
36 |         mailto_url += f'&cc={cc}'
37 |     if bcc is not None:
38 |         bcc = urllib.parse.quote(bcc)
39 |         mailto_url += f'&bcc={bcc}'
40 | 
41 |     webbrowser.open(mailto_url)
42 | 
43 |     return f"状态: 成功\n备注: 已发送邮件给 {to}, 标题: {subject}"
44 | 


--------------------------------------------------------------------------------
/src/Tools/ExcelTool.py:
--------------------------------------------------------------------------------
 1 | import pandas as pd
 2 | 
 3 | 
 4 | def get_sheet_names(
 5 |         filename: str
 6 | ) -> str:
 7 |     """获取 Excel 文件的工作表名称"""
 8 |     excel_file = pd.ExcelFile(filename, engine='openpyxl')
 9 |     sheet_names = excel_file.sheet_names
10 |     return f"这是 '{filename}' 文件的工作表名称：\n\n{sheet_names}"
11 | 
12 | 
13 | def get_column_names(
14 |         filename: str
15 | ) -> str:
16 |     """获取 Excel 文件的列名"""
17 | 
18 |     # 读取 Excel 文件的第一个工作表
19 |     df = pd.read_excel(filename, sheet_name=0, header=None)  # sheet_name=0 表示第一个工作表
20 | 
21 |     # column_names = '\n'.join(
22 |     #     df.columns.to_list()
23 |     #
24 |     column_names = '\n'.join(map(str, df.columns.to_list()))
25 | 
26 |     result = f"这是 '{filename}' 文件第一个工作表的列名：\n\n{column_names}"
27 |     return result
28 | 
29 | 
30 | def get_first_n_rows(
31 |         filename: str,
32 |         n: int = 3
33 | ) -> str:
34 |     """获取 Excel 文件的前 n 行"""
35 | 
36 |     result = get_sheet_names(filename) + "\n\n"
37 | 
38 |     result += get_column_names(filename) + "\n\n"
39 | 
40 |     # 读取 Excel 文件的第一个工作表
41 |     df = pd.read_excel(filename, sheet_name=0, engine='openpyxl')  # sheet_name=0 表示第一个工作表
42 | 
43 |     n_lines = '\n'.join(
44 |         df.head(n).to_string(index=False, header=True).split('\n')
45 |     )
46 | 
47 |     result += f"这是 '{filename}' 文件第一个工作表的前{n}行样例：\n\n{n_lines}"
48 |     return result
49 | 
50 | 
51 | if __name__ == '__main__':
52 |     file_name = '../data/labels.xlsx'
53 |     get_first_n_rows(file_name, 3)
54 | 


--------------------------------------------------------------------------------
/src/Tools/FileQATool.py:
--------------------------------------------------------------------------------
 1 | from typing import List
 2 | from langchain.schema import Document
 3 | from langchain.text_splitter import RecursiveCharacterTextSplitter
 4 | from langchain_community.vectorstores import Chroma
 5 | from langchain_community.document_loaders import PyMuPDFLoader
 6 | from langchain_community.document_loaders import UnstructuredWordDocumentLoader
 7 | from langchain.chains import RetrievalQA
 8 | 
 9 | from Models.Factory import ChatModelFactory, EmbeddingModelFactory
10 | 
11 | 
12 | class FileLoadFactory:
13 |     @staticmethod
14 |     def get_loader(filename: str):
15 |         ext = get_file_extension(filename)
16 |         if ext == "pdf":
17 |             return PyMuPDFLoader(filename)
18 |         elif ext == "docx" or ext == "doc":
19 |             return UnstructuredWordDocumentLoader(filename)
20 |         else:
21 |             raise NotImplementedError(f"File extension {ext} not supported.")
22 | 
23 | 
24 | def get_file_extension(filename: str) -> str:
25 |     return filename.split(".")[-1]
26 | 
27 | 
28 | def load_docs(filename: str) -> List[Document]:
29 |     file_loader = FileLoadFactory.get_loader(filename)
30 |     pages = file_loader.load_and_split()
31 |     return pages
32 | 
33 | 
34 | def ask_docment(
35 |         filename: str,
36 |         query: str,
37 | ) -> str:
38 |     """根据一个PDF文档的内容，回答一个问题"""
39 | 
40 |     raw_docs = load_docs(filename)
41 |     if len(raw_docs) == 0:
42 |         return "抱歉，文档内容为空"
43 |     text_splitter = RecursiveCharacterTextSplitter(
44 |         chunk_size=200,
45 |         chunk_overlap=100,
46 |         length_function=len,
47 |         add_start_index=True,
48 |     )
49 |     documents = text_splitter.split_documents(raw_docs)
50 |     if documents is None or len(documents) == 0:
51 |         return "无法读取文档内容"
52 |     db = Chroma.from_documents(documents, EmbeddingModelFactory.get_default_model())
53 |     qa_chain = RetrievalQA.from_chain_type(
54 |         llm=ChatModelFactory.get_default_model(),  # 语言模型
55 |         chain_type="stuff",  # prompt的组织方式，后面细讲
56 |         retriever=db.as_retriever()  # 检索器
57 |     )
58 |     response = qa_chain.run(query + "(请用中文回答)")
59 |     return response
60 | 
61 | 
62 | if __name__ == "__main__":
63 |     filename = "../data/2023年10月份销售计划.docx"
64 |     query = "销售额达标的标准是多少？"
65 |     response = ask_docment(filename, query)
66 |     print(response)
67 | 


--------------------------------------------------------------------------------
/src/Tools/FileTool.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | 
 3 | 
 4 | def list_files_in_directory(path: str) -> str:
 5 |     """List all file names in the directory"""
 6 |     file_names = os.listdir(path)
 7 | 
 8 |     # Join the file names into a single string, separated by a newline
 9 |     return "\n".join(file_names)
10 | 
11 | 
12 | def return_dict(path: str) -> str:
13 |     return path
14 | 


--------------------------------------------------------------------------------
/src/Tools/FinishTool.py:
--------------------------------------------------------------------------------
1 | def finish(the_final_answer: str) -> str:
2 |     return the_final_answer
3 | 
4 | 


--------------------------------------------------------------------------------
/src/Tools/PythonTool.py:
--------------------------------------------------------------------------------
  1 | import re
  2 | from typing import Union
  3 | 
  4 | from langchain.tools import StructuredTool
  5 | from langchain_core.language_models import BaseChatModel, BaseLanguageModel
  6 | from langchain_core.output_parsers import BaseOutputParser, StrOutputParser
  7 | from langchain_core.prompts import PromptTemplate
  8 | 
  9 | from Models.Factory import ChatModelFactory
 10 | from Utils.CallbackHandlers import ColoredPrintHandler
 11 | from Utils.PrintUtils import CODE_COLOR
 12 | from langchain_openai import ChatOpenAI
 13 | from .ExcelTool import get_first_n_rows, get_column_names
 14 | from langchain_experimental.utilities import PythonREPL
 15 | 
 16 | 
 17 | class PythonCodeParser(BaseOutputParser):
 18 |     """从OpenAI返回的文本中提取Python代码。"""
 19 | 
 20 |     @staticmethod
 21 |     def __remove_marked_lines(input_str: str) -> str:
 22 |         lines = input_str.strip().split('\n')
 23 |         if lines and lines[0].strip().startswith('```'):
 24 |             del lines[0]
 25 |         if lines and lines[-1].strip().startswith('```'):
 26 |             del lines[-1]
 27 | 
 28 |         ans = '\n'.join(lines)
 29 |         return ans
 30 | 
 31 |     def parse(self, text: str) -> str:
 32 |         # 使用正则表达式找到所有的Python代码块
 33 |         python_code_blocks = re.findall(r'```python\n(.*?)\n```', text, re.DOTALL)
 34 |         # 从re返回结果提取出Python代码文本
 35 |         python_code = None
 36 |         if len(python_code_blocks) > 0:
 37 |             python_code = python_code_blocks[0]
 38 |             python_code = self.__remove_marked_lines(python_code)
 39 |         return python_code
 40 | 
 41 | 
 42 | class ExcelAnalyser:
 43 |     """
 44 |     通过程序脚本分析一个结构化文件（例如excel文件）的内容。
 45 |     输人中必须包含文件的完整路径和具体分析方式和分析依据，阈值常量等。
 46 |     """
 47 | 
 48 |     def __init__(
 49 |             self,
 50 |             llm: Union[BaseLanguageModel, BaseChatModel],
 51 |             prompt_file="./prompts/tools/excel_analyser.txt",
 52 |             verbose=False
 53 |     ):
 54 |         self.llm = llm
 55 |         self.prompt = PromptTemplate.from_file(prompt_file)
 56 |         self.verbose = verbose
 57 |         self.verbose_handler = ColoredPrintHandler(CODE_COLOR)
 58 | 
 59 |     def analyse(self, query, filename):
 60 | 
 61 |         """分析一个结构化文件（例如excel文件）的内容。"""
 62 | 
 63 |         # columns = get_column_names(filename)
 64 |         inspections = get_first_n_rows(filename, 3)
 65 | 
 66 |         code_parser = PythonCodeParser()
 67 |         chain = self.prompt | self.llm | StrOutputParser()
 68 | 
 69 |         response = ""
 70 | 
 71 |         for c in chain.stream({
 72 |             "query": query,
 73 |             "filename": filename,
 74 |             "inspections": inspections
 75 |         }, config={
 76 |             "callbacks": [
 77 |                 self.verbose_handler
 78 |             ] if self.verbose else []
 79 |         }):
 80 |             response += c
 81 | 
 82 |         code = code_parser.parse(response)
 83 | 
 84 |         if code:
 85 |             ans = query + "\n" + PythonREPL().run(code)
 86 |             return ans
 87 |         else:
 88 |             return "没有找到可执行的Python代码"
 89 | 
 90 |     def as_tool(self):
 91 |         return StructuredTool.from_function(
 92 |             func=self.analyse,
 93 |             name="AnalyseExcel",
 94 |             description=self.__class__.__doc__.replace("\n", ""),
 95 |         )
 96 | 
 97 | 
 98 | def read_python(filenames):
 99 |     codes = ""
100 |     for filename in filenames:
101 |         with open(filename, "r", encoding="utf-8") as file:
102 |             code = file.read()
103 |             codes += "######" + str(filename) + "#####" + code + '\n'
104 |     return codes
105 | 
106 | 


--------------------------------------------------------------------------------
/src/Tools/Python_structure_Tool.py:
--------------------------------------------------------------------------------
 1 | import ast
 2 | import os
 3 | 
 4 | 
 5 | def analyze_python_dict(file_paths):
 6 |     contents = ''
 7 |     for file_path in file_paths:
 8 |         contents += f'#######{file_path}#######' + '\n' + analyze_python_file(file_path)
 9 | 
10 |     return contents
11 | 
12 | 
13 | def analyze_python_file(file_path):
14 |     if file_path.endswith('py'):
15 |         with open(file_path, "r", encoding="utf-8") as file:
16 |             code = file.read()
17 |         contents = ""
18 |         try:
19 |             parsed_code = ast.parse(code)
20 |             classes = [node for node in ast.walk(parsed_code) if isinstance(node, ast.ClassDef)]
21 |             functions = [node for node in ast.walk(parsed_code) if isinstance(node, ast.FunctionDef)]
22 | 
23 |             print(f"\n解析文件: {file_path}")
24 | 
25 |             for cls in classes:
26 |                 contents += f"类名：{cls.name}" + '\n'
27 |                 for item in cls.body:
28 |                     try:
29 |                         functions.remove(item)
30 |                     except:
31 |                         pass
32 |                     if isinstance(item, ast.FunctionDef):
33 |                         contents += f"  方法名：{item.name}" + '\n'
34 |                         contents += "  参数列表 " + str([arg.arg for arg in item.args.args]) + '\n'
35 | 
36 |             for func in functions:
37 |                 if func not in classes:
38 |                     contents += "函数名：", func.name + '\n'
39 |                     contents += "参数列表：" + str([arg.arg for arg in func.args.args])
40 |                     # print("函数体：")
41 |                     # for stmt in func.body:
42 |                     #     print(f"  {ast.dump(stmt)}")
43 | 
44 |         except Exception as e:
45 |             contents += f"解析文件 {file_path} 时出错: {e}"
46 |         return contents
47 | 
48 |     elif file_path.endswith('m'):
49 |         with open(file_path, "r", encoding="gbk") as file:
50 |             code = file.read()
51 | 
52 |         print(f"\n解析文件: {file_path}")
53 | 
54 | 
55 | if __name__ == "__main__":
56 |     contents = analyze_python_file('../Agent/ReAct.py')
57 |     contents = analyze_python_dict(['../Agent/ReAct.py', '../Agent/Action.py'])
58 |     pass
59 | 


--------------------------------------------------------------------------------
/src/Tools/Tools.py:
--------------------------------------------------------------------------------
 1 | import warnings
 2 | warnings.filterwarnings("ignore")
 3 | 
 4 | from langchain.tools import StructuredTool
 5 | from .FileQATool import ask_docment
 6 | from .WriterTool import write
 7 | from .EmailTool import send_email
 8 | from .ExcelTool import get_first_n_rows
 9 | from .FileTool import list_files_in_directory, return_dict
10 | from .FinishTool import finish
11 | from .PythonTool import read_python
12 | from .Python_structure_Tool import analyze_python_dict
13 | from .githubTool import github_core
14 | 
15 | document_qa_tool = StructuredTool.from_function(
16 |     func=ask_docment,
17 |     name="AskDocument",
18 |     description="根据一个Word或PDF文档的内容，回答一个问题。考虑上下文信息，确保问题对相关概念的定义表述完整。",
19 | )
20 | 
21 | document_generation_tool = StructuredTool.from_function(
22 |     func=write,
23 |     name="GenerateDocument",
24 |     description="根据需求描述生成一篇正式文档",
25 | )
26 | 
27 | email_tool = StructuredTool.from_function(
28 |     func=send_email,
29 |     name="SendEmail",
30 |     description="给指定的邮箱发送邮件。确保邮箱地址是xxx@xxx.xxx的格式。多个邮箱地址以';'分割。",
31 | )
32 | 
33 | excel_inspection_tool = StructuredTool.from_function(
34 |     func=get_first_n_rows,
35 |     name="InspectExcel",
36 |     description="探查表格文件的内容和结构，展示它的列名和前n行，n默认为3",
37 | )
38 | 
39 | directory_inspection_tool = StructuredTool.from_function(
40 |     func=list_files_in_directory,
41 |     name="ListDirectory",
42 |     description="探查文件夹的内容和结构，展示它的文件名和文件夹名",
43 | )
44 | 
45 | finish_placeholder = StructuredTool.from_function(
46 |     func=finish,
47 |     name="FINISH",
48 |     description="结束任务，将最终答案返回,最终答案需要明确答案是什么，比如 tool.py中的代码核心代码为xxx，起到了链接API的作用"
49 | )
50 | 
51 | python_tool = StructuredTool.from_function(
52 |     func=read_python,
53 |     name="ReadPython",
54 |     description="如果是github的工程，不要调用此工具 #### 读取python代码，以字符串形式返回代码, 可以一次访问不止一个文件,在访问的时候需要提供完整代码路径,"
55 |                 "如果你觉得提供给你的代码框架已经足够回答问题，可以不调用此工具"
56 | 
57 | )
58 | 
59 | return_dict_tool = StructuredTool.from_function(
60 |     func=return_dict,
61 |     name="return_file",
62 |     description="探查文件夹的内容和结构，展示它的文件名和文件夹名"
63 | )
64 | 
65 | get_python_structure = StructuredTool.from_function(
66 |     func=analyze_python_dict,
67 |     name="get_python_structure",
68 |     description="获得python代码的机构，可以一次访问不止一个文件，在访问是需要提供完整的代码路径"
69 | )
70 | 
71 | 
72 | 


--------------------------------------------------------------------------------
/src/Tools/WriterTool.py:
--------------------------------------------------------------------------------
 1 | from langchain.prompts import ChatPromptTemplate
 2 | from langchain.prompts.chat import SystemMessagePromptTemplate, HumanMessagePromptTemplate
 3 | from langchain_core.output_parsers import StrOutputParser
 4 | from langchain_core.runnables import RunnablePassthrough
 5 | from langchain_openai import ChatOpenAI
 6 | 
 7 | from Models.Factory import ChatModelFactory
 8 | 
 9 | 
10 | def write(query: str, verbose=False):
11 |     """按用户要求撰写文档"""
12 |     template = ChatPromptTemplate.from_messages(
13 |         [
14 |             SystemMessagePromptTemplate.from_template(
15 |                 "你是专业的文档写手。你根据客户的要求，写一份文档。输出中文。"),
16 |             HumanMessagePromptTemplate.from_template("{query}"),
17 |         ]
18 |     )
19 | 
20 |     llm = ChatModelFactory.get_default_model()
21 | 
22 |     chain = {"query": RunnablePassthrough()} | template | llm | StrOutputParser()
23 | 
24 |     return chain.invoke(query)
25 | 
26 | 
27 | if __name__ == "__main__":
28 |     print(write("写一封邮件给张三，内容是：你好，我是李四。"))
29 | 


--------------------------------------------------------------------------------
/src/Tools/__init__.py:
--------------------------------------------------------------------------------
 1 | from .Tools import (
 2 |     document_qa_tool,
 3 |     document_generation_tool,
 4 |     email_tool,
 5 |     excel_inspection_tool,
 6 |     directory_inspection_tool,
 7 |     finish_placeholder,
 8 |     python_tool,
 9 |     return_dict_tool,
10 |     get_python_structure,
11 | )
12 | 


--------------------------------------------------------------------------------
/src/Tools/githubTool.py:
--------------------------------------------------------------------------------
 1 | from langchain.tools import StructuredTool
 2 | 
 3 | 
 4 | class github_core:
 5 |     def __init__(self, repo=None, branch_name=None):
 6 |         self.repo = repo
 7 |         self.branch_name = branch_name
 8 | 
 9 |     def set_repo_branch(self, repo, branch_name):
10 |         self.repo = repo
11 |         self.branch_name = branch_name
12 | 
13 |     def get_github_codes(self, file):
14 |         if type(file) == str:
15 |             return self.repo.get_contents(file, ref=self.branch_name).decoded_content.decode("utf-8")
16 |         else:
17 |             contents = ""
18 |             for f in file:
19 |                 contents += f'#######{f}#######' + '\n' + \
20 |                             self.repo.get_contents(f, ref=self.branch_name).decoded_content.decode("utf-8")
21 |             return contents
22 | 
23 |     def as_tool(self):
24 |         return StructuredTool.from_function(
25 |             func=self.get_github_codes,
26 |             name="get_github_code",
27 |             description='如果在进行github任务，读取代码只能使用此工具，读取github上的代码和README，以字符串形式返回代码, 可以一次访问不止一个文件,'
28 |                         '如果要访问不止一个文件，请用列表的形式返回，在访问的时候需要提供完整代码路径',
29 |         )
30 | 


--------------------------------------------------------------------------------
/src/Utils/CallbackHandlers.py:
--------------------------------------------------------------------------------
 1 | from typing import Optional, Union, Any, Dict
 2 | from uuid import UUID
 3 | 
 4 | from langchain_core.callbacks import BaseCallbackHandler
 5 | from langchain_core.outputs import GenerationChunk, ChatGenerationChunk, LLMResult
 6 | 
 7 | from Utils.PrintUtils import *
 8 | 
 9 | 
10 | class ColoredPrintHandler(BaseCallbackHandler):
11 |     def __init__(self, color: str):
12 |         BaseCallbackHandler.__init__(self)
13 |         self._color = color
14 | 
15 |     def on_llm_new_token(
16 |             self,
17 |             token: str,
18 |             *,
19 |             chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
20 |             run_id: UUID,
21 |             parent_run_id: Optional[UUID] = None,
22 |             **kwargs: Any,
23 |     ) -> Any:
24 |         color_print(token, self._color, end="")
25 |         return token
26 | 
27 |     def on_llm_end(self, response: LLMResult, **kwargs: Any) -> Any:
28 |         color_print("\n", self._color, end="")
29 |         return response
30 | 
31 |     def on_tool_end(self, output: Any, **kwargs: Any) -> Any:
32 |         """Run when tool ends running."""
33 |         print()
34 |         color_print("\n[Tool Return]", RETURN_COLOR)
35 |         color_print(output, OBSERVATION_COLOR)
36 |         return output
37 | 
38 |     @staticmethod
39 |     def on_thought_start(index: int, **kwargs: Any) -> Any:
40 |         """自定义事件，非继承自BaseCallbackHandler"""
41 |         color_print(f"\n[Thought: {index}]", ROUND_COLOR)
42 |         return index
43 | 
44 | 


--------------------------------------------------------------------------------
/src/Utils/PrintUtils.py:
--------------------------------------------------------------------------------
 1 | from colorama import init, Fore, Back, Style
 2 | import sys
 3 | 
 4 | THOUGHT_COLOR = Fore.GREEN
 5 | OBSERVATION_COLOR = Fore.YELLOW
 6 | ROUND_COLOR = Fore.BLUE
 7 | RETURN_COLOR = Fore.CYAN
 8 | CODE_COLOR = Fore.WHITE
 9 | 
10 | 
11 | def color_print(text, color=None, end="\n"):
12 |     if color is not None:
13 |         content = color + text + Style.RESET_ALL + end
14 |     else:
15 |         content = text + end
16 |     sys.stdout.write(content)
17 |     sys.stdout.flush()
18 | 


--------------------------------------------------------------------------------
/src/Utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/syzhy113/Engineering-Code-Analysis/a2fdb2c4d565c63ec80ce2c0897833e546879389/src/Utils/__init__.py


--------------------------------------------------------------------------------
/src/config.json:
--------------------------------------------------------------------------------
1 | {
2 |   "GIT_KEY": "",
3 |   "OPENAI_API_KEY": "",
4 |   "OPENAI_BASE_URL": ""
5 | }


--------------------------------------------------------------------------------
/src/main.py:
--------------------------------------------------------------------------------
  1 | import warnings
  2 | warnings.filterwarnings("ignore")
  3 | import asyncio
  4 | import os
  5 | from Tools import *
  6 | from langchain_core.prompts import HumanMessagePromptTemplate, ChatPromptTemplate
  7 | from langchain.schema.output_parser import StrOutputParser
  8 | from langchain_core.tools import render_text_description
  9 | from Agent.Action import Action
 10 | from langchain_community.chat_message_histories.in_memory import ChatMessageHistory
 11 | from langchain.output_parsers import PydanticOutputParser, OutputFixingParser
 12 | from Agent.ReAct import ReActAgent
 13 | from langchain_openai import ChatOpenAI
 14 | import json
 15 | 
 16 | 
 17 | with open('config.json') as f:
 18 |     config = json.load(f)
 19 |     os.environ["GIT_KEY"] = config["GIT_KEY"]
 20 |     os.environ["OPENAI_API_KEY"] = config["OPENAI_API_KEY"]
 21 |     os.environ["OPENAI_BASE_URL"] = config["OPENAI_BASE_URL"]
 22 | 
 23 | from typing import Annotated
 24 | from os.path import exists
 25 | from langgraph.graph.message import add_messages
 26 | 
 27 | from typing import Literal
 28 | 
 29 | from langchain_core.runnables import RunnableConfig
 30 | from langgraph.graph import END, START, StateGraph
 31 | from typing import TypedDict, Optional
 32 | from Models.Github import *
 33 | 
 34 | 
 35 | class State(TypedDict):
 36 |     messages: Annotated[list, add_messages]
 37 |     dict_struct: Optional[str] = None
 38 |     python_code_list: Optional[list]
 39 | 
 40 | 
 41 | tools = [python_tool, finish_placeholder, core.as_tool()]
 42 | 
 43 | model_name = "gpt-4o"
 44 | 
 45 | model = ChatOpenAI(model=model_name)
 46 | 
 47 | agent = ReActAgent(
 48 |     llm=model,
 49 |     tools=tools,
 50 |     main_prompt_file="./prompts/main/sample_agent.txt",
 51 |     max_thought_steps=5,
 52 | )
 53 | 
 54 | large_agent = ReActAgent(
 55 |     llm=model,
 56 |     tools=tools,
 57 |     main_prompt_file="./prompts/main/main.txt",
 58 |     max_thought_steps=20,
 59 | )
 60 | 
 61 | 
 62 | def should_agent(state: State) -> Literal["agent", "sample_agent", "__end__"]:
 63 |     messages = state["messages"]
 64 |     last_message = messages[-1].content
 65 |     if last_message == "NO":
 66 |         return "sample_agent"
 67 |     elif last_message == "YES":
 68 |         return "agent"
 69 |     else:
 70 |         print("无结果")
 71 |         return END
 72 | 
 73 | 
 74 | def no_dict(state: State) -> Literal["sample_agent", "agent_select"]:
 75 |     dict_str = state["dict_struct"]
 76 |     if dict_str:
 77 |         return "agent_select"
 78 |     return "sample_agent"
 79 | 
 80 | 
 81 | def extract_json_action(text: str) -> str | None:
 82 |     json_pattern = re.compile(r'```json(.*?)```', re.DOTALL)
 83 |     matches = json_pattern.findall(text)
 84 |     if matches:
 85 |         last_json_str = matches[-1]
 86 |         return last_json_str
 87 |     return None
 88 | 
 89 | 
 90 | async def dict_list(state: State, config: RunnableConfig):
 91 |     contents = "不存在 Start-*&^%"
 92 |     output_parser = PydanticOutputParser(pydantic_object=Action)
 93 |     with open('./prompts/main/dict_list_prompt.txt', 'r', encoding='utf-8') as f:
 94 |         prompt = ChatPromptTemplate.from_messages([HumanMessagePromptTemplate.from_template(f.read())]).partial(
 95 |             tools=render_text_description([return_dict_tool]),
 96 |             tool_names=','.join([return_dict_tool.name]),
 97 |             format_instructions=output_parser.get_format_instructions(),
 98 |         )
 99 |     messages = state["messages"]
100 |     run_model = (prompt | model | StrOutputParser())
101 |     short_term_memory = []
102 |     robust_parser = OutputFixingParser.from_llm(
103 |         parser=output_parser,
104 |         llm=model
105 |     )
106 |     time = 5
107 |     while contents.find('不存在') != -1 and time > 0:
108 |         time -= 1
109 |         if contents.find('Start-*&^%') != -1:
110 |             input_temp = {'input': messages[0].content, "agent_scratchpad": short_term_memory}
111 |         else:
112 |             input_temp = {'input': messages[0].content + contents, "agent_scratchpad": short_term_memory}
113 |         response = await run_model.ainvoke(input_temp, config)
114 |         json_action = extract_json_action(response)
115 |         action = robust_parser.parse(json_action)
116 |         response = return_dict_tool.run(action.args)
117 |         if response == "NO":
118 |             break
119 |         if response.find('github') != -1:
120 |             repo_full_name = parse_github_url(response)  # 提取仓库信息
121 |             try:
122 |                 branch_name = "master"
123 |                 repo = g.get_repo(repo_full_name)
124 |                 contents = repo.get_contents("", ref=branch_name)
125 |                 contents = directory_contents2str_github(contents=contents, repo=repo, branch_name=branch_name)
126 |                 core.set_repo_branch(repo, branch_name)
127 |             except:
128 |                 pass
129 | 
130 |             try:
131 |                 branch_name = "main"
132 |                 repo = g.get_repo(repo_full_name)
133 |                 contents = repo.get_contents("", ref=branch_name)
134 |                 contents = directory_contents2str_github(contents=contents, repo=repo, branch_name=branch_name)
135 |                 core.set_repo_branch(repo, branch_name)
136 |             except:
137 |                 pass
138 |         else:
139 |             if exists(response):
140 |                 contents = directory_contents2str(response)
141 |             else:
142 |                 contents = '不存在该路径，请重新确认'
143 | 
144 |     return {"messages": response, 'dict_struct': contents}
145 | 
146 | 
147 | async def call_agent(state: State, config: RunnableConfig):
148 |     messages = state["messages"][-3]
149 |     large_agent.work_dir = state["dict_struct"]
150 |     response = large_agent.run(messages.content, chat_history, verbose=True)
151 |     return {"messages": response}
152 | 
153 | 
154 | async def select_model(state: State, config: RunnableConfig):
155 |     with open('./prompts/main/choose_agent.txt', 'r', encoding='utf-8') as f:
156 |         prompt = ChatPromptTemplate.from_messages([HumanMessagePromptTemplate.from_template(f.read())])
157 |     messages = state["messages"]
158 |     run_model = (prompt | model | StrOutputParser())
159 |     input_temp = {'input': messages[0].content}
160 |     response = await run_model.ainvoke(input_temp, config)
161 |     return {"messages": response}
162 | 
163 | 
164 | async def call_model(state: State, config: RunnableConfig):
165 |     messages = state["messages"][-3]
166 |     agent.work_dir = state["dict_struct"]
167 |     response = agent.run(messages.content, chat_history, verbose=True)
168 | 
169 |     return {"messages": response}
170 | 
171 | 
172 | async def all_in_all_model(state: State, config: RunnableConfig):
173 |     messages = state["messages"][-1]
174 |     input_temp = {
175 |         "input": messages,
176 |     }
177 |     with open('./prompts/main/allinall_prompt.txt', 'r', encoding='utf-8') as f:
178 |         prompt = ChatPromptTemplate.from_messages([HumanMessagePromptTemplate.from_template(f.read())])
179 |     run = prompt | model | StrOutputParser()
180 |     response = ""
181 |     async for s in run.astream(input=input_temp):
182 |         response += s
183 |     print(f"{response}\n")
184 |     return {"messages": response}
185 | 
186 | 
187 | chat_history = ChatMessageHistory()
188 | workflow = StateGraph(State)
189 | 
190 | workflow.add_node("dict_list", dict_list)
191 | workflow.add_node("agent_select", select_model)
192 | workflow.add_node("sample_agent", call_model)
193 | workflow.add_node("agent", call_agent)
194 | workflow.add_node("all_agent", all_in_all_model)
195 | 
196 | workflow.add_edge(START, "dict_list")
197 | workflow.add_edge("dict_list", "agent_select")
198 | workflow.add_edge("sample_agent", "all_agent")
199 | workflow.add_edge("agent", "all_agent")
200 | workflow.add_edge("all_agent", END)
201 | 
202 | workflow.add_conditional_edges(
203 |     "dict_list",
204 |     no_dict,
205 | )
206 | 
207 | workflow.add_conditional_edges(
208 |     "agent_select",
209 |     should_agent,
210 | )
211 | 
212 | app = workflow.compile()
213 | 
214 | 
215 | async def generate(inputs):
216 |     async for event in app.astream_events({"messages": inputs}, version="v1"):
217 |         kind = event["event"]
218 |         if kind == "on_chat_model_stream":
219 |             content = event["data"]["chunk"].content
220 | 
221 | 
222 | async def main():
223 |     human_icon = "\U0001F468"
224 |     ai_icon = "\U0001F916"
225 | 
226 |     while True:
227 |         task = input(f"{ai_icon}：有什么可以帮您？\n{human_icon}：")
228 |         if task.strip().lower() == "quit":
229 |             break
230 |         await generate(task)
231 | 
232 | if __name__ == "__main__":
233 |     asyncio.run(main())
234 | 


--------------------------------------------------------------------------------
/src/prompts/main/allinall_prompt.txt:
--------------------------------------------------------------------------------
 1 | 你的任务是总结一个agent内容，需要保持原有意思不变，但是需要用更好的方式总结出来
 2 | 
 3 | 需要总结的内容是：
 4 | {input}
 5 | 
 6 | 你必须遵循以下约束来完成任务。
 7 | 1. 在总结的过程中，不需要管输入的推理、拆解、关键概念等和问题不相关的过程
 8 | 2. 不需要最后那个json的部分，此部分和问题无关
 9 | 3. 不要开头写总结如下，且不要提到任务难度和调用工具或者任务是什么等无关信息。
10 | 4. 总结后的内容不要有重复内容，不要冗余，且不要出现大量代码
11 | 5. 不要出现未知的信息，只需要总结已经知道的信息
12 | 6， 关注结果，不要展示推理过程。
13 | 
14 | 输出的格式可以是
15 |     结果为：
16 |     test.py的流程图，作用是什么，结构是什么样的，内容有什么
17 | 
18 | 


--------------------------------------------------------------------------------
/src/prompts/main/choose_agent.txt:
--------------------------------------------------------------------------------
 1 | 你的职责是选择是否需要使用agent思维链来完成任务
 2 | 
 3 | 你的任务是：
 4 | {input}
 5 | 选择是否需要使用agent思维链来完成任务
 6 | 如果你认为你可以单独完成，即输出NO
 7 | 反之如果你认为你需要引入帮助推理的agent来完成，即输出YES
 8 | 只回复YES或NO，不要回复其他内容
 9 | 
10 | 你必须遵循以下约束来完成任务。
11 | 1.不需要你回答具体的问题，只需要进行选择。
12 | 2.不要输出别的无关信息，只输出YES/NO
13 | 3.在你能够完成的情况下尽可能不要调用agent
14 | 4.不允许输出为空，必须得输出确定的结果即YES/NO
15 | 5.如果需要解析github的任务，则一定要调用agent
16 | 


--------------------------------------------------------------------------------
/src/prompts/main/dict_list_prompt.txt:
--------------------------------------------------------------------------------
 1 | 你是文件目录助手，可以使用工具获取文件夹路径并整理路径
 2 | 
 3 | 你的任务是：
 4 | #######
 5 | 
 6 | {input}
 7 | 
 8 | #########
 9 | 
10 | 你可以使用以下工具或指令，它们又称为动作或actions:
11 | {tools}
12 | 访问文件或链接时请确保文件路径完整。
13 | 
14 | 当前的任务执行记录:
15 | <history>
16 | {agent_scratchpad}
17 | </history>
18 | 
19 | 你必须遵循以下约束来完成任务。
20 | 1. 确保你调用的指令或使用的工具在给定的工具列表中, {tool_names}。
21 | 2. 你要做的只是  提取路径和链接相关的信息，不要做任何别的事情
22 | 3. 你选择的路径必须是输入中有的，不要有任何的拓展和联想
23 | 4. 如果多次尝试，你认为这个路径是不存在的，你可以在调用tool的过程中，以NO作为参数
24 | 5. 不要出现xx文件夹这样的表述，只需要输出完整的路径即可
25 | 6. 如果是如果出现github则需要完整的网页链接
26 | 
27 | 
28 | 输出形式：
29 | 1. 只需要输出可访问的路径即可，不需要输出别的信息，不要输出任何别的内容，只要输出提取出来的路径即可
30 | 2. 如果不存在可用的github链接或者可用的文件地址，则直接输出NO
31 | 
32 | 3. 最后，输出你选择执行的动作/工具
33 | {format_instructions}
34 | 
35 | 请确保你的动作/工具选择（JSON）出现在输出的最后一部分。
36 | 请确保你输出的JSON代码块以```json\n\n```包裹。
37 | 
38 | 
39 | 
40 | 


--------------------------------------------------------------------------------
/src/prompts/main/main.txt:
--------------------------------------------------------------------------------
  1 | 你是强大的AI助手，可以使用工具与指令自动化解决问题。
  2 | 
  3 | 你的任务是:
  4 | {input}
  5 | 如果此任务表达“没有了”、“已完成”或类似意思，你直接输出下述工具中的FINISH即可，一定要输出FINSH的工具。
  6 | 
  7 | 工程的目录结构如下，你所有的内容都需要基于这个工程完成：
  8 | {work_dir}
  9 | 工程目录中也会有代码的核心组成部分，包括类，函数等，优先基于该内容回答问题
 10 | 如果可以解决，则不要调用工具去读取代码文件
 11 | 如果工程目录为NO，则没有工程文件，可以不用基于工程回答问题
 12 | 
 13 | 你可以使用以下工具或指令，它们又称为动作或actions:
 14 | {tools}
 15 | 
 16 | 你必须遵循以下约束来完成任务。
 17 | 1. 每次你的决策只使用一种工具，你可以使用任意多次。
 18 | 2. 确保你调用的指令或使用的工具在给定的工具列表中, {tool_names}。
 19 | 3. 确保你的回答不会包含违法或有侵犯性的信息。
 20 | 4. 如果你已经完成所有任务，确保以"FINISH"指令结束。
 21 | 5. 用中文思考和输出。
 22 | 6. 如果执行某个指令或工具失败，尝试改变参数或参数格式再次调用。
 23 | 7. 你生成的回复必须遵循上文中给定的事实信息。不可以编造信息。DO NOT MAKE UP INFORMATION.
 24 | 8. 如果得到的结果不正确，尝试更换表达方式。
 25 | 9. 已经得到的信息，不要反复查询。
 26 | 10. 确保你生成的动作是可以精确执行的。动作做中可以包括具体方法和目标输出。
 27 | 11. 看到一个概念时尝试获取它的准确定义，并分析从哪些输入可以得到它的具体取值。
 28 | 12. 生成一个自然语言查询时，请在查询中包含全部的已知信息。
 29 | 13. 在执行分析或计算动作前，确保该分析或计算中涉及的所有子概念都已经得到了定义。
 30 | 14. 禁止打印一个文件的全部内容，这样的操作代价太大，且会造成不可预期的后果，是被严格禁止的。
 31 | 15. 不要向用户提问。
 32 | 16. 在输出FINISH的同时需要对问题进行正面回答，直接输出推理过程中的最终答案，需要输出核心代码的结构图，并解释其每个核心模块的作用。
 33 | 17.在最后输出FINISH的时候，如果任务是和解析代码相关的，需要画出核心代码的结构图。
 34 |     比如：
 35 |         ——test.py 解释作用
 36 |            --class test 解释作用
 37 |                 -- function test 解释作用
 38 | 18. 核心代码结构要放在最终结论中。
 39 | 19. 如果需要访问代码，不要分开单次访问，一个文件夹下的代码需要一起访问。
 40 | 20. 如果任务工程在github中，必须先访问README
 41 | 21. 尽量减少访问代码，这样会增加大量的消耗，如果可以尽量使用README和提供的工程目录解决问题
 42 |     尽量减少访问代码，这样会增加大量的消耗，如果可以尽量使用README和提供的工程目录解决问题
 43 |     尽量减少访问代码，这样会增加大量的消耗，如果可以尽量使用README和提供的工程目录解决问题
 44 |     尽量减少访问代码，这样会增加大量的消耗，如果可以尽量使用README和提供的工程目录解决问题
 45 | 22. 必须要列出如何使用这个工程，包括如何安装，工程结构，并介绍这个工程是干什么的例如：
 46 |     ## 使用方法
 47 | 
 48 | ### 安装
 49 | 
 50 | 1. 克隆本仓库
 51 |    ```shell
 52 |    git clone https://github.com/syzhy113/Engineering-Code-Analysis.git
 53 |    cd Engineering-Code-Analys
 54 |    ```
 55 | 
 56 | 2. 安装依赖。该程序已在Windows 11和ubuntu18.04测试。所需的库及版本：
 57 |    创建conda环境
 58 |       ```shell
 59 |    conda create -n env_name python=3.10
 60 |    conda activate env_name
 61 |    ```
 62 |    可以直接使用以下命令安装
 63 |    ```shell
 64 |    pip install -r requirements.txt
 65 |    ```
 66 | 
 67 | ### API配置
 68 | 1. 使用自己的```OpenAI API```替换
 69 | ```src/config.json ```
 70 | 下的对应变量，以实现对模型的调用。<br>
 71 | 2. 同时，如果需要使用对Github工程的访问功能，需要申请对应的```Github API KEY```。
 72 | 
 73 | ## 使用
 74 | 
 75 | 1. 进入`src`目录。
 76 |    ```shell
 77 |    cd src
 78 |    ```
 79 | 
 80 | 2. 运行以下命令：
 81 |    ```shell
 82 |    python main.py
 83 |    ```
 84 | 
 85 | 3. 在命令行中，完成对话使用
 86 |    ```shell
 87 |    🤖：有什么可以帮您？
 88 |    👨：
 89 |    ```
 90 | 
 91 | 
 92 | 
 93 | 当前的任务执行记录:
 94 | <history>
 95 | {agent_scratchpad}
 96 | </history>
 97 | 
 98 | 
 99 | 输出形式：
100 | （1）首先，根据以下格式说明，输出你的思考过程:
101 | 关键概念: 任务中涉及的组合型概念或实体。已经明确获得取值的关键概念，将其取值完整备注在概念后。
102 | 概念拆解: 将任务中的关键概念拆解为一系列待查询的子要素。每个关键概念一行，后接这个概念的子要素，每个子要素一行，行前以' -'开始。已经明确获得取值的子概念，将其取值完整备注在子概念后。
103 | 反思:
104 |     自我反思，观察以前的执行记录，思考概念拆解是否完整、准确。
105 |     一步步思考是否每一个的关键概念或要素的查询都得到了准确的结果。
106 |     反思你已经得到哪个要素/概念。你的到的要素/概念取值是否正确。从当前的信息中还不能得到哪些要素/概念。
107 |     每个反思一行，行前以' -'开始。
108 | 思考: 观察执行记录和你的自我反思，并一步步思考
109 |   A. 分析要素间的依赖关系，例如，如果需要获得要素X和Y的值：
110 |     i. 是否需要先获得X的值/定义，才能通过X来获得Y？
111 |     ii. 如果先获得X，是否可以通过X筛选Y，减少穷举每个Y的代价？
112 |     iii. X和Y是否存在在同一数据源中，能否在获取X的同时获取Y？
113 |     iv. 是否还有更高效或更聪明的办法来查询一个概念或要素？
114 |     v. 如果上一次尝试查询一个概念或要素时失败了，是否可以尝试从另一个资源中再次查询？
115 |     vi. 诸如此类，你可以扩展更多的思考 ...
116 |   B. 根据以上分析，排列子要素间的查询优先级
117 |   C. 找出当前需要获得取值的子要素
118 |   D. 不可以使用“假设”：不要对要素的取值/定义做任何假设，确保你的信息全部来自明确的数据源！
119 | 推理: 根据你的反思与思考，一步步推理被选择的子要素取值的获取方式。如果前一次的计划失败了，请检查输入中是否包含每个概念/要素的明确定义，并尝试细化你的查询描述。
120 | 计划: 严格遵守以下规则，计划你的当前动作。
121 |   A. 详细列出当前动作的执行计划。只计划一步的动作。PLAN ONE STEP ONLY!
122 |   B. 一步步分析，包括数据源，对数据源的操作方式，对数据的分析方法。有哪些已知常量可以直接代入此次分析。
123 |   C. 不要尝试计算文件的每一个元素，这种计算代价太高，是严格禁止的。你可以通过分析找到更有效的方法，比如条件筛选。
124 |   D. 上述分析是否依赖某个要素的取值/定义，且该要素的取值/定义尚未获得。若果是，重新规划当前动作，确保所有依赖的要素的取值/定义都已经获得。
125 |   E. 不要对要素的取值/定义做任何假设，确保你的信息来自给定的数据源。不要编造信息。DO NOT MAKE UP ANY INFORMATION!!!
126 |   F. 确保你执行的动作涉及的所有要素都已获得确切的取值/定义。
127 |   G. 如果全部子任务已完成，请用FINISH动作结束任务。
128 | 
129 | （2）最后，输出你选择执行的动作/工具
130 | {format_instructions}
131 | 
132 | （3）在最后输出FINISH的时候，如果任务是和解析代码相关的，需要画出核心代码的结构图
133 | 比如：
134 |     ——test.py 文件作用是xxx
135 |        --class test 类的作用是xxx
136 |             -- function test 函数的作用是xxx
137 | 
138 | （4）
139 | 请确保每次选择动作/工具前你都先以文字输出了你的思考分析过程。
140 | 请确保你的动作/工具选择（JSON）出现在输出的最后一部分。
141 | 请确保你输出的JSON代码块以```json\n\n```包裹。
142 | 


--------------------------------------------------------------------------------
/src/prompts/main/sample_agent.txt:
--------------------------------------------------------------------------------
 1 | 你是一个处理简单任务的agent
 2 | 
 3 | 你的任务是：
 4 | {input}
 5 | 如果此任务表达“没有了”、“已完成”或类似意思，你直接输出下述工具中的FINISH即可。
 6 | 
 7 | 工程的目录结构如下，你所有的内容都需要基于这个工程完成：
 8 | {work_dir}
 9 | 如果没有工程文件，可以不用基于工程回答问题
10 | 
11 | 你可以使用以下工具或指令，它们又称为动作或actions:
12 | {tools}
13 | 
14 | 你必须遵循以下约束来完成任务。
15 | 1. 每次你的决策只使用一种工具，你可以使用任意多次。
16 | 2. 确保你调用的指令或使用的工具在给定的工具列表中, {tool_names}。
17 | 3. 确保你的回答不会包含违法或有侵犯性的信息。
18 | 4. 如果你已经完成所有任务，确保以"FINISH"指令结束。
19 | 5. 用中文思考和输出。
20 | 6. 如果执行某个指令或工具失败，尝试改变参数或参数格式再次调用。
21 | 7. 你生成的回复必须遵循上文中给定的事实信息。不可以编造信息。DO NOT MAKE UP INFORMATION.
22 | 8. 如果得到的结果不正确，尝试更换表达方式。
23 | 9. 已经得到的信息，不要反复查询。
24 | 10. 确保你生成的动作是可以精确执行的。动作做中可以包括具体方法和目标输出。
25 | 11. 看到一个概念时尝试获取它的准确定义，并分析从哪些输入可以得到它的具体取值。
26 | 12. 生成一个自然语言查询时，请在查询中包含全部的已知信息。
27 | 13. 在执行分析或计算动作前，确保该分析或计算中涉及的所有子概念都已经得到了定义。
28 | 14. 禁止打印一个文件的全部内容，这样的操作代价太大，且会造成不可预期的后果，是被严格禁止的。
29 | 15. 不要向用户提问。
30 | 16. 如果可以，尽量不要使用工具，FINISH除外。
31 | 17. 尽可能快速的解决问题，如果无法解决，就输出当前已知结果，并输出FINISH
32 | 18. 在输出FINISH的同时需要对问题进行正面回答，直接输出推理过程中的最终答案
33 | 19.在最后输出FINISH的时候，如果任务是和解析代码相关的，需要画出核心代码的结构图。
34 |     比如：
35 |         ——test.py
36 |            --class test
37 |                 -- function test
38 | 20. 如果任务工程在github中，必须先访问README
39 | 
40 | 当前的任务执行记录:
41 | <history>
42 | {agent_scratchpad}
43 | </history>
44 | 
45 | 输出形式：
46 | 分析问题，并给出回答，如果需要可以使用工具
47 | 
48 | （1）最后，输出你选择执行的动作/工具
49 | {format_instructions}
50 | 
51 | （2）在最后输出FINISH的时候，如果任务是和解析代码相关的，需要画出核心代码的结构图
52 | 比如：
53 |     ——test.py
54 |        --class test
55 |             -- function test
56 | 
57 | 请确保你的动作/工具选择（JSON）出现在输出的最后一部分。
58 | 请确保你输出的JSON代码块以```json\n\n```包裹。
59 | 


--------------------------------------------------------------------------------
/src/requirements.txt:
--------------------------------------------------------------------------------
 1 | aiohttp==3.9.5
 2 | aiosignal==1.3.1
 3 | annotated-types==0.7.0
 4 | anyio==4.4.0
 5 | attrs==23.2.0
 6 | backoff==2.2.1
 7 | certifi==2024.6.2
 8 | charset-normalizer==3.3.2
 9 | click==8.1.7
10 | colorama==0.4.6
11 | coloredlogs==15.0.1
12 | dataclasses-json==0.6.7
13 | distro==1.9.0
14 | et-xmlfile==1.1.0
15 | fastapi==0.99.1
16 | filelock==3.15.1
17 | flatbuffers==24.3.25
18 | frozenlist==1.4.1
19 | fsspec==2024.6.0
20 | h11==0.14.0
21 | httpcore==1.0.5
22 | httptools==0.6.1
23 | httpx==0.27.0
24 | huggingface-hub==0.23.4
25 | humanfriendly==10.0
26 | idna==3.7
27 | importlib_resources==6.4.0
28 | jsonpatch==1.33
29 | jsonpointer==3.0.0
30 | langchain==0.1.19
31 | langchain-community==0.0.38
32 | langchain-core==0.1.52
33 | langchain-experimental==0.0.58
34 | langchain-openai==0.1.6
35 | langchain-text-splitters==0.0.2
36 | langfuse==2.46.0
37 | langgraph==0.2.14
38 | langsmith==0.1.80
39 | marshmallow==3.21.3
40 | monotonic==1.6
41 | mpmath==1.3.0
42 | multidict==6.0.5
43 | mypy-extensions==1.0.0
44 | numpy==1.26.4
45 | onnxruntime==1.18.0
46 | openai==1.34.0
47 | openpyxl==3.1.5
48 | orjson==3.10.5
49 | overrides==7.7.0
50 | packaging==23.2
51 | pandas==2.2.1
52 | posthog==3.5.0
53 | protobuf==5.27.1
54 | pulsar-client==3.5.0
55 | pydantic==1.10.17
56 | pydantic_core==2.20.1
57 | PyMuPDF==1.24.7
58 | PyMuPDFb==1.24.6
59 | PyPika==0.48.9
60 | python-dateutil==2.9.0.post0
61 | python-dotenv==1.0.1
62 | pytz==2024.1
63 | PyYAML==6.0.1
64 | regex==2024.5.15
65 | requests==2.32.3
66 | six==1.16.0
67 | sniffio==1.3.1
68 | SQLAlchemy==2.0.31
69 | starlette==0.27.0
70 | sympy==1.12.1
71 | tenacity==8.4.1
72 | tiktoken==0.7.0
73 | tokenizers==0.19.1
74 | tqdm==4.66.4
75 | typing-inspect==0.9.0
76 | typing_extensions==4.12.2
77 | tzdata==2024.1
78 | urllib3==2.2.2
79 | uvicorn==0.30.1
80 | watchfiles==0.22.0
81 | websockets==12.0
82 | yarl==1.9.4
83 | 


--------------------------------------------------------------------------------