├── .DS_Store
├── README.md
├── Translate.html
├── get_freeai_api_v2.py
├── gpt_academic
├── config.py
├── requirements.txt
└── toolbox.py
├── images
└── error
│ └── pandora_public_pool_token.png
└── old
├── 3.43
├── config.py
├── requirements.txt
└── toolbox.py
├── README_old.md
└── gpt_academic_old
├── config_private.py
├── credentials.txt
├── get_freeai_api.py
└── toolbox.py
/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/elphen-wang/FreeAI/25cd8bfff04607fc73a464bfcff948d102b65c4f/.DS_Store
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 | #
| FreeAI
3 |
4 | **OpenAI should not be a closed AI.**
5 |
6 | 你是否还在为OpenAI需要科学上网在犯愁?
7 |
8 | 你是否还在为OpenAI的付费模式而望而却步?
9 |
10 | 你是否苦恼没有免费的API Key来开发自己的ChatGPT工具?
11 |
12 | 本项目综述Github众优秀开发者的努力,给出一个比较完美的解决方案,并持续向更好用、更强大、更便宜的AI开放努力。**如果你喜欢本项目,请给一个免费的star,谢谢!**
13 |
14 | `Tips:有些一般性的问题和提醒,我在写在本页面并加注提醒了。大家实操时先耐心看完自己需要的本教程那一part,以免重复提问浪费等待时间。`
15 |
16 | ---
17 | #### 2023年7月16日上线FreeAI:
18 | + 基于Pandora和OpenAIAuth实现免翻墙使用ChatGPT 3.5;
19 | + 演示配置gpt_academic (vension: 3.45)实现免翻墙免费使用ChatGPT 3.5;
20 |
21 | #### **2023年8月1日更新要点:**
22 | + 提供一个自己制作的Pool Token (10个账号组成);
23 | + 废弃OpenAIAuth。新提供一个免科学上网获取自己OpenAI账号的Access Token (即用户Cookie)的方法,以便制作自己的Pandora Shore Token和Pool Token;
24 | + 基于gpt_academic (vension: 3.47)演示免科学上网使用`ChatGPT 3.5`;
25 | + 穿插一些issue反馈的常见问题的解决方案。
26 |
27 | #### **2023年9月22日更新要点:**
28 | + FreeAI提供的池子更新频率更改为每4小时一次,以降低组成池子的账户的Cookie的失效的概率;
29 | + 根据gpt_academic (vension: 3.53)更新文件,更新的区域都写了“FreeAI更新”,大家可自行搜索关键词和对照学习。其实更改的部分不是很多,学习成本低;
30 | + 建议使用gpt_academic (vension: 3.45)升级到gpt_academic (vension: 3.53)的用户,运行`python3 -m pip install -r requirements.txt`更新依赖。
31 |
32 | ---
33 |
34 | **鸣谢:**
35 | + [pengzhile/pandora](https://github.com/pengzhile/pandora):让OpenAI GPT-3.5的API免费和免科学上网的关键技术。
36 | + [binary-husky/gpt_academic](https://github.com/binary-husky/gpt_academic), 以它为例,解决它需翻墙和需要付费的OpenAI API key的问题,演示OpenAI变为FreeAI。
37 |
38 | ## Pandora
39 | 旨在打造免科学上网情况下,最原汁原味的ChatGPT。基于access token的[技术原理](https://zhile.io/2023/05/19/how-to-get-chatgpt-access-token-via-pkce.html)实现的。目前有官方的体验网站[https://chat.zhile.io](https://chat.zhile.io),需要使用OpenAI的账户密码,所有对话记录与在官网的一致;也有基于Pandora技术的共享[Shared Chat](https://baipiao.io/chatgpt)的资源池,无需账号密码也能体验。`Tips:现Shared Chat的体验有些卡顿是正常现象,毕竟人太多了。`
40 |
41 | Pandora项目最难能可贵的是提供了可将用户的Cookie转化为形式如同API key的Access Token和响应这个Access Token的反代接口(也可响应OpenAI原生的API key)的服务,此举无疑是基于OpenAI自由开发者最大的福音。详情请见:[“这个服务旨在模拟 Turbo API,免费且使用的是ChatGPT的8k模型”](https://github.com/pengzhile/pandora/issues/837)。
42 | + 免科学上网获取自己的用户Cookie(即ChatGPT的Access Toke),演示地址:[https://ai-20230626.fakeopen.com/auth](https://ai-20230626.fakeopen.com/auth)和[https://ai-20230626.fakeopen.com/auth1](https://ai-20230626.fakeopen.com/auth1);`Tips:Pandora后台记不记录你的用户账号密码不知道,但确实好用。`
43 | + Cookie转 `fk-`开头、43位的 Share Token 演示地址:[https://ai.fakeopen.com/token](https://ai.fakeopen.com/token);
44 | + Cookie转 `pk-`开头、43位的 Pool Token 演示地址:[https://ai.fakeopen.com/pool](https://ai.fakeopen.com/pool)。解决多账号并发的问题;
45 | + 响应上述 Access Token 的反代接口是:[https://ai.fakeopen.com/v1/chat/completions](https://ai.fakeopen.com/v1/chat/completions)。
46 |
47 | Pandora项目还提供了两个免费的Pool Token:
48 | + `pk-this-is-a-real-free-pool-token-for-everyone` 很多 Share Token 组成的池子。
49 | + ~~`pk-this-is-a-real-free-api-key-pk-for-everyone`~~ 一些120刀 Api Key组成的池子。`(我测试的时候已经没钱了,[衰],继续使用会经常报错,所以别用了。)`
50 |
51 | 经使用自己的账号生成的Share Token和Pool Token进行测试,这种方式进行的对话的记录,不会出现在该账户记录中。`但Pandora论坛帖子有人在反馈将这一部分的对话记录给保存到账户对话记录中,所以以后会不会有变化,不好说。`
52 |
53 | 本人十分中意ChatGPT的翻译效果,所以编写一个基于Pandora的简易翻译服务的网页,即文件[Translate.html](https://github.com/elphen-wang/FreeAI/blob/main/Translate.html),测试效果表明还可以。`Tips:使用的是Pandora提供的Pool Token。`
54 | ## FreeAI来提供自己Pool Token啦
55 |
56 | 我**之前**因为自己的池子不够大,且用户cookie的生命周期只有**14天**,时常更新Access Token也很烦,所以我使用的是Pandora提供Pool Token。但是,经过一段时间实操,发现大家(包括我)都遇到类似于以下的报错:
57 |
58 |
59 |
60 | 我**猜想**这是因为Pandora提供的免费Pool Token是由约100个账号组成的池子,而每个账号的Access Token生命周期只有14天且应该产生日期不尽相同,所以这个Pool Token需要经常更新下属这100个账号的Access Token,不然就会出现上述的报错。实际上,也是正因为如此,这种的报错持续一两天就会自动消失,这也说明这个Pool Token更新机制有所压力或未完善。**之前本教程基于 [OpenAIAuth](https://github.com/acheong08/OpenAIAuth)** 提供了[一个免科学上网获取专属自己的Pandora的Share Token和Pool Token](https://github.com/elphen-wang/FreeAI/blob/main/old/gpt_academic_old/get_freeai_api.py)的方式。但是,经过实测,OpenAIAuth所依靠的**服务机器响应请求有压力,时常获取不了自己的账号的Access Token**,故寻找一个替代方式是十分有必要的。**这些,无疑都是十分糟糕的用户体验。**
61 |
62 | 由此,FreeAI来提供自己Pool Token啦。大家可以通过以下的链接获取FreeAI Pool Token:
63 | [https://api.elphen.site/api?mode=default_my_poolkey](https://api.elphen.site/api?mode=default_my_poolkey) 。
64 |
65 | 大家在使用这个链接时,**请注意以下几点**:
66 | + 这个链接提供的FreeAI Pool Token是**每天凌晨4点10分定时更新**的,注意它的内容**并不长久固定**,目前暂定它的生命周期为一天,所以大家**一天取一次即可**;
67 | + 这个池子挂靠的服务器是我的轻量云服务器,**请大家轻虐,不要频繁访问**。
68 | + 这个FreeAI Pool Token是由10个OpenAI账号组成的。池子虽不大,但应该够用。将来会继续扩展这个池子。
69 | + python 获取这个FreeAI Pool Token代码如下:
70 |
71 | ``` .python
72 | import requests,json
73 | response = requests.get("https://api.elphen.site/api?mode=default_my_poolkey")
74 | if response.status_code == 200:
75 | FreeAI_Pool_Token=response.json()
76 | ```
77 |
78 | 大家也可以通过Pandora项目提供的API,制作专属自己的Pandora Token:
79 | ``` .python
80 | import requests,json
81 | #免科学上网,获取自己OpenAI账户的Access Token
82 | #Tips:Pandora后台记不记录你的用户账号密码不知道,但确实好用。
83 | data0 = {'username': username, #你OpenAI的账户
84 | 'password': password, #你OpenAI的密码
85 | 'prompt': 'login',}
86 | resp0 = requests.post('https://ai-20230626.fakeopen.com/auth/login', data=data0)
87 | if resp0.status_code == 200:
88 | your_openai_cookie=resp0.json()['access_token']
89 |
90 | #获取专属自己的Pandora Token
91 | data1 = {'unique_name': 'get my token', #可以不用修改
92 | 'access_token': your_openai_cookie,
93 | 'expires_in': 0,
94 | }
95 | resp1 = requests.post('https://ai.fakeopen.com/token/register', data=data1)
96 | if resp1.status_code == 200:
97 | your_panroda_token= resp.json()['token_key']
98 | ```
99 | 要制作专属自己的Pandora Pool Token,先假定你已经获取了两个及以上账号的Pandora (Share)Token组成的数组your_panroda_token_list,然后可用如下python代码获取:
100 | ``` .python
101 | data2 = {'share_tokens': '\n'.join(your_panroda_token_list),}
102 | resp2 = requests.post('https://ai.fakeopen.com/pool/update', data=data2)
103 | if resp2.status_code == 200:
104 | your_pool_token=resp2.json()['pool_token']
105 | ```
106 |
107 | 本教程的[get_freeai_api_v2.py](get_freeai_api_v2.py)即是一个获取Pandora (Share/Pool)Token的完整演示程序。
108 |
109 | **强烈建议大家使用自己的Pandora Token。并且,请大家优化代码,不要频繁发起请求,让提供服务的服务器(个人开发者的服务器性能一般不高)承载极限压力,最终反噬自己的请求响应缓慢。**
110 |
111 | ## gpt_academic
112 | 本人之前搭建专属自己的OpenAI API反向代理的教程[ChatGPT Wallfree](https://github.com/elphen-wang/chatgpt_wallfree)只实现了gpt_academic免科学上网功能,但仍需使用OpenAI原生的API key。这里还是以它为例,本次直接不用开发者自己搭建反向代理服务和OpenAI原生的API key,可以为一般的科研组省下一笔的不易报销的经费支出。
113 |
114 | 开发者可使用本项目中[gpt_academic](https://github.com/elphen-wang/FreeAI/tree/main/gpt_academic)文件夹中文件替代官方的文件(`主要是修改对toolbox.py和config.py对Pandora Token的识别和获取`),也可在此基础上加入自己的设定(如gpt_academic账户密码等)。如此之后,安装官方的调试运行和部署指引,gpt_academic就可以不用科学上网又能免费使用gpt-3.5啦!
115 |
116 | **部署教程**:
117 | + 由于之前发现gpt_academic设定用户参数配置的读取优先级: 环境变量 > config_private.py > config.py,所以调试中,最好config.py文件也做对应的修改(即改为一样)。不然,用户的配置可能在某些调试情况下不生效,这是gpt_academic的bug,我目前没有对此进行修改。**我的建议是:干脆就别配置config_private.py,即删掉或别生成config_private.py文件,或者这两文件弄成一模一样。**
118 | + 本项目中[gpt_academic](https://github.com/elphen-wang/FreeAI/tree/main/gpt_academic)文件夹下的文件替代官方的对应的文件并做一定的修改即可。测试用的是gpt_academic v3.47的版本。
119 | `这里说明几点:`
120 | + `requirements.txt`相对官方增加pdfminer,pdflatex,apscheduler,前两个是latex功能相关的包,后一个是定时更新API_KEY的包,也即只有apscheduler是必须的。大家据此也可以做相应的代码更改以使用专属自己的Pandora token;
121 | + `toolbox.py`相关官方增加识别Pandora Token的功能;
122 | + `config.py`中增加了定时获取FreeAI提供的Pool Token,修改了API_URL_REDIRECT反代端口(不然处理不了Pandora Token),WEB_PORT为86(数字随便取你喜欢的)。你也可以增设访问gpt_academic的账户密码和其他功能。
123 | + docker模型一般编译是:
124 | ```bash {.line-numbers}
125 | #编译 docker 镜像
126 | docker build -t gpt-academic .
127 | #端口可以自由更换,保持和config.py一样即可
128 | docker run -d --restart=always -p 86:86 --name gpt-academic gpt-academic
129 | ```
130 | + 要使用gpt_academic arxiv翻译功能,在docker模式下,需要进行以下编译:
131 | ``` bash {.line-numbers}
132 | #编译 docker 镜像
133 | docker build -t gpt-academic-nolocal-latex -f docs/GithubAction+NoLocal+Latex .
134 | #端口可以自由更换,保持和config.py和config_private.py中设置的一样即可
135 | #/home/fuqingxu/arxiv_cache是docker容器外的文件夹,存放arxiv相关的内容。具体路经可以修改为你喜欢的
136 | run -d -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --net=host -p 86:86 --restart=always --name gpt-academic gpt-academic-nolocal-latex
137 | ```
138 | ## 后记
139 | + 因为,Pandora目前本质上是将OpenAI原生的网页服务还原出来,所以目前还不能免费使用诸如ChatGPT-4等付费服务。不过,这将是本人和一众致力于使AI技术服务更广大群众的开发者今后努力的方向。
140 | + 之前ChatGPT Wallfree教程中提及ZeroTier的内网穿透技术,实测不如[Frp](https://github.com/fatedier/frp)更适合中国科研宝宝的体质:更稳定、速度更快且第三方无需客户端。
141 |
142 | ## To-do List
143 | + [ ] 因为我目前是一名科研工作人员,未来将优先投入有限精力开发与arxiv相关的功能,集成我能且想要集成的服务。
144 |
145 | ## Star历史
146 |
147 | 
148 |
149 |
150 |
--------------------------------------------------------------------------------
/Translate.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | Translate
7 |
8 |
9 |
10 |
11 |
12 |
13 |
Translation based on the FreeAI
14 |
15 |
16 |
17 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
112 |
113 |
114 |
--------------------------------------------------------------------------------
/get_freeai_api_v2.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 |
3 | from os import path
4 | import requests,json
5 |
6 | def run():
7 | expires_in = 0
8 | unique_name = 'my share token'
9 | current_dir = path.dirname(path.abspath(__file__))
10 | # credentials.txt需自己输入账号密码,“,”隔开,一行一组
11 | # share_tokens.txt记录产生的对应的Pandora的share token和/或pool token。
12 | credentials_file = path.join(current_dir, 'credentials.txt')
13 | share_tokens_file = path.join(current_dir, 'share_tokens.txt')
14 | with open(credentials_file, 'r', encoding='utf-8') as f:
15 | credentials = f.read().split('\n')
16 | credentials = [credential.split(',', 1) for credential in credentials]
17 | count = 0
18 | token_keys = []
19 | for credential in credentials:
20 | progress = '{}/{}'.format(credentials.index(credential) + 1, len(credentials))
21 | if not credential or len(credential) != 2:
22 | continue
23 |
24 | count += 1
25 | username, password = credential[0].strip(), credential[1].strip()
26 | token_info = {
27 | 'token': 'None',
28 | 'share_token': 'None',
29 | }
30 | token_keys.append(token_info)
31 | try:
32 | #auth = Auth0(email=username, password=password)
33 | #token_info['token'] = auth.get_access_token()
34 | data = {'username': username,
35 | 'password': password,
36 | 'prompt': 'login',}
37 | resp = requests.post('https://ai-20230626.fakeopen.com/auth/login', data=data)
38 | if resp.status_code == 200:
39 | feedback_info= resp.json()
40 | token_info['token'] =feedback_info['access_token']
41 | #print(auth)
42 | print('Login success: {}, {}'.format(username, progress))
43 | except Exception as e:
44 | err_str = str(e).replace('\n', '').replace('\r', '').strip()
45 | print('Login failed: {}, {}'.format(username, err_str))
46 | token_info['token'] = err_str
47 | continue
48 | data = {
49 | 'unique_name': unique_name,
50 | 'access_token': token_info['token'],
51 | 'expires_in': expires_in,
52 | }
53 | resp = requests.post('https://ai.fakeopen.com/token/register', data=data)
54 | if resp.status_code == 200:
55 | token_info['share_token'] = resp.json()['token_key']
56 | else:
57 | continue
58 |
59 | with open(share_tokens_file, 'w', encoding='utf-8') as f:
60 | # 如果账号大于一个,优先使用pool;只有一个时,使用单独的api;没有,则有公共pool。
61 | if count==0:
62 | f.write('pk-this-is-a-real-free-pool-token-for-everyone\n')
63 | f.write('pk-this-is-a-real-free-api-key-pk-for-everyone\n')
64 | elif count==1:
65 | f.write('{}\n'.format(token_keys[0]['share_token']))
66 | else:
67 | data = {
68 | 'share_tokens': '\n'.join([token_info['share_token'] for token_info in token_keys]),
69 | }
70 | resp = requests.post('https://ai.fakeopen.com/pool/update', data=data)
71 | if resp.status_code == 200:
72 | f.write('{}\n'.format(resp.json()['pool_token']))
73 | for token_info in token_keys:
74 | f.write('{}\n'.format(token_info['share_token']))
75 | f.close()
76 |
77 | if __name__ == '__main__':
78 | run()
79 |
80 |
--------------------------------------------------------------------------------
/gpt_academic/config.py:
--------------------------------------------------------------------------------
1 | """
2 | 以下所有配置也都支持利用环境变量覆写,环境变量配置格式见docker-compose.yml。
3 | 读取优先级:环境变量 > config_private.py > config.py
4 | --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
5 | All the following configurations also support using environment variables to override,
6 | and the environment variable configuration format can be seen in docker-compose.yml.
7 | Configuration reading priority: environment variable > config_private.py > config.py
8 | """
9 |
10 | # [step 1]>> API_KEY = "sk-123456789xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123456789"。极少数情况下,还需要填写组织(格式如org-123456789abcdefghijklmno的),请向下翻,找 API_ORG 设置项
11 | API_KEY = "此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4"
12 |
13 | ####################FreeAI更改的部分:定时更新API_KEY的部分####################
14 | import requests
15 | from apscheduler.schedulers.background import BackgroundScheduler
16 | ELPHEN_URL="https://api.elphen.site/api?mode=default_my_poolkey"
17 | def GET_API_KEY(url=ELPHEN_URL):
18 | response = requests.get(url)
19 | global API_KEY
20 | if response.status_code == 200:
21 | API_KEY=response.json()
22 | else:
23 | API_KEY="pk-this-is-a-real-free-pool-token-for-everyone"
24 | return API_KEY
25 |
26 | API_KEY=GET_API_KEY(ELPHEN_URL)
27 | # 创建定时任务的线程
28 | scheduler = BackgroundScheduler()
29 | scheduler.add_job(GET_API_KEY, trigger='cron', hour=5, minute=8)#定时每天凌晨5点8分更新一次FreeAI Key
30 | # 启动定时任务的调度器
31 | scheduler.start()
32 | #print(API_KEY)
33 | ###########################################################
34 |
35 | # [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改;如果使用本地或无地域限制的大模型时,此处也不需要修改
36 | USE_PROXY = False
37 | if USE_PROXY:
38 | """
39 | 填写格式是 [协议]:// [地址] :[端口],填写之前不要忘记把USE_PROXY改成True,如果直接在海外服务器部署,此处不修改
40 | <配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1>
41 | [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http
42 | [地址] 懂的都懂,不懂就填localhost或者127.0.0.1肯定错不了(localhost意思是代理软件安装在本机上)
43 | [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样,但端口号都应该在最显眼的位置上
44 | """
45 | # 代理网络的地址,打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284)
46 | proxies = {
47 | # [协议]:// [地址] :[端口]
48 | "http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890",
49 | "https": "socks5h://localhost:11284", # 再例如 "https": "http://127.0.0.1:7890",
50 | }
51 | else:
52 | proxies = None
53 |
54 | # ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------
55 |
56 | # 重新URL重新定向,实现更换API_URL的作用(高危设置! 常规情况下不要修改! 通过修改此设置,您将把您的API-KEY和对话隐私完全暴露给您设定的中间人!)
57 | # 格式: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"}
58 | # 举例: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "https://reverse-proxy-url/v1/chat/completions"}
59 |
60 | ############FreeAI更改的部分#################
61 | API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://ai.fakeopen.com/v1/chat/completions"}
62 | ###########################################
63 |
64 | # 多线程函数插件中,默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次,Pay-as-you-go users的限制是每分钟3500次
65 | # 一言以蔽之:免费(5刀)用户填3,OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询:https://platform.openai.com/docs/guides/rate-limits/overview
66 | DEFAULT_WORKER_NUM = 3
67 |
68 |
69 | # 色彩主题, 可选 ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast"]
70 | # 更多主题, 请查阅Gradio主题商店: https://huggingface.co/spaces/gradio/theme-gallery 可选 ["Gstaff/Xkcd", "NoCrypt/Miku", ...]
71 | THEME = "Default"
72 | AVAIL_THEMES = ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast", "Gstaff/Xkcd", "NoCrypt/Miku"]
73 |
74 | # 对话窗的高度 (仅在LAYOUT="TOP-DOWN"时生效)
75 | CHATBOT_HEIGHT = 1115
76 |
77 |
78 | # 代码高亮
79 | CODE_HIGHLIGHT = True
80 |
81 |
82 | # 窗口布局
83 | LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
84 | DARK_MODE = True # 暗色模式 / 亮色模式
85 |
86 |
87 | # 发送请求到OpenAI后,等待多久判定为超时
88 | TIMEOUT_SECONDS = 30
89 |
90 |
91 | # 网页的端口, -1代表随机端口
92 | ############FreeAI更改的部分#################
93 | WEB_PORT = 86
94 | ############################################
95 |
96 | # 如果OpenAI不响应(网络卡顿、代理失败、KEY失效),重试的次数限制
97 | MAX_RETRY = 2
98 |
99 |
100 | # 插件分类默认选项
101 | DEFAULT_FN_GROUPS = ['对话', '编程', '学术']
102 |
103 |
104 | # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 )
105 | LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
106 | AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo",
107 | "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
108 | # P.S. 其他可用的模型还包括 ["qianfan", "llama2", "qwen", "gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613",
109 | # "spark", "sparkv2", "chatglm_onnx", "claude-1-100k", "claude-2", "internlm", "jittorllms_pangualpha", "jittorllms_llama"]
110 |
111 |
112 | # 百度千帆(LLM_MODEL="qianfan")
113 | BAIDU_CLOUD_API_KEY = ''
114 | BAIDU_CLOUD_SECRET_KEY = ''
115 | BAIDU_CLOUD_QIANFAN_MODEL = 'ERNIE-Bot' # 可选 "ERNIE-Bot"(文心一言), "ERNIE-Bot-turbo", "BLOOMZ-7B", "Llama-2-70B-Chat", "Llama-2-13B-Chat", "Llama-2-7B-Chat"
116 |
117 |
118 | # 如果使用ChatGLM2微调模型,请把 LLM_MODEL="chatglmft",并在此处指定模型路径
119 | CHATGLM_PTUNING_CHECKPOINT = "" # 例如"/home/hmp/ChatGLM2-6B/ptuning/output/6b-pt-128-1e-2/checkpoint-100"
120 |
121 |
122 | # 本地LLM模型如ChatGLM的执行方式 CPU/GPU
123 | LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda"
124 | LOCAL_MODEL_QUANT = "FP16" # 默认 "FP16" "INT4" 启用量化INT4版本 "INT8" 启用量化INT8版本
125 |
126 |
127 | # 设置gradio的并行线程数(不需要修改)
128 | CONCURRENT_COUNT = 100
129 |
130 |
131 | # 是否在提交时自动清空输入框
132 | AUTO_CLEAR_TXT = False
133 |
134 |
135 | # 加一个live2d装饰
136 | ADD_WAIFU = False
137 |
138 |
139 | # 设置用户名和密码(不需要修改)(相关功能不稳定,与gradio版本和网络都相关,如果本地使用不建议加这个)
140 | # [("username", "password"), ("username2", "password2"), ...]
141 | AUTHENTICATION = []
142 |
143 |
144 | # 如果需要在二级路径下运行(常规情况下,不要修改!!)(需要配合修改main.py才能生效!)
145 | CUSTOM_PATH = "/"
146 |
147 |
148 | # 极少数情况下,openai的官方KEY需要伴随组织编码(格式如org-xxxxxxxxxxxxxxxxxxxxxxxx)使用
149 | API_ORG = ""
150 |
151 |
152 | # 如果需要使用Slack Claude,使用教程详情见 request_llm/README.md
153 | SLACK_CLAUDE_BOT_ID = ''
154 | SLACK_CLAUDE_USER_TOKEN = ''
155 |
156 |
157 | # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md
158 | AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/"
159 | AZURE_API_KEY = "填入azure openai api的密钥" # 建议直接在API_KEY处填写,该选项即将被弃用
160 | AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md
161 |
162 |
163 | # 使用Newbing
164 | NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"]
165 | NEWBING_COOKIES = """
166 | put your new bing cookies here
167 | """
168 |
169 |
170 | # 阿里云实时语音识别 配置难度较高 仅建议高手用户使用 参考 https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md
171 | ENABLE_AUDIO = False
172 | ALIYUN_TOKEN="" # 例如 f37f30e0f9934c34a992f6f64f7eba4f
173 | ALIYUN_APPKEY="" # 例如 RoPlZrM88DnAFkZK
174 | ALIYUN_ACCESSKEY="" # (无需填写)
175 | ALIYUN_SECRET="" # (无需填写)
176 |
177 |
178 | # 接入讯飞星火大模型 https://console.xfyun.cn/services/iat
179 | XFYUN_APPID = "00000000"
180 | XFYUN_API_SECRET = "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
181 | XFYUN_API_KEY = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
182 |
183 |
184 | # Claude API KEY
185 | ANTHROPIC_API_KEY = ""
186 |
187 |
188 | # 自定义API KEY格式
189 | CUSTOM_API_KEY_PATTERN = ""
190 |
191 |
192 | # HUGGINGFACE的TOKEN,下载LLAMA时起作用 https://huggingface.co/docs/hub/security-tokens
193 | HUGGINGFACE_ACCESS_TOKEN = "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"
194 |
195 |
196 | # GROBID服务器地址(填写多个可以均衡负载),用于高质量地读取PDF文档
197 | # 获取方法:复制以下空间https://huggingface.co/spaces/qingxu98/grobid,设为public,然后GROBID_URL = "https://(你的hf用户名如qingxu98)-(你的填写的空间名如grobid).hf.space"
198 | GROBID_URLS = [
199 | "https://qingxu98-grobid.hf.space","https://qingxu98-grobid2.hf.space","https://qingxu98-grobid3.hf.space",
200 | "https://shaocongma-grobid.hf.space","https://FBR123-grobid.hf.space", "https://yeku-grobid.hf.space",
201 | ]
202 |
203 |
204 | # 是否允许通过自然语言描述修改本页的配置,该功能具有一定的危险性,默认关闭
205 | ALLOW_RESET_CONFIG = False
206 | # 临时的上传文件夹位置,请勿修改
207 | PATH_PRIVATE_UPLOAD = "private_upload"
208 | # 日志文件夹的位置,请勿修改
209 | PATH_LOGGING = "gpt_log"
210 |
211 | """
212 | 在线大模型配置关联关系示意图
213 | │
214 | ├── "gpt-3.5-turbo" 等openai模型
215 | │ ├── API_KEY
216 | │ ├── CUSTOM_API_KEY_PATTERN(不常用)
217 | │ ├── API_ORG(不常用)
218 | │ └── API_URL_REDIRECT(不常用)
219 | │
220 | ├── "azure-gpt-3.5" 等azure模型
221 | │ ├── API_KEY
222 | │ ├── AZURE_ENDPOINT
223 | │ ├── AZURE_API_KEY
224 | │ ├── AZURE_ENGINE
225 | │ └── API_URL_REDIRECT
226 | │
227 | ├── "spark" 星火认知大模型 spark & sparkv2
228 | │ ├── XFYUN_APPID
229 | │ ├── XFYUN_API_SECRET
230 | │ └── XFYUN_API_KEY
231 | │
232 | ├── "claude-1-100k" 等claude模型
233 | │ └── ANTHROPIC_API_KEY
234 | │
235 | ├── "stack-claude"
236 | │ ├── SLACK_CLAUDE_BOT_ID
237 | │ └── SLACK_CLAUDE_USER_TOKEN
238 | │
239 | ├── "qianfan" 百度千帆大模型库
240 | │ ├── BAIDU_CLOUD_QIANFAN_MODEL
241 | │ ├── BAIDU_CLOUD_API_KEY
242 | │ └── BAIDU_CLOUD_SECRET_KEY
243 | │
244 | ├── "newbing" Newbing接口不再稳定,不推荐使用
245 | ├── NEWBING_STYLE
246 | └── NEWBING_COOKIES
247 |
248 |
249 | 用户图形界面布局依赖关系示意图
250 | │
251 | ├── CHATBOT_HEIGHT 对话窗的高度
252 | ├── CODE_HIGHLIGHT 代码高亮
253 | ├── LAYOUT 窗口布局
254 | ├── DARK_MODE 暗色模式 / 亮色模式
255 | ├── DEFAULT_FN_GROUPS 插件分类默认选项
256 | ├── THEME 色彩主题
257 | ├── AUTO_CLEAR_TXT 是否在提交时自动清空输入框
258 | ├── ADD_WAIFU 加一个live2d装饰
259 | ├── ALLOW_RESET_CONFIG 是否允许通过自然语言描述修改本页的配置,该功能具有一定的危险性
260 |
261 |
262 | 插件在线服务配置依赖关系示意图
263 | │
264 | ├── 语音功能
265 | │ ├── ENABLE_AUDIO
266 | │ ├── ALIYUN_TOKEN
267 | │ ├── ALIYUN_APPKEY
268 | │ ├── ALIYUN_ACCESSKEY
269 | │ └── ALIYUN_SECRET
270 | │
271 | ├── PDF文档精准解析
272 | │ └── GROBID_URLS
273 |
274 | """
275 |
--------------------------------------------------------------------------------
/gpt_academic/requirements.txt:
--------------------------------------------------------------------------------
1 | ./docs/gradio-3.32.2-py3-none-any.whl
2 | pydantic==1.10.11
3 | tiktoken>=0.3.3
4 | requests[socks]
5 | transformers
6 | python-markdown-math
7 | beautifulsoup4
8 | prompt_toolkit
9 | latex2mathml
10 | python-docx
11 | mdtex2html
12 | anthropic
13 | colorama
14 | Markdown
15 | pygments
16 | pymupdf
17 | openai
18 | numpy
19 | arxiv
20 | rich
21 | pypdf2==2.12.1
22 | websocket-client
23 | scipdf_parser>=0.3
24 | pdfminer
25 | pdflatex
26 | apscheduler
27 |
--------------------------------------------------------------------------------
/gpt_academic/toolbox.py:
--------------------------------------------------------------------------------
1 | import markdown
2 | import importlib
3 | import time
4 | import inspect
5 | import re
6 | import os
7 | import gradio
8 | import shutil
9 | import glob
10 | from latex2mathml.converter import convert as tex2mathml
11 | from functools import wraps, lru_cache
12 | pj = os.path.join
13 |
14 | """
15 | ========================================================================
16 | 第一部分
17 | 函数插件输入输出接驳区
18 | - ChatBotWithCookies: 带Cookies的Chatbot类,为实现更多强大的功能做基础
19 | - ArgsGeneralWrapper: 装饰器函数,用于重组输入参数,改变输入参数的顺序与结构
20 | - update_ui: 刷新界面用 yield from update_ui(chatbot, history)
21 | - CatchException: 将插件中出的所有问题显示在界面上
22 | - HotReload: 实现插件的热更新
23 | - trimmed_format_exc: 打印traceback,为了安全而隐藏绝对地址
24 | ========================================================================
25 | """
26 |
27 | class ChatBotWithCookies(list):
28 | def __init__(self, cookie):
29 | """
30 | cookies = {
31 | 'top_p': top_p,
32 | 'temperature': temperature,
33 | 'lock_plugin': bool,
34 | "files_to_promote": ["file1", "file2"],
35 | "most_recent_uploaded": {
36 | "path": "uploaded_path",
37 | "time": time.time(),
38 | "time_str": "timestr",
39 | }
40 | }
41 | """
42 | self._cookies = cookie
43 |
44 | def write_list(self, list):
45 | for t in list:
46 | self.append(t)
47 |
48 | def get_list(self):
49 | return [t for t in self]
50 |
51 | def get_cookies(self):
52 | return self._cookies
53 |
54 |
55 | def ArgsGeneralWrapper(f):
56 | """
57 | 装饰器函数,用于重组输入参数,改变输入参数的顺序与结构。
58 | """
59 | def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args):
60 | txt_passon = txt
61 | if txt == "" and txt2 != "": txt_passon = txt2
62 | # 引入一个有cookie的chatbot
63 | cookies.update({
64 | 'top_p':top_p,
65 | 'api_key': cookies['api_key'],
66 | 'llm_model': llm_model,
67 | 'temperature':temperature,
68 | })
69 | llm_kwargs = {
70 | 'api_key': cookies['api_key'],
71 | 'llm_model': llm_model,
72 | 'top_p':top_p,
73 | 'max_length': max_length,
74 | 'temperature':temperature,
75 | 'client_ip': request.client.host,
76 | }
77 | plugin_kwargs = {
78 | "advanced_arg": plugin_advanced_arg,
79 | }
80 | chatbot_with_cookie = ChatBotWithCookies(cookies)
81 | chatbot_with_cookie.write_list(chatbot)
82 |
83 | if cookies.get('lock_plugin', None) is None:
84 | # 正常状态
85 | if len(args) == 0: # 插件通道
86 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, request)
87 | else: # 对话通道,或者基础功能通道
88 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
89 | else:
90 | # 处理少数情况下的特殊插件的锁定状态
91 | module, fn_name = cookies['lock_plugin'].split('->')
92 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
93 | yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, request)
94 | # 判断一下用户是否错误地通过对话通道进入,如果是,则进行提醒
95 | final_cookies = chatbot_with_cookie.get_cookies()
96 | # len(args) != 0 代表“提交”键对话通道,或者基础功能通道
97 | if len(args) != 0 and 'files_to_promote' in final_cookies and len(final_cookies['files_to_promote']) > 0:
98 | chatbot_with_cookie.append(["检测到**滞留的缓存文档**,请及时处理。", "请及时点击“**保存当前对话**”获取所有滞留文档。"])
99 | yield from update_ui(chatbot_with_cookie, final_cookies['history'], msg="检测到被滞留的缓存文档")
100 | return decorated
101 |
102 |
103 | def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面
104 | """
105 | 刷新用户界面
106 | """
107 | assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。"
108 | cookies = chatbot.get_cookies()
109 | # 备份一份History作为记录
110 | cookies.update({'history': history})
111 | # 解决插件锁定时的界面显示问题
112 | if cookies.get('lock_plugin', None):
113 | label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None)
114 | chatbot_gr = gradio.update(value=chatbot, label=label)
115 | if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label
116 | elif cookies.get('label', None):
117 | chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', ""))
118 | cookies['label'] = None # 清空label
119 | else:
120 | chatbot_gr = chatbot
121 |
122 | yield cookies, chatbot_gr, history, msg
123 |
124 | def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面
125 | """
126 | 刷新用户界面
127 | """
128 | if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg])
129 | chatbot[-1] = list(chatbot[-1])
130 | chatbot[-1][-1] = lastmsg
131 | yield from update_ui(chatbot=chatbot, history=history)
132 | time.sleep(delay)
133 |
134 |
135 | def trimmed_format_exc():
136 | import os, traceback
137 | str = traceback.format_exc()
138 | current_path = os.getcwd()
139 | replace_path = "."
140 | return str.replace(current_path, replace_path)
141 |
142 | def CatchException(f):
143 | """
144 | 装饰器函数,捕捉函数f中的异常并封装到一个生成器中返回,并显示到聊天当中。
145 | """
146 |
147 | @wraps(f)
148 | def decorated(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs):
149 | try:
150 | yield from f(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs)
151 | except Exception as e:
152 | from check_proxy import check_proxy
153 | from toolbox import get_conf
154 | proxies, = get_conf('proxies')
155 | tb_str = '```\n' + trimmed_format_exc() + '```'
156 | if len(chatbot_with_cookie) == 0:
157 | chatbot_with_cookie.clear()
158 | chatbot_with_cookie.append(["插件调度异常", "异常原因"])
159 | chatbot_with_cookie[-1] = (chatbot_with_cookie[-1][0],
160 | f"[Local Message] 实验性函数调用出错: \n\n{tb_str} \n\n当前代理可用性: \n\n{check_proxy(proxies)}")
161 | yield from update_ui(chatbot=chatbot_with_cookie, history=history, msg=f'异常 {e}') # 刷新界面
162 | return decorated
163 |
164 |
165 | def HotReload(f):
166 | """
167 | HotReload的装饰器函数,用于实现Python函数插件的热更新。
168 | 函数热更新是指在不停止程序运行的情况下,更新函数代码,从而达到实时更新功能。
169 | 在装饰器内部,使用wraps(f)来保留函数的元信息,并定义了一个名为decorated的内部函数。
170 | 内部函数通过使用importlib模块的reload函数和inspect模块的getmodule函数来重新加载并获取函数模块,
171 | 然后通过getattr函数获取函数名,并在新模块中重新加载函数。
172 | 最后,使用yield from语句返回重新加载过的函数,并在被装饰的函数上执行。
173 | 最终,装饰器函数返回内部函数。这个内部函数可以将函数的原始定义更新为最新版本,并执行函数的新版本。
174 | """
175 | @wraps(f)
176 | def decorated(*args, **kwargs):
177 | fn_name = f.__name__
178 | f_hot_reload = getattr(importlib.reload(inspect.getmodule(f)), fn_name)
179 | yield from f_hot_reload(*args, **kwargs)
180 | return decorated
181 |
182 |
183 | """
184 | ========================================================================
185 | 第二部分
186 | 其他小工具:
187 | - write_history_to_file: 将结果写入markdown文件中
188 | - regular_txt_to_markdown: 将普通文本转换为Markdown格式的文本。
189 | - report_execption: 向chatbot中添加简单的意外错误信息
190 | - text_divide_paragraph: 将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
191 | - markdown_convertion: 用多种方式组合,将markdown转化为好看的html
192 | - format_io: 接管gradio默认的markdown处理方式
193 | - on_file_uploaded: 处理文件的上传(自动解压)
194 | - on_report_generated: 将生成的报告自动投射到文件上传区
195 | - clip_history: 当历史上下文过长时,自动截断
196 | - get_conf: 获取设置
197 | - select_api_key: 根据当前的模型类别,抽取可用的api-key
198 | ========================================================================
199 | """
200 |
201 | def get_reduce_token_percent(text):
202 | """
203 | * 此函数未来将被弃用
204 | """
205 | try:
206 | # text = "maximum context length is 4097 tokens. However, your messages resulted in 4870 tokens"
207 | pattern = r"(\d+)\s+tokens\b"
208 | match = re.findall(pattern, text)
209 | EXCEED_ALLO = 500 # 稍微留一点余地,否则在回复时会因余量太少出问题
210 | max_limit = float(match[0]) - EXCEED_ALLO
211 | current_tokens = float(match[1])
212 | ratio = max_limit/current_tokens
213 | assert ratio > 0 and ratio < 1
214 | return ratio, str(int(current_tokens-max_limit))
215 | except:
216 | return 0.5, '不详'
217 |
218 |
219 | def write_history_to_file(history, file_basename=None, file_fullname=None, auto_caption=True):
220 | """
221 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名,则使用当前时间生成文件名。
222 | """
223 | import os
224 | import time
225 | if file_fullname is None:
226 | if file_basename is not None:
227 | file_fullname = pj(get_log_folder(), file_basename)
228 | else:
229 | file_fullname = pj(get_log_folder(), f'GPT-Academic-{gen_time_str()}.md')
230 | os.makedirs(os.path.dirname(file_fullname), exist_ok=True)
231 | with open(file_fullname, 'w', encoding='utf8') as f:
232 | f.write('# GPT-Academic Report\n')
233 | for i, content in enumerate(history):
234 | try:
235 | if type(content) != str: content = str(content)
236 | except:
237 | continue
238 | if i % 2 == 0 and auto_caption:
239 | f.write('## ')
240 | try:
241 | f.write(content)
242 | except:
243 | # remove everything that cannot be handled by utf8
244 | f.write(content.encode('utf-8', 'ignore').decode())
245 | f.write('\n\n')
246 | res = os.path.abspath(file_fullname)
247 | return res
248 |
249 |
250 | def regular_txt_to_markdown(text):
251 | """
252 | 将普通文本转换为Markdown格式的文本。
253 | """
254 | text = text.replace('\n', '\n\n')
255 | text = text.replace('\n\n\n', '\n\n')
256 | text = text.replace('\n\n\n', '\n\n')
257 | return text
258 |
259 |
260 |
261 |
262 | def report_execption(chatbot, history, a, b):
263 | """
264 | 向chatbot中添加错误信息
265 | """
266 | chatbot.append((a, b))
267 | history.extend([a, b])
268 |
269 |
270 | def text_divide_paragraph(text):
271 | """
272 | 将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
273 | """
274 | pre = ''
275 | suf = '
'
276 | if text.startswith(pre) and text.endswith(suf):
277 | return text
278 |
279 | if '```' in text:
280 | # careful input
281 | return pre + text + suf
282 | else:
283 | # wtf input
284 | lines = text.split("\n")
285 | for i, line in enumerate(lines):
286 | lines[i] = lines[i].replace(" ", " ")
287 | text = "".join(lines)
288 | return pre + text + suf
289 |
290 |
291 | @lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
292 | def markdown_convertion(txt):
293 | """
294 | 将Markdown格式的文本转换为HTML格式。如果包含数学公式,则先将公式转换为HTML格式。
295 | """
296 | pre = ''
297 | suf = '
'
298 | if txt.startswith(pre) and txt.endswith(suf):
299 | # print('警告,输入了已经经过转化的字符串,二次转化可能出问题')
300 | return txt # 已经被转化过,不需要再次转化
301 |
302 | markdown_extension_configs = {
303 | 'mdx_math': {
304 | 'enable_dollar_delimiter': True,
305 | 'use_gitlab_delimiters': False,
306 | },
307 | }
308 | find_equation_pattern = r'\n', '')
343 | return content
344 |
345 | def is_equation(txt):
346 | """
347 | 判定是否为公式 | 测试1 写出洛伦兹定律,使用tex格式公式 测试2 给出柯西不等式,使用latex格式 测试3 写出麦克斯韦方程组
348 | """
349 | if '```' in txt and '```reference' not in txt: return False
350 | if '$' not in txt and '\\[' not in txt: return False
351 | mathpatterns = {
352 | r'(?= one_minute_ago:
510 | if os.path.isdir(file_path):
511 | continue
512 | recent_files.append(file_path)
513 |
514 | return recent_files
515 |
516 | def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
517 | # 将文件复制一份到下载区
518 | import shutil
519 | if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
520 | new_path = pj(get_log_folder(), rename_file)
521 | # 如果已经存在,先删除
522 | if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
523 | # 把文件复制过去
524 | if not os.path.exists(new_path): shutil.copyfile(file, new_path)
525 | # 将文件添加到chatbot cookie中,避免多用户干扰
526 | if chatbot:
527 | if 'files_to_promote' in chatbot._cookies: current = chatbot._cookies['files_to_promote']
528 | else: current = []
529 | chatbot._cookies.update({'files_to_promote': [new_path] + current})
530 |
531 | def disable_auto_promotion(chatbot):
532 | chatbot._cookies.update({'files_to_promote': []})
533 | return
534 |
535 | def is_the_upload_folder(string):
536 | PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD')
537 | pattern = r'^PATH_PRIVATE_UPLOAD/[A-Za-z0-9_-]+/\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}$'
538 | pattern = pattern.replace('PATH_PRIVATE_UPLOAD', PATH_PRIVATE_UPLOAD)
539 | if re.match(pattern, string): return True
540 | else: return False
541 |
542 | def del_outdated_uploads(outdate_time_seconds):
543 | PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD')
544 | current_time = time.time()
545 | one_hour_ago = current_time - outdate_time_seconds
546 | # Get a list of all subdirectories in the PATH_PRIVATE_UPLOAD folder
547 | # Remove subdirectories that are older than one hour
548 | for subdirectory in glob.glob(f'{PATH_PRIVATE_UPLOAD}/*/*'):
549 | subdirectory_time = os.path.getmtime(subdirectory)
550 | if subdirectory_time < one_hour_ago:
551 | try: shutil.rmtree(subdirectory)
552 | except: pass
553 | return
554 |
555 | def on_file_uploaded(request: gradio.Request, files, chatbot, txt, txt2, checkboxes, cookies):
556 | """
557 | 当文件被上传时的回调函数
558 | """
559 | if len(files) == 0:
560 | return chatbot, txt
561 |
562 | # 移除过时的旧文件从而节省空间&保护隐私
563 | outdate_time_seconds = 60
564 | del_outdated_uploads(outdate_time_seconds)
565 |
566 | # 创建工作路径
567 | user_name = "default" if not request.username else request.username
568 | time_tag = gen_time_str()
569 | PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD')
570 | target_path_base = pj(PATH_PRIVATE_UPLOAD, user_name, time_tag)
571 | os.makedirs(target_path_base, exist_ok=True)
572 |
573 | # 逐个文件转移到目标路径
574 | upload_msg = ''
575 | for file in files:
576 | file_origin_name = os.path.basename(file.orig_name)
577 | this_file_path = pj(target_path_base, file_origin_name)
578 | shutil.move(file.name, this_file_path)
579 | upload_msg += extract_archive(file_path=this_file_path, dest_dir=this_file_path+'.extract')
580 |
581 | # 整理文件集合
582 | moved_files = [fp for fp in glob.glob(f'{target_path_base}/**/*', recursive=True)]
583 | if "底部输入区" in checkboxes:
584 | txt, txt2 = "", target_path_base
585 | else:
586 | txt, txt2 = target_path_base, ""
587 |
588 | # 输出消息
589 | moved_files_str = '\t\n\n'.join(moved_files)
590 | chatbot.append(['我上传了文件,请查收',
591 | f'[Local Message] 收到以下文件: \n\n{moved_files_str}' +
592 | f'\n\n调用路径参数已自动修正到: \n\n{txt}' +
593 | f'\n\n现在您点击任意函数插件时,以上文件将被作为输入参数'+upload_msg])
594 |
595 | # 记录近期文件
596 | cookies.update({
597 | 'most_recent_uploaded': {
598 | 'path': target_path_base,
599 | 'time': time.time(),
600 | 'time_str': time_tag
601 | }})
602 | return chatbot, txt, txt2, cookies
603 |
604 |
605 | def on_report_generated(cookies, files, chatbot):
606 | from toolbox import find_recent_files
607 | PATH_LOGGING, = get_conf('PATH_LOGGING')
608 | if 'files_to_promote' in cookies:
609 | report_files = cookies['files_to_promote']
610 | cookies.pop('files_to_promote')
611 | else:
612 | report_files = find_recent_files(PATH_LOGGING)
613 | if len(report_files) == 0:
614 | return cookies, None, chatbot
615 | # files.extend(report_files)
616 | file_links = ''
617 | for f in report_files: file_links += f'
{f}'
618 | chatbot.append(['报告如何远程获取?', f'报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。{file_links}'])
619 | return cookies, report_files, chatbot
620 |
621 | def load_chat_cookies():
622 | API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY')
623 | if is_any_api_key(AZURE_API_KEY):
624 | if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY
625 | else: API_KEY = AZURE_API_KEY
626 | return {'api_key': API_KEY, 'llm_model': LLM_MODEL}
627 |
628 | def is_openai_api_key(key):
629 | CUSTOM_API_KEY_PATTERN, = get_conf('CUSTOM_API_KEY_PATTERN')
630 | if len(CUSTOM_API_KEY_PATTERN) != 0:
631 | API_MATCH_ORIGINAL = re.match(CUSTOM_API_KEY_PATTERN, key)
632 | else:
633 | API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
634 | return bool(API_MATCH_ORIGINAL)
635 |
636 | def is_azure_api_key(key):
637 | API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key)
638 | return bool(API_MATCH_AZURE)
639 |
640 | def is_api2d_key(key):
641 | API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key)
642 | return bool(API_MATCH_API2D)
643 |
644 | ############FreeAI更改的部分#################
645 | def is_freeai_api_key(key):#new add
646 | API_MATCH_FREEAI0 = re.match(r"pk-[a-zA-Z0-9-_]{43}$", key)
647 | API_MATCH_FREEAI1 = re.match(r"fk-[a-zA-Z0-9-_]{43}$", key)
648 | return bool(API_MATCH_FREEAI0) or bool(API_MATCH_FREEAI1)
649 |
650 | def is_any_api_key(key):
651 | if ',' in key:
652 | keys = key.split(',')
653 | for k in keys:
654 | if is_any_api_key(k): return True
655 | return False
656 | else:#new add
657 | return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key) or is_freeai_api_key(key)
658 |
659 | def what_keys(keys):
660 | avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0}
661 | key_list = keys.split(',')
662 |
663 | for k in key_list:
664 | if is_openai_api_key(k):
665 | avail_key_list['OpenAI Key'] += 1
666 |
667 | for k in key_list:
668 | if is_api2d_key(k):
669 | avail_key_list['API2D Key'] += 1
670 |
671 | for k in key_list:
672 | if is_azure_api_key(k):
673 | avail_key_list['Azure Key'] += 1
674 |
675 | for k in key_list: # new add
676 | if is_freeai_api_key(k):
677 | avail_key_list['FreeAI Key'] += 1
678 | # new add
679 | return f"检测到: OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']} 个, FreeAI Key {avail_key_list['FreeAI Key']} 个"
680 |
681 | def select_api_key(keys, llm_model):
682 | import random
683 | avail_key_list = []
684 | key_list = keys.split(',')
685 |
686 | if llm_model.startswith('gpt-'):
687 | for k in key_list:
688 | if is_openai_api_key(k): avail_key_list.append(k)
689 | for k in key_list:# new add
690 | if is_freeai_api_key(k): avail_key_list.append(k)
691 |
692 | if llm_model.startswith('api2d-'):
693 | for k in key_list:
694 | if is_api2d_key(k): avail_key_list.append(k)
695 |
696 | if llm_model.startswith('azure-'):
697 | for k in key_list:
698 | if is_azure_api_key(k): avail_key_list.append(k)
699 |
700 | if len(avail_key_list) == 0:
701 | raise RuntimeError(f"您提供的api-key不满足要求,不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源(右下角更换模型菜单中可切换openai,azure,claude,api2d等请求源)。")
702 |
703 | api_key = random.choice(avail_key_list) # 随机负载均衡
704 | return api_key
705 |
706 | ###########################################
707 |
708 |
709 | def read_env_variable(arg, default_value):
710 | """
711 | 环境变量可以是 `GPT_ACADEMIC_CONFIG`(优先),也可以直接是`CONFIG`
712 | 例如在windows cmd中,既可以写:
713 | set USE_PROXY=True
714 | set API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx
715 | set proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",}
716 | set AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"]
717 | set AUTHENTICATION=[("username", "password"), ("username2", "password2")]
718 | 也可以写:
719 | set GPT_ACADEMIC_USE_PROXY=True
720 | set GPT_ACADEMIC_API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx
721 | set GPT_ACADEMIC_proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",}
722 | set GPT_ACADEMIC_AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"]
723 | set GPT_ACADEMIC_AUTHENTICATION=[("username", "password"), ("username2", "password2")]
724 | """
725 | from colorful import print亮红, print亮绿
726 | arg_with_prefix = "GPT_ACADEMIC_" + arg
727 | if arg_with_prefix in os.environ:
728 | env_arg = os.environ[arg_with_prefix]
729 | elif arg in os.environ:
730 | env_arg = os.environ[arg]
731 | else:
732 | raise KeyError
733 | print(f"[ENV_VAR] 尝试加载{arg},默认值:{default_value} --> 修正值:{env_arg}")
734 | try:
735 | if isinstance(default_value, bool):
736 | env_arg = env_arg.strip()
737 | if env_arg == 'True': r = True
738 | elif env_arg == 'False': r = False
739 | else: print('enter True or False, but have:', env_arg); r = default_value
740 | elif isinstance(default_value, int):
741 | r = int(env_arg)
742 | elif isinstance(default_value, float):
743 | r = float(env_arg)
744 | elif isinstance(default_value, str):
745 | r = env_arg.strip()
746 | elif isinstance(default_value, dict):
747 | r = eval(env_arg)
748 | elif isinstance(default_value, list):
749 | r = eval(env_arg)
750 | elif default_value is None:
751 | assert arg == "proxies"
752 | r = eval(env_arg)
753 | else:
754 | print亮红(f"[ENV_VAR] 环境变量{arg}不支持通过环境变量设置! ")
755 | raise KeyError
756 | except:
757 | print亮红(f"[ENV_VAR] 环境变量{arg}加载失败! ")
758 | raise KeyError(f"[ENV_VAR] 环境变量{arg}加载失败! ")
759 |
760 | print亮绿(f"[ENV_VAR] 成功读取环境变量{arg}")
761 | return r
762 |
763 | @lru_cache(maxsize=128)
764 | def read_single_conf_with_lru_cache(arg):
765 | from colorful import print亮红, print亮绿, print亮蓝
766 | try:
767 | # 优先级1. 获取环境变量作为配置
768 | default_ref = getattr(importlib.import_module('config'), arg) # 读取默认值作为数据类型转换的参考
769 | r = read_env_variable(arg, default_ref)
770 | except:
771 | try:
772 | # 优先级2. 获取config_private中的配置
773 | r = getattr(importlib.import_module('config_private'), arg)
774 | except:
775 | # 优先级3. 获取config中的配置
776 | r = getattr(importlib.import_module('config'), arg)
777 |
778 | # 在读取API_KEY时,检查一下是不是忘了改config
779 | if arg == 'API_KEY':
780 | print亮蓝(f"[API_KEY] 本项目现已支持OpenAI和Azure的api-key。也支持同时填写多个api-key,如API_KEY=\"openai-key1,openai-key2,azure-key3\"")
781 | print亮蓝(f"[API_KEY] 您既可以在config.py中修改api-key(s),也可以在问题输入区输入临时的api-key(s),然后回车键提交后即可生效。")
782 | if is_any_api_key(r):
783 | print亮绿(f"[API_KEY] 您的 API_KEY 是: {r[:15]}*** API_KEY 导入成功")
784 | else:
785 | print亮红( "[API_KEY] 您的 API_KEY 不满足任何一种已知的密钥格式,请在config文件中修改API密钥之后再运行。")
786 | if arg == 'proxies':
787 | if not read_single_conf_with_lru_cache('USE_PROXY'): r = None # 检查USE_PROXY,防止proxies单独起作用
788 | if r is None:
789 | print亮红('[PROXY] 网络代理状态:未配置。无代理状态下很可能无法访问OpenAI家族的模型。建议:检查USE_PROXY选项是否修改。')
790 | else:
791 | print亮绿('[PROXY] 网络代理状态:已配置。配置信息如下:', r)
792 | assert isinstance(r, dict), 'proxies格式错误,请注意proxies选项的格式,不要遗漏括号。'
793 | return r
794 |
795 |
796 | @lru_cache(maxsize=128)
797 | def get_conf(*args):
798 | # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
799 | res = []
800 | for arg in args:
801 | r = read_single_conf_with_lru_cache(arg)
802 | res.append(r)
803 | return res
804 |
805 |
806 | def clear_line_break(txt):
807 | txt = txt.replace('\n', ' ')
808 | txt = txt.replace(' ', ' ')
809 | txt = txt.replace(' ', ' ')
810 | return txt
811 |
812 |
813 | class DummyWith():
814 | """
815 | 这段代码定义了一个名为DummyWith的空上下文管理器,
816 | 它的作用是……额……就是不起作用,即在代码结构不变得情况下取代其他的上下文管理器。
817 | 上下文管理器是一种Python对象,用于与with语句一起使用,
818 | 以确保一些资源在代码块执行期间得到正确的初始化和清理。
819 | 上下文管理器必须实现两个方法,分别为 __enter__()和 __exit__()。
820 | 在上下文执行开始的情况下,__enter__()方法会在代码块被执行前被调用,
821 | 而在上下文执行结束时,__exit__()方法则会被调用。
822 | """
823 | def __enter__(self):
824 | return self
825 |
826 | def __exit__(self, exc_type, exc_value, traceback):
827 | return
828 |
829 | def run_gradio_in_subpath(demo, auth, port, custom_path):
830 | """
831 | 把gradio的运行地址更改到指定的二次路径上
832 | """
833 | def is_path_legal(path: str)->bool:
834 | '''
835 | check path for sub url
836 | path: path to check
837 | return value: do sub url wrap
838 | '''
839 | if path == "/": return True
840 | if len(path) == 0:
841 | print("ilegal custom path: {}\npath must not be empty\ndeploy on root url".format(path))
842 | return False
843 | if path[0] == '/':
844 | if path[1] != '/':
845 | print("deploy on sub-path {}".format(path))
846 | return True
847 | return False
848 | print("ilegal custom path: {}\npath should begin with \'/\'\ndeploy on root url".format(path))
849 | return False
850 |
851 | if not is_path_legal(custom_path): raise RuntimeError('Ilegal custom path')
852 | import uvicorn
853 | import gradio as gr
854 | from fastapi import FastAPI
855 | app = FastAPI()
856 | if custom_path != "/":
857 | @app.get("/")
858 | def read_main():
859 | return {"message": f"Gradio is running at: {custom_path}"}
860 | app = gr.mount_gradio_app(app, demo, path=custom_path)
861 | uvicorn.run(app, host="0.0.0.0", port=port) # , auth=auth
862 |
863 |
864 | def clip_history(inputs, history, tokenizer, max_token_limit):
865 | """
866 | reduce the length of history by clipping.
867 | this function search for the longest entries to clip, little by little,
868 | until the number of token of history is reduced under threshold.
869 | 通过裁剪来缩短历史记录的长度。
870 | 此函数逐渐地搜索最长的条目进行剪辑,
871 | 直到历史记录的标记数量降低到阈值以下。
872 | """
873 | import numpy as np
874 | from request_llm.bridge_all import model_info
875 | def get_token_num(txt):
876 | return len(tokenizer.encode(txt, disallowed_special=()))
877 | input_token_num = get_token_num(inputs)
878 | if input_token_num < max_token_limit * 3 / 4:
879 | # 当输入部分的token占比小于限制的3/4时,裁剪时
880 | # 1. 把input的余量留出来
881 | max_token_limit = max_token_limit - input_token_num
882 | # 2. 把输出用的余量留出来
883 | max_token_limit = max_token_limit - 128
884 | # 3. 如果余量太小了,直接清除历史
885 | if max_token_limit < 128:
886 | history = []
887 | return history
888 | else:
889 | # 当输入部分的token占比 > 限制的3/4时,直接清除历史
890 | history = []
891 | return history
892 |
893 | everything = ['']
894 | everything.extend(history)
895 | n_token = get_token_num('\n'.join(everything))
896 | everything_token = [get_token_num(e) for e in everything]
897 |
898 | # 截断时的颗粒度
899 | delta = max(everything_token) // 16
900 |
901 | while n_token > max_token_limit:
902 | where = np.argmax(everything_token)
903 | encoded = tokenizer.encode(everything[where], disallowed_special=())
904 | clipped_encoded = encoded[:len(encoded)-delta]
905 | everything[where] = tokenizer.decode(clipped_encoded)[:-1] # -1 to remove the may-be illegal char
906 | everything_token[where] = get_token_num(everything[where])
907 | n_token = get_token_num('\n'.join(everything))
908 |
909 | history = everything[1:]
910 | return history
911 |
912 | """
913 | ========================================================================
914 | 第三部分
915 | 其他小工具:
916 | - zip_folder: 把某个路径下所有文件压缩,然后转移到指定的另一个路径中(gpt写的)
917 | - gen_time_str: 生成时间戳
918 | - ProxyNetworkActivate: 临时地启动代理网络(如果有)
919 | - objdump/objload: 快捷的调试函数
920 | ========================================================================
921 | """
922 |
923 | def zip_folder(source_folder, dest_folder, zip_name):
924 | import zipfile
925 | import os
926 | # Make sure the source folder exists
927 | if not os.path.exists(source_folder):
928 | print(f"{source_folder} does not exist")
929 | return
930 |
931 | # Make sure the destination folder exists
932 | if not os.path.exists(dest_folder):
933 | print(f"{dest_folder} does not exist")
934 | return
935 |
936 | # Create the name for the zip file
937 | zip_file = pj(dest_folder, zip_name)
938 |
939 | # Create a ZipFile object
940 | with zipfile.ZipFile(zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf:
941 | # Walk through the source folder and add files to the zip file
942 | for foldername, subfolders, filenames in os.walk(source_folder):
943 | for filename in filenames:
944 | filepath = pj(foldername, filename)
945 | zipf.write(filepath, arcname=os.path.relpath(filepath, source_folder))
946 |
947 | # Move the zip file to the destination folder (if it wasn't already there)
948 | if os.path.dirname(zip_file) != dest_folder:
949 | os.rename(zip_file, pj(dest_folder, os.path.basename(zip_file)))
950 | zip_file = pj(dest_folder, os.path.basename(zip_file))
951 |
952 | print(f"Zip file created at {zip_file}")
953 |
954 | def zip_result(folder):
955 | t = gen_time_str()
956 | zip_folder(folder, get_log_folder(), f'{t}-result.zip')
957 | return pj(get_log_folder(), f'{t}-result.zip')
958 |
959 | def gen_time_str():
960 | import time
961 | return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
962 |
963 | def get_log_folder(user='default', plugin_name='shared'):
964 | PATH_LOGGING, = get_conf('PATH_LOGGING')
965 | _dir = pj(PATH_LOGGING, user, plugin_name)
966 | if not os.path.exists(_dir): os.makedirs(_dir)
967 | return _dir
968 |
969 | class ProxyNetworkActivate():
970 | """
971 | 这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
972 | """
973 | def __enter__(self):
974 | from toolbox import get_conf
975 | proxies, = get_conf('proxies')
976 | if 'no_proxy' in os.environ: os.environ.pop('no_proxy')
977 | if proxies is not None:
978 | if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http']
979 | if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https']
980 | return self
981 |
982 | def __exit__(self, exc_type, exc_value, traceback):
983 | os.environ['no_proxy'] = '*'
984 | if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY')
985 | if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY')
986 | return
987 |
988 | def objdump(obj, file='objdump.tmp'):
989 | import pickle
990 | with open(file, 'wb+') as f:
991 | pickle.dump(obj, f)
992 | return
993 |
994 | def objload(file='objdump.tmp'):
995 | import pickle, os
996 | if not os.path.exists(file):
997 | return
998 | with open(file, 'rb') as f:
999 | return pickle.load(f)
1000 |
1001 | def Singleton(cls):
1002 | """
1003 | 一个单实例装饰器
1004 | """
1005 | _instance = {}
1006 |
1007 | def _singleton(*args, **kargs):
1008 | if cls not in _instance:
1009 | _instance[cls] = cls(*args, **kargs)
1010 | return _instance[cls]
1011 |
1012 | return _singleton
1013 |
1014 | """
1015 | ========================================================================
1016 | 第四部分
1017 | 接驳虚空终端:
1018 | - set_conf: 在运行过程中动态地修改配置
1019 | - set_multi_conf: 在运行过程中动态地修改多个配置
1020 | - get_plugin_handle: 获取插件的句柄
1021 | - get_plugin_default_kwargs: 获取插件的默认参数
1022 | - get_chat_handle: 获取简单聊天的句柄
1023 | - get_chat_default_kwargs: 获取简单聊天的默认参数
1024 | ========================================================================
1025 | """
1026 |
1027 | def set_conf(key, value):
1028 | from toolbox import read_single_conf_with_lru_cache, get_conf
1029 | read_single_conf_with_lru_cache.cache_clear()
1030 | get_conf.cache_clear()
1031 | os.environ[key] = str(value)
1032 | altered, = get_conf(key)
1033 | return altered
1034 |
1035 | def set_multi_conf(dic):
1036 | for k, v in dic.items(): set_conf(k, v)
1037 | return
1038 |
1039 | def get_plugin_handle(plugin_name):
1040 | """
1041 | e.g. plugin_name = 'crazy_functions.批量Markdown翻译->Markdown翻译指定语言'
1042 | """
1043 | import importlib
1044 | assert '->' in plugin_name, \
1045 | "Example of plugin_name: crazy_functions.批量Markdown翻译->Markdown翻译指定语言"
1046 | module, fn_name = plugin_name.split('->')
1047 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
1048 | return f_hot_reload
1049 |
1050 | def get_chat_handle():
1051 | """
1052 | """
1053 | from request_llm.bridge_all import predict_no_ui_long_connection
1054 | return predict_no_ui_long_connection
1055 |
1056 | def get_plugin_default_kwargs():
1057 | """
1058 | """
1059 | from toolbox import get_conf, ChatBotWithCookies
1060 |
1061 | WEB_PORT, LLM_MODEL, API_KEY = \
1062 | get_conf('WEB_PORT', 'LLM_MODEL', 'API_KEY')
1063 |
1064 | llm_kwargs = {
1065 | 'api_key': API_KEY,
1066 | 'llm_model': LLM_MODEL,
1067 | 'top_p':1.0,
1068 | 'max_length': None,
1069 | 'temperature':1.0,
1070 | }
1071 | chatbot = ChatBotWithCookies(llm_kwargs)
1072 |
1073 | # txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port
1074 | DEFAULT_FN_GROUPS_kwargs = {
1075 | "main_input": "./README.md",
1076 | "llm_kwargs": llm_kwargs,
1077 | "plugin_kwargs": {},
1078 | "chatbot_with_cookie": chatbot,
1079 | "history": [],
1080 | "system_prompt": "You are a good AI.",
1081 | "web_port": WEB_PORT
1082 | }
1083 | return DEFAULT_FN_GROUPS_kwargs
1084 |
1085 | def get_chat_default_kwargs():
1086 | """
1087 | """
1088 | from toolbox import get_conf
1089 |
1090 | LLM_MODEL, API_KEY = get_conf('LLM_MODEL', 'API_KEY')
1091 |
1092 | llm_kwargs = {
1093 | 'api_key': API_KEY,
1094 | 'llm_model': LLM_MODEL,
1095 | 'top_p':1.0,
1096 | 'max_length': None,
1097 | 'temperature':1.0,
1098 | }
1099 |
1100 | default_chat_kwargs = {
1101 | "inputs": "Hello there, are you ready?",
1102 | "llm_kwargs": llm_kwargs,
1103 | "history": [],
1104 | "sys_prompt": "You are AI assistant",
1105 | "observe_window": None,
1106 | "console_slience": False,
1107 | }
1108 |
1109 | return default_chat_kwargs
1110 |
1111 |
--------------------------------------------------------------------------------
/images/error/pandora_public_pool_token.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/elphen-wang/FreeAI/25cd8bfff04607fc73a464bfcff948d102b65c4f/images/error/pandora_public_pool_token.png
--------------------------------------------------------------------------------
/old/3.43/config.py:
--------------------------------------------------------------------------------
1 | """
2 | 以下所有配置也都支持利用环境变量覆写,环境变量配置格式见docker-compose.yml。
3 | 读取优先级:环境变量 > config_private.py > config.py
4 | --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
5 | All the following configurations also support using environment variables to override,
6 | and the environment variable configuration format can be seen in docker-compose.yml.
7 | Configuration reading priority: environment variable > config_private.py > config.py
8 | """
9 |
10 | # [step 1]>> API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY'。极少数情况下,还需要填写组织(格式如org-123456789abcdefghijklmno的),请向下翻,找 API_ORG 设置项
11 | API_KEY = "此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY'
12 |
13 | ####################定时更新API_KEY的部分####################
14 | import requests
15 | from apscheduler.schedulers.background import BackgroundScheduler
16 | ELPHEN_URL="https://api.elphen.site/api?mode=default_my_poolkey"
17 | def GET_API_KEY(url=ELPHEN_URL):
18 | response = requests.get(url)
19 | global API_KEY
20 | if response.status_code == 200:
21 | API_KEY=response.json()
22 | else:
23 | API_KEY="pk-this-is-a-real-free-pool-token-for-everyone"
24 | return API_KEY
25 |
26 | API_KEY=GET_API_KEY(ELPHEN_URL)
27 | # 创建定时任务的线程
28 | scheduler = BackgroundScheduler()
29 | scheduler.add_job(GET_API_KEY, trigger='cron', hour=5, minute=8)#定时每天凌晨5点8分更新一次FreeAI Key
30 | # 启动定时任务的调度器
31 | scheduler.start()
32 | #print(API_KEY)
33 | ###########################################################
34 |
35 | # [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改
36 | USE_PROXY = False
37 | if USE_PROXY:
38 | """
39 | 填写格式是 [协议]:// [地址] :[端口],填写之前不要忘记把USE_PROXY改成True,如果直接在海外服务器部署,此处不修改
40 | <配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1>
41 | [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http
42 | [地址] 懂的都懂,不懂就填localhost或者127.0.0.1肯定错不了(localhost意思是代理软件安装在本机上)
43 | [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样,但端口号都应该在最显眼的位置上
44 | """
45 | # 代理网络的地址,打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284)
46 | proxies = {
47 | # [协议]:// [地址] :[端口]
48 | "http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890",
49 | "https": "socks5h://localhost:11284", # 再例如 "https": "http://127.0.0.1:7890",
50 | }
51 | else:
52 | proxies = None
53 |
54 | # ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------
55 |
56 | # 重新URL重新定向,实现更换API_URL的作用(高危设置! 常规情况下不要修改! 通过修改此设置,您将把您的API-KEY和对话隐私完全暴露给您设定的中间人!)
57 | # 格式: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"}
58 | # 举例: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "https://reverse-proxy-url/v1/chat/completions"}
59 | API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://ai.fakeopen.com/v1/chat/completions"}
60 |
61 |
62 | # 多线程函数插件中,默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次,Pay-as-you-go users的限制是每分钟3500次
63 | # 一言以蔽之:免费(5刀)用户填3,OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询:https://platform.openai.com/docs/guides/rate-limits/overview
64 | DEFAULT_WORKER_NUM = 3
65 |
66 |
67 | # 对话窗的高度
68 | CHATBOT_HEIGHT = 1115
69 |
70 |
71 | # 代码高亮
72 | CODE_HIGHLIGHT = True
73 |
74 |
75 | # 窗口布局
76 | LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
77 | DARK_MODE = True # 暗色模式 / 亮色模式
78 |
79 |
80 | # 发送请求到OpenAI后,等待多久判定为超时
81 | TIMEOUT_SECONDS = 30
82 |
83 |
84 | # 网页的端口, -1代表随机端口
85 | WEB_PORT = 86
86 |
87 |
88 | # 如果OpenAI不响应(网络卡顿、代理失败、KEY失效),重试的次数限制
89 | MAX_RETRY = 2
90 |
91 |
92 | # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 )
93 | LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
94 | AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
95 | # P.S. 其他可用的模型还包括 ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "claude-1-100k", "claude-2", "internlm", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
96 |
97 |
98 | # ChatGLM(2) Finetune Model Path (如果使用ChatGLM2微调模型,需要把"chatglmft"加入AVAIL_LLM_MODELS中)
99 | ChatGLM_PTUNING_CHECKPOINT = "" # 例如"/home/hmp/ChatGLM2-6B/ptuning/output/6b-pt-128-1e-2/checkpoint-100"
100 |
101 |
102 | # 本地LLM模型如ChatGLM的执行方式 CPU/GPU
103 | LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda"
104 | LOCAL_MODEL_QUANT = "FP16" # 默认 "FP16" "INT4" 启用量化INT4版本 "INT8" 启用量化INT8版本
105 |
106 |
107 | # 设置gradio的并行线程数(不需要修改)
108 | CONCURRENT_COUNT = 100
109 |
110 |
111 | # 是否在提交时自动清空输入框
112 | AUTO_CLEAR_TXT = False
113 |
114 |
115 | # 色彩主体,可选 ["Default", "Chuanhu-Small-and-Beautiful"]
116 | THEME = "Default"
117 |
118 |
119 | # 加一个live2d装饰
120 | ADD_WAIFU = False
121 |
122 |
123 | # 设置用户名和密码(不需要修改)(相关功能不稳定,与gradio版本和网络都相关,如果本地使用不建议加这个)
124 | # [("username", "password"), ("username2", "password2"), ...]
125 | AUTHENTICATION = []
126 |
127 |
128 | # 如果需要在二级路径下运行(常规情况下,不要修改!!)(需要配合修改main.py才能生效!)
129 | CUSTOM_PATH = "/"
130 |
131 |
132 | # 极少数情况下,openai的官方KEY需要伴随组织编码(格式如org-xxxxxxxxxxxxxxxxxxxxxxxx)使用
133 | API_ORG = ""
134 |
135 |
136 | # 如果需要使用Slack Claude,使用教程详情见 request_llm/README.md
137 | SLACK_CLAUDE_BOT_ID = ''
138 | SLACK_CLAUDE_USER_TOKEN = ''
139 |
140 |
141 | # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md
142 | AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/"
143 | AZURE_API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY' # 建议直接在API_KEY处填写,该选项即将被弃用
144 | AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md
145 |
146 |
147 | # 使用Newbing
148 | NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"]
149 | NEWBING_COOKIES = """
150 | put your new bing cookies here
151 | """
152 |
153 |
154 | # 阿里云实时语音识别 配置难度较高 仅建议高手用户使用 参考 https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md
155 | ENABLE_AUDIO = False
156 | ALIYUN_TOKEN="" # 例如 f37f30e0f9934c34a992f6f64f7eba4f
157 | ALIYUN_APPKEY="" # 例如 RoPlZrM88DnAFkZK
158 | ALIYUN_ACCESSKEY="" # (无需填写)
159 | ALIYUN_SECRET="" # (无需填写)
160 |
161 | # Claude API KEY
162 | ANTHROPIC_API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY'
163 |
164 |
165 | # 自定义API KEY格式
166 | CUSTOM_API_KEY_PATTERN = ""
167 |
--------------------------------------------------------------------------------
/old/3.43/requirements.txt:
--------------------------------------------------------------------------------
1 | ./docs/gradio-3.32.2-py3-none-any.whl
2 | pydantic==1.10.11
3 | tiktoken>=0.3.3
4 | requests[socks]
5 | transformers
6 | python-markdown-math
7 | beautifulsoup4
8 | prompt_toolkit
9 | latex2mathml
10 | python-docx
11 | mdtex2html
12 | anthropic
13 | colorama
14 | Markdown
15 | pygments
16 | pymupdf
17 | openai
18 | numpy
19 | arxiv
20 | rich
21 | pypdf2==2.12.1
22 | pdfminer
23 | pdflatex
24 | apscheduler
--------------------------------------------------------------------------------
/old/3.43/toolbox.py:
--------------------------------------------------------------------------------
1 | import markdown
2 | import importlib
3 | import time
4 | import inspect
5 | import re
6 | import os
7 | import gradio
8 | from latex2mathml.converter import convert as tex2mathml
9 | from functools import wraps, lru_cache
10 | pj = os.path.join
11 |
12 | """
13 | ========================================================================
14 | 第一部分
15 | 函数插件输入输出接驳区
16 | - ChatBotWithCookies: 带Cookies的Chatbot类,为实现更多强大的功能做基础
17 | - ArgsGeneralWrapper: 装饰器函数,用于重组输入参数,改变输入参数的顺序与结构
18 | - update_ui: 刷新界面用 yield from update_ui(chatbot, history)
19 | - CatchException: 将插件中出的所有问题显示在界面上
20 | - HotReload: 实现插件的热更新
21 | - trimmed_format_exc: 打印traceback,为了安全而隐藏绝对地址
22 | ========================================================================
23 | """
24 |
25 | class ChatBotWithCookies(list):
26 | def __init__(self, cookie):
27 | self._cookies = cookie
28 |
29 | def write_list(self, list):
30 | for t in list:
31 | self.append(t)
32 |
33 | def get_list(self):
34 | return [t for t in self]
35 |
36 | def get_cookies(self):
37 | return self._cookies
38 |
39 |
40 | def ArgsGeneralWrapper(f):
41 | """
42 | 装饰器函数,用于重组输入参数,改变输入参数的顺序与结构。
43 | """
44 | def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args):
45 | txt_passon = txt
46 | if txt == "" and txt2 != "": txt_passon = txt2
47 | # 引入一个有cookie的chatbot
48 | cookies.update({
49 | 'top_p':top_p,
50 | 'temperature':temperature,
51 | })
52 | llm_kwargs = {
53 | 'api_key': cookies['api_key'],
54 | 'llm_model': llm_model,
55 | 'top_p':top_p,
56 | 'max_length': max_length,
57 | 'temperature':temperature,
58 | 'client_ip': request.client.host,
59 | }
60 | plugin_kwargs = {
61 | "advanced_arg": plugin_advanced_arg,
62 | }
63 | chatbot_with_cookie = ChatBotWithCookies(cookies)
64 | chatbot_with_cookie.write_list(chatbot)
65 | if cookies.get('lock_plugin', None) is None:
66 | # 正常状态
67 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
68 | else:
69 | # 处理个别特殊插件的锁定状态
70 | module, fn_name = cookies['lock_plugin'].split('->')
71 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
72 | yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
73 | return decorated
74 |
75 |
76 | def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面
77 | """
78 | 刷新用户界面
79 | """
80 | assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。"
81 | cookies = chatbot.get_cookies()
82 |
83 | # 解决插件锁定时的界面显示问题
84 | if cookies.get('lock_plugin', None):
85 | label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None)
86 | chatbot_gr = gradio.update(value=chatbot, label=label)
87 | if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label
88 | elif cookies.get('label', None):
89 | chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', ""))
90 | cookies['label'] = None # 清空label
91 | else:
92 | chatbot_gr = chatbot
93 |
94 | yield cookies, chatbot_gr, history, msg
95 |
96 | def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面
97 | """
98 | 刷新用户界面
99 | """
100 | if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg])
101 | chatbot[-1] = list(chatbot[-1])
102 | chatbot[-1][-1] = lastmsg
103 | yield from update_ui(chatbot=chatbot, history=history)
104 | time.sleep(delay)
105 |
106 |
107 | def trimmed_format_exc():
108 | import os, traceback
109 | str = traceback.format_exc()
110 | current_path = os.getcwd()
111 | replace_path = "."
112 | return str.replace(current_path, replace_path)
113 |
114 | def CatchException(f):
115 | """
116 | 装饰器函数,捕捉函数f中的异常并封装到一个生成器中返回,并显示到聊天当中。
117 | """
118 |
119 | @wraps(f)
120 | def decorated(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs):
121 | try:
122 | yield from f(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs)
123 | except Exception as e:
124 | from check_proxy import check_proxy
125 | from toolbox import get_conf
126 | proxies, = get_conf('proxies')
127 | tb_str = '```\n' + trimmed_format_exc() + '```'
128 | if len(chatbot_with_cookie) == 0:
129 | chatbot_with_cookie.clear()
130 | chatbot_with_cookie.append(["插件调度异常", "异常原因"])
131 | chatbot_with_cookie[-1] = (chatbot_with_cookie[-1][0],
132 | f"[Local Message] 实验性函数调用出错: \n\n{tb_str} \n\n当前代理可用性: \n\n{check_proxy(proxies)}")
133 | yield from update_ui(chatbot=chatbot_with_cookie, history=history, msg=f'异常 {e}') # 刷新界面
134 | return decorated
135 |
136 |
137 | def HotReload(f):
138 | """
139 | HotReload的装饰器函数,用于实现Python函数插件的热更新。
140 | 函数热更新是指在不停止程序运行的情况下,更新函数代码,从而达到实时更新功能。
141 | 在装饰器内部,使用wraps(f)来保留函数的元信息,并定义了一个名为decorated的内部函数。
142 | 内部函数通过使用importlib模块的reload函数和inspect模块的getmodule函数来重新加载并获取函数模块,
143 | 然后通过getattr函数获取函数名,并在新模块中重新加载函数。
144 | 最后,使用yield from语句返回重新加载过的函数,并在被装饰的函数上执行。
145 | 最终,装饰器函数返回内部函数。这个内部函数可以将函数的原始定义更新为最新版本,并执行函数的新版本。
146 | """
147 | @wraps(f)
148 | def decorated(*args, **kwargs):
149 | fn_name = f.__name__
150 | f_hot_reload = getattr(importlib.reload(inspect.getmodule(f)), fn_name)
151 | yield from f_hot_reload(*args, **kwargs)
152 | return decorated
153 |
154 |
155 | """
156 | ========================================================================
157 | 第二部分
158 | 其他小工具:
159 | - write_results_to_file: 将结果写入markdown文件中
160 | - regular_txt_to_markdown: 将普通文本转换为Markdown格式的文本。
161 | - report_execption: 向chatbot中添加简单的意外错误信息
162 | - text_divide_paragraph: 将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
163 | - markdown_convertion: 用多种方式组合,将markdown转化为好看的html
164 | - format_io: 接管gradio默认的markdown处理方式
165 | - on_file_uploaded: 处理文件的上传(自动解压)
166 | - on_report_generated: 将生成的报告自动投射到文件上传区
167 | - clip_history: 当历史上下文过长时,自动截断
168 | - get_conf: 获取设置
169 | - select_api_key: 根据当前的模型类别,抽取可用的api-key
170 | ========================================================================
171 | """
172 |
173 | def get_reduce_token_percent(text):
174 | """
175 | * 此函数未来将被弃用
176 | """
177 | try:
178 | # text = "maximum context length is 4097 tokens. However, your messages resulted in 4870 tokens"
179 | pattern = r"(\d+)\s+tokens\b"
180 | match = re.findall(pattern, text)
181 | EXCEED_ALLO = 500 # 稍微留一点余地,否则在回复时会因余量太少出问题
182 | max_limit = float(match[0]) - EXCEED_ALLO
183 | current_tokens = float(match[1])
184 | ratio = max_limit/current_tokens
185 | assert ratio > 0 and ratio < 1
186 | return ratio, str(int(current_tokens-max_limit))
187 | except:
188 | return 0.5, '不详'
189 |
190 |
191 | def write_results_to_file(history, file_name=None):
192 | """
193 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名,则使用当前时间生成文件名。
194 | """
195 | import os
196 | import time
197 | if file_name is None:
198 | # file_name = time.strftime("chatGPT分析报告%Y-%m-%d-%H-%M-%S", time.localtime()) + '.md'
199 | file_name = 'GPT-Report-' + gen_time_str() + '.md'
200 | os.makedirs('./gpt_log/', exist_ok=True)
201 | with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
202 | f.write('# GPT-Academic Report\n')
203 | for i, content in enumerate(history):
204 | try:
205 | if type(content) != str: content = str(content)
206 | except:
207 | continue
208 | if i % 2 == 0:
209 | f.write('## ')
210 | try:
211 | f.write(content)
212 | except:
213 | # remove everything that cannot be handled by utf8
214 | f.write(content.encode('utf-8', 'ignore').decode())
215 | f.write('\n\n')
216 | res = '以上材料已经被写入:\t' + os.path.abspath(f'./gpt_log/{file_name}')
217 | print(res)
218 | return res
219 |
220 |
221 | def write_history_to_file(history, file_basename=None, file_fullname=None):
222 | """
223 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名,则使用当前时间生成文件名。
224 | """
225 | import os
226 | import time
227 | if file_fullname is None:
228 | if file_basename is not None:
229 | file_fullname = os.path.join(get_log_folder(), file_basename)
230 | else:
231 | file_fullname = os.path.join(get_log_folder(), f'GPT-Academic-{gen_time_str()}.md')
232 | os.makedirs(os.path.dirname(file_fullname), exist_ok=True)
233 | with open(file_fullname, 'w', encoding='utf8') as f:
234 | f.write('# GPT-Academic Report\n')
235 | for i, content in enumerate(history):
236 | try:
237 | if type(content) != str: content = str(content)
238 | except:
239 | continue
240 | if i % 2 == 0:
241 | f.write('## ')
242 | try:
243 | f.write(content)
244 | except:
245 | # remove everything that cannot be handled by utf8
246 | f.write(content.encode('utf-8', 'ignore').decode())
247 | f.write('\n\n')
248 | res = os.path.abspath(file_fullname)
249 | return res
250 |
251 |
252 | def regular_txt_to_markdown(text):
253 | """
254 | 将普通文本转换为Markdown格式的文本。
255 | """
256 | text = text.replace('\n', '\n\n')
257 | text = text.replace('\n\n\n', '\n\n')
258 | text = text.replace('\n\n\n', '\n\n')
259 | return text
260 |
261 |
262 |
263 |
264 | def report_execption(chatbot, history, a, b):
265 | """
266 | 向chatbot中添加错误信息
267 | """
268 | chatbot.append((a, b))
269 | history.append(a)
270 | history.append(b)
271 |
272 |
273 | def text_divide_paragraph(text):
274 | """
275 | 将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
276 | """
277 | pre = ''
278 | suf = '
'
279 | if text.startswith(pre) and text.endswith(suf):
280 | return text
281 |
282 | if '```' in text:
283 | # careful input
284 | return pre + text + suf
285 | else:
286 | # wtf input
287 | lines = text.split("\n")
288 | for i, line in enumerate(lines):
289 | lines[i] = lines[i].replace(" ", " ")
290 | text = "".join(lines)
291 | return pre + text + suf
292 |
293 | @lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
294 | def markdown_convertion(txt):
295 | """
296 | 将Markdown格式的文本转换为HTML格式。如果包含数学公式,则先将公式转换为HTML格式。
297 | """
298 | pre = ''
299 | suf = '
'
300 | if txt.startswith(pre) and txt.endswith(suf):
301 | # print('警告,输入了已经经过转化的字符串,二次转化可能出问题')
302 | return txt # 已经被转化过,不需要再次转化
303 |
304 | markdown_extension_configs = {
305 | 'mdx_math': {
306 | 'enable_dollar_delimiter': True,
307 | 'use_gitlab_delimiters': False,
308 | },
309 | }
310 | find_equation_pattern = r'\n', '')
345 | return content
346 |
347 | def no_code(txt):
348 | if '```' not in txt:
349 | return True
350 | else:
351 | if '```reference' in txt: return True # newbing
352 | else: return False
353 |
354 | if ('$' in txt) and no_code(txt): # 有$标识的公式符号,且没有代码段```的标识
355 | # convert everything to html format
356 | split = markdown.markdown(text='---')
357 | convert_stage_1 = markdown.markdown(text=txt, extensions=['mdx_math', 'fenced_code', 'tables', 'sane_lists'], extension_configs=markdown_extension_configs)
358 | convert_stage_1 = markdown_bug_hunt(convert_stage_1)
359 | # re.DOTALL: Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline. Corresponds to the inline flag (?s).
360 | # 1. convert to easy-to-copy tex (do not render math)
361 | convert_stage_2_1, n = re.subn(find_equation_pattern, replace_math_no_render, convert_stage_1, flags=re.DOTALL)
362 | # 2. convert to rendered equation
363 | convert_stage_2_2, n = re.subn(find_equation_pattern, replace_math_render, convert_stage_1, flags=re.DOTALL)
364 | # cat them together
365 | return pre + convert_stage_2_1 + f'{split}' + convert_stage_2_2 + suf
366 | else:
367 | return pre + markdown.markdown(txt, extensions=['fenced_code', 'codehilite', 'tables', 'sane_lists']) + suf
368 |
369 |
370 | def close_up_code_segment_during_stream(gpt_reply):
371 | """
372 | 在gpt输出代码的中途(输出了前面的```,但还没输出完后面的```),补上后面的```
373 |
374 | Args:
375 | gpt_reply (str): GPT模型返回的回复字符串。
376 |
377 | Returns:
378 | str: 返回一个新的字符串,将输出代码片段的“后面的```”补上。
379 |
380 | """
381 | if '```' not in gpt_reply:
382 | return gpt_reply
383 | if gpt_reply.endswith('```'):
384 | return gpt_reply
385 |
386 | # 排除了以上两个情况,我们
387 | segments = gpt_reply.split('```')
388 | n_mark = len(segments) - 1
389 | if n_mark % 2 == 1:
390 | # print('输出代码片段中!')
391 | return gpt_reply+'\n```'
392 | else:
393 | return gpt_reply
394 |
395 |
396 | def format_io(self, y):
397 | """
398 | 将输入和输出解析为HTML格式。将y中最后一项的输入部分段落化,并将输出部分的Markdown和数学公式转换为HTML格式。
399 | """
400 | if y is None or y == []:
401 | return []
402 | i_ask, gpt_reply = y[-1]
403 | # 输入部分太自由,预处理一波
404 | if i_ask is not None: i_ask = text_divide_paragraph(i_ask)
405 | # 当代码输出半截的时候,试着补上后个```
406 | if gpt_reply is not None: gpt_reply = close_up_code_segment_during_stream(gpt_reply)
407 | # process
408 | y[-1] = (
409 | None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']),
410 | None if gpt_reply is None else markdown_convertion(gpt_reply)
411 | )
412 | return y
413 |
414 |
415 | def find_free_port():
416 | """
417 | 返回当前系统中可用的未使用端口。
418 | """
419 | import socket
420 | from contextlib import closing
421 | with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as s:
422 | s.bind(('', 0))
423 | s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
424 | return s.getsockname()[1]
425 |
426 |
427 | def extract_archive(file_path, dest_dir):
428 | import zipfile
429 | import tarfile
430 | import os
431 | # Get the file extension of the input file
432 | file_extension = os.path.splitext(file_path)[1]
433 |
434 | # Extract the archive based on its extension
435 | if file_extension == '.zip':
436 | with zipfile.ZipFile(file_path, 'r') as zipobj:
437 | zipobj.extractall(path=dest_dir)
438 | print("Successfully extracted zip archive to {}".format(dest_dir))
439 |
440 | elif file_extension in ['.tar', '.gz', '.bz2']:
441 | with tarfile.open(file_path, 'r:*') as tarobj:
442 | tarobj.extractall(path=dest_dir)
443 | print("Successfully extracted tar archive to {}".format(dest_dir))
444 |
445 | # 第三方库,需要预先pip install rarfile
446 | # 此外,Windows上还需要安装winrar软件,配置其Path环境变量,如"C:\Program Files\WinRAR"才可以
447 | elif file_extension == '.rar':
448 | try:
449 | import rarfile
450 | with rarfile.RarFile(file_path) as rf:
451 | rf.extractall(path=dest_dir)
452 | print("Successfully extracted rar archive to {}".format(dest_dir))
453 | except:
454 | print("Rar format requires additional dependencies to install")
455 | return '\n\n解压失败! 需要安装pip install rarfile来解压rar文件'
456 |
457 | # 第三方库,需要预先pip install py7zr
458 | elif file_extension == '.7z':
459 | try:
460 | import py7zr
461 | with py7zr.SevenZipFile(file_path, mode='r') as f:
462 | f.extractall(path=dest_dir)
463 | print("Successfully extracted 7z archive to {}".format(dest_dir))
464 | except:
465 | print("7z format requires additional dependencies to install")
466 | return '\n\n解压失败! 需要安装pip install py7zr来解压7z文件'
467 | else:
468 | return ''
469 | return ''
470 |
471 |
472 | def find_recent_files(directory):
473 | """
474 | me: find files that is created with in one minutes under a directory with python, write a function
475 | gpt: here it is!
476 | """
477 | import os
478 | import time
479 | current_time = time.time()
480 | one_minute_ago = current_time - 60
481 | recent_files = []
482 |
483 | for filename in os.listdir(directory):
484 | file_path = os.path.join(directory, filename)
485 | if file_path.endswith('.log'):
486 | continue
487 | created_time = os.path.getmtime(file_path)
488 | if created_time >= one_minute_ago:
489 | if os.path.isdir(file_path):
490 | continue
491 | recent_files.append(file_path)
492 |
493 | return recent_files
494 |
495 | def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
496 | # 将文件复制一份到下载区
497 | import shutil
498 | if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
499 | new_path = os.path.join(get_log_folder(), rename_file)
500 | # 如果已经存在,先删除
501 | if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
502 | # 把文件复制过去
503 | if not os.path.exists(new_path): shutil.copyfile(file, new_path)
504 | # 将文件添加到chatbot cookie中,避免多用户干扰
505 | if chatbot:
506 | if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote']
507 | else: current = []
508 | chatbot._cookies.update({'file_to_promote': [new_path] + current})
509 |
510 | def disable_auto_promotion(chatbot):
511 | chatbot._cookies.update({'file_to_promote': []})
512 | return
513 |
514 | def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
515 | """
516 | 当文件被上传时的回调函数
517 | """
518 | if len(files) == 0:
519 | return chatbot, txt
520 | import shutil
521 | import os
522 | import time
523 | import glob
524 | from toolbox import extract_archive
525 | try:
526 | shutil.rmtree('./private_upload/')
527 | except:
528 | pass
529 | time_tag = gen_time_str()
530 | os.makedirs(f'private_upload/{time_tag}', exist_ok=True)
531 | err_msg = ''
532 | for file in files:
533 | file_origin_name = os.path.basename(file.orig_name)
534 | shutil.copy(file.name, f'private_upload/{time_tag}/{file_origin_name}')
535 | err_msg += extract_archive(f'private_upload/{time_tag}/{file_origin_name}',
536 | dest_dir=f'private_upload/{time_tag}/{file_origin_name}.extract')
537 | moved_files = [fp for fp in glob.glob('private_upload/**/*', recursive=True)]
538 | if "底部输入区" in checkboxes:
539 | txt = ""
540 | txt2 = f'private_upload/{time_tag}'
541 | else:
542 | txt = f'private_upload/{time_tag}'
543 | txt2 = ""
544 | moved_files_str = '\t\n\n'.join(moved_files)
545 | chatbot.append(['我上传了文件,请查收',
546 | f'[Local Message] 收到以下文件: \n\n{moved_files_str}' +
547 | f'\n\n调用路径参数已自动修正到: \n\n{txt}' +
548 | f'\n\n现在您点击任意“红颜色”标识的函数插件时,以上文件将被作为输入参数'+err_msg])
549 | return chatbot, txt, txt2
550 |
551 |
552 | def on_report_generated(cookies, files, chatbot):
553 | from toolbox import find_recent_files
554 | if 'file_to_promote' in cookies:
555 | report_files = cookies['file_to_promote']
556 | cookies.pop('file_to_promote')
557 | else:
558 | report_files = find_recent_files('gpt_log')
559 | if len(report_files) == 0:
560 | return cookies, None, chatbot
561 | # files.extend(report_files)
562 | file_links = ''
563 | for f in report_files: file_links += f'
{f}'
564 | chatbot.append(['报告如何远程获取?', f'报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。{file_links}'])
565 | return cookies, report_files, chatbot
566 |
567 | def load_chat_cookies():
568 | API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY')
569 | if is_any_api_key(AZURE_API_KEY):
570 | if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY
571 | else: API_KEY = AZURE_API_KEY
572 | return {'api_key': API_KEY, 'llm_model': LLM_MODEL}
573 |
574 | def is_openai_api_key(key):
575 | CUSTOM_API_KEY_PATTERN, = get_conf('CUSTOM_API_KEY_PATTERN')
576 | if len(CUSTOM_API_KEY_PATTERN) != 0:
577 | API_MATCH_ORIGINAL = re.match(CUSTOM_API_KEY_PATTERN, key)
578 | else:
579 | API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
580 | return bool(API_MATCH_ORIGINAL)
581 |
582 | def is_azure_api_key(key):
583 | API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key)
584 | return bool(API_MATCH_AZURE)
585 |
586 | def is_api2d_key(key):
587 | API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key)
588 | return bool(API_MATCH_API2D)
589 |
590 | def is_freeai_api_key(key):#new add
591 | API_MATCH_FREEAI0 = re.match(r"pk-[a-zA-Z0-9-_]{43}$", key)
592 | API_MATCH_FREEAI1 = re.match(r"fk-[a-zA-Z0-9-_]{43}$", key)
593 | return bool(API_MATCH_FREEAI0) or bool(API_MATCH_FREEAI1)
594 |
595 | def is_any_api_key(key):
596 | if ',' in key:
597 | keys = key.split(',')
598 | for k in keys:
599 | if is_any_api_key(k): return True
600 | return False
601 | else:#new add
602 | return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key) or is_freeai_api_key(key)
603 |
604 | def what_keys(keys):
605 | # new add
606 | avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0, "FreeAI Key":0}
607 |
608 | key_list = keys.split(',')
609 |
610 | for k in key_list:
611 | if is_openai_api_key(k):
612 | avail_key_list['OpenAI Key'] += 1
613 |
614 | for k in key_list:
615 | if is_api2d_key(k):
616 | avail_key_list['API2D Key'] += 1
617 |
618 | for k in key_list:
619 | if is_azure_api_key(k):
620 | avail_key_list['Azure Key'] += 1
621 |
622 | for k in key_list: # new add
623 | if is_freeai_api_key(k):
624 | avail_key_list['FreeAI Key'] += 1
625 |
626 | # new add
627 | return f"检测到: OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']} 个, FreeAI Key {avail_key_list['FreeAI Key']} 个"
628 |
629 | def select_api_key(keys, llm_model):
630 | import random
631 | avail_key_list = []
632 | key_list = keys.split(',')
633 |
634 | if llm_model.startswith('gpt-'):
635 | for k in key_list:
636 | if is_openai_api_key(k): avail_key_list.append(k)
637 | for k in key_list:# new add
638 | if is_freeai_api_key(k): avail_key_list.append(k)
639 |
640 | if llm_model.startswith('api2d-'):
641 | for k in key_list:
642 | if is_api2d_key(k): avail_key_list.append(k)
643 |
644 | if llm_model.startswith('azure-'):
645 | for k in key_list:
646 | if is_azure_api_key(k): avail_key_list.append(k)
647 |
648 | if len(avail_key_list) == 0:
649 | raise RuntimeError(f"您提供的api-key不满足要求,不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源(右下角更换模型菜单中可切换openai,azure,claude,api2d等请求源)。")
650 |
651 | api_key = random.choice(avail_key_list) # 随机负载均衡
652 | return api_key
653 |
654 | def read_env_variable(arg, default_value):
655 | """
656 | 环境变量可以是 `GPT_ACADEMIC_CONFIG`(优先),也可以直接是`CONFIG`
657 | 例如在windows cmd中,既可以写:
658 | set USE_PROXY=True
659 | set API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx
660 | set proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",}
661 | set AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"]
662 | set AUTHENTICATION=[("username", "password"), ("username2", "password2")]
663 | 也可以写:
664 | set GPT_ACADEMIC_USE_PROXY=True
665 | set GPT_ACADEMIC_API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx
666 | set GPT_ACADEMIC_proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",}
667 | set GPT_ACADEMIC_AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"]
668 | set GPT_ACADEMIC_AUTHENTICATION=[("username", "password"), ("username2", "password2")]
669 | """
670 | from colorful import print亮红, print亮绿
671 | arg_with_prefix = "GPT_ACADEMIC_" + arg
672 | if arg_with_prefix in os.environ:
673 | env_arg = os.environ[arg_with_prefix]
674 | elif arg in os.environ:
675 | env_arg = os.environ[arg]
676 | else:
677 | raise KeyError
678 | print(f"[ENV_VAR] 尝试加载{arg},默认值:{default_value} --> 修正值:{env_arg}")
679 | try:
680 | if isinstance(default_value, bool):
681 | env_arg = env_arg.strip()
682 | if env_arg == 'True': r = True
683 | elif env_arg == 'False': r = False
684 | else: print('enter True or False, but have:', env_arg); r = default_value
685 | elif isinstance(default_value, int):
686 | r = int(env_arg)
687 | elif isinstance(default_value, float):
688 | r = float(env_arg)
689 | elif isinstance(default_value, str):
690 | r = env_arg.strip()
691 | elif isinstance(default_value, dict):
692 | r = eval(env_arg)
693 | elif isinstance(default_value, list):
694 | r = eval(env_arg)
695 | elif default_value is None:
696 | assert arg == "proxies"
697 | r = eval(env_arg)
698 | else:
699 | print亮红(f"[ENV_VAR] 环境变量{arg}不支持通过环境变量设置! ")
700 | raise KeyError
701 | except:
702 | print亮红(f"[ENV_VAR] 环境变量{arg}加载失败! ")
703 | raise KeyError(f"[ENV_VAR] 环境变量{arg}加载失败! ")
704 |
705 | print亮绿(f"[ENV_VAR] 成功读取环境变量{arg}")
706 | return r
707 |
708 | @lru_cache(maxsize=128)
709 | def read_single_conf_with_lru_cache(arg):
710 | from colorful import print亮红, print亮绿, print亮蓝
711 | try:
712 | # 优先级1. 获取环境变量作为配置
713 | default_ref = getattr(importlib.import_module('config'), arg) # 读取默认值作为数据类型转换的参考
714 | r = read_env_variable(arg, default_ref)
715 | except:
716 | try:
717 | # 优先级2. 获取config_private中的配置
718 | r = getattr(importlib.import_module('config_private'), arg)
719 | except:
720 | # 优先级3. 获取config中的配置
721 | r = getattr(importlib.import_module('config'), arg)
722 |
723 | # 在读取API_KEY时,检查一下是不是忘了改config
724 | if arg == 'API_KEY':
725 | print亮蓝(f"[API_KEY] 本项目现已支持OpenAI和Azure的api-key。也支持同时填写多个api-key,如API_KEY=\"openai-key1,openai-key2,azure-key3\"")
726 | print亮蓝(f"[API_KEY] 您既可以在config.py中修改api-key(s),也可以在问题输入区输入临时的api-key(s),然后回车键提交后即可生效。")
727 | if is_any_api_key(r):
728 | print亮绿(f"[API_KEY] 您的 API_KEY 是: {r[:15]}*** API_KEY 导入成功")
729 | else:
730 | print亮红( "[API_KEY] 您的 API_KEY 不满足任何一种已知的密钥格式,请在config文件中修改API密钥之后再运行。")
731 | if arg == 'proxies':
732 | if not read_single_conf_with_lru_cache('USE_PROXY'): r = None # 检查USE_PROXY,防止proxies单独起作用
733 | if r is None:
734 | print亮红('[PROXY] 网络代理状态:未配置。无代理状态下很可能无法访问OpenAI家族的模型。建议:检查USE_PROXY选项是否修改。')
735 | else:
736 | print亮绿('[PROXY] 网络代理状态:已配置。配置信息如下:', r)
737 | assert isinstance(r, dict), 'proxies格式错误,请注意proxies选项的格式,不要遗漏括号。'
738 | return r
739 |
740 |
741 | @lru_cache(maxsize=128)
742 | def get_conf(*args):
743 | # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
744 | res = []
745 | for arg in args:
746 | r = read_single_conf_with_lru_cache(arg)
747 | res.append(r)
748 | return res
749 |
750 |
751 | def clear_line_break(txt):
752 | txt = txt.replace('\n', ' ')
753 | txt = txt.replace(' ', ' ')
754 | txt = txt.replace(' ', ' ')
755 | return txt
756 |
757 |
758 | class DummyWith():
759 | """
760 | 这段代码定义了一个名为DummyWith的空上下文管理器,
761 | 它的作用是……额……就是不起作用,即在代码结构不变得情况下取代其他的上下文管理器。
762 | 上下文管理器是一种Python对象,用于与with语句一起使用,
763 | 以确保一些资源在代码块执行期间得到正确的初始化和清理。
764 | 上下文管理器必须实现两个方法,分别为 __enter__()和 __exit__()。
765 | 在上下文执行开始的情况下,__enter__()方法会在代码块被执行前被调用,
766 | 而在上下文执行结束时,__exit__()方法则会被调用。
767 | """
768 | def __enter__(self):
769 | return self
770 |
771 | def __exit__(self, exc_type, exc_value, traceback):
772 | return
773 |
774 | def run_gradio_in_subpath(demo, auth, port, custom_path):
775 | """
776 | 把gradio的运行地址更改到指定的二次路径上
777 | """
778 | def is_path_legal(path: str)->bool:
779 | '''
780 | check path for sub url
781 | path: path to check
782 | return value: do sub url wrap
783 | '''
784 | if path == "/": return True
785 | if len(path) == 0:
786 | print("ilegal custom path: {}\npath must not be empty\ndeploy on root url".format(path))
787 | return False
788 | if path[0] == '/':
789 | if path[1] != '/':
790 | print("deploy on sub-path {}".format(path))
791 | return True
792 | return False
793 | print("ilegal custom path: {}\npath should begin with \'/\'\ndeploy on root url".format(path))
794 | return False
795 |
796 | if not is_path_legal(custom_path): raise RuntimeError('Ilegal custom path')
797 | import uvicorn
798 | import gradio as gr
799 | from fastapi import FastAPI
800 | app = FastAPI()
801 | if custom_path != "/":
802 | @app.get("/")
803 | def read_main():
804 | return {"message": f"Gradio is running at: {custom_path}"}
805 | app = gr.mount_gradio_app(app, demo, path=custom_path)
806 | uvicorn.run(app, host="0.0.0.0", port=port) # , auth=auth
807 |
808 |
809 | def clip_history(inputs, history, tokenizer, max_token_limit):
810 | """
811 | reduce the length of history by clipping.
812 | this function search for the longest entries to clip, little by little,
813 | until the number of token of history is reduced under threshold.
814 | 通过裁剪来缩短历史记录的长度。
815 | 此函数逐渐地搜索最长的条目进行剪辑,
816 | 直到历史记录的标记数量降低到阈值以下。
817 | """
818 | import numpy as np
819 | from request_llm.bridge_all import model_info
820 | def get_token_num(txt):
821 | return len(tokenizer.encode(txt, disallowed_special=()))
822 | input_token_num = get_token_num(inputs)
823 | if input_token_num < max_token_limit * 3 / 4:
824 | # 当输入部分的token占比小于限制的3/4时,裁剪时
825 | # 1. 把input的余量留出来
826 | max_token_limit = max_token_limit - input_token_num
827 | # 2. 把输出用的余量留出来
828 | max_token_limit = max_token_limit - 128
829 | # 3. 如果余量太小了,直接清除历史
830 | if max_token_limit < 128:
831 | history = []
832 | return history
833 | else:
834 | # 当输入部分的token占比 > 限制的3/4时,直接清除历史
835 | history = []
836 | return history
837 |
838 | everything = ['']
839 | everything.extend(history)
840 | n_token = get_token_num('\n'.join(everything))
841 | everything_token = [get_token_num(e) for e in everything]
842 |
843 | # 截断时的颗粒度
844 | delta = max(everything_token) // 16
845 |
846 | while n_token > max_token_limit:
847 | where = np.argmax(everything_token)
848 | encoded = tokenizer.encode(everything[where], disallowed_special=())
849 | clipped_encoded = encoded[:len(encoded)-delta]
850 | everything[where] = tokenizer.decode(clipped_encoded)[:-1] # -1 to remove the may-be illegal char
851 | everything_token[where] = get_token_num(everything[where])
852 | n_token = get_token_num('\n'.join(everything))
853 |
854 | history = everything[1:]
855 | return history
856 |
857 | """
858 | ========================================================================
859 | 第三部分
860 | 其他小工具:
861 | - zip_folder: 把某个路径下所有文件压缩,然后转移到指定的另一个路径中(gpt写的)
862 | - gen_time_str: 生成时间戳
863 | - ProxyNetworkActivate: 临时地启动代理网络(如果有)
864 | - objdump/objload: 快捷的调试函数
865 | ========================================================================
866 | """
867 |
868 | def zip_folder(source_folder, dest_folder, zip_name):
869 | import zipfile
870 | import os
871 | # Make sure the source folder exists
872 | if not os.path.exists(source_folder):
873 | print(f"{source_folder} does not exist")
874 | return
875 |
876 | # Make sure the destination folder exists
877 | if not os.path.exists(dest_folder):
878 | print(f"{dest_folder} does not exist")
879 | return
880 |
881 | # Create the name for the zip file
882 | zip_file = os.path.join(dest_folder, zip_name)
883 |
884 | # Create a ZipFile object
885 | with zipfile.ZipFile(zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf:
886 | # Walk through the source folder and add files to the zip file
887 | for foldername, subfolders, filenames in os.walk(source_folder):
888 | for filename in filenames:
889 | filepath = os.path.join(foldername, filename)
890 | zipf.write(filepath, arcname=os.path.relpath(filepath, source_folder))
891 |
892 | # Move the zip file to the destination folder (if it wasn't already there)
893 | if os.path.dirname(zip_file) != dest_folder:
894 | os.rename(zip_file, os.path.join(dest_folder, os.path.basename(zip_file)))
895 | zip_file = os.path.join(dest_folder, os.path.basename(zip_file))
896 |
897 | print(f"Zip file created at {zip_file}")
898 |
899 | def zip_result(folder):
900 | t = gen_time_str()
901 | zip_folder(folder, './gpt_log/', f'{t}-result.zip')
902 | return pj('./gpt_log/', f'{t}-result.zip')
903 |
904 | def gen_time_str():
905 | import time
906 | return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
907 |
908 | def get_log_folder(user='default', plugin_name='shared'):
909 | _dir = os.path.join(os.path.dirname(__file__), 'gpt_log', user, plugin_name)
910 | if not os.path.exists(_dir): os.makedirs(_dir)
911 | return _dir
912 |
913 | class ProxyNetworkActivate():
914 | """
915 | 这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
916 | """
917 | def __enter__(self):
918 | from toolbox import get_conf
919 | proxies, = get_conf('proxies')
920 | if 'no_proxy' in os.environ: os.environ.pop('no_proxy')
921 | if proxies is not None:
922 | if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http']
923 | if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https']
924 | return self
925 |
926 | def __exit__(self, exc_type, exc_value, traceback):
927 | os.environ['no_proxy'] = '*'
928 | if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY')
929 | if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY')
930 | return
931 |
932 | def objdump(obj, file='objdump.tmp'):
933 | import pickle
934 | with open(file, 'wb+') as f:
935 | pickle.dump(obj, f)
936 | return
937 |
938 | def objload(file='objdump.tmp'):
939 | import pickle, os
940 | if not os.path.exists(file):
941 | return
942 | with open(file, 'rb') as f:
943 | return pickle.load(f)
944 |
945 | def Singleton(cls):
946 | """
947 | 一个单实例装饰器
948 | """
949 | _instance = {}
950 |
951 | def _singleton(*args, **kargs):
952 | if cls not in _instance:
953 | _instance[cls] = cls(*args, **kargs)
954 | return _instance[cls]
955 |
956 | return _singleton
957 |
958 | """
959 | ========================================================================
960 | 第四部分
961 | 接驳虚空终端:
962 | - set_conf: 在运行过程中动态地修改配置
963 | - set_multi_conf: 在运行过程中动态地修改多个配置
964 | - get_plugin_handle: 获取插件的句柄
965 | - get_plugin_default_kwargs: 获取插件的默认参数
966 | - get_chat_handle: 获取简单聊天的句柄
967 | - get_chat_default_kwargs: 获取简单聊天的默认参数
968 | ========================================================================
969 | """
970 |
971 | def set_conf(key, value):
972 | from toolbox import read_single_conf_with_lru_cache, get_conf
973 | read_single_conf_with_lru_cache.cache_clear()
974 | get_conf.cache_clear()
975 | os.environ[key] = str(value)
976 | altered, = get_conf(key)
977 | return altered
978 |
979 | def set_multi_conf(dic):
980 | for k, v in dic.items(): set_conf(k, v)
981 | return
982 |
983 | def get_plugin_handle(plugin_name):
984 | """
985 | e.g. plugin_name = 'crazy_functions.批量Markdown翻译->Markdown翻译指定语言'
986 | """
987 | import importlib
988 | assert '->' in plugin_name, \
989 | "Example of plugin_name: crazy_functions.批量Markdown翻译->Markdown翻译指定语言"
990 | module, fn_name = plugin_name.split('->')
991 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
992 | return f_hot_reload
993 |
994 | def get_chat_handle():
995 | """
996 | """
997 | from request_llm.bridge_all import predict_no_ui_long_connection
998 | return predict_no_ui_long_connection
999 |
1000 | def get_plugin_default_kwargs():
1001 | """
1002 | """
1003 | from toolbox import get_conf, ChatBotWithCookies
1004 |
1005 | WEB_PORT, LLM_MODEL, API_KEY = \
1006 | get_conf('WEB_PORT', 'LLM_MODEL', 'API_KEY')
1007 |
1008 | llm_kwargs = {
1009 | 'api_key': API_KEY,
1010 | 'llm_model': LLM_MODEL,
1011 | 'top_p':1.0,
1012 | 'max_length': None,
1013 | 'temperature':1.0,
1014 | }
1015 | chatbot = ChatBotWithCookies(llm_kwargs)
1016 |
1017 | # txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port
1018 | default_plugin_kwargs = {
1019 | "main_input": "./README.md",
1020 | "llm_kwargs": llm_kwargs,
1021 | "plugin_kwargs": {},
1022 | "chatbot_with_cookie": chatbot,
1023 | "history": [],
1024 | "system_prompt": "You are a good AI.",
1025 | "web_port": WEB_PORT
1026 | }
1027 | return default_plugin_kwargs
1028 |
1029 | def get_chat_default_kwargs():
1030 | """
1031 | """
1032 | from toolbox import get_conf
1033 |
1034 | LLM_MODEL, API_KEY = get_conf('LLM_MODEL', 'API_KEY')
1035 |
1036 | llm_kwargs = {
1037 | 'api_key': API_KEY,
1038 | 'llm_model': LLM_MODEL,
1039 | 'top_p':1.0,
1040 | 'max_length': None,
1041 | 'temperature':1.0,
1042 | }
1043 |
1044 | default_chat_kwargs = {
1045 | "inputs": "Hello there, are you ready?",
1046 | "llm_kwargs": llm_kwargs,
1047 | "history": [],
1048 | "sys_prompt": "You are AI assistant",
1049 | "observe_window": None,
1050 | "console_slience": False,
1051 | }
1052 |
1053 | return default_chat_kwargs
1054 |
1055 |
--------------------------------------------------------------------------------
/old/README_old.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | #
| FreeAI
4 |
5 | **OpenAI should not be a closed AI.**
6 |
7 | 你是否还在为OpenAI需要科学上网在犯愁?
8 |
9 | 你是否还在为OpenAI的付费模式而望而却步?
10 |
11 | 你是否苦恼没有免费的API Key来开发自己的ChatGPT工具?
12 |
13 | 本项目综述Github众优秀开发者的努力,给出一个比较完美的解决方案,并持续向更好用、更强大、更便宜的AI开放努力。**如果你喜欢本项目,请给一个免费的star,谢谢!**
14 |
15 | `写在前面(因为issues很多人没看到这句话而遇到缺少API key的报错):`
16 | + ***由于gpt_academic设定用户参数配置的读取优先级: 环境变量 > config_private.py > config.py,所以调试中,最好config.py文件也做对应的修改(即改的和config_private.py一模一样)。不然,用户的配置可能在某些调试情况下不生效,这可能是gpt_academic的bug。***
17 |
18 | **鸣谢:**
19 | + [pengzhile/pandora](https://github.com/pengzhile/pandora):让OpenAI GPT-3.5的API免费和免科学上网的关键技术。
20 | + [acheong08/OpenAIAuth](https://github.com/acheong08/OpenAIAuth):免科学上网获取自己OpenAI账户的Cookie。
21 | + [binary-husky/gpt_academic](https://github.com/binary-husky/gpt_academic), 以它为例,解决它需翻墙和需要付费的OpenAI API key的问题,演示OpenAI变为FreeAI。
22 |
23 | ## Pandora
24 | 旨在打造免科学上网情况下,最原汁原味的ChatGPT。基于access token的[技术原理](https://zhile.io/2023/05/19/how-to-get-chatgpt-access-token-via-pkce.html)实现的。目前有官方的体验网站[https://chat.zhile.io](https://chat.zhile.io),需要使用OpenAI的账户密码,所有对话记录与在官网的一致;也有基于Pandora技术的共享[Shared Chat](https://baipiao.io/chatgpt)的资源池,无需账号密码也能体验。
25 |
26 | Pandora项目最难能可贵的是提供了可将用户的Cookie转化为形式如同API key的Access Token和响应这个Access Token的反代接口(也可响应OpenAI原生的API key)的服务,此举无疑是基于OpenAI自由开发者最大的福音。详情请见:[“这个服务旨在模拟 Turbo API,免费且使用的是ChatGPT的8k模型”](https://github.com/pengzhile/pandora/issues/837)。
27 | + Cookie转 `fk-`开头、43位的 Share Token 演示地址:[https://ai.fakeopen.com/token](https://ai.fakeopen.com/token);
28 | + Cookie转 `pk-`开头、43位的 Pool Token 演示地址:[https://ai.fakeopen.com/pool](https://ai.fakeopen.com/pool)。解决多账号并发的问题;
29 | + 响应上述 Access Token 的反代接口是:[https://ai.fakeopen.com/v1/chat/completions](https://ai.fakeopen.com/v1/chat/completions)。
30 |
31 | Pandora项目还提供了两个免费的Pool Token:
32 | + `pk-this-is-a-real-free-pool-token-for-everyone` 很多 Share Token 组成的池子。
33 | + `pk-this-is-a-real-free-api-key-pk-for-everyone` 一些120刀 Api Key组成的池子。`(我测试的时候已经没钱了,衰。)`
34 |
35 | 经使用自己的账号生成的Share Token和Pool Token进行测试,使用Access Token进行的对话的记录,不会出现在该账户记录中。所以我自己使用的也是Pandora提供Pool Token,毕竟自己的池子不够大,而且因为自己的用户cookie的生命周期只有14天,时常更新Access Token也很烦。
36 |
37 | 本人十分中意ChatGPT的翻译效果,所以编写一个基于Pandora的简易翻译服务的网页,即文件[Translate.html](https://github.com/elphen-wang/FreeAI/blob/main/Translate.html),测试效果表明还可以。
38 |
39 | ## OpenAIAuth
40 | 现在,Pandora的讨论帖就有人在提Access token想保留问询记录的需求。如果,你在使用Pandora提供的Pool Token还有隐私和安全的顾虑,也可以同时使用[OpenAIAuth](https://github.com/acheong08/OpenAIAuth)和`pandora-chatgpt`的python函数包来产生和定时更新专属自己Access token。
41 |
42 | Pandora项目其实也独立提供了[这种服务](https://gist.github.com/pengzhile/448bfcfd548b3ae4e665a84cc86c4694)。但是我实操后,还是觉得结合OpenAIAuth更好使一些,并把修改后的代码放进[get_freeai_api.py](https://github.com/elphen-wang/FreeAI/blob/main/get_freeai_api.py)文件,生成的'share_tokens.txt'是Pool Token(如果有二个及以上的账户密码的话)和Share Token并在的。
43 |
44 | ## gpt_academic
45 | 本人之前搭建专属自己的OpenAI API反向代理的教程[ChatGPT Wallfree](https://github.com/elphen-wang/chatgpt_wallfree)只实现了gpt_academic免科学上网功能,但仍需使用OpenAI原生的API key。这里还是以它为例,本次直接不用开发者自己搭建反向代理服务和OpenAI原生的API key,可以为一般的科研组省下一笔的不易报销的经费支出。
46 |
47 | 开发者可使用本项目中[gpt_academic](https://github.com/elphen-wang/FreeAI/tree/main/gpt_academic)文件夹中文件替代官方的文件(`主要是修改对toolbox.py和和config_private.py对access token的识别和获取`),也可在此基础上加入自己的设定(如gpt_academic账户密码等)。如此之后,安装官方的调试运行和部署指引,gpt_academic就可以不用科学上网又能免费使用gpt-3.5啦!
48 |
49 | 在使用自己的账户的access token的场景中,需要用户自己设定定时执行get_freeai_api.py的功能,如每天凌晨四点执行一次。这样可以克服OpenAI cookie只有14天生命周期引入的频繁手动更新access token的问题。
50 |
51 | `tips`:
52 | + 要使用gpt_academic arxiv翻译功能,在docker模式下,需要进行以下编译:
53 | ``` bash {.line-numbers}
54 | #编译 docker 镜像
55 | docker build -t gpt-academic-nolocal-latex -f docs/GithubAction+NoLocal+Latex .
56 | #端口可以自由更换,保持和config.py和config_private.py中设置的一样即可
57 | run -d -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --net=host -p 86:86 --restart=always --name gpt-academic gpt-academic-nolocal-latex
58 | ```
59 |
60 | + ***由于gpt_academic设定用户参数配置的读取优先级: 环境变量 > config_private.py > config.py,所以调试中,最好config.py文件也做对应的修改(即改为一样)。不然,用户的配置可能在某些调试情况下不生效,这可能是gpt_academic的bug。***
61 |
62 | ## 后记
63 | + 因为,Pandora目前本质上是将OpenAI原生的网页服务还原出来,所以目前还不能免费使用诸如ChatGPT-4等付费服务。不过,这将是本人和一众致力于使AI技术服务更广大群众的开发者今后努力的方向。
64 | + 之前ChatGPT Wallfree教程中提及ZeroTier的内网穿透技术,实测不如[Frp](https://github.com/fatedier/frp)更适合中国科研宝宝的体质:更稳定、速度更快且第三方无需客户端。
65 |
66 |
67 | ## To-do List
68 | + [ ] 完善gpt_academic的arxiv翻译功能,因为我是一个科研民工...
69 | + [ ] 集成new bing的gpt4服务...
70 |
71 | ## Star历史
72 |
73 | 
74 |
75 |
76 |
--------------------------------------------------------------------------------
/old/gpt_academic_old/config_private.py:
--------------------------------------------------------------------------------
1 | """
2 | 以下所有配置也都支持利用环境变量覆写,环境变量配置格式见docker-compose.yml。
3 | 读取优先级:环境变量 > config_private.py > config.py
4 | --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
5 | All the following configurations also support using environment variables to override,
6 | and the environment variable configuration format can be seen in docker-compose.yml.
7 | Configuration reading priority: environment variable > config_private.py > config.py
8 | """
9 |
10 | # [step 1]>> API_KEY = "sk-123456789xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123456789"。极少数情况下,还需要填写组织(格式如org-123456789abcdefghijklmno的),请向下翻,找 API_ORG 设置项
11 | API_KEYAPI_KEY = "pk-this-is-a-real-free-pool-token-for-everyone" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4"
12 |
13 | ''' #和上面的API_KEYAPI_KEY可以二选一,这段代码是配合用自己的OpenAI账户密码二设定。
14 | from os import path
15 | current_dir = path.dirname(path.abspath(__file__))
16 | share_tokens_file = path.join(current_dir, 'share_tokens.txt')
17 | with open(share_tokens_file, 'r', encoding='utf-8') as f:
18 | free_apis= f.read().split('\n')
19 | API_KEYAPI_KEY=','.join(filter(None,free_apis))
20 | '''
21 |
22 |
23 | # [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改
24 | USE_PROXY = False
25 | if USE_PROXY:
26 | """
27 | 填写格式是 [协议]:// [地址] :[端口],填写之前不要忘记把USE_PROXY改成True,如果直接在海外服务器部署,此处不修改
28 | <配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1>
29 | [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http
30 | [地址] 懂的都懂,不懂就填localhost或者127.0.0.1肯定错不了(localhost意思是代理软件安装在本机上)
31 | [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样,但端口号都应该在最显眼的位置上
32 | """
33 | # 代理网络的地址,打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284)
34 | proxies = {
35 | # [协议]:// [地址] :[端口]
36 | "http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890",
37 | "https": "socks5h://localhost:11284", # 再例如 "https": "http://127.0.0.1:7890",
38 | }
39 | else:
40 | proxies = None
41 |
42 | # ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------
43 |
44 | # 重新URL重新定向,实现更换API_URL的作用(常规情况下,不要修改!! 高危设置!通过修改此设置,您将把您的API-KEY和对话隐私完全暴露给您设定的中间人!)
45 | # 格式 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"}
46 | # 例如 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://reverse-proxy-url/v1/chat/completions"}
47 | API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://ai.fakeopen.com/v1/chat/completions"}
48 |
49 |
50 | # 多线程函数插件中,默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次,Pay-as-you-go users的限制是每分钟3500次
51 | # 一言以蔽之:免费(5刀)用户填3,OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询:https://platform.openai.com/docs/guides/rate-limits/overview
52 | DEFAULT_WORKER_NUM = 3
53 |
54 |
55 | # 对话窗的高度
56 | CHATBOT_HEIGHT = 1115
57 |
58 |
59 | # 代码高亮
60 | CODE_HIGHLIGHT = True
61 |
62 |
63 | # 窗口布局
64 | LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
65 | DARK_MODE = True # 暗色模式 / 亮色模式
66 |
67 |
68 | # 发送请求到OpenAI后,等待多久判定为超时
69 | TIMEOUT_SECONDS = 30
70 |
71 |
72 | # 网页的端口, -1代表随机端口
73 | WEB_PORT = -1
74 |
75 |
76 | # 如果OpenAI不响应(网络卡顿、代理失败、KEY失效),重试的次数限制
77 | MAX_RETRY = 2
78 |
79 |
80 | # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 )
81 | LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
82 | AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
83 | # P.S. 其他可用的模型还包括 ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
84 |
85 |
86 | # 本地LLM模型如ChatGLM的执行方式 CPU/GPU
87 | LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda"
88 |
89 |
90 | # 设置gradio的并行线程数(不需要修改)
91 | CONCURRENT_COUNT = 100
92 |
93 |
94 | # 是否在提交时自动清空输入框
95 | AUTO_CLEAR_TXT = False
96 |
97 |
98 | # 加一个live2d装饰
99 | ADD_WAIFU = False
100 |
101 |
102 | # 设置用户名和密码(不需要修改)(相关功能不稳定,与gradio版本和网络都相关,如果本地使用不建议加这个)
103 | # [("username", "password"), ("username2", "password2"), ...]
104 | AUTHENTICATION = []
105 |
106 |
107 | # 如果需要在二级路径下运行(常规情况下,不要修改!!)(需要配合修改main.py才能生效!)
108 | CUSTOM_PATH = "/"
109 |
110 |
111 | # 极少数情况下,openai的官方KEY需要伴随组织编码(格式如org-xxxxxxxxxxxxxxxxxxxxxxxx)使用
112 | API_ORG = ""
113 |
114 |
115 | # 如果需要使用Slack Claude,使用教程详情见 request_llm/README.md
116 | SLACK_CLAUDE_BOT_ID = ''
117 | SLACK_CLAUDE_USER_TOKEN = ''
118 |
119 |
120 | # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md
121 | AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/"
122 | AZURE_API_KEY = "填入azure openai api的密钥" # 建议直接在API_KEY处填写,该选项即将被弃用
123 | AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md
124 |
125 |
126 | # 使用Newbing
127 | NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"]
128 | NEWBING_COOKIES = """
129 | put your new bing cookies here
130 | """
131 |
--------------------------------------------------------------------------------
/old/gpt_academic_old/credentials.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/elphen-wang/FreeAI/25cd8bfff04607fc73a464bfcff948d102b65c4f/old/gpt_academic_old/credentials.txt
--------------------------------------------------------------------------------
/old/gpt_academic_old/get_freeai_api.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 |
3 | from os import path
4 | import requests
5 | from OpenAIAuth import Auth0
6 |
7 | def run():
8 | expires_in = 0
9 | unique_name = 'my share token'
10 | current_dir = path.dirname(path.abspath(__file__))
11 | credentials_file = path.join(current_dir, 'credentials.txt')
12 | share_tokens_file = path.join(current_dir, 'share_tokens.txt')
13 | with open(credentials_file, 'r', encoding='utf-8') as f:
14 | credentials = f.read().split('\n')
15 | credentials = [credential.split(',', 1) for credential in credentials]
16 | count = 0
17 | token_keys = []
18 | for credential in credentials:
19 | progress = '{}/{}'.format(credentials.index(credential) + 1, len(credentials))
20 | if not credential or len(credential) != 2:
21 | continue
22 |
23 | count += 1
24 | username, password = credential[0].strip(), credential[1].strip()
25 | token_info = {
26 | 'token': 'None',
27 | 'share_token': 'None',
28 | }
29 | token_keys.append(token_info)
30 | try:
31 | auth = Auth0(email=username, password=password)
32 | token_info['token'] = auth.get_access_token()
33 | #print('Login success: {}, {}'.format(username, progress))
34 | except Exception as e:
35 | err_str = str(e).replace('\n', '').replace('\r', '').strip()
36 | #print('Login failed: {}, {}'.format(username, err_str))
37 | token_info['token'] = err_str
38 | continue
39 | data = {
40 | 'unique_name': unique_name,
41 | 'access_token': token_info['token'],
42 | 'expires_in': expires_in,
43 | }
44 | resp = requests.post('https://ai.fakeopen.com/token/register', data=data)
45 | if resp.status_code == 200:
46 | token_info['share_token'] = resp.json()['token_key']
47 | else:
48 | continue
49 |
50 | with open(share_tokens_file, 'w', encoding='utf-8') as f:
51 | # 如果账号大于一个,优先使用pool;只有一个时,使用单独的api;没有,则有公共pool。
52 | if count==0:
53 | f.write('pk-this-is-a-real-free-pool-token-for-everyone\n')
54 | f.write('pk-this-is-a-real-free-api-key-pk-for-everyone\n')
55 | elif count==1:
56 | f.write('{}\n'.format(token_keys[0]['share_token']))
57 | else:
58 | data = {
59 | 'share_tokens': '\n'.join([token_info['share_token'] for token_info in token_keys]),
60 | }
61 | resp = requests.post('https://ai.fakeopen.com/pool/update', data=data)
62 | if resp.status_code == 200:
63 | f.write('{}\n'.format(resp.json()['pool_token']))
64 | for token_info in token_keys:
65 | f.write('{}\n'.format(token_info['share_token']))
66 | f.close()
67 |
68 | if __name__ == '__main__':
69 | run()
70 |
71 |
--------------------------------------------------------------------------------
/old/gpt_academic_old/toolbox.py:
--------------------------------------------------------------------------------
1 | import markdown
2 | import importlib
3 | import time
4 | import inspect
5 | import re
6 | import os
7 | import gradio
8 | from latex2mathml.converter import convert as tex2mathml
9 | from functools import wraps, lru_cache
10 | pj = os.path.join
11 |
12 | """
13 | ========================================================================
14 | 第一部分
15 | 函数插件输入输出接驳区
16 | - ChatBotWithCookies: 带Cookies的Chatbot类,为实现更多强大的功能做基础
17 | - ArgsGeneralWrapper: 装饰器函数,用于重组输入参数,改变输入参数的顺序与结构
18 | - update_ui: 刷新界面用 yield from update_ui(chatbot, history)
19 | - CatchException: 将插件中出的所有问题显示在界面上
20 | - HotReload: 实现插件的热更新
21 | - trimmed_format_exc: 打印traceback,为了安全而隐藏绝对地址
22 | ========================================================================
23 | """
24 |
25 | class ChatBotWithCookies(list):
26 | def __init__(self, cookie):
27 | self._cookies = cookie
28 |
29 | def write_list(self, list):
30 | for t in list:
31 | self.append(t)
32 |
33 | def get_list(self):
34 | return [t for t in self]
35 |
36 | def get_cookies(self):
37 | return self._cookies
38 |
39 |
40 | def ArgsGeneralWrapper(f):
41 | """
42 | 装饰器函数,用于重组输入参数,改变输入参数的顺序与结构。
43 | """
44 | def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args):
45 | txt_passon = txt
46 | if txt == "" and txt2 != "": txt_passon = txt2
47 | # 引入一个有cookie的chatbot
48 | cookies.update({
49 | 'top_p':top_p,
50 | 'temperature':temperature,
51 | })
52 | llm_kwargs = {
53 | 'api_key': cookies['api_key'],
54 | 'llm_model': llm_model,
55 | 'top_p':top_p,
56 | 'max_length': max_length,
57 | 'temperature':temperature,
58 | 'client_ip': request.client.host,
59 | }
60 | plugin_kwargs = {
61 | "advanced_arg": plugin_advanced_arg,
62 | }
63 | chatbot_with_cookie = ChatBotWithCookies(cookies)
64 | chatbot_with_cookie.write_list(chatbot)
65 | if cookies.get('lock_plugin', None) is None:
66 | # 正常状态
67 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
68 | else:
69 | # 处理个别特殊插件的锁定状态
70 | module, fn_name = cookies['lock_plugin'].split('->')
71 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
72 | yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
73 | return decorated
74 |
75 |
76 | def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面
77 | """
78 | 刷新用户界面
79 | """
80 | assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。"
81 | cookies = chatbot.get_cookies()
82 |
83 | # 解决插件锁定时的界面显示问题
84 | if cookies.get('lock_plugin', None):
85 | label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None)
86 | chatbot_gr = gradio.update(value=chatbot, label=label)
87 | if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label
88 | elif cookies.get('label', None):
89 | chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', ""))
90 | cookies['label'] = None # 清空label
91 | else:
92 | chatbot_gr = chatbot
93 |
94 | yield cookies, chatbot_gr, history, msg
95 |
96 | def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面
97 | """
98 | 刷新用户界面
99 | """
100 | if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg])
101 | chatbot[-1] = list(chatbot[-1])
102 | chatbot[-1][-1] = lastmsg
103 | yield from update_ui(chatbot=chatbot, history=history)
104 | time.sleep(delay)
105 |
106 |
107 | def trimmed_format_exc():
108 | import os, traceback
109 | str = traceback.format_exc()
110 | current_path = os.getcwd()
111 | replace_path = "."
112 | return str.replace(current_path, replace_path)
113 |
114 | def CatchException(f):
115 | """
116 | 装饰器函数,捕捉函数f中的异常并封装到一个生成器中返回,并显示到聊天当中。
117 | """
118 |
119 | @wraps(f)
120 | def decorated(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT=-1):
121 | try:
122 | yield from f(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT)
123 | except Exception as e:
124 | from check_proxy import check_proxy
125 | from toolbox import get_conf
126 | proxies, = get_conf('proxies')
127 | tb_str = '```\n' + trimmed_format_exc() + '```'
128 | if len(chatbot) == 0:
129 | chatbot.clear()
130 | chatbot.append(["插件调度异常", "异常原因"])
131 | chatbot[-1] = (chatbot[-1][0],
132 | f"[Local Message] 实验性函数调用出错: \n\n{tb_str} \n\n当前代理可用性: \n\n{check_proxy(proxies)}")
133 | yield from update_ui(chatbot=chatbot, history=history, msg=f'异常 {e}') # 刷新界面
134 | return decorated
135 |
136 |
137 | def HotReload(f):
138 | """
139 | HotReload的装饰器函数,用于实现Python函数插件的热更新。
140 | 函数热更新是指在不停止程序运行的情况下,更新函数代码,从而达到实时更新功能。
141 | 在装饰器内部,使用wraps(f)来保留函数的元信息,并定义了一个名为decorated的内部函数。
142 | 内部函数通过使用importlib模块的reload函数和inspect模块的getmodule函数来重新加载并获取函数模块,
143 | 然后通过getattr函数获取函数名,并在新模块中重新加载函数。
144 | 最后,使用yield from语句返回重新加载过的函数,并在被装饰的函数上执行。
145 | 最终,装饰器函数返回内部函数。这个内部函数可以将函数的原始定义更新为最新版本,并执行函数的新版本。
146 | """
147 | @wraps(f)
148 | def decorated(*args, **kwargs):
149 | fn_name = f.__name__
150 | f_hot_reload = getattr(importlib.reload(inspect.getmodule(f)), fn_name)
151 | yield from f_hot_reload(*args, **kwargs)
152 | return decorated
153 |
154 |
155 | """
156 | ========================================================================
157 | 第二部分
158 | 其他小工具:
159 | - write_results_to_file: 将结果写入markdown文件中
160 | - regular_txt_to_markdown: 将普通文本转换为Markdown格式的文本。
161 | - report_execption: 向chatbot中添加简单的意外错误信息
162 | - text_divide_paragraph: 将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
163 | - markdown_convertion: 用多种方式组合,将markdown转化为好看的html
164 | - format_io: 接管gradio默认的markdown处理方式
165 | - on_file_uploaded: 处理文件的上传(自动解压)
166 | - on_report_generated: 将生成的报告自动投射到文件上传区
167 | - clip_history: 当历史上下文过长时,自动截断
168 | - get_conf: 获取设置
169 | - select_api_key: 根据当前的模型类别,抽取可用的api-key
170 | ========================================================================
171 | """
172 |
173 | def get_reduce_token_percent(text):
174 | """
175 | * 此函数未来将被弃用
176 | """
177 | try:
178 | # text = "maximum context length is 4097 tokens. However, your messages resulted in 4870 tokens"
179 | pattern = r"(\d+)\s+tokens\b"
180 | match = re.findall(pattern, text)
181 | EXCEED_ALLO = 500 # 稍微留一点余地,否则在回复时会因余量太少出问题
182 | max_limit = float(match[0]) - EXCEED_ALLO
183 | current_tokens = float(match[1])
184 | ratio = max_limit/current_tokens
185 | assert ratio > 0 and ratio < 1
186 | return ratio, str(int(current_tokens-max_limit))
187 | except:
188 | return 0.5, '不详'
189 |
190 |
191 | def write_results_to_file(history, file_name=None):
192 | """
193 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名,则使用当前时间生成文件名。
194 | """
195 | import os
196 | import time
197 | if file_name is None:
198 | # file_name = time.strftime("chatGPT分析报告%Y-%m-%d-%H-%M-%S", time.localtime()) + '.md'
199 | file_name = 'chatGPT分析报告' + \
200 | time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + '.md'
201 | os.makedirs('./gpt_log/', exist_ok=True)
202 | with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
203 | f.write('# chatGPT 分析报告\n')
204 | for i, content in enumerate(history):
205 | try:
206 | if type(content) != str: content = str(content)
207 | except:
208 | continue
209 | if i % 2 == 0:
210 | f.write('## ')
211 | try:
212 | f.write(content)
213 | except:
214 | # remove everything that cannot be handled by utf8
215 | f.write(content.encode('utf-8', 'ignore').decode())
216 | f.write('\n\n')
217 | res = '以上材料已经被写入' + os.path.abspath(f'./gpt_log/{file_name}')
218 | print(res)
219 | return res
220 |
221 |
222 | def regular_txt_to_markdown(text):
223 | """
224 | 将普通文本转换为Markdown格式的文本。
225 | """
226 | text = text.replace('\n', '\n\n')
227 | text = text.replace('\n\n\n', '\n\n')
228 | text = text.replace('\n\n\n', '\n\n')
229 | return text
230 |
231 |
232 |
233 |
234 | def report_execption(chatbot, history, a, b):
235 | """
236 | 向chatbot中添加错误信息
237 | """
238 | chatbot.append((a, b))
239 | history.append(a)
240 | history.append(b)
241 |
242 |
243 | def text_divide_paragraph(text):
244 | """
245 | 将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
246 | """
247 | pre = ''
248 | suf = '
'
249 | if text.startswith(pre) and text.endswith(suf):
250 | return text
251 |
252 | if '```' in text:
253 | # careful input
254 | return pre + text + suf
255 | else:
256 | # wtf input
257 | lines = text.split("\n")
258 | for i, line in enumerate(lines):
259 | lines[i] = lines[i].replace(" ", " ")
260 | text = "".join(lines)
261 | return pre + text + suf
262 |
263 | @lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
264 | def markdown_convertion(txt):
265 | """
266 | 将Markdown格式的文本转换为HTML格式。如果包含数学公式,则先将公式转换为HTML格式。
267 | """
268 | pre = ''
269 | suf = '
'
270 | if txt.startswith(pre) and txt.endswith(suf):
271 | # print('警告,输入了已经经过转化的字符串,二次转化可能出问题')
272 | return txt # 已经被转化过,不需要再次转化
273 |
274 | markdown_extension_configs = {
275 | 'mdx_math': {
276 | 'enable_dollar_delimiter': True,
277 | 'use_gitlab_delimiters': False,
278 | },
279 | }
280 | find_equation_pattern = r'\n', '')
315 | return content
316 |
317 | def no_code(txt):
318 | if '```' not in txt:
319 | return True
320 | else:
321 | if '```reference' in txt: return True # newbing
322 | else: return False
323 |
324 | if ('$' in txt) and no_code(txt): # 有$标识的公式符号,且没有代码段```的标识
325 | # convert everything to html format
326 | split = markdown.markdown(text='---')
327 | convert_stage_1 = markdown.markdown(text=txt, extensions=['mdx_math', 'fenced_code', 'tables', 'sane_lists'], extension_configs=markdown_extension_configs)
328 | convert_stage_1 = markdown_bug_hunt(convert_stage_1)
329 | # re.DOTALL: Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline. Corresponds to the inline flag (?s).
330 | # 1. convert to easy-to-copy tex (do not render math)
331 | convert_stage_2_1, n = re.subn(find_equation_pattern, replace_math_no_render, convert_stage_1, flags=re.DOTALL)
332 | # 2. convert to rendered equation
333 | convert_stage_2_2, n = re.subn(find_equation_pattern, replace_math_render, convert_stage_1, flags=re.DOTALL)
334 | # cat them together
335 | return pre + convert_stage_2_1 + f'{split}' + convert_stage_2_2 + suf
336 | else:
337 | return pre + markdown.markdown(txt, extensions=['fenced_code', 'codehilite', 'tables', 'sane_lists']) + suf
338 |
339 |
340 | def close_up_code_segment_during_stream(gpt_reply):
341 | """
342 | 在gpt输出代码的中途(输出了前面的```,但还没输出完后面的```),补上后面的```
343 |
344 | Args:
345 | gpt_reply (str): GPT模型返回的回复字符串。
346 |
347 | Returns:
348 | str: 返回一个新的字符串,将输出代码片段的“后面的```”补上。
349 |
350 | """
351 | if '```' not in gpt_reply:
352 | return gpt_reply
353 | if gpt_reply.endswith('```'):
354 | return gpt_reply
355 |
356 | # 排除了以上两个情况,我们
357 | segments = gpt_reply.split('```')
358 | n_mark = len(segments) - 1
359 | if n_mark % 2 == 1:
360 | # print('输出代码片段中!')
361 | return gpt_reply+'\n```'
362 | else:
363 | return gpt_reply
364 |
365 |
366 | def format_io(self, y):
367 | """
368 | 将输入和输出解析为HTML格式。将y中最后一项的输入部分段落化,并将输出部分的Markdown和数学公式转换为HTML格式。
369 | """
370 | if y is None or y == []:
371 | return []
372 | i_ask, gpt_reply = y[-1]
373 | # 输入部分太自由,预处理一波
374 | if i_ask is not None: i_ask = text_divide_paragraph(i_ask)
375 | # 当代码输出半截的时候,试着补上后个```
376 | if gpt_reply is not None: gpt_reply = close_up_code_segment_during_stream(gpt_reply)
377 | # process
378 | y[-1] = (
379 | None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']),
380 | None if gpt_reply is None else markdown_convertion(gpt_reply)
381 | )
382 | return y
383 |
384 |
385 | def find_free_port():
386 | """
387 | 返回当前系统中可用的未使用端口。
388 | """
389 | import socket
390 | from contextlib import closing
391 | with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as s:
392 | s.bind(('', 0))
393 | s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
394 | return s.getsockname()[1]
395 |
396 |
397 | def extract_archive(file_path, dest_dir):
398 | import zipfile
399 | import tarfile
400 | import os
401 | # Get the file extension of the input file
402 | file_extension = os.path.splitext(file_path)[1]
403 |
404 | # Extract the archive based on its extension
405 | if file_extension == '.zip':
406 | with zipfile.ZipFile(file_path, 'r') as zipobj:
407 | zipobj.extractall(path=dest_dir)
408 | print("Successfully extracted zip archive to {}".format(dest_dir))
409 |
410 | elif file_extension in ['.tar', '.gz', '.bz2']:
411 | with tarfile.open(file_path, 'r:*') as tarobj:
412 | tarobj.extractall(path=dest_dir)
413 | print("Successfully extracted tar archive to {}".format(dest_dir))
414 |
415 | # 第三方库,需要预先pip install rarfile
416 | # 此外,Windows上还需要安装winrar软件,配置其Path环境变量,如"C:\Program Files\WinRAR"才可以
417 | elif file_extension == '.rar':
418 | try:
419 | import rarfile
420 | with rarfile.RarFile(file_path) as rf:
421 | rf.extractall(path=dest_dir)
422 | print("Successfully extracted rar archive to {}".format(dest_dir))
423 | except:
424 | print("Rar format requires additional dependencies to install")
425 | return '\n\n解压失败! 需要安装pip install rarfile来解压rar文件'
426 |
427 | # 第三方库,需要预先pip install py7zr
428 | elif file_extension == '.7z':
429 | try:
430 | import py7zr
431 | with py7zr.SevenZipFile(file_path, mode='r') as f:
432 | f.extractall(path=dest_dir)
433 | print("Successfully extracted 7z archive to {}".format(dest_dir))
434 | except:
435 | print("7z format requires additional dependencies to install")
436 | return '\n\n解压失败! 需要安装pip install py7zr来解压7z文件'
437 | else:
438 | return ''
439 | return ''
440 |
441 |
442 | def find_recent_files(directory):
443 | """
444 | me: find files that is created with in one minutes under a directory with python, write a function
445 | gpt: here it is!
446 | """
447 | import os
448 | import time
449 | current_time = time.time()
450 | one_minute_ago = current_time - 60
451 | recent_files = []
452 |
453 | for filename in os.listdir(directory):
454 | file_path = os.path.join(directory, filename)
455 | if file_path.endswith('.log'):
456 | continue
457 | created_time = os.path.getmtime(file_path)
458 | if created_time >= one_minute_ago:
459 | if os.path.isdir(file_path):
460 | continue
461 | recent_files.append(file_path)
462 |
463 | return recent_files
464 |
465 | def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
466 | # 将文件复制一份到下载区
467 | import shutil
468 | if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
469 | new_path = os.path.join(f'./gpt_log/', rename_file)
470 | if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
471 | if not os.path.exists(new_path): shutil.copyfile(file, new_path)
472 | if chatbot:
473 | if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote']
474 | else: current = []
475 | chatbot._cookies.update({'file_to_promote': [new_path] + current})
476 |
477 | def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
478 | """
479 | 当文件被上传时的回调函数
480 | """
481 | if len(files) == 0:
482 | return chatbot, txt
483 | import shutil
484 | import os
485 | import time
486 | import glob
487 | from toolbox import extract_archive
488 | try:
489 | shutil.rmtree('./private_upload/')
490 | except:
491 | pass
492 | time_tag = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
493 | os.makedirs(f'private_upload/{time_tag}', exist_ok=True)
494 | err_msg = ''
495 | for file in files:
496 | file_origin_name = os.path.basename(file.orig_name)
497 | shutil.copy(file.name, f'private_upload/{time_tag}/{file_origin_name}')
498 | err_msg += extract_archive(f'private_upload/{time_tag}/{file_origin_name}',
499 | dest_dir=f'private_upload/{time_tag}/{file_origin_name}.extract')
500 | moved_files = [fp for fp in glob.glob('private_upload/**/*', recursive=True)]
501 | if "底部输入区" in checkboxes:
502 | txt = ""
503 | txt2 = f'private_upload/{time_tag}'
504 | else:
505 | txt = f'private_upload/{time_tag}'
506 | txt2 = ""
507 | moved_files_str = '\t\n\n'.join(moved_files)
508 | chatbot.append(['我上传了文件,请查收',
509 | f'[Local Message] 收到以下文件: \n\n{moved_files_str}' +
510 | f'\n\n调用路径参数已自动修正到: \n\n{txt}' +
511 | f'\n\n现在您点击任意“红颜色”标识的函数插件时,以上文件将被作为输入参数'+err_msg])
512 | return chatbot, txt, txt2
513 |
514 |
515 | def on_report_generated(cookies, files, chatbot):
516 | from toolbox import find_recent_files
517 | if 'file_to_promote' in cookies:
518 | report_files = cookies['file_to_promote']
519 | cookies.pop('file_to_promote')
520 | else:
521 | report_files = find_recent_files('gpt_log')
522 | if len(report_files) == 0:
523 | return cookies, None, chatbot
524 | # files.extend(report_files)
525 | file_links = ''
526 | for f in report_files: file_links += f'
{f}'
527 | chatbot.append(['报告如何远程获取?', f'报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。{file_links}'])
528 | return cookies, report_files, chatbot
529 |
530 | def load_chat_cookies():
531 | API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY')
532 | if is_any_api_key(AZURE_API_KEY):
533 | if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY
534 | else: API_KEY = AZURE_API_KEY
535 | #print(API_KEY)
536 | return {'api_key': API_KEY, 'llm_model': LLM_MODEL}
537 |
538 | def is_openai_api_key(key):
539 | API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
540 | return bool(API_MATCH_ORIGINAL)
541 |
542 | def is_azure_api_key(key):
543 | API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key)
544 | return bool(API_MATCH_AZURE)
545 |
546 | def is_api2d_key(key):
547 | API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key)
548 | return bool(API_MATCH_API2D)
549 |
550 | def is_freeai_api_key(key):#new add
551 | API_MATCH_FREEAI0 = re.match(r"pk-[a-zA-Z0-9-_]{43}$", key)
552 | API_MATCH_FREEAI1 = re.match(r"fk-[a-zA-Z0-9-_]{43}$", key)
553 | return bool(API_MATCH_FREEAI0) or bool(API_MATCH_FREEAI1)
554 |
555 | def is_any_api_key(key):
556 | if ',' in key:
557 | keys = key.split(',')
558 | for k in keys:
559 | if is_any_api_key(k): return True
560 | return False
561 | else:#new add
562 | return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key) or is_freeai_api_key(key)
563 |
564 | def what_keys(keys):#new add
565 | avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0, "FreeAI Key":0}
566 | key_list = keys.split(',')
567 |
568 | for k in key_list:
569 | if is_openai_api_key(k):
570 | avail_key_list['OpenAI Key'] += 1
571 |
572 | for k in key_list:
573 | if is_api2d_key(k):
574 | avail_key_list['API2D Key'] += 1
575 |
576 | for k in key_list:
577 | if is_azure_api_key(k):
578 | avail_key_list['Azure Key'] += 1
579 |
580 | for k in key_list: # new add
581 | if is_freeai_api_key(k):
582 | avail_key_list['FreeAI Key'] += 1
583 |
584 | #new add
585 | return f"检测到: OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']} 个, FreeAI Key {avail_key_list['FreeAI Key']} 个"
586 |
587 | def select_api_key(keys, llm_model):
588 | import random
589 | avail_key_list = []
590 | key_list = keys.split(',')
591 |
592 | if llm_model.startswith('gpt-'):
593 | for k in key_list:
594 | if is_openai_api_key(k): avail_key_list.append(k)
595 | for k in key_list:# new add
596 | if is_freeai_api_key(k): avail_key_list.append(k)
597 |
598 | if llm_model.startswith('api2d-'):
599 | for k in key_list:
600 | if is_api2d_key(k): avail_key_list.append(k)
601 |
602 | if llm_model.startswith('azure-'):
603 | for k in key_list:
604 | if is_azure_api_key(k): avail_key_list.append(k)
605 |
606 | if len(avail_key_list) == 0:
607 | raise RuntimeError(f"您提供的api-key不满足要求,不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源(右下角更换模型菜单中可切换openai,azure和api2d请求源)")
608 |
609 | api_key = random.choice(avail_key_list) # 随机负载均衡
610 | return api_key
611 |
612 | def read_env_variable(arg, default_value):
613 | """
614 | 环境变量可以是 `GPT_ACADEMIC_CONFIG`(优先),也可以直接是`CONFIG`
615 | 例如在windows cmd中,既可以写:
616 | set USE_PROXY=True
617 | set API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx
618 | set proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",}
619 | set AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"]
620 | set AUTHENTICATION=[("username", "password"), ("username2", "password2")]
621 | 也可以写:
622 | set GPT_ACADEMIC_USE_PROXY=True
623 | set GPT_ACADEMIC_API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx
624 | set GPT_ACADEMIC_proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",}
625 | set GPT_ACADEMIC_AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"]
626 | set GPT_ACADEMIC_AUTHENTICATION=[("username", "password"), ("username2", "password2")]
627 | """
628 | from colorful import print亮红, print亮绿
629 | arg_with_prefix = "GPT_ACADEMIC_" + arg
630 | if arg_with_prefix in os.environ:
631 | env_arg = os.environ[arg_with_prefix]
632 | elif arg in os.environ:
633 | env_arg = os.environ[arg]
634 | else:
635 | raise KeyError
636 | print(f"[ENV_VAR] 尝试加载{arg},默认值:{default_value} --> 修正值:{env_arg}")
637 | try:
638 | if isinstance(default_value, bool):
639 | env_arg = env_arg.strip()
640 | if env_arg == 'True': r = True
641 | elif env_arg == 'False': r = False
642 | else: print('enter True or False, but have:', env_arg); r = default_value
643 | elif isinstance(default_value, int):
644 | r = int(env_arg)
645 | elif isinstance(default_value, float):
646 | r = float(env_arg)
647 | elif isinstance(default_value, str):
648 | r = env_arg.strip()
649 | elif isinstance(default_value, dict):
650 | r = eval(env_arg)
651 | elif isinstance(default_value, list):
652 | r = eval(env_arg)
653 | elif default_value is None:
654 | assert arg == "proxies"
655 | r = eval(env_arg)
656 | else:
657 | print亮红(f"[ENV_VAR] 环境变量{arg}不支持通过环境变量设置! ")
658 | raise KeyError
659 | except:
660 | print亮红(f"[ENV_VAR] 环境变量{arg}加载失败! ")
661 | raise KeyError(f"[ENV_VAR] 环境变量{arg}加载失败! ")
662 |
663 | print亮绿(f"[ENV_VAR] 成功读取环境变量{arg}")
664 | return r
665 |
666 | @lru_cache(maxsize=128)
667 | def read_single_conf_with_lru_cache(arg):
668 | from colorful import print亮红, print亮绿, print亮蓝
669 | try:
670 | # 优先级1. 获取环境变量作为配置
671 | default_ref = getattr(importlib.import_module('config'), arg) # 读取默认值作为数据类型转换的参考
672 | r = read_env_variable(arg, default_ref)
673 | except:
674 | try:
675 | # 优先级2. 获取config_private中的配置
676 | r = getattr(importlib.import_module('config_private'), arg)
677 | except:
678 | # 优先级3. 获取config中的配置
679 | r = getattr(importlib.import_module('config'), arg)
680 |
681 | # 在读取API_KEY时,检查一下是不是忘了改config
682 | if arg == 'API_KEY':
683 | print亮蓝(f"[API_KEY] 本项目现已支持OpenAI和API2D的api-key。也支持同时填写多个api-key,如API_KEY=\"openai-key1,openai-key2,api2d-key3\"")
684 | print亮蓝(f"[API_KEY] 您既可以在config.py中修改api-key(s),也可以在问题输入区输入临时的api-key(s),然后回车键提交后即可生效。")
685 | if is_any_api_key(r):
686 | print亮绿(f"[API_KEY] 您的 API_KEY 是: {r[:15]}*** API_KEY 导入成功")
687 | else:
688 | print亮红( "[API_KEY] 正确的 API_KEY 是'sk'开头的51位密钥(OpenAI),或者 'fk'开头的41位密钥,请在config文件中修改API密钥之后再运行。")
689 | if arg == 'proxies':
690 | if r is None:
691 | print亮红('[PROXY] 网络代理状态:未配置。无代理状态下很可能无法访问OpenAI家族的模型。建议:检查USE_PROXY选项是否修改。')
692 | else:
693 | print亮绿('[PROXY] 网络代理状态:已配置。配置信息如下:', r)
694 | assert isinstance(r, dict), 'proxies格式错误,请注意proxies选项的格式,不要遗漏括号。'
695 | return r
696 |
697 |
698 | def get_conf(*args):
699 | # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
700 | res = []
701 | for arg in args:
702 | r = read_single_conf_with_lru_cache(arg)
703 | res.append(r)
704 | return res
705 |
706 |
707 | def clear_line_break(txt):
708 | txt = txt.replace('\n', ' ')
709 | txt = txt.replace(' ', ' ')
710 | txt = txt.replace(' ', ' ')
711 | return txt
712 |
713 |
714 | class DummyWith():
715 | """
716 | 这段代码定义了一个名为DummyWith的空上下文管理器,
717 | 它的作用是……额……就是不起作用,即在代码结构不变得情况下取代其他的上下文管理器。
718 | 上下文管理器是一种Python对象,用于与with语句一起使用,
719 | 以确保一些资源在代码块执行期间得到正确的初始化和清理。
720 | 上下文管理器必须实现两个方法,分别为 __enter__()和 __exit__()。
721 | 在上下文执行开始的情况下,__enter__()方法会在代码块被执行前被调用,
722 | 而在上下文执行结束时,__exit__()方法则会被调用。
723 | """
724 | def __enter__(self):
725 | return self
726 |
727 | def __exit__(self, exc_type, exc_value, traceback):
728 | return
729 |
730 | def run_gradio_in_subpath(demo, auth, port, custom_path):
731 | """
732 | 把gradio的运行地址更改到指定的二次路径上
733 | """
734 | def is_path_legal(path: str)->bool:
735 | '''
736 | check path for sub url
737 | path: path to check
738 | return value: do sub url wrap
739 | '''
740 | if path == "/": return True
741 | if len(path) == 0:
742 | print("ilegal custom path: {}\npath must not be empty\ndeploy on root url".format(path))
743 | return False
744 | if path[0] == '/':
745 | if path[1] != '/':
746 | print("deploy on sub-path {}".format(path))
747 | return True
748 | return False
749 | print("ilegal custom path: {}\npath should begin with \'/\'\ndeploy on root url".format(path))
750 | return False
751 |
752 | if not is_path_legal(custom_path): raise RuntimeError('Ilegal custom path')
753 | import uvicorn
754 | import gradio as gr
755 | from fastapi import FastAPI
756 | app = FastAPI()
757 | if custom_path != "/":
758 | @app.get("/")
759 | def read_main():
760 | return {"message": f"Gradio is running at: {custom_path}"}
761 | app = gr.mount_gradio_app(app, demo, path=custom_path)
762 | uvicorn.run(app, host="0.0.0.0", port=port) # , auth=auth
763 |
764 |
765 | def clip_history(inputs, history, tokenizer, max_token_limit):
766 | """
767 | reduce the length of history by clipping.
768 | this function search for the longest entries to clip, little by little,
769 | until the number of token of history is reduced under threshold.
770 | 通过裁剪来缩短历史记录的长度。
771 | 此函数逐渐地搜索最长的条目进行剪辑,
772 | 直到历史记录的标记数量降低到阈值以下。
773 | """
774 | import numpy as np
775 | from request_llm.bridge_all import model_info
776 | def get_token_num(txt):
777 | return len(tokenizer.encode(txt, disallowed_special=()))
778 | input_token_num = get_token_num(inputs)
779 | if input_token_num < max_token_limit * 3 / 4:
780 | # 当输入部分的token占比小于限制的3/4时,裁剪时
781 | # 1. 把input的余量留出来
782 | max_token_limit = max_token_limit - input_token_num
783 | # 2. 把输出用的余量留出来
784 | max_token_limit = max_token_limit - 128
785 | # 3. 如果余量太小了,直接清除历史
786 | if max_token_limit < 128:
787 | history = []
788 | return history
789 | else:
790 | # 当输入部分的token占比 > 限制的3/4时,直接清除历史
791 | history = []
792 | return history
793 |
794 | everything = ['']
795 | everything.extend(history)
796 | n_token = get_token_num('\n'.join(everything))
797 | everything_token = [get_token_num(e) for e in everything]
798 |
799 | # 截断时的颗粒度
800 | delta = max(everything_token) // 16
801 |
802 | while n_token > max_token_limit:
803 | where = np.argmax(everything_token)
804 | encoded = tokenizer.encode(everything[where], disallowed_special=())
805 | clipped_encoded = encoded[:len(encoded)-delta]
806 | everything[where] = tokenizer.decode(clipped_encoded)[:-1] # -1 to remove the may-be illegal char
807 | everything_token[where] = get_token_num(everything[where])
808 | n_token = get_token_num('\n'.join(everything))
809 |
810 | history = everything[1:]
811 | return history
812 |
813 | """
814 | ========================================================================
815 | 第三部分
816 | 其他小工具:
817 | - zip_folder: 把某个路径下所有文件压缩,然后转移到指定的另一个路径中(gpt写的)
818 | - gen_time_str: 生成时间戳
819 | - ProxyNetworkActivate: 临时地启动代理网络(如果有)
820 | - objdump/objload: 快捷的调试函数
821 | ========================================================================
822 | """
823 |
824 | def zip_folder(source_folder, dest_folder, zip_name):
825 | import zipfile
826 | import os
827 | # Make sure the source folder exists
828 | if not os.path.exists(source_folder):
829 | print(f"{source_folder} does not exist")
830 | return
831 |
832 | # Make sure the destination folder exists
833 | if not os.path.exists(dest_folder):
834 | print(f"{dest_folder} does not exist")
835 | return
836 |
837 | # Create the name for the zip file
838 | zip_file = os.path.join(dest_folder, zip_name)
839 |
840 | # Create a ZipFile object
841 | with zipfile.ZipFile(zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf:
842 | # Walk through the source folder and add files to the zip file
843 | for foldername, subfolders, filenames in os.walk(source_folder):
844 | for filename in filenames:
845 | filepath = os.path.join(foldername, filename)
846 | zipf.write(filepath, arcname=os.path.relpath(filepath, source_folder))
847 |
848 | # Move the zip file to the destination folder (if it wasn't already there)
849 | if os.path.dirname(zip_file) != dest_folder:
850 | os.rename(zip_file, os.path.join(dest_folder, os.path.basename(zip_file)))
851 | zip_file = os.path.join(dest_folder, os.path.basename(zip_file))
852 |
853 | print(f"Zip file created at {zip_file}")
854 |
855 | def zip_result(folder):
856 | import time
857 | t = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
858 | zip_folder(folder, './gpt_log/', f'{t}-result.zip')
859 | return pj('./gpt_log/', f'{t}-result.zip')
860 |
861 | def gen_time_str():
862 | import time
863 | return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
864 |
865 | class ProxyNetworkActivate():
866 | """
867 | 这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
868 | """
869 | def __enter__(self):
870 | from toolbox import get_conf
871 | proxies, = get_conf('proxies')
872 | if 'no_proxy' in os.environ: os.environ.pop('no_proxy')
873 | if proxies is not None:
874 | if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http']
875 | if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https']
876 | return self
877 |
878 | def __exit__(self, exc_type, exc_value, traceback):
879 | os.environ['no_proxy'] = '*'
880 | if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY')
881 | if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY')
882 | return
883 |
884 | def objdump(obj, file='objdump.tmp'):
885 | import pickle
886 | with open(file, 'wb+') as f:
887 | pickle.dump(obj, f)
888 | return
889 |
890 | def objload(file='objdump.tmp'):
891 | import pickle, os
892 | if not os.path.exists(file):
893 | return
894 | with open(file, 'rb') as f:
895 | return pickle.load(f)
896 |
897 |
--------------------------------------------------------------------------------