├── .DS_Store ├── README.md ├── Translate.html ├── get_freeai_api_v2.py ├── gpt_academic ├── config.py ├── requirements.txt └── toolbox.py ├── images └── error │ └── pandora_public_pool_token.png └── old ├── 3.43 ├── config.py ├── requirements.txt └── toolbox.py ├── README_old.md └── gpt_academic_old ├── config_private.py ├── credentials.txt ├── get_freeai_api.py └── toolbox.py /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/elphen-wang/FreeAI/25cd8bfff04607fc73a464bfcff948d102b65c4f/.DS_Store -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # 图片描述

| FreeAI 3 | 4 | **OpenAI should not be a closed AI.** 5 | 6 | 你是否还在为OpenAI需要科学上网在犯愁？ 7 | 8 | 你是否还在为OpenAI的付费模式而望而却步？ 9 | 10 | 你是否苦恼没有免费的API Key来开发自己的ChatGPT工具？ 11 | 12 | 本项目综述Github众优秀开发者的努力，给出一个比较完美的解决方案，并持续向更好用、更强大、更便宜的AI开放努力。**如果你喜欢本项目，请给一个免费的star，谢谢！** 13 | 14 | `Tips：有些一般性的问题和提醒，我在写在本页面并加注提醒了。大家实操时先耐心看完自己需要的本教程那一part，以免重复提问浪费等待时间。` 15 | 16 | --- 17 | #### 2023年7月16日上线FreeAI： 18 | + 基于Pandora和OpenAIAuth实现免翻墙使用ChatGPT 3.5; 19 | + 演示配置gpt_academic (vension: 3.45)实现免翻墙免费使用ChatGPT 3.5; 20 | 21 | #### **2023年8月1日更新要点：** 22 | + 提供一个自己制作的Pool Token (10个账号组成)； 23 | + 废弃OpenAIAuth。新提供一个免科学上网获取自己OpenAI账号的Access Token （即用户Cookie）的方法，以便制作自己的Pandora Shore Token和Pool Token; 24 | + 基于gpt_academic (vension: 3.47)演示免科学上网使用`ChatGPT 3.5`； 25 | + 穿插一些issue反馈的常见问题的解决方案。 26 | 27 | #### **2023年9月22日更新要点：** 28 | + FreeAI提供的池子更新频率更改为每4小时一次，以降低组成池子的账户的Cookie的失效的概率； 29 | + 根据gpt_academic (vension: 3.53)更新文件，更新的区域都写了“FreeAI更新”，大家可自行搜索关键词和对照学习。其实更改的部分不是很多，学习成本低； 30 | + 建议使用gpt_academic (vension: 3.45)升级到gpt_academic (vension: 3.53)的用户，运行`python3 -m pip install -r requirements.txt`更新依赖。 31 | 32 | --- 33 | 34 | **鸣谢：** 35 | + [pengzhile/pandora](https://github.com/pengzhile/pandora)：让OpenAI GPT-3.5的API免费和免科学上网的关键技术。 36 | + [binary-husky/gpt_academic](https://github.com/binary-husky/gpt_academic), 以它为例，解决它需翻墙和需要付费的OpenAI API key的问题，演示OpenAI变为FreeAI。 37 | 38 | ## Pandora 39 | 旨在打造免科学上网情况下，最原汁原味的ChatGPT。基于access token的[技术原理](https://zhile.io/2023/05/19/how-to-get-chatgpt-access-token-via-pkce.html)实现的。目前有官方的体验网站[https://chat.zhile.io](https://chat.zhile.io)，需要使用OpenAI的账户密码，所有对话记录与在官网的一致；也有基于Pandora技术的共享[Shared Chat](https://baipiao.io/chatgpt)的资源池，无需账号密码也能体验。`Tips：现Shared Chat的体验有些卡顿是正常现象，毕竟人太多了。` 40 | 41 | Pandora项目最难能可贵的是提供了可将用户的Cookie转化为形式如同API key的Access Token和响应这个Access Token的反代接口（也可响应OpenAI原生的API key）的服务，此举无疑是基于OpenAI自由开发者最大的福音。详情请见：[“这个服务旨在模拟 Turbo API，免费且使用的是ChatGPT的8k模型”](https://github.com/pengzhile/pandora/issues/837)。 42 | + 免科学上网获取自己的用户Cookie（即ChatGPT的Access Toke），演示地址：[https://ai-20230626.fakeopen.com/auth](https://ai-20230626.fakeopen.com/auth)和[https://ai-20230626.fakeopen.com/auth1](https://ai-20230626.fakeopen.com/auth1)；`Tips：Pandora后台记不记录你的用户账号密码不知道，但确实好用。` 43 | + Cookie转 `fk-`开头、43位的 Share Token 演示地址：[https://ai.fakeopen.com/token](https://ai.fakeopen.com/token)； 44 | + Cookie转 `pk-`开头、43位的 Pool Token 演示地址：[https://ai.fakeopen.com/pool](https://ai.fakeopen.com/pool)。解决多账号并发的问题； 45 | + 响应上述 Access Token 的反代接口是：[https://ai.fakeopen.com/v1/chat/completions](https://ai.fakeopen.com/v1/chat/completions)。 46 | 47 | Pandora项目还提供了两个免费的Pool Token: 48 | + `pk-this-is-a-real-free-pool-token-for-everyone` 很多 Share Token 组成的池子。 49 | + ~~`pk-this-is-a-real-free-api-key-pk-for-everyone`~~ 一些120刀 Api Key组成的池子。`（我测试的时候已经没钱了，[衰]，继续使用会经常报错，所以别用了。）` 50 | 51 | 经使用自己的账号生成的Share Token和Pool Token进行测试，这种方式进行的对话的记录，不会出现在该账户记录中。`但Pandora论坛帖子有人在反馈将这一部分的对话记录给保存到账户对话记录中，所以以后会不会有变化，不好说。` 52 | 53 | 本人十分中意ChatGPT的翻译效果，所以编写一个基于Pandora的简易翻译服务的网页，即文件[Translate.html](https://github.com/elphen-wang/FreeAI/blob/main/Translate.html)，测试效果表明还可以。`Tips：使用的是Pandora提供的Pool Token。` 54 | ## FreeAI来提供自己Pool Token啦 55 | 56 | 我**之前**因为自己的池子不够大，且用户cookie的生命周期只有**14天**，时常更新Access Token也很烦，所以我使用的是Pandora提供Pool Token。但是，经过一段时间实操，发现大家（包括我）都遇到类似于以下的报错： 57 | 58 | 图片描述

59 | 60 | 我**猜想**这是因为Pandora提供的免费Pool Token是由约100个账号组成的池子，而每个账号的Access Token生命周期只有14天且应该产生日期不尽相同，所以这个Pool Token需要经常更新下属这100个账号的Access Token，不然就会出现上述的报错。实际上，也是正因为如此，这种的报错持续一两天就会自动消失，这也说明这个Pool Token更新机制有所压力或未完善。**之前本教程基于 [OpenAIAuth](https://github.com/acheong08/OpenAIAuth)** 提供了[一个免科学上网获取专属自己的Pandora的Share Token和Pool Token](https://github.com/elphen-wang/FreeAI/blob/main/old/gpt_academic_old/get_freeai_api.py)的方式。但是，经过实测，OpenAIAuth所依靠的**服务机器响应请求有压力，时常获取不了自己的账号的Access Token**，故寻找一个替代方式是十分有必要的。**这些，无疑都是十分糟糕的用户体验。** 61 | 62 | 由此，FreeAI来提供自己Pool Token啦。大家可以通过以下的链接获取FreeAI Pool Token： 63 | [https://api.elphen.site/api?mode=default_my_poolkey](https://api.elphen.site/api?mode=default_my_poolkey) 。 64 | 65 | 大家在使用这个链接时，**请注意以下几点**： 66 | + 这个链接提供的FreeAI Pool Token是**每天凌晨4点10分定时更新**的，注意它的内容**并不长久固定**，目前暂定它的生命周期为一天，所以大家**一天取一次即可**； 67 | + 这个池子挂靠的服务器是我的轻量云服务器，**请大家轻虐，不要频繁访问**。 68 | + 这个FreeAI Pool Token是由10个OpenAI账号组成的。池子虽不大，但应该够用。将来会继续扩展这个池子。 69 | + python 获取这个FreeAI Pool Token代码如下: 70 | 71 | ``` .python 72 | import requests,json 73 | response = requests.get("https://api.elphen.site/api?mode=default_my_poolkey") 74 | if response.status_code == 200: 75 | FreeAI_Pool_Token=response.json() 76 | ``` 77 | 78 | 大家也可以通过Pandora项目提供的API，制作专属自己的Pandora Token： 79 | ``` .python 80 | import requests,json 81 | #免科学上网，获取自己OpenAI账户的Access Token 82 | #Tips：Pandora后台记不记录你的用户账号密码不知道，但确实好用。 83 | data0 = {'username': username, #你OpenAI的账户 84 | 'password': password, #你OpenAI的密码 85 | 'prompt': 'login',} 86 | resp0 = requests.post('https://ai-20230626.fakeopen.com/auth/login', data=data0) 87 | if resp0.status_code == 200: 88 | your_openai_cookie=resp0.json()['access_token'] 89 | 90 | #获取专属自己的Pandora Token 91 | data1 = {'unique_name': 'get my token', #可以不用修改 92 | 'access_token': your_openai_cookie, 93 | 'expires_in': 0, 94 | } 95 | resp1 = requests.post('https://ai.fakeopen.com/token/register', data=data1) 96 | if resp1.status_code == 200: 97 | your_panroda_token= resp.json()['token_key'] 98 | ``` 99 | 要制作专属自己的Pandora Pool Token，先假定你已经获取了两个及以上账号的Pandora （Share）Token组成的数组your_panroda_token_list，然后可用如下python代码获取： 100 | ``` .python 101 | data2 = {'share_tokens': '\n'.join(your_panroda_token_list),} 102 | resp2 = requests.post('https://ai.fakeopen.com/pool/update', data=data2) 103 | if resp2.status_code == 200: 104 | your_pool_token=resp2.json()['pool_token'] 105 | ``` 106 | 107 | 本教程的[get_freeai_api_v2.py](get_freeai_api_v2.py)即是一个获取Pandora （Share/Pool）Token的完整演示程序。 108 | 109 | **强烈建议大家使用自己的Pandora Token。并且，请大家优化代码，不要频繁发起请求，让提供服务的服务器（个人开发者的服务器性能一般不高）承载极限压力，最终反噬自己的请求响应缓慢。** 110 | 111 | ## gpt_academic 112 | 本人之前搭建专属自己的OpenAI API反向代理的教程[ChatGPT Wallfree](https://github.com/elphen-wang/chatgpt_wallfree)只实现了gpt_academic免科学上网功能，但仍需使用OpenAI原生的API key。这里还是以它为例，本次直接不用开发者自己搭建反向代理服务和OpenAI原生的API key，可以为一般的科研组省下一笔的不易报销的经费支出。 113 | 114 | 开发者可使用本项目中[gpt_academic](https://github.com/elphen-wang/FreeAI/tree/main/gpt_academic)文件夹中文件替代官方的文件（`主要是修改对toolbox.py和config.py对Pandora Token的识别和获取`），也可在此基础上加入自己的设定（如gpt_academic账户密码等）。如此之后，安装官方的调试运行和部署指引，gpt_academic就可以不用科学上网又能免费使用gpt-3.5啦！ 115 | 116 | **部署教程**： 117 | + 由于之前发现gpt_academic设定用户参数配置的读取优先级: 环境变量 > config_private.py > config.py，所以调试中，最好config.py文件也做对应的修改（即改为一样）。不然，用户的配置可能在某些调试情况下不生效，这是gpt_academic的bug，我目前没有对此进行修改。**我的建议是：干脆就别配置config_private.py，即删掉或别生成config_private.py文件，或者这两文件弄成一模一样。** 118 | + 本项目中[gpt_academic](https://github.com/elphen-wang/FreeAI/tree/main/gpt_academic)文件夹下的文件替代官方的对应的文件并做一定的修改即可。测试用的是gpt_academic v3.47的版本。 119 | `这里说明几点：` 120 | + `requirements.txt`相对官方增加pdfminer，pdflatex，apscheduler，前两个是latex功能相关的包，后一个是定时更新API_KEY的包，也即只有apscheduler是必须的。大家据此也可以做相应的代码更改以使用专属自己的Pandora token; 121 | + `toolbox.py`相关官方增加识别Pandora Token的功能； 122 | + `config.py`中增加了定时获取FreeAI提供的Pool Token，修改了API_URL_REDIRECT反代端口（不然处理不了Pandora Token），WEB_PORT为86（数字随便取你喜欢的）。你也可以增设访问gpt_academic的账户密码和其他功能。 123 | + docker模型一般编译是： 124 | ```bash {.line-numbers} 125 | #编译 docker 镜像 126 | docker build -t gpt-academic . 127 | #端口可以自由更换，保持和config.py一样即可 128 | docker run -d --restart=always -p 86:86 --name gpt-academic gpt-academic 129 | ``` 130 | + 要使用gpt_academic arxiv翻译功能，在docker模式下，需要进行以下编译： 131 | ``` bash {.line-numbers} 132 | #编译 docker 镜像 133 | docker build -t gpt-academic-nolocal-latex -f docs/GithubAction+NoLocal+Latex . 134 | #端口可以自由更换，保持和config.py和config_private.py中设置的一样即可 135 | #/home/fuqingxu/arxiv_cache是docker容器外的文件夹，存放arxiv相关的内容。具体路经可以修改为你喜欢的 136 | run -d -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --net=host -p 86:86 --restart=always --name gpt-academic gpt-academic-nolocal-latex 137 | ``` 138 | ## 后记 139 | + 因为，Pandora目前本质上是将OpenAI原生的网页服务还原出来，所以目前还不能免费使用诸如ChatGPT-4等付费服务。不过，这将是本人和一众致力于使AI技术服务更广大群众的开发者今后努力的方向。 140 | + 之前ChatGPT Wallfree教程中提及ZeroTier的内网穿透技术，实测不如[Frp](https://github.com/fatedier/frp)更适合中国科研宝宝的体质：更稳定、速度更快且第三方无需客户端。 141 | 142 | ## To-do List 143 | + [ ] 因为我目前是一名科研工作人员，未来将优先投入有限精力开发与arxiv相关的功能，集成我能且想要集成的服务。 144 | 145 | ## Star历史 146 | 147 | ![Star History Chart](https://api.star-history.com/svg?repos=elphen-wang/FreeAI&type=Date) 148 | 149 | 150 | -------------------------------------------------------------------------------- /Translate.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | Translate 7 |
8 | 9 | 10 | 11 | 12 |

13 |

Translation based on the FreeAI

14 |

15 |

16 | Source Language: 17 | 30 | Source Text: 31 | 32 | 33 | 34 |

35 |

36 | Translated Text: 37 | 38 | 39 | 40 |

41 |

42 |

43 | 44 | 45 | 46 | 47 | 48 | 49 | 112 | 113 | 114 | -------------------------------------------------------------------------------- /get_freeai_api_v2.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | from os import path 4 | import requests,json 5 | 6 | def run(): 7 | expires_in = 0 8 | unique_name = 'my share token' 9 | current_dir = path.dirname(path.abspath(__file__)) 10 | # credentials.txt需自己输入账号密码，“,”隔开，一行一组 11 | # share_tokens.txt记录产生的对应的Pandora的share token和/或pool token。 12 | credentials_file = path.join(current_dir, 'credentials.txt') 13 | share_tokens_file = path.join(current_dir, 'share_tokens.txt') 14 | with open(credentials_file, 'r', encoding='utf-8') as f: 15 | credentials = f.read().split('\n') 16 | credentials = [credential.split(',', 1) for credential in credentials] 17 | count = 0 18 | token_keys = [] 19 | for credential in credentials: 20 | progress = '{}/{}'.format(credentials.index(credential) + 1, len(credentials)) 21 | if not credential or len(credential) != 2: 22 | continue 23 | 24 | count += 1 25 | username, password = credential[0].strip(), credential[1].strip() 26 | token_info = { 27 | 'token': 'None', 28 | 'share_token': 'None', 29 | } 30 | token_keys.append(token_info) 31 | try: 32 | #auth = Auth0(email=username, password=password) 33 | #token_info['token'] = auth.get_access_token() 34 | data = {'username': username, 35 | 'password': password, 36 | 'prompt': 'login',} 37 | resp = requests.post('https://ai-20230626.fakeopen.com/auth/login', data=data) 38 | if resp.status_code == 200: 39 | feedback_info= resp.json() 40 | token_info['token'] =feedback_info['access_token'] 41 | #print(auth) 42 | print('Login success: {}, {}'.format(username, progress)) 43 | except Exception as e: 44 | err_str = str(e).replace('\n', '').replace('\r', '').strip() 45 | print('Login failed: {}, {}'.format(username, err_str)) 46 | token_info['token'] = err_str 47 | continue 48 | data = { 49 | 'unique_name': unique_name, 50 | 'access_token': token_info['token'], 51 | 'expires_in': expires_in, 52 | } 53 | resp = requests.post('https://ai.fakeopen.com/token/register', data=data) 54 | if resp.status_code == 200: 55 | token_info['share_token'] = resp.json()['token_key'] 56 | else: 57 | continue 58 | 59 | with open(share_tokens_file, 'w', encoding='utf-8') as f: 60 | # 如果账号大于一个，优先使用pool；只有一个时，使用单独的api；没有，则有公共pool。 61 | if count==0: 62 | f.write('pk-this-is-a-real-free-pool-token-for-everyone\n') 63 | f.write('pk-this-is-a-real-free-api-key-pk-for-everyone\n') 64 | elif count==1: 65 | f.write('{}\n'.format(token_keys[0]['share_token'])) 66 | else: 67 | data = { 68 | 'share_tokens': '\n'.join([token_info['share_token'] for token_info in token_keys]), 69 | } 70 | resp = requests.post('https://ai.fakeopen.com/pool/update', data=data) 71 | if resp.status_code == 200: 72 | f.write('{}\n'.format(resp.json()['pool_token'])) 73 | for token_info in token_keys: 74 | f.write('{}\n'.format(token_info['share_token'])) 75 | f.close() 76 | 77 | if __name__ == '__main__': 78 | run() 79 | 80 | -------------------------------------------------------------------------------- /gpt_academic/config.py: -------------------------------------------------------------------------------- 1 | """ 2 | 以下所有配置也都支持利用环境变量覆写，环境变量配置格式见docker-compose.yml。 3 | 读取优先级：环境变量 > config_private.py > config.py 4 | --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- 5 | All the following configurations also support using environment variables to override, 6 | and the environment variable configuration format can be seen in docker-compose.yml. 7 | Configuration reading priority: environment variable > config_private.py > config.py 8 | """ 9 | 10 | # [step 1]>> API_KEY = "sk-123456789xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123456789"。极少数情况下，还需要填写组织（格式如org-123456789abcdefghijklmno的），请向下翻，找 API_ORG 设置项 11 | API_KEY = "此处填API密钥" # 可同时填写多个API-KEY，用英文逗号分割，例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4" 12 | 13 | ####################FreeAI更改的部分：定时更新API_KEY的部分#################### 14 | import requests 15 | from apscheduler.schedulers.background import BackgroundScheduler 16 | ELPHEN_URL="https://api.elphen.site/api?mode=default_my_poolkey" 17 | def GET_API_KEY(url=ELPHEN_URL): 18 | response = requests.get(url) 19 | global API_KEY 20 | if response.status_code == 200: 21 | API_KEY=response.json() 22 | else: 23 | API_KEY="pk-this-is-a-real-free-pool-token-for-everyone" 24 | return API_KEY 25 | 26 | API_KEY=GET_API_KEY(ELPHEN_URL) 27 | # 创建定时任务的线程 28 | scheduler = BackgroundScheduler() 29 | scheduler.add_job(GET_API_KEY, trigger='cron', hour=5, minute=8)#定时每天凌晨5点8分更新一次FreeAI Key 30 | # 启动定时任务的调度器 31 | scheduler.start() 32 | #print(API_KEY) 33 | ########################################################### 34 | 35 | # [step 2]>> 改为True应用代理，如果直接在海外服务器部署，此处不修改；如果使用本地或无地域限制的大模型时，此处也不需要修改 36 | USE_PROXY = False 37 | if USE_PROXY: 38 | """ 39 | 填写格式是 [协议]:// [地址] :[端口]，填写之前不要忘记把USE_PROXY改成True，如果直接在海外服务器部署，此处不修改 40 | <配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1> 41 | [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http 42 | [地址] 懂的都懂，不懂就填localhost或者127.0.0.1肯定错不了（localhost意思是代理软件安装在本机上） 43 | [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样，但端口号都应该在最显眼的位置上 44 | """ 45 | # 代理网络的地址，打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284) 46 | proxies = { 47 | # [协议]:// [地址] :[端口] 48 | "http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890", 49 | "https": "socks5h://localhost:11284", # 再例如 "https": "http://127.0.0.1:7890", 50 | } 51 | else: 52 | proxies = None 53 | 54 | # ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------ 55 | 56 | # 重新URL重新定向，实现更换API_URL的作用（高危设置! 常规情况下不要修改! 通过修改此设置，您将把您的API-KEY和对话隐私完全暴露给您设定的中间人！） 57 | # 格式: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"} 58 | # 举例: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "https://reverse-proxy-url/v1/chat/completions"} 59 | 60 | ############FreeAI更改的部分################# 61 | API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://ai.fakeopen.com/v1/chat/completions"} 62 | ########################################### 63 | 64 | # 多线程函数插件中，默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次，Pay-as-you-go users的限制是每分钟3500次 65 | # 一言以蔽之：免费（5刀）用户填3，OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询：https://platform.openai.com/docs/guides/rate-limits/overview 66 | DEFAULT_WORKER_NUM = 3 67 | 68 | 69 | # 色彩主题, 可选 ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast"] 70 | # 更多主题, 请查阅Gradio主题商店: https://huggingface.co/spaces/gradio/theme-gallery 可选 ["Gstaff/Xkcd", "NoCrypt/Miku", ...] 71 | THEME = "Default" 72 | AVAIL_THEMES = ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast", "Gstaff/Xkcd", "NoCrypt/Miku"] 73 | 74 | # 对话窗的高度（仅在LAYOUT="TOP-DOWN"时生效） 75 | CHATBOT_HEIGHT = 1115 76 | 77 | 78 | # 代码高亮 79 | CODE_HIGHLIGHT = True 80 | 81 | 82 | # 窗口布局 83 | LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"（左右布局） # "TOP-DOWN"（上下布局） 84 | DARK_MODE = True # 暗色模式 / 亮色模式 85 | 86 | 87 | # 发送请求到OpenAI后，等待多久判定为超时 88 | TIMEOUT_SECONDS = 30 89 | 90 | 91 | # 网页的端口, -1代表随机端口 92 | ############FreeAI更改的部分################# 93 | WEB_PORT = 86 94 | ############################################ 95 | 96 | # 如果OpenAI不响应（网络卡顿、代理失败、KEY失效），重试的次数限制 97 | MAX_RETRY = 2 98 | 99 | 100 | # 插件分类默认选项 101 | DEFAULT_FN_GROUPS = ['对话', '编程', '学术'] 102 | 103 | 104 | # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 ) 105 | LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓ 106 | AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", 107 | "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"] 108 | # P.S. 其他可用的模型还包括 ["qianfan", "llama2", "qwen", "gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", 109 | # "spark", "sparkv2", "chatglm_onnx", "claude-1-100k", "claude-2", "internlm", "jittorllms_pangualpha", "jittorllms_llama"] 110 | 111 | 112 | # 百度千帆（LLM_MODEL="qianfan"） 113 | BAIDU_CLOUD_API_KEY = '' 114 | BAIDU_CLOUD_SECRET_KEY = '' 115 | BAIDU_CLOUD_QIANFAN_MODEL = 'ERNIE-Bot' # 可选 "ERNIE-Bot"(文心一言), "ERNIE-Bot-turbo", "BLOOMZ-7B", "Llama-2-70B-Chat", "Llama-2-13B-Chat", "Llama-2-7B-Chat" 116 | 117 | 118 | # 如果使用ChatGLM2微调模型，请把 LLM_MODEL="chatglmft"，并在此处指定模型路径 119 | CHATGLM_PTUNING_CHECKPOINT = "" # 例如"/home/hmp/ChatGLM2-6B/ptuning/output/6b-pt-128-1e-2/checkpoint-100" 120 | 121 | 122 | # 本地LLM模型如ChatGLM的执行方式 CPU/GPU 123 | LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda" 124 | LOCAL_MODEL_QUANT = "FP16" # 默认 "FP16" "INT4" 启用量化INT4版本 "INT8" 启用量化INT8版本 125 | 126 | 127 | # 设置gradio的并行线程数（不需要修改） 128 | CONCURRENT_COUNT = 100 129 | 130 | 131 | # 是否在提交时自动清空输入框 132 | AUTO_CLEAR_TXT = False 133 | 134 | 135 | # 加一个live2d装饰 136 | ADD_WAIFU = False 137 | 138 | 139 | # 设置用户名和密码（不需要修改）（相关功能不稳定，与gradio版本和网络都相关，如果本地使用不建议加这个） 140 | # [("username", "password"), ("username2", "password2"), ...] 141 | AUTHENTICATION = [] 142 | 143 | 144 | # 如果需要在二级路径下运行（常规情况下，不要修改!!）（需要配合修改main.py才能生效!） 145 | CUSTOM_PATH = "/" 146 | 147 | 148 | # 极少数情况下，openai的官方KEY需要伴随组织编码（格式如org-xxxxxxxxxxxxxxxxxxxxxxxx）使用 149 | API_ORG = "" 150 | 151 | 152 | # 如果需要使用Slack Claude，使用教程详情见 request_llm/README.md 153 | SLACK_CLAUDE_BOT_ID = '' 154 | SLACK_CLAUDE_USER_TOKEN = '' 155 | 156 | 157 | # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md 158 | AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/" 159 | AZURE_API_KEY = "填入azure openai api的密钥" # 建议直接在API_KEY处填写，该选项即将被弃用 160 | AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md 161 | 162 | 163 | # 使用Newbing 164 | NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"] 165 | NEWBING_COOKIES = """ 166 | put your new bing cookies here 167 | """ 168 | 169 | 170 | # 阿里云实时语音识别配置难度较高仅建议高手用户使用参考 https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md 171 | ENABLE_AUDIO = False 172 | ALIYUN_TOKEN="" # 例如 f37f30e0f9934c34a992f6f64f7eba4f 173 | ALIYUN_APPKEY="" # 例如 RoPlZrM88DnAFkZK 174 | ALIYUN_ACCESSKEY="" # （无需填写） 175 | ALIYUN_SECRET="" # （无需填写） 176 | 177 | 178 | # 接入讯飞星火大模型 https://console.xfyun.cn/services/iat 179 | XFYUN_APPID = "00000000" 180 | XFYUN_API_SECRET = "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" 181 | XFYUN_API_KEY = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" 182 | 183 | 184 | # Claude API KEY 185 | ANTHROPIC_API_KEY = "" 186 | 187 | 188 | # 自定义API KEY格式 189 | CUSTOM_API_KEY_PATTERN = "" 190 | 191 | 192 | # HUGGINGFACE的TOKEN，下载LLAMA时起作用 https://huggingface.co/docs/hub/security-tokens 193 | HUGGINGFACE_ACCESS_TOKEN = "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" 194 | 195 | 196 | # GROBID服务器地址（填写多个可以均衡负载），用于高质量地读取PDF文档 197 | # 获取方法：复制以下空间https://huggingface.co/spaces/qingxu98/grobid，设为public，然后GROBID_URL = "https://(你的hf用户名如qingxu98)-(你的填写的空间名如grobid).hf.space" 198 | GROBID_URLS = [ 199 | "https://qingxu98-grobid.hf.space","https://qingxu98-grobid2.hf.space","https://qingxu98-grobid3.hf.space", 200 | "https://shaocongma-grobid.hf.space","https://FBR123-grobid.hf.space", "https://yeku-grobid.hf.space", 201 | ] 202 | 203 | 204 | # 是否允许通过自然语言描述修改本页的配置，该功能具有一定的危险性，默认关闭 205 | ALLOW_RESET_CONFIG = False 206 | # 临时的上传文件夹位置，请勿修改 207 | PATH_PRIVATE_UPLOAD = "private_upload" 208 | # 日志文件夹的位置，请勿修改 209 | PATH_LOGGING = "gpt_log" 210 | 211 | """ 212 | 在线大模型配置关联关系示意图 213 | │ 214 | ├── "gpt-3.5-turbo" 等openai模型 215 | │ ├── API_KEY 216 | │ ├── CUSTOM_API_KEY_PATTERN（不常用） 217 | │ ├── API_ORG（不常用） 218 | │ └── API_URL_REDIRECT（不常用） 219 | │ 220 | ├── "azure-gpt-3.5" 等azure模型 221 | │ ├── API_KEY 222 | │ ├── AZURE_ENDPOINT 223 | │ ├── AZURE_API_KEY 224 | │ ├── AZURE_ENGINE 225 | │ └── API_URL_REDIRECT 226 | │ 227 | ├── "spark" 星火认知大模型 spark & sparkv2 228 | │ ├── XFYUN_APPID 229 | │ ├── XFYUN_API_SECRET 230 | │ └── XFYUN_API_KEY 231 | │ 232 | ├── "claude-1-100k" 等claude模型 233 | │ └── ANTHROPIC_API_KEY 234 | │ 235 | ├── "stack-claude" 236 | │ ├── SLACK_CLAUDE_BOT_ID 237 | │ └── SLACK_CLAUDE_USER_TOKEN 238 | │ 239 | ├── "qianfan" 百度千帆大模型库 240 | │ ├── BAIDU_CLOUD_QIANFAN_MODEL 241 | │ ├── BAIDU_CLOUD_API_KEY 242 | │ └── BAIDU_CLOUD_SECRET_KEY 243 | │ 244 | ├── "newbing" Newbing接口不再稳定，不推荐使用 245 | ├── NEWBING_STYLE 246 | └── NEWBING_COOKIES 247 | 248 | 249 | 用户图形界面布局依赖关系示意图 250 | │ 251 | ├── CHATBOT_HEIGHT 对话窗的高度 252 | ├── CODE_HIGHLIGHT 代码高亮 253 | ├── LAYOUT 窗口布局 254 | ├── DARK_MODE 暗色模式 / 亮色模式 255 | ├── DEFAULT_FN_GROUPS 插件分类默认选项 256 | ├── THEME 色彩主题 257 | ├── AUTO_CLEAR_TXT 是否在提交时自动清空输入框 258 | ├── ADD_WAIFU 加一个live2d装饰 259 | ├── ALLOW_RESET_CONFIG 是否允许通过自然语言描述修改本页的配置，该功能具有一定的危险性 260 | 261 | 262 | 插件在线服务配置依赖关系示意图 263 | │ 264 | ├── 语音功能 265 | │ ├── ENABLE_AUDIO 266 | │ ├── ALIYUN_TOKEN 267 | │ ├── ALIYUN_APPKEY 268 | │ ├── ALIYUN_ACCESSKEY 269 | │ └── ALIYUN_SECRET 270 | │ 271 | ├── PDF文档精准解析 272 | │ └── GROBID_URLS 273 | 274 | """ 275 | -------------------------------------------------------------------------------- /gpt_academic/requirements.txt: -------------------------------------------------------------------------------- 1 | ./docs/gradio-3.32.2-py3-none-any.whl 2 | pydantic==1.10.11 3 | tiktoken>=0.3.3 4 | requests[socks] 5 | transformers 6 | python-markdown-math 7 | beautifulsoup4 8 | prompt_toolkit 9 | latex2mathml 10 | python-docx 11 | mdtex2html 12 | anthropic 13 | colorama 14 | Markdown 15 | pygments 16 | pymupdf 17 | openai 18 | numpy 19 | arxiv 20 | rich 21 | pypdf2==2.12.1 22 | websocket-client 23 | scipdf_parser>=0.3 24 | pdfminer 25 | pdflatex 26 | apscheduler 27 | -------------------------------------------------------------------------------- /gpt_academic/toolbox.py: -------------------------------------------------------------------------------- 1 | import markdown 2 | import importlib 3 | import time 4 | import inspect 5 | import re 6 | import os 7 | import gradio 8 | import shutil 9 | import glob 10 | from latex2mathml.converter import convert as tex2mathml 11 | from functools import wraps, lru_cache 12 | pj = os.path.join 13 | 14 | """ 15 | ======================================================================== 16 | 第一部分 17 | 函数插件输入输出接驳区 18 | - ChatBotWithCookies: 带Cookies的Chatbot类，为实现更多强大的功能做基础 19 | - ArgsGeneralWrapper: 装饰器函数，用于重组输入参数，改变输入参数的顺序与结构 20 | - update_ui: 刷新界面用 yield from update_ui(chatbot, history) 21 | - CatchException: 将插件中出的所有问题显示在界面上 22 | - HotReload: 实现插件的热更新 23 | - trimmed_format_exc: 打印traceback，为了安全而隐藏绝对地址 24 | ======================================================================== 25 | """ 26 | 27 | class ChatBotWithCookies(list): 28 | def __init__(self, cookie): 29 | """ 30 | cookies = { 31 | 'top_p': top_p, 32 | 'temperature': temperature, 33 | 'lock_plugin': bool, 34 | "files_to_promote": ["file1", "file2"], 35 | "most_recent_uploaded": { 36 | "path": "uploaded_path", 37 | "time": time.time(), 38 | "time_str": "timestr", 39 | } 40 | } 41 | """ 42 | self._cookies = cookie 43 | 44 | def write_list(self, list): 45 | for t in list: 46 | self.append(t) 47 | 48 | def get_list(self): 49 | return [t for t in self] 50 | 51 | def get_cookies(self): 52 | return self._cookies 53 | 54 | 55 | def ArgsGeneralWrapper(f): 56 | """ 57 | 装饰器函数，用于重组输入参数，改变输入参数的顺序与结构。 58 | """ 59 | def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args): 60 | txt_passon = txt 61 | if txt == "" and txt2 != "": txt_passon = txt2 62 | # 引入一个有cookie的chatbot 63 | cookies.update({ 64 | 'top_p':top_p, 65 | 'api_key': cookies['api_key'], 66 | 'llm_model': llm_model, 67 | 'temperature':temperature, 68 | }) 69 | llm_kwargs = { 70 | 'api_key': cookies['api_key'], 71 | 'llm_model': llm_model, 72 | 'top_p':top_p, 73 | 'max_length': max_length, 74 | 'temperature':temperature, 75 | 'client_ip': request.client.host, 76 | } 77 | plugin_kwargs = { 78 | "advanced_arg": plugin_advanced_arg, 79 | } 80 | chatbot_with_cookie = ChatBotWithCookies(cookies) 81 | chatbot_with_cookie.write_list(chatbot) 82 | 83 | if cookies.get('lock_plugin', None) is None: 84 | # 正常状态 85 | if len(args) == 0: # 插件通道 86 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, request) 87 | else: # 对话通道，或者基础功能通道 88 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args) 89 | else: 90 | # 处理少数情况下的特殊插件的锁定状态 91 | module, fn_name = cookies['lock_plugin'].split('->') 92 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name) 93 | yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, request) 94 | # 判断一下用户是否错误地通过对话通道进入，如果是，则进行提醒 95 | final_cookies = chatbot_with_cookie.get_cookies() 96 | # len(args) != 0 代表“提交”键对话通道，或者基础功能通道 97 | if len(args) != 0 and 'files_to_promote' in final_cookies and len(final_cookies['files_to_promote']) > 0: 98 | chatbot_with_cookie.append(["检测到**滞留的缓存文档**，请及时处理。", "请及时点击“**保存当前对话**”获取所有滞留文档。"]) 99 | yield from update_ui(chatbot_with_cookie, final_cookies['history'], msg="检测到被滞留的缓存文档") 100 | return decorated 101 | 102 | 103 | def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面 104 | """ 105 | 刷新用户界面 106 | """ 107 | assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。" 108 | cookies = chatbot.get_cookies() 109 | # 备份一份History作为记录 110 | cookies.update({'history': history}) 111 | # 解决插件锁定时的界面显示问题 112 | if cookies.get('lock_plugin', None): 113 | label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None) 114 | chatbot_gr = gradio.update(value=chatbot, label=label) 115 | if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label 116 | elif cookies.get('label', None): 117 | chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', "")) 118 | cookies['label'] = None # 清空label 119 | else: 120 | chatbot_gr = chatbot 121 | 122 | yield cookies, chatbot_gr, history, msg 123 | 124 | def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面 125 | """ 126 | 刷新用户界面 127 | """ 128 | if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg]) 129 | chatbot[-1] = list(chatbot[-1]) 130 | chatbot[-1][-1] = lastmsg 131 | yield from update_ui(chatbot=chatbot, history=history) 132 | time.sleep(delay) 133 | 134 | 135 | def trimmed_format_exc(): 136 | import os, traceback 137 | str = traceback.format_exc() 138 | current_path = os.getcwd() 139 | replace_path = "." 140 | return str.replace(current_path, replace_path) 141 | 142 | def CatchException(f): 143 | """ 144 | 装饰器函数，捕捉函数f中的异常并封装到一个生成器中返回，并显示到聊天当中。 145 | """ 146 | 147 | @wraps(f) 148 | def decorated(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs): 149 | try: 150 | yield from f(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs) 151 | except Exception as e: 152 | from check_proxy import check_proxy 153 | from toolbox import get_conf 154 | proxies, = get_conf('proxies') 155 | tb_str = '```\n' + trimmed_format_exc() + '```' 156 | if len(chatbot_with_cookie) == 0: 157 | chatbot_with_cookie.clear() 158 | chatbot_with_cookie.append(["插件调度异常", "异常原因"]) 159 | chatbot_with_cookie[-1] = (chatbot_with_cookie[-1][0], 160 | f"[Local Message] 实验性函数调用出错: \n\n{tb_str} \n\n当前代理可用性: \n\n{check_proxy(proxies)}") 161 | yield from update_ui(chatbot=chatbot_with_cookie, history=history, msg=f'异常 {e}') # 刷新界面 162 | return decorated 163 | 164 | 165 | def HotReload(f): 166 | """ 167 | HotReload的装饰器函数，用于实现Python函数插件的热更新。 168 | 函数热更新是指在不停止程序运行的情况下，更新函数代码，从而达到实时更新功能。 169 | 在装饰器内部，使用wraps(f)来保留函数的元信息，并定义了一个名为decorated的内部函数。 170 | 内部函数通过使用importlib模块的reload函数和inspect模块的getmodule函数来重新加载并获取函数模块， 171 | 然后通过getattr函数获取函数名，并在新模块中重新加载函数。 172 | 最后，使用yield from语句返回重新加载过的函数，并在被装饰的函数上执行。 173 | 最终，装饰器函数返回内部函数。这个内部函数可以将函数的原始定义更新为最新版本，并执行函数的新版本。 174 | """ 175 | @wraps(f) 176 | def decorated(*args, **kwargs): 177 | fn_name = f.__name__ 178 | f_hot_reload = getattr(importlib.reload(inspect.getmodule(f)), fn_name) 179 | yield from f_hot_reload(*args, **kwargs) 180 | return decorated 181 | 182 | 183 | """ 184 | ======================================================================== 185 | 第二部分 186 | 其他小工具: 187 | - write_history_to_file: 将结果写入markdown文件中 188 | - regular_txt_to_markdown: 将普通文本转换为Markdown格式的文本。 189 | - report_execption: 向chatbot中添加简单的意外错误信息 190 | - text_divide_paragraph: 将文本按照段落分隔符分割开，生成带有段落标签的HTML代码。 191 | - markdown_convertion: 用多种方式组合，将markdown转化为好看的html 192 | - format_io: 接管gradio默认的markdown处理方式 193 | - on_file_uploaded: 处理文件的上传（自动解压） 194 | - on_report_generated: 将生成的报告自动投射到文件上传区 195 | - clip_history: 当历史上下文过长时，自动截断 196 | - get_conf: 获取设置 197 | - select_api_key: 根据当前的模型类别，抽取可用的api-key 198 | ======================================================================== 199 | """ 200 | 201 | def get_reduce_token_percent(text): 202 | """ 203 | * 此函数未来将被弃用 204 | """ 205 | try: 206 | # text = "maximum context length is 4097 tokens. However, your messages resulted in 4870 tokens" 207 | pattern = r"(\d+)\s+tokens\b" 208 | match = re.findall(pattern, text) 209 | EXCEED_ALLO = 500 # 稍微留一点余地，否则在回复时会因余量太少出问题 210 | max_limit = float(match[0]) - EXCEED_ALLO 211 | current_tokens = float(match[1]) 212 | ratio = max_limit/current_tokens 213 | assert ratio > 0 and ratio < 1 214 | return ratio, str(int(current_tokens-max_limit)) 215 | except: 216 | return 0.5, '不详' 217 | 218 | 219 | def write_history_to_file(history, file_basename=None, file_fullname=None, auto_caption=True): 220 | """ 221 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名，则使用当前时间生成文件名。 222 | """ 223 | import os 224 | import time 225 | if file_fullname is None: 226 | if file_basename is not None: 227 | file_fullname = pj(get_log_folder(), file_basename) 228 | else: 229 | file_fullname = pj(get_log_folder(), f'GPT-Academic-{gen_time_str()}.md') 230 | os.makedirs(os.path.dirname(file_fullname), exist_ok=True) 231 | with open(file_fullname, 'w', encoding='utf8') as f: 232 | f.write('# GPT-Academic Report\n') 233 | for i, content in enumerate(history): 234 | try: 235 | if type(content) != str: content = str(content) 236 | except: 237 | continue 238 | if i % 2 == 0 and auto_caption: 239 | f.write('## ') 240 | try: 241 | f.write(content) 242 | except: 243 | # remove everything that cannot be handled by utf8 244 | f.write(content.encode('utf-8', 'ignore').decode()) 245 | f.write('\n\n') 246 | res = os.path.abspath(file_fullname) 247 | return res 248 | 249 | 250 | def regular_txt_to_markdown(text): 251 | """ 252 | 将普通文本转换为Markdown格式的文本。 253 | """ 254 | text = text.replace('\n', '\n\n') 255 | text = text.replace('\n\n\n', '\n\n') 256 | text = text.replace('\n\n\n', '\n\n') 257 | return text 258 | 259 | 260 | 261 | 262 | def report_execption(chatbot, history, a, b): 263 | """ 264 | 向chatbot中添加错误信息 265 | """ 266 | chatbot.append((a, b)) 267 | history.extend([a, b]) 268 | 269 | 270 | def text_divide_paragraph(text): 271 | """ 272 | 将文本按照段落分隔符分割开，生成带有段落标签的HTML代码。 273 | """ 274 | pre = '

' 275 | suf = '

' 297 | suf = '

' 298 | if txt.startswith(pre) and txt.endswith(suf): 299 | # print('警告，输入了已经经过转化的字符串，二次转化可能出问题') 300 | return txt # 已经被转化过，不需要再次转化 301 | 302 | markdown_extension_configs = { 303 | 'mdx_math': { 304 | 'enable_dollar_delimiter': True, 305 | 'use_gitlab_delimiters': False, 306 | }, 307 | } 308 | find_equation_pattern = r' $$${content}$$" 322 | else: 323 | return f"${content}$" 324 | 325 | def replace_math_render(match): 326 | content = match.group(1) 327 | if 'mode=display' in match.group(0): 328 | if '\\begin{aligned}' in content: 329 | content = content.replace('\\begin{aligned}', '\\begin{array}') 330 | content = content.replace('\\end{aligned}', '\\end{array}') 331 | content = content.replace('&', ' ') 332 | content = tex2mathml_catch_exception(content, display="block") 333 | return content 334 | else: 335 | return tex2mathml_catch_exception(content) 336 | 337 | def markdown_bug_hunt(content): 338 | """ 339 | 解决一个mdx_math的bug（单$包裹begin命令时多余） 340 | """ 341 | content = content.replace('\n', '') 342 | content = content.replace('$ \n', '') 343 | return content 344 | 345 | def is_equation(txt): 346 | """ 347 | 判定是否为公式 | 测试1 写出洛伦兹定律，使用tex格式公式测试2 给出柯西不等式，使用latex格式测试3 写出麦克斯韦方程组 348 | """ 349 | if '```' in txt and '```reference' not in txt: return False 350 | if '$' not in txt and '\\[' not in txt: return False 351 | mathpatterns = { 352 | r'(?= one_minute_ago: 510 | if os.path.isdir(file_path): 511 | continue 512 | recent_files.append(file_path) 513 | 514 | return recent_files 515 | 516 | def promote_file_to_downloadzone(file, rename_file=None, chatbot=None): 517 | # 将文件复制一份到下载区 518 | import shutil 519 | if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}' 520 | new_path = pj(get_log_folder(), rename_file) 521 | # 如果已经存在，先删除 522 | if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path) 523 | # 把文件复制过去 524 | if not os.path.exists(new_path): shutil.copyfile(file, new_path) 525 | # 将文件添加到chatbot cookie中，避免多用户干扰 526 | if chatbot: 527 | if 'files_to_promote' in chatbot._cookies: current = chatbot._cookies['files_to_promote'] 528 | else: current = [] 529 | chatbot._cookies.update({'files_to_promote': [new_path] + current}) 530 | 531 | def disable_auto_promotion(chatbot): 532 | chatbot._cookies.update({'files_to_promote': []}) 533 | return 534 | 535 | def is_the_upload_folder(string): 536 | PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD') 537 | pattern = r'^PATH_PRIVATE_UPLOAD/[A-Za-z0-9_-]+/\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}$' 538 | pattern = pattern.replace('PATH_PRIVATE_UPLOAD', PATH_PRIVATE_UPLOAD) 539 | if re.match(pattern, string): return True 540 | else: return False 541 | 542 | def del_outdated_uploads(outdate_time_seconds): 543 | PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD') 544 | current_time = time.time() 545 | one_hour_ago = current_time - outdate_time_seconds 546 | # Get a list of all subdirectories in the PATH_PRIVATE_UPLOAD folder 547 | # Remove subdirectories that are older than one hour 548 | for subdirectory in glob.glob(f'{PATH_PRIVATE_UPLOAD}/*/*'): 549 | subdirectory_time = os.path.getmtime(subdirectory) 550 | if subdirectory_time < one_hour_ago: 551 | try: shutil.rmtree(subdirectory) 552 | except: pass 553 | return 554 | 555 | def on_file_uploaded(request: gradio.Request, files, chatbot, txt, txt2, checkboxes, cookies): 556 | """ 557 | 当文件被上传时的回调函数 558 | """ 559 | if len(files) == 0: 560 | return chatbot, txt 561 | 562 | # 移除过时的旧文件从而节省空间&保护隐私 563 | outdate_time_seconds = 60 564 | del_outdated_uploads(outdate_time_seconds) 565 | 566 | # 创建工作路径 567 | user_name = "default" if not request.username else request.username 568 | time_tag = gen_time_str() 569 | PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD') 570 | target_path_base = pj(PATH_PRIVATE_UPLOAD, user_name, time_tag) 571 | os.makedirs(target_path_base, exist_ok=True) 572 | 573 | # 逐个文件转移到目标路径 574 | upload_msg = '' 575 | for file in files: 576 | file_origin_name = os.path.basename(file.orig_name) 577 | this_file_path = pj(target_path_base, file_origin_name) 578 | shutil.move(file.name, this_file_path) 579 | upload_msg += extract_archive(file_path=this_file_path, dest_dir=this_file_path+'.extract') 580 | 581 | # 整理文件集合 582 | moved_files = [fp for fp in glob.glob(f'{target_path_base}/**/*', recursive=True)] 583 | if "底部输入区" in checkboxes: 584 | txt, txt2 = "", target_path_base 585 | else: 586 | txt, txt2 = target_path_base, "" 587 | 588 | # 输出消息 589 | moved_files_str = '\t\n\n'.join(moved_files) 590 | chatbot.append(['我上传了文件，请查收', 591 | f'[Local Message] 收到以下文件: \n\n{moved_files_str}' + 592 | f'\n\n调用路径参数已自动修正到: \n\n{txt}' + 593 | f'\n\n现在您点击任意函数插件时，以上文件将被作为输入参数'+upload_msg]) 594 | 595 | # 记录近期文件 596 | cookies.update({ 597 | 'most_recent_uploaded': { 598 | 'path': target_path_base, 599 | 'time': time.time(), 600 | 'time_str': time_tag 601 | }}) 602 | return chatbot, txt, txt2, cookies 603 | 604 | 605 | def on_report_generated(cookies, files, chatbot): 606 | from toolbox import find_recent_files 607 | PATH_LOGGING, = get_conf('PATH_LOGGING') 608 | if 'files_to_promote' in cookies: 609 | report_files = cookies['files_to_promote'] 610 | cookies.pop('files_to_promote') 611 | else: 612 | report_files = find_recent_files(PATH_LOGGING) 613 | if len(report_files) == 0: 614 | return cookies, None, chatbot 615 | # files.extend(report_files) 616 | file_links = '' 617 | for f in report_files: file_links += f'
{f}' 618 | chatbot.append(['报告如何远程获取？', f'报告已经添加到右侧“文件上传区”（可能处于折叠状态），请查收。{file_links}']) 619 | return cookies, report_files, chatbot 620 | 621 | def load_chat_cookies(): 622 | API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY') 623 | if is_any_api_key(AZURE_API_KEY): 624 | if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY 625 | else: API_KEY = AZURE_API_KEY 626 | return {'api_key': API_KEY, 'llm_model': LLM_MODEL} 627 | 628 | def is_openai_api_key(key): 629 | CUSTOM_API_KEY_PATTERN, = get_conf('CUSTOM_API_KEY_PATTERN') 630 | if len(CUSTOM_API_KEY_PATTERN) != 0: 631 | API_MATCH_ORIGINAL = re.match(CUSTOM_API_KEY_PATTERN, key) 632 | else: 633 | API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key) 634 | return bool(API_MATCH_ORIGINAL) 635 | 636 | def is_azure_api_key(key): 637 | API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key) 638 | return bool(API_MATCH_AZURE) 639 | 640 | def is_api2d_key(key): 641 | API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key) 642 | return bool(API_MATCH_API2D) 643 | 644 | ############FreeAI更改的部分################# 645 | def is_freeai_api_key(key):#new add 646 | API_MATCH_FREEAI0 = re.match(r"pk-[a-zA-Z0-9-_]{43}$", key) 647 | API_MATCH_FREEAI1 = re.match(r"fk-[a-zA-Z0-9-_]{43}$", key) 648 | return bool(API_MATCH_FREEAI0) or bool(API_MATCH_FREEAI1) 649 | 650 | def is_any_api_key(key): 651 | if ',' in key: 652 | keys = key.split(',') 653 | for k in keys: 654 | if is_any_api_key(k): return True 655 | return False 656 | else:#new add 657 | return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key) or is_freeai_api_key(key) 658 | 659 | def what_keys(keys): 660 | avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0} 661 | key_list = keys.split(',') 662 | 663 | for k in key_list: 664 | if is_openai_api_key(k): 665 | avail_key_list['OpenAI Key'] += 1 666 | 667 | for k in key_list: 668 | if is_api2d_key(k): 669 | avail_key_list['API2D Key'] += 1 670 | 671 | for k in key_list: 672 | if is_azure_api_key(k): 673 | avail_key_list['Azure Key'] += 1 674 | 675 | for k in key_list: # new add 676 | if is_freeai_api_key(k): 677 | avail_key_list['FreeAI Key'] += 1 678 | # new add 679 | return f"检测到： OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']} 个, FreeAI Key {avail_key_list['FreeAI Key']} 个" 680 | 681 | def select_api_key(keys, llm_model): 682 | import random 683 | avail_key_list = [] 684 | key_list = keys.split(',') 685 | 686 | if llm_model.startswith('gpt-'): 687 | for k in key_list: 688 | if is_openai_api_key(k): avail_key_list.append(k) 689 | for k in key_list:# new add 690 | if is_freeai_api_key(k): avail_key_list.append(k) 691 | 692 | if llm_model.startswith('api2d-'): 693 | for k in key_list: 694 | if is_api2d_key(k): avail_key_list.append(k) 695 | 696 | if llm_model.startswith('azure-'): 697 | for k in key_list: 698 | if is_azure_api_key(k): avail_key_list.append(k) 699 | 700 | if len(avail_key_list) == 0: 701 | raise RuntimeError(f"您提供的api-key不满足要求，不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源（右下角更换模型菜单中可切换openai,azure,claude,api2d等请求源）。") 702 | 703 | api_key = random.choice(avail_key_list) # 随机负载均衡 704 | return api_key 705 | 706 | ########################################### 707 | 708 | 709 | def read_env_variable(arg, default_value): 710 | """ 711 | 环境变量可以是 `GPT_ACADEMIC_CONFIG`(优先)，也可以直接是`CONFIG` 712 | 例如在windows cmd中，既可以写： 713 | set USE_PROXY=True 714 | set API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx 715 | set proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",} 716 | set AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"] 717 | set AUTHENTICATION=[("username", "password"), ("username2", "password2")] 718 | 也可以写： 719 | set GPT_ACADEMIC_USE_PROXY=True 720 | set GPT_ACADEMIC_API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx 721 | set GPT_ACADEMIC_proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",} 722 | set GPT_ACADEMIC_AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"] 723 | set GPT_ACADEMIC_AUTHENTICATION=[("username", "password"), ("username2", "password2")] 724 | """ 725 | from colorful import print亮红, print亮绿 726 | arg_with_prefix = "GPT_ACADEMIC_" + arg 727 | if arg_with_prefix in os.environ: 728 | env_arg = os.environ[arg_with_prefix] 729 | elif arg in os.environ: 730 | env_arg = os.environ[arg] 731 | else: 732 | raise KeyError 733 | print(f"[ENV_VAR] 尝试加载{arg}，默认值：{default_value} --> 修正值：{env_arg}") 734 | try: 735 | if isinstance(default_value, bool): 736 | env_arg = env_arg.strip() 737 | if env_arg == 'True': r = True 738 | elif env_arg == 'False': r = False 739 | else: print('enter True or False, but have:', env_arg); r = default_value 740 | elif isinstance(default_value, int): 741 | r = int(env_arg) 742 | elif isinstance(default_value, float): 743 | r = float(env_arg) 744 | elif isinstance(default_value, str): 745 | r = env_arg.strip() 746 | elif isinstance(default_value, dict): 747 | r = eval(env_arg) 748 | elif isinstance(default_value, list): 749 | r = eval(env_arg) 750 | elif default_value is None: 751 | assert arg == "proxies" 752 | r = eval(env_arg) 753 | else: 754 | print亮红(f"[ENV_VAR] 环境变量{arg}不支持通过环境变量设置! ") 755 | raise KeyError 756 | except: 757 | print亮红(f"[ENV_VAR] 环境变量{arg}加载失败! ") 758 | raise KeyError(f"[ENV_VAR] 环境变量{arg}加载失败! ") 759 | 760 | print亮绿(f"[ENV_VAR] 成功读取环境变量{arg}") 761 | return r 762 | 763 | @lru_cache(maxsize=128) 764 | def read_single_conf_with_lru_cache(arg): 765 | from colorful import print亮红, print亮绿, print亮蓝 766 | try: 767 | # 优先级1. 获取环境变量作为配置 768 | default_ref = getattr(importlib.import_module('config'), arg) # 读取默认值作为数据类型转换的参考 769 | r = read_env_variable(arg, default_ref) 770 | except: 771 | try: 772 | # 优先级2. 获取config_private中的配置 773 | r = getattr(importlib.import_module('config_private'), arg) 774 | except: 775 | # 优先级3. 获取config中的配置 776 | r = getattr(importlib.import_module('config'), arg) 777 | 778 | # 在读取API_KEY时，检查一下是不是忘了改config 779 | if arg == 'API_KEY': 780 | print亮蓝(f"[API_KEY] 本项目现已支持OpenAI和Azure的api-key。也支持同时填写多个api-key，如API_KEY=\"openai-key1,openai-key2,azure-key3\"") 781 | print亮蓝(f"[API_KEY] 您既可以在config.py中修改api-key(s)，也可以在问题输入区输入临时的api-key(s)，然后回车键提交后即可生效。") 782 | if is_any_api_key(r): 783 | print亮绿(f"[API_KEY] 您的 API_KEY 是: {r[:15]}*** API_KEY 导入成功") 784 | else: 785 | print亮红( "[API_KEY] 您的 API_KEY 不满足任何一种已知的密钥格式，请在config文件中修改API密钥之后再运行。") 786 | if arg == 'proxies': 787 | if not read_single_conf_with_lru_cache('USE_PROXY'): r = None # 检查USE_PROXY，防止proxies单独起作用 788 | if r is None: 789 | print亮红('[PROXY] 网络代理状态：未配置。无代理状态下很可能无法访问OpenAI家族的模型。建议：检查USE_PROXY选项是否修改。') 790 | else: 791 | print亮绿('[PROXY] 网络代理状态：已配置。配置信息如下：', r) 792 | assert isinstance(r, dict), 'proxies格式错误，请注意proxies选项的格式，不要遗漏括号。' 793 | return r 794 | 795 | 796 | @lru_cache(maxsize=128) 797 | def get_conf(*args): 798 | # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到 799 | res = [] 800 | for arg in args: 801 | r = read_single_conf_with_lru_cache(arg) 802 | res.append(r) 803 | return res 804 | 805 | 806 | def clear_line_break(txt): 807 | txt = txt.replace('\n', ' ') 808 | txt = txt.replace(' ', ' ') 809 | txt = txt.replace(' ', ' ') 810 | return txt 811 | 812 | 813 | class DummyWith(): 814 | """ 815 | 这段代码定义了一个名为DummyWith的空上下文管理器， 816 | 它的作用是……额……就是不起作用，即在代码结构不变得情况下取代其他的上下文管理器。 817 | 上下文管理器是一种Python对象，用于与with语句一起使用， 818 | 以确保一些资源在代码块执行期间得到正确的初始化和清理。 819 | 上下文管理器必须实现两个方法，分别为 __enter__()和 __exit__()。 820 | 在上下文执行开始的情况下，__enter__()方法会在代码块被执行前被调用， 821 | 而在上下文执行结束时，__exit__()方法则会被调用。 822 | """ 823 | def __enter__(self): 824 | return self 825 | 826 | def __exit__(self, exc_type, exc_value, traceback): 827 | return 828 | 829 | def run_gradio_in_subpath(demo, auth, port, custom_path): 830 | """ 831 | 把gradio的运行地址更改到指定的二次路径上 832 | """ 833 | def is_path_legal(path: str)->bool: 834 | ''' 835 | check path for sub url 836 | path: path to check 837 | return value: do sub url wrap 838 | ''' 839 | if path == "/": return True 840 | if len(path) == 0: 841 | print("ilegal custom path: {}\npath must not be empty\ndeploy on root url".format(path)) 842 | return False 843 | if path[0] == '/': 844 | if path[1] != '/': 845 | print("deploy on sub-path {}".format(path)) 846 | return True 847 | return False 848 | print("ilegal custom path: {}\npath should begin with \'/\'\ndeploy on root url".format(path)) 849 | return False 850 | 851 | if not is_path_legal(custom_path): raise RuntimeError('Ilegal custom path') 852 | import uvicorn 853 | import gradio as gr 854 | from fastapi import FastAPI 855 | app = FastAPI() 856 | if custom_path != "/": 857 | @app.get("/") 858 | def read_main(): 859 | return {"message": f"Gradio is running at: {custom_path}"} 860 | app = gr.mount_gradio_app(app, demo, path=custom_path) 861 | uvicorn.run(app, host="0.0.0.0", port=port) # , auth=auth 862 | 863 | 864 | def clip_history(inputs, history, tokenizer, max_token_limit): 865 | """ 866 | reduce the length of history by clipping. 867 | this function search for the longest entries to clip, little by little, 868 | until the number of token of history is reduced under threshold. 869 | 通过裁剪来缩短历史记录的长度。 870 | 此函数逐渐地搜索最长的条目进行剪辑， 871 | 直到历史记录的标记数量降低到阈值以下。 872 | """ 873 | import numpy as np 874 | from request_llm.bridge_all import model_info 875 | def get_token_num(txt): 876 | return len(tokenizer.encode(txt, disallowed_special=())) 877 | input_token_num = get_token_num(inputs) 878 | if input_token_num < max_token_limit * 3 / 4: 879 | # 当输入部分的token占比小于限制的3/4时，裁剪时 880 | # 1. 把input的余量留出来 881 | max_token_limit = max_token_limit - input_token_num 882 | # 2. 把输出用的余量留出来 883 | max_token_limit = max_token_limit - 128 884 | # 3. 如果余量太小了，直接清除历史 885 | if max_token_limit < 128: 886 | history = [] 887 | return history 888 | else: 889 | # 当输入部分的token占比 > 限制的3/4时，直接清除历史 890 | history = [] 891 | return history 892 | 893 | everything = [''] 894 | everything.extend(history) 895 | n_token = get_token_num('\n'.join(everything)) 896 | everything_token = [get_token_num(e) for e in everything] 897 | 898 | # 截断时的颗粒度 899 | delta = max(everything_token) // 16 900 | 901 | while n_token > max_token_limit: 902 | where = np.argmax(everything_token) 903 | encoded = tokenizer.encode(everything[where], disallowed_special=()) 904 | clipped_encoded = encoded[:len(encoded)-delta] 905 | everything[where] = tokenizer.decode(clipped_encoded)[:-1] # -1 to remove the may-be illegal char 906 | everything_token[where] = get_token_num(everything[where]) 907 | n_token = get_token_num('\n'.join(everything)) 908 | 909 | history = everything[1:] 910 | return history 911 | 912 | """ 913 | ======================================================================== 914 | 第三部分 915 | 其他小工具: 916 | - zip_folder: 把某个路径下所有文件压缩，然后转移到指定的另一个路径中（gpt写的） 917 | - gen_time_str: 生成时间戳 918 | - ProxyNetworkActivate: 临时地启动代理网络（如果有） 919 | - objdump/objload: 快捷的调试函数 920 | ======================================================================== 921 | """ 922 | 923 | def zip_folder(source_folder, dest_folder, zip_name): 924 | import zipfile 925 | import os 926 | # Make sure the source folder exists 927 | if not os.path.exists(source_folder): 928 | print(f"{source_folder} does not exist") 929 | return 930 | 931 | # Make sure the destination folder exists 932 | if not os.path.exists(dest_folder): 933 | print(f"{dest_folder} does not exist") 934 | return 935 | 936 | # Create the name for the zip file 937 | zip_file = pj(dest_folder, zip_name) 938 | 939 | # Create a ZipFile object 940 | with zipfile.ZipFile(zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf: 941 | # Walk through the source folder and add files to the zip file 942 | for foldername, subfolders, filenames in os.walk(source_folder): 943 | for filename in filenames: 944 | filepath = pj(foldername, filename) 945 | zipf.write(filepath, arcname=os.path.relpath(filepath, source_folder)) 946 | 947 | # Move the zip file to the destination folder (if it wasn't already there) 948 | if os.path.dirname(zip_file) != dest_folder: 949 | os.rename(zip_file, pj(dest_folder, os.path.basename(zip_file))) 950 | zip_file = pj(dest_folder, os.path.basename(zip_file)) 951 | 952 | print(f"Zip file created at {zip_file}") 953 | 954 | def zip_result(folder): 955 | t = gen_time_str() 956 | zip_folder(folder, get_log_folder(), f'{t}-result.zip') 957 | return pj(get_log_folder(), f'{t}-result.zip') 958 | 959 | def gen_time_str(): 960 | import time 961 | return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) 962 | 963 | def get_log_folder(user='default', plugin_name='shared'): 964 | PATH_LOGGING, = get_conf('PATH_LOGGING') 965 | _dir = pj(PATH_LOGGING, user, plugin_name) 966 | if not os.path.exists(_dir): os.makedirs(_dir) 967 | return _dir 968 | 969 | class ProxyNetworkActivate(): 970 | """ 971 | 这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理 972 | """ 973 | def __enter__(self): 974 | from toolbox import get_conf 975 | proxies, = get_conf('proxies') 976 | if 'no_proxy' in os.environ: os.environ.pop('no_proxy') 977 | if proxies is not None: 978 | if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http'] 979 | if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https'] 980 | return self 981 | 982 | def __exit__(self, exc_type, exc_value, traceback): 983 | os.environ['no_proxy'] = '*' 984 | if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY') 985 | if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY') 986 | return 987 | 988 | def objdump(obj, file='objdump.tmp'): 989 | import pickle 990 | with open(file, 'wb+') as f: 991 | pickle.dump(obj, f) 992 | return 993 | 994 | def objload(file='objdump.tmp'): 995 | import pickle, os 996 | if not os.path.exists(file): 997 | return 998 | with open(file, 'rb') as f: 999 | return pickle.load(f) 1000 | 1001 | def Singleton(cls): 1002 | """ 1003 | 一个单实例装饰器 1004 | """ 1005 | _instance = {} 1006 | 1007 | def _singleton(*args, **kargs): 1008 | if cls not in _instance: 1009 | _instance[cls] = cls(*args, **kargs) 1010 | return _instance[cls] 1011 | 1012 | return _singleton 1013 | 1014 | """ 1015 | ======================================================================== 1016 | 第四部分 1017 | 接驳虚空终端: 1018 | - set_conf: 在运行过程中动态地修改配置 1019 | - set_multi_conf: 在运行过程中动态地修改多个配置 1020 | - get_plugin_handle: 获取插件的句柄 1021 | - get_plugin_default_kwargs: 获取插件的默认参数 1022 | - get_chat_handle: 获取简单聊天的句柄 1023 | - get_chat_default_kwargs: 获取简单聊天的默认参数 1024 | ======================================================================== 1025 | """ 1026 | 1027 | def set_conf(key, value): 1028 | from toolbox import read_single_conf_with_lru_cache, get_conf 1029 | read_single_conf_with_lru_cache.cache_clear() 1030 | get_conf.cache_clear() 1031 | os.environ[key] = str(value) 1032 | altered, = get_conf(key) 1033 | return altered 1034 | 1035 | def set_multi_conf(dic): 1036 | for k, v in dic.items(): set_conf(k, v) 1037 | return 1038 | 1039 | def get_plugin_handle(plugin_name): 1040 | """ 1041 | e.g. plugin_name = 'crazy_functions.批量Markdown翻译->Markdown翻译指定语言' 1042 | """ 1043 | import importlib 1044 | assert '->' in plugin_name, \ 1045 | "Example of plugin_name: crazy_functions.批量Markdown翻译->Markdown翻译指定语言" 1046 | module, fn_name = plugin_name.split('->') 1047 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name) 1048 | return f_hot_reload 1049 | 1050 | def get_chat_handle(): 1051 | """ 1052 | """ 1053 | from request_llm.bridge_all import predict_no_ui_long_connection 1054 | return predict_no_ui_long_connection 1055 | 1056 | def get_plugin_default_kwargs(): 1057 | """ 1058 | """ 1059 | from toolbox import get_conf, ChatBotWithCookies 1060 | 1061 | WEB_PORT, LLM_MODEL, API_KEY = \ 1062 | get_conf('WEB_PORT', 'LLM_MODEL', 'API_KEY') 1063 | 1064 | llm_kwargs = { 1065 | 'api_key': API_KEY, 1066 | 'llm_model': LLM_MODEL, 1067 | 'top_p':1.0, 1068 | 'max_length': None, 1069 | 'temperature':1.0, 1070 | } 1071 | chatbot = ChatBotWithCookies(llm_kwargs) 1072 | 1073 | # txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port 1074 | DEFAULT_FN_GROUPS_kwargs = { 1075 | "main_input": "./README.md", 1076 | "llm_kwargs": llm_kwargs, 1077 | "plugin_kwargs": {}, 1078 | "chatbot_with_cookie": chatbot, 1079 | "history": [], 1080 | "system_prompt": "You are a good AI.", 1081 | "web_port": WEB_PORT 1082 | } 1083 | return DEFAULT_FN_GROUPS_kwargs 1084 | 1085 | def get_chat_default_kwargs(): 1086 | """ 1087 | """ 1088 | from toolbox import get_conf 1089 | 1090 | LLM_MODEL, API_KEY = get_conf('LLM_MODEL', 'API_KEY') 1091 | 1092 | llm_kwargs = { 1093 | 'api_key': API_KEY, 1094 | 'llm_model': LLM_MODEL, 1095 | 'top_p':1.0, 1096 | 'max_length': None, 1097 | 'temperature':1.0, 1098 | } 1099 | 1100 | default_chat_kwargs = { 1101 | "inputs": "Hello there, are you ready?", 1102 | "llm_kwargs": llm_kwargs, 1103 | "history": [], 1104 | "sys_prompt": "You are AI assistant", 1105 | "observe_window": None, 1106 | "console_slience": False, 1107 | } 1108 | 1109 | return default_chat_kwargs 1110 | 1111 | -------------------------------------------------------------------------------- /images/error/pandora_public_pool_token.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/elphen-wang/FreeAI/25cd8bfff04607fc73a464bfcff948d102b65c4f/images/error/pandora_public_pool_token.png -------------------------------------------------------------------------------- /old/3.43/config.py: -------------------------------------------------------------------------------- 1 | """ 2 | 以下所有配置也都支持利用环境变量覆写，环境变量配置格式见docker-compose.yml。 3 | 读取优先级：环境变量 > config_private.py > config.py 4 | --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- 5 | All the following configurations also support using environment variables to override, 6 | and the environment variable configuration format can be seen in docker-compose.yml. 7 | Configuration reading priority: environment variable > config_private.py > config.py 8 | """ 9 | 10 | # [step 1]>> API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY'。极少数情况下，还需要填写组织（格式如org-123456789abcdefghijklmno的），请向下翻，找 API_ORG 设置项 11 | API_KEY = "此处填API密钥" # 可同时填写多个API-KEY，用英文逗号分割，例如API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY' 12 | 13 | ####################定时更新API_KEY的部分#################### 14 | import requests 15 | from apscheduler.schedulers.background import BackgroundScheduler 16 | ELPHEN_URL="https://api.elphen.site/api?mode=default_my_poolkey" 17 | def GET_API_KEY(url=ELPHEN_URL): 18 | response = requests.get(url) 19 | global API_KEY 20 | if response.status_code == 200: 21 | API_KEY=response.json() 22 | else: 23 | API_KEY="pk-this-is-a-real-free-pool-token-for-everyone" 24 | return API_KEY 25 | 26 | API_KEY=GET_API_KEY(ELPHEN_URL) 27 | # 创建定时任务的线程 28 | scheduler = BackgroundScheduler() 29 | scheduler.add_job(GET_API_KEY, trigger='cron', hour=5, minute=8)#定时每天凌晨5点8分更新一次FreeAI Key 30 | # 启动定时任务的调度器 31 | scheduler.start() 32 | #print(API_KEY) 33 | ########################################################### 34 | 35 | # [step 2]>> 改为True应用代理，如果直接在海外服务器部署，此处不修改 36 | USE_PROXY = False 37 | if USE_PROXY: 38 | """ 39 | 填写格式是 [协议]:// [地址] :[端口]，填写之前不要忘记把USE_PROXY改成True，如果直接在海外服务器部署，此处不修改 40 | <配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1> 41 | [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http 42 | [地址] 懂的都懂，不懂就填localhost或者127.0.0.1肯定错不了（localhost意思是代理软件安装在本机上） 43 | [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样，但端口号都应该在最显眼的位置上 44 | """ 45 | # 代理网络的地址，打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284) 46 | proxies = { 47 | # [协议]:// [地址] :[端口] 48 | "http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890", 49 | "https": "socks5h://localhost:11284", # 再例如 "https": "http://127.0.0.1:7890", 50 | } 51 | else: 52 | proxies = None 53 | 54 | # ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------ 55 | 56 | # 重新URL重新定向，实现更换API_URL的作用（高危设置! 常规情况下不要修改! 通过修改此设置，您将把您的API-KEY和对话隐私完全暴露给您设定的中间人！） 57 | # 格式: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"} 58 | # 举例: API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "https://reverse-proxy-url/v1/chat/completions"} 59 | API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://ai.fakeopen.com/v1/chat/completions"} 60 | 61 | 62 | # 多线程函数插件中，默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次，Pay-as-you-go users的限制是每分钟3500次 63 | # 一言以蔽之：免费（5刀）用户填3，OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询：https://platform.openai.com/docs/guides/rate-limits/overview 64 | DEFAULT_WORKER_NUM = 3 65 | 66 | 67 | # 对话窗的高度 68 | CHATBOT_HEIGHT = 1115 69 | 70 | 71 | # 代码高亮 72 | CODE_HIGHLIGHT = True 73 | 74 | 75 | # 窗口布局 76 | LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"（左右布局） # "TOP-DOWN"（上下布局） 77 | DARK_MODE = True # 暗色模式 / 亮色模式 78 | 79 | 80 | # 发送请求到OpenAI后，等待多久判定为超时 81 | TIMEOUT_SECONDS = 30 82 | 83 | 84 | # 网页的端口, -1代表随机端口 85 | WEB_PORT = 86 86 | 87 | 88 | # 如果OpenAI不响应（网络卡顿、代理失败、KEY失效），重试的次数限制 89 | MAX_RETRY = 2 90 | 91 | 92 | # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 ) 93 | LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓ 94 | AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"] 95 | # P.S. 其他可用的模型还包括 ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "claude-1-100k", "claude-2", "internlm", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"] 96 | 97 | 98 | # ChatGLM(2) Finetune Model Path （如果使用ChatGLM2微调模型，需要把"chatglmft"加入AVAIL_LLM_MODELS中） 99 | ChatGLM_PTUNING_CHECKPOINT = "" # 例如"/home/hmp/ChatGLM2-6B/ptuning/output/6b-pt-128-1e-2/checkpoint-100" 100 | 101 | 102 | # 本地LLM模型如ChatGLM的执行方式 CPU/GPU 103 | LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda" 104 | LOCAL_MODEL_QUANT = "FP16" # 默认 "FP16" "INT4" 启用量化INT4版本 "INT8" 启用量化INT8版本 105 | 106 | 107 | # 设置gradio的并行线程数（不需要修改） 108 | CONCURRENT_COUNT = 100 109 | 110 | 111 | # 是否在提交时自动清空输入框 112 | AUTO_CLEAR_TXT = False 113 | 114 | 115 | # 色彩主体，可选 ["Default", "Chuanhu-Small-and-Beautiful"] 116 | THEME = "Default" 117 | 118 | 119 | # 加一个live2d装饰 120 | ADD_WAIFU = False 121 | 122 | 123 | # 设置用户名和密码（不需要修改）（相关功能不稳定，与gradio版本和网络都相关，如果本地使用不建议加这个） 124 | # [("username", "password"), ("username2", "password2"), ...] 125 | AUTHENTICATION = [] 126 | 127 | 128 | # 如果需要在二级路径下运行（常规情况下，不要修改!!）（需要配合修改main.py才能生效!） 129 | CUSTOM_PATH = "/" 130 | 131 | 132 | # 极少数情况下，openai的官方KEY需要伴随组织编码（格式如org-xxxxxxxxxxxxxxxxxxxxxxxx）使用 133 | API_ORG = "" 134 | 135 | 136 | # 如果需要使用Slack Claude，使用教程详情见 request_llm/README.md 137 | SLACK_CLAUDE_BOT_ID = '' 138 | SLACK_CLAUDE_USER_TOKEN = '' 139 | 140 | 141 | # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md 142 | AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/" 143 | AZURE_API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY' # 建议直接在API_KEY处填写，该选项即将被弃用 144 | AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md 145 | 146 | 147 | # 使用Newbing 148 | NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"] 149 | NEWBING_COOKIES = """ 150 | put your new bing cookies here 151 | """ 152 | 153 | 154 | # 阿里云实时语音识别配置难度较高仅建议高手用户使用参考 https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md 155 | ENABLE_AUDIO = False 156 | ALIYUN_TOKEN="" # 例如 f37f30e0f9934c34a992f6f64f7eba4f 157 | ALIYUN_APPKEY="" # 例如 RoPlZrM88DnAFkZK 158 | ALIYUN_ACCESSKEY="" # （无需填写） 159 | ALIYUN_SECRET="" # （无需填写） 160 | 161 | # Claude API KEY 162 | ANTHROPIC_API_KEY = 'pk-jvDh20uTsC02w50uOESZop_PP0BgxwnBV31jTgA7LOY' 163 | 164 | 165 | # 自定义API KEY格式 166 | CUSTOM_API_KEY_PATTERN = "" 167 | -------------------------------------------------------------------------------- /old/3.43/requirements.txt: -------------------------------------------------------------------------------- 1 | ./docs/gradio-3.32.2-py3-none-any.whl 2 | pydantic==1.10.11 3 | tiktoken>=0.3.3 4 | requests[socks] 5 | transformers 6 | python-markdown-math 7 | beautifulsoup4 8 | prompt_toolkit 9 | latex2mathml 10 | python-docx 11 | mdtex2html 12 | anthropic 13 | colorama 14 | Markdown 15 | pygments 16 | pymupdf 17 | openai 18 | numpy 19 | arxiv 20 | rich 21 | pypdf2==2.12.1 22 | pdfminer 23 | pdflatex 24 | apscheduler -------------------------------------------------------------------------------- /old/3.43/toolbox.py: -------------------------------------------------------------------------------- 1 | import markdown 2 | import importlib 3 | import time 4 | import inspect 5 | import re 6 | import os 7 | import gradio 8 | from latex2mathml.converter import convert as tex2mathml 9 | from functools import wraps, lru_cache 10 | pj = os.path.join 11 | 12 | """ 13 | ======================================================================== 14 | 第一部分 15 | 函数插件输入输出接驳区 16 | - ChatBotWithCookies: 带Cookies的Chatbot类，为实现更多强大的功能做基础 17 | - ArgsGeneralWrapper: 装饰器函数，用于重组输入参数，改变输入参数的顺序与结构 18 | - update_ui: 刷新界面用 yield from update_ui(chatbot, history) 19 | - CatchException: 将插件中出的所有问题显示在界面上 20 | - HotReload: 实现插件的热更新 21 | - trimmed_format_exc: 打印traceback，为了安全而隐藏绝对地址 22 | ======================================================================== 23 | """ 24 | 25 | class ChatBotWithCookies(list): 26 | def __init__(self, cookie): 27 | self._cookies = cookie 28 | 29 | def write_list(self, list): 30 | for t in list: 31 | self.append(t) 32 | 33 | def get_list(self): 34 | return [t for t in self] 35 | 36 | def get_cookies(self): 37 | return self._cookies 38 | 39 | 40 | def ArgsGeneralWrapper(f): 41 | """ 42 | 装饰器函数，用于重组输入参数，改变输入参数的顺序与结构。 43 | """ 44 | def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args): 45 | txt_passon = txt 46 | if txt == "" and txt2 != "": txt_passon = txt2 47 | # 引入一个有cookie的chatbot 48 | cookies.update({ 49 | 'top_p':top_p, 50 | 'temperature':temperature, 51 | }) 52 | llm_kwargs = { 53 | 'api_key': cookies['api_key'], 54 | 'llm_model': llm_model, 55 | 'top_p':top_p, 56 | 'max_length': max_length, 57 | 'temperature':temperature, 58 | 'client_ip': request.client.host, 59 | } 60 | plugin_kwargs = { 61 | "advanced_arg": plugin_advanced_arg, 62 | } 63 | chatbot_with_cookie = ChatBotWithCookies(cookies) 64 | chatbot_with_cookie.write_list(chatbot) 65 | if cookies.get('lock_plugin', None) is None: 66 | # 正常状态 67 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args) 68 | else: 69 | # 处理个别特殊插件的锁定状态 70 | module, fn_name = cookies['lock_plugin'].split('->') 71 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name) 72 | yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args) 73 | return decorated 74 | 75 | 76 | def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面 77 | """ 78 | 刷新用户界面 79 | """ 80 | assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。" 81 | cookies = chatbot.get_cookies() 82 | 83 | # 解决插件锁定时的界面显示问题 84 | if cookies.get('lock_plugin', None): 85 | label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None) 86 | chatbot_gr = gradio.update(value=chatbot, label=label) 87 | if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label 88 | elif cookies.get('label', None): 89 | chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', "")) 90 | cookies['label'] = None # 清空label 91 | else: 92 | chatbot_gr = chatbot 93 | 94 | yield cookies, chatbot_gr, history, msg 95 | 96 | def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面 97 | """ 98 | 刷新用户界面 99 | """ 100 | if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg]) 101 | chatbot[-1] = list(chatbot[-1]) 102 | chatbot[-1][-1] = lastmsg 103 | yield from update_ui(chatbot=chatbot, history=history) 104 | time.sleep(delay) 105 | 106 | 107 | def trimmed_format_exc(): 108 | import os, traceback 109 | str = traceback.format_exc() 110 | current_path = os.getcwd() 111 | replace_path = "." 112 | return str.replace(current_path, replace_path) 113 | 114 | def CatchException(f): 115 | """ 116 | 装饰器函数，捕捉函数f中的异常并封装到一个生成器中返回，并显示到聊天当中。 117 | """ 118 | 119 | @wraps(f) 120 | def decorated(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs): 121 | try: 122 | yield from f(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs) 123 | except Exception as e: 124 | from check_proxy import check_proxy 125 | from toolbox import get_conf 126 | proxies, = get_conf('proxies') 127 | tb_str = '```\n' + trimmed_format_exc() + '```' 128 | if len(chatbot_with_cookie) == 0: 129 | chatbot_with_cookie.clear() 130 | chatbot_with_cookie.append(["插件调度异常", "异常原因"]) 131 | chatbot_with_cookie[-1] = (chatbot_with_cookie[-1][0], 132 | f"[Local Message] 实验性函数调用出错: \n\n{tb_str} \n\n当前代理可用性: \n\n{check_proxy(proxies)}") 133 | yield from update_ui(chatbot=chatbot_with_cookie, history=history, msg=f'异常 {e}') # 刷新界面 134 | return decorated 135 | 136 | 137 | def HotReload(f): 138 | """ 139 | HotReload的装饰器函数，用于实现Python函数插件的热更新。 140 | 函数热更新是指在不停止程序运行的情况下，更新函数代码，从而达到实时更新功能。 141 | 在装饰器内部，使用wraps(f)来保留函数的元信息，并定义了一个名为decorated的内部函数。 142 | 内部函数通过使用importlib模块的reload函数和inspect模块的getmodule函数来重新加载并获取函数模块， 143 | 然后通过getattr函数获取函数名，并在新模块中重新加载函数。 144 | 最后，使用yield from语句返回重新加载过的函数，并在被装饰的函数上执行。 145 | 最终，装饰器函数返回内部函数。这个内部函数可以将函数的原始定义更新为最新版本，并执行函数的新版本。 146 | """ 147 | @wraps(f) 148 | def decorated(*args, **kwargs): 149 | fn_name = f.__name__ 150 | f_hot_reload = getattr(importlib.reload(inspect.getmodule(f)), fn_name) 151 | yield from f_hot_reload(*args, **kwargs) 152 | return decorated 153 | 154 | 155 | """ 156 | ======================================================================== 157 | 第二部分 158 | 其他小工具: 159 | - write_results_to_file: 将结果写入markdown文件中 160 | - regular_txt_to_markdown: 将普通文本转换为Markdown格式的文本。 161 | - report_execption: 向chatbot中添加简单的意外错误信息 162 | - text_divide_paragraph: 将文本按照段落分隔符分割开，生成带有段落标签的HTML代码。 163 | - markdown_convertion: 用多种方式组合，将markdown转化为好看的html 164 | - format_io: 接管gradio默认的markdown处理方式 165 | - on_file_uploaded: 处理文件的上传（自动解压） 166 | - on_report_generated: 将生成的报告自动投射到文件上传区 167 | - clip_history: 当历史上下文过长时，自动截断 168 | - get_conf: 获取设置 169 | - select_api_key: 根据当前的模型类别，抽取可用的api-key 170 | ======================================================================== 171 | """ 172 | 173 | def get_reduce_token_percent(text): 174 | """ 175 | * 此函数未来将被弃用 176 | """ 177 | try: 178 | # text = "maximum context length is 4097 tokens. However, your messages resulted in 4870 tokens" 179 | pattern = r"(\d+)\s+tokens\b" 180 | match = re.findall(pattern, text) 181 | EXCEED_ALLO = 500 # 稍微留一点余地，否则在回复时会因余量太少出问题 182 | max_limit = float(match[0]) - EXCEED_ALLO 183 | current_tokens = float(match[1]) 184 | ratio = max_limit/current_tokens 185 | assert ratio > 0 and ratio < 1 186 | return ratio, str(int(current_tokens-max_limit)) 187 | except: 188 | return 0.5, '不详' 189 | 190 | 191 | def write_results_to_file(history, file_name=None): 192 | """ 193 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名，则使用当前时间生成文件名。 194 | """ 195 | import os 196 | import time 197 | if file_name is None: 198 | # file_name = time.strftime("chatGPT分析报告%Y-%m-%d-%H-%M-%S", time.localtime()) + '.md' 199 | file_name = 'GPT-Report-' + gen_time_str() + '.md' 200 | os.makedirs('./gpt_log/', exist_ok=True) 201 | with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f: 202 | f.write('# GPT-Academic Report\n') 203 | for i, content in enumerate(history): 204 | try: 205 | if type(content) != str: content = str(content) 206 | except: 207 | continue 208 | if i % 2 == 0: 209 | f.write('## ') 210 | try: 211 | f.write(content) 212 | except: 213 | # remove everything that cannot be handled by utf8 214 | f.write(content.encode('utf-8', 'ignore').decode()) 215 | f.write('\n\n') 216 | res = '以上材料已经被写入:\t' + os.path.abspath(f'./gpt_log/{file_name}') 217 | print(res) 218 | return res 219 | 220 | 221 | def write_history_to_file(history, file_basename=None, file_fullname=None): 222 | """ 223 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名，则使用当前时间生成文件名。 224 | """ 225 | import os 226 | import time 227 | if file_fullname is None: 228 | if file_basename is not None: 229 | file_fullname = os.path.join(get_log_folder(), file_basename) 230 | else: 231 | file_fullname = os.path.join(get_log_folder(), f'GPT-Academic-{gen_time_str()}.md') 232 | os.makedirs(os.path.dirname(file_fullname), exist_ok=True) 233 | with open(file_fullname, 'w', encoding='utf8') as f: 234 | f.write('# GPT-Academic Report\n') 235 | for i, content in enumerate(history): 236 | try: 237 | if type(content) != str: content = str(content) 238 | except: 239 | continue 240 | if i % 2 == 0: 241 | f.write('## ') 242 | try: 243 | f.write(content) 244 | except: 245 | # remove everything that cannot be handled by utf8 246 | f.write(content.encode('utf-8', 'ignore').decode()) 247 | f.write('\n\n') 248 | res = os.path.abspath(file_fullname) 249 | return res 250 | 251 | 252 | def regular_txt_to_markdown(text): 253 | """ 254 | 将普通文本转换为Markdown格式的文本。 255 | """ 256 | text = text.replace('\n', '\n\n') 257 | text = text.replace('\n\n\n', '\n\n') 258 | text = text.replace('\n\n\n', '\n\n') 259 | return text 260 | 261 | 262 | 263 | 264 | def report_execption(chatbot, history, a, b): 265 | """ 266 | 向chatbot中添加错误信息 267 | """ 268 | chatbot.append((a, b)) 269 | history.append(a) 270 | history.append(b) 271 | 272 | 273 | def text_divide_paragraph(text): 274 | """ 275 | 将文本按照段落分隔符分割开，生成带有段落标签的HTML代码。 276 | """ 277 | pre = '

' 278 | suf = '

' 299 | suf = '

' 300 | if txt.startswith(pre) and txt.endswith(suf): 301 | # print('警告，输入了已经经过转化的字符串，二次转化可能出问题') 302 | return txt # 已经被转化过，不需要再次转化 303 | 304 | markdown_extension_configs = { 305 | 'mdx_math': { 306 | 'enable_dollar_delimiter': True, 307 | 'use_gitlab_delimiters': False, 308 | }, 309 | } 310 | find_equation_pattern = r' $$${content}$$" 324 | else: 325 | return f"${content}$" 326 | 327 | def replace_math_render(match): 328 | content = match.group(1) 329 | if 'mode=display' in match.group(0): 330 | if '\\begin{aligned}' in content: 331 | content = content.replace('\\begin{aligned}', '\\begin{array}') 332 | content = content.replace('\\end{aligned}', '\\end{array}') 333 | content = content.replace('&', ' ') 334 | content = tex2mathml_catch_exception(content, display="block") 335 | return content 336 | else: 337 | return tex2mathml_catch_exception(content) 338 | 339 | def markdown_bug_hunt(content): 340 | """ 341 | 解决一个mdx_math的bug（单$包裹begin命令时多余） 342 | """ 343 | content = content.replace('\n', '') 344 | content = content.replace('$ \n', '') 345 | return content 346 | 347 | def no_code(txt): 348 | if '```' not in txt: 349 | return True 350 | else: 351 | if '```reference' in txt: return True # newbing 352 | else: return False 353 | 354 | if ('$' in txt) and no_code(txt): # 有$标识的公式符号，且没有代码段```的标识 355 | # convert everything to html format 356 | split = markdown.markdown(text='---') 357 | convert_stage_1 = markdown.markdown(text=txt, extensions=['mdx_math', 'fenced_code', 'tables', 'sane_lists'], extension_configs=markdown_extension_configs) 358 | convert_stage_1 = markdown_bug_hunt(convert_stage_1) 359 | # re.DOTALL: Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline. Corresponds to the inline flag (?s). 360 | # 1. convert to easy-to-copy tex (do not render math) 361 | convert_stage_2_1, n = re.subn(find_equation_pattern, replace_math_no_render, convert_stage_1, flags=re.DOTALL) 362 | # 2. convert to rendered equation 363 | convert_stage_2_2, n = re.subn(find_equation_pattern, replace_math_render, convert_stage_1, flags=re.DOTALL) 364 | # cat them together 365 | return pre + convert_stage_2_1 + f'{split}' + convert_stage_2_2 + suf 366 | else: 367 | return pre + markdown.markdown(txt, extensions=['fenced_code', 'codehilite', 'tables', 'sane_lists']) + suf 368 | 369 | 370 | def close_up_code_segment_during_stream(gpt_reply): 371 | """ 372 | 在gpt输出代码的中途（输出了前面的```，但还没输出完后面的```），补上后面的``` 373 | 374 | Args: 375 | gpt_reply (str): GPT模型返回的回复字符串。 376 | 377 | Returns: 378 | str: 返回一个新的字符串，将输出代码片段的“后面的```”补上。 379 | 380 | """ 381 | if '```' not in gpt_reply: 382 | return gpt_reply 383 | if gpt_reply.endswith('```'): 384 | return gpt_reply 385 | 386 | # 排除了以上两个情况，我们 387 | segments = gpt_reply.split('```') 388 | n_mark = len(segments) - 1 389 | if n_mark % 2 == 1: 390 | # print('输出代码片段中！') 391 | return gpt_reply+'\n```' 392 | else: 393 | return gpt_reply 394 | 395 | 396 | def format_io(self, y): 397 | """ 398 | 将输入和输出解析为HTML格式。将y中最后一项的输入部分段落化，并将输出部分的Markdown和数学公式转换为HTML格式。 399 | """ 400 | if y is None or y == []: 401 | return [] 402 | i_ask, gpt_reply = y[-1] 403 | # 输入部分太自由，预处理一波 404 | if i_ask is not None: i_ask = text_divide_paragraph(i_ask) 405 | # 当代码输出半截的时候，试着补上后个``` 406 | if gpt_reply is not None: gpt_reply = close_up_code_segment_during_stream(gpt_reply) 407 | # process 408 | y[-1] = ( 409 | None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']), 410 | None if gpt_reply is None else markdown_convertion(gpt_reply) 411 | ) 412 | return y 413 | 414 | 415 | def find_free_port(): 416 | """ 417 | 返回当前系统中可用的未使用端口。 418 | """ 419 | import socket 420 | from contextlib import closing 421 | with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as s: 422 | s.bind(('', 0)) 423 | s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) 424 | return s.getsockname()[1] 425 | 426 | 427 | def extract_archive(file_path, dest_dir): 428 | import zipfile 429 | import tarfile 430 | import os 431 | # Get the file extension of the input file 432 | file_extension = os.path.splitext(file_path)[1] 433 | 434 | # Extract the archive based on its extension 435 | if file_extension == '.zip': 436 | with zipfile.ZipFile(file_path, 'r') as zipobj: 437 | zipobj.extractall(path=dest_dir) 438 | print("Successfully extracted zip archive to {}".format(dest_dir)) 439 | 440 | elif file_extension in ['.tar', '.gz', '.bz2']: 441 | with tarfile.open(file_path, 'r:*') as tarobj: 442 | tarobj.extractall(path=dest_dir) 443 | print("Successfully extracted tar archive to {}".format(dest_dir)) 444 | 445 | # 第三方库，需要预先pip install rarfile 446 | # 此外，Windows上还需要安装winrar软件，配置其Path环境变量，如"C:\Program Files\WinRAR"才可以 447 | elif file_extension == '.rar': 448 | try: 449 | import rarfile 450 | with rarfile.RarFile(file_path) as rf: 451 | rf.extractall(path=dest_dir) 452 | print("Successfully extracted rar archive to {}".format(dest_dir)) 453 | except: 454 | print("Rar format requires additional dependencies to install") 455 | return '\n\n解压失败! 需要安装pip install rarfile来解压rar文件' 456 | 457 | # 第三方库，需要预先pip install py7zr 458 | elif file_extension == '.7z': 459 | try: 460 | import py7zr 461 | with py7zr.SevenZipFile(file_path, mode='r') as f: 462 | f.extractall(path=dest_dir) 463 | print("Successfully extracted 7z archive to {}".format(dest_dir)) 464 | except: 465 | print("7z format requires additional dependencies to install") 466 | return '\n\n解压失败! 需要安装pip install py7zr来解压7z文件' 467 | else: 468 | return '' 469 | return '' 470 | 471 | 472 | def find_recent_files(directory): 473 | """ 474 | me: find files that is created with in one minutes under a directory with python, write a function 475 | gpt: here it is! 476 | """ 477 | import os 478 | import time 479 | current_time = time.time() 480 | one_minute_ago = current_time - 60 481 | recent_files = [] 482 | 483 | for filename in os.listdir(directory): 484 | file_path = os.path.join(directory, filename) 485 | if file_path.endswith('.log'): 486 | continue 487 | created_time = os.path.getmtime(file_path) 488 | if created_time >= one_minute_ago: 489 | if os.path.isdir(file_path): 490 | continue 491 | recent_files.append(file_path) 492 | 493 | return recent_files 494 | 495 | def promote_file_to_downloadzone(file, rename_file=None, chatbot=None): 496 | # 将文件复制一份到下载区 497 | import shutil 498 | if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}' 499 | new_path = os.path.join(get_log_folder(), rename_file) 500 | # 如果已经存在，先删除 501 | if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path) 502 | # 把文件复制过去 503 | if not os.path.exists(new_path): shutil.copyfile(file, new_path) 504 | # 将文件添加到chatbot cookie中，避免多用户干扰 505 | if chatbot: 506 | if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote'] 507 | else: current = [] 508 | chatbot._cookies.update({'file_to_promote': [new_path] + current}) 509 | 510 | def disable_auto_promotion(chatbot): 511 | chatbot._cookies.update({'file_to_promote': []}) 512 | return 513 | 514 | def on_file_uploaded(files, chatbot, txt, txt2, checkboxes): 515 | """ 516 | 当文件被上传时的回调函数 517 | """ 518 | if len(files) == 0: 519 | return chatbot, txt 520 | import shutil 521 | import os 522 | import time 523 | import glob 524 | from toolbox import extract_archive 525 | try: 526 | shutil.rmtree('./private_upload/') 527 | except: 528 | pass 529 | time_tag = gen_time_str() 530 | os.makedirs(f'private_upload/{time_tag}', exist_ok=True) 531 | err_msg = '' 532 | for file in files: 533 | file_origin_name = os.path.basename(file.orig_name) 534 | shutil.copy(file.name, f'private_upload/{time_tag}/{file_origin_name}') 535 | err_msg += extract_archive(f'private_upload/{time_tag}/{file_origin_name}', 536 | dest_dir=f'private_upload/{time_tag}/{file_origin_name}.extract') 537 | moved_files = [fp for fp in glob.glob('private_upload/**/*', recursive=True)] 538 | if "底部输入区" in checkboxes: 539 | txt = "" 540 | txt2 = f'private_upload/{time_tag}' 541 | else: 542 | txt = f'private_upload/{time_tag}' 543 | txt2 = "" 544 | moved_files_str = '\t\n\n'.join(moved_files) 545 | chatbot.append(['我上传了文件，请查收', 546 | f'[Local Message] 收到以下文件: \n\n{moved_files_str}' + 547 | f'\n\n调用路径参数已自动修正到: \n\n{txt}' + 548 | f'\n\n现在您点击任意“红颜色”标识的函数插件时，以上文件将被作为输入参数'+err_msg]) 549 | return chatbot, txt, txt2 550 | 551 | 552 | def on_report_generated(cookies, files, chatbot): 553 | from toolbox import find_recent_files 554 | if 'file_to_promote' in cookies: 555 | report_files = cookies['file_to_promote'] 556 | cookies.pop('file_to_promote') 557 | else: 558 | report_files = find_recent_files('gpt_log') 559 | if len(report_files) == 0: 560 | return cookies, None, chatbot 561 | # files.extend(report_files) 562 | file_links = '' 563 | for f in report_files: file_links += f'
{f}' 564 | chatbot.append(['报告如何远程获取？', f'报告已经添加到右侧“文件上传区”（可能处于折叠状态），请查收。{file_links}']) 565 | return cookies, report_files, chatbot 566 | 567 | def load_chat_cookies(): 568 | API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY') 569 | if is_any_api_key(AZURE_API_KEY): 570 | if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY 571 | else: API_KEY = AZURE_API_KEY 572 | return {'api_key': API_KEY, 'llm_model': LLM_MODEL} 573 | 574 | def is_openai_api_key(key): 575 | CUSTOM_API_KEY_PATTERN, = get_conf('CUSTOM_API_KEY_PATTERN') 576 | if len(CUSTOM_API_KEY_PATTERN) != 0: 577 | API_MATCH_ORIGINAL = re.match(CUSTOM_API_KEY_PATTERN, key) 578 | else: 579 | API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key) 580 | return bool(API_MATCH_ORIGINAL) 581 | 582 | def is_azure_api_key(key): 583 | API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key) 584 | return bool(API_MATCH_AZURE) 585 | 586 | def is_api2d_key(key): 587 | API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key) 588 | return bool(API_MATCH_API2D) 589 | 590 | def is_freeai_api_key(key):#new add 591 | API_MATCH_FREEAI0 = re.match(r"pk-[a-zA-Z0-9-_]{43}$", key) 592 | API_MATCH_FREEAI1 = re.match(r"fk-[a-zA-Z0-9-_]{43}$", key) 593 | return bool(API_MATCH_FREEAI0) or bool(API_MATCH_FREEAI1) 594 | 595 | def is_any_api_key(key): 596 | if ',' in key: 597 | keys = key.split(',') 598 | for k in keys: 599 | if is_any_api_key(k): return True 600 | return False 601 | else:#new add 602 | return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key) or is_freeai_api_key(key) 603 | 604 | def what_keys(keys): 605 | # new add 606 | avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0, "FreeAI Key":0} 607 | 608 | key_list = keys.split(',') 609 | 610 | for k in key_list: 611 | if is_openai_api_key(k): 612 | avail_key_list['OpenAI Key'] += 1 613 | 614 | for k in key_list: 615 | if is_api2d_key(k): 616 | avail_key_list['API2D Key'] += 1 617 | 618 | for k in key_list: 619 | if is_azure_api_key(k): 620 | avail_key_list['Azure Key'] += 1 621 | 622 | for k in key_list: # new add 623 | if is_freeai_api_key(k): 624 | avail_key_list['FreeAI Key'] += 1 625 | 626 | # new add 627 | return f"检测到： OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']} 个, FreeAI Key {avail_key_list['FreeAI Key']} 个" 628 | 629 | def select_api_key(keys, llm_model): 630 | import random 631 | avail_key_list = [] 632 | key_list = keys.split(',') 633 | 634 | if llm_model.startswith('gpt-'): 635 | for k in key_list: 636 | if is_openai_api_key(k): avail_key_list.append(k) 637 | for k in key_list:# new add 638 | if is_freeai_api_key(k): avail_key_list.append(k) 639 | 640 | if llm_model.startswith('api2d-'): 641 | for k in key_list: 642 | if is_api2d_key(k): avail_key_list.append(k) 643 | 644 | if llm_model.startswith('azure-'): 645 | for k in key_list: 646 | if is_azure_api_key(k): avail_key_list.append(k) 647 | 648 | if len(avail_key_list) == 0: 649 | raise RuntimeError(f"您提供的api-key不满足要求，不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源（右下角更换模型菜单中可切换openai,azure,claude,api2d等请求源）。") 650 | 651 | api_key = random.choice(avail_key_list) # 随机负载均衡 652 | return api_key 653 | 654 | def read_env_variable(arg, default_value): 655 | """ 656 | 环境变量可以是 `GPT_ACADEMIC_CONFIG`(优先)，也可以直接是`CONFIG` 657 | 例如在windows cmd中，既可以写： 658 | set USE_PROXY=True 659 | set API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx 660 | set proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",} 661 | set AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"] 662 | set AUTHENTICATION=[("username", "password"), ("username2", "password2")] 663 | 也可以写： 664 | set GPT_ACADEMIC_USE_PROXY=True 665 | set GPT_ACADEMIC_API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx 666 | set GPT_ACADEMIC_proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",} 667 | set GPT_ACADEMIC_AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"] 668 | set GPT_ACADEMIC_AUTHENTICATION=[("username", "password"), ("username2", "password2")] 669 | """ 670 | from colorful import print亮红, print亮绿 671 | arg_with_prefix = "GPT_ACADEMIC_" + arg 672 | if arg_with_prefix in os.environ: 673 | env_arg = os.environ[arg_with_prefix] 674 | elif arg in os.environ: 675 | env_arg = os.environ[arg] 676 | else: 677 | raise KeyError 678 | print(f"[ENV_VAR] 尝试加载{arg}，默认值：{default_value} --> 修正值：{env_arg}") 679 | try: 680 | if isinstance(default_value, bool): 681 | env_arg = env_arg.strip() 682 | if env_arg == 'True': r = True 683 | elif env_arg == 'False': r = False 684 | else: print('enter True or False, but have:', env_arg); r = default_value 685 | elif isinstance(default_value, int): 686 | r = int(env_arg) 687 | elif isinstance(default_value, float): 688 | r = float(env_arg) 689 | elif isinstance(default_value, str): 690 | r = env_arg.strip() 691 | elif isinstance(default_value, dict): 692 | r = eval(env_arg) 693 | elif isinstance(default_value, list): 694 | r = eval(env_arg) 695 | elif default_value is None: 696 | assert arg == "proxies" 697 | r = eval(env_arg) 698 | else: 699 | print亮红(f"[ENV_VAR] 环境变量{arg}不支持通过环境变量设置! ") 700 | raise KeyError 701 | except: 702 | print亮红(f"[ENV_VAR] 环境变量{arg}加载失败! ") 703 | raise KeyError(f"[ENV_VAR] 环境变量{arg}加载失败! ") 704 | 705 | print亮绿(f"[ENV_VAR] 成功读取环境变量{arg}") 706 | return r 707 | 708 | @lru_cache(maxsize=128) 709 | def read_single_conf_with_lru_cache(arg): 710 | from colorful import print亮红, print亮绿, print亮蓝 711 | try: 712 | # 优先级1. 获取环境变量作为配置 713 | default_ref = getattr(importlib.import_module('config'), arg) # 读取默认值作为数据类型转换的参考 714 | r = read_env_variable(arg, default_ref) 715 | except: 716 | try: 717 | # 优先级2. 获取config_private中的配置 718 | r = getattr(importlib.import_module('config_private'), arg) 719 | except: 720 | # 优先级3. 获取config中的配置 721 | r = getattr(importlib.import_module('config'), arg) 722 | 723 | # 在读取API_KEY时，检查一下是不是忘了改config 724 | if arg == 'API_KEY': 725 | print亮蓝(f"[API_KEY] 本项目现已支持OpenAI和Azure的api-key。也支持同时填写多个api-key，如API_KEY=\"openai-key1,openai-key2,azure-key3\"") 726 | print亮蓝(f"[API_KEY] 您既可以在config.py中修改api-key(s)，也可以在问题输入区输入临时的api-key(s)，然后回车键提交后即可生效。") 727 | if is_any_api_key(r): 728 | print亮绿(f"[API_KEY] 您的 API_KEY 是: {r[:15]}*** API_KEY 导入成功") 729 | else: 730 | print亮红( "[API_KEY] 您的 API_KEY 不满足任何一种已知的密钥格式，请在config文件中修改API密钥之后再运行。") 731 | if arg == 'proxies': 732 | if not read_single_conf_with_lru_cache('USE_PROXY'): r = None # 检查USE_PROXY，防止proxies单独起作用 733 | if r is None: 734 | print亮红('[PROXY] 网络代理状态：未配置。无代理状态下很可能无法访问OpenAI家族的模型。建议：检查USE_PROXY选项是否修改。') 735 | else: 736 | print亮绿('[PROXY] 网络代理状态：已配置。配置信息如下：', r) 737 | assert isinstance(r, dict), 'proxies格式错误，请注意proxies选项的格式，不要遗漏括号。' 738 | return r 739 | 740 | 741 | @lru_cache(maxsize=128) 742 | def get_conf(*args): 743 | # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到 744 | res = [] 745 | for arg in args: 746 | r = read_single_conf_with_lru_cache(arg) 747 | res.append(r) 748 | return res 749 | 750 | 751 | def clear_line_break(txt): 752 | txt = txt.replace('\n', ' ') 753 | txt = txt.replace(' ', ' ') 754 | txt = txt.replace(' ', ' ') 755 | return txt 756 | 757 | 758 | class DummyWith(): 759 | """ 760 | 这段代码定义了一个名为DummyWith的空上下文管理器， 761 | 它的作用是……额……就是不起作用，即在代码结构不变得情况下取代其他的上下文管理器。 762 | 上下文管理器是一种Python对象，用于与with语句一起使用， 763 | 以确保一些资源在代码块执行期间得到正确的初始化和清理。 764 | 上下文管理器必须实现两个方法，分别为 __enter__()和 __exit__()。 765 | 在上下文执行开始的情况下，__enter__()方法会在代码块被执行前被调用， 766 | 而在上下文执行结束时，__exit__()方法则会被调用。 767 | """ 768 | def __enter__(self): 769 | return self 770 | 771 | def __exit__(self, exc_type, exc_value, traceback): 772 | return 773 | 774 | def run_gradio_in_subpath(demo, auth, port, custom_path): 775 | """ 776 | 把gradio的运行地址更改到指定的二次路径上 777 | """ 778 | def is_path_legal(path: str)->bool: 779 | ''' 780 | check path for sub url 781 | path: path to check 782 | return value: do sub url wrap 783 | ''' 784 | if path == "/": return True 785 | if len(path) == 0: 786 | print("ilegal custom path: {}\npath must not be empty\ndeploy on root url".format(path)) 787 | return False 788 | if path[0] == '/': 789 | if path[1] != '/': 790 | print("deploy on sub-path {}".format(path)) 791 | return True 792 | return False 793 | print("ilegal custom path: {}\npath should begin with \'/\'\ndeploy on root url".format(path)) 794 | return False 795 | 796 | if not is_path_legal(custom_path): raise RuntimeError('Ilegal custom path') 797 | import uvicorn 798 | import gradio as gr 799 | from fastapi import FastAPI 800 | app = FastAPI() 801 | if custom_path != "/": 802 | @app.get("/") 803 | def read_main(): 804 | return {"message": f"Gradio is running at: {custom_path}"} 805 | app = gr.mount_gradio_app(app, demo, path=custom_path) 806 | uvicorn.run(app, host="0.0.0.0", port=port) # , auth=auth 807 | 808 | 809 | def clip_history(inputs, history, tokenizer, max_token_limit): 810 | """ 811 | reduce the length of history by clipping. 812 | this function search for the longest entries to clip, little by little, 813 | until the number of token of history is reduced under threshold. 814 | 通过裁剪来缩短历史记录的长度。 815 | 此函数逐渐地搜索最长的条目进行剪辑， 816 | 直到历史记录的标记数量降低到阈值以下。 817 | """ 818 | import numpy as np 819 | from request_llm.bridge_all import model_info 820 | def get_token_num(txt): 821 | return len(tokenizer.encode(txt, disallowed_special=())) 822 | input_token_num = get_token_num(inputs) 823 | if input_token_num < max_token_limit * 3 / 4: 824 | # 当输入部分的token占比小于限制的3/4时，裁剪时 825 | # 1. 把input的余量留出来 826 | max_token_limit = max_token_limit - input_token_num 827 | # 2. 把输出用的余量留出来 828 | max_token_limit = max_token_limit - 128 829 | # 3. 如果余量太小了，直接清除历史 830 | if max_token_limit < 128: 831 | history = [] 832 | return history 833 | else: 834 | # 当输入部分的token占比 > 限制的3/4时，直接清除历史 835 | history = [] 836 | return history 837 | 838 | everything = [''] 839 | everything.extend(history) 840 | n_token = get_token_num('\n'.join(everything)) 841 | everything_token = [get_token_num(e) for e in everything] 842 | 843 | # 截断时的颗粒度 844 | delta = max(everything_token) // 16 845 | 846 | while n_token > max_token_limit: 847 | where = np.argmax(everything_token) 848 | encoded = tokenizer.encode(everything[where], disallowed_special=()) 849 | clipped_encoded = encoded[:len(encoded)-delta] 850 | everything[where] = tokenizer.decode(clipped_encoded)[:-1] # -1 to remove the may-be illegal char 851 | everything_token[where] = get_token_num(everything[where]) 852 | n_token = get_token_num('\n'.join(everything)) 853 | 854 | history = everything[1:] 855 | return history 856 | 857 | """ 858 | ======================================================================== 859 | 第三部分 860 | 其他小工具: 861 | - zip_folder: 把某个路径下所有文件压缩，然后转移到指定的另一个路径中（gpt写的） 862 | - gen_time_str: 生成时间戳 863 | - ProxyNetworkActivate: 临时地启动代理网络（如果有） 864 | - objdump/objload: 快捷的调试函数 865 | ======================================================================== 866 | """ 867 | 868 | def zip_folder(source_folder, dest_folder, zip_name): 869 | import zipfile 870 | import os 871 | # Make sure the source folder exists 872 | if not os.path.exists(source_folder): 873 | print(f"{source_folder} does not exist") 874 | return 875 | 876 | # Make sure the destination folder exists 877 | if not os.path.exists(dest_folder): 878 | print(f"{dest_folder} does not exist") 879 | return 880 | 881 | # Create the name for the zip file 882 | zip_file = os.path.join(dest_folder, zip_name) 883 | 884 | # Create a ZipFile object 885 | with zipfile.ZipFile(zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf: 886 | # Walk through the source folder and add files to the zip file 887 | for foldername, subfolders, filenames in os.walk(source_folder): 888 | for filename in filenames: 889 | filepath = os.path.join(foldername, filename) 890 | zipf.write(filepath, arcname=os.path.relpath(filepath, source_folder)) 891 | 892 | # Move the zip file to the destination folder (if it wasn't already there) 893 | if os.path.dirname(zip_file) != dest_folder: 894 | os.rename(zip_file, os.path.join(dest_folder, os.path.basename(zip_file))) 895 | zip_file = os.path.join(dest_folder, os.path.basename(zip_file)) 896 | 897 | print(f"Zip file created at {zip_file}") 898 | 899 | def zip_result(folder): 900 | t = gen_time_str() 901 | zip_folder(folder, './gpt_log/', f'{t}-result.zip') 902 | return pj('./gpt_log/', f'{t}-result.zip') 903 | 904 | def gen_time_str(): 905 | import time 906 | return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) 907 | 908 | def get_log_folder(user='default', plugin_name='shared'): 909 | _dir = os.path.join(os.path.dirname(__file__), 'gpt_log', user, plugin_name) 910 | if not os.path.exists(_dir): os.makedirs(_dir) 911 | return _dir 912 | 913 | class ProxyNetworkActivate(): 914 | """ 915 | 这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理 916 | """ 917 | def __enter__(self): 918 | from toolbox import get_conf 919 | proxies, = get_conf('proxies') 920 | if 'no_proxy' in os.environ: os.environ.pop('no_proxy') 921 | if proxies is not None: 922 | if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http'] 923 | if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https'] 924 | return self 925 | 926 | def __exit__(self, exc_type, exc_value, traceback): 927 | os.environ['no_proxy'] = '*' 928 | if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY') 929 | if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY') 930 | return 931 | 932 | def objdump(obj, file='objdump.tmp'): 933 | import pickle 934 | with open(file, 'wb+') as f: 935 | pickle.dump(obj, f) 936 | return 937 | 938 | def objload(file='objdump.tmp'): 939 | import pickle, os 940 | if not os.path.exists(file): 941 | return 942 | with open(file, 'rb') as f: 943 | return pickle.load(f) 944 | 945 | def Singleton(cls): 946 | """ 947 | 一个单实例装饰器 948 | """ 949 | _instance = {} 950 | 951 | def _singleton(*args, **kargs): 952 | if cls not in _instance: 953 | _instance[cls] = cls(*args, **kargs) 954 | return _instance[cls] 955 | 956 | return _singleton 957 | 958 | """ 959 | ======================================================================== 960 | 第四部分 961 | 接驳虚空终端: 962 | - set_conf: 在运行过程中动态地修改配置 963 | - set_multi_conf: 在运行过程中动态地修改多个配置 964 | - get_plugin_handle: 获取插件的句柄 965 | - get_plugin_default_kwargs: 获取插件的默认参数 966 | - get_chat_handle: 获取简单聊天的句柄 967 | - get_chat_default_kwargs: 获取简单聊天的默认参数 968 | ======================================================================== 969 | """ 970 | 971 | def set_conf(key, value): 972 | from toolbox import read_single_conf_with_lru_cache, get_conf 973 | read_single_conf_with_lru_cache.cache_clear() 974 | get_conf.cache_clear() 975 | os.environ[key] = str(value) 976 | altered, = get_conf(key) 977 | return altered 978 | 979 | def set_multi_conf(dic): 980 | for k, v in dic.items(): set_conf(k, v) 981 | return 982 | 983 | def get_plugin_handle(plugin_name): 984 | """ 985 | e.g. plugin_name = 'crazy_functions.批量Markdown翻译->Markdown翻译指定语言' 986 | """ 987 | import importlib 988 | assert '->' in plugin_name, \ 989 | "Example of plugin_name: crazy_functions.批量Markdown翻译->Markdown翻译指定语言" 990 | module, fn_name = plugin_name.split('->') 991 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name) 992 | return f_hot_reload 993 | 994 | def get_chat_handle(): 995 | """ 996 | """ 997 | from request_llm.bridge_all import predict_no_ui_long_connection 998 | return predict_no_ui_long_connection 999 | 1000 | def get_plugin_default_kwargs(): 1001 | """ 1002 | """ 1003 | from toolbox import get_conf, ChatBotWithCookies 1004 | 1005 | WEB_PORT, LLM_MODEL, API_KEY = \ 1006 | get_conf('WEB_PORT', 'LLM_MODEL', 'API_KEY') 1007 | 1008 | llm_kwargs = { 1009 | 'api_key': API_KEY, 1010 | 'llm_model': LLM_MODEL, 1011 | 'top_p':1.0, 1012 | 'max_length': None, 1013 | 'temperature':1.0, 1014 | } 1015 | chatbot = ChatBotWithCookies(llm_kwargs) 1016 | 1017 | # txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port 1018 | default_plugin_kwargs = { 1019 | "main_input": "./README.md", 1020 | "llm_kwargs": llm_kwargs, 1021 | "plugin_kwargs": {}, 1022 | "chatbot_with_cookie": chatbot, 1023 | "history": [], 1024 | "system_prompt": "You are a good AI.", 1025 | "web_port": WEB_PORT 1026 | } 1027 | return default_plugin_kwargs 1028 | 1029 | def get_chat_default_kwargs(): 1030 | """ 1031 | """ 1032 | from toolbox import get_conf 1033 | 1034 | LLM_MODEL, API_KEY = get_conf('LLM_MODEL', 'API_KEY') 1035 | 1036 | llm_kwargs = { 1037 | 'api_key': API_KEY, 1038 | 'llm_model': LLM_MODEL, 1039 | 'top_p':1.0, 1040 | 'max_length': None, 1041 | 'temperature':1.0, 1042 | } 1043 | 1044 | default_chat_kwargs = { 1045 | "inputs": "Hello there, are you ready?", 1046 | "llm_kwargs": llm_kwargs, 1047 | "history": [], 1048 | "sys_prompt": "You are AI assistant", 1049 | "observe_window": None, 1050 | "console_slience": False, 1051 | } 1052 | 1053 | return default_chat_kwargs 1054 | 1055 | -------------------------------------------------------------------------------- /old/README_old.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # 图片描述

| FreeAI 4 | 5 | **OpenAI should not be a closed AI.** 6 | 7 | 你是否还在为OpenAI需要科学上网在犯愁？ 8 | 9 | 你是否还在为OpenAI的付费模式而望而却步？ 10 | 11 | 你是否苦恼没有免费的API Key来开发自己的ChatGPT工具？ 12 | 13 | 本项目综述Github众优秀开发者的努力，给出一个比较完美的解决方案，并持续向更好用、更强大、更便宜的AI开放努力。**如果你喜欢本项目，请给一个免费的star，谢谢！** 14 | 15 | `写在前面（因为issues很多人没看到这句话而遇到缺少API key的报错）：` 16 | + ***由于gpt_academic设定用户参数配置的读取优先级: 环境变量 > config_private.py > config.py，所以调试中，最好config.py文件也做对应的修改（即改的和config_private.py一模一样）。不然，用户的配置可能在某些调试情况下不生效，这可能是gpt_academic的bug。*** 17 | 18 | **鸣谢：** 19 | + [pengzhile/pandora](https://github.com/pengzhile/pandora)：让OpenAI GPT-3.5的API免费和免科学上网的关键技术。 20 | + [acheong08/OpenAIAuth](https://github.com/acheong08/OpenAIAuth)：免科学上网获取自己OpenAI账户的Cookie。 21 | + [binary-husky/gpt_academic](https://github.com/binary-husky/gpt_academic), 以它为例，解决它需翻墙和需要付费的OpenAI API key的问题，演示OpenAI变为FreeAI。 22 | 23 | ## Pandora 24 | 旨在打造免科学上网情况下，最原汁原味的ChatGPT。基于access token的[技术原理](https://zhile.io/2023/05/19/how-to-get-chatgpt-access-token-via-pkce.html)实现的。目前有官方的体验网站[https://chat.zhile.io](https://chat.zhile.io)，需要使用OpenAI的账户密码，所有对话记录与在官网的一致；也有基于Pandora技术的共享[Shared Chat](https://baipiao.io/chatgpt)的资源池，无需账号密码也能体验。 25 | 26 | Pandora项目最难能可贵的是提供了可将用户的Cookie转化为形式如同API key的Access Token和响应这个Access Token的反代接口（也可响应OpenAI原生的API key）的服务，此举无疑是基于OpenAI自由开发者最大的福音。详情请见：[“这个服务旨在模拟 Turbo API，免费且使用的是ChatGPT的8k模型”](https://github.com/pengzhile/pandora/issues/837)。 27 | + Cookie转 `fk-`开头、43位的 Share Token 演示地址：[https://ai.fakeopen.com/token](https://ai.fakeopen.com/token)； 28 | + Cookie转 `pk-`开头、43位的 Pool Token 演示地址：[https://ai.fakeopen.com/pool](https://ai.fakeopen.com/pool)。解决多账号并发的问题； 29 | + 响应上述 Access Token 的反代接口是：[https://ai.fakeopen.com/v1/chat/completions](https://ai.fakeopen.com/v1/chat/completions)。 30 | 31 | Pandora项目还提供了两个免费的Pool Token: 32 | + `pk-this-is-a-real-free-pool-token-for-everyone` 很多 Share Token 组成的池子。 33 | + `pk-this-is-a-real-free-api-key-pk-for-everyone` 一些120刀 Api Key组成的池子。`（我测试的时候已经没钱了，衰。）` 34 | 35 | 经使用自己的账号生成的Share Token和Pool Token进行测试，使用Access Token进行的对话的记录，不会出现在该账户记录中。所以我自己使用的也是Pandora提供Pool Token，毕竟自己的池子不够大，而且因为自己的用户cookie的生命周期只有14天，时常更新Access Token也很烦。 36 | 37 | 本人十分中意ChatGPT的翻译效果，所以编写一个基于Pandora的简易翻译服务的网页，即文件[Translate.html](https://github.com/elphen-wang/FreeAI/blob/main/Translate.html)，测试效果表明还可以。 38 | 39 | ## OpenAIAuth 40 | 现在，Pandora的讨论帖就有人在提Access token想保留问询记录的需求。如果，你在使用Pandora提供的Pool Token还有隐私和安全的顾虑，也可以同时使用[OpenAIAuth](https://github.com/acheong08/OpenAIAuth)和`pandora-chatgpt`的python函数包来产生和定时更新专属自己Access token。 41 | 42 | Pandora项目其实也独立提供了[这种服务](https://gist.github.com/pengzhile/448bfcfd548b3ae4e665a84cc86c4694)。但是我实操后，还是觉得结合OpenAIAuth更好使一些，并把修改后的代码放进[get_freeai_api.py](https://github.com/elphen-wang/FreeAI/blob/main/get_freeai_api.py)文件，生成的'share_tokens.txt'是Pool Token（如果有二个及以上的账户密码的话）和Share Token并在的。 43 | 44 | ## gpt_academic 45 | 本人之前搭建专属自己的OpenAI API反向代理的教程[ChatGPT Wallfree](https://github.com/elphen-wang/chatgpt_wallfree)只实现了gpt_academic免科学上网功能，但仍需使用OpenAI原生的API key。这里还是以它为例，本次直接不用开发者自己搭建反向代理服务和OpenAI原生的API key，可以为一般的科研组省下一笔的不易报销的经费支出。 46 | 47 | 开发者可使用本项目中[gpt_academic](https://github.com/elphen-wang/FreeAI/tree/main/gpt_academic)文件夹中文件替代官方的文件（`主要是修改对toolbox.py和和config_private.py对access token的识别和获取`），也可在此基础上加入自己的设定（如gpt_academic账户密码等）。如此之后，安装官方的调试运行和部署指引，gpt_academic就可以不用科学上网又能免费使用gpt-3.5啦！ 48 | 49 | 在使用自己的账户的access token的场景中，需要用户自己设定定时执行get_freeai_api.py的功能，如每天凌晨四点执行一次。这样可以克服OpenAI cookie只有14天生命周期引入的频繁手动更新access token的问题。 50 | 51 | `tips`： 52 | + 要使用gpt_academic arxiv翻译功能，在docker模式下，需要进行以下编译： 53 | ``` bash {.line-numbers} 54 | #编译 docker 镜像 55 | docker build -t gpt-academic-nolocal-latex -f docs/GithubAction+NoLocal+Latex . 56 | #端口可以自由更换，保持和config.py和config_private.py中设置的一样即可 57 | run -d -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --net=host -p 86:86 --restart=always --name gpt-academic gpt-academic-nolocal-latex 58 | ``` 59 | 60 | + ***由于gpt_academic设定用户参数配置的读取优先级: 环境变量 > config_private.py > config.py，所以调试中，最好config.py文件也做对应的修改（即改为一样）。不然，用户的配置可能在某些调试情况下不生效，这可能是gpt_academic的bug。*** 61 | 62 | ## 后记 63 | + 因为，Pandora目前本质上是将OpenAI原生的网页服务还原出来，所以目前还不能免费使用诸如ChatGPT-4等付费服务。不过，这将是本人和一众致力于使AI技术服务更广大群众的开发者今后努力的方向。 64 | + 之前ChatGPT Wallfree教程中提及ZeroTier的内网穿透技术，实测不如[Frp](https://github.com/fatedier/frp)更适合中国科研宝宝的体质：更稳定、速度更快且第三方无需客户端。 65 | 66 | 67 | ## To-do List 68 | + [ ] 完善gpt_academic的arxiv翻译功能，因为我是一个科研民工... 69 | + [ ] 集成new bing的gpt4服务... 70 | 71 | ## Star历史 72 | 73 | ![Star History Chart](https://api.star-history.com/svg?repos=elphen-wang/FreeAI&type=Date) 74 | 75 | 76 | -------------------------------------------------------------------------------- /old/gpt_academic_old/config_private.py: -------------------------------------------------------------------------------- 1 | """ 2 | 以下所有配置也都支持利用环境变量覆写，环境变量配置格式见docker-compose.yml。 3 | 读取优先级：环境变量 > config_private.py > config.py 4 | --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- 5 | All the following configurations also support using environment variables to override, 6 | and the environment variable configuration format can be seen in docker-compose.yml. 7 | Configuration reading priority: environment variable > config_private.py > config.py 8 | """ 9 | 10 | # [step 1]>> API_KEY = "sk-123456789xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123456789"。极少数情况下，还需要填写组织（格式如org-123456789abcdefghijklmno的），请向下翻，找 API_ORG 设置项 11 | API_KEYAPI_KEY = "pk-this-is-a-real-free-pool-token-for-everyone" # 可同时填写多个API-KEY，用英文逗号分割，例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4" 12 | 13 | ''' #和上面的API_KEYAPI_KEY可以二选一，这段代码是配合用自己的OpenAI账户密码二设定。 14 | from os import path 15 | current_dir = path.dirname(path.abspath(__file__)) 16 | share_tokens_file = path.join(current_dir, 'share_tokens.txt') 17 | with open(share_tokens_file, 'r', encoding='utf-8') as f: 18 | free_apis= f.read().split('\n') 19 | API_KEYAPI_KEY=','.join(filter(None,free_apis)) 20 | ''' 21 | 22 | 23 | # [step 2]>> 改为True应用代理，如果直接在海外服务器部署，此处不修改 24 | USE_PROXY = False 25 | if USE_PROXY: 26 | """ 27 | 填写格式是 [协议]:// [地址] :[端口]，填写之前不要忘记把USE_PROXY改成True，如果直接在海外服务器部署，此处不修改 28 | <配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1> 29 | [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http 30 | [地址] 懂的都懂，不懂就填localhost或者127.0.0.1肯定错不了（localhost意思是代理软件安装在本机上） 31 | [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样，但端口号都应该在最显眼的位置上 32 | """ 33 | # 代理网络的地址，打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284) 34 | proxies = { 35 | # [协议]:// [地址] :[端口] 36 | "http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890", 37 | "https": "socks5h://localhost:11284", # 再例如 "https": "http://127.0.0.1:7890", 38 | } 39 | else: 40 | proxies = None 41 | 42 | # ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------ 43 | 44 | # 重新URL重新定向，实现更换API_URL的作用（常规情况下，不要修改!! 高危设置！通过修改此设置，您将把您的API-KEY和对话隐私完全暴露给您设定的中间人！） 45 | # 格式 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"} 46 | # 例如 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://reverse-proxy-url/v1/chat/completions"} 47 | API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://ai.fakeopen.com/v1/chat/completions"} 48 | 49 | 50 | # 多线程函数插件中，默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次，Pay-as-you-go users的限制是每分钟3500次 51 | # 一言以蔽之：免费（5刀）用户填3，OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询：https://platform.openai.com/docs/guides/rate-limits/overview 52 | DEFAULT_WORKER_NUM = 3 53 | 54 | 55 | # 对话窗的高度 56 | CHATBOT_HEIGHT = 1115 57 | 58 | 59 | # 代码高亮 60 | CODE_HIGHLIGHT = True 61 | 62 | 63 | # 窗口布局 64 | LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"（左右布局） # "TOP-DOWN"（上下布局） 65 | DARK_MODE = True # 暗色模式 / 亮色模式 66 | 67 | 68 | # 发送请求到OpenAI后，等待多久判定为超时 69 | TIMEOUT_SECONDS = 30 70 | 71 | 72 | # 网页的端口, -1代表随机端口 73 | WEB_PORT = -1 74 | 75 | 76 | # 如果OpenAI不响应（网络卡顿、代理失败、KEY失效），重试的次数限制 77 | MAX_RETRY = 2 78 | 79 | 80 | # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 ) 81 | LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓ 82 | AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"] 83 | # P.S. 其他可用的模型还包括 ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"] 84 | 85 | 86 | # 本地LLM模型如ChatGLM的执行方式 CPU/GPU 87 | LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda" 88 | 89 | 90 | # 设置gradio的并行线程数（不需要修改） 91 | CONCURRENT_COUNT = 100 92 | 93 | 94 | # 是否在提交时自动清空输入框 95 | AUTO_CLEAR_TXT = False 96 | 97 | 98 | # 加一个live2d装饰 99 | ADD_WAIFU = False 100 | 101 | 102 | # 设置用户名和密码（不需要修改）（相关功能不稳定，与gradio版本和网络都相关，如果本地使用不建议加这个） 103 | # [("username", "password"), ("username2", "password2"), ...] 104 | AUTHENTICATION = [] 105 | 106 | 107 | # 如果需要在二级路径下运行（常规情况下，不要修改!!）（需要配合修改main.py才能生效!） 108 | CUSTOM_PATH = "/" 109 | 110 | 111 | # 极少数情况下，openai的官方KEY需要伴随组织编码（格式如org-xxxxxxxxxxxxxxxxxxxxxxxx）使用 112 | API_ORG = "" 113 | 114 | 115 | # 如果需要使用Slack Claude，使用教程详情见 request_llm/README.md 116 | SLACK_CLAUDE_BOT_ID = '' 117 | SLACK_CLAUDE_USER_TOKEN = '' 118 | 119 | 120 | # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md 121 | AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/" 122 | AZURE_API_KEY = "填入azure openai api的密钥" # 建议直接在API_KEY处填写，该选项即将被弃用 123 | AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md 124 | 125 | 126 | # 使用Newbing 127 | NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"] 128 | NEWBING_COOKIES = """ 129 | put your new bing cookies here 130 | """ 131 | -------------------------------------------------------------------------------- /old/gpt_academic_old/credentials.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/elphen-wang/FreeAI/25cd8bfff04607fc73a464bfcff948d102b65c4f/old/gpt_academic_old/credentials.txt -------------------------------------------------------------------------------- /old/gpt_academic_old/get_freeai_api.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | from os import path 4 | import requests 5 | from OpenAIAuth import Auth0 6 | 7 | def run(): 8 | expires_in = 0 9 | unique_name = 'my share token' 10 | current_dir = path.dirname(path.abspath(__file__)) 11 | credentials_file = path.join(current_dir, 'credentials.txt') 12 | share_tokens_file = path.join(current_dir, 'share_tokens.txt') 13 | with open(credentials_file, 'r', encoding='utf-8') as f: 14 | credentials = f.read().split('\n') 15 | credentials = [credential.split(',', 1) for credential in credentials] 16 | count = 0 17 | token_keys = [] 18 | for credential in credentials: 19 | progress = '{}/{}'.format(credentials.index(credential) + 1, len(credentials)) 20 | if not credential or len(credential) != 2: 21 | continue 22 | 23 | count += 1 24 | username, password = credential[0].strip(), credential[1].strip() 25 | token_info = { 26 | 'token': 'None', 27 | 'share_token': 'None', 28 | } 29 | token_keys.append(token_info) 30 | try: 31 | auth = Auth0(email=username, password=password) 32 | token_info['token'] = auth.get_access_token() 33 | #print('Login success: {}, {}'.format(username, progress)) 34 | except Exception as e: 35 | err_str = str(e).replace('\n', '').replace('\r', '').strip() 36 | #print('Login failed: {}, {}'.format(username, err_str)) 37 | token_info['token'] = err_str 38 | continue 39 | data = { 40 | 'unique_name': unique_name, 41 | 'access_token': token_info['token'], 42 | 'expires_in': expires_in, 43 | } 44 | resp = requests.post('https://ai.fakeopen.com/token/register', data=data) 45 | if resp.status_code == 200: 46 | token_info['share_token'] = resp.json()['token_key'] 47 | else: 48 | continue 49 | 50 | with open(share_tokens_file, 'w', encoding='utf-8') as f: 51 | # 如果账号大于一个，优先使用pool；只有一个时，使用单独的api；没有，则有公共pool。 52 | if count==0: 53 | f.write('pk-this-is-a-real-free-pool-token-for-everyone\n') 54 | f.write('pk-this-is-a-real-free-api-key-pk-for-everyone\n') 55 | elif count==1: 56 | f.write('{}\n'.format(token_keys[0]['share_token'])) 57 | else: 58 | data = { 59 | 'share_tokens': '\n'.join([token_info['share_token'] for token_info in token_keys]), 60 | } 61 | resp = requests.post('https://ai.fakeopen.com/pool/update', data=data) 62 | if resp.status_code == 200: 63 | f.write('{}\n'.format(resp.json()['pool_token'])) 64 | for token_info in token_keys: 65 | f.write('{}\n'.format(token_info['share_token'])) 66 | f.close() 67 | 68 | if __name__ == '__main__': 69 | run() 70 | 71 | -------------------------------------------------------------------------------- /old/gpt_academic_old/toolbox.py: -------------------------------------------------------------------------------- 1 | import markdown 2 | import importlib 3 | import time 4 | import inspect 5 | import re 6 | import os 7 | import gradio 8 | from latex2mathml.converter import convert as tex2mathml 9 | from functools import wraps, lru_cache 10 | pj = os.path.join 11 | 12 | """ 13 | ======================================================================== 14 | 第一部分 15 | 函数插件输入输出接驳区 16 | - ChatBotWithCookies: 带Cookies的Chatbot类，为实现更多强大的功能做基础 17 | - ArgsGeneralWrapper: 装饰器函数，用于重组输入参数，改变输入参数的顺序与结构 18 | - update_ui: 刷新界面用 yield from update_ui(chatbot, history) 19 | - CatchException: 将插件中出的所有问题显示在界面上 20 | - HotReload: 实现插件的热更新 21 | - trimmed_format_exc: 打印traceback，为了安全而隐藏绝对地址 22 | ======================================================================== 23 | """ 24 | 25 | class ChatBotWithCookies(list): 26 | def __init__(self, cookie): 27 | self._cookies = cookie 28 | 29 | def write_list(self, list): 30 | for t in list: 31 | self.append(t) 32 | 33 | def get_list(self): 34 | return [t for t in self] 35 | 36 | def get_cookies(self): 37 | return self._cookies 38 | 39 | 40 | def ArgsGeneralWrapper(f): 41 | """ 42 | 装饰器函数，用于重组输入参数，改变输入参数的顺序与结构。 43 | """ 44 | def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args): 45 | txt_passon = txt 46 | if txt == "" and txt2 != "": txt_passon = txt2 47 | # 引入一个有cookie的chatbot 48 | cookies.update({ 49 | 'top_p':top_p, 50 | 'temperature':temperature, 51 | }) 52 | llm_kwargs = { 53 | 'api_key': cookies['api_key'], 54 | 'llm_model': llm_model, 55 | 'top_p':top_p, 56 | 'max_length': max_length, 57 | 'temperature':temperature, 58 | 'client_ip': request.client.host, 59 | } 60 | plugin_kwargs = { 61 | "advanced_arg": plugin_advanced_arg, 62 | } 63 | chatbot_with_cookie = ChatBotWithCookies(cookies) 64 | chatbot_with_cookie.write_list(chatbot) 65 | if cookies.get('lock_plugin', None) is None: 66 | # 正常状态 67 | yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args) 68 | else: 69 | # 处理个别特殊插件的锁定状态 70 | module, fn_name = cookies['lock_plugin'].split('->') 71 | f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name) 72 | yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args) 73 | return decorated 74 | 75 | 76 | def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面 77 | """ 78 | 刷新用户界面 79 | """ 80 | assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。" 81 | cookies = chatbot.get_cookies() 82 | 83 | # 解决插件锁定时的界面显示问题 84 | if cookies.get('lock_plugin', None): 85 | label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None) 86 | chatbot_gr = gradio.update(value=chatbot, label=label) 87 | if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label 88 | elif cookies.get('label', None): 89 | chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', "")) 90 | cookies['label'] = None # 清空label 91 | else: 92 | chatbot_gr = chatbot 93 | 94 | yield cookies, chatbot_gr, history, msg 95 | 96 | def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面 97 | """ 98 | 刷新用户界面 99 | """ 100 | if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg]) 101 | chatbot[-1] = list(chatbot[-1]) 102 | chatbot[-1][-1] = lastmsg 103 | yield from update_ui(chatbot=chatbot, history=history) 104 | time.sleep(delay) 105 | 106 | 107 | def trimmed_format_exc(): 108 | import os, traceback 109 | str = traceback.format_exc() 110 | current_path = os.getcwd() 111 | replace_path = "." 112 | return str.replace(current_path, replace_path) 113 | 114 | def CatchException(f): 115 | """ 116 | 装饰器函数，捕捉函数f中的异常并封装到一个生成器中返回，并显示到聊天当中。 117 | """ 118 | 119 | @wraps(f) 120 | def decorated(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT=-1): 121 | try: 122 | yield from f(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT) 123 | except Exception as e: 124 | from check_proxy import check_proxy 125 | from toolbox import get_conf 126 | proxies, = get_conf('proxies') 127 | tb_str = '```\n' + trimmed_format_exc() + '```' 128 | if len(chatbot) == 0: 129 | chatbot.clear() 130 | chatbot.append(["插件调度异常", "异常原因"]) 131 | chatbot[-1] = (chatbot[-1][0], 132 | f"[Local Message] 实验性函数调用出错: \n\n{tb_str} \n\n当前代理可用性: \n\n{check_proxy(proxies)}") 133 | yield from update_ui(chatbot=chatbot, history=history, msg=f'异常 {e}') # 刷新界面 134 | return decorated 135 | 136 | 137 | def HotReload(f): 138 | """ 139 | HotReload的装饰器函数，用于实现Python函数插件的热更新。 140 | 函数热更新是指在不停止程序运行的情况下，更新函数代码，从而达到实时更新功能。 141 | 在装饰器内部，使用wraps(f)来保留函数的元信息，并定义了一个名为decorated的内部函数。 142 | 内部函数通过使用importlib模块的reload函数和inspect模块的getmodule函数来重新加载并获取函数模块， 143 | 然后通过getattr函数获取函数名，并在新模块中重新加载函数。 144 | 最后，使用yield from语句返回重新加载过的函数，并在被装饰的函数上执行。 145 | 最终，装饰器函数返回内部函数。这个内部函数可以将函数的原始定义更新为最新版本，并执行函数的新版本。 146 | """ 147 | @wraps(f) 148 | def decorated(*args, **kwargs): 149 | fn_name = f.__name__ 150 | f_hot_reload = getattr(importlib.reload(inspect.getmodule(f)), fn_name) 151 | yield from f_hot_reload(*args, **kwargs) 152 | return decorated 153 | 154 | 155 | """ 156 | ======================================================================== 157 | 第二部分 158 | 其他小工具: 159 | - write_results_to_file: 将结果写入markdown文件中 160 | - regular_txt_to_markdown: 将普通文本转换为Markdown格式的文本。 161 | - report_execption: 向chatbot中添加简单的意外错误信息 162 | - text_divide_paragraph: 将文本按照段落分隔符分割开，生成带有段落标签的HTML代码。 163 | - markdown_convertion: 用多种方式组合，将markdown转化为好看的html 164 | - format_io: 接管gradio默认的markdown处理方式 165 | - on_file_uploaded: 处理文件的上传（自动解压） 166 | - on_report_generated: 将生成的报告自动投射到文件上传区 167 | - clip_history: 当历史上下文过长时，自动截断 168 | - get_conf: 获取设置 169 | - select_api_key: 根据当前的模型类别，抽取可用的api-key 170 | ======================================================================== 171 | """ 172 | 173 | def get_reduce_token_percent(text): 174 | """ 175 | * 此函数未来将被弃用 176 | """ 177 | try: 178 | # text = "maximum context length is 4097 tokens. However, your messages resulted in 4870 tokens" 179 | pattern = r"(\d+)\s+tokens\b" 180 | match = re.findall(pattern, text) 181 | EXCEED_ALLO = 500 # 稍微留一点余地，否则在回复时会因余量太少出问题 182 | max_limit = float(match[0]) - EXCEED_ALLO 183 | current_tokens = float(match[1]) 184 | ratio = max_limit/current_tokens 185 | assert ratio > 0 and ratio < 1 186 | return ratio, str(int(current_tokens-max_limit)) 187 | except: 188 | return 0.5, '不详' 189 | 190 | 191 | def write_results_to_file(history, file_name=None): 192 | """ 193 | 将对话记录history以Markdown格式写入文件中。如果没有指定文件名，则使用当前时间生成文件名。 194 | """ 195 | import os 196 | import time 197 | if file_name is None: 198 | # file_name = time.strftime("chatGPT分析报告%Y-%m-%d-%H-%M-%S", time.localtime()) + '.md' 199 | file_name = 'chatGPT分析报告' + \ 200 | time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + '.md' 201 | os.makedirs('./gpt_log/', exist_ok=True) 202 | with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f: 203 | f.write('# chatGPT 分析报告\n') 204 | for i, content in enumerate(history): 205 | try: 206 | if type(content) != str: content = str(content) 207 | except: 208 | continue 209 | if i % 2 == 0: 210 | f.write('## ') 211 | try: 212 | f.write(content) 213 | except: 214 | # remove everything that cannot be handled by utf8 215 | f.write(content.encode('utf-8', 'ignore').decode()) 216 | f.write('\n\n') 217 | res = '以上材料已经被写入' + os.path.abspath(f'./gpt_log/{file_name}') 218 | print(res) 219 | return res 220 | 221 | 222 | def regular_txt_to_markdown(text): 223 | """ 224 | 将普通文本转换为Markdown格式的文本。 225 | """ 226 | text = text.replace('\n', '\n\n') 227 | text = text.replace('\n\n\n', '\n\n') 228 | text = text.replace('\n\n\n', '\n\n') 229 | return text 230 | 231 | 232 | 233 | 234 | def report_execption(chatbot, history, a, b): 235 | """ 236 | 向chatbot中添加错误信息 237 | """ 238 | chatbot.append((a, b)) 239 | history.append(a) 240 | history.append(b) 241 | 242 | 243 | def text_divide_paragraph(text): 244 | """ 245 | 将文本按照段落分隔符分割开，生成带有段落标签的HTML代码。 246 | """ 247 | pre = '

' 248 | suf = '

' 269 | suf = '

' 270 | if txt.startswith(pre) and txt.endswith(suf): 271 | # print('警告，输入了已经经过转化的字符串，二次转化可能出问题') 272 | return txt # 已经被转化过，不需要再次转化 273 | 274 | markdown_extension_configs = { 275 | 'mdx_math': { 276 | 'enable_dollar_delimiter': True, 277 | 'use_gitlab_delimiters': False, 278 | }, 279 | } 280 | find_equation_pattern = r' $$${content}$$" 294 | else: 295 | return f"${content}$" 296 | 297 | def replace_math_render(match): 298 | content = match.group(1) 299 | if 'mode=display' in match.group(0): 300 | if '\\begin{aligned}' in content: 301 | content = content.replace('\\begin{aligned}', '\\begin{array}') 302 | content = content.replace('\\end{aligned}', '\\end{array}') 303 | content = content.replace('&', ' ') 304 | content = tex2mathml_catch_exception(content, display="block") 305 | return content 306 | else: 307 | return tex2mathml_catch_exception(content) 308 | 309 | def markdown_bug_hunt(content): 310 | """ 311 | 解决一个mdx_math的bug（单$包裹begin命令时多余） 312 | """ 313 | content = content.replace('\n', '') 314 | content = content.replace('$ \n', '') 315 | return content 316 | 317 | def no_code(txt): 318 | if '```' not in txt: 319 | return True 320 | else: 321 | if '```reference' in txt: return True # newbing 322 | else: return False 323 | 324 | if ('$' in txt) and no_code(txt): # 有$标识的公式符号，且没有代码段```的标识 325 | # convert everything to html format 326 | split = markdown.markdown(text='---') 327 | convert_stage_1 = markdown.markdown(text=txt, extensions=['mdx_math', 'fenced_code', 'tables', 'sane_lists'], extension_configs=markdown_extension_configs) 328 | convert_stage_1 = markdown_bug_hunt(convert_stage_1) 329 | # re.DOTALL: Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline. Corresponds to the inline flag (?s). 330 | # 1. convert to easy-to-copy tex (do not render math) 331 | convert_stage_2_1, n = re.subn(find_equation_pattern, replace_math_no_render, convert_stage_1, flags=re.DOTALL) 332 | # 2. convert to rendered equation 333 | convert_stage_2_2, n = re.subn(find_equation_pattern, replace_math_render, convert_stage_1, flags=re.DOTALL) 334 | # cat them together 335 | return pre + convert_stage_2_1 + f'{split}' + convert_stage_2_2 + suf 336 | else: 337 | return pre + markdown.markdown(txt, extensions=['fenced_code', 'codehilite', 'tables', 'sane_lists']) + suf 338 | 339 | 340 | def close_up_code_segment_during_stream(gpt_reply): 341 | """ 342 | 在gpt输出代码的中途（输出了前面的```，但还没输出完后面的```），补上后面的``` 343 | 344 | Args: 345 | gpt_reply (str): GPT模型返回的回复字符串。 346 | 347 | Returns: 348 | str: 返回一个新的字符串，将输出代码片段的“后面的```”补上。 349 | 350 | """ 351 | if '```' not in gpt_reply: 352 | return gpt_reply 353 | if gpt_reply.endswith('```'): 354 | return gpt_reply 355 | 356 | # 排除了以上两个情况，我们 357 | segments = gpt_reply.split('```') 358 | n_mark = len(segments) - 1 359 | if n_mark % 2 == 1: 360 | # print('输出代码片段中！') 361 | return gpt_reply+'\n```' 362 | else: 363 | return gpt_reply 364 | 365 | 366 | def format_io(self, y): 367 | """ 368 | 将输入和输出解析为HTML格式。将y中最后一项的输入部分段落化，并将输出部分的Markdown和数学公式转换为HTML格式。 369 | """ 370 | if y is None or y == []: 371 | return [] 372 | i_ask, gpt_reply = y[-1] 373 | # 输入部分太自由，预处理一波 374 | if i_ask is not None: i_ask = text_divide_paragraph(i_ask) 375 | # 当代码输出半截的时候，试着补上后个``` 376 | if gpt_reply is not None: gpt_reply = close_up_code_segment_during_stream(gpt_reply) 377 | # process 378 | y[-1] = ( 379 | None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']), 380 | None if gpt_reply is None else markdown_convertion(gpt_reply) 381 | ) 382 | return y 383 | 384 | 385 | def find_free_port(): 386 | """ 387 | 返回当前系统中可用的未使用端口。 388 | """ 389 | import socket 390 | from contextlib import closing 391 | with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as s: 392 | s.bind(('', 0)) 393 | s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) 394 | return s.getsockname()[1] 395 | 396 | 397 | def extract_archive(file_path, dest_dir): 398 | import zipfile 399 | import tarfile 400 | import os 401 | # Get the file extension of the input file 402 | file_extension = os.path.splitext(file_path)[1] 403 | 404 | # Extract the archive based on its extension 405 | if file_extension == '.zip': 406 | with zipfile.ZipFile(file_path, 'r') as zipobj: 407 | zipobj.extractall(path=dest_dir) 408 | print("Successfully extracted zip archive to {}".format(dest_dir)) 409 | 410 | elif file_extension in ['.tar', '.gz', '.bz2']: 411 | with tarfile.open(file_path, 'r:*') as tarobj: 412 | tarobj.extractall(path=dest_dir) 413 | print("Successfully extracted tar archive to {}".format(dest_dir)) 414 | 415 | # 第三方库，需要预先pip install rarfile 416 | # 此外，Windows上还需要安装winrar软件，配置其Path环境变量，如"C:\Program Files\WinRAR"才可以 417 | elif file_extension == '.rar': 418 | try: 419 | import rarfile 420 | with rarfile.RarFile(file_path) as rf: 421 | rf.extractall(path=dest_dir) 422 | print("Successfully extracted rar archive to {}".format(dest_dir)) 423 | except: 424 | print("Rar format requires additional dependencies to install") 425 | return '\n\n解压失败! 需要安装pip install rarfile来解压rar文件' 426 | 427 | # 第三方库，需要预先pip install py7zr 428 | elif file_extension == '.7z': 429 | try: 430 | import py7zr 431 | with py7zr.SevenZipFile(file_path, mode='r') as f: 432 | f.extractall(path=dest_dir) 433 | print("Successfully extracted 7z archive to {}".format(dest_dir)) 434 | except: 435 | print("7z format requires additional dependencies to install") 436 | return '\n\n解压失败! 需要安装pip install py7zr来解压7z文件' 437 | else: 438 | return '' 439 | return '' 440 | 441 | 442 | def find_recent_files(directory): 443 | """ 444 | me: find files that is created with in one minutes under a directory with python, write a function 445 | gpt: here it is! 446 | """ 447 | import os 448 | import time 449 | current_time = time.time() 450 | one_minute_ago = current_time - 60 451 | recent_files = [] 452 | 453 | for filename in os.listdir(directory): 454 | file_path = os.path.join(directory, filename) 455 | if file_path.endswith('.log'): 456 | continue 457 | created_time = os.path.getmtime(file_path) 458 | if created_time >= one_minute_ago: 459 | if os.path.isdir(file_path): 460 | continue 461 | recent_files.append(file_path) 462 | 463 | return recent_files 464 | 465 | def promote_file_to_downloadzone(file, rename_file=None, chatbot=None): 466 | # 将文件复制一份到下载区 467 | import shutil 468 | if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}' 469 | new_path = os.path.join(f'./gpt_log/', rename_file) 470 | if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path) 471 | if not os.path.exists(new_path): shutil.copyfile(file, new_path) 472 | if chatbot: 473 | if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote'] 474 | else: current = [] 475 | chatbot._cookies.update({'file_to_promote': [new_path] + current}) 476 | 477 | def on_file_uploaded(files, chatbot, txt, txt2, checkboxes): 478 | """ 479 | 当文件被上传时的回调函数 480 | """ 481 | if len(files) == 0: 482 | return chatbot, txt 483 | import shutil 484 | import os 485 | import time 486 | import glob 487 | from toolbox import extract_archive 488 | try: 489 | shutil.rmtree('./private_upload/') 490 | except: 491 | pass 492 | time_tag = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) 493 | os.makedirs(f'private_upload/{time_tag}', exist_ok=True) 494 | err_msg = '' 495 | for file in files: 496 | file_origin_name = os.path.basename(file.orig_name) 497 | shutil.copy(file.name, f'private_upload/{time_tag}/{file_origin_name}') 498 | err_msg += extract_archive(f'private_upload/{time_tag}/{file_origin_name}', 499 | dest_dir=f'private_upload/{time_tag}/{file_origin_name}.extract') 500 | moved_files = [fp for fp in glob.glob('private_upload/**/*', recursive=True)] 501 | if "底部输入区" in checkboxes: 502 | txt = "" 503 | txt2 = f'private_upload/{time_tag}' 504 | else: 505 | txt = f'private_upload/{time_tag}' 506 | txt2 = "" 507 | moved_files_str = '\t\n\n'.join(moved_files) 508 | chatbot.append(['我上传了文件，请查收', 509 | f'[Local Message] 收到以下文件: \n\n{moved_files_str}' + 510 | f'\n\n调用路径参数已自动修正到: \n\n{txt}' + 511 | f'\n\n现在您点击任意“红颜色”标识的函数插件时，以上文件将被作为输入参数'+err_msg]) 512 | return chatbot, txt, txt2 513 | 514 | 515 | def on_report_generated(cookies, files, chatbot): 516 | from toolbox import find_recent_files 517 | if 'file_to_promote' in cookies: 518 | report_files = cookies['file_to_promote'] 519 | cookies.pop('file_to_promote') 520 | else: 521 | report_files = find_recent_files('gpt_log') 522 | if len(report_files) == 0: 523 | return cookies, None, chatbot 524 | # files.extend(report_files) 525 | file_links = '' 526 | for f in report_files: file_links += f'
{f}' 527 | chatbot.append(['报告如何远程获取？', f'报告已经添加到右侧“文件上传区”（可能处于折叠状态），请查收。{file_links}']) 528 | return cookies, report_files, chatbot 529 | 530 | def load_chat_cookies(): 531 | API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY') 532 | if is_any_api_key(AZURE_API_KEY): 533 | if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY 534 | else: API_KEY = AZURE_API_KEY 535 | #print(API_KEY) 536 | return {'api_key': API_KEY, 'llm_model': LLM_MODEL} 537 | 538 | def is_openai_api_key(key): 539 | API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key) 540 | return bool(API_MATCH_ORIGINAL) 541 | 542 | def is_azure_api_key(key): 543 | API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key) 544 | return bool(API_MATCH_AZURE) 545 | 546 | def is_api2d_key(key): 547 | API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key) 548 | return bool(API_MATCH_API2D) 549 | 550 | def is_freeai_api_key(key):#new add 551 | API_MATCH_FREEAI0 = re.match(r"pk-[a-zA-Z0-9-_]{43}$", key) 552 | API_MATCH_FREEAI1 = re.match(r"fk-[a-zA-Z0-9-_]{43}$", key) 553 | return bool(API_MATCH_FREEAI0) or bool(API_MATCH_FREEAI1) 554 | 555 | def is_any_api_key(key): 556 | if ',' in key: 557 | keys = key.split(',') 558 | for k in keys: 559 | if is_any_api_key(k): return True 560 | return False 561 | else:#new add 562 | return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key) or is_freeai_api_key(key) 563 | 564 | def what_keys(keys):#new add 565 | avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0, "FreeAI Key":0} 566 | key_list = keys.split(',') 567 | 568 | for k in key_list: 569 | if is_openai_api_key(k): 570 | avail_key_list['OpenAI Key'] += 1 571 | 572 | for k in key_list: 573 | if is_api2d_key(k): 574 | avail_key_list['API2D Key'] += 1 575 | 576 | for k in key_list: 577 | if is_azure_api_key(k): 578 | avail_key_list['Azure Key'] += 1 579 | 580 | for k in key_list: # new add 581 | if is_freeai_api_key(k): 582 | avail_key_list['FreeAI Key'] += 1 583 | 584 | #new add 585 | return f"检测到： OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']} 个, FreeAI Key {avail_key_list['FreeAI Key']} 个" 586 | 587 | def select_api_key(keys, llm_model): 588 | import random 589 | avail_key_list = [] 590 | key_list = keys.split(',') 591 | 592 | if llm_model.startswith('gpt-'): 593 | for k in key_list: 594 | if is_openai_api_key(k): avail_key_list.append(k) 595 | for k in key_list:# new add 596 | if is_freeai_api_key(k): avail_key_list.append(k) 597 | 598 | if llm_model.startswith('api2d-'): 599 | for k in key_list: 600 | if is_api2d_key(k): avail_key_list.append(k) 601 | 602 | if llm_model.startswith('azure-'): 603 | for k in key_list: 604 | if is_azure_api_key(k): avail_key_list.append(k) 605 | 606 | if len(avail_key_list) == 0: 607 | raise RuntimeError(f"您提供的api-key不满足要求，不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源（右下角更换模型菜单中可切换openai,azure和api2d请求源）") 608 | 609 | api_key = random.choice(avail_key_list) # 随机负载均衡 610 | return api_key 611 | 612 | def read_env_variable(arg, default_value): 613 | """ 614 | 环境变量可以是 `GPT_ACADEMIC_CONFIG`(优先)，也可以直接是`CONFIG` 615 | 例如在windows cmd中，既可以写： 616 | set USE_PROXY=True 617 | set API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx 618 | set proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",} 619 | set AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"] 620 | set AUTHENTICATION=[("username", "password"), ("username2", "password2")] 621 | 也可以写： 622 | set GPT_ACADEMIC_USE_PROXY=True 623 | set GPT_ACADEMIC_API_KEY=sk-j7caBpkRoxxxxxxxxxxxxxxxxxxxxxxxxxxxx 624 | set GPT_ACADEMIC_proxies={"http":"http://127.0.0.1:10085", "https":"http://127.0.0.1:10085",} 625 | set GPT_ACADEMIC_AVAIL_LLM_MODELS=["gpt-3.5-turbo", "chatglm"] 626 | set GPT_ACADEMIC_AUTHENTICATION=[("username", "password"), ("username2", "password2")] 627 | """ 628 | from colorful import print亮红, print亮绿 629 | arg_with_prefix = "GPT_ACADEMIC_" + arg 630 | if arg_with_prefix in os.environ: 631 | env_arg = os.environ[arg_with_prefix] 632 | elif arg in os.environ: 633 | env_arg = os.environ[arg] 634 | else: 635 | raise KeyError 636 | print(f"[ENV_VAR] 尝试加载{arg}，默认值：{default_value} --> 修正值：{env_arg}") 637 | try: 638 | if isinstance(default_value, bool): 639 | env_arg = env_arg.strip() 640 | if env_arg == 'True': r = True 641 | elif env_arg == 'False': r = False 642 | else: print('enter True or False, but have:', env_arg); r = default_value 643 | elif isinstance(default_value, int): 644 | r = int(env_arg) 645 | elif isinstance(default_value, float): 646 | r = float(env_arg) 647 | elif isinstance(default_value, str): 648 | r = env_arg.strip() 649 | elif isinstance(default_value, dict): 650 | r = eval(env_arg) 651 | elif isinstance(default_value, list): 652 | r = eval(env_arg) 653 | elif default_value is None: 654 | assert arg == "proxies" 655 | r = eval(env_arg) 656 | else: 657 | print亮红(f"[ENV_VAR] 环境变量{arg}不支持通过环境变量设置! ") 658 | raise KeyError 659 | except: 660 | print亮红(f"[ENV_VAR] 环境变量{arg}加载失败! ") 661 | raise KeyError(f"[ENV_VAR] 环境变量{arg}加载失败! ") 662 | 663 | print亮绿(f"[ENV_VAR] 成功读取环境变量{arg}") 664 | return r 665 | 666 | @lru_cache(maxsize=128) 667 | def read_single_conf_with_lru_cache(arg): 668 | from colorful import print亮红, print亮绿, print亮蓝 669 | try: 670 | # 优先级1. 获取环境变量作为配置 671 | default_ref = getattr(importlib.import_module('config'), arg) # 读取默认值作为数据类型转换的参考 672 | r = read_env_variable(arg, default_ref) 673 | except: 674 | try: 675 | # 优先级2. 获取config_private中的配置 676 | r = getattr(importlib.import_module('config_private'), arg) 677 | except: 678 | # 优先级3. 获取config中的配置 679 | r = getattr(importlib.import_module('config'), arg) 680 | 681 | # 在读取API_KEY时，检查一下是不是忘了改config 682 | if arg == 'API_KEY': 683 | print亮蓝(f"[API_KEY] 本项目现已支持OpenAI和API2D的api-key。也支持同时填写多个api-key，如API_KEY=\"openai-key1,openai-key2,api2d-key3\"") 684 | print亮蓝(f"[API_KEY] 您既可以在config.py中修改api-key(s)，也可以在问题输入区输入临时的api-key(s)，然后回车键提交后即可生效。") 685 | if is_any_api_key(r): 686 | print亮绿(f"[API_KEY] 您的 API_KEY 是: {r[:15]}*** API_KEY 导入成功") 687 | else: 688 | print亮红( "[API_KEY] 正确的 API_KEY 是'sk'开头的51位密钥（OpenAI），或者 'fk'开头的41位密钥，请在config文件中修改API密钥之后再运行。") 689 | if arg == 'proxies': 690 | if r is None: 691 | print亮红('[PROXY] 网络代理状态：未配置。无代理状态下很可能无法访问OpenAI家族的模型。建议：检查USE_PROXY选项是否修改。') 692 | else: 693 | print亮绿('[PROXY] 网络代理状态：已配置。配置信息如下：', r) 694 | assert isinstance(r, dict), 'proxies格式错误，请注意proxies选项的格式，不要遗漏括号。' 695 | return r 696 | 697 | 698 | def get_conf(*args): 699 | # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到 700 | res = [] 701 | for arg in args: 702 | r = read_single_conf_with_lru_cache(arg) 703 | res.append(r) 704 | return res 705 | 706 | 707 | def clear_line_break(txt): 708 | txt = txt.replace('\n', ' ') 709 | txt = txt.replace(' ', ' ') 710 | txt = txt.replace(' ', ' ') 711 | return txt 712 | 713 | 714 | class DummyWith(): 715 | """ 716 | 这段代码定义了一个名为DummyWith的空上下文管理器， 717 | 它的作用是……额……就是不起作用，即在代码结构不变得情况下取代其他的上下文管理器。 718 | 上下文管理器是一种Python对象，用于与with语句一起使用， 719 | 以确保一些资源在代码块执行期间得到正确的初始化和清理。 720 | 上下文管理器必须实现两个方法，分别为 __enter__()和 __exit__()。 721 | 在上下文执行开始的情况下，__enter__()方法会在代码块被执行前被调用， 722 | 而在上下文执行结束时，__exit__()方法则会被调用。 723 | """ 724 | def __enter__(self): 725 | return self 726 | 727 | def __exit__(self, exc_type, exc_value, traceback): 728 | return 729 | 730 | def run_gradio_in_subpath(demo, auth, port, custom_path): 731 | """ 732 | 把gradio的运行地址更改到指定的二次路径上 733 | """ 734 | def is_path_legal(path: str)->bool: 735 | ''' 736 | check path for sub url 737 | path: path to check 738 | return value: do sub url wrap 739 | ''' 740 | if path == "/": return True 741 | if len(path) == 0: 742 | print("ilegal custom path: {}\npath must not be empty\ndeploy on root url".format(path)) 743 | return False 744 | if path[0] == '/': 745 | if path[1] != '/': 746 | print("deploy on sub-path {}".format(path)) 747 | return True 748 | return False 749 | print("ilegal custom path: {}\npath should begin with \'/\'\ndeploy on root url".format(path)) 750 | return False 751 | 752 | if not is_path_legal(custom_path): raise RuntimeError('Ilegal custom path') 753 | import uvicorn 754 | import gradio as gr 755 | from fastapi import FastAPI 756 | app = FastAPI() 757 | if custom_path != "/": 758 | @app.get("/") 759 | def read_main(): 760 | return {"message": f"Gradio is running at: {custom_path}"} 761 | app = gr.mount_gradio_app(app, demo, path=custom_path) 762 | uvicorn.run(app, host="0.0.0.0", port=port) # , auth=auth 763 | 764 | 765 | def clip_history(inputs, history, tokenizer, max_token_limit): 766 | """ 767 | reduce the length of history by clipping. 768 | this function search for the longest entries to clip, little by little, 769 | until the number of token of history is reduced under threshold. 770 | 通过裁剪来缩短历史记录的长度。 771 | 此函数逐渐地搜索最长的条目进行剪辑， 772 | 直到历史记录的标记数量降低到阈值以下。 773 | """ 774 | import numpy as np 775 | from request_llm.bridge_all import model_info 776 | def get_token_num(txt): 777 | return len(tokenizer.encode(txt, disallowed_special=())) 778 | input_token_num = get_token_num(inputs) 779 | if input_token_num < max_token_limit * 3 / 4: 780 | # 当输入部分的token占比小于限制的3/4时，裁剪时 781 | # 1. 把input的余量留出来 782 | max_token_limit = max_token_limit - input_token_num 783 | # 2. 把输出用的余量留出来 784 | max_token_limit = max_token_limit - 128 785 | # 3. 如果余量太小了，直接清除历史 786 | if max_token_limit < 128: 787 | history = [] 788 | return history 789 | else: 790 | # 当输入部分的token占比 > 限制的3/4时，直接清除历史 791 | history = [] 792 | return history 793 | 794 | everything = [''] 795 | everything.extend(history) 796 | n_token = get_token_num('\n'.join(everything)) 797 | everything_token = [get_token_num(e) for e in everything] 798 | 799 | # 截断时的颗粒度 800 | delta = max(everything_token) // 16 801 | 802 | while n_token > max_token_limit: 803 | where = np.argmax(everything_token) 804 | encoded = tokenizer.encode(everything[where], disallowed_special=()) 805 | clipped_encoded = encoded[:len(encoded)-delta] 806 | everything[where] = tokenizer.decode(clipped_encoded)[:-1] # -1 to remove the may-be illegal char 807 | everything_token[where] = get_token_num(everything[where]) 808 | n_token = get_token_num('\n'.join(everything)) 809 | 810 | history = everything[1:] 811 | return history 812 | 813 | """ 814 | ======================================================================== 815 | 第三部分 816 | 其他小工具: 817 | - zip_folder: 把某个路径下所有文件压缩，然后转移到指定的另一个路径中（gpt写的） 818 | - gen_time_str: 生成时间戳 819 | - ProxyNetworkActivate: 临时地启动代理网络（如果有） 820 | - objdump/objload: 快捷的调试函数 821 | ======================================================================== 822 | """ 823 | 824 | def zip_folder(source_folder, dest_folder, zip_name): 825 | import zipfile 826 | import os 827 | # Make sure the source folder exists 828 | if not os.path.exists(source_folder): 829 | print(f"{source_folder} does not exist") 830 | return 831 | 832 | # Make sure the destination folder exists 833 | if not os.path.exists(dest_folder): 834 | print(f"{dest_folder} does not exist") 835 | return 836 | 837 | # Create the name for the zip file 838 | zip_file = os.path.join(dest_folder, zip_name) 839 | 840 | # Create a ZipFile object 841 | with zipfile.ZipFile(zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf: 842 | # Walk through the source folder and add files to the zip file 843 | for foldername, subfolders, filenames in os.walk(source_folder): 844 | for filename in filenames: 845 | filepath = os.path.join(foldername, filename) 846 | zipf.write(filepath, arcname=os.path.relpath(filepath, source_folder)) 847 | 848 | # Move the zip file to the destination folder (if it wasn't already there) 849 | if os.path.dirname(zip_file) != dest_folder: 850 | os.rename(zip_file, os.path.join(dest_folder, os.path.basename(zip_file))) 851 | zip_file = os.path.join(dest_folder, os.path.basename(zip_file)) 852 | 853 | print(f"Zip file created at {zip_file}") 854 | 855 | def zip_result(folder): 856 | import time 857 | t = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) 858 | zip_folder(folder, './gpt_log/', f'{t}-result.zip') 859 | return pj('./gpt_log/', f'{t}-result.zip') 860 | 861 | def gen_time_str(): 862 | import time 863 | return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) 864 | 865 | class ProxyNetworkActivate(): 866 | """ 867 | 这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理 868 | """ 869 | def __enter__(self): 870 | from toolbox import get_conf 871 | proxies, = get_conf('proxies') 872 | if 'no_proxy' in os.environ: os.environ.pop('no_proxy') 873 | if proxies is not None: 874 | if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http'] 875 | if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https'] 876 | return self 877 | 878 | def __exit__(self, exc_type, exc_value, traceback): 879 | os.environ['no_proxy'] = '*' 880 | if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY') 881 | if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY') 882 | return 883 | 884 | def objdump(obj, file='objdump.tmp'): 885 | import pickle 886 | with open(file, 'wb+') as f: 887 | pickle.dump(obj, f) 888 | return 889 | 890 | def objload(file='objdump.tmp'): 891 | import pickle, os 892 | if not os.path.exists(file): 893 | return 894 | with open(file, 'rb') as f: 895 | return pickle.load(f) 896 | 897 | --------------------------------------------------------------------------------