152 |
153 |
下次刷新倒计时
154 | --:--:--
155 |
156 |
157 |
158 |
预计下次刷新
159 |
正在计算...
(北京时间)
160 |
161 |
162 |
163 |
上次刷新时间
164 |
正在计算...
(北京时间)
165 |
166 |
167 |
168 |
基准刷新时间
169 |
加载中...
(北京时间)
170 |
171 |
172 |
173 |
323 |
324 |
325 |
--------------------------------------------------------------------------------
/copilot-models.md:
--------------------------------------------------------------------------------
1 | # AI模型配置信息表
2 |
3 | ## 模型基本信息
4 |
5 | | 模型ID | 模型名称 | 厂商 | 版本 | 上下文窗口大小 | 官方最大提示词Token | 实测最大输入Token | 最大输出Token | 预览版 |
6 | |--------|---------|------|------|---------------|-------------------|-----------------|-------------|--------|
7 | | gpt-3.5-turbo | GPT 3.5 Turbo | Azure OpenAI | gpt-3.5-turbo-0613 | 16,384 | 16,384 | 12,288 | 4,096 | ❌ |
8 | | gpt-3.5-turbo-0613 | GPT 3.5 Turbo | Azure OpenAI | gpt-3.5-turbo-0613 | 16,384 | 16,384 | 12,288 | 4,096 | ❌ |
9 | | gpt-4o-mini | GPT-4o mini | Azure OpenAI | gpt-4o-mini-2024-07-18 | 128,000 | 64,000 | 12,288 | 4,096 | ❌ |
10 | | gpt-4o-mini-2024-07-18 | GPT-4o mini | Azure OpenAI | gpt-4o-mini-2024-07-18 | 128,000 | 64,000 | 12,288 | 4,096 | ❌ |
11 | | gpt-4 | GPT 4 | Azure OpenAI | gpt-4-0613 | 32,768 | 32,768 | 32,768 | 4,096 | ❌ |
12 | | gpt-4-0613 | GPT 4 | Azure OpenAI | gpt-4-0613 | 32,768 | 32,768 | 32,768 | 4,096 | ❌ |
13 | | gpt-4o | GPT-4o | Azure OpenAI | gpt-4o-2024-11-20 | 128,000 | 64,000 | 64,000 | 16,384 | ❌ |
14 | | gpt-4o-2024-11-20 | GPT-4o | Azure OpenAI | gpt-4o-2024-11-20 | 128,000 | 64,000 | 64,000 | 16,384 | ❌ |
15 | | gpt-4o-2024-05-13 | GPT-4o | Azure OpenAI | gpt-4o-2024-05-13 | 128,000 | 64,000 | 64,000 | 4,096 | ❌ |
16 | | gpt-4-o-preview | GPT-4o | Azure OpenAI | gpt-4o-2024-05-13 | 128,000 | 64,000 | 64,000 | 4,096 | ❌ |
17 | | gpt-4o-2024-08-06 | GPT-4o | Azure OpenAI | gpt-4o-2024-08-06 | 128,000 | 64,000 | 64,000 | 16,384 | ❌ |
18 | | o1 | o1 (Preview) | Azure OpenAI | o1-2024-12-17 | 200,000 | 16,384 | 20,000 | - | ✅ |
19 | | o1-2024-12-17 | o1 (Preview) | Azure OpenAI | o1-2024-12-17 | 200,000 | 16,384 | 20,000 | - | ✅ |
20 | | o3-mini | o3-mini | Azure OpenAI | o3-mini-2025-01-31 | 200,000 | 64,000 | 64,000 | 100,000 | ❌ |
21 | | o3-mini-2025-01-31 | o3-mini | Azure OpenAI | o3-mini-2025-01-31 | 200,000 | 64,000 | 64,000 | 100,000 | ❌ |
22 | | o3-mini-paygo | o3-mini | Azure OpenAI | o3-mini-paygo | 200,000 | 64,000 | 64,000 | 100,000 | ❌ |
23 | | text-embedding-ada-002 | Embedding V2 Ada | Azure OpenAI | text-embedding-3-small | - | - | - | - | ❌ |
24 | | text-embedding-3-small | Embedding V3 small | Azure OpenAI | text-embedding-3-small | - | - | - | - | ❌ |
25 | | text-embedding-3-small-inference | Embedding V3 small (Inference) | Azure OpenAI | text-embedding-3-small | - | - | - | - | ❌ |
26 | | claude-3.5-sonnet | Claude 3.5 Sonnet | Anthropic | claude-3.5-sonnet | 90,000 | 90,000 | 90,000 | 8,192 | ❌ |
27 | | claude-3.7-sonnet | Claude 3.7 Sonnet | Anthropic | claude-3.7-sonnet | 200,000 | 128,000 | 90,000 | 16,384 | ❌ |
28 | | claude-3.7-sonnet-thought | Claude 3.7 Sonnet Thinking | Anthropic | claude-3.7-sonnet-thought | 200,000 | 90,000 | 90,000 | 16,384 | ❌ |
29 | | gemini-2.0-flash-001 | Gemini 2.0 Flash | Google | gemini-2.0-flash-001 | 1,000,000 | 128,000 | 128,000 | 8,192 | ❌ |
30 | | gemini-2.5-pro | Gemini 2.5 Pro (Preview) | Google | gemini-2.5-pro-preview-03-25 | 128,000 | 128,000 | 128,000 | 64,000 | ✅ |
31 | | gemini-2.5-pro-preview-03-25 | Gemini 2.5 Pro (Preview) | Google | gemini-2.5-pro-preview-03-25 | 128,000 | 128,000 | 128,000 | 64,000 | ✅ |
32 | | o4-mini | o4-mini (Preview) | Azure OpenAI | o4-mini-2025-04-16 | 128,000 | 128,000 | 128,000 | 16,384 | ✅ |
33 | | o4-mini-2025-04-16 | o4-mini (Preview) | OpenAI | o4-mini-2025-04-16 | 128,000 | 128,000 | 128,000 | 16,384 | ✅ |
34 | | gpt-4.1 | GPT-4.1 (Preview) | Azure OpenAI | gpt-4.1-2025-04-14 | 128,000 | 128,000 | 128,000 | 16,384 | ✅ |
35 | | gpt-4.1-2025-04-14 | GPT-4.1 (Preview) | OpenAI | gpt-4.1-2025-04-14 | 128,000 | 128,000 | 128,000 | 16,384 | ✅ |
36 |
37 | ## 模型特殊能力支持情况
38 |
39 | | 模型ID | vision | tool_calls | parallel_tool_calls | streaming | structured_outputs |
40 | |--------|--------|-----------|---------------------|-----------|-------------------|
41 | | gpt-3.5-turbo | ❌ | ✅ | ❌ | ✅ | ❌ |
42 | | gpt-3.5-turbo-0613 | ❌ | ✅ | ❌ | ✅ | ❌ |
43 | | gpt-4o-mini | ❌ | ✅ | ✅ | ✅ | ❌ |
44 | | gpt-4o-mini-2024-07-18 | ❌ | ✅ | ✅ | ✅ | ❌ |
45 | | gpt-4 | ❌ | ✅ | ❌ | ✅ | ❌ |
46 | | gpt-4-0613 | ❌ | ✅ | ❌ | ✅ | ❌ |
47 | | gpt-4o | ✅ | ✅ | ✅ | ✅ | ❌ |
48 | | gpt-4o-2024-11-20 | ✅ | ✅ | ✅ | ✅ | ❌ |
49 | | gpt-4o-2024-05-13 | ✅ | ✅ | ✅ | ✅ | ❌ |
50 | | gpt-4-o-preview | ❌ | ✅ | ✅ | ✅ | ❌ |
51 | | gpt-4o-2024-08-06 | ❌ | ✅ | ✅ | ✅ | ❌ |
52 | | o1 | ❌ | ✅ | ❌ | ❌ | ✅ |
53 | | o1-2024-12-17 | ❌ | ✅ | ❌ | ❌ | ✅ |
54 | | o3-mini | ❌ | ✅ | ❌ | ✅ | ✅ |
55 | | o3-mini-2025-01-31 | ❌ | ✅ | ❌ | ✅ | ✅ |
56 | | o3-mini-paygo | ❌ | ✅ | ❌ | ✅ | ✅ |
57 | | claude-3.5-sonnet | ✅ | ✅ | ✅ | ✅ | ❌ |
58 | | claude-3.7-sonnet | ✅ | ✅ | ✅ | ✅ | ❌ |
59 | | claude-3.7-sonnet-thought | ✅ | ❌ | ❌ | ✅ | ❌ |
60 | | gemini-2.0-flash-001 | ✅ | ✅ | ✅ | ✅ | ❌ |
61 | | gemini-2.5-pro | ✅ | ✅ | ✅ | ✅ | ❌ |
62 | | gemini-2.5-pro-preview-03-25 | ✅ | ✅ | ✅ | ✅ | ❌ |
63 | | o4-mini | ❌ | ✅ | ✅ | ✅ | ✅ |
64 | | o4-mini-2025-04-16 | ❌ | ✅ | ✅ | ✅ | ✅ |
65 | | gpt-4.1 | ✅ | ✅ | ✅ | ✅ | ✅ |
66 | | gpt-4.1-2025-04-14 | ✅ | ✅ | ✅ | ✅ | ✅ |
67 |
68 | ## 嵌入模型
69 |
70 | | 模型ID | 模型名称 | 厂商 | 版本 | 最大输入 | 支持自定义维度 | Tokenizer |
71 | |--------|---------|------|------|---------|--------------|----------|
72 | | text-embedding-ada-002 | Embedding V2 Ada | Azure OpenAI | text-embedding-3-small | 512 | ❌ | cl100k_base |
73 | | text-embedding-3-small | Embedding V3 small | Azure OpenAI | text-embedding-3-small | 512 | ✅ | cl100k_base |
74 | | text-embedding-3-small-inference | Embedding V3 small (Inference) | Azure OpenAI | text-embedding-3-small | - | ✅ | cl100k_base |
75 |
76 | ## 模型Tokenizer信息
77 |
78 | | 模型类别 | 使用的Tokenizer |
79 | |---------|----------------|
80 | | GPT-3.5系列 | cl100k_base |
81 | | GPT-4系列 | cl100k_base |
82 | | GPT-4o系列 | o200k_base |
83 | | o1系列 | o200k_base |
84 | | o3系列 | o200k_base |
85 | | o4系列 | o200k_base |
86 | | Claude系列 | o200k_base |
87 | | Gemini系列 | o200k_base |
88 | | 嵌入模型 | cl100k_base |
89 |
--------------------------------------------------------------------------------
/hfs/api-pool/Dockerfile:
--------------------------------------------------------------------------------
1 | # --- 第一阶段:构建阶段 (Builder Stage) ---
2 | # 使用官方 Go 镜像进行编译
3 | FROM golang:1.22-alpine AS builder
4 |
5 | # 设置工作目录
6 | WORKDIR /build
7 |
8 | # 复制 Go 源代码文件
9 | COPY api-pool.go .
10 |
11 | # 编译 Go 应用
12 | # CGO_ENABLED=0 尝试静态链接,减少依赖
13 | # -ldflags="-w -s" 减小二进制文件大小
14 | # -o /app/api-pool 指定输出路径和名称
15 | RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o /app/api-pool api-pool.go
16 |
17 | # --- 第二阶段:运行阶段 (Final Stage) ---
18 | # 使用轻量的 Alpine 镜像作为最终运行环境
19 | FROM alpine:latest
20 |
21 | # 设置工作目录
22 | WORKDIR /app
23 |
24 | # 从构建阶段复制编译好的二进制文件
25 | COPY --from=builder /app/api-pool /app/api-pool
26 |
27 | # 复制启动脚本
28 | COPY entrypoint.sh /app/entrypoint.sh
29 |
30 | # 赋予执行权限
31 | RUN chmod +x /app/api-pool /app/entrypoint.sh
32 |
33 | # 暴露应用程序监听的端口 (根据您的参数是 6969)
34 | EXPOSE 6969
35 |
36 | # 设置容器的入口点为启动脚本
37 | ENTRYPOINT ["/app/entrypoint.sh"]
38 |
39 | # 注意:CMD 指令现在由 entrypoint.sh 脚本通过 exec 来执行
--------------------------------------------------------------------------------
/hfs/api-pool/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: API Key Pool # 您可以修改标题
3 | emoji: 🔑 # 您可以修改 Emoji
4 | colorFrom: green # 您可以修改颜色
5 | colorTo: blue # 您可以修改颜色
6 | sdk: docker
7 | app_port: 6969 # 必须与您的 --port 参数和 EXPOSE 端口一致
8 | pinned: false
9 | ---
10 |
11 | (在此添加您的 Space 描述)
--------------------------------------------------------------------------------
/hfs/api-pool/entrypoint.sh:
--------------------------------------------------------------------------------
1 | #!/bin/sh
2 | # 如果命令失败则立即退出
3 | set -e
4 |
5 | # 定义临时密钥文件的路径
6 | KEY_FILE_PATH="/tmp/keys.txt"
7 |
8 | echo "--- 正在检查 Secrets ---"
9 |
10 | # 检查 API_PASSWORD 是否已设置
11 | if [ -z "${API_PASSWORD}" ]; then
12 | echo "[错误] 必须在 Hugging Face Secrets 中设置 API_PASSWORD !"
13 | exit 1
14 | fi
15 |
16 | # 检查 key_list 是否已设置
17 | if [ -z "${key_list}" ]; then
18 | echo "[错误] 必须在 Hugging Face Secrets 中设置 key_list !"
19 | exit 1
20 | fi
21 |
22 | echo "--- 正在从 Secret 'key_list' 创建临时密钥文件 (${KEY_FILE_PATH}) ---"
23 |
24 | # 从环境变量 key_list 读取内容,并写入临时文件
25 | # 使用 'echo -e' 来解释可能存在的 '\n' 换行符
26 | # 将标准错误重定向到 /dev/null 以避免打印潜在的密钥内容(尽管通常 echo 不会)
27 | echo -e "${key_list}" > "${KEY_FILE_PATH}" 2>/dev/null
28 |
29 | # 验证文件是否创建成功且非空
30 | if [ ! -s "${KEY_FILE_PATH}" ]; then
31 | echo "[错误] 创建密钥文件失败或文件为空!请检查 'key_list' Secret 的内容。"
32 | exit 1
33 | fi
34 |
35 | echo "--- 密钥文件已生成 ---"
36 |
37 | # !!! 【重要】生产环境中不要取消下面这行的注释,避免日志泄露 !!!
38 | # echo "密钥文件内容预览 (前几行):"
39 | # head -n 3 "${KEY_FILE_PATH}"
40 |
41 | echo "--- 正在启动 api-pool 服务 ---"
42 |
43 | # 使用 exec 执行 Go 程序,将脚本进程替换为 Go 程序进程
44 | # 将临时文件路径传给 --key-file
45 | # 将从 Secret 读取的密码传给 --password
46 | # 将地址设为 0.0.0.0 以便容器外访问
47 | # 传入其他您指定的参数
48 | exec /app/api-pool \
49 | --key-file "${KEY_FILE_PATH}" \
50 | --target-url "https://api.siliconflow.cn" \
51 | --port "6969" \
52 | --address "0.0.0.0" \
53 | --password "${API_PASSWORD}" \
54 | --max-workers=1000 \
55 | --max-queue=2000
56 | # 注意:--max-workers 和 --max-queue 值较高,请关注 Space 资源使用情况
--------------------------------------------------------------------------------
/hfs/hunyuan2api/Dockerfile:
--------------------------------------------------------------------------------
1 | # --- 第一阶段:构建阶段 (Builder Stage) ---
2 | # 使用官方的 Go 语言镜像作为编译环境, Alpine 版本比较小巧
3 | FROM golang:1.22-alpine AS builder
4 | # 或者 FROM golang:1.22 # 如果 alpine 的 musl libc 与您的代码有兼容问题
5 |
6 | # 设置构建阶段的工作目录
7 | WORKDIR /build
8 |
9 | # 将你的 Go 源代码文件 (hunyuan2api.go) 复制到构建环境的 /build/ 目录下
10 | COPY hunyuan2api.go .
11 |
12 | # 编译 Go 应用程序
13 | # CGO_ENABLED=0 尝试进行静态链接,避免 C 库依赖问题,尤其是在使用 alpine 镜像时
14 | # -ldflags="-w -s" 用于减小编译后二进制文件的大小
15 | # -o /app/hunyuan2api 指定编译输出的可执行文件路径和名称
16 | # hunyuan2api.go 是你的源文件名
17 | RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o /app/hunyuan2api hunyuan2api.go
18 |
19 | # --- 第二阶段:运行阶段 (Final Stage) ---
20 | # 使用一个非常精简的基础镜像来运行编译好的程序
21 | FROM alpine:latest
22 | # 注意:如果静态编译 (CGO_ENABLED=0) 失败或运行时仍有问题,
23 | # 可能需要换成基于 glibc 的镜像,例如 'debian:stable-slim'
24 | # FROM debian:stable-slim
25 |
26 | # 设置最终运行阶段的工作目录
27 | WORKDIR /app
28 |
29 | # 从第一阶段 (builder) 复制编译好的二进制文件到最终镜像的 /app/ 目录下
30 | COPY --from=builder /app/hunyuan2api /app/hunyuan2api
31 |
32 | # 确保复制过来的二进制文件具有执行权限
33 | RUN chmod +x /app/hunyuan2api
34 |
35 | # 暴露你的 Go 应用程序监听的网络端口 (根据你的启动参数是 6677)
36 | EXPOSE 6677
37 |
38 | # 设置容器启动时执行的命令
39 | # 这里的启动参数需要和您提供的一致
40 | CMD ["/app/hunyuan2api", "--address", "0.0.0.0", "--port", "6677", "--verify-ssl=false", "--dev", "--workers", "400", "--queue-size", "1000", "--max-concurrent", "400"]
--------------------------------------------------------------------------------
/hfs/hunyuan2api/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Hunyuan2api # 标题
3 | emoji: 🌍 # Emoji
4 | colorFrom: indigo # 渐变起始色
5 | colorTo: red # 渐变结束色
6 | sdk: docker # 指定使用 Docker SDK
7 | app_port: 6677 # 【新增】指定应用程序在容器内监听的端口
8 | pinned: false # 是否固定在个人资料页
9 | license: gpl-3.0 # 开源许可证
10 | ---
--------------------------------------------------------------------------------
/hfs/qwen2api/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM node:16-slim
2 |
3 | # 安装 git
4 | RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*
5 |
6 | # 设置工作目录
7 | WORKDIR /app
8 |
9 | # 克隆代码仓库
10 | RUN git clone https://github.com/Rfym21/Qwen2API .
11 |
12 | # 预先创建数据目录并设置权限
13 | RUN mkdir -p /app/data && \
14 | chmod 777 /app/data && \
15 | chmod 777 /app
16 |
17 | # 安装依赖
18 | RUN npm install
19 |
20 | # 暴露端口
21 | EXPOSE 3000
22 |
23 | # 创建不写入 .env 文件的启动脚本
24 | RUN echo '#!/bin/bash\n\
25 | \n\
26 | # 日志函数\n\
27 | log() {\n\
28 | echo "[$(date "+%Y-%m-%d %H:%M:%S")] $1"\n\
29 | }\n\
30 | \n\
31 | # 显示配置信息\n\
32 | log "配置信息:"\n\
33 | log "API_PREFIX: ${API_PREFIX:-(未设置)}"\n\
34 | log "SERVICE_PORT: ${SERVICE_PORT:-3000}"\n\
35 | log "API_KEY: ${API_KEY:+已设置} ${API_KEY:-未设置}"\n\
36 | log "ACCOUNT_TOKENS: ${ACCOUNT_TOKENS:+已设置} ${ACCOUNT_TOKENS:-未设置}"\n\
37 | log "SEARCH_INFO_MODE: ${SEARCH_INFO_MODE:-table}"\n\
38 | \n\
39 | # 直接使用环境变量启动服务\n\
40 | log "正在启动 Qwen2API 服务..."\n\
41 | node src/server.js\n\
42 | ' > /app/start.sh && chmod +x /app/start.sh
43 |
44 | # 设置启动命令
45 | CMD ["/app/start.sh"]
--------------------------------------------------------------------------------
/hfs/qwen2api/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Qwen2API
3 | emoji: 📚
4 | colorFrom: blue
5 | colorTo: green
6 | sdk: docker
7 | pinned: false
8 | app_port: 3000
9 | ---
--------------------------------------------------------------------------------
/hunyuan2api.go:
--------------------------------------------------------------------------------
1 | package main
2 |
3 | import (
4 | "bufio"
5 | "bytes"
6 | "context"
7 | "crypto/tls"
8 | "encoding/json"
9 | "flag"
10 | "fmt"
11 | "io"
12 | "log"
13 | "net/http"
14 | "os"
15 | "os/signal"
16 | "strings"
17 | "sync"
18 | "sync/atomic"
19 | "syscall"
20 | "time"
21 | )
22 |
23 | // WorkerPool 工作池结构体,用于管理goroutine
24 | type WorkerPool struct {
25 | taskQueue chan *Task
26 | workerCount int
27 | shutdownChannel chan struct{}
28 | wg sync.WaitGroup
29 | }
30 |
31 | // Task 任务结构体,包含请求处理所需数据
32 | type Task struct {
33 | r *http.Request
34 | w http.ResponseWriter
35 | done chan struct{}
36 | reqID string
37 | isStream bool
38 | hunyuanReq HunyuanRequest
39 | }
40 |
41 | // NewWorkerPool 创建并启动一个新的工作池
42 | func NewWorkerPool(workerCount int, queueSize int) *WorkerPool {
43 | pool := &WorkerPool{
44 | taskQueue: make(chan *Task, queueSize),
45 | workerCount: workerCount,
46 | shutdownChannel: make(chan struct{}),
47 | }
48 |
49 | pool.Start()
50 | return pool
51 | }
52 |
53 | // Start 启动工作池中的worker goroutines
54 | func (pool *WorkerPool) Start() {
55 | // 启动工作goroutine
56 | for i := 0; i < pool.workerCount; i++ {
57 | pool.wg.Add(1)
58 | go func(workerID int) {
59 | defer pool.wg.Done()
60 |
61 | logInfo("Worker %d 已启动", workerID)
62 |
63 | for {
64 | select {
65 | case task, ok := <-pool.taskQueue:
66 | if !ok {
67 | // 队列已关闭,退出worker
68 | logInfo("Worker %d 收到队列关闭信号,准备退出", workerID)
69 | return
70 | }
71 |
72 | logDebug("Worker %d 处理任务 reqID:%s", workerID, task.reqID)
73 |
74 | // 处理任务
75 | if task.isStream {
76 | err := handleStreamingRequest(task.w, task.r, task.hunyuanReq, task.reqID)
77 | if err != nil {
78 | logError("Worker %d 处理流式任务失败: %v", workerID, err)
79 | }
80 | } else {
81 | err := handleNonStreamingRequest(task.w, task.r, task.hunyuanReq, task.reqID)
82 | if err != nil {
83 | logError("Worker %d 处理非流式任务失败: %v", workerID, err)
84 | }
85 | }
86 |
87 | // 通知任务完成
88 | close(task.done)
89 |
90 | case <-pool.shutdownChannel:
91 | // 收到关闭信号,退出worker
92 | logInfo("Worker %d 收到关闭信号,准备退出", workerID)
93 | return
94 | }
95 | }
96 | }(i)
97 | }
98 | }
99 |
100 | // SubmitTask 提交任务到工作池,非阻塞
101 | func (pool *WorkerPool) SubmitTask(task *Task) (bool, error) {
102 | select {
103 | case pool.taskQueue <- task:
104 | // 任务成功添加到队列
105 | return true, nil
106 | default:
107 | // 队列已满
108 | return false, fmt.Errorf("任务队列已满")
109 | }
110 | }
111 |
112 | // Shutdown 关闭工作池
113 | func (pool *WorkerPool) Shutdown() {
114 | logInfo("正在关闭工作池...")
115 |
116 | // 发送关闭信号给所有worker
117 | close(pool.shutdownChannel)
118 |
119 | // 等待所有worker退出
120 | pool.wg.Wait()
121 |
122 | // 关闭任务队列
123 | close(pool.taskQueue)
124 |
125 | logInfo("工作池已关闭")
126 | }
127 |
128 | // Semaphore 信号量实现,用于限制并发数量
129 | type Semaphore struct {
130 | sem chan struct{}
131 | }
132 |
133 | // NewSemaphore 创建新的信号量
134 | func NewSemaphore(size int) *Semaphore {
135 | return &Semaphore{
136 | sem: make(chan struct{}, size),
137 | }
138 | }
139 |
140 | // Acquire 获取信号量(阻塞)
141 | func (s *Semaphore) Acquire() {
142 | s.sem <- struct{}{}
143 | }
144 |
145 | // Release 释放信号量
146 | func (s *Semaphore) Release() {
147 | <-s.sem
148 | }
149 |
150 | // TryAcquire 尝试获取信号量(非阻塞)
151 | func (s *Semaphore) TryAcquire() bool {
152 | select {
153 | case s.sem <- struct{}{}:
154 | return true
155 | default:
156 | return false
157 | }
158 | }
159 |
160 | // 配置结构体用于存储命令行参数
161 | type Config struct {
162 | Port string // 代理服务器监听端口
163 | Address string // 代理服务器监听地址
164 | LogLevel string // 日志级别
165 | DevMode bool // 开发模式标志
166 | MaxRetries int // 最大重试次数
167 | Timeout int // 请求超时时间(秒)
168 | VerifySSL bool // 是否验证SSL证书
169 | ModelName string // 默认模型名称
170 | BearerToken string // Bearer Token (默认提供公开Token)
171 | WorkerCount int // 工作池中的worker数量
172 | QueueSize int // 任务队列大小
173 | MaxConcurrent int // 最大并发请求数
174 | }
175 |
176 | // 支持的模型列表
177 | var SupportedModels = []string{
178 | "hunyuan-t1-latest",
179 | "hunyuan-turbos-latest",
180 | }
181 |
182 | // 腾讯混元 API 目标URL
183 | const (
184 | TargetURL = "https://llm.hunyuan.tencent.com/aide/api/v2/triton_image/demo_text_chat/"
185 | Version = "1.0.0" // 版本号
186 | )
187 |
188 | // 日志级别
189 | const (
190 | LogLevelDebug = "debug"
191 | LogLevelInfo = "info"
192 | LogLevelWarn = "warn"
193 | LogLevelError = "error"
194 | )
195 |
196 | // 解析命令行参数并返回 Config 实例
197 | func parseFlags() *Config {
198 | cfg := &Config{}
199 | flag.StringVar(&cfg.Port, "port", "6666", "Port to listen on")
200 | flag.StringVar(&cfg.Address, "address", "localhost", "Address to listen on")
201 | flag.StringVar(&cfg.LogLevel, "log-level", LogLevelInfo, "Log level (debug, info, warn, error)")
202 | flag.BoolVar(&cfg.DevMode, "dev", false, "Enable development mode with enhanced logging")
203 | flag.IntVar(&cfg.MaxRetries, "max-retries", 3, "Maximum number of retries for failed requests")
204 | flag.IntVar(&cfg.Timeout, "timeout", 300, "Request timeout in seconds")
205 | flag.BoolVar(&cfg.VerifySSL, "verify-ssl", true, "Verify SSL certificates")
206 | flag.StringVar(&cfg.ModelName, "model", "hunyuan-t1-latest", "Default Hunyuan model name")
207 | flag.StringVar(&cfg.BearerToken, "token", "7auGXNATFSKl7dF", "Bearer token for Hunyuan API")
208 | flag.IntVar(&cfg.WorkerCount, "workers", 50, "Number of worker goroutines in the pool")
209 | flag.IntVar(&cfg.QueueSize, "queue-size", 500, "Size of the task queue")
210 | flag.IntVar(&cfg.MaxConcurrent, "max-concurrent", 100, "Maximum number of concurrent requests")
211 | flag.Parse()
212 |
213 | // 如果开发模式开启,自动设置日志级别为debug
214 | if cfg.DevMode && cfg.LogLevel != LogLevelDebug {
215 | cfg.LogLevel = LogLevelDebug
216 | fmt.Println("开发模式已启用,日志级别设置为debug")
217 | }
218 |
219 | return cfg
220 | }
221 |
222 | // 全局配置变量
223 | var (
224 | appConfig *Config
225 | )
226 |
227 | // 性能指标
228 | var (
229 | requestCounter int64
230 | successCounter int64
231 | errorCounter int64
232 | avgResponseTime int64
233 | latencyHistogram [10]int64 // 0-100ms, 100-200ms, ... >1s
234 | queuedRequests int64 // 当前在队列中的请求数
235 | rejectedRequests int64 // 被拒绝的请求数
236 | )
237 |
238 | // 并发控制组件
239 | var (
240 | workerPool *WorkerPool // 工作池
241 | requestSem *Semaphore // 请求信号量
242 | )
243 |
244 | // 日志记录器
245 | var (
246 | logger *log.Logger
247 | logLevel string
248 | logMutex sync.Mutex
249 | )
250 |
251 | // 日志初始化
252 | func initLogger(level string) {
253 | logger = log.New(os.Stdout, "[HunyuanAPI] ", log.LstdFlags)
254 | logLevel = level
255 | }
256 |
257 | // 根据日志级别记录日志
258 | func logDebug(format string, v ...interface{}) {
259 | if logLevel == LogLevelDebug {
260 | logMutex.Lock()
261 | logger.Printf("[DEBUG] "+format, v...)
262 | logMutex.Unlock()
263 | }
264 | }
265 |
266 | func logInfo(format string, v ...interface{}) {
267 | if logLevel == LogLevelDebug || logLevel == LogLevelInfo {
268 | logMutex.Lock()
269 | logger.Printf("[INFO] "+format, v...)
270 | logMutex.Unlock()
271 | }
272 | }
273 |
274 | func logWarn(format string, v ...interface{}) {
275 | if logLevel == LogLevelDebug || logLevel == LogLevelInfo || logLevel == LogLevelWarn {
276 | logMutex.Lock()
277 | logger.Printf("[WARN] "+format, v...)
278 | logMutex.Unlock()
279 | }
280 | }
281 |
282 | func logError(format string, v ...interface{}) {
283 | logMutex.Lock()
284 | logger.Printf("[ERROR] "+format, v...)
285 | logMutex.Unlock()
286 |
287 | // 错误计数
288 | atomic.AddInt64(&errorCounter, 1)
289 | }
290 |
291 | // OpenAI/DeepSeek 消息格式
292 | type APIMessage struct {
293 | Role string `json:"role"`
294 | Content interface{} `json:"content"` // 使用interface{}以支持各种类型
295 | }
296 |
297 | // OpenAI/DeepSeek 请求格式
298 | type APIRequest struct {
299 | Model string `json:"model"`
300 | Messages []APIMessage `json:"messages"`
301 | Stream bool `json:"stream"`
302 | Temperature float64 `json:"temperature,omitempty"`
303 | MaxTokens int `json:"max_tokens,omitempty"`
304 | }
305 |
306 | // 腾讯混元请求格式
307 | type HunyuanRequest struct {
308 | Stream bool `json:"stream"`
309 | Model string `json:"model"`
310 | QueryID string `json:"query_id"`
311 | Messages []APIMessage `json:"messages"`
312 | StreamModeration bool `json:"stream_moderation"`
313 | EnableEnhancement bool `json:"enable_enhancement"`
314 | }
315 |
316 | // 腾讯混元响应格式
317 | type HunyuanResponse struct {
318 | ID string `json:"id"`
319 | Object string `json:"object"`
320 | Created int64 `json:"created"`
321 | Model string `json:"model"`
322 | SystemFingerprint string `json:"system_fingerprint"`
323 | Choices []Choice `json:"choices"`
324 | Note string `json:"note,omitempty"`
325 | }
326 |
327 | // 选择结构
328 | type Choice struct {
329 | Index int `json:"index"`
330 | Delta Delta `json:"delta"`
331 | FinishReason *string `json:"finish_reason"`
332 | }
333 |
334 | // Delta结构,包含内容和推理内容
335 | type Delta struct {
336 | Role string `json:"role,omitempty"`
337 | Content string `json:"content,omitempty"`
338 | ReasoningContent string `json:"reasoning_content,omitempty"`
339 | }
340 |
341 | // DeepSeek 流式响应格式
342 | type StreamChunk struct {
343 | ID string `json:"id"`
344 | Object string `json:"object"`
345 | Created int64 `json:"created"`
346 | Model string `json:"model"`
347 | Choices []struct {
348 | Index int `json:"index"`
349 | FinishReason *string `json:"finish_reason,omitempty"`
350 | Delta struct {
351 | Role string `json:"role,omitempty"`
352 | Content string `json:"content,omitempty"`
353 | ReasoningContent string `json:"reasoning_content,omitempty"`
354 | } `json:"delta"`
355 | } `json:"choices"`
356 | }
357 |
358 | // 非流式响应格式
359 | type CompletionResponse struct {
360 | ID string `json:"id"`
361 | Object string `json:"object"`
362 | Created int64 `json:"created"`
363 | Model string `json:"model"`
364 | Choices []struct {
365 | Index int `json:"index"`
366 | FinishReason string `json:"finish_reason"`
367 | Message struct {
368 | Role string `json:"role"`
369 | Content string `json:"content"`
370 | ReasoningContent string `json:"reasoning_content,omitempty"`
371 | } `json:"message"`
372 | } `json:"choices"`
373 | Usage struct {
374 | PromptTokens int `json:"prompt_tokens"`
375 | CompletionTokens int `json:"completion_tokens"`
376 | TotalTokens int `json:"total_tokens"`
377 | } `json:"usage"`
378 | }
379 |
380 | // 请求计数和互斥锁,用于监控
381 | var (
382 | requestCount uint64 = 0
383 | countMutex sync.Mutex
384 | )
385 |
386 | // 主入口函数
387 | func main() {
388 | // 解析配置
389 | appConfig = parseFlags()
390 |
391 | // 初始化日志
392 | initLogger(appConfig.LogLevel)
393 |
394 | logInfo("启动服务: TargetURL=%s, Address=%s, Port=%s, Version=%s, LogLevel=%s, 支持模型=%v, BearerToken=***, WorkerCount=%d, QueueSize=%d, MaxConcurrent=%d",
395 | TargetURL, appConfig.Address, appConfig.Port, Version, appConfig.LogLevel, SupportedModels,
396 | appConfig.WorkerCount, appConfig.QueueSize, appConfig.MaxConcurrent)
397 |
398 | // 创建工作池和信号量
399 | workerPool = NewWorkerPool(appConfig.WorkerCount, appConfig.QueueSize)
400 | requestSem = NewSemaphore(appConfig.MaxConcurrent)
401 |
402 | logInfo("工作池已创建: %d个worker, 队列大小为%d", appConfig.WorkerCount, appConfig.QueueSize)
403 |
404 | // 配置更高的并发处理能力
405 | http.DefaultTransport.(*http.Transport).MaxIdleConnsPerHost = 100
406 | http.DefaultTransport.(*http.Transport).MaxIdleConns = 100
407 | http.DefaultTransport.(*http.Transport).IdleConnTimeout = 90 * time.Second
408 |
409 | // 创建自定义服务器,支持更高并发
410 | server := &http.Server{
411 | Addr: appConfig.Address + ":" + appConfig.Port,
412 | ReadTimeout: time.Duration(appConfig.Timeout) * time.Second,
413 | WriteTimeout: time.Duration(appConfig.Timeout) * time.Second,
414 | IdleTimeout: 120 * time.Second,
415 | Handler: nil, // 使用默认的ServeMux
416 | }
417 |
418 | // 创建处理器
419 | http.HandleFunc("/v1/models", func(w http.ResponseWriter, r *http.Request) {
420 | setCORSHeaders(w)
421 | if r.Method == "OPTIONS" {
422 | w.WriteHeader(http.StatusOK)
423 | return
424 | }
425 | handleModelsRequest(w, r)
426 | })
427 |
428 | http.HandleFunc("/v1/chat/completions", func(w http.ResponseWriter, r *http.Request) {
429 | setCORSHeaders(w)
430 | if r.Method == "OPTIONS" {
431 | w.WriteHeader(http.StatusOK)
432 | return
433 | }
434 |
435 | // 计数器增加
436 | countMutex.Lock()
437 | requestCount++
438 | currentCount := requestCount
439 | countMutex.Unlock()
440 |
441 | logInfo("收到新请求 #%d", currentCount)
442 |
443 | // 请求计数
444 | atomic.AddInt64(&requestCounter, 1)
445 |
446 | // 尝试获取信号量
447 | if !requestSem.TryAcquire() {
448 | // 请求数量超过限制
449 | atomic.AddInt64(&rejectedRequests, 1)
450 | logWarn("请求 #%d 被拒绝: 当前并发请求数已达上限", currentCount)
451 | w.Header().Set("Retry-After", "30")
452 | http.Error(w, "Server is busy, please try again later", http.StatusServiceUnavailable)
453 | return
454 | }
455 |
456 | // 释放信号量(在函数返回时)
457 | defer requestSem.Release()
458 |
459 | // 处理请求
460 | handleChatCompletionRequestWithPool(w, r, currentCount)
461 | })
462 |
463 | // 添加健康检查端点
464 | http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
465 | setCORSHeaders(w)
466 | if r.Method == "OPTIONS" {
467 | w.WriteHeader(http.StatusOK)
468 | return
469 | }
470 |
471 | // 获取各种计数器的值
472 | reqCount := atomic.LoadInt64(&requestCounter)
473 | succCount := atomic.LoadInt64(&successCounter)
474 | errCount := atomic.LoadInt64(&errorCounter)
475 | queuedCount := atomic.LoadInt64(&queuedRequests)
476 | rejectedCount := atomic.LoadInt64(&rejectedRequests)
477 |
478 | // 计算平均响应时间
479 | var avgTime int64 = 0
480 | if reqCount > 0 {
481 | avgTime = atomic.LoadInt64(&avgResponseTime) / max(reqCount, 1)
482 | }
483 |
484 | // 构建延迟直方图数据
485 | histogram := make([]int64, 10)
486 | for i := 0; i < 10; i++ {
487 | histogram[i] = atomic.LoadInt64(&latencyHistogram[i])
488 | }
489 |
490 | // 构建响应
491 | stats := map[string]interface{}{
492 | "status": "ok",
493 | "version": Version,
494 | "requests": reqCount,
495 | "success": succCount,
496 | "errors": errCount,
497 | "queued": queuedCount,
498 | "rejected": rejectedCount,
499 | "avg_time_ms": avgTime,
500 | "histogram_ms": histogram,
501 | "worker_count": workerPool.workerCount,
502 | "queue_size": len(workerPool.taskQueue),
503 | "queue_capacity": cap(workerPool.taskQueue),
504 | "queue_percent": float64(len(workerPool.taskQueue)) / float64(cap(workerPool.taskQueue)) * 100,
505 | "concurrent_limit": appConfig.MaxConcurrent,
506 | }
507 |
508 | w.Header().Set("Content-Type", "application/json")
509 | w.WriteHeader(http.StatusOK)
510 | json.NewEncoder(w).Encode(stats)
511 | })
512 |
513 | // 创建停止通道
514 | stop := make(chan os.Signal, 1)
515 | signal.Notify(stop, os.Interrupt, syscall.SIGTERM)
516 |
517 | // 在goroutine中启动服务器
518 | go func() {
519 | logInfo("Starting proxy server on %s", server.Addr)
520 | if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
521 | logError("Failed to start server: %v", err)
522 | os.Exit(1)
523 | }
524 | }()
525 |
526 | // 等待停止信号
527 | <-stop
528 |
529 | // 创建上下文用于优雅关闭
530 | ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
531 | defer cancel()
532 |
533 | // 优雅关闭服务器
534 | logInfo("Server is shutting down...")
535 | if err := server.Shutdown(ctx); err != nil {
536 | logError("Server shutdown failed: %v", err)
537 | }
538 |
539 | // 关闭工作池
540 | workerPool.Shutdown()
541 |
542 | logInfo("Server gracefully stopped")
543 | }
544 |
545 | // 设置CORS头
546 | func setCORSHeaders(w http.ResponseWriter) {
547 | w.Header().Set("Access-Control-Allow-Origin", "*")
548 | w.Header().Set("Access-Control-Allow-Methods", "POST, GET, OPTIONS")
549 | w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization")
550 | }
551 |
552 | // 验证消息格式
553 | func validateMessages(messages []APIMessage) (bool, string) {
554 | reqID := generateRequestID()
555 | logDebug("[reqID:%s] 验证消息格式", reqID)
556 |
557 | if messages == nil || len(messages) == 0 {
558 | return false, "Messages array is required"
559 | }
560 |
561 | for _, msg := range messages {
562 | if msg.Role == "" || msg.Content == nil {
563 | return false, "Invalid message format: each message must have role and content"
564 | }
565 | }
566 |
567 | return true, ""
568 | }
569 |
570 | // 从请求头中提取令牌
571 | func extractToken(r *http.Request) (string, error) {
572 | // 获取 Authorization 头部
573 | authHeader := r.Header.Get("Authorization")
574 | if authHeader == "" {
575 | return "", fmt.Errorf("missing Authorization header")
576 | }
577 |
578 | // 验证格式并提取令牌
579 | if !strings.HasPrefix(authHeader, "Bearer ") {
580 | return "", fmt.Errorf("invalid Authorization header format, must start with 'Bearer '")
581 | }
582 |
583 | // 提取令牌值
584 | token := strings.TrimPrefix(authHeader, "Bearer ")
585 | if token == "" {
586 | return "", fmt.Errorf("empty token in Authorization header")
587 | }
588 |
589 | return token, nil
590 | }
591 |
592 | // 转换任意类型的内容为字符串
593 | func contentToString(content interface{}) string {
594 | if content == nil {
595 | return ""
596 | }
597 |
598 | switch v := content.(type) {
599 | case string:
600 | return v
601 | default:
602 | jsonBytes, err := json.Marshal(v)
603 | if err != nil {
604 | logWarn("将内容转换为JSON失败: %v", err)
605 | return ""
606 | }
607 | return string(jsonBytes)
608 | }
609 | }
610 |
611 | // 生成请求ID
612 | func generateQueryID() string {
613 | return fmt.Sprintf("%s%d", getRandomString(8), time.Now().UnixNano())
614 | }
615 |
616 | // 判断模型是否在支持列表中
617 | func isModelSupported(modelName string) bool {
618 | for _, supportedModel := range SupportedModels {
619 | if modelName == supportedModel {
620 | return true
621 | }
622 | }
623 | return false
624 | }
625 |
626 | // 处理模型列表请求
627 | func handleModelsRequest(w http.ResponseWriter, r *http.Request) {
628 | logInfo("处理模型列表请求")
629 |
630 | // 返回模型列表
631 | w.Header().Set("Content-Type", "application/json")
632 | w.WriteHeader(http.StatusOK)
633 |
634 | // 构建模型数据
635 | modelData := make([]map[string]interface{}, 0, len(SupportedModels))
636 | for _, model := range SupportedModels {
637 | modelData = append(modelData, map[string]interface{}{
638 | "id": model,
639 | "object": "model",
640 | "created": time.Now().Unix(),
641 | "owned_by": "TencentCloud",
642 | "capabilities": map[string]interface{}{
643 | "chat": true,
644 | "completions": true,
645 | "reasoning": true,
646 | },
647 | })
648 | }
649 |
650 | modelsList := map[string]interface{}{
651 | "object": "list",
652 | "data": modelData,
653 | }
654 |
655 | json.NewEncoder(w).Encode(modelsList)
656 | }
657 |
658 | // 处理聊天补全请求(使用工作池)
659 | func handleChatCompletionRequestWithPool(w http.ResponseWriter, r *http.Request, requestNum uint64) {
660 | reqID := generateRequestID()
661 | startTime := time.Now()
662 | logInfo("[reqID:%s] 处理聊天补全请求 #%d", reqID, requestNum)
663 |
664 | // 设置超时上下文
665 | ctx, cancel := context.WithTimeout(r.Context(), time.Duration(appConfig.Timeout)*time.Second)
666 | defer cancel()
667 |
668 | // 包含超时上下文的请求
669 | r = r.WithContext(ctx)
670 |
671 | // 添加恢复机制,防止panic
672 | defer func() {
673 | if r := recover(); r != nil {
674 | logError("[reqID:%s] 处理请求时发生panic: %v", reqID, r)
675 | http.Error(w, "Internal server error", http.StatusInternalServerError)
676 | }
677 | }()
678 |
679 | // 解析请求体
680 | var apiReq APIRequest
681 | if err := json.NewDecoder(r.Body).Decode(&apiReq); err != nil {
682 | logError("[reqID:%s] 解析请求失败: %v", reqID, err)
683 | http.Error(w, "Invalid request body", http.StatusBadRequest)
684 | return
685 | }
686 |
687 | // 验证消息格式
688 | valid, errMsg := validateMessages(apiReq.Messages)
689 | if !valid {
690 | logError("[reqID:%s] 消息格式验证失败: %s", reqID, errMsg)
691 | http.Error(w, errMsg, http.StatusBadRequest)
692 | return
693 | }
694 |
695 | // 是否使用流式处理
696 | isStream := apiReq.Stream
697 |
698 | // 确定使用的模型
699 | modelName := appConfig.ModelName
700 | if apiReq.Model != "" {
701 | // 检查请求的模型是否是我们支持的
702 | if isModelSupported(apiReq.Model) {
703 | modelName = apiReq.Model
704 | } else {
705 | logWarn("[reqID:%s] 请求的模型 %s 不支持,使用默认模型 %s", reqID, apiReq.Model, modelName)
706 | }
707 | }
708 |
709 | logInfo("[reqID:%s] 使用模型: %s", reqID, modelName)
710 |
711 | // 创建混元API请求
712 | hunyuanReq := HunyuanRequest{
713 | Stream: true, // 混元API总是使用流式响应
714 | Model: modelName,
715 | QueryID: generateQueryID(),
716 | Messages: apiReq.Messages,
717 | StreamModeration: true,
718 | EnableEnhancement: false,
719 | }
720 |
721 | // 创建任务
722 | task := &Task{
723 | r: r,
724 | w: w,
725 | done: make(chan struct{}),
726 | reqID: reqID,
727 | isStream: isStream,
728 | hunyuanReq: hunyuanReq,
729 | }
730 |
731 | // 添加到任务队列
732 | atomic.AddInt64(&queuedRequests, 1)
733 | submitted, err := workerPool.SubmitTask(task)
734 | if !submitted {
735 | atomic.AddInt64(&queuedRequests, -1)
736 | atomic.AddInt64(&rejectedRequests, 1)
737 | logError("[reqID:%s] 提交任务失败: %v", reqID, err)
738 | w.Header().Set("Retry-After", "60")
739 | http.Error(w, "Server queue is full, please try again later", http.StatusServiceUnavailable)
740 | return
741 | }
742 |
743 | logInfo("[reqID:%s] 任务已提交到队列", reqID)
744 |
745 | // 等待任务完成或超时
746 | select {
747 | case <-task.done:
748 | // 任务已完成
749 | logInfo("[reqID:%s] 任务已完成", reqID)
750 | case <-r.Context().Done():
751 | // 请求被取消或超时
752 | logWarn("[reqID:%s] 请求被取消或超时", reqID)
753 | // 注意:虽然请求被取消,但worker可能仍在处理任务
754 | }
755 |
756 | // 请求处理完成,更新指标
757 | atomic.AddInt64(&queuedRequests, -1)
758 | elapsed := time.Since(startTime).Milliseconds()
759 |
760 | // 更新延迟直方图
761 | bucketIndex := min(int(elapsed/100), 9)
762 | atomic.AddInt64(&latencyHistogram[bucketIndex], 1)
763 |
764 | // 更新平均响应时间
765 | atomic.AddInt64(&avgResponseTime, elapsed)
766 |
767 | if r.Context().Err() == nil {
768 | // 成功计数增加
769 | atomic.AddInt64(&successCounter, 1)
770 | logInfo("[reqID:%s] 请求处理成功,耗时: %dms", reqID, elapsed)
771 | } else {
772 | logError("[reqID:%s] 请求处理失败: %v, 耗时: %dms", reqID, r.Context().Err(), elapsed)
773 | }
774 | }
775 |
776 | // 处理聊天补全请求(原实现,已不使用)
777 | func handleChatCompletionRequest(w http.ResponseWriter, r *http.Request) {
778 | reqID := generateRequestID()
779 | startTime := time.Now()
780 | logInfo("[reqID:%s] 处理聊天补全请求", reqID)
781 |
782 | // 解析请求体
783 | var apiReq APIRequest
784 | if err := json.NewDecoder(r.Body).Decode(&apiReq); err != nil {
785 | logError("[reqID:%s] 解析请求失败: %v", reqID, err)
786 | http.Error(w, "Invalid request body", http.StatusBadRequest)
787 | return
788 | }
789 |
790 | // 验证消息格式
791 | valid, errMsg := validateMessages(apiReq.Messages)
792 | if !valid {
793 | logError("[reqID:%s] 消息格式验证失败: %s", reqID, errMsg)
794 | http.Error(w, errMsg, http.StatusBadRequest)
795 | return
796 | }
797 |
798 | // 是否使用流式处理
799 | isStream := apiReq.Stream
800 |
801 | // 确定使用的模型
802 | modelName := appConfig.ModelName
803 | if apiReq.Model != "" {
804 | // 检查请求的模型是否是我们支持的
805 | if isModelSupported(apiReq.Model) {
806 | modelName = apiReq.Model
807 | } else {
808 | logWarn("[reqID:%s] 请求的模型 %s 不支持,使用默认模型 %s", reqID, apiReq.Model, modelName)
809 | }
810 | }
811 |
812 | logInfo("[reqID:%s] 使用模型: %s", reqID, modelName)
813 |
814 | // 创建混元API请求
815 | hunyuanReq := HunyuanRequest{
816 | Stream: true, // 混元API总是使用流式响应
817 | Model: modelName,
818 | QueryID: generateQueryID(),
819 | Messages: apiReq.Messages,
820 | StreamModeration: true,
821 | EnableEnhancement: false,
822 | }
823 |
824 | // 转发请求到混元API
825 | var responseErr error
826 | if isStream {
827 | responseErr = handleStreamingRequest(w, r, hunyuanReq, reqID)
828 | } else {
829 | responseErr = handleNonStreamingRequest(w, r, hunyuanReq, reqID)
830 | }
831 |
832 | // 请求处理完成,更新指标
833 | elapsed := time.Since(startTime).Milliseconds()
834 |
835 | // 更新延迟直方图
836 | bucketIndex := min(int(elapsed/100), 9)
837 | atomic.AddInt64(&latencyHistogram[bucketIndex], 1)
838 |
839 | // 更新平均响应时间
840 | atomic.AddInt64(&avgResponseTime, elapsed)
841 |
842 | if responseErr == nil {
843 | // 成功计数增加
844 | atomic.AddInt64(&successCounter, 1)
845 | logInfo("[reqID:%s] 请求处理成功,耗时: %dms", reqID, elapsed)
846 | } else {
847 | logError("[reqID:%s] 请求处理失败: %v, 耗时: %dms", reqID, responseErr, elapsed)
848 | }
849 | }
850 |
851 | // 安全的HTTP客户端,支持禁用SSL验证
852 | func getHTTPClient() *http.Client {
853 | tr := &http.Transport{
854 | MaxIdleConnsPerHost: 100,
855 | IdleConnTimeout: 90 * time.Second,
856 | TLSClientConfig: nil, // 默认配置
857 | }
858 |
859 | // 如果配置了禁用SSL验证
860 | if !appConfig.VerifySSL {
861 | tr.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}
862 | }
863 |
864 | return &http.Client{
865 | Timeout: time.Duration(appConfig.Timeout) * time.Second,
866 | Transport: tr,
867 | }
868 | }
869 |
870 | // 处理流式请求
871 | func handleStreamingRequest(w http.ResponseWriter, r *http.Request, hunyuanReq HunyuanRequest, reqID string) error {
872 | logInfo("[reqID:%s] 处理流式请求", reqID)
873 |
874 | // 序列化请求
875 | jsonData, err := json.Marshal(hunyuanReq)
876 | if err != nil {
877 | logError("[reqID:%s] 序列化请求失败: %v", reqID, err)
878 | http.Error(w, "Internal server error", http.StatusInternalServerError)
879 | return err
880 | }
881 |
882 | // 创建请求
883 | httpReq, err := http.NewRequestWithContext(r.Context(), "POST", TargetURL, bytes.NewBuffer(jsonData))
884 | if err != nil {
885 | logError("[reqID:%s] 创建请求失败: %v", reqID, err)
886 | http.Error(w, "Internal server error", http.StatusInternalServerError)
887 | return err
888 | }
889 |
890 | // 设置请求头
891 | httpReq.Header.Set("Content-Type", "application/json")
892 | httpReq.Header.Set("Model", hunyuanReq.Model)
893 | setCommonHeaders(httpReq)
894 |
895 | // 创建HTTP客户端
896 | client := getHTTPClient()
897 |
898 | // 发送请求
899 | resp, err := client.Do(httpReq)
900 | if err != nil {
901 | logError("[reqID:%s] 发送请求失败: %v", reqID, err)
902 | http.Error(w, "Failed to connect to API", http.StatusBadGateway)
903 | return err
904 | }
905 | defer resp.Body.Close()
906 |
907 | // 检查响应状态
908 | if resp.StatusCode != http.StatusOK {
909 | logError("[reqID:%s] API返回非200状态码: %d", reqID, resp.StatusCode)
910 |
911 | bodyBytes, _ := io.ReadAll(resp.Body)
912 | logError("[reqID:%s] 错误响应内容: %s", reqID, string(bodyBytes))
913 |
914 | http.Error(w, fmt.Sprintf("API error with status code: %d", resp.StatusCode), resp.StatusCode)
915 | return fmt.Errorf("API返回非200状态码: %d", resp.StatusCode)
916 | }
917 |
918 | // 设置响应头
919 | w.Header().Set("Content-Type", "text/event-stream")
920 | w.Header().Set("Cache-Control", "no-cache")
921 | w.Header().Set("Connection", "keep-alive")
922 |
923 | // 创建响应ID和时间戳
924 | respID := fmt.Sprintf("chatcmpl-%s", getRandomString(10))
925 | createdTime := time.Now().Unix()
926 |
927 | // 创建读取器
928 | reader := bufio.NewReaderSize(resp.Body, 16384)
929 |
930 | // 创建Flusher
931 | flusher, ok := w.(http.Flusher)
932 | if !ok {
933 | logError("[reqID:%s] Streaming not supported", reqID)
934 | http.Error(w, "Streaming not supported", http.StatusInternalServerError)
935 | return fmt.Errorf("streaming not supported")
936 | }
937 |
938 | // 发送角色块
939 | roleChunk := createRoleChunk(respID, createdTime, hunyuanReq.Model)
940 | w.Write([]byte("data: " + string(roleChunk) + "\n\n"))
941 | flusher.Flush()
942 |
943 | // 持续读取响应
944 | for {
945 | // 添加超时检测
946 | select {
947 | case <-r.Context().Done():
948 | logWarn("[reqID:%s] 请求超时或被客户端取消", reqID)
949 | return fmt.Errorf("请求超时或被取消")
950 | default:
951 | // 继续处理
952 | }
953 |
954 | // 读取一行数据
955 | line, err := reader.ReadBytes('\n')
956 | if err != nil {
957 | if err != io.EOF {
958 | logError("[reqID:%s] 读取响应出错: %v", reqID, err)
959 | return err
960 | }
961 | break
962 | }
963 |
964 | // 处理数据行
965 | lineStr := string(line)
966 | if strings.HasPrefix(lineStr, "data: ") {
967 | jsonStr := strings.TrimPrefix(lineStr, "data: ")
968 | jsonStr = strings.TrimSpace(jsonStr)
969 |
970 | // 特殊处理[DONE]消息
971 | if jsonStr == "[DONE]" {
972 | logDebug("[reqID:%s] 收到[DONE]消息", reqID)
973 | w.Write([]byte("data: [DONE]\n\n"))
974 | flusher.Flush()
975 | break
976 | }
977 |
978 | // 解析混元响应
979 | var hunyuanResp HunyuanResponse
980 | if err := json.Unmarshal([]byte(jsonStr), &hunyuanResp); err != nil {
981 | logWarn("[reqID:%s] 解析JSON失败: %v, data: %s", reqID, err, jsonStr)
982 | continue
983 | }
984 |
985 | // 处理各种类型的内容
986 | for _, choice := range hunyuanResp.Choices {
987 | if choice.Delta.Content != "" {
988 | // 发送内容块
989 | contentChunk := createContentChunk(respID, createdTime, hunyuanReq.Model, choice.Delta.Content)
990 | w.Write([]byte("data: " + string(contentChunk) + "\n\n"))
991 | flusher.Flush()
992 | }
993 |
994 | if choice.Delta.ReasoningContent != "" {
995 | // 发送推理内容块
996 | reasoningChunk := createReasoningChunk(respID, createdTime, hunyuanReq.Model, choice.Delta.ReasoningContent)
997 | w.Write([]byte("data: " + string(reasoningChunk) + "\n\n"))
998 | flusher.Flush()
999 | }
1000 |
1001 | // 处理完成标志
1002 | if choice.FinishReason != nil {
1003 | finishReason := *choice.FinishReason
1004 | if finishReason != "" {
1005 | doneChunk := createDoneChunk(respID, createdTime, hunyuanReq.Model, finishReason)
1006 | w.Write([]byte("data: " + string(doneChunk) + "\n\n"))
1007 | flusher.Flush()
1008 | }
1009 | }
1010 | }
1011 | }
1012 | }
1013 |
1014 | // 发送结束信号(如果没有正常结束)
1015 | finishReason := "stop"
1016 | doneChunk := createDoneChunk(respID, createdTime, hunyuanReq.Model, finishReason)
1017 | w.Write([]byte("data: " + string(doneChunk) + "\n\n"))
1018 | w.Write([]byte("data: [DONE]\n\n"))
1019 | flusher.Flush()
1020 |
1021 | return nil
1022 | }
1023 |
1024 | // 处理非流式请求
1025 | func handleNonStreamingRequest(w http.ResponseWriter, r *http.Request, hunyuanReq HunyuanRequest, reqID string) error {
1026 | logInfo("[reqID:%s] 处理非流式请求", reqID)
1027 |
1028 | // 序列化请求
1029 | jsonData, err := json.Marshal(hunyuanReq)
1030 | if err != nil {
1031 | logError("[reqID:%s] 序列化请求失败: %v", reqID, err)
1032 | http.Error(w, "Internal server error", http.StatusInternalServerError)
1033 | return err
1034 | }
1035 |
1036 | // 创建请求
1037 | httpReq, err := http.NewRequestWithContext(r.Context(), "POST", TargetURL, bytes.NewBuffer(jsonData))
1038 | if err != nil {
1039 | logError("[reqID:%s] 创建请求失败: %v", reqID, err)
1040 | http.Error(w, "Internal server error", http.StatusInternalServerError)
1041 | return err
1042 | }
1043 |
1044 | // 设置请求头
1045 | httpReq.Header.Set("Content-Type", "application/json")
1046 | httpReq.Header.Set("Model", hunyuanReq.Model)
1047 | setCommonHeaders(httpReq)
1048 |
1049 | // 创建HTTP客户端
1050 | client := getHTTPClient()
1051 |
1052 | // 发送请求
1053 | resp, err := client.Do(httpReq)
1054 | if err != nil {
1055 | logError("[reqID:%s] 发送请求失败: %v", reqID, err)
1056 | http.Error(w, "Failed to connect to API", http.StatusBadGateway)
1057 | return err
1058 | }
1059 | defer resp.Body.Close()
1060 |
1061 | // 检查响应状态
1062 | if resp.StatusCode != http.StatusOK {
1063 | logError("[reqID:%s] API返回非200状态码: %d", reqID, resp.StatusCode)
1064 |
1065 | bodyBytes, _ := io.ReadAll(resp.Body)
1066 | logError("[reqID:%s] 错误响应内容: %s", reqID, string(bodyBytes))
1067 |
1068 | http.Error(w, fmt.Sprintf("API error with status code: %d", resp.StatusCode), resp.StatusCode)
1069 | return fmt.Errorf("API返回非200状态码: %d", resp.StatusCode)
1070 | }
1071 |
1072 | // 读取完整的流式响应
1073 | bodyBytes, err := io.ReadAll(resp.Body)
1074 | if err != nil {
1075 | logError("[reqID:%s] 读取响应失败: %v", reqID, err)
1076 | http.Error(w, "Failed to read API response", http.StatusInternalServerError)
1077 | return err
1078 | }
1079 |
1080 | // 解析流式响应并提取完整内容
1081 | fullContent, reasoningContent, err := extractFullContentFromStream(bodyBytes, reqID)
1082 | if err != nil {
1083 | logError("[reqID:%s] 解析流式响应失败: %v", reqID, err)
1084 | http.Error(w, "Failed to parse streaming response", http.StatusInternalServerError)
1085 | return err
1086 | }
1087 |
1088 | // 构建完整的非流式响应
1089 | completionResponse := CompletionResponse{
1090 | ID: fmt.Sprintf("chatcmpl-%s", getRandomString(10)),
1091 | Object: "chat.completion",
1092 | Created: time.Now().Unix(),
1093 | Model: hunyuanReq.Model,
1094 | Choices: []struct {
1095 | Index int `json:"index"`
1096 | FinishReason string `json:"finish_reason"`
1097 | Message struct {
1098 | Role string `json:"role"`
1099 | Content string `json:"content"`
1100 | ReasoningContent string `json:"reasoning_content,omitempty"`
1101 | } `json:"message"`
1102 | }{
1103 | {
1104 | Index: 0,
1105 | FinishReason: "stop",
1106 | Message: struct {
1107 | Role string `json:"role"`
1108 | Content string `json:"content"`
1109 | ReasoningContent string `json:"reasoning_content,omitempty"`
1110 | }{
1111 | Role: "assistant",
1112 | Content: fullContent,
1113 | ReasoningContent: reasoningContent,
1114 | },
1115 | },
1116 | },
1117 | Usage: struct {
1118 | PromptTokens int `json:"prompt_tokens"`
1119 | CompletionTokens int `json:"completion_tokens"`
1120 | TotalTokens int `json:"total_tokens"`
1121 | }{
1122 | PromptTokens: len(formatMessages(hunyuanReq.Messages)) / 4,
1123 | CompletionTokens: len(fullContent) / 4,
1124 | TotalTokens: (len(formatMessages(hunyuanReq.Messages)) + len(fullContent)) / 4,
1125 | },
1126 | }
1127 |
1128 | // 返回响应
1129 | w.Header().Set("Content-Type", "application/json")
1130 | if err := json.NewEncoder(w).Encode(completionResponse); err != nil {
1131 | logError("[reqID:%s] 编码响应失败: %v", reqID, err)
1132 | http.Error(w, "Failed to encode response", http.StatusInternalServerError)
1133 | return err
1134 | }
1135 |
1136 | return nil
1137 | }
1138 |
1139 | // 从流式响应中提取完整内容
1140 | func extractFullContentFromStream(bodyBytes []byte, reqID string) (string, string, error) {
1141 | bodyStr := string(bodyBytes)
1142 | lines := strings.Split(bodyStr, "\n")
1143 |
1144 | // 内容累积器
1145 | var contentBuilder strings.Builder
1146 | var reasoningBuilder strings.Builder
1147 |
1148 | // 解析每一行
1149 | for _, line := range lines {
1150 | if strings.HasPrefix(line, "data: ") && !strings.Contains(line, "[DONE]") {
1151 | jsonStr := strings.TrimPrefix(line, "data: ")
1152 | jsonStr = strings.TrimSpace(jsonStr)
1153 |
1154 | // 解析JSON
1155 | var hunyuanResp HunyuanResponse
1156 | if err := json.Unmarshal([]byte(jsonStr), &hunyuanResp); err != nil {
1157 | continue // 跳过无效JSON
1158 | }
1159 |
1160 | // 提取内容和推理内容
1161 | for _, choice := range hunyuanResp.Choices {
1162 | if choice.Delta.Content != "" {
1163 | contentBuilder.WriteString(choice.Delta.Content)
1164 | }
1165 | if choice.Delta.ReasoningContent != "" {
1166 | reasoningBuilder.WriteString(choice.Delta.ReasoningContent)
1167 | }
1168 | }
1169 | }
1170 | }
1171 |
1172 | return contentBuilder.String(), reasoningBuilder.String(), nil
1173 | }
1174 |
1175 | // 创建角色块
1176 | func createRoleChunk(id string, created int64, model string) []byte {
1177 | chunk := StreamChunk{
1178 | ID: id,
1179 | Object: "chat.completion.chunk",
1180 | Created: created,
1181 | Model: model,
1182 | Choices: []struct {
1183 | Index int `json:"index"`
1184 | FinishReason *string `json:"finish_reason,omitempty"`
1185 | Delta struct {
1186 | Role string `json:"role,omitempty"`
1187 | Content string `json:"content,omitempty"`
1188 | ReasoningContent string `json:"reasoning_content,omitempty"`
1189 | } `json:"delta"`
1190 | }{
1191 | {
1192 | Index: 0,
1193 | Delta: struct {
1194 | Role string `json:"role,omitempty"`
1195 | Content string `json:"content,omitempty"`
1196 | ReasoningContent string `json:"reasoning_content,omitempty"`
1197 | }{
1198 | Role: "assistant",
1199 | },
1200 | },
1201 | },
1202 | }
1203 |
1204 | data, _ := json.Marshal(chunk)
1205 | return data
1206 | }
1207 |
1208 | // 创建内容块
1209 | func createContentChunk(id string, created int64, model string, content string) []byte {
1210 | chunk := StreamChunk{
1211 | ID: id,
1212 | Object: "chat.completion.chunk",
1213 | Created: created,
1214 | Model: model,
1215 | Choices: []struct {
1216 | Index int `json:"index"`
1217 | FinishReason *string `json:"finish_reason,omitempty"`
1218 | Delta struct {
1219 | Role string `json:"role,omitempty"`
1220 | Content string `json:"content,omitempty"`
1221 | ReasoningContent string `json:"reasoning_content,omitempty"`
1222 | } `json:"delta"`
1223 | }{
1224 | {
1225 | Index: 0,
1226 | Delta: struct {
1227 | Role string `json:"role,omitempty"`
1228 | Content string `json:"content,omitempty"`
1229 | ReasoningContent string `json:"reasoning_content,omitempty"`
1230 | }{
1231 | Content: content,
1232 | },
1233 | },
1234 | },
1235 | }
1236 |
1237 | data, _ := json.Marshal(chunk)
1238 | return data
1239 | }
1240 |
1241 | // 创建推理内容块
1242 | func createReasoningChunk(id string, created int64, model string, reasoningContent string) []byte {
1243 | chunk := StreamChunk{
1244 | ID: id,
1245 | Object: "chat.completion.chunk",
1246 | Created: created,
1247 | Model: model,
1248 | Choices: []struct {
1249 | Index int `json:"index"`
1250 | FinishReason *string `json:"finish_reason,omitempty"`
1251 | Delta struct {
1252 | Role string `json:"role,omitempty"`
1253 | Content string `json:"content,omitempty"`
1254 | ReasoningContent string `json:"reasoning_content,omitempty"`
1255 | } `json:"delta"`
1256 | }{
1257 | {
1258 | Index: 0,
1259 | Delta: struct {
1260 | Role string `json:"role,omitempty"`
1261 | Content string `json:"content,omitempty"`
1262 | ReasoningContent string `json:"reasoning_content,omitempty"`
1263 | }{
1264 | ReasoningContent: reasoningContent,
1265 | },
1266 | },
1267 | },
1268 | }
1269 |
1270 | data, _ := json.Marshal(chunk)
1271 | return data
1272 | }
1273 |
1274 | // 创建完成块
1275 | func createDoneChunk(id string, created int64, model string, reason string) []byte {
1276 | finishReason := reason
1277 | chunk := StreamChunk{
1278 | ID: id,
1279 | Object: "chat.completion.chunk",
1280 | Created: created,
1281 | Model: model,
1282 | Choices: []struct {
1283 | Index int `json:"index"`
1284 | FinishReason *string `json:"finish_reason,omitempty"`
1285 | Delta struct {
1286 | Role string `json:"role,omitempty"`
1287 | Content string `json:"content,omitempty"`
1288 | ReasoningContent string `json:"reasoning_content,omitempty"`
1289 | } `json:"delta"`
1290 | }{
1291 | {
1292 | Index: 0,
1293 | FinishReason: &finishReason,
1294 | Delta: struct {
1295 | Role string `json:"role,omitempty"`
1296 | Content string `json:"content,omitempty"`
1297 | ReasoningContent string `json:"reasoning_content,omitempty"`
1298 | }{},
1299 | },
1300 | },
1301 | }
1302 |
1303 | data, _ := json.Marshal(chunk)
1304 | return data
1305 | }
1306 |
1307 | // 设置常见的请求头 - 参考Python版本
1308 | func setCommonHeaders(req *http.Request) {
1309 | req.Header.Set("accept", "*/*")
1310 | req.Header.Set("accept-language", "zh-CN,zh;q=0.9,en;q=0.8,zh-TW;q=0.7")
1311 | req.Header.Set("authorization", "Bearer "+appConfig.BearerToken)
1312 | req.Header.Set("dnt", "1")
1313 | req.Header.Set("origin", "https://llm.hunyuan.tencent.com")
1314 | req.Header.Set("polaris", "stream-server-online-sbs-10697")
1315 | req.Header.Set("priority", "u=1, i")
1316 | req.Header.Set("referer", "https://llm.hunyuan.tencent.com/")
1317 | req.Header.Set("sec-ch-ua", "\"Chromium\";v=\"134\", \"Not:A-Brand\";v=\"24\", \"Google Chrome\";v=\"134\"")
1318 | req.Header.Set("sec-ch-ua-mobile", "?0")
1319 | req.Header.Set("sec-ch-ua-platform", "\"Windows\"")
1320 | req.Header.Set("sec-fetch-dest", "empty")
1321 | req.Header.Set("sec-fetch-mode", "cors")
1322 | req.Header.Set("sec-fetch-site", "same-origin")
1323 | req.Header.Set("staffname", "staryxzhang")
1324 | req.Header.Set("wsid", "10697")
1325 | req.Header.Set("user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36")
1326 | }
1327 |
1328 | // 生成请求ID
1329 | func generateRequestID() string {
1330 | return fmt.Sprintf("%x", time.Now().UnixNano())
1331 | }
1332 |
1333 | // 生成随机字符串
1334 | func getRandomString(length int) string {
1335 | const charset = "abcdefghijklmnopqrstuvwxyz0123456789"
1336 | b := make([]byte, length)
1337 | for i := range b {
1338 | b[i] = charset[time.Now().UnixNano()%int64(len(charset))]
1339 | time.Sleep(1 * time.Nanosecond)
1340 | }
1341 | return string(b)
1342 | }
1343 |
1344 | // 格式化消息为字符串
1345 | func formatMessages(messages []APIMessage) string {
1346 | var result strings.Builder
1347 | for _, msg := range messages {
1348 | result.WriteString(msg.Role)
1349 | result.WriteString(": ")
1350 | result.WriteString(contentToString(msg.Content))
1351 | result.WriteString("\n")
1352 | }
1353 | return result.String()
1354 | }
1355 |
1356 | // 获取两个整数中的最小值
1357 | func min(a, b int) int {
1358 | if a < b {
1359 | return a
1360 | }
1361 | return b
1362 | }
1363 |
1364 | // 获取两个整数中的最大值
1365 | func max(a, b int64) int64 {
1366 | if a > b {
1367 | return a
1368 | }
1369 | return b
1370 | }
--------------------------------------------------------------------------------
/qwen2api-cf.js:
--------------------------------------------------------------------------------
1 | // 通义千问 OpenAI 兼容代理 - 完整版
2 | // 包括 /v1/models、/v1/chat/completions(流式和非流)、/v1/images/generations 以及图片上传功能
3 | // 把https://chat.qwen.ai/的Cookie中的token字段值作为APIKEY传入使用openai兼容性标准接口使用即可
4 |
5 | export default {
6 | // 内置模型列表(当获取接口失败时使用)
7 | defaultModels: [
8 | "qwen-max-latest",
9 | "qwen-plus-latest",
10 | "qwen2.5-vl-72b-instruct",
11 | "qwen2.5-14b-instruct-1m",
12 | "qvq-72b-preview",
13 | "qwq-32b-preview",
14 | "qwen2.5-coder-32b-instruct",
15 | "qwen-turbo-latest",
16 | "qwen2.5-72b-instruct"
17 | ],
18 |
19 | // 主入口:根据 URL 路径分发请求
20 | async fetch(request, env, ctx) {
21 | const url = new URL(request.url);
22 | const path = url.pathname;
23 | const apiPrefix = env.API_PREFIX || '';
24 |
25 | if (path === `${apiPrefix}/v1/models`) {
26 | return this.handleModels(request);
27 | } else if (path === `${apiPrefix}/v1/chat/completions`) {
28 | return this.handleChatCompletions(request);
29 | } else if (path === `${apiPrefix}/v1/images/generations`) {
30 | return this.handleImageGenerations(request);
31 | }
32 |
33 | return new Response("Not Found", { status: 404 });
34 | },
35 |
36 | // 从请求中提取 Authorization token
37 | getAuthToken(request) {
38 | const authHeader = request.headers.get('authorization');
39 | if (!authHeader) return null;
40 | return authHeader.replace('Bearer ', '');
41 | },
42 |
43 | // 处理模型列表接口
44 | async handleModels(request) {
45 | const authToken = this.getAuthToken(request);
46 | let modelsList = [];
47 |
48 | if (authToken) {
49 | try {
50 | const response = await fetch('https://chat.qwen.ai/api/models', {
51 | headers: {
52 | 'Authorization': `Bearer ${authToken}`,
53 | 'User-Agent': 'Mozilla/5.0'
54 | }
55 | });
56 | if (response.ok) {
57 | const data = await response.json();
58 | modelsList = data.data.map(item => item.id);
59 | } else {
60 | modelsList = [...this.defaultModels];
61 | }
62 | } catch (e) {
63 | console.error('获取模型列表失败:', e);
64 | modelsList = [...this.defaultModels];
65 | }
66 | } else {
67 | modelsList = [...this.defaultModels];
68 | }
69 |
70 | // 扩展模型列表,增加变种后缀
71 | const expandedModels = [];
72 | for (const model of modelsList) {
73 | expandedModels.push(model);
74 | expandedModels.push(model + '-thinking');
75 | expandedModels.push(model + '-search');
76 | expandedModels.push(model + '-thinking-search');
77 | expandedModels.push(model + '-draw');
78 | }
79 |
80 | return new Response(JSON.stringify({
81 | object: "list",
82 | data: expandedModels.map(id => ({
83 | id,
84 | object: "model",
85 | created: Date.now(),
86 | owned_by: "qwen"
87 | }))
88 | }), { headers: { 'Content-Type': 'application/json' } });
89 | },
90 |
91 | // 处理 /v1/chat/completions 接口
92 | async handleChatCompletions(request) {
93 | const authToken = this.getAuthToken(request);
94 | if (!authToken) {
95 | return new Response(JSON.stringify({
96 | error: "请提供正确的 Authorization token"
97 | }), { status: 401, headers: { 'Content-Type': 'application/json' } });
98 | }
99 |
100 | let body;
101 | try {
102 | body = await request.json();
103 | } catch (error) {
104 | return new Response(JSON.stringify({
105 | error: "无效的请求体,请提供有效的JSON"
106 | }), { status: 400, headers: { 'Content-Type': 'application/json' } });
107 | }
108 |
109 | const stream = !!body.stream;
110 | const messages = body.messages || [];
111 | const requestId = crypto.randomUUID();
112 |
113 | if (!Array.isArray(messages) || messages.length === 0) {
114 | return new Response(JSON.stringify({
115 | error: "请提供有效的 messages 数组"
116 | }), { status: 400, headers: { 'Content-Type': 'application/json' } });
117 | }
118 |
119 | let modelName = body.model || "qwen-turbo-latest";
120 | let chatType = "t2t";
121 |
122 | // 如果模型名包含 -draw,则走图像生成流程
123 | if (modelName.includes('-draw')) {
124 | return this.handleDrawRequest(messages, modelName, authToken);
125 | }
126 |
127 | // 如果是 -thinking 模式,则设置思考配置
128 | if (modelName.includes('-thinking')) {
129 | modelName = modelName.replace('-thinking', '');
130 | if (messages[messages.length - 1]) {
131 | messages[messages.length - 1].feature_config = { thinking_enabled: true };
132 | }
133 | }
134 |
135 | // 如果是 -search 模式,则修改 chat_type
136 | if (modelName.includes('-search')) {
137 | modelName = modelName.replace('-search', '');
138 | chatType = "search";
139 | if (messages[messages.length - 1]) {
140 | messages[messages.length - 1].chat_type = "search";
141 | }
142 | }
143 |
144 | const requestBody = {
145 | model: modelName,
146 | messages,
147 | stream,
148 | chat_type: chatType,
149 | id: requestId
150 | };
151 |
152 | // 处理图片消息(例如上传图片):
153 | const lastMessage = messages[messages.length - 1];
154 | if (Array.isArray(lastMessage?.content)) {
155 | const imageItem = lastMessage.content.find(item =>
156 | item.image_url && item.image_url.url
157 | );
158 | if (imageItem) {
159 | const imageId = await this.uploadImage(imageItem.image_url.url, authToken);
160 | if (imageId) {
161 | const index = lastMessage.content.findIndex(item =>
162 | item.image_url && item.image_url.url
163 | );
164 | if (index >= 0) {
165 | lastMessage.content[index] = {
166 | type: "image",
167 | image: imageId
168 | };
169 | }
170 | }
171 | }
172 | }
173 |
174 | try {
175 | const response = await fetch('https://chat.qwen.ai/api/chat/completions', {
176 | method: 'POST',
177 | headers: {
178 | 'Authorization': `Bearer ${authToken}`,
179 | 'Content-Type': 'application/json',
180 | 'User-Agent': 'Mozilla/5.0'
181 | },
182 | body: JSON.stringify(requestBody)
183 | });
184 |
185 | if (!response.ok) {
186 | const errText = await response.text();
187 | console.error('Qwen 接口调用失败:', response.status, errText);
188 | return new Response(JSON.stringify({
189 | error: `请求通义千问API失败: ${response.status}`,
190 | details: errText
191 | }), { status: response.status, headers: { 'Content-Type': 'application/json' } });
192 | }
193 |
194 | if (stream) {
195 | return this.handleStreamResponse(response, requestId, modelName);
196 | } else {
197 | return this.handleNormalResponse(response, requestId, modelName);
198 | }
199 | } catch (e) {
200 | console.error('请求失败:', e);
201 | return new Response(JSON.stringify({
202 | error: "请求通义千问API失败,请检查 token 是否正确"
203 | }), { status: 500, headers: { 'Content-Type': 'application/json' } });
204 | }
205 | },
206 |
207 | // ---------------------- 流式响应处理(改进) ----------------------
208 | async handleStreamResponse(fetchResponse, requestId, modelName) {
209 | const { readable, writable } = new TransformStream();
210 | const writer = writable.getWriter();
211 | const encoder = new TextEncoder();
212 |
213 | // 辅助函数:将 payload 包装为 SSE 格式后写入,并编码成字节
214 | const sendSSE = async (payload) => {
215 | await writer.write(encoder.encode(`data: ${payload}\n\n`));
216 | };
217 |
218 | // 用于去重和累积内容
219 | let previousDelta = "";
220 | let cumulativeContent = ""; // 累积完整内容,解决断流问题
221 |
222 | const processStream = async () => {
223 | try {
224 | const reader = fetchResponse.body.getReader();
225 | const decoder = new TextDecoder('utf-8');
226 | let buffer = '';
227 | let isFirstChunk = true;
228 |
229 | while (true) {
230 | const { done, value } = await reader.read();
231 | if (done) {
232 | // 确保最后一个缓冲区也被处理
233 | if (buffer.trim()) {
234 | await processBuffer(buffer);
235 | }
236 | break;
237 | }
238 |
239 | const chunkStr = decoder.decode(value, { stream: true });
240 | buffer += chunkStr;
241 |
242 | // 更可靠的处理方式:按照 SSE 规范处理双换行符分隔的消息
243 | await processBuffer(buffer);
244 |
245 | // 仅保留可能不完整的最后一部分
246 | const lastBoundaryIndex = buffer.lastIndexOf('\n\n');
247 | if (lastBoundaryIndex !== -1) {
248 | buffer = buffer.substring(lastBoundaryIndex + 2);
249 | }
250 | }
251 |
252 | // 确保发送最终 DONE 信号
253 | console.log(`流处理完成,累积内容长度: ${cumulativeContent.length}`);
254 | await sendSSE('[DONE]');
255 | } catch (err) {
256 | console.error('处理 SSE 流时出错:', err);
257 | const errorChunk = {
258 | id: `chatcmpl-${requestId}`,
259 | object: 'chat.completion.chunk',
260 | created: Date.now(),
261 | model: modelName,
262 | choices: [
263 | {
264 | index: 0,
265 | delta: { content: '【流式处理出错,请重试】' },
266 | finish_reason: 'error'
267 | }
268 | ]
269 | };
270 | try {
271 | await sendSSE(JSON.stringify(errorChunk));
272 | await sendSSE('[DONE]');
273 | } catch (_) {}
274 | } finally {
275 | await writer.close();
276 | }
277 | };
278 |
279 | // 处理缓冲区内的完整 SSE 消息
280 | const processBuffer = async (buffer) => {
281 | // 按 data: 行分割
282 | const dataLineRegex = /^data: (.+)$/gm;
283 | let match;
284 |
285 | while ((match = dataLineRegex.exec(buffer)) !== null) {
286 | const dataStr = match[1].trim();
287 |
288 | if (dataStr === '[DONE]') {
289 | await sendSSE('[DONE]');
290 | console.log('收到 [DONE],流结束');
291 | continue;
292 | }
293 |
294 | try {
295 | const jsonData = JSON.parse(dataStr);
296 | const delta = jsonData?.choices?.[0]?.delta;
297 | if (!delta) continue;
298 |
299 | let currentDelta = delta.content || "";
300 |
301 | // 改进的去重逻辑:如果有完整内容,检查是否为前缀
302 | if (currentDelta) {
303 | let newContent = currentDelta;
304 | let needsSending = true;
305 |
306 | if (previousDelta && currentDelta.startsWith(previousDelta)) {
307 | // 只提取新增部分
308 | newContent = currentDelta.substring(previousDelta.length);
309 | // 如果没有新增内容,跳过发送
310 | if (!newContent) needsSending = false;
311 | }
312 |
313 | if (needsSending) {
314 | // 创建并发送内容块
315 | const openaiChunk = {
316 | id: `chatcmpl-${requestId}`,
317 | object: 'chat.completion.chunk',
318 | created: Date.now(),
319 | model: modelName,
320 | choices: [
321 | {
322 | index: 0,
323 | delta: isFirstChunk
324 | ? { role: 'assistant', content: newContent }
325 | : { content: newContent },
326 | finish_reason: null
327 | }
328 | ]
329 | };
330 |
331 | if (isFirstChunk) isFirstChunk = false;
332 | await sendSSE(JSON.stringify(openaiChunk));
333 |
334 | // 累积内容
335 | cumulativeContent += newContent;
336 | }
337 |
338 | // 更新之前的内容为当前完整内容
339 | previousDelta = currentDelta;
340 | }
341 |
342 | // 处理完成标志
343 | if (jsonData?.choices?.[0]?.finish_reason) {
344 | const finishChunk = {
345 | id: `chatcmpl-${requestId}`,
346 | object: 'chat.completion.chunk',
347 | created: Date.now(),
348 | model: modelName,
349 | choices: [
350 | {
351 | index: 0,
352 | delta: {},
353 | finish_reason: jsonData.choices[0].finish_reason
354 | }
355 | ]
356 | };
357 | await sendSSE(JSON.stringify(finishChunk));
358 | }
359 | } catch (err) {
360 | console.error('解析 SSE JSON 失败:', dataStr, err);
361 | }
362 | }
363 | };
364 |
365 | processStream();
366 | return new Response(readable, {
367 | headers: {
368 | 'Content-Type': 'text/event-stream',
369 | 'Connection': 'keep-alive',
370 | 'Cache-Control': 'no-cache',
371 | 'X-Accel-Buffering': 'no'
372 | }
373 | });
374 | },
375 |
376 | // ---------------------- 普通(非流)响应 ----------------------
377 | async handleNormalResponse(fetchResponse, requestId, modelName) {
378 | try {
379 | const data = await fetchResponse.json();
380 | const content = data?.choices?.[0]?.message?.content || '';
381 | const finishReason = data?.choices?.[0]?.finish_reason || 'stop';
382 |
383 | return new Response(JSON.stringify({
384 | id: `chatcmpl-${requestId}`,
385 | object: 'chat.completion',
386 | created: Date.now(),
387 | model: modelName,
388 | choices: [
389 | {
390 | index: 0,
391 | message: {
392 | role: 'assistant',
393 | content
394 | },
395 | finish_reason: finishReason
396 | }
397 | ],
398 | usage: data?.usage || { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 }
399 | }), { headers: { 'Content-Type': 'application/json' } });
400 | } catch (e) {
401 | console.error('解析普通响应失败:', e);
402 | return new Response(JSON.stringify({
403 | error: "解析 Qwen 响应出错"
404 | }), { status: 500, headers: { 'Content-Type': 'application/json' } });
405 | }
406 | },
407 |
408 | // ---------------------- 图像生成请求(handleDrawRequest) ----------------------
409 | async handleDrawRequest(messages, model, authToken) {
410 | const prompt = messages[messages.length - 1].content;
411 | const size = '1024*1024';
412 | const pureModelName = model.replace('-draw', '').replace('-thinking', '').replace('-search', '');
413 |
414 | try {
415 | // 创建图像生成任务
416 | const createResponse = await fetch('https://chat.qwen.ai/api/chat/completions', {
417 | method: 'POST',
418 | headers: {
419 | 'Authorization': `Bearer ${authToken}`,
420 | 'Content-Type': 'application/json',
421 | 'User-Agent': 'Mozilla/5.0'
422 | },
423 | body: JSON.stringify({
424 | stream: false,
425 | incremental_output: true,
426 | chat_type: "t2i",
427 | model: pureModelName,
428 | messages: [
429 | {
430 | role: "user",
431 | content: prompt,
432 | chat_type: "t2i",
433 | extra: {},
434 | feature_config: { thinking_enabled: false }
435 | }
436 | ],
437 | id: crypto.randomUUID(),
438 | size: size
439 | })
440 | });
441 |
442 | if (!createResponse.ok) {
443 | const errorText = await createResponse.text();
444 | return new Response(JSON.stringify({
445 | error: "图像生成任务创建失败",
446 | details: errorText
447 | }), {
448 | status: 500,
449 | headers: { 'Content-Type': 'application/json' }
450 | });
451 | }
452 |
453 | const createData = await createResponse.json();
454 | let taskId = null;
455 |
456 | // 查找任务ID
457 | for (const msg of createData.messages) {
458 | if (msg.role === 'assistant' && msg.extra?.wanx?.task_id) {
459 | taskId = msg.extra.wanx.task_id;
460 | break;
461 | }
462 | }
463 |
464 | if (!taskId) {
465 | return new Response(JSON.stringify({
466 | error: "无法获取图像生成任务ID"
467 | }), {
468 | status: 500,
469 | headers: { 'Content-Type': 'application/json' }
470 | });
471 | }
472 |
473 | // 轮询等待图像生成完成(最多 30 次,每次间隔6秒)
474 | let imageUrl = null;
475 | for (let i = 0; i < 30; i++) {
476 | try {
477 | const statusResponse = await fetch(`https://chat.qwen.ai/api/v1/tasks/status/${taskId}`, {
478 | headers: {
479 | 'Authorization': `Bearer ${authToken}`,
480 | 'User-Agent': 'Mozilla/5.0'
481 | }
482 | });
483 | if (statusResponse.ok) {
484 | const statusData = await statusResponse.json();
485 | if (statusData.content) {
486 | imageUrl = statusData.content;
487 | break;
488 | }
489 | }
490 | } catch (error) {
491 | // 忽略单次错误
492 | }
493 | await new Promise(resolve => setTimeout(resolve, 6000));
494 | }
495 |
496 | if (!imageUrl) {
497 | return new Response(JSON.stringify({
498 | error: "图像生成超时"
499 | }), {
500 | status: 500,
501 | headers: { 'Content-Type': 'application/json' }
502 | });
503 | }
504 |
505 | // 返回 OpenAI 标准格式的响应(使用 Markdown 格式嵌入图片)
506 | return new Response(JSON.stringify({
507 | id: `chatcmpl-${crypto.randomUUID()}`,
508 | object: "chat.completion",
509 | created: Date.now(),
510 | model: model,
511 | choices: [
512 | {
513 | index: 0,
514 | message: {
515 | role: "assistant",
516 | content: ``
517 | },
518 | finish_reason: "stop"
519 | }
520 | ],
521 | usage: {
522 | prompt_tokens: 1024,
523 | completion_tokens: 1024,
524 | total_tokens: 2048
525 | }
526 | }), {
527 | headers: { 'Content-Type': 'application/json' }
528 | });
529 | } catch (error) {
530 | console.error('图像生成失败:', error);
531 | return new Response(JSON.stringify({
532 | error: "图像生成请求失败"
533 | }), {
534 | status: 500,
535 | headers: { 'Content-Type': 'application/json' }
536 | });
537 | }
538 | },
539 |
540 | // ---------------------- 图像生成接口(/v1/images/generations) ----------------------
541 | async handleImageGenerations(request) {
542 | const authToken = this.getAuthToken(request);
543 | if (!authToken) {
544 | return new Response(JSON.stringify({
545 | error: "请提供正确的 Authorization token"
546 | }), {
547 | status: 401,
548 | headers: { 'Content-Type': 'application/json' }
549 | });
550 | }
551 |
552 | let body;
553 | try {
554 | body = await request.json();
555 | } catch (error) {
556 | return new Response(JSON.stringify({
557 | error: "无效的请求体,请提供有效的JSON"
558 | }), {
559 | status: 400,
560 | headers: { 'Content-Type': 'application/json' }
561 | });
562 | }
563 |
564 | const { model = "qwen-max-latest-draw", prompt, n = 1, size = '1024*1024' } = body;
565 | const pureModelName = model.replace('-draw', '').replace('-thinking', '').replace('-search', '');
566 |
567 | try {
568 | // 创建图像生成任务(非流式,incremental_output: true)
569 | const createResponse = await fetch('https://chat.qwen.ai/api/chat/completions', {
570 | method: 'POST',
571 | headers: {
572 | 'Authorization': `Bearer ${authToken}`,
573 | 'Content-Type': 'application/json',
574 | 'User-Agent': 'Mozilla/5.0'
575 | },
576 | body: JSON.stringify({
577 | stream: false,
578 | incremental_output: true,
579 | chat_type: "t2i",
580 | model: pureModelName,
581 | messages: [
582 | {
583 | role: "user",
584 | content: prompt,
585 | chat_type: "t2i",
586 | extra: {},
587 | feature_config: { thinking_enabled: false }
588 | }
589 | ],
590 | id: crypto.randomUUID(),
591 | size: size
592 | })
593 | });
594 |
595 | if (!createResponse.ok) {
596 | const errorText = await createResponse.text();
597 | return new Response(JSON.stringify({
598 | error: "图像生成任务创建失败",
599 | details: errorText
600 | }), {
601 | status: 500,
602 | headers: { 'Content-Type': 'application/json' }
603 | });
604 | }
605 |
606 | const createData = await createResponse.json();
607 | let taskId = null;
608 | for (const msg of createData.messages) {
609 | if (msg.role === 'assistant' && msg.extra?.wanx?.task_id) {
610 | taskId = msg.extra.wanx.task_id;
611 | break;
612 | }
613 | }
614 | if (!taskId) {
615 | return new Response(JSON.stringify({
616 | error: "无法获取图像生成任务ID"
617 | }), {
618 | status: 500,
619 | headers: { 'Content-Type': 'application/json' }
620 | });
621 | }
622 |
623 | let imageUrl = null;
624 | for (let i = 0; i < 30; i++) {
625 | try {
626 | const statusResponse = await fetch(`https://chat.qwen.ai/api/v1/tasks/status/${taskId}`, {
627 | headers: {
628 | 'Authorization': `Bearer ${authToken}`,
629 | 'User-Agent': 'Mozilla/5.0'
630 | }
631 | });
632 | if (statusResponse.ok) {
633 | const statusData = await statusResponse.json();
634 | if (statusData.content) {
635 | imageUrl = statusData.content;
636 | break;
637 | }
638 | }
639 | } catch (error) {
640 | // 忽略错误
641 | }
642 | await new Promise(resolve => setTimeout(resolve, 6000));
643 | }
644 |
645 | if (!imageUrl) {
646 | return new Response(JSON.stringify({
647 | error: "图像生成超时"
648 | }), {
649 | status: 500,
650 | headers: { 'Content-Type': 'application/json' }
651 | });
652 | }
653 |
654 | // 构造 OpenAI 标准格式的响应数据(返回图片列表)
655 | const images = Array(n).fill().map(() => ({ url: imageUrl }));
656 | return new Response(JSON.stringify({
657 | created: Date.now(),
658 | data: images
659 | }), {
660 | headers: { 'Content-Type': 'application/json' }
661 | });
662 | } catch (error) {
663 | console.error('图像生成失败:', error);
664 | return new Response(JSON.stringify({
665 | error: "图像生成请求失败"
666 | }), {
667 | status: 500,
668 | headers: { 'Content-Type': 'application/json' }
669 | });
670 | }
671 | },
672 |
673 | // ---------------------- 图片上传接口 ----------------------
674 | async uploadImage(base64Data, authToken) {
675 | try {
676 | // 从 base64 数据中提取图片数据
677 | const base64Image = base64Data.split(';base64,').pop();
678 | const imageData = atob(base64Image);
679 | const arrayBuffer = new ArrayBuffer(imageData.length);
680 | const uint8Array = new Uint8Array(arrayBuffer);
681 | for (let i = 0; i < imageData.length; i++) {
682 | uint8Array[i] = imageData.charCodeAt(i);
683 | }
684 | const formData = new FormData();
685 | const blob = new Blob([uint8Array], { type: 'image/jpeg' });
686 | formData.append('file', blob, `image-${Date.now()}.jpg`);
687 |
688 | const response = await fetch('https://chat.qwen.ai/api/v1/files/', {
689 | method: 'POST',
690 | headers: {
691 | 'Authorization': `Bearer ${authToken}`,
692 | 'User-Agent': 'Mozilla/5.0'
693 | },
694 | body: formData
695 | });
696 |
697 | if (response.ok) {
698 | const data = await response.json();
699 | return data.id;
700 | }
701 | return null;
702 | } catch (error) {
703 | console.error('图片上传失败:', error);
704 | return null;
705 | }
706 | }
707 | };
708 |
--------------------------------------------------------------------------------
/qwen2api-cf.md:
--------------------------------------------------------------------------------
1 | # Qwen2API
2 | ## 项目简介
3 |
4 | Qwen2API,用于将通义千问(Qwen AI)的WEB转换为OpenAI兼容的API接口格式,让您可以通过标准的OpenAI API调用方式来使用通义千问模型。该代理支持包括模型列表查询、聊天补全(流式和非流式)、图像生成,以及图片上传功能,为开发者提供了便捷的集成方式。
5 |
6 | ## 特性
7 |
8 | - **OpenAI API兼容**: 提供与OpenAI API格式兼容的接口,方便现有OpenAI项目迁移
9 | - **模型支持**: 支持通义千问的各类模型,包括qwen-max、qwen-plus等
10 | - **模型变体**: 自动扩展模型名称,支持以下后缀功能:
11 | - `-thinking`: 启用思考模式
12 | - `-search`: 启用搜索增强
13 | - `-draw`: 启用图像生成 【可能存在问题】
14 | - 以上后缀可组合使用,如`qwen-max-latest-thinking-search`
15 | - **流式输出**: 支持流式响应,减少首字等待时间
16 | - **多模态交互**: 支持图片上传和图像生成【可能存在问题】
17 | - **图像生成**: 提供专用的图像生成接口【可能存在问题】
18 |
19 | ## 部署要求
20 |
21 | - CloudFlare账号
22 | - CloudFlare Workers服务
23 |
24 | ## 安装部署
25 |
26 | 1. 登录CloudFlare Workers控制台
27 | 2. 创建新的Worker
28 | 3. 将[qwen2api-cf.js](qwen2api-cf.js)代码复制到Worker编辑器中
29 | 4. 保存并部署
30 |
31 | ## 配置选项
32 |
33 | 您可以通过环境变量配置以下选项:
34 |
35 | | 变量名 | 描述 | 默认值 |
36 | |-------|------|--------|
37 | | API_PREFIX | API路径前缀,可用于自定义路由 | 空字符串 |
38 |
39 | ## 使用方法
40 |
41 | ### 认证
42 |
43 | 使用通义千问的Token作为API密钥,在请求头中设置`Authorization: Bearer {YOUR_QWEN_TOKEN}`。
44 |
45 | **获取Token方法**:
46 | 1. 访问[通义千问官网](https://chat.qwen.ai/)
47 | 2. 登录您的账号
48 | 3. 从Cookie中提取`token`字段的值
49 |
50 | ### 支持的API端点
51 |
52 | #### 1. 获取模型列表
53 |
54 | ```
55 | GET /v1/models
56 | ```
57 |
58 | **响应示例**:
59 | ```json
60 | {
61 | "object": "list",
62 | "data": [
63 | {
64 | "id": "qwen-max-latest",
65 | "object": "model",
66 | "created": 1709128113453,
67 | "owned_by": "qwen"
68 | },
69 | {
70 | "id": "qwen-max-latest-thinking",
71 | "object": "model",
72 | "created": 1709128113453,
73 | "owned_by": "qwen"
74 | },
75 | // 更多模型...
76 | ]
77 | }
78 | ```
79 |
80 | #### 2. 聊天补全
81 |
82 | ```
83 | POST /v1/chat/completions
84 | ```
85 |
86 | **请求体示例**:
87 | ```json
88 | {
89 | "model": "qwen-max-latest",
90 | "messages": [
91 | {
92 | "role": "user",
93 | "content": "你好,请介绍一下自己"
94 | }
95 | ],
96 | "stream": false
97 | }
98 | ```
99 |
100 | **特殊功能**:
101 | - 使用`-thinking`后缀启用思考模式
102 | - 使用`-search`后缀启用搜索增强
103 | - 同时传递图片(多模态)
104 |
105 | **多模态示例**:
106 | ```json
107 | {
108 | "model": "qwen2.5-vl-72b-instruct",
109 | "messages": [
110 | {
111 | "role": "user",
112 | "content": [
113 | {
114 | "type": "text",
115 | "text": "这张图片是什么?"
116 | },
117 | {
118 | "type": "image_url",
119 | "image_url": {
120 | "url": "data:image/jpeg;base64,/9j/4AAQ..."
121 | }
122 | }
123 | ]
124 | }
125 | ]
126 | }
127 | ```
128 |
129 | #### 3. 图像生成
130 |
131 | **方法1**: 使用聊天接口
132 |
133 | ```
134 | POST /v1/chat/completions
135 | ```
136 |
137 | ```json
138 | {
139 | "model": "qwen-max-latest-draw",
140 | "messages": [
141 | {
142 | "role": "user",
143 | "content": "画一只可爱的猫咪"
144 | }
145 | ]
146 | }
147 | ```
148 |
149 | **方法2**: 使用专用图像生成接口
150 |
151 | ```
152 | POST /v1/images/generations
153 | ```
154 |
155 | ```json
156 | {
157 | "model": "qwen-max-latest-draw",
158 | "prompt": "画一只可爱的猫咪",
159 | "n": 1,
160 | "size": "1024*1024"
161 | }
162 | ```
163 |
164 | ## 支持的模型
165 |
166 | 系统内置了以下默认模型,当API获取失败时会使用这些模型:
167 |
168 | - qwen-max-latest
169 | - qwen-plus-latest
170 | - qwen2.5-vl-72b-instruct
171 | - qwen2.5-14b-instruct-1m
172 | - qvq-72b-preview
173 | - qwq-32b-preview
174 | - qwen2.5-coder-32b-instruct
175 | - qwen-turbo-latest
176 | - qwen2.5-72b-instruct
177 |
178 | 每个模型都支持添加后缀变体(-thinking、-search、-draw)。
179 |
180 | ## 技术实现细节
181 |
182 | ### 架构概述
183 |
184 | 该代理作为中间层,将OpenAI格式的请求转换为通义千问API格式,并将通义千问的响应转换回OpenAI格式。主要处理流程包括:
185 |
186 | 1. 解析请求和提取Token
187 | 2. 根据URL路径分发到不同处理函数
188 | 3. 转换请求格式并调用通义千问API
189 | 4. 处理响应并转换格式
190 | 5. 特殊处理流式响应和图像生成
191 |
192 | ### 模型处理机制
193 |
194 | - 基础模型名称处理:从请求中提取模型名称
195 | - 后缀功能处理:解析后缀并应用相应的配置
196 | - `-thinking`: 设置`feature_config.thinking_enabled = true`
197 | - `-search`: 设置`chat_type = "search"`
198 | - `-draw`: 切换到图像生成流程
199 |
200 | ### 流式响应处理
201 |
202 | 代理实现了高效的流式响应处理机制:
203 | 1. 使用TransformStream处理数据流
204 | 2. 对数据进行SSE(Server-Sent Events)格式转换
205 | 3. 实现增量去重逻辑,确保内容不重复
206 | 4. 处理完成标记和结束流
207 |
208 | ### 图像生成实现
209 |
210 | 图像生成使用任务创建和状态轮询机制:
211 | 1. 创建图像生成任务并获取taskId
212 | 2. 定期轮询任务状态(最多30次,每6秒一次)
213 | 3. 获取生成的图像URL并返回
214 |
215 | ## 常见问题与解决方案
216 |
217 | ### Token无效或过期
218 |
219 | **症状**: 请求返回401错误
220 | **解决方案**: 重新获取通义千问Cookie中的token值
221 |
222 | ### 模型列表获取失败
223 |
224 | **症状**: 仅显示默认模型列表
225 | **解决方案**: 检查网络连接和token有效性
226 |
227 | ### 图像生成超时
228 |
229 | **症状**: 返回"图像生成超时"错误
230 | **解决方案**:
231 | - 检查网络连接
232 | - 尝试简化图像描述
233 | - 尝试减小图像尺寸
234 |
235 | ### 流式响应中断
236 |
237 | **症状**: 响应突然停止
238 | **解决方案**:
239 | - 检查网络稳定性
240 | - 减少请求的复杂度
241 |
242 |
--------------------------------------------------------------------------------