├── .gitignore ├── .idea ├── inspectionProfiles │ ├── Project_Default.xml │ └── profiles_settings.xml └── modules.xml ├── LICENSE ├── README.md ├── assets ├── image-20201225053752740.png ├── image-20201225140607971.png └── sponsor.jpg ├── requirements.txt ├── src ├── __init__.pyw ├── misc │ ├── icon.icns │ ├── icon.ico │ ├── icon_listning.icns │ ├── icon_listning.ico │ ├── png转ico和icns.bat │ ├── requirements.txt │ ├── sponsor.jpg │ └── style.css └── moduels │ ├── component │ ├── Ali_CallBack.py │ ├── NormalValue.py │ ├── QEditBox_StdoutBox.py │ ├── SponsorDialog.py │ └── Stream.py │ ├── function │ ├── createDB.py │ ├── getAlibabaRecognizer.py │ └── getAlibabaToken.py │ ├── gui │ ├── Combo_EngineList.py │ ├── Dialog_AddEngine.py │ ├── Group_EditableList.py │ ├── List_List.py │ ├── MainWindow.py │ ├── SystemTray.py │ ├── Tab_CapsWriter.py │ ├── Tab_Config.py │ └── Tab_Help.py │ └── thread │ └── Thread_AliEngine.py ├── 安装指南 ├── PyAudio-0.2.11-cp36-cp36m-win32.whl ├── PyAudio-0.2.11-cp36-cp36m-win_amd64.whl ├── PyAudio-0.2.11-cp37-cp37m-win32.whl ├── PyAudio-0.2.11-cp37-cp37m-win_amd64.whl ├── PyAudio-0.2.11-cp38-cp38-win32.whl ├── PyAudio-0.2.11-cp38-cp38-win_amd64.whl ├── PyAudio-0.2.11-cp39-cp39-win32.whl ├── PyAudio-0.2.11-cp39-cp39-win_amd64.whl ├── alibabacloud-nls-python-sdk │ ├── LICENSE │ ├── NOTICE │ ├── README.rst │ ├── ali_speech │ │ ├── __init__.py │ │ ├── _client.py │ │ ├── _constant.py │ │ ├── _create_token.py │ │ ├── _logging.py │ │ ├── _speech_recognizer.py │ │ ├── _speech_reqprotocol.py │ │ ├── _speech_synthesizer.py │ │ ├── _speech_transcriber.py │ │ ├── callbacks.py │ │ └── constant.py │ ├── alibabacloud_nls_java_sdk.egg-info │ │ ├── PKG-INFO │ │ ├── SOURCES.txt │ │ ├── dependency_links.txt │ │ ├── requires.txt │ │ └── top_level.txt │ ├── build │ │ └── lib │ │ │ └── ali_speech │ │ │ ├── __init__.py │ │ │ ├── _client.py │ │ │ ├── _constant.py │ │ │ ├── _create_token.py │ │ │ ├── _logging.py │ │ │ ├── _speech_recognizer.py │ │ │ ├── _speech_reqprotocol.py │ │ │ ├── _speech_synthesizer.py │ │ │ ├── _speech_transcriber.py │ │ │ ├── callbacks.py │ │ │ └── constant.py │ ├── create_token_demo.py │ ├── dist │ │ └── alibabacloud-nls-java-sdk-2.0.0.tar.gz │ ├── nls-sample-16k.wav │ ├── setup.py │ ├── speech_recognizer_demo.py │ ├── speech_synthesizer_demo.py │ └── speech_transcriber_demo.py ├── requirements.txt └── 安装指南.md └── 打包 └── Pyinstaller 编译和打包 Win64.bat /.gitignore: -------------------------------------------------------------------------------- 1 | *.spec 2 | __pycache__ 3 | *.log 4 | *.spec 5 | *.mp4 6 | *.mkv 7 | *.wav 8 | .idea 9 | .DS_Store 10 | *info 11 | *database.db 12 | *test.py 13 | *.7z 14 | */dist/* 15 | */build/* 16 | *.db 17 | *test/* 18 | *.afphoto 19 | icon*.png 20 | Scripts/* 21 | Lib/* 22 | pyvenv.cfg 23 | 视频封面.png 24 | 25 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/Project_Default.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 88 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/profiles_settings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 6 | -------------------------------------------------------------------------------- /.idea/modules.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Haujet Zhao 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [Gitee](https://gitee.com/haujet/CapsWriter) | [Github](https://github.com/HaujetZhao/CapsWriter) 2 | 3 | # icon.ico Caps Writer 4 | 5 | ## 💡 简介 6 | 7 | 这是一款电脑端语音输入工具。顾名思义,Caps Writer 就是按下大写锁定键来打字的工具。它的具体作用是:当你长按键盘上的大写锁定键后,软件会开始语音识别,当你松开大写锁定键时,识别的结果就可以立马上屏。 8 | 9 | 对于聊天时候进行快捷输入、写代码时快速加入中文注释非常的方便。 10 | 11 | 目前软件内置了对阿里云一句话识别 API 的支持。如果你要使用,就需要先在阿里云上实名认证,申请语音识别 API,在设置页面添加一个语音识别引擎。 12 | 13 | > 添加其它服务商的引擎也是可以做的,只是目前阿里云的引擎就够用,还没有足够的动力添加其它引擎。 14 | 15 | 具体使用效果、申请阿里云 API 的方法,可以参考我这个视频: [ CapsWriter 2.0 使用视频 ](https://www.bilibili.com/video/BV12A411p73r/) 16 | 17 | 添加上引擎后,在主页面选择一个引擎,点击启用按钮,就可以进行语音识别了! 18 | 19 | 启用后,在实际使用中,只要按下 CapsLock 键,软件就会立刻开始录音: 20 | 21 | * 如果只是单击 CapsLock 后松开,录音数据会立刻被删除; 22 | * 如果按下 CapsLock 键时长超过 0.3 秒,就会开始连网进行语音识别,松开 CapsLock 键时,语音识别结果会被立刻输入。 23 | 24 | 所以你只需要按下 CapsLock 键,无需等待,就可以开始说话,因为当你按下按下 CapsLock 键的时候,程序就开始录音了,只要你按的时长超过 0.3 秒,就肯定能识别上。说完后,松开,识别结果立马上屏。 25 | 26 | image-20201225053752740 27 | 28 | ## ⭐技巧 29 | 30 | 在设置界面,将 `点击关闭按钮时隐藏到托盘` 选项勾选,就可以将软件隐藏到托盘栏运行: 31 | 32 | image-20201225140607971 33 | 34 | ### 📝 背景 35 | 36 | 37 | 对于直到 0202 年,仍然没有开发者做过一个好用的语音输入工具,我又生气又无奈,毕竟这东西不赚钱,自然没有人做。 38 | 39 | 40 | 有人建议用搜狗输入法、讯飞输入法的语音输入,但这几个方面是真让人受不了: 41 | 42 | 43 | * 广告太多的软件,拒绝安装 44 | * 速度慢,讯飞在手机上的语音输入挺快的,但是在 PC 上的语音识别速度超级慢 45 | * 就以搜狗输入法为例,它的语音输入快捷键只能是`Ctrl + Shift + A/B/C……`,有以下槽点: 46 | * 这个快捷键会和许多软件的快捷键冲突,且不好记 47 | * 打字时,按这样三个快捷键,手指很别扭,不爽 48 | * 讯飞语音输入法的快捷键是 F6,只能换成 F 功能键,离手指太远,不好够,同时和许多软件快捷键冲突 49 | 50 | 51 | 52 | 53 | ## 🔮 开箱即用 54 | 55 | Windows 小白用户,只需要在 [Gitee Releases](https://gitee.com/haujet/CapsWriter/releases) 或 [Github Releases](https://github.com/HaujetZhao/CapsWriter/releases) 界面下载打包好的压缩文件,解压,执行里面的 exe 文件,就可以运行了,在设置界面新建引擎,填入你在阿里云中申请的: 56 | 57 | * 拥有 **管理智能语音交互(NLS)** 权限的 **RAM访问控制** 用户的 **Accesskey Id**、**Accesskey Secret** 58 | * 智能语音交互语音识别项目的 **appkey** 59 | 60 | 就可以正常使用了。 61 | 62 | 详细申请、填写 API 的步骤请到 [ CapsWriter 2.0 使用视频 ](https://www.bilibili.com/video/BV12A411p73r/) 查看视频教程。 63 | 64 | Mac 和 Linux 用户,你们也可以使用,只是我没有 Mac 和 Linux 的电脑,无法打包。需要你们下载源代码、安装依赖库,再打包或者直接运行。 65 | 66 | ### 🛠 源代码使用 67 | 68 | 小白下载的 Release 其实是用 pyinstaller 导出的 exe 文件,如果你需要在源码基础上使用,就需要安装以下模块: 69 | 70 | - keyboard (用于监听键盘输入) 71 | - pyaudio (用于接收录音) 72 | - PySide2 (图形界面框架) 73 | - aliyun-python-sdk-core (阿里云 sdk) 74 | - alibabacloud-nls-java-sdk (阿里云智能语音引擎 sdk) 75 | 76 | 其中: 77 | 78 | - pyaudio 在 windows 上不是太好安装,可以先到 [这个链接](https://www.lfd.uci.edu/~gohlke/pythonlibs) 下载 pyaudio 对应版本的 whl 文件,再用 pip 安装,Mac 和 Linux 上需要先安装 port audio,才能安装上 pyaudio 79 | - alibabacloud-nls-java-sdk 是指阿里云官方 java sdk 的 python 实现,它不是通过 pip 安装的(官方没有上传到 pypi ),而是通过 [阿里云官方文档的方法](https://www.alibabacloud.com/help/zh/doc-detail/120693.htm) 进行安装。 80 | - 其它模块使用 pip 安装即可 81 | 82 | 本文件夹内有一个 `安装指南` 文件夹,在里面可以找到详细的安装指南,还包括了提前下载的 `alibabacloud-nls-python-sdk` 和 `pyaudio` 的 whl 文件。 83 | 84 | ## ☕ 打赏 85 | 86 | 万水千山总是情,一块几块都是情。本软件完全开源,用爱发电,如果你愿意,可以以打赏的方式支持我一下: 87 | 88 | sponsor 89 | 90 | 91 | 92 | ## 😀 交流 93 | 94 | 如果有软件方面的反馈可以提交 issues,或者加入 QQ 群:[1146626791](https://qm.qq.com/cgi-bin/qm/qr?k=DgiFh5cclAElnELH4mOxqWUBxReyEVpm&jump_from=webapi) -------------------------------------------------------------------------------- /assets/image-20201225053752740.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/assets/image-20201225053752740.png -------------------------------------------------------------------------------- /assets/image-20201225140607971.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/assets/image-20201225140607971.png -------------------------------------------------------------------------------- /assets/sponsor.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/assets/sponsor.jpg -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | setuptools 2 | pyaudio 3 | keyboard 4 | aliyun-python-sdk-core 5 | PySide2 6 | pywin32 -------------------------------------------------------------------------------- /src/__init__.pyw: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | import os, sys, time 4 | 5 | os.chdir(os.path.dirname(os.path.abspath(__file__))) # 更改工作目录,指向正确的当前文件夹 6 | sys.path.append(os.path.dirname(os.path.abspath(__file__))) # 将当前目录导入 python 寻找 package 和 moduel 的变量 7 | # os.environ['PATH'] += os.pathsep + os.path.abspath('./bin') # 将可执行文件的目录加入环境变量 8 | 9 | from PySide2.QtWidgets import QApplication 10 | from PySide2.QtCore import QCoreApplication 11 | from PySide2.QtGui import Qt 12 | 13 | from moduels.function.createDB import createDB # 引入检查和创建创建数据库的函数 14 | 15 | from moduels.gui.MainWindow import MainWindow 16 | from moduels.gui.SystemTray import SystemTray 17 | from moduels.component.NormalValue import 常量 18 | 19 | 20 | ############# 主窗口和托盘 ################ 21 | 22 | def 高分屏变量设置(app): 23 | os.environ['QT_SCALE_FACTOR'] = '1' 24 | QCoreApplication.instance().setAttribute(Qt.AA_UseHighDpiPixmaps) 25 | 26 | 27 | def main(): 28 | QApplication.setAttribute(Qt.AA_EnableHighDpiScaling) 29 | app = QApplication(sys.argv) 30 | 高分屏变量设置(app) 31 | createDB() 32 | mainWindow = MainWindow() 33 | tray = SystemTray(mainWindow) 34 | 常量.托盘 = tray 35 | sys.exit(app.exec_()) 36 | 37 | if __name__ == '__main__': 38 | main() 39 | -------------------------------------------------------------------------------- /src/misc/icon.icns: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/src/misc/icon.icns -------------------------------------------------------------------------------- /src/misc/icon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/src/misc/icon.ico -------------------------------------------------------------------------------- /src/misc/icon_listning.icns: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/src/misc/icon_listning.icns -------------------------------------------------------------------------------- /src/misc/icon_listning.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/src/misc/icon_listning.ico -------------------------------------------------------------------------------- /src/misc/png转ico和icns.bat: -------------------------------------------------------------------------------- 1 | magick convert "%1" -resize 128x128 "%~dp1%~n1.ico" 2 | magick convert "%1" -resize 128x128 "%~dp1%~n1.icns" 3 | -------------------------------------------------------------------------------- /src/misc/requirements.txt: -------------------------------------------------------------------------------- 1 | setuptools 2 | pyaudio 3 | keyboard 4 | aliyunsdkcore 5 | PySide2 -------------------------------------------------------------------------------- /src/misc/sponsor.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/src/misc/sponsor.jpg -------------------------------------------------------------------------------- /src/misc/style.css: -------------------------------------------------------------------------------- 1 | /*参考这里:https://doc.qt.io/qt-5/stylesheet-reference.html*/ 2 | 3 | 4 | /*切换到分割视频 Tab,里面有几个上面带字的功能框,那些框框就是 QGroupBox */ 5 | QGroupBox{ 6 | border: 1px solid #ccc; 7 | border-radius:6px; 8 | margin-top: 2ex; 9 | margin-bottom: 0.5ex; 10 | padding: 0.3em 0.4em 0.4em 0.3em; /* 上 右 下 左*/ 11 | } 12 | 13 | /* 这就是 QGroupBox 上面的标题 */ 14 | QGroupBox:title { 15 | color: #005980; 16 | subcontrol-origin: margin; 17 | margin-top: 0.5ex; 18 | left: 2ex; 19 | } 20 | 21 | /* 22 | QPushButton { 23 | border: 2px solid #8f8f91; 24 | border-radius: 6px; 25 | padding: 1em 0.4em 1em 0.3em; /* 上 右 下 左* 26 | background-color: qlineargradient(x1: 0, y1: 0, x2: 0, y2: 1, 27 | stop: 0 #f6f7fa, stop: 1 #dadbde); 28 | min-width: 80px; 29 | } 30 | 31 | QPushButton:pressed { 32 | background-color: qlineargradient(x1: 0, y1: 0, x2: 0, y2: 1, 33 | stop: 0 #dadbde, stop: 1 #f6f7fa); 34 | } 35 | 36 | QPushButton:flat { 37 | border: none; /* no border for a flat push button * 38 | } 39 | 40 | QPushButton:default { 41 | border-color: navy; /* make the default button prominent * 42 | }*/ -------------------------------------------------------------------------------- /src/moduels/component/Ali_CallBack.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | import keyboard 4 | 5 | from ali_speech.callbacks import SpeechRecognizerCallback 6 | 7 | class Ali_Callback(SpeechRecognizerCallback): 8 | """ 9 | 构造函数的参数没有要求,可根据需要设置添加 10 | 示例中的name参数可作为待识别的音频文件名,用于在多线程中进行区分 11 | """ 12 | def __init__(self, name='default'): 13 | self._name = name 14 | def on_started(self, message): 15 | #print('MyCallback.OnRecognitionStarted: %s' % message) 16 | pass 17 | def on_result_changed(self, message): 18 | print('任务信息: task_id: %s, result: %s' % ( 19 | message['header']['task_id'], message['payload']['result'])) 20 | def on_completed(self, message): 21 | print('结果: %s' % ( 22 | message['payload']['result'])) 23 | result = message['payload']['result'] 24 | try: 25 | if result[-1] == '。': # 如果最后一个符号是句号,就去掉。 26 | result = result[0:-1] 27 | except Exception as e: 28 | pass 29 | keyboard.write(result) # 输入识别结果 30 | def on_task_failed(self, message): 31 | print('MyCallback.OnRecognitionTaskFailed: %s' % message) 32 | def on_channel_closed(self): 33 | # print('MyCallback.OnRecognitionChannelClosed') 34 | pass -------------------------------------------------------------------------------- /src/moduels/component/NormalValue.py: -------------------------------------------------------------------------------- 1 | import sqlite3 2 | import platform 3 | import subprocess 4 | 5 | 6 | class NormalValue(): 7 | 样式文件 = 'misc/style.css' 8 | 软件版本 = '2.1.0' 9 | 10 | 主窗口 = None 11 | 托盘 = None 12 | 状态栏 = None 13 | 14 | Token配置路径 = 'misc/Token.ini' 15 | 16 | 数据库路径 = 'misc/database.db' 17 | 数据库连接 = sqlite3.connect(数据库路径) 18 | 19 | 偏好设置表单名 = '偏好设置' 20 | 语音引擎表单名 = '语音引擎' 21 | 22 | 关闭时隐藏到托盘 = False 23 | 24 | 25 | 系统平台 = platform.system() 26 | 27 | 图标路径 = 'misc/icon.icns' if 系统平台 == 'Darwin' else 'misc/icon.ico' 28 | 聆听图标路径 = 'misc/icon_listning.icns' if 系统平台 == 'Darwin' else 'misc/icon_listning.ico' 29 | 30 | subprocessStartUpInfo = subprocess.STARTUPINFO() 31 | if 系统平台 == 'Windows': 32 | subprocessStartUpInfo.dwFlags = subprocess.STARTF_USESHOWWINDOW 33 | subprocessStartUpInfo.wShowWindow = subprocess.SW_HIDE 34 | 35 | class ThreadValue(): 36 | pass 37 | 38 | 常量 = NormalValue() 39 | 线程值 = ThreadValue() -------------------------------------------------------------------------------- /src/moduels/component/QEditBox_StdoutBox.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | import sys 3 | 4 | from PySide2.QtWidgets import QTextEdit 5 | from PySide2.QtGui import QTextCursor 6 | 7 | from moduels.component.Stream import Stream 8 | 9 | # 命令输出窗口中的多行文本框 10 | class QEditBox_StdoutBox(QTextEdit): 11 | # 定义一个 QTextEdit 类,写入 print 方法。用于输出显示。 12 | def __init__(self, parent=None): 13 | super(QEditBox_StdoutBox, self).__init__(parent) 14 | self.setReadOnly(True) 15 | self.标准输出流 = Stream() 16 | self.标准输出流.newText.connect(self.print) 17 | sys.stdout = self.标准输出流 18 | 19 | def print(self, text): 20 | try: 21 | cursor = self.textCursor() 22 | cursor.movePosition(QTextCursor.End) 23 | cursor.insertText(text) 24 | self.setTextCursor(cursor) 25 | self.ensureCursorVisible() 26 | except: 27 | print('文本框更新文本失败') 28 | -------------------------------------------------------------------------------- /src/moduels/component/SponsorDialog.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import QDialog 4 | from PySide2.QtGui import Qt, QIcon, QPainter, QPixmap 5 | 6 | from moduels.component.NormalValue import 常量 7 | 8 | 9 | # 打赏对话框 10 | class SponsorDialog(QDialog): 11 | def __init__(self, parent=None): 12 | super(SponsorDialog, self).__init__(parent) 13 | self.resize(500, 567) 14 | 图标路径 = 'misc/icon.icns' if 常量.系统平台 == 'Darwin' else 'misc/icon.ico' 15 | self.setWindowIcon(QIcon(图标路径)) 16 | self.setWindowTitle(self.tr('打赏作者')) 17 | self.setWindowModality(Qt.NonModal) # 让窗口不要阻挡主窗口 18 | self.show() 19 | 20 | def paintEvent(self, event): 21 | painter = QPainter(self) 22 | pixmap = QPixmap('misc/sponsor.jpg') 23 | painter.drawPixmap(self.rect(), pixmap) -------------------------------------------------------------------------------- /src/moduels/component/Stream.py: -------------------------------------------------------------------------------- 1 | 2 | from PySide2.QtCore import Signal, QObject 3 | 4 | 5 | class Stream(QObject): 6 | # 用于将控制台的输出定向到一个槽 7 | newText = Signal(str) 8 | 9 | # def __init__(self): 10 | # super().__init__() 11 | # self.newText = Signal(str) 12 | 13 | def write(self, text): 14 | self.newText.emit(str(text)) 15 | # QApplication.processEvents() 16 | 17 | def flush(self): 18 | pass -------------------------------------------------------------------------------- /src/moduels/function/createDB.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from moduels.component.NormalValue import 常量 4 | 5 | def createDB(): 6 | 7 | 数据库连接 = 常量.数据库连接 8 | 偏好设置表单名 = 常量.偏好设置表单名 9 | 语音引擎表单名 = 常量.语音引擎表单名 10 | # 模板表单名 = 常量.数据库模板表单名 11 | # 皮肤表单名 = 常量.数据库皮肤表单名 12 | cursor = 数据库连接.cursor() 13 | 14 | result = cursor.execute(f'select * from sqlite_master where name = "{偏好设置表单名}";') 15 | if result.fetchone() == None: 16 | cursor.execute(f'''create table {偏好设置表单名} ( 17 | id integer primary key autoincrement, 18 | item text, 19 | value text 20 | )''') 21 | else: 22 | # print('偏好设置表单已存在') 23 | pass 24 | result = cursor.execute(f'select * from sqlite_master where name = "{语音引擎表单名}";') 25 | if result.fetchone() == None: 26 | cursor.execute(f'''create table {语音引擎表单名} ( 27 | id integer primary key autoincrement, 28 | 引擎名称 text, 29 | 服务商 text, 30 | AppKey text, 31 | 语言 text, 32 | AccessKeyId text, 33 | AccessKeySecret text 34 | )''') 35 | else: 36 | # print('语音引擎表单名已存在') 37 | pass 38 | # 39 | # result = cursor.execute(f'select * from sqlite_master where name = "{皮肤表单名}";') 40 | # if result.fetchone() == None: 41 | # cursor.execute(f'''create table {皮肤表单名} ( 42 | # id integer primary key autoincrement, 43 | # skinName text, 44 | # outputFileName text, 45 | # sourceFilePath text, 46 | # supportDarkMode BOOLEAN)''') 47 | # else: 48 | # print('皮肤表单已存在') 49 | # 50 | 数据库连接.commit() # 最后要提交更改 51 | -------------------------------------------------------------------------------- /src/moduels/function/getAlibabaRecognizer.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | import configparser, sqlite3, json 3 | 4 | from ali_speech.constant import ASRFormat 5 | from ali_speech.constant import ASRSampleRate 6 | 7 | from moduels.component.NormalValue import 常量 8 | from moduels.component.Ali_CallBack import Ali_Callback 9 | from moduels.function.getAlibabaToken import getAlibabaToken 10 | 11 | def getAlibabaRecognizer(client, appkey, accessKeyId, accessKeySecret, tokenId, tokenExpireTime, 线程): 12 | tokenId, tokenExpireTime = getAlibabaToken(accessKeyId, accessKeySecret, tokenId, tokenExpireTime) 13 | if tokenId == False: return False 14 | 线程.tokenId = tokenId 15 | 线程.tokenExpireTime = tokenExpireTime 16 | audio_name = 'none' 17 | callback = Ali_Callback(audio_name) 18 | recognizer = client.create_recognizer(callback) 19 | recognizer.set_appkey(appkey) 20 | recognizer.set_token(tokenId) 21 | recognizer.set_format(ASRFormat.PCM) 22 | recognizer.set_sample_rate(ASRSampleRate.SAMPLE_RATE_16K) 23 | recognizer.set_enable_intermediate_result(False) 24 | recognizer.set_enable_punctuation_prediction(True) 25 | recognizer.set_enable_inverse_text_normalization(True) 26 | return (recognizer) -------------------------------------------------------------------------------- /src/moduels/function/getAlibabaToken.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | import configparser, sqlite3, json, time, sys 3 | 4 | 5 | from aliyunsdkcore.client import AcsClient 6 | from aliyunsdkcore.request import CommonRequest 7 | 8 | from moduels.component.NormalValue import 常量 9 | 10 | def getAlibabaToken(accessID, accessKey, tokenId, tokenExpireTime): 11 | # 要是 token 还有 50 秒过期,那就重新获得一个。 12 | if (int(tokenExpireTime) - time.time()) < 50 : 13 | # 创建AcsClient实例 14 | client = AcsClient( 15 | accessID, # 填写 AccessID 16 | accessKey, # 填写 AccessKey = 得到AccessKey(引擎名称) 17 | "cn-shanghai" 18 | ); 19 | # 创建request,并设置参数 20 | request = CommonRequest() 21 | request.set_method('POST') 22 | request.set_domain('nls-meta.cn-shanghai.aliyuncs.com') 23 | request.set_version('2019-02-28') 24 | request.set_action_name('CreateToken') 25 | try: 26 | response = json.loads(client.do_action_with_exception(request)) 27 | except Exception as e: 28 | print(f'''获取 Token 出错了,出错信息如下:\n{e}\n''') 29 | return False, False 30 | tokenId = response['Token']['Id'] 31 | tokenExpireTime = str(response['Token']['ExpireTime']) 32 | return tokenId, tokenExpireTime 33 | 34 | # def 得到AccessKey(引擎名称): 35 | # 数据库连接 = sqlite3.connect(常量.数据库路径) 36 | # AccessKeyId, AccessKeySecret = 数据库连接.execute(f'''select AccessKeyId, 37 | # AccessKeySecret 38 | # from {常量.语音引擎表单名} 39 | # where 引擎名称 = :引擎名称''', 40 | # {'引擎名称': 引擎名称}).fetchone() 41 | # 数据库连接.close() 42 | # return AccessKeyId, AccessKeySecret -------------------------------------------------------------------------------- /src/moduels/gui/Combo_EngineList.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | import os, sqlite3 4 | from PySide2.QtWidgets import QComboBox 5 | from moduels.component.NormalValue import 常量 6 | 7 | # 添加预设对话框 8 | class Combo_EngineList(QComboBox): 9 | def __init__(self): 10 | super().__init__() 11 | self.initElements() # 先初始化各个控件 12 | self.initSlots() # 再将各个控件连接到信号槽 13 | self.initLayouts() # 然后布局 14 | self.initValues() # 再定义各个控件的值 15 | 16 | def initElements(self): 17 | pass 18 | 19 | def initSlots(self): 20 | pass 21 | 22 | def initLayouts(self): 23 | pass 24 | 25 | def initValues(self): 26 | self.初始化列表() 27 | 28 | def mousePressEvent(self, e): 29 | self.列表更新() 30 | self.showPopup() 31 | 32 | def 初始化列表(self): 33 | self.列表项 = [] 34 | 数据库连接 = 常量.数据库连接 35 | cursor = 数据库连接.cursor() 36 | result = cursor.execute(f'''select 引擎名称 from {常量.语音引擎表单名} order by id;''').fetchall() 37 | if len(result) != 0: 38 | for item in result: 39 | self.列表项.append(item[0]) 40 | self.addItems(self.列表项) 41 | # if not os.path.exists(常量.音效文件路径): os.makedirs(常量.音效文件路径) 42 | # with os.scandir(常量.音效文件路径) as 目录条目: 43 | # for entry in 目录条目: 44 | # if not entry.name.startswith('.') and entry.is_dir(): 45 | # self.列表项.append(entry.name) 46 | 47 | 48 | def 列表更新(self): 49 | 新列表 = [] 50 | 数据库连接 = 常量.数据库连接 51 | cursor = 数据库连接.cursor() 52 | result = cursor.execute(f'''select 引擎名称 from {常量.语音引擎表单名} order by id;''').fetchall() 53 | if len(result) != 0: 54 | for item in result: 55 | 新列表.append(item[0]) 56 | if self.列表项 == 新列表: return True 57 | self.clear() 58 | self.列表项 = 新列表 59 | self.addItems(self.列表项) 60 | 61 | -------------------------------------------------------------------------------- /src/moduels/gui/Dialog_AddEngine.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import * 4 | from PySide2.QtGui import * 5 | from PySide2.QtCore import * 6 | from moduels.component.NormalValue import 常量 7 | 8 | 9 | class Dialog_AddEngine(QDialog): 10 | def __init__(self, 列表, 数据库连接, 表单名字, 显示的列名): 11 | super().__init__(常量.主窗口) 12 | self.列表 = 列表 13 | self.数据库连接 = 数据库连接 14 | self.表单名字 = 表单名字 15 | self.显示的列名 = 显示的列名 16 | self.initElements() # 先初始化各个控件 17 | self.initSlots() # 再将各个控件连接到信号槽 18 | self.initLayouts() # 然后布局 19 | self.initValues() # 再定义各个控件的值 20 | 21 | def initElements(self): 22 | self.引擎名称编辑框 = QLineEdit() 23 | self.服务商选择框 = QComboBox() 24 | self.appKey输入框 = QLineEdit() 25 | self.语言Combobox = QComboBox() 26 | self.accessKeyId输入框 = QLineEdit() 27 | self.AccessKeySecret输入框 = QLineEdit() 28 | 29 | self.确定按钮 = QPushButton(self.tr('确定')) 30 | self.取消按钮 = QPushButton(self.tr('取消')) 31 | 32 | self.纵向布局 = QVBoxLayout() 33 | self.表格布局 = QFormLayout() 34 | self.按钮横向布局 = QHBoxLayout() 35 | 36 | self.Api输入校验器 = QRegExpValidator() 37 | 38 | 39 | def initSlots(self): 40 | self.服务商选择框.currentTextChanged.connect(self.服务商变化) 41 | 42 | self.确定按钮.clicked.connect(self.确认) 43 | self.取消按钮.clicked.connect(self.取消) 44 | 45 | def initLayouts(self): 46 | self.表格布局.addRow('引擎名称:', self.引擎名称编辑框) 47 | self.表格布局.addRow('服务商:', self.服务商选择框) 48 | self.表格布局.addRow('AppKey:', self.appKey输入框) 49 | self.表格布局.addRow('语言:', self.语言Combobox) 50 | self.表格布局.addRow('AccessKeyId:', self.accessKeyId输入框) 51 | self.表格布局.addRow('AccessKeySecret:', self.AccessKeySecret输入框) 52 | 53 | self.按钮横向布局.addWidget(self.确定按钮) 54 | self.按钮横向布局.addWidget(self.取消按钮) 55 | 56 | self.纵向布局.addLayout(self.表格布局) 57 | self.纵向布局.addLayout(self.按钮横向布局) 58 | 59 | self.setLayout(self.纵向布局) 60 | 61 | def initValues(self): 62 | self.引擎名称编辑框.setPlaceholderText(self.tr('例如:阿里-中文')) 63 | 64 | self.服务商选择框.addItems(['Alibaba']) 65 | self.服务商选择框.setCurrentText('Alibaba') 66 | 67 | self.accessKeyId输入框.setEchoMode(QLineEdit.Password) 68 | self.AccessKeySecret输入框.setEchoMode(QLineEdit.Password) 69 | 70 | self.setWindowIcon(QIcon(常量.图标路径)) 71 | self.setWindowTitle(self.tr('添加或更新 Api')) 72 | self.setWindowModality(Qt.NonModal) 73 | 74 | self.Api输入校验器.setRegExp(QRegExp(r'\w+')) 75 | self.appKey输入框.setValidator(self.Api输入校验器) 76 | self.accessKeyId输入框.setValidator(self.Api输入校验器) 77 | self.AccessKeySecret输入框.setValidator(self.Api输入校验器) 78 | 79 | if self.列表.currentItem(): 80 | 已选中的列表项 = self.列表.currentItem().text() 81 | 填充数据 = self.从数据库得到选中项的数据(已选中的列表项) 82 | self.引擎名称编辑框.setText(填充数据[0]) 83 | self.服务商选择框.setCurrentText(填充数据[1]) 84 | self.appKey输入框.setText(填充数据[2]) 85 | self.语言Combobox.setCurrentText(填充数据[3]) 86 | self.accessKeyId输入框.setText(填充数据[4]) 87 | self.AccessKeySecret输入框.setText(填充数据[5]) 88 | 89 | self.show() 90 | 91 | 92 | def 服务商变化(self): 93 | if self.服务商选择框.currentText() == 'Alibaba': 94 | self.语言Combobox.clear() 95 | self.语言Combobox.addItem(self.tr('由 Api 的云端配置决定')) 96 | self.语言Combobox.setCurrentText(self.tr('由 Api 的云端配置决定')) 97 | self.语言Combobox.setEnabled(False) 98 | self.appKey输入框.setEnabled(True) 99 | # self.accessKeyId标签.setText('AccessKeyId:') 100 | # self.AccessKeySecret标签.setText('AccessKeySecret:') 101 | elif self.服务商选择框.currentText() == 'Tencent': 102 | self.语言Combobox.clear() 103 | self.语言Combobox.addItems(['中文普通话', '英语', '粤语']) 104 | self.语言Combobox.setCurrentText('中文普通话') 105 | self.语言Combobox.setEnabled(True) 106 | self.appKey输入框.setEnabled(False) 107 | # self.accessKeyId标签.setText('AccessSecretId:') 108 | # self.AccessKeySecret标签.setText('AccessSecretKey:') 109 | 110 | 111 | def 确认(self): 112 | self.引擎名称 = self.引擎名称编辑框.text() # str 113 | self.服务商 = self.服务商选择框.currentText() # str 114 | self.AppKey = self.appKey输入框.text() # str 115 | self.语言 = self.语言Combobox.currentText() # str 116 | self.AccessKeyId = self.accessKeyId输入框.text() # str 117 | self.AccessKeySecret = self.AccessKeySecret输入框.text() # str 118 | self.有重名项 = self.检查数据库是否有重名项() 119 | if self.引擎名称 == '': 120 | return False 121 | if self.有重名项: 122 | 是否覆盖 = QMessageBox.warning(self, '覆盖确认', '已存在相同名字的引擎,是否覆盖?', QMessageBox.Yes | QMessageBox.Cancel, QMessageBox.Cancel) 123 | if 是否覆盖 != QMessageBox.Yes: 124 | return False 125 | self.更新数据库() 126 | else: 127 | self.插入数据库() 128 | self.close() 129 | 130 | def 取消(self): 131 | self.close() 132 | 133 | def 从数据库得到选中项的数据(self, 已选中的列表项): 134 | 数据库连接 = self.数据库连接 135 | cursor = 数据库连接.cursor() 136 | result = cursor.execute(f'''select 引擎名称, 137 | 服务商, 138 | AppKey, 139 | 语言, 140 | AccessKeyId, 141 | AccessKeySecret 142 | from {self.表单名字} where {self.显示的列名} = :引擎名称;''', 143 | {'引擎名称': 已选中的列表项}) 144 | return result.fetchone() 145 | # 146 | def 检查数据库是否有重名项(self): 147 | 数据库连接 = self.数据库连接 148 | cursor = 数据库连接.cursor() 149 | result = cursor.execute(f'''select * from {self.表单名字} where {self.显示的列名} = :引擎名称;''', {'引擎名称': self.引擎名称}) 150 | if result.fetchone() == None: return False # 没有重名项,返回 False 151 | return True 152 | # 153 | def 更新数据库(self): 154 | 数据库连接 = self.数据库连接 155 | cursor = 数据库连接.cursor() 156 | cursor.execute(f'''update {self.表单名字} set 服务商 = :服务商, 157 | AppKey = :AppKey, 158 | 语言 = :语言, 159 | AccessKeyId = :AccessKeyId, 160 | AccessKeySecret = :AccessKeySecret 161 | where {self.显示的列名} = :引擎名称 ''', 162 | {'服务商': self.服务商, 163 | 'AppKey': self.AppKey, 164 | '语言': self.语言, 165 | 'AccessKeyId': self.AccessKeyId, 166 | 'AccessKeySecret': self.AccessKeySecret, 167 | '引擎名称': self.引擎名称}) 168 | 数据库连接.commit() 169 | # 170 | def 插入数据库(self): 171 | 数据库连接 = self.数据库连接 172 | cursor = 数据库连接.cursor() 173 | cursor.execute(f'''insert into {self.表单名字} (引擎名称, 174 | 服务商, 175 | AppKey, 176 | 语言, 177 | AccessKeyId, 178 | AccessKeySecret) 179 | values (:引擎名称, 180 | :服务商, 181 | :AppKey, 182 | :语言, 183 | :AccessKeyId, 184 | :AccessKeySecret)''', 185 | {'引擎名称': self.引擎名称, 186 | '服务商': self.服务商, 187 | 'AppKey': self.AppKey, 188 | '语言': self.语言, 189 | 'AccessKeyId': self.AccessKeyId, 190 | 'AccessKeySecret': self.AccessKeySecret}) 191 | 数据库连接.commit() 192 | 193 | # 根据刚开始预设名字是否为空,设置确定键可否使用 194 | def closeEvent(self, a0: QCloseEvent) -> None: 195 | try: 196 | self.列表.刷新列表() 197 | except: 198 | print('引擎列表刷新失败') 199 | -------------------------------------------------------------------------------- /src/moduels/gui/Group_EditableList.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import QGroupBox, QLineEdit, QPushButton, QGridLayout, QMessageBox 4 | from moduels.gui.List_List import List_List 5 | 6 | # 添加预设对话框 7 | class Group_EditableList(QGroupBox): 8 | def __init__(self, 组名, 对话框类, 数据库连接, 表单名字, 显示的列名): 9 | super().__init__(组名) 10 | self.对话框类 = 对话框类 11 | self.数据库连接 = 数据库连接 12 | self.表单名字 = 表单名字 13 | self.显示的列名 = 显示的列名 14 | self.initElements() # 先初始化各个控件 15 | self.initSlots() # 再将各个控件连接到信号槽 16 | self.initLayouts() # 然后布局 17 | self.initValues() # 再定义各个控件的值 18 | 19 | def initElements(self): 20 | self.筛选文字输入框 = QLineEdit() 21 | self.列表 = List_List(self.数据库连接, self.表单名字, self.显示的列名) 22 | self.添加按钮 = QPushButton('+') 23 | self.删除按钮 = QPushButton('-') 24 | self.上移按钮 = QPushButton('↑') 25 | self.下移按钮 = QPushButton('↓') 26 | self.部件布局 = QGridLayout() 27 | 28 | def initSlots(self): 29 | self.筛选文字输入框.textChanged.connect(self.筛选) 30 | self.添加按钮.clicked.connect(self.添加或修改) 31 | self.删除按钮.clicked.connect(self.删除) 32 | self.上移按钮.clicked.connect(self.上移) 33 | self.下移按钮.clicked.connect(self.下移) 34 | 35 | def initLayouts(self): 36 | self.部件布局.addWidget(self.筛选文字输入框, 0, 0, 1, 2) 37 | self.部件布局.addWidget(self.列表, 1, 0, 1, 2) 38 | self.部件布局.addWidget(self.添加按钮, 2, 0, 1, 1) 39 | self.部件布局.addWidget(self.删除按钮, 2, 1, 1, 1) 40 | self.部件布局.addWidget(self.上移按钮, 3, 0, 1, 1) 41 | self.部件布局.addWidget(self.下移按钮, 3, 1, 1, 1) 42 | self.setLayout(self.部件布局) 43 | 44 | def initValues(self): 45 | self.筛选文字输入框.setPlaceholderText('筛选') 46 | self.列表.刷新列表() 47 | 48 | 49 | def 添加或修改(self): 50 | ''' 51 | 打开对话框,添加或修改条目 52 | ''' 53 | 对话框 = self.对话框类(self.列表, self.数据库连接, self.表单名字, self.显示的列名) 54 | 55 | def 删除(self): 56 | if not self.列表.currentItem(): return False 57 | 当前排 = self.列表.currentRow() 58 | 已选中的列表项 = self.列表.currentItem().text() 59 | answer = QMessageBox.question(self, self.tr('删除预设'), self.tr(f'将要删除“{已选中的列表项}”项,是否确认?')) 60 | if answer != QMessageBox.Yes: return False 61 | id = self.数据库连接.cursor().execute( 62 | f'''select id from {self.表单名字} where {self.显示的列名} = :已选中的列表项''', {'已选中的列表项': 已选中的列表项}).fetchone()[0] 63 | self.数据库连接.cursor().execute(f'''delete from {self.表单名字} where id = :id''', {'id': id}) 64 | self.数据库连接.cursor().execute(f'''update {self.表单名字} set id=id-1 where id > :id''', {'id': id}) 65 | self.数据库连接.commit() 66 | self.列表.刷新列表() 67 | if self.列表.count() >= 当前排: 68 | self.列表.setCurrentRow(当前排) 69 | 70 | def 上移(self): 71 | 当前排 = self.列表.currentRow() 72 | if 当前排 > 0: 73 | 已选中的列表项 = self.列表.currentItem().text() 74 | id = self.数据库连接.cursor().execute( 75 | f'''select id from {self.表单名字} where {self.显示的列名} = :已选中的列表项 ''', {'已选中的列表项': 已选中的列表项}).fetchone()[0] 76 | self.数据库连接.cursor().execute(f'''update {self.表单名字} set id = 100000 where id = :id - 1 ''', {'id': id}) 77 | self.数据库连接.cursor().execute(f'''update {self.表单名字} set id = id - 1 where {self.显示的列名} = :已选中的列表项''', {'已选中的列表项': 已选中的列表项}) 78 | self.数据库连接.cursor().execute(f'''update {self.表单名字} set id = :id where id=100000 ''', {'id': id}) 79 | self.数据库连接.commit() 80 | self.列表.刷新列表() 81 | if self.列表.筛选文字 == '': 82 | self.列表.setCurrentRow(当前排 - 1) 83 | return 84 | 85 | # 向下移动预设 86 | def 下移(self): 87 | 当前排 = self.列表.currentRow() 88 | 总行数 = self.列表.count() 89 | if 当前排 > -1 and 当前排 < 总行数 - 1: 90 | 已选中的列表项 = self.列表.currentItem().text() 91 | id = self.数据库连接.cursor().execute( 92 | f'''select id from {self.表单名字} where {self.显示的列名} = :已选中的列表项''', {'已选中的列表项': 已选中的列表项}).fetchone()[0] 93 | self.数据库连接.cursor().execute(f'''update {self.表单名字} set id = 100000 where id = :id + 1 ''', {'id': id}) 94 | self.数据库连接.cursor().execute(f'''update {self.表单名字} set id = id + 1 where {self.显示的列名} = :已选中的列表项''', {'已选中的列表项': 已选中的列表项}) 95 | self.数据库连接.cursor().execute(f'''update {self.表单名字} set id = :id where id=100000 ''', {'id': id}) 96 | self.数据库连接.commit() 97 | self.列表.刷新列表() 98 | if self.列表.筛选文字 == '': 99 | if 当前排 < 总行数: 100 | self.列表.setCurrentRow(当前排 + 1) 101 | else: 102 | self.列表.setCurrentRow(当前排) 103 | return 104 | 105 | def 筛选(self): 106 | self.列表.筛选文字 = self.筛选文字输入框.text() 107 | self.列表.刷新列表() 108 | 109 | -------------------------------------------------------------------------------- /src/moduels/gui/List_List.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import QListWidget 4 | from PySide2.QtCore import Signal 5 | from moduels.component.NormalValue import 常量 6 | 7 | # 添加预设对话框 8 | class List_List(QListWidget): 9 | 10 | 选中文字 = Signal(str) 11 | 12 | def __init__(self, 数据库连接, 表单名字, 显示的列名): 13 | super().__init__() 14 | self.筛选文字 = '' 15 | self.数据库连接 = 数据库连接 16 | self.表单名字 = 表单名字 17 | self.显示的列名 = 显示的列名 18 | self.initElements() # 先初始化各个控件 19 | self.initSlots() # 再将各个控件连接到信号槽 20 | self.initLayouts() # 然后布局 21 | self.initValues() # 再定义各个控件的值 22 | 23 | def initElements(self): 24 | pass 25 | 26 | def initSlots(self): 27 | pass 28 | 29 | def initLayouts(self): 30 | pass 31 | 32 | def initValues(self): 33 | self.刷新列表() 34 | 35 | def currentChanged(self, current, previous): 36 | if current.row() > -1: 37 | self.选中文字.emit(current.data()) 38 | 39 | def 刷新列表(self): 40 | cursor = self.数据库连接.cursor() 41 | if self.筛选文字 == '': 42 | 显示项 = cursor.execute( 43 | f'''select id, {self.显示的列名} from {self.表单名字} order by id''') 44 | self.clear() 45 | for i in 显示项: 46 | self.addItem(i[1]) 47 | else: 48 | 显示项 = cursor.execute( 49 | f'''select id, {self.显示的列名}, * from {self.表单名字} order by id''') 50 | self.clear() 51 | for i in 显示项: 52 | for j in i[2:]: 53 | if self.筛选文字 in str(j): 54 | self.addItem(i[1]) 55 | break 56 | 57 | -------------------------------------------------------------------------------- /src/moduels/gui/MainWindow.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import QMainWindow, QTabWidget 4 | from PySide2.QtGui import QIcon, Qt 5 | 6 | from moduels.component.NormalValue import 常量 7 | 8 | from moduels.gui.Tab_CapsWriter import Tab_CapsWriter 9 | from moduels.gui.Tab_Config import Tab_Config 10 | from moduels.gui.Tab_Help import Tab_Help 11 | 12 | import sys, os 13 | 14 | class MainWindow(QMainWindow): 15 | def __init__(self): 16 | super().__init__() 17 | 常量.主窗口 = self 18 | self.loadStyleSheet() 19 | self.initElements() # 先初始化各个控件 20 | self.initSlots() # 再将各个控件连接到信号槽 21 | self.initLayouts() # 然后布局 22 | self.initValues() # 再定义各个控件的值 23 | 24 | 25 | # self.setWindowState(Qt.WindowMaximized) 26 | 27 | def initElements(self): 28 | self.状态栏 = self.statusBar() 29 | # 定义中心控件为多 tab 页面 30 | self.tabs = QTabWidget() 31 | 32 | # 定义多个不同功能的 tab 33 | self.设置标签页 = Tab_Config() # 设置页要在前排加载,以确保一些设置加载成功 34 | self.CapsWriter标签页 = Tab_CapsWriter() # 主要功能的 tab 35 | self.帮助标签页 = Tab_Help() 36 | 37 | # 38 | 39 | 40 | 41 | def initLayouts(self): 42 | 43 | self.tabs.addTab(self.CapsWriter标签页, 'CapsWriter') 44 | self.tabs.addTab(self.设置标签页, '设置') 45 | self.tabs.addTab(self.帮助标签页, '帮助') 46 | self.setCentralWidget(self.tabs) 47 | 48 | def initSlots(self): 49 | self.CapsWriter标签页.状态栏消息.connect(lambda 消息, 时间: self.状态栏.showMessage(消息, 时间)) 50 | # self.打印输出标签页.状态栏消息.connect(lambda 消息, 时间: self.状态栏.showMessage(消息, 时间)) 51 | # self.设置标签页.状态栏消息.connect(lambda 消息, 时间: self.状态栏.showMessage(消息, 时间)) 52 | self.帮助标签页.状态栏消息.connect(lambda 消息, 时间: self.状态栏.showMessage(消息, 时间)) 53 | 54 | 55 | pass 56 | 57 | def initValues(self): 58 | # self.adjustSize() 59 | # self.setGeometry(QStyle(Qt.LeftToRight, Qt.AlignCenter, self.size(), QApplication.desktop().availableGeometry())) 60 | 常量.状态栏 = self.状态栏 61 | 62 | self.setWindowIcon(QIcon(常量.图标路径)) 63 | self.setWindowTitle('CapsWriter 语音输入工具') 64 | self.setWindowFlag(Qt.WindowStaysOnTopHint) # 始终在前台 65 | 66 | self.show() 67 | 68 | def 移动到屏幕中央(self): 69 | rectangle = self.frameGeometry() 70 | center = QApplication.desktop().availableGeometry().center() 71 | rectangle.moveCenter(center) 72 | self.move(rectangle.topLeft()) 73 | 74 | def 更新控制台输出(self, text): 75 | self.打印输出标签页.print(text) 76 | 77 | def loadStyleSheet(self): 78 | try: 79 | try: 80 | with open(常量.样式文件, 'r', encoding='utf-8') as style: 81 | self.setStyleSheet(style.read()) 82 | except: 83 | with open(常量.样式文件, 'r', encoding='gbk') as style: 84 | self.setStyleSheet(style.read()) 85 | except: 86 | QMessageBox.warning(self, self.tr('主题载入错误'), self.tr('未能成功载入主题,请确保软件 misc 目录有 "style.css" 文件存在。')) 87 | 88 | def keyPressEvent(self, event) -> None: 89 | # 在按下 F5 的时候重载 style.css 主题 90 | if (event.key() == Qt.Key_F5): 91 | self.loadStyleSheet() 92 | self.status.showMessage('已成功更新主题', 800) 93 | 94 | def onUpdateText(self, text): 95 | """Write console output to text widget.""" 96 | 97 | cursor = self.consoleTab.consoleEditBox.textCursor() 98 | cursor.movePosition(QTextCursor.End) 99 | cursor.insertText(text) 100 | self.consoleTab.consoleEditBox.setTextCursor(cursor) 101 | self.consoleTab.consoleEditBox.ensureCursorVisible() 102 | 103 | def 状态栏提示(self, 提示文字:str, 时间:int): 104 | self.状态栏.showMessage(提示文字, 时间) 105 | 106 | 107 | def closeEvent(self, event): 108 | """Shuts down application on close.""" 109 | # Return stdout to defaults. 110 | if 常量.关闭时隐藏到托盘: 111 | event.ignore() 112 | self.hide() 113 | else: 114 | sys.stdout = sys.__stdout__ 115 | super().closeEvent(event) 116 | -------------------------------------------------------------------------------- /src/moduels/gui/SystemTray.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import QSystemTrayIcon, QMenu, QApplication, QAction 4 | # from PySide2.QtCore import * 5 | from PySide2.QtGui import QIcon 6 | from PySide2.QtCore import Qt 7 | 8 | import sys 9 | 10 | from moduels.component.NormalValue import 常量 11 | 12 | class SystemTray(QSystemTrayIcon): 13 | def __init__(self, 主窗口): 14 | super(SystemTray, self).__init__() 15 | self.主窗口 = 主窗口 16 | self.initElements() # 先初始化各个控件 17 | self.initSlots() # 再将各个控件连接到信号槽 18 | self.initLayouts() # 然后布局 19 | self.initValues() # 再定义各个控件的值 20 | 21 | # self.RestoreAction = QAction(u'还原 ', self, triggered=self.showWindow) # 添加一级菜单动作选项(还原主窗口) 22 | # self.StyleAction = QAction(self.tr('更新主题'), self, triggered=mainWindow.loadStyleSheet) # 添加一级菜单动作选项(更新 QSS) 23 | # self.tray_menu.addAction(self.RestoreAction) # 为菜单添加动作 24 | # self.tray_menu.addAction(self.StyleAction) 25 | 26 | def initElements(self): 27 | self.托盘菜单 = QMenu(QApplication.desktop()) # 创建菜单 28 | self.QuitAction = QAction(self.tr('退出'), self, triggered=self.退出) # 添加一级菜单动作选项(退出程序) 29 | 30 | def initSlots(self): 31 | self.activated.connect(self.trayEvent) # 设置托盘点击事件处理函数 32 | 33 | def initLayouts(self): 34 | self.托盘菜单.addAction(self.QuitAction) 35 | 36 | def initValues(self): 37 | self.setIcon(QIcon(常量.图标路径)) 38 | self.setParent(self.主窗口) 39 | self.setContextMenu(self.托盘菜单) # 设置系统托盘菜单 40 | self.show() 41 | 42 | def 显示主窗口(self): 43 | self.主窗口.showNormal() 44 | self.主窗口.activateWindow() 45 | self.主窗口.setWindowFlag(Qt.Window, Qt.WindowStaysOnTopHint) # 始终在前台 46 | self.主窗口.show() 47 | 48 | def 退出(self): 49 | sys.stdout = sys.__stdout__ 50 | self.hide() 51 | QApplication.quit() 52 | 53 | def 切换聆听中的图标(self): 54 | self.setIcon(QIcon(常量.聆听图标路径)) 55 | 56 | def 切换正常图标(self): 57 | self.setIcon(QIcon(常量.图标路径)) 58 | 59 | def trayEvent(self, reason): 60 | # 鼠标点击icon传递的信号会带有一个整形的值,1是表示单击右键,2是双击,3是单击左键,4是用鼠标中键点击 61 | if reason == 2 or reason == 3: 62 | if 常量.主窗口.isMinimized() or not 常量.主窗口.isVisible(): 63 | # 若是最小化或者最小化到托盘,则先正常显示窗口,再变为活动窗口(暂时显示在最前面) 64 | self.显示主窗口() 65 | else: 66 | # 若不是最小化,则最小化 67 | # self.window.showMinimized() 68 | self.主窗口.hide() 69 | pass -------------------------------------------------------------------------------- /src/moduels/gui/Tab_CapsWriter.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import QWidget, QVBoxLayout, QHBoxLayout, QGroupBox, QPushButton 4 | # from PySide2.QtGui import * 5 | from PySide2.QtCore import Signal 6 | import os, re, subprocess, time 7 | 8 | import pyaudio 9 | 10 | 11 | from moduels.component.NormalValue import 常量 12 | from moduels.component.QEditBox_StdoutBox import QEditBox_StdoutBox 13 | from moduels.thread.Thread_AliEngine import Thread_AliEngine 14 | from moduels.gui.Combo_EngineList import Combo_EngineList 15 | 16 | 17 | class Tab_CapsWriter(QWidget): 18 | 状态栏消息 = Signal(str, int) 19 | 20 | def __init__(self): 21 | super().__init__() 22 | self.initElements() # 先初始化各个控件 23 | self.initSlots() # 再将各个控件连接到信号槽 24 | self.initLayouts() # 然后布局 25 | self.initValues() # 再定义各个控件的值 26 | 27 | 28 | def initElements(self): 29 | self.页面总布局 = QVBoxLayout() 30 | 31 | self.引擎选择Box = QGroupBox('引擎选择') 32 | self.引擎选择下拉框 = Combo_EngineList() 33 | self.引擎选择Box布局 = QVBoxLayout() 34 | 35 | self.控制台输出Box = QGroupBox('提示消息') 36 | self.控制台输出框 = QEditBox_StdoutBox() 37 | self.控制台输出Box布局 = QVBoxLayout() 38 | 39 | self.启停按钮Box = QGroupBox('开关') 40 | self.启动按钮 = QPushButton('启用 CapsWriter') 41 | self.停止按钮 = QPushButton('停止 CapsWriter') 42 | self.启停按钮Box布局 = QHBoxLayout() 43 | 44 | 45 | 46 | def initLayouts(self): 47 | 48 | 49 | self.引擎选择Box布局.addWidget(self.引擎选择下拉框) 50 | 51 | self.控制台输出Box布局.addWidget(self.控制台输出框) 52 | 53 | self.启停按钮Box布局.addWidget(self.启动按钮) 54 | self.启停按钮Box布局.addWidget(self.停止按钮) 55 | 56 | self.引擎选择Box.setLayout(self.引擎选择Box布局) 57 | self.控制台输出Box.setLayout(self.控制台输出Box布局) 58 | self.启停按钮Box.setLayout(self.启停按钮Box布局) 59 | 60 | self.页面总布局.addWidget(self.引擎选择Box) 61 | self.页面总布局.addWidget(self.控制台输出Box) 62 | self.页面总布局.addWidget(self.启停按钮Box) 63 | 64 | self.setLayout(self.页面总布局) 65 | 66 | def initSlots(self): 67 | self.启动按钮.clicked.connect(self.启动引擎) 68 | self.停止按钮.clicked.connect(self.停止引擎) 69 | 70 | def 更新控制台输出(self, 文本): 71 | self.控制台输出框.print(文本) 72 | 73 | def initValues(self): 74 | self.引擎线程 = None 75 | self.停止按钮.setDisabled(True) 76 | 77 | print("""\n软件介绍: 78 | 79 | CapsWriter,顾名思义,就是按下大写锁定键来打字的工具。它的具体作用是:当你按下键盘上的大写锁定键后,软件开始语音识别,当你松开大写锁定键时,识别的结果就可以立马上屏。 80 | 81 | 目前软件内置了对阿里云一句话识别 API 的支持。如果你要使用,就需要先在阿里云上实名认证,申请语音识别 API,在设置页面添加一个语音识别引擎。 82 | 83 | 具体申请阿里云 API 的方法,可以参考我这个视频:https://www.bilibili.com/video/BV1qK4y1s7Fb/ 84 | 85 | 添加上引擎后,在当前页面选择一个引擎,点击启用按钮,就可以进行语音识别了!嗯 86 | 87 | 启用后,在实际使用中,只要按下 CapsLock 键,软件就会立刻开始录音: 88 | 89 | 如果只是单击 CapsLock 后松开,录音数据会立刻被删除; 90 | 如果按下 CapsLock 键时长超过 0.3 秒,就会开始连网进行语音识别,松开 CapsLock 键时,语音识别结果会被立刻输入。 91 | 92 | 所以你只需要按下 CapsLock 键,无需等待,就可以开始说话,因为当你按下按下 CapsLock 键的时候,程序就开始录音了。说完后,松开,识别结果立马上屏。\r\n""") 93 | 94 | def 启动引擎(self): 95 | if self.引擎线程 != None: return 96 | 引擎名称 = self.引擎选择下拉框.currentText() 97 | if 引擎名称 == '': return 98 | self.启动按钮.setDisabled(True) 99 | self.引擎选择下拉框.setDisabled(True) 100 | self.parent().parent().setTabEnabled(1, False) 101 | self.parent().parent().setTabEnabled(2, False) 102 | self.停止按钮.setEnabled(True) 103 | result = 常量.数据库连接.execute(f'''select * from {常量.语音引擎表单名} where 引擎名称 = :引擎名称''', 104 | {'引擎名称': 引擎名称}).fetchone() 105 | if result == None: return 106 | self.引擎线程 = Thread_AliEngine(引擎名称, self) 107 | self.引擎线程.识别中的信号.connect(常量.托盘.切换聆听中的图标) 108 | self.引擎线程.结束识别的信号.connect(常量.托盘.切换正常图标) 109 | self.引擎线程.引擎出错信号.connect(self.停止引擎) 110 | self.引擎线程.start() 111 | 112 | 113 | def 停止引擎(self): 114 | if self.引擎线程 != None: 115 | self.引擎线程.停止引擎() 116 | # print(self.引擎线程.isRunning()) 117 | self.引擎线程 = None 118 | self.启动按钮.setEnabled(True) 119 | self.引擎选择下拉框.setEnabled(True) 120 | self.parent().parent().setTabEnabled(1, True) 121 | self.parent().parent().setTabEnabled(2, True) 122 | self.停止按钮.setDisabled(True) 123 | 124 | 125 | # self.压缩图片_按钮纵向布局.通过id勾选单选按钮(常量.输出选项['图片压缩']) 126 | # self.输出格式_按钮纵向布局.通过id勾选单选按钮(常量.输出选项['输出格式']) 127 | # self.其它_安装到手机选框.setChecked(常量.输出选项['adb发送至手机']) 128 | # self.其它_清理注释选框.setChecked(常量.输出选项['清理注释']) 129 | # 130 | # def 无线adb(self): 131 | # self.无线adb线程.start() 132 | # 133 | # def 提取皮肤(self): 134 | # self.提取皮肤线程.start() 135 | # 136 | # def 解压皮肤(self): 137 | # 解压皮肤对话框 = Dialog_DecompressSkin() 138 | # 139 | # def 发送皮肤(self): 140 | # 获得的皮肤路径 = QFileDialog.getOpenFileName(self, caption='选择皮肤', dir=常量.皮肤输出路径, filter='皮肤文件 (*.bds)')[0] 141 | # if 获得的皮肤路径 == '': return True 142 | # 皮肤文件名 = os.path.basename(获得的皮肤路径) 143 | # 手机皮肤路径 = '/sdcard/baidu/ime/skins/' + 皮肤文件名 144 | # 发送皮肤命令 = f'''adb push "{获得的皮肤路径}" "{手机皮肤路径}"''' 145 | # subprocess.run(发送皮肤命令, startupinfo=常量.subprocessStartUpInfo) 146 | # 安装皮肤命令 = f'''adb shell am start -a android.intent.action.VIEW -c android.intent.category.DEFAULT -n com.baidu.input/com.baidu.input.ImeUpdateActivity -d '{手机皮肤路径}' ''' 147 | # subprocess.run(安装皮肤命令, startupinfo=常量.subprocessStartUpInfo) 148 | # 149 | # 150 | # def 备份选中皮肤(self): 151 | # if self.皮肤列表Box.列表.currentRow() < 0: return 152 | # 已选中的列表项 = self.皮肤列表Box.列表.currentItem().text() 153 | # 输出文件名, 皮肤源文件目录 = 常量.数据库连接.cursor().execute( 154 | # f'select outputFileName, sourceFilePath from {常量.数据库皮肤表单名} where skinName = :皮肤名字;', 155 | # {'皮肤名字': 已选中的列表项}).fetchone() 156 | # 备份时间 = time.localtime() 157 | # 备份压缩文件名 = f'{输出文件名}_备份_{备份时间.tm_year}年{备份时间.tm_mon}月{备份时间.tm_mday}日{备份时间.tm_hour}时{备份时间.tm_min}分{备份时间.tm_sec}秒.bds' 158 | # 备份文件完整路径 = os.path.join(常量.皮肤输出路径, '皮肤备份文件', 备份压缩文件名) 159 | # 备份命令 = f'''winrar a -afzip -ibck -r -ep1 "{备份文件完整路径}" "{皮肤源文件目录}/*"''' 160 | # if not os.path.exists(os.path.dirname(备份文件完整路径)): os.makedirs(os.path.dirname(备份文件完整路径)) 161 | # subprocess.run(备份命令, startupinfo=常量.subprocessStartUpInfo) 162 | # os.startfile(os.path.dirname(备份文件完整路径)) 163 | # 164 | # 165 | # def 还原选中皮肤(self): 166 | # if self.皮肤列表Box.列表.currentRow() < 0: return 167 | # 已选中的列表项 = self.皮肤列表Box.列表.currentItem().text() 168 | # 输出文件名, 皮肤源文件目录 = 常量.数据库连接.cursor().execute( 169 | # f'select outputFileName, sourceFilePath from {常量.数据库皮肤表单名} where skinName = :皮肤名字;', 170 | # {'皮肤名字': 已选中的列表项}).fetchone() 171 | # 备份文件夹路径 = os.path.join(常量.皮肤输出路径, '皮肤备份文件') 172 | # Dialog_RestoreSkin(备份文件夹路径, 输出文件名, 皮肤源文件目录) 173 | # 174 | # def 打开皮肤输出文件夹(self): 175 | # if not os.path.exists(常量.皮肤输出路径): os.makedirs(常量.皮肤输出路径) 176 | # os.startfile(常量.皮肤输出路径) 177 | # 178 | # def 打包选中皮肤(self): 179 | # if self.皮肤列表Box.列表.currentRow() < 0: return True 180 | # self.备份选中皮肤_按钮.setDisabled(True) 181 | # self.还原选中皮肤_按钮.setDisabled(True) 182 | # self.打包选中皮肤_按钮.setDisabled(True) 183 | # self.打包所有皮肤_按钮.setDisabled(True) 184 | # self.生成皮肤线程.是否要全部生成 = False 185 | # self.生成皮肤线程.start() 186 | # 187 | # def 打包所有皮肤(self): 188 | # self.备份选中皮肤_按钮.setDisabled(True) 189 | # self.还原选中皮肤_按钮.setDisabled(True) 190 | # self.打包选中皮肤_按钮.setDisabled(True) 191 | # self.打包所有皮肤_按钮.setDisabled(True) 192 | # self.生成皮肤线程.是否要全部生成 = True 193 | # self.生成皮肤线程.start() 194 | # 195 | # def 生成皮肤线程完成(self): 196 | # self.备份选中皮肤_按钮.setEnabled(True) 197 | # self.还原选中皮肤_按钮.setEnabled(True) 198 | # self.打包选中皮肤_按钮.setEnabled(True) 199 | # self.打包所有皮肤_按钮.setEnabled(True) 200 | # 常量.状态栏.showMessage('打包任务完成!', 5000) 201 | 202 | -------------------------------------------------------------------------------- /src/moduels/gui/Tab_Config.py: -------------------------------------------------------------------------------- 1 | import webbrowser 2 | from PySide2.QtCore import Signal 3 | from PySide2.QtWidgets import QWidget, QVBoxLayout, QHBoxLayout, QGridLayout, QGroupBox, QPushButton, QCheckBox 4 | from moduels.component.NormalValue import 常量 5 | from moduels.gui.Group_EditableList import Group_EditableList 6 | from moduels.gui.Dialog_AddEngine import Dialog_AddEngine 7 | # from moduels.gui.Group_PathSetting import Group_PathSetting 8 | 9 | 10 | class Tab_Config(QWidget): 11 | 状态栏消息 = Signal(str, int) 12 | 13 | def __init__(self, parent=None): 14 | super(Tab_Config, self).__init__(parent) 15 | self.initElements() # 先初始化各个控件 16 | self.initSlots() # 再将各个控件连接到信号槽 17 | self.initLayouts() # 然后布局 18 | self.initValues() # 再定义各个控件的值 19 | 20 | def initElements(self): 21 | self.程序设置Box = QGroupBox(self.tr('程序设置')) 22 | self.开关_关闭窗口时隐藏到托盘 = QCheckBox(self.tr('点击关闭按钮时隐藏到托盘')) 23 | self.程序设置横向布局 = QHBoxLayout() 24 | 25 | self.引擎列表 = Group_EditableList('语音引擎', Dialog_AddEngine, 常量.数据库连接, 常量.语音引擎表单名, '引擎名称') 26 | 27 | self.常用网址Box = QGroupBox('网页控制台') 28 | self.常用网址Box布局 = QGridLayout() 29 | self.智能语音交互控制台按钮 = QPushButton('智能语音交互') 30 | self.RAM访问控制控制台按钮 = QPushButton('RAM访问控制') 31 | 32 | self.页面布局 = QVBoxLayout() 33 | 34 | 35 | def initSlots(self): 36 | self.开关_关闭窗口时隐藏到托盘.stateChanged.connect(self.设置_隐藏到状态栏) 37 | self.智能语音交互控制台按钮.clicked.connect(lambda: webbrowser.open(r'https://nls-portal.console.aliyun.com/')) 38 | self.RAM访问控制控制台按钮.clicked.connect(lambda: webbrowser.open(r'https://ram.console.aliyun.com/')) 39 | # self.路径设置Box.皮肤输出路径输入框.textChanged.connect(self.设置_皮肤输出路径) 40 | # self.路径设置Box.音效文件路径输入框.textChanged.connect(self.设置_音效文件路径) 41 | 42 | def initLayouts(self): 43 | self.程序设置横向布局.addWidget(self.开关_关闭窗口时隐藏到托盘) 44 | self.程序设置Box.setLayout(self.程序设置横向布局) 45 | 46 | self.常用网址Box布局.addWidget(self.智能语音交互控制台按钮, 0, 0) 47 | self.常用网址Box布局.addWidget(self.RAM访问控制控制台按钮, 0, 1) 48 | self.常用网址Box.setLayout(self.常用网址Box布局) 49 | 50 | self.页面布局.addWidget(self.程序设置Box) 51 | self.页面布局.addWidget(self.引擎列表) 52 | self.页面布局.addWidget(self.常用网址Box) 53 | self.页面布局.addStretch(1) 54 | 55 | self.setLayout(self.页面布局) 56 | 57 | def initValues(self): 58 | self.检查数据库() 59 | 60 | 61 | def 检查数据库(self): 62 | 数据库连接 = 常量.数据库连接 63 | self.检查数据库_关闭时最小化(数据库连接) 64 | 65 | def 检查数据库_关闭时最小化(self, 数据库连接): 66 | result = 数据库连接.cursor().execute(f'''select value from {常量.偏好设置表单名} where item = :item''', 67 | {'item': 'hideToTrayWhenHitCloseButton'}).fetchone() 68 | if result == None: # 如果关闭窗口最小化到状态栏这个选项还没有在数据库创建,那就创建一个 69 | 初始值 = 'False' 70 | 数据库连接.cursor().execute(f'''insert into {常量.偏好设置表单名} (item, value) values (:item, :value) ''', 71 | {'item': 'hideToTrayWhenHitCloseButton', 72 | 'value':初始值}) 73 | 数据库连接.commit() 74 | self.开关_关闭窗口时隐藏到托盘.setChecked(初始值 == 'True') 75 | else: 76 | self.开关_关闭窗口时隐藏到托盘.setChecked(result[0] == 'True') 77 | # 78 | # def 检查数据库_皮肤输出路径(self, 数据库连接): 79 | # result = 数据库连接.cursor().execute(f'''select value from {常量.偏好设置表单名} where item = :item''', 80 | # {'item': 'skinOutputPath'}).fetchone() 81 | # if result == None: # 如果关闭窗口最小化到状态栏这个选项还没有在数据库创建,那就创建一个 82 | # 初始值 = 'output' 83 | # 数据库连接.cursor().execute(f'''insert into {常量.偏好设置表单名} (item, value) values (:item, :value) ''', 84 | # {'item': 'skinOutputPath', 85 | # 'value': 初始值}) 86 | # 数据库连接.commit() 87 | # self.路径设置Box.皮肤输出路径输入框.setText(初始值) 88 | # else: 89 | # self.路径设置Box.皮肤输出路径输入框.setText(result[0]) 90 | # 91 | # def 检查数据库_音效文件路径(self, 数据库连接): 92 | # result = 数据库连接.cursor().execute(f'''select value from {常量.偏好设置表单名} where item = :item''', 93 | # {'item': 'soundFilePath'}).fetchone() 94 | # if result == None: # 如果关闭窗口最小化到状态栏这个选项还没有在数据库创建,那就创建一个 95 | # 初始值 = 'sound' 96 | # 数据库连接.cursor().execute(f'''insert into {常量.偏好设置表单名} (item, value) values (:item, :value) ''', 97 | # {'item': 'soundFilePath', 98 | # 'value': 初始值}) 99 | # 数据库连接.commit() 100 | # self.路径设置Box.音效文件路径输入框.setText(初始值) 101 | # else: 102 | # self.路径设置Box.音效文件路径输入框.setText(result[0]) 103 | 104 | def 设置_隐藏到状态栏(self): 105 | 数据库连接 = 常量.数据库连接 106 | 数据库连接.cursor().execute(f'''update {常量.偏好设置表单名} set value = :value where item = :item''', 107 | {'item': 'hideToTrayWhenHitCloseButton', 108 | 'value': str(self.开关_关闭窗口时隐藏到托盘.isChecked())}) 109 | 数据库连接.commit() 110 | 常量.关闭时隐藏到托盘 = self.开关_关闭窗口时隐藏到托盘.isChecked() 111 | 112 | # def 设置_皮肤输出路径(self): 113 | # 数据库连接 = 常量.数据库连接 114 | # 数据库连接.cursor().execute(f'''update {常量.数据库偏好设置表单名} set value = :value where item = :item''', 115 | # {'item': 'skinOutputPath', 116 | # 'value': self.路径设置Box.皮肤输出路径输入框.text()}) 117 | # 数据库连接.commit() 118 | # 常量.皮肤输出路径 = self.路径设置Box.皮肤输出路径输入框.text() 119 | # 120 | # 121 | # def 设置_音效文件路径(self): 122 | # 数据库连接 = 常量.数据库连接 123 | # 数据库连接.cursor().execute(f'''update {常量.数据库偏好设置表单名} set value = :value where item = :item''', 124 | # {'item': 'soundFilePath', 125 | # 'value': self.路径设置Box.音效文件路径输入框.text()}) 126 | # 数据库连接.commit() 127 | # 常量.音效文件路径 = self.路径设置Box.音效文件路径输入框.text() 128 | 129 | def 隐藏到状态栏开关被点击(self): 130 | cursor = 常量.数据库连接.cursor() 131 | cursor.execute(f'''update {常量.数据库偏好设置表单名} set value='{str(self.开关_关闭窗口时隐藏到托盘.isChecked())}' where item = '{'hideToTrayWhenHitCloseButton'}';''') 132 | 常量.数据库连接.commit() 133 | -------------------------------------------------------------------------------- /src/moduels/gui/Tab_Help.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | from PySide2.QtWidgets import QWidget, QPushButton, QVBoxLayout 4 | from PySide2.QtCore import Signal 5 | from moduels.component.NormalValue import 常量 6 | from moduels.component.SponsorDialog import SponsorDialog 7 | 8 | import os, webbrowser 9 | 10 | 11 | class Tab_Help(QWidget): 12 | 状态栏消息 = Signal(str, int) 13 | 14 | def __init__(self): 15 | super().__init__() 16 | self.initElement() # 先初始化各个控件 17 | self.initSlots() # 再将各个控件连接到信号槽 18 | self.initLayout() # 然后布局 19 | self.initValue() # 再定义各个控件的值 20 | 21 | def initElement(self): 22 | self.打开帮助按钮 = QPushButton(self.tr('打开帮助文档')) 23 | self.ffmpegMannualNoteButton = QPushButton(self.tr('查看作者的 FFmpeg 笔记')) 24 | self.openVideoHelpButtone = QPushButton(self.tr('查看视频教程')) 25 | self.openGiteePage = QPushButton(self.tr(f'当前版本是 v{常量.软件版本},到 Gitee 检查新版本')) 26 | self.openGithubPage = QPushButton(self.tr(f'当前版本是 v{常量.软件版本},到 Github 检查新版本')) 27 | self.linkToDiscussPage = QPushButton(self.tr('加入 QQ 群')) 28 | self.tipButton = QPushButton(self.tr('打赏作者')) 29 | 30 | self.masterLayout = QVBoxLayout() 31 | 32 | def initSlots(self): 33 | self.打开帮助按钮.clicked.connect(self.openHelpDocument) 34 | self.ffmpegMannualNoteButton.clicked.connect(lambda: webbrowser.open(self.tr(r'https://hacpai.com/article/1595480295489'))) 35 | self.openVideoHelpButtone.clicked.connect(lambda: webbrowser.open(self.tr(r'https://www.bilibili.com/video/BV12A411p73r/'))) 36 | self.openGiteePage.clicked.connect(lambda: webbrowser.open(self.tr(r'https://gitee.com/haujet/CapsWriter/releases'))) 37 | self.openGithubPage.clicked.connect(lambda: webbrowser.open(self.tr(r'https://github.com/HaujetZhao/CapsWriter/releases'))) 38 | self.linkToDiscussPage.clicked.connect(lambda: webbrowser.open( 39 | self.tr(r'https://qm.qq.com/cgi-bin/qm/qr?k=DgiFh5cclAElnELH4mOxqWUBxReyEVpm&jump_from=webapi'))) 40 | self.tipButton.clicked.connect(lambda: SponsorDialog(self)) 41 | 42 | def initLayout(self): 43 | self.setLayout(self.masterLayout) 44 | # self.masterLayout.addWidget(self.打开帮助按钮) 45 | # self.masterLayout.addWidget(self.ffmpegMannualNoteButton) 46 | self.masterLayout.addWidget(self.openVideoHelpButtone) 47 | self.masterLayout.addWidget(self.openGiteePage) 48 | self.masterLayout.addWidget(self.openGithubPage) 49 | self.masterLayout.addWidget(self.linkToDiscussPage) 50 | self.masterLayout.addWidget(self.tipButton) 51 | 52 | def initValue(self): 53 | self.打开帮助按钮.setMaximumHeight(100) 54 | self.ffmpegMannualNoteButton.setMaximumHeight(100) 55 | self.openVideoHelpButtone.setMaximumHeight(100) 56 | self.openGiteePage.setMaximumHeight(100) 57 | self.openGithubPage.setMaximumHeight(100) 58 | self.linkToDiscussPage.setMaximumHeight(100) 59 | self.tipButton.setMaximumHeight(100) 60 | 61 | def openHelpDocument(self): 62 | try: 63 | if 常量.系统平台 == 'Darwin': 64 | import shlex 65 | os.system("open " + shlex.quote(self.tr("./misc/Docs/README_zh.html"))) 66 | elif 常量.系统平台 == 'Windows': 67 | os.startfile(os.path.realpath(self.tr('./misc/Docs/README_zh.html'))) 68 | except: 69 | print('未能打开帮助文档') 70 | -------------------------------------------------------------------------------- /src/moduels/thread/Thread_AliEngine.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | import json 4 | import os 5 | import pyaudio 6 | import threading 7 | import keyboard 8 | import sqlite3 9 | import time 10 | 11 | import ali_speech 12 | 13 | from PySide2.QtCore import QThread, Signal 14 | from PySide2.QtWidgets import QApplication 15 | 16 | from moduels.component.NormalValue import 常量 17 | from moduels.function.getAlibabaRecognizer import getAlibabaRecognizer 18 | 19 | 20 | 21 | 22 | class Thread_AliEngine(QThread): 23 | 状态栏消息 = Signal(str, int) 24 | 引擎出错信号 = Signal() 25 | 26 | CHUNK = 1024 # 数据包或者数据片段 27 | FORMAT = pyaudio.paInt16 # pyaudio.paInt16表示我们使用量化位数 16位来进行录音 28 | CHANNELS = 1 # 声道,1为单声道,2为双声道 29 | RATE = 16000 # 采样率,每秒钟16000次 30 | 总共写入音频片段数 = 0 31 | 32 | count = 1 # 计数 33 | 待命中 = True # 是否准备开始录音 34 | 识别中 = False # 控制录音是否停止 35 | 36 | 识别中的信号 = Signal() 37 | 结束识别的信号 = Signal() 38 | 39 | 40 | def __init__(self, 引擎名称, parent=None): 41 | super().__init__(parent) 42 | self.持续录音 = True 43 | self.正在运行 = 0 44 | self.引擎名称 = 引擎名称 45 | self.得到引擎信息() 46 | self.tokenId = 0 47 | self.tokenExpireTime = 0 48 | self.构建按键发送器() 49 | QApplication.instance().aboutToQuit.connect(self.要退出了) 50 | 51 | def 要退出了(self): 52 | self.terminate() 53 | 54 | def 构建按键发送器(self): 55 | if 常量.系统平台 == 'Windows': 56 | import win32com.client as comclt 57 | self.按键发送器 = comclt.Dispatch("WScript.Shell") 58 | 59 | def 发送大写锁定键(self): 60 | if 常量.系统平台 == 'Windows': 61 | self.按键发送器.SendKeys("{CAPSLOCK}") 62 | else: 63 | self.取消监听大写锁定键() 64 | keyboard.press_and_release('caps lock') 65 | self.开始监听大写锁定键() 66 | 67 | def 开始监听大写锁定键(self): 68 | keyboard.hook_key('caps lock', self.大写锁定键被触发) 69 | 70 | def 取消监听大写锁定键(self): 71 | try: 72 | keyboard.unhook('caps lock') 73 | except: 74 | pass 75 | 76 | def 停止引擎(self): 77 | self.取消监听大写锁定键() 78 | keyboard.unhook_all() 79 | self.setTerminationEnabled(True) 80 | self.terminate() 81 | print('引擎已停止\n\n') 82 | self.正在运行 = 0 83 | 84 | def 得到引擎信息(self): 85 | 数据库连接 = sqlite3.connect(常量.数据库路径) 86 | self.appKey, self.accessKeyId, self.accessKeySecret = 数据库连接.execute(f'''select AppKey, 87 | AccessKeyId, 88 | AccessKeySecret 89 | from {常量.语音引擎表单名} 90 | where 引擎名称 = :引擎名称''', 91 | {'引擎名称': self.引擎名称}).fetchone() 92 | 数据库连接.close() 93 | 94 | def 大写锁定键被触发(self, event): 95 | if event.event_type == "down": 96 | if self.识别中: 97 | return 98 | self.识别中 = True 99 | try: 100 | self.data = [] 101 | if not self.持续录音: # 如果录音进程不是被持续开启着,那么就需要在这里主动开启 102 | threading.Thread(target=self.录音线程, args=[self.p]).start() # 开始录音 103 | threading.Thread(target=self.识别线程).start() # 开始识别 104 | 105 | except: 106 | print('process 启动失败') 107 | elif event.event_type == "up": 108 | # self.访问录音数据的线程锁.acquire() 109 | self.识别中 = False 110 | # self.访问录音数据的线程锁.release() 111 | else: 112 | # print(event.event_type) 113 | pass 114 | 115 | def 为下一次输入准备识别器(self): 116 | self.识别器 = getAlibabaRecognizer(self.client, 117 | self.appKey, 118 | self.accessKeyId, 119 | self.accessKeySecret, 120 | self.tokenId, 121 | self.tokenExpireTime, 122 | 线程=self) 123 | if self.识别器 == False: 124 | print('获取云端识别器出错\n') 125 | self.引擎出错信号.emit() 126 | return False 127 | 128 | def 录音线程(self, p): 129 | self.录音(p) 130 | 131 | def 识别线程(self): 132 | self.识别中的信号.emit() 133 | if not self.识别(): 134 | self.count += 1 135 | self.总共写入音频片段数 = 0 136 | self.结束识别的信号.emit() 137 | 138 | def 录音(self, p): 139 | # print('准备录制') 140 | stream = p.open(channels=self.CHANNELS, 141 | format=self.FORMAT, 142 | rate=self.RATE, 143 | input=True, 144 | frames_per_buffer=self.CHUNK) 145 | if self.持续录音: 146 | while self.isRunning(): 147 | if self.识别中: 148 | self.录音数据存入内存(stream) 149 | else: 150 | stream.read(self.CHUNK) 151 | else: 152 | self.录音数据存入内存(stream) 153 | 154 | stream.stop_stream()# print('停止录制流') 155 | stream.close() 156 | 157 | def 录音数据存入内存(self, stream): 158 | for i in range(5): 159 | if not self.识别中: 160 | self.data = [] 161 | # self.访问录音数据的线程锁.release() 162 | return 163 | # print(f'录音{录音写入序号},开始写入,时间 {time.time()}') 164 | self.data.append(stream.read(self.CHUNK)) 165 | # print(f'录音{录音写入序号},写入结束,时间 {time.time()}') 166 | # 录音写入序号 += 1 167 | # 在这里录下5个小片段,大约录制了0.32秒,如果这个时候松开了大写锁定键,就不打开连接。如果还继续按着,那就开始识别。 168 | 169 | while self.识别中: 170 | # self.访问录音数据的线程锁.acquire() 171 | # print(f'录音{录音写入序号},开始写入,时间 {time.time()}') 172 | self.data.append(stream.read(self.CHUNK)) 173 | # print(f'录音{录音写入序号},写入结束,时间 {time.time()}\n') 174 | # 录音写入序号 += 1 175 | # self.访问录音数据的线程锁.release() 176 | # self.访问录音数据的线程锁.acquire() 177 | time.sleep(0.0) 178 | self.总共写入音频片段数 = len(self.data) 179 | # self.访问录音数据的线程锁.release() 180 | self.发送大写锁定键() # 再按下大写锁定键,还原大写锁定 181 | 182 | # 这边开始上传识别 183 | def 识别(self): 184 | # print('识别器开始等待') 185 | for i in range(5): 186 | time.sleep(0.06) 187 | if not self.识别中: 188 | return # 如果这个时候大写锁定键松开了 那就返回 189 | # print('识别器等待完闭') 190 | # try: 191 | print(self.tr('\n{}:在识别了,说完后请松开 CapsLock 键...').format(self.count)) 192 | 识别器 = self.识别器 193 | self.识别器 = None 194 | threading.Thread(target=self.为下一次输入准备识别器).start() # 用新线程为下一次识别准备识别器 195 | # print('准备新的识别器') 196 | try: 197 | ret = 识别器.start() # 识别器开始识别 198 | except: 199 | print('识别器开启失败') 200 | return False 201 | if ret < 0: 202 | return False # 如果开始识别出错了,那就返回 203 | 已发送音频片段数 = 0 # 对音频片段记数 204 | # j = 1 205 | 当前进程测得数据片段总数 = len(self.data) 206 | while self.识别中 or 已发送音频片段数 < 当前进程测得数据片段总数 or 已发送音频片段数 < self.总共写入音频片段数: 207 | # self.访问录音数据的线程锁.acquire() 208 | 当前进程测得数据片段总数 = len(self.data) 209 | # self.访问录音数据的线程锁.release() 210 | # print(f' 已发送音频片段数: {已发送音频片段数}, 当前进程测得数据片段总数: {当前进程测得数据片段总数}') 211 | if 已发送音频片段数 > 当前进程测得数据片段总数: 212 | return True 213 | elif 已发送音频片段数 == 当前进程测得数据片段总数: 214 | time.sleep(0.05) 215 | if 已发送音频片段数 < 当前进程测得数据片段总数: 216 | # self.访问录音数据的线程锁.acquire() 217 | 要发送的音频数据 = self.data[已发送音频片段数] 218 | # self.访问录音数据的线程锁.release() 219 | try: 220 | # print(f' 发送器{j},开始发送,时间 {time.time()}') 221 | ret = 识别器.send(要发送的音频数据) # 发送音频数据 222 | # print(f' 发送器{j},发送结束,时间 {time.time()}\n') 223 | # j += 1 224 | except: 225 | print('识别器发送失败') 226 | return False 227 | 已发送音频片段数 += 1 228 | # print(self.tr('\n{}:按住 CapsLock 键后开始说话...').format(self.count + 1)) 229 | self.总共写入音频片段数 = 0 230 | self.结束识别的信号.emit() 231 | self.count += 1 232 | 识别器.stop() 233 | 识别器.close() 234 | return True 235 | 236 | 237 | def run(self): 238 | if self.正在运行 == 1: return False 239 | self.正在运行 = 1 240 | 241 | self.client = ali_speech.NlsClient() 242 | self.client.set_log_level('WARNING') # 设置 client 输出日志信息的级别:DEBUG、INFO、WARNING、ERROR 243 | 244 | self.tokenId = 0 245 | self.tokenExpireTime = 0 246 | # try: 247 | self.识别器 = getAlibabaRecognizer(self.client, 248 | self.appKey, 249 | self.accessKeyId, 250 | self.accessKeySecret, 251 | self.tokenId, 252 | self.tokenExpireTime, 253 | 线程=self) 254 | if self.识别器 == False: 255 | print('获取云端识别器出错\n') 256 | self.引擎出错信号.emit() 257 | return False 258 | 259 | self.p = pyaudio.PyAudio() # 在 QThread 中引入 PyAudio 会使得 PySide2 图形界面阻塞 260 | if self.持续录音: 261 | threading.Thread(target=self.录音线程, args=[self.p]).start() 262 | 263 | self.开始监听大写锁定键() 264 | 265 | print("""引擎初始化完成\n""") 266 | print('按住 CapsLock 键后开始说话...'.format(self.count)) 267 | keyboard.wait() 268 | 269 | 270 | 271 | 272 | -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp36-cp36m-win32.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp36-cp36m-win32.whl -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp36-cp36m-win_amd64.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp36-cp36m-win_amd64.whl -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp37-cp37m-win32.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp37-cp37m-win32.whl -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp37-cp37m-win_amd64.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp37-cp37m-win_amd64.whl -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp38-cp38-win32.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp38-cp38-win32.whl -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp38-cp38-win_amd64.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp38-cp38-win_amd64.whl -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp39-cp39-win32.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp39-cp39-win32.whl -------------------------------------------------------------------------------- /安装指南/PyAudio-0.2.11-cp39-cp39-win_amd64.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/PyAudio-0.2.11-cp39-cp39-win_amd64.whl -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/NOTICE: -------------------------------------------------------------------------------- 1 | The following notices pertain to this software license. 2 | 3 | 4 | ==================================================================== 5 | ==================================================================== 6 | websocket-client 7 | 8 | Copyright 2018 Hiroki Ohtani. 9 | 10 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 11 | 12 | 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 15 | 16 | 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. 17 | 18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 19 | 20 | 21 | ==================================================================== 22 | ==================================================================== 23 | requests 24 | 25 | Copyright 2018 Kenneth Reitz 26 | 27 | Licensed under the Apache License, Version 2.0 (the "License"); 28 | you may not use this file except in compliance with the License. 29 | You may obtain a copy of the License at 30 | 31 | https://www.apache.org/licenses/LICENSE-2.0 32 | 33 | Unless required by applicable law or agreed to in writing, software 34 | distributed under the License is distributed on an "AS IS" BASIS, 35 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 36 | See the License for the specific language governing permissions and 37 | limitations under the License. 38 | 39 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/README.rst: -------------------------------------------------------------------------------- 1 | ============================ 2 | alibabacloud-nls-java-sdk 3 | ============================ 4 | 5 | 概述 6 | ----- 7 | 8 | 这是阿里巴巴智能语音交互2.0服务的Python SDK。 9 | 支持的服务:获取Token、一句话识别、实时语音识别、语音合成。 10 | 11 | 运行环境 12 | -------- 13 | 14 | Python3.4及以上 15 | 16 | .. note:: 17 | 18 | 暂不支持Python2 19 | 20 | 21 | 安全方法 22 | -------- 23 | 24 | 请确认已安装Python包管理工具setuptools,如果没有安装,请安装: 25 | 26 | .. code-block:: bash 27 | 28 | $ pip install setuptools 29 | 30 | 31 | 请在SDK目录执行以下命令: 32 | 33 | .. code-block:: bash 34 | 35 | # 打包 36 | $ python setup.py bdist_egg 37 | # 安装 38 | $ python setup.py install 39 | 40 | 41 | .. note:: 42 | 43 | 以上的pip、python命令是对应的Python3。 44 | 45 | 46 | 接口说明 47 | -------- 48 | 49 | - `一句话识别Python SDK说明 `_ 50 | - `实时语音识别Python SDK说明 `_ 51 | - `语音合成Python SDK说明 `_ -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | * Copyright 2015 Alibaba Group Holding Limited 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | """ 16 | 17 | from ali_speech._client import NlsClient 18 | 19 | __version__ = "0.1.0" 20 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_client.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import websocket 20 | 21 | try: 22 | import thread 23 | except ImportError: 24 | import _thread as thread 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._create_token import AccessToken 28 | from ali_speech._speech_recognizer import SpeechRecognizer 29 | from ali_speech._speech_transcriber import SpeechTranscriber 30 | from ali_speech._speech_synthesizer import SpeechSynthesizer 31 | 32 | __all__ = ["NlsClient"] 33 | 34 | 35 | class NlsClient: 36 | URL_GATEWAY = 'wss://nls-gateway.cn-shanghai.aliyuncs.com/ws/v1' 37 | 38 | def __init__(self): 39 | websocket.enableTrace(False) 40 | 41 | @staticmethod 42 | def set_log_level(level): 43 | _log.setLevel(level) 44 | 45 | @staticmethod 46 | def create_token(access_key_id, access_key_secret): 47 | return AccessToken.create_token(access_key_id, access_key_secret) 48 | 49 | def create_recognizer(self, callback, gateway_url=URL_GATEWAY): 50 | request = SpeechRecognizer(callback, gateway_url) 51 | return request 52 | 53 | def create_transcriber(self, callback, gateway_url=URL_GATEWAY): 54 | transcriber = SpeechTranscriber(callback, gateway_url) 55 | return transcriber 56 | 57 | def create_synthesizer(self, callback, gateway_url=URL_GATEWAY): 58 | synthesizer = SpeechSynthesizer(callback, gateway_url) 59 | return synthesizer 60 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_constant.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | 20 | class Status: 21 | # 初始状态 22 | STATUS_INIT = 1 23 | # websocker 网络连接建立成功,on_open 24 | STATUS_WS_CONNECTED = 2 25 | # 与gateway服务建立连接中 26 | STATUS_STARTING = 3 27 | # 与服务建立连接成功,on_message RecognitionStarted 28 | STATUS_STARTED = 4 29 | # 客户端正主动断开连接 30 | STATUS_STOPPING = 5 31 | # 与服务已断开连接 32 | STATUS_STOPPED = 6 33 | # 开启VAD,服务主动返回completed事件 34 | STATUS_COMPLETED_WITH_OUT_STOP = 7 35 | 36 | 37 | class Constant: 38 | CONTEXT = 'context' 39 | CONTEXT_SDK_KEY = 'sdk' 40 | CONTEXT_SDK_KEY_NAME = 'name' 41 | CONTEXT_SDK_VALUE_NAME = 'nls-sdk-python' 42 | CONTEXT_SDK_KEY_VERSION = 'version' 43 | CONTEXT_SDK_VALUE_VERSION = '2.0.1' 44 | 45 | HEADER_TOKEN = 'X-NLS-Token' 46 | 47 | HEADER = 'header' 48 | HEADER_KEY_NAMESPACE = 'namespace' 49 | HEADER_KEY_NAME = 'name' 50 | HEADER_KEY_MESSAGE_ID = 'message_id' 51 | HEADER_KEY_APPKEY = 'appkey' 52 | HEADER_KEY_TASK_ID = 'task_id' 53 | HEADER_KEY_STATUS = 'status' 54 | HEADER_KEY_STATUS_TEXT = 'status_text' 55 | 56 | PAYLOAD = 'payload' 57 | PAYLOAD_KEY_SAMPLE_RATE = 'sample_rate' 58 | PAYLOAD_KEY_FORMAT = 'format' 59 | PAYLOAD_KEY_ENABLE_ITN = 'enable_inverse_text_normalization' 60 | PAYLOAD_KEY_ENABLE_INTERMEDIATE_RESULT = 'enable_intermediate_result' 61 | PAYLOAD_KEY_ENABLE_PUNCTUATION_PREDICTION = 'enable_punctuation_prediction' 62 | 63 | PAYLOAD_KEY_VOICE = 'voice' 64 | PAYLOAD_KEY_TEXT = 'text' 65 | PAYLOAD_KEY_VOLUME = 'volume' 66 | PAYLOAD_KEY_SPEECH_RATE = 'speech_rate' 67 | PAYLOAD_KEY_PITCH_RATE = 'pitch_rate' 68 | 69 | HEADER_VALUE_ASR_NAMESPACE = 'SpeechRecognizer' 70 | HEADER_VALUE_ASR_NAME_START = 'StartRecognition' 71 | HEADER_VALUE_ASR_NAME_STOP = 'StopRecognition' 72 | HEADER_VALUE_ASR_NAME_STARTED = 'RecognitionStarted' 73 | HEADER_VALUE_ASR_NAME_RESULT_CHANGED = 'RecognitionResultChanged' 74 | HEADER_VALUE_ASR_NAME_COMPLETED = 'RecognitionCompleted' 75 | 76 | HEADER_VALUE_NAME_TASK_FAILED = 'TaskFailed' 77 | 78 | HEADER_VALUE_TRANS_NAMESPACE = 'SpeechTranscriber' 79 | HEADER_VALUE_TRANS_NAME_START = 'StartTranscription' 80 | HEADER_VALUE_TRANS_NAME_STOP = 'StopTranscription' 81 | HEADER_VALUE_TRANS_NAME_STARTED = 'TranscriptionStarted' 82 | HEADER_VALUE_TRANS_NAME_SENTENCE_BEGIN = 'SentenceBegin' 83 | HEADER_VALUE_TRANS_NAME_SENTENCE_END = 'SentenceEnd' 84 | HEADER_VALUE_TRANS_NAME_RESULT_CHANGE = 'TranscriptionResultChanged' 85 | HEADER_VALUE_TRANS_NAME_COMPLETED = 'TranscriptionCompleted' 86 | 87 | HEADER_VALUE_TTS_NAMESPACE = 'SpeechSynthesizer' 88 | HEADER_VALUE_TTS_NAME_START = 'StartSynthesis' 89 | HEADER_VALUE_TTS_NAME_COMPLETED = 'SynthesisCompleted' 90 | 91 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_create_token.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | * Copyright 2015 Alibaba Group Holding Limited 6 | * 7 | * Licensed under the Apache License, Version 2.0 (the "License"); 8 | * you may not use this file except in compliance with the License. 9 | * You may obtain a copy of the License at 10 | * 11 | * http://www.apache.org/licenses/LICENSE-2.0 12 | * 13 | * Unless required by applicable law or agreed to in writing, software 14 | * distributed under the License is distributed on an "AS IS" BASIS, 15 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 | * See the License for the specific language governing permissions and 17 | * limitations under the License. 18 | """ 19 | 20 | import base64 21 | import hashlib 22 | import hmac 23 | import requests 24 | import time 25 | import uuid 26 | 27 | from urllib import parse 28 | from ali_speech._logging import _log 29 | 30 | 31 | class AccessToken: 32 | @staticmethod 33 | def _encode_text(text): 34 | encoded_text = parse.quote_plus(text) 35 | return encoded_text.replace('+', '%20').replace('*', '%2A').replace('%7E', '~') 36 | 37 | @staticmethod 38 | def _encode_dict(dic): 39 | keys = dic.keys() 40 | dic_sorted = [(key, dic[key]) for key in sorted(keys)] 41 | encoded_text = parse.urlencode(dic_sorted) 42 | return encoded_text.replace('+', '%20').replace('*', '%2A').replace('%7E', '~') 43 | 44 | @staticmethod 45 | def create_token(access_key_id, access_key_secret): 46 | parameters = {'AccessKeyId': access_key_id, 47 | 'Action': 'CreateToken', 48 | 'Format': 'JSON', 49 | 'RegionId': 'cn-shanghai', 50 | 'SignatureMethod': 'HMAC-SHA1', 51 | 'SignatureNonce': str(uuid.uuid1()), 52 | 'SignatureVersion': '1.0', 53 | 'Timestamp': time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()), 54 | 'Version': '2019-02-28'} 55 | # 构造规范化的请求字符串 56 | query_string = AccessToken._encode_dict(parameters) 57 | _log.debug('规范化的请求字符串: %s' % query_string) 58 | # 构造待签名字符串 59 | string_to_sign = 'GET' + '&' + AccessToken._encode_text('/') + '&' + AccessToken._encode_text(query_string) 60 | _log.debug('待签名的字符串: %s' % string_to_sign) 61 | # 计算签名 62 | secreted_string = hmac.new(bytes(access_key_secret + '&', encoding='utf-8'), 63 | bytes(string_to_sign, encoding='utf-8'), 64 | hashlib.sha1).digest() 65 | signature = base64.b64encode(secreted_string) 66 | _log.debug('签名: %s' % signature) 67 | # 进行URL编码 68 | signature = AccessToken._encode_text(signature) 69 | _log.debug('URL编码后的签名: %s' % signature) 70 | # 调用服务 71 | full_url = 'http://nls-meta.cn-shanghai.aliyuncs.com/?Signature=%s&%s' % (signature, query_string) 72 | _log.debug('url: %s' % full_url) 73 | # 提交HTTP GET请求 74 | response = requests.get(full_url) 75 | if response.ok: 76 | root_obj = response.json() 77 | key = 'Token' 78 | if key in root_obj: 79 | token = root_obj[key]['Id'] 80 | expire_time = root_obj[key]['ExpireTime'] 81 | return token, expire_time 82 | 83 | _log.error(response.text) 84 | return None, None 85 | 86 | 87 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_logging.py: -------------------------------------------------------------------------------- 1 | # -*- coding:utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import logging.handlers 20 | 21 | __all__ = ['_log'] 22 | 23 | FORMAT = '%(asctime)15s %(name)s-%(levelname)s %(funcName)s:%(lineno)s %(message)s' 24 | logging.basicConfig(level=logging.DEBUG, format=FORMAT) 25 | _log = logging.getLogger('alispeech') 26 | 27 | handler = logging.handlers.RotatingFileHandler('alispeech.log', maxBytes=1024 * 1024, 28 | backupCount=5, encoding='utf-8') 29 | handler.setLevel(logging.DEBUG) 30 | handler.setFormatter(logging.Formatter(FORMAT)) 31 | _log.addHandler(handler) 32 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_speech_recognizer.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | import six 21 | import websocket 22 | import uuid 23 | import threading 24 | import time 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._constant import Status 28 | from ali_speech._constant import Constant 29 | from ali_speech._speech_reqprotocol import SpeechReqProtocol 30 | 31 | 32 | class SpeechRecognizer(SpeechReqProtocol): 33 | 34 | def __init__(self, callback, url): 35 | super(SpeechRecognizer, self).__init__(callback, url) 36 | self._last_start_retry = False 37 | self._is_connected = False 38 | 39 | self._header[Constant.HEADER_KEY_NAMESPACE] = Constant.HEADER_VALUE_ASR_NAMESPACE 40 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = "pcm" 41 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = 16000 42 | 43 | def set_enable_intermediate_result(self, flag): 44 | self._payload[Constant.PAYLOAD_KEY_ENABLE_INTERMEDIATE_RESULT] = flag 45 | 46 | def set_enable_punctuation_prediction(self, flag): 47 | self._payload[Constant.PAYLOAD_KEY_ENABLE_PUNCTUATION_PREDICTION] = flag 48 | 49 | def set_enable_inverse_text_normalization(self, flag): 50 | self._payload[Constant.PAYLOAD_KEY_ENABLE_ITN] = flag 51 | 52 | def start(self, ping_interval=5, ping_timeout=3): 53 | """ 54 | 开始识别,新建到服务端的连接 55 | :param ping_interval: 自动发送ping命令,指定发送间隔,单位为秒 56 | :param ping_timeout: 等待接收pong消息的超时时间,单位为秒 57 | :return: 与服务端建立连接成功,返回0 58 | 与服务端建立连接失败,返回-1 59 | """ 60 | if self._status == Status.STATUS_INIT: 61 | _log.debug('starting recognizer...') 62 | self._status = Status.STATUS_STARTING 63 | else: 64 | _log.error("Illegal status: %s" % self._status) 65 | return -1 66 | 67 | def _on_open(ws): 68 | _log.debug('websocket connected') 69 | self._status = Status.STATUS_WS_CONNECTED 70 | self._is_connected = True 71 | msg_id = six.u(uuid.uuid1().hex) 72 | self._task_id = six.u(uuid.uuid1().hex) 73 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_ASR_NAME_START 74 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 75 | self._header[Constant.HEADER_KEY_TASK_ID] = self._task_id 76 | 77 | text = self.serialize() 78 | _log.info('sending start cmd: ' + text) 79 | ws.send(text) 80 | 81 | def _on_message(ws, raw): 82 | _log.debug('websocket message received: ' + raw) 83 | msg = json.loads(raw) 84 | name = msg[Constant.HEADER][Constant.HEADER_KEY_NAME] 85 | if name == Constant.HEADER_VALUE_ASR_NAME_STARTED: 86 | self._status = Status.STATUS_STARTED 87 | _log.debug('callback on_started') 88 | self._callback.on_started(msg) 89 | elif name == Constant.HEADER_VALUE_ASR_NAME_RESULT_CHANGED: 90 | _log.debug('callback on_result_changed') 91 | self._callback.on_result_changed(msg) 92 | elif name == Constant.HEADER_VALUE_ASR_NAME_COMPLETED: 93 | if self._status == Status.STATUS_STOPPING: 94 | # 客户端主动调用stop返回的completed事件 95 | self._status = Status.STATUS_STOPPED 96 | else: 97 | # 开启VAD,服务端主动返回的completed事件 98 | self._status = Status.STATUS_COMPLETED_WITH_OUT_STOP 99 | 100 | _log.debug('websocket status changed to stopped') 101 | _log.debug('callback on_completed') 102 | self._callback.on_completed(msg) 103 | elif name == Constant.HEADER_VALUE_NAME_TASK_FAILED: 104 | self._status = Status.STATUS_STOPPED 105 | _log.error(msg) 106 | _log.debug('websocket status changed to stopped') 107 | _log.debug('callback on_task_failed') 108 | self._callback.on_task_failed(msg) 109 | 110 | def _on_close(ws): 111 | _log.debug('callback on_channel_closed') 112 | self._callback.on_channel_closed() 113 | 114 | def _on_error(ws, error): 115 | if self._is_connected or self._last_start_retry: 116 | _log.error(error) 117 | self._status = Status.STATUS_STOPPED 118 | message = json.loads('{"header":{"namespace":"Default","name":"TaskFailed",' 119 | '"status":400,"message_id":"0","task_id":"0",' 120 | '"status_text":"%s"}}' 121 | % error) 122 | self._callback.on_task_failed(message) 123 | else: 124 | _log.warning('retry start: %s' % error) 125 | 126 | retry_count = 3 127 | for count in range(retry_count): 128 | self._status = Status.STATUS_STARTING 129 | if count == (retry_count - 1): 130 | self._last_start_retry = True 131 | 132 | # Init WebSocket 133 | self._ws = websocket.WebSocketApp(self._gateway_url, 134 | on_open=_on_open, 135 | on_message=_on_message, 136 | on_error=_on_error, 137 | on_close=_on_close, 138 | header={Constant.HEADER_TOKEN: self._token}) 139 | 140 | self._thread = threading.Thread(target=self._ws.run_forever, 141 | args=(None, None, ping_interval, ping_timeout)) 142 | self._thread.daemon = True 143 | self._thread.start() 144 | # waite for no more than 10 seconds 145 | for i in range(1000): 146 | if self._status == Status.STATUS_STARTED or self._status == Status.STATUS_STOPPED: 147 | break 148 | else: 149 | time.sleep(0.01) 150 | 151 | if self._status == Status.STATUS_STARTED: 152 | # 与服务端连接建立成功 153 | _log.debug('start succeed!') 154 | return 0 155 | else: 156 | if self._is_connected or self._last_start_retry: 157 | # 已建立了WebSocket链接但是与服务端的连接失败, 或者是最后一次重试,则返回-1 158 | _log.error("start failed, status: %s" % self._status) 159 | return -1 160 | else: 161 | # 尝试重连 162 | continue 163 | 164 | def send(self, audio_data): 165 | """ 166 | 发送语音数据到服务端,建议每次发送 1000 ~ 8000 字节 167 | :param audio_data: 二进制音频数据 168 | :return: 发送成功,返回0 169 | 发送失败,返回-1 170 | """ 171 | if self._status == Status.STATUS_STARTED: 172 | self._ws.send(audio_data, opcode=websocket.ABNF.OPCODE_BINARY) 173 | return 0 174 | elif self._status == Status.STATUS_COMPLETED_WITH_OUT_STOP: 175 | _log.info('the recognizer finished with VAD, no need to send data anymore!') 176 | return -1 177 | else: 178 | _log.error('should not send data in state %d', self._status) 179 | return -1 180 | 181 | def stop(self): 182 | """ 183 | 结束识别并关闭与服务端的连接 184 | :return: 关闭成功,返回0 185 | 关闭失败,返回-1 186 | """ 187 | ret = 0 188 | if self._status == Status.STATUS_COMPLETED_WITH_OUT_STOP: 189 | _log.info('the recognizer finished with VAD') 190 | ret = 0 191 | elif self._status == Status.STATUS_STARTED: 192 | self._status = Status.STATUS_STOPPING 193 | msg_id = six.u(uuid.uuid1().hex) 194 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_ASR_NAME_STOP 195 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 196 | self._payload.clear() 197 | text = self.serialize() 198 | _log.info('sending stop cmd: ' + text) 199 | self._ws.send(text) 200 | 201 | for i in range(100): 202 | if self._status == Status.STATUS_STOPPED: 203 | break 204 | else: 205 | time.sleep(0.1) 206 | _log.debug('waite 100ms') 207 | 208 | if self._status != Status.STATUS_STOPPED: 209 | ret = -1 210 | else: 211 | ret = 0 212 | else: 213 | _log.error('should not stop in state %d', self._status) 214 | ret = -1 215 | 216 | return ret 217 | 218 | def close(self): 219 | """ 220 | 关闭WebSocket链接 221 | """ 222 | if self._ws: 223 | if self._thread and self._thread.is_alive(): 224 | self._ws.keep_running = False 225 | self._thread.join() 226 | self._ws.close() 227 | 228 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_speech_reqprotocol.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | 21 | from ali_speech._constant import Status 22 | from ali_speech._constant import Constant 23 | 24 | 25 | class SpeechReqProtocol: 26 | def __init__(self, callback, url): 27 | self._header = {} 28 | self._context = {} 29 | self._payload = {} 30 | self._token = None 31 | self._gateway_url = url 32 | self._callback = callback 33 | self._status = Status.STATUS_INIT 34 | self._ws = None 35 | self._thread = None 36 | self._task_id = None 37 | 38 | sdk_info = {Constant.CONTEXT_SDK_KEY_NAME: Constant.CONTEXT_SDK_VALUE_NAME, 39 | Constant.CONTEXT_SDK_KEY_VERSION: Constant.CONTEXT_SDK_VALUE_VERSION} 40 | self._context[Constant.CONTEXT_SDK_KEY] = sdk_info 41 | 42 | def set_appkey(self, appkey): 43 | self._header[Constant.HEADER_KEY_APPKEY] = appkey 44 | 45 | def get_appkey(self): 46 | return self._header[Constant.HEADER_KEY_APPKEY] 47 | 48 | def set_token(self, token): 49 | self._token = token 50 | 51 | def get_token(self): 52 | return self._token 53 | 54 | def set_format(self, format): 55 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = format 56 | 57 | def get_format(self): 58 | return self._payload[Constant.PAYLOAD_KEY_FORMAT] 59 | 60 | def set_sample_rate(self, sample_rate): 61 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = sample_rate 62 | 63 | def get_sample_rate(self): 64 | return self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] 65 | 66 | def get_task_id(self): 67 | return self._header[Constant.HEADER_KEY_TASK_ID] 68 | 69 | def put_context(self, key, obj): 70 | self._context[key] = obj 71 | 72 | def add_payload_param(self, key, obj): 73 | self._payload[key] = obj 74 | 75 | def get_status(self): 76 | return self._status 77 | 78 | def serialize(self): 79 | root = {Constant.HEADER: self._header} 80 | 81 | if len(self._payload) != 0: 82 | root[Constant.CONTEXT] = self._context 83 | root[Constant.PAYLOAD] = self._payload 84 | 85 | return json.dumps(root) 86 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_speech_synthesizer.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | import six 21 | import websocket 22 | import uuid 23 | import threading 24 | import time 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._constant import Status 28 | from ali_speech._constant import Constant 29 | from ali_speech._speech_reqprotocol import SpeechReqProtocol 30 | 31 | 32 | class SpeechSynthesizer(SpeechReqProtocol): 33 | 34 | def __init__(self, callback, url): 35 | super(SpeechSynthesizer, self).__init__(callback, url) 36 | self._last_start_retry = False 37 | self._is_connected = False 38 | 39 | self._header[Constant.HEADER_KEY_NAMESPACE] = Constant.HEADER_VALUE_TTS_NAMESPACE 40 | self._payload[Constant.PAYLOAD_KEY_VOICE] = 'xiaoyun' 41 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = 'pcm' 42 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = 16000 43 | 44 | def set_text(self, text): 45 | self._payload[Constant.PAYLOAD_KEY_TEXT] = text 46 | 47 | def set_voice(self, voice): 48 | self._payload[Constant.PAYLOAD_KEY_VOICE] = voice 49 | 50 | def set_volume(self, volume): 51 | self._payload[Constant.PAYLOAD_KEY_VOLUME] = volume 52 | 53 | def set_speech_rate(self, speech_rate): 54 | self._payload[Constant.PAYLOAD_KEY_SPEECH_RATE] = speech_rate 55 | 56 | def set_pitch_rate(self, pitch_rate): 57 | self._payload[Constant.PAYLOAD_KEY_PITCH_RATE] = pitch_rate 58 | 59 | def start(self, ping_interval=5, ping_timeout=3): 60 | """ 61 | 开始合成,新建到服务端的连接 62 | :param ping_interval: 自动发送ping命令,指定发送间隔,单位为秒 63 | :param ping_timeout: 等待接收pong消息的超时时间,单位为秒 64 | :return: 与服务端建立连接成功,返回0 65 | 与服务端建立连接失败,返回-1 66 | """ 67 | if self._status == Status.STATUS_INIT: 68 | _log.debug('starting synthesizer...') 69 | self._status = Status.STATUS_STARTING 70 | else: 71 | _log.error("Illegal status: %s" % self._status) 72 | return -1 73 | 74 | def _on_open(ws): 75 | _log.debug('websocket connected') 76 | self._status = Status.STATUS_STARTED 77 | self._is_connected = True 78 | time.sleep(0.01) 79 | msg_id = six.u(uuid.uuid1().hex) 80 | self._task_id = six.u(uuid.uuid1().hex) 81 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_TTS_NAME_START 82 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 83 | self._header[Constant.HEADER_KEY_TASK_ID] = self._task_id 84 | 85 | text = self.serialize() 86 | _log.info('sending start cmd: ' + text) 87 | ws.send(text) 88 | 89 | def _on_data(ws, raw, opcode, flag): 90 | if opcode == websocket.ABNF.OPCODE_BINARY: 91 | _log.debug("received binary data, size: %s" % len(raw)) 92 | self._callback.on_binary_data_received(raw) 93 | elif opcode == websocket.ABNF.OPCODE_TEXT: 94 | _log.debug("websocket message received: %s" % raw) 95 | msg = json.loads(raw) 96 | name = msg[Constant.HEADER][Constant.HEADER_KEY_NAME] 97 | if name == Constant.HEADER_VALUE_TTS_NAME_COMPLETED: 98 | self._status = Status.STATUS_STOPPED 99 | _log.debug('websocket status changed to stopped') 100 | _log.debug('callback on_completed') 101 | self._callback.on_completed(msg) 102 | elif name == Constant.HEADER_VALUE_NAME_TASK_FAILED: 103 | self._status = Status.STATUS_STOPPED 104 | _log.error(msg) 105 | _log.debug('websocket status changed to stopped') 106 | _log.debug('callback on_task_failed') 107 | self._callback.on_task_failed(msg) 108 | 109 | def _on_close(ws): 110 | _log.debug('callback on_channel_closed') 111 | self._callback.on_channel_closed() 112 | 113 | def _on_error(ws, error): 114 | if self._is_connected or self._last_start_retry: 115 | _log.error(error) 116 | self._status = Status.STATUS_STOPPED 117 | message = json.loads('{"header":{"namespace":"Default","name":"TaskFailed",' 118 | '"status":400,"message_id":"0","task_id":"0",' 119 | '"status_text":"%s"}}' 120 | % error) 121 | self._callback.on_task_failed(message) 122 | else: 123 | _log.warning('retry start: %s' % error) 124 | 125 | retry_count = 3 126 | for count in range(retry_count): 127 | self._status = Status.STATUS_STARTING 128 | if count == (retry_count - 1): 129 | self._last_start_retry = True 130 | 131 | # Init WebSocket 132 | self._ws = websocket.WebSocketApp(self._gateway_url, 133 | on_open=_on_open, 134 | on_data=_on_data, 135 | on_error=_on_error, 136 | on_close=_on_close, 137 | header={Constant.HEADER_TOKEN: self._token}) 138 | 139 | self._thread = threading.Thread(target=self._ws.run_forever, args=(None, None, ping_interval, ping_timeout)) 140 | self._thread.daemon = True 141 | self._thread.start() 142 | # waite for no more than 10 seconds 143 | for i in range(1000): 144 | if self._status == Status.STATUS_STARTED or self._status == Status.STATUS_STOPPED: 145 | break 146 | else: 147 | time.sleep(0.01) 148 | 149 | if self._status == Status.STATUS_STARTED: 150 | # 与服务端连接建立成功 151 | _log.debug('start succeed!') 152 | return 0 153 | else: 154 | if self._is_connected or self._last_start_retry: 155 | # 已建立了WebSocket链接但是与服务端的连接失败, 或者是最后一次重试,则返回-1 156 | _log.error("start failed, status: %s" % self._status) 157 | return -1 158 | else: 159 | # 尝试重连 160 | continue 161 | 162 | def wait_completed(self): 163 | """ 164 | 等待合成结束 165 | :return: 合成结束,返回0 166 | 合成超时,返回-1 167 | """ 168 | ret = 0 169 | if self._status == Status.STATUS_STARTED: 170 | for i in range(100): 171 | if self._status == Status.STATUS_STOPPED: 172 | break 173 | else: 174 | time.sleep(0.1) 175 | _log.debug('waite 100ms') 176 | 177 | if self._status != Status.STATUS_STOPPED: 178 | ret = -1 179 | else: 180 | ret = 0 181 | else: 182 | _log.error('should not wait completed in state %d', self._status) 183 | ret = -1 184 | 185 | return ret 186 | 187 | def close(self): 188 | """ 189 | 关闭WebSocket连接 190 | :return: 191 | """ 192 | if self._ws: 193 | if self._thread and self._thread.is_alive(): 194 | self._ws.keep_running = False 195 | self._thread.join() 196 | self._ws.close() 197 | 198 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/_speech_transcriber.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | import six 21 | import websocket 22 | import uuid 23 | import threading 24 | import time 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._constant import Status 28 | from ali_speech._constant import Constant 29 | from ali_speech._speech_reqprotocol import SpeechReqProtocol 30 | 31 | 32 | class SpeechTranscriber(SpeechReqProtocol): 33 | def __init__(self, callback, url): 34 | super(SpeechTranscriber, self).__init__(callback, url) 35 | self._last_start_retry = False 36 | self._is_connected = False 37 | 38 | self._header[Constant.HEADER_KEY_NAMESPACE] = Constant.HEADER_VALUE_TRANS_NAMESPACE 39 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = 'pcm' 40 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = 16000 41 | 42 | def set_enable_intermediate_result(self, flag): 43 | self._payload[Constant.PAYLOAD_KEY_ENABLE_INTERMEDIATE_RESULT] = flag 44 | 45 | def set_enable_punctuation_prediction(self, flag): 46 | self._payload[Constant.PAYLOAD_KEY_ENABLE_PUNCTUATION_PREDICTION] = flag 47 | 48 | def set_enable_inverse_text_normalization(self, flag): 49 | self._payload[Constant.PAYLOAD_KEY_ENABLE_ITN] = flag 50 | 51 | def start(self, ping_interval=5, ping_timeout=3): 52 | """ 53 | 开始识别,新建到服务端的连接 54 | :param ping_interval: 自动发送ping命令,指定发送间隔,单位为秒 55 | :param ping_timeout: 等待接收pong消息的超时时间,单位为秒 56 | :return: 与服务端建立连接成功,返回0 57 | 与服务端建立连接失败,返回-1 58 | """ 59 | if self._status == Status.STATUS_INIT: 60 | _log.debug('starting transcriber...') 61 | self._status = Status.STATUS_STARTING 62 | else: 63 | _log.error("Illegal status: %s" % self._status) 64 | return -1 65 | 66 | def _on_open(ws): 67 | _log.debug('websocket connected') 68 | self._status = Status.STATUS_WS_CONNECTED 69 | self._is_connected = True 70 | msg_id = six.u(uuid.uuid1().hex) 71 | self._task_id = six.u(uuid.uuid1().hex) 72 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_TRANS_NAME_START 73 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 74 | self._header[Constant.HEADER_KEY_TASK_ID] = self._task_id 75 | 76 | text = self.serialize() 77 | _log.info('sending start cmd: ' + text) 78 | ws.send(text) 79 | 80 | def _on_message(ws, raw): 81 | _log.debug('websocket message received: ' + raw) 82 | msg = json.loads(raw) 83 | name = msg[Constant.HEADER][Constant.HEADER_KEY_NAME] 84 | if name == Constant.HEADER_VALUE_TRANS_NAME_STARTED: 85 | self._status = Status.STATUS_STARTED 86 | _log.debug('callback on_started') 87 | self._callback.on_started(msg) 88 | elif name == Constant.HEADER_VALUE_TRANS_NAME_RESULT_CHANGE: 89 | _log.debug('callback on_result_changed') 90 | self._callback.on_result_changed(msg) 91 | elif name == Constant.HEADER_VALUE_TRANS_NAME_SENTENCE_BEGIN: 92 | _log.debug('callback on_sentence_begin') 93 | self._callback.on_sentence_begin(msg) 94 | elif name == Constant.HEADER_VALUE_TRANS_NAME_SENTENCE_END: 95 | _log.debug('callback on_sentence_end') 96 | self._callback.on_sentence_end(msg) 97 | elif name == Constant.HEADER_VALUE_TRANS_NAME_COMPLETED: 98 | self._status = Status.STATUS_STOPPED 99 | _log.debug('websocket status changed to stopped') 100 | _log.debug('callback on_completed') 101 | self._callback.on_completed(msg) 102 | elif name == Constant.HEADER_VALUE_NAME_TASK_FAILED: 103 | self._status = Status.STATUS_STOPPED 104 | _log.error(msg) 105 | _log.debug('websocket status changed to stopped') 106 | _log.debug('callback on_task_failed') 107 | self._callback.on_task_failed(msg) 108 | 109 | def _on_close(ws): 110 | _log.debug('callback on_channel_closed') 111 | self._callback.on_channel_closed() 112 | 113 | def _on_error(ws, error): 114 | if self._is_connected or self._last_start_retry: 115 | _log.error(error) 116 | self._status = Status.STATUS_STOPPED 117 | message = json.loads('{"header":{"namespace":"Default","name":"TaskFailed",' 118 | '"status":400,"message_id":"0","task_id":"0",' 119 | '"status_text":"%s"}}' 120 | % error) 121 | self._callback.on_task_failed(message) 122 | else: 123 | _log.warning('retry start: %s' % error) 124 | 125 | retry_count = 3 126 | for count in range(retry_count): 127 | self._status = Status.STATUS_STARTING 128 | if count == (retry_count - 1): 129 | self._last_start_retry = True 130 | 131 | # Init WebSocket 132 | self._ws = websocket.WebSocketApp(self._gateway_url, 133 | on_open=_on_open, 134 | on_message=_on_message, 135 | on_error=_on_error, 136 | on_close=_on_close, 137 | header={Constant.HEADER_TOKEN: self._token}) 138 | 139 | self._thread = threading.Thread(target=self._ws.run_forever, args=(None, None, ping_interval, ping_timeout)) 140 | self._thread.daemon = True 141 | self._thread.start() 142 | # waite for no more than 3 seconds 143 | for i in range(1000): 144 | if self._status == Status.STATUS_STARTED or self._status == Status.STATUS_STOPPED: 145 | break 146 | else: 147 | time.sleep(0.01) 148 | 149 | if self._status == Status.STATUS_STARTED: 150 | _log.debug('start succeed!') 151 | return 0 152 | else: 153 | if self._is_connected or self._last_start_retry: 154 | _log.error("start failed, status: %s" % self._status) 155 | return -1 156 | else: 157 | continue 158 | 159 | def send(self, audio_data): 160 | """ 161 | 发送语音数据到服务端,建议每次发送 1000 ~ 8000 字节 162 | :param audio_data: 二进制音频数据 163 | :return: 发送成功,返回0 164 | 发送失败,返回-1 165 | """ 166 | if self._status == Status.STATUS_STARTED: 167 | self._ws.send(audio_data, opcode=websocket.ABNF.OPCODE_BINARY) 168 | return 0 169 | else: 170 | _log.error('should not send data in state %d', self._status) 171 | return -1 172 | 173 | def stop(self): 174 | """ 175 | 结束识别并关闭与服务端的连接 176 | :return: 关闭成功,返回0 177 | 关闭失败,返回-1 178 | """ 179 | ret = 0 180 | if self._status == Status.STATUS_STARTED: 181 | self._status = Status.STATUS_STOPPING 182 | msg_id = six.u(uuid.uuid1().hex) 183 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_TRANS_NAME_STOP 184 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 185 | self._payload.clear() 186 | text = self.serialize() 187 | _log.info('sending stop cmd: ' + text) 188 | self._ws.send(text) 189 | 190 | for i in range(100): 191 | if self._status == Status.STATUS_STOPPED: 192 | break 193 | else: 194 | time.sleep(0.1) 195 | _log.debug('waite 100ms') 196 | 197 | if self._status != Status.STATUS_STOPPED: 198 | ret = -1 199 | else: 200 | ret = 0 201 | else: 202 | _log.error('should not stop in state %d', self._status) 203 | ret = -1 204 | 205 | return ret 206 | 207 | def close(self): 208 | """ 209 | 关闭WebSocket链接 210 | :return: 211 | """ 212 | if self._ws: 213 | if self._thread and self._thread.is_alive(): 214 | self._ws.keep_running = False 215 | self._thread.join() 216 | self._ws.close() 217 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/callbacks.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | 20 | class SpeechRecognizerCallback: 21 | """ 22 | * @brief 调用start(), 成功与服务建立连接, sdk内部线程上报started事件 23 | * @note 请勿在回调函数内部调用stop()操作 24 | * @param message 服务返回的响应 25 | * @return 26 | """ 27 | def on_started(self, message): 28 | raise Exception('Not implemented!') 29 | 30 | """ 31 | * @brief 设置允许返回中间结果参数, sdk在接收到服务返回到中间结果时, sdk内部线程上报ResultChanged事件 32 | * @note 请勿在回调函数内部调用stop()操作 33 | * @param message 服务返回的响应 34 | * @return 35 | """ 36 | def on_result_changed(self, message): 37 | raise Exception('Not implemented!') 38 | 39 | """ 40 | * @brief sdk在接收到服务返回识别结束消息时, sdk内部线程上报Completed事件 41 | * @note 上报Completed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 42 | * 请勿在回调函数内部调用stop()操作 43 | * @param message 服务返回的响应 44 | * @return 45 | """ 46 | def on_completed(self, message): 47 | raise Exception('Not implemented!') 48 | 49 | """ 50 | * @brief 识别过程(包含start(), send(), stop())发生异常时, sdk内部线程上报TaskFailed事件 51 | * @note 上报TaskFailed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 52 | * 请勿在回调函数内部调用stop()操作 53 | * @param message 服务返回的响应 54 | * @return 55 | """ 56 | def on_task_failed(self, message): 57 | raise Exception('Not implemented!') 58 | 59 | """ 60 | * @brief 识别结束或发生异常时,会关闭websocket连接通道 61 | * @note 请勿在回调函数内部调用stop()操作 62 | * @return 63 | """ 64 | def on_channel_closed(self): 65 | raise Exception('Not implemented!') 66 | 67 | 68 | class SpeechTranscriberCallback: 69 | """ 70 | * @brief 调用start(), 成功与服务建立连接, sdk内部线程上报started事件 71 | * @note 请勿在回调函数内部调用stop()操作 72 | * @param message 服务返回的响应 73 | * @return 74 | """ 75 | def on_started(self, message): 76 | raise Exception('Not implemented!') 77 | 78 | """ 79 | * @brief 设置允许返回中间结果参数, sdk在接收到服务返回到中间结果时, sdk内部线程上报ResultChanged事件 80 | * @note 请勿在回调函数内部调用stop()操作 81 | * @param message 服务返回的响应 82 | * @return 83 | """ 84 | def on_result_changed(self, message): 85 | raise Exception('Not implemented!') 86 | 87 | """ 88 | * @brief sdk在接收到服务返回的识别到一句话的开始, sdk内部线程上报SentenceBegin事件 89 | * @note 该事件作为检测到一句话的开始,请勿在回调函数内部调用stop()操作 90 | * @param message 服务返回的响应 91 | * @return 92 | """ 93 | def on_sentence_begin(self, message): 94 | raise Exception('Not implemented!') 95 | 96 | """ 97 | * @brief sdk在接收到服务返回的识别到一句话的开始, sdk内部线程上报SentenceBegin事件 98 | * @note 该事件作为检测到一句话的开始,请勿在回调函数内部调用stop()操作 99 | * @param message 服务返回的响应 100 | * @return 101 | """ 102 | def on_sentence_end(self, message): 103 | raise Exception('Not implemented!') 104 | 105 | """ 106 | * @brief sdk在接收到服务返回识别结束消息时, sdk内部线程上报Completed事件 107 | * @note 上报Completed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 108 | * 请勿在回调函数内部调用stop()操作 109 | * @param message 服务返回的响应 110 | * @return 111 | """ 112 | def on_completed(self, message): 113 | raise Exception('Not implemented!') 114 | 115 | """ 116 | * @brief 识别过程(包含start(), send(), stop())发生异常时, sdk内部线程上报TaskFailed事件 117 | * @note 上报TaskFailed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 118 | * 请勿在回调函数内部调用stop()操作 119 | * @param message 服务返回的响应 120 | * @return 121 | """ 122 | def on_task_failed(self, message): 123 | raise Exception('Not implemented!') 124 | 125 | """ 126 | * @brief 识别结束或发生异常时,会关闭websocket连接通道 127 | * @note 请勿在回调函数内部调用stop()操作 128 | * @return 129 | """ 130 | def on_channel_closed(self): 131 | raise Exception('Not implemented!') 132 | 133 | 134 | class SpeechSynthesizerCallback: 135 | 136 | def on_binary_data_received(self, raw): 137 | raise Exception('Not implemented!') 138 | 139 | """ 140 | * @brief sdk在接收到服务返回识别结束消息时, sdk内部线程上报Completed事件 141 | * @note 上报Completed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 142 | * 请勿在回调函数内部调用stop()操作 143 | * @param message 服务返回的响应 144 | * @return 145 | """ 146 | def on_completed(self, message): 147 | raise Exception('Not implemented!') 148 | 149 | """ 150 | * @brief 识别过程(包含start(), send(), stop())发生异常时, sdk内部线程上报TaskFailed事件 151 | * @note 上报TaskFailed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 152 | * 请勿在回调函数内部调用stop()操作 153 | * @param message 服务返回的响应 154 | * @return 155 | """ 156 | def on_task_failed(self, message): 157 | raise Exception('Not implemented!') 158 | 159 | """ 160 | * @brief 识别结束或发生异常时,会关闭websocket连接通道 161 | * @note 请勿在回调函数内部调用stop()操作 162 | * @return 163 | """ 164 | def on_channel_closed(self): 165 | raise Exception('Not implemented!') 166 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/ali_speech/constant.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | 20 | class ASRFormat: 21 | PCM = 'pcm' 22 | OPUS = 'opus' 23 | 24 | 25 | class TTSFormat: 26 | PCM = 'pcm' 27 | WAV = 'wav' 28 | MP3 = 'mp3' 29 | 30 | 31 | class ASRSampleRate: 32 | SAMPLE_RATE_8K = 8000 33 | SAMPLE_RATE_16K = 16000 34 | 35 | 36 | class TTSSampleRate: 37 | SAMPLE_RATE_8K = 8000 38 | SAMPLE_RATE_16K = 16000 39 | SAMPLE_RATE_24K = 24000 40 | 41 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/alibabacloud_nls_java_sdk.egg-info/PKG-INFO: -------------------------------------------------------------------------------- 1 | Metadata-Version: 1.1 2 | Name: alibabacloud-nls-java-sdk 3 | Version: 2.0.0 4 | Summary: ali_speech python sdk 5 | Home-page: https://github.com/aliyun/alibabacloud-nls-python-sdk.git 6 | Author: Alibaba Cloud NLS Team 7 | Author-email: nls-system-client@list.alibaba-inc.com 8 | License: Apache License 2.0 9 | Description: UNKNOWN 10 | Platform: UNKNOWN 11 | Classifier: Development Status :: 5 - Production/Stable 12 | Classifier: Intended Audience :: Developers 13 | Classifier: License :: OSI Approved :: Apache Software License 14 | Classifier: Programming Language :: Python :: 3 15 | Classifier: Programming Language :: Python :: 3.4 16 | Classifier: Programming Language :: Python :: 3.5 17 | Classifier: Programming Language :: Python :: 3.6 18 | Classifier: Programming Language :: Python :: 3.7 19 | Classifier: Topic :: Software Development 20 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/alibabacloud_nls_java_sdk.egg-info/SOURCES.txt: -------------------------------------------------------------------------------- 1 | README.rst 2 | setup.py 3 | ali_speech/__init__.py 4 | ali_speech/_client.py 5 | ali_speech/_constant.py 6 | ali_speech/_create_token.py 7 | ali_speech/_logging.py 8 | ali_speech/_speech_recognizer.py 9 | ali_speech/_speech_reqprotocol.py 10 | ali_speech/_speech_synthesizer.py 11 | ali_speech/_speech_transcriber.py 12 | ali_speech/callbacks.py 13 | ali_speech/constant.py 14 | alibabacloud_nls_java_sdk.egg-info/PKG-INFO 15 | alibabacloud_nls_java_sdk.egg-info/SOURCES.txt 16 | alibabacloud_nls_java_sdk.egg-info/dependency_links.txt 17 | alibabacloud_nls_java_sdk.egg-info/requires.txt 18 | alibabacloud_nls_java_sdk.egg-info/top_level.txt -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/alibabacloud_nls_java_sdk.egg-info/dependency_links.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/alibabacloud_nls_java_sdk.egg-info/requires.txt: -------------------------------------------------------------------------------- 1 | websocket-client 2 | requests 3 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/alibabacloud_nls_java_sdk.egg-info/top_level.txt: -------------------------------------------------------------------------------- 1 | ali_speech 2 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | * Copyright 2015 Alibaba Group Holding Limited 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | """ 16 | 17 | from ali_speech._client import NlsClient 18 | 19 | __version__ = "0.1.0" 20 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_client.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import websocket 20 | 21 | try: 22 | import thread 23 | except ImportError: 24 | import _thread as thread 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._create_token import AccessToken 28 | from ali_speech._speech_recognizer import SpeechRecognizer 29 | from ali_speech._speech_transcriber import SpeechTranscriber 30 | from ali_speech._speech_synthesizer import SpeechSynthesizer 31 | 32 | __all__ = ["NlsClient"] 33 | 34 | 35 | class NlsClient: 36 | URL_GATEWAY = 'wss://nls-gateway.cn-shanghai.aliyuncs.com/ws/v1' 37 | 38 | def __init__(self): 39 | websocket.enableTrace(False) 40 | 41 | @staticmethod 42 | def set_log_level(level): 43 | _log.setLevel(level) 44 | 45 | @staticmethod 46 | def create_token(access_key_id, access_key_secret): 47 | return AccessToken.create_token(access_key_id, access_key_secret) 48 | 49 | def create_recognizer(self, callback, gateway_url=URL_GATEWAY): 50 | request = SpeechRecognizer(callback, gateway_url) 51 | return request 52 | 53 | def create_transcriber(self, callback, gateway_url=URL_GATEWAY): 54 | transcriber = SpeechTranscriber(callback, gateway_url) 55 | return transcriber 56 | 57 | def create_synthesizer(self, callback, gateway_url=URL_GATEWAY): 58 | synthesizer = SpeechSynthesizer(callback, gateway_url) 59 | return synthesizer 60 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_constant.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | 20 | class Status: 21 | # 初始状态 22 | STATUS_INIT = 1 23 | # websocker 网络连接建立成功,on_open 24 | STATUS_WS_CONNECTED = 2 25 | # 与gateway服务建立连接中 26 | STATUS_STARTING = 3 27 | # 与服务建立连接成功,on_message RecognitionStarted 28 | STATUS_STARTED = 4 29 | # 客户端正主动断开连接 30 | STATUS_STOPPING = 5 31 | # 与服务已断开连接 32 | STATUS_STOPPED = 6 33 | # 开启VAD,服务主动返回completed事件 34 | STATUS_COMPLETED_WITH_OUT_STOP = 7 35 | 36 | 37 | class Constant: 38 | CONTEXT = 'context' 39 | CONTEXT_SDK_KEY = 'sdk' 40 | CONTEXT_SDK_KEY_NAME = 'name' 41 | CONTEXT_SDK_VALUE_NAME = 'nls-sdk-python' 42 | CONTEXT_SDK_KEY_VERSION = 'version' 43 | CONTEXT_SDK_VALUE_VERSION = '2.0.1' 44 | 45 | HEADER_TOKEN = 'X-NLS-Token' 46 | 47 | HEADER = 'header' 48 | HEADER_KEY_NAMESPACE = 'namespace' 49 | HEADER_KEY_NAME = 'name' 50 | HEADER_KEY_MESSAGE_ID = 'message_id' 51 | HEADER_KEY_APPKEY = 'appkey' 52 | HEADER_KEY_TASK_ID = 'task_id' 53 | HEADER_KEY_STATUS = 'status' 54 | HEADER_KEY_STATUS_TEXT = 'status_text' 55 | 56 | PAYLOAD = 'payload' 57 | PAYLOAD_KEY_SAMPLE_RATE = 'sample_rate' 58 | PAYLOAD_KEY_FORMAT = 'format' 59 | PAYLOAD_KEY_ENABLE_ITN = 'enable_inverse_text_normalization' 60 | PAYLOAD_KEY_ENABLE_INTERMEDIATE_RESULT = 'enable_intermediate_result' 61 | PAYLOAD_KEY_ENABLE_PUNCTUATION_PREDICTION = 'enable_punctuation_prediction' 62 | 63 | PAYLOAD_KEY_VOICE = 'voice' 64 | PAYLOAD_KEY_TEXT = 'text' 65 | PAYLOAD_KEY_VOLUME = 'volume' 66 | PAYLOAD_KEY_SPEECH_RATE = 'speech_rate' 67 | PAYLOAD_KEY_PITCH_RATE = 'pitch_rate' 68 | 69 | HEADER_VALUE_ASR_NAMESPACE = 'SpeechRecognizer' 70 | HEADER_VALUE_ASR_NAME_START = 'StartRecognition' 71 | HEADER_VALUE_ASR_NAME_STOP = 'StopRecognition' 72 | HEADER_VALUE_ASR_NAME_STARTED = 'RecognitionStarted' 73 | HEADER_VALUE_ASR_NAME_RESULT_CHANGED = 'RecognitionResultChanged' 74 | HEADER_VALUE_ASR_NAME_COMPLETED = 'RecognitionCompleted' 75 | 76 | HEADER_VALUE_NAME_TASK_FAILED = 'TaskFailed' 77 | 78 | HEADER_VALUE_TRANS_NAMESPACE = 'SpeechTranscriber' 79 | HEADER_VALUE_TRANS_NAME_START = 'StartTranscription' 80 | HEADER_VALUE_TRANS_NAME_STOP = 'StopTranscription' 81 | HEADER_VALUE_TRANS_NAME_STARTED = 'TranscriptionStarted' 82 | HEADER_VALUE_TRANS_NAME_SENTENCE_BEGIN = 'SentenceBegin' 83 | HEADER_VALUE_TRANS_NAME_SENTENCE_END = 'SentenceEnd' 84 | HEADER_VALUE_TRANS_NAME_RESULT_CHANGE = 'TranscriptionResultChanged' 85 | HEADER_VALUE_TRANS_NAME_COMPLETED = 'TranscriptionCompleted' 86 | 87 | HEADER_VALUE_TTS_NAMESPACE = 'SpeechSynthesizer' 88 | HEADER_VALUE_TTS_NAME_START = 'StartSynthesis' 89 | HEADER_VALUE_TTS_NAME_COMPLETED = 'SynthesisCompleted' 90 | 91 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_create_token.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | * Copyright 2015 Alibaba Group Holding Limited 6 | * 7 | * Licensed under the Apache License, Version 2.0 (the "License"); 8 | * you may not use this file except in compliance with the License. 9 | * You may obtain a copy of the License at 10 | * 11 | * http://www.apache.org/licenses/LICENSE-2.0 12 | * 13 | * Unless required by applicable law or agreed to in writing, software 14 | * distributed under the License is distributed on an "AS IS" BASIS, 15 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 | * See the License for the specific language governing permissions and 17 | * limitations under the License. 18 | """ 19 | 20 | import base64 21 | import hashlib 22 | import hmac 23 | import requests 24 | import time 25 | import uuid 26 | 27 | from urllib import parse 28 | from ali_speech._logging import _log 29 | 30 | 31 | class AccessToken: 32 | @staticmethod 33 | def _encode_text(text): 34 | encoded_text = parse.quote_plus(text) 35 | return encoded_text.replace('+', '%20').replace('*', '%2A').replace('%7E', '~') 36 | 37 | @staticmethod 38 | def _encode_dict(dic): 39 | keys = dic.keys() 40 | dic_sorted = [(key, dic[key]) for key in sorted(keys)] 41 | encoded_text = parse.urlencode(dic_sorted) 42 | return encoded_text.replace('+', '%20').replace('*', '%2A').replace('%7E', '~') 43 | 44 | @staticmethod 45 | def create_token(access_key_id, access_key_secret): 46 | parameters = {'AccessKeyId': access_key_id, 47 | 'Action': 'CreateToken', 48 | 'Format': 'JSON', 49 | 'RegionId': 'cn-shanghai', 50 | 'SignatureMethod': 'HMAC-SHA1', 51 | 'SignatureNonce': str(uuid.uuid1()), 52 | 'SignatureVersion': '1.0', 53 | 'Timestamp': time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()), 54 | 'Version': '2019-02-28'} 55 | # 构造规范化的请求字符串 56 | query_string = AccessToken._encode_dict(parameters) 57 | _log.debug('规范化的请求字符串: %s' % query_string) 58 | # 构造待签名字符串 59 | string_to_sign = 'GET' + '&' + AccessToken._encode_text('/') + '&' + AccessToken._encode_text(query_string) 60 | _log.debug('待签名的字符串: %s' % string_to_sign) 61 | # 计算签名 62 | secreted_string = hmac.new(bytes(access_key_secret + '&', encoding='utf-8'), 63 | bytes(string_to_sign, encoding='utf-8'), 64 | hashlib.sha1).digest() 65 | signature = base64.b64encode(secreted_string) 66 | _log.debug('签名: %s' % signature) 67 | # 进行URL编码 68 | signature = AccessToken._encode_text(signature) 69 | _log.debug('URL编码后的签名: %s' % signature) 70 | # 调用服务 71 | full_url = 'http://nls-meta.cn-shanghai.aliyuncs.com/?Signature=%s&%s' % (signature, query_string) 72 | _log.debug('url: %s' % full_url) 73 | # 提交HTTP GET请求 74 | response = requests.get(full_url) 75 | if response.ok: 76 | root_obj = response.json() 77 | key = 'Token' 78 | if key in root_obj: 79 | token = root_obj[key]['Id'] 80 | expire_time = root_obj[key]['ExpireTime'] 81 | return token, expire_time 82 | 83 | _log.error(response.text) 84 | return None, None 85 | 86 | 87 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_logging.py: -------------------------------------------------------------------------------- 1 | # -*- coding:utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import logging.handlers 20 | 21 | __all__ = ['_log'] 22 | 23 | FORMAT = '%(asctime)15s %(name)s-%(levelname)s %(funcName)s:%(lineno)s %(message)s' 24 | logging.basicConfig(level=logging.DEBUG, format=FORMAT) 25 | _log = logging.getLogger('alispeech') 26 | 27 | handler = logging.handlers.RotatingFileHandler('alispeech.log', maxBytes=1024 * 1024, 28 | backupCount=5, encoding='utf-8') 29 | handler.setLevel(logging.DEBUG) 30 | handler.setFormatter(logging.Formatter(FORMAT)) 31 | _log.addHandler(handler) 32 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_speech_recognizer.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | import six 21 | import websocket 22 | import uuid 23 | import threading 24 | import time 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._constant import Status 28 | from ali_speech._constant import Constant 29 | from ali_speech._speech_reqprotocol import SpeechReqProtocol 30 | 31 | 32 | class SpeechRecognizer(SpeechReqProtocol): 33 | 34 | def __init__(self, callback, url): 35 | super(SpeechRecognizer, self).__init__(callback, url) 36 | self._last_start_retry = False 37 | self._is_connected = False 38 | 39 | self._header[Constant.HEADER_KEY_NAMESPACE] = Constant.HEADER_VALUE_ASR_NAMESPACE 40 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = "pcm" 41 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = 16000 42 | 43 | def set_enable_intermediate_result(self, flag): 44 | self._payload[Constant.PAYLOAD_KEY_ENABLE_INTERMEDIATE_RESULT] = flag 45 | 46 | def set_enable_punctuation_prediction(self, flag): 47 | self._payload[Constant.PAYLOAD_KEY_ENABLE_PUNCTUATION_PREDICTION] = flag 48 | 49 | def set_enable_inverse_text_normalization(self, flag): 50 | self._payload[Constant.PAYLOAD_KEY_ENABLE_ITN] = flag 51 | 52 | def start(self, ping_interval=5, ping_timeout=3): 53 | """ 54 | 开始识别,新建到服务端的连接 55 | :param ping_interval: 自动发送ping命令,指定发送间隔,单位为秒 56 | :param ping_timeout: 等待接收pong消息的超时时间,单位为秒 57 | :return: 与服务端建立连接成功,返回0 58 | 与服务端建立连接失败,返回-1 59 | """ 60 | if self._status == Status.STATUS_INIT: 61 | _log.debug('starting recognizer...') 62 | self._status = Status.STATUS_STARTING 63 | else: 64 | _log.error("Illegal status: %s" % self._status) 65 | return -1 66 | 67 | def _on_open(ws): 68 | _log.debug('websocket connected') 69 | self._status = Status.STATUS_WS_CONNECTED 70 | self._is_connected = True 71 | msg_id = six.u(uuid.uuid1().hex) 72 | self._task_id = six.u(uuid.uuid1().hex) 73 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_ASR_NAME_START 74 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 75 | self._header[Constant.HEADER_KEY_TASK_ID] = self._task_id 76 | 77 | text = self.serialize() 78 | _log.info('sending start cmd: ' + text) 79 | ws.send(text) 80 | 81 | def _on_message(ws, raw): 82 | _log.debug('websocket message received: ' + raw) 83 | msg = json.loads(raw) 84 | name = msg[Constant.HEADER][Constant.HEADER_KEY_NAME] 85 | if name == Constant.HEADER_VALUE_ASR_NAME_STARTED: 86 | self._status = Status.STATUS_STARTED 87 | _log.debug('callback on_started') 88 | self._callback.on_started(msg) 89 | elif name == Constant.HEADER_VALUE_ASR_NAME_RESULT_CHANGED: 90 | _log.debug('callback on_result_changed') 91 | self._callback.on_result_changed(msg) 92 | elif name == Constant.HEADER_VALUE_ASR_NAME_COMPLETED: 93 | if self._status == Status.STATUS_STOPPING: 94 | # 客户端主动调用stop返回的completed事件 95 | self._status = Status.STATUS_STOPPED 96 | else: 97 | # 开启VAD,服务端主动返回的completed事件 98 | self._status = Status.STATUS_COMPLETED_WITH_OUT_STOP 99 | 100 | _log.debug('websocket status changed to stopped') 101 | _log.debug('callback on_completed') 102 | self._callback.on_completed(msg) 103 | elif name == Constant.HEADER_VALUE_NAME_TASK_FAILED: 104 | self._status = Status.STATUS_STOPPED 105 | _log.error(msg) 106 | _log.debug('websocket status changed to stopped') 107 | _log.debug('callback on_task_failed') 108 | self._callback.on_task_failed(msg) 109 | 110 | def _on_close(ws): 111 | _log.debug('callback on_channel_closed') 112 | self._callback.on_channel_closed() 113 | 114 | def _on_error(ws, error): 115 | if self._is_connected or self._last_start_retry: 116 | _log.error(error) 117 | self._status = Status.STATUS_STOPPED 118 | message = json.loads('{"header":{"namespace":"Default","name":"TaskFailed",' 119 | '"status":400,"message_id":"0","task_id":"0",' 120 | '"status_text":"%s"}}' 121 | % error) 122 | self._callback.on_task_failed(message) 123 | else: 124 | _log.warning('retry start: %s' % error) 125 | 126 | retry_count = 3 127 | for count in range(retry_count): 128 | self._status = Status.STATUS_STARTING 129 | if count == (retry_count - 1): 130 | self._last_start_retry = True 131 | 132 | # Init WebSocket 133 | self._ws = websocket.WebSocketApp(self._gateway_url, 134 | on_open=_on_open, 135 | on_message=_on_message, 136 | on_error=_on_error, 137 | on_close=_on_close, 138 | header={Constant.HEADER_TOKEN: self._token}) 139 | 140 | self._thread = threading.Thread(target=self._ws.run_forever, 141 | args=(None, None, ping_interval, ping_timeout)) 142 | self._thread.daemon = True 143 | self._thread.start() 144 | # waite for no more than 10 seconds 145 | for i in range(1000): 146 | if self._status == Status.STATUS_STARTED or self._status == Status.STATUS_STOPPED: 147 | break 148 | else: 149 | time.sleep(0.01) 150 | 151 | if self._status == Status.STATUS_STARTED: 152 | # 与服务端连接建立成功 153 | _log.debug('start succeed!') 154 | return 0 155 | else: 156 | if self._is_connected or self._last_start_retry: 157 | # 已建立了WebSocket链接但是与服务端的连接失败, 或者是最后一次重试,则返回-1 158 | _log.error("start failed, status: %s" % self._status) 159 | return -1 160 | else: 161 | # 尝试重连 162 | continue 163 | 164 | def send(self, audio_data): 165 | """ 166 | 发送语音数据到服务端,建议每次发送 1000 ~ 8000 字节 167 | :param audio_data: 二进制音频数据 168 | :return: 发送成功,返回0 169 | 发送失败,返回-1 170 | """ 171 | if self._status == Status.STATUS_STARTED: 172 | self._ws.send(audio_data, opcode=websocket.ABNF.OPCODE_BINARY) 173 | return 0 174 | elif self._status == Status.STATUS_COMPLETED_WITH_OUT_STOP: 175 | _log.info('the recognizer finished with VAD, no need to send data anymore!') 176 | return -1 177 | else: 178 | _log.error('should not send data in state %d', self._status) 179 | return -1 180 | 181 | def stop(self): 182 | """ 183 | 结束识别并关闭与服务端的连接 184 | :return: 关闭成功,返回0 185 | 关闭失败,返回-1 186 | """ 187 | ret = 0 188 | if self._status == Status.STATUS_COMPLETED_WITH_OUT_STOP: 189 | _log.info('the recognizer finished with VAD') 190 | ret = 0 191 | elif self._status == Status.STATUS_STARTED: 192 | self._status = Status.STATUS_STOPPING 193 | msg_id = six.u(uuid.uuid1().hex) 194 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_ASR_NAME_STOP 195 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 196 | self._payload.clear() 197 | text = self.serialize() 198 | _log.info('sending stop cmd: ' + text) 199 | self._ws.send(text) 200 | 201 | for i in range(100): 202 | if self._status == Status.STATUS_STOPPED: 203 | break 204 | else: 205 | time.sleep(0.1) 206 | _log.debug('waite 100ms') 207 | 208 | if self._status != Status.STATUS_STOPPED: 209 | ret = -1 210 | else: 211 | ret = 0 212 | else: 213 | _log.error('should not stop in state %d', self._status) 214 | ret = -1 215 | 216 | return ret 217 | 218 | def close(self): 219 | """ 220 | 关闭WebSocket链接 221 | """ 222 | if self._ws: 223 | if self._thread and self._thread.is_alive(): 224 | self._ws.keep_running = False 225 | self._thread.join() 226 | self._ws.close() 227 | 228 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_speech_reqprotocol.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | 21 | from ali_speech._constant import Status 22 | from ali_speech._constant import Constant 23 | 24 | 25 | class SpeechReqProtocol: 26 | def __init__(self, callback, url): 27 | self._header = {} 28 | self._context = {} 29 | self._payload = {} 30 | self._token = None 31 | self._gateway_url = url 32 | self._callback = callback 33 | self._status = Status.STATUS_INIT 34 | self._ws = None 35 | self._thread = None 36 | self._task_id = None 37 | 38 | sdk_info = {Constant.CONTEXT_SDK_KEY_NAME: Constant.CONTEXT_SDK_VALUE_NAME, 39 | Constant.CONTEXT_SDK_KEY_VERSION: Constant.CONTEXT_SDK_VALUE_VERSION} 40 | self._context[Constant.CONTEXT_SDK_KEY] = sdk_info 41 | 42 | def set_appkey(self, appkey): 43 | self._header[Constant.HEADER_KEY_APPKEY] = appkey 44 | 45 | def get_appkey(self): 46 | return self._header[Constant.HEADER_KEY_APPKEY] 47 | 48 | def set_token(self, token): 49 | self._token = token 50 | 51 | def get_token(self): 52 | return self._token 53 | 54 | def set_format(self, format): 55 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = format 56 | 57 | def get_format(self): 58 | return self._payload[Constant.PAYLOAD_KEY_FORMAT] 59 | 60 | def set_sample_rate(self, sample_rate): 61 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = sample_rate 62 | 63 | def get_sample_rate(self): 64 | return self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] 65 | 66 | def get_task_id(self): 67 | return self._header[Constant.HEADER_KEY_TASK_ID] 68 | 69 | def put_context(self, key, obj): 70 | self._context[key] = obj 71 | 72 | def add_payload_param(self, key, obj): 73 | self._payload[key] = obj 74 | 75 | def get_status(self): 76 | return self._status 77 | 78 | def serialize(self): 79 | root = {Constant.HEADER: self._header} 80 | 81 | if len(self._payload) != 0: 82 | root[Constant.CONTEXT] = self._context 83 | root[Constant.PAYLOAD] = self._payload 84 | 85 | return json.dumps(root) 86 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_speech_synthesizer.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | import six 21 | import websocket 22 | import uuid 23 | import threading 24 | import time 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._constant import Status 28 | from ali_speech._constant import Constant 29 | from ali_speech._speech_reqprotocol import SpeechReqProtocol 30 | 31 | 32 | class SpeechSynthesizer(SpeechReqProtocol): 33 | 34 | def __init__(self, callback, url): 35 | super(SpeechSynthesizer, self).__init__(callback, url) 36 | self._last_start_retry = False 37 | self._is_connected = False 38 | 39 | self._header[Constant.HEADER_KEY_NAMESPACE] = Constant.HEADER_VALUE_TTS_NAMESPACE 40 | self._payload[Constant.PAYLOAD_KEY_VOICE] = 'xiaoyun' 41 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = 'pcm' 42 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = 16000 43 | 44 | def set_text(self, text): 45 | self._payload[Constant.PAYLOAD_KEY_TEXT] = text 46 | 47 | def set_voice(self, voice): 48 | self._payload[Constant.PAYLOAD_KEY_VOICE] = voice 49 | 50 | def set_volume(self, volume): 51 | self._payload[Constant.PAYLOAD_KEY_VOLUME] = volume 52 | 53 | def set_speech_rate(self, speech_rate): 54 | self._payload[Constant.PAYLOAD_KEY_SPEECH_RATE] = speech_rate 55 | 56 | def set_pitch_rate(self, pitch_rate): 57 | self._payload[Constant.PAYLOAD_KEY_PITCH_RATE] = pitch_rate 58 | 59 | def start(self, ping_interval=5, ping_timeout=3): 60 | """ 61 | 开始合成,新建到服务端的连接 62 | :param ping_interval: 自动发送ping命令,指定发送间隔,单位为秒 63 | :param ping_timeout: 等待接收pong消息的超时时间,单位为秒 64 | :return: 与服务端建立连接成功,返回0 65 | 与服务端建立连接失败,返回-1 66 | """ 67 | if self._status == Status.STATUS_INIT: 68 | _log.debug('starting synthesizer...') 69 | self._status = Status.STATUS_STARTING 70 | else: 71 | _log.error("Illegal status: %s" % self._status) 72 | return -1 73 | 74 | def _on_open(ws): 75 | _log.debug('websocket connected') 76 | self._status = Status.STATUS_STARTED 77 | self._is_connected = True 78 | time.sleep(0.01) 79 | msg_id = six.u(uuid.uuid1().hex) 80 | self._task_id = six.u(uuid.uuid1().hex) 81 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_TTS_NAME_START 82 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 83 | self._header[Constant.HEADER_KEY_TASK_ID] = self._task_id 84 | 85 | text = self.serialize() 86 | _log.info('sending start cmd: ' + text) 87 | ws.send(text) 88 | 89 | def _on_data(ws, raw, opcode, flag): 90 | if opcode == websocket.ABNF.OPCODE_BINARY: 91 | _log.debug("received binary data, size: %s" % len(raw)) 92 | self._callback.on_binary_data_received(raw) 93 | elif opcode == websocket.ABNF.OPCODE_TEXT: 94 | _log.debug("websocket message received: %s" % raw) 95 | msg = json.loads(raw) 96 | name = msg[Constant.HEADER][Constant.HEADER_KEY_NAME] 97 | if name == Constant.HEADER_VALUE_TTS_NAME_COMPLETED: 98 | self._status = Status.STATUS_STOPPED 99 | _log.debug('websocket status changed to stopped') 100 | _log.debug('callback on_completed') 101 | self._callback.on_completed(msg) 102 | elif name == Constant.HEADER_VALUE_NAME_TASK_FAILED: 103 | self._status = Status.STATUS_STOPPED 104 | _log.error(msg) 105 | _log.debug('websocket status changed to stopped') 106 | _log.debug('callback on_task_failed') 107 | self._callback.on_task_failed(msg) 108 | 109 | def _on_close(ws): 110 | _log.debug('callback on_channel_closed') 111 | self._callback.on_channel_closed() 112 | 113 | def _on_error(ws, error): 114 | if self._is_connected or self._last_start_retry: 115 | _log.error(error) 116 | self._status = Status.STATUS_STOPPED 117 | message = json.loads('{"header":{"namespace":"Default","name":"TaskFailed",' 118 | '"status":400,"message_id":"0","task_id":"0",' 119 | '"status_text":"%s"}}' 120 | % error) 121 | self._callback.on_task_failed(message) 122 | else: 123 | _log.warning('retry start: %s' % error) 124 | 125 | retry_count = 3 126 | for count in range(retry_count): 127 | self._status = Status.STATUS_STARTING 128 | if count == (retry_count - 1): 129 | self._last_start_retry = True 130 | 131 | # Init WebSocket 132 | self._ws = websocket.WebSocketApp(self._gateway_url, 133 | on_open=_on_open, 134 | on_data=_on_data, 135 | on_error=_on_error, 136 | on_close=_on_close, 137 | header={Constant.HEADER_TOKEN: self._token}) 138 | 139 | self._thread = threading.Thread(target=self._ws.run_forever, args=(None, None, ping_interval, ping_timeout)) 140 | self._thread.daemon = True 141 | self._thread.start() 142 | # waite for no more than 10 seconds 143 | for i in range(1000): 144 | if self._status == Status.STATUS_STARTED or self._status == Status.STATUS_STOPPED: 145 | break 146 | else: 147 | time.sleep(0.01) 148 | 149 | if self._status == Status.STATUS_STARTED: 150 | # 与服务端连接建立成功 151 | _log.debug('start succeed!') 152 | return 0 153 | else: 154 | if self._is_connected or self._last_start_retry: 155 | # 已建立了WebSocket链接但是与服务端的连接失败, 或者是最后一次重试,则返回-1 156 | _log.error("start failed, status: %s" % self._status) 157 | return -1 158 | else: 159 | # 尝试重连 160 | continue 161 | 162 | def wait_completed(self): 163 | """ 164 | 等待合成结束 165 | :return: 合成结束,返回0 166 | 合成超时,返回-1 167 | """ 168 | ret = 0 169 | if self._status == Status.STATUS_STARTED: 170 | for i in range(100): 171 | if self._status == Status.STATUS_STOPPED: 172 | break 173 | else: 174 | time.sleep(0.1) 175 | _log.debug('waite 100ms') 176 | 177 | if self._status != Status.STATUS_STOPPED: 178 | ret = -1 179 | else: 180 | ret = 0 181 | else: 182 | _log.error('should not wait completed in state %d', self._status) 183 | ret = -1 184 | 185 | return ret 186 | 187 | def close(self): 188 | """ 189 | 关闭WebSocket连接 190 | :return: 191 | """ 192 | if self._ws: 193 | if self._thread and self._thread.is_alive(): 194 | self._ws.keep_running = False 195 | self._thread.join() 196 | self._ws.close() 197 | 198 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/_speech_transcriber.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import json 20 | import six 21 | import websocket 22 | import uuid 23 | import threading 24 | import time 25 | 26 | from ali_speech._logging import _log 27 | from ali_speech._constant import Status 28 | from ali_speech._constant import Constant 29 | from ali_speech._speech_reqprotocol import SpeechReqProtocol 30 | 31 | 32 | class SpeechTranscriber(SpeechReqProtocol): 33 | def __init__(self, callback, url): 34 | super(SpeechTranscriber, self).__init__(callback, url) 35 | self._last_start_retry = False 36 | self._is_connected = False 37 | 38 | self._header[Constant.HEADER_KEY_NAMESPACE] = Constant.HEADER_VALUE_TRANS_NAMESPACE 39 | self._payload[Constant.PAYLOAD_KEY_FORMAT] = 'pcm' 40 | self._payload[Constant.PAYLOAD_KEY_SAMPLE_RATE] = 16000 41 | 42 | def set_enable_intermediate_result(self, flag): 43 | self._payload[Constant.PAYLOAD_KEY_ENABLE_INTERMEDIATE_RESULT] = flag 44 | 45 | def set_enable_punctuation_prediction(self, flag): 46 | self._payload[Constant.PAYLOAD_KEY_ENABLE_PUNCTUATION_PREDICTION] = flag 47 | 48 | def set_enable_inverse_text_normalization(self, flag): 49 | self._payload[Constant.PAYLOAD_KEY_ENABLE_ITN] = flag 50 | 51 | def start(self, ping_interval=5, ping_timeout=3): 52 | """ 53 | 开始识别,新建到服务端的连接 54 | :param ping_interval: 自动发送ping命令,指定发送间隔,单位为秒 55 | :param ping_timeout: 等待接收pong消息的超时时间,单位为秒 56 | :return: 与服务端建立连接成功,返回0 57 | 与服务端建立连接失败,返回-1 58 | """ 59 | if self._status == Status.STATUS_INIT: 60 | _log.debug('starting transcriber...') 61 | self._status = Status.STATUS_STARTING 62 | else: 63 | _log.error("Illegal status: %s" % self._status) 64 | return -1 65 | 66 | def _on_open(ws): 67 | _log.debug('websocket connected') 68 | self._status = Status.STATUS_WS_CONNECTED 69 | self._is_connected = True 70 | msg_id = six.u(uuid.uuid1().hex) 71 | self._task_id = six.u(uuid.uuid1().hex) 72 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_TRANS_NAME_START 73 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 74 | self._header[Constant.HEADER_KEY_TASK_ID] = self._task_id 75 | 76 | text = self.serialize() 77 | _log.info('sending start cmd: ' + text) 78 | ws.send(text) 79 | 80 | def _on_message(ws, raw): 81 | _log.debug('websocket message received: ' + raw) 82 | msg = json.loads(raw) 83 | name = msg[Constant.HEADER][Constant.HEADER_KEY_NAME] 84 | if name == Constant.HEADER_VALUE_TRANS_NAME_STARTED: 85 | self._status = Status.STATUS_STARTED 86 | _log.debug('callback on_started') 87 | self._callback.on_started(msg) 88 | elif name == Constant.HEADER_VALUE_TRANS_NAME_RESULT_CHANGE: 89 | _log.debug('callback on_result_changed') 90 | self._callback.on_result_changed(msg) 91 | elif name == Constant.HEADER_VALUE_TRANS_NAME_SENTENCE_BEGIN: 92 | _log.debug('callback on_sentence_begin') 93 | self._callback.on_sentence_begin(msg) 94 | elif name == Constant.HEADER_VALUE_TRANS_NAME_SENTENCE_END: 95 | _log.debug('callback on_sentence_end') 96 | self._callback.on_sentence_end(msg) 97 | elif name == Constant.HEADER_VALUE_TRANS_NAME_COMPLETED: 98 | self._status = Status.STATUS_STOPPED 99 | _log.debug('websocket status changed to stopped') 100 | _log.debug('callback on_completed') 101 | self._callback.on_completed(msg) 102 | elif name == Constant.HEADER_VALUE_NAME_TASK_FAILED: 103 | self._status = Status.STATUS_STOPPED 104 | _log.error(msg) 105 | _log.debug('websocket status changed to stopped') 106 | _log.debug('callback on_task_failed') 107 | self._callback.on_task_failed(msg) 108 | 109 | def _on_close(ws): 110 | _log.debug('callback on_channel_closed') 111 | self._callback.on_channel_closed() 112 | 113 | def _on_error(ws, error): 114 | if self._is_connected or self._last_start_retry: 115 | _log.error(error) 116 | self._status = Status.STATUS_STOPPED 117 | message = json.loads('{"header":{"namespace":"Default","name":"TaskFailed",' 118 | '"status":400,"message_id":"0","task_id":"0",' 119 | '"status_text":"%s"}}' 120 | % error) 121 | self._callback.on_task_failed(message) 122 | else: 123 | _log.warning('retry start: %s' % error) 124 | 125 | retry_count = 3 126 | for count in range(retry_count): 127 | self._status = Status.STATUS_STARTING 128 | if count == (retry_count - 1): 129 | self._last_start_retry = True 130 | 131 | # Init WebSocket 132 | self._ws = websocket.WebSocketApp(self._gateway_url, 133 | on_open=_on_open, 134 | on_message=_on_message, 135 | on_error=_on_error, 136 | on_close=_on_close, 137 | header={Constant.HEADER_TOKEN: self._token}) 138 | 139 | self._thread = threading.Thread(target=self._ws.run_forever, args=(None, None, ping_interval, ping_timeout)) 140 | self._thread.daemon = True 141 | self._thread.start() 142 | # waite for no more than 3 seconds 143 | for i in range(1000): 144 | if self._status == Status.STATUS_STARTED or self._status == Status.STATUS_STOPPED: 145 | break 146 | else: 147 | time.sleep(0.01) 148 | 149 | if self._status == Status.STATUS_STARTED: 150 | _log.debug('start succeed!') 151 | return 0 152 | else: 153 | if self._is_connected or self._last_start_retry: 154 | _log.error("start failed, status: %s" % self._status) 155 | return -1 156 | else: 157 | continue 158 | 159 | def send(self, audio_data): 160 | """ 161 | 发送语音数据到服务端,建议每次发送 1000 ~ 8000 字节 162 | :param audio_data: 二进制音频数据 163 | :return: 发送成功,返回0 164 | 发送失败,返回-1 165 | """ 166 | if self._status == Status.STATUS_STARTED: 167 | self._ws.send(audio_data, opcode=websocket.ABNF.OPCODE_BINARY) 168 | return 0 169 | else: 170 | _log.error('should not send data in state %d', self._status) 171 | return -1 172 | 173 | def stop(self): 174 | """ 175 | 结束识别并关闭与服务端的连接 176 | :return: 关闭成功,返回0 177 | 关闭失败,返回-1 178 | """ 179 | ret = 0 180 | if self._status == Status.STATUS_STARTED: 181 | self._status = Status.STATUS_STOPPING 182 | msg_id = six.u(uuid.uuid1().hex) 183 | self._header[Constant.HEADER_KEY_NAME] = Constant.HEADER_VALUE_TRANS_NAME_STOP 184 | self._header[Constant.HEADER_KEY_MESSAGE_ID] = msg_id 185 | self._payload.clear() 186 | text = self.serialize() 187 | _log.info('sending stop cmd: ' + text) 188 | self._ws.send(text) 189 | 190 | for i in range(100): 191 | if self._status == Status.STATUS_STOPPED: 192 | break 193 | else: 194 | time.sleep(0.1) 195 | _log.debug('waite 100ms') 196 | 197 | if self._status != Status.STATUS_STOPPED: 198 | ret = -1 199 | else: 200 | ret = 0 201 | else: 202 | _log.error('should not stop in state %d', self._status) 203 | ret = -1 204 | 205 | return ret 206 | 207 | def close(self): 208 | """ 209 | 关闭WebSocket链接 210 | :return: 211 | """ 212 | if self._ws: 213 | if self._thread and self._thread.is_alive(): 214 | self._ws.keep_running = False 215 | self._thread.join() 216 | self._ws.close() 217 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/callbacks.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | 20 | class SpeechRecognizerCallback: 21 | """ 22 | * @brief 调用start(), 成功与服务建立连接, sdk内部线程上报started事件 23 | * @note 请勿在回调函数内部调用stop()操作 24 | * @param message 服务返回的响应 25 | * @return 26 | """ 27 | def on_started(self, message): 28 | raise Exception('Not implemented!') 29 | 30 | """ 31 | * @brief 设置允许返回中间结果参数, sdk在接收到服务返回到中间结果时, sdk内部线程上报ResultChanged事件 32 | * @note 请勿在回调函数内部调用stop()操作 33 | * @param message 服务返回的响应 34 | * @return 35 | """ 36 | def on_result_changed(self, message): 37 | raise Exception('Not implemented!') 38 | 39 | """ 40 | * @brief sdk在接收到服务返回识别结束消息时, sdk内部线程上报Completed事件 41 | * @note 上报Completed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 42 | * 请勿在回调函数内部调用stop()操作 43 | * @param message 服务返回的响应 44 | * @return 45 | """ 46 | def on_completed(self, message): 47 | raise Exception('Not implemented!') 48 | 49 | """ 50 | * @brief 识别过程(包含start(), send(), stop())发生异常时, sdk内部线程上报TaskFailed事件 51 | * @note 上报TaskFailed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 52 | * 请勿在回调函数内部调用stop()操作 53 | * @param message 服务返回的响应 54 | * @return 55 | """ 56 | def on_task_failed(self, message): 57 | raise Exception('Not implemented!') 58 | 59 | """ 60 | * @brief 识别结束或发生异常时,会关闭websocket连接通道 61 | * @note 请勿在回调函数内部调用stop()操作 62 | * @return 63 | """ 64 | def on_channel_closed(self): 65 | raise Exception('Not implemented!') 66 | 67 | 68 | class SpeechTranscriberCallback: 69 | """ 70 | * @brief 调用start(), 成功与服务建立连接, sdk内部线程上报started事件 71 | * @note 请勿在回调函数内部调用stop()操作 72 | * @param message 服务返回的响应 73 | * @return 74 | """ 75 | def on_started(self, message): 76 | raise Exception('Not implemented!') 77 | 78 | """ 79 | * @brief 设置允许返回中间结果参数, sdk在接收到服务返回到中间结果时, sdk内部线程上报ResultChanged事件 80 | * @note 请勿在回调函数内部调用stop()操作 81 | * @param message 服务返回的响应 82 | * @return 83 | """ 84 | def on_result_changed(self, message): 85 | raise Exception('Not implemented!') 86 | 87 | """ 88 | * @brief sdk在接收到服务返回的识别到一句话的开始, sdk内部线程上报SentenceBegin事件 89 | * @note 该事件作为检测到一句话的开始,请勿在回调函数内部调用stop()操作 90 | * @param message 服务返回的响应 91 | * @return 92 | """ 93 | def on_sentence_begin(self, message): 94 | raise Exception('Not implemented!') 95 | 96 | """ 97 | * @brief sdk在接收到服务返回的识别到一句话的开始, sdk内部线程上报SentenceBegin事件 98 | * @note 该事件作为检测到一句话的开始,请勿在回调函数内部调用stop()操作 99 | * @param message 服务返回的响应 100 | * @return 101 | """ 102 | def on_sentence_end(self, message): 103 | raise Exception('Not implemented!') 104 | 105 | """ 106 | * @brief sdk在接收到服务返回识别结束消息时, sdk内部线程上报Completed事件 107 | * @note 上报Completed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 108 | * 请勿在回调函数内部调用stop()操作 109 | * @param message 服务返回的响应 110 | * @return 111 | """ 112 | def on_completed(self, message): 113 | raise Exception('Not implemented!') 114 | 115 | """ 116 | * @brief 识别过程(包含start(), send(), stop())发生异常时, sdk内部线程上报TaskFailed事件 117 | * @note 上报TaskFailed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 118 | * 请勿在回调函数内部调用stop()操作 119 | * @param message 服务返回的响应 120 | * @return 121 | """ 122 | def on_task_failed(self, message): 123 | raise Exception('Not implemented!') 124 | 125 | """ 126 | * @brief 识别结束或发生异常时,会关闭websocket连接通道 127 | * @note 请勿在回调函数内部调用stop()操作 128 | * @return 129 | """ 130 | def on_channel_closed(self): 131 | raise Exception('Not implemented!') 132 | 133 | 134 | class SpeechSynthesizerCallback: 135 | 136 | def on_binary_data_received(self, raw): 137 | raise Exception('Not implemented!') 138 | 139 | """ 140 | * @brief sdk在接收到服务返回识别结束消息时, sdk内部线程上报Completed事件 141 | * @note 上报Completed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 142 | * 请勿在回调函数内部调用stop()操作 143 | * @param message 服务返回的响应 144 | * @return 145 | """ 146 | def on_completed(self, message): 147 | raise Exception('Not implemented!') 148 | 149 | """ 150 | * @brief 识别过程(包含start(), send(), stop())发生异常时, sdk内部线程上报TaskFailed事件 151 | * @note 上报TaskFailed事件之后, SDK内部会关闭识别连接通道. 此时调用send()会返回-1, 请停止发送. 152 | * 请勿在回调函数内部调用stop()操作 153 | * @param message 服务返回的响应 154 | * @return 155 | """ 156 | def on_task_failed(self, message): 157 | raise Exception('Not implemented!') 158 | 159 | """ 160 | * @brief 识别结束或发生异常时,会关闭websocket连接通道 161 | * @note 请勿在回调函数内部调用stop()操作 162 | * @return 163 | """ 164 | def on_channel_closed(self): 165 | raise Exception('Not implemented!') 166 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/build/lib/ali_speech/constant.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | 20 | class ASRFormat: 21 | PCM = 'pcm' 22 | OPUS = 'opus' 23 | 24 | 25 | class TTSFormat: 26 | PCM = 'pcm' 27 | WAV = 'wav' 28 | MP3 = 'mp3' 29 | 30 | 31 | class ASRSampleRate: 32 | SAMPLE_RATE_8K = 8000 33 | SAMPLE_RATE_16K = 16000 34 | 35 | 36 | class TTSSampleRate: 37 | SAMPLE_RATE_8K = 8000 38 | SAMPLE_RATE_16K = 16000 39 | SAMPLE_RATE_24K = 24000 40 | 41 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/create_token_demo.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import time 20 | import ali_speech 21 | 22 | if __name__ == "__main__": 23 | ali_speech.NlsClient.set_log_level('INFO') 24 | # 用户信息 25 | access_key_id = '您的AccessKeyId' 26 | access_key_secret = '您的AccessKeySecret' 27 | token, expire_time = ali_speech.NlsClient.create_token(access_key_id, access_key_secret) 28 | print('token: %s, expire time(s): %s' % (token, expire_time)) 29 | if expire_time: 30 | print('token有效期的北京时间:%s' % (time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(expire_time)))) 31 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/dist/alibabacloud-nls-java-sdk-2.0.0.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/alibabacloud-nls-python-sdk/dist/alibabacloud-nls-java-sdk-2.0.0.tar.gz -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/nls-sample-16k.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/安装指南/alibabacloud-nls-python-sdk/nls-sample-16k.wav -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | """ 3 | * Copyright 2015 Alibaba Group Holding Limited 4 | * 5 | * Licensed under the Apache License, Version 2.0 (the "License"); 6 | * you may not use this file except in compliance with the License. 7 | * You may obtain a copy of the License at 8 | * 9 | * http://www.apache.org/licenses/LICENSE-2.0 10 | * 11 | * Unless required by applicable law or agreed to in writing, software 12 | * distributed under the License is distributed on an "AS IS" BASIS, 13 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | * See the License for the specific language governing permissions and 15 | * limitations under the License. 16 | """ 17 | 18 | try: 19 | from setuptools import setup 20 | except ImportError: 21 | from distutils.core import setup 22 | 23 | config = { 24 | 'name': 'alibabacloud-nls-java-sdk', 25 | 'version': '2.0.0', 26 | 'description': 'ali_speech python sdk', 27 | 'author': 'Alibaba Cloud NLS Team', 28 | 'author_email': 'nls-system-client@list.alibaba-inc.com', 29 | 'license': 'Apache License 2.0', 30 | 'url': 'https://github.com/aliyun/alibabacloud-nls-python-sdk.git', 31 | 'install_requires': ['websocket-client', 'requests'], 32 | 'packages': ['ali_speech'], 33 | 'classifiers': ( 34 | 'Development Status :: 5 - Production/Stable', 35 | 'Intended Audience :: Developers', 36 | 'License :: OSI Approved :: Apache Software License', 37 | 'Programming Language :: Python :: 3', 38 | 'Programming Language :: Python :: 3.4', 39 | 'Programming Language :: Python :: 3.5', 40 | 'Programming Language :: Python :: 3.6', 41 | 'Programming Language :: Python :: 3.7', 42 | 'Topic :: Software Development', 43 | ) 44 | } 45 | 46 | setup(**config) 47 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/speech_recognizer_demo.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import os 20 | import time 21 | import threading 22 | import ali_speech 23 | from ali_speech.callbacks import SpeechRecognizerCallback 24 | from ali_speech.constant import ASRFormat 25 | from ali_speech.constant import ASRSampleRate 26 | 27 | 28 | class MyCallback(SpeechRecognizerCallback): 29 | """ 30 | 构造函数的参数没有要求,可根据需要设置添加 31 | 示例中的name参数可作为待识别的音频文件名,用于在多线程中进行区分 32 | """ 33 | def __init__(self, name='default'): 34 | self._name = name 35 | 36 | def on_started(self, message): 37 | print('MyCallback.OnRecognitionStarted: %s' % message) 38 | 39 | def on_result_changed(self, message): 40 | print('MyCallback.OnRecognitionResultChanged: file: %s, task_id: %s, result: %s' % ( 41 | self._name, message['header']['task_id'], message['payload']['result'])) 42 | 43 | def on_completed(self, message): 44 | print('MyCallback.OnRecognitionCompleted: file: %s, task_id:%s, result:%s' % ( 45 | self._name, message['header']['task_id'], message['payload']['result'])) 46 | 47 | def on_task_failed(self, message): 48 | print('MyCallback.OnRecognitionTaskFailed: %s' % message) 49 | 50 | def on_channel_closed(self): 51 | print('MyCallback.OnRecognitionChannelClosed') 52 | 53 | 54 | def process(client, appkey, token): 55 | audio_name = 'nls-sample-16k.wav' 56 | callback = MyCallback(audio_name) 57 | recognizer = client.create_recognizer(callback) 58 | recognizer.set_appkey(appkey) 59 | recognizer.set_token(token) 60 | recognizer.set_format(ASRFormat.PCM) 61 | recognizer.set_sample_rate(ASRSampleRate.SAMPLE_RATE_16K) 62 | recognizer.set_enable_intermediate_result(False) 63 | recognizer.set_enable_punctuation_prediction(True) 64 | recognizer.set_enable_inverse_text_normalization(True) 65 | 66 | try: 67 | ret = recognizer.start() 68 | if ret < 0: 69 | return ret 70 | 71 | print('sending audio...') 72 | with open(audio_name, 'rb') as f: 73 | audio = f.read(3200) 74 | while audio: 75 | ret = recognizer.send(audio) 76 | if ret < 0: 77 | break 78 | time.sleep(0.1) 79 | audio = f.read(3200) 80 | 81 | recognizer.stop() 82 | except Exception as e: 83 | print(e) 84 | finally: 85 | recognizer.close() 86 | 87 | 88 | def process_multithread(client, appkey, token, number): 89 | thread_list = [] 90 | for i in range(0, number): 91 | thread = threading.Thread(target=process, args=(client, appkey, token)) 92 | thread_list.append(thread) 93 | thread.start() 94 | 95 | for thread in thread_list: 96 | thread.join() 97 | 98 | 99 | if __name__ == "__main__": 100 | client = ali_speech.NlsClient() 101 | # 设置输出日志信息的级别:DEBUG、INFO、WARNING、ERROR 102 | client.set_log_level('INFO') 103 | 104 | appkey = '您的appkey' 105 | token = '您的Token' 106 | 107 | process(client, appkey, token) 108 | 109 | # 多线程示例 110 | # process_multithread(client, appkey, token, 10) 111 | 112 | 113 | 114 | 115 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/speech_synthesizer_demo.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import threading 20 | import ali_speech 21 | from ali_speech.callbacks import SpeechSynthesizerCallback 22 | from ali_speech.constant import TTSFormat 23 | from ali_speech.constant import TTSSampleRate 24 | 25 | 26 | class MyCallback(SpeechSynthesizerCallback): 27 | # 参数name用于指定保存音频的文件 28 | def __init__(self, name): 29 | self._name = name 30 | self._fout = open(name, 'wb') 31 | 32 | def on_binary_data_received(self, raw): 33 | print('MyCallback.on_binary_data_received: %s' % len(raw)) 34 | self._fout.write(raw) 35 | 36 | def on_completed(self, message): 37 | print('MyCallback.OnRecognitionCompleted: %s' % message) 38 | self._fout.close() 39 | 40 | def on_task_failed(self, message): 41 | print('MyCallback.OnRecognitionTaskFailed-task_id:%s, status_text:%s' % ( 42 | message['header']['task_id'], message['header']['status_text'])) 43 | self._fout.close() 44 | 45 | def on_channel_closed(self): 46 | print('MyCallback.OnRecognitionChannelClosed') 47 | 48 | 49 | def process(client, appkey, token, text, audio_name): 50 | callback = MyCallback(audio_name) 51 | synthesizer = client.create_synthesizer(callback) 52 | synthesizer.set_appkey(appkey) 53 | synthesizer.set_token(token) 54 | synthesizer.set_voice('xiaoyun') 55 | synthesizer.set_text(text) 56 | synthesizer.set_format(TTSFormat.WAV) 57 | synthesizer.set_sample_rate(TTSSampleRate.SAMPLE_RATE_16K) 58 | synthesizer.set_volume(50) 59 | synthesizer.set_speech_rate(0) 60 | synthesizer.set_pitch_rate(0) 61 | 62 | try: 63 | ret = synthesizer.start() 64 | if ret < 0: 65 | return ret 66 | 67 | synthesizer.wait_completed() 68 | except Exception as e: 69 | print(e) 70 | finally: 71 | synthesizer.close() 72 | 73 | 74 | def process_multithread(client, appkey, token, number): 75 | thread_list = [] 76 | for i in range(0, number): 77 | text = "这是线程" + str(i) + "的合成。" 78 | audio_name = "sy_audio_" + str(i) + ".wav" 79 | thread = threading.Thread(target=process, args=(client, appkey, token, text, audio_name)) 80 | thread_list.append(thread) 81 | thread.start() 82 | 83 | for thread in thread_list: 84 | thread.join() 85 | 86 | 87 | if __name__ == "__main__": 88 | client = ali_speech.NlsClient() 89 | # 设置输出日志信息的级别:DEBUG、INFO、WARNING、ERROR 90 | client.set_log_level('INFO') 91 | 92 | appkey = '您的appkey' 93 | token = '您的token' 94 | 95 | text = "今天是周一,天气挺好的。" 96 | audio_name = 'sy_audio.wav' 97 | process(client, appkey, token, text, audio_name) 98 | 99 | # 多线程示例 100 | # process_multithread(client, appkey, token, 10) 101 | 102 | 103 | 104 | 105 | 106 | -------------------------------------------------------------------------------- /安装指南/alibabacloud-nls-python-sdk/speech_transcriber_demo.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | * Copyright 2015 Alibaba Group Holding Limited 5 | * 6 | * Licensed under the Apache License, Version 2.0 (the "License"); 7 | * you may not use this file except in compliance with the License. 8 | * You may obtain a copy of the License at 9 | * 10 | * http://www.apache.org/licenses/LICENSE-2.0 11 | * 12 | * Unless required by applicable law or agreed to in writing, software 13 | * distributed under the License is distributed on an "AS IS" BASIS, 14 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | * See the License for the specific language governing permissions and 16 | * limitations under the License. 17 | """ 18 | 19 | import os 20 | import time 21 | import threading 22 | import ali_speech 23 | from ali_speech.callbacks import SpeechTranscriberCallback 24 | from ali_speech.constant import ASRFormat 25 | from ali_speech.constant import ASRSampleRate 26 | 27 | 28 | class MyCallback(SpeechTranscriberCallback): 29 | """ 30 | 构造函数的参数没有要求,可根据需要设置添加 31 | 示例中的name参数可作为待识别的音频文件名,用于在多线程中进行区分 32 | """ 33 | def __init__(self, name='default'): 34 | self._name = name 35 | 36 | def on_started(self, message): 37 | print('MyCallback.OnRecognitionStarted: %s' % message) 38 | 39 | def on_result_changed(self, message): 40 | print('MyCallback.OnRecognitionResultChanged: file: %s, task_id: %s, result: %s' % ( 41 | self._name, message['header']['task_id'], message['payload']['result'])) 42 | 43 | def on_sentence_begin(self, message): 44 | print('MyCallback.on_sentence_begin: file: %s, task_id: %s, sentence_id: %s, time: %s' % ( 45 | self._name, message['header']['task_id'], message['payload']['index'], message['payload']['time'])) 46 | 47 | def on_sentence_end(self, message): 48 | print('MyCallback.on_sentence_end: file: %s, task_id: %s, sentence_id: %s, time: %s, result: %s' % ( 49 | self._name, 50 | message['header']['task_id'], message['payload']['index'], 51 | message['payload']['time'], message['payload']['result'])) 52 | 53 | def on_completed(self, message): 54 | print('MyCallback.OnRecognitionCompleted: %s' % message) 55 | 56 | def on_task_failed(self, message): 57 | print('MyCallback.OnRecognitionTaskFailed-task_id:%s, status_text:%s' % ( 58 | message['header']['task_id'], message['header']['status_text'])) 59 | 60 | def on_channel_closed(self): 61 | print('MyCallback.OnRecognitionChannelClosed') 62 | 63 | 64 | def process(client, appkey, token): 65 | audio_name = 'nls-sample-16k.wav' 66 | callback = MyCallback(audio_name) 67 | transcriber = client.create_transcriber(callback) 68 | transcriber.set_appkey(appkey) 69 | transcriber.set_token(token) 70 | transcriber.set_format(ASRFormat.PCM) 71 | transcriber.set_sample_rate(ASRSampleRate.SAMPLE_RATE_16K) 72 | transcriber.set_enable_intermediate_result(False) 73 | transcriber.set_enable_punctuation_prediction(True) 74 | transcriber.set_enable_inverse_text_normalization(True) 75 | 76 | try: 77 | ret = transcriber.start() 78 | if ret < 0: 79 | return ret 80 | 81 | print('sending audio...') 82 | with open(audio_name, 'rb') as f: 83 | audio = f.read(3200) 84 | while audio: 85 | ret = transcriber.send(audio) 86 | if ret < 0: 87 | break 88 | time.sleep(0.1) 89 | audio = f.read(3200) 90 | 91 | transcriber.stop() 92 | except Exception as e: 93 | print(e) 94 | finally: 95 | transcriber.close() 96 | 97 | 98 | def process_multithread(client, appkey, token, number): 99 | thread_list = [] 100 | for i in range(0, number): 101 | thread = threading.Thread(target=process, args=(client, appkey, token)) 102 | thread_list.append(thread) 103 | thread.start() 104 | 105 | for thread in thread_list: 106 | thread.join() 107 | 108 | 109 | if __name__ == "__main__": 110 | client = ali_speech.NlsClient() 111 | # 设置输出日志信息的级别:DEBUG、INFO、WARNING、ERROR 112 | client.set_log_level('INFO') 113 | 114 | appkey = '您的appkey' 115 | token = '您的Token' 116 | 117 | process(client, appkey, token) 118 | 119 | # 多线程示例 120 | # process_multithread(client, appkey, token, 10) 121 | 122 | 123 | -------------------------------------------------------------------------------- /安装指南/requirements.txt: -------------------------------------------------------------------------------- 1 | setuptools 2 | pyaudio 3 | keyboard 4 | aliyunsdkcore 5 | PySide2 -------------------------------------------------------------------------------- /安装指南/安装指南.md: -------------------------------------------------------------------------------- 1 | ## 安装依赖 2 | 3 | 先确保安装了 python 4 | 5 | 然后打开命令行窗口,切换到本文件夹,用以下命令安装依赖: 6 | 7 | ``` 8 | pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple 9 | pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ 10 | pip install setuptools keyboard aliyunsdkcore configparser 11 | cd alibabacloud-nls-python-sdk 12 | python setup.py bdist_egg 13 | python setup.py install 14 | ``` 15 | 16 | 17 | 18 | 在你 `pip install aliyunsdkcore` 这一步几乎百分百可能会因为安装依赖包 `pycrypto` 而失败。解决办法是,先安装 `Microsoft Visual Studio 14.0`,再在命令行里运行 `set CL=-FI"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\stdint.h"`,再执行 `pip install aliyunsdkcore ` 19 | 20 | 21 | 22 | 此外,还有一个依赖 `pyaudio` 不是太好安装,需要先到 [这个链接](https://www.lfd.uci.edu/~gohlke/pythonlibs) 下载 pyaudio 对应版本的 whl 文件。例如,我安装的是 python3.7 64位版本,就下载 `PyAudio‑0.2.11‑cp37‑cp37m‑win_amd64.whl` ,放到本文件夹,用 23 | 24 | ``` 25 | pip install PyAudio‑0.2.11‑cp37‑cp37m‑win_amd64.whl 26 | ``` 27 | 28 | 安装好 pyaudio。(为了方便,作者已经将 python3.6 到 3.9 的 whl 文件下载到本文件夹,根据版本选择安装即可) 29 | 30 | 31 | 32 | 安装阿里云的 `aliyunsdkcore` 不要用 `pip install aliyunsdkcore`, 那样下载来的是不能用的旧版本。应该使用 `pip install aliyun-python-sdk-core==2.13.3` 33 | 34 | ## API 配置 35 | 36 | ### 开通服务 37 | 38 | 你需要先到阿云开发者控制台开通 **RAM访问控制** ,进入 **RAM访问控制** 的控制台,新建一个用户,记下它的 **accessID**、**accessKey**,然后再为它添加 **管理智能语音交互(NLS)** 的权限。 39 | 40 | 接下来开通 **智能语音交互** 服务,新建一个项目: 41 | 42 | - 类别:非电话 43 | - 分类:通用 44 | - 场景:中文普通话 或 其它语言(想识别哪个语言就用哪个) 45 | 46 | 发布上线,再记下这个项目的 **appkey** 47 | -------------------------------------------------------------------------------- /打包/Pyinstaller 编译和打包 Win64.bat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HaujetZhao/CapsWriter/f2b2038a2c0984a1d356f024cbac421fe594601a/打包/Pyinstaller 编译和打包 Win64.bat --------------------------------------------------------------------------------