├── .gitignore ├── README.md ├── model └── onnx-json files.txt ├── outputs └── Audio files.txt ├── script.py └── settings.json /.gitignore: -------------------------------------------------------------------------------- 1 | # Piper TTS 2 | model/ 3 | outputs/ 4 | piper/ 5 | 6 | # Byte-compiled / optimized / DLL files 7 | __pycache__/ 8 | *.py[cod] 9 | *$py.class 10 | 11 | # C extensions 12 | *.so 13 | 14 | # Distribution / packaging 15 | .Python 16 | build/ 17 | develop-eggs/ 18 | dist/ 19 | downloads/ 20 | eggs/ 21 | .eggs/ 22 | lib/ 23 | lib64/ 24 | parts/ 25 | sdist/ 26 | var/ 27 | wheels/ 28 | share/python-wheels/ 29 | *.egg-info/ 30 | .installed.cfg 31 | *.egg 32 | MANIFEST 33 | 34 | # PyInstaller 35 | # Usually these files are written by a python script from a template 36 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 37 | *.manifest 38 | *.spec 39 | 40 | # Installer logs 41 | pip-log.txt 42 | pip-delete-this-directory.txt 43 | 44 | # Unit test / coverage reports 45 | htmlcov/ 46 | .tox/ 47 | .nox/ 48 | .coverage 49 | .coverage.* 50 | .cache 51 | nosetests.xml 52 | coverage.xml 53 | *.cover 54 | *.py,cover 55 | .hypothesis/ 56 | .pytest_cache/ 57 | cover/ 58 | 59 | # Translations 60 | *.mo 61 | *.pot 62 | 63 | # Django stuff: 64 | *.log 65 | local_settings.py 66 | db.sqlite3 67 | db.sqlite3-journal 68 | 69 | # Flask stuff: 70 | instance/ 71 | .webassets-cache 72 | 73 | # Scrapy stuff: 74 | .scrapy 75 | 76 | # Sphinx documentation 77 | docs/_build/ 78 | 79 | # PyBuilder 80 | .pybuilder/ 81 | target/ 82 | 83 | # Jupyter Notebook 84 | .ipynb_checkpoints 85 | 86 | # IPython 87 | profile_default/ 88 | ipython_config.py 89 | 90 | # pyenv 91 | # For a library or package, you might want to ignore these files since the code is 92 | # intended to run in multiple environments; otherwise, check them in: 93 | # .python-version 94 | 95 | # pipenv 96 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 97 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 98 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 99 | # install all needed dependencies. 100 | #Pipfile.lock 101 | 102 | # poetry 103 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 104 | # This is especially recommended for binary packages to ensure reproducibility, and is more 105 | # commonly ignored for libraries. 106 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 107 | #poetry.lock 108 | 109 | # pdm 110 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 111 | #pdm.lock 112 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 113 | # in version control. 114 | # https://pdm.fming.dev/#use-with-ide 115 | .pdm.toml 116 | 117 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 118 | __pypackages__/ 119 | 120 | # Celery stuff 121 | celerybeat-schedule 122 | celerybeat.pid 123 | 124 | # SageMath parsed files 125 | *.sage.py 126 | 127 | # Environments 128 | .env 129 | .venv 130 | env/ 131 | venv/ 132 | ENV/ 133 | env.bak/ 134 | venv.bak/ 135 | 136 | # Spyder project settings 137 | .spyderproject 138 | .spyproject 139 | 140 | # Rope project settings 141 | .ropeproject 142 | 143 | # mkdocs documentation 144 | /site 145 | 146 | # mypy 147 | .mypy_cache/ 148 | .dmypy.json 149 | dmypy.json 150 | 151 | # Pyre type checker 152 | .pyre/ 153 | 154 | # pytype static type analyzer 155 | .pytype/ 156 | 157 | # Cython debug symbols 158 | cython_debug/ 159 | 160 | # PyCharm 161 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 162 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 163 | # and can be added to the global gitignore or merged into this file. For a more nuclear 164 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 165 | #.idea/ 166 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # piper_tts 2 | An extension for the [text-generation-webui by oobabooga](https://github.com/oobabooga/text-generation-webui) that uses [Piper](https://github.com/rhasspy/piper) for fast voice generation. 3 | 4 | This project is a Web user interface (WebUI) for text generation using Gradio and a Piper text-to-speech (TTS) model. The main objective is to provide a user-friendly experience for text generation with audio. 5 | 6 | ![Mon Image](https://drive.google.com/uc?export=view&id=1TOnHWGWDqHWgNNn6HFOfvEv9egC1z7g-) 7 | 8 | 9 | ## Features 10 | 11 | - 16/11/2023 -- Speaker ID :** Some model may contain several voices, so to find out which ID to use, refer to the model's JSON file. 12 | - 16/11/2023 -- Sentence silence :** allows you to specify the duration, in seconds, of silence to be added after each sentence during text-to-speech. 13 | - Enable/Disable :** Enable or disable the TTS extension. 14 | - Autoplay :** Choose to automatically read generated text. 15 | - Text display :** Choose to show or hide generated text. 16 | - Custom settings :** Adjust audio parameters such as noise, phoneme length and noise width. 17 | - Template selection :** Choose from different templates available for text generation. 18 | - WAV save :** Audio files are saved in the `outputs` folder. 19 | - Save settings :** Save your settings. 20 | - Remove WAV :** delete all WAV files from the directory to free up storage space. 21 | 22 | ## Saved settings 23 | 24 | Selected settings are saved in a JSON file `settings.json` so that the user can retrieve his preferences each time he uses the device. 25 | 26 | ## Initial configuration 27 | 28 | Make sure you install all necessary dependencies and configure your environment according to the project instructions. 29 | 30 | ## Installation 31 | 32 | 1. Clone the repository in the extensions directory. 33 | 34 | ```bash 35 | git clone https://github.com/tijo95/piper_tts.git 36 | ``` 37 | 38 | 39 | 2. download the appropriate binary for your platform from piper repository: 40 | 41 | For windows, download `https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_windows_amd64.zip` 42 | Unzip all contents into `piper_tts` 43 | 44 | ![Mon Image](https://drive.google.com/uc?export=view&id=1bO8QyVR7v7gwoLsUdXquTeZx5rEwF7EY) 45 | 46 | For linux: 47 | ```bash 48 | cd piper_tts/ 49 | wget https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz 50 | tar -xvf piper_linux_x86_64.tar.gz 51 | rm piper_linux_x86_64.tar.gz 52 | ``` 53 | 54 | 3. Download the .onnx model and their .json files and place them in the `piper_tts/model` directory. 55 | 56 | The models are available at this address: https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0 57 | 58 | ![Mon Image](https://drive.google.com/uc?export=view&id=16JkRmOfCL-E37Xe6V6jm7MJShZzNHTyr) 59 | 60 | 61 | 5. Run the main script and have fun surprising your AI. 62 | 63 | ## Contributions 64 | 65 | Contributions are welcome! Feel free to open an issue or propose an extraction request to improve this project. 66 | 67 | ## Piper Github 68 | 69 | Github : https://github.com/rhasspy/piper#running-in-python 70 | 71 | Listen to voice samples : https://rhasspy.github.io/piper-samples 72 | -------------------------------------------------------------------------------- /model/onnx-json files.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tijo95/piper_tts/5c0db63c41542b26264925828741deb6af3fb1c8/model/onnx-json files.txt -------------------------------------------------------------------------------- /outputs/Audio files.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tijo95/piper_tts/5c0db63c41542b26264925828741deb6af3fb1c8/outputs/Audio files.txt -------------------------------------------------------------------------------- /script.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | import re 4 | import subprocess 5 | import time 6 | 7 | from modules import shared 8 | from pathlib import Path 9 | 10 | import gradio as gr 11 | 12 | 13 | root_dir = Path(__file__).resolve().parent 14 | settings_file = root_dir / 'settings.json' 15 | piper_path = root_dir / 'piper/piper' 16 | model_folder = root_dir / 'model' 17 | output_folder = root_dir / 'outputs' 18 | 19 | params = { 20 | "display_name": "Piper TTS", 21 | "active": True, 22 | "autoplay": True, 23 | "show_text": True, 24 | "ignore_asterisk_text": False, 25 | "quiet": False, 26 | "selected_model": "", 27 | "speaker_id": 0, 28 | "noise_scale": 0.667, 29 | "length_scale": 1.0, 30 | "noise_w": 0.8, 31 | "sentence_silence": 0.2, 32 | } 33 | defaults = params.copy() 34 | 35 | def load_settings(): 36 | try: 37 | with open(settings_file, 'r') as json_file: 38 | settings = json.load(json_file) 39 | params.update(settings) 40 | except FileNotFoundError: 41 | pass 42 | 43 | # Load parameters from JSON file at start of script 44 | load_settings() 45 | 46 | def clean_text(text): 47 | cleaned_text = text 48 | 49 | replacements = { 50 | ''': "'", 51 | '"': '"', 52 | '&': '&', 53 | '<': '<', 54 | '>': '>', 55 | ' ': ' ', 56 | '©': '©', 57 | '®': '®' 58 | } 59 | 60 | for key, value in replacements.items(): 61 | cleaned_text = cleaned_text.replace(key, value) 62 | 63 | cleaned_text = cleaned_text.replace("***", "*").replace("**", "*") 64 | cleaned_text = re.sub(r"[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF\U0001F700-\U0001F77F\U0001F780-\U0001F7FF\U0001F800-\U0001F8FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FA6F\U0001FA70-\U0001FAFF\U00002702-\U000027B0\U000024C2-\U0001F251]+", "", cleaned_text) 65 | 66 | # Ignore text between asterisks if option is enabled 67 | if params["ignore_asterisk_text"]: 68 | while '*' in cleaned_text: 69 | start = cleaned_text.find('*') 70 | end = cleaned_text.find('*', start + 1) 71 | if start != -1 and end != -1 and start < end: 72 | excluded_text = cleaned_text[start:end + 1] 73 | cleaned_text = cleaned_text.replace(excluded_text, '') 74 | 75 | return cleaned_text 76 | 77 | 78 | def tts(text, output_file): 79 | cleaned_text = clean_text(text) 80 | print(f"tts: {cleaned_text} -> {output_file}") 81 | 82 | selected_model = params.get('selected_model', '') 83 | model_path = model_folder / selected_model 84 | config_path = model_folder/ f'{selected_model}.json' 85 | 86 | output_file_path = output_folder / output_file 87 | output_file_str = output_file.as_posix() 88 | 89 | process = subprocess.Popen( 90 | [ 91 | piper_path.as_posix(), 92 | '--sentence_silence', str(params['sentence_silence']), 93 | '--noise_scale', str(params['noise_scale']), 94 | '--length_scale', str(params['length_scale']), 95 | '--noise_w', str(params['noise_w']), 96 | '--speaker', str(params['speaker_id']), 97 | '--model', model_path.as_posix(), 98 | '--config', config_path.as_posix(), 99 | '--output_file', output_file_str, 100 | '--quiet' if params['quiet'] else '', 101 | ], 102 | stdin=subprocess.PIPE, 103 | text=True 104 | ) 105 | 106 | process.communicate(input=cleaned_text) 107 | 108 | def output_modifier(string, state): 109 | 110 | if not params['active']: 111 | return string 112 | 113 | if string == '': 114 | string = '*Empty reply, try regenerating*' 115 | else: 116 | output_file = Path(os.path.relpath(output_folder / f'{state["character_menu"]}_{int(time.time())}.wav')) 117 | tts(string, output_file) 118 | autoplay = 'autoplay' if params['autoplay'] else '' 119 | html_string = f'' 120 | if params['show_text']: 121 | string = f'{html_string}\n\n{string}' 122 | else: 123 | string = html_string 124 | 125 | shared.processing_message = "*Is typing...*" 126 | return string 127 | 128 | def history_modifier(history): 129 | if len(history['internal']) > 0: 130 | history['visible'][-1] = [ 131 | history['visible'][-1][0], 132 | history['visible'][-1][1].replace('controls autoplay>', 'controls>') 133 | ] 134 | 135 | return history 136 | 137 | def remove_directory(): 138 | for file in output_folder.glob('*.wav'): 139 | file.unlink() 140 | 141 | def custom_update_selected_model(selected_model): 142 | if selected_model: 143 | model_path = model_folder / selected_model 144 | params.update({'selected_model': selected_model, 'model_path': model_path}) 145 | 146 | def create_model_dropdown(): 147 | available_models = [model.name for model in model_folder.glob('*.onnx')] 148 | available_models.sort() 149 | 150 | model_dropdown = gr.Dropdown(choices=available_models, label="Choose Model", value=params["selected_model"]) 151 | 152 | def update_selected_model(selected_model): 153 | custom_update_selected_model(selected_model) 154 | 155 | model_dropdown.change(update_selected_model, model_dropdown, None) 156 | 157 | return model_dropdown 158 | 159 | def set_initial_model(): 160 | available_models = [model.name for model in model_folder.glob('*.onnx')] 161 | 162 | load_settings() 163 | 164 | if not params["selected_model"] and available_models: 165 | initial_model = params.get("selected_model", available_models[0]) 166 | 167 | params.update({ 168 | "selected_model": initial_model, 169 | "active": params.get("active", True), 170 | "autoplay": params.get("autoplay", True), 171 | "show_text": params.get("show_text", True), 172 | }) 173 | 174 | # Call set_initial_model() 175 | set_initial_model() 176 | 177 | def save_settings(): 178 | settings = { 179 | "active": params["active"], 180 | "autoplay": params["autoplay"], 181 | "show_text": params["show_text"], 182 | "quiet": params["quiet"], 183 | "selected_model": params["selected_model"], 184 | "speaker_id": params["speaker_id"], 185 | "noise_scale": params["noise_scale"], 186 | "length_scale": params["length_scale"], 187 | "noise_w": params["noise_w"], 188 | "sentence_silence": params["sentence_silence"], 189 | "ignore_asterisk_text": params["ignore_asterisk_text"], 190 | } 191 | 192 | with open(settings_file, 'w') as json_file: 193 | json.dump(settings, json_file, indent=4) 194 | 195 | def ui(): 196 | with gr.Accordion(params["display_name"], open=False): 197 | 198 | activate = gr.Checkbox(value=params['active'], label='Active extension') 199 | autoplay = gr.Checkbox(value=params['autoplay'], label='Play TTS automatically') 200 | show_text = gr.Checkbox(value=params['show_text'], label='Show message text under audio player') 201 | ignore_asterisk_checkbox = gr.Checkbox(value=params["ignore_asterisk_text"], label="*Ignore text inside asterisk*") 202 | quiet_checkbox = gr.Checkbox(value=params["quiet"], label='Disable log') 203 | 204 | noise_scale_slider = gr.Slider(minimum=0.0, maximum=1.0, label=f'Noise Scale : Default ({defaults["noise_scale"]})', value=params['noise_scale']) 205 | length_scale_slider = gr.Slider(minimum=0.0, maximum=2.0, label=f'Length Scale : Default ({defaults["length_scale"]})', value=params['length_scale']) 206 | noise_w_slider = gr.Slider(minimum=0.0, maximum=1.0, label=f'Noise Width : Default ({defaults["noise_w"]})', value=params['noise_w']) 207 | sentence_silence_slider = gr.Slider(minimum=0.0, maximum=1.0, label=f'Sentence Silence : Default ({defaults["sentence_silence"]})', value=params['sentence_silence']) 208 | 209 | activate.change(lambda x: params.update({'active': x}), activate, None) 210 | autoplay.change(lambda x: params.update({'autoplay': x}), autoplay, None) 211 | show_text.change(lambda x: params.update({'show_text': x}), show_text, None) 212 | ignore_asterisk_checkbox.change(lambda x: params.update({"ignore_asterisk_text": x}), ignore_asterisk_checkbox, None) 213 | quiet_checkbox.change(lambda x: params.update({'quiet': x}), quiet_checkbox, None) 214 | 215 | noise_scale_slider.change(lambda x: params.update({'noise_scale': x}), noise_scale_slider, None) 216 | length_scale_slider.change(lambda x: params.update({'length_scale': x}), length_scale_slider, None) 217 | noise_w_slider.change(lambda x: params.update({'noise_w': x}), noise_w_slider, None) 218 | sentence_silence_slider.change(lambda x: params.update({'sentence_silence': x}), sentence_silence_slider, None) 219 | 220 | # Use params["selected_model"] as initial drop-down value 221 | model_dropdown = create_model_dropdown() 222 | 223 | speaker_id_input = gr.Number(value=params["speaker_id"], label=f'Speaker ID : Default ({defaults["speaker_id"]}) See the model JSON file to find out which ID are available for the selected model.') 224 | speaker_id_input.change(lambda x: params.update({'speaker_id': int(x)}), speaker_id_input, None) 225 | 226 | with gr.Row(): 227 | save_button = gr.Button("Save Settings") 228 | save_button.click(save_settings, None) 229 | 230 | remove_directory_button = gr.Button("Remove WAV") 231 | remove_directory_button.click(remove_directory, None) 232 | 233 | -------------------------------------------------------------------------------- /settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "active": true, 3 | "autoplay": true, 4 | "show_text": true, 5 | "ignore_asterisk_text": true, 6 | "quiet": false, 7 | "selected_model": "en_US-hfc_female-medium.onnx", 8 | "speaker_id": 0, 9 | "noise_scale": 0.66, 10 | "length_scale": 1, 11 | "noise_w": 0.8, 12 | "sentence_silence": 0.2 13 | } --------------------------------------------------------------------------------