├── .env ├── LICENSE ├── README.md ├── requirements.txt └── script.py /.env: -------------------------------------------------------------------------------- 1 | # Model path 2 | MODEL_PATH="extensions/bark_tts/models/" 3 | 4 | # Whether to use small models 5 | USE_SMALL_MODELS=false 6 | 7 | # Whether to use CPU 8 | USE_CPU=false 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Attribution-NonCommercial 4.0 International 3 | 4 | ======================================================================= 5 | 6 | Creative Commons Corporation ("Creative Commons") is not a law firm and 7 | does not provide legal services or legal advice. Distribution of 8 | Creative Commons public licenses does not create a lawyer-client or 9 | other relationship. Creative Commons makes its licenses and related 10 | information available on an "as-is" basis. Creative Commons gives no 11 | warranties regarding its licenses, any material licensed under their 12 | terms and conditions, or any related information. Creative Commons 13 | disclaims all liability for damages resulting from their use to the 14 | fullest extent possible. 15 | 16 | Using Creative Commons Public Licenses 17 | 18 | Creative Commons public licenses provide a standard set of terms and 19 | conditions that creators and other rights holders may use to share 20 | original works of authorship and other material subject to copyright 21 | and certain other rights specified in the public license below. The 22 | following considerations are for informational purposes only, are not 23 | exhaustive, and do not form part of our licenses. 24 | 25 | Considerations for licensors: Our public licenses are 26 | intended for use by those authorized to give the public 27 | permission to use material in ways otherwise restricted by 28 | copyright and certain other rights. Our licenses are 29 | irrevocable. Licensors should read and understand the terms 30 | and conditions of the license they choose before applying it. 31 | Licensors should also secure all rights necessary before 32 | applying our licenses so that the public can reuse the 33 | material as expected. Licensors should clearly mark any 34 | material not subject to the license. This includes other CC- 35 | licensed material, or material used under an exception or 36 | limitation to copyright. More considerations for licensors: 37 | wiki.creativecommons.org/Considerations_for_licensors 38 | 39 | Considerations for the public: By using one of our public 40 | licenses, a licensor grants the public permission to use the 41 | licensed material under specified terms and conditions. If 42 | the licensor's permission is not necessary for any reason--for 43 | example, because of any applicable exception or limitation to 44 | copyright--then that use is not regulated by the license. Our 45 | licenses grant only permissions under copyright and certain 46 | other rights that a licensor has authority to grant. Use of 47 | the licensed material may still be restricted for other 48 | reasons, including because others have copyright or other 49 | rights in the material. A licensor may make special requests, 50 | such as asking that all changes be marked or described. 51 | Although not required by our licenses, you are encouraged to 52 | respect those requests where reasonable. More_considerations 53 | for the public: 54 | wiki.creativecommons.org/Considerations_for_licensees 55 | 56 | ======================================================================= 57 | 58 | Creative Commons Attribution-NonCommercial 4.0 International Public 59 | License 60 | 61 | By exercising the Licensed Rights (defined below), You accept and agree 62 | to be bound by the terms and conditions of this Creative Commons 63 | Attribution-NonCommercial 4.0 International Public License ("Public 64 | License"). To the extent this Public License may be interpreted as a 65 | contract, You are granted the Licensed Rights in consideration of Your 66 | acceptance of these terms and conditions, and the Licensor grants You 67 | such rights in consideration of benefits the Licensor receives from 68 | making the Licensed Material available under these terms and 69 | conditions. 70 | 71 | Section 1 -- Definitions. 72 | 73 | a. Adapted Material means material subject to Copyright and Similar 74 | Rights that is derived from or based upon the Licensed Material 75 | and in which the Licensed Material is translated, altered, 76 | arranged, transformed, or otherwise modified in a manner requiring 77 | permission under the Copyright and Similar Rights held by the 78 | Licensor. For purposes of this Public License, where the Licensed 79 | Material is a musical work, performance, or sound recording, 80 | Adapted Material is always produced where the Licensed Material is 81 | synched in timed relation with a moving image. 82 | 83 | b. Adapter's License means the license You apply to Your Copyright 84 | and Similar Rights in Your contributions to Adapted Material in 85 | accordance with the terms and conditions of this Public License. 86 | 87 | c. Copyright and Similar Rights means copyright and/or similar rights 88 | closely related to copyright including, without limitation, 89 | performance, broadcast, sound recording, and Sui Generis Database 90 | Rights, without regard to how the rights are labeled or 91 | categorized. For purposes of this Public License, the rights 92 | specified in Section 2(b)(1)-(2) are not Copyright and Similar 93 | Rights. 94 | d. Effective Technological Measures means those measures that, in the 95 | absence of proper authority, may not be circumvented under laws 96 | fulfilling obligations under Article 11 of the WIPO Copyright 97 | Treaty adopted on December 20, 1996, and/or similar international 98 | agreements. 99 | 100 | e. Exceptions and Limitations means fair use, fair dealing, and/or 101 | any other exception or limitation to Copyright and Similar Rights 102 | that applies to Your use of the Licensed Material. 103 | 104 | f. Licensed Material means the artistic or literary work, database, 105 | or other material to which the Licensor applied this Public 106 | License. 107 | 108 | g. Licensed Rights means the rights granted to You subject to the 109 | terms and conditions of this Public License, which are limited to 110 | all Copyright and Similar Rights that apply to Your use of the 111 | Licensed Material and that the Licensor has authority to license. 112 | 113 | h. Licensor means the individual(s) or entity(ies) granting rights 114 | under this Public License. 115 | 116 | i. NonCommercial means not primarily intended for or directed towards 117 | commercial advantage or monetary compensation. For purposes of 118 | this Public License, the exchange of the Licensed Material for 119 | other material subject to Copyright and Similar Rights by digital 120 | file-sharing or similar means is NonCommercial provided there is 121 | no payment of monetary compensation in connection with the 122 | exchange. 123 | 124 | j. Share means to provide material to the public by any means or 125 | process that requires permission under the Licensed Rights, such 126 | as reproduction, public display, public performance, distribution, 127 | dissemination, communication, or importation, and to make material 128 | available to the public including in ways that members of the 129 | public may access the material from a place and at a time 130 | individually chosen by them. 131 | 132 | k. Sui Generis Database Rights means rights other than copyright 133 | resulting from Directive 96/9/EC of the European Parliament and of 134 | the Council of 11 March 1996 on the legal protection of databases, 135 | as amended and/or succeeded, as well as other essentially 136 | equivalent rights anywhere in the world. 137 | 138 | l. You means the individual or entity exercising the Licensed Rights 139 | under this Public License. Your has a corresponding meaning. 140 | 141 | Section 2 -- Scope. 142 | 143 | a. License grant. 144 | 145 | 1. Subject to the terms and conditions of this Public License, 146 | the Licensor hereby grants You a worldwide, royalty-free, 147 | non-sublicensable, non-exclusive, irrevocable license to 148 | exercise the Licensed Rights in the Licensed Material to: 149 | 150 | a. reproduce and Share the Licensed Material, in whole or 151 | in part, for NonCommercial purposes only; and 152 | 153 | b. produce, reproduce, and Share Adapted Material for 154 | NonCommercial purposes only. 155 | 156 | 2. Exceptions and Limitations. For the avoidance of doubt, where 157 | Exceptions and Limitations apply to Your use, this Public 158 | License does not apply, and You do not need to comply with 159 | its terms and conditions. 160 | 161 | 3. Term. The term of this Public License is specified in Section 162 | 6(a). 163 | 164 | 4. Media and formats; technical modifications allowed. The 165 | Licensor authorizes You to exercise the Licensed Rights in 166 | all media and formats whether now known or hereafter created, 167 | and to make technical modifications necessary to do so. The 168 | Licensor waives and/or agrees not to assert any right or 169 | authority to forbid You from making technical modifications 170 | necessary to exercise the Licensed Rights, including 171 | technical modifications necessary to circumvent Effective 172 | Technological Measures. For purposes of this Public License, 173 | simply making modifications authorized by this Section 2(a) 174 | (4) never produces Adapted Material. 175 | 176 | 5. Downstream recipients. 177 | 178 | a. Offer from the Licensor -- Licensed Material. Every 179 | recipient of the Licensed Material automatically 180 | receives an offer from the Licensor to exercise the 181 | Licensed Rights under the terms and conditions of this 182 | Public License. 183 | 184 | b. No downstream restrictions. You may not offer or impose 185 | any additional or different terms or conditions on, or 186 | apply any Effective Technological Measures to, the 187 | Licensed Material if doing so restricts exercise of the 188 | Licensed Rights by any recipient of the Licensed 189 | Material. 190 | 191 | 6. No endorsement. Nothing in this Public License constitutes or 192 | may be construed as permission to assert or imply that You 193 | are, or that Your use of the Licensed Material is, connected 194 | with, or sponsored, endorsed, or granted official status by, 195 | the Licensor or others designated to receive attribution as 196 | provided in Section 3(a)(1)(A)(i). 197 | 198 | b. Other rights. 199 | 200 | 1. Moral rights, such as the right of integrity, are not 201 | licensed under this Public License, nor are publicity, 202 | privacy, and/or other similar personality rights; however, to 203 | the extent possible, the Licensor waives and/or agrees not to 204 | assert any such rights held by the Licensor to the limited 205 | extent necessary to allow You to exercise the Licensed 206 | Rights, but not otherwise. 207 | 208 | 2. Patent and trademark rights are not licensed under this 209 | Public License. 210 | 211 | 3. To the extent possible, the Licensor waives any right to 212 | collect royalties from You for the exercise of the Licensed 213 | Rights, whether directly or through a collecting society 214 | under any voluntary or waivable statutory or compulsory 215 | licensing scheme. In all other cases the Licensor expressly 216 | reserves any right to collect such royalties, including when 217 | the Licensed Material is used other than for NonCommercial 218 | purposes. 219 | 220 | Section 3 -- License Conditions. 221 | 222 | Your exercise of the Licensed Rights is expressly made subject to the 223 | following conditions. 224 | 225 | a. Attribution. 226 | 227 | 1. If You Share the Licensed Material (including in modified 228 | form), You must: 229 | 230 | a. retain the following if it is supplied by the Licensor 231 | with the Licensed Material: 232 | 233 | i. identification of the creator(s) of the Licensed 234 | Material and any others designated to receive 235 | attribution, in any reasonable manner requested by 236 | the Licensor (including by pseudonym if 237 | designated); 238 | 239 | ii. a copyright notice; 240 | 241 | iii. a notice that refers to this Public License; 242 | 243 | iv. a notice that refers to the disclaimer of 244 | warranties; 245 | 246 | v. a URI or hyperlink to the Licensed Material to the 247 | extent reasonably practicable; 248 | 249 | b. indicate if You modified the Licensed Material and 250 | retain an indication of any previous modifications; and 251 | 252 | c. indicate the Licensed Material is licensed under this 253 | Public License, and include the text of, or the URI or 254 | hyperlink to, this Public License. 255 | 256 | 2. You may satisfy the conditions in Section 3(a)(1) in any 257 | reasonable manner based on the medium, means, and context in 258 | which You Share the Licensed Material. For example, it may be 259 | reasonable to satisfy the conditions by providing a URI or 260 | hyperlink to a resource that includes the required 261 | information. 262 | 263 | 3. If requested by the Licensor, You must remove any of the 264 | information required by Section 3(a)(1)(A) to the extent 265 | reasonably practicable. 266 | 267 | 4. If You Share Adapted Material You produce, the Adapter's 268 | License You apply must not prevent recipients of the Adapted 269 | Material from complying with this Public License. 270 | 271 | Section 4 -- Sui Generis Database Rights. 272 | 273 | Where the Licensed Rights include Sui Generis Database Rights that 274 | apply to Your use of the Licensed Material: 275 | 276 | a. for the avoidance of doubt, Section 2(a)(1) grants You the right 277 | to extract, reuse, reproduce, and Share all or a substantial 278 | portion of the contents of the database for NonCommercial purposes 279 | only; 280 | 281 | b. if You include all or a substantial portion of the database 282 | contents in a database in which You have Sui Generis Database 283 | Rights, then the database in which You have Sui Generis Database 284 | Rights (but not its individual contents) is Adapted Material; and 285 | 286 | c. You must comply with the conditions in Section 3(a) if You Share 287 | all or a substantial portion of the contents of the database. 288 | 289 | For the avoidance of doubt, this Section 4 supplements and does not 290 | replace Your obligations under this Public License where the Licensed 291 | Rights include other Copyright and Similar Rights. 292 | 293 | Section 5 -- Disclaimer of Warranties and Limitation of Liability. 294 | 295 | a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE 296 | EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS 297 | AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF 298 | ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, 299 | IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, 300 | WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR 301 | PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, 302 | ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT 303 | KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT 304 | ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU. 305 | 306 | b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE 307 | TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, 308 | NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, 309 | INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, 310 | COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR 311 | USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN 312 | ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR 313 | DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR 314 | IN PART, THIS LIMITATION MAY NOT APPLY TO YOU. 315 | 316 | c. The disclaimer of warranties and limitation of liability provided 317 | above shall be interpreted in a manner that, to the extent 318 | possible, most closely approximates an absolute disclaimer and 319 | waiver of all liability. 320 | 321 | Section 6 -- Term and Termination. 322 | 323 | a. This Public License applies for the term of the Copyright and 324 | Similar Rights licensed here. However, if You fail to comply with 325 | this Public License, then Your rights under this Public License 326 | terminate automatically. 327 | 328 | b. Where Your right to use the Licensed Material has terminated under 329 | Section 6(a), it reinstates: 330 | 331 | 1. automatically as of the date the violation is cured, provided 332 | it is cured within 30 days of Your discovery of the 333 | violation; or 334 | 335 | 2. upon express reinstatement by the Licensor. 336 | 337 | For the avoidance of doubt, this Section 6(b) does not affect any 338 | right the Licensor may have to seek remedies for Your violations 339 | of this Public License. 340 | 341 | c. For the avoidance of doubt, the Licensor may also offer the 342 | Licensed Material under separate terms or conditions or stop 343 | distributing the Licensed Material at any time; however, doing so 344 | will not terminate this Public License. 345 | 346 | d. Sections 1, 5, 6, 7, and 8 survive termination of this Public 347 | License. 348 | 349 | Section 7 -- Other Terms and Conditions. 350 | 351 | a. The Licensor shall not be bound by any additional or different 352 | terms or conditions communicated by You unless expressly agreed. 353 | 354 | b. Any arrangements, understandings, or agreements regarding the 355 | Licensed Material not stated herein are separate from and 356 | independent of the terms and conditions of this Public License. 357 | 358 | Section 8 -- Interpretation. 359 | 360 | a. For the avoidance of doubt, this Public License does not, and 361 | shall not be interpreted to, reduce, limit, restrict, or impose 362 | conditions on any use of the Licensed Material that could lawfully 363 | be made without permission under this Public License. 364 | 365 | b. To the extent possible, if any provision of this Public License is 366 | deemed unenforceable, it shall be automatically reformed to the 367 | minimum extent necessary to make it enforceable. If the provision 368 | cannot be reformed, it shall be severed from this Public License 369 | without affecting the enforceability of the remaining terms and 370 | conditions. 371 | 372 | c. No term or condition of this Public License will be waived and no 373 | failure to comply consented to unless expressly agreed to by the 374 | Licensor. 375 | 376 | d. Nothing in this Public License constitutes or may be interpreted 377 | as a limitation upon, or waiver of, any privileges and immunities 378 | that apply to the Licensor or You, including from the legal 379 | processes of any jurisdiction or authority. 380 | 381 | ======================================================================= 382 | 383 | Creative Commons is not a party to its public 384 | licenses. Notwithstanding, Creative Commons may elect to apply one of 385 | its public licenses to material it publishes and in those instances 386 | will be considered the “Licensor.” The text of the Creative Commons 387 | public licenses is dedicated to the public domain under the CC0 Public 388 | Domain Dedication. Except for the limited purpose of indicating that 389 | material is shared under a Creative Commons public license or as 390 | otherwise permitted by the Creative Commons policies published at 391 | creativecommons.org/policies, Creative Commons does not authorize the 392 | use of the trademark "Creative Commons" or any other trademark or logo 393 | of Creative Commons without its prior written consent including, 394 | without limitation, in connection with any unauthorized modifications 395 | to any of its public licenses or any other arrangements, 396 | understandings, or agreements concerning use of licensed material. For 397 | the avoidance of doubt, this paragraph does not form part of the 398 | public licenses. 399 | 400 | Creative Commons may be contacted at creativecommons.org. 401 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # text-generation-webui-barktts 2 | A simple extension for the [text-generation-webui by oobabooga](https://github.com/oobabooga/text-generation-webui) that uses [Bark](https://github.com/suno-ai/bark) for audio output. 3 | 4 | # ⚠️Important Info⚠️ 5 | This repo is not actively maintained _(at least for a while)_. 6 | If you want a version that works and fixes some issues with my code, 7 | have a look at [RandoInternetPreson's Fork](https://github.com/RandomInternetPreson/text-generation-webui-barktts) 8 | 9 | ## How to install 10 | Assuming you already have the webui set up: 11 | 12 | 1. Activate the conda environment with the `cmd_xxx.bat` or using `conda activate textgen` 13 | 2. Enter the `text-generation-webui/extensions/` directory and clone this repository 14 | ``` 15 | cd text-generation-webui/extensions/ 16 | git clone https://github.com/minemo/text-generation-webui-barktts bark_tts/ 17 | ``` 18 | 3. install the requirements 19 | ``` 20 | pip install -r extensions/bark_tts/requirements.txt 21 | ``` 22 | 4. Add `--extensions bark_tts` to your startup script
or
enable it through the `Interface Mode` tab in the webui 23 | 24 | ## Tips 25 | The full version of Bark requires around 12Gb of memory to hold everything on GPU at the same time. However, even smaller cards down to ~2Gb work with some additional settings. For this extension, you could open `extensions/bark_tts/.env`, then set `USE_SMALL_MODELS` and `USE_CPU` to `true`: 26 | 27 | ``` 28 | # Whether to use small models 29 | USE_SMALL_MODELS=true 30 | 31 | # Whether to use CPU 32 | USE_CPU=true 33 | ``` 34 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | git+https://github.com/suno-ai/bark.git 2 | nltk 3 | python-dotenv 4 | -------------------------------------------------------------------------------- /script.py: -------------------------------------------------------------------------------- 1 | """ 2 | Bark TTS extension for https://github.com/oobabooga/text-generation-webui/ 3 | All credit for the amazing tts model goes to https://github.com/suno-ai/bark 4 | """ 5 | import hashlib 6 | from http.client import IncompleteRead 7 | import os 8 | import time 9 | import urllib.request 10 | from pathlib import Path 11 | from dotenv import load_dotenv 12 | 13 | # Read .env file 14 | load_dotenv() 15 | 16 | # Should change this environment variable before import bark 17 | model_path = Path(os.environ.get('MODEL_PATH', 'extensions/bark_tts/models/')) 18 | os.environ['XDG_CACHE_HOME'] = model_path.resolve().as_posix() 19 | 20 | import nltk 21 | import gradio as gr 22 | import numpy as np 23 | from bark import SAMPLE_RATE, preload_models 24 | from bark.generation import ALLOWED_PROMPTS, generate_text_semantic 25 | from bark.api import semantic_to_waveform 26 | from modules import shared 27 | from scipy.io.wavfile import write as write_wav 28 | 29 | nltk.download('punkt') 30 | 31 | params = { 32 | 'activate': True, 33 | 'autoplay': False, 34 | 'forced_speaker_enabled': False, 35 | 'forced_speaker': 'Man', 36 | 'show_text': False, 37 | 'modifiers': [], 38 | 'use_small_models': os.environ.get("USE_SMALL_MODELS", 'false').lower() == 'true', 39 | 'use_cpu': os.environ.get("USE_CPU", 'false').lower() == 'true', 40 | 'force_manual_download': False, 41 | 'voice': 'v2/en_speaker_3', 42 | 'sample_rate': SAMPLE_RATE, 43 | 'temperature': 0.7 44 | } 45 | 46 | input_hijack = { 47 | 'state': False, 48 | 'value': ["", ""] 49 | } 50 | 51 | streaming_state = shared.args.no_stream 52 | forced_modes = ["Man", "Woman", "Narrator"] 53 | modifier_options = ["[laughter]","[laughs]","[sighs]","[music]","[gasps]","[clears throat]"] 54 | voice_presets = sorted(list(ALLOWED_PROMPTS)) 55 | 56 | def manual_model_preload(): 57 | for model in ["text","coarse","fine","text_2","coarse_2","fine_2"]: 58 | remote_url=f"https://dl.suno-models.io/bark/models/v0/{model}.pt" 59 | remote_md5=hashlib.md5(remote_url.encode()).hexdigest() 60 | out_path = f"{os.path.expanduser('~/.cache/suno/bark_v0')}/{remote_md5}.pt" 61 | if not Path(out_path).exists(): 62 | print(f"\t+ Downloading {model} model to {out_path}...") 63 | # we also have to do some user agent tomfoolery to get the download to work 64 | req = urllib.request.Request(remote_url, headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0'}) 65 | with urllib.request.urlopen(req) as response, open(out_path, 'wb') as out_file: 66 | try: 67 | data = response.read() 68 | except IncompleteRead as e: 69 | data = e.partial 70 | out_file.write(data) 71 | else: 72 | print(f"\t+ {model} model already exists, skipping...") 73 | preload_models( 74 | text_use_gpu= not params['use_cpu'], 75 | text_use_small= params['use_small_models'], 76 | coarse_use_gpu= not params['use_cpu'], 77 | coarse_use_small=params['use_small_models'], 78 | fine_use_gpu= not params['use_cpu'], 79 | fine_use_small=params['use_small_models'], 80 | codec_use_gpu= not params['use_cpu'] 81 | ) 82 | 83 | def input_modifier(string): 84 | if not params['activate']: 85 | shared.processing_message = "*Is typing...*" 86 | return string 87 | shared.processing_message = "*Is recording a voice message...*" 88 | shared.args.no_stream = True 89 | return string 90 | 91 | def output_modifier(string): 92 | 93 | if not params['activate']: 94 | return string 95 | 96 | ttstext = string 97 | 98 | if params['modifiers']: 99 | ttstext = f"{' '.join(params['modifiers'])}: {ttstext}" 100 | 101 | if params['forced_speaker_enabled']: 102 | ttstext = f"{params['forced_speaker'].upper()}: {ttstext}" 103 | 104 | sentences = nltk.sent_tokenize(ttstext) 105 | silence = np.zeros(int(0.25 * params['sample_rate'])) # quarter second of silence 106 | pieces = [] 107 | for sentence in sentences: 108 | semantic_tokens = generate_text_semantic( 109 | sentence, 110 | history_prompt=params['voice'], 111 | temp=params['temperature'], 112 | min_eos_p=0.05, # this controls how likely the generation is to end 113 | ) 114 | audio_array = semantic_to_waveform(semantic_tokens, history_prompt=params['voice'],) 115 | pieces += [audio_array, silence.copy()] 116 | audio = np.array(np.concatenate(pieces), dtype="float32") 117 | time_label = int(time.time()) 118 | write_wav(f"extensions/bark_tts/generated/{shared.character}_{time_label}.wav", params['sample_rate'], audio) 119 | autoplay = 'autoplay' if params['autoplay'] else '' 120 | if params['show_text']: 121 | string = f'
{ttstext}' 122 | else: 123 | string = f'' 124 | 125 | shared.args.no_stream = streaming_state 126 | return string 127 | 128 | 129 | def setup(): 130 | # tell the user what's going on 131 | print() 132 | print("== Loading Bark TTS extension ==") 133 | print("+ This may take a while on first run don't worry!") 134 | 135 | print("+ Creating directories (if they don't exist)...") 136 | if not Path("extensions/bark_tts/generated").exists(): 137 | Path("extensions/bark_tts/generated").mkdir(parents=True) 138 | if not Path(model_path).exists(): 139 | Path(model_path).mkdir(parents=True) 140 | print("+ Done!") 141 | 142 | # load models into extension directory so we don't clutter the pc 143 | print("+ Loading model...") 144 | if not params['force_manual_download']: 145 | try: 146 | preload_models( 147 | text_use_gpu= not params['use_cpu'], 148 | text_use_small= params['use_small_models'], 149 | coarse_use_gpu= not params['use_cpu'], 150 | coarse_use_small=params['use_small_models'], 151 | fine_use_gpu= not params['use_cpu'], 152 | fine_use_small=params['use_small_models'], 153 | codec_use_gpu= not params['use_cpu'] 154 | ) 155 | except ValueError as e: 156 | # for some reason the download fails sometimes, so we just do it manually 157 | # solution adapted from https://github.com/suno-ai/bark/issues/46 158 | print("\t+ Automatic download failed, trying manual download...") 159 | manual_model_preload() 160 | 161 | else: 162 | print("\t+ Forcing manual download...") 163 | manual_model_preload() 164 | 165 | 166 | 167 | print("+ Done!") 168 | 169 | print("== Bark TTS extension loaded ==\n\n") 170 | 171 | def ui(): 172 | with gr.Accordion("Bark TTS"): 173 | with gr.Row(): 174 | activate = gr.Checkbox(value=params['activate'], label='Activate TTS') 175 | autoplay = gr.Checkbox(value=params['autoplay'], label='Autoplay') 176 | show_text = gr.Checkbox(value=params['show_text'], label='Show text') 177 | forced_speaker_enabled = gr.Checkbox(value=params['forced_speaker_enabled'], label='Forced speaker enabled') 178 | with gr.Row(): 179 | forced_speaker = gr.Dropdown(forced_modes, label='Forced speaker', value=params['forced_speaker']) 180 | modifiers = gr.Dropdown(modifier_options, label='Modifiers', value=params['modifiers'], multiselect=True) 181 | voice = gr.Dropdown(voice_presets, label='Voice Preset', value=params['voice']) 182 | with gr.Row(): 183 | sample_rate = gr.Slider(minimum=18000, maximum=30000, value=params['sample_rate'], label='Sample Rate') 184 | temperature = gr.Slider(minimum=0.1, maximum=1.0, value=params['temperature'], label='Temperature') 185 | 186 | activate.change(lambda x: params.update({'activate': x}), activate, None) 187 | autoplay.change(lambda x: params.update({'autoplay': x}), autoplay, None) 188 | show_text.change(lambda x: params.update({'show_text': x}), show_text, None) 189 | forced_speaker_enabled.change(lambda x: params.update({'forced_speaker_enabled': x}), forced_speaker_enabled, None) 190 | forced_speaker.change(lambda x: params.update({'forced_speaker': x}), forced_speaker, None) 191 | modifiers.change(lambda x: params.update({'modifiers': x}), modifiers, None) 192 | voice.change(lambda x: params.update({'voice': x}), voice, None) 193 | sample_rate.change(lambda x: params.update({'sample_rate': x}), sample_rate, None) 194 | temperature.change(lambda x: params.update({'temperature': x}), temperature, None) 195 | --------------------------------------------------------------------------------