├── .env
├── LICENSE
├── README.md
├── requirements.txt
└── script.py
/.env:
--------------------------------------------------------------------------------
1 | # Model path
2 | MODEL_PATH="extensions/bark_tts/models/"
3 |
4 | # Whether to use small models
5 | USE_SMALL_MODELS=false
6 |
7 | # Whether to use CPU
8 | USE_CPU=false
9 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 |
2 | Attribution-NonCommercial 4.0 International
3 |
4 | =======================================================================
5 |
6 | Creative Commons Corporation ("Creative Commons") is not a law firm and
7 | does not provide legal services or legal advice. Distribution of
8 | Creative Commons public licenses does not create a lawyer-client or
9 | other relationship. Creative Commons makes its licenses and related
10 | information available on an "as-is" basis. Creative Commons gives no
11 | warranties regarding its licenses, any material licensed under their
12 | terms and conditions, or any related information. Creative Commons
13 | disclaims all liability for damages resulting from their use to the
14 | fullest extent possible.
15 |
16 | Using Creative Commons Public Licenses
17 |
18 | Creative Commons public licenses provide a standard set of terms and
19 | conditions that creators and other rights holders may use to share
20 | original works of authorship and other material subject to copyright
21 | and certain other rights specified in the public license below. The
22 | following considerations are for informational purposes only, are not
23 | exhaustive, and do not form part of our licenses.
24 |
25 | Considerations for licensors: Our public licenses are
26 | intended for use by those authorized to give the public
27 | permission to use material in ways otherwise restricted by
28 | copyright and certain other rights. Our licenses are
29 | irrevocable. Licensors should read and understand the terms
30 | and conditions of the license they choose before applying it.
31 | Licensors should also secure all rights necessary before
32 | applying our licenses so that the public can reuse the
33 | material as expected. Licensors should clearly mark any
34 | material not subject to the license. This includes other CC-
35 | licensed material, or material used under an exception or
36 | limitation to copyright. More considerations for licensors:
37 | wiki.creativecommons.org/Considerations_for_licensors
38 |
39 | Considerations for the public: By using one of our public
40 | licenses, a licensor grants the public permission to use the
41 | licensed material under specified terms and conditions. If
42 | the licensor's permission is not necessary for any reason--for
43 | example, because of any applicable exception or limitation to
44 | copyright--then that use is not regulated by the license. Our
45 | licenses grant only permissions under copyright and certain
46 | other rights that a licensor has authority to grant. Use of
47 | the licensed material may still be restricted for other
48 | reasons, including because others have copyright or other
49 | rights in the material. A licensor may make special requests,
50 | such as asking that all changes be marked or described.
51 | Although not required by our licenses, you are encouraged to
52 | respect those requests where reasonable. More_considerations
53 | for the public:
54 | wiki.creativecommons.org/Considerations_for_licensees
55 |
56 | =======================================================================
57 |
58 | Creative Commons Attribution-NonCommercial 4.0 International Public
59 | License
60 |
61 | By exercising the Licensed Rights (defined below), You accept and agree
62 | to be bound by the terms and conditions of this Creative Commons
63 | Attribution-NonCommercial 4.0 International Public License ("Public
64 | License"). To the extent this Public License may be interpreted as a
65 | contract, You are granted the Licensed Rights in consideration of Your
66 | acceptance of these terms and conditions, and the Licensor grants You
67 | such rights in consideration of benefits the Licensor receives from
68 | making the Licensed Material available under these terms and
69 | conditions.
70 |
71 | Section 1 -- Definitions.
72 |
73 | a. Adapted Material means material subject to Copyright and Similar
74 | Rights that is derived from or based upon the Licensed Material
75 | and in which the Licensed Material is translated, altered,
76 | arranged, transformed, or otherwise modified in a manner requiring
77 | permission under the Copyright and Similar Rights held by the
78 | Licensor. For purposes of this Public License, where the Licensed
79 | Material is a musical work, performance, or sound recording,
80 | Adapted Material is always produced where the Licensed Material is
81 | synched in timed relation with a moving image.
82 |
83 | b. Adapter's License means the license You apply to Your Copyright
84 | and Similar Rights in Your contributions to Adapted Material in
85 | accordance with the terms and conditions of this Public License.
86 |
87 | c. Copyright and Similar Rights means copyright and/or similar rights
88 | closely related to copyright including, without limitation,
89 | performance, broadcast, sound recording, and Sui Generis Database
90 | Rights, without regard to how the rights are labeled or
91 | categorized. For purposes of this Public License, the rights
92 | specified in Section 2(b)(1)-(2) are not Copyright and Similar
93 | Rights.
94 | d. Effective Technological Measures means those measures that, in the
95 | absence of proper authority, may not be circumvented under laws
96 | fulfilling obligations under Article 11 of the WIPO Copyright
97 | Treaty adopted on December 20, 1996, and/or similar international
98 | agreements.
99 |
100 | e. Exceptions and Limitations means fair use, fair dealing, and/or
101 | any other exception or limitation to Copyright and Similar Rights
102 | that applies to Your use of the Licensed Material.
103 |
104 | f. Licensed Material means the artistic or literary work, database,
105 | or other material to which the Licensor applied this Public
106 | License.
107 |
108 | g. Licensed Rights means the rights granted to You subject to the
109 | terms and conditions of this Public License, which are limited to
110 | all Copyright and Similar Rights that apply to Your use of the
111 | Licensed Material and that the Licensor has authority to license.
112 |
113 | h. Licensor means the individual(s) or entity(ies) granting rights
114 | under this Public License.
115 |
116 | i. NonCommercial means not primarily intended for or directed towards
117 | commercial advantage or monetary compensation. For purposes of
118 | this Public License, the exchange of the Licensed Material for
119 | other material subject to Copyright and Similar Rights by digital
120 | file-sharing or similar means is NonCommercial provided there is
121 | no payment of monetary compensation in connection with the
122 | exchange.
123 |
124 | j. Share means to provide material to the public by any means or
125 | process that requires permission under the Licensed Rights, such
126 | as reproduction, public display, public performance, distribution,
127 | dissemination, communication, or importation, and to make material
128 | available to the public including in ways that members of the
129 | public may access the material from a place and at a time
130 | individually chosen by them.
131 |
132 | k. Sui Generis Database Rights means rights other than copyright
133 | resulting from Directive 96/9/EC of the European Parliament and of
134 | the Council of 11 March 1996 on the legal protection of databases,
135 | as amended and/or succeeded, as well as other essentially
136 | equivalent rights anywhere in the world.
137 |
138 | l. You means the individual or entity exercising the Licensed Rights
139 | under this Public License. Your has a corresponding meaning.
140 |
141 | Section 2 -- Scope.
142 |
143 | a. License grant.
144 |
145 | 1. Subject to the terms and conditions of this Public License,
146 | the Licensor hereby grants You a worldwide, royalty-free,
147 | non-sublicensable, non-exclusive, irrevocable license to
148 | exercise the Licensed Rights in the Licensed Material to:
149 |
150 | a. reproduce and Share the Licensed Material, in whole or
151 | in part, for NonCommercial purposes only; and
152 |
153 | b. produce, reproduce, and Share Adapted Material for
154 | NonCommercial purposes only.
155 |
156 | 2. Exceptions and Limitations. For the avoidance of doubt, where
157 | Exceptions and Limitations apply to Your use, this Public
158 | License does not apply, and You do not need to comply with
159 | its terms and conditions.
160 |
161 | 3. Term. The term of this Public License is specified in Section
162 | 6(a).
163 |
164 | 4. Media and formats; technical modifications allowed. The
165 | Licensor authorizes You to exercise the Licensed Rights in
166 | all media and formats whether now known or hereafter created,
167 | and to make technical modifications necessary to do so. The
168 | Licensor waives and/or agrees not to assert any right or
169 | authority to forbid You from making technical modifications
170 | necessary to exercise the Licensed Rights, including
171 | technical modifications necessary to circumvent Effective
172 | Technological Measures. For purposes of this Public License,
173 | simply making modifications authorized by this Section 2(a)
174 | (4) never produces Adapted Material.
175 |
176 | 5. Downstream recipients.
177 |
178 | a. Offer from the Licensor -- Licensed Material. Every
179 | recipient of the Licensed Material automatically
180 | receives an offer from the Licensor to exercise the
181 | Licensed Rights under the terms and conditions of this
182 | Public License.
183 |
184 | b. No downstream restrictions. You may not offer or impose
185 | any additional or different terms or conditions on, or
186 | apply any Effective Technological Measures to, the
187 | Licensed Material if doing so restricts exercise of the
188 | Licensed Rights by any recipient of the Licensed
189 | Material.
190 |
191 | 6. No endorsement. Nothing in this Public License constitutes or
192 | may be construed as permission to assert or imply that You
193 | are, or that Your use of the Licensed Material is, connected
194 | with, or sponsored, endorsed, or granted official status by,
195 | the Licensor or others designated to receive attribution as
196 | provided in Section 3(a)(1)(A)(i).
197 |
198 | b. Other rights.
199 |
200 | 1. Moral rights, such as the right of integrity, are not
201 | licensed under this Public License, nor are publicity,
202 | privacy, and/or other similar personality rights; however, to
203 | the extent possible, the Licensor waives and/or agrees not to
204 | assert any such rights held by the Licensor to the limited
205 | extent necessary to allow You to exercise the Licensed
206 | Rights, but not otherwise.
207 |
208 | 2. Patent and trademark rights are not licensed under this
209 | Public License.
210 |
211 | 3. To the extent possible, the Licensor waives any right to
212 | collect royalties from You for the exercise of the Licensed
213 | Rights, whether directly or through a collecting society
214 | under any voluntary or waivable statutory or compulsory
215 | licensing scheme. In all other cases the Licensor expressly
216 | reserves any right to collect such royalties, including when
217 | the Licensed Material is used other than for NonCommercial
218 | purposes.
219 |
220 | Section 3 -- License Conditions.
221 |
222 | Your exercise of the Licensed Rights is expressly made subject to the
223 | following conditions.
224 |
225 | a. Attribution.
226 |
227 | 1. If You Share the Licensed Material (including in modified
228 | form), You must:
229 |
230 | a. retain the following if it is supplied by the Licensor
231 | with the Licensed Material:
232 |
233 | i. identification of the creator(s) of the Licensed
234 | Material and any others designated to receive
235 | attribution, in any reasonable manner requested by
236 | the Licensor (including by pseudonym if
237 | designated);
238 |
239 | ii. a copyright notice;
240 |
241 | iii. a notice that refers to this Public License;
242 |
243 | iv. a notice that refers to the disclaimer of
244 | warranties;
245 |
246 | v. a URI or hyperlink to the Licensed Material to the
247 | extent reasonably practicable;
248 |
249 | b. indicate if You modified the Licensed Material and
250 | retain an indication of any previous modifications; and
251 |
252 | c. indicate the Licensed Material is licensed under this
253 | Public License, and include the text of, or the URI or
254 | hyperlink to, this Public License.
255 |
256 | 2. You may satisfy the conditions in Section 3(a)(1) in any
257 | reasonable manner based on the medium, means, and context in
258 | which You Share the Licensed Material. For example, it may be
259 | reasonable to satisfy the conditions by providing a URI or
260 | hyperlink to a resource that includes the required
261 | information.
262 |
263 | 3. If requested by the Licensor, You must remove any of the
264 | information required by Section 3(a)(1)(A) to the extent
265 | reasonably practicable.
266 |
267 | 4. If You Share Adapted Material You produce, the Adapter's
268 | License You apply must not prevent recipients of the Adapted
269 | Material from complying with this Public License.
270 |
271 | Section 4 -- Sui Generis Database Rights.
272 |
273 | Where the Licensed Rights include Sui Generis Database Rights that
274 | apply to Your use of the Licensed Material:
275 |
276 | a. for the avoidance of doubt, Section 2(a)(1) grants You the right
277 | to extract, reuse, reproduce, and Share all or a substantial
278 | portion of the contents of the database for NonCommercial purposes
279 | only;
280 |
281 | b. if You include all or a substantial portion of the database
282 | contents in a database in which You have Sui Generis Database
283 | Rights, then the database in which You have Sui Generis Database
284 | Rights (but not its individual contents) is Adapted Material; and
285 |
286 | c. You must comply with the conditions in Section 3(a) if You Share
287 | all or a substantial portion of the contents of the database.
288 |
289 | For the avoidance of doubt, this Section 4 supplements and does not
290 | replace Your obligations under this Public License where the Licensed
291 | Rights include other Copyright and Similar Rights.
292 |
293 | Section 5 -- Disclaimer of Warranties and Limitation of Liability.
294 |
295 | a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
296 | EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
297 | AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
298 | ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
299 | IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
300 | WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
301 | PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
302 | ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
303 | KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
304 | ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
305 |
306 | b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
307 | TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
308 | NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
309 | INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
310 | COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
311 | USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
312 | ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
313 | DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
314 | IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
315 |
316 | c. The disclaimer of warranties and limitation of liability provided
317 | above shall be interpreted in a manner that, to the extent
318 | possible, most closely approximates an absolute disclaimer and
319 | waiver of all liability.
320 |
321 | Section 6 -- Term and Termination.
322 |
323 | a. This Public License applies for the term of the Copyright and
324 | Similar Rights licensed here. However, if You fail to comply with
325 | this Public License, then Your rights under this Public License
326 | terminate automatically.
327 |
328 | b. Where Your right to use the Licensed Material has terminated under
329 | Section 6(a), it reinstates:
330 |
331 | 1. automatically as of the date the violation is cured, provided
332 | it is cured within 30 days of Your discovery of the
333 | violation; or
334 |
335 | 2. upon express reinstatement by the Licensor.
336 |
337 | For the avoidance of doubt, this Section 6(b) does not affect any
338 | right the Licensor may have to seek remedies for Your violations
339 | of this Public License.
340 |
341 | c. For the avoidance of doubt, the Licensor may also offer the
342 | Licensed Material under separate terms or conditions or stop
343 | distributing the Licensed Material at any time; however, doing so
344 | will not terminate this Public License.
345 |
346 | d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
347 | License.
348 |
349 | Section 7 -- Other Terms and Conditions.
350 |
351 | a. The Licensor shall not be bound by any additional or different
352 | terms or conditions communicated by You unless expressly agreed.
353 |
354 | b. Any arrangements, understandings, or agreements regarding the
355 | Licensed Material not stated herein are separate from and
356 | independent of the terms and conditions of this Public License.
357 |
358 | Section 8 -- Interpretation.
359 |
360 | a. For the avoidance of doubt, this Public License does not, and
361 | shall not be interpreted to, reduce, limit, restrict, or impose
362 | conditions on any use of the Licensed Material that could lawfully
363 | be made without permission under this Public License.
364 |
365 | b. To the extent possible, if any provision of this Public License is
366 | deemed unenforceable, it shall be automatically reformed to the
367 | minimum extent necessary to make it enforceable. If the provision
368 | cannot be reformed, it shall be severed from this Public License
369 | without affecting the enforceability of the remaining terms and
370 | conditions.
371 |
372 | c. No term or condition of this Public License will be waived and no
373 | failure to comply consented to unless expressly agreed to by the
374 | Licensor.
375 |
376 | d. Nothing in this Public License constitutes or may be interpreted
377 | as a limitation upon, or waiver of, any privileges and immunities
378 | that apply to the Licensor or You, including from the legal
379 | processes of any jurisdiction or authority.
380 |
381 | =======================================================================
382 |
383 | Creative Commons is not a party to its public
384 | licenses. Notwithstanding, Creative Commons may elect to apply one of
385 | its public licenses to material it publishes and in those instances
386 | will be considered the “Licensor.” The text of the Creative Commons
387 | public licenses is dedicated to the public domain under the CC0 Public
388 | Domain Dedication. Except for the limited purpose of indicating that
389 | material is shared under a Creative Commons public license or as
390 | otherwise permitted by the Creative Commons policies published at
391 | creativecommons.org/policies, Creative Commons does not authorize the
392 | use of the trademark "Creative Commons" or any other trademark or logo
393 | of Creative Commons without its prior written consent including,
394 | without limitation, in connection with any unauthorized modifications
395 | to any of its public licenses or any other arrangements,
396 | understandings, or agreements concerning use of licensed material. For
397 | the avoidance of doubt, this paragraph does not form part of the
398 | public licenses.
399 |
400 | Creative Commons may be contacted at creativecommons.org.
401 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # text-generation-webui-barktts
2 | A simple extension for the [text-generation-webui by oobabooga](https://github.com/oobabooga/text-generation-webui) that uses [Bark](https://github.com/suno-ai/bark) for audio output.
3 |
4 | # ⚠️Important Info⚠️
5 | This repo is not actively maintained _(at least for a while)_.
6 | If you want a version that works and fixes some issues with my code,
7 | have a look at [RandoInternetPreson's Fork](https://github.com/RandomInternetPreson/text-generation-webui-barktts)
8 |
9 | ## How to install
10 | Assuming you already have the webui set up:
11 |
12 | 1. Activate the conda environment with the `cmd_xxx.bat` or using `conda activate textgen`
13 | 2. Enter the `text-generation-webui/extensions/` directory and clone this repository
14 | ```
15 | cd text-generation-webui/extensions/
16 | git clone https://github.com/minemo/text-generation-webui-barktts bark_tts/
17 | ```
18 | 3. install the requirements
19 | ```
20 | pip install -r extensions/bark_tts/requirements.txt
21 | ```
22 | 4. Add `--extensions bark_tts` to your startup script or enable it through the `Interface Mode` tab in the webui
23 |
24 | ## Tips
25 | The full version of Bark requires around 12Gb of memory to hold everything on GPU at the same time. However, even smaller cards down to ~2Gb work with some additional settings. For this extension, you could open `extensions/bark_tts/.env`, then set `USE_SMALL_MODELS` and `USE_CPU` to `true`:
26 |
27 | ```
28 | # Whether to use small models
29 | USE_SMALL_MODELS=true
30 |
31 | # Whether to use CPU
32 | USE_CPU=true
33 | ```
34 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | git+https://github.com/suno-ai/bark.git
2 | nltk
3 | python-dotenv
4 |
--------------------------------------------------------------------------------
/script.py:
--------------------------------------------------------------------------------
1 | """
2 | Bark TTS extension for https://github.com/oobabooga/text-generation-webui/
3 | All credit for the amazing tts model goes to https://github.com/suno-ai/bark
4 | """
5 | import hashlib
6 | from http.client import IncompleteRead
7 | import os
8 | import time
9 | import urllib.request
10 | from pathlib import Path
11 | from dotenv import load_dotenv
12 |
13 | # Read .env file
14 | load_dotenv()
15 |
16 | # Should change this environment variable before import bark
17 | model_path = Path(os.environ.get('MODEL_PATH', 'extensions/bark_tts/models/'))
18 | os.environ['XDG_CACHE_HOME'] = model_path.resolve().as_posix()
19 |
20 | import nltk
21 | import gradio as gr
22 | import numpy as np
23 | from bark import SAMPLE_RATE, preload_models
24 | from bark.generation import ALLOWED_PROMPTS, generate_text_semantic
25 | from bark.api import semantic_to_waveform
26 | from modules import shared
27 | from scipy.io.wavfile import write as write_wav
28 |
29 | nltk.download('punkt')
30 |
31 | params = {
32 | 'activate': True,
33 | 'autoplay': False,
34 | 'forced_speaker_enabled': False,
35 | 'forced_speaker': 'Man',
36 | 'show_text': False,
37 | 'modifiers': [],
38 | 'use_small_models': os.environ.get("USE_SMALL_MODELS", 'false').lower() == 'true',
39 | 'use_cpu': os.environ.get("USE_CPU", 'false').lower() == 'true',
40 | 'force_manual_download': False,
41 | 'voice': 'v2/en_speaker_3',
42 | 'sample_rate': SAMPLE_RATE,
43 | 'temperature': 0.7
44 | }
45 |
46 | input_hijack = {
47 | 'state': False,
48 | 'value': ["", ""]
49 | }
50 |
51 | streaming_state = shared.args.no_stream
52 | forced_modes = ["Man", "Woman", "Narrator"]
53 | modifier_options = ["[laughter]","[laughs]","[sighs]","[music]","[gasps]","[clears throat]"]
54 | voice_presets = sorted(list(ALLOWED_PROMPTS))
55 |
56 | def manual_model_preload():
57 | for model in ["text","coarse","fine","text_2","coarse_2","fine_2"]:
58 | remote_url=f"https://dl.suno-models.io/bark/models/v0/{model}.pt"
59 | remote_md5=hashlib.md5(remote_url.encode()).hexdigest()
60 | out_path = f"{os.path.expanduser('~/.cache/suno/bark_v0')}/{remote_md5}.pt"
61 | if not Path(out_path).exists():
62 | print(f"\t+ Downloading {model} model to {out_path}...")
63 | # we also have to do some user agent tomfoolery to get the download to work
64 | req = urllib.request.Request(remote_url, headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0'})
65 | with urllib.request.urlopen(req) as response, open(out_path, 'wb') as out_file:
66 | try:
67 | data = response.read()
68 | except IncompleteRead as e:
69 | data = e.partial
70 | out_file.write(data)
71 | else:
72 | print(f"\t+ {model} model already exists, skipping...")
73 | preload_models(
74 | text_use_gpu= not params['use_cpu'],
75 | text_use_small= params['use_small_models'],
76 | coarse_use_gpu= not params['use_cpu'],
77 | coarse_use_small=params['use_small_models'],
78 | fine_use_gpu= not params['use_cpu'],
79 | fine_use_small=params['use_small_models'],
80 | codec_use_gpu= not params['use_cpu']
81 | )
82 |
83 | def input_modifier(string):
84 | if not params['activate']:
85 | shared.processing_message = "*Is typing...*"
86 | return string
87 | shared.processing_message = "*Is recording a voice message...*"
88 | shared.args.no_stream = True
89 | return string
90 |
91 | def output_modifier(string):
92 |
93 | if not params['activate']:
94 | return string
95 |
96 | ttstext = string
97 |
98 | if params['modifiers']:
99 | ttstext = f"{' '.join(params['modifiers'])}: {ttstext}"
100 |
101 | if params['forced_speaker_enabled']:
102 | ttstext = f"{params['forced_speaker'].upper()}: {ttstext}"
103 |
104 | sentences = nltk.sent_tokenize(ttstext)
105 | silence = np.zeros(int(0.25 * params['sample_rate'])) # quarter second of silence
106 | pieces = []
107 | for sentence in sentences:
108 | semantic_tokens = generate_text_semantic(
109 | sentence,
110 | history_prompt=params['voice'],
111 | temp=params['temperature'],
112 | min_eos_p=0.05, # this controls how likely the generation is to end
113 | )
114 | audio_array = semantic_to_waveform(semantic_tokens, history_prompt=params['voice'],)
115 | pieces += [audio_array, silence.copy()]
116 | audio = np.array(np.concatenate(pieces), dtype="float32")
117 | time_label = int(time.time())
118 | write_wav(f"extensions/bark_tts/generated/{shared.character}_{time_label}.wav", params['sample_rate'], audio)
119 | autoplay = 'autoplay' if params['autoplay'] else ''
120 | if params['show_text']:
121 | string = f' {ttstext}'
122 | else:
123 | string = f''
124 |
125 | shared.args.no_stream = streaming_state
126 | return string
127 |
128 |
129 | def setup():
130 | # tell the user what's going on
131 | print()
132 | print("== Loading Bark TTS extension ==")
133 | print("+ This may take a while on first run don't worry!")
134 |
135 | print("+ Creating directories (if they don't exist)...")
136 | if not Path("extensions/bark_tts/generated").exists():
137 | Path("extensions/bark_tts/generated").mkdir(parents=True)
138 | if not Path(model_path).exists():
139 | Path(model_path).mkdir(parents=True)
140 | print("+ Done!")
141 |
142 | # load models into extension directory so we don't clutter the pc
143 | print("+ Loading model...")
144 | if not params['force_manual_download']:
145 | try:
146 | preload_models(
147 | text_use_gpu= not params['use_cpu'],
148 | text_use_small= params['use_small_models'],
149 | coarse_use_gpu= not params['use_cpu'],
150 | coarse_use_small=params['use_small_models'],
151 | fine_use_gpu= not params['use_cpu'],
152 | fine_use_small=params['use_small_models'],
153 | codec_use_gpu= not params['use_cpu']
154 | )
155 | except ValueError as e:
156 | # for some reason the download fails sometimes, so we just do it manually
157 | # solution adapted from https://github.com/suno-ai/bark/issues/46
158 | print("\t+ Automatic download failed, trying manual download...")
159 | manual_model_preload()
160 |
161 | else:
162 | print("\t+ Forcing manual download...")
163 | manual_model_preload()
164 |
165 |
166 |
167 | print("+ Done!")
168 |
169 | print("== Bark TTS extension loaded ==\n\n")
170 |
171 | def ui():
172 | with gr.Accordion("Bark TTS"):
173 | with gr.Row():
174 | activate = gr.Checkbox(value=params['activate'], label='Activate TTS')
175 | autoplay = gr.Checkbox(value=params['autoplay'], label='Autoplay')
176 | show_text = gr.Checkbox(value=params['show_text'], label='Show text')
177 | forced_speaker_enabled = gr.Checkbox(value=params['forced_speaker_enabled'], label='Forced speaker enabled')
178 | with gr.Row():
179 | forced_speaker = gr.Dropdown(forced_modes, label='Forced speaker', value=params['forced_speaker'])
180 | modifiers = gr.Dropdown(modifier_options, label='Modifiers', value=params['modifiers'], multiselect=True)
181 | voice = gr.Dropdown(voice_presets, label='Voice Preset', value=params['voice'])
182 | with gr.Row():
183 | sample_rate = gr.Slider(minimum=18000, maximum=30000, value=params['sample_rate'], label='Sample Rate')
184 | temperature = gr.Slider(minimum=0.1, maximum=1.0, value=params['temperature'], label='Temperature')
185 |
186 | activate.change(lambda x: params.update({'activate': x}), activate, None)
187 | autoplay.change(lambda x: params.update({'autoplay': x}), autoplay, None)
188 | show_text.change(lambda x: params.update({'show_text': x}), show_text, None)
189 | forced_speaker_enabled.change(lambda x: params.update({'forced_speaker_enabled': x}), forced_speaker_enabled, None)
190 | forced_speaker.change(lambda x: params.update({'forced_speaker': x}), forced_speaker, None)
191 | modifiers.change(lambda x: params.update({'modifiers': x}), modifiers, None)
192 | voice.change(lambda x: params.update({'voice': x}), voice, None)
193 | sample_rate.change(lambda x: params.update({'sample_rate': x}), sample_rate, None)
194 | temperature.change(lambda x: params.update({'temperature': x}), temperature, None)
195 |
--------------------------------------------------------------------------------