├── .gitignore ├── README.md ├── __init__.py ├── assets └── StableAudio_00001.wav ├── nodes.py ├── pyproject.toml ├── requirements.txt ├── util_config.py ├── util_dependencies.py ├── web └── js │ └── playSound.js └── workflows └── audio-space-exploration.json /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ComfyUI-StableAudioSampler 2 | The New Stable Audio Open 1.0 Sampler In a ComfyUI Node. Make some beats! 3 | ![image](https://github.com/lks-ai/ComfyUI-StableAudioSampler/assets/163685473/477272f3-46c5-46e5-8de9-d74a93e91716) 4 | 5 | ## An Example I Pasted Together 6 | In this [workflow](https://github.com/lks-ai/ComfyUI-StableAudioSampler/blob/main/workflows/audio-space-exploration.json), I got random `cfg_scale`, `sigma_min` and `step` values making variations on the same 16 beats; same `prompt` and `seed`. **VOLUME WARNING!** 7 | 8 | https://github.com/lks-ai/ComfyUI-StableAudioSampler/assets/163685473/5f43db75-cc35-47f3-999b-6f65f91420eb 9 | 10 | # Caveats 11 | - The longer your audio, the more VRAM you need to stitch it together 12 | - on a 3060, we've tried up to 10 seconds so far 13 | 14 | ## Installation 15 | 16 | ### Download the Model and Config 17 | 1. Go to [Stable Audio Open on HuggingFace](https://huggingface.co/stabilityai/stable-audio-open-1.0/tree/main/) and download the `model.safetensors` and `model.config.json` files. 18 | 2. Place the files in the `models/audio_checkpoints` folder. If you don't have one, make one in your comfy folder. 19 | 3. Open Comfy and StableAudioLoader will see your model and config 20 | 21 | ### With a HuggingFace Token 22 | 1. Make sure you have your `HF_TOKEN` environment variable for hugging face because model loading doesn't work just yet directly from a saved file 23 | 2. Go ahead and download model from here for when we fix that [Stable Audio Open on HuggingFace](https://huggingface.co/stabilityai/stable-audio-open-1.0/tree/main/model.safetensors) 24 | 3. Make sure to run `pip install -r requirements.txt` inside the repo folder if you're not using Manager 25 | 4. It should just run if you've got your environment variable set up 26 | 27 | There will definitely be issues because this is so new and it was coded quickly so we couldn't test it out. 28 | 29 | This is not an official StableAudioOpen repository. 30 | 31 | ## Current Features 32 | - Load your own models! 33 | - Runs in half precision optional 34 | - Nodes 35 | - A Sampler Node: now with seed control, positive and negative prompts 36 | - A Pre-Conditioning Node: kind of like empty latent audio with batch option 37 | - A Prompt Node: Pipes conditioning 38 | - A Model Loading Node: Includes repo options and scans `models/audio_checkpoints` for models and config.json files 39 | - `control_after_generate` option 40 | - Audio to Audio (like in the Gradio Example) **Still working on fix for this** 41 | - Can still use HF env key if you want 42 | - Generates audio and outputs raw bytes and a sample rate for use with VHS 43 | - Includes all of the original Stable Audio Open parameters 44 | - Sampler outputs a Spectrogram image (experimental) 45 | - Can save audio to file 46 | - New Prefix Templates for save file naming 47 | - Outputs a temporary `wav` to `temp/stableaudiosampler.wav` you can use for looping like in [this video](https://www.youtube.com/watch?v=_eR6tP-c8W4). 48 | 49 | ### Example Workflows 50 | #### [Exploring Same Prompt and Seed](https://github.com/lks-ai/ComfyUI-StableAudioSampler/blob/main/workflows/audio-space-exploration.json) 51 | 52 | The part I use AnyNode for is just getting random values within a range for `cfg_scale`, `steps` and `sigma_min` thanks to feedback from the community and some tinkering, I think I found a way in this workflow to just get endless sequences of the same seed/prompt in any key (because I mentioned what key the synth lead needed to be in). 53 | 54 | With the new save prefix templating, it makes it easy to just look at the file and know what settings (since wav doesn't have PNGinfo) 55 | 56 | ## Roadmap and Requested Features 57 | Keeping track of requests and ideas as they come in: 58 | - Stereo output 59 | - Nodes 60 | - A Mixer Node (mix your audio outputs with some sort of mastering) 61 | - A Tiling Sampler (concatenate the audios) 62 | - More Sampler Node Options 63 | - Gain 64 | - Possibly Clipping at some dB 65 | - Cleaning up some of the current options with selectors, etc. 66 | - Upfi (upscaling fidelity) 67 | - Audio Preview Node 68 | 69 | ## Error: `progressbar` 70 | If you get the `progressbar` error, you can use our new utility from the latest update. 71 | ```bash 72 | cd ComfyUI/custom_modules/ComfyUI-StableAudioSampler 73 | python util_discrepancies.py progressbar 74 | ``` 75 | You will see something like this... 76 | ![Screenshot from 2024-06-13 13-02-30](https://github.com/lks-ai/ComfyUI-StableAudioSampler/assets/163685473/5ce10eb3-d841-4d21-bd48-93d697cff3d8) 77 | In this screenshot, you see `protobuf` but that is only because I don't have version issues with `progressbar`. 78 | **Note**: If I install one of those version suggestions, StableAudioSampler should work, but at the same time, it might make other packages not work. 79 | 80 | # Contributions 81 | We are very open to anyone who wants to contribute from the open source community. Make your forks and pull requests. We will build something cool. If it's already on the roadmap, chances are we're already working on it, so check in with us. We will start a dev branch. 82 | 83 | # Feature Requests 84 | If you have a request for a feature, open an issue about it and it will be seen. 85 | 86 | Happy Diffusing! 87 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | @author: lks-ai 3 | @title: StableAudioSampler 4 | @nickname: stableaudio 5 | @description: A Simple integration of Stable Audio Diffusion with knobs and stuff! 6 | """ 7 | 8 | from .nodes import StableAudioSampler, NODE_CLASS_MAPPINGS, NODE_DISPLAY_NAME_MAPPINGS 9 | 10 | __all__ = ['NODE_CLASS_MAPPINGS', 'NODE_DISPLAY_NAME_MAPPINGS'] 11 | WEB_DIRECTORY = "./web" -------------------------------------------------------------------------------- /assets/StableAudio_00001.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lks-ai/ComfyUI-StableAudioSampler/1fa0c1155b1c7f42a3c73e48bddc0b55fef1dba0/assets/StableAudio_00001.wav -------------------------------------------------------------------------------- /nodes.py: -------------------------------------------------------------------------------- 1 | import os, sys, json, gc 2 | import glob 3 | import typing as tp 4 | import numpy as np 5 | import torch 6 | import torchaudio 7 | from einops import rearrange 8 | from safetensors.torch import load_file 9 | from torchaudio import transforms as T 10 | from aeiou.viz import audio_spectrogram_image 11 | 12 | from .util_dependencies import PackageDependencyChecker 13 | from .util_config import get_model_config 14 | 15 | # Add local stable-audio-tools to path 16 | def add_stable_audio_tools_path(): 17 | current_path = os.path.dirname(os.path.abspath(__file__)) 18 | # Updated path to point to custom-extensions 19 | stable_audio_path = os.path.abspath(os.path.join(current_path, '../../custom-extensions/stable-audio-tools')) 20 | if stable_audio_path not in sys.path: 21 | sys.path.insert(0, stable_audio_path) 22 | print(f"[comfyui-stable-audio-sampler, nodes.py, add_stable_audio_tools_path] Added stable-audio-tools path: {stable_audio_path}") 23 | 24 | add_stable_audio_tools_path() 25 | 26 | # Import stable-audio-tools after path modification 27 | from stable_audio_tools import get_pretrained_model, create_model_from_config 28 | from stable_audio_tools.inference.generation import generate_diffusion_cond, generate_diffusion_uncond 29 | from stable_audio_tools.inference.utils import prepare_audio 30 | from stable_audio_tools.models.utils import load_ckpt_state_dict 31 | from stable_audio_tools.training.utils import copy_state_dict 32 | 33 | # try: 34 | import torch 35 | import torchaudio 36 | from einops import rearrange 37 | from stable_audio_tools import get_pretrained_model 38 | from stable_audio_tools.inference.generation import generate_diffusion_cond 39 | import numpy as np 40 | from safetensors.torch import load_file 41 | from .util_config import get_model_config 42 | # from stable_audio_tools.models.factory import create_model_from_config 43 | # from stable_audio_tools.models.utils import load_ckpt_state_dict 44 | from stable_audio_tools import get_pretrained_model, create_model_from_config 45 | # from stable_audio_tools.inference.generation import generate_diffusion_cond 46 | from stable_audio_tools.models.utils import load_ckpt_state_dict 47 | 48 | from stable_audio_tools.inference.generation import generate_diffusion_cond, generate_diffusion_uncond 49 | from stable_audio_tools.inference.utils import prepare_audio 50 | from stable_audio_tools.training.utils import copy_state_dict 51 | from aeiou.viz import audio_spectrogram_image 52 | from torchaudio import transforms as T 53 | # except ImportError as e: 54 | # checker = PackageDependencyChecker() 55 | # discrepancies = checker.check_version_discrepancies('requirements.txt') 56 | # #instructions = checker.generate_user_instructions(discrepancies) 57 | 58 | # # Find dependent discrepancies for all packages with issues 59 | # dependent_discrepancies = [] 60 | # for discrepancy in discrepancies: 61 | # package_name = discrepancy['package_name'] 62 | # dependent_discrepancies.extend(checker.check_dependents_discrepancies(package_name)) 63 | 64 | # # Analyze discrepancies and suggest solutions 65 | # solutions = checker.analyze_discrepancies(discrepancies + dependent_discrepancies) 66 | # solution_suggestions = checker.suggest_solutions(solutions) 67 | 68 | # out = "\nSuggested solutions:\n" 69 | # for suggestion in solution_suggestions: 70 | # out += f"{suggestion}\n" 71 | 72 | # raise ValueError(f"<>: You Have some Environment Problems...\n\n{out}") 73 | 74 | # Test current setup 75 | # Add in Audio2Audio 76 | 77 | # Comfy libs 78 | def add_comfy_path(): 79 | current_path = os.path.dirname(os.path.abspath(__file__)) 80 | comfy_path = os.path.abspath(os.path.join(current_path, '../../../comfy')) 81 | if comfy_path not in sys.path: 82 | sys.path.insert(0, comfy_path) 83 | 84 | 85 | add_comfy_path() 86 | 87 | from comfy.utils import ProgressBar # type: ignore 88 | import folder_paths # type: ignore 89 | 90 | device = "cuda" if torch.cuda.is_available() else "cpu" 91 | MAX_FP32 = np.iinfo(np.int32).max 92 | SCHEDULERS = ["dpmpp-3m-sde", "dpmpp-2m-sde", "k-heun", "k-lms", "k-dpmpp-2s-ancestral", "k-dpm-2", "k-dpm-fast"] 93 | ACKPT_FOLDER = "models/audio_checkpoints/" 94 | TEMP_FOLDER = "temp/" 95 | 96 | class AnyType(str): 97 | def __ne__(self, __value: object) -> bool: 98 | return False 99 | 100 | base_path = os.path.dirname(os.path.realpath(__file__)) 101 | os.makedirs(ACKPT_FOLDER, exist_ok=True) 102 | os.makedirs(TEMP_FOLDER, exist_ok=True) 103 | 104 | # Our any instance wants to be a wildcard string 105 | any = AnyType("*") 106 | def get_models_path(ckpt_name): 107 | if not ckpt_name: 108 | return None 109 | return f"{ACKPT_FOLDER}{ckpt_name}" 110 | 111 | model_files = [os.path.basename(file) for file in glob.glob(f"{ACKPT_FOLDER}*.safetensors")] + [os.path.basename(file) for file in glob.glob("models/audio_checkpoints/*.ckpt")] 112 | config_files = [os.path.basename(file) for file in glob.glob(f"{ACKPT_FOLDER}*.json")] 113 | if len(model_files) == 0: 114 | model_files.append(f"Put models in {ACKPT_FOLDER}") 115 | 116 | def repo_path(repo, filename): 117 | path = os.path.join(repo, filename) 118 | instance_path = os.path.normpath(path) 119 | if sys.platform == 'win32': 120 | instance_path = instance_path.replace('\\', "/") 121 | return instance_path 122 | 123 | import re 124 | def replace_variables(template, values_dict): 125 | """Replace variables from a template where {} encloses a variable key from values_dict.""" 126 | pattern = r'\{(\w+)\}' 127 | 128 | def replacer(match): 129 | variable_name = match.group(1) 130 | value = values_dict.get(variable_name, match.group(0)) 131 | if isinstance(value, (str, int, float, bool)): 132 | return str(value) 133 | return match.group(0) 134 | 135 | result = re.sub(pattern, replacer, template) 136 | return result 137 | 138 | import io 139 | # def wav_bytes_to_tensor(wav_bytes: bytes, model, sample_rate) -> tp.Tuple[int, torch.Tensor]: 140 | # # Load the audio data and sample rate using torchaudio 141 | # audio_tensor, in_sr = torchaudio.load(io.BytesIO(wav_bytes)) 142 | # #audio_tensor = torch.from_numpy(audio_tensor).float().div(32767) 143 | # print("Before Transform", audio_tensor) 144 | # audio_tensor.float().div(32767) 145 | # print("Converted", audio_tensor) 146 | # if audio_tensor.dim() == 1: 147 | # audio_tensor = audio_tensor.unsqueeze(0) # [1, n] 148 | # elif audio_tensor.dim() == 2: 149 | # audio_tensor = audio_tensor.transpose(0, 1) # [n, 2] -> [2, n] print(sample_rate) 150 | # print("Unsquoze", audio_tensor) 151 | # if in_sr != sample_rate: 152 | # resample_tf = T.Resample(in_sr, sample_rate).to(audio_tensor.device) 153 | # audio_tensor = resample_tf(audio_tensor) 154 | # print("Resampled", audio_tensor) 155 | # dtype = next(model.parameters()).dtype 156 | # audio_tensor = audio_tensor.to(dtype) 157 | # print("Retyped", audio_tensor) 158 | # return sample_rate, audio_tensor 159 | 160 | def wav_bytes_to_tensor(wav_bytes: tp.Union[bytes, dict], model, sample_rate, sample_size: int) -> tp.Tuple[int, torch.Tensor]: 161 | # Handle dictionary input case 162 | if isinstance(wav_bytes, dict): 163 | if 'waveform' in wav_bytes: 164 | # Handle ComfyUI LoadAudio format 165 | audio_tensor = wav_bytes['waveform'] 166 | in_sr = wav_bytes.get('sample_rate', sample_rate) 167 | elif 'path' in wav_bytes: 168 | # Load from file path 169 | audio_tensor, in_sr = torchaudio.load(wav_bytes['path']) 170 | elif 'tensor' in wav_bytes: 171 | # Direct tensor data 172 | audio_tensor = wav_bytes['tensor'] 173 | in_sr = wav_bytes.get('sample_rate', sample_rate) 174 | elif 'filename' in wav_bytes: 175 | # Load from filename 176 | audio_tensor, in_sr = torchaudio.load(wav_bytes['filename']) 177 | else: 178 | print("Audio dictionary contents:", wav_bytes.keys()) 179 | raise ValueError("Invalid audio dictionary format - must contain 'waveform', 'path', 'tensor', or 'filename' key") 180 | else: 181 | # Original bytes handling 182 | audio_tensor, in_sr = torchaudio.load(io.BytesIO(wav_bytes)) 183 | 184 | # Handle different tensor shapes 185 | if audio_tensor.dim() == 3: 186 | # If we have a 3D tensor [batch, channels, samples], squeeze out the batch dimension 187 | audio_tensor = audio_tensor.squeeze(0) 188 | elif audio_tensor.dim() == 1: 189 | # If we have a 1D tensor [samples], add channel dimension 190 | audio_tensor = audio_tensor.unsqueeze(0) 191 | 192 | # Ensure we have [channels, samples] format 193 | if audio_tensor.shape[0] > audio_tensor.shape[-1]: 194 | audio_tensor = audio_tensor.transpose(0, -1) 195 | 196 | if in_sr != sample_rate: 197 | resample_tf = T.Resample(in_sr, sample_rate).to(audio_tensor.device) 198 | audio_tensor = resample_tf(audio_tensor) 199 | 200 | num_channels, num_samples = audio_tensor.shape 201 | if num_samples > sample_size: 202 | audio_tensor = audio_tensor[:, :sample_size] 203 | elif num_samples < sample_size: 204 | padding = torch.zeros((num_channels, sample_size - num_samples), dtype=audio_tensor.dtype, device=audio_tensor.device) 205 | audio_tensor = torch.cat((audio_tensor, padding), dim=1) 206 | 207 | dtype = next(model.parameters()).dtype 208 | audio_tensor = audio_tensor.to(dtype) 209 | 210 | return sample_rate, audio_tensor 211 | 212 | 213 | def generate_audio(cond_batch, steps, cfg_scale, sigma_min, sigma_max, sampler_type, device, save, save_prefix, modelinfo, batch_size=1, seed=-1, after_generate="randomize", counter=0, init_noise_level=1.0, init_audio=None, quantum=True): 214 | if torch.cuda.is_available(): 215 | torch.cuda.empty_cache() 216 | gc.collect() 217 | 218 | model, sample_rate, sample_size, _device = modelinfo 219 | b_pos, b_neg = cond_batch 220 | p_conditioning, p_batch_size = b_pos 221 | n_conditioning, n_batch_size = b_neg 222 | sample_size = p_conditioning[0]['seconds_total'] * sample_rate 223 | 224 | #dprint("Model Loaded:", model) 225 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Positive Conditioning:", p_conditioning) 226 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Negative Conditioning:", n_conditioning) 227 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Sample Size:", sample_size) 228 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Sample Rate:", sample_rate) 229 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Seconds:", sample_size / sample_rate) 230 | 231 | # if init_audio is not None: 232 | # print(len(init_audio)) 233 | # in_sr, init_audio = init_audio 234 | # # Turn into torch tensor, converting from int16 to float32 235 | # init_audio = torch.from_numpy(init_audio).float().div(32767) 236 | 237 | # if init_audio.dim() == 1: 238 | # init_audio = init_audio.unsqueeze(0) # [1, n] 239 | # elif init_audio.dim() == 2: 240 | # init_audio = init_audio.transpose(0, 1) # [n, 2] -> [2, n] 241 | 242 | # if in_sr != sample_rate: 243 | # resample_tf = T.Resample(in_sr, sample_rate).to(init_audio.device) 244 | # init_audio = resample_tf(init_audio) 245 | 246 | # audio_length = init_audio.shape[-1] 247 | 248 | # if audio_length > sample_size: 249 | 250 | # input_sample_size = audio_length + (model.min_input_length - (audio_length % model.min_input_length)) % model.min_input_length 251 | 252 | # init_audio = (sample_rate, init_audio) 253 | 254 | wt = None if init_audio is None else wav_bytes_to_tensor(init_audio, model, sample_rate, sample_size) 255 | output = generate_diffusion_cond( 256 | model, 257 | steps=steps, 258 | cfg_scale=cfg_scale, 259 | conditioning=p_conditioning, 260 | negative_conditioning=n_conditioning, 261 | sample_size=sample_size, 262 | sigma_min=sigma_min, 263 | sigma_max=sigma_max, 264 | sampler_type=sampler_type, 265 | device=_device, 266 | seed=seed, 267 | batch_size=p_batch_size, 268 | init_noise_level=init_noise_level, 269 | init_audio=wt, 270 | quantum=quantum 271 | ) 272 | 273 | gendata = locals() 274 | gendata['prompt'] = p_conditioning[0]['prompt'] 275 | gendata['negative_prompt'] = n_conditioning[0]['prompt'] 276 | 277 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Raw Output:", output) 278 | 279 | output = rearrange(output, "b d n -> d (b n)") 280 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Rearranged Output:", output) 281 | 282 | output = output.to(torch.float32).div(torch.max(torch.abs(output))).clamp(-1, 1).mul(32767).to(torch.int16).cpu() 283 | print("[comfyui-stable-audio-sampler, nodes.py, generate_audio] Transformed Output:", output) 284 | 285 | filepaths = None 286 | if save: 287 | filepaths = save_audio_files(output, sample_rate, save_prefix, counter, data=gendata) 288 | 289 | spectrogram = audio_spectrogram_image(output, sample_rate=sample_rate) 290 | 291 | audio_bytes = output.numpy().tobytes() 292 | 293 | del model 294 | if torch.cuda.is_available(): 295 | torch.cuda.empty_cache() 296 | gc.collect() 297 | 298 | return audio_bytes, sample_rate, spectrogram, filepaths 299 | 300 | 301 | def get_model(model_filename=None, config=None, repo=None, half_precision=False, device_override=None): 302 | #print(model_filename, config, repo, half_precision) 303 | if model_filename: 304 | model_path = get_models_path(model_filename) #f"models/audio_checkpoints/{model_filename}" 305 | if model_filename.endswith(".safetensors") or model_filename.endswith(".ckpt"): 306 | if not config: 307 | model_config = get_model_config() 308 | else: 309 | with open(config, 'r') as f: 310 | model_config = json.load(f) 311 | model = create_model_from_config(model_config) 312 | print(f"[comfyui-stable-audio-sampler, nodes.py, get_model] Model path: {model_path}") 313 | model.load_state_dict(load_ckpt_state_dict(model_path)) 314 | else: 315 | repo_id = "stabilityai/stable-audio-open-1.0" if not repo else repo 316 | print(f"[comfyui-stable-audio-sampler, nodes.py, get_model] Loading pretrained model {repo_id}") 317 | model, model_config = get_pretrained_model(repo_id) 318 | sample_rate = model_config["sample_rate"] 319 | sample_size = model_config["sample_size"] 320 | elif repo: 321 | if repo == "stabilityai/stable-audio-open-1.0": 322 | print(f"[comfyui-stable-audio-sampler, nodes.py, get_model] Loading pretrained model {repo}") 323 | model, model_config = get_pretrained_model(repo) 324 | else: 325 | json_path = config or repo_path(repo, "model_config.json") 326 | model_path = repo_path(repo, "model.safetensors") 327 | with open(json_path) as f: 328 | model_config = json.load(f) 329 | model = create_model_from_config(model_config) 330 | model.load_state_dict(load_ckpt_state_dict(model_path), strict=False) 331 | sample_rate = model_config["sample_rate"] 332 | sample_size = model_config["sample_size"] 333 | else: 334 | raise ValueError("You must specify an Audio Checkpoint or a Repo to load from.") 335 | 336 | _device = device if not device_override else device_override 337 | model = model.to(_device).requires_grad_(False) #.eval().requires_grad_(False) 338 | 339 | if half_precision and _device != "cpu": 340 | model.to(torch.float16) 341 | 342 | return (model, sample_rate, sample_size, _device) 343 | 344 | def load_model(model_config=None, model_ckpt_path=None, pretrained_name=None, pretransform_ckpt_path=None, device="cuda", model_half=False): 345 | global model, sample_rate, sample_size 346 | 347 | if pretrained_name is not None: 348 | print(f"[comfyui-stable-audio-sampler, nodes.py, load_model] Loading pretrained model {pretrained_name}") 349 | model, model_config = get_pretrained_model(pretrained_name) 350 | 351 | elif model_config is not None and model_ckpt_path is not None: 352 | print(f"[comfyui-stable-audio-sampler, nodes.py, load_model] Creating model from config") 353 | model = create_model_from_config(model_config) 354 | 355 | print(f"[comfyui-stable-audio-sampler, nodes.py, load_model] Loading model checkpoint from {model_ckpt_path}") 356 | # Load checkpoint 357 | copy_state_dict(model, load_ckpt_state_dict(model_ckpt_path)) 358 | #model.load_state_dict(load_ckpt_state_dict(model_ckpt_path)) 359 | 360 | sample_rate = model_config["sample_rate"] 361 | sample_size = model_config["sample_size"] 362 | 363 | if pretransform_ckpt_path is not None: 364 | print(f"[comfyui-stable-audio-sampler, nodes.py, load_model] Loading pretransform checkpoint from {pretransform_ckpt_path}") 365 | model.pretransform.load_state_dict(load_ckpt_state_dict(pretransform_ckpt_path), strict=False) 366 | print(f"[comfyui-stable-audio-sampler, nodes.py, load_model] Done loading pretransform") 367 | 368 | model.to(device).eval().requires_grad_(False) 369 | 370 | if model_half: 371 | model.to(torch.float16) 372 | 373 | print(f"[comfyui-stable-audio-sampler, nodes.py, load_model] Done loading model") 374 | 375 | return model, model_config 376 | 377 | import shutil 378 | from urllib.parse import quote 379 | def save_audio_files(output, sample_rate, filename_prefix, counter, data=None, save_temp=True): 380 | from datetime import datetime 381 | 382 | filename_prefix += "" 383 | output_dir = "output" 384 | os.makedirs(output_dir, exist_ok=True) 385 | 386 | # Get current datetime and format it 387 | current_time = datetime.now().strftime("%Y%m%d_%H%M%S") 388 | 389 | # Create filename with datetime 390 | wavname = f"{current_time}_{filename_prefix}" if not data else f"{current_time}_{replace_variables(filename_prefix, data)}" 391 | 392 | filepaths = [] 393 | for i, audio in enumerate(output): 394 | if i > 0: # TODO fix batches 395 | break 396 | fpath = f"{quote(wavname)}_{counter:04}.wav" 397 | file_path = os.path.join(output_dir, fpath) 398 | print(f"[comfyui-stable-audio-sampler, nodes.py, save_audio_files] Saving audio to {file_path}") 399 | torchaudio.save(file_path, audio.unsqueeze(0), sample_rate) 400 | filepaths.append(fpath) 401 | # Saves to temporary path so it can be used for streaming loops 402 | if save_temp: 403 | tpath = os.path.join(TEMP_FOLDER, "stableaudiosampler.wav") 404 | print(f"[comfyui-stable-audio-sampler, nodes.py, save_audio_files] Saving temp audio to: {tpath}") 405 | shutil.copyfile(file_path, tpath) 406 | counter += 1 407 | return filepaths 408 | 409 | from aeiou.viz import spectrogram_image 410 | 411 | def create_image_batch(spectrograms, batch_size): 412 | images = [] 413 | for spec in spectrograms: 414 | im = spec.convert("RGB") # Ensure image is in RGB format 415 | im_tensor = torch.tensor(np.array(im)) # Convert to tensor, keeping the dimensions as (height, width, channels) 416 | images.append(im_tensor) 417 | batch_tensor = torch.stack(images) # Stack images into a batch 418 | return batch_tensor 419 | 420 | class StableAudioSampler: 421 | def __init__(self): 422 | self.counter = 0 423 | 424 | @classmethod 425 | def INPUT_TYPES(s): 426 | return { 427 | "required": { 428 | "audio_model": ("SAOMODEL", {"forceInput": True}), 429 | "positive": ("CONDITIONING", {"forceInput": True}), 430 | "negative": ("CONDITIONING", {"forceInput": True}), 431 | "seed": ("INT", {"default": -1, "min": -1, "max": MAX_FP32}), 432 | "steps": ("INT", {"default": 100, "min": 1, "max": 10000}), 433 | "cfg_scale": ("FLOAT", {"default": 7.0, "min": 0.0, "max": 100.0, "step": 0.1}), 434 | # "sample_size": ("INT", {"default": 65536, "min": 1, "max": 1000000}), 435 | "sigma_min": ("FLOAT", {"default": 0.3, "min": 0.01, "max": 1000.0, "step": 0.01}), 436 | "sigma_max": ("FLOAT", {"default": 500.0, "min": 0.0, "max": 1000.0, "step": 0.01}), 437 | "sampler_type": (SCHEDULERS, {"default": "dpmpp-3m-sde"}), 438 | "denoise": ("FLOAT", {"default": 1.0, "min": 0.0, "max": 20.0, "step": 0.01}), 439 | "save": ("BOOLEAN", {"default": True}), 440 | "save_prefix": ("STRING", {"default": "{prompt}-{seed}-{cfg_scale}-{steps}-{sigma_min}"}), 441 | "quantum": ("BOOLEAN", {"default": True}), 442 | }, 443 | "optional": { 444 | "audio": (any, ) 445 | } 446 | } 447 | 448 | RETURN_TYPES = (any, "INT", "IMAGE") 449 | RETURN_NAMES = ("audio", "sample_rate", "image") 450 | FUNCTION = "sample" 451 | OUTPUT_NODE = True 452 | 453 | CATEGORY = "audio/samplers" 454 | 455 | def sample(self, audio_model, positive, negative, seed, steps, cfg_scale, sigma_min, sigma_max, sampler_type, denoise, save, save_prefix, quantum=True, audio=None): 456 | audio_bytes, sample_rate, spectrogram, filepaths = generate_audio( 457 | (positive, negative), 458 | steps, 459 | cfg_scale, 460 | sigma_min, 461 | sigma_max, 462 | sampler_type, 463 | device, 464 | save, 465 | save_prefix, 466 | audio_model, 467 | seed=seed, 468 | counter=self.counter, 469 | init_noise_level=denoise, 470 | init_audio=audio, 471 | quantum=quantum 472 | ) 473 | spectrograms = create_image_batch([spectrogram], 1) 474 | return {"ui": {"paths": filepaths}, "result": (audio_bytes, sample_rate, spectrograms)} 475 | #return (audio_bytes, sample_rate, spectrograms) 476 | 477 | class StableLoadAudioModel: 478 | @classmethod 479 | def INPUT_TYPES(s): 480 | return { 481 | "required": { 482 | "model_filename": (model_files, ), 483 | }, 484 | "optional": { 485 | "model_config": (config_files, ), 486 | "repo": ("STRING", {"default": "stabilityai/stable-audio-open-1.0"}), 487 | "half_precision": ("BOOLEAN", {"default": False}), 488 | "force_cpu": ("BOOLEAN", {"default": False}), 489 | } 490 | } 491 | 492 | RETURN_TYPES = ("SAOMODEL", ) 493 | RETURN_NAMES = ("audio_model", ) 494 | FUNCTION = "load" 495 | 496 | CATEGORY = "audio/loaders" 497 | 498 | def load(self, model_filename, model_config=None, repo=None, half_precision=None, force_cpu=None): 499 | mpath = get_models_path(model_config) 500 | modelinfo = get_model(model_filename=model_filename, config=mpath, repo=repo, half_precision=half_precision, device_override=None if not force_cpu else "cpu") 501 | return (modelinfo,) 502 | 503 | class StableAudioPrompt: 504 | @classmethod 505 | def INPUT_TYPES(s): 506 | return { 507 | "required": { 508 | "conditioning": ("CONDITIONING", {"forceInput": True}), 509 | "prompt": ("STRING", {"multiline": True}), 510 | } 511 | } 512 | 513 | RETURN_TYPES = ("CONDITIONING", ) 514 | RETURN_NAMES = ("conditioning", ) 515 | FUNCTION = "go" 516 | 517 | CATEGORY = "audio/conditioning" 518 | 519 | def go(self, conditioning, prompt): 520 | print("[comfyui-stable-audio-sampler, nodes.py, go] PROMPT", prompt) 521 | cond, batch_size = conditioning 522 | print("[comfyui-stable-audio-sampler, nodes.py, go] cond, batch_size", cond, batch_size) 523 | o = [] 524 | #cond[0]['prompt'] = prompt 525 | for v in cond: 526 | v['prompt'] = prompt 527 | o.append(v.copy()) 528 | #c = conditioning[0] 529 | # conditioning = [{ 530 | # "prompt": prompt, 531 | # "seconds_start": seconds_start, 532 | # "seconds_total": seconds_total 533 | # }] 534 | print("[comfyui-stable-audio-sampler, nodes.py, go] o, batch_size", o, batch_size) 535 | return ((o, batch_size), ) 536 | 537 | import time 538 | 539 | class StableAudioConditioning: 540 | @classmethod 541 | def INPUT_TYPES(s): 542 | return { 543 | "required": { 544 | "seconds_start": ("INT", {"default": 0, "min": 0, "max": 60, "step": 1, "display": "number"}), 545 | "seconds_total": ("INT", {"default": 30, "min": 0, "max": 60, "step": 1, "display": "number"}), 546 | "batch_size": ("INT", {"default": 1, "min": 1, "max": 50, "step": 1, "display": "number"}), 547 | } 548 | } 549 | 550 | RETURN_TYPES = ("CONDITIONING", ) 551 | RETURN_NAMES = ("conditioning", ) 552 | FUNCTION = "go" 553 | 554 | CATEGORY = "audio/conditioning" 555 | 556 | def go(self, seconds_start, seconds_total, batch_size): 557 | conditioning = [{ 558 | "prompt": None, 559 | "seconds_start": seconds_start, 560 | "seconds_total": seconds_total 561 | }] 562 | return ((conditioning, batch_size), ) 563 | 564 | @classmethod 565 | def IS_CHANGED(s, image, string_field, int_field, float_field, print_to_screen): 566 | return time.time() 567 | 568 | NODE_CLASS_MAPPINGS = { 569 | "StableAudioSampler": StableAudioSampler, 570 | "StableAudioLoadModel": StableLoadAudioModel, 571 | "StableAudioPrompt": StableAudioPrompt, 572 | "StableAudioConditioning": StableAudioConditioning, 573 | } 574 | 575 | NODE_DISPLAY_NAME_MAPPINGS = { 576 | "StableAudioSampler": "Stable Audio Sampler", 577 | "StableAudioLoadModel": "Load Stable Audio Model", 578 | "StableAudioPrompt": "Stable Audio Prompt", 579 | "StableAudioConditioning": "Stable Audio Pre-Conditioning" 580 | } 581 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [project] 2 | name = "comfyui-stableaudiosampler" 3 | description = "Nodes: StableAudioSampler. Wraps the new Stable Audio Open Model in the sampler that dropped Jun 5th. See Github for Features" 4 | version = "1.0.0" 5 | license = "LICENSE" 6 | dependencies = ["stable-audio-tools"] 7 | 8 | [project.urls] 9 | Repository = "https://github.com/lks-ai/ComfyUI-StableAudioSampler" 10 | # Used by Comfy Registry https://comfyregistry.org 11 | 12 | [tool.comfy] 13 | PublisherId = "" 14 | DisplayName = "ComfyUI-StableAudioSampler" 15 | Icon = "" 16 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | stable-audio-tools 2 | torch 3 | torchaudio 4 | einops 5 | numpy 6 | aeiou 7 | flash_attn -------------------------------------------------------------------------------- /util_config.py: -------------------------------------------------------------------------------- 1 | def get_model_config(): 2 | return { 3 | "model_type": "diffusion_cond", 4 | "sample_size": 2097152, 5 | "sample_rate": 44100, 6 | "audio_channels": 2, 7 | "model": { 8 | "pretransform": { 9 | "type": "autoencoder", 10 | "iterate_batch": True, 11 | "config": { 12 | "encoder": { 13 | "type": "oobleck", 14 | "requires_grad": False, 15 | "config": { 16 | "in_channels": 2, 17 | "channels": 128, 18 | "c_mults": [1, 2, 4, 8, 16], 19 | "strides": [2, 4, 4, 8, 8], 20 | "latent_dim": 128, 21 | "use_snake": True 22 | } 23 | }, 24 | "decoder": { 25 | "type": "oobleck", 26 | "config": { 27 | "out_channels": 2, 28 | "channels": 128, 29 | "c_mults": [1, 2, 4, 8, 16], 30 | "strides": [2, 4, 4, 8, 8], 31 | "latent_dim": 64, 32 | "use_snake": True, 33 | "final_tanh": False 34 | } 35 | }, 36 | "bottleneck": { 37 | "type": "vae" 38 | }, 39 | "latent_dim": 64, 40 | "downsampling_ratio": 2048, 41 | "io_channels": 2 42 | } 43 | }, 44 | "conditioning": { 45 | "configs": [ 46 | { 47 | "id": "prompt", 48 | "type": "t5", 49 | "config": { 50 | "t5_model_name": "t5-base", 51 | "max_length": 128 52 | } 53 | }, 54 | { 55 | "id": "seconds_start", 56 | "type": "number", 57 | "config": { 58 | "min_val": 0, 59 | "max_val": 512 60 | } 61 | }, 62 | { 63 | "id": "seconds_total", 64 | "type": "number", 65 | "config": { 66 | "min_val": 0, 67 | "max_val": 512 68 | } 69 | } 70 | ], 71 | "cond_dim": 768 72 | }, 73 | "diffusion": { 74 | "cross_attention_cond_ids": ["prompt", "seconds_start", "seconds_total"], 75 | "global_cond_ids": ["seconds_start", "seconds_total"], 76 | "type": "dit", 77 | "config": { 78 | "io_channels": 64, 79 | "embed_dim": 1536, 80 | "depth": 24, 81 | "num_heads": 24, 82 | "cond_token_dim": 768, 83 | "global_cond_dim": 1536, 84 | "project_cond_tokens": False, 85 | "transformer_type": "continuous_transformer" 86 | } 87 | }, 88 | "io_channels": 64 89 | }, 90 | "training": { 91 | "use_ema": True, 92 | "log_loss_info": False, 93 | "optimizer_configs": { 94 | "diffusion": { 95 | "optimizer": { 96 | "type": "AdamW", 97 | "config": { 98 | "lr": 5e-5, 99 | "betas": [0.9, 0.999], 100 | "weight_decay": 1e-3 101 | } 102 | }, 103 | "scheduler": { 104 | "type": "InverseLR", 105 | "config": { 106 | "inv_gamma": 1000000, 107 | "power": 0.5, 108 | "warmup": 0.99 109 | } 110 | } 111 | } 112 | }, 113 | "demo": { 114 | "demo_every": 2000, 115 | "demo_steps": 250, 116 | "num_demos": 4, 117 | "demo_cond": [ 118 | {"prompt": "Amen break 174 BPM", "seconds_start": 0, "seconds_total": 12}, 119 | {"prompt": "A beautiful orchestral symphony, classical music", "seconds_start": 0, "seconds_total": 160}, 120 | {"prompt": "Chill hip-hop beat, chillhop", "seconds_start": 0, "seconds_total": 190}, 121 | {"prompt": "A pop song about love and loss", "seconds_start": 0, "seconds_total": 180} 122 | ], 123 | "demo_cfg_scales": [3, 6, 9] 124 | } 125 | } 126 | } 127 | -------------------------------------------------------------------------------- /util_dependencies.py: -------------------------------------------------------------------------------- 1 | import pkg_resources 2 | from packaging import version 3 | from collections import defaultdict 4 | import subprocess 5 | 6 | class PackageDependencyChecker: 7 | """Utility functions for unspaghettifying version hell.""" 8 | def __init__(self): 9 | self.installed_packages = {dist.project_name: dist for dist in pkg_resources.working_set} 10 | 11 | def find_dependents(self, package_name): 12 | dependents = [] 13 | for dist in self.installed_packages.values(): 14 | requires = [str(req) for req in dist.requires()] 15 | if any(package_name in req for req in requires): 16 | dependents.append((dist.project_name, dist.version, requires)) 17 | return dependents 18 | 19 | def check_version_discrepancy_for_line(self, line: str) -> dict: 20 | line = line.strip() 21 | if not line or line.startswith('#'): 22 | return None 23 | 24 | try: 25 | requirement = pkg_resources.Requirement.parse(line) 26 | package_name = requirement.name 27 | version_spec = str(requirement.specifier) 28 | return self._check_version(package_name, version_spec) 29 | except pkg_resources.extern.packaging.requirements.InvalidRequirement: 30 | return None 31 | 32 | def _check_version(self, package_name, version_spec): 33 | if package_name in self.installed_packages: 34 | installed_version = self.installed_packages[package_name].version 35 | #print(f"Checking version: {package_name} {version_spec}") 36 | requirement = pkg_resources.Requirement.parse(f"{package_name} {version_spec}") 37 | if not requirement.specifier.contains(installed_version): 38 | dependents = self.find_dependents(package_name) 39 | return { 40 | 'package_name': package_name, 41 | 'installed_version': installed_version, 42 | 'required_version': version_spec, 43 | 'dependents': dependents 44 | } 45 | else: 46 | return { 47 | 'package_name': package_name, 48 | 'installed_version': None, 49 | 'required_version': version_spec, 50 | 'dependents': [] 51 | } 52 | return None 53 | 54 | def check_version_discrepancies(self, requirements_file): 55 | discrepancies = [] 56 | with open(requirements_file, 'r') as file: 57 | for line in file: 58 | discrepancy = self.check_version_discrepancy_for_line(line) 59 | if discrepancy: 60 | discrepancies.append(discrepancy) 61 | return discrepancies 62 | 63 | def generate_user_instructions(self, discrepancies): 64 | instructions = [] 65 | for discrepancy in discrepancies: 66 | if discrepancy['installed_version'] is None: 67 | instructions.append(f"Package '{discrepancy['package_name']}' is missing. Please install version {discrepancy['required_version']}.") 68 | else: 69 | instructions.append(f"Package '{discrepancy['package_name']}' has version {discrepancy['installed_version']} installed, but version {discrepancy['required_version']} is required.") 70 | if discrepancy['dependents']: 71 | instructions.append("Dependent packages:") 72 | for dependent in discrepancy['dependents']: 73 | instructions.append(f" - {dependent[0]} (version {dependent[1]}) requires {discrepancy['package_name']}.") 74 | return "\n".join(instructions) 75 | 76 | def full_check(self, requirements_file): 77 | discrepancies = self.check_version_discrepancies(requirements_file) 78 | instructions = self.generate_user_instructions(discrepancies) 79 | return instructions 80 | 81 | def check_dependents_discrepancies(self, package_name): 82 | discrepancies = [] 83 | dependents = self.find_dependents(package_name) 84 | for dependent in dependents: 85 | for requirement in dependent[2]: 86 | if package_name in requirement: 87 | try: 88 | req = pkg_resources.Requirement.parse(requirement) 89 | if req.name == package_name: 90 | requirement_version = str(req.specifier) 91 | discrepancy = self._check_version(package_name, requirement_version) 92 | if discrepancy: 93 | discrepancies.append(discrepancy) 94 | except pkg_resources.extern.packaging.requirements.InvalidRequirement: 95 | continue 96 | return discrepancies 97 | 98 | def analyze_discrepancies(self, discrepancies): 99 | """Analyzes discrepancies and finds possible version ranges.""" 100 | print("Analyzing package discrepancies...") 101 | requirements = defaultdict(list) 102 | for discrepancy in discrepancies: 103 | requirements[discrepancy['package_name']].append(discrepancy['required_version']) 104 | 105 | solutions = {} 106 | for package, version_specs in requirements.items(): 107 | solutions[package] = self.find_version_solutions(version_specs) 108 | return solutions 109 | 110 | def find_version_solutions(self, version_specs): 111 | """Finds version ranges that satisfy the given version specifications.""" 112 | ranges = [] 113 | for spec in version_specs: 114 | requirement = pkg_resources.Requirement.parse(f"dummy {spec}") 115 | ranges.append(requirement.specifier) 116 | 117 | return self.find_common_version_range(ranges) 118 | 119 | def find_common_version_range(self, ranges): 120 | """Finds common version range for the given specifiers.""" 121 | min_version = version.parse("0") 122 | max_version = version.parse("9999") 123 | excluded_versions = set() 124 | 125 | for r in ranges: 126 | for spec in r: 127 | if spec.operator == '>=': 128 | min_version = max(min_version, version.parse(spec.version)) 129 | elif spec.operator == '>': 130 | new_min_version = version.parse(spec.version) 131 | min_version = max(min_version, version.parse(f"{new_min_version.major}.{new_min_version.minor}.{new_min_version.micro}") + version.parse("0.0.1")) 132 | elif spec.operator == '<=': 133 | max_version = min(max_version, version.parse(spec.version)) 134 | elif spec.operator == '<': 135 | max_version = min(max_version, version.parse(spec.version)) 136 | elif spec.operator == '!=': 137 | excluded_versions.add(version.parse(spec.version)) 138 | 139 | if min_version <= max_version: 140 | return (min_version, max_version, excluded_versions) 141 | else: 142 | return None 143 | 144 | def fetch_available_versions(self, package_name): 145 | """Fetches all available versions of a package from pip.""" 146 | result = subprocess.run(['pip', 'index', 'versions', package_name], capture_output=True, text=True) 147 | if result.returncode != 0: 148 | return [] 149 | 150 | lines = result.stdout.splitlines() 151 | versions = [] 152 | for line in lines: 153 | if line.startswith('Available versions:'): 154 | versions = [ver.strip() for ver in line.split(':')[1].split(',')] 155 | break 156 | return versions 157 | 158 | def analyze_possible_upgrades(self, discrepancies): 159 | possible_upgrades = {} 160 | for discrepancy in discrepancies: 161 | package_name = discrepancy['package_name'] 162 | current_version = discrepancy['installed_version'] 163 | # Fetch possible upgrades from pip 164 | available_versions = self.fetch_available_versions(package_name) 165 | for version in available_versions: 166 | if version > current_version: 167 | impact = self.calculate_upgrade_impact(package_name, version, discrepancies) 168 | possible_upgrades[(package_name, version)] = impact 169 | return possible_upgrades 170 | 171 | def calculate_upgrade_impact(self, package_name, version, discrepancies): 172 | # Simulate upgrading the package and recheck discrepancies 173 | new_discrepancies = self.simulate_upgrade(package_name, version, discrepancies) 174 | return len(discrepancies) - len(new_discrepancies) 175 | 176 | def simulate_upgrade(self, package_name, new_version, discrepancies): 177 | original_version = self.installed_packages[package_name].version 178 | self.installed_packages[package_name] = pkg_resources.Distribution(project_name=package_name, version=new_version) 179 | new_discrepancies = [] 180 | for discrepancy in discrepancies: 181 | if discrepancy['package_name'] != package_name: 182 | new_discrepancy = self._check_version(discrepancy['package_name'], discrepancy['required_version']) 183 | if new_discrepancy: 184 | new_discrepancies.append(new_discrepancy) 185 | self.installed_packages[package_name] = pkg_resources.Distribution(project_name=package_name, version=original_version) 186 | return new_discrepancies 187 | 188 | def suggest_solutions(self, solutions): 189 | """Generates solution suggestions based on the analyzed discrepancies.""" 190 | suggestions = [] 191 | for package, version_range in solutions.items(): 192 | dependents = self.find_dependents(package) 193 | 194 | if version_range: 195 | min_version, max_version, excluded_versions = version_range 196 | excluded_versions_str = ", ".join(str(v) for v in sorted(excluded_versions)) 197 | 198 | if excluded_versions: 199 | suggestions.append(f"Package '{package}' can be installed in the version range {min_version} - {max_version}, excluding versions: {excluded_versions_str}.") 200 | else: 201 | suggestions.append(f"Package '{package}' can be installed in the version range {min_version} - {max_version}.") 202 | 203 | suggestions.append("This version range is required by the following packages:") 204 | for dependent in dependents: 205 | suggestions.append(f" - {dependent[0]} (version {dependent[1]})") 206 | 207 | # Fetch available versions from pip 208 | available_versions = self.fetch_available_versions(package) 209 | valid_versions = [ 210 | version.parse(ver) for ver in available_versions 211 | if min_version <= version.parse(ver) <= max_version and version.parse(ver) not in excluded_versions 212 | ] 213 | 214 | # Filter to keep only the latest version in each subversion 215 | latest_versions = self.filter_latest_versions(valid_versions) 216 | 217 | if latest_versions: 218 | suggestions.append("Possible Commands:") 219 | for ver in sorted(latest_versions, reverse=True): 220 | suggestions.append(f" - pip install {package}=={ver}") 221 | else: 222 | suggestions.append(f"No valid versions found for '{package}' within the specified range.") 223 | else: 224 | suggestions.append(f"No common version range found for package '{package}'. However, you can try the following versions required by other packages:") 225 | package_commands = defaultdict(list) 226 | for dependent in dependents: 227 | for requirement in dependent[2]: 228 | if package in requirement: 229 | suggestions.append(f" - {requirement} required by {dependent[0]} (version {dependent[1]})") 230 | available_versions = self.fetch_available_versions(package) 231 | try: 232 | req = pkg_resources.Requirement.parse(requirement) 233 | for ver in available_versions: 234 | ver_parsed = version.parse(ver) 235 | if str(ver_parsed) in str(req.specifier): 236 | package_commands[(dependent[0], dependent[1])].append(f" - pip install {package}=={ver}") 237 | except (ValueError, IndexError): 238 | continue 239 | 240 | if package_commands: 241 | for packages, commands in package_commands.items(): 242 | package_list = f"{packages[0]} (version {packages[1]})" 243 | suggestions.append(f"To get the package {package_list} working, you can use the following commands:") 244 | suggestions.extend(commands) 245 | else: 246 | suggestions.append(f"No valid versions found for '{package}' based on individual requirements.") 247 | return suggestions 248 | 249 | def filter_latest_versions(self, versions): 250 | """Filters the versions to keep only the latest version in each subversion.""" 251 | latest_versions = {} 252 | for ver in versions: 253 | major_minor = (ver.major, ver.minor) 254 | if major_minor not in latest_versions or ver > latest_versions[major_minor]: 255 | latest_versions[major_minor] = ver 256 | return latest_versions.values() 257 | 258 | def rank_solutions(self, solutions): 259 | """Ranks version ranges based on the number of discrepancies they resolve.""" 260 | version_range_counts = {} 261 | for package, version_range in solutions.items(): 262 | if version_range: 263 | min_version, max_version, _ = version_range 264 | range_key = (min_version, max_version) 265 | if range_key not in version_range_counts: 266 | version_range_counts[range_key] = 0 267 | version_range_counts[range_key] += 1 268 | 269 | ranked_solutions = sorted(version_range_counts.items(), key=lambda x: -x[1]) 270 | return ranked_solutions 271 | 272 | def rank_upgrades(self, possible_upgrades): 273 | """Ranks upgrades based on the number of discrepancies they resolve.""" 274 | ranked_upgrades = sorted(possible_upgrades.items(), key=lambda x: -x[1]) 275 | return ranked_upgrades 276 | 277 | def find_best_upgrade_path(self, discrepancies): 278 | """Finds the best upgrade path to resolve the most discrepancies.""" 279 | solutions = self.analyze_discrepancies(discrepancies) 280 | possible_upgrades = self.analyze_possible_upgrades(discrepancies) 281 | ranked_solutions = self.rank_solutions(solutions) 282 | ranked_upgrades = self.rank_upgrades(possible_upgrades) 283 | 284 | best_path = [] 285 | 286 | # Combine ranked solutions and upgrades to find the optimal path 287 | for upgrade, impact in ranked_upgrades: 288 | package, version = upgrade 289 | best_path.append(f"Upgrading {package} to version {version} resolves {impact} discrepancies.") 290 | 291 | for version_range, count in ranked_solutions: 292 | best_path.append(f"Installing packages in the version range {version_range} resolves {count} discrepancies.") 293 | 294 | return best_path 295 | 296 | 297 | if __name__ == "__main__": 298 | import sys 299 | # Usage example: 300 | checker = PackageDependencyChecker() 301 | 302 | # Find dependents of a specific package 303 | package_name = sys.argv[1] 304 | dependents = checker.find_dependents(package_name) 305 | print(f"Dependents of '{package_name}':\n") 306 | for dependent in dependents: 307 | print(f"Package: {dependent[0]}, Version: {dependent[1]}") 308 | print(f" Requires: {dependent[2]}") 309 | 310 | # Check version discrepancies based on requirements.txt 311 | # discrepancies = checker.check_version_discrepancies('requirements.txt') 312 | # print("\nVersion discrepancies:") 313 | # for discrepancy in discrepancies: 314 | # print(f"Package: {discrepancy['package_name']}, Installed: {discrepancy['installed_version']}, Required: {discrepancy['required_version']}") 315 | # print("Dependents:") 316 | # for dependent in discrepancy['dependents']: 317 | # print(f" - Dependent Package: {dependent[0]}, Version: {dependent[1]}") 318 | # print(f" Requires: {dependent[2]}") 319 | 320 | # instructions = checker.generate_user_instructions(discrepancies) 321 | # print("\nVersion discrepancies and instructions:") 322 | # print(instructions) 323 | 324 | # Check version discrepancies among dependents of the specific package 325 | dependent_discrepancies = checker.check_dependents_discrepancies(package_name) 326 | # if dependent_discrepancies: 327 | # print(f"\nDiscrepancies in dependents of '{package_name}':") 328 | # for discrepancy in dependent_discrepancies: 329 | # print(f"Package: {discrepancy['package_name']}, Installed: {discrepancy['installed_version']}, Required: {discrepancy['required_version']}") 330 | # print("Dependents:") 331 | # for dependent in discrepancy['dependents']: 332 | # print(f" - Dependent Package: {dependent[0]}, Version: {dependent[1]}") 333 | # print(f" Requires: {dependent[2]}") 334 | # else: 335 | # print(f"No discrepancies found among the dependents of '{package_name}'.") 336 | 337 | # Analyze discrepancies and suggest solutions 338 | solutions = checker.analyze_discrepancies(dependent_discrepancies) 339 | solution_suggestions = checker.suggest_solutions(solutions) 340 | print("\nSuggested solutions:") 341 | for suggestion in solution_suggestions: 342 | print(suggestion) 343 | 344 | best_upgrade_path = checker.find_best_upgrade_path(dependent_discrepancies) 345 | for up in best_upgrade_path: 346 | print(up) -------------------------------------------------------------------------------- /web/js/playSound.js: -------------------------------------------------------------------------------- 1 | /* 2 | shouts to: pygoss, Fill, and Joviex! 3 | */ 4 | import { app } from "../../../scripts/app.js"; 5 | 6 | console.log("StableAudioSampler") 7 | 8 | app.registerExtension({ 9 | name: "lks-ai.StableAudioSampler", 10 | async beforeRegisterNodeDef(nodeType, nodeData, app) { 11 | if (nodeData.name === "StableAudioSampler") { 12 | console.log(app); 13 | console.log(nodeData); 14 | const onExecuted = nodeType.prototype.onExecuted; 15 | //console.log(onExecuted); 16 | nodeType.prototype.onExecuted = async function (message) { 17 | onExecuted?.apply(this, arguments); 18 | // console.log(this.widgets); 19 | // console.log(app.ui.lastQueueSize); 20 | // console.log(message) 21 | 22 | // TODO can check this.widgets[] for specific controls 23 | let file = message.paths[0]; 24 | if (!file) { 25 | file = "temp/stableaudiosampler.wav"; 26 | } 27 | 28 | // const url = new URL(`http://localhost:8188/view?filename=${encodeURIComponent(file)}&subfolder=&type=output&format=audio%2Fwav`); 29 | // console.log(import.meta.url) 30 | // console.log(url) 31 | // const audio = new Audio(url); 32 | // audio.volume = 1.0; //this.widgets[1].value; 33 | // audio.play(); 34 | }; 35 | } 36 | }, 37 | }); -------------------------------------------------------------------------------- /workflows/audio-space-exploration.json: -------------------------------------------------------------------------------- 1 | { 2 | "last_node_id": 37, 3 | "last_link_id": 53, 4 | "nodes": [ 5 | { 6 | "id": 4, 7 | "type": "StableAudioPrompt", 8 | "pos": [ 9 | 640, 10 | 350 11 | ], 12 | "size": { 13 | "0": 276, 14 | "1": 100 15 | }, 16 | "flags": {}, 17 | "order": 8, 18 | "mode": 0, 19 | "inputs": [ 20 | { 21 | "name": "conditioning", 22 | "type": "SAOCOND", 23 | "link": 13, 24 | "slot_index": 0 25 | } 26 | ], 27 | "outputs": [ 28 | { 29 | "name": "conditioning", 30 | "type": "SAOCOND", 31 | "links": [ 32 | 35 33 | ], 34 | "shape": 3 35 | } 36 | ], 37 | "properties": { 38 | "Node name for S&R": "StableAudioPrompt" 39 | }, 40 | "widgets_values": [ 41 | "" 42 | ], 43 | "color": "#322", 44 | "bgcolor": "#533" 45 | }, 46 | { 47 | "id": 18, 48 | "type": "StableAudioLoadModel", 49 | "pos": [ 50 | 363, 51 | 500 52 | ], 53 | "size": { 54 | "0": 315, 55 | "1": 154 56 | }, 57 | "flags": {}, 58 | "order": 0, 59 | "mode": 0, 60 | "outputs": [ 61 | { 62 | "name": "audio_model", 63 | "type": "SAOMODEL", 64 | "links": [ 65 | 33 66 | ], 67 | "shape": 3, 68 | "slot_index": 0 69 | } 70 | ], 71 | "properties": { 72 | "Node name for S&R": "StableAudioLoadModel" 73 | }, 74 | "widgets_values": [ 75 | "model.safetensors", 76 | "model_config.json", 77 | "stabilityai/stable-audio-open-1.0", 78 | true, 79 | false 80 | ] 81 | }, 82 | { 83 | "id": 35, 84 | "type": "ShowText|pysssss", 85 | "pos": [ 86 | 962, 87 | 52 88 | ], 89 | "size": { 90 | "0": 210, 91 | "1": 80 92 | }, 93 | "flags": {}, 94 | "order": 5, 95 | "mode": 0, 96 | "inputs": [ 97 | { 98 | "name": "text", 99 | "type": "STRING", 100 | "link": 48, 101 | "widget": { 102 | "name": "text" 103 | } 104 | } 105 | ], 106 | "outputs": [ 107 | { 108 | "name": "STRING", 109 | "type": "STRING", 110 | "links": null, 111 | "shape": 6 112 | } 113 | ], 114 | "title": "Show Text 🐍 Sigma Min", 115 | "properties": { 116 | "Node name for S&R": "ShowText|pysssss" 117 | }, 118 | "widgets_values": [ 119 | "", 120 | "0.5" 121 | ] 122 | }, 123 | { 124 | "id": 34, 125 | "type": "AnyNode", 126 | "pos": [ 127 | 730, 128 | 51 129 | ], 130 | "size": { 131 | "0": 210, 132 | "1": 120 133 | }, 134 | "flags": {}, 135 | "order": 1, 136 | "mode": 0, 137 | "inputs": [ 138 | { 139 | "name": "any", 140 | "type": "*", 141 | "link": null 142 | }, 143 | { 144 | "name": "any2", 145 | "type": "*", 146 | "link": null 147 | } 148 | ], 149 | "outputs": [ 150 | { 151 | "name": "any", 152 | "type": "*", 153 | "links": [ 154 | 47, 155 | 48 156 | ], 157 | "shape": 3, 158 | "slot_index": 0 159 | }, 160 | { 161 | "name": "control", 162 | "type": "CTRL", 163 | "links": null, 164 | "shape": 3, 165 | "slot_index": 1 166 | } 167 | ], 168 | "properties": { 169 | "Node name for S&R": "AnyNode" 170 | }, 171 | "widgets_values": [ 172 | "output a random float between 0.2 and 2.0 ... output precision should be rounded to 2 decimal places.", 173 | "gpt-4o" 174 | ] 175 | }, 176 | { 177 | "id": 28, 178 | "type": "ShowText|pysssss", 179 | "pos": [ 180 | 950, 181 | 580 182 | ], 183 | "size": { 184 | "0": 210, 185 | "1": 80 186 | }, 187 | "flags": {}, 188 | "order": 7, 189 | "mode": 0, 190 | "inputs": [ 191 | { 192 | "name": "text", 193 | "type": "STRING", 194 | "link": 40, 195 | "widget": { 196 | "name": "text" 197 | } 198 | } 199 | ], 200 | "outputs": [ 201 | { 202 | "name": "STRING", 203 | "type": "STRING", 204 | "links": null, 205 | "shape": 6 206 | } 207 | ], 208 | "title": "Show Text 🐍 Steps", 209 | "properties": { 210 | "Node name for S&R": "ShowText|pysssss" 211 | }, 212 | "widgets_values": [ 213 | "", 214 | "337" 215 | ] 216 | }, 217 | { 218 | "id": 27, 219 | "type": "ShowText|pysssss", 220 | "pos": [ 221 | 950, 222 | 710 223 | ], 224 | "size": { 225 | "0": 210, 226 | "1": 80 227 | }, 228 | "flags": {}, 229 | "order": 6, 230 | "mode": 0, 231 | "inputs": [ 232 | { 233 | "name": "text", 234 | "type": "STRING", 235 | "link": 39, 236 | "widget": { 237 | "name": "text" 238 | } 239 | } 240 | ], 241 | "outputs": [ 242 | { 243 | "name": "STRING", 244 | "type": "STRING", 245 | "links": null, 246 | "shape": 6 247 | } 248 | ], 249 | "title": "Show Text 🐍 CFG Scale", 250 | "properties": { 251 | "Node name for S&R": "ShowText|pysssss" 252 | }, 253 | "widgets_values": [ 254 | "", 255 | "35.93" 256 | ] 257 | }, 258 | { 259 | "id": 26, 260 | "type": "AnyNode", 261 | "pos": [ 262 | 710, 263 | 670 264 | ], 265 | "size": { 266 | "0": 210, 267 | "1": 120 268 | }, 269 | "flags": {}, 270 | "order": 2, 271 | "mode": 0, 272 | "inputs": [ 273 | { 274 | "name": "any", 275 | "type": "*", 276 | "link": null 277 | }, 278 | { 279 | "name": "any2", 280 | "type": "*", 281 | "link": null 282 | } 283 | ], 284 | "outputs": [ 285 | { 286 | "name": "any", 287 | "type": "*", 288 | "links": [ 289 | 39, 290 | 51 291 | ], 292 | "shape": 3, 293 | "slot_index": 0 294 | }, 295 | { 296 | "name": "control", 297 | "type": "CTRL", 298 | "links": null, 299 | "shape": 3 300 | } 301 | ], 302 | "properties": { 303 | "Node name for S&R": "AnyNode" 304 | }, 305 | "widgets_values": [ 306 | "output a random float between 7.0 and 40.0. output precision should be rounded to 2 decimal places.", 307 | "gpt-4o" 308 | ] 309 | }, 310 | { 311 | "id": 25, 312 | "type": "AnyNode", 313 | "pos": [ 314 | 710, 315 | 500 316 | ], 317 | "size": { 318 | "0": 210, 319 | "1": 120 320 | }, 321 | "flags": {}, 322 | "order": 3, 323 | "mode": 0, 324 | "inputs": [ 325 | { 326 | "name": "any", 327 | "type": "*", 328 | "link": null 329 | }, 330 | { 331 | "name": "any2", 332 | "type": "*", 333 | "link": null 334 | } 335 | ], 336 | "outputs": [ 337 | { 338 | "name": "any", 339 | "type": "*", 340 | "links": [ 341 | 40, 342 | 53 343 | ], 344 | "shape": 3, 345 | "slot_index": 0 346 | }, 347 | { 348 | "name": "control", 349 | "type": "CTRL", 350 | "links": null, 351 | "shape": 3 352 | } 353 | ], 354 | "properties": { 355 | "Node name for S&R": "AnyNode" 356 | }, 357 | "widgets_values": [ 358 | "output a random integer between 200 and 500.", 359 | "gpt-4o" 360 | ] 361 | }, 362 | { 363 | "id": 10, 364 | "type": "StableAudioConditioning", 365 | "pos": [ 366 | 280, 367 | 270 368 | ], 369 | "size": { 370 | "0": 315, 371 | "1": 106 372 | }, 373 | "flags": {}, 374 | "order": 4, 375 | "mode": 0, 376 | "outputs": [ 377 | { 378 | "name": "conditioning", 379 | "type": "SAOCOND", 380 | "links": [ 381 | 13, 382 | 45 383 | ], 384 | "shape": 3, 385 | "slot_index": 0 386 | } 387 | ], 388 | "properties": { 389 | "Node name for S&R": "StableAudioConditioning" 390 | }, 391 | "widgets_values": [ 392 | 0, 393 | 8, 394 | 1 395 | ] 396 | }, 397 | { 398 | "id": 33, 399 | "type": "StableAudioPrompt", 400 | "pos": [ 401 | 656, 402 | 227 403 | ], 404 | "size": { 405 | "0": 211.60000610351562, 406 | "1": 76.00000762939453 407 | }, 408 | "flags": {}, 409 | "order": 9, 410 | "mode": 0, 411 | "inputs": [ 412 | { 413 | "name": "conditioning", 414 | "type": "SAOCOND", 415 | "link": 45 416 | } 417 | ], 418 | "outputs": [ 419 | { 420 | "name": "conditioning", 421 | "type": "SAOCOND", 422 | "links": [ 423 | 46 424 | ], 425 | "shape": 3, 426 | "slot_index": 0 427 | } 428 | ], 429 | "properties": { 430 | "Node name for S&R": "StableAudioPrompt" 431 | }, 432 | "widgets_values": [ 433 | "trance synth lead 120BPM in E Minor" 434 | ] 435 | }, 436 | { 437 | "id": 24, 438 | "type": "StableAudioSampler", 439 | "pos": [ 440 | 986, 441 | 199 442 | ], 443 | "size": { 444 | "0": 315, 445 | "1": 334 446 | }, 447 | "flags": {}, 448 | "order": 10, 449 | "mode": 0, 450 | "inputs": [ 451 | { 452 | "name": "audio_model", 453 | "type": "SAOMODEL", 454 | "link": 33 455 | }, 456 | { 457 | "name": "positive", 458 | "type": "SAOCOND", 459 | "link": 46 460 | }, 461 | { 462 | "name": "negative", 463 | "type": "SAOCOND", 464 | "link": 35 465 | }, 466 | { 467 | "name": "audio", 468 | "type": "*", 469 | "link": null 470 | }, 471 | { 472 | "name": "sigma_min", 473 | "type": "FLOAT", 474 | "link": 47, 475 | "widget": { 476 | "name": "sigma_min" 477 | }, 478 | "slot_index": 4 479 | }, 480 | { 481 | "name": "cfg_scale", 482 | "type": "FLOAT", 483 | "link": 51, 484 | "widget": { 485 | "name": "cfg_scale" 486 | } 487 | }, 488 | { 489 | "name": "steps", 490 | "type": "INT", 491 | "link": 53, 492 | "widget": { 493 | "name": "steps" 494 | } 495 | } 496 | ], 497 | "outputs": [ 498 | { 499 | "name": "audio", 500 | "type": "*", 501 | "links": [], 502 | "shape": 3, 503 | "slot_index": 0 504 | }, 505 | { 506 | "name": "sample_rate", 507 | "type": "INT", 508 | "links": null, 509 | "shape": 3 510 | }, 511 | { 512 | "name": "image", 513 | "type": "IMAGE", 514 | "links": [], 515 | "shape": 3, 516 | "slot_index": 2 517 | } 518 | ], 519 | "properties": { 520 | "Node name for S&R": "StableAudioSampler" 521 | }, 522 | "widgets_values": [ 523 | 1991606814, 524 | "fixed", 525 | 238, 526 | 10.3, 527 | 0.3, 528 | 500, 529 | "dpmpp-3m-sde", 530 | 1, 531 | true, 532 | "{prompt}-{seed}-{cfg_scale}-{steps}-{sigma_min}" 533 | ] 534 | } 535 | ], 536 | "links": [ 537 | [ 538 | 13, 539 | 10, 540 | 0, 541 | 4, 542 | 0, 543 | "SAOCOND" 544 | ], 545 | [ 546 | 33, 547 | 18, 548 | 0, 549 | 24, 550 | 0, 551 | "SAOMODEL" 552 | ], 553 | [ 554 | 35, 555 | 4, 556 | 0, 557 | 24, 558 | 2, 559 | "SAOCOND" 560 | ], 561 | [ 562 | 39, 563 | 26, 564 | 0, 565 | 27, 566 | 0, 567 | "STRING" 568 | ], 569 | [ 570 | 40, 571 | 25, 572 | 0, 573 | 28, 574 | 0, 575 | "STRING" 576 | ], 577 | [ 578 | 45, 579 | 10, 580 | 0, 581 | 33, 582 | 0, 583 | "SAOCOND" 584 | ], 585 | [ 586 | 46, 587 | 33, 588 | 0, 589 | 24, 590 | 1, 591 | "SAOCOND" 592 | ], 593 | [ 594 | 47, 595 | 34, 596 | 0, 597 | 24, 598 | 4, 599 | "FLOAT" 600 | ], 601 | [ 602 | 48, 603 | 34, 604 | 0, 605 | 35, 606 | 0, 607 | "STRING" 608 | ], 609 | [ 610 | 51, 611 | 26, 612 | 0, 613 | 24, 614 | 5, 615 | "FLOAT" 616 | ], 617 | [ 618 | 53, 619 | 25, 620 | 0, 621 | 24, 622 | 6, 623 | "INT" 624 | ] 625 | ], 626 | "groups": [], 627 | "config": {}, 628 | "extra": {}, 629 | "version": 0.4 630 | } --------------------------------------------------------------------------------