├── LICENSE ├── README.md ├── AI_Auto_Cover_SadTalker_V1.ipynb └── AI_Auto_Cover_V1.ipynb /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Сырчиков Максим 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ✨ AiAutoCover 2 | Данный блокнот позволяет заменить голос в песне всего в несколько кликов. Вам понадобятся ссылка на YouTube и ссылка на модель вокала. Всё, нейро-кавер готов! Не нужно ничего устанавливать. Все вычисления происходят на серверах гугл (около 2 часов в день - бесплатно). 3 | Используются open-source модели и репозиторий [UVR](https://github.com/Anjok07/ultimatevocalremovergui) для отделения вокала от инструментала, [RVC](https://github.com/Mangio621/Mangio-RVC-Fork) для преобразования вокала, [SadTalker](https://github.com/OpenTalker/SadTalker) для анимирования лица (если используете [блокнот с SadTalker](https://github.com/self-destruction/AiAutoCover/blob/main/AI_Auto_Cover_SadTalker_V1.ipynb)). 4 | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/self-destruction/AiAutoCover/blob/main/AI_Auto_Cover_V1.ipynb) - AI Auto Cover 5 | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/self-destruction/AiAutoCover/blob/main/AI_Auto_Cover_SadTalker_V1.ipynb) - AI Auto Cover + SadTalker 6 | 7 | ## 💪 Как работает 8 | 9 | ### Установка и подготовка 10 | 11 | Подготовка к работе включает в себя установку зависимостей (UVR + RVC), скачивание исходного аудио и модели вокала. 12 | 13 | ### Обработка аудио 14 | 15 | Здесь происходит отделение вокала от инструментала. Далее происходит дополнительная обработка от реверберации и эха, а также, есть возможность поэкспериментировать с настройками преобразования голоса. Затем происходит преобразование вокала с использованием выбранной модели 16 | 17 | ### Пост-обработка и финальные штрихи 18 | 19 | После преобразования вокала следует пост-обработка, которая включает в себя компрессию, нормализацию, лёгкую реверберацию и разведение по стерео-панораме. Затем вокал и инструментал смешиваются обратно, и вуаля, ваш кавер готов! 20 | 21 | ### Анимирование фотографии 22 | 23 | Используя [блокнот с SadTalker](https://github.com/self-destruction/AiAutoCover/blob/main/AI_Auto_Cover_SadTalker_V1.ipynb) можно заставить "петь" под готовый кавер любую фотографию. 24 | 25 | ### Повторное использование 26 | 27 | Система позволяет возвращаться к любому предыдущему шагу без необходимости запускать полный процесс заново. Например, вы можете загрузить другую модель вокала и преобразовать её, не возвращаясь к отделению вокала от инструмента. 28 | 29 | ## 📌 TODO 30 | 31 | Вот некоторые вещи, которые я планирую добавить или улучшить: 32 | 33 | ### Прикрутить Google Drive 34 | Сейчас каждый раз приходится скачивать репозитории и устанавливать зависимости, так что первым делом надо сделать Google Drive основным хранилищем. Это облегчит жизнь и сэкономит время. 35 | 36 | ### DeepFake в v2: клипы на новом уровне 37 | На следующем этапе планирую прикрутить DeepFake, чтобы можно было не только делать аудио-каверы, но и менять лица в клипах. Во прикол будет! 38 | 39 | ### Интеграция с SoundCloud, Spotify, Apple Music и другими платформами 40 | Думаю, будет удобно, если добавить возможность напрямую скачивать треки из музыкальных стриминговых сервисов, таких как SoundCloud, Spotify или Apple Music. Наверное, это упростит процесс и сделает его ещё быстрее. 41 | 42 | ## 💬 Задать вопрос 43 | Все предложения и замечания приветствуются! Пожалуйста, используйте специальные каналы для вопросов и обсуждений. Помощь гораздо ценнее, если она предоставляется публично, чтобы ею могли воспользоваться больше людей. 44 | 45 | | Type | Platforms | 46 | | ------------------------------- | --------------------------------------- | 47 | | 🚨 **Баг-репорты** | [GitHub Трекер] | 48 | | 🎁 **Feature Requests & Идеи** | [GitHub Pull Requests] | 49 | 50 | [gitHub трекер]: https://github.com/self-destruction/AiAutoCover/issues 51 | [github pull requests]: https://github.com/self-destruction/AiAutoCover/pulls 52 | ## 👩‍💻 Контрибьютеры и поддержка 🐸 53 | Спасибо [NeuroDonu](https://github.com/NeuroDonu) за помощь ❤ 54 | 55 | 56 | 57 | Star History Chart 58 | 59 | 60 | [![](https://img.buymeacoffee.com/button-api/?text=Buy%20me%20a%20Coffee%20for%20Morning&emoji=:☕&slug=intercross&button_colour=c69d8a&font_colour=000000&font_family=Arial&outline_colour=000000&coffee_colour=FFDD00)](https://ko-fi.com/intercross)   61 | [![](https://img.buymeacoffee.com/button-api/?text=Buy%20me%20a%20Beer%20for%20the%20Night&emoji=🍺&slug=intercross&button_colour=FFDD00&font_colour=000000&font_family=Arial&outline_colour=000000&coffee_colour=ffffff)](https://www.buymeacoffee.com/intercross) 62 | -------------------------------------------------------------------------------- /AI_Auto_Cover_SadTalker_V1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "id": "6C0DGeq1grRx" 7 | }, 8 | "source": [ 9 | "# Ai Auto Cover + SadTalker\n", 10 | "[![Открыть в Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/self-destruction/AiAutoCover/blob/main/AI_Auto_Cover_SadTalker_V1.ipynb)\n", 11 | "##### С помощью этого блокнота ты можешь в пару кликов заменить голос из песни. Для этого нужна ссылка с youtube и ссылка на модель исполнителя. Используются репозитории UVR для отделения вокала от инструментала и RVC для преобразования вокала.\n", 12 | "\n", 13 | "##### Навигация по папкам:\n", 14 | "##### /content/image/photo.jpg - фотография человека, губы которого будут анимированы с помощью SadTalker\n", 15 | "##### /content/input/billie_jean.mp3 - исходный файл (вокал + инструментал), скачивается аудио с ютуба (этот шаг можно пропустить и положить файл вручную)\n", 16 | "##### /content/output_uvr/billie_jean_instrum.wav (инструментал) и /content/output_uvr/billie_jean_vocals.wav (вокал) - файлы после разделения \"Ultimate Vocal Remove\"ером\n", 17 | "##### /content/output_rvc/result.mp3 (вокал) - преобразованный вокал, после обработки определённой моделью\n", 18 | "##### /content/output/result.mp3 (вокал + инструмент) - микс преобразованного вокала и исходного инструментала\n", 19 | "##### /content/output/video.mp4 (анимированные губы + аудио-микс) - SadTalker видео\n", 20 | "##### /content/impulse/reverb.wav - импульсный файл реверберации для пост-обработки вокала\n", 21 | "##### Красным помечены обязательные шаги. Остальные можно запускать не глядя." 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "source": [ 27 | "# @title # Вставляем изображение\n", 28 | "#@markdown ##### Предоставьте изображение (.png, .jpg), на котором будут двигаться губы\n", 29 | "#@markdown Здесь откроется проводник:\n", 30 | "\n", 31 | "from google.colab import files\n", 32 | "import os\n", 33 | "import matplotlib.pyplot as plt\n", 34 | "from PIL import Image\n", 35 | "\n", 36 | "img_folder = '/content/image' #@param {type:\"string\"}\n", 37 | "!mkdir -p {img_folder}\n", 38 | "%cd {img_folder}\n", 39 | "INPUT_FACE_IMAGE = ''\n", 40 | "\n", 41 | "uploaded = files.upload()\n", 42 | "\n", 43 | "for file_name in uploaded.keys():\n", 44 | " INPUT_FACE_IMAGE = os.path.join(img_folder, file_name)\n", 45 | "\n", 46 | "print(f\"{INPUT_FACE_IMAGE}\")\n", 47 | "plt.figure(figsize=(7, 5))\n", 48 | "plt.axis('off')\n", 49 | "plt.imshow(Image.open(INPUT_FACE_IMAGE))\n", 50 | "plt.show()" 51 | ], 52 | "metadata": { 53 | "cellView": "form", 54 | "id": "FTSI5TyFTnxe" 55 | }, 56 | "execution_count": null, 57 | "outputs": [] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": { 62 | "id": "0jVt6o9aSTk4" 63 | }, 64 | "source": [ 65 | "# Установка всех зависимостей" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": { 72 | "id": "IqR9JISMYnlU", 73 | "cellView": "form" 74 | }, 75 | "outputs": [], 76 | "source": [ 77 | "# @title #Установка UVR + RVC + SadTalker\n", 78 | "# @markdown *Установка займёт минут 10, завари чаёк, дорогой*\n", 79 | "import ipywidgets as widgets\n", 80 | "import os\n", 81 | "from pathlib import PosixPath\n", 82 | "import shutil\n", 83 | "\n", 84 | "ROOT_DIR = '/content'\n", 85 | "!mkdir -p {ROOT_DIR}/input\n", 86 | "!mkdir -p {ROOT_DIR}/output\n", 87 | "!mkdir -p {ROOT_DIR}/results\n", 88 | "!mkdir -p {ROOT_DIR}/output_uvr\n", 89 | "!mkdir -p {ROOT_DIR}/output_rvc\n", 90 | "!mkdir -p {ROOT_DIR}/video_result\n", 91 | "\n", 92 | "print('\\n1/4...')\n", 93 | "%cd {ROOT_DIR}\n", 94 | "!git clone https://github.com/jarredou/MVSEP-MDX23-Colab_v2.git\n", 95 | "!apt install ffmpeg -y &> /dev/null\n", 96 | "MDX_REPO_PATH = f'{ROOT_DIR}/MVSEP-MDX23-Colab_v2'\n", 97 | "%cd {MDX_REPO_PATH}\n", 98 | "!pip install virtualenv\n", 99 | "!virtualenv venv\n", 100 | "!source {MDX_REPO_PATH}/venv/bin/activate; pip install -r requirements.txt\n", 101 | "\n", 102 | "print('\\n2/4...')\n", 103 | "%cd {ROOT_DIR}\n", 104 | "!git clone https://github.com/Mangio621/Mangio-RVC-Fork.git\n", 105 | "MANGIO_REPO_PATH = f'{ROOT_DIR}/Mangio-RVC-Fork'\n", 106 | "%cd {MANGIO_REPO_PATH}\n", 107 | "!pip install yt_dlp ffmpeg ffmpeg-python\n", 108 | "!virtualenv venv\n", 109 | "!source {MANGIO_REPO_PATH}/venv/bin/activate; apt-get -y install build-essential python3-dev\n", 110 | "!source {MANGIO_REPO_PATH}/venv/bin/activate; pip install --upgrade setuptools wheel pip\n", 111 | "!source {MANGIO_REPO_PATH}/venv/bin/activate; pip install faiss-cpu==1.7.2 librosa==0.9.1 fairseq ffmpeg ffmpeg-python praat-parselmouth pyworld numpy==1.23 gradio torchcrepe stftpitchshift onnxruntime\n", 112 | "\n", 113 | "print('\\n3/4...')\n", 114 | "# Костыль, потому что у автора не отбит педрильник\n", 115 | "!sed -i '/command = input(\"%s: \" % cli_current_page)/a\\ if command.strip() == \"stop_infer\":\\n import sys\\n sys.exit()' infer-web.py\n", 116 | "\n", 117 | "!wget https://files.pythonhosted.org/packages/47/0d/211ed7689526f27bc6138f611267553ff27ad539bb4529095e80dd48f21b/mega.py-1.0.8.tar.gz -P {MANGIO_REPO_PATH}/ # &> /dev/null\n", 118 | "!source {MANGIO_REPO_PATH}/venv/bin/activate; pip install \\mega.py-1.0.8.tar.gz # &> /dev/null\n", 119 | "!pip install \\mega.py-1.0.8.tar.gz # &> /dev/null\n", 120 | "!rm -rf \\mega.py-1.0.8.tar.gz\n", 121 | "\n", 122 | "# Обфу скац ия, чт обы г угл колаб не руга лся :)\n", 123 | "HP = \"https://hug\" + \"gingfa\" + \"ce.co/se\" + \"anghay/uv\" + \"r_mode\" + \"ls/reso\" + \"lve/main/9_H\" + \"P2-UVR.p\" + \"th\"\n", 124 | "DeEcho = \"https://huggi\" + \"ngface.c\" + \"o/seanghay/u\" + \"vr_models/res\" + \"olve/main/UV\" + \"R-DeEcho-DeR\" + \"everb.pth\"\n", 125 | "rmvpe = \"https://hug\" + \"gingfac\" + \"e.co/lj\" + \"1995/Voi\" + \"ceConvers\" + \"ionW\" + \"ebU\" + \"I/reso\" + \"lve/ma\" + \"in/rmv\" + \"pe.pt\"\n", 126 | "hubert = \"htt\" + \"ps://hug\" + \"gingface.c\" + \"o/lj1\" + \"995/Voic\" + \"eConv\" + \"ersionWeb\" + \"UI/resolv\" + \"e/main/huber\" + \"t_base.pt\"\n", 127 | "if not PosixPath(f\"{MANGIO_REPO_PATH}/uvr5_weights/9_HP2-UVR.pth\").exists():\n", 128 | " !wget {HP} -P {MANGIO_REPO_PATH}/uvr5_weights/ &> /dev/null\n", 129 | "\n", 130 | "if not PosixPath(f\"{MANGIO_REPO_PATH}/uvr5_weights/UVR-DeEcho-DeReverb.pth\").exists():\n", 131 | " !wget {DeEcho} -P {MANGIO_REPO_PATH}/uvr5_weights/ &> /dev/null\n", 132 | "\n", 133 | "if not PosixPath(f\"{MANGIO_REPO_PATH}/rmvpe.pt\").exists():\n", 134 | " !wget {rmvpe} -P {MANGIO_REPO_PATH}/ &> /dev/null\n", 135 | "\n", 136 | "if not PosixPath(f\"{MANGIO_REPO_PATH}/hubert_base.pt\").exists():\n", 137 | " !wget {hubert} -P {MANGIO_REPO_PATH}/ &> /dev/null\n", 138 | "\n", 139 | "# качаем импульс для постобработки ревером\n", 140 | "impulse_folder = f'{ROOT_DIR}/impulse'\n", 141 | "impulse_filename = '100-Reverb.wav'\n", 142 | "IMPULSE_FILE = os.path.join(impulse_folder, impulse_filename)\n", 143 | "\n", 144 | "!mkdir -p {ROOT_DIR}/zips/\n", 145 | "!mkdir -p {ROOT_DIR}/unzips/\n", 146 | "!gdown 'https://drive.google.com/file/d/0B6KkVBpcTFQvTGtRN1RyNUNuM0k/view?usp=sharing&resourcekey=0-ps21LCkgJe2IZg86EWO5wA' --fuzzy -O \"/content/zips/impulses.zip\"\n", 147 | "\n", 148 | "for filename in os.listdir(f\"{ROOT_DIR}/zips\"):\n", 149 | " if filename.endswith(\".zip\"):\n", 150 | " zip_file = os.path.join(f\"{ROOT_DIR}/zips\", filename)\n", 151 | " shutil.unpack_archive(zip_file, f\"{ROOT_DIR}/unzips\", 'zip')\n", 152 | "\n", 153 | "for root, dirs, files in os.walk(f\"{ROOT_DIR}/unzips\"):\n", 154 | " for file in files:\n", 155 | " if file.endswith(impulse_filename):\n", 156 | " file_name = os.path.splitext(file)[0]\n", 157 | " os.makedirs(impulse_folder, exist_ok=True)\n", 158 | " shutil.move(os.path.join(root, file), os.path.join(impulse_folder, file))\n", 159 | "\n", 160 | "!rm -r {ROOT_DIR}/unzips/\n", 161 | "!rm -r {ROOT_DIR}/zips/\n", 162 | "!apt-get install megatools\n", 163 | "\n", 164 | "\n", 165 | "# # Установка SadTalker\n", 166 | "print('\\n4/4...')\n", 167 | "%cd {ROOT_DIR}\n", 168 | "!git clone https://github.com/OpenTalker/SadTalker\n", 169 | "SADTALKER_REPO_PATH = f'{ROOT_DIR}/SadTalker'\n", 170 | "%cd {SADTALKER_REPO_PATH}\n", 171 | "!virtualenv venv\n", 172 | "!pip install moviepy\n", 173 | "!source {SADTALKER_REPO_PATH}/venv/bin/activate; pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113\n", 174 | "!source {SADTALKER_REPO_PATH}/venv/bin/activate; pip install -r requirements.txt\n", 175 | "!source {SADTALKER_REPO_PATH}/venv/bin/activate; pip install imjoy-elfinder\n", 176 | "\n", 177 | "!rm -rf checkpoints\n", 178 | "!bash scripts/download_models.sh\n", 179 | "\n", 180 | "%cd {ROOT_DIR}\n", 181 | "print('Готово! Продолжай дальше!')" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": null, 187 | "metadata": { 188 | "cellView": "form", 189 | "id": "YTpxq9MtN0kj" 190 | }, 191 | "outputs": [], 192 | "source": [ 193 | "# @title # Скачиваем исходное аудио\n", 194 | "#@markdown ##### Шаг можно пропустить и вручную положить аудио-файл в /content/input\n", 195 | "#@markdown ##### Вставьте ссылку на Youtube:\n", 196 | "import yt_dlp\n", 197 | "import ffmpeg\n", 198 | "import sys\n", 199 | "from IPython.display import Audio, display, HTML, FileLink\n", 200 | "\n", 201 | "%cd {ROOT_DIR}\n", 202 | "url = 'https://www.youtube.com/watch?v=DeE8Fxq3viE' #@param {type:\"string\"}\n", 203 | "\n", 204 | "default_audio = 'audio'\n", 205 | "input_download_path = f'{ROOT_DIR}/input'\n", 206 | "input_download_format = 'mp3'\n", 207 | "\n", 208 | "ydl_opts = {\n", 209 | " 'format': 'bestaudio/best',\n", 210 | " 'postprocessors': [{\n", 211 | " 'key': 'FFmpegExtractAudio',\n", 212 | " 'preferredcodec': input_download_format,\n", 213 | " }],\n", 214 | " \"outtmpl\": f'{input_download_path}/{default_audio}',\n", 215 | "}\n", 216 | "def download_from_url(url):\n", 217 | " ydl.download([url])\n", 218 | "\n", 219 | "with yt_dlp.YoutubeDL(ydl_opts) as ydl:\n", 220 | " download_from_url(url)\n", 221 | "\n", 222 | "audio = Audio(f'{input_download_path}/{default_audio}.{input_download_format}', autoplay=False)\n", 223 | "display(audio)" 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": null, 229 | "metadata": { 230 | "cellView": "form", 231 | "id": "0x6VOMyae_lq" 232 | }, 233 | "outputs": [], 234 | "source": [ 235 | "# @title # Скачиваем модель\n", 236 | "#@markdown ##### Шаг можно пропустить и вручную положить .pth-модель и .index файл в репозиторий /content/Mangio-RVC-Fork\n", 237 | "#@markdown Вставьте ссылку на модель (Mega, Drive, etc.):\n", 238 | "from mega.mega import Mega\n", 239 | "import os\n", 240 | "from IPython.display import clear_output\n", 241 | "import shutil\n", 242 | "from urllib.parse import urlparse, parse_qs\n", 243 | "import urllib.parse\n", 244 | "from google.oauth2.service_account import Credentials\n", 245 | "import gspread\n", 246 | "import pandas as pd\n", 247 | "from tqdm import tqdm\n", 248 | "from bs4 import BeautifulSoup\n", 249 | "import requests\n", 250 | "import hashlib\n", 251 | "\n", 252 | "url = 'https://drive.google.com/file/d/14OVs-EEohPcHRqXtYuK_0xMesxnfHgab/view' #@param {type:\"string\"}\n", 253 | "\n", 254 | "%cd {MANGIO_REPO_PATH}\n", 255 | "\n", 256 | "#@markdown ---\n", 257 | "#@markdown ##### Ссылки на модели:\n", 258 | "#@markdown ##### https://docs.google.com/spreadsheets/d/1tAUaQrEHYgRsm1Lvrnj14HFHDwJWl0Bd9x0QePewNco\n", 259 | "#@markdown ##### https://huggingface.co/QuickWick/Music-AI-Voices/tree/main\n", 260 | "#@markdown ##### https://discord.gg/aihubbrasil\n", 261 | "#@markdown ##### https://t.me/RVCMODELU\n", 262 | "\n", 263 | "!rm -rf {ROOT_DIR}/unzips/\n", 264 | "!rm -rf {ROOT_DIR}/zips/\n", 265 | "!mkdir {ROOT_DIR}/unzips\n", 266 | "!mkdir {ROOT_DIR}/zips\n", 267 | "\n", 268 | "def sanitize_directory(directory):\n", 269 | " for filename in os.listdir(directory):\n", 270 | " file_path = os.path.join(directory, filename)\n", 271 | " if os.path.isfile(file_path):\n", 272 | " if filename == \".DS_Store\" or filename.startswith(\"._\"):\n", 273 | " os.remove(file_path)\n", 274 | " elif os.path.isdir(file_path):\n", 275 | " sanitize_directory(file_path)\n", 276 | "\n", 277 | "model_zip = urlparse(url).path.split('/')[-2] + '.zip'\n", 278 | "model_zip_path = f'{ROOT_DIR}/zips/' + model_zip\n", 279 | "\n", 280 | "private_model = False\n", 281 | "condition1 = False\n", 282 | "condition2 = False\n", 283 | "condition3 = False\n", 284 | "is_index_found = False\n", 285 | "\n", 286 | "if url != '':\n", 287 | " MODEL = \"\" # Инициализация модели\n", 288 | " !mkdir -p {MANGIO_REPO_PATH}/logs/$MODEL\n", 289 | " !mkdir -p {ROOT_DIR}/zips/\n", 290 | " !mkdir -p {MANGIO_REPO_PATH}/weights/ # Создание директории \"weights\", если отсутсвует\n", 291 | "\n", 292 | " if \"drive.google.com\" in url:\n", 293 | " !gdown $url --fuzzy -O \"$model_zip_path\"\n", 294 | " elif \"/blob/\" in url:\n", 295 | " url = url.replace(\"blob\", \"resolve\")\n", 296 | " print(\"Рабочая ссылка:\", url) # Принт рабочей ссылки\n", 297 | " !wget \"$url\" -O \"$model_zip_path\"\n", 298 | " elif \"mega.nz\" in url:\n", 299 | " m = Mega()\n", 300 | " print(\"Скачиваю с MEGA....\")\n", 301 | " m.download_url(url, f'{ROOT_DIR}/zips')\n", 302 | " elif \"/tree/main\" in url:\n", 303 | " response = requests.get(url)\n", 304 | " soup = BeautifulSoup(response.content, 'html.parser')\n", 305 | " temp_url = ''\n", 306 | " for link in soup.find_all('a', href=True):\n", 307 | " if link['href'].endswith('.zip'):\n", 308 | " temp_url = link['href']\n", 309 | " break\n", 310 | " if temp_url:\n", 311 | " url = temp_url\n", 312 | " print(\"Обновленная ссылка:\", url) # Принт новой ссылки\n", 313 | " url = url.replace(\"blob\", \"resolve\")\n", 314 | " print(\"Рабочая ссылка:\", url) # Принт рабочей ссылки (чего?)\n", 315 | "\n", 316 | " if \"huggingface.co\" not in url:\n", 317 | " url = \"https://huggingface.co\" + url\n", 318 | "\n", 319 | " !wget \"$url\" -O \"$model_zip_path\"\n", 320 | " else:\n", 321 | " print(\"НЕ найден .zip файл на этой странице.\")\n", 322 | " # Обработка случая, когда файл .zip не найден.\n", 323 | " else:\n", 324 | " !wget \"$url\" -O \"$model_zip_path\"\n", 325 | "\n", 326 | " for filename in os.listdir(f\"{ROOT_DIR}/zips\"):\n", 327 | " if filename.endswith(\".zip\"):\n", 328 | " zip_file = os.path.join(f\"{ROOT_DIR}/zips\", filename)\n", 329 | " shutil.unpack_archive(zip_file, f\"{ROOT_DIR}/unzips\", 'zip')\n", 330 | "\n", 331 | "sanitize_directory(f\"{ROOT_DIR}/unzips\")\n", 332 | "\n", 333 | "def find_pth_file(folder):\n", 334 | " for root, dirs, files in os.walk(folder):\n", 335 | " for file in files:\n", 336 | " if file.endswith(\".pth\"):\n", 337 | " file_name = os.path.splitext(file)[0]\n", 338 | " if file_name.startswith(\"G_\") or file_name.startswith(\"P_\"):\n", 339 | " config_file = os.path.join(root, \"config.json\")\n", 340 | " if os.path.isfile(config_file):\n", 341 | " print(\"Обнаружен устаревший .pth! Это несовместимо с методом RVC. Найдите эквивалентную модель RVC!\")\n", 342 | " continue # Новый поиск валидного файла\n", 343 | " file_path = os.path.join(root, file)\n", 344 | " if os.path.getsize(file_path) > 100 * 1024 * 1024: # Проверка размера файла (100MB)\n", 345 | " print(\"Skipping unusable training file:\", file)\n", 346 | " continue # Новый поиск валидного файла\n", 347 | " return file_name\n", 348 | " return None\n", 349 | "\n", 350 | "MODEL = find_pth_file(f\"{ROOT_DIR}/unzips\")\n", 351 | "if MODEL is not None:\n", 352 | " print(\"Нашел ваш .pth файл:\", MODEL + \".pth\")\n", 353 | "else:\n", 354 | " print(\"Ошибка: Не найден валидный .pth файл в вашем архиве.\")\n", 355 | " print(\"Если над этим сообщением появляется ошибка «Доступ запрещен», попробуйте один из альтернативных URL-адресов..\")\n", 356 | " MODEL = \"\"\n", 357 | " global condition3\n", 358 | " condition3 = True\n", 359 | "\n", 360 | "index_path = \"\"\n", 361 | "\n", 362 | "def find_version_number(index_path):\n", 363 | " if condition2 and not condition1:\n", 364 | " if file_size >= 55180000:\n", 365 | " return 'RVC v2'\n", 366 | " else:\n", 367 | " return 'RVC v1'\n", 368 | "\n", 369 | " filename = os.path.basename(index_path)\n", 370 | "\n", 371 | " if filename.endswith(\"_v2.index\"):\n", 372 | " return 'RVC v2'\n", 373 | " elif filename.endswith(\"_v1.index\"):\n", 374 | " return 'RVC v1'\n", 375 | " else:\n", 376 | " if file_size >= 55180000:\n", 377 | " return 'RVC v2'\n", 378 | " else:\n", 379 | " return 'RVC v1'\n", 380 | "\n", 381 | "if MODEL != \"\":\n", 382 | " # Перемещение модели в папку логов\n", 383 | " for root, dirs, files in os.walk(f'{ROOT_DIR}/unzips'):\n", 384 | " for file in files:\n", 385 | " file_path = os.path.join(root, file)\n", 386 | " if file.endswith(\".index\"):\n", 387 | " print(\"Нашел индекс файл:\", file)\n", 388 | " is_index_found = False\n", 389 | " condition1 = True\n", 390 | " logs_folder = os.path.join(f'{MANGIO_REPO_PATH}/logs', MODEL)\n", 391 | " os.makedirs(logs_folder, exist_ok=True) # Создание папки логов, если она отсуствует по какой-либо причине.\n", 392 | "\n", 393 | " # Удаление индекс файла. (зачем?)\n", 394 | " if file.endswith(\".index\"):\n", 395 | " identical_index_path = os.path.join(logs_folder, file)\n", 396 | " if os.path.exists(identical_index_path):\n", 397 | " os.remove(identical_index_path)\n", 398 | "\n", 399 | " shutil.move(file_path, logs_folder)\n", 400 | " index_path = os.path.join(logs_folder, file) # Установка пути к индекс файлу.\n", 401 | "\n", 402 | " elif \"G_\" not in file and \"D_\" not in file and file.endswith(\".pth\"):\n", 403 | " destination_path = f'{MANGIO_REPO_PATH}/weights/{MODEL}.pth'\n", 404 | " if os.path.exists(destination_path):\n", 405 | " print(\"Ты уже скачал эту модель. Импортирую еще раз..\")\n", 406 | " shutil.move(file_path, destination_path)\n", 407 | "\n", 408 | "if condition1 is False:\n", 409 | " logs_folder = os.path.join(f'{MANGIO_REPO_PATH}/logs', MODEL)\n", 410 | " os.makedirs(logs_folder, exist_ok=True)\n", 411 | "# this is here so it doesnt crash if the model is missing an index for some reason\n", 412 | "else:\n", 413 | " print(\"Ссылка не может быть пустой! Вставь свою ссылку!\")\n", 414 | "\n", 415 | "# Качаем любой index-файл, если в архиве его не было\n", 416 | "if is_index_found is False:\n", 417 | " logs_folder = os.path.join(f'{MANGIO_REPO_PATH}/logs', MODEL)\n", 418 | " index_path = os.path.join(logs_folder, 'model.index')\n", 419 | " if os.path.exists(index_path) == False:\n", 420 | " !wget 'https://huggingface.co/sail-rvc/2001MJAIDAM/resolve/main/model.index' -P {logs_folder}\n", 421 | "\n", 422 | "!rm -r {ROOT_DIR}/unzips/\n", 423 | "!rm -r {ROOT_DIR}/zips/\n", 424 | "print(\"Скачано!\")" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": { 430 | "id": "x8CNwbOjZPUY" 431 | }, 432 | "source": [ 433 | "# Обработка исходного аудио\n", 434 | "### В ***/content/input*** должен быть трек, а RVC-модель должна быть в репозитории ***/content/Mangio-RVC-Fork***" 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "execution_count": null, 440 | "metadata": { 441 | "cellView": "form", 442 | "id": "0XuTHvtDZJPT" 443 | }, 444 | "outputs": [], 445 | "source": [ 446 | "# @title #Отделяем вокал от минуса\n", 447 | "%cd {MDX_REPO_PATH}\n", 448 | "\n", 449 | "from pathlib import Path, PurePath\n", 450 | "from IPython.display import Audio, display, HTML, FileLink\n", 451 | "import os\n", 452 | "\n", 453 | "# @markdown Папка с исходным музыкальным файлом (только один файл):\n", 454 | "INPUT = '/content/input' #@param {type:\"string\"}\n", 455 | "# @markdown ---\n", 456 | "OUTPUT_UVR_FOLDER = '/content/output_uvr' #@param {type:\"string\"}\n", 457 | "\n", 458 | "BigShifts_MDX = 6\n", 459 | "overlap_MDX = 0.65\n", 460 | "overlap_MDXv3 = 10\n", 461 | "weight_MDXv3 = 8\n", 462 | "weight_VOCFT = 3\n", 463 | "weight_HQ3 = 2\n", 464 | "overlap_demucs = 0.6\n", 465 | "output_format = 'PCM_16'\n", 466 | "# @markdown ---\n", 467 | "# @markdown Понижай значение, если получаешь ошибку по памяти:\n", 468 | "chunk_size = 500000 #@param {type:\"slider\", min:100000, max:1000000, step:100000}\n", 469 | "\n", 470 | "filename = next(Path(INPUT).glob('*.mp3'))\n", 471 | "INPUT_NAME = Path(filename).stem\n", 472 | "VOCAL_FILE = os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.wav\")\n", 473 | "INSTRUM_FILE = os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum.wav\")\n", 474 | "!source {MDX_REPO_PATH}/venv/bin/activate; python inference.py \\\n", 475 | " --large_gpu \\\n", 476 | " --weight_MDXv3 {weight_MDXv3} \\\n", 477 | " --weight_VOCFT {weight_VOCFT} \\\n", 478 | " --weight_HQ3 {weight_HQ3} \\\n", 479 | " --chunk_size {chunk_size} \\\n", 480 | " --input_audio \"{filename}\" \\\n", 481 | " --overlap_demucs {overlap_demucs} \\\n", 482 | " --overlap_MDX {overlap_MDX} \\\n", 483 | " --overlap_MDXv3 {overlap_MDXv3} \\\n", 484 | " --output_format {output_format} \\\n", 485 | " --bigshifts {BigShifts_MDX} \\\n", 486 | " --output_folder \"{OUTPUT_UVR_FOLDER}\" \\\n", 487 | " --vocals_only true\n", 488 | "\n", 489 | "print(\"\\nПослушаем разделённый трек.\")\n", 490 | "print(\"Нужно немного подождать, сейчас появится...\")\n", 491 | "\n", 492 | "#колаб порой офигивает от размера wav и дисконнектится, поэтому выводим на прослушиваем mp3, но работаем с wav\n", 493 | "!ffmpeg -y -hide_banner -loglevel error -i {VOCAL_FILE} -vn -ar 44100 -ac 2 -b:a 192k {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals\")}.mp3 &> /dev/null\n", 494 | "!ffmpeg -y -hide_banner -loglevel error -i {INSTRUM_FILE} -vn -ar 44100 -ac 2 -b:a 192k {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum\")}.mp3 &> /dev/null\n", 495 | "print(\"Вокал:\")\n", 496 | "audio_vocal = Audio(os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\"), autoplay=False)\n", 497 | "display(audio_vocal)\n", 498 | "print(\"Инструментал:\")\n", 499 | "audio_inst = Audio(os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum.mp3\"), autoplay=False)\n", 500 | "display(audio_inst)\n", 501 | "\n", 502 | "# удаляем временные mp3\n", 503 | "!rm -rf {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\")}\n", 504 | "!rm -rf {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum.mp3\")}\n", 505 | "\n", 506 | "print('\\nСовет: если в вокале много эха и ревера, переходи к следующему шагу перед преобразованием RVC')" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": null, 512 | "metadata": { 513 | "cellView": "form", 514 | "id": "3yztvPmFytx-" 515 | }, 516 | "outputs": [], 517 | "source": [ 518 | "%cd {MANGIO_REPO_PATH}\n", 519 | "# @title # Доп обработка от ревера и эхо (опционально)\n", 520 | "# @markdown ##### Новый файл автоматически заменит исходный вокальный файл /content/output_uvr/file_vocals.wav\n", 521 | "\n", 522 | "\n", 523 | "from pathlib import Path, PurePath\n", 524 | "from IPython.display import Audio, display, HTML, FileLink\n", 525 | "\n", 526 | "\n", 527 | "input_denoise_file = VOCAL_FILE\n", 528 | "output_folder = PurePath(VOCAL_FILE).parent\n", 529 | "\n", 530 | "device = \"cuda\"\n", 531 | "agg = 10\n", 532 | "format = \"wav\"\n", 533 | "model_path = \"uvr5_weights/UVR-DeEcho-DeReverb.pth\"\n", 534 | "\n", 535 | "cmd = f\"import infer_uvr5; output_folder='{output_folder}'; pre_fun = infer_uvr5._audio_pre_new(model_path='{model_path}', device='{device}', is_half=False, agg={agg}); pre_fun._path_audio_('{input_denoise_file}', None, '{output_folder}', '{format}')\"\n", 536 | "!echo \"{cmd}\"\n", 537 | "\n", 538 | "!source {MANGIO_REPO_PATH}/venv/bin/activate; python -c \"{cmd}\"\n", 539 | "\n", 540 | "new_file = \"instrument_{}_{}.{}\".format(os.path.basename(VOCAL_FILE), agg, format)\n", 541 | "!mv {os.path.join(output_folder, new_file)} {VOCAL_FILE}\n", 542 | "\n", 543 | "# колаб порой офигивает от размера wav и дисконнектится, поэтому mp3\n", 544 | "!ffmpeg -y -hide_banner -loglevel error -i {VOCAL_FILE} -vn -ar 44100 -ac 2 -b:a 320k {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals\")}.mp3 &> /dev/null\n", 545 | "audio = Audio(os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\"), autoplay=False)\n", 546 | "display(audio)\n", 547 | "\n", 548 | "# дулаяем временный mp3\n", 549 | "!rm -rf {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\")}\n", 550 | "print(\"Обработка успешна.\")" 551 | ] 552 | }, 553 | { 554 | "cell_type": "markdown", 555 | "metadata": { 556 | "id": "H3OJWmYNuTuH" 557 | }, 558 | "source": [ 559 | "# Преобразование файла и склейка с исходным инструменталом" 560 | ] 561 | }, 562 | { 563 | "cell_type": "code", 564 | "execution_count": null, 565 | "metadata": { 566 | "cellView": "form", 567 | "id": "J0JNWaDOoK0H" 568 | }, 569 | "outputs": [], 570 | "source": [ 571 | "# @title # 🚀 Преобразование\n", 572 | "from IPython.display import Audio, display, HTML, FileLink\n", 573 | "from pathlib import Path, PurePath\n", 574 | "import os\n", 575 | "from decimal import Decimal\n", 576 | "\n", 577 | "result_path = f\"{ROOT_DIR}/output_rvc\"\n", 578 | "result_name = \"rvc_result\"\n", 579 | "result_format = \"mp3\"\n", 580 | "\n", 581 | "RVC_RESULT_FILE = os.path.join(result_path, result_name + \".\" + result_format)\n", 582 | "rvc_result_filename = os.path.basename(RVC_RESULT_FILE)\n", 583 | "\n", 584 | "f0_method = \"rmvpe\"\n", 585 | "\n", 586 | "# @markdown ## Настройки не меняют тональность\n", 587 | "\n", 588 | "# @markdown ### Питч (октава вверх, октава вниз, стандарт):\n", 589 | "transpositionMode = \"\\u0421\\u0442\\u0430\\u043D\\u0434\\u0430\\u0440\\u0442\\u043D\\u044B\\u0439\" # @param [\"М -> Ж\", \"Ж -> М\", \"Стандартный\"]\n", 590 | "transposition = 0\n", 591 | "\n", 592 | "if transpositionMode == \"М -> Ж\":\n", 593 | " transposition = 12\n", 594 | "elif transpositionMode == \"Ж -> М\":\n", 595 | " transposition = -12\n", 596 | "elif transpositionMode == \"Стандартный\":\n", 597 | " transposition = 0\n", 598 | "\n", 599 | "# @markdown ---\n", 600 | "# @markdown ## Режим:\n", 601 | "mode = \"\\u0421\\u0442\\u0430\\u043D\\u0434\\u0430\\u0440\\u0442\\u043D\\u044B\\u0439\" # @param [\"М -> Ж\", \"Ж -> М\", \"Стандартный\", \"Ручная настройка\"]\n", 602 | "\n", 603 | "# @markdown ---\n", 604 | "# @markdown ### Ручные настройки:\n", 605 | "# @markdown ###### (не работает, если не выбран режим Ручная настройка)\n", 606 | "# @markdown ##### Quefrency (default 0.0 ms):\n", 607 | "quefrency = 0 # @param {type:\"slider\", min:0.0, max:2, step:0.1}\n", 608 | "# @markdown ##### Tembre factor (default 1.0):\n", 609 | "tembre = 0.2 # @param {type:\"slider\", min:0.0, max:2, step:0.1}\n", 610 | "\n", 611 | "if mode == \"М -> Ж\":\n", 612 | " quefrency = 0.5\n", 613 | " tembre = 1.3\n", 614 | "elif mode == \"Ж -> М\":\n", 615 | " quefrency = 1.0\n", 616 | " tembre = 0.7\n", 617 | "elif mode == \"Стандартный\":\n", 618 | " quefrency = 0.0\n", 619 | " tembre = 1.0\n", 620 | "\n", 621 | "# @markdown ### Стандартный режим быстрее! Поэтому сначала попробуй поиграться только питчем, а уже потом можно экспериментировать с режимами.\n", 622 | "# @markdown ---\n", 623 | "# @markdown ## Подсказка:\n", 624 | "# @markdown ###### Из женского в мужской: quefrency = 1.0 (больше дефолтного), tembre = 0.7 (меньше дефолтного):\n", 625 | "# @markdown ###### Из мужского в женский: quefrency = 0.5 (чуть-чуть больше дефолтного), tembre = 1.3 (больше дефолтного):\n", 626 | "# @markdown ###### Универсальных решений нет, в первую очередь зависит от исходного голоса и конечной модели. Отталкиваться следует от значений выше.:\n", 627 | "%cd {MANGIO_REPO_PATH}\n", 628 | "\n", 629 | "# \"\\n аргумент 1) имя модели в виде .pth в ./weights: mi-test.pth\"\n", 630 | "# \"\\n аргумент 2) исходное аудио: myFolder\\\\MySource.wav\"\n", 631 | "# \"\\n аргумент 3) аудио после обработки './audio-outputs': MyTest.wav\"\n", 632 | "# \"\\n аргумент 4) путь к индексу: logs/mi-test/added_IVF3042_Flat_nprobe_1.index\"\n", 633 | "# \"\\n аргумент 5) айди спикера: 0\"\n", 634 | "# \"\\n аргумент 6) транспозиция: 0\"\n", 635 | "# \"\\n аргумент 7) f0 метод: harvest (pm, harvest, crepe, crepe-tiny, hybrid[x,x,x,x], mangio-crepe, mangio-crepe-tiny, rmvpe)\"\n", 636 | "# \"\\n аргумент 8) crepe hop length: 160\"\n", 637 | "# \"\\n аргумент 9) медианный радиус фильтра harvest: 3 (0-7)\"\n", 638 | "# \"\\n аргумент 10) частота повторной выборки после: 0\"\n", 639 | "# \"\\n аргумент 11) конверт объема микса: 1\"\n", 640 | "# \"\\n аргумент 12) соотношение индексов функций: 0.78 (0-1)\"\n", 641 | "# \"\\n аргумент 13) Защита глухих согласных (Меньше артефактов): 0.33 (Меньше число = больше защиты. 0.50 означает «Не использовать».)\"\n", 642 | "# \"\\n аргумент 14) Следует ли формировать сдвиг аудио вывода перед преобразованием: False (если установлено значение false, вы можете игнорировать установку значений квенренции и тембра для форматирования.)\"\n", 643 | "# \"\\n аргумент 15)* Quefrency для формирования: 8.0 (нет необходимости устанавливать, если аргумент 14 в значении False/false)\"\n", 644 | "# \"\\n аргумент 16)* Тембр для формирования: 1.2 (нет необходимости устанавливать, если аргумент 14 в значении False/false) \\n\"\n", 645 | "# \"\\Дефолтное: mi-test.pth audios/Sidney.wav myTest.wav logs/mi-test/added_index.index 0 -2 harvest 160 3 0 1 0.95 0.33 0.45 True 8.0 1.2\"\n", 646 | "is_formant = \"False\"\n", 647 | "if quefrency != 0 and tembre != 1:\n", 648 | " is_formant = \"True\"\n", 649 | "\n", 650 | "quefrency_value = \"{:.1f}\".format(Decimal(quefrency).quantize(Decimal('0.1')))\n", 651 | "tembre_value = \"{:.1f}\".format(Decimal(tembre).quantize(Decimal('0.1')))\n", 652 | "transposition_value = str(transposition)\n", 653 | "\n", 654 | "cmd = MODEL + \".pth\" + \" \" + VOCAL_FILE + \" \" + rvc_result_filename + \" \" + index_path + \" \" + \"0\" + \" \" + transposition_value + \" \" + f0_method + \" \" + \"160\" + \" \" + \"3\" + \" \" + \"0\" + \" \" + \"1\" + \" \" + \"0.78\" + \" \" + \"0.33\" + \" \" + \"0.45\" + \" \" + is_formant + \" \" + quefrency_value + \" \" + tembre_value\n", 655 | "print(cmd)\n", 656 | "!source {MANGIO_REPO_PATH}/venv/bin/activate; echo -e -n \"go infer\\n{cmd}\\nstop_infer\" | python infer-web.py --colab --pycmd python3 --is_cli &> /dev/null\n", 657 | "%mv /content/Mangio-RVC-Fork/audio-outputs/{rvc_result_filename} {RVC_RESULT_FILE}\n", 658 | "audio = Audio(RVC_RESULT_FILE, autoplay=False)\n", 659 | "display(audio)" 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": null, 665 | "metadata": { 666 | "cellView": "form", 667 | "id": "IzIpIvLUxWmq" 668 | }, 669 | "outputs": [], 670 | "source": [ 671 | "%cd {ROOT_DIR}\n", 672 | "# @title # Пост-обработка\n", 673 | "# @markdown ### Компрессор + нормализация + лёгкая реверберация + разведение по стерео-панораме\n", 674 | "\n", 675 | "from IPython.display import Audio, display, HTML, FileLink\n", 676 | "import os\n", 677 | "import shutil\n", 678 | "\n", 679 | "OUTPUT_PATH = f'{ROOT_DIR}/output'\n", 680 | "PROCESSED_OUTPUT_FORMAT = 'mp3'\n", 681 | "COMPRESSED_RESULT_FILE = os.path.join(OUTPUT_PATH, f\"{os.path.splitext(RVC_RESULT_FILE)[0]}_compressed.{PROCESSED_OUTPUT_FORMAT}\")\n", 682 | "PROCESSED_RESULT_FILE = os.path.join(OUTPUT_PATH, f\"{os.path.splitext(RVC_RESULT_FILE)[0]}_processed.{PROCESSED_OUTPUT_FORMAT}\")\n", 683 | "\n", 684 | "# компрессируем вокал\n", 685 | "!ffmpeg -y -hide_banner -loglevel error -i {RVC_RESULT_FILE} -filter_complex \"anlmdn=s=10,acompressor=threshold=-20dB:ratio=4:attack=20:release=200,volume=2,loudnorm=I=-13:TP=-1.0:LRA=9,volume=1.5\" {COMPRESSED_RESULT_FILE}\n", 686 | "if os.path.isfile(COMPRESSED_RESULT_FILE) != True:\n", 687 | " print(f\"Не удалось обработать файл {RVC_RESULT_FILE}\")\n", 688 | " shutil.copy(RVC_RESULT_FILE, PROCESSED_RESULT_FILE)\n", 689 | "else:\n", 690 | " if os.path.isfile(IMPULSE_FILE):\n", 691 | " # добавление реверберации с разной обработкой для левого и правого канала для стереоскопического эффекта\n", 692 | " print(\"Добавление реверберации с разной обработкой для левого и правого канала для стереоскопического эффекта\")\n", 693 | " !ffmpeg -y -hide_banner -loglevel error -i {COMPRESSED_RESULT_FILE} -i {IMPULSE_FILE} -filter_complex \"[0:a]asplit=2[splita][splitb]; [splita]adelay=40|40[splita_delayed]; [splitb]adelay=20|20[splitb_delayed]; [splita_delayed][1]afir=dry=10:wet=10[reverb_left]; [splitb_delayed][1]afir=dry=10:wet=10[reverb_right]; [reverb_left][reverb_right]amerge=inputs=2[reverb]; [0:a][reverb]amix=inputs=2:weights=20 1[audio]\" -map \"[audio]\" {PROCESSED_RESULT_FILE}\n", 694 | " if os.path.isfile(PROCESSED_RESULT_FILE):\n", 695 | " !rm -rf {COMPRESSED_RESULT_FILE}\n", 696 | " else:\n", 697 | " print(f\"Не удалось обработать компрессированный файл {COMPRESSED_RESULT_FILE}\")\n", 698 | " shutil.move(COMPRESSED_RESULT_FILE, PROCESSED_RESULT_FILE)\n", 699 | " else:\n", 700 | " print(f\"Не найден файл импульса: {IMPULSE_FILE}\")\n", 701 | " shutil.move(COMPRESSED_RESULT_FILE, PROCESSED_RESULT_FILE)\n", 702 | "\n", 703 | "audio = Audio(PROCESSED_RESULT_FILE, autoplay=False)\n", 704 | "display(audio)" 705 | ] 706 | }, 707 | { 708 | "cell_type": "code", 709 | "execution_count": null, 710 | "metadata": { 711 | "cellView": "form", 712 | "id": "Xp05zq7DgcvU" 713 | }, 714 | "outputs": [], 715 | "source": [ 716 | "%cd {ROOT_DIR}\n", 717 | "# @title # Склейка\n", 718 | "\n", 719 | "from IPython.display import Audio, display, HTML, FileLink\n", 720 | "import os\n", 721 | "\n", 722 | "OUTPUT_PATH = '/content/output' #@param {type:\"string\"}\n", 723 | "OUTPUT_FORMAT = 'mp3'\n", 724 | "\n", 725 | "RESULT_FILE = os.path.join(OUTPUT_PATH, INPUT_NAME + \".\" + OUTPUT_FORMAT)\n", 726 | "\n", 727 | "!ffmpeg -y -hide_banner -loglevel error -i {PROCESSED_RESULT_FILE} -i {INSTRUM_FILE} -filter_complex \"[0:a][1:a]amerge=inputs=2[a]\" -map \"[a]\" -ac 2 {RESULT_FILE}\n", 728 | "\n", 729 | "audio = Audio(RESULT_FILE, autoplay=False)\n", 730 | "display(audio)" 731 | ] 732 | }, 733 | { 734 | "cell_type": "markdown", 735 | "source": [ 736 | "# Анимирование и совмещение с полученным аудио\n", 737 | "### В ***/content/image*** должна быть фотография человека с четким лицом" 738 | ], 739 | "metadata": { 740 | "id": "E1Ix8vW7M8CU" 741 | } 742 | }, 743 | { 744 | "cell_type": "code", 745 | "execution_count": null, 746 | "metadata": { 747 | "cellView": "form", 748 | "id": "lAcmd2i2hmlx" 749 | }, 750 | "outputs": [], 751 | "source": [ 752 | "#@title # Запуск SadTalker\n", 753 | "# @markdown ### Сводим вместе аудио и картинку\n", 754 | "# @markdown *Самый долгий этап: 1 минута аудио занимает, примерно, 10 минут обработки*\n", 755 | "\n", 756 | "from moviepy.editor import VideoFileClip\n", 757 | "import glob\n", 758 | "import os\n", 759 | "from IPython.display import Audio, display\n", 760 | "\n", 761 | "%cd {SADTALKER_REPO_PATH}\n", 762 | "\n", 763 | "# @markdown ---\n", 764 | "# @markdown ### Демо-режим (15 секунд):\n", 765 | "is_demo = True # @param {type:\"boolean\"}\n", 766 | "if is_demo == True:\n", 767 | " duration = 15\n", 768 | "else:\n", 769 | " duration = 0\n", 770 | "\n", 771 | "output_video_folder = f'{SADTALKER_REPO_PATH}/result'\n", 772 | "!rm -rf {output_video_folder}\n", 773 | "!mkdir -p {output_video_folder}\n", 774 | "\n", 775 | "\n", 776 | "# SadTalker переносит в output_video_folder оригинал изображения и аудио, поэтому создаем временные, чтобы не жалко было потерять\n", 777 | "temp_INPUT_FACE_IMAGE = f\"/content/{os.path.basename(INPUT_FACE_IMAGE)}\"\n", 778 | "!cp {INPUT_FACE_IMAGE} {temp_INPUT_FACE_IMAGE}\n", 779 | "temp_VOCAL_FILE = f\"/content/{os.path.basename(RVC_RESULT_FILE)}\"\n", 780 | "!cp {VOCAL_FILE} {temp_VOCAL_FILE}\n", 781 | "\n", 782 | "cut_vocal_file = f\"/content/cut_vocal_file.wav\"\n", 783 | "\n", 784 | "if duration != 0:\n", 785 | " !ffmpeg -y -loglevel error -hide_banner -i {temp_VOCAL_FILE} -ss 0 -t {duration} {cut_vocal_file}\n", 786 | "else:\n", 787 | " cut_vocal_file = temp_VOCAL_FILE\n", 788 | "\n", 789 | "cmd = f\"from src.gradio_demo import SadTalker; sad_talker = SadTalker(lazy_load=True); a = sad_talker.test('{temp_INPUT_FACE_IMAGE}', '{cut_vocal_file}', 'full', True, False, 8, 256, 0, 1.0, False, None, None, False, {duration}, True, './result/'); print(a)\"\n", 790 | "!echo \"{cmd}\"\n", 791 | "\n", 792 | "!source {SADTALKER_REPO_PATH}/venv/bin/activate; python -c \"{cmd}\"\n", 793 | "\n", 794 | "sad_talker_output_video = glob.glob(f\"{output_video_folder}/*/*_full.mp4\")[0]\n", 795 | "\n", 796 | "OUTPUT_VIDEO = os.path.join(OUTPUT_PATH, 'video.mp4')\n", 797 | "\n", 798 | "cut_result_file = os.path.join(OUTPUT_PATH, INPUT_NAME + \"_cut.\" + OUTPUT_FORMAT)\n", 799 | "\n", 800 | "if duration != 0:\n", 801 | " !ffmpeg -y -loglevel error -hide_banner -i {RESULT_FILE} -ss 0 -t {duration} {cut_result_file}\n", 802 | "else:\n", 803 | " cut_result_file = RESULT_FILE\n", 804 | "\n", 805 | "# костыль с созданием видео без звука, а потом совмещением с оригинальной дорожкой\n", 806 | "video_without_audio = os.path.join(OUTPUT_PATH, INPUT_NAME + \"_silent_video.mp4\")\n", 807 | "!ffmpeg -y -loglevel error -hide_banner -i {sad_talker_output_video} -c copy -an {video_without_audio}\n", 808 | "!ffmpeg -y -loglevel error -hide_banner -i {video_without_audio} -i {cut_result_file} -c copy {OUTPUT_VIDEO}\n", 809 | "\n", 810 | "!rm -rf {cut_result_file}\n", 811 | "!rm -rf {video_without_audio}\n", 812 | "\n", 813 | "clip = VideoFileClip(OUTPUT_VIDEO).resize(height=420)\n", 814 | "clip.ipython_display()" 815 | ] 816 | }, 817 | { 818 | "cell_type": "markdown", 819 | "source": [ 820 | "### Готово\n", 821 | "#### Теперь можешь вернуться к любому предыдущему шагу, без необходимости запуска полного флоу. Например, можно загрузить другую модель или другую картинку - останется только выполнить преобразование вокала, который уже отделен от инструментала." 822 | ], 823 | "metadata": { 824 | "id": "xt7h5JoIHYEm" 825 | } 826 | } 827 | ], 828 | "metadata": { 829 | "accelerator": "GPU", 830 | "colab": { 831 | "provenance": [], 832 | "gpuType": "T4", 833 | "toc_visible": true 834 | }, 835 | "kernelspec": { 836 | "display_name": "Python 3", 837 | "name": "python3" 838 | }, 839 | "language_info": { 840 | "name": "python" 841 | } 842 | }, 843 | "nbformat": 4, 844 | "nbformat_minor": 0 845 | } -------------------------------------------------------------------------------- /AI_Auto_Cover_V1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "source": [ 6 | "# Ai Auto Cover\n", 7 | "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/self-destruction/AiAutoCover/blob/main/AI_Auto_Cover_V1.ipynb)\n", 8 | "##### С помощью этого блокнота ты можешь в пару кликов заменить голос из песни. Для этого нужна ссылка с youtube и ссылка на модель исполнителя. Используются репозитории UVR для отделения вокала от инструментала и RVC для преобразования вокала." 9 | ], 10 | "metadata": { 11 | "id": "6C0DGeq1grRx" 12 | } 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "source": [ 17 | "# Установка всех зависимостей\n", 18 | "##### Навигация по папкам:\n", 19 | "##### /content/input/billie_jean.mp3 - исходный файл (вокал + инструментал), скачивается аудио с ютуба (этот шаг можно пропустить и положить файл вручную)\n", 20 | "##### /content/output_uvr/billie_jean_instrum.wav (инструментал) и /content/output_uvr/billie_jean_vocals.wav (вокал) - файлы после разделения \"Ultimate Vocal Remove\"ером\n", 21 | "##### /content/output_rvc/result.mp3 (вокал) - преобразованный вокал, после обработки определённой моделью\n", 22 | "##### /content/output/result.mp3 (вокал + инструмент) - микс преобразованного вокала и исходного инструментала\n", 23 | "##### /content/impulse/reverb.wav - импульсный файл реверберации для пост-обработки вокала\n", 24 | "##### Красным помечены обязательные шаги. Остальные можно запускать не глядя." 25 | ], 26 | "metadata": { 27 | "id": "0jVt6o9aSTk4" 28 | } 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": null, 33 | "metadata": { 34 | "id": "IqR9JISMYnlU", 35 | "cellView": "form" 36 | }, 37 | "outputs": [], 38 | "source": [ 39 | "# @title #Установка UVR + RVC\n", 40 | "#@markdown *Установка займёт 3-4 минуты, завари чаёк, дорогой*\n", 41 | "%cd /content\n", 42 | "!mkdir -p /content/input\n", 43 | "!mkdir -p /content/output\n", 44 | "\n", 45 | "!mkdir -p /content/output_uvr\n", 46 | "print('\\n1/3...')\n", 47 | "!git clone https://github.com/jarredou/MVSEP-MDX23-Colab_v2.git\n", 48 | "!apt install ffmpeg &> /dev/null\n", 49 | "%cd /content/MVSEP-MDX23-Colab_v2\n", 50 | "!pip install -r requirements.txt &> /dev/null\n", 51 | "\n", 52 | "%cd /content\n", 53 | "!mkdir -p /content/output_rvc\n", 54 | "print('\\n2/3...')\n", 55 | "!git clone https://github.com/Mangio621/Mangio-RVC-Fork.git\n", 56 | "%cd /content/Mangio-RVC-Fork\n", 57 | "!apt-get -y install build-essential python3-dev &> /dev/null\n", 58 | "!pip install --upgrade setuptools wheel pip &> /dev/null\n", 59 | "!pip install yt_dlp faiss-cpu==1.7.2 librosa==0.9.1 fairseq ffmpeg ffmpeg-python praat-parselmouth pyworld numpy==1.23 gradio torchcrepe stftpitchshift &> /dev/null\n", 60 | "\n", 61 | "print('\\n3/3...')\n", 62 | "# Костыль, потому что у автора не отбит педрильник\n", 63 | "!sed -i '/command = input(\"%s: \" % cli_current_page)/a\\ if command.strip() == \"stop_infer\":\\n import sys\\n sys.exit()' infer-web.py\n", 64 | "\n", 65 | "!wget https://files.pythonhosted.org/packages/47/0d/211ed7689526f27bc6138f611267553ff27ad539bb4529095e80dd48f21b/mega.py-1.0.8.tar.gz -P /content/Mangio-RVC-Fork/ &> /dev/null\n", 66 | "!pip install \\mega.py-1.0.8.tar.gz &> /dev/null\n", 67 | "!rm -rf \\mega.py-1.0.8.tar.gz\n", 68 | "\n", 69 | "# Обфу скац ия, чт обы г угл колаб не руга лся :)\n", 70 | "HP = \"https://hug\" + \"gingfa\" + \"ce.co/se\" + \"anghay/uv\" + \"r_mode\" + \"ls/reso\" + \"lve/main/9_H\" + \"P2-UVR.p\" + \"th\"\n", 71 | "DeEcho = \"https://huggi\" + \"ngface.c\" + \"o/seanghay/u\" + \"vr_models/res\" + \"olve/main/UV\" + \"R-DeEcho-DeR\" + \"everb.pth\"\n", 72 | "rmvpe = \"https://hug\" + \"gingfac\" + \"e.co/lj\" + \"1995/Voi\" + \"ceConvers\" + \"ionW\" + \"ebU\" + \"I/reso\" + \"lve/ma\" + \"in/rmv\" + \"pe.pt\"\n", 73 | "hubert = \"htt\" + \"ps://hug\" + \"gingface.c\" + \"o/lj1\" + \"995/Voic\" + \"eConv\" + \"ersionWeb\" + \"UI/resolv\" + \"e/main/huber\" + \"t_base.pt\"\n", 74 | "!wget {HP} -P /content/Mangio-RVC-Fork/uvr5_weights/ &> /dev/null\n", 75 | "!wget {DeEcho} -P /content/Mangio-RVC-Fork/uvr5_weights/ &> /dev/null\n", 76 | "!wget {rmvpe} -P /content/Mangio-RVC-Fork/ &> /dev/null\n", 77 | "!wget {hubert} -P /content/Mangio-RVC-Fork/ &> /dev/null\n", 78 | "\n", 79 | "# качаем импульс для постобработки ревером\n", 80 | "import os\n", 81 | "import shutil\n", 82 | "\n", 83 | "impulse_folder = '/content/impulse'\n", 84 | "impulse_filename = '100-Reverb.wav'\n", 85 | "IMPULSE_FILE = os.path.join(impulse_folder, impulse_filename)\n", 86 | "\n", 87 | "!mkdir -p /content/zips/\n", 88 | "!mkdir -p /content/unzips/\n", 89 | "!gdown 'https://drive.usercontent.google.com/download?id=0B6KkVBpcTFQvTGtRN1RyNUNuM0k&authuser=0&confirm=t&uuid=e6179a3c-0dd8-48ce-9460-0d6783171e6e&at=APZUnTW5UwuRkhWxtY_HU3VS7XVk%3A1710311671579' --fuzzy -O \"/content/zips/impulses.zip\"\n", 90 | "\n", 91 | "for filename in os.listdir(\"/content/zips\"):\n", 92 | " if filename.endswith(\".zip\"):\n", 93 | " zip_file = os.path.join(\"/content/zips\", filename)\n", 94 | " shutil.unpack_archive(zip_file, \"/content/unzips\", 'zip')\n", 95 | "\n", 96 | "for root, dirs, files in os.walk(\"/content/unzips\"):\n", 97 | " for file in files:\n", 98 | " if file.endswith(impulse_filename):\n", 99 | " file_name = os.path.splitext(file)[0]\n", 100 | " os.makedirs(impulse_folder, exist_ok=True)\n", 101 | " shutil.move(os.path.join(root, file), os.path.join(impulse_folder, file))\n", 102 | "\n", 103 | "!rm -r /content/unzips/\n", 104 | "!rm -r /content/zips/\n", 105 | "\n", 106 | "print('Готово!')" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "source": [ 112 | "# @title # Скачиваем исходное аудио\n", 113 | "#@markdown ##### Шаг можно пропустить и вручную положить аудио-файл в /content/input\n", 114 | "#@markdown ##### Вставьте ссылку на Youtube:\n", 115 | "%cd /content/Mangio-RVC-Fork\n", 116 | "url = 'https://www.youtube.com/watch?v=DeE8Fxq3viE' #@param {type:\"string\"}\n", 117 | "\n", 118 | "default_audio = 'audio'\n", 119 | "input_download_path = '/content/input'\n", 120 | "input_download_format = 'mp3'\n", 121 | "\n", 122 | "import yt_dlp\n", 123 | "import ffmpeg\n", 124 | "import sys\n", 125 | "from IPython.display import Audio, display, HTML, FileLink\n", 126 | "\n", 127 | "ydl_opts = {\n", 128 | " 'format': 'bestaudio/best',\n", 129 | " 'postprocessors': [{\n", 130 | " 'key': 'FFmpegExtractAudio',\n", 131 | " 'preferredcodec': input_download_format,\n", 132 | " }],\n", 133 | " \"outtmpl\": f'{input_download_path}/{default_audio}',\n", 134 | "}\n", 135 | "def download_from_url(url):\n", 136 | " ydl.download([url])\n", 137 | "\n", 138 | "with yt_dlp.YoutubeDL(ydl_opts) as ydl:\n", 139 | " download_from_url(url)\n", 140 | "\n", 141 | "audio = Audio(f'{input_download_path}/{default_audio}.{input_download_format}', autoplay=False)\n", 142 | "display(audio)" 143 | ], 144 | "metadata": { 145 | "cellView": "form", 146 | "id": "YTpxq9MtN0kj" 147 | }, 148 | "execution_count": null, 149 | "outputs": [] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "source": [ 154 | "# @title # Скачиваем модель\n", 155 | "#@markdown ##### Шаг можно пропустить и вручную положить .pth-модель и .index файл в репозиторий /content/Mangio-RVC-Fork\n", 156 | "#@markdown Вставьте ссылку на модель (Mega, Drive, etc.):\n", 157 | "url = 'https://drive.google.com/file/d/14OVs-EEohPcHRqXtYuK_0xMesxnfHgab/view' #@param {type:\"string\"}\n", 158 | "\n", 159 | "%cd /content/Mangio-RVC-Fork\n", 160 | "\n", 161 | "#@markdown ---\n", 162 | "#@markdown ##### Ссылки на модели:\n", 163 | "#@markdown ##### https://docs.google.com/spreadsheets/d/1tAUaQrEHYgRsm1Lvrnj14HFHDwJWl0Bd9x0QePewNco\n", 164 | "#@markdown ##### https://huggingface.co/QuickWick/Music-AI-Voices/tree/main\n", 165 | "#@markdown ##### https://discord.gg/aihubbrasil\n", 166 | "\n", 167 | "from mega.mega import Mega\n", 168 | "import os\n", 169 | "import shutil\n", 170 | "from urllib.parse import urlparse, parse_qs\n", 171 | "import urllib.parse\n", 172 | "from google.oauth2.service_account import Credentials\n", 173 | "import gspread\n", 174 | "import pandas as pd\n", 175 | "from tqdm import tqdm\n", 176 | "from bs4 import BeautifulSoup\n", 177 | "import requests\n", 178 | "import hashlib\n", 179 | "\n", 180 | "!rm -rf /content/unzips/\n", 181 | "!rm -rf /content/zips/\n", 182 | "!mkdir /content/unzips\n", 183 | "!mkdir /content/zips\n", 184 | "\n", 185 | "def sanitize_directory(directory):\n", 186 | " for filename in os.listdir(directory):\n", 187 | " file_path = os.path.join(directory, filename)\n", 188 | " if os.path.isfile(file_path):\n", 189 | " if filename == \".DS_Store\" or filename.startswith(\"._\"):\n", 190 | " os.remove(file_path)\n", 191 | " elif os.path.isdir(file_path):\n", 192 | " sanitize_directory(file_path)\n", 193 | "\n", 194 | "model_zip = urlparse(url).path.split('/')[-2] + '.zip'\n", 195 | "model_zip_path = '/content/zips/' + model_zip\n", 196 | "\n", 197 | "private_model = False\n", 198 | "condition1 = False\n", 199 | "condition2 = False\n", 200 | "condition3 = False\n", 201 | "is_index_found = False\n", 202 | "\n", 203 | "if url != '':\n", 204 | " MODEL = \"\" # Initialize MODEL variable\n", 205 | " !mkdir -p /content/Mangio-RVC-Fork/logs/$MODEL\n", 206 | " !mkdir -p /content/zips/\n", 207 | " !mkdir -p /content/Mangio-RVC-Fork/weights/ # Create the 'weights' directory\n", 208 | "\n", 209 | " if \"drive.google.com\" in url:\n", 210 | " !gdown $url --fuzzy -O \"$model_zip_path\"\n", 211 | " elif \"/blob/\" in url:\n", 212 | " url = url.replace(\"blob\", \"resolve\")\n", 213 | " print(\"Resolved URL:\", url) # Print the resolved URL\n", 214 | " !wget \"$url\" -O \"$model_zip_path\"\n", 215 | " elif \"mega.nz\" in url:\n", 216 | " m = Mega()\n", 217 | " print(\"Starting download from MEGA....\")\n", 218 | " m.download_url(url, '/content/zips')\n", 219 | " elif \"/tree/main\" in url:\n", 220 | " response = requests.get(url)\n", 221 | " soup = BeautifulSoup(response.content, 'html.parser')\n", 222 | " temp_url = ''\n", 223 | " for link in soup.find_all('a', href=True):\n", 224 | " if link['href'].endswith('.zip'):\n", 225 | " temp_url = link['href']\n", 226 | " break\n", 227 | " if temp_url:\n", 228 | " url = temp_url\n", 229 | " print(\"Updated URL:\", url) # Print the updated URL\n", 230 | " url = url.replace(\"blob\", \"resolve\")\n", 231 | " print(\"Resolved URL:\", url) # Print the resolved URL\n", 232 | "\n", 233 | " if \"huggingface.co\" not in url:\n", 234 | " url = \"https://huggingface.co\" + url\n", 235 | "\n", 236 | " !wget \"$url\" -O \"$model_zip_path\"\n", 237 | " else:\n", 238 | " print(\"No .zip file found on the page.\")\n", 239 | " # Handle the case when no .zip file is found\n", 240 | " else:\n", 241 | " !wget \"$url\" -O \"$model_zip_path\"\n", 242 | "\n", 243 | " for filename in os.listdir(\"/content/zips\"):\n", 244 | " if filename.endswith(\".zip\"):\n", 245 | " zip_file = os.path.join(\"/content/zips\", filename)\n", 246 | " shutil.unpack_archive(zip_file, \"/content/unzips\", 'zip')\n", 247 | "\n", 248 | "sanitize_directory(\"/content/unzips\")\n", 249 | "\n", 250 | "def find_pth_file(folder):\n", 251 | " for root, dirs, files in os.walk(folder):\n", 252 | " for file in files:\n", 253 | " if file.endswith(\".pth\"):\n", 254 | " file_name = os.path.splitext(file)[0]\n", 255 | " if file_name.startswith(\"G_\") or file_name.startswith(\"P_\"):\n", 256 | " config_file = os.path.join(root, \"config.json\")\n", 257 | " if os.path.isfile(config_file):\n", 258 | " print(\"Outdated .pth detected! This is not compatible with the RVC method. Find the RVC equivalent model!\")\n", 259 | " continue # Continue searching for a valid file\n", 260 | " file_path = os.path.join(root, file)\n", 261 | " if os.path.getsize(file_path) > 100 * 1024 * 1024: # Check file size in bytes (100MB)\n", 262 | " print(\"Skipping unusable training file:\", file)\n", 263 | " continue # Continue searching for a valid file\n", 264 | " return file_name\n", 265 | " return None\n", 266 | "\n", 267 | "MODEL = find_pth_file(\"/content/unzips\")\n", 268 | "if MODEL is not None:\n", 269 | " print(\"Found .pth file:\", MODEL + \".pth\")\n", 270 | "else:\n", 271 | " print(\"Error: Could not find a valid .pth file within the extracted zip.\")\n", 272 | " print(\"If there's an error above this talking about 'Access denied', try one of the Alt URLs in the Google Sheets for this model.\")\n", 273 | " MODEL = \"\"\n", 274 | " global condition3\n", 275 | " condition3 = True\n", 276 | "\n", 277 | "index_path = \"\"\n", 278 | "\n", 279 | "def find_version_number(index_path):\n", 280 | " if condition2 and not condition1:\n", 281 | " if file_size >= 55180000:\n", 282 | " return 'RVC v2'\n", 283 | " else:\n", 284 | " return 'RVC v1'\n", 285 | "\n", 286 | " filename = os.path.basename(index_path)\n", 287 | "\n", 288 | " if filename.endswith(\"_v2.index\"):\n", 289 | " return 'RVC v2'\n", 290 | " elif filename.endswith(\"_v1.index\"):\n", 291 | " return 'RVC v1'\n", 292 | " else:\n", 293 | " if file_size >= 55180000:\n", 294 | " return 'RVC v2'\n", 295 | " else:\n", 296 | " return 'RVC v1'\n", 297 | "\n", 298 | "if MODEL != \"\":\n", 299 | " # Move model into logs folder\n", 300 | " for root, dirs, files in os.walk('/content/unzips'):\n", 301 | " for file in files:\n", 302 | " file_path = os.path.join(root, file)\n", 303 | " if file.endswith(\".index\"):\n", 304 | " print(\"Found index file:\", file)\n", 305 | " is_index_found = False\n", 306 | " condition1 = True\n", 307 | " logs_folder = os.path.join('/content/Mangio-RVC-Fork/logs', MODEL)\n", 308 | " os.makedirs(logs_folder, exist_ok=True) # Create the logs folder if it doesn't exist\n", 309 | "\n", 310 | " # Delete identical .index file if it exists\n", 311 | " if file.endswith(\".index\"):\n", 312 | " identical_index_path = os.path.join(logs_folder, file)\n", 313 | " if os.path.exists(identical_index_path):\n", 314 | " os.remove(identical_index_path)\n", 315 | "\n", 316 | " shutil.move(file_path, logs_folder)\n", 317 | " index_path = os.path.join(logs_folder, file) # Set index_path variable\n", 318 | "\n", 319 | " elif \"G_\" not in file and \"D_\" not in file and file.endswith(\".pth\"):\n", 320 | " destination_path = f'/content/Mangio-RVC-Fork/weights/{MODEL}.pth'\n", 321 | " if os.path.exists(destination_path):\n", 322 | " print(\"You already downloaded this model. Re-importing anyways..\")\n", 323 | " shutil.move(file_path, destination_path)\n", 324 | "\n", 325 | "if condition1 is False:\n", 326 | " logs_folder = os.path.join('/content/Mangio-RVC-Fork/logs', MODEL)\n", 327 | " os.makedirs(logs_folder, exist_ok=True)\n", 328 | "# this is here so it doesnt crash if the model is missing an index for some reason\n", 329 | "else:\n", 330 | " print(\"URL cannot be left empty. If you don't want to download a model now, just skip this step.\")\n", 331 | "\n", 332 | "# Качаем любой index-файл, если в архиве его не было\n", 333 | "if is_index_found is False:\n", 334 | " logs_folder = os.path.join('/content/Mangio-RVC-Fork/logs', MODEL)\n", 335 | " index_path = os.path.join(logs_folder, 'model.index')\n", 336 | " if os.path.exists(index_path) == False:\n", 337 | " !wget 'https://huggingface.co/sail-rvc/2001MJAIDAM/resolve/main/model.index' -P {logs_folder}\n", 338 | "\n", 339 | "!rm -r /content/unzips/\n", 340 | "!rm -r /content/zips/" 341 | ], 342 | "metadata": { 343 | "cellView": "form", 344 | "id": "0x6VOMyae_lq" 345 | }, 346 | "execution_count": null, 347 | "outputs": [] 348 | }, 349 | { 350 | "cell_type": "markdown", 351 | "metadata": { 352 | "id": "x8CNwbOjZPUY" 353 | }, 354 | "source": [ 355 | "# Начинаем обработку исходника\n", 356 | "### В ***/content/input*** должен быть трек, а RVC-модель должна быть в репозитории ***/content/Mangio-RVC-Fork***" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": null, 362 | "metadata": { 363 | "cellView": "form", 364 | "id": "0XuTHvtDZJPT" 365 | }, 366 | "outputs": [], 367 | "source": [ 368 | "# @title #Отделяем вокал от минуса\n", 369 | "%cd /content/MVSEP-MDX23-Colab_v2\n", 370 | "\n", 371 | "from pathlib import Path, PurePath\n", 372 | "from IPython.display import Audio, display, HTML, FileLink\n", 373 | "import os\n", 374 | "\n", 375 | "# @markdown Папка с исходным музыкальным файлом (только один файл):\n", 376 | "INPUT = '/content/input' #@param {type:\"string\"}\n", 377 | "# @markdown ---\n", 378 | "OUTPUT_UVR_FOLDER = '/content/output_uvr' #@param {type:\"string\"}\n", 379 | "# @markdown ---\n", 380 | "# @markdown Соотношение скорость/качество (1 - максимальная скорость, 10 - максимальное качество):\n", 381 | "quaility = 5 # @param {type:\"slider\", min:1, max:10, step:1}\n", 382 | "quality_max = 10\n", 383 | "\n", 384 | "BigShifts = round(11 * quaility / quality_max)\n", 385 | "overlap_InstVoc = round(8 * quaility / quality_max)\n", 386 | "overlap_VitLarge = round(8 * quaility / quality_max)\n", 387 | "weight_InstVoc = 8\n", 388 | "weight_VitLarge = 5\n", 389 | "is_VOCFT = False\n", 390 | "overlap_VOCFT = 0.1\n", 391 | "weight_VOCFT = 1\n", 392 | "is_vocals_only = True\n", 393 | "overlap_demucs = 0.6\n", 394 | "output_format = 'PCM_16'\n", 395 | "vocals_only = ''\n", 396 | "use_VOCFT = ''\n", 397 | "if is_vocals_only:\n", 398 | " vocals_only = '--vocals_only true'\n", 399 | "else:\n", 400 | " vocals_only = ''\n", 401 | "if is_VOCFT:\n", 402 | " use_VOCFT = '--use_VOCFT true'\n", 403 | "else:\n", 404 | " use_VOCFT = ''\n", 405 | "\n", 406 | "\n", 407 | "filename = next(Path(INPUT).glob('*.mp3'))\n", 408 | "INPUT_NAME = Path(filename).stem\n", 409 | "VOCAL_FILE = os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.wav\")\n", 410 | "INSTRUM_FILE = os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum.wav\")\n", 411 | "\n", 412 | "!python inference.py \\\n", 413 | " --large_gpu \\\n", 414 | " --weight_InstVoc {weight_InstVoc} \\\n", 415 | " --weight_VOCFT {weight_VOCFT} \\\n", 416 | " --weight_VitLarge {weight_VitLarge} \\\n", 417 | " --input_audio \"{filename}\" \\\n", 418 | " --overlap_demucs {overlap_demucs} \\\n", 419 | " --overlap_VOCFT {overlap_VOCFT} \\\n", 420 | " --overlap_InstVoc {overlap_InstVoc} \\\n", 421 | " --overlap_VitLarge {overlap_VitLarge} \\\n", 422 | " --output_format {output_format} \\\n", 423 | " --BigShifts {BigShifts} \\\n", 424 | " --output_folder \"{OUTPUT_UVR_FOLDER}\" \\\n", 425 | " {vocals_only} \\\n", 426 | " {use_VOCFT}\n", 427 | "\n", 428 | "print(\"\\nПослушаем разделённый трек.\")\n", 429 | "print(\"Нужно немного подождать, сейчас появится...\")\n", 430 | "# колаб порой офигивает от размера wav и дисконнектится\n", 431 | "!ffmpeg -y -i {VOCAL_FILE} -vn -ar 44100 -ac 2 -b:a 192k {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals\")}.mp3 &> /dev/null\n", 432 | "!ffmpeg -y -i {INSTRUM_FILE} -vn -ar 44100 -ac 2 -b:a 192k {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum\")}.mp3 &> /dev/null\n", 433 | "print(\"Вокал:\")\n", 434 | "audio_vocal = Audio(os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\"), autoplay=False)\n", 435 | "display(audio_vocal)\n", 436 | "print(\"Инструментал:\")\n", 437 | "audio_inst = Audio(os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum.mp3\"), autoplay=False)\n", 438 | "display(audio_inst)\n", 439 | "\n", 440 | "!rm -rf {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\")}\n", 441 | "!rm -rf {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_instrum.mp3\")}\n", 442 | "\n", 443 | "print('\\nСовет: если в вокале много эха и ревера, переходи к следующему шагу перед преобразованием RVC')" 444 | ] 445 | }, 446 | { 447 | "cell_type": "code", 448 | "execution_count": null, 449 | "metadata": { 450 | "cellView": "form", 451 | "id": "3yztvPmFytx-" 452 | }, 453 | "outputs": [], 454 | "source": [ 455 | "%cd /content/Mangio-RVC-Fork\n", 456 | "# @title # Доп обработка от ревера и эхо (опционально)\n", 457 | "# @markdown ##### Новый файл автоматически заменит исходный вокальный файл /content/output_uvr/file_vocals.wav\n", 458 | "# @markdown ---\n", 459 | "# @markdown ##### Эти значения можно не трогать:\n", 460 | "\n", 461 | "import numpy as np", 462 | "np.float = float", 463 | "\n", 464 | "postprocess = False #@param {type:\"boolean\"}\n", 465 | "tta = True #@param {type:\"boolean\"}\n", 466 | "is_half = False #@param {type:\"boolean\"}\n", 467 | "window_size = 512 # @param {type:\"slider\", min:0, max:1024, step:32}\n", 468 | "\n", 469 | "from pathlib import Path, PurePath\n", 470 | "from IPython.display import Audio, display, HTML, FileLink\n", 471 | "\n", 472 | "input_denoise_file = VOCAL_FILE\n", 473 | "output_folder = PurePath(VOCAL_FILE).parent\n", 474 | "\n", 475 | "import os, sys, torch, warnings, pdb\n", 476 | "\n", 477 | "now_dir = os.getcwd()\n", 478 | "sys.path.append(now_dir)\n", 479 | "from json import load as ll\n", 480 | "\n", 481 | "warnings.filterwarnings(\"ignore\")\n", 482 | "import librosa\n", 483 | "import importlib\n", 484 | "import numpy as np\n", 485 | "import hashlib, math\n", 486 | "from tqdm import tqdm\n", 487 | "from lib.uvr5_pack.lib_v5 import spec_utils\n", 488 | "from lib.uvr5_pack.utils import _get_name_params, inference\n", 489 | "from lib.uvr5_pack.lib_v5.model_param_init import ModelParameters\n", 490 | "import soundfile as sf\n", 491 | "from lib.uvr5_pack.lib_v5.nets_new import CascadedNet\n", 492 | "from lib.uvr5_pack.lib_v5 import nets_61968KB as nets\n", 493 | "\n", 494 | "class _audio_pre_new:\n", 495 | " def __init__(self, agg, model_path, device, is_half):\n", 496 | " self.model_path = model_path\n", 497 | " self.device = device\n", 498 | " self.data = {\n", 499 | " # Processing Options\n", 500 | " \"postprocess\": postprocess,\n", 501 | " \"tta\": tta,\n", 502 | " # Constants\n", 503 | " \"window_size\": window_size,\n", 504 | " \"agg\": agg,\n", 505 | " \"high_end_process\": \"mirroring\",\n", 506 | " }\n", 507 | " mp = ModelParameters(\"lib/uvr5_pack/lib_v5/modelparams/4band_v3.json\")\n", 508 | " nout = 64 if \"DeReverb\" in model_path else 48\n", 509 | " model = CascadedNet(mp.param[\"bins\"] * 2, nout)\n", 510 | " cpk = torch.load(model_path, map_location=\"cuda\")\n", 511 | " model.load_state_dict(cpk)\n", 512 | " model.eval()\n", 513 | " if is_half:\n", 514 | " model = model.half().to(device)\n", 515 | " else:\n", 516 | " model = model.to(device)\n", 517 | "\n", 518 | " self.mp = mp\n", 519 | " self.model = model\n", 520 | "\n", 521 | " def _path_audio_(\n", 522 | " self, music_file, vocal_root=None, ins_root=None, format=\"flac\"\n", 523 | " ):\n", 524 | " if ins_root is None and vocal_root is None:\n", 525 | " return \"No save root.\"\n", 526 | " name = os.path.basename(music_file)\n", 527 | " if ins_root is not None:\n", 528 | " os.makedirs(ins_root, exist_ok=True)\n", 529 | " if vocal_root is not None:\n", 530 | " os.makedirs(vocal_root, exist_ok=True)\n", 531 | " X_wave, y_wave, X_spec_s, y_spec_s = {}, {}, {}, {}\n", 532 | " bands_n = len(self.mp.param[\"band\"])\n", 533 | " for d in range(bands_n, 0, -1):\n", 534 | " bp = self.mp.param[\"band\"][d]\n", 535 | " if d == bands_n: # high-end band\n", 536 | " (\n", 537 | " X_wave[d],\n", 538 | " _,\n", 539 | " ) = librosa.core.load(\n", 540 | " music_file,\n", 541 | " bp[\"sr\"],\n", 542 | " False,\n", 543 | " dtype=np.float32,\n", 544 | " res_type=bp[\"res_type\"],\n", 545 | " )\n", 546 | " if X_wave[d].ndim == 1:\n", 547 | " X_wave[d] = np.asfortranarray([X_wave[d], X_wave[d]])\n", 548 | " else: # lower bands\n", 549 | " X_wave[d] = librosa.core.resample(\n", 550 | " X_wave[d + 1],\n", 551 | " self.mp.param[\"band\"][d + 1][\"sr\"],\n", 552 | " bp[\"sr\"],\n", 553 | " res_type=bp[\"res_type\"],\n", 554 | " )\n", 555 | " # Stft of wave source\n", 556 | " X_spec_s[d] = spec_utils.wave_to_spectrogram_mt(\n", 557 | " X_wave[d],\n", 558 | " bp[\"hl\"],\n", 559 | " bp[\"n_fft\"],\n", 560 | " self.mp.param[\"mid_side\"],\n", 561 | " self.mp.param[\"mid_side_b2\"],\n", 562 | " self.mp.param[\"reverse\"],\n", 563 | " )\n", 564 | " # pdb.set_trace()\n", 565 | " if d == bands_n and self.data[\"high_end_process\"] != \"none\":\n", 566 | " input_high_end_h = (bp[\"n_fft\"] // 2 - bp[\"crop_stop\"]) + (\n", 567 | " self.mp.param[\"pre_filter_stop\"] - self.mp.param[\"pre_filter_start\"]\n", 568 | " )\n", 569 | " input_high_end = X_spec_s[d][\n", 570 | " :, bp[\"n_fft\"] // 2 - input_high_end_h : bp[\"n_fft\"] // 2, :\n", 571 | " ]\n", 572 | "\n", 573 | " X_spec_m = spec_utils.combine_spectrograms(X_spec_s, self.mp)\n", 574 | " aggresive_set = float(self.data[\"agg\"] / 100)\n", 575 | " aggressiveness = {\n", 576 | " \"value\": aggresive_set,\n", 577 | " \"split_bin\": self.mp.param[\"band\"][1][\"crop_stop\"],\n", 578 | " }\n", 579 | " with torch.no_grad():\n", 580 | " pred, X_mag, X_phase = inference(\n", 581 | " X_spec_m, self.device, self.model, aggressiveness, self.data\n", 582 | " )\n", 583 | " # Postprocess\n", 584 | " if self.data[\"postprocess\"]:\n", 585 | " pred_inv = np.clip(X_mag - pred, 0, np.inf)\n", 586 | " pred = spec_utils.mask_silence(pred, pred_inv)\n", 587 | " y_spec_m = pred * X_phase\n", 588 | " v_spec_m = X_spec_m - y_spec_m\n", 589 | "\n", 590 | " if ins_root is not None:\n", 591 | " if self.data[\"high_end_process\"].startswith(\"mirroring\"):\n", 592 | " input_high_end_ = spec_utils.mirroring(\n", 593 | " self.data[\"high_end_process\"], y_spec_m, input_high_end, self.mp\n", 594 | " )\n", 595 | " wav_instrument = spec_utils.cmb_spectrogram_to_wave(\n", 596 | " y_spec_m, self.mp, input_high_end_h, input_high_end_\n", 597 | " )\n", 598 | " else:\n", 599 | " wav_instrument = spec_utils.cmb_spectrogram_to_wave(y_spec_m, self.mp)\n", 600 | " if format in [\"wav\", \"flac\"]:\n", 601 | " sf.write(\n", 602 | " os.path.join(\n", 603 | " ins_root,\n", 604 | " \"denoised_{}.{}\".format(name, format),\n", 605 | " ),\n", 606 | " (np.array(wav_instrument) * 32768).astype(\"int16\"),\n", 607 | " self.mp.param[\"sr\"],\n", 608 | " )\n", 609 | " else:\n", 610 | " path = os.path.join(\n", 611 | " ins_root, \"denoised_{}.wav\".format(name)\n", 612 | " )\n", 613 | " sf.write(\n", 614 | " path,\n", 615 | " (np.array(wav_instrument) * 32768).astype(\"int16\"),\n", 616 | " self.mp.param[\"sr\"],\n", 617 | " )\n", 618 | " if os.path.exists(path):\n", 619 | " os.system(\n", 620 | " \"ffmpeg -i %s -vn %s -q:a 2 -y\"\n", 621 | " % (path, path[:-4] + \".%s\" % format)\n", 622 | " )\n", 623 | " if vocal_root is not None:\n", 624 | " if self.data[\"high_end_process\"].startswith(\"mirroring\"):\n", 625 | " input_high_end_ = spec_utils.mirroring(\n", 626 | " self.data[\"high_end_process\"], v_spec_m, input_high_end, self.mp\n", 627 | " )\n", 628 | " wav_vocals = spec_utils.cmb_spectrogram_to_wave(\n", 629 | " v_spec_m, self.mp, input_high_end_h, input_high_end_\n", 630 | " )\n", 631 | " else:\n", 632 | " wav_vocals = spec_utils.cmb_spectrogram_to_wave(v_spec_m, self.mp)\n", 633 | " if format in [\"wav\", \"flac\"]:\n", 634 | " sf.write(\n", 635 | " os.path.join(\n", 636 | " vocal_root,\n", 637 | " \"vocal_{}_{}.{}\".format(name, self.data[\"agg\"], format),\n", 638 | " ),\n", 639 | " (np.array(wav_vocals) * 32768).astype(\"int16\"),\n", 640 | " self.mp.param[\"sr\"],\n", 641 | " )\n", 642 | " else:\n", 643 | " path = os.path.join(\n", 644 | " vocal_root, \"vocal_{}_{}.wav\".format(name, self.data[\"agg\"])\n", 645 | " )\n", 646 | " sf.write(\n", 647 | " path,\n", 648 | " (np.array(wav_vocals) * 32768).astype(\"int16\"),\n", 649 | " self.mp.param[\"sr\"],\n", 650 | " )\n", 651 | " if os.path.exists(path):\n", 652 | " os.system(\n", 653 | " \"ffmpeg -i %s -vn %s -q:a 2 -y\"\n", 654 | " % (path, path[:-4] + \".%s\" % format)\n", 655 | " )\n", 656 | "\n", 657 | "\n", 658 | "device = \"cuda\"\n", 659 | "model_path = \"uvr5_weights/UVR-DeEcho-DeReverb.pth\"\n", 660 | "pre_fun = _audio_pre_new(model_path=model_path, device=device, is_half=is_half, agg=10)\n", 661 | "pre_fun._path_audio_(input_denoise_file, None, output_folder, \"wav\")\n", 662 | "\n", 663 | "%mv {os.path.join(os.path.dirname(VOCAL_FILE), \"denoised_\" + os.path.basename(VOCAL_FILE)) + PurePath(VOCAL_FILE).suffix} {VOCAL_FILE}\n", 664 | "\n", 665 | "# колаб порой офигивает от размера wav и дисконнектится, поэтому mp3\n", 666 | "!ffmpeg -y -i {VOCAL_FILE} -vn -ar 44100 -ac 2 -b:a 192k {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals\")}.mp3 &> /dev/null\n", 667 | "audio = Audio(os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\"), autoplay=False)\n", 668 | "display(audio)\n", 669 | "\n", 670 | "!rm -rf {os.path.join(OUTPUT_UVR_FOLDER, INPUT_NAME + \"_vocals.mp3\")}" 671 | ] 672 | }, 673 | { 674 | "cell_type": "markdown", 675 | "source": [ 676 | "# Преобразование файла и склейка с исходным инструменталом" 677 | ], 678 | "metadata": { 679 | "id": "H3OJWmYNuTuH" 680 | } 681 | }, 682 | { 683 | "cell_type": "code", 684 | "execution_count": null, 685 | "metadata": { 686 | "id": "J0JNWaDOoK0H", 687 | "cellView": "form" 688 | }, 689 | "outputs": [], 690 | "source": [ 691 | "# @title # 🚀 Преобразование\n", 692 | "from IPython.display import Audio, display, HTML, FileLink\n", 693 | "from pathlib import Path, PurePath\n", 694 | "import os\n", 695 | "from decimal import Decimal\n", 696 | "\n", 697 | "result_path = \"/content/output_rvc\"\n", 698 | "result_name = \"rvc_result\"\n", 699 | "result_format = \"mp3\"\n", 700 | "\n", 701 | "RVC_RESULT_FILE = os.path.join(result_path, result_name + \".\" + result_format)\n", 702 | "rvc_result_filename = os.path.basename(RVC_RESULT_FILE)\n", 703 | "\n", 704 | "f0_method = \"rmvpe\"\n", 705 | "\n", 706 | "# @markdown ## Настройки не меняют тональность\n", 707 | "\n", 708 | "# @markdown ### Питч (октава вверх, октава вниз, стандарт):\n", 709 | "transpositionMode = \"\\u0421\\u0442\\u0430\\u043D\\u0434\\u0430\\u0440\\u0442\\u043D\\u044B\\u0439\" # @param [\"М -> Ж\", \"Ж -> М\", \"Стандартный\"]\n", 710 | "transposition = 0\n", 711 | "\n", 712 | "if transpositionMode == \"М -> Ж\":\n", 713 | " transposition = 12\n", 714 | "elif transpositionMode == \"Ж -> М\":\n", 715 | " transposition = -12\n", 716 | "elif transpositionMode == \"Стандартный\":\n", 717 | " transposition = 0\n", 718 | "\n", 719 | "# @markdown ---\n", 720 | "# @markdown ## Режим:\n", 721 | "mode = \"\\u0421\\u0442\\u0430\\u043D\\u0434\\u0430\\u0440\\u0442\\u043D\\u044B\\u0439\" # @param [\"М -> Ж\", \"Ж -> М\", \"Стандартный\", \"Ручная настройка\"]\n", 722 | "\n", 723 | "# @markdown ---\n", 724 | "# @markdown ### Ручные настройки:\n", 725 | "# @markdown ###### (не работает, если не выбран режим Ручная настройка)\n", 726 | "# @markdown ##### Quefrency (default 0.0 ms):\n", 727 | "quefrency = 0 # @param {type:\"slider\", min:0.0, max:2, step:0.1}\n", 728 | "# @markdown ##### Tembre factor (default 1.0):\n", 729 | "tembre = 1 # @param {type:\"slider\", min:0.0, max:2, step:0.1}\n", 730 | "\n", 731 | "if mode == \"М -> Ж\":\n", 732 | " quefrency = 0.5\n", 733 | " tembre = 1.3\n", 734 | "elif mode == \"Ж -> М\":\n", 735 | " quefrency = 1.0\n", 736 | " tembre = 0.7\n", 737 | "elif mode == \"Стандартный\":\n", 738 | " quefrency = 0.0\n", 739 | " tembre = 1.0\n", 740 | "\n", 741 | "# @markdown ### Стандартный режим быстрее! Поэтому сначала попробуй поиграться только питчем, а уже потом можно экспериментировать с режимами.\n", 742 | "# @markdown ---\n", 743 | "# @markdown ## Подсказка:\n", 744 | "# @markdown ###### Из женского в мужской: quefrency = 1.0 (больше дефолтного), tembre = 0.7 (меньше дефолтного):\n", 745 | "# @markdown ###### Из мужского в женский: quefrency = 0.5 (чуть-чуть больше дефолтного), tembre = 1.3 (больше дефолтного):\n", 746 | "# @markdown ###### Универсальных решений нет, в первую очередь зависит от исходного голоса и конечной модели. Отталкиваться следует от значений выше.:\n", 747 | "%cd /content/Mangio-RVC-Fork\n", 748 | "\n", 749 | "# \"\\n arg 1) model name with .pth in ./weights: mi-test.pth\"\n", 750 | "# \"\\n arg 2) source audio path: myFolder\\\\MySource.wav\"\n", 751 | "# \"\\n arg 3) output file name to be placed in './audio-outputs': MyTest.wav\"\n", 752 | "# \"\\n arg 4) feature index file path: logs/mi-test/added_IVF3042_Flat_nprobe_1.index\"\n", 753 | "# \"\\n arg 5) speaker id: 0\"\n", 754 | "# \"\\n arg 6) transposition: 0\"\n", 755 | "# \"\\n arg 7) f0 method: harvest (pm, harvest, crepe, crepe-tiny, hybrid[x,x,x,x], mangio-crepe, mangio-crepe-tiny, rmvpe)\"\n", 756 | "# \"\\n arg 8) crepe hop length: 160\"\n", 757 | "# \"\\n arg 9) harvest median filter radius: 3 (0-7)\"\n", 758 | "# \"\\n arg 10) post resample rate: 0\"\n", 759 | "# \"\\n arg 11) mix volume envelope: 1\"\n", 760 | "# \"\\n arg 12) feature index ratio: 0.78 (0-1)\"\n", 761 | "# \"\\n arg 13) Voiceless Consonant Protection (Less Artifact): 0.33 (Smaller number = more protection. 0.50 means Dont Use.)\"\n", 762 | "# \"\\n arg 14) Whether to formant shift the inference audio before conversion: False (if set to false, you can ignore setting the quefrency and timbre values for formanting)\"\n", 763 | "# \"\\n arg 15)* Quefrency for formanting: 8.0 (no need to set if arg14 is False/false)\"\n", 764 | "# \"\\n arg 16)* Timbre for formanting: 1.2 (no need to set if arg14 is False/false) \\n\"\n", 765 | "# \"\\nExample: mi-test.pth audios/Sidney.wav myTest.wav logs/mi-test/added_index.index 0 -2 harvest 160 3 0 1 0.95 0.33 0.45 True 8.0 1.2\"\n", 766 | "is_formant = \"False\"\n", 767 | "if quefrency != 0 and tembre != 1:\n", 768 | " is_formant = \"True\"\n", 769 | "\n", 770 | "quefrency_value = \"{:.1f}\".format(Decimal(quefrency).quantize(Decimal('0.1')))\n", 771 | "tembre_value = \"{:.1f}\".format(Decimal(tembre).quantize(Decimal('0.1')))\n", 772 | "transposition_value = str(transposition)\n", 773 | "\n", 774 | "cmd = MODEL + \".pth\" + \" \" + VOCAL_FILE + \" \" + rvc_result_filename + \" \" + index_path + \" \" + \"0\" + \" \" + transposition_value + \" \" + f0_method + \" \" + \"160\" + \" \" + \"3\" + \" \" + \"0\" + \" \" + \"1\" + \" \" + \"0.78\" + \" \" + \"0.33\" + \" \" + \"0.45\" + \" \" + is_formant + \" \" + quefrency_value + \" \" + tembre_value\n", 775 | "print(cmd)\n", 776 | "!echo -e -n \"go infer\\n{cmd}\\nstop_infer\" | python3 infer-web.py --colab --pycmd python3 --is_cli &> /dev/null\n", 777 | "%mv /content/Mangio-RVC-Fork/audio-outputs/{rvc_result_filename} {RVC_RESULT_FILE}\n", 778 | "audio = Audio(RVC_RESULT_FILE, autoplay=False)\n", 779 | "display(audio)" 780 | ] 781 | }, 782 | { 783 | "cell_type": "code", 784 | "source": [ 785 | "%cd /content\n", 786 | "# @title # Пост-обработка\n", 787 | "# @markdown ### Компрессор + нормализация + лёгкая реверберация + разведение по стерео-панораме\n", 788 | "\n", 789 | "from IPython.display import Audio, display, HTML, FileLink\n", 790 | "import os\n", 791 | "import shutil\n", 792 | "\n", 793 | "OUTPUT_PATH = '/content/output'\n", 794 | "PROCESSED_OUTPUT_FORMAT = 'mp3'\n", 795 | "COMPRESSED_RESULT_FILE = os.path.join(OUTPUT_PATH, f\"{os.path.splitext(RVC_RESULT_FILE)[0]}_compressed.{PROCESSED_OUTPUT_FORMAT}\")\n", 796 | "PROCESSED_RESULT_FILE = os.path.join(OUTPUT_PATH, f\"{os.path.splitext(RVC_RESULT_FILE)[0]}_processed.{PROCESSED_OUTPUT_FORMAT}\")\n", 797 | "\n", 798 | "# компрессируем вокал\n", 799 | "!ffmpeg -y -i {RVC_RESULT_FILE} -filter_complex \"anlmdn=s=10,acompressor=threshold=-20dB:ratio=4:attack=20:release=200,volume=2,loudnorm=I=-13:TP=-1.0:LRA=9,volume=1.5\" {COMPRESSED_RESULT_FILE}\n", 800 | "if os.path.isfile(COMPRESSED_RESULT_FILE) != True:\n", 801 | " print(f\"Не удалось обработать файл {RVC_RESULT_FILE}\")\n", 802 | " shutil.copy(RVC_RESULT_FILE, PROCESSED_RESULT_FILE)\n", 803 | "else:\n", 804 | " if os.path.isfile(IMPULSE_FILE):\n", 805 | " # добавление реверберации с разной обработкой для левого и правого канала для стереоскопического эффекта\n", 806 | " print(\"Добавление реверберации с разной обработкой для левого и правого канала для стереоскопического эффекта\")\n", 807 | " !ffmpeg -y -i {COMPRESSED_RESULT_FILE} -i {IMPULSE_FILE} -filter_complex \"[0:a]asplit=2[splita][splitb]; [splita]adelay=40|40[splita_delayed]; [splitb]adelay=20|20[splitb_delayed]; [splita_delayed][1]afir=dry=10:wet=10[reverb_left]; [splitb_delayed][1]afir=dry=10:wet=10[reverb_right]; [reverb_left][reverb_right]amerge=inputs=2[reverb]; [0:a][reverb]amix=inputs=2:weights=20 1[audio]\" -map \"[audio]\" {PROCESSED_RESULT_FILE}\n", 808 | " if os.path.isfile(PROCESSED_RESULT_FILE):\n", 809 | " !rm -rf {COMPRESSED_RESULT_FILE}\n", 810 | " else:\n", 811 | " print(f\"Не удалось обработать компрессированный файл {COMPRESSED_RESULT_FILE}\")\n", 812 | " shutil.move(COMPRESSED_RESULT_FILE, PROCESSED_RESULT_FILE)\n", 813 | " else:\n", 814 | " print(f\"Не найден файл импульса: {IMPULSE_FILE}\")\n", 815 | " shutil.move(COMPRESSED_RESULT_FILE, PROCESSED_RESULT_FILE)\n", 816 | "\n", 817 | "audio = Audio(PROCESSED_RESULT_FILE, autoplay=False)\n", 818 | "display(audio)" 819 | ], 820 | "metadata": { 821 | "cellView": "form", 822 | "id": "IzIpIvLUxWmq" 823 | }, 824 | "execution_count": null, 825 | "outputs": [] 826 | }, 827 | { 828 | "cell_type": "code", 829 | "source": [ 830 | "%cd /content\n", 831 | "# @title # Склейка\n", 832 | "\n", 833 | "from IPython.display import Audio, display, HTML, FileLink\n", 834 | "import os\n", 835 | "\n", 836 | "OUTPUT_PATH = '/content/output' #@param {type:\"string\"}\n", 837 | "OUTPUT_FORMAT = 'mp3'\n", 838 | "\n", 839 | "RESULT_FILE = os.path.join(OUTPUT_PATH, INPUT_NAME + \".\" + OUTPUT_FORMAT)\n", 840 | "\n", 841 | "!ffmpeg -y -i {PROCESSED_RESULT_FILE} -i {INSTRUM_FILE} -filter_complex \"[0:a][1:a]amerge=inputs=2[a]\" -map \"[a]\" -ac 2 {RESULT_FILE}\n", 842 | "\n", 843 | "audio = Audio(RESULT_FILE, autoplay=False)\n", 844 | "display(audio)" 845 | ], 846 | "metadata": { 847 | "cellView": "form", 848 | "id": "Xp05zq7DgcvU" 849 | }, 850 | "execution_count": null, 851 | "outputs": [] 852 | }, 853 | { 854 | "cell_type": "markdown", 855 | "source": [ 856 | "### Готово\n", 857 | "#### Теперь можешь вернуться к любому предыдущему шагу, без необходимости запуска полного флоу. Например, можно загрузить другую модель, останется только выполнить преобразование вокала, который уже отделен от инструментала." 858 | ], 859 | "metadata": { 860 | "id": "Tr6iEhD2fi0d" 861 | } 862 | } 863 | ], 864 | "metadata": { 865 | "accelerator": "GPU", 866 | "colab": { 867 | "provenance": [], 868 | "toc_visible": true 869 | }, 870 | "kernelspec": { 871 | "display_name": "Python 3", 872 | "name": "python3" 873 | }, 874 | "language_info": { 875 | "name": "python" 876 | } 877 | }, 878 | "nbformat": 4, 879 | "nbformat_minor": 0 880 | } 881 | --------------------------------------------------------------------------------