├── .github └── FUNDING.yml ├── Disco_Diffusion_v5_2_Warp.ipynb ├── Disco_Diffusion_v5_2_Warp_custom_model.ipynb ├── LICENSE ├── README.md └── image_morphing_3d.ipynb /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | 3 | github: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2] 4 | patreon: sxela 5 | open_collective: # Replace with a single Open Collective username 6 | ko_fi: # 7 | tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel 8 | community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry 9 | liberapay: # Replace with a single Liberapay username 10 | issuehunt: # Replace with a single IssueHunt username 11 | otechie: # Replace with a single Otechie username 12 | lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry 13 | custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2'] -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Alex, Respective copyrights for code pieces included can be found in the notebooks 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Disco Diffusion v5.2 - WarpFusion 2 | 3 | [![][github-release-shield]][github-release-link] 4 | [![][github-release-date-shield]][github-release-link] 5 | [![][github-downloads-shield]][github-downloads-link] 6 | 7 | 8 | [github-release-shield]: https://img.shields.io/github/v/release/Sxela/DiscoDiffusion-Warp?style=flat&sort=semver 9 | [github-release-link]: https://github.com/Sxela/DiscoDiffusion-Warp/releases 10 | [github-release-date-shield]: https://img.shields.io/github/release-date/Sxela/DiscoDiffusion-Warp?style=flat 11 | [github-downloads-shield]: https://img.shields.io/github/downloads/Sxela/DiscoDiffusion-Warp/total?style=flat 12 | [github-downloads-link]: https://github.com/Sxela/DiscoDiffusion-Warp/releases 13 | 14 | [![Disco Diffusion v5.2 - Warp](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Sxela/DiscoDiffusion-Warp/blob/main/Disco_Diffusion_v5_2_Warp.ipynb) 15 | ![visitors](https://visitor-badge.glitch.me/badge?page_id=sxela_ddwarp_repo) 16 | 17 | [Discuss on Discord](https://linktr.ee/devdef) (keeping it on linktree now so it's always an active link) 18 | 19 | # About 20 | This version improves video init. You can now generate optical flow maps from input videos, and use those to: 21 | - warp init frames for consistent style 22 | - warp processed frames for less noise in final video 23 | 24 | ## Init warping 25 | The feature works like this: we take the 1st frame, diffuse it as usual as an image input with fixed skip steps. Then we warp in with its flow map into the 2nd frame and blend it with the original raw video 2nd frame. This way we get the style from heavily stylized 1st frame (warped accordingly) and content from 2nd frame (to reduce warping artifacts and prevent overexposure) 26 | 27 | # Changelog 28 | 29 | ### 27.05.2022 30 | - Add existing flow check, now generate only if no flow found 31 | - Add comprehensive error reporting for missing video_init, video frames, flow files 32 | - Fix non alphanumeric batch names not working 33 | - Fix frames not being sorted before creating output video 34 | - Fix incorrect RAFT root foder on local machines 35 | - Add storing RAFT on gdrive 36 | 37 | ### 23.05.2022 38 | - Add [colab](https://github.com/Sxela/DiscoDiffusion-Warp/blob/main/image_morphing_3d.ipynb) for 3d animation only 39 | 40 | ### 22.05.2022: 41 | - Add saving frames and flow to google drive (suggested by Chris the Wizard#8082) 42 | - Add back consistency checking 43 | 44 | ### 18.05.2022 45 | - Update 512x512 and secondary model urls 46 | 47 | ### 17.05.2022 48 | - Remove consistency checking for stability 49 | 50 | ### 15.05.2022 51 | - Add 256x256 comis faces model 52 | 53 | ### 22.04.2022: 54 | - Add ViT-L/14@336px 55 | ### 21.04.2022: 56 | - Add warp parameteres to saved settings 57 | ### 16.04.2022: 58 | - Use width_height size instead of input video size 59 | - Bring back adabins and 2d/3d anim modes 60 | - Install RAFT only when video input animation mode is selected 61 | - Generate optical flow maps only for video input animation mode even with flow_warp unchecked, so you can still save an obtical flow blended video later 62 | - Install AdaBins for 3d mode only (should do the same for midas) 63 | - Add animation mode check to create video tab 64 | ### 15.04.2022: Init 65 | 66 | # 67 | 68 | ## Optical flow input warping 69 | 70 | ### Settings: 71 | (Located in animation settings tab) 72 | 73 | Video Optical Flow Settings: 74 | - flow_warp - check to warp 75 | - flow_blend: 0 - you get raw input, 1 - you get warped diffused previous frame 76 | - check_consistency: check forward-backward flow consistency (uncheck unless getting too many warping artifacts) 77 | 78 | ## Output warping 79 | This feature is plain simple - we just take any frame, warp in to the next frame, blend with real next frame, get smooth noise-free result. 80 | 81 | ### Settings: 82 | (located in create video tab) 83 | blend_mode: 84 | - none: just mash frames together in a video 85 | - optical flow: take frame, warp, blend with the next frame 86 | - check_consistency: use consistency maps (may prevent warping artfacts) 87 | - blend: 0 - you get raw 2nd frame, 1 - you get warped 1st frame 88 | 89 | ## TODO: 90 | - [x] Add automatic flow map management (i.e. create only when needed) 91 | - [x] Add error reporting for missing inputs, flows, frames 92 | - [ ] Add turbo 93 | - [ ] Add turbosmooth 94 | 95 | # 96 | 97 | This is a variation of the awesome [DiscoDiffusion colab](https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb#scrollTo=Changelog) 98 | 99 | If you like what I'm doing you can check my linktree 100 | - follow me on [twitter](https://twitter.com/devdef) 101 | - tip me on [patreon](https://www.patreon.com/sxela) 102 | 103 | Thank you for being awesome! 104 | -------------------------------------------------------------------------------- /image_morphing_3d.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "image_morphing_3d.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [], 9 | "authorship_tag": "ABX9TyNXm3NuGhT+nWPBvfTpn/kl", 10 | "include_colab_link": true 11 | }, 12 | "kernelspec": { 13 | "name": "python3", 14 | "display_name": "Python 3" 15 | }, 16 | "language_info": { 17 | "name": "python" 18 | }, 19 | "accelerator": "GPU", 20 | "widgets": { 21 | "application/vnd.jupyter.widget-state+json": { 22 | "8688f18d323c48b1860f2a0f8c5ebb7b": { 23 | "model_module": "@jupyter-widgets/controls", 24 | "model_name": "HBoxModel", 25 | "model_module_version": "1.5.0", 26 | "state": { 27 | "_dom_classes": [], 28 | "_model_module": "@jupyter-widgets/controls", 29 | "_model_module_version": "1.5.0", 30 | "_model_name": "HBoxModel", 31 | "_view_count": null, 32 | "_view_module": "@jupyter-widgets/controls", 33 | "_view_module_version": "1.5.0", 34 | "_view_name": "HBoxView", 35 | "box_style": "", 36 | "children": [ 37 | "IPY_MODEL_10fddb5c0ee14dbaba53515a3faf652b", 38 | "IPY_MODEL_1bd46af147274667ac6d9b135d31f462", 39 | "IPY_MODEL_40a05829465243699948262cbba458ff" 40 | ], 41 | "layout": "IPY_MODEL_9f6bc988297f4d20962e47954d18be55" 42 | } 43 | }, 44 | "10fddb5c0ee14dbaba53515a3faf652b": { 45 | "model_module": "@jupyter-widgets/controls", 46 | "model_name": "HTMLModel", 47 | "model_module_version": "1.5.0", 48 | "state": { 49 | "_dom_classes": [], 50 | "_model_module": "@jupyter-widgets/controls", 51 | "_model_module_version": "1.5.0", 52 | "_model_name": "HTMLModel", 53 | "_view_count": null, 54 | "_view_module": "@jupyter-widgets/controls", 55 | "_view_module_version": "1.5.0", 56 | "_view_name": "HTMLView", 57 | "description": "", 58 | "description_tooltip": null, 59 | "layout": "IPY_MODEL_51e7f724c27a4b65a74ad1b9f25b1fcc", 60 | "placeholder": "​", 61 | "style": "IPY_MODEL_2e2465965971480cb00785f1a94e8067", 62 | "value": " 5%" 63 | } 64 | }, 65 | "1bd46af147274667ac6d9b135d31f462": { 66 | "model_module": "@jupyter-widgets/controls", 67 | "model_name": "FloatProgressModel", 68 | "model_module_version": "1.5.0", 69 | "state": { 70 | "_dom_classes": [], 71 | "_model_module": "@jupyter-widgets/controls", 72 | "_model_module_version": "1.5.0", 73 | "_model_name": "FloatProgressModel", 74 | "_view_count": null, 75 | "_view_module": "@jupyter-widgets/controls", 76 | "_view_module_version": "1.5.0", 77 | "_view_name": "ProgressView", 78 | "bar_style": "", 79 | "description": "", 80 | "description_tooltip": null, 81 | "layout": "IPY_MODEL_4e2e6d1914f74655be50c9f247ace1bf", 82 | "max": 19, 83 | "min": 0, 84 | "orientation": "horizontal", 85 | "style": "IPY_MODEL_1bf7b6ab4a4b4381a3fd443010020b58", 86 | "value": 1 87 | } 88 | }, 89 | "40a05829465243699948262cbba458ff": { 90 | "model_module": "@jupyter-widgets/controls", 91 | "model_name": "HTMLModel", 92 | "model_module_version": "1.5.0", 93 | "state": { 94 | "_dom_classes": [], 95 | "_model_module": "@jupyter-widgets/controls", 96 | "_model_module_version": "1.5.0", 97 | "_model_name": "HTMLModel", 98 | "_view_count": null, 99 | "_view_module": "@jupyter-widgets/controls", 100 | "_view_module_version": "1.5.0", 101 | "_view_name": "HTMLView", 102 | "description": "", 103 | "description_tooltip": null, 104 | "layout": "IPY_MODEL_f150940fbf83498e96c759a5f8c7b209", 105 | "placeholder": "​", 106 | "style": "IPY_MODEL_de3f79a0115c4a50aa8203e7e4022d01", 107 | "value": " 1/19 [00:08<02:26, 8.15s/it]" 108 | } 109 | }, 110 | "9f6bc988297f4d20962e47954d18be55": { 111 | "model_module": "@jupyter-widgets/base", 112 | "model_name": "LayoutModel", 113 | "model_module_version": "1.2.0", 114 | "state": { 115 | "_model_module": "@jupyter-widgets/base", 116 | "_model_module_version": "1.2.0", 117 | "_model_name": "LayoutModel", 118 | "_view_count": null, 119 | "_view_module": "@jupyter-widgets/base", 120 | "_view_module_version": "1.2.0", 121 | "_view_name": "LayoutView", 122 | "align_content": null, 123 | "align_items": null, 124 | "align_self": null, 125 | "border": null, 126 | "bottom": null, 127 | "display": null, 128 | "flex": null, 129 | "flex_flow": null, 130 | "grid_area": null, 131 | "grid_auto_columns": null, 132 | "grid_auto_flow": null, 133 | "grid_auto_rows": null, 134 | "grid_column": null, 135 | "grid_gap": null, 136 | "grid_row": null, 137 | "grid_template_areas": null, 138 | "grid_template_columns": null, 139 | "grid_template_rows": null, 140 | "height": null, 141 | "justify_content": null, 142 | "justify_items": null, 143 | "left": null, 144 | "margin": null, 145 | "max_height": null, 146 | "max_width": null, 147 | "min_height": null, 148 | "min_width": null, 149 | "object_fit": null, 150 | "object_position": null, 151 | "order": null, 152 | "overflow": null, 153 | "overflow_x": null, 154 | "overflow_y": null, 155 | "padding": null, 156 | "right": null, 157 | "top": null, 158 | "visibility": null, 159 | "width": null 160 | } 161 | }, 162 | "51e7f724c27a4b65a74ad1b9f25b1fcc": { 163 | "model_module": "@jupyter-widgets/base", 164 | "model_name": "LayoutModel", 165 | "model_module_version": "1.2.0", 166 | "state": { 167 | "_model_module": "@jupyter-widgets/base", 168 | "_model_module_version": "1.2.0", 169 | "_model_name": "LayoutModel", 170 | "_view_count": null, 171 | "_view_module": "@jupyter-widgets/base", 172 | "_view_module_version": "1.2.0", 173 | "_view_name": "LayoutView", 174 | "align_content": null, 175 | "align_items": null, 176 | "align_self": null, 177 | "border": null, 178 | "bottom": null, 179 | "display": null, 180 | "flex": null, 181 | "flex_flow": null, 182 | "grid_area": null, 183 | "grid_auto_columns": null, 184 | "grid_auto_flow": null, 185 | "grid_auto_rows": null, 186 | "grid_column": null, 187 | "grid_gap": null, 188 | "grid_row": null, 189 | "grid_template_areas": null, 190 | "grid_template_columns": null, 191 | "grid_template_rows": null, 192 | "height": null, 193 | "justify_content": null, 194 | "justify_items": null, 195 | "left": null, 196 | "margin": null, 197 | "max_height": null, 198 | "max_width": null, 199 | "min_height": null, 200 | "min_width": null, 201 | "object_fit": null, 202 | "object_position": null, 203 | "order": null, 204 | "overflow": null, 205 | "overflow_x": null, 206 | "overflow_y": null, 207 | "padding": null, 208 | "right": null, 209 | "top": null, 210 | "visibility": null, 211 | "width": null 212 | } 213 | }, 214 | "2e2465965971480cb00785f1a94e8067": { 215 | "model_module": "@jupyter-widgets/controls", 216 | "model_name": "DescriptionStyleModel", 217 | "model_module_version": "1.5.0", 218 | "state": { 219 | "_model_module": "@jupyter-widgets/controls", 220 | "_model_module_version": "1.5.0", 221 | "_model_name": "DescriptionStyleModel", 222 | "_view_count": null, 223 | "_view_module": "@jupyter-widgets/base", 224 | "_view_module_version": "1.2.0", 225 | "_view_name": "StyleView", 226 | "description_width": "" 227 | } 228 | }, 229 | "4e2e6d1914f74655be50c9f247ace1bf": { 230 | "model_module": "@jupyter-widgets/base", 231 | "model_name": "LayoutModel", 232 | "model_module_version": "1.2.0", 233 | "state": { 234 | "_model_module": "@jupyter-widgets/base", 235 | "_model_module_version": "1.2.0", 236 | "_model_name": "LayoutModel", 237 | "_view_count": null, 238 | "_view_module": "@jupyter-widgets/base", 239 | "_view_module_version": "1.2.0", 240 | "_view_name": "LayoutView", 241 | "align_content": null, 242 | "align_items": null, 243 | "align_self": null, 244 | "border": null, 245 | "bottom": null, 246 | "display": null, 247 | "flex": null, 248 | "flex_flow": null, 249 | "grid_area": null, 250 | "grid_auto_columns": null, 251 | "grid_auto_flow": null, 252 | "grid_auto_rows": null, 253 | "grid_column": null, 254 | "grid_gap": null, 255 | "grid_row": null, 256 | "grid_template_areas": null, 257 | "grid_template_columns": null, 258 | "grid_template_rows": null, 259 | "height": null, 260 | "justify_content": null, 261 | "justify_items": null, 262 | "left": null, 263 | "margin": null, 264 | "max_height": null, 265 | "max_width": null, 266 | "min_height": null, 267 | "min_width": null, 268 | "object_fit": null, 269 | "object_position": null, 270 | "order": null, 271 | "overflow": null, 272 | "overflow_x": null, 273 | "overflow_y": null, 274 | "padding": null, 275 | "right": null, 276 | "top": null, 277 | "visibility": null, 278 | "width": null 279 | } 280 | }, 281 | "1bf7b6ab4a4b4381a3fd443010020b58": { 282 | "model_module": "@jupyter-widgets/controls", 283 | "model_name": "ProgressStyleModel", 284 | "model_module_version": "1.5.0", 285 | "state": { 286 | "_model_module": "@jupyter-widgets/controls", 287 | "_model_module_version": "1.5.0", 288 | "_model_name": "ProgressStyleModel", 289 | "_view_count": null, 290 | "_view_module": "@jupyter-widgets/base", 291 | "_view_module_version": "1.2.0", 292 | "_view_name": "StyleView", 293 | "bar_color": null, 294 | "description_width": "" 295 | } 296 | }, 297 | "f150940fbf83498e96c759a5f8c7b209": { 298 | "model_module": "@jupyter-widgets/base", 299 | "model_name": "LayoutModel", 300 | "model_module_version": "1.2.0", 301 | "state": { 302 | "_model_module": "@jupyter-widgets/base", 303 | "_model_module_version": "1.2.0", 304 | "_model_name": "LayoutModel", 305 | "_view_count": null, 306 | "_view_module": "@jupyter-widgets/base", 307 | "_view_module_version": "1.2.0", 308 | "_view_name": "LayoutView", 309 | "align_content": null, 310 | "align_items": null, 311 | "align_self": null, 312 | "border": null, 313 | "bottom": null, 314 | "display": null, 315 | "flex": null, 316 | "flex_flow": null, 317 | "grid_area": null, 318 | "grid_auto_columns": null, 319 | "grid_auto_flow": null, 320 | "grid_auto_rows": null, 321 | "grid_column": null, 322 | "grid_gap": null, 323 | "grid_row": null, 324 | "grid_template_areas": null, 325 | "grid_template_columns": null, 326 | "grid_template_rows": null, 327 | "height": null, 328 | "justify_content": null, 329 | "justify_items": null, 330 | "left": null, 331 | "margin": null, 332 | "max_height": null, 333 | "max_width": null, 334 | "min_height": null, 335 | "min_width": null, 336 | "object_fit": null, 337 | "object_position": null, 338 | "order": null, 339 | "overflow": null, 340 | "overflow_x": null, 341 | "overflow_y": null, 342 | "padding": null, 343 | "right": null, 344 | "top": null, 345 | "visibility": null, 346 | "width": null 347 | } 348 | }, 349 | "de3f79a0115c4a50aa8203e7e4022d01": { 350 | "model_module": "@jupyter-widgets/controls", 351 | "model_name": "DescriptionStyleModel", 352 | "model_module_version": "1.5.0", 353 | "state": { 354 | "_model_module": "@jupyter-widgets/controls", 355 | "_model_module_version": "1.5.0", 356 | "_model_name": "DescriptionStyleModel", 357 | "_view_count": null, 358 | "_view_module": "@jupyter-widgets/base", 359 | "_view_module_version": "1.2.0", 360 | "_view_name": "StyleView", 361 | "description_width": "" 362 | } 363 | } 364 | } 365 | } 366 | }, 367 | "cells": [ 368 | { 369 | "cell_type": "markdown", 370 | "metadata": { 371 | "id": "view-in-github", 372 | "colab_type": "text" 373 | }, 374 | "source": [ 375 | "\"Open" 376 | ] 377 | }, 378 | { 379 | "cell_type": "markdown", 380 | "source": [ 381 | "# DiscoDiffusion 3d Animation only mode by [Alex Spirin](https://linktr.ee/devdef) \n", 382 | "![visitors](https://visitor-badge.glitch.me/badge?page_id=sxela_3dmorph_colab)\n", 383 | "\n", 384 | "\n", 385 | "This is an amputated 3d animation mode from the awesome [DiscoDiffusion colab](https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb#scrollTo=Changelog) \n", 386 | "\n", 387 | "\n", 388 | "It takes an image as an input, distorts it based on the animation settings below, and makes a video.\n" 389 | ], 390 | "metadata": { 391 | "id": "KnCBcp4ctDVR" 392 | } 393 | }, 394 | { 395 | "cell_type": "code", 396 | "source": [ 397 | "#@title 1.2 Prepare Folders\n", 398 | "import subprocess, os, sys, ipykernel\n", 399 | "\n", 400 | "def gitclone(url):\n", 401 | " res = subprocess.run(['git', 'clone', url], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", 402 | " print(res)\n", 403 | "\n", 404 | "def pipi(modulestr):\n", 405 | " res = subprocess.run(['pip', 'install', modulestr], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", 406 | " print(res)\n", 407 | "\n", 408 | "def pipie(modulestr):\n", 409 | " res = subprocess.run(['git', 'install', '-e', modulestr], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", 410 | " print(res)\n", 411 | "\n", 412 | "def wget(url, outputdir):\n", 413 | " res = subprocess.run(['wget', url, '-P', f'{outputdir}'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", 414 | " print(res)\n", 415 | "\n", 416 | "try:\n", 417 | " from google.colab import drive\n", 418 | " print(\"Google Colab detected. Using Google Drive.\")\n", 419 | " is_colab = True\n", 420 | " #@markdown If you connect your Google Drive, you can save the final image of each run on your drive.\n", 421 | " google_drive = True #@param {type:\"boolean\"}\n", 422 | " #@markdown Click here if you'd like to save the diffusion model checkpoint file to (and/or load from) your Google Drive:\n", 423 | " save_models_to_google_drive = True #@param {type:\"boolean\"}\n", 424 | "except:\n", 425 | " is_colab = False\n", 426 | " google_drive = False\n", 427 | " save_models_to_google_drive = False\n", 428 | " print(\"Google Colab not detected.\")\n", 429 | "\n", 430 | "if is_colab:\n", 431 | " if google_drive is True:\n", 432 | " drive.mount('/content/drive')\n", 433 | " root_path = '/content/drive/MyDrive/AI/Disco_Diffusion'\n", 434 | " else:\n", 435 | " root_path = '/content'\n", 436 | "else:\n", 437 | " root_path = os.getcwd()\n", 438 | "\n", 439 | "import os\n", 440 | "def createPath(filepath):\n", 441 | " os.makedirs(filepath, exist_ok=True)\n", 442 | "\n", 443 | "initDirPath = f'{root_path}/init_images'\n", 444 | "createPath(initDirPath)\n", 445 | "outDirPath = f'{root_path}/images_out'\n", 446 | "createPath(outDirPath)\n", 447 | "\n", 448 | "if is_colab:\n", 449 | " if google_drive and not save_models_to_google_drive or not google_drive:\n", 450 | " model_path = '/content/models'\n", 451 | " createPath(model_path)\n", 452 | " if google_drive and save_models_to_google_drive:\n", 453 | " model_path = f'{root_path}/models'\n", 454 | " createPath(model_path)\n", 455 | "else:\n", 456 | " model_path = f'{root_path}/models'\n", 457 | " createPath(model_path)\n", 458 | "\n", 459 | "# libraries = f'{root_path}/libraries'\n", 460 | "# createPath(libraries)" 461 | ], 462 | "metadata": { 463 | "colab": { 464 | "base_uri": "https://localhost:8080/" 465 | }, 466 | "id": "S65d1tI_RfEL", 467 | "outputId": "f19c348c-6cdc-4a3b-d42b-5e3f8e917aff", 468 | "cellView": "form" 469 | }, 470 | "execution_count": 1, 471 | "outputs": [ 472 | { 473 | "output_type": "stream", 474 | "name": "stdout", 475 | "text": [ 476 | "Google Colab detected. Using Google Drive.\n", 477 | "Mounted at /content/drive\n" 478 | ] 479 | } 480 | ] 481 | }, 482 | { 483 | "cell_type": "code", 484 | "source": [ 485 | "#@title ### 1.3 Install and import dependencies\n", 486 | "\n", 487 | "import pathlib, shutil, os, sys\n", 488 | "\n", 489 | "if not is_colab:\n", 490 | " # If running locally, there's a good chance your env will need this in order to not crash upon np.matmul() or similar operations.\n", 491 | " os.environ['KMP_DUPLICATE_LIB_OK']='TRUE'\n", 492 | "\n", 493 | "PROJECT_DIR = os.path.abspath(os.getcwd())\n", 494 | "USE_ADABINS = True\n", 495 | "\n", 496 | "if is_colab:\n", 497 | " if google_drive is not True:\n", 498 | " root_path = f'/content'\n", 499 | " model_path = '/content/models' \n", 500 | "else:\n", 501 | " root_path = os.getcwd()\n", 502 | " model_path = f'{root_path}/models'\n", 503 | "\n", 504 | "model_256_downloaded = False\n", 505 | "model_512_downloaded = False\n", 506 | "model_secondary_downloaded = False\n", 507 | "\n", 508 | "multipip_res = subprocess.run(['pip', 'install', 'lpips', 'datetime', 'timm', 'ftfy', 'einops', 'pytorch-lightning', 'omegaconf'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", 509 | "print(multipip_res)\n", 510 | "\n", 511 | "if is_colab:\n", 512 | " subprocess.run(['apt', 'install', 'imagemagick'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", 513 | "\n", 514 | "# try:\n", 515 | "# from CLIP import clip\n", 516 | "# except:\n", 517 | "# if not os.path.exists(\"CLIP\"):\n", 518 | "# gitclone(\"https://github.com/openai/CLIP\")\n", 519 | "# sys.path.append(f'{PROJECT_DIR}/CLIP')\n", 520 | "\n", 521 | "# try:\n", 522 | "# from guided_diffusion.script_util import create_model_and_diffusion\n", 523 | "# except:\n", 524 | "# if not os.path.exists(\"guided-diffusion\"):\n", 525 | "# gitclone(\"https://github.com/crowsonkb/guided-diffusion\")\n", 526 | "# sys.path.append(f'{PROJECT_DIR}/guided-diffusion')\n", 527 | "\n", 528 | "# try:\n", 529 | "# from resize_right import resize\n", 530 | "# except:\n", 531 | "# if not os.path.exists(\"ResizeRight\"):\n", 532 | "# gitclone(\"https://github.com/assafshocher/ResizeRight.git\")\n", 533 | "# sys.path.append(f'{PROJECT_DIR}/ResizeRight')\n", 534 | "\n", 535 | "try:\n", 536 | " import py3d_tools\n", 537 | "except:\n", 538 | " if not os.path.exists('pytorch3d-lite'):\n", 539 | " gitclone(\"https://github.com/MSFTserver/pytorch3d-lite.git\")\n", 540 | " sys.path.append(f'{PROJECT_DIR}/pytorch3d-lite')\n", 541 | "\n", 542 | "try:\n", 543 | " from midas.dpt_depth import DPTDepthModel\n", 544 | "except:\n", 545 | " if not os.path.exists('MiDaS'):\n", 546 | " gitclone(\"https://github.com/isl-org/MiDaS.git\")\n", 547 | " if not os.path.exists('MiDaS/midas_utils.py'):\n", 548 | " shutil.move('MiDaS/utils.py', 'MiDaS/midas_utils.py')\n", 549 | " if not os.path.exists(f'{model_path}/dpt_large-midas-2f21e586.pt'):\n", 550 | " wget(\"https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt\", model_path)\n", 551 | " sys.path.append(f'{PROJECT_DIR}/MiDaS')\n", 552 | "\n", 553 | "try:\n", 554 | " sys.path.append(PROJECT_DIR)\n", 555 | " import disco_xform_utils as dxf\n", 556 | "except:\n", 557 | " if not os.path.exists(\"disco-diffusion\"):\n", 558 | " gitclone(\"https://github.com/alembics/disco-diffusion.git\")\n", 559 | " if os.path.exists('disco_xform_utils.py') is not True:\n", 560 | " shutil.move('disco-diffusion/disco_xform_utils.py', 'disco_xform_utils.py')\n", 561 | " sys.path.append(PROJECT_DIR)\n", 562 | "\n", 563 | "import torch\n", 564 | "from dataclasses import dataclass\n", 565 | "from functools import partial\n", 566 | "import cv2\n", 567 | "import pandas as pd\n", 568 | "import gc\n", 569 | "import io\n", 570 | "import math\n", 571 | "import timm\n", 572 | "from IPython import display\n", 573 | "import lpips\n", 574 | "from PIL import Image, ImageOps\n", 575 | "import requests\n", 576 | "from glob import glob\n", 577 | "import json\n", 578 | "from types import SimpleNamespace\n", 579 | "from torch import nn\n", 580 | "from torch.nn import functional as F\n", 581 | "import torchvision.transforms as T\n", 582 | "import torchvision.transforms.functional as TF\n", 583 | "from tqdm.notebook import tqdm\n", 584 | "# from CLIP import clip\n", 585 | "# from resize_right import resize\n", 586 | "# from guided_diffusion.script_util import create_model_and_diffusion, model_and_diffusion_defaults\n", 587 | "from datetime import datetime\n", 588 | "import numpy as np\n", 589 | "import matplotlib.pyplot as plt\n", 590 | "import random\n", 591 | "from ipywidgets import Output\n", 592 | "import hashlib\n", 593 | "from functools import partial\n", 594 | "if is_colab:\n", 595 | " os.chdir('/content')\n", 596 | " from google.colab import files\n", 597 | "else:\n", 598 | " os.chdir(f'{PROJECT_DIR}')\n", 599 | "from IPython.display import Image as ipyimg\n", 600 | "from numpy import asarray\n", 601 | "from einops import rearrange, repeat\n", 602 | "import torch, torchvision\n", 603 | "import time\n", 604 | "from omegaconf import OmegaConf\n", 605 | "import warnings\n", 606 | "warnings.filterwarnings(\"ignore\", category=UserWarning)\n", 607 | "\n", 608 | "# AdaBins stuff\n", 609 | "if USE_ADABINS:\n", 610 | " try:\n", 611 | " from infer import InferenceHelper\n", 612 | " except:\n", 613 | " if os.path.exists(\"AdaBins\") is not True:\n", 614 | " gitclone(\"https://github.com/shariqfarooq123/AdaBins.git\")\n", 615 | " if not os.path.exists(f'{PROJECT_DIR}/pretrained/AdaBins_nyu.pt'):\n", 616 | " createPath(f'{PROJECT_DIR}/pretrained')\n", 617 | " wget(\"https://cloudflare-ipfs.com/ipfs/Qmd2mMnDLWePKmgfS8m6ntAg4nhV5VkUyAydYBp8cWWeB7/AdaBins_nyu.pt\", f'{PROJECT_DIR}/pretrained')\n", 618 | " sys.path.append(f'{PROJECT_DIR}/AdaBins')\n", 619 | " from infer import InferenceHelper\n", 620 | " MAX_ADABINS_AREA = 500000\n", 621 | "\n", 622 | "import torch\n", 623 | "DEVICE = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')\n", 624 | "print('Using device:', DEVICE)\n", 625 | "device = DEVICE # At least one of the modules expects this name..\n", 626 | "\n", 627 | "if torch.cuda.get_device_capability(DEVICE) == (8,0): ## A100 fix thanks to Emad\n", 628 | " print('Disabling CUDNN for A100 gpu', file=sys.stderr)\n", 629 | " torch.backends.cudnn.enabled = False" 630 | ], 631 | "metadata": { 632 | "id": "UZSVli6h_V_Q", 633 | "cellView": "form", 634 | "colab": { 635 | "base_uri": "https://localhost:8080/" 636 | }, 637 | "outputId": "02cc49b5-b6b2-47ce-addd-8fe04cff4868" 638 | }, 639 | "execution_count": 2, 640 | "outputs": [ 641 | { 642 | "output_type": "stream", 643 | "name": "stdout", 644 | "text": [ 645 | "Collecting lpips\n", 646 | " Downloading lpips-0.1.4-py3-none-any.whl (53 kB)\n", 647 | "Collecting datetime\n", 648 | " Downloading DateTime-4.4-py2.py3-none-any.whl (51 kB)\n", 649 | "Collecting timm\n", 650 | " Downloading timm-0.5.4-py3-none-any.whl (431 kB)\n", 651 | "Collecting ftfy\n", 652 | " Downloading ftfy-6.1.1-py3-none-any.whl (53 kB)\n", 653 | "Collecting einops\n", 654 | " Downloading einops-0.4.1-py3-none-any.whl (28 kB)\n", 655 | "Collecting pytorch-lightning\n", 656 | " Downloading pytorch_lightning-1.6.3-py3-none-any.whl (584 kB)\n", 657 | "Collecting omegaconf\n", 658 | " Downloading omegaconf-2.2.1-py3-none-any.whl (78 kB)\n", 659 | "Requirement already satisfied: scipy>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from lpips) (1.4.1)\n", 660 | "Requirement already satisfied: torchvision>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from lpips) (0.12.0+cu113)\n", 661 | "Requirement already satisfied: torch>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from lpips) (1.11.0+cu113)\n", 662 | "Requirement already satisfied: numpy>=1.14.3 in /usr/local/lib/python3.7/dist-packages (from lpips) (1.21.6)\n", 663 | "Requirement already satisfied: tqdm>=4.28.1 in /usr/local/lib/python3.7/dist-packages (from lpips) (4.64.0)\n", 664 | "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch>=0.4.0->lpips) (4.2.0)\n", 665 | "Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from torchvision>=0.2.1->lpips) (2.23.0)\n", 666 | "Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.7/dist-packages (from torchvision>=0.2.1->lpips) (7.1.2)\n", 667 | "Collecting zope.interface\n", 668 | " Downloading zope.interface-5.4.0-cp37-cp37m-manylinux2010_x86_64.whl (251 kB)\n", 669 | "Requirement already satisfied: pytz in /usr/local/lib/python3.7/dist-packages (from datetime) (2022.1)\n", 670 | "Requirement already satisfied: wcwidth>=0.2.5 in /usr/local/lib/python3.7/dist-packages (from ftfy) (0.2.5)\n", 671 | "Collecting pyDeprecate<0.4.0,>=0.3.1\n", 672 | " Downloading pyDeprecate-0.3.2-py3-none-any.whl (10 kB)\n", 673 | "Requirement already satisfied: tensorboard>=2.2.0 in /usr/local/lib/python3.7/dist-packages (from pytorch-lightning) (2.8.0)\n", 674 | "Collecting torchmetrics>=0.4.1\n", 675 | " Downloading torchmetrics-0.8.2-py3-none-any.whl (409 kB)\n", 676 | "Collecting fsspec[http]!=2021.06.0,>=2021.05.0\n", 677 | " Downloading fsspec-2022.5.0-py3-none-any.whl (140 kB)\n", 678 | "Requirement already satisfied: packaging>=17.0 in /usr/local/lib/python3.7/dist-packages (from pytorch-lightning) (21.3)\n", 679 | "Collecting PyYAML>=5.4\n", 680 | " Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)\n", 681 | "Collecting aiohttp\n", 682 | " Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)\n", 683 | "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging>=17.0->pytorch-lightning) (3.0.9)\n", 684 | "Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (0.37.1)\n", 685 | "Requirement already satisfied: grpcio>=1.24.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (1.46.1)\n", 686 | "Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (0.6.1)\n", 687 | "Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (1.0.0)\n", 688 | "Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (1.0.1)\n", 689 | "Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (1.35.0)\n", 690 | "Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (1.8.1)\n", 691 | "Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (0.4.6)\n", 692 | "Requirement already satisfied: protobuf>=3.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (3.17.3)\n", 693 | "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (3.3.7)\n", 694 | "Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->pytorch-lightning) (57.4.0)\n", 695 | "Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from absl-py>=0.4->tensorboard>=2.2.0->pytorch-lightning) (1.15.0)\n", 696 | "Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning) (4.2.4)\n", 697 | "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning) (0.2.8)\n", 698 | "Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning) (4.8)\n", 699 | "Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->pytorch-lightning) (1.3.1)\n", 700 | "Requirement already satisfied: importlib-metadata>=4.4 in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard>=2.2.0->pytorch-lightning) (4.11.3)\n", 701 | "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard>=2.2.0->pytorch-lightning) (3.8.0)\n", 702 | "Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning) (0.4.8)\n", 703 | "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->torchvision>=0.2.1->lpips) (2.10)\n", 704 | "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->torchvision>=0.2.1->lpips) (1.24.3)\n", 705 | "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->torchvision>=0.2.1->lpips) (2021.10.8)\n", 706 | "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->torchvision>=0.2.1->lpips) (3.0.4)\n", 707 | "Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->pytorch-lightning) (3.2.0)\n", 708 | "Collecting antlr4-python3-runtime==4.9.*\n", 709 | " Downloading antlr4-python3-runtime-4.9.3.tar.gz (117 kB)\n", 710 | "Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp->fsspec[http]!=2021.06.0,>=2021.05.0->pytorch-lightning) (2.0.12)\n", 711 | "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp->fsspec[http]!=2021.06.0,>=2021.05.0->pytorch-lightning) (21.4.0)\n", 712 | "Collecting multidict<7.0,>=4.5\n", 713 | " Downloading multidict-6.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (94 kB)\n", 714 | "Collecting yarl<2.0,>=1.0\n", 715 | " Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)\n", 716 | "Collecting asynctest==0.13.0\n", 717 | " Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)\n", 718 | "Collecting async-timeout<5.0,>=4.0.0a3\n", 719 | " Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)\n", 720 | "Collecting aiosignal>=1.1.2\n", 721 | " Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)\n", 722 | "Collecting frozenlist>=1.1.1\n", 723 | " Downloading frozenlist-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (144 kB)\n", 724 | "Building wheels for collected packages: antlr4-python3-runtime\n", 725 | " Building wheel for antlr4-python3-runtime (setup.py): started\n", 726 | " Building wheel for antlr4-python3-runtime (setup.py): finished with status 'done'\n", 727 | " Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.9.3-py3-none-any.whl size=144575 sha256=e708a06e07e1ae8a35d4e2f9454f334b49878b2494885ff904213b25ef13eecc\n", 728 | " Stored in directory: /root/.cache/pip/wheels/8b/8d/53/2af8772d9aec614e3fc65e53d4a993ad73c61daa8bbd85a873\n", 729 | "Successfully built antlr4-python3-runtime\n", 730 | "Installing collected packages: multidict, frozenlist, yarl, asynctest, async-timeout, aiosignal, pyDeprecate, fsspec, aiohttp, zope.interface, torchmetrics, PyYAML, antlr4-python3-runtime, timm, pytorch-lightning, omegaconf, lpips, ftfy, einops, datetime\n", 731 | " Attempting uninstall: PyYAML\n", 732 | " Found existing installation: PyYAML 3.13\n", 733 | " Uninstalling PyYAML-3.13:\n", 734 | " Successfully uninstalled PyYAML-3.13\n", 735 | "Successfully installed PyYAML-6.0 aiohttp-3.8.1 aiosignal-1.2.0 antlr4-python3-runtime-4.9.3 async-timeout-4.0.2 asynctest-0.13.0 datetime-4.4 einops-0.4.1 frozenlist-1.3.0 fsspec-2022.5.0 ftfy-6.1.1 lpips-0.1.4 multidict-6.0.2 omegaconf-2.2.1 pyDeprecate-0.3.2 pytorch-lightning-1.6.3 timm-0.5.4 torchmetrics-0.8.2 yarl-1.7.2 zope.interface-5.4.0\n", 736 | "\n", 737 | "\n", 738 | "\n", 739 | "\n", 740 | "\n", 741 | "\n", 742 | "Using device: cuda:0\n" 743 | ] 744 | } 745 | ] 746 | }, 747 | { 748 | "cell_type": "code", 749 | "source": [ 750 | "#@markdown ####**Animation Mode:**\n", 751 | "animation_mode = '3D'\n", 752 | "#@markdown *For animation, you probably want to turn `cutn_batches` to 1 to make it quicker.*\n", 753 | "\n", 754 | "\n", 755 | "\n", 756 | "\n", 757 | "\n", 758 | "if is_colab:\n", 759 | " video_init_path = \"/content/training.mp4\" \n", 760 | "else:\n", 761 | " video_init_path = \"training.mp4\"\n", 762 | "extract_nth_frame = 2 \n", 763 | "video_init_seed_continuity = True\n", 764 | "\n", 765 | "if animation_mode == \"Video Input\":\n", 766 | " if is_colab:\n", 767 | " videoFramesFolder = f'/content/videoFrames'\n", 768 | " else:\n", 769 | " videoFramesFolder = f'videoFrames'\n", 770 | " createPath(videoFramesFolder)\n", 771 | " print(f\"Exporting Video Frames (1 every {extract_nth_frame})...\")\n", 772 | " try:\n", 773 | " for f in pathlib.Path(f'{videoFramesFolder}').glob('*.jpg'):\n", 774 | " f.unlink()\n", 775 | " except:\n", 776 | " print('')\n", 777 | " vf = f'select=not(mod(n\\,{extract_nth_frame}))'\n", 778 | " subprocess.run(['ffmpeg', '-i', f'{video_init_path}', '-vf', f'{vf}', '-vsync', 'vfr', '-q:v', '2', '-loglevel', 'error', '-stats', f'{videoFramesFolder}/%04d.jpg'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", 779 | " #!ffmpeg -i {video_init_path} -vf {vf} -vsync vfr -q:v 2 -loglevel error -stats {videoFramesFolder}/%04d.jpg\n", 780 | "\n", 781 | "key_frames = True \n", 782 | "max_frames = 10000\n", 783 | "\n", 784 | "if animation_mode == \"Video Input\":\n", 785 | " max_frames = len(glob(f'{videoFramesFolder}/*.jpg'))\n", 786 | "\n", 787 | "interp_spline = 'Linear' #Do not change, currently will not look good. param ['Linear','Quadratic','Cubic']{type:\"string\"}\n", 788 | "angle = \"0:(0)\"#@param {type:\"string\"}\n", 789 | "zoom = \"0: (1)\"#@param {type:\"string\"}\n", 790 | "translation_x = \"0: (1.25)\"#@param {type:\"string\"}\n", 791 | "translation_y = \"0: (0)\"#@param {type:\"string\"}\n", 792 | "translation_z = \"0: (.5)\"#@param {type:\"string\"}\n", 793 | "rotation_3d_x = \"0: (0)\"#@param {type:\"string\"}\n", 794 | "rotation_3d_y = \"0: (-0.125)\"#@param {type:\"string\"}\n", 795 | "rotation_3d_z = \"0: (0)\"#@param {type:\"string\"}\n", 796 | "midas_depth_model = \"dpt_large\"#@param {type:\"string\"}\n", 797 | "midas_weight = 0.8#@param {type:\"number\"}\n", 798 | "near_plane = 200#@param {type:\"number\"}\n", 799 | "far_plane = 10000#@param {type:\"number\"}\n", 800 | "fov = 120#@param {type:\"number\"}\n", 801 | "padding_mode = 'border'#@param {type:\"string\"}\n", 802 | "sampling_mode = 'bicubic'#@param {type:\"string\"}\n", 803 | "\n", 804 | "\n", 805 | "turbo_mode = False\n", 806 | "turbo_steps = \"3\" \n", 807 | "turbo_preroll = 10\n", 808 | "\n", 809 | "#insist turbo be used only w 3d anim.\n", 810 | "if turbo_mode and animation_mode != '3D':\n", 811 | " print('=====')\n", 812 | " print('Turbo mode only available with 3D animations. Disabling Turbo.')\n", 813 | " print('=====')\n", 814 | " turbo_mode = False\n", 815 | "\n", 816 | "\n", 817 | "frames_scale = 1500\n", 818 | "frames_skip_steps = '60%' \n", 819 | "\n", 820 | "vr_mode = False \n", 821 | "vr_eye_angle = 0.5\n", 822 | "vr_ipd = 5.0 \n", 823 | "\n", 824 | "#insist VR be used only w 3d anim.\n", 825 | "if vr_mode and animation_mode != '3D':\n", 826 | " print('=====')\n", 827 | " print('VR mode only available with 3D animations. Disabling VR.')\n", 828 | " print('=====')\n", 829 | " vr_mode = False\n", 830 | "\n", 831 | "\n", 832 | "def parse_key_frames(string, prompt_parser=None):\n", 833 | " \"\"\"Given a string representing frame numbers paired with parameter values at that frame,\n", 834 | " return a dictionary with the frame numbers as keys and the parameter values as the values.\n", 835 | "\n", 836 | " Parameters\n", 837 | " ----------\n", 838 | " string: string\n", 839 | " Frame numbers paired with parameter values at that frame number, in the format\n", 840 | " 'framenumber1: (parametervalues1), framenumber2: (parametervalues2), ...'\n", 841 | " prompt_parser: function or None, optional\n", 842 | " If provided, prompt_parser will be applied to each string of parameter values.\n", 843 | " \n", 844 | " Returns\n", 845 | " -------\n", 846 | " dict\n", 847 | " Frame numbers as keys, parameter values at that frame number as values\n", 848 | "\n", 849 | " Raises\n", 850 | " ------\n", 851 | " RuntimeError\n", 852 | " If the input string does not match the expected format.\n", 853 | " \n", 854 | " Examples\n", 855 | " --------\n", 856 | " >>> parse_key_frames(\"10:(Apple: 1| Orange: 0), 20: (Apple: 0| Orange: 1| Peach: 1)\")\n", 857 | " {10: 'Apple: 1| Orange: 0', 20: 'Apple: 0| Orange: 1| Peach: 1'}\n", 858 | "\n", 859 | " >>> parse_key_frames(\"10:(Apple: 1| Orange: 0), 20: (Apple: 0| Orange: 1| Peach: 1)\", prompt_parser=lambda x: x.lower()))\n", 860 | " {10: 'apple: 1| orange: 0', 20: 'apple: 0| orange: 1| peach: 1'}\n", 861 | " \"\"\"\n", 862 | " import re\n", 863 | " pattern = r'((?P[0-9]+):[\\s]*[\\(](?P[\\S\\s]*?)[\\)])'\n", 864 | " frames = dict()\n", 865 | " for match_object in re.finditer(pattern, string):\n", 866 | " frame = int(match_object.groupdict()['frame'])\n", 867 | " param = match_object.groupdict()['param']\n", 868 | " if prompt_parser:\n", 869 | " frames[frame] = prompt_parser(param)\n", 870 | " else:\n", 871 | " frames[frame] = param\n", 872 | "\n", 873 | " if frames == {} and len(string) != 0:\n", 874 | " raise RuntimeError('Key Frame string not correctly formatted')\n", 875 | " return frames\n", 876 | "\n", 877 | "def get_inbetweens(key_frames, integer=False):\n", 878 | " \"\"\"Given a dict with frame numbers as keys and a parameter value as values,\n", 879 | " return a pandas Series containing the value of the parameter at every frame from 0 to max_frames.\n", 880 | " Any values not provided in the input dict are calculated by linear interpolation between\n", 881 | " the values of the previous and next provided frames. If there is no previous provided frame, then\n", 882 | " the value is equal to the value of the next provided frame, or if there is no next provided frame,\n", 883 | " then the value is equal to the value of the previous provided frame. If no frames are provided,\n", 884 | " all frame values are NaN.\n", 885 | "\n", 886 | " Parameters\n", 887 | " ----------\n", 888 | " key_frames: dict\n", 889 | " A dict with integer frame numbers as keys and numerical values of a particular parameter as values.\n", 890 | " integer: Bool, optional\n", 891 | " If True, the values of the output series are converted to integers.\n", 892 | " Otherwise, the values are floats.\n", 893 | " \n", 894 | " Returns\n", 895 | " -------\n", 896 | " pd.Series\n", 897 | " A Series with length max_frames representing the parameter values for each frame.\n", 898 | " \n", 899 | " Examples\n", 900 | " --------\n", 901 | " >>> max_frames = 5\n", 902 | " >>> get_inbetweens({1: 5, 3: 6})\n", 903 | " 0 5.0\n", 904 | " 1 5.0\n", 905 | " 2 5.5\n", 906 | " 3 6.0\n", 907 | " 4 6.0\n", 908 | " dtype: float64\n", 909 | "\n", 910 | " >>> get_inbetweens({1: 5, 3: 6}, integer=True)\n", 911 | " 0 5\n", 912 | " 1 5\n", 913 | " 2 5\n", 914 | " 3 6\n", 915 | " 4 6\n", 916 | " dtype: int64\n", 917 | " \"\"\"\n", 918 | " key_frame_series = pd.Series([np.nan for a in range(max_frames)])\n", 919 | "\n", 920 | " for i, value in key_frames.items():\n", 921 | " key_frame_series[i] = value\n", 922 | " key_frame_series = key_frame_series.astype(float)\n", 923 | " \n", 924 | " interp_method = interp_spline\n", 925 | "\n", 926 | " if interp_method == 'Cubic' and len(key_frames.items()) <=3:\n", 927 | " interp_method = 'Quadratic'\n", 928 | " \n", 929 | " if interp_method == 'Quadratic' and len(key_frames.items()) <= 2:\n", 930 | " interp_method = 'Linear'\n", 931 | " \n", 932 | " \n", 933 | " key_frame_series[0] = key_frame_series[key_frame_series.first_valid_index()]\n", 934 | " key_frame_series[max_frames-1] = key_frame_series[key_frame_series.last_valid_index()]\n", 935 | " # key_frame_series = key_frame_series.interpolate(method=intrp_method,order=1, limit_direction='both')\n", 936 | " key_frame_series = key_frame_series.interpolate(method=interp_method.lower(),limit_direction='both')\n", 937 | " if integer:\n", 938 | " return key_frame_series.astype(int)\n", 939 | " return key_frame_series\n", 940 | "\n", 941 | "def split_prompts(prompts):\n", 942 | " prompt_series = pd.Series([np.nan for a in range(max_frames)])\n", 943 | " for i, prompt in prompts.items():\n", 944 | " prompt_series[i] = prompt\n", 945 | " # prompt_series = prompt_series.astype(str)\n", 946 | " prompt_series = prompt_series.ffill().bfill()\n", 947 | " return prompt_series\n", 948 | "\n", 949 | "if key_frames:\n", 950 | " try:\n", 951 | " angle_series = get_inbetweens(parse_key_frames(angle))\n", 952 | " except RuntimeError as e:\n", 953 | " print(\n", 954 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 955 | " \"formatted `angle` correctly for key frames.\\n\"\n", 956 | " \"Attempting to interpret `angle` as \"\n", 957 | " f'\"0: ({angle})\"\\n'\n", 958 | " \"Please read the instructions to find out how to use key frames \"\n", 959 | " \"correctly.\\n\"\n", 960 | " )\n", 961 | " angle = f\"0: ({angle})\"\n", 962 | " angle_series = get_inbetweens(parse_key_frames(angle))\n", 963 | "\n", 964 | " try:\n", 965 | " zoom_series = get_inbetweens(parse_key_frames(zoom))\n", 966 | " except RuntimeError as e:\n", 967 | " print(\n", 968 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 969 | " \"formatted `zoom` correctly for key frames.\\n\"\n", 970 | " \"Attempting to interpret `zoom` as \"\n", 971 | " f'\"0: ({zoom})\"\\n'\n", 972 | " \"Please read the instructions to find out how to use key frames \"\n", 973 | " \"correctly.\\n\"\n", 974 | " )\n", 975 | " zoom = f\"0: ({zoom})\"\n", 976 | " zoom_series = get_inbetweens(parse_key_frames(zoom))\n", 977 | "\n", 978 | " try:\n", 979 | " translation_x_series = get_inbetweens(parse_key_frames(translation_x))\n", 980 | " except RuntimeError as e:\n", 981 | " print(\n", 982 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 983 | " \"formatted `translation_x` correctly for key frames.\\n\"\n", 984 | " \"Attempting to interpret `translation_x` as \"\n", 985 | " f'\"0: ({translation_x})\"\\n'\n", 986 | " \"Please read the instructions to find out how to use key frames \"\n", 987 | " \"correctly.\\n\"\n", 988 | " )\n", 989 | " translation_x = f\"0: ({translation_x})\"\n", 990 | " translation_x_series = get_inbetweens(parse_key_frames(translation_x))\n", 991 | "\n", 992 | " try:\n", 993 | " translation_y_series = get_inbetweens(parse_key_frames(translation_y))\n", 994 | " except RuntimeError as e:\n", 995 | " print(\n", 996 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 997 | " \"formatted `translation_y` correctly for key frames.\\n\"\n", 998 | " \"Attempting to interpret `translation_y` as \"\n", 999 | " f'\"0: ({translation_y})\"\\n'\n", 1000 | " \"Please read the instructions to find out how to use key frames \"\n", 1001 | " \"correctly.\\n\"\n", 1002 | " )\n", 1003 | " translation_y = f\"0: ({translation_y})\"\n", 1004 | " translation_y_series = get_inbetweens(parse_key_frames(translation_y))\n", 1005 | "\n", 1006 | " try:\n", 1007 | " translation_z_series = get_inbetweens(parse_key_frames(translation_z))\n", 1008 | " except RuntimeError as e:\n", 1009 | " print(\n", 1010 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 1011 | " \"formatted `translation_z` correctly for key frames.\\n\"\n", 1012 | " \"Attempting to interpret `translation_z` as \"\n", 1013 | " f'\"0: ({translation_z})\"\\n'\n", 1014 | " \"Please read the instructions to find out how to use key frames \"\n", 1015 | " \"correctly.\\n\"\n", 1016 | " )\n", 1017 | " translation_z = f\"0: ({translation_z})\"\n", 1018 | " translation_z_series = get_inbetweens(parse_key_frames(translation_z))\n", 1019 | "\n", 1020 | " try:\n", 1021 | " rotation_3d_x_series = get_inbetweens(parse_key_frames(rotation_3d_x))\n", 1022 | " except RuntimeError as e:\n", 1023 | " print(\n", 1024 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 1025 | " \"formatted `rotation_3d_x` correctly for key frames.\\n\"\n", 1026 | " \"Attempting to interpret `rotation_3d_x` as \"\n", 1027 | " f'\"0: ({rotation_3d_x})\"\\n'\n", 1028 | " \"Please read the instructions to find out how to use key frames \"\n", 1029 | " \"correctly.\\n\"\n", 1030 | " )\n", 1031 | " rotation_3d_x = f\"0: ({rotation_3d_x})\"\n", 1032 | " rotation_3d_x_series = get_inbetweens(parse_key_frames(rotation_3d_x))\n", 1033 | "\n", 1034 | " try:\n", 1035 | " rotation_3d_y_series = get_inbetweens(parse_key_frames(rotation_3d_y))\n", 1036 | " except RuntimeError as e:\n", 1037 | " print(\n", 1038 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 1039 | " \"formatted `rotation_3d_y` correctly for key frames.\\n\"\n", 1040 | " \"Attempting to interpret `rotation_3d_y` as \"\n", 1041 | " f'\"0: ({rotation_3d_y})\"\\n'\n", 1042 | " \"Please read the instructions to find out how to use key frames \"\n", 1043 | " \"correctly.\\n\"\n", 1044 | " )\n", 1045 | " rotation_3d_y = f\"0: ({rotation_3d_y})\"\n", 1046 | " rotation_3d_y_series = get_inbetweens(parse_key_frames(rotation_3d_y))\n", 1047 | "\n", 1048 | " try:\n", 1049 | " rotation_3d_z_series = get_inbetweens(parse_key_frames(rotation_3d_z))\n", 1050 | " except RuntimeError as e:\n", 1051 | " print(\n", 1052 | " \"WARNING: You have selected to use key frames, but you have not \"\n", 1053 | " \"formatted `rotation_3d_z` correctly for key frames.\\n\"\n", 1054 | " \"Attempting to interpret `rotation_3d_z` as \"\n", 1055 | " f'\"0: ({rotation_3d_z})\"\\n'\n", 1056 | " \"Please read the instructions to find out how to use key frames \"\n", 1057 | " \"correctly.\\n\"\n", 1058 | " )\n", 1059 | " rotation_3d_z = f\"0: ({rotation_3d_z})\"\n", 1060 | " rotation_3d_z_series = get_inbetweens(parse_key_frames(rotation_3d_z))\n", 1061 | "\n", 1062 | "else:\n", 1063 | " angle = float(angle)\n", 1064 | " zoom = float(zoom)\n", 1065 | " translation_x = float(translation_x)\n", 1066 | " translation_y = float(translation_y)\n", 1067 | " translation_z = float(translation_z)\n", 1068 | " rotation_3d_x = float(rotation_3d_x)\n", 1069 | " rotation_3d_y = float(rotation_3d_y)\n", 1070 | " rotation_3d_z = float(rotation_3d_z)\n", 1071 | "\n", 1072 | "#@title Default title text\n", 1073 | "args = {\n", 1074 | " # 'batchNum': batchNum,\n", 1075 | " # 'prompts_series':split_prompts(text_prompts) if text_prompts else None,\n", 1076 | " # 'image_prompts_series':split_prompts(image_prompts) if image_prompts else None,\n", 1077 | " # 'seed': seed,\n", 1078 | " # 'display_rate':display_rate,\n", 1079 | " # 'n_batches':n_batches if animation_mode == 'None' else 1,\n", 1080 | " # 'batch_size':batch_size,\n", 1081 | " # 'batch_name': batch_name,\n", 1082 | " # 'steps': steps,\n", 1083 | " # 'diffusion_sampling_mode': diffusion_sampling_mode,\n", 1084 | " # 'width_height': width_height,\n", 1085 | " # 'clip_guidance_scale': clip_guidance_scale,\n", 1086 | " # 'tv_scale': tv_scale,\n", 1087 | " # 'range_scale': range_scale,\n", 1088 | " # 'sat_scale': sat_scale,\n", 1089 | " # 'cutn_batches': cutn_batches,\n", 1090 | " # 'init_image': init_image,\n", 1091 | " # 'init_scale': init_scale,\n", 1092 | " # 'skip_steps': skip_steps,\n", 1093 | " # 'side_x': side_x,\n", 1094 | " # 'side_y': side_y,\n", 1095 | " # 'timestep_respacing': timestep_respacing,\n", 1096 | " # 'diffusion_steps': diffusion_steps,\n", 1097 | " 'animation_mode': animation_mode,\n", 1098 | " 'video_init_path': video_init_path,\n", 1099 | " 'extract_nth_frame': extract_nth_frame,\n", 1100 | " 'video_init_seed_continuity': video_init_seed_continuity,\n", 1101 | " 'key_frames': key_frames,\n", 1102 | " 'max_frames': max_frames if animation_mode != \"None\" else 1,\n", 1103 | " 'interp_spline': interp_spline,\n", 1104 | " # 'start_frame': start_frame,\n", 1105 | " 'angle': angle,\n", 1106 | " 'zoom': zoom,\n", 1107 | " 'translation_x': translation_x,\n", 1108 | " 'translation_y': translation_y,\n", 1109 | " 'translation_z': translation_z,\n", 1110 | " 'rotation_3d_x': rotation_3d_x,\n", 1111 | " 'rotation_3d_y': rotation_3d_y,\n", 1112 | " 'rotation_3d_z': rotation_3d_z,\n", 1113 | " 'midas_depth_model': midas_depth_model,\n", 1114 | " 'midas_weight': midas_weight,\n", 1115 | " 'near_plane': near_plane,\n", 1116 | " 'far_plane': far_plane,\n", 1117 | " 'fov': fov,\n", 1118 | " 'padding_mode': padding_mode,\n", 1119 | " 'sampling_mode': sampling_mode,\n", 1120 | " 'angle_series':angle_series,\n", 1121 | " 'zoom_series':zoom_series,\n", 1122 | " 'translation_x_series':translation_x_series,\n", 1123 | " 'translation_y_series':translation_y_series,\n", 1124 | " 'translation_z_series':translation_z_series,\n", 1125 | " 'rotation_3d_x_series':rotation_3d_x_series,\n", 1126 | " 'rotation_3d_y_series':rotation_3d_y_series,\n", 1127 | " 'rotation_3d_z_series':rotation_3d_z_series,\n", 1128 | " 'frames_scale': frames_scale,\n", 1129 | " # 'calc_frames_skip_steps': calc_frames_skip_steps,\n", 1130 | " # 'skip_step_ratio': skip_step_ratio,\n", 1131 | " # 'calc_frames_skip_steps': calc_frames_skip_steps,\n", 1132 | " # 'text_prompts': text_prompts,\n", 1133 | " # 'image_prompts': image_prompts,\n", 1134 | " # 'cut_overview': eval(cut_overview),\n", 1135 | " # 'cut_innercut': eval(cut_innercut),\n", 1136 | " # 'cut_ic_pow': cut_ic_pow,\n", 1137 | " # 'cut_icgray_p': eval(cut_icgray_p),\n", 1138 | " # 'intermediate_saves': intermediate_saves,\n", 1139 | " # 'intermediates_in_subfolder': intermediates_in_subfolder,\n", 1140 | " # 'steps_per_checkpoint': steps_per_checkpoint,\n", 1141 | " # 'perlin_init': perlin_init,\n", 1142 | " # 'perlin_mode': perlin_mode,\n", 1143 | " # 'set_seed': set_seed,\n", 1144 | " # 'eta': eta,\n", 1145 | " # 'clamp_grad': clamp_grad,\n", 1146 | " # 'clamp_max': clamp_max,\n", 1147 | " # 'skip_augs': skip_augs,\n", 1148 | " # 'randomize_class': randomize_class,\n", 1149 | " # 'clip_denoised': clip_denoised,\n", 1150 | " # 'fuzzy_prompt': fuzzy_prompt,\n", 1151 | " # 'rand_mag': rand_mag,\n", 1152 | "}\n", 1153 | "\n", 1154 | "args = SimpleNamespace(**args)" 1155 | ], 1156 | "metadata": { 1157 | "cellView": "form", 1158 | "id": "JmMlLvD6ALqR" 1159 | }, 1160 | "execution_count": 3, 1161 | "outputs": [] 1162 | }, 1163 | { 1164 | "cell_type": "code", 1165 | "execution_count": 4, 1166 | "metadata": { 1167 | "cellView": "form", 1168 | "id": "H_UBFv6x7G00", 1169 | "colab": { 1170 | "base_uri": "https://localhost:8080/" 1171 | }, 1172 | "outputId": "fa9a29a8-531d-4f71-c975-545b5dd5541a" 1173 | }, 1174 | "outputs": [ 1175 | { 1176 | "output_type": "stream", 1177 | "name": "stdout", 1178 | "text": [ 1179 | "Initializing MiDaS 'dpt_large' depth model...\n", 1180 | "MiDaS 'dpt_large' depth model initialized.\n" 1181 | ] 1182 | } 1183 | ], 1184 | "source": [ 1185 | "#@title ### 1.4 Define Midas functions\n", 1186 | "\n", 1187 | "from midas.dpt_depth import DPTDepthModel\n", 1188 | "from midas.midas_net import MidasNet\n", 1189 | "from midas.midas_net_custom import MidasNet_small\n", 1190 | "from midas.transforms import Resize, NormalizeImage, PrepareForNet\n", 1191 | "\n", 1192 | "# Initialize MiDaS depth model.\n", 1193 | "# It remains resident in VRAM and likely takes around 2GB VRAM.\n", 1194 | "# You could instead initialize it for each frame (and free it after each frame) to save VRAM.. but initializing it is slow.\n", 1195 | "default_models = {\n", 1196 | " \"midas_v21_small\": f\"{model_path}/midas_v21_small-70d6b9c8.pt\",\n", 1197 | " \"midas_v21\": f\"{model_path}/midas_v21-f6b98070.pt\",\n", 1198 | " \"dpt_large\": f\"{model_path}/dpt_large-midas-2f21e586.pt\",\n", 1199 | " \"dpt_hybrid\": f\"{model_path}/dpt_hybrid-midas-501f0c75.pt\",\n", 1200 | " \"dpt_hybrid_nyu\": f\"{model_path}/dpt_hybrid_nyu-2ce69ec7.pt\",}\n", 1201 | "\n", 1202 | "\n", 1203 | "def init_midas_depth_model(midas_model_type=\"dpt_large\", optimize=True):\n", 1204 | " midas_model = None\n", 1205 | " net_w = None\n", 1206 | " net_h = None\n", 1207 | " resize_mode = None\n", 1208 | " normalization = None\n", 1209 | "\n", 1210 | " print(f\"Initializing MiDaS '{midas_model_type}' depth model...\")\n", 1211 | " # load network\n", 1212 | " midas_model_path = default_models[midas_model_type]\n", 1213 | "\n", 1214 | " if midas_model_type == \"dpt_large\": # DPT-Large\n", 1215 | " midas_model = DPTDepthModel(\n", 1216 | " path=midas_model_path,\n", 1217 | " backbone=\"vitl16_384\",\n", 1218 | " non_negative=True,\n", 1219 | " )\n", 1220 | " net_w, net_h = 384, 384\n", 1221 | " resize_mode = \"minimal\"\n", 1222 | " normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])\n", 1223 | " elif midas_model_type == \"dpt_hybrid\": #DPT-Hybrid\n", 1224 | " midas_model = DPTDepthModel(\n", 1225 | " path=midas_model_path,\n", 1226 | " backbone=\"vitb_rn50_384\",\n", 1227 | " non_negative=True,\n", 1228 | " )\n", 1229 | " net_w, net_h = 384, 384\n", 1230 | " resize_mode=\"minimal\"\n", 1231 | " normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])\n", 1232 | " elif midas_model_type == \"dpt_hybrid_nyu\": #DPT-Hybrid-NYU\n", 1233 | " midas_model = DPTDepthModel(\n", 1234 | " path=midas_model_path,\n", 1235 | " backbone=\"vitb_rn50_384\",\n", 1236 | " non_negative=True,\n", 1237 | " )\n", 1238 | " net_w, net_h = 384, 384\n", 1239 | " resize_mode=\"minimal\"\n", 1240 | " normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])\n", 1241 | " elif midas_model_type == \"midas_v21\":\n", 1242 | " midas_model = MidasNet(midas_model_path, non_negative=True)\n", 1243 | " net_w, net_h = 384, 384\n", 1244 | " resize_mode=\"upper_bound\"\n", 1245 | " normalization = NormalizeImage(\n", 1246 | " mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]\n", 1247 | " )\n", 1248 | " elif midas_model_type == \"midas_v21_small\":\n", 1249 | " midas_model = MidasNet_small(midas_model_path, features=64, backbone=\"efficientnet_lite3\", exportable=True, non_negative=True, blocks={'expand': True})\n", 1250 | " net_w, net_h = 256, 256\n", 1251 | " resize_mode=\"upper_bound\"\n", 1252 | " normalization = NormalizeImage(\n", 1253 | " mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]\n", 1254 | " )\n", 1255 | " else:\n", 1256 | " print(f\"midas_model_type '{midas_model_type}' not implemented\")\n", 1257 | " assert False\n", 1258 | "\n", 1259 | " midas_transform = T.Compose(\n", 1260 | " [\n", 1261 | " Resize(\n", 1262 | " net_w,\n", 1263 | " net_h,\n", 1264 | " resize_target=None,\n", 1265 | " keep_aspect_ratio=True,\n", 1266 | " ensure_multiple_of=32,\n", 1267 | " resize_method=resize_mode,\n", 1268 | " image_interpolation_method=cv2.INTER_CUBIC,\n", 1269 | " ),\n", 1270 | " normalization,\n", 1271 | " PrepareForNet(),\n", 1272 | " ]\n", 1273 | " )\n", 1274 | "\n", 1275 | " midas_model.eval()\n", 1276 | " \n", 1277 | " if optimize==True:\n", 1278 | " if DEVICE == torch.device(\"cuda\"):\n", 1279 | " midas_model = midas_model.to(memory_format=torch.channels_last) \n", 1280 | " midas_model = midas_model.half()\n", 1281 | "\n", 1282 | " midas_model.to(DEVICE)\n", 1283 | "\n", 1284 | " print(f\"MiDaS '{midas_model_type}' depth model initialized.\")\n", 1285 | " return midas_model, midas_transform, net_w, net_h, resize_mode, normalization\n", 1286 | "\n", 1287 | "#@title Default title text\n", 1288 | "def do_3d_step(img_filepath, frame_num, midas_model, midas_transform):\n", 1289 | " if args.key_frames:\n", 1290 | " translation_x = args.translation_x_series[frame_num]\n", 1291 | " translation_y = args.translation_y_series[frame_num]\n", 1292 | " translation_z = args.translation_z_series[frame_num]\n", 1293 | " rotation_3d_x = args.rotation_3d_x_series[frame_num]\n", 1294 | " rotation_3d_y = args.rotation_3d_y_series[frame_num]\n", 1295 | " rotation_3d_z = args.rotation_3d_z_series[frame_num]\n", 1296 | " # print(\n", 1297 | " # f'translation_x: {translation_x}',\n", 1298 | " # f'translation_y: {translation_y}',\n", 1299 | " # f'translation_z: {translation_z}',\n", 1300 | " # f'rotation_3d_x: {rotation_3d_x}',\n", 1301 | " # f'rotation_3d_y: {rotation_3d_y}',\n", 1302 | " # f'rotation_3d_z: {rotation_3d_z}',\n", 1303 | " # )\n", 1304 | "\n", 1305 | " translate_xyz = [-translation_x*TRANSLATION_SCALE, translation_y*TRANSLATION_SCALE, -translation_z*TRANSLATION_SCALE]\n", 1306 | " rotate_xyz_degrees = [rotation_3d_x, rotation_3d_y, rotation_3d_z]\n", 1307 | " # print('translation:',translate_xyz)\n", 1308 | " # print('rotation:',rotate_xyz_degrees)\n", 1309 | " rotate_xyz = [math.radians(rotate_xyz_degrees[0]), math.radians(rotate_xyz_degrees[1]), math.radians(rotate_xyz_degrees[2])]\n", 1310 | " rot_mat = p3dT.euler_angles_to_matrix(torch.tensor(rotate_xyz, device=device), \"XYZ\").unsqueeze(0)\n", 1311 | " # print(\"rot_mat: \" + str(rot_mat))\n", 1312 | " next_step_pil = dxf.transform_image_3d(img_filepath, midas_model, midas_transform, DEVICE,\n", 1313 | " rot_mat, translate_xyz, args.near_plane, args.far_plane,\n", 1314 | " args.fov, padding_mode=args.padding_mode,\n", 1315 | " sampling_mode=args.sampling_mode, midas_weight=args.midas_weight)\n", 1316 | " return next_step_pil\n", 1317 | "\n", 1318 | "import py3d_tools as p3dT\n", 1319 | "import disco_xform_utils as dxf\n", 1320 | "from tqdm.notebook import trange\n", 1321 | "midas_model, midas_transform, midas_net_w, midas_net_h, midas_resize_mode, midas_normalization = init_midas_depth_model(args.midas_depth_model)\n", 1322 | "TRANSLATION_SCALE = 1.0/200.0" 1323 | ] 1324 | }, 1325 | { 1326 | "cell_type": "code", 1327 | "source": [ 1328 | "#@title Set image path, number of frames to make, and run this cell.\n", 1329 | "#@markdown Output video will be saved to /content/\n", 1330 | "from tqdm.notebook import trange\n", 1331 | "\n", 1332 | "image = '/content/Wonder_Women_Poster_Cropped.webp' #@param {type:\"string\"}\n", 1333 | "init = image\n", 1334 | "max_frames = 20 #@param {type:\"number\"}\n", 1335 | "!mkdir ./out\n", 1336 | "!rm -rf ./out/*\n", 1337 | "\n", 1338 | "for i in trange(1, max_frames):\n", 1339 | " out = do_3d_step(image, i, midas_model, midas_transform)\n", 1340 | " new_fname = f'./out/frame_{i:04d}.png'\n", 1341 | " out.save(new_fname)\n", 1342 | " image = new_fname\n", 1343 | "\n", 1344 | "out_fname = f'/content/{init.split(\"/\")[-1]}_n{near_plane}_o{far_plane}_f{fov}_mw{midas_weight}.mp4'\n", 1345 | "!ffmpeg -y -pattern_type glob -i \"/content/out/*.png\" \"{out_fname}\"" 1346 | ], 1347 | "metadata": { 1348 | "id": "65crZ7HfSkW2", 1349 | "colab": { 1350 | "base_uri": "https://localhost:8080/", 1351 | "height": 285, 1352 | "referenced_widgets": [ 1353 | "8688f18d323c48b1860f2a0f8c5ebb7b", 1354 | "10fddb5c0ee14dbaba53515a3faf652b", 1355 | "1bd46af147274667ac6d9b135d31f462", 1356 | "40a05829465243699948262cbba458ff", 1357 | "9f6bc988297f4d20962e47954d18be55", 1358 | "51e7f724c27a4b65a74ad1b9f25b1fcc", 1359 | "2e2465965971480cb00785f1a94e8067", 1360 | "4e2e6d1914f74655be50c9f247ace1bf", 1361 | "1bf7b6ab4a4b4381a3fd443010020b58", 1362 | "f150940fbf83498e96c759a5f8c7b209", 1363 | "de3f79a0115c4a50aa8203e7e4022d01" 1364 | ] 1365 | }, 1366 | "cellView": "form", 1367 | "outputId": "8efd693a-69da-47ba-c954-956985b6bff5" 1368 | }, 1369 | "execution_count": null, 1370 | "outputs": [ 1371 | { 1372 | "output_type": "display_data", 1373 | "data": { 1374 | "text/plain": [ 1375 | " 0%| | 0/19 [00:00