├── tests
    ├── __init__.py
    ├── models
    │   ├── __init__.py
    │   ├── test_models_vae_flax.py
    │   └── test_activations.py
    ├── pipelines
    │   ├── __init__.py
    │   ├── ddim
    │   │   └── __init__.py
    │   ├── ddpm
    │   │   └── __init__.py
    │   ├── dit
    │   │   └── __init__.py
    │   ├── pndm
    │   │   └── __init__.py
    │   ├── audioldm
    │   │   └── __init__.py
    │   ├── audioldm2
    │   │   └── __init__.py
    │   ├── controlnet
    │   │   └── __init__.py
    │   ├── kandinsky
    │   │   └── __init__.py
    │   ├── karras_ve
    │   │   └── __init__.py
    │   ├── musicldm
    │   │   └── __init__.py
    │   ├── repaint
    │   │   └── __init__.py
    │   ├── shap_e
    │   │   └── __init__.py
    │   ├── unclip
    │   │   └── __init__.py
    │   ├── altdiffusion
    │   │   └── __init__.py
    │   ├── audio_diffusion
    │   │   └── __init__.py
    │   ├── dance_diffusion
    │   │   └── __init__.py
    │   ├── kandinsky_v22
    │   │   └── __init__.py
    │   ├── score_sde_ve
    │   │   └── __init__.py
    │   ├── stable_unclip
    │   │   └── __init__.py
    │   ├── text_to_video
    │   │   ├── __init__.py
    │   │   └── test_text_to_video_zero.py
    │   ├── unidiffuser
    │   │   └── __init__.py
    │   ├── vq_diffusion
    │   │   └── __init__.py
    │   ├── consistency_models
    │   │   └── __init__.py
    │   ├── latent_diffusion
    │   │   └── __init__.py
    │   ├── paint_by_example
    │   │   └── __init__.py
    │   ├── stable_diffusion
    │   │   └── __init__.py
    │   ├── stable_diffusion_2
    │   │   └── __init__.py
    │   ├── stable_diffusion_xl
    │   │   └── __init__.py
    │   ├── versatile_diffusion
    │   │   └── __init__.py
    │   ├── semantic_stable_diffusion
    │   │   └── __init__.py
    │   ├── spectrogram_diffusion
    │   │   └── __init__.py
    │   ├── stable_diffusion_safe
    │   │   └── __init__.py
    │   └── test_pipelines_onnx_common.py
    ├── schedulers
    │   ├── __init__.py
    │   └── test_scheduler_vq_diffusion.py
    ├── fixtures
    │   └── elise_format0.mid
    ├── conftest.py
    └── others
    │   ├── test_dependencies.py
    │   └── test_hub_utils.py
├── scripts
    ├── __init__.py
    ├── convert_unclip_txt2img_to_image_variation.py
    └── conversion_ldm_uncond.py
├── src
    └── diffusers
    │   ├── experimental
    │       ├── __init__.py
    │       ├── rl
    │       │   └── __init__.py
    │       └── README.md
    │   ├── pipelines
    │       ├── dit
    │       │   └── __init__.py
    │       ├── ddim
    │       │   └── __init__.py
    │       ├── ddpm
    │       │   └── __init__.py
    │       ├── pndm
    │       │   └── __init__.py
    │       ├── repaint
    │       │   └── __init__.py
    │       ├── score_sde_ve
    │       │   └── __init__.py
    │       ├── dance_diffusion
    │       │   └── __init__.py
    │       ├── latent_diffusion_uncond
    │       │   └── __init__.py
    │       ├── stochastic_karras_ve
    │       │   └── __init__.py
    │       ├── consistency_models
    │       │   └── __init__.py
    │       ├── audio_diffusion
    │       │   └── __init__.py
    │       ├── vq_diffusion
    │       │   └── __init__.py
    │       ├── t2i_adapter
    │       │   └── __init__.py
    │       ├── latent_diffusion
    │       │   └── __init__.py
    │       ├── audioldm
    │       │   └── __init__.py
    │       ├── musicldm
    │       │   └── __init__.py
    │       ├── paint_by_example
    │       │   └── __init__.py
    │       ├── unclip
    │       │   └── __init__.py
    │       ├── unidiffuser
    │       │   └── __init__.py
    │       ├── audioldm2
    │       │   └── __init__.py
    │       ├── shap_e
    │       │   └── __init__.py
    │       ├── kandinsky
    │       │   ├── __init__.py
    │       │   └── text_encoder.py
    │       ├── controlnet
    │       │   └── __init__.py
    │       ├── spectrogram_diffusion
    │       │   └── __init__.py
    │       ├── versatile_diffusion
    │       │   └── __init__.py
    │       ├── kandinsky2_2
    │       │   └── __init__.py
    │       ├── text_to_video_synthesis
    │       │   └── __init__.py
    │       ├── stable_diffusion
    │       │   ├── pipeline_flax_stable_diffusion_controlnet.py
    │       │   ├── pipeline_stable_diffusion_controlnet.py
    │       │   └── stable_unclip_image_normalizer.py
    │       ├── stable_diffusion_xl
    │       │   ├── watermark.py
    │       │   └── __init__.py
    │       ├── semantic_stable_diffusion
    │       │   └── __init__.py
    │       ├── alt_diffusion
    │       │   └── __init__.py
    │       └── deepfloyd_if
    │       │   └── watermark.py
    │   ├── models
    │       ├── README.md
    │       ├── activations.py
    │       └── __init__.py
    │   ├── schedulers
    │       └── README.md
    │   ├── utils
    │       ├── dummy_onnx_objects.py
    │       ├── dummy_note_seq_objects.py
    │       ├── dummy_torch_and_scipy_objects.py
    │       ├── dummy_torch_and_torchsde_objects.py
    │       ├── dummy_transformers_and_torch_and_note_seq_objects.py
    │       ├── dummy_torch_and_transformers_and_k_diffusion_objects.py
    │       ├── dummy_torch_and_librosa_objects.py
    │       ├── constants.py
    │       ├── doc_utils.py
    │       ├── model_card_template.md
    │       ├── accelerate_utils.py
    │       └── dummy_flax_and_transformers_objects.py
    │   ├── commands
    │       ├── __init__.py
    │       └── diffusers_cli.py
    │   ├── pipeline_utils.py
    │   ├── dependency_versions_table.py
    │   └── dependency_versions_check.py
├── MANIFEST.in
├── examples
    ├── unconditional_image_generation
    │   └── requirements.txt
    ├── controlnet
    │   ├── requirements.txt
    │   ├── requirements_sdxl.txt
    │   └── requirements_flax.txt
    ├── custom_diffusion
    │   └── requirements.txt
    ├── dreambooth
    │   ├── requirements.txt
    │   ├── requirements_sdxl.txt
    │   └── requirements_flax.txt
    ├── instruct_pix2pix
    │   └── requirements.txt
    ├── research_projects
    │   ├── colossalai
    │   │   ├── requirement.txt
    │   │   └── inference.py
    │   ├── onnxruntime
    │   │   ├── unconditional_image_generation
    │   │   │   ├── requirements.txt
    │   │   │   └── README.md
    │   │   ├── textual_inversion
    │   │   │   └── requirements.txt
    │   │   ├── text_to_image
    │   │   │   └── requirements.txt
    │   │   └── README.md
    │   ├── multi_subject_dreambooth
    │   │   └── requirements.txt
    │   ├── mulit_token_textual_inversion
    │   │   ├── requirements.txt
    │   │   └── requirements_flax.txt
    │   ├── dreambooth_inpaint
    │   │   └── requirements.txt
    │   ├── intel_opts
    │   │   ├── textual_inversion_dfq
    │   │   │   └── requirements.txt
    │   │   ├── textual_inversion
    │   │   │   └── requirements.txt
    │   │   └── README.md
    │   ├── lora
    │   │   └── requirements.txt
    │   └── README.md
    ├── textual_inversion
    │   ├── requirements.txt
    │   └── requirements_flax.txt
    ├── text_to_image
    │   ├── requirements_sdxl.txt
    │   ├── requirements.txt
    │   └── requirements_flax.txt
    ├── inference
    │   ├── image_to_image.py
    │   ├── inpainting.py
    │   └── README.md
    ├── community
    │   └── one_step_unet.py
    ├── reinforcement_learning
    │   ├── README.md
    │   └── run_diffuser_locomotion.py
    └── conftest.py
├── docs
    └── source
    │   ├── en
    │       ├── imgs
    │       │   ├── access_request.png
    │       │   └── diffusers_library.jpg
    │       ├── api
    │       │   ├── models
    │       │   │   ├── transformer_temporal.md
    │       │   │   ├── overview.md
    │       │   │   ├── transformer2d.md
    │       │   │   ├── vq.md
    │       │   │   ├── prior_transformer.md
    │       │   │   ├── autoencoder_tiny.md
    │       │   │   ├── unet.md
    │       │   │   ├── unet2d.md
    │       │   │   ├── unet3d-cond.md
    │       │   │   └── unet2d-cond.md
    │       │   ├── utilities.md
    │       │   ├── schedulers
    │       │   │   ├── ipndm.md
    │       │   │   ├── dpm_sde.md
    │       │   │   ├── ddim_inverse.md
    │       │   │   ├── pndm.md
    │       │   │   ├── heun.md
    │       │   │   ├── stochastic_karras_ve.md
    │       │   │   ├── dpm_discrete.md
    │       │   │   ├── lms_discrete.md
    │       │   │   ├── dpm_discrete_ancestral.md
    │       │   │   ├── euler_ancestral.md
    │       │   │   ├── euler.md
    │       │   │   ├── ddpm.md
    │       │   │   ├── cm_stochastic_iterative.md
    │       │   │   ├── multistep_dpm_solver_inverse.md
    │       │   │   └── singlestep_dpm_solver.md
    │       │   ├── configuration.md
    │       │   ├── attnprocessor.md
    │       │   ├── pipelines
    │       │   │   ├── dance_diffusion.md
    │       │   │   ├── audio_diffusion.md
    │       │   │   ├── stable_diffusion
    │       │   │   │   ├── upscale.md
    │       │   │   │   ├── image_variation.md
    │       │   │   │   ├── depth2img.md
    │       │   │   │   └── latent_upscale.md
    │       │   │   ├── overview.md
    │       │   │   ├── ddim.md
    │       │   │   └── dit.md
    │       │   ├── diffusion_pipeline.md
    │       │   ├── loaders.md
    │       │   ├── image_processor.md
    │       │   └── outputs.md
    │       ├── using-diffusers
    │       │   ├── using_safetensors
    │       │   ├── other-modalities.md
    │       │   ├── loading_overview.md
    │       │   └── pipeline_overview.md
    │       ├── optimization
    │       │   ├── opt_overview.md
    │       │   └── xformers.md
    │       └── tutorials
    │       │   └── tutorial_overview.md
    │   ├── zh
    │       └── _toctree.yml
    │   ├── _config.py
    │   └── ko
    │       ├── using-diffusers
    │           ├── using_safetensors.md
    │           └── pipeline_overview.md
    │       ├── in_translation.md
    │       ├── optimization
    │           ├── opt_overview.md
    │           ├── xformers.md
    │           └── open_vino.md
    │       └── tutorials
    │           └── tutorial_overview.md
├── .github
    ├── ISSUE_TEMPLATE
    │   ├── config.yml
    │   ├── feedback.md
    │   ├── feature_request.md
    │   └── new-model-addition.yml
    └── workflows
    │   ├── typos.yml
    │   ├── delete_doc_comment_trigger.yml
    │   ├── delete_doc_comment.yml
    │   ├── upload_pr_documentation.yml
    │   ├── build_pr_documentation.yml
    │   ├── build_documentation.yml
    │   ├── stale.yml
    │   ├── pr_dependency_test.yml
    │   ├── build_docker_images.yml
    │   ├── pr_quality.yml
    │   └── push_tests_mps.yml
├── _typos.toml
├── setup.cfg
├── pyproject.toml
├── CITATION.cff
├── docker
    ├── diffusers-onnxruntime-cpu
    │   └── Dockerfile
    ├── diffusers-pytorch-cpu
    │   └── Dockerfile
    ├── diffusers-onnxruntime-cuda
    │   └── Dockerfile
    ├── diffusers-flax-cpu
    │   └── Dockerfile
    ├── diffusers-pytorch-cuda
    │   └── Dockerfile
    └── diffusers-flax-tpu
    │   └── Dockerfile
└── utils
    ├── print_env.py
    └── get_modified_files.py


/tests/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/scripts/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/models/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/schedulers/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/ddim/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/ddpm/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/dit/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/pndm/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/audioldm/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/audioldm2/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/controlnet/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/kandinsky/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/karras_ve/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/musicldm/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/repaint/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/shap_e/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/unclip/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/altdiffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/audio_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/dance_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/kandinsky_v22/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/score_sde_ve/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/stable_unclip/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/text_to_video/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/unidiffuser/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/vq_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/consistency_models/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/latent_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/paint_by_example/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/stable_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/stable_diffusion_2/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/stable_diffusion_xl/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/versatile_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/semantic_stable_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/spectrogram_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/tests/pipelines/stable_diffusion_safe/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/src/diffusers/experimental/__init__.py:
--------------------------------------------------------------------------------
1 | from .rl import ValueGuidedRLPipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/dit/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_dit import DiTPipeline
2 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include LICENSE
2 | include src/diffusers/utils/model_card_template.md
3 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/ddim/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_ddim import DDIMPipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/ddpm/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_ddpm import DDPMPipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/pndm/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_pndm import PNDMPipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/repaint/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_repaint import RePaintPipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/experimental/rl/__init__.py:
--------------------------------------------------------------------------------
1 | from .value_guided_sampling import ValueGuidedRLPipeline
2 | 


--------------------------------------------------------------------------------
/examples/unconditional_image_generation/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | datasets
4 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/score_sde_ve/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_score_sde_ve import ScoreSdeVePipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/dance_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_dance_diffusion import DanceDiffusionPipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/latent_diffusion_uncond/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_latent_diffusion_uncond import LDMPipeline
2 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/stochastic_karras_ve/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_stochastic_karras_ve import KarrasVePipeline
2 | 


--------------------------------------------------------------------------------
/tests/fixtures/elise_format0.mid:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mnslarcher/diffusers/main/tests/fixtures/elise_format0.mid


--------------------------------------------------------------------------------
/src/diffusers/pipelines/consistency_models/__init__.py:
--------------------------------------------------------------------------------
1 | from .pipeline_consistency_models import ConsistencyModelPipeline
2 | 


--------------------------------------------------------------------------------
/docs/source/en/imgs/access_request.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mnslarcher/diffusers/main/docs/source/en/imgs/access_request.png


--------------------------------------------------------------------------------
/docs/source/en/imgs/diffusers_library.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mnslarcher/diffusers/main/docs/source/en/imgs/diffusers_library.jpg


--------------------------------------------------------------------------------
/examples/controlnet/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | datasets
7 | 


--------------------------------------------------------------------------------
/examples/custom_diffusion/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | 


--------------------------------------------------------------------------------
/examples/dreambooth/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | 


--------------------------------------------------------------------------------
/examples/dreambooth/requirements_sdxl.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | 


--------------------------------------------------------------------------------
/examples/instruct_pix2pix/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | datasets
5 | ftfy
6 | tensorboard


--------------------------------------------------------------------------------
/examples/research_projects/colossalai/requirement.txt:
--------------------------------------------------------------------------------
1 | diffusers
2 | torch
3 | torchvision
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | transformers


--------------------------------------------------------------------------------
/examples/textual_inversion/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/audio_diffusion/__init__.py:
--------------------------------------------------------------------------------
1 | from .mel import Mel
2 | from .pipeline_audio_diffusion import AudioDiffusionPipeline
3 | 


--------------------------------------------------------------------------------
/examples/dreambooth/requirements_flax.txt:
--------------------------------------------------------------------------------
1 | transformers>=4.25.1
2 | flax
3 | optax
4 | torch
5 | torchvision
6 | ftfy
7 | tensorboard
8 | Jinja2
9 | 


--------------------------------------------------------------------------------
/examples/text_to_image/requirements_sdxl.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | 


--------------------------------------------------------------------------------
/examples/research_projects/onnxruntime/unconditional_image_generation/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | datasets
4 | tensorboard


--------------------------------------------------------------------------------
/examples/text_to_image/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | datasets
5 | ftfy
6 | tensorboard
7 | Jinja2
8 | 


--------------------------------------------------------------------------------
/examples/textual_inversion/requirements_flax.txt:
--------------------------------------------------------------------------------
1 | transformers>=4.25.1
2 | flax
3 | optax
4 | torch
5 | torchvision
6 | ftfy
7 | tensorboard
8 | Jinja2
9 | 


--------------------------------------------------------------------------------
/examples/research_projects/multi_subject_dreambooth/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2


--------------------------------------------------------------------------------
/examples/controlnet/requirements_sdxl.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | datasets
8 | wandb
9 | 


--------------------------------------------------------------------------------
/src/diffusers/models/README.md:
--------------------------------------------------------------------------------
1 | # Models
2 | 
3 | For more detail on the models, please refer to the [docs](https://huggingface.co/docs/diffusers/api/models/overview).


--------------------------------------------------------------------------------
/examples/controlnet/requirements_flax.txt:
--------------------------------------------------------------------------------
 1 | transformers>=4.25.1
 2 | datasets
 3 | flax
 4 | optax
 5 | torch
 6 | torchvision
 7 | ftfy
 8 | tensorboard
 9 | Jinja2
10 | 


--------------------------------------------------------------------------------
/examples/research_projects/mulit_token_textual_inversion/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | 


--------------------------------------------------------------------------------
/examples/text_to_image/requirements_flax.txt:
--------------------------------------------------------------------------------
 1 | transformers>=4.25.1
 2 | datasets
 3 | flax
 4 | optax
 5 | torch
 6 | torchvision
 7 | ftfy
 8 | tensorboard
 9 | Jinja2
10 | 


--------------------------------------------------------------------------------
/examples/research_projects/onnxruntime/textual_inversion/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | ftfy
5 | tensorboard
6 | modelcards
7 | 


--------------------------------------------------------------------------------
/examples/research_projects/dreambooth_inpaint/requirements.txt:
--------------------------------------------------------------------------------
1 | diffusers==0.9.0
2 | accelerate>=0.16.0
3 | torchvision
4 | transformers>=4.21.0
5 | ftfy
6 | tensorboard
7 | Jinja2
8 | 


--------------------------------------------------------------------------------
/examples/research_projects/mulit_token_textual_inversion/requirements_flax.txt:
--------------------------------------------------------------------------------
1 | transformers>=4.25.1
2 | flax
3 | optax
4 | torch
5 | torchvision
6 | ftfy
7 | tensorboard
8 | Jinja2
9 | 


--------------------------------------------------------------------------------
/examples/research_projects/onnxruntime/text_to_image/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | datasets
5 | ftfy
6 | tensorboard
7 | modelcards
8 | 


--------------------------------------------------------------------------------
/src/diffusers/schedulers/README.md:
--------------------------------------------------------------------------------
1 | # Schedulers
2 | 
3 | For more information on the schedulers, please refer to the [docs](https://huggingface.co/docs/diffusers/api/schedulers/overview).


--------------------------------------------------------------------------------
/examples/research_projects/intel_opts/textual_inversion_dfq/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate
2 | torchvision
3 | transformers>=4.25.0
4 | ftfy
5 | tensorboard
6 | modelcards
7 | neural-compressor


--------------------------------------------------------------------------------
/docs/source/zh/_toctree.yml:
--------------------------------------------------------------------------------
1 | - sections:
2 |   - local: index
3 |     title: 🧨 Diffusers
4 |   - local: quicktour
5 |     title: 快速入门
6 |   - local: installation
7 |     title: 安装
8 |   title: 开始
9 | 


--------------------------------------------------------------------------------
/examples/research_projects/lora/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.25.1
4 | datasets
5 | ftfy
6 | tensorboard
7 | Jinja2
8 | git+https://github.com/huggingface/peft.git


--------------------------------------------------------------------------------
/examples/research_projects/intel_opts/textual_inversion/requirements.txt:
--------------------------------------------------------------------------------
1 | accelerate>=0.16.0
2 | torchvision
3 | transformers>=4.21.0
4 | ftfy
5 | tensorboard
6 | Jinja2
7 | intel_extension_for_pytorch>=1.13
8 | 


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/config.yml:
--------------------------------------------------------------------------------
1 | contact_links:
2 |   - name: Blank issue
3 |     url: https://github.com/huggingface/diffusers/issues/new
4 |     about: Other
5 |   - name: Forum
6 |     url: https://discuss.huggingface.co/
7 |     about: General usage questions and community discussions


--------------------------------------------------------------------------------
/.github/workflows/typos.yml:
--------------------------------------------------------------------------------
 1 | name: Check typos
 2 | 
 3 | on:
 4 |   workflow_dispatch:
 5 | 
 6 | jobs:
 7 |   build:
 8 |     runs-on: ubuntu-latest
 9 | 
10 |     steps:
11 |       - uses: actions/checkout@v3
12 | 
13 |       - name: typos-action
14 |         uses: crate-ci/typos@v1.12.4
15 | 


--------------------------------------------------------------------------------
/examples/inference/image_to_image.py:
--------------------------------------------------------------------------------
 1 | import warnings
 2 | 
 3 | from diffusers import StableDiffusionImg2ImgPipeline  # noqa F401
 4 | 
 5 | 
 6 | warnings.warn(
 7 |     "The `image_to_image.py` script is outdated. Please use directly `from diffusers import"
 8 |     " StableDiffusionImg2ImgPipeline` instead."
 9 | )
10 | 


--------------------------------------------------------------------------------
/src/diffusers/experimental/README.md:
--------------------------------------------------------------------------------
1 | # 🧨 Diffusers Experimental
2 | 
3 | We are adding experimental code to support novel applications and usages of the Diffusers library.
4 | Currently, the following experiments are supported:
5 | * Reinforcement learning via an implementation of the [Diffuser](https://arxiv.org/abs/2205.09991) model.


--------------------------------------------------------------------------------
/.github/workflows/delete_doc_comment_trigger.yml:
--------------------------------------------------------------------------------
 1 | name: Delete doc comment trigger
 2 | 
 3 | on:
 4 |   pull_request:
 5 |     types: [ closed ]
 6 | 
 7 | 
 8 | jobs:
 9 |   delete:
10 |     uses: huggingface/doc-builder/.github/workflows/delete_doc_comment_trigger.yml@main
11 |     with:
12 |       pr_number: ${{ github.event.number }}
13 | 


--------------------------------------------------------------------------------
/examples/inference/inpainting.py:
--------------------------------------------------------------------------------
 1 | import warnings
 2 | 
 3 | from diffusers import StableDiffusionInpaintPipeline as StableDiffusionInpaintPipeline  # noqa F401
 4 | 
 5 | 
 6 | warnings.warn(
 7 |     "The `inpainting.py` script is outdated. Please use directly `from diffusers import"
 8 |     " StableDiffusionInpaintPipeline` instead."
 9 | )
10 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/transformer_temporal.md:
--------------------------------------------------------------------------------
 1 | # Transformer Temporal
 2 | 
 3 | A Transformer model for video-like data.
 4 | 
 5 | ## TransformerTemporalModel
 6 | 
 7 | [[autodoc]] models.transformer_temporal.TransformerTemporalModel
 8 | 
 9 | ## TransformerTemporalModelOutput
10 | 
11 | [[autodoc]] models.transformer_temporal.TransformerTemporalModelOutput


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/feedback.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: "💬 Feedback about API Design"
 3 | about: Give feedback about the current API design
 4 | title: ''
 5 | labels: ''
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **What API design would you like to have changed or added to the library? Why?**
11 | 
12 | **What use case would this enable or better enable? Can you give us a code example?**
13 | 


--------------------------------------------------------------------------------
/.github/workflows/delete_doc_comment.yml:
--------------------------------------------------------------------------------
 1 | name: Delete doc comment
 2 | 
 3 | on:
 4 |   workflow_run:
 5 |     workflows: ["Delete doc comment trigger"]
 6 |     types:
 7 |       - completed
 8 | 
 9 | 
10 | jobs:
11 |   delete:
12 |     uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
13 |     secrets:
14 |       comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}


--------------------------------------------------------------------------------
/docs/source/_config.py:
--------------------------------------------------------------------------------
 1 | # docstyle-ignore
 2 | INSTALL_CONTENT = """
 3 | # Diffusers installation
 4 | ! pip install diffusers transformers datasets accelerate
 5 | # To install from source instead of the last release, comment the command above and uncomment the following one.
 6 | # ! pip install git+https://github.com/huggingface/diffusers.git
 7 | """
 8 | 
 9 | notebook_first_cells = [{"type": "code", "content": INSTALL_CONTENT}]
10 | 


--------------------------------------------------------------------------------
/examples/research_projects/colossalai/inference.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | from diffusers import StableDiffusionPipeline
 4 | 
 5 | 
 6 | model_id = "path-to-your-trained-model"
 7 | pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
 8 | 
 9 | prompt = "A photo of sks dog in a bucket"
10 | image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0]
11 | 
12 | image.save("dog-bucket.png")
13 | 


--------------------------------------------------------------------------------
/src/diffusers/models/activations.py:
--------------------------------------------------------------------------------
 1 | from torch import nn
 2 | 
 3 | 
 4 | def get_activation(act_fn):
 5 |     if act_fn in ["swish", "silu"]:
 6 |         return nn.SiLU()
 7 |     elif act_fn == "mish":
 8 |         return nn.Mish()
 9 |     elif act_fn == "gelu":
10 |         return nn.GELU()
11 |     elif act_fn == "relu":
12 |         return nn.ReLU()
13 |     else:
14 |         raise ValueError(f"Unsupported activation function: {act_fn}")
15 | 


--------------------------------------------------------------------------------
/tests/pipelines/test_pipelines_onnx_common.py:
--------------------------------------------------------------------------------
 1 | from diffusers.utils.testing_utils import require_onnxruntime
 2 | 
 3 | 
 4 | @require_onnxruntime
 5 | class OnnxPipelineTesterMixin:
 6 |     """
 7 |     This mixin is designed to be used with unittest.TestCase classes.
 8 |     It provides a set of common tests for each ONNXRuntime pipeline, e.g. saving and loading the pipeline,
 9 |     equivalence of dict and tuple outputs, etc.
10 |     """
11 | 
12 |     pass
13 | 


--------------------------------------------------------------------------------
/_typos.toml:
--------------------------------------------------------------------------------
 1 | # Files for typos
 2 | # Instruction:  https://github.com/marketplace/actions/typos-action#getting-started
 3 | 
 4 | [default.extend-identifiers]
 5 | 
 6 | [default.extend-words]
 7 | NIN="NIN" # NIN is used in scripts/convert_ncsnpp_original_checkpoint_to_diffusers.py
 8 | nd="np" # nd may be np (numpy)
 9 | parms="parms" # parms is used in scripts/convert_original_stable_diffusion_to_diffusers.py
10 | 
11 | 
12 | [files]
13 | extend-exclude = ["_typos.toml"]
14 | 


--------------------------------------------------------------------------------
/.github/workflows/upload_pr_documentation.yml:
--------------------------------------------------------------------------------
 1 | name: Upload PR Documentation
 2 | 
 3 | on:
 4 |   workflow_run:
 5 |     workflows: ["Build PR Documentation"]
 6 |     types:
 7 |       - completed
 8 | 
 9 | jobs:
10 |   build:
11 |     uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@main
12 |     with:
13 |       package_name: diffusers
14 |     secrets:
15 |       hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
16 |       comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}
17 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
 1 | [isort]
 2 | default_section = FIRSTPARTY
 3 | ensure_newline_before_comments = True
 4 | force_grid_wrap = 0
 5 | include_trailing_comma = True
 6 | known_first_party = accelerate
 7 | known_third_party =
 8 |     numpy
 9 |     torch
10 |     torch_xla
11 | 
12 | line_length = 119
13 | lines_after_imports = 2
14 | multi_line_output = 3
15 | use_parentheses = True
16 | 
17 | [flake8]
18 | ignore = E203, E722, E501, E741, W503, W605
19 | max-line-length = 119
20 | per-file-ignores = __init__.py:F401
21 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/vq_diffusion/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import OptionalDependencyNotAvailable, is_torch_available, is_transformers_available
 2 | 
 3 | 
 4 | try:
 5 |     if not (is_transformers_available() and is_torch_available()):
 6 |         raise OptionalDependencyNotAvailable()
 7 | except OptionalDependencyNotAvailable:
 8 |     from ...utils.dummy_torch_and_transformers_objects import *
 9 | else:
10 |     from .pipeline_vq_diffusion import LearnedClassifierFreeSamplingEmbeddings, VQDiffusionPipeline
11 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/t2i_adapter/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 | )
 6 | 
 7 | 
 8 | try:
 9 |     if not (is_transformers_available() and is_torch_available()):
10 |         raise OptionalDependencyNotAvailable()
11 | except OptionalDependencyNotAvailable:
12 |     from ...utils.dummy_torch_and_transformers_objects import *  # noqa F403
13 | else:
14 |     from .pipeline_stable_diffusion_adapter import StableDiffusionAdapterPipeline
15 | 


--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
 1 | [tool.black]
 2 | line-length = 119
 3 | target-version = ['py37']
 4 | 
 5 | [tool.ruff]
 6 | # Never enforce `E501` (line length violations).
 7 | ignore = ["C901", "E501", "E741", "W605"]
 8 | select = ["C", "E", "F", "I", "W"]
 9 | line-length = 119
10 | 
11 | # Ignore import violations in all `__init__.py` files.
12 | [tool.ruff.per-file-ignores]
13 | "__init__.py" = ["E402", "F401", "F403", "F811"]
14 | "src/diffusers/utils/dummy_*.py" = ["F401"]
15 | 
16 | [tool.ruff.isort]
17 | lines-after-imports = 2
18 | known-first-party = ["diffusers"]
19 | 


--------------------------------------------------------------------------------
/.github/workflows/build_pr_documentation.yml:
--------------------------------------------------------------------------------
 1 | name: Build PR Documentation
 2 | 
 3 | on:
 4 |   pull_request:
 5 | 
 6 | concurrency:
 7 |   group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
 8 |   cancel-in-progress: true
 9 | 
10 | jobs:
11 |   build:
12 |     uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
13 |     with:
14 |       commit_sha: ${{ github.event.pull_request.head.sha }}
15 |       pr_number: ${{ github.event.number }}
16 |       install_libgl1: true
17 |       package: diffusers
18 |       languages: en ko zh
19 | 


--------------------------------------------------------------------------------
/docs/source/en/api/utilities.md:
--------------------------------------------------------------------------------
 1 | # Utilities
 2 | 
 3 | Utility and helper functions for working with 🤗 Diffusers.
 4 | 
 5 | ## randn_tensor
 6 | 
 7 | [[autodoc]] diffusers.utils.randn_tensor
 8 | 
 9 | ## numpy_to_pil
10 | 
11 | [[autodoc]] utils.pil_utils.numpy_to_pil
12 | 
13 | ## pt_to_pil
14 | 
15 | [[autodoc]] utils.pil_utils.pt_to_pil
16 | 
17 | ## load_image
18 | 
19 | [[autodoc]] utils.testing_utils.load_image
20 | 
21 | ## export_to_video
22 | 
23 | [[autodoc]] utils.testing_utils.export_to_video
24 | 
25 | ## make_image_grid
26 | 
27 | [[autodoc]] utils.pil_utils.make_image_grid


--------------------------------------------------------------------------------
/docs/source/ko/using-diffusers/using_safetensors.md:
--------------------------------------------------------------------------------
 1 | # 세이프센서란 무엇인가요? 
 2 | 
 3 | [세이프텐서](https://github.com/huggingface/safetensors)는 피클을 사용하는 파이토치를 사용하는 기존의 '.bin'과는 다른 형식입니다.
 4 | 
 5 | 피클은 악의적인 파일이 임의의 코드를 실행할 수 있는 안전하지 않은 것으로 악명이 높습니다.
 6 | 허브 자체에서 문제를 방지하기 위해 노력하고 있지만 만병통치약은 아닙니다.
 7 | 
 8 | 세이프텐서의 가장 중요한 목표는 컴퓨터를 탈취할 수 없다는 의미에서 머신 러닝 모델 로딩을 *안전하게* 만드는 것입니다.
 9 | 
10 | # 왜 세이프센서를 사용하나요?
11 | 
12 | **잘 알려지지 않은 모델을 사용하려는 경우, 그리고 파일의 출처가 확실하지 않은 경우 "안전성"이 하나의 이유가 될 수 있습니다.
13 | 
14 | 그리고 두 번째 이유는 **로딩 속도**입니다. 세이프센서는 일반 피클 파일보다 훨씬 빠르게 모델을 훨씬 빠르게 로드할 수 있습니다. 모델을 전환하는 데 많은 시간을 소비하는 경우, 이는 엄청난 시간 절약이 가능합니다.


--------------------------------------------------------------------------------
/src/diffusers/pipelines/latent_diffusion/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import OptionalDependencyNotAvailable, is_torch_available, is_transformers_available
 2 | from .pipeline_latent_diffusion_superresolution import LDMSuperResolutionPipeline
 3 | 
 4 | 
 5 | try:
 6 |     if not (is_transformers_available() and is_torch_available()):
 7 |         raise OptionalDependencyNotAvailable()
 8 | except OptionalDependencyNotAvailable:
 9 |     from ...utils.dummy_torch_and_transformers_objects import ShapEPipeline
10 | else:
11 |     from .pipeline_latent_diffusion import LDMBertModel, LDMTextToImagePipeline
12 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_onnx_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class OnnxRuntimeModel(metaclass=DummyObject):
 6 |     _backends = ["onnx"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["onnx"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["onnx"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["onnx"])
18 | 


--------------------------------------------------------------------------------
/examples/research_projects/onnxruntime/README.md:
--------------------------------------------------------------------------------
1 | ## Diffusers examples with ONNXRuntime optimizations
2 | 
3 | **This research project is not actively maintained by the diffusers team. For any questions or comments, please contact Prathik Rao (prathikr), Sunghoon Choi (hanbitmyths), Ashwini Khade (askhade), or Peng Wang (pengwa) on github with any questions.**
4 | 
5 | This aims to provide diffusers examples with ONNXRuntime optimizations for training/fine-tuning unconditional image generation, text to image, and textual inversion. Please see individual directories for more details on how to run each task using ONNXRuntime.


--------------------------------------------------------------------------------
/src/diffusers/pipelines/audioldm/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 |     is_transformers_version,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available() and is_transformers_version(">=", "4.27.0")):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import (
14 |         AudioLDMPipeline,
15 |     )
16 | else:
17 |     from .pipeline_audioldm import AudioLDMPipeline
18 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/musicldm/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 |     is_transformers_version,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available() and is_transformers_version(">=", "4.27.0")):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import (
14 |         MusicLDMPipeline,
15 |     )
16 | else:
17 |     from .pipeline_musicldm import MusicLDMPipeline
18 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_note_seq_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class MidiProcessor(metaclass=DummyObject):
 6 |     _backends = ["note_seq"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["note_seq"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["note_seq"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["note_seq"])
18 | 


--------------------------------------------------------------------------------
/.github/workflows/build_documentation.yml:
--------------------------------------------------------------------------------
 1 | name: Build documentation
 2 | 
 3 | on:
 4 |   push:
 5 |     branches:
 6 |       - main
 7 |       - doc-builder*
 8 |       - v*-release
 9 |       - v*-patch
10 | 
11 | jobs:
12 |   build:
13 |     uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
14 |     with:
15 |       commit_sha: ${{ github.sha }}
16 |       install_libgl1: true
17 |       package: diffusers
18 |       notebook_folder: diffusers_doc
19 |       languages: en ko zh
20 | 
21 |     secrets:
22 |       token: ${{ secrets.HUGGINGFACE_PUSH }}
23 |       hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
24 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_torch_and_scipy_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class LMSDiscreteScheduler(metaclass=DummyObject):
 6 |     _backends = ["torch", "scipy"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["torch", "scipy"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["torch", "scipy"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["torch", "scipy"])
18 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_torch_and_torchsde_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class DPMSolverSDEScheduler(metaclass=DummyObject):
 6 |     _backends = ["torch", "torchsde"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["torch", "torchsde"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["torch", "torchsde"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["torch", "torchsde"])
18 | 


--------------------------------------------------------------------------------
/.github/workflows/stale.yml:
--------------------------------------------------------------------------------
 1 | name: Stale Bot
 2 | 
 3 | on:
 4 |   schedule:
 5 |     - cron: "0 15 * * *"
 6 | 
 7 | jobs:
 8 |   close_stale_issues:
 9 |     name: Close Stale Issues
10 |     if: github.repository == 'huggingface/diffusers'
11 |     runs-on: ubuntu-latest
12 |     env:
13 |       GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
14 |     steps:
15 |     - uses: actions/checkout@v2
16 | 
17 |     - name: Setup Python
18 |       uses: actions/setup-python@v1
19 |       with:
20 |         python-version: 3.7
21 | 
22 |     - name: Install requirements
23 |       run: |
24 |         pip install PyGithub
25 |     - name: Close stale issues
26 |       run: |
27 |         python utils/stale.py
28 | 


--------------------------------------------------------------------------------
/examples/research_projects/README.md:
--------------------------------------------------------------------------------
 1 | # Research projects
 2 | 
 3 | This folder contains various research projects using 🧨 Diffusers. 
 4 | They are not really maintained by the core maintainers of this library and often require a specific version of Diffusers that is indicated in the requirements file of each folder. 
 5 | Updating them to the most recent version of the library will require some work.
 6 | 
 7 | To use any of them, just run the command
 8 | 
 9 | ```
10 | pip install -r requirements.txt
11 | ```
12 | inside the folder of your choice.
13 | 
14 | If you need help with any of those, please open an issue where you directly ping the author(s), as indicated at the top of the README of each folder.
15 | 


--------------------------------------------------------------------------------
/docs/source/ko/in_translation.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # 번역중
14 | 
15 | 열심히 번역을 진행중입니다. 조금만 기다려주세요.
16 | 감사합니다!


--------------------------------------------------------------------------------
/src/diffusers/pipelines/paint_by_example/__init__.py:
--------------------------------------------------------------------------------
 1 | from dataclasses import dataclass
 2 | from typing import List, Optional, Union
 3 | 
 4 | import numpy as np
 5 | import PIL
 6 | from PIL import Image
 7 | 
 8 | from ...utils import OptionalDependencyNotAvailable, is_torch_available, is_transformers_available
 9 | 
10 | 
11 | try:
12 |     if not (is_transformers_available() and is_torch_available()):
13 |         raise OptionalDependencyNotAvailable()
14 | except OptionalDependencyNotAvailable:
15 |     from ...utils.dummy_torch_and_transformers_objects import ShapEPipeline
16 | else:
17 |     from .image_encoder import PaintByExampleImageEncoder
18 |     from .pipeline_paint_by_example import PaintByExamplePipeline
19 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/overview.md:
--------------------------------------------------------------------------------
 1 | # Models
 2 | 
 3 | 🤗 Diffusers provides pretrained models for popular algorithms and modules to create custom diffusion systems. The primary function of models is to denoise an input sample as modeled by the distribution \\(p_{\theta}(x_{t-1}|x_{t})\\).
 4 | 
 5 | All models are built from the base [`ModelMixin`] class which is a [`torch.nn.module`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html) providing basic functionality for saving and loading models, locally and from the Hugging Face Hub.
 6 | 
 7 | ## ModelMixin
 8 | [[autodoc]] ModelMixin
 9 | 
10 | ## FlaxModelMixin
11 | 
12 | [[autodoc]] FlaxModelMixin
13 | 
14 | ## PushToHubMixin
15 | 
16 | [[autodoc]] utils.PushToHubMixin


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/feature_request.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: "\U0001F680 Feature request"
 3 | about: Suggest an idea for this project
 4 | title: ''
 5 | labels: ''
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **Is your feature request related to a problem? Please describe.**
11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12 | 
13 | **Describe the solution you'd like**
14 | A clear and concise description of what you want to happen.
15 | 
16 | **Describe alternatives you've considered**
17 | A clear and concise description of any alternative solutions or features you've considered.
18 | 
19 | **Additional context**
20 | Add any other context or screenshots about the feature request here.
21 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/unclip/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 |     is_transformers_version,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available() and is_transformers_version(">=", "4.25.0")):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import UnCLIPImageVariationPipeline, UnCLIPPipeline
14 | else:
15 |     from .pipeline_unclip import UnCLIPPipeline
16 |     from .pipeline_unclip_image_variation import UnCLIPImageVariationPipeline
17 |     from .text_proj import UnCLIPTextProjModel
18 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_transformers_and_torch_and_note_seq_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class SpectrogramDiffusionPipeline(metaclass=DummyObject):
 6 |     _backends = ["transformers", "torch", "note_seq"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["transformers", "torch", "note_seq"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["transformers", "torch", "note_seq"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["transformers", "torch", "note_seq"])
18 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_torch_and_transformers_and_k_diffusion_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class StableDiffusionKDiffusionPipeline(metaclass=DummyObject):
 6 |     _backends = ["torch", "transformers", "k_diffusion"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["torch", "transformers", "k_diffusion"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["torch", "transformers", "k_diffusion"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["torch", "transformers", "k_diffusion"])
18 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/unidiffuser/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 |     is_transformers_version,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available()):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import (
14 |         ImageTextPipelineOutput,
15 |         UniDiffuserPipeline,
16 |     )
17 | else:
18 |     from .modeling_text_decoder import UniDiffuserTextDecoder
19 |     from .modeling_uvit import UniDiffuserModel, UTransformer2DModel
20 |     from .pipeline_unidiffuser import ImageTextPipelineOutput, UniDiffuserPipeline
21 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/audioldm2/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 |     is_transformers_version,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available() and is_transformers_version(">=", "4.27.0")):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import (
14 |         AudioLDM2Pipeline,
15 |         AudioLDM2ProjectionModel,
16 |         AudioLDM2UNet2DConditionModel,
17 |     )
18 | else:
19 |     from .modeling_audioldm2 import AudioLDM2ProjectionModel, AudioLDM2UNet2DConditionModel
20 |     from .pipeline_audioldm2 import AudioLDM2Pipeline
21 | 


--------------------------------------------------------------------------------
/examples/community/one_step_unet.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | import torch
 3 | 
 4 | from diffusers import DiffusionPipeline
 5 | 
 6 | 
 7 | class UnetSchedulerOneForwardPipeline(DiffusionPipeline):
 8 |     def __init__(self, unet, scheduler):
 9 |         super().__init__()
10 | 
11 |         self.register_modules(unet=unet, scheduler=scheduler)
12 | 
13 |     def __call__(self):
14 |         image = torch.randn(
15 |             (1, self.unet.config.in_channels, self.unet.config.sample_size, self.unet.config.sample_size),
16 |         )
17 |         timestep = 1
18 | 
19 |         model_output = self.unet(image, timestep).sample
20 |         scheduler_output = self.scheduler.step(model_output, timestep, image).prev_sample
21 | 
22 |         result = scheduler_output - scheduler_output + torch.ones_like(scheduler_output)
23 | 
24 |         return result
25 | 


--------------------------------------------------------------------------------
/.github/workflows/pr_dependency_test.yml:
--------------------------------------------------------------------------------
 1 | name: Run dependency tests
 2 | 
 3 | on:
 4 |   pull_request:
 5 |     branches:
 6 |       - main
 7 |   push:
 8 |     branches:
 9 |       - main
10 | 
11 | concurrency:
12 |   group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
13 |   cancel-in-progress: true
14 | 
15 | jobs:
16 |   check_dependencies:
17 |     runs-on: ubuntu-latest
18 |     steps:
19 |       - uses: actions/checkout@v3
20 |       - name: Set up Python
21 |         uses: actions/setup-python@v4
22 |         with:
23 |           python-version: "3.7"
24 |       - name: Install dependencies
25 |         run: |
26 |           python -m pip install --upgrade pip
27 |           pip install -e .
28 |           pip install pytest
29 |       - name: Check for soft dependencies
30 |         run: |
31 |           pytest tests/others/test_dependencies.py
32 |       


--------------------------------------------------------------------------------
/examples/inference/README.md:
--------------------------------------------------------------------------------
1 | # Inference Examples
2 | 
3 | **The inference examples folder is deprecated and will be removed in a future version**.
4 | **Officially supported inference examples can be found in the [Pipelines folder](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines)**.
5 | 
6 | - For `Image-to-Image text-guided generation with Stable Diffusion`, please have a look at the official [Pipeline examples](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines#examples)
7 | - For `In-painting using Stable Diffusion`, please have a look at the official [Pipeline examples](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines#examples)
8 | - For `Tweak prompts reusing seeds and latents`, please have a look at the official [Pipeline examples](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines#examples)
9 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/shap_e/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 |     is_transformers_version,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available()):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import ShapEPipeline
14 | else:
15 |     from .camera import create_pan_cameras
16 |     from .pipeline_shap_e import ShapEPipeline
17 |     from .pipeline_shap_e_img2img import ShapEImg2ImgPipeline
18 |     from .renderer import (
19 |         BoundingBoxVolume,
20 |         ImportanceRaySampler,
21 |         MLPNeRFModelOutput,
22 |         MLPNeRSTFModel,
23 |         ShapEParamsProjModel,
24 |         ShapERenderer,
25 |         StratifiedRaySampler,
26 |         VoidNeRFModel,
27 |     )
28 | 


--------------------------------------------------------------------------------
/examples/reinforcement_learning/README.md:
--------------------------------------------------------------------------------
 1 | # Overview
 2 | 
 3 | These examples show how to run [Diffuser](https://arxiv.org/abs/2205.09991) in Diffusers. 
 4 | There are two ways to use the script, `run_diffuser_locomotion.py`.
 5 | 
 6 | The key option is a change of the variable `n_guide_steps`. 
 7 | When `n_guide_steps=0`, the trajectories are sampled from the diffusion model, but not fine-tuned to maximize reward in the environment.
 8 | By default, `n_guide_steps=2` to match the original implementation.
 9 |  
10 | 
11 | You will need some RL specific requirements to run the examples:
12 | 
13 | ```
14 | pip install -f https://download.pytorch.org/whl/torch_stable.html \
15 |                 free-mujoco-py \
16 |                 einops \
17 |                 gym==0.24.1 \
18 |                 protobuf==3.20.1 \
19 |                 git+https://github.com/rail-berkeley/d4rl.git \
20 |                 mediapy \
21 |                 Pillow==9.0.0
22 | ```
23 | 


--------------------------------------------------------------------------------
/docs/source/en/using-diffusers/using_safetensors:
--------------------------------------------------------------------------------
 1 | # What is safetensors ? 
 2 | 
 3 | [safetensors](https://github.com/huggingface/safetensors) is a different format
 4 | from the classic `.bin` which uses Pytorch which uses pickle.
 5 | 
 6 | Pickle is notoriously unsafe which allow any malicious file to execute arbitrary code.
 7 | The hub itself tries to prevent issues from it, but it's not a silver bullet.
 8 | 
 9 | `safetensors` first and foremost goal is to make loading machine learning models *safe*
10 | in the sense that no takeover of your computer can be done.
11 | 
12 | # Why use safetensors ?
13 | 
14 | **Safety** can be one reason, if you're attempting to use a not well known model and
15 | you're not sure about the source of the file.
16 | 
17 | And a secondary reason, is **the speed of loading**. Safetensors can load models much faster
18 | than regular pickle files. If you spend a lot of times switching models, this can be
19 | a huge timesave.
20 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/kandinsky/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 | )
 6 | 
 7 | 
 8 | try:
 9 |     if not (is_transformers_available() and is_torch_available()):
10 |         raise OptionalDependencyNotAvailable()
11 | except OptionalDependencyNotAvailable:
12 |     from ...utils.dummy_torch_and_transformers_objects import *
13 | else:
14 |     from .pipeline_kandinsky import KandinskyPipeline
15 |     from .pipeline_kandinsky_combined import (
16 |         KandinskyCombinedPipeline,
17 |         KandinskyImg2ImgCombinedPipeline,
18 |         KandinskyInpaintCombinedPipeline,
19 |     )
20 |     from .pipeline_kandinsky_img2img import KandinskyImg2ImgPipeline
21 |     from .pipeline_kandinsky_inpaint import KandinskyInpaintPipeline
22 |     from .pipeline_kandinsky_prior import KandinskyPriorPipeline, KandinskyPriorPipelineOutput
23 |     from .text_encoder import MultilingualCLIP
24 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/controlnet/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_flax_available,
 4 |     is_torch_available,
 5 |     is_transformers_available,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available()):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import *  # noqa F403
14 | else:
15 |     from .multicontrolnet import MultiControlNetModel
16 |     from .pipeline_controlnet import StableDiffusionControlNetPipeline
17 |     from .pipeline_controlnet_img2img import StableDiffusionControlNetImg2ImgPipeline
18 |     from .pipeline_controlnet_inpaint import StableDiffusionControlNetInpaintPipeline
19 |     from .pipeline_controlnet_sd_xl import StableDiffusionXLControlNetPipeline
20 | 
21 | 
22 | if is_transformers_available() and is_flax_available():
23 |     from .pipeline_flax_controlnet import FlaxStableDiffusionControlNetPipeline
24 | 


--------------------------------------------------------------------------------
/src/diffusers/commands/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | from abc import ABC, abstractmethod
16 | from argparse import ArgumentParser
17 | 
18 | 
19 | class BaseDiffusersCLICommand(ABC):
20 |     @staticmethod
21 |     @abstractmethod
22 |     def register_subcommand(parser: ArgumentParser):
23 |         raise NotImplementedError()
24 | 
25 |     @abstractmethod
26 |     def run(self):
27 |         raise NotImplementedError()
28 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/ipndm.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # IPNDMScheduler
14 | 
15 | `IPNDMScheduler` is a fourth-order Improved Pseudo Linear Multistep scheduler. The original implementation can be found at [crowsonkb/v-diffusion-pytorch](https://github.com/crowsonkb/v-diffusion-pytorch/blob/987f8985e38208345c1959b0ea767a625831cc9b/diffusion/sampling.py#L296).
16 | 
17 | ## IPNDMScheduler
18 | [[autodoc]] IPNDMScheduler
19 | 
20 | ## SchedulerOutput
21 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_torch_and_librosa_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class AudioDiffusionPipeline(metaclass=DummyObject):
 6 |     _backends = ["torch", "librosa"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["torch", "librosa"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["torch", "librosa"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["torch", "librosa"])
18 | 
19 | 
20 | class Mel(metaclass=DummyObject):
21 |     _backends = ["torch", "librosa"]
22 | 
23 |     def __init__(self, *args, **kwargs):
24 |         requires_backends(self, ["torch", "librosa"])
25 | 
26 |     @classmethod
27 |     def from_config(cls, *args, **kwargs):
28 |         requires_backends(cls, ["torch", "librosa"])
29 | 
30 |     @classmethod
31 |     def from_pretrained(cls, *args, **kwargs):
32 |         requires_backends(cls, ["torch", "librosa"])
33 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/spectrogram_diffusion/__init__.py:
--------------------------------------------------------------------------------
 1 | # flake8: noqa
 2 | from ...utils import is_note_seq_available, is_transformers_available, is_torch_available
 3 | from ...utils import OptionalDependencyNotAvailable
 4 | 
 5 | 
 6 | try:
 7 |     if not (is_transformers_available() and is_torch_available()):
 8 |         raise OptionalDependencyNotAvailable()
 9 | except OptionalDependencyNotAvailable:
10 |     from ...utils.dummy_torch_and_transformers_objects import *  # noqa F403
11 | else:
12 |     from .notes_encoder import SpectrogramNotesEncoder
13 |     from .continous_encoder import SpectrogramContEncoder
14 |     from .pipeline_spectrogram_diffusion import (
15 |         SpectrogramContEncoder,
16 |         SpectrogramDiffusionPipeline,
17 |         T5FilmDecoder,
18 |     )
19 | 
20 | try:
21 |     if not (is_transformers_available() and is_torch_available() and is_note_seq_available()):
22 |         raise OptionalDependencyNotAvailable()
23 | except OptionalDependencyNotAvailable:
24 |     from ...utils.dummy_transformers_and_torch_and_note_seq_objects import *  # noqa F403
25 | else:
26 |     from .midi_utils import MidiProcessor
27 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/dpm_sde.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # DPMSolverSDEScheduler
14 | 
15 | The `DPMSolverSDEScheduler` is inspired by the stochastic sampler from the [Elucidating the Design Space of Diffusion-Based Generative Models](https://huggingface.co/papers/2206.00364) paper, and the scheduler is ported from and created by [Katherine Crowson](https://github.com/crowsonkb/).
16 | 
17 | ## DPMSolverSDEScheduler
18 | [[autodoc]] DPMSolverSDEScheduler
19 | 
20 | ## SchedulerOutput
21 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/ddim_inverse.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # DDIMInverseScheduler
14 | 
15 | `DDIMInverseScheduler` is the inverted scheduler from [Denoising Diffusion Implicit Models](https://huggingface.co/papers/2010.02502) (DDIM) by Jiaming Song, Chenlin Meng and Stefano Ermon.
16 | The implementation is mostly based on the DDIM inversion definition from [Null-text Inversion for Editing Real Images using Guided Diffusion Models](https://huggingface.co/papers/2211.09794.pdf).
17 | 
18 | ## DDIMInverseScheduler
19 | [[autodoc]] DDIMInverseScheduler
20 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/kandinsky/text_encoder.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from transformers import PreTrainedModel, XLMRobertaConfig, XLMRobertaModel
 3 | 
 4 | 
 5 | class MCLIPConfig(XLMRobertaConfig):
 6 |     model_type = "M-CLIP"
 7 | 
 8 |     def __init__(self, transformerDimSize=1024, imageDimSize=768, **kwargs):
 9 |         self.transformerDimensions = transformerDimSize
10 |         self.numDims = imageDimSize
11 |         super().__init__(**kwargs)
12 | 
13 | 
14 | class MultilingualCLIP(PreTrainedModel):
15 |     config_class = MCLIPConfig
16 | 
17 |     def __init__(self, config, *args, **kwargs):
18 |         super().__init__(config, *args, **kwargs)
19 |         self.transformer = XLMRobertaModel(config)
20 |         self.LinearTransformation = torch.nn.Linear(
21 |             in_features=config.transformerDimensions, out_features=config.numDims
22 |         )
23 | 
24 |     def forward(self, input_ids, attention_mask):
25 |         embs = self.transformer(input_ids=input_ids, attention_mask=attention_mask)[0]
26 |         embs2 = (embs * attention_mask.unsqueeze(2)).sum(dim=1) / attention_mask.sum(dim=1)[:, None]
27 |         return self.LinearTransformation(embs2), embs
28 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/versatile_diffusion/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 |     is_transformers_version,
 6 | )
 7 | 
 8 | 
 9 | try:
10 |     if not (is_transformers_available() and is_torch_available() and is_transformers_version(">=", "4.25.0")):
11 |         raise OptionalDependencyNotAvailable()
12 | except OptionalDependencyNotAvailable:
13 |     from ...utils.dummy_torch_and_transformers_objects import (
14 |         VersatileDiffusionDualGuidedPipeline,
15 |         VersatileDiffusionImageVariationPipeline,
16 |         VersatileDiffusionPipeline,
17 |         VersatileDiffusionTextToImagePipeline,
18 |     )
19 | else:
20 |     from .modeling_text_unet import UNetFlatConditionModel
21 |     from .pipeline_versatile_diffusion import VersatileDiffusionPipeline
22 |     from .pipeline_versatile_diffusion_dual_guided import VersatileDiffusionDualGuidedPipeline
23 |     from .pipeline_versatile_diffusion_image_variation import VersatileDiffusionImageVariationPipeline
24 |     from .pipeline_versatile_diffusion_text_to_image import VersatileDiffusionTextToImagePipeline
25 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/pndm.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # PNDMScheduler
14 | 
15 | `PNDMScheduler`, or pseudo numerical methods for diffusion models, uses more advanced ODE integration techniques like the Runge-Kutta and linear multi-step method. The original implementation can be found at [crowsonkb/k-diffusion](https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L181).
16 | 
17 | ## PNDMScheduler
18 | [[autodoc]] PNDMScheduler
19 | 
20 | ## SchedulerOutput
21 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/heun.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # HeunDiscreteScheduler
14 | 
15 | The Heun scheduler (Algorithm 1) is from the [Elucidating the Design Space of Diffusion-Based Generative Models](https://huggingface.co/papers/2206.00364) paper by Karras et al. The scheduler is ported from the [k-diffusion](https://github.com/crowsonkb/k-diffusion) library and created by [Katherine Crowson](https://github.com/crowsonkb/).
16 | 
17 | ## HeunDiscreteScheduler
18 | [[autodoc]] HeunDiscreteScheduler
19 | 
20 | ## SchedulerOutput
21 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/stochastic_karras_ve.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # KarrasVeScheduler
14 | 
15 | `KarrasVeScheduler` is a stochastic sampler tailored o variance-expanding (VE) models. It is based on the [Elucidating the Design Space of Diffusion-Based Generative Models](https://huggingface.co/papers/2206.00364) and [Score-based generative modeling through stochastic differential equations](https://huggingface.co/papers/2011.13456) papers.
16 | 
17 | ## KarrasVeScheduler
18 | [[autodoc]] KarrasVeScheduler
19 | 
20 | ## KarrasVeOutput
21 | [[autodoc]] schedulers.scheduling_karras_ve.KarrasVeOutput


--------------------------------------------------------------------------------
/src/diffusers/pipelines/kandinsky2_2/__init__.py:
--------------------------------------------------------------------------------
 1 | from ...utils import (
 2 |     OptionalDependencyNotAvailable,
 3 |     is_torch_available,
 4 |     is_transformers_available,
 5 | )
 6 | 
 7 | 
 8 | try:
 9 |     if not (is_transformers_available() and is_torch_available()):
10 |         raise OptionalDependencyNotAvailable()
11 | except OptionalDependencyNotAvailable:
12 |     from ...utils.dummy_torch_and_transformers_objects import *
13 | else:
14 |     from .pipeline_kandinsky2_2 import KandinskyV22Pipeline
15 |     from .pipeline_kandinsky2_2_combined import (
16 |         KandinskyV22CombinedPipeline,
17 |         KandinskyV22Img2ImgCombinedPipeline,
18 |         KandinskyV22InpaintCombinedPipeline,
19 |     )
20 |     from .pipeline_kandinsky2_2_controlnet import KandinskyV22ControlnetPipeline
21 |     from .pipeline_kandinsky2_2_controlnet_img2img import KandinskyV22ControlnetImg2ImgPipeline
22 |     from .pipeline_kandinsky2_2_img2img import KandinskyV22Img2ImgPipeline
23 |     from .pipeline_kandinsky2_2_inpainting import KandinskyV22InpaintPipeline
24 |     from .pipeline_kandinsky2_2_prior import KandinskyV22PriorPipeline
25 |     from .pipeline_kandinsky2_2_prior_emb2emb import KandinskyV22PriorEmb2EmbPipeline
26 | 


--------------------------------------------------------------------------------
/CITATION.cff:
--------------------------------------------------------------------------------
 1 | cff-version: 1.2.0
 2 | title: 'Diffusers: State-of-the-art diffusion models'
 3 | message: >-
 4 |   If you use this software, please cite it using the
 5 |   metadata from this file.
 6 | type: software
 7 | authors:
 8 |   - given-names: Patrick
 9 |     family-names: von Platen
10 |   - given-names: Suraj
11 |     family-names: Patil
12 |   - given-names: Anton
13 |     family-names: Lozhkov
14 |   - given-names: Pedro
15 |     family-names: Cuenca
16 |   - given-names: Nathan
17 |     family-names: Lambert
18 |   - given-names: Kashif
19 |     family-names: Rasul
20 |   - given-names: Mishig
21 |     family-names: Davaadorj
22 |   - given-names: Thomas
23 |     family-names: Wolf
24 | repository-code: 'https://github.com/huggingface/diffusers'
25 | abstract: >-
26 |   Diffusers provides pretrained diffusion models across
27 |   multiple modalities, such as vision and audio, and serves
28 |   as a modular toolbox for inference and training of
29 |   diffusion models.
30 | keywords:
31 |   - deep-learning
32 |   - pytorch
33 |   - image-generation
34 |   - diffusion
35 |   - text2image
36 |   - image2image
37 |   - score-based-generative-modeling
38 |   - stable-diffusion
39 | license: Apache-2.0
40 | version: 0.12.1
41 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/dpm_discrete.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # KDPM2DiscreteScheduler
14 | 
15 | The `KDPM2DiscreteScheduler` is inspired by the [Elucidating the Design Space of Diffusion-Based Generative Models](https://huggingface.co/papers/2206.00364) paper, and the scheduler is ported from and created by [Katherine Crowson](https://github.com/crowsonkb/).
16 | 
17 | The original codebase can be found at [crowsonkb/k-diffusion](https://github.com/crowsonkb/k-diffusion).
18 | 
19 | ## KDPM2DiscreteScheduler
20 | [[autodoc]] KDPM2DiscreteScheduler
21 | 
22 | ## SchedulerOutput
23 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/lms_discrete.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # LMSDiscreteScheduler
14 | 
15 | `LMSDiscreteScheduler` is a linear multistep scheduler for discrete beta schedules. The scheduler is ported from and created by [Katherine Crowson](https://github.com/crowsonkb/), and the original implementation can be found at [crowsonkb/k-diffusion](https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L181).
16 | 
17 | ## LMSDiscreteScheduler
18 | [[autodoc]] LMSDiscreteScheduler
19 | 
20 | ## LMSDiscreteSchedulerOutput
21 | [[autodoc]] schedulers.scheduling_lms_discrete.LMSDiscreteSchedulerOutput


--------------------------------------------------------------------------------
/docs/source/en/api/configuration.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Configuration
14 | 
15 | Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which stores all the parameters that are passed to their respective `__init__` methods in a JSON-configuration file.
16 | 
17 | <Tip>
18 | 
19 | To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
20 | 
21 | </Tip>
22 | 
23 | ## ConfigMixin
24 | 
25 | [[autodoc]] ConfigMixin
26 | 	- load_config
27 | 	- from_config
28 | 	- save_config
29 | 	- to_json_file
30 | 	- to_json_string
31 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/transformer2d.md:
--------------------------------------------------------------------------------
 1 | # Transformer2D
 2 | 
 3 | A Transformer model for image-like data from [CompVis](https://huggingface.co/CompVis) that is based on the [Vision Transformer](https://huggingface.co/papers/2010.11929) introduced by Dosovitskiy et al. The [`Transformer2DModel`] accepts discrete (classes of vector embeddings) or continuous (actual embeddings) inputs.
 4 | 
 5 | When the input is **continuous**:
 6 | 
 7 | 1. Project the input and reshape it to `(batch_size, sequence_length, feature_dimension)`.
 8 | 2. Apply the Transformer blocks in the standard way.
 9 | 3. Reshape to image.
10 | 
11 | When the input is **discrete**:
12 | 
13 | <Tip>
14 | 
15 | It is assumed one of the input classes is the masked latent pixel. The predicted classes of the unnoised image don't contain a prediction for the masked pixel because the unnoised image cannot be masked.
16 | 
17 | </Tip>
18 | 
19 | 1. Convert input (classes of latent pixels) to embeddings and apply positional embeddings.
20 | 2. Apply the Transformer blocks in the standard way.
21 | 3. Predict classes of unnoised image.
22 | 
23 | ## Transformer2DModel
24 | 
25 | [[autodoc]] Transformer2DModel
26 | 
27 | ## Transformer2DModelOutput
28 | 
29 | [[autodoc]] models.transformer_2d.Transformer2DModelOutput
30 | 


--------------------------------------------------------------------------------
/docs/source/ko/using-diffusers/pipeline_overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Overview
14 | 
15 | 파이프라인은 독립적으로 훈련된 모델과 스케줄러를 함께 모아서 추론을 위해 diffusion 시스템을 빠르고 쉽게 사용할 수 있는 방법을 제공하는 end-to-end 클래스입니다. 모델과 스케줄러의 특정 조합은 특수한 기능과 함께 [`StableDiffusionPipeline`] 또는 [`StableDiffusionControlNetPipeline`]과 같은 특정 파이프라인 유형을 정의합니다. 모든 파이프라인 유형은 기본 [`DiffusionPipeline`] 클래스에서 상속됩니다. 어느 체크포인트를 전달하면, 파이프라인 유형을 자동으로 감지하고 필요한 구성 요소들을 불러옵니다.
16 | 
17 | 이 섹션에서는 unconditional 이미지 생성, text-to-image 생성의 다양한 테크닉과 변화를 파이프라인에서 지원하는 작업들을 소개합니다. 프롬프트에 있는 특정 단어가 출력에 영향을 미치는 것을 조정하기 위해 재현성을 위한 시드 설정과 프롬프트에 가중치를 부여하는 것으로 생성 프로세스를 더 잘 제어하는 방법에 대해 배울 수 있습니다. 마지막으로 음성에서부터 이미지 생성과 같은 커스텀 작업을 위한 커뮤니티 파이프라인을 만드는 방법을 알 수 있습니다.
18 | 


--------------------------------------------------------------------------------
/src/diffusers/pipeline_utils.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | 
14 | # limitations under the License.
15 | 
16 | # NOTE: This file is deprecated and will be removed in a future version.
17 | # It only exists so that temporarely `from diffusers.pipelines import DiffusionPipeline` works
18 | 
19 | from .pipelines import DiffusionPipeline, ImagePipelineOutput  # noqa: F401
20 | from .utils import deprecate
21 | 
22 | 
23 | deprecate(
24 |     "pipelines_utils",
25 |     "0.22.0",
26 |     "Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.",
27 |     standard_warn=False,
28 |     stacklevel=3,
29 | )
30 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/dpm_discrete_ancestral.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # KDPM2AncestralDiscreteScheduler
14 | 
15 | The `KDPM2DiscreteScheduler` with ancestral sampling is inspired by the [Elucidating the Design Space of Diffusion-Based Generative Models](https://huggingface.co/papers/2206.00364) paper, and the scheduler is ported from and created by [Katherine Crowson](https://github.com/crowsonkb/).
16 | 
17 | The original codebase can be found at [crowsonkb/k-diffusion](https://github.com/crowsonkb/k-diffusion).
18 | 
19 | ## KDPM2AncestralDiscreteScheduler
20 | [[autodoc]] KDPM2AncestralDiscreteScheduler
21 | 
22 | ## SchedulerOutput
23 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput


--------------------------------------------------------------------------------
/docs/source/ko/optimization/opt_overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # 개요
14 | 
15 | 노이즈가 많은 출력에서 적은 출력으로 만드는 과정으로 고품질 생성 모델의 출력을 만드는 각각의 반복되는 스텝은 많은 계산이 필요합니다. 🧨 Diffuser의 목표 중 하나는 모든 사람이 이 기술을 널리 이용할 수 있도록 하는 것이며, 여기에는 소비자 및 특수 하드웨어에서 빠른 추론을 가능하게 하는 것을 포함합니다. 
16 | 
17 | 이 섹션에서는 추론 속도를 최적화하고 메모리 소비를 줄이기 위한 반정밀(half-precision) 가중치 및 sliced attention과 같은 팁과 요령을 다룹니다. 또한 [`torch.compile`](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html) 또는 [ONNX Runtime](https://onnxruntime.ai/docs/)을 사용하여 PyTorch 코드의 속도를 높이고, [xFormers](https://facebookresearch.github.io/xformers/)를 사용하여 memory-efficient attention을 활성화하는 방법을 배울 수 있습니다. Apple Silicon, Intel 또는 Habana 프로세서와 같은 특정 하드웨어에서 추론을 실행하기 위한 가이드도 있습니다.


--------------------------------------------------------------------------------
/docs/source/en/using-diffusers/other-modalities.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Using Diffusers with other modalities
14 | 
15 | Diffusers is in the process of expanding to modalities other than images.
16 | 
17 | Example type        | Colab | Pipeline |
18 | :-------------------------:|:-------------------------:|:-------------------------:|
19 | [Molecule conformation](https://www.nature.com/subjects/molecular-conformation#:~:text=Definition,to%20changes%20in%20their%20environment.) generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/geodiff_molecule_conformation.ipynb) | ❌
20 | 
21 | More coming soon!


--------------------------------------------------------------------------------
/tests/models/test_models_vae_flax.py:
--------------------------------------------------------------------------------
 1 | import unittest
 2 | 
 3 | from diffusers import FlaxAutoencoderKL
 4 | from diffusers.utils import is_flax_available
 5 | from diffusers.utils.testing_utils import require_flax
 6 | 
 7 | from .test_modeling_common_flax import FlaxModelTesterMixin
 8 | 
 9 | 
10 | if is_flax_available():
11 |     import jax
12 | 
13 | 
14 | @require_flax
15 | class FlaxAutoencoderKLTests(FlaxModelTesterMixin, unittest.TestCase):
16 |     model_class = FlaxAutoencoderKL
17 | 
18 |     @property
19 |     def dummy_input(self):
20 |         batch_size = 4
21 |         num_channels = 3
22 |         sizes = (32, 32)
23 | 
24 |         prng_key = jax.random.PRNGKey(0)
25 |         image = jax.random.uniform(prng_key, ((batch_size, num_channels) + sizes))
26 | 
27 |         return {"sample": image, "prng_key": prng_key}
28 | 
29 |     def prepare_init_args_and_inputs_for_common(self):
30 |         init_dict = {
31 |             "block_out_channels": [32, 64],
32 |             "in_channels": 3,
33 |             "out_channels": 3,
34 |             "down_block_types": ["DownEncoderBlock2D", "DownEncoderBlock2D"],
35 |             "up_block_types": ["UpDecoderBlock2D", "UpDecoderBlock2D"],
36 |             "latent_channels": 4,
37 |         }
38 |         inputs_dict = self.dummy_input
39 |         return init_dict, inputs_dict
40 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/euler_ancestral.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # EulerAncestralDiscreteScheduler
14 | 
15 | A scheduler that uses ancestral sampling with Euler method steps. This is a fast scheduler which can often generate good outputs in 20-30 steps. The scheduler is based on the original [k-diffusion](https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L72) implementation by [Katherine Crowson](https://github.com/crowsonkb/).
16 | 
17 | ## EulerAncestralDiscreteScheduler
18 | [[autodoc]] EulerAncestralDiscreteScheduler
19 | 
20 | ## EulerAncestralDiscreteSchedulerOutput
21 | [[autodoc]] schedulers.scheduling_euler_ancestral_discrete.EulerAncestralDiscreteSchedulerOutput


--------------------------------------------------------------------------------
/src/diffusers/pipelines/text_to_video_synthesis/__init__.py:
--------------------------------------------------------------------------------
 1 | from dataclasses import dataclass
 2 | from typing import List, Optional, Union
 3 | 
 4 | import numpy as np
 5 | import torch
 6 | 
 7 | from ...utils import BaseOutput, OptionalDependencyNotAvailable, is_torch_available, is_transformers_available
 8 | 
 9 | 
10 | @dataclass
11 | class TextToVideoSDPipelineOutput(BaseOutput):
12 |     """
13 |     Output class for text-to-video pipelines.
14 | 
15 |     Args:
16 |         frames (`List[np.ndarray]` or `torch.FloatTensor`)
17 |             List of denoised frames (essentially images) as NumPy arrays of shape `(height, width, num_channels)` or as
18 |             a `torch` tensor. The length of the list denotes the video length (the number of frames).
19 |     """
20 | 
21 |     frames: Union[List[np.ndarray], torch.FloatTensor]
22 | 
23 | 
24 | try:
25 |     if not (is_transformers_available() and is_torch_available()):
26 |         raise OptionalDependencyNotAvailable()
27 | except OptionalDependencyNotAvailable:
28 |     from ...utils.dummy_torch_and_transformers_objects import *  # noqa F403
29 | else:
30 |     from .pipeline_text_to_video_synth import TextToVideoSDPipeline
31 |     from .pipeline_text_to_video_synth_img2img import VideoToVideoSDPipeline  # noqa: F401
32 |     from .pipeline_text_to_video_zero import TextToVideoZeroPipeline
33 | 


--------------------------------------------------------------------------------
/.github/workflows/build_docker_images.yml:
--------------------------------------------------------------------------------
 1 | name: Build Docker images (nightly)
 2 | 
 3 | on:
 4 |   workflow_dispatch:
 5 |   schedule:
 6 |     - cron: "0 0 * * *" # every day at midnight
 7 | 
 8 | concurrency:
 9 |   group: docker-image-builds
10 |   cancel-in-progress: false
11 | 
12 | env:
13 |   REGISTRY: diffusers
14 | 
15 | jobs:
16 |   build-docker-images:
17 |     runs-on: ubuntu-latest
18 | 
19 |     permissions:
20 |       contents: read
21 |       packages: write
22 | 
23 |     strategy:
24 |       fail-fast: false
25 |       matrix:
26 |         image-name:
27 |           - diffusers-pytorch-cpu
28 |           - diffusers-pytorch-cuda
29 |           - diffusers-flax-cpu
30 |           - diffusers-flax-tpu
31 |           - diffusers-onnxruntime-cpu
32 |           - diffusers-onnxruntime-cuda
33 | 
34 |     steps:
35 |       - name: Checkout repository
36 |         uses: actions/checkout@v3
37 | 
38 |       - name: Login to Docker Hub
39 |         uses: docker/login-action@v2
40 |         with:
41 |           username: ${{ env.REGISTRY }}
42 |           password: ${{ secrets.DOCKERHUB_TOKEN }}
43 | 
44 |       - name: Build and push
45 |         uses: docker/build-push-action@v3
46 |         with:
47 |           no-cache: true
48 |           context: ./docker/${{ matrix.image-name }}
49 |           push: true
50 |           tags: ${{ env.REGISTRY }}/${{ matrix.image-name }}:latest
51 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/euler.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # EulerDiscreteScheduler
14 | 
15 | The Euler scheduler (Algorithm 2) is from the [Elucidating the Design Space of Diffusion-Based Generative Models](https://huggingface.co/papers/2206.00364) paper by Karras et al. This is a fast scheduler which can often generate good outputs in 20-30 steps. The scheduler is based on the original [k-diffusion](https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L51) implementation by [Katherine Crowson](https://github.com/crowsonkb/).
16 | 
17 | 
18 | ## EulerDiscreteScheduler
19 | [[autodoc]] EulerDiscreteScheduler
20 | 
21 | ## EulerDiscreteSchedulerOutput
22 | [[autodoc]] schedulers.scheduling_euler_discrete.EulerDiscreteSchedulerOutput


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/new-model-addition.yml:
--------------------------------------------------------------------------------
 1 | name: "\U0001F31F New model/pipeline/scheduler addition"
 2 | description: Submit a proposal/request to implement a new diffusion model / pipeline / scheduler
 3 | labels: [ "New model/pipeline/scheduler" ]
 4 | 
 5 | body:
 6 |   - type: textarea
 7 |     id: description-request
 8 |     validations:
 9 |       required: true
10 |     attributes:
11 |       label: Model/Pipeline/Scheduler description
12 |       description: |
13 |         Put any and all important information relative to the model/pipeline/scheduler
14 | 
15 |   - type: checkboxes
16 |     id: information-tasks
17 |     attributes:
18 |       label: Open source status
19 |       description: |
20 |           Please note that if the model implementation isn't available or if the weights aren't open-source, we are less likely to implement it in `diffusers`.
21 |       options:
22 |         - label: "The model implementation is available"
23 |         - label: "The model weights are available (Only relevant if addition is not a scheduler)."
24 | 
25 |   - type: textarea
26 |     id: additional-info
27 |     attributes:
28 |       label: Provide useful links for the implementation
29 |       description: |
30 |         Please provide information regarding the implementation, the weights, and the authors.
31 |         Please mention the authors by @gh-username if you're aware of their usernames.
32 | 


--------------------------------------------------------------------------------
/docker/diffusers-onnxruntime-cpu/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM ubuntu:20.04
 2 | LABEL maintainer="Hugging Face"
 3 | LABEL repository="diffusers"
 4 | 
 5 | ENV DEBIAN_FRONTEND=noninteractive
 6 | 
 7 | RUN apt update && \
 8 |     apt install -y bash \
 9 |                    build-essential \
10 |                    git \
11 |                    git-lfs \
12 |                    curl \
13 |                    ca-certificates \
14 |                    libsndfile1-dev \
15 |                    python3.8 \
16 |                    python3-pip \
17 |                    python3.8-venv && \
18 |     rm -rf /var/lib/apt/lists
19 | 
20 | # make sure to use venv
21 | RUN python3 -m venv /opt/venv
22 | ENV PATH="/opt/venv/bin:$PATH"
23 | 
24 | # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
25 | RUN python3 -m pip install --no-cache-dir --upgrade pip && \
26 |     python3 -m pip install --no-cache-dir \
27 |         torch \
28 |         torchvision \
29 |         torchaudio \
30 |         onnxruntime \
31 |         --extra-index-url https://download.pytorch.org/whl/cpu && \
32 |     python3 -m pip install --no-cache-dir \
33 |         accelerate \
34 |         datasets \
35 |         hf-doc-builder \
36 |         huggingface-hub \
37 |         Jinja2 \
38 |         librosa \
39 |         numpy \
40 |         scipy \
41 |         tensorboard \
42 |         transformers
43 | 
44 | CMD ["/bin/bash"]


--------------------------------------------------------------------------------
/docs/source/ko/tutorials/tutorial_overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Overview
14 | 
15 | 🧨 Diffusers에 오신 걸 환영합니다! 여러분이 diffusion 모델과 생성 AI를 처음 접하고, 더 많은 걸 배우고 싶으셨다면 제대로 찾아오셨습니다. 이 튜토리얼은 diffusion model을 여러분에게 젠틀하게 소개하고, 라이브러리의 기본 사항(핵심 구성요소와 🧨 Diffusers 사용법)을 이해하는 데 도움이 되도록 설계되었습니다.
16 | 
17 | 여러분은 이 튜토리얼을 통해 빠르게 생성하기 위해선 추론 파이프라인을 어떻게 사용해야 하는지, 그리고 라이브러리를 modular toolbox처럼 이용해서 여러분만의 diffusion system을 구축할 수 있도록 파이프라인을 분해하는 법을 배울 수 있습니다. 다음 단원에서는 여러분이 원하는 것을 생성하기 위해 자신만의 diffusion model을 학습하는 방법을 배우게 됩니다.
18 | 
19 | 튜토리얼을 완료한다면 여러분은 라이브러리를 직접 탐색하고, 자신의 프로젝트와 애플리케이션에 적용할 스킬들을 습득할 수 있을 겁니다. 
20 | 
21 | [Discord](https://discord.com/invite/JfAtkvEtRb)나 [포럼](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63) 커뮤니티에 자유롭게 참여해서 다른 사용자와 개발자들과 교류하고 협업해 보세요!
22 | 
23 | 자 지금부터 diffusing을 시작해 보겠습니다! 🧨


--------------------------------------------------------------------------------
/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion_controlnet.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | # NOTE: This file is deprecated and will be removed in a future version.
16 | # It only exists so that temporarely `from diffusers.pipelines import DiffusionPipeline` works
17 | 
18 | from ...utils import deprecate
19 | from ..controlnet.pipeline_flax_controlnet import FlaxStableDiffusionControlNetPipeline  # noqa: F401
20 | 
21 | 
22 | deprecate(
23 |     "stable diffusion controlnet",
24 |     "0.22.0",
25 |     "Importing `FlaxStableDiffusionControlNetPipeline` from diffusers.pipelines.stable_diffusion.flax_pipeline_stable_diffusion_controlnet is deprecated. Please import `from diffusers import FlaxStableDiffusionControlNetPipeline` instead.",
26 |     standard_warn=False,
27 |     stacklevel=3,
28 | )
29 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/stable_diffusion_xl/watermark.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch
 3 | 
 4 | from ...utils import is_invisible_watermark_available
 5 | 
 6 | 
 7 | if is_invisible_watermark_available():
 8 |     from imwatermark import WatermarkEncoder
 9 | 
10 | 
11 | # Copied from https://github.com/Stability-AI/generative-models/blob/613af104c6b85184091d42d374fef420eddb356d/scripts/demo/streamlit_helpers.py#L66
12 | WATERMARK_MESSAGE = 0b101100111110110010010000011110111011000110011110
13 | # bin(x)[2:] gives bits of x as str, use int to convert them to 0/1
14 | WATERMARK_BITS = [int(bit) for bit in bin(WATERMARK_MESSAGE)[2:]]
15 | 
16 | 
17 | class StableDiffusionXLWatermarker:
18 |     def __init__(self):
19 |         self.watermark = WATERMARK_BITS
20 |         self.encoder = WatermarkEncoder()
21 | 
22 |         self.encoder.set_watermark("bits", self.watermark)
23 | 
24 |     def apply_watermark(self, images: torch.FloatTensor):
25 |         # can't encode images that are smaller than 256
26 |         if images.shape[-1] < 256:
27 |             return images
28 | 
29 |         images = (255 * (images / 2 + 0.5)).cpu().permute(0, 2, 3, 1).float().numpy()
30 | 
31 |         images = [self.encoder.encode(image, "dwtDct") for image in images]
32 | 
33 |         images = torch.from_numpy(np.array(images)).permute(0, 3, 1, 2)
34 | 
35 |         images = torch.clamp(2 * (images / 255 - 0.5), min=-1.0, max=1.0)
36 |         return images
37 | 


--------------------------------------------------------------------------------
/docker/diffusers-pytorch-cpu/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM ubuntu:20.04
 2 | LABEL maintainer="Hugging Face"
 3 | LABEL repository="diffusers"
 4 | 
 5 | ENV DEBIAN_FRONTEND=noninteractive
 6 | 
 7 | RUN apt update && \
 8 |     apt install -y bash \
 9 |                    build-essential \
10 |                    git \
11 |                    git-lfs \
12 |                    curl \
13 |                    ca-certificates \
14 |                    libsndfile1-dev \
15 |                    python3.8 \
16 |                    python3-pip \
17 |                    libgl1 \
18 |                    python3.8-venv && \
19 |     rm -rf /var/lib/apt/lists
20 | 
21 | # make sure to use venv
22 | RUN python3 -m venv /opt/venv
23 | ENV PATH="/opt/venv/bin:$PATH"
24 | 
25 | # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
26 | RUN python3 -m pip install --no-cache-dir --upgrade pip && \
27 |     python3 -m pip install --no-cache-dir \
28 |         torch \
29 |         torchvision \
30 |         torchaudio \
31 |         invisible_watermark \
32 |         --extra-index-url https://download.pytorch.org/whl/cpu && \
33 |     python3 -m pip install --no-cache-dir \
34 |         accelerate \
35 |         datasets \
36 |         hf-doc-builder \
37 |         huggingface-hub \
38 |         Jinja2 \
39 |         librosa \
40 |         numpy \
41 |         scipy \
42 |         tensorboard \
43 |         transformers
44 | 
45 | CMD ["/bin/bash"]
46 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/constants.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Inc. team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | import os
15 | 
16 | from huggingface_hub.constants import HUGGINGFACE_HUB_CACHE, hf_cache_home
17 | 
18 | 
19 | default_cache_path = HUGGINGFACE_HUB_CACHE
20 | 
21 | 
22 | CONFIG_NAME = "config.json"
23 | WEIGHTS_NAME = "diffusion_pytorch_model.bin"
24 | FLAX_WEIGHTS_NAME = "diffusion_flax_model.msgpack"
25 | ONNX_WEIGHTS_NAME = "model.onnx"
26 | SAFETENSORS_WEIGHTS_NAME = "diffusion_pytorch_model.safetensors"
27 | ONNX_EXTERNAL_WEIGHTS_NAME = "weights.pb"
28 | HUGGINGFACE_CO_RESOLVE_ENDPOINT = "https://huggingface.co"
29 | DIFFUSERS_CACHE = default_cache_path
30 | DIFFUSERS_DYNAMIC_MODULE_NAME = "diffusers_modules"
31 | HF_MODULES_CACHE = os.getenv("HF_MODULES_CACHE", os.path.join(hf_cache_home, "modules"))
32 | DEPRECATED_REVISION_ARGS = ["fp16", "non-ema"]
33 | 


--------------------------------------------------------------------------------
/docker/diffusers-onnxruntime-cuda/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04
 2 | LABEL maintainer="Hugging Face"
 3 | LABEL repository="diffusers"
 4 | 
 5 | ENV DEBIAN_FRONTEND=noninteractive
 6 | 
 7 | RUN apt update && \
 8 |     apt install -y bash \
 9 |                    build-essential \
10 |                    git \
11 |                    git-lfs \
12 |                    curl \
13 |                    ca-certificates \
14 |                    libsndfile1-dev \
15 |                    python3.8 \
16 |                    python3-pip \
17 |                    python3.8-venv && \
18 |     rm -rf /var/lib/apt/lists
19 | 
20 | # make sure to use venv
21 | RUN python3 -m venv /opt/venv
22 | ENV PATH="/opt/venv/bin:$PATH"
23 | 
24 | # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
25 | RUN python3 -m pip install --no-cache-dir --upgrade pip && \
26 |     python3 -m pip install --no-cache-dir \
27 |         torch \
28 |         torchvision \
29 |         torchaudio \
30 |         "onnxruntime-gpu>=1.13.1" \
31 |         --extra-index-url https://download.pytorch.org/whl/cu117 && \
32 |     python3 -m pip install --no-cache-dir \
33 |         accelerate \
34 |         datasets \
35 |         hf-doc-builder \
36 |         huggingface-hub \
37 |         Jinja2 \
38 |         librosa \
39 |         numpy \
40 |         scipy \
41 |         tensorboard \
42 |         transformers
43 | 
44 | CMD ["/bin/bash"]


--------------------------------------------------------------------------------
/src/diffusers/pipelines/semantic_stable_diffusion/__init__.py:
--------------------------------------------------------------------------------
 1 | from dataclasses import dataclass
 2 | from enum import Enum
 3 | from typing import List, Optional, Union
 4 | 
 5 | import numpy as np
 6 | import PIL
 7 | from PIL import Image
 8 | 
 9 | from ...utils import BaseOutput, OptionalDependencyNotAvailable, is_torch_available, is_transformers_available
10 | 
11 | 
12 | @dataclass
13 | class SemanticStableDiffusionPipelineOutput(BaseOutput):
14 |     """
15 |     Output class for Stable Diffusion pipelines.
16 | 
17 |     Args:
18 |         images (`List[PIL.Image.Image]` or `np.ndarray`)
19 |             List of denoised PIL images of length `batch_size` or NumPy array of shape `(batch_size, height, width,
20 |             num_channels)`.
21 |         nsfw_content_detected (`List[bool]`)
22 |             List indicating whether the corresponding generated image contains “not-safe-for-work” (nsfw) content or
23 |             `None` if safety checking could not be performed.
24 |     """
25 | 
26 |     images: Union[List[PIL.Image.Image], np.ndarray]
27 |     nsfw_content_detected: Optional[List[bool]]
28 | 
29 | 
30 | try:
31 |     if not (is_transformers_available() and is_torch_available()):
32 |         raise OptionalDependencyNotAvailable()
33 | except OptionalDependencyNotAvailable:
34 |     from ...utils.dummy_torch_and_transformers_objects import *  # noqa F403
35 | else:
36 |     from .pipeline_semantic_stable_diffusion import SemanticStableDiffusionPipeline
37 | 


--------------------------------------------------------------------------------
/docs/source/en/api/attnprocessor.md:
--------------------------------------------------------------------------------
 1 | # Attention Processor
 2 | 
 3 | An attention processor is a class for applying different types of attention mechanisms.
 4 | 
 5 | ## AttnProcessor
 6 | [[autodoc]] models.attention_processor.AttnProcessor
 7 | 
 8 | ## AttnProcessor2_0
 9 | [[autodoc]] models.attention_processor.AttnProcessor2_0
10 | 
11 | ## LoRAAttnProcessor
12 | [[autodoc]] models.attention_processor.LoRAAttnProcessor
13 | 
14 | ## LoRAAttnProcessor2_0
15 | [[autodoc]] models.attention_processor.LoRAAttnProcessor2_0
16 | 
17 | ## CustomDiffusionAttnProcessor
18 | [[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor
19 | 
20 | ## AttnAddedKVProcessor
21 | [[autodoc]] models.attention_processor.AttnAddedKVProcessor
22 | 
23 | ## AttnAddedKVProcessor2_0
24 | [[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0
25 | 
26 | ## LoRAAttnAddedKVProcessor
27 | [[autodoc]] models.attention_processor.LoRAAttnAddedKVProcessor
28 | 
29 | ## XFormersAttnProcessor
30 | [[autodoc]] models.attention_processor.XFormersAttnProcessor
31 | 
32 | ## LoRAXFormersAttnProcessor
33 | [[autodoc]] models.attention_processor.LoRAXFormersAttnProcessor
34 | 
35 | ## CustomDiffusionXFormersAttnProcessor
36 | [[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor
37 | 
38 | ## SlicedAttnProcessor
39 | [[autodoc]] models.attention_processor.SlicedAttnProcessor
40 | 
41 | ## SlicedAttnAddedKVProcessor
42 | [[autodoc]] models.attention_processor.SlicedAttnAddedKVProcessor


--------------------------------------------------------------------------------
/docker/diffusers-flax-cpu/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM ubuntu:20.04
 2 | LABEL maintainer="Hugging Face"
 3 | LABEL repository="diffusers"
 4 | 
 5 | ENV DEBIAN_FRONTEND=noninteractive
 6 | 
 7 | RUN apt update && \
 8 |     apt install -y bash \
 9 |                    build-essential \
10 |                    git \
11 |                    git-lfs \
12 |                    curl \
13 |                    ca-certificates \
14 |                    libsndfile1-dev \
15 |                    python3.8 \
16 |                    python3-pip \
17 |                    python3.8-venv && \
18 |     rm -rf /var/lib/apt/lists
19 | 
20 | # make sure to use venv
21 | RUN python3 -m venv /opt/venv
22 | ENV PATH="/opt/venv/bin:$PATH"
23 | 
24 | # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
25 | # follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
26 | RUN python3 -m pip install --no-cache-dir --upgrade pip && \
27 |     python3 -m pip install --upgrade --no-cache-dir \
28 |         clu \
29 |         "jax[cpu]>=0.2.16,!=0.3.2" \
30 |         "flax>=0.4.1" \
31 |         "jaxlib>=0.1.65" && \
32 |     python3 -m pip install --no-cache-dir \
33 |         accelerate \
34 |         datasets \
35 |         hf-doc-builder \
36 |         huggingface-hub \
37 |         Jinja2 \
38 |         librosa \
39 |         numpy \
40 |         scipy \
41 |         tensorboard \
42 |         transformers
43 | 
44 | CMD ["/bin/bash"]


--------------------------------------------------------------------------------
/docker/diffusers-pytorch-cuda/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04
 2 | LABEL maintainer="Hugging Face"
 3 | LABEL repository="diffusers"
 4 | 
 5 | ENV DEBIAN_FRONTEND=noninteractive
 6 | 
 7 | RUN apt update && \
 8 |     apt install -y bash \
 9 |                    build-essential \
10 |                    git \
11 |                    git-lfs \
12 |                    curl \
13 |                    ca-certificates \
14 |                    libsndfile1-dev \
15 |                    libgl1 \
16 |                    python3.8 \
17 |                    python3-pip \
18 |                    python3.8-venv && \
19 |     rm -rf /var/lib/apt/lists
20 | 
21 | # make sure to use venv
22 | RUN python3 -m venv /opt/venv
23 | ENV PATH="/opt/venv/bin:$PATH"
24 | 
25 | # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
26 | RUN python3 -m pip install --no-cache-dir --upgrade pip && \
27 |     python3 -m pip install --no-cache-dir \
28 |         torch \
29 |         torchvision \
30 |         torchaudio \
31 |         invisible_watermark && \
32 |     python3 -m pip install --no-cache-dir \
33 |         accelerate \
34 |         datasets \
35 |         hf-doc-builder \
36 |         huggingface-hub \
37 |         Jinja2 \
38 |         librosa \
39 |         numpy \
40 |         scipy \
41 |         tensorboard \
42 |         transformers \
43 |         omegaconf \
44 |         pytorch-lightning \
45 |         xformers
46 | 
47 | CMD ["/bin/bash"]
48 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | # NOTE: This file is deprecated and will be removed in a future version.
16 | # It only exists so that temporarely `from diffusers.pipelines import DiffusionPipeline` works
17 | from ...utils import deprecate
18 | from ..controlnet.multicontrolnet import MultiControlNetModel  # noqa: F401
19 | from ..controlnet.pipeline_controlnet import StableDiffusionControlNetPipeline  # noqa: F401
20 | 
21 | 
22 | deprecate(
23 |     "stable diffusion controlnet",
24 |     "0.22.0",
25 |     "Importing `StableDiffusionControlNetPipeline` or `MultiControlNetModel` from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_controlnet is deprecated. Please import `from diffusers import StableDiffusionControlNetPipeline` instead.",
26 |     standard_warn=False,
27 |     stacklevel=3,
28 | )
29 | 


--------------------------------------------------------------------------------
/src/diffusers/pipelines/stable_diffusion_xl/__init__.py:
--------------------------------------------------------------------------------
 1 | from dataclasses import dataclass
 2 | from typing import List, Optional, Union
 3 | 
 4 | import numpy as np
 5 | import PIL
 6 | 
 7 | from ...utils import (
 8 |     BaseOutput,
 9 |     OptionalDependencyNotAvailable,
10 |     is_torch_available,
11 |     is_transformers_available,
12 | )
13 | 
14 | 
15 | @dataclass
16 | class StableDiffusionXLPipelineOutput(BaseOutput):
17 |     """
18 |     Output class for Stable Diffusion pipelines.
19 | 
20 |     Args:
21 |         images (`List[PIL.Image.Image]` or `np.ndarray`)
22 |             List of denoised PIL images of length `batch_size` or numpy array of shape `(batch_size, height, width,
23 |             num_channels)`. PIL images or numpy array present the denoised images of the diffusion pipeline.
24 |     """
25 | 
26 |     images: Union[List[PIL.Image.Image], np.ndarray]
27 | 
28 | 
29 | try:
30 |     if not (is_transformers_available() and is_torch_available()):
31 |         raise OptionalDependencyNotAvailable()
32 | except OptionalDependencyNotAvailable:
33 |     from ...utils.dummy_torch_and_transformers_objects import *  # noqa F403
34 | else:
35 |     from .pipeline_stable_diffusion_xl import StableDiffusionXLPipeline
36 |     from .pipeline_stable_diffusion_xl_img2img import StableDiffusionXLImg2ImgPipeline
37 |     from .pipeline_stable_diffusion_xl_inpaint import StableDiffusionXLInpaintPipeline
38 |     from .pipeline_stable_diffusion_xl_instruct_pix2pix import StableDiffusionXLInstructPix2PixPipeline
39 | 


--------------------------------------------------------------------------------
/scripts/convert_unclip_txt2img_to_image_variation.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection
 4 | 
 5 | from diffusers import UnCLIPImageVariationPipeline, UnCLIPPipeline
 6 | 
 7 | 
 8 | if __name__ == "__main__":
 9 |     parser = argparse.ArgumentParser()
10 | 
11 |     parser.add_argument("--dump_path", default=None, type=str, required=True, help="Path to the output model.")
12 | 
13 |     parser.add_argument(
14 |         "--txt2img_unclip",
15 |         default="kakaobrain/karlo-v1-alpha",
16 |         type=str,
17 |         required=False,
18 |         help="The pretrained txt2img unclip.",
19 |     )
20 | 
21 |     args = parser.parse_args()
22 | 
23 |     txt2img = UnCLIPPipeline.from_pretrained(args.txt2img_unclip)
24 | 
25 |     feature_extractor = CLIPImageProcessor()
26 |     image_encoder = CLIPVisionModelWithProjection.from_pretrained("openai/clip-vit-large-patch14")
27 | 
28 |     img2img = UnCLIPImageVariationPipeline(
29 |         decoder=txt2img.decoder,
30 |         text_encoder=txt2img.text_encoder,
31 |         tokenizer=txt2img.tokenizer,
32 |         text_proj=txt2img.text_proj,
33 |         feature_extractor=feature_extractor,
34 |         image_encoder=image_encoder,
35 |         super_res_first=txt2img.super_res_first,
36 |         super_res_last=txt2img.super_res_last,
37 |         decoder_scheduler=txt2img.decoder_scheduler,
38 |         super_res_scheduler=txt2img.super_res_scheduler,
39 |     )
40 | 
41 |     img2img.save_pretrained(args.dump_path)
42 | 


--------------------------------------------------------------------------------
/src/diffusers/commands/diffusers_cli.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 3 | #
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | #     http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | 
16 | from argparse import ArgumentParser
17 | 
18 | from .env import EnvironmentCommand
19 | from .fp16_safetensors import FP16SafetensorsCommand
20 | 
21 | 
22 | def main():
23 |     parser = ArgumentParser("Diffusers CLI tool", usage="diffusers-cli <command> [<args>]")
24 |     commands_parser = parser.add_subparsers(help="diffusers-cli command helpers")
25 | 
26 |     # Register commands
27 |     EnvironmentCommand.register_subcommand(commands_parser)
28 |     FP16SafetensorsCommand.register_subcommand(commands_parser)
29 | 
30 |     # Let's go
31 |     args = parser.parse_args()
32 | 
33 |     if not hasattr(args, "func"):
34 |         parser.print_help()
35 |         exit(1)
36 | 
37 |     # Run
38 |     service = args.func(args)
39 |     service.run()
40 | 
41 | 
42 | if __name__ == "__main__":
43 |     main()
44 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/doc_utils.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | """
15 | Doc utilities: Utilities related to documentation
16 | """
17 | import re
18 | 
19 | 
20 | def replace_example_docstring(example_docstring):
21 |     def docstring_decorator(fn):
22 |         func_doc = fn.__doc__
23 |         lines = func_doc.split("\n")
24 |         i = 0
25 |         while i < len(lines) and re.search(r"^\s*Examples?:\s*$", lines[i]) is None:
26 |             i += 1
27 |         if i < len(lines):
28 |             lines[i] = example_docstring
29 |             func_doc = "\n".join(lines)
30 |         else:
31 |             raise ValueError(
32 |                 f"The function {fn} should have an empty 'Examples:' in its docstring as placeholder, "
33 |                 f"current docstring is:\n{func_doc}"
34 |             )
35 |         fn.__doc__ = func_doc
36 |         return fn
37 | 
38 |     return docstring_decorator
39 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/vq.md:
--------------------------------------------------------------------------------
 1 | # VQModel
 2 | 
 3 | The VQ-VAE model was introduced in [Neural Discrete Representation Learning](https://huggingface.co/papers/1711.00937) by Aaron van den Oord, Oriol Vinyals and Koray Kavukcuoglu. The model is used in 🤗 Diffusers to decode latent representations into images. Unlike [`AutoencoderKL`], the [`VQModel`] works in a quantized latent space.
 4 | 
 5 | The abstract from the paper is:
 6 | 
 7 | *Learning useful representations without supervision remains a key challenge in machine learning. In this paper, we propose a simple yet powerful generative model that learns such discrete representations. Our model, the Vector Quantised-Variational AutoEncoder (VQ-VAE), differs from VAEs in two key ways: the encoder network outputs discrete, rather than continuous, codes; and the prior is learnt rather than static. In order to learn a discrete latent representation, we incorporate ideas from vector quantisation (VQ). Using the VQ method allows the model to circumvent issues of "posterior collapse" -- where the latents are ignored when they are paired with a powerful autoregressive decoder -- typically observed in the VAE framework. Pairing these representations with an autoregressive prior, the model can generate high quality images, videos, and speech as well as doing high quality speaker conversion and unsupervised learning of phonemes, providing further evidence of the utility of the learnt representations.*
 8 | 
 9 | ## VQModel
10 | 
11 | [[autodoc]] VQModel
12 | 
13 | ## VQEncoderOutput
14 | 
15 | [[autodoc]] models.vq_model.VQEncoderOutput


--------------------------------------------------------------------------------
/.github/workflows/pr_quality.yml:
--------------------------------------------------------------------------------
 1 | name: Run code quality checks
 2 | 
 3 | on:
 4 |   pull_request:
 5 |     branches:
 6 |       - main
 7 |   push:
 8 |     branches:
 9 |       - main
10 | 
11 | concurrency:
12 |   group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
13 |   cancel-in-progress: true
14 | 
15 | jobs:
16 |   check_code_quality:
17 |     runs-on: ubuntu-latest
18 |     steps:
19 |       - uses: actions/checkout@v3
20 |       - name: Set up Python
21 |         uses: actions/setup-python@v4
22 |         with:
23 |           python-version: "3.7"
24 |       - name: Install dependencies
25 |         run: |
26 |           python -m pip install --upgrade pip
27 |           pip install .[quality]
28 |       - name: Check quality
29 |         run: |
30 |           black --check examples tests src utils scripts
31 |           ruff examples tests src utils scripts
32 |           doc-builder style src/diffusers docs/source --max_len 119 --check_only --path_to_docs docs/source
33 | 
34 |   check_repository_consistency:
35 |     runs-on: ubuntu-latest
36 |     steps:
37 |       - uses: actions/checkout@v3
38 |       - name: Set up Python
39 |         uses: actions/setup-python@v4
40 |         with:
41 |           python-version: "3.7"
42 |       - name: Install dependencies
43 |         run: |
44 |           python -m pip install --upgrade pip
45 |           pip install .[quality]
46 |       - name: Check quality
47 |         run: |
48 |           python utils/check_copies.py
49 |           python utils/check_dummies.py
50 |           make deps_table_check_updated
51 | 


--------------------------------------------------------------------------------
/docs/source/en/optimization/opt_overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Overview
14 | 
15 | Generating high-quality outputs is computationally intensive, especially during each iterative step where you go from a noisy output to a less noisy output. One of 🧨 Diffuser's goal is to make this technology widely accessible to everyone, which includes enabling fast inference on consumer and specialized hardware. 
16 | 
17 | This section will cover tips and tricks - like half-precision weights and sliced attention - for optimizing inference speed and reducing memory-consumption. You can also learn how to speed up your PyTorch code with [`torch.compile`](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html) or [ONNX Runtime](https://onnxruntime.ai/docs/), and enable memory-efficient attention with [xFormers](https://facebookresearch.github.io/xformers/). There are also guides for running inference on specific hardware like Apple Silicon, and Intel or Habana processors.


--------------------------------------------------------------------------------
/utils/print_env.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | 
 3 | # coding=utf-8
 4 | # Copyright 2023 The HuggingFace Inc. team.
 5 | #
 6 | # Licensed under the Apache License, Version 2.0 (the "License");
 7 | # you may not use this file except in compliance with the License.
 8 | # You may obtain a copy of the License at
 9 | #
10 | #     http://www.apache.org/licenses/LICENSE-2.0
11 | #
12 | # Unless required by applicable law or agreed to in writing, software
13 | # distributed under the License is distributed on an "AS IS" BASIS,
14 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 | # See the License for the specific language governing permissions and
16 | # limitations under the License.
17 | 
18 | # this script dumps information about the environment
19 | 
20 | import os
21 | import platform
22 | import sys
23 | 
24 | 
25 | os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
26 | 
27 | print("Python version:", sys.version)
28 | 
29 | print("OS platform:", platform.platform())
30 | print("OS architecture:", platform.machine())
31 | 
32 | try:
33 |     import torch
34 | 
35 |     print("Torch version:", torch.__version__)
36 |     print("Cuda available:", torch.cuda.is_available())
37 |     print("Cuda version:", torch.version.cuda)
38 |     print("CuDNN version:", torch.backends.cudnn.version())
39 |     print("Number of GPUs available:", torch.cuda.device_count())
40 | except ImportError:
41 |     print("Torch version:", None)
42 | 
43 | try:
44 |     import transformers
45 | 
46 |     print("transformers version:", transformers.__version__)
47 | except ImportError:
48 |     print("transformers version:", None)
49 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/prior_transformer.md:
--------------------------------------------------------------------------------
 1 | # Prior Transformer
 2 | 
 3 | The Prior Transformer was originally introduced in [Hierarchical Text-Conditional Image Generation with CLIP Latents
 4 | ](https://huggingface.co/papers/2204.06125) by Ramesh et al. It is used to predict CLIP image embeddings from CLIP text embeddings; image embeddings are predicted through a denoising diffusion process.
 5 | 
 6 | The abstract from the paper is:
 7 | 
 8 | *Contrastive models like CLIP have been shown to learn robust representations of images that capture both semantics and style. To leverage these representations for image generation, we propose a two-stage model: a prior that generates a CLIP image embedding given a text caption, and a decoder that generates an image conditioned on the image embedding. We show that explicitly generating image representations improves image diversity with minimal loss in photorealism and caption similarity. Our decoders conditioned on image representations can also produce variations of an image that preserve both its semantics and style, while varying the non-essential details absent from the image representation. Moreover, the joint embedding space of CLIP enables language-guided image manipulations in a zero-shot fashion. We use diffusion models for the decoder and experiment with both autoregressive and diffusion models for the prior, finding that the latter are computationally more efficient and produce higher-quality samples.*
 9 | 
10 | ## PriorTransformer
11 | 
12 | [[autodoc]] PriorTransformer
13 | 
14 | ## PriorTransformerOutput
15 | 
16 | [[autodoc]] models.prior_transformer.PriorTransformerOutput


--------------------------------------------------------------------------------
/docker/diffusers-flax-tpu/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM ubuntu:20.04
 2 | LABEL maintainer="Hugging Face"
 3 | LABEL repository="diffusers"
 4 | 
 5 | ENV DEBIAN_FRONTEND=noninteractive
 6 | 
 7 | RUN apt update && \
 8 |     apt install -y bash \
 9 |                    build-essential \
10 |                    git \
11 |                    git-lfs \
12 |                    curl \
13 |                    ca-certificates \
14 |                    libsndfile1-dev \
15 |                    python3.8 \
16 |                    python3-pip \
17 |                    python3.8-venv && \
18 |     rm -rf /var/lib/apt/lists
19 | 
20 | # make sure to use venv
21 | RUN python3 -m venv /opt/venv
22 | ENV PATH="/opt/venv/bin:$PATH"
23 | 
24 | # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
25 | # follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
26 | RUN python3 -m pip install --no-cache-dir --upgrade pip && \
27 |     python3 -m pip install --no-cache-dir \
28 |         "jax[tpu]>=0.2.16,!=0.3.2" \
29 |         -f https://storage.googleapis.com/jax-releases/libtpu_releases.html && \
30 |     python3 -m pip install --upgrade --no-cache-dir \
31 |         clu \
32 |         "flax>=0.4.1" \
33 |         "jaxlib>=0.1.65" && \
34 |     python3 -m pip install --no-cache-dir \
35 |         accelerate \
36 |         datasets \
37 |         hf-doc-builder \
38 |         huggingface-hub \
39 |         Jinja2 \
40 |         librosa \        
41 |         numpy \
42 |         scipy \
43 |         tensorboard \
44 |         transformers
45 | 
46 | CMD ["/bin/bash"]


--------------------------------------------------------------------------------
/docs/source/ko/optimization/xformers.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # xFormers 설치하기
14 | 
15 | 추론과 학습 모두에 [xFormers](https://github.com/facebookresearch/xformers)를 사용하는 것이 좋습니다.
16 | 자체 테스트로 어텐션 블록에서 수행된 최적화가 더 빠른 속도와 적은 메모리 소비를 확인했습니다.
17 | 
18 | 2023년 1월에 출시된 xFormers 버전 '0.0.16'부터 사전 빌드된 pip wheel을 사용하여 쉽게 설치할 수 있습니다:
19 | 
20 | ```bash
21 | pip install xformers
22 | ```
23 | 
24 | <Tip>
25 | 
26 | xFormers PIP 패키지에는 최신 버전의 PyTorch(xFormers 0.0.16에 1.13.1)가 필요합니다. 이전 버전의 PyTorch를 사용해야 하는 경우 [프로젝트 지침](https://github.com/facebookresearch/xformers#installing-xformers)의 소스를 사용해 xFormers를 설치하는 것이 좋습니다.
27 | 
28 | </Tip>
29 | 
30 | xFormers를 설치하면, [여기](fp16#memory-efficient-attention)서 설명한 것처럼 'enable_xformers_memory_efficient_attention()'을 사용하여 추론 속도를 높이고 메모리 소비를 줄일 수 있습니다.
31 | 
32 | <Tip warning={true}>
33 | 
34 | [이 이슈](https://github.com/huggingface/diffusers/issues/2234#issuecomment-1416931212)에 따르면 xFormers `v0.0.16`에서 GPU를 사용한 학습(파인 튜닝 또는 Dreambooth)을 할 수 없습니다. 해당 문제가 발견되면. 해당 코멘트를 참고해 development 버전을 설치하세요.
35 | 
36 | </Tip>
37 | 


--------------------------------------------------------------------------------
/utils/get_modified_files.py:
--------------------------------------------------------------------------------
 1 | # coding=utf-8
 2 | # Copyright 2023 The HuggingFace Inc. team.
 3 | #
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | #     http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | 
16 | # this script reports modified .py files under the desired list of top-level sub-dirs passed as a list of arguments, e.g.:
17 | #   python ./utils/get_modified_files.py utils src tests examples
18 | #
19 | # it uses git to find the forking point and which files were modified - i.e. files not under git won't be considered
20 | # since the output of this script is fed into Makefile commands it doesn't print a newline after the results
21 | 
22 | import re
23 | import subprocess
24 | import sys
25 | 
26 | 
27 | fork_point_sha = subprocess.check_output("git merge-base main HEAD".split()).decode("utf-8")
28 | modified_files = subprocess.check_output(f"git diff --name-only {fork_point_sha}".split()).decode("utf-8").split()
29 | 
30 | joined_dirs = "|".join(sys.argv[1:])
31 | regex = re.compile(rf"^({joined_dirs}).*?\.py$")
32 | 
33 | relevant_modified_files = [x for x in modified_files if regex.match(x)]
34 | print(" ".join(relevant_modified_files), end="")
35 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/dance_diffusion.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Dance Diffusion
14 | 
15 | [Dance Diffusion](https://github.com/Harmonai-org/sample-generator) is by Zach Evans.
16 | 
17 | Dance Diffusion is the first in a suite of generative audio tools for producers and musicians released by [Harmonai](https://github.com/Harmonai-org).
18 | 
19 | The original codebase of this implementation can be found at [Harmonai-org](https://github.com/Harmonai-org/sample-generator).
20 | 
21 | <Tip>
22 | 
23 | Make sure to check out the Schedulers [guide](/using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](/using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
24 | 
25 | </Tip>
26 | 
27 | ## DanceDiffusionPipeline
28 | [[autodoc]] DanceDiffusionPipeline
29 | 	- all
30 | 	- __call__
31 | 
32 | ## AudioPipelineOutput
33 | [[autodoc]] pipelines.AudioPipelineOutput


--------------------------------------------------------------------------------
/docs/source/en/using-diffusers/loading_overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Overview
14 | 
15 | 🧨 Diffusers offers many pipelines, models, and schedulers for generative tasks. To make loading these components as simple as possible, we provide a single and unified method - `from_pretrained()` - that loads any of these components from either the Hugging Face [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) or your local machine. Whenever you load a pipeline or model, the latest files are automatically downloaded and cached so you can quickly reuse them next time without redownloading the files.
16 | 
17 | This section will show you everything you need to know about loading pipelines, how to load different components in a pipeline, how to load checkpoint variants, and how to load community pipelines. You'll also learn how to load schedulers and compare the speed and quality trade-offs of using different schedulers. Finally, you'll see how to convert and load KerasCV checkpoints so you can use them in PyTorch with 🧨 Diffusers.


--------------------------------------------------------------------------------
/src/diffusers/pipelines/alt_diffusion/__init__.py:
--------------------------------------------------------------------------------
 1 | from dataclasses import dataclass
 2 | from typing import List, Optional, Union
 3 | 
 4 | import numpy as np
 5 | import PIL
 6 | from PIL import Image
 7 | 
 8 | from ...utils import BaseOutput, OptionalDependencyNotAvailable, is_torch_available, is_transformers_available
 9 | 
10 | 
11 | @dataclass
12 | # Copied from diffusers.pipelines.stable_diffusion.__init__.StableDiffusionPipelineOutput with Stable->Alt
13 | class AltDiffusionPipelineOutput(BaseOutput):
14 |     """
15 |     Output class for Alt Diffusion pipelines.
16 | 
17 |     Args:
18 |         images (`List[PIL.Image.Image]` or `np.ndarray`)
19 |             List of denoised PIL images of length `batch_size` or NumPy array of shape `(batch_size, height, width,
20 |             num_channels)`.
21 |         nsfw_content_detected (`List[bool]`)
22 |             List indicating whether the corresponding generated image contains "not-safe-for-work" (nsfw) content or
23 |             `None` if safety checking could not be performed.
24 |     """
25 | 
26 |     images: Union[List[PIL.Image.Image], np.ndarray]
27 |     nsfw_content_detected: Optional[List[bool]]
28 | 
29 | 
30 | try:
31 |     if not (is_transformers_available() and is_torch_available()):
32 |         raise OptionalDependencyNotAvailable()
33 | except OptionalDependencyNotAvailable:
34 |     from ...utils.dummy_torch_and_transformers_objects import ShapEPipeline
35 | else:
36 |     from .modeling_roberta_series import RobertaSeriesModelWithTransformation
37 |     from .pipeline_alt_diffusion import AltDiffusionPipeline
38 |     from .pipeline_alt_diffusion_img2img import AltDiffusionImg2ImgPipeline
39 | 


--------------------------------------------------------------------------------
/src/diffusers/dependency_versions_table.py:
--------------------------------------------------------------------------------
 1 | # THIS FILE HAS BEEN AUTOGENERATED. To update:
 2 | # 1. modify the `_deps` dict in setup.py
 3 | # 2. run `make deps_table_update``
 4 | deps = {
 5 |     "Pillow": "Pillow",
 6 |     "accelerate": "accelerate>=0.11.0",
 7 |     "compel": "compel==0.1.8",
 8 |     "black": "black~=23.1",
 9 |     "datasets": "datasets",
10 |     "filelock": "filelock",
11 |     "flax": "flax>=0.4.1",
12 |     "hf-doc-builder": "hf-doc-builder>=0.3.0",
13 |     "huggingface-hub": "huggingface-hub>=0.13.2",
14 |     "requests-mock": "requests-mock==1.10.0",
15 |     "importlib_metadata": "importlib_metadata",
16 |     "invisible-watermark": "invisible-watermark>=0.2.0",
17 |     "isort": "isort>=5.5.4",
18 |     "jax": "jax>=0.2.8,!=0.3.2",
19 |     "jaxlib": "jaxlib>=0.1.65",
20 |     "Jinja2": "Jinja2",
21 |     "k-diffusion": "k-diffusion>=0.0.12",
22 |     "torchsde": "torchsde",
23 |     "note_seq": "note_seq",
24 |     "librosa": "librosa",
25 |     "numpy": "numpy",
26 |     "omegaconf": "omegaconf",
27 |     "parameterized": "parameterized",
28 |     "protobuf": "protobuf>=3.20.3,<4",
29 |     "pytest": "pytest",
30 |     "pytest-timeout": "pytest-timeout",
31 |     "pytest-xdist": "pytest-xdist",
32 |     "ruff": "ruff==0.0.280",
33 |     "safetensors": "safetensors>=0.3.1",
34 |     "sentencepiece": "sentencepiece>=0.1.91,!=0.1.92",
35 |     "scipy": "scipy",
36 |     "onnx": "onnx",
37 |     "regex": "regex!=2019.12.17",
38 |     "requests": "requests",
39 |     "tensorboard": "tensorboard",
40 |     "torch": "torch>=1.4",
41 |     "torchvision": "torchvision",
42 |     "transformers": "transformers>=4.25.1",
43 |     "urllib3": "urllib3<=2.0.0",
44 | }
45 | 


--------------------------------------------------------------------------------
/docs/source/en/api/diffusion_pipeline.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Pipelines
14 | 
15 | The [`DiffusionPipeline`] is the quickest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) for inference.
16 | 
17 | <Tip>
18 | 
19 | You shouldn't use the [`DiffusionPipeline`] class for training or finetuning a diffusion model. Individual 
20 | components (for example, [`UNet2DModel`] and [`UNet2DConditionModel`]) of diffusion pipelines are usually trained individually, so we suggest directly working with them instead.
21 | 
22 | </Tip>
23 | 
24 | The pipeline type (for example [`StableDiffusionPipeline`]) of any diffusion pipeline loaded with [`~DiffusionPipeline.from_pretrained`] is automatically 
25 | detected and pipeline components are loaded and passed to the `__init__` function of the pipeline.
26 | 
27 | Any pipeline object can be saved locally with [`~DiffusionPipeline.save_pretrained`].
28 | 
29 | ## DiffusionPipeline
30 | 
31 | [[autodoc]] DiffusionPipeline
32 | 	- all
33 | 	- __call__
34 | 	- device
35 | 	- to
36 | 	- components
37 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/autoencoder_tiny.md:
--------------------------------------------------------------------------------
 1 | # Tiny AutoEncoder
 2 | 
 3 | Tiny AutoEncoder for Stable Diffusion (TAESD) was introduced in [madebyollin/taesd](https://github.com/madebyollin/taesd) by Ollin Boer Bohan. It is a tiny distilled version of Stable Diffusion's VAE that can quickly decode the latents in a [`StableDiffusionPipeline`] or [`StableDiffusionXLPipeline`] almost instantly. 
 4 | 
 5 | To use with Stable Diffusion v-2.1:
 6 | 
 7 | ```python
 8 | import torch
 9 | from diffusers import DiffusionPipeline, AutoencoderTiny
10 | 
11 | pipe = DiffusionPipeline.from_pretrained(
12 |     "stabilityai/stable-diffusion-2-1-base", torch_dtype=torch.float16
13 | )
14 | pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesd", torch_dtype=torch.float16)
15 | pipe = pipe.to("cuda")
16 | 
17 | prompt = "slice of delicious New York-style berry cheesecake"
18 | image = pipe(prompt, num_inference_steps=25).images[0]
19 | image.save("cheesecake.png")
20 | ```
21 | 
22 | To use with Stable Diffusion XL 1.0
23 | 
24 | ```python
25 | import torch
26 | from diffusers import DiffusionPipeline, AutoencoderTiny
27 | 
28 | pipe = DiffusionPipeline.from_pretrained(
29 |     "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
30 | )
31 | pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesdxl", torch_dtype=torch.float16)
32 | pipe = pipe.to("cuda")
33 | 
34 | prompt = "slice of delicious New York-style berry cheesecake"
35 | image = pipe(prompt, num_inference_steps=25).images[0]
36 | image.save("cheesecake_sdxl.png")
37 | ```
38 | 
39 | ## AutoencoderTiny
40 | 
41 | [[autodoc]] AutoencoderTiny
42 | 
43 | ## AutoencoderTinyOutput
44 | 
45 | [[autodoc]] models.autoencoder_tiny.AutoencoderTinyOutput


--------------------------------------------------------------------------------
/docs/source/en/using-diffusers/pipeline_overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Overview
14 | 
15 | A pipeline is an end-to-end class that provides a quick and easy way to use a diffusion system for inference by bundling independently trained models and schedulers together. Certain combinations of models and schedulers define specific pipeline types, like [`StableDiffusionPipeline`] or [`StableDiffusionControlNetPipeline`], with specific capabilities. All pipeline types inherit from the base [`DiffusionPipeline`] class; pass it any checkpoint, and it'll automatically detect the pipeline type and load the necessary components.
16 | 
17 | This section introduces you to some of the tasks supported by our pipelines such as unconditional image generation and different techniques and variations of text-to-image generation. You'll also learn how to gain more control over the generation process by setting a seed for reproducibility and weighting prompts to adjust the influence certain words in the prompt has over the output. Finally, you'll see how you can create a community pipeline for a custom task like generating images from speech.


--------------------------------------------------------------------------------
/src/diffusers/utils/model_card_template.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | {{ card_data }}
 3 | ---
 4 | 
 5 | <!-- This model card has been generated automatically according to the information the training script had access to. You
 6 | should probably proofread and complete it, then remove this comment. -->
 7 | 
 8 | # {{ model_name | default("Diffusion Model") }}
 9 | 
10 | ## Model description
11 | 
12 | This diffusion model is trained with the [🤗 Diffusers](https://github.com/huggingface/diffusers) library 
13 | on the `{{ dataset_name }}` dataset.
14 | 
15 | ## Intended uses & limitations
16 | 
17 | #### How to use
18 | 
19 | ```python
20 | # TODO: add an example code snippet for running this diffusion pipeline
21 | ```
22 | 
23 | #### Limitations and bias
24 | 
25 | [TODO: provide examples of latent issues and potential remediations]
26 | 
27 | ## Training data
28 | 
29 | [TODO: describe the data used to train the model]
30 | 
31 | ### Training hyperparameters
32 | 
33 | The following hyperparameters were used during training:
34 | - learning_rate: {{ learning_rate }}
35 | - train_batch_size: {{ train_batch_size }}
36 | - eval_batch_size: {{ eval_batch_size }}
37 | - gradient_accumulation_steps: {{ gradient_accumulation_steps }}
38 | - optimizer: AdamW with betas=({{ adam_beta1 }}, {{ adam_beta2 }}), weight_decay={{ adam_weight_decay }} and epsilon={{ adam_epsilon }}
39 | - lr_scheduler: {{ lr_scheduler }}
40 | - lr_warmup_steps: {{ lr_warmup_steps }}
41 | - ema_inv_gamma: {{ ema_inv_gamma }}
42 | - ema_inv_gamma: {{ ema_power }}
43 | - ema_inv_gamma: {{ ema_max_decay }}
44 | - mixed_precision: {{ mixed_precision }}
45 | 
46 | ### Training results
47 | 
48 | 📈 [TensorBoard logs](https://huggingface.co/{{ repo_name }}/tensorboard?#scalars)
49 | 
50 | 
51 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/audio_diffusion.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Audio Diffusion
14 | 
15 | [Audio Diffusion](https://github.com/teticio/audio-diffusion) is by Robert Dargavel Smith, and it leverages the recent advances in image generation from diffusion models by converting audio samples to and from Mel spectrogram images.
16 | 
17 | The original codebase, training scripts and example notebooks can be found at [teticio/audio-diffusion](https://github.com/teticio/audio-diffusion).
18 | 
19 | <Tip>
20 | 
21 | Make sure to check out the Schedulers [guide](/using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](/using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
22 | 
23 | </Tip>
24 | 
25 | ## AudioDiffusionPipeline
26 | [[autodoc]] AudioDiffusionPipeline
27 | 	- all
28 | 	- __call__
29 | 
30 | ## AudioPipelineOutput
31 | [[autodoc]] pipelines.AudioPipelineOutput
32 | 
33 | ## ImagePipelineOutput
34 | [[autodoc]] pipelines.ImagePipelineOutput
35 | 
36 | ## Mel
37 | [[autodoc]] Mel
38 | 


--------------------------------------------------------------------------------
/src/diffusers/models/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | from ..utils import is_flax_available, is_torch_available
16 | 
17 | 
18 | if is_torch_available():
19 |     from .adapter import MultiAdapter, T2IAdapter
20 |     from .autoencoder_asym_kl import AsymmetricAutoencoderKL
21 |     from .autoencoder_kl import AutoencoderKL
22 |     from .autoencoder_tiny import AutoencoderTiny
23 |     from .controlnet import ControlNetModel
24 |     from .dual_transformer_2d import DualTransformer2DModel
25 |     from .modeling_utils import ModelMixin
26 |     from .prior_transformer import PriorTransformer
27 |     from .t5_film_transformer import T5FilmDecoder
28 |     from .transformer_2d import Transformer2DModel
29 |     from .unet_1d import UNet1DModel
30 |     from .unet_2d import UNet2DModel
31 |     from .unet_2d_condition import UNet2DConditionModel
32 |     from .unet_3d_condition import UNet3DConditionModel
33 |     from .vq_model import VQModel
34 | 
35 | if is_flax_available():
36 |     from .controlnet_flax import FlaxControlNetModel
37 |     from .unet_2d_condition_flax import FlaxUNet2DConditionModel
38 |     from .vae_flax import FlaxAutoencoderKL
39 | 


--------------------------------------------------------------------------------
/docs/source/ko/optimization/open_vino.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # 추론을 위한 OpenVINO 사용 방법
14 | 
15 | 🤗 [Optimum](https://github.com/huggingface/optimum-intel)은 OpenVINO와 호환되는 Stable Diffusion 파이프라인을 제공합니다.
16 | 이제 다양한 Intel 프로세서에서 OpenVINO Runtime으로 쉽게 추론을 수행할 수 있습니다. ([여기](https://docs.openvino.ai/latest/openvino_docs_OV_UG_supported_plugins_Supported_Devices.html)서 지원되는 전 기기 목록을 확인하세요).
17 | 
18 | ## 설치
19 | 
20 | 다음 명령어로 🤗 Optimum을 설치합니다:
21 | 
22 | ```
23 | pip install optimum["openvino"]
24 | ```
25 | 
26 | ## Stable Diffusion 추론
27 | 
28 | OpenVINO 모델을 불러오고 OpenVINO 런타임으로 추론을 실행하려면 `StableDiffusionPipeline`을 `OVStableDiffusionPipeline`으로 교체해야 합니다. PyTorch 모델을 불러오고 즉시 OpenVINO 형식으로 변환하려는 경우 `export=True`로 설정합니다.
29 | 
30 | ```python
31 | from optimum.intel.openvino import OVStableDiffusionPipeline
32 | 
33 | model_id = "runwayml/stable-diffusion-v1-5"
34 | pipe = OVStableDiffusionPipeline.from_pretrained(model_id, export=True)
35 | prompt = "a photo of an astronaut riding a horse on mars"
36 | images = pipe(prompt).images[0]
37 | ```
38 | 
39 | [Optimum 문서](https://huggingface.co/docs/optimum/intel/inference#export-and-inference-of-stable-diffusion-models)에서 (정적 reshaping과 모델 컴파일 등의) 더 많은 예시들을 찾을 수 있습니다.
40 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/unet.md:
--------------------------------------------------------------------------------
 1 | # UNet1DModel
 2 | 
 3 | The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 1D UNet model.
 4 | 
 5 | The abstract from the paper is:
 6 | 
 7 | *There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
 8 | 
 9 | ## UNet1DModel
10 | [[autodoc]] UNet1DModel
11 | 
12 | ## UNet1DOutput
13 | [[autodoc]] models.unet_1d.UNet1DOutput


--------------------------------------------------------------------------------
/docs/source/en/api/models/unet2d.md:
--------------------------------------------------------------------------------
 1 | # UNet2DModel
 2 | 
 3 | The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 2D UNet model.
 4 | 
 5 | The abstract from the paper is:
 6 | 
 7 | *There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
 8 | 
 9 | ## UNet2DModel
10 | [[autodoc]] UNet2DModel
11 | 
12 | ## UNet2DOutput
13 | [[autodoc]] models.unet_2d.UNet2DOutput


--------------------------------------------------------------------------------
/src/diffusers/pipelines/deepfloyd_if/watermark.py:
--------------------------------------------------------------------------------
 1 | from typing import List
 2 | 
 3 | import PIL
 4 | import torch
 5 | from PIL import Image
 6 | 
 7 | from ...configuration_utils import ConfigMixin
 8 | from ...models.modeling_utils import ModelMixin
 9 | from ...utils import PIL_INTERPOLATION
10 | 
11 | 
12 | class IFWatermarker(ModelMixin, ConfigMixin):
13 |     def __init__(self):
14 |         super().__init__()
15 | 
16 |         self.register_buffer("watermark_image", torch.zeros((62, 62, 4)))
17 |         self.watermark_image_as_pil = None
18 | 
19 |     def apply_watermark(self, images: List[PIL.Image.Image], sample_size=None):
20 |         # copied from https://github.com/deep-floyd/IF/blob/b77482e36ca2031cb94dbca1001fc1e6400bf4ab/deepfloyd_if/modules/base.py#L287
21 | 
22 |         h = images[0].height
23 |         w = images[0].width
24 | 
25 |         sample_size = sample_size or h
26 | 
27 |         coef = min(h / sample_size, w / sample_size)
28 |         img_h, img_w = (int(h / coef), int(w / coef)) if coef < 1 else (h, w)
29 | 
30 |         S1, S2 = 1024**2, img_w * img_h
31 |         K = (S2 / S1) ** 0.5
32 |         wm_size, wm_x, wm_y = int(K * 62), img_w - int(14 * K), img_h - int(14 * K)
33 | 
34 |         if self.watermark_image_as_pil is None:
35 |             watermark_image = self.watermark_image.to(torch.uint8).cpu().numpy()
36 |             watermark_image = Image.fromarray(watermark_image, mode="RGBA")
37 |             self.watermark_image_as_pil = watermark_image
38 | 
39 |         wm_img = self.watermark_image_as_pil.resize(
40 |             (wm_size, wm_size), PIL_INTERPOLATION["bicubic"], reducing_gap=None
41 |         )
42 | 
43 |         for pil_img in images:
44 |             pil_img.paste(wm_img, box=(wm_x - wm_size, wm_y - wm_size, wm_x, wm_y), mask=wm_img.split()[-1])
45 | 
46 |         return images
47 | 


--------------------------------------------------------------------------------
/tests/schedulers/test_scheduler_vq_diffusion.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn.functional as F
 3 | 
 4 | from diffusers import VQDiffusionScheduler
 5 | 
 6 | from .test_schedulers import SchedulerCommonTest
 7 | 
 8 | 
 9 | class VQDiffusionSchedulerTest(SchedulerCommonTest):
10 |     scheduler_classes = (VQDiffusionScheduler,)
11 | 
12 |     def get_scheduler_config(self, **kwargs):
13 |         config = {
14 |             "num_vec_classes": 4097,
15 |             "num_train_timesteps": 100,
16 |         }
17 | 
18 |         config.update(**kwargs)
19 |         return config
20 | 
21 |     def dummy_sample(self, num_vec_classes):
22 |         batch_size = 4
23 |         height = 8
24 |         width = 8
25 | 
26 |         sample = torch.randint(0, num_vec_classes, (batch_size, height * width))
27 | 
28 |         return sample
29 | 
30 |     @property
31 |     def dummy_sample_deter(self):
32 |         assert False
33 | 
34 |     def dummy_model(self, num_vec_classes):
35 |         def model(sample, t, *args):
36 |             batch_size, num_latent_pixels = sample.shape
37 |             logits = torch.rand((batch_size, num_vec_classes - 1, num_latent_pixels))
38 |             return_value = F.log_softmax(logits.double(), dim=1).float()
39 |             return return_value
40 | 
41 |         return model
42 | 
43 |     def test_timesteps(self):
44 |         for timesteps in [2, 5, 100, 1000]:
45 |             self.check_over_configs(num_train_timesteps=timesteps)
46 | 
47 |     def test_num_vec_classes(self):
48 |         for num_vec_classes in [5, 100, 1000, 4000]:
49 |             self.check_over_configs(num_vec_classes=num_vec_classes)
50 | 
51 |     def test_time_indices(self):
52 |         for t in [0, 50, 99]:
53 |             self.check_over_forward(time_step=t)
54 | 
55 |     def test_add_noise_device(self):
56 |         pass
57 | 


--------------------------------------------------------------------------------
/docs/source/en/tutorials/tutorial_overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Overview
14 | 
15 | Welcome to 🧨 Diffusers! If you're new to diffusion models and generative AI, and want to learn more, then you've come to the right place. These beginner-friendly tutorials are designed to provide a gentle introduction to diffusion models and help you understand the library fundamentals - the core components and how 🧨 Diffusers is meant to be used. 
16 | 
17 | You'll learn how to use a pipeline for inference to rapidly generate things, and then deconstruct that pipeline to really understand how to use the library as a modular toolbox for building your own diffusion systems. In the next lesson, you'll learn how to train your own diffusion model to generate what you want.
18 | 
19 | After completing the tutorials, you'll have gained the necessary skills to start exploring the library on your own and see how to use it for your own projects and applications.
20 | 
21 | Feel free to join our community on [Discord](https://discord.com/invite/JfAtkvEtRb) or the [forums](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63) to connect and collaborate with other users and developers!
22 | 
23 | Let's start diffusing! 🧨


--------------------------------------------------------------------------------
/tests/pipelines/text_to_video/test_text_to_video_zero.py:
--------------------------------------------------------------------------------
 1 | # coding=utf-8
 2 | # Copyright 2023 HuggingFace Inc.
 3 | #
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | #     http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | 
16 | import unittest
17 | 
18 | import torch
19 | 
20 | from diffusers import DDIMScheduler, TextToVideoZeroPipeline
21 | from diffusers.utils import load_pt, require_torch_gpu, slow
22 | 
23 | from ..test_pipelines_common import assert_mean_pixel_difference
24 | 
25 | 
26 | @slow
27 | @require_torch_gpu
28 | class TextToVideoZeroPipelineSlowTests(unittest.TestCase):
29 |     def test_full_model(self):
30 |         model_id = "runwayml/stable-diffusion-v1-5"
31 |         pipe = TextToVideoZeroPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
32 |         pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
33 |         generator = torch.Generator(device="cuda").manual_seed(0)
34 | 
35 |         prompt = "A bear is playing a guitar on Times Square"
36 |         result = pipe(prompt=prompt, generator=generator).images
37 | 
38 |         expected_result = load_pt(
39 |             "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/text-to-video/A bear is playing a guitar on Times Square.pt"
40 |         )
41 | 
42 |         assert_mean_pixel_difference(result, expected_result)
43 | 


--------------------------------------------------------------------------------
/docs/source/en/api/loaders.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Loaders
14 | 
15 | Adapters (textual inversion, LoRA, hypernetworks) allow you to modify a diffusion model to generate images in a specific style without training or finetuning the entire model. The adapter weights are typically only a tiny fraction of the pretrained model's which making them very portable. 🤗 Diffusers provides an easy-to-use `LoaderMixin` API to load adapter weights.
16 | 
17 | <Tip warning={true}>
18 | 
19 | 🧪 The `LoaderMixins` are highly experimental and prone to future changes. To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
20 | 
21 | </Tip>
22 | 
23 | ## UNet2DConditionLoadersMixin
24 | 
25 | [[autodoc]] loaders.UNet2DConditionLoadersMixin
26 | 
27 | ## TextualInversionLoaderMixin
28 | 
29 | [[autodoc]] loaders.TextualInversionLoaderMixin
30 | 
31 | ## LoraLoaderMixin
32 | 
33 | [[autodoc]] loaders.LoraLoaderMixin
34 | 
35 | ## FromSingleFileMixin
36 | 
37 | [[autodoc]] loaders.FromSingleFileMixin
38 | 
39 | ## FromOriginalControlnetMixin
40 | 
41 | [[autodoc]] loaders.FromOriginalControlnetMixin
42 | 
43 | ## FromOriginalVAEMixin
44 | 
45 | [[autodoc]] loaders.FromOriginalVAEMixin
46 | 


--------------------------------------------------------------------------------
/examples/reinforcement_learning/run_diffuser_locomotion.py:
--------------------------------------------------------------------------------
 1 | import d4rl  # noqa
 2 | import gym
 3 | import tqdm
 4 | from diffusers.experimental import ValueGuidedRLPipeline
 5 | 
 6 | 
 7 | config = {
 8 |     "n_samples": 64,
 9 |     "horizon": 32,
10 |     "num_inference_steps": 20,
11 |     "n_guide_steps": 2,  # can set to 0 for faster sampling, does not use value network
12 |     "scale_grad_by_std": True,
13 |     "scale": 0.1,
14 |     "eta": 0.0,
15 |     "t_grad_cutoff": 2,
16 |     "device": "cpu",
17 | }
18 | 
19 | 
20 | if __name__ == "__main__":
21 |     env_name = "hopper-medium-v2"
22 |     env = gym.make(env_name)
23 | 
24 |     pipeline = ValueGuidedRLPipeline.from_pretrained(
25 |         "bglick13/hopper-medium-v2-value-function-hor32",
26 |         env=env,
27 |     )
28 | 
29 |     env.seed(0)
30 |     obs = env.reset()
31 |     total_reward = 0
32 |     total_score = 0
33 |     T = 1000
34 |     rollout = [obs.copy()]
35 |     try:
36 |         for t in tqdm.tqdm(range(T)):
37 |             # call the policy
38 |             denorm_actions = pipeline(obs, planning_horizon=32)
39 | 
40 |             # execute action in environment
41 |             next_observation, reward, terminal, _ = env.step(denorm_actions)
42 |             score = env.get_normalized_score(total_reward)
43 | 
44 |             # update return
45 |             total_reward += reward
46 |             total_score += score
47 |             print(
48 |                 f"Step: {t}, Reward: {reward}, Total Reward: {total_reward}, Score: {score}, Total Score:"
49 |                 f" {total_score}"
50 |             )
51 | 
52 |             # save observations for rendering
53 |             rollout.append(next_observation.copy())
54 | 
55 |             obs = next_observation
56 |     except KeyboardInterrupt:
57 |         pass
58 | 
59 |     print(f"Total reward: {total_reward}")
60 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/stable_diffusion/upscale.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Super-resolution
14 | 
15 | The Stable Diffusion upscaler diffusion model was created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), and [LAION](https://laion.ai/). It is used to enhance the resolution of input images by a factor of 4.
16 | 
17 | <Tip>
18 | 
19 | Make sure to check out the Stable Diffusion [Tips](overview#tips) section to learn how to explore the tradeoff between scheduler speed and quality, and how to reuse pipeline components efficiently! 
20 | 
21 | If you're interested in using one of the official checkpoints for a task, explore the [CompVis](https://huggingface.co/CompVis), [Runway](https://huggingface.co/runwayml), and [Stability AI](https://huggingface.co/stabilityai) Hub organizations!
22 | 
23 | </Tip>
24 | 
25 | ## StableDiffusionUpscalePipeline
26 | 
27 | [[autodoc]] StableDiffusionUpscalePipeline
28 | 	- all
29 | 	- __call__
30 | 	- enable_attention_slicing
31 | 	- disable_attention_slicing
32 | 	- enable_xformers_memory_efficient_attention
33 | 	- disable_xformers_memory_efficient_attention
34 | 
35 | ## StableDiffusionPipelineOutput
36 | 
37 | [[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput


--------------------------------------------------------------------------------
/docs/source/en/api/models/unet3d-cond.md:
--------------------------------------------------------------------------------
 1 | # UNet3DConditionModel
 2 | 
 3 | The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 3D UNet conditional model.
 4 | 
 5 | The abstract from the paper is:
 6 | 
 7 | *There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
 8 | 
 9 | ## UNet3DConditionModel
10 | [[autodoc]] UNet3DConditionModel
11 | 
12 | ## UNet3DConditionOutput
13 | [[autodoc]] models.unet_3d_condition.UNet3DConditionOutput


--------------------------------------------------------------------------------
/tests/conftest.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | # tests directory-specific settings - this file is run automatically
16 | # by pytest before any tests are run
17 | 
18 | import sys
19 | import warnings
20 | from os.path import abspath, dirname, join
21 | 
22 | 
23 | # allow having multiple repository checkouts and not needing to remember to rerun
24 | # 'pip install -e .[dev]' when switching between checkouts and running tests.
25 | git_repo_path = abspath(join(dirname(dirname(__file__)), "src"))
26 | sys.path.insert(1, git_repo_path)
27 | 
28 | # silence FutureWarning warnings in tests since often we can't act on them until
29 | # they become normal warnings - i.e. the tests still need to test the current functionality
30 | warnings.simplefilter(action="ignore", category=FutureWarning)
31 | 
32 | 
33 | def pytest_addoption(parser):
34 |     from diffusers.utils.testing_utils import pytest_addoption_shared
35 | 
36 |     pytest_addoption_shared(parser)
37 | 
38 | 
39 | def pytest_terminal_summary(terminalreporter):
40 |     from diffusers.utils.testing_utils import pytest_terminal_summary_main
41 | 
42 |     make_reports = terminalreporter.config.getoption("--make-reports")
43 |     if make_reports:
44 |         pytest_terminal_summary_main(terminalreporter, id=make_reports)
45 | 


--------------------------------------------------------------------------------
/docs/source/en/api/image_processor.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # VAE Image Processor
14 | 
15 | The [`VaeImageProcessor`] provides a unified API for [`StableDiffusionPipeline`]'s to prepare image inputs for VAE encoding and post-processing outputs once they're decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays. 
16 | 
17 | All pipelines with [`VaeImageProcessor`] accepts PIL Image, PyTorch tensor, or NumPy arrays as image inputs and returns outputs based on the `output_type` argument by the user. You can pass encoded image latents directly to the pipeline and return latents from the pipeline as a specific output with the `output_type` argument (for example `output_type="pt"`). This allows you to take the generated latents from one pipeline and pass it to another pipeline as input without leaving the latent space. It also makes it much easier to use multiple pipelines together by passing PyTorch tensors directly between different pipelines. 
18 | 
19 | ## VaeImageProcessor
20 | 
21 | [[autodoc]] image_processor.VaeImageProcessor
22 | 
23 | ## VaeImageProcessorLDM3D
24 | 
25 | The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs.
26 | 
27 | [[autodoc]] image_processor.VaeImageProcessorLDM3D


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/ddpm.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # DDPMScheduler
14 | 
15 | [Denoising Diffusion Probabilistic Models](https://huggingface.co/papers/2006.11239) (DDPM) by Jonathan Ho, Ajay Jain and Pieter Abbeel proposes a diffusion based model of the same name. In the context of the 🤗 Diffusers library, DDPM refers to the discrete denoising scheduler from the paper as well as the pipeline.
16 | 
17 | The abstract from the paper is:
18 | 
19 | *We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score of 9.46 and a state-of-the-art FID score of 3.17. On 256x256 LSUN, we obtain sample quality similar to ProgressiveGAN.*
20 | 
21 | ## DDPMScheduler
22 | [[autodoc]] DDPMScheduler
23 | 
24 | ## DDPMSchedulerOutput
25 | [[autodoc]] schedulers.scheduling_ddpm.DDPMSchedulerOutput


--------------------------------------------------------------------------------
/examples/conftest.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | # tests directory-specific settings - this file is run automatically
16 | # by pytest before any tests are run
17 | 
18 | import sys
19 | import warnings
20 | from os.path import abspath, dirname, join
21 | 
22 | 
23 | # allow having multiple repository checkouts and not needing to remember to rerun
24 | # 'pip install -e .[dev]' when switching between checkouts and running tests.
25 | git_repo_path = abspath(join(dirname(dirname(dirname(__file__))), "src"))
26 | sys.path.insert(1, git_repo_path)
27 | 
28 | 
29 | # silence FutureWarning warnings in tests since often we can't act on them until
30 | # they become normal warnings - i.e. the tests still need to test the current functionality
31 | warnings.simplefilter(action="ignore", category=FutureWarning)
32 | 
33 | 
34 | def pytest_addoption(parser):
35 |     from diffusers.utils.testing_utils import pytest_addoption_shared
36 | 
37 |     pytest_addoption_shared(parser)
38 | 
39 | 
40 | def pytest_terminal_summary(terminalreporter):
41 |     from diffusers.utils.testing_utils import pytest_terminal_summary_main
42 | 
43 |     make_reports = terminalreporter.config.getoption("--make-reports")
44 |     if make_reports:
45 |         pytest_terminal_summary_main(terminalreporter, id=make_reports)
46 | 


--------------------------------------------------------------------------------
/examples/research_projects/onnxruntime/unconditional_image_generation/README.md:
--------------------------------------------------------------------------------
 1 | ## Training examples
 2 | 
 3 | Creating a training image set is [described in a different document](https://huggingface.co/docs/datasets/image_process#image-datasets).
 4 | 
 5 | ### Installing the dependencies
 6 | 
 7 | Before running the scripts, make sure to install the library's training dependencies:
 8 | 
 9 | **Important**
10 | 
11 | To make sure you can successfully run the latest versions of the example scripts, we highly recommend **installing from source** and keeping the install up to date as we update the example scripts frequently and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:
12 | ```bash
13 | git clone https://github.com/huggingface/diffusers
14 | cd diffusers
15 | pip install .
16 | ```
17 | 
18 | Then cd in the example folder  and run
19 | ```bash
20 | pip install -r requirements.txt
21 | ```
22 | 
23 | 
24 | And initialize an [🤗Accelerate](https://github.com/huggingface/accelerate/) environment with:
25 | 
26 | ```bash
27 | accelerate config
28 | ```
29 | 
30 | #### Use ONNXRuntime to accelerate training
31 | 
32 | In order to leverage onnxruntime to accelerate training, please use train_unconditional_ort.py
33 | 
34 | The command to train a DDPM UNet model on the Oxford Flowers dataset with onnxruntime:
35 | 
36 | ```bash
37 | accelerate launch train_unconditional.py \
38 |   --dataset_name="huggan/flowers-102-categories" \
39 |   --resolution=64 --center_crop --random_flip \
40 |   --output_dir="ddpm-ema-flowers-64" \
41 |   --use_ema \
42 |   --train_batch_size=16 \
43 |   --num_epochs=1 \
44 |   --gradient_accumulation_steps=1 \
45 |   --learning_rate=1e-4 \
46 |   --lr_warmup_steps=500 \
47 |   --mixed_precision=fp16
48 |   ```
49 | 
50 | Please contact Prathik Rao (prathikr), Sunghoon Choi (hanbitmyths), Ashwini Khade (askhade), or Peng Wang (pengwa) on github with any questions.
51 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/stable_diffusion/image_variation.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Image variation
14 | 
15 | The Stable Diffusion model can also generate variations from an input image. It uses a fine-tuned version of a Stable Diffusion model by [Justin Pinkney](https://www.justinpinkney.com/) from [Lambda](https://lambdalabs.com/).
16 | 
17 | The original codebase can be found at [LambdaLabsML/lambda-diffusers](https://github.com/LambdaLabsML/lambda-diffusers#stable-diffusion-image-variations) and additional official checkpoints for image variation can be found at [lambdalabs/sd-image-variations-diffusers](https://huggingface.co/lambdalabs/sd-image-variations-diffusers).
18 | 
19 | <Tip>
20 | 
21 | Make sure to check out the Stable Diffusion [Tips](./overview#tips) section to learn how to explore the tradeoff between scheduler speed and quality, and how to reuse pipeline components efficiently!
22 | 
23 | </Tip>
24 | 
25 | ## StableDiffusionImageVariationPipeline
26 | 
27 | [[autodoc]] StableDiffusionImageVariationPipeline
28 | 	- all
29 | 	- __call__
30 | 	- enable_attention_slicing
31 | 	- disable_attention_slicing
32 | 	- enable_xformers_memory_efficient_attention
33 | 	- disable_xformers_memory_efficient_attention
34 | 
35 | ## StableDiffusionPipelineOutput
36 | 
37 | [[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput
38 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/overview.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Pipelines
14 | 
15 | Pipelines provide a simple way to run state-of-the-art diffusion models in inference by bundling all of the necessary components (multiple independently-trained models, schedulers, and processors) into a single end-to-end class. Pipelines are flexible and they can be adapted to use different scheduler or even model components.
16 | 
17 | All pipelines are built from the base [`DiffusionPipeline`] class which provides basic functionality for loading, downloading, and saving all the components.
18 | 
19 | <Tip warning={true}>
20 | 
21 | Pipelines do not offer any training functionality. You'll notice PyTorch's autograd is disabled by decorating the [`~DiffusionPipeline.__call__`] method with a [`torch.no_grad`](https://pytorch.org/docs/stable/generated/torch.no_grad.html) decorator because pipelines should not be used for training. If you're interested in training, please take a look at the [Training](../traininig/overview) guides instead!
22 | 
23 | </Tip>
24 | 
25 | ## DiffusionPipeline
26 | 
27 | [[autodoc]] DiffusionPipeline
28 | 	- all
29 | 	- __call__
30 | 	- device
31 | 	- to
32 | 	- components
33 | 
34 | ## FlaxDiffusionPipeline
35 | 
36 | [[autodoc]] pipelines.pipeline_flax_utils.FlaxDiffusionPipeline
37 | 
38 | ## PushToHubMixin
39 | 
40 | [[autodoc]] utils.PushToHubMixin
41 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/stable_diffusion/depth2img.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Depth-to-image
14 | 
15 | The Stable Diffusion model can also infer depth based on an image using [MiDas](https://github.com/isl-org/MiDaS). This allows you to pass a text prompt and an initial image to condition the generation of new images as well as a `depth_map` to preserve the image structure. 
16 | 
17 | <Tip>
18 | 
19 | Make sure to check out the Stable Diffusion [Tips](overview#tips) section to learn how to explore the tradeoff between scheduler speed and quality, and how to reuse pipeline components efficiently! 
20 | 
21 | If you're interested in using one of the official checkpoints for a task, explore the [CompVis](https://huggingface.co/CompVis), [Runway](https://huggingface.co/runwayml), and [Stability AI](https://huggingface.co/stabilityai) Hub organizations!
22 | 
23 | </Tip>
24 | 
25 | ## StableDiffusionDepth2ImgPipeline
26 | 
27 | [[autodoc]] StableDiffusionDepth2ImgPipeline
28 | 	- all
29 | 	- __call__
30 | 	- enable_attention_slicing
31 | 	- disable_attention_slicing
32 | 	- enable_xformers_memory_efficient_attention
33 | 	- disable_xformers_memory_efficient_attention
34 | 	- load_textual_inversion
35 | 	- load_lora_weights
36 | 	- save_lora_weights
37 | 
38 | ## StableDiffusionPipelineOutput
39 | 
40 | [[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput


--------------------------------------------------------------------------------
/src/diffusers/dependency_versions_check.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | import sys
15 | 
16 | from .dependency_versions_table import deps
17 | from .utils.versions import require_version, require_version_core
18 | 
19 | 
20 | # define which module versions we always want to check at run time
21 | # (usually the ones defined in `install_requires` in setup.py)
22 | #
23 | # order specific notes:
24 | # - tqdm must be checked before tokenizers
25 | 
26 | pkgs_to_check_at_runtime = "python tqdm regex requests packaging filelock numpy tokenizers".split()
27 | if sys.version_info < (3, 7):
28 |     pkgs_to_check_at_runtime.append("dataclasses")
29 | if sys.version_info < (3, 8):
30 |     pkgs_to_check_at_runtime.append("importlib_metadata")
31 | 
32 | for pkg in pkgs_to_check_at_runtime:
33 |     if pkg in deps:
34 |         if pkg == "tokenizers":
35 |             # must be loaded here, or else tqdm check may fail
36 |             from .utils import is_tokenizers_available
37 | 
38 |             if not is_tokenizers_available():
39 |                 continue  # not required, check version only if installed
40 | 
41 |         require_version_core(deps[pkg])
42 |     else:
43 |         raise ValueError(f"can't find {pkg} in {deps.keys()}, check dependency_versions_table.py")
44 | 
45 | 
46 | def dep_version_check(pkg, hint=None):
47 |     require_version(deps[pkg], hint)
48 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/cm_stochastic_iterative.md:
--------------------------------------------------------------------------------
 1 | # CMStochasticIterativeScheduler
 2 | 
 3 | [Consistency Models](https://huggingface.co/papers/2303.01469) by Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever introduced a multistep and onestep scheduler (Algorithm 1) that is capable of generating good samples in one or a small number of steps.
 4 | 
 5 | The abstract from the paper is:
 6 | 
 7 | *Diffusion models have made significant breakthroughs in image, audio, and video generation, but they depend on an iterative generation process that causes slow sampling speed and caps their potential for real-time applications. To overcome this limitation, we propose consistency models, a new family of generative models that achieve high sample quality without adversarial training. They support fast one-step generation by design, while still allowing for few-step sampling to trade compute for sample quality. They also support zero-shot data editing, like image inpainting, colorization, and super-resolution, without requiring explicit training on these tasks. Consistency models can be trained either as a way to distill pre-trained diffusion models, or as standalone generative models. Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step generation. For example, we achieve the new state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 for one-step generation. When trained as standalone generative models, consistency models also outperform single-step, non-adversarial generative models on standard benchmarks like CIFAR-10, ImageNet 64x64 and LSUN 256x256.*
 8 | 
 9 | The original codebase can be found at [openai/consistency_models](https://github.com/openai/consistency_models).
10 | 
11 | ## CMStochasticIterativeScheduler
12 | [[autodoc]] CMStochasticIterativeScheduler
13 | 
14 | ## CMStochasticIterativeSchedulerOutput
15 | [[autodoc]] schedulers.scheduling_consistency_models.CMStochasticIterativeSchedulerOutput


--------------------------------------------------------------------------------
/docs/source/en/optimization/xformers.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Installing xFormers
14 | 
15 | We recommend the use of [xFormers](https://github.com/facebookresearch/xformers) for both inference and training. In our tests, the optimizations performed in the attention blocks allow for both faster speed and reduced memory consumption.
16 | 
17 | Starting from version `0.0.16` of xFormers, released on January 2023, installation can be easily performed using pre-built pip wheels:
18 | 
19 | ```bash
20 | pip install xformers
21 | ```
22 | 
23 | <Tip>
24 | 
25 | The xFormers PIP package requires the latest version of PyTorch (1.13.1 as of xFormers 0.0.16). If you need to use a previous version of PyTorch, then we recommend you install xFormers from source using [the project instructions](https://github.com/facebookresearch/xformers#installing-xformers).
26 | 
27 | </Tip>
28 | 
29 | After xFormers is installed, you can use `enable_xformers_memory_efficient_attention()` for faster inference and reduced memory consumption, as discussed [here](fp16#memory-efficient-attention).
30 | 
31 | <Tip warning={true}>
32 | 
33 | According to [this issue](https://github.com/huggingface/diffusers/issues/2234#issuecomment-1416931212), xFormers `v0.0.16` cannot be used for training (fine-tune or Dreambooth) in some GPUs. If you observe that problem, please install a development version as indicated in that comment.
34 | 
35 | </Tip>
36 | 


--------------------------------------------------------------------------------
/examples/research_projects/intel_opts/README.md:
--------------------------------------------------------------------------------
 1 | ## Diffusers examples with Intel optimizations
 2 | 
 3 | **This research project is not actively maintained by the diffusers team. For any questions or comments, please make sure to tag @hshen14 .**
 4 | 
 5 | This aims to provide diffusers examples with Intel optimizations such as Bfloat16 for training/fine-tuning acceleration and 8-bit integer (INT8) for inference acceleration on Intel platforms.
 6 | 
 7 | ## Accelerating the fine-tuning for textual inversion
 8 | 
 9 | We accelereate the fine-tuning for textual inversion with Intel Extension for PyTorch. The [examples](textual_inversion) enable both single node and multi-node distributed training with Bfloat16 support on Intel Xeon Scalable Processor.
10 | 
11 | ## Accelerating the inference for Stable Diffusion using Bfloat16
12 | 
13 | We start the inference acceleration with Bfloat16 using Intel Extension for PyTorch. The [script](inference_bf16.py) is generally designed to support standard Stable Diffusion models with Bfloat16 support.
14 | ```bash
15 | pip install diffusers transformers accelerate scipy safetensors
16 | 
17 | export KMP_BLOCKTIME=1
18 | export KMP_SETTINGS=1
19 | export KMP_AFFINITY=granularity=fine,compact,1,0
20 | 
21 | # Intel OpenMP
22 | export OMP_NUM_THREADS=< Cores to use >
23 | export LD_PRELOAD=${LD_PRELOAD}:/path/to/lib/libiomp5.so
24 | # Jemalloc is a recommended malloc implementation that emphasizes fragmentation avoidance and scalable concurrency support.
25 | export LD_PRELOAD=${LD_PRELOAD}:/path/to/lib/libjemalloc.so
26 | export MALLOC_CONF="oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:-1,muzzy_decay_ms:9000000000"
27 | 
28 | # Launch with default DDIM
29 | numactl --membind <node N> -C <cpu list> python python inference_bf16.py
30 | # Launch with DPMSolverMultistepScheduler
31 | numactl --membind <node N> -C <cpu list> python python inference_bf16.py --dpm
32 | 
33 | ```
34 | 
35 | ## Accelerating the inference for Stable Diffusion using INT8
36 | 
37 | Coming soon ...
38 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/ddim.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # DDIM
14 | 
15 | [Denoising Diffusion Implicit Models](https://huggingface.co/papers/2010.02502) (DDIM) by Jiaming Song, Chenlin Meng and Stefano Ermon.
16 | 
17 | The abstract from the paper is:
18 | 
19 | *Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process. We construct a class of non-Markovian diffusion processes that lead to the same training objective, but whose reverse process can be much faster to sample from. We empirically demonstrate that DDIMs can produce high quality samples 10× to 50× faster in terms of wall-clock time compared to DDPMs, allow us to trade off computation for sample quality, and can perform semantically meaningful image interpolation directly in the latent space.*
20 | 
21 | The original codebase can be found at [ermongroup/ddim](https://github.com/ermongroup/ddim).
22 | 
23 | ## DDIMPipeline
24 | [[autodoc]] DDIMPipeline
25 | 	- all
26 | 	- __call__
27 | 
28 | ## ImagePipelineOutput
29 | [[autodoc]] pipelines.ImagePipelineOutput


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/multistep_dpm_solver_inverse.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # DPMSolverMultistepInverse
14 | 
15 | `DPMSolverMultistepInverse` is the inverted scheduler from [DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps](https://huggingface.co/papers/2206.00927) and [DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models](https://huggingface.co/papers/2211.01095) by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu.
16 | 
17 | The implementation is mostly based on the DDIM inversion definition of [Null-text Inversion for Editing Real Images using Guided Diffusion Models](https://huggingface.co/papers/2211.09794.pdf) and notebook implementation of the [`DiffEdit`] latent inversion from [Xiang-cd/DiffEdit-stable-diffusion](https://github.com/Xiang-cd/DiffEdit-stable-diffusion/blob/main/diffedit.ipynb).
18 | 
19 | ## Tips
20 | 
21 | Dynamic thresholding from Imagen (https://huggingface.co/papers/2205.11487) is supported, and for pixel-space
22 | diffusion models, you can set both `algorithm_type="dpmsolver++"` and `thresholding=True` to use the dynamic
23 | thresholding. This thresholding method is unsuitable for latent-space diffusion models such as
24 | Stable Diffusion.
25 | 
26 | ## DPMSolverMultistepInverseScheduler
27 | [[autodoc]] DPMSolverMultistepInverseScheduler
28 | 
29 | ## SchedulerOutput
30 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput
31 | 


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/stable_diffusion/latent_upscale.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Latent upscaler
14 | 
15 | The Stable Diffusion latent upscaler model was created by [Katherine Crowson](https://github.com/crowsonkb/k-diffusion) in collaboration with [Stability AI](https://stability.ai/). It is used to enhance the output image resolution by a factor of 2 (see this demo [notebook](https://colab.research.google.com/drive/1o1qYJcFeywzCIdkfKJy7cTpgZTCM2EI4) for a demonstration of the original implementation).
16 | 
17 | <Tip>
18 | 
19 | Make sure to check out the Stable Diffusion [Tips](overview#tips) section to learn how to explore the tradeoff between scheduler speed and quality, and how to reuse pipeline components efficiently! 
20 | 
21 | If you're interested in using one of the official checkpoints for a task, explore the [CompVis](https://huggingface.co/CompVis), [Runway](https://huggingface.co/runwayml), and [Stability AI](https://huggingface.co/stabilityai) Hub organizations!
22 | 
23 | </Tip>
24 | 
25 | ## StableDiffusionLatentUpscalePipeline
26 | 
27 | [[autodoc]] StableDiffusionLatentUpscalePipeline
28 | 	- all
29 | 	- __call__
30 | 	- enable_sequential_cpu_offload
31 | 	- enable_attention_slicing
32 | 	- disable_attention_slicing
33 | 	- enable_xformers_memory_efficient_attention
34 | 	- disable_xformers_memory_efficient_attention
35 | 
36 | ## StableDiffusionPipelineOutput
37 | 
38 | [[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput


--------------------------------------------------------------------------------
/src/diffusers/utils/accelerate_utils.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | """
15 | Accelerate utilities: Utilities related to accelerate
16 | """
17 | 
18 | from packaging import version
19 | 
20 | from .import_utils import is_accelerate_available
21 | 
22 | 
23 | if is_accelerate_available():
24 |     import accelerate
25 | 
26 | 
27 | def apply_forward_hook(method):
28 |     """
29 |     Decorator that applies a registered CpuOffload hook to an arbitrary function rather than `forward`. This is useful
30 |     for cases where a PyTorch module provides functions other than `forward` that should trigger a move to the
31 |     appropriate acceleration device. This is the case for `encode` and `decode` in [`AutoencoderKL`].
32 | 
33 |     This decorator looks inside the internal `_hf_hook` property to find a registered offload hook.
34 | 
35 |     :param method: The method to decorate. This method should be a method of a PyTorch module.
36 |     """
37 |     if not is_accelerate_available():
38 |         return method
39 |     accelerate_version = version.parse(accelerate.__version__).base_version
40 |     if version.parse(accelerate_version) < version.parse("0.17.0"):
41 |         return method
42 | 
43 |     def wrapper(self, *args, **kwargs):
44 |         if hasattr(self, "_hf_hook") and hasattr(self._hf_hook, "pre_forward"):
45 |             self._hf_hook.pre_forward(self)
46 |         return method(self, *args, **kwargs)
47 | 
48 |     return wrapper
49 | 


--------------------------------------------------------------------------------
/tests/models/test_activations.py:
--------------------------------------------------------------------------------
 1 | import unittest
 2 | 
 3 | import torch
 4 | from torch import nn
 5 | 
 6 | from diffusers.models.activations import get_activation
 7 | 
 8 | 
 9 | class ActivationsTests(unittest.TestCase):
10 |     def test_swish(self):
11 |         act = get_activation("swish")
12 | 
13 |         self.assertIsInstance(act, nn.SiLU)
14 | 
15 |         self.assertEqual(act(torch.tensor(-100, dtype=torch.float32)).item(), 0)
16 |         self.assertNotEqual(act(torch.tensor(-1, dtype=torch.float32)).item(), 0)
17 |         self.assertEqual(act(torch.tensor(0, dtype=torch.float32)).item(), 0)
18 |         self.assertEqual(act(torch.tensor(20, dtype=torch.float32)).item(), 20)
19 | 
20 |     def test_silu(self):
21 |         act = get_activation("silu")
22 | 
23 |         self.assertIsInstance(act, nn.SiLU)
24 | 
25 |         self.assertEqual(act(torch.tensor(-100, dtype=torch.float32)).item(), 0)
26 |         self.assertNotEqual(act(torch.tensor(-1, dtype=torch.float32)).item(), 0)
27 |         self.assertEqual(act(torch.tensor(0, dtype=torch.float32)).item(), 0)
28 |         self.assertEqual(act(torch.tensor(20, dtype=torch.float32)).item(), 20)
29 | 
30 |     def test_mish(self):
31 |         act = get_activation("mish")
32 | 
33 |         self.assertIsInstance(act, nn.Mish)
34 | 
35 |         self.assertEqual(act(torch.tensor(-200, dtype=torch.float32)).item(), 0)
36 |         self.assertNotEqual(act(torch.tensor(-1, dtype=torch.float32)).item(), 0)
37 |         self.assertEqual(act(torch.tensor(0, dtype=torch.float32)).item(), 0)
38 |         self.assertEqual(act(torch.tensor(20, dtype=torch.float32)).item(), 20)
39 | 
40 |     def test_gelu(self):
41 |         act = get_activation("gelu")
42 | 
43 |         self.assertIsInstance(act, nn.GELU)
44 | 
45 |         self.assertEqual(act(torch.tensor(-100, dtype=torch.float32)).item(), 0)
46 |         self.assertNotEqual(act(torch.tensor(-1, dtype=torch.float32)).item(), 0)
47 |         self.assertEqual(act(torch.tensor(0, dtype=torch.float32)).item(), 0)
48 |         self.assertEqual(act(torch.tensor(20, dtype=torch.float32)).item(), 20)
49 | 


--------------------------------------------------------------------------------
/.github/workflows/push_tests_mps.yml:
--------------------------------------------------------------------------------
 1 | name: Fast mps tests on main
 2 | 
 3 | on:
 4 |   push:
 5 |     branches:
 6 |       - main
 7 | 
 8 | env:
 9 |   DIFFUSERS_IS_CI: yes
10 |   HF_HOME: /mnt/cache
11 |   OMP_NUM_THREADS: 8
12 |   MKL_NUM_THREADS: 8
13 |   PYTEST_TIMEOUT: 600
14 |   RUN_SLOW: no
15 | 
16 | jobs:
17 |   run_fast_tests_apple_m1:
18 |     name: Fast PyTorch MPS tests on MacOS
19 |     runs-on: [ self-hosted, apple-m1 ]
20 | 
21 |     steps:
22 |     - name: Checkout diffusers
23 |       uses: actions/checkout@v3
24 |       with:
25 |         fetch-depth: 2
26 | 
27 |     - name: Clean checkout
28 |       shell: arch -arch arm64 bash {0}
29 |       run: |
30 |         git clean -fxd
31 | 
32 |     - name: Setup miniconda
33 |       uses: ./.github/actions/setup-miniconda
34 |       with:
35 |         python-version: 3.9
36 | 
37 |     - name: Install dependencies
38 |       shell: arch -arch arm64 bash {0}
39 |       run: |
40 |         ${CONDA_RUN} python -m pip install --upgrade pip
41 |         ${CONDA_RUN} python -m pip install -e .[quality,test]
42 |         ${CONDA_RUN} python -m pip install torch torchvision torchaudio
43 |         ${CONDA_RUN} python -m pip install accelerate --upgrade
44 |         ${CONDA_RUN} python -m pip install transformers --upgrade
45 | 
46 |     - name: Environment
47 |       shell: arch -arch arm64 bash {0}
48 |       run: |
49 |         ${CONDA_RUN} python utils/print_env.py
50 | 
51 |     - name: Run fast PyTorch tests on M1 (MPS)
52 |       shell: arch -arch arm64 bash {0}
53 |       env:
54 |         HF_HOME: /System/Volumes/Data/mnt/cache
55 |         HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
56 |       run: |
57 |         ${CONDA_RUN} python -m pytest -n 0 -s -v --make-reports=tests_torch_mps tests/
58 | 
59 |     - name: Failure short reports
60 |       if: ${{ failure() }}
61 |       run: cat reports/tests_torch_mps_failures_short.txt
62 | 
63 |     - name: Test suite reports artifacts
64 |       if: ${{ always() }}
65 |       uses: actions/upload-artifact@v2
66 |       with:
67 |         name: pr_torch_mps_test_reports
68 |         path: reports
69 | 


--------------------------------------------------------------------------------
/docs/source/en/api/models/unet2d-cond.md:
--------------------------------------------------------------------------------
 1 | # UNet2DConditionModel
 2 | 
 3 | The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 2D UNet conditional model.
 4 | 
 5 | The abstract from the paper is:
 6 | 
 7 | *There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
 8 | 
 9 | ## UNet2DConditionModel
10 | [[autodoc]] UNet2DConditionModel
11 | 
12 | ## UNet2DConditionOutput
13 | [[autodoc]] models.unet_2d_condition.UNet2DConditionOutput
14 | 
15 | ## FlaxUNet2DConditionModel
16 | [[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionModel
17 | 
18 | ## FlaxUNet2DConditionOutput
19 | [[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionOutput


--------------------------------------------------------------------------------
/scripts/conversion_ldm_uncond.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | import OmegaConf
 4 | import torch
 5 | 
 6 | from diffusers import DDIMScheduler, LDMPipeline, UNetLDMModel, VQModel
 7 | 
 8 | 
 9 | def convert_ldm_original(checkpoint_path, config_path, output_path):
10 |     config = OmegaConf.load(config_path)
11 |     state_dict = torch.load(checkpoint_path, map_location="cpu")["model"]
12 |     keys = list(state_dict.keys())
13 | 
14 |     # extract state_dict for VQVAE
15 |     first_stage_dict = {}
16 |     first_stage_key = "first_stage_model."
17 |     for key in keys:
18 |         if key.startswith(first_stage_key):
19 |             first_stage_dict[key.replace(first_stage_key, "")] = state_dict[key]
20 | 
21 |     # extract state_dict for UNetLDM
22 |     unet_state_dict = {}
23 |     unet_key = "model.diffusion_model."
24 |     for key in keys:
25 |         if key.startswith(unet_key):
26 |             unet_state_dict[key.replace(unet_key, "")] = state_dict[key]
27 | 
28 |     vqvae_init_args = config.model.params.first_stage_config.params
29 |     unet_init_args = config.model.params.unet_config.params
30 | 
31 |     vqvae = VQModel(**vqvae_init_args).eval()
32 |     vqvae.load_state_dict(first_stage_dict)
33 | 
34 |     unet = UNetLDMModel(**unet_init_args).eval()
35 |     unet.load_state_dict(unet_state_dict)
36 | 
37 |     noise_scheduler = DDIMScheduler(
38 |         timesteps=config.model.params.timesteps,
39 |         beta_schedule="scaled_linear",
40 |         beta_start=config.model.params.linear_start,
41 |         beta_end=config.model.params.linear_end,
42 |         clip_sample=False,
43 |     )
44 | 
45 |     pipeline = LDMPipeline(vqvae, unet, noise_scheduler)
46 |     pipeline.save_pretrained(output_path)
47 | 
48 | 
49 | if __name__ == "__main__":
50 |     parser = argparse.ArgumentParser()
51 |     parser.add_argument("--checkpoint_path", type=str, required=True)
52 |     parser.add_argument("--config_path", type=str, required=True)
53 |     parser.add_argument("--output_path", type=str, required=True)
54 |     args = parser.parse_args()
55 | 
56 |     convert_ldm_original(args.checkpoint_path, args.config_path, args.output_path)
57 | 


--------------------------------------------------------------------------------
/tests/others/test_dependencies.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | import inspect
16 | import unittest
17 | from importlib import import_module
18 | 
19 | 
20 | class DependencyTester(unittest.TestCase):
21 |     def test_diffusers_import(self):
22 |         try:
23 |             import diffusers  # noqa: F401
24 |         except ImportError:
25 |             assert False
26 | 
27 |     def test_backend_registration(self):
28 |         import diffusers
29 |         from diffusers.dependency_versions_table import deps
30 | 
31 |         all_classes = inspect.getmembers(diffusers, inspect.isclass)
32 | 
33 |         for cls_name, cls_module in all_classes:
34 |             if "dummy_" in cls_module.__module__:
35 |                 for backend in cls_module._backends:
36 |                     if backend == "k_diffusion":
37 |                         backend = "k-diffusion"
38 |                     elif backend == "invisible_watermark":
39 |                         backend = "invisible-watermark"
40 |                     assert backend in deps, f"{backend} is not in the deps table!"
41 | 
42 |     def test_pipeline_imports(self):
43 |         import diffusers
44 |         import diffusers.pipelines
45 | 
46 |         all_classes = inspect.getmembers(diffusers, inspect.isclass)
47 |         for cls_name, cls_module in all_classes:
48 |             if hasattr(diffusers.pipelines, cls_name):
49 |                 pipeline_folder_module = ".".join(str(cls_module.__module__).split(".")[:3])
50 |                 _ = import_module(pipeline_folder_module, str(cls_name))
51 | 


--------------------------------------------------------------------------------
/docs/source/en/api/schedulers/singlestep_dpm_solver.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # DPMSolverSinglestepScheduler
14 | 
15 | `DPMSolverSinglestepScheduler` is a single step scheduler from [DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps](https://huggingface.co/papers/2206.00927) and [DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models](https://huggingface.co/papers/2211.01095) by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu.
16 | 
17 | DPMSolver (and the improved version DPMSolver++) is a fast dedicated high-order solver for diffusion ODEs with convergence order guarantee. Empirically, DPMSolver sampling with only 20 steps can generate high-quality
18 | samples, and it can generate quite good samples even in 10 steps.
19 | 
20 | The original implementation can be found at [LuChengTHU/dpm-solver](https://github.com/LuChengTHU/dpm-solver).
21 | 
22 | ## Tips
23 | 
24 | It is recommended to set `solver_order` to 2 for guide sampling, and `solver_order=3` for unconditional sampling.
25 | 
26 | Dynamic thresholding from Imagen (https://huggingface.co/papers/2205.11487) is supported, and for pixel-space
27 | diffusion models, you can set both `algorithm_type="dpmsolver++"` and `thresholding=True` to use dynamic
28 | thresholding. This thresholding method is unsuitable for latent-space diffusion models such as
29 | Stable Diffusion.
30 | 
31 | ## DPMSolverSinglestepScheduler
32 | [[autodoc]] DPMSolverSinglestepScheduler
33 | 
34 | ## SchedulerOutput
35 | [[autodoc]] schedulers.scheduling_utils.SchedulerOutput


--------------------------------------------------------------------------------
/src/diffusers/pipelines/stable_diffusion/stable_unclip_image_normalizer.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | from typing import Optional, Union
16 | 
17 | import torch
18 | from torch import nn
19 | 
20 | from ...configuration_utils import ConfigMixin, register_to_config
21 | from ...models.modeling_utils import ModelMixin
22 | 
23 | 
24 | class StableUnCLIPImageNormalizer(ModelMixin, ConfigMixin):
25 |     """
26 |     This class is used to hold the mean and standard deviation of the CLIP embedder used in stable unCLIP.
27 | 
28 |     It is used to normalize the image embeddings before the noise is applied and un-normalize the noised image
29 |     embeddings.
30 |     """
31 | 
32 |     @register_to_config
33 |     def __init__(
34 |         self,
35 |         embedding_dim: int = 768,
36 |     ):
37 |         super().__init__()
38 | 
39 |         self.mean = nn.Parameter(torch.zeros(1, embedding_dim))
40 |         self.std = nn.Parameter(torch.ones(1, embedding_dim))
41 | 
42 |     def to(
43 |         self,
44 |         torch_device: Optional[Union[str, torch.device]] = None,
45 |         torch_dtype: Optional[torch.dtype] = None,
46 |     ):
47 |         self.mean = nn.Parameter(self.mean.to(torch_device).to(torch_dtype))
48 |         self.std = nn.Parameter(self.std.to(torch_device).to(torch_dtype))
49 |         return self
50 | 
51 |     def scale(self, embeds):
52 |         embeds = (embeds - self.mean) * 1.0 / self.std
53 |         return embeds
54 | 
55 |     def unscale(self, embeds):
56 |         embeds = (embeds * self.std) + self.mean
57 |         return embeds
58 | 


--------------------------------------------------------------------------------
/tests/others/test_hub_utils.py:
--------------------------------------------------------------------------------
 1 | # coding=utf-8
 2 | # Copyright 2023 HuggingFace Inc.
 3 | #
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | #     http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | import unittest
16 | from pathlib import Path
17 | from tempfile import TemporaryDirectory
18 | from unittest.mock import Mock, patch
19 | 
20 | import diffusers.utils.hub_utils
21 | 
22 | 
23 | class CreateModelCardTest(unittest.TestCase):
24 |     @patch("diffusers.utils.hub_utils.get_full_repo_name")
25 |     def test_create_model_card(self, repo_name_mock: Mock) -> None:
26 |         repo_name_mock.return_value = "full_repo_name"
27 |         with TemporaryDirectory() as tmpdir:
28 |             # Dummy args values
29 |             args = Mock()
30 |             args.output_dir = tmpdir
31 |             args.local_rank = 0
32 |             args.hub_token = "hub_token"
33 |             args.dataset_name = "dataset_name"
34 |             args.learning_rate = 0.01
35 |             args.train_batch_size = 100000
36 |             args.eval_batch_size = 10000
37 |             args.gradient_accumulation_steps = 0.01
38 |             args.adam_beta1 = 0.02
39 |             args.adam_beta2 = 0.03
40 |             args.adam_weight_decay = 0.0005
41 |             args.adam_epsilon = 0.000001
42 |             args.lr_scheduler = 1
43 |             args.lr_warmup_steps = 10
44 |             args.ema_inv_gamma = 0.001
45 |             args.ema_power = 0.1
46 |             args.ema_max_decay = 0.2
47 |             args.mixed_precision = True
48 | 
49 |             # Model card mush be rendered and saved
50 |             diffusers.utils.hub_utils.create_model_card(args, model_name="model_name")
51 |             self.assertTrue((Path(tmpdir) / "README.md").is_file())
52 | 


--------------------------------------------------------------------------------
/src/diffusers/utils/dummy_flax_and_transformers_objects.py:
--------------------------------------------------------------------------------
 1 | # This file is autogenerated by the command `make fix-copies`, do not edit.
 2 | from ..utils import DummyObject, requires_backends
 3 | 
 4 | 
 5 | class FlaxStableDiffusionControlNetPipeline(metaclass=DummyObject):
 6 |     _backends = ["flax", "transformers"]
 7 | 
 8 |     def __init__(self, *args, **kwargs):
 9 |         requires_backends(self, ["flax", "transformers"])
10 | 
11 |     @classmethod
12 |     def from_config(cls, *args, **kwargs):
13 |         requires_backends(cls, ["flax", "transformers"])
14 | 
15 |     @classmethod
16 |     def from_pretrained(cls, *args, **kwargs):
17 |         requires_backends(cls, ["flax", "transformers"])
18 | 
19 | 
20 | class FlaxStableDiffusionImg2ImgPipeline(metaclass=DummyObject):
21 |     _backends = ["flax", "transformers"]
22 | 
23 |     def __init__(self, *args, **kwargs):
24 |         requires_backends(self, ["flax", "transformers"])
25 | 
26 |     @classmethod
27 |     def from_config(cls, *args, **kwargs):
28 |         requires_backends(cls, ["flax", "transformers"])
29 | 
30 |     @classmethod
31 |     def from_pretrained(cls, *args, **kwargs):
32 |         requires_backends(cls, ["flax", "transformers"])
33 | 
34 | 
35 | class FlaxStableDiffusionInpaintPipeline(metaclass=DummyObject):
36 |     _backends = ["flax", "transformers"]
37 | 
38 |     def __init__(self, *args, **kwargs):
39 |         requires_backends(self, ["flax", "transformers"])
40 | 
41 |     @classmethod
42 |     def from_config(cls, *args, **kwargs):
43 |         requires_backends(cls, ["flax", "transformers"])
44 | 
45 |     @classmethod
46 |     def from_pretrained(cls, *args, **kwargs):
47 |         requires_backends(cls, ["flax", "transformers"])
48 | 
49 | 
50 | class FlaxStableDiffusionPipeline(metaclass=DummyObject):
51 |     _backends = ["flax", "transformers"]
52 | 
53 |     def __init__(self, *args, **kwargs):
54 |         requires_backends(self, ["flax", "transformers"])
55 | 
56 |     @classmethod
57 |     def from_config(cls, *args, **kwargs):
58 |         requires_backends(cls, ["flax", "transformers"])
59 | 
60 |     @classmethod
61 |     def from_pretrained(cls, *args, **kwargs):
62 |         requires_backends(cls, ["flax", "transformers"])
63 | 


--------------------------------------------------------------------------------
/docs/source/en/api/outputs.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # Outputs
14 | 
15 | All models outputs are subclasses of [`~utils.BaseOutput`], data structures containing all the information returned by the model. The outputs can also be used as tuples or dictionaries.
16 | 
17 | For example:
18 | 
19 | ```python
20 | from diffusers import DDIMPipeline
21 | 
22 | pipeline = DDIMPipeline.from_pretrained("google/ddpm-cifar10-32")
23 | outputs = pipeline()
24 | ```
25 | 
26 | The `outputs` object is a [`~pipelines.ImagePipelineOutput`] which means it has an image attribute.
27 | 
28 | You can access each attribute as you normally would or with a keyword lookup, and if that attribute is not returned by the model, you will get `None`:
29 | 
30 | ```python
31 | outputs.images
32 | outputs["images"]
33 | ```
34 | 
35 | When considering the `outputs` object as a tuple, it only considers the attributes that don't have `None` values.
36 | For instance, retrieving an image by indexing into it returns the tuple `(outputs.images)`:
37 | 
38 | ```python
39 | outputs[:1]
40 | ```
41 | 
42 | <Tip>
43 | 
44 | To check a specific pipeline or model output, refer to its corresponding API documentation.
45 | 
46 | </Tip>
47 | 
48 | ## BaseOutput
49 | 
50 | [[autodoc]] utils.BaseOutput
51 |     - to_tuple
52 | 
53 | ## ImagePipelineOutput
54 | 
55 | [[autodoc]] pipelines.ImagePipelineOutput
56 | 
57 | ## FlaxImagePipelineOutput
58 | 
59 | [[autodoc]] pipelines.pipeline_flax_utils.FlaxImagePipelineOutput
60 | 
61 | ## AudioPipelineOutput
62 | 
63 | [[autodoc]] pipelines.AudioPipelineOutput
64 | 
65 | ## ImageTextPipelineOutput
66 | 
67 | [[autodoc]] ImageTextPipelineOutput


--------------------------------------------------------------------------------
/docs/source/en/api/pipelines/dit.md:
--------------------------------------------------------------------------------
 1 | <!--Copyright 2023 The HuggingFace Team. All rights reserved.
 2 | 
 3 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 4 | the License. You may obtain a copy of the License at
 5 | 
 6 | http://www.apache.org/licenses/LICENSE-2.0
 7 | 
 8 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 9 | an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10 | specific language governing permissions and limitations under the License.
11 | -->
12 | 
13 | # DiT
14 | 
15 | [Scalable Diffusion Models with Transformers](https://huggingface.co/papers/2212.09748) (DiT) is by William Peebles and Saining Xie.
16 | 
17 | The abstract from the paper is:
18 | 
19 | *We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches. We analyze the scalability of our Diffusion Transformers (DiTs) through the lens of forward pass complexity as measured by Gflops. We find that DiTs with higher Gflops -- through increased transformer depth/width or increased number of input tokens -- consistently have lower FID. In addition to possessing good scalability properties, our largest DiT-XL/2 models outperform all prior diffusion models on the class-conditional ImageNet 512x512 and 256x256 benchmarks, achieving a state-of-the-art FID of 2.27 on the latter.*
20 | 
21 | The original codebase can be found at [facebookresearch/dit](https://github.com/facebookresearch/dit).
22 | 
23 | <Tip>
24 | 
25 | Make sure to check out the Schedulers [guide](/using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](/using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
26 | 
27 | </Tip>
28 | 
29 | ## DiTPipeline
30 | [[autodoc]] DiTPipeline
31 | 	- all
32 | 	- __call__
33 | 
34 | ## ImagePipelineOutput
35 | [[autodoc]] pipelines.ImagePipelineOutput


--------------------------------------------------------------------------------