├── .gitignore ├── 01-2_pytorch-fabric.py ├── 01_pytorch-vit.py ├── 02_mixed-precision.py ├── 03_bfloat16.py ├── 04_lower-batchsize.py ├── 05_gradient-accum.py ├── 06_sgd-with-scheduler.py ├── 07_01_init-module.py ├── 07_02_init-module.py ├── 07_03_init-module.py ├── 08-10-vit32 ├── 08_baseline.py ├── 08a_fsdp-defaults.py ├── 08b_fsdp-custom.py ├── 08c_fsdp-size-wrap.py ├── 09_fsdp-act-checkp.py ├── 10_fsdp-with-cpu-offload.py ├── 10b_fsdp-with-cpu-offload-no-act-check.py └── local_utilities.py ├── 08_baseline.py ├── 08a_fsdp-defaults.py ├── 08b_fsdp-custom.py ├── 08c_fsdp-size-wrap.py ├── 09_fsdp-act-checkp.py ├── 10_fsdp-with-cpu-offload.py ├── 10b_fsdp-with-cpu-offload-no-act-check.py ├── 11_delay-allocation.py ├── 12_fsdp-overlap.py ├── LICENSE.txt ├── README.md ├── bonus_bigbird-after.py ├── bonus_bigbird-before.py ├── bonus_distilbert-after.py ├── bonus_distilbert-before.py ├── figures └── overview.png ├── local_utilities.py ├── logs.md └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/.gitignore -------------------------------------------------------------------------------- /01-2_pytorch-fabric.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/01-2_pytorch-fabric.py -------------------------------------------------------------------------------- /01_pytorch-vit.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/01_pytorch-vit.py -------------------------------------------------------------------------------- /02_mixed-precision.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/02_mixed-precision.py -------------------------------------------------------------------------------- /03_bfloat16.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/03_bfloat16.py -------------------------------------------------------------------------------- /04_lower-batchsize.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/04_lower-batchsize.py -------------------------------------------------------------------------------- /05_gradient-accum.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/05_gradient-accum.py -------------------------------------------------------------------------------- /06_sgd-with-scheduler.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/06_sgd-with-scheduler.py -------------------------------------------------------------------------------- /07_01_init-module.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/07_01_init-module.py -------------------------------------------------------------------------------- /07_02_init-module.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/07_02_init-module.py -------------------------------------------------------------------------------- /07_03_init-module.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/07_03_init-module.py -------------------------------------------------------------------------------- /08-10-vit32/08_baseline.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/08_baseline.py -------------------------------------------------------------------------------- /08-10-vit32/08a_fsdp-defaults.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/08a_fsdp-defaults.py -------------------------------------------------------------------------------- /08-10-vit32/08b_fsdp-custom.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/08b_fsdp-custom.py -------------------------------------------------------------------------------- /08-10-vit32/08c_fsdp-size-wrap.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/08c_fsdp-size-wrap.py -------------------------------------------------------------------------------- /08-10-vit32/09_fsdp-act-checkp.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/09_fsdp-act-checkp.py -------------------------------------------------------------------------------- /08-10-vit32/10_fsdp-with-cpu-offload.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/10_fsdp-with-cpu-offload.py -------------------------------------------------------------------------------- /08-10-vit32/10b_fsdp-with-cpu-offload-no-act-check.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/10b_fsdp-with-cpu-offload-no-act-check.py -------------------------------------------------------------------------------- /08-10-vit32/local_utilities.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08-10-vit32/local_utilities.py -------------------------------------------------------------------------------- /08_baseline.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08_baseline.py -------------------------------------------------------------------------------- /08a_fsdp-defaults.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08a_fsdp-defaults.py -------------------------------------------------------------------------------- /08b_fsdp-custom.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08b_fsdp-custom.py -------------------------------------------------------------------------------- /08c_fsdp-size-wrap.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/08c_fsdp-size-wrap.py -------------------------------------------------------------------------------- /09_fsdp-act-checkp.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/09_fsdp-act-checkp.py -------------------------------------------------------------------------------- /10_fsdp-with-cpu-offload.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/10_fsdp-with-cpu-offload.py -------------------------------------------------------------------------------- /10b_fsdp-with-cpu-offload-no-act-check.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/10b_fsdp-with-cpu-offload-no-act-check.py -------------------------------------------------------------------------------- /11_delay-allocation.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/11_delay-allocation.py -------------------------------------------------------------------------------- /12_fsdp-overlap.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/12_fsdp-overlap.py -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/LICENSE.txt -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/README.md -------------------------------------------------------------------------------- /bonus_bigbird-after.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/bonus_bigbird-after.py -------------------------------------------------------------------------------- /bonus_bigbird-before.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/bonus_bigbird-before.py -------------------------------------------------------------------------------- /bonus_distilbert-after.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/bonus_distilbert-after.py -------------------------------------------------------------------------------- /bonus_distilbert-before.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/bonus_distilbert-before.py -------------------------------------------------------------------------------- /figures/overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/figures/overview.png -------------------------------------------------------------------------------- /local_utilities.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/local_utilities.py -------------------------------------------------------------------------------- /logs.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/logs.md -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rasbt/pytorch-memory-optim/HEAD/requirements.txt --------------------------------------------------------------------------------