├── .github └── workflows │ └── pre-commit.yaml ├── .gitignore ├── .idea └── workspace.xml ├── .pre-commit-config.yaml ├── .pylintrc ├── LICENSE ├── Makefile ├── Model Merge And Analysis Tools ├── Enhanced_Mixer.py ├── Enhanced_Mixer_Requirements.txt ├── LM_BlockMerge.py ├── LM_BlockMerge_Requirements.txt ├── StratusScope.py ├── StratusScope_BarGraph.png ├── StratusScope_ConsoleOutput.png ├── StratusScope_Requirements.txt └── __Quick Tool Explainer__.txt ├── README.md ├── clustering ├── download.py ├── feature_extractor.py ├── hierarchical_clustering.py ├── memmap_utils.py └── train_clusterer.py ├── conda-mdel.yml ├── configs ├── fp16_1-3B_4M_bs_1.4T_tok_summit_pp3_mp2_256_nodes.yml ├── fp16_2-7B_4M_bs_1.4T_tok_summit_pp6_mp2_256nodes.yml └── fp16_6-7B_4M_bs_1T_tok_summit_pp12_mp2_mbs2_512_nodes_real.yml ├── distillation_sparsification ├── README.md ├── datautils.py ├── distill.py ├── falcon.py ├── lion.py ├── lm_seqs_dataset.py ├── make_student.py ├── modelutils.py ├── process_data.py ├── quant.py ├── sparsegpt.py ├── test.py ├── test1.py ├── tracker.py ├── train.py └── utils.py ├── docs └── .gitkeep ├── lora-x ├── README.md ├── bpt_attention_plugin.py ├── bpt_pt.py ├── bpt_triton.py ├── config.py ├── configs │ └── zero3_offload_config.json ├── data.py ├── experimental │ ├── qlora_lomo.py │ └── train_qlora_lomo.py ├── flash_patch.py ├── lora.py ├── qlora_bpt.py ├── requirements.txt ├── scripts │ └── juwels_booster.sh └── utils.py ├── notebooks ├── .gitkeep ├── CalculatePerplexity.ipynb └── Merge_N_Experts.ipynb ├── requirements.txt ├── resources.md ├── scripts ├── c-btmInference.py ├── calc_perplexities.sh ├── calc_perplexities_slurm.sh ├── create_domain_pile_mix.sh ├── get_pile_shard1_data.sh └── upload_to_hf.sh ├── setup.py └── src └── mdel ├── __init__.py ├── calculate_perplexity.py ├── configs ├── config.yaml └── zero_config.json ├── eval_merges.py ├── iterate_layers.sh ├── merge_experts.py ├── pile_upload.py ├── pile_utils.py ├── train.sh ├── train_cbtm_classifier.py ├── train_chat.sh ├── train_ds.sh ├── trainer.py ├── trainer_chat.bat └── trainer_chat.py /.github/workflows/pre-commit.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/.github/workflows/pre-commit.yaml -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/.gitignore -------------------------------------------------------------------------------- /.idea/workspace.xml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/.idea/workspace.xml -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/.pre-commit-config.yaml -------------------------------------------------------------------------------- /.pylintrc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/.pylintrc -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/LICENSE -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Makefile -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/Enhanced_Mixer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/Enhanced_Mixer.py -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/Enhanced_Mixer_Requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/Enhanced_Mixer_Requirements.txt -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/LM_BlockMerge.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/LM_BlockMerge.py -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/LM_BlockMerge_Requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/LM_BlockMerge_Requirements.txt -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/StratusScope.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/StratusScope.py -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/StratusScope_BarGraph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/StratusScope_BarGraph.png -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/StratusScope_ConsoleOutput.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/StratusScope_ConsoleOutput.png -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/StratusScope_Requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/StratusScope_Requirements.txt -------------------------------------------------------------------------------- /Model Merge And Analysis Tools/__Quick Tool Explainer__.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/Model Merge And Analysis Tools/__Quick Tool Explainer__.txt -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/README.md -------------------------------------------------------------------------------- /clustering/download.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/clustering/download.py -------------------------------------------------------------------------------- /clustering/feature_extractor.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/clustering/feature_extractor.py -------------------------------------------------------------------------------- /clustering/hierarchical_clustering.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/clustering/hierarchical_clustering.py -------------------------------------------------------------------------------- /clustering/memmap_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/clustering/memmap_utils.py -------------------------------------------------------------------------------- /clustering/train_clusterer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/clustering/train_clusterer.py -------------------------------------------------------------------------------- /conda-mdel.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/conda-mdel.yml -------------------------------------------------------------------------------- /configs/fp16_1-3B_4M_bs_1.4T_tok_summit_pp3_mp2_256_nodes.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/configs/fp16_1-3B_4M_bs_1.4T_tok_summit_pp3_mp2_256_nodes.yml -------------------------------------------------------------------------------- /configs/fp16_2-7B_4M_bs_1.4T_tok_summit_pp6_mp2_256nodes.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/configs/fp16_2-7B_4M_bs_1.4T_tok_summit_pp6_mp2_256nodes.yml -------------------------------------------------------------------------------- /configs/fp16_6-7B_4M_bs_1T_tok_summit_pp12_mp2_mbs2_512_nodes_real.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/configs/fp16_6-7B_4M_bs_1T_tok_summit_pp12_mp2_mbs2_512_nodes_real.yml -------------------------------------------------------------------------------- /distillation_sparsification/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/README.md -------------------------------------------------------------------------------- /distillation_sparsification/datautils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/datautils.py -------------------------------------------------------------------------------- /distillation_sparsification/distill.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/distill.py -------------------------------------------------------------------------------- /distillation_sparsification/falcon.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/falcon.py -------------------------------------------------------------------------------- /distillation_sparsification/lion.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/lion.py -------------------------------------------------------------------------------- /distillation_sparsification/lm_seqs_dataset.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/lm_seqs_dataset.py -------------------------------------------------------------------------------- /distillation_sparsification/make_student.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/make_student.py -------------------------------------------------------------------------------- /distillation_sparsification/modelutils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/modelutils.py -------------------------------------------------------------------------------- /distillation_sparsification/process_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/process_data.py -------------------------------------------------------------------------------- /distillation_sparsification/quant.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/quant.py -------------------------------------------------------------------------------- /distillation_sparsification/sparsegpt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/sparsegpt.py -------------------------------------------------------------------------------- /distillation_sparsification/test.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/test.py -------------------------------------------------------------------------------- /distillation_sparsification/test1.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/test1.py -------------------------------------------------------------------------------- /distillation_sparsification/tracker.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/tracker.py -------------------------------------------------------------------------------- /distillation_sparsification/train.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/train.py -------------------------------------------------------------------------------- /distillation_sparsification/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/distillation_sparsification/utils.py -------------------------------------------------------------------------------- /docs/.gitkeep: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /lora-x/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/README.md -------------------------------------------------------------------------------- /lora-x/bpt_attention_plugin.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/bpt_attention_plugin.py -------------------------------------------------------------------------------- /lora-x/bpt_pt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/bpt_pt.py -------------------------------------------------------------------------------- /lora-x/bpt_triton.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/bpt_triton.py -------------------------------------------------------------------------------- /lora-x/config.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/config.py -------------------------------------------------------------------------------- /lora-x/configs/zero3_offload_config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/configs/zero3_offload_config.json -------------------------------------------------------------------------------- /lora-x/data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/data.py -------------------------------------------------------------------------------- /lora-x/experimental/qlora_lomo.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/experimental/qlora_lomo.py -------------------------------------------------------------------------------- /lora-x/experimental/train_qlora_lomo.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/experimental/train_qlora_lomo.py -------------------------------------------------------------------------------- /lora-x/flash_patch.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/flash_patch.py -------------------------------------------------------------------------------- /lora-x/lora.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/lora.py -------------------------------------------------------------------------------- /lora-x/qlora_bpt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/qlora_bpt.py -------------------------------------------------------------------------------- /lora-x/requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/requirements.txt -------------------------------------------------------------------------------- /lora-x/scripts/juwels_booster.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/scripts/juwels_booster.sh -------------------------------------------------------------------------------- /lora-x/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/lora-x/utils.py -------------------------------------------------------------------------------- /notebooks/.gitkeep: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /notebooks/CalculatePerplexity.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/notebooks/CalculatePerplexity.ipynb -------------------------------------------------------------------------------- /notebooks/Merge_N_Experts.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/notebooks/Merge_N_Experts.ipynb -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/requirements.txt -------------------------------------------------------------------------------- /resources.md: -------------------------------------------------------------------------------- 1 | https://github.com/TehVenomm/LM_Transformers_BlockMerge 2 | -------------------------------------------------------------------------------- /scripts/c-btmInference.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/scripts/c-btmInference.py -------------------------------------------------------------------------------- /scripts/calc_perplexities.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/scripts/calc_perplexities.sh -------------------------------------------------------------------------------- /scripts/calc_perplexities_slurm.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/scripts/calc_perplexities_slurm.sh -------------------------------------------------------------------------------- /scripts/create_domain_pile_mix.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/scripts/create_domain_pile_mix.sh -------------------------------------------------------------------------------- /scripts/get_pile_shard1_data.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/scripts/get_pile_shard1_data.sh -------------------------------------------------------------------------------- /scripts/upload_to_hf.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/scripts/upload_to_hf.sh -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/setup.py -------------------------------------------------------------------------------- /src/mdel/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/mdel/calculate_perplexity.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/calculate_perplexity.py -------------------------------------------------------------------------------- /src/mdel/configs/config.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/configs/config.yaml -------------------------------------------------------------------------------- /src/mdel/configs/zero_config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/configs/zero_config.json -------------------------------------------------------------------------------- /src/mdel/eval_merges.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/eval_merges.py -------------------------------------------------------------------------------- /src/mdel/iterate_layers.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/iterate_layers.sh -------------------------------------------------------------------------------- /src/mdel/merge_experts.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/merge_experts.py -------------------------------------------------------------------------------- /src/mdel/pile_upload.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/pile_upload.py -------------------------------------------------------------------------------- /src/mdel/pile_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/pile_utils.py -------------------------------------------------------------------------------- /src/mdel/train.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/train.sh -------------------------------------------------------------------------------- /src/mdel/train_cbtm_classifier.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/train_cbtm_classifier.py -------------------------------------------------------------------------------- /src/mdel/train_chat.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/train_chat.sh -------------------------------------------------------------------------------- /src/mdel/train_ds.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/train_ds.sh -------------------------------------------------------------------------------- /src/mdel/trainer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/trainer.py -------------------------------------------------------------------------------- /src/mdel/trainer_chat.bat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/trainer_chat.bat -------------------------------------------------------------------------------- /src/mdel/trainer_chat.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huu4ontocord/MDEL/HEAD/src/mdel/trainer_chat.py --------------------------------------------------------------------------------