├── .gitignore
├── INSTALLATION.md
├── README.md
├── configs
    ├── base_fp16.yaml
    ├── base_fp32.yaml
    ├── dynamo_fp16.yaml
    └── dynamo_fp32.yaml
├── experiments
    ├── accelerate_script.py
    ├── base.py
    ├── base_fp16.py
    ├── dynamic.py
    ├── dynamic_fp16.py
    ├── dynamic_optimized.py
    ├── dynamic_optimized_fp16.py
    ├── generate_script.py
    ├── optimize_forward.py
    ├── optimize_forward_fp16.py
    ├── optimize_model.py
    ├── optimize_model_fp16.py
    ├── optimize_train_step.py
    └── optimize_train_step_fp16.py
├── requirements.txt
├── run_experiments.sh
├── scripts
    ├── cv_classification.py
    ├── language_modeling.py
    ├── text_classification.py
    └── translation.py
└── tools
    ├── summarize.py
    └── verify_dynamo.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .idea
2 | trained-resnet
3 | 


--------------------------------------------------------------------------------
/INSTALLATION.md:
--------------------------------------------------------------------------------
  1 | # Installation guide on a new instance
  2 | 
  3 | Jump to the last section if you alrady have CUDA installed.
  4 | 
  5 | ## Install drivers:
  6 | 
  7 | ```bash
  8 | sudo apt install ubuntu-drivers-common
  9 | ```
 10 | 
 11 | Run
 12 | 
 13 | ```bash
 14 | ubuntu-drivers devices
 15 | ```
 16 | 
 17 | Output
 18 | 
 19 | ```
 20 | == /sys/devices/pci0000:00/0000:00:04.0==
 21 | modalias : pci:v000010DEd000020B0sv000010DEsd0000134Fbc03sc02i00
 22 | vendor   : NVIDIA Corporation
 23 | driver   : nvidia-driver-470-server - distro non-free
 24 | driver   : nvidia-driver-515-open - distro non-free recommended
 25 | driver   : nvidia-driver-515 - distro non-free
 26 | driver   : nvidia-driver-450-server - distro non-free
 27 | driver   : nvidia-driver-510 - distro non-free
 28 | driver   : nvidia-driver-510-server - distro non-free
 29 | driver   : nvidia-driver-515-server - distro non-free
 30 | driver   : nvidia-driver-470 - distro non-free
 31 | driver   : xserver-xorg-video-nouveau - distro free builtin 
 32 | ```
 33 | 
 34 | Pick number recommended
 35 | 
 36 | ```
 37 | sudo apt install nvidia-headless-515-server nvidia-utils-515-server
 38 | ```
 39 | 
 40 | Reboot
 41 | ```bash
 42 | sudo reboot
 43 | ```
 44 | 
 45 | ## Install CUDA
 46 | 
 47 | ```bash
 48 | wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
 49 | sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
 50 | ```
 51 | 
 52 | Add public key/repo
 53 | ```bash
 54 | sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
 55 | sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
 56 | ```
 57 | 
 58 | Install cuda toolkit. 
 59 | If you want to use conda setup, please refer the corresponding section and skip below steps.
 60 | 
 61 | ```bash
 62 | sudo apt update
 63 | sudo apt install cuda-toolkit-11-7
 64 | ```
 65 | 
 66 | (you can type-hint after cuda-toolkit- to find all available versions.)
 67 | 
 68 | Download [CUDNN](https://developer.nvidia.com/cudnn) and scp it to the instance.
 69 | 
 70 | Extract
 71 | 
 72 | ```bash
 73 | tar -xf cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz
 74 | sudo cp cudnn-linux-x86_64-8.6.0.163_cuda11-archive/include/cudnn*.h /usr/local/cuda/include 
 75 | sudo cp cudnn-linux-x86_64-8.6.0.163_cuda11-archive/lib/libcudnn* /usr/local/cuda/lib64
 76 | sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
 77 | ```
 78 | 
 79 | Add to .bashrc
 80 | 
 81 | ```
 82 | export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
 83 | export CUDA_HOME=/usr/local/cuda
 84 | export PATH="/usr/local/cuda/:$PATH"
 85 | ```
 86 | 
 87 | then
 88 | 
 89 | ```bash
 90 | source ~/.bashrc
 91 | ```
 92 | 
 93 | Check everything is alright
 94 | 
 95 | ```bash
 96 | nvidia-smi
 97 | ```
 98 | 
 99 | ## Python
100 | 
101 | ```bash
102 | sudo apt-get install pip
103 | sudo apt install python3.8-venv
104 | python3 -m venv dynamo
105 | ```
106 | 
107 | Add to .bashrc
108 | ```bash
109 | source dynamo/bin/activate
110 | ```
111 | 
112 | then
113 | 
114 | ```bash
115 | source ~/.bashrc
116 | ```
117 | 
118 | Install nightlies with dynamo
119 | 
120 | ```bash
121 | pip install numpy
122 | pip install --pre torch[dynamo] --extra-index-url https://download.pytorch.org/whl/nightly/cu117/
123 | ```
124 | 
125 | # Conda Installation instructions
126 | 
127 | 1. Install miniconda
128 | ```bash
129 | wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
130 | bash Miniconda3-latest-Linux-x86_64.sh
131 | ```
132 | 
133 | 2. Create conda python env and then activate it
134 | ```bash
135 | conda create --name dynamo python
136 | conda activate dynamo
137 | ```
138 | 
139 | 3. Install cudatoolkit-11.7. Please refer [cuda-toolkit](https://anaconda.org/nvidia/cuda-toolkit)
140 | for more information
141 | ```bash
142 | conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit
143 | ```
144 | 
145 | 4. Install PyTorch along with Torch Dynamo dependencies
146 | ```bash
147 | pip install numpy
148 | pip install --pre torch[dynamo] --extra-index-url https://download.pytorch.org/whl/nightly/cu117/
149 | ```
150 | 
151 | 5. Verify torch-dynamo with below command assuming you are in the top folder
152 | ```
153 | python tools/verify_dynamo.py
154 | ```


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | Repo to test torchdynamo
 2 | 
 3 | `pip install -r requirements.txt`
 4 | 
 5 | ## Experiments
 6 | 
 7 | This folder contains quick reproducers to observe different behaviors. The base scripts give a benchmark without any optimization. Then we can observe what happens when optimizing different things. This table regroups the average iteration time (alle executed on an A100):
 8 | 
 9 | Batch size 16:
10 | 
11 | | Script | FP32 | FP16 |
12 | |:--|:-:|:-:|
13 | | base | 54.44ms | 62.24ms |
14 | | optimize_model | 38.20ms | 29.85ms |
15 | | optimize_forward | 38.36ms | 29.49ms |
16 | | train_step | x | x |
17 | 
18 | Batch size 8:
19 | 
20 | | Script | FP32 | FP16 |
21 | |:--|:-:|:-:|
22 | | base | 53.47ms | 59.68ms |
23 | | optimize_model | 28.34ms | 23.80ms |
24 | | optimize_forward | 28.16ms | 29.34ms |
25 | | train_step | 1754.47ms | 1740.21ms |
26 | 
27 | Using torchdynamo to optimize the train step does not really work lots of warning like this and the times are not right:
28 | 
29 | ```
30 | [2022-11-04 15:32:56,201] torch._dynamo.optimizations.training: [WARNING] Unable to use Aot Autograd because of presence of mutation
31 | [2022-11-04 15:32:56,201] torch._inductor.compile_fx: [WARNING] Aot Autograd is not safe to run, so falling back to eager
32 | ```
33 | 
34 | Reproducer: `python experiments/optimize_train_step_fp16.py`
35 | 
36 | Dynamic
37 | 
38 | | Script | FP32 | FP16 |
39 | |:--|:-:|:-:|
40 | | dynamic | 59.23ms | 63.53ms |
41 | | dyanmic_optimized | OOM? | OOM? |
42 | 
43 | ## Scripts
44 | 
45 | ### Text classification
46 | 
47 | Iteration avg time for training/evaluation (excluding the first) when fine-tuning BERT on MRPC. Final results of the models are within the variance of fine-tuning, no particular performance drop observed except for FP16 + torchdynamo which seems to underperform a bit (more like 82%-84% accuracy compared to 86%-87% for other tests and 0.86/0.87 F1 score instead of 0.89/0.90).
48 | 
49 | | Dynamo | FP32 | FP16 |
50 | |:--|:-:|:-:|
51 | | no | 57.9ms/15.65ms | 65.87ms/18.52ms |
52 | | inductor | 36.24ms/10.55ms | 39.43ms/9.09ms |
53 | 
54 | To reproduce:
55 | 
56 | ```bash
57 | accelerate launch --config_file configs/base_fp32.yaml scripts/text_classification.py --task_name mrpc
58 | ```
59 | 
60 | and change the config file to one of the four options in `configs` to get the four squares.
61 | 
62 | 
63 | ### Language Modeling
64 | 
65 | ```bash
66 | accelerate launch scripts/language_modeling.py \
67 |     --dataset_name wikitext \
68 |     --dataset_config_name wikitext-2-raw-v1 \
69 |     --model_name_or_path gpt2 \
70 |     --dynamo_backend inductor \
71 |     --mixed_precision fp16
72 | ```
73 | 
74 | ### Vision Classification 
75 | 
76 | ```bash
77 | accelerate launch scripts/cv_classification.py \
78 |     --model_name_or_path microsoft/resnet-18 \
79 |     --dataset_name beans \
80 |     --dynamo_backend inductor \
81 |     --mixed_precision no
82 | ```


--------------------------------------------------------------------------------
/configs/base_fp16.yaml:
--------------------------------------------------------------------------------
 1 | command_file: null
 2 | commands: null
 3 | compute_environment: LOCAL_MACHINE
 4 | deepspeed_config: {}
 5 | distributed_type: 'NO'
 6 | downcast_bf16: 'no'
 7 | dynamo_backend: 'NO'
 8 | fsdp_config: {}
 9 | gpu_ids: all
10 | machine_rank: 0
11 | main_process_ip: null
12 | main_process_port: null
13 | main_training_function: main
14 | megatron_lm_config: {}
15 | mixed_precision: fp16
16 | num_machines: 1
17 | num_processes: 1
18 | rdzv_backend: static
19 | same_network: true
20 | tpu_name: null
21 | tpu_zone: null
22 | use_cpu: false
23 | 


--------------------------------------------------------------------------------
/configs/base_fp32.yaml:
--------------------------------------------------------------------------------
 1 | command_file: null
 2 | commands: null
 3 | compute_environment: LOCAL_MACHINE
 4 | deepspeed_config: {}
 5 | distributed_type: 'NO'
 6 | downcast_bf16: 'no'
 7 | dynamo_backend: 'NO'
 8 | fsdp_config: {}
 9 | gpu_ids: all
10 | machine_rank: 0
11 | main_process_ip: null
12 | main_process_port: null
13 | main_training_function: main
14 | megatron_lm_config: {}
15 | mixed_precision: 'no'
16 | num_machines: 1
17 | num_processes: 1
18 | rdzv_backend: static
19 | same_network: true
20 | tpu_name: null
21 | tpu_zone: null
22 | use_cpu: false
23 | 


--------------------------------------------------------------------------------
/configs/dynamo_fp16.yaml:
--------------------------------------------------------------------------------
 1 | command_file: null
 2 | commands: null
 3 | compute_environment: LOCAL_MACHINE
 4 | deepspeed_config: {}
 5 | distributed_type: 'NO'
 6 | downcast_bf16: 'no'
 7 | dynamo_backend: INDUCTOR
 8 | fsdp_config: {}
 9 | gpu_ids: all
10 | machine_rank: 0
11 | main_process_ip: null
12 | main_process_port: null
13 | main_training_function: main
14 | megatron_lm_config: {}
15 | mixed_precision: fp16
16 | num_machines: 1
17 | num_processes: 1
18 | rdzv_backend: static
19 | same_network: true
20 | tpu_name: null
21 | tpu_zone: null
22 | use_cpu: false
23 | 


--------------------------------------------------------------------------------
/configs/dynamo_fp32.yaml:
--------------------------------------------------------------------------------
 1 | command_file: null
 2 | commands: null
 3 | compute_environment: LOCAL_MACHINE
 4 | deepspeed_config: {}
 5 | distributed_type: 'NO'
 6 | downcast_bf16: 'no'
 7 | dynamo_backend: INDUCTOR
 8 | fsdp_config: {}
 9 | gpu_ids: all
10 | machine_rank: 0
11 | main_process_ip: null
12 | main_process_port: null
13 | main_training_function: main
14 | megatron_lm_config: {}
15 | mixed_precision: 'no'
16 | num_machines: 1
17 | num_processes: 1
18 | rdzv_backend: static
19 | same_network: true
20 | tpu_name: null
21 | tpu_zone: null
22 | use_cpu: false
23 | 


--------------------------------------------------------------------------------
/experiments/accelerate_script.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | from torch.optim import AdamW
  6 | from accelerate import Accelerator
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | def parse_args():
 48 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 49 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 50 |     parser.add_argument("--batch_size", type=int, default=16)
 51 |     parser.add_argument("--num_batches", type=int, default=100)
 52 | 
 53 |     args = parser.parse_args()
 54 |     return args
 55 | 
 56 | class DataLoader():
 57 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 58 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 59 |         self.batch_size = batch_size
 60 |         self.num_batches = num_batches
 61 |         self.seq_len = seq_len
 62 |         self.mask_token_id = tokenizer.mask_token_id
 63 |     
 64 |     def __iter__(self):
 65 |         for _ in range(self.num_batches):
 66 |             masked_samples = []
 67 |             samples = []
 68 |             for _ in range(self.batch_size):
 69 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 70 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 71 |                 samples.append(tokens)
 72 | 
 73 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 74 |                 masked_samples.append(masked_tokens)
 75 | 
 76 | 
 77 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 78 |     
 79 |     def __len__(self):
 80 |         return self.num_batches
 81 | 
 82 | 
 83 | def main():
 84 |     args = parse_args()
 85 |     accelerator = Accelerator()
 86 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 87 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 88 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 89 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 90 | 
 91 |     model = model.train()
 92 |     model, optimizer = accelerator.prepare(model, optimizer)
 93 | 
 94 |     start_time = time.time()
 95 |     for step, batch in enumerate(train_dl):
 96 |         batch = {k: v.to(accelerator.device) for k, v in batch.items()}
 97 |         output = model(**batch)
 98 |         loss = output.loss
 99 |         loss.backward()
100 |         optimizer.step()
101 |         optimizer.zero_grad()
102 |         if step == 0:
103 |             first_step_time = time.time() - start_time
104 | 
105 |     total_training_time = time.time() - start_time
106 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
107 |     print("Training finished.")
108 |     print(f"First iteration took: {first_step_time:.2f}s")
109 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
110 | 
111 | if __name__ == "__main__":
112 |     main()
113 | 


--------------------------------------------------------------------------------
/experiments/base.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | from torch.optim import AdamW
  6 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  7 | 
  8 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
  9 | 
 10 | == History ==
 11 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 12 | 
 13 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 14 | 
 15 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 16 | 
 17 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 18 | 
 19 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 20 | 
 21 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 22 | 
 23 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 24 | 
 25 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 26 | 
 27 | == Services and technologies ==
 28 | === Transformers Library ===
 29 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 30 | 
 31 | 
 32 | === Hugging Face Hub ===
 33 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 34 | 
 35 | == References ==
 36 | {{Reflist}}
 37 | 
 38 | {{Portal bar|Companies}}
 39 | 
 40 | {{DEFAULTSORT:Hugging Face}}
 41 | [[Category:Machine learning]]
 42 | [[Category:Open-source artificial intelligence]]
 43 | 
 44 | """
 45 | 
 46 | torch.backends.cuda.matmul.allow_tf32 = True
 47 | 
 48 | def parse_args():
 49 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 50 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 51 |     parser.add_argument("--batch_size", type=int, default=16)
 52 |     parser.add_argument("--num_batches", type=int, default=100)
 53 | 
 54 |     args = parser.parse_args()
 55 |     return args
 56 | 
 57 | class DataLoader():
 58 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 59 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 60 |         self.batch_size = batch_size
 61 |         self.num_batches = num_batches
 62 |         self.seq_len = seq_len
 63 |         self.mask_token_id = tokenizer.mask_token_id
 64 |     
 65 |     def __iter__(self):
 66 |         for _ in range(self.num_batches):
 67 |             masked_samples = []
 68 |             samples = []
 69 |             for _ in range(self.batch_size):
 70 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 71 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 72 |                 samples.append(tokens)
 73 | 
 74 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 75 |                 masked_samples.append(masked_tokens)
 76 | 
 77 | 
 78 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 79 |     
 80 |     def __len__(self):
 81 |         return self.num_batches
 82 | 
 83 | 
 84 | def main():
 85 |     args = parse_args()
 86 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 87 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 88 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 89 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 90 | 
 91 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 92 |     model = model.to(device).train()
 93 | 
 94 |     start_time = time.time()
 95 |     for step, batch in enumerate(train_dl):
 96 |         batch = {k: v.to(device) for k, v in batch.items()}
 97 |         output = model(**batch)
 98 |         loss = output.loss
 99 |         loss.backward()
100 |         optimizer.step()
101 |         optimizer.zero_grad()
102 |         if step == 0:
103 |             first_step_time = time.time() - start_time
104 | 
105 |     total_training_time = time.time() - start_time
106 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
107 |     print("Training finished.")
108 |     print(f"First iteration took: {first_step_time:.2f}s")
109 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
110 | 
111 | if __name__ == "__main__":
112 |     main()
113 | 


--------------------------------------------------------------------------------
/experiments/base_fp16.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | from torch.optim import AdamW
  6 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  7 | 
  8 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
  9 | 
 10 | == History ==
 11 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 12 | 
 13 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 14 | 
 15 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 16 | 
 17 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 18 | 
 19 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 20 | 
 21 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 22 | 
 23 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 24 | 
 25 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 26 | 
 27 | == Services and technologies ==
 28 | === Transformers Library ===
 29 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 30 | 
 31 | 
 32 | === Hugging Face Hub ===
 33 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 34 | 
 35 | == References ==
 36 | {{Reflist}}
 37 | 
 38 | {{Portal bar|Companies}}
 39 | 
 40 | {{DEFAULTSORT:Hugging Face}}
 41 | [[Category:Machine learning]]
 42 | [[Category:Open-source artificial intelligence]]
 43 | 
 44 | """
 45 | 
 46 | def parse_args():
 47 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 48 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 49 |     parser.add_argument("--batch_size", type=int, default=16)
 50 |     parser.add_argument("--num_batches", type=int, default=100)
 51 | 
 52 |     args = parser.parse_args()
 53 |     return args
 54 | 
 55 | class DataLoader():
 56 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 57 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 58 |         self.batch_size = batch_size
 59 |         self.num_batches = num_batches
 60 |         self.seq_len = seq_len
 61 |         self.mask_token_id = tokenizer.mask_token_id
 62 |     
 63 |     def __iter__(self):
 64 |         for _ in range(self.num_batches):
 65 |             masked_samples = []
 66 |             samples = []
 67 |             for _ in range(self.batch_size):
 68 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 69 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 70 |                 samples.append(tokens)
 71 | 
 72 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 73 |                 masked_samples.append(masked_tokens)
 74 | 
 75 | 
 76 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 77 |     
 78 |     def __len__(self):
 79 |         return self.num_batches
 80 | 
 81 | 
 82 | def main():
 83 |     args = parse_args()
 84 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 85 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 86 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 87 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 88 | 
 89 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 90 |     model = model.to(device).train()
 91 | 
 92 |     start_time = time.time()
 93 |     for step, batch in enumerate(train_dl):
 94 |         batch = {k: v.to(device) for k, v in batch.items()}
 95 |         with torch.cuda.amp.autocast():
 96 |             output = model(**batch)
 97 |         loss = output.loss
 98 |         loss.backward()
 99 |         optimizer.step()
100 |         optimizer.zero_grad()
101 |         if step == 0:
102 |             first_step_time = time.time() - start_time
103 | 
104 |     total_training_time = time.time() - start_time
105 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
106 |     print("Training finished.")
107 |     print(f"First iteration took: {first_step_time:.2f}s")
108 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
109 | 
110 | if __name__ == "__main__":
111 |     main()
112 | 


--------------------------------------------------------------------------------
/experiments/dynamic.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | from torch.optim import AdamW
  6 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  7 | 
  8 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
  9 | 
 10 | == History ==
 11 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 12 | 
 13 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 14 | 
 15 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 16 | 
 17 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 18 | 
 19 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 20 | 
 21 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 22 | 
 23 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 24 | 
 25 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 26 | 
 27 | == Services and technologies ==
 28 | === Transformers Library ===
 29 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 30 | 
 31 | 
 32 | === Hugging Face Hub ===
 33 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 34 | 
 35 | == References ==
 36 | {{Reflist}}
 37 | 
 38 | {{Portal bar|Companies}}
 39 | 
 40 | {{DEFAULTSORT:Hugging Face}}
 41 | [[Category:Machine learning]]
 42 | [[Category:Open-source artificial intelligence]]
 43 | 
 44 | """
 45 | 
 46 | torch.backends.cuda.matmul.allow_tf32 = True
 47 | 
 48 | def parse_args():
 49 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 50 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 51 |     parser.add_argument("--batch_size", type=int, default=16)
 52 |     parser.add_argument("--num_batches", type=int, default=100)
 53 | 
 54 |     args = parser.parse_args()
 55 |     return args
 56 | 
 57 | class DataLoader():
 58 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 59 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 60 |         self.batch_size = batch_size
 61 |         self.num_batches = num_batches
 62 |         self.seq_len = seq_len
 63 |         self.mask_token_id = tokenizer.mask_token_id
 64 |     
 65 |     def __iter__(self):
 66 |         for _ in range(self.num_batches):
 67 |             masked_samples = []
 68 |             samples = []
 69 |             seq_len = random.randint(self.seq_len // 8, self.seq_len // 4 - 1) * 8
 70 |             for _ in range(self.batch_size):
 71 |                 start = random.randint(0, len(self.tokenized_corpus) - seq_len - 1)
 72 |                 tokens = self.tokenized_corpus[start: start + seq_len]
 73 |                 samples.append(tokens)
 74 | 
 75 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 76 |                 masked_samples.append(masked_tokens)
 77 | 
 78 | 
 79 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 80 |         
 81 |     def __len__(self):
 82 |         return self.num_batches
 83 | 
 84 | 
 85 | def main():
 86 |     args = parse_args()
 87 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 88 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 89 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 90 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 91 | 
 92 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 93 |     model = model.to(device).train()
 94 | 
 95 |     start_time = time.time()
 96 |     for step, batch in enumerate(train_dl):
 97 |         batch = {k: v.to(device) for k, v in batch.items()}
 98 |         output = model(**batch)
 99 |         loss = output.loss
100 |         loss.backward()
101 |         optimizer.step()
102 |         optimizer.zero_grad()
103 | 
104 |         if step == 0:
105 |             first_step_time = time.time() - start_time
106 | 
107 |     total_training_time = time.time() - start_time
108 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
109 |     print("Training finished.")
110 |     print(f"First iteration took: {first_step_time:.2f}s")
111 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
112 | 
113 | 
114 | 
115 | 
116 | 
117 | 
118 | if __name__ == "__main__":
119 |     main()
120 | 


--------------------------------------------------------------------------------
/experiments/dynamic_fp16.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | from torch.optim import AdamW
  6 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  7 | 
  8 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
  9 | 
 10 | == History ==
 11 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 12 | 
 13 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 14 | 
 15 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 16 | 
 17 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 18 | 
 19 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 20 | 
 21 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 22 | 
 23 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 24 | 
 25 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 26 | 
 27 | == Services and technologies ==
 28 | === Transformers Library ===
 29 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 30 | 
 31 | 
 32 | === Hugging Face Hub ===
 33 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 34 | 
 35 | == References ==
 36 | {{Reflist}}
 37 | 
 38 | {{Portal bar|Companies}}
 39 | 
 40 | {{DEFAULTSORT:Hugging Face}}
 41 | [[Category:Machine learning]]
 42 | [[Category:Open-source artificial intelligence]]
 43 | 
 44 | """
 45 | 
 46 | def parse_args():
 47 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 48 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 49 |     parser.add_argument("--batch_size", type=int, default=16)
 50 |     parser.add_argument("--num_batches", type=int, default=100)
 51 | 
 52 |     args = parser.parse_args()
 53 |     return args
 54 | 
 55 | class DataLoader():
 56 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 57 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 58 |         self.batch_size = batch_size
 59 |         self.num_batches = num_batches
 60 |         self.seq_len = seq_len
 61 |         self.mask_token_id = tokenizer.mask_token_id
 62 |     
 63 |     def __iter__(self):
 64 |         for _ in range(self.num_batches):
 65 |             masked_samples = []
 66 |             samples = []
 67 |             seq_len = random.randint(self.seq_len // 8, self.seq_len // 4 - 1) * 8
 68 |             for _ in range(self.batch_size):
 69 |                 start = random.randint(0, len(self.tokenized_corpus) - seq_len - 1)
 70 |                 tokens = self.tokenized_corpus[start: start + seq_len]
 71 |                 samples.append(tokens)
 72 | 
 73 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 74 |                 masked_samples.append(masked_tokens)
 75 | 
 76 | 
 77 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 78 |         
 79 |     def __len__(self):
 80 |         return self.num_batches
 81 | 
 82 | 
 83 | def main():
 84 |     args = parse_args()
 85 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 86 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 87 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 88 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 89 | 
 90 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 91 |     model = model.to(device).train()
 92 | 
 93 |     start_time = time.time()
 94 |     for step, batch in enumerate(train_dl):
 95 |         batch = {k: v.to(device) for k, v in batch.items()}
 96 |         with torch.cuda.amp.autocast():
 97 |             output = model(**batch)
 98 |         loss = output.loss
 99 |         loss.backward()
100 |         optimizer.step()
101 |         optimizer.zero_grad()
102 | 
103 |         if step == 0:
104 |             first_step_time = time.time() - start_time
105 | 
106 |     total_training_time = time.time() - start_time
107 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
108 |     print("Training finished.")
109 |     print(f"First iteration took: {first_step_time:.2f}s")
110 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
111 | 
112 | 
113 | 
114 | 
115 | 
116 | 
117 | if __name__ == "__main__":
118 |     main()
119 | 


--------------------------------------------------------------------------------
/experiments/dynamic_optimized.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | torch.backends.cuda.matmul.allow_tf32 = True
 48 | 
 49 | def parse_args():
 50 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 51 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 52 |     parser.add_argument("--batch_size", type=int, default=16)
 53 |     parser.add_argument("--num_batches", type=int, default=100)
 54 | 
 55 |     args = parser.parse_args()
 56 |     return args
 57 | 
 58 | class DataLoader():
 59 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 60 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 61 |         self.batch_size = batch_size
 62 |         self.num_batches = num_batches
 63 |         self.seq_len = seq_len
 64 |         self.mask_token_id = tokenizer.mask_token_id
 65 |     
 66 |     def __iter__(self):
 67 |         for _ in range(self.num_batches):
 68 |             masked_samples = []
 69 |             samples = []
 70 |             seq_len = random.randint(self.seq_len // 8, self.seq_len // 4 - 1) * 8
 71 |             for _ in range(self.batch_size):
 72 |                 start = random.randint(0, len(self.tokenized_corpus) - seq_len - 1)
 73 |                 tokens = self.tokenized_corpus[start: start + seq_len]
 74 |                 samples.append(tokens)
 75 | 
 76 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 77 |                 masked_samples.append(masked_tokens)
 78 | 
 79 | 
 80 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 81 |         
 82 |     def __len__(self):
 83 |         return self.num_batches
 84 | 
 85 | 
 86 | def main():
 87 |     args = parse_args()
 88 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 89 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 90 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 91 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 92 | 
 93 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 94 |     model = model.to(device).train()
 95 | 
 96 |     model = dynamo.optimize("inductor")(model)
 97 | 
 98 |     start_time = time.time()
 99 |     for step, batch in enumerate(train_dl):
100 |         batch = {k: v.to(device) for k, v in batch.items()}
101 |         output = model(**batch)
102 |         loss = output.loss
103 |         loss.backward()
104 |         optimizer.step()
105 |         optimizer.zero_grad()
106 | 
107 |         if step == 0:
108 |             first_step_time = time.time() - start_time
109 | 
110 |     total_training_time = time.time() - start_time
111 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
112 |     print("Training finished.")
113 |     print(f"First iteration took: {first_step_time:.2f}s")
114 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
115 | 
116 | 
117 | 
118 | 
119 | 
120 | 
121 | if __name__ == "__main__":
122 |     main()
123 | 


--------------------------------------------------------------------------------
/experiments/dynamic_optimized_fp16.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | def parse_args():
 48 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 49 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 50 |     parser.add_argument("--batch_size", type=int, default=16)
 51 |     parser.add_argument("--num_batches", type=int, default=100)
 52 | 
 53 |     args = parser.parse_args()
 54 |     return args
 55 | 
 56 | class DataLoader():
 57 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 58 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 59 |         self.batch_size = batch_size
 60 |         self.num_batches = num_batches
 61 |         self.seq_len = seq_len
 62 |         self.mask_token_id = tokenizer.mask_token_id
 63 |     
 64 |     def __iter__(self):
 65 |         for _ in range(self.num_batches):
 66 |             masked_samples = []
 67 |             samples = []
 68 |             seq_len = random.randint(self.seq_len // 8, self.seq_len // 4 - 1) * 8
 69 |             for _ in range(self.batch_size):
 70 |                 start = random.randint(0, len(self.tokenized_corpus) - seq_len - 1)
 71 |                 tokens = self.tokenized_corpus[start: start + seq_len]
 72 |                 samples.append(tokens)
 73 | 
 74 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 75 |                 masked_samples.append(masked_tokens)
 76 | 
 77 | 
 78 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 79 |         
 80 |     def __len__(self):
 81 |         return self.num_batches
 82 | 
 83 | 
 84 | def main():
 85 |     args = parse_args()
 86 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 87 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 88 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 89 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 90 | 
 91 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 92 |     model = model.to(device).train()
 93 | 
 94 |     model = dynamo.optimize("inductor")(model)
 95 | 
 96 |     start_time = time.time()
 97 |     for step, batch in enumerate(train_dl):
 98 |         batch = {k: v.to(device) for k, v in batch.items()}
 99 |         with torch.cuda.amp.autocast():
100 |             output = model(**batch)
101 |         loss = output.loss
102 |         loss.backward()
103 |         optimizer.step()
104 |         optimizer.zero_grad()
105 | 
106 |         if step == 0:
107 |             first_step_time = time.time() - start_time
108 | 
109 |     total_training_time = time.time() - start_time
110 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
111 |     print("Training finished.")
112 |     print(f"First iteration took: {first_step_time:.2f}s")
113 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
114 | 
115 | 
116 | 
117 | 
118 | 
119 | 
120 | if __name__ == "__main__":
121 |     main()
122 | 


--------------------------------------------------------------------------------
/experiments/generate_script.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import random
 3 | import time
 4 | import torch
 5 | 
 6 | from accelerate import Accelerator
 7 | from transformers import AutoModelForCausalLM, AutoTokenizer
 8 | 
 9 | torch.backends.cuda.matmul.allow_tf32 = True
10 | 
11 | 
12 | def parse_args():
13 |     parser = argparse.ArgumentParser(description="Make a couple of generations")
14 |     parser.add_argument("--model_name", type=str, default="gpt2")
15 |     args = parser.parse_args()
16 |     return args
17 | 
18 | 
19 | def main():
20 |     args = parse_args()
21 |     accelerator = Accelerator()
22 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
23 |     inputs = tokenizer(["Once upon a time,"] * 8, return_tensors="pt")
24 |     model = AutoModelForCausalLM.from_pretrained(args.model_name)
25 | 
26 |     model = model.eval()
27 |     model = accelerator.prepare(model)
28 | 
29 |     start_time = time.time()
30 |     for step in range(50):
31 |         batch = {k: v.to(accelerator.device) for k, v in inputs.items()}
32 |         output = model.generate(**batch)
33 |         if step == 0:
34 |             first_step_time = time.time() - start_time
35 | 
36 |     total_training_time = time.time() - start_time
37 |     avg_iteration_time = (total_training_time - first_step_time) / (50 - 1)
38 |     print("Generations finished.")
39 |     print(f"First iteration took: {first_step_time:.2f}s")
40 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
41 | 
42 | 
43 | if __name__ == "__main__":
44 |     main()
45 | 


--------------------------------------------------------------------------------
/experiments/optimize_forward.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | torch.backends.cuda.matmul.allow_tf32 = True
 48 | 
 49 | def parse_args():
 50 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 51 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 52 |     parser.add_argument("--batch_size", type=int, default=16)
 53 |     parser.add_argument("--num_batches", type=int, default=100)
 54 | 
 55 |     args = parser.parse_args()
 56 |     return args
 57 | 
 58 | class DataLoader():
 59 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 60 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 61 |         self.batch_size = batch_size
 62 |         self.num_batches = num_batches
 63 |         self.seq_len = seq_len
 64 |         self.mask_token_id = tokenizer.mask_token_id
 65 |     
 66 |     def __iter__(self):
 67 |         for _ in range(self.num_batches):
 68 |             masked_samples = []
 69 |             samples = []
 70 |             for _ in range(self.batch_size):
 71 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 72 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 73 |                 samples.append(tokens)
 74 | 
 75 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 76 |                 masked_samples.append(masked_tokens)
 77 | 
 78 | 
 79 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 80 |         
 81 |     def __len__(self):
 82 |         return self.num_batches
 83 | 
 84 | 
 85 | def main():
 86 |     args = parse_args()
 87 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 88 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 89 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 90 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 91 | 
 92 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 93 |     model = model.to(device).train()
 94 | 
 95 |     model = dynamo.optimize("inductor")(model)
 96 | 
 97 |     start_time = time.time()
 98 |     for step, batch in enumerate(train_dl):
 99 |         batch = {k: v.to(device) for k, v in batch.items()}
100 |         output = model(**batch)
101 |         loss = output.loss
102 |         loss.backward()
103 |         optimizer.step()
104 |         optimizer.zero_grad()
105 | 
106 |         if step == 0:
107 |             first_step_time = time.time() - start_time
108 | 
109 |     total_training_time = time.time() - start_time
110 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
111 |     print("Training finished.")
112 |     print(f"First iteration took: {first_step_time:.2f}s")
113 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
114 | 
115 | 
116 | 
117 | 
118 | 
119 | 
120 | if __name__ == "__main__":
121 |     main()
122 | 


--------------------------------------------------------------------------------
/experiments/optimize_forward_fp16.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | def parse_args():
 48 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 49 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 50 |     parser.add_argument("--batch_size", type=int, default=16)
 51 |     parser.add_argument("--num_batches", type=int, default=100)
 52 | 
 53 |     args = parser.parse_args()
 54 |     return args
 55 | 
 56 | class DataLoader():
 57 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 58 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 59 |         self.batch_size = batch_size
 60 |         self.num_batches = num_batches
 61 |         self.seq_len = seq_len
 62 |         self.mask_token_id = tokenizer.mask_token_id
 63 |     
 64 |     def __iter__(self):
 65 |         for _ in range(self.num_batches):
 66 |             masked_samples = []
 67 |             samples = []
 68 |             for _ in range(self.batch_size):
 69 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 70 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 71 |                 samples.append(tokens)
 72 | 
 73 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 74 |                 masked_samples.append(masked_tokens)
 75 | 
 76 | 
 77 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 78 |         
 79 |     def __len__(self):
 80 |         return self.num_batches
 81 | 
 82 | 
 83 | def main():
 84 |     args = parse_args()
 85 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 86 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 87 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 88 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 89 | 
 90 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 91 |     model = model.to(device).train()
 92 | 
 93 |     model.forward = dynamo.optimize("inductor")(model.forward)
 94 | 
 95 |     start_time = time.time()
 96 |     for step, batch in enumerate(train_dl):
 97 |         batch = {k: v.to(device) for k, v in batch.items()}
 98 |         with torch.cuda.amp.autocast():
 99 |             output = model(**batch)
100 |         loss = output.loss
101 |         loss.backward()
102 |         optimizer.step()
103 |         optimizer.zero_grad()
104 | 
105 |         if step == 0:
106 |             first_step_time = time.time() - start_time
107 | 
108 |     total_training_time = time.time() - start_time
109 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
110 |     print("Training finished.")
111 |     print(f"First iteration took: {first_step_time:.2f}s")
112 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
113 | 
114 | 
115 | 
116 | 
117 | 
118 | 
119 | if __name__ == "__main__":
120 |     main()
121 | 


--------------------------------------------------------------------------------
/experiments/optimize_model.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | torch.backends.cuda.matmul.allow_tf32 = True
 48 | 
 49 | def parse_args():
 50 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 51 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 52 |     parser.add_argument("--batch_size", type=int, default=16)
 53 |     parser.add_argument("--num_batches", type=int, default=100)
 54 | 
 55 |     args = parser.parse_args()
 56 |     return args
 57 | 
 58 | class DataLoader():
 59 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 60 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 61 |         self.batch_size = batch_size
 62 |         self.num_batches = num_batches
 63 |         self.seq_len = seq_len
 64 |         self.mask_token_id = tokenizer.mask_token_id
 65 |     
 66 |     def __iter__(self):
 67 |         for _ in range(self.num_batches):
 68 |             masked_samples = []
 69 |             samples = []
 70 |             for _ in range(self.batch_size):
 71 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 72 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 73 |                 samples.append(tokens)
 74 | 
 75 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 76 |                 masked_samples.append(masked_tokens)
 77 | 
 78 | 
 79 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 80 |         
 81 |     def __len__(self):
 82 |         return self.num_batches
 83 | 
 84 | 
 85 | def main():
 86 |     args = parse_args()
 87 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 88 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 89 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 90 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 91 | 
 92 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 93 |     model = model.to(device).train()
 94 | 
 95 |     model = dynamo.optimize("inductor")(model)
 96 | 
 97 |     start_time = time.time()
 98 |     for step, batch in enumerate(train_dl):
 99 |         batch = {k: v.to(device) for k, v in batch.items()}
100 |         output = model(**batch)
101 |         loss = output.loss
102 |         loss.backward()
103 |         optimizer.step()
104 |         optimizer.zero_grad()
105 | 
106 |         if step == 0:
107 |             first_step_time = time.time() - start_time
108 | 
109 |     total_training_time = time.time() - start_time
110 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
111 |     print("Training finished.")
112 |     print(f"First iteration took: {first_step_time:.2f}s")
113 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
114 | 
115 | 
116 | 
117 | 
118 | 
119 | 
120 | if __name__ == "__main__":
121 |     main()
122 | 


--------------------------------------------------------------------------------
/experiments/optimize_model_fp16.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | def parse_args():
 48 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 49 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 50 |     parser.add_argument("--batch_size", type=int, default=16)
 51 |     parser.add_argument("--num_batches", type=int, default=100)
 52 | 
 53 |     args = parser.parse_args()
 54 |     return args
 55 | 
 56 | class DataLoader():
 57 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 58 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 59 |         self.batch_size = batch_size
 60 |         self.num_batches = num_batches
 61 |         self.seq_len = seq_len
 62 |         self.mask_token_id = tokenizer.mask_token_id
 63 |     
 64 |     def __iter__(self):
 65 |         for _ in range(self.num_batches):
 66 |             masked_samples = []
 67 |             samples = []
 68 |             for _ in range(self.batch_size):
 69 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 70 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 71 |                 samples.append(tokens)
 72 | 
 73 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 74 |                 masked_samples.append(masked_tokens)
 75 | 
 76 | 
 77 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 78 |         
 79 |     def __len__(self):
 80 |         return self.num_batches
 81 | 
 82 | 
 83 | def main():
 84 |     args = parse_args()
 85 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 86 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 87 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 88 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 89 | 
 90 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 91 |     model = model.to(device).train()
 92 | 
 93 |     model.forward = dynamo.optimize("inductor")(model.forward)
 94 | 
 95 |     start_time = time.time()
 96 |     for step, batch in enumerate(train_dl):
 97 |         batch = {k: v.to(device) for k, v in batch.items()}
 98 |         with torch.cuda.amp.autocast():
 99 |             output = model(**batch)
100 |         loss = output.loss
101 |         loss.backward()
102 |         optimizer.step()
103 |         optimizer.zero_grad()
104 | 
105 |         if step == 0:
106 |             first_step_time = time.time() - start_time
107 | 
108 |     total_training_time = time.time() - start_time
109 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
110 |     print("Training finished.")
111 |     print(f"First iteration took: {first_step_time:.2f}s")
112 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
113 | 
114 | 
115 | 
116 | 
117 | 
118 | 
119 | if __name__ == "__main__":
120 |     main()
121 | 


--------------------------------------------------------------------------------
/experiments/optimize_train_step.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | def parse_args():
 48 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 49 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 50 |     parser.add_argument("--batch_size", type=int, default=16)
 51 |     parser.add_argument("--num_batches", type=int, default=100)
 52 | 
 53 |     args = parser.parse_args()
 54 |     return args
 55 | 
 56 | class DataLoader():
 57 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 58 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 59 |         self.batch_size = batch_size
 60 |         self.num_batches = num_batches
 61 |         self.seq_len = seq_len
 62 |         self.mask_token_id = tokenizer.mask_token_id
 63 |     
 64 |     def __iter__(self):
 65 |         for _ in range(self.num_batches):
 66 |             masked_samples = []
 67 |             samples = []
 68 |             for _ in range(self.batch_size):
 69 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 70 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 71 |                 samples.append(tokens)
 72 | 
 73 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 74 |                 masked_samples.append(masked_tokens)
 75 | 
 76 | 
 77 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 78 |         
 79 |     def __len__(self):
 80 |         return self.num_batches
 81 | 
 82 | 
 83 | def main():
 84 |     args = parse_args()
 85 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 86 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 87 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 88 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 89 | 
 90 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 91 |     model = model.to(device).train()
 92 | 
 93 |     @dynamo.optimize("inductor")
 94 |     def train_step(batch):
 95 |         output = model(**batch)
 96 |         loss = output.loss
 97 |         loss.backward()
 98 |         optimizer.step()
 99 | 
100 |     start_time = time.time()
101 |     for step, batch in enumerate(train_dl):
102 |         batch = {k: v.to(device) for k, v in batch.items()}
103 |         train_step(batch)
104 |         optimizer.zero_grad()
105 | 
106 |         if step == 0:
107 |             first_step_time = time.time() - start_time
108 | 
109 |     total_training_time = time.time() - start_time
110 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
111 |     print("Training finished.")
112 |     print(f"First iteration took: {first_step_time:.2f}s")
113 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
114 | 
115 | 
116 | 
117 | 
118 | 
119 | 
120 | if __name__ == "__main__":
121 |     main()
122 | 


--------------------------------------------------------------------------------
/experiments/optimize_train_step_fp16.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import time
  4 | import torch
  5 | import torch._dynamo as dynamo
  6 | from torch.optim import AdamW
  7 | from transformers import AutoModelForMaskedLM, AutoTokenizer
  8 | 
  9 | CORPUS = """'''Hugging Face, Inc.''' is an American company that develops tools for building applications using [[machine learning]].<ref>{{Cite web |title=Hugging Face – The AI community building the future. |url=https://huggingface.co/ |access-date=2022-08-20 |website=huggingface.co}}</ref> It is most notable for its Transformers library built for [[natural language processing]] applications and its platform that allows users to share machine learning models and datasets.
 10 | 
 11 | == History ==
 12 | The company was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf originally as a company that developed a chatbot app targeted at teenagers.<ref>{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://social.techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2022-08-20 |website=TechCrunch |language=en-US}}</ref> After open-sourcing the model behind the chatbot, the company [[Lean startup|pivoted]] to focus on being a platform for democratizing machine learning.
 13 | 
 14 | In March 2021, Hugging Face raised $40 million in a [[Series B]] funding round.<ref>{{cite web |title=Hugging Face raises $40 million for its natural language processing library |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library}}</ref>
 15 | 
 16 | On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/}}</ref> In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large [[language model]] with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co}}</ref>
 17 | 
 18 | On December 21, 2021, the company announced its acquisition of Gradio, a software library used to make interactive browser demos of machine learning models.<ref>{{Cite web |title=Gradio is joining Hugging Face! |url=https://huggingface.co/blog/gradio-joins-hf |access-date=2022-08-20 |website=huggingface.co}}</ref>
 19 | 
 20 | On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en}}</ref> The company received a $2 billion valuation.
 21 | 
 22 | On May 13, 2022, the company introduced its Student Ambassador Program to help fulfill its mission to teach machine learning to 5 million people by 2023.<ref>{{Cite web |title=Student Ambassador Program’s call for applications is open! |url=https://huggingface.co/blog/ambassadors |access-date=2022-08-20 |website=huggingface.co}}</ref>
 23 | 
 24 | On May 26, 2022, the company announced a partnership with [[Graphcore]] to optimize its Transformers library for the Graphcore IPU.<ref>{{Cite web |title=Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers |url=https://huggingface.co/blog/graphcore-update |access-date=2022-08-19 |website=huggingface.co}}</ref>
 25 | 
 26 | On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports [[Software as a service|SaaS]] or [[On-premises software|on-premise]] deployment.<ref>{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co}}</ref>
 27 | 
 28 | == Services and technologies ==
 29 | === Transformers Library ===
 30 | The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2|GPT]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co}}</ref>
 31 | 
 32 | 
 33 | === Hugging Face Hub ===
 34 | The Hugging Face Hub is a platform where users can share pretrained datasets, models, and demos of machine learning projects.<ref>{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co}}</ref> The Hub contains [[GitHub]]-inspired features for code-sharing and collaboration, including discussions and pull requests for projects. It also hosts Hugging Face Spaces, a hosted service that allows users to build web-based demos of machine learning apps using the Gradio or Streamlit.
 35 | 
 36 | == References ==
 37 | {{Reflist}}
 38 | 
 39 | {{Portal bar|Companies}}
 40 | 
 41 | {{DEFAULTSORT:Hugging Face}}
 42 | [[Category:Machine learning]]
 43 | [[Category:Open-source artificial intelligence]]
 44 | 
 45 | """
 46 | 
 47 | def parse_args():
 48 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a tiny corpus for masked LM")
 49 |     parser.add_argument("--model_name", type=str, default="bert-base-cased")
 50 |     parser.add_argument("--batch_size", type=int, default=16)
 51 |     parser.add_argument("--num_batches", type=int, default=100)
 52 | 
 53 |     args = parser.parse_args()
 54 |     return args
 55 | 
 56 | class DataLoader():
 57 |     def __init__(self, tokenizer, batch_size=8, num_batches=100, seq_len=128):
 58 |         self.tokenized_corpus = tokenizer(CORPUS).input_ids
 59 |         self.batch_size = batch_size
 60 |         self.num_batches = num_batches
 61 |         self.seq_len = seq_len
 62 |         self.mask_token_id = tokenizer.mask_token_id
 63 |     
 64 |     def __iter__(self):
 65 |         for _ in range(self.num_batches):
 66 |             masked_samples = []
 67 |             samples = []
 68 |             for _ in range(self.batch_size):
 69 |                 start = random.randint(0, len(self.tokenized_corpus) - self.seq_len - 1)
 70 |                 tokens = self.tokenized_corpus[start: start + self.seq_len]
 71 |                 samples.append(tokens)
 72 | 
 73 |                 masked_tokens = [(t if random.random() < 0.8 else self.mask_token_id) for t in tokens]
 74 |                 masked_samples.append(masked_tokens)
 75 | 
 76 | 
 77 |             yield {"input_ids": torch.tensor(masked_samples), "labels": torch.tensor(samples)}
 78 |         
 79 |     def __len__(self):
 80 |         return self.num_batches
 81 | 
 82 | 
 83 | def main():
 84 |     args = parse_args()
 85 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name)
 86 |     model = AutoModelForMaskedLM.from_pretrained(args.model_name)
 87 |     train_dl = DataLoader(tokenizer, batch_size=args.batch_size, num_batches=args.num_batches)
 88 |     optimizer = AdamW(model.parameters(), lr=1e-4)
 89 | 
 90 |     device = "cuda" if torch.cuda.is_available() else "cpu"
 91 |     model = model.to(device).train()
 92 | 
 93 |     @dynamo.optimize("inductor")
 94 |     def train_step(batch):
 95 |         output = model(**batch)
 96 |         loss = output.loss
 97 |         loss.backward()
 98 |         optimizer.step()
 99 | 
100 |     start_time = time.time()
101 |     for step, batch in enumerate(train_dl):
102 |         batch = {k: v.to(device) for k, v in batch.items()}
103 |         train_step(batch)
104 |         optimizer.zero_grad()
105 | 
106 |         if step == 0:
107 |             first_step_time = time.time() - start_time
108 | 
109 |     total_training_time = time.time() - start_time
110 |     avg_iteration_time = (total_training_time - first_step_time) / (len(train_dl) - 1)
111 |     print("Training finished.")
112 |     print(f"First iteration took: {first_step_time:.2f}s")
113 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
114 | 
115 | 
116 | 
117 | 
118 | 
119 | 
120 | if __name__ == "__main__":
121 |     main()
122 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | transformers
2 | datasets
3 | evaluate
4 | scikit-learn
5 | git+https://github.com/huggingface/accelerate@main


--------------------------------------------------------------------------------
/run_experiments.sh:
--------------------------------------------------------------------------------
  1 | #!/bin/bash
  2 | 
  3 | if [[ -z $1 ]];
  4 | then 
  5 |     echo "model_name_or_path not passed"
  6 |     exit 1
  7 | else
  8 |     echo "model_name_or_path = $1"
  9 | fi
 10 | 
 11 | if [[ -z $2 ]];
 12 | then 
 13 |     echo "num_runs not passed"
 14 |     exit 1
 15 | else
 16 |     echo "num_runs = $2"
 17 | fi
 18 | 
 19 | if [[ -z $3 ]];
 20 | then 
 21 |     echo "task_name not passed"
 22 |     exit 1
 23 | else
 24 |     echo "task_name = $2"
 25 | fi
 26 | 
 27 | model_name_or_path=$1
 28 | num_runs=$2
 29 | task_name=$3
 30 | 
 31 | for ((i = 1; i <= $num_runs; i++));
 32 | do
 33 |     echo "experiment run $i"
 34 |     
 35 |     case $task_name in
 36 |         "text_classification")
 37 |             echo "Running text_classification"
 38 |             echo "inductor backend with fp32"
 39 |             accelerate launch scripts/text_classification.py \
 40 |             --task_name mrpc \
 41 |             --seed $i \
 42 |             --model_name_or_path $model_name_or_path \
 43 |             --dynamo_backend inductor
 44 |             echo "no backend with fp32"
 45 |             accelerate launch scripts/text_classification.py \
 46 |             --task_name mrpc \
 47 |             --seed $i \
 48 |             --model_name_or_path $model_name_or_path
 49 |             echo "inductor backend with fp16"
 50 |             accelerate launch scripts/text_classification.py \
 51 |             --task_name mrpc \
 52 |             --seed $i \
 53 |             --model_name_or_path $model_name_or_path \
 54 |             --dynamo_backend inductor \
 55 |             --mixed_precision fp16
 56 |             echo "no backend with fp16"
 57 |             accelerate launch scripts/text_classification.py \
 58 |             --task_name mrpc \
 59 |             --seed $i \
 60 |             --model_name_or_path $model_name_or_path \
 61 |             --mixed_precision fp16 \
 62 |             ;;
 63 |         "language_modeling")
 64 |             echo "Running language_modeling"
 65 |             echo "inductor backend with fp32"
 66 |             accelerate launch scripts/language_modeling.py \
 67 |             --dataset_name wikitext \
 68 |             --dataset_config_name wikitext-2-raw-v1 \
 69 |             --seed $i \
 70 |             --model_name_or_path $model_name_or_path \
 71 |             --dynamo_backend inductor
 72 |             echo "no backend with fp32"
 73 |             accelerate launch scripts/language_modeling.py \
 74 |             --dataset_name wikitext \
 75 |             --dataset_config_name wikitext-2-raw-v1 \
 76 |             --seed $i \
 77 |             --model_name_or_path $model_name_or_path
 78 |             echo "inductor backend with fp16"
 79 |             accelerate launch scripts/language_modeling.py \
 80 |             --dataset_name wikitext \
 81 |             --dataset_config_name wikitext-2-raw-v1 \
 82 |             --seed $i \
 83 |             --model_name_or_path $model_name_or_path \
 84 |             --dynamo_backend inductor \
 85 |             --mixed_precision fp16
 86 |             echo "no backend with fp16"
 87 |             accelerate launch scripts/language_modeling.py \
 88 |             --dataset_name wikitext \
 89 |             --dataset_config_name wikitext-2-raw-v1 \
 90 |             --seed $i \
 91 |             --model_name_or_path $model_name_or_path \
 92 |             --mixed_precision fp16
 93 |             ;;
 94 |             "cv_classification")
 95 |             echo "Running cv_classification"
 96 |             echo "inductor backend with fp32"
 97 |             accelerate launch scripts/cv_classification.py \
 98 |             --dataset_name beans \
 99 |             --seed $i \
100 |             --model_name_or_path $model_name_or_path \
101 |             --dynamo_backend inductor
102 |             echo "no backend with fp32"
103 |             accelerate launch scripts/cv_classification.py \
104 |             --dataset_name beans \
105 |             --seed $i \
106 |             --model_name_or_path $model_name_or_path
107 |             echo "inductor backend with fp16"
108 |             accelerate launch scripts/cv_classification.py \
109 |             --dataset_name beans \
110 |             --seed $i \
111 |             --model_name_or_path $model_name_or_path \
112 |             --dynamo_backend inductor \
113 |             --mixed_precision fp16
114 |             echo "no backend with fp16"
115 |             accelerate launch scripts/cv_classification.py \
116 |             --dataset_name beans \
117 |             --seed $i \
118 |             --model_name_or_path $model_name_or_path \
119 |             --mixed_precision fp16
120 |             ;;
121 |         *)
122 |             echo "Invalid task_name"
123 |             exit 1
124 |             ;;
125 |     esac
126 | done


--------------------------------------------------------------------------------
/scripts/cv_classification.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # Copyright 2022 The HuggingFace Inc. team. All rights reserved.
  3 | #
  4 | # Licensed under the Apache License, Version 2.0 (the "License");
  5 | # you may not use this file except in compliance with the License.
  6 | # You may obtain a copy of the License at
  7 | #
  8 | #     http://www.apache.org/licenses/LICENSE-2.0
  9 | #
 10 | # Unless required by applicable law or agreed to in writing, software
 11 | # distributed under the License is distributed on an "AS IS" BASIS,
 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13 | # See the License for the specific language governing permissions and
 14 | # limitations under the License.
 15 | """ Finetuning any 🤗 Transformers model for image classification leveraging 🤗 Accelerate."""
 16 | import argparse
 17 | import json
 18 | import logging
 19 | import math
 20 | import os
 21 | from pathlib import Path
 22 | import time
 23 | 
 24 | import datasets
 25 | import torch
 26 | from datasets import load_dataset
 27 | from torch.utils.data import DataLoader
 28 | from torchvision.transforms import (
 29 |     CenterCrop,
 30 |     Compose,
 31 |     Normalize,
 32 |     RandomHorizontalFlip,
 33 |     RandomResizedCrop,
 34 |     Resize,
 35 |     ToTensor,
 36 | )
 37 | from tqdm.auto import tqdm
 38 | 
 39 | import evaluate
 40 | import transformers
 41 | from accelerate import Accelerator
 42 | from accelerate.logging import get_logger
 43 | from accelerate.utils import set_seed
 44 | from huggingface_hub import Repository
 45 | from transformers import (
 46 |     AutoFeatureExtractor,
 47 |     AutoModelForImageClassification,
 48 |     get_scheduler,
 49 | )
 50 | 
 51 | 
 52 | torch.backends.cuda.matmul.allow_tf32 = True
 53 | logger = get_logger(__name__)
 54 | 
 55 | 
 56 | def parse_args():
 57 |     parser = argparse.ArgumentParser(description="Fine-tune a Transformers model on an image classification dataset")
 58 |     parser.add_argument(
 59 |         "--dataset_name",
 60 |         type=str,
 61 |         default="cifar10",
 62 |         help=(
 63 |             "The name of the Dataset (from the HuggingFace hub) to train on (could be your own, possibly private,"
 64 |             " dataset)."
 65 |         ),
 66 |     )
 67 |     parser.add_argument(
 68 |         "--model_name_or_path",
 69 |         type=str,
 70 |         help="Path to pretrained model or model identifier from huggingface.co/models.",
 71 |         default="google/vit-base-patch16-224-in21k",
 72 |     )
 73 |     parser.add_argument(
 74 |         "--batch_size",
 75 |         type=int,
 76 |         default=8,
 77 |         help="Batch size (per device) for the training dataloader.",
 78 |     )
 79 |     parser.add_argument(
 80 |         "--learning_rate",
 81 |         type=float,
 82 |         default=5e-5,
 83 |         help="Initial learning rate (after the potential warmup period) to use.",
 84 |     )
 85 |     parser.add_argument("--num_epochs", type=int, default=3, help="Total number of training epochs to perform.")
 86 |     parser.add_argument("--seed", type=int, default=0, help="A seed for reproducible training.")
 87 |     parser.add_argument("--dynamo_backend", type=str, default="no", help="Dynamo backend")
 88 |     parser.add_argument("--mixed_precision", type=str, default="no", help="`no` or `fp16`")
 89 |     args = parser.parse_args()
 90 |     return args
 91 | 
 92 | 
 93 | def main():
 94 |     args = parse_args()
 95 |     set_seed(args.seed)
 96 |     accelerator = Accelerator(dynamo_backend=args.dynamo_backend, mixed_precision=args.mixed_precision)
 97 | 
 98 |     logger.info(accelerator.state)
 99 |     # Make one log on every process with the configuration for debugging.
100 |     logging.basicConfig(
101 |         format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
102 |         datefmt="%m/%d/%Y %H:%M:%S",
103 |         level=logging.INFO,
104 |     )
105 |     logger.info(accelerator.state, main_process_only=False)
106 |     if accelerator.is_local_main_process:
107 |         datasets.utils.logging.set_verbosity_warning()
108 |         transformers.utils.logging.set_verbosity_info()
109 |     else:
110 |         datasets.utils.logging.set_verbosity_error()
111 |         transformers.utils.logging.set_verbosity_error()
112 | 
113 |     dataset = load_dataset(args.dataset_name, task="image-classification")
114 |     feature_extractor = AutoFeatureExtractor.from_pretrained(args.model_name_or_path)
115 |     model = AutoModelForImageClassification.from_pretrained(
116 |         args.model_name_or_path,
117 |         num_labels=len(dataset["train"].features["labels"].names),
118 |         ignore_mismatched_sizes=True,
119 |     )
120 | 
121 |     # Preprocessing the datasets
122 | 
123 |     # Define torchvision transforms to be applied to each image.
124 |     if "shortest_edge" in feature_extractor.size:
125 |         size = feature_extractor.size["shortest_edge"]
126 |     else:
127 |         size = (feature_extractor.size["height"], feature_extractor.size["width"])
128 |     normalize = Normalize(mean=feature_extractor.image_mean, std=feature_extractor.image_std)
129 |     train_transforms = Compose(
130 |         [
131 |             RandomResizedCrop(size),
132 |             RandomHorizontalFlip(),
133 |             ToTensor(),
134 |             normalize,
135 |         ]
136 |     )
137 |     val_transforms = Compose(
138 |         [
139 |             Resize(size),
140 |             CenterCrop(size),
141 |             ToTensor(),
142 |             normalize,
143 |         ]
144 |     )
145 | 
146 |     def preprocess_train(example_batch):
147 |         """Apply _train_transforms across a batch."""
148 |         example_batch["pixel_values"] = [train_transforms(image.convert("RGB")) for image in example_batch["image"]]
149 |         return example_batch
150 | 
151 |     def preprocess_val(example_batch):
152 |         """Apply _val_transforms across a batch."""
153 |         example_batch["pixel_values"] = [val_transforms(image.convert("RGB")) for image in example_batch["image"]]
154 |         return example_batch
155 | 
156 |     with accelerator.main_process_first():
157 |         dataset["train"] = dataset["train"].shuffle(seed=args.seed)
158 |         # Set the training transforms
159 |         train_dataset = dataset["train"].with_transform(preprocess_train)
160 |         dataset["validation"] = dataset["validation"].shuffle(seed=args.seed)
161 |         # Set the validation transforms
162 |         eval_dataset = dataset["validation"].with_transform(preprocess_val)
163 | 
164 |     # DataLoaders creation:
165 |     def collate_fn(examples):
166 |         pixel_values = torch.stack([example["pixel_values"] for example in examples])
167 |         labels = torch.tensor([example["labels"] for example in examples])
168 |         return {"pixel_values": pixel_values, "labels": labels}
169 | 
170 |     train_dataloader = DataLoader(
171 |         train_dataset, shuffle=True, collate_fn=collate_fn, batch_size=args.batch_size, drop_last=True
172 |     )
173 |     eval_dataloader = DataLoader(eval_dataset, collate_fn=collate_fn, batch_size=args.batch_size, drop_last=True)
174 | 
175 |     # Optimizer
176 |     optimizer = torch.optim.AdamW(model.parameters(), lr=args.learning_rate)
177 | 
178 |     # Scheduler.
179 |     lr_scheduler = get_scheduler(
180 |         name="linear",
181 |         optimizer=optimizer,
182 |         num_warmup_steps=0,
183 |         num_training_steps=len(train_dataloader) * args.num_epochs,
184 |     )
185 | 
186 |     # Prepare everything with our `accelerator`.
187 |     model, optimizer, train_dataloader, eval_dataloader, lr_scheduler = accelerator.prepare(
188 |         model, optimizer, train_dataloader, eval_dataloader, lr_scheduler
189 |     )
190 | 
191 |     # Get the metric function
192 |     metric = evaluate.load("accuracy")
193 |     # Train!
194 |     # Only show the progress bar once on each machine.
195 |     train_steps = len(train_dataloader) * args.num_epochs
196 |     progress_bar = tqdm(range(train_steps), disable=not accelerator.is_local_main_process)
197 | 
198 |     start_time = time.time()
199 |     for epoch in range(args.num_epochs):
200 |         model.train()
201 |         for step, batch in enumerate(train_dataloader):
202 |             outputs = model(**batch)
203 |             loss = outputs.loss
204 |             predictions, references = accelerator.gather_for_metrics((outputs.logits.argmax(dim=-1), batch["labels"]))
205 |             metric.add_batch(predictions=predictions, references=references)
206 |             accelerator.backward(loss)
207 |             optimizer.step()
208 |             lr_scheduler.step()
209 |             optimizer.zero_grad()
210 |             progress_bar.update(1)
211 |             if step == 0 and epoch == 0:
212 |                 first_step_time = time.time() - start_time
213 | 
214 |         eval_train_metric = metric.compute()
215 |         print(f"Training Accuracy for backend {args.dynamo_backend} at epoch {epoch}: {eval_train_metric}")
216 | 
217 |     total_training_time = time.time() - start_time
218 |     avg_train_iteration_time = (total_training_time - first_step_time) / (train_steps - 1)
219 |     print("Training finished.")
220 |     print(f"First iteration took: {first_step_time:.2f}s")
221 |     print(f"Average time after the first iteration: {avg_train_iteration_time * 1000:.2f}ms")
222 | 
223 |     model.eval()
224 |     start_time = time.time()
225 |     for step, batch in enumerate(eval_dataloader):
226 |         with torch.no_grad():
227 |             outputs = model(**batch)
228 |         predictions = outputs.logits.argmax(dim=-1)
229 |         predictions, references = accelerator.gather_for_metrics((predictions, batch["labels"]))
230 |         metric.add_batch(predictions=predictions, references=references)
231 | 
232 |         if step == 0:
233 |             first_step_time = time.time() - start_time
234 |     total_eval_time = time.time() - start_time
235 |     avg_test_iteration_time = (total_eval_time - first_step_time) / (len(eval_dataloader) - 1)
236 |     print("Evaluation finished.")
237 |     print(f"First iteration took: {first_step_time:.2f}s")
238 |     print(f"Average time after the first iteration: {avg_test_iteration_time * 1000:.2f}ms")
239 | 
240 |     eval_test_metric = metric.compute()
241 |     print(f"Test Accuracy for backend {args.dynamo_backend}: {eval_test_metric}")
242 | 
243 |     out_dict = {
244 |         "backend": args.dynamo_backend,
245 |         "mixed_precision": args.mixed_precision,
246 |         "num_epochs": str(args.num_epochs),
247 |         "seed": str(args.seed),
248 |         "train_acc": str(eval_train_metric["accuracy"]),
249 |         "avg_train_time": str(avg_train_iteration_time * 1000),
250 |         "test_acc": str(eval_test_metric["accuracy"]),
251 |         "avg_test_time": str(avg_test_iteration_time * 1000),
252 |     }
253 |     prefix = args.model_name_or_path.split("/")[-1]
254 |     with open(f"{prefix}_cv_classification_results.csv", "a+") as fd:
255 |         fd.seek(0)
256 |         if len(fd.read(1)) == 0:
257 |             fd.write(",".join(out_dict.keys()) + "\n")
258 |         else:
259 |             fd.write("\n")
260 |         fd.write(",".join(out_dict.values()))
261 | 
262 | 
263 | if __name__ == "__main__":
264 |     main()
265 | 


--------------------------------------------------------------------------------
/scripts/language_modeling.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding=utf-8
  3 | # Copyright 2021 The HuggingFace Inc. team. All rights reserved.
  4 | #
  5 | # Licensed under the Apache License, Version 2.0 (the "License");
  6 | # you may not use this file except in compliance with the License.
  7 | # You may obtain a copy of the License at
  8 | #
  9 | #     http://www.apache.org/licenses/LICENSE-2.0
 10 | #
 11 | # Unless required by applicable law or agreed to in writing, software
 12 | # distributed under the License is distributed on an "AS IS" BASIS,
 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 | # See the License for the specific language governing permissions and
 15 | # limitations under the License.
 16 | """
 17 | Fine-tuning the library models for causal language modeling (GPT, GPT-2, CTRL, ...)
 18 | on a text file or a dataset without using HuggingFace Trainer.
 19 | 
 20 | Here is the full list of checkpoints on the hub that can be fine-tuned by this script:
 21 | https://huggingface.co/models?filter=text-generation
 22 | """
 23 | # You can also adapt this script on your own causal language modeling task. Pointers for this are left as comments.
 24 | 
 25 | import argparse
 26 | import logging
 27 | import math
 28 | import os
 29 | from itertools import chain
 30 | import time
 31 | 
 32 | import datasets
 33 | import torch
 34 | from datasets import load_dataset
 35 | from torch.utils.data import DataLoader
 36 | from tqdm.auto import tqdm
 37 | 
 38 | import transformers
 39 | from accelerate import Accelerator
 40 | from accelerate.logging import get_logger
 41 | from accelerate.utils import set_seed
 42 | from transformers import (
 43 |     AutoModelForCausalLM,
 44 |     AutoTokenizer,
 45 |     default_data_collator,
 46 |     get_scheduler,
 47 | )
 48 | 
 49 | torch.backends.cuda.matmul.allow_tf32 = True
 50 | logger = get_logger(__name__)
 51 | 
 52 | 
 53 | def parse_args():
 54 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a causal language modeling task")
 55 |     parser.add_argument(
 56 |         "--dataset_name",
 57 |         type=str,
 58 |         default=None,
 59 |         help="The name of the dataset to use (via the datasets library).",
 60 |     )
 61 |     parser.add_argument(
 62 |         "--dataset_config_name",
 63 |         type=str,
 64 |         default=None,
 65 |         help="The configuration name of the dataset to use (via the datasets library).",
 66 |     )
 67 |     parser.add_argument(
 68 |         "--model_name_or_path",
 69 |         type=str,
 70 |         help="Path to pretrained model or model identifier from huggingface.co/models.",
 71 |         required=False,
 72 |     )
 73 |     parser.add_argument(
 74 |         "--batch_size",
 75 |         type=int,
 76 |         default=8,
 77 |         help="Batch size (per device) for the training dataloader.",
 78 |     )
 79 |     parser.add_argument("--num_epochs", type=int, default=3, help="Total number of training epochs to perform.")
 80 |     parser.add_argument("--seed", type=int, default=0, help="A seed for reproducible training.")
 81 |     parser.add_argument("--dynamo_backend", type=str, default="no", help="Dynamo backend")
 82 |     parser.add_argument("--mixed_precision", type=str, default="no", help="`no` or `fp16`")
 83 |     args = parser.parse_args()
 84 |     return args
 85 | 
 86 | 
 87 | def main():
 88 |     args = parse_args()
 89 |     set_seed(args.seed)
 90 |     accelerator = Accelerator(dynamo_backend=args.dynamo_backend, mixed_precision=args.mixed_precision)
 91 | 
 92 |     # Make one log on every process with the configuration for debugging.
 93 |     logging.basicConfig(
 94 |         format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
 95 |         datefmt="%m/%d/%Y %H:%M:%S",
 96 |         level=logging.INFO,
 97 |     )
 98 |     logger.info(accelerator.state, main_process_only=False)
 99 |     if accelerator.is_local_main_process:
100 |         datasets.utils.logging.set_verbosity_warning()
101 |         transformers.utils.logging.set_verbosity_info()
102 |     else:
103 |         datasets.utils.logging.set_verbosity_error()
104 |         transformers.utils.logging.set_verbosity_error()
105 | 
106 |     if args.dataset_name is not None:
107 |         # Downloading and loading a dataset from the hub.
108 |         raw_datasets = load_dataset(args.dataset_name, args.dataset_config_name)
109 |         if "validation" not in raw_datasets.keys():
110 |             raw_datasets["validation"] = load_dataset(
111 |                 args.dataset_name,
112 |                 args.dataset_config_name,
113 |                 split=f"train[:{args.validation_split_percentage}%]",
114 |             )
115 |             raw_datasets["train"] = load_dataset(
116 |                 args.dataset_name,
117 |                 args.dataset_config_name,
118 |                 split=f"train[{args.validation_split_percentage}%:]",
119 |             )
120 | 
121 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path)
122 |     model = AutoModelForCausalLM.from_pretrained(args.model_name_or_path)
123 | 
124 |     # Preprocessing the datasets.
125 |     # First we tokenize all the texts.
126 |     column_names = raw_datasets["train"].column_names
127 |     text_column_name = "text" if "text" in column_names else column_names[0]
128 | 
129 |     def tokenize_function(examples):
130 |         return tokenizer(examples[text_column_name])
131 | 
132 |     with accelerator.main_process_first():
133 |         tokenized_datasets = raw_datasets.map(
134 |             tokenize_function,
135 |             batched=True,
136 |             num_proc=4,
137 |             remove_columns=column_names,
138 |             load_from_cache_file=False,
139 |             desc="Running tokenizer on dataset",
140 |         )
141 |     block_size = tokenizer.model_max_length
142 | 
143 |     # Main data processing function that will concatenate all texts from our dataset and generate chunks of block_size.
144 |     def group_texts(examples):
145 |         # Concatenate all texts.
146 |         concatenated_examples = {k: list(chain(*examples[k])) for k in examples.keys()}
147 |         total_length = len(concatenated_examples[list(examples.keys())[0]])
148 |         # We drop the small remainder, we could add padding if the model supported it instead of this drop, you can
149 |         # customize this part to your needs.
150 |         if total_length >= block_size:
151 |             total_length = (total_length // block_size) * block_size
152 |         # Split by chunks of max_len.
153 |         result = {
154 |             k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
155 |             for k, t in concatenated_examples.items()
156 |         }
157 |         result["labels"] = result["input_ids"].copy()
158 |         return result
159 | 
160 |     with accelerator.main_process_first():
161 |         lm_datasets = tokenized_datasets.map(
162 |             group_texts,
163 |             batched=True,
164 |             num_proc=4,
165 |             load_from_cache_file=False,
166 |             desc=f"Grouping texts in chunks of {block_size}",
167 |         )
168 | 
169 |     train_dataset = lm_datasets["train"]
170 |     eval_dataset = lm_datasets["validation"]
171 | 
172 |     # DataLoaders creation:
173 |     train_dataloader = DataLoader(
174 |         train_dataset, shuffle=True, collate_fn=default_data_collator, batch_size=args.batch_size, drop_last=True
175 |     )
176 |     eval_dataloader = DataLoader(
177 |         eval_dataset, collate_fn=default_data_collator, batch_size=args.batch_size, drop_last=True
178 |     )
179 | 
180 |     # Optimizer
181 |     optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
182 | 
183 |     # Scheduler.
184 |     lr_scheduler = get_scheduler(
185 |         name="linear",
186 |         optimizer=optimizer,
187 |         num_warmup_steps=0,
188 |         num_training_steps=len(train_dataloader) * args.num_epochs,
189 |     )
190 | 
191 |     # Prepare everything with our `accelerator`.
192 |     model, optimizer, train_dataloader, eval_dataloader, lr_scheduler = accelerator.prepare(
193 |         model, optimizer, train_dataloader, eval_dataloader, lr_scheduler
194 |     )
195 | 
196 |     # Train!
197 |     # Only show the progress bar once on each machine.
198 |     train_steps = len(train_dataloader) * args.num_epochs
199 |     progress_bar = tqdm(range(train_steps), disable=not accelerator.is_local_main_process)
200 | 
201 |     start_time = time.time()
202 |     for epoch in range(args.num_epochs):
203 |         model.train()
204 |         total_loss = 0
205 |         for step, batch in enumerate(train_dataloader):
206 |             outputs = model(**batch)
207 |             loss = outputs.loss
208 |             total_loss += loss.detach().float()
209 |             accelerator.backward(loss)
210 |             optimizer.step()
211 |             lr_scheduler.step()
212 |             optimizer.zero_grad()
213 |             progress_bar.update(1)
214 |             if step == 0 and epoch == 0:
215 |                 first_step_time = time.time() - start_time
216 |         train_perplexity = torch.exp(total_loss / len(train_dataloader))
217 |         print(f"Training Perplexity for backend {args.dynamo_backend} at epoch {epoch}: {train_perplexity}")
218 | 
219 |     total_training_time = time.time() - start_time
220 |     avg_train_iteration_time = (total_training_time - first_step_time) / (train_steps - 1)
221 |     print("Training finished.")
222 |     print(f"First iteration took: {first_step_time:.2f}s")
223 |     print(f"Average time after the first iteration: {avg_train_iteration_time * 1000:.2f}ms")
224 |     model.eval()
225 |     total_loss = 0
226 |     start_time = time.time()
227 |     for step, batch in enumerate(eval_dataloader):
228 |         with torch.no_grad():
229 |             outputs = model(**batch)
230 |         loss = outputs.loss
231 |         total_loss += loss.detach().float()
232 |         if step == 0:
233 |             first_step_time = time.time() - start_time
234 | 
235 |     total_eval_time = time.time() - start_time
236 |     total_eval_time = time.time() - start_time
237 |     avg_test_iteration_time = (total_eval_time - first_step_time) / (len(eval_dataloader) - 1)
238 |     print("Evaluation finished.")
239 |     print(f"First iteration took: {first_step_time:.2f}s")
240 |     print(f"Average time after the first iteration: {avg_test_iteration_time * 1000:.2f}ms")
241 |     test_perplexity = torch.exp(total_loss / len(eval_dataloader))
242 |     print(f"Test Perplexity for backend {args.dynamo_backend}: {test_perplexity}")
243 |     out_dict = {
244 |         "backend": args.dynamo_backend,
245 |         "mixed_precision": args.mixed_precision,
246 |         "num_epochs": str(args.num_epochs),
247 |         "seed": str(args.seed),
248 |         "train_perplexity": str(train_perplexity.item()),
249 |         "avg_train_time": str(avg_train_iteration_time * 1000),
250 |         "test_perplexity": str(test_perplexity.item()),
251 |         "avg_test_time": str(avg_test_iteration_time * 1000),
252 |     }
253 |     prefix = args.model_name_or_path.split("/")[-1]
254 |     with open(f"{prefix}_language_modeling_task_results.csv", "a+") as fd:
255 |         fd.seek(0)
256 |         if len(fd.read(1)) == 0:
257 |             fd.write(",".join(out_dict.keys()) + "\n")
258 |         else:
259 |             fd.write("\n")
260 |         fd.write(",".join(out_dict.values()))
261 | 
262 | 
263 | if __name__ == "__main__":
264 |     main()
265 | 


--------------------------------------------------------------------------------
/scripts/text_classification.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # Copyright 2021 The HuggingFace Inc. team. All rights reserved.
  3 | #
  4 | # Licensed under the Apache License, Version 2.0 (the "License");
  5 | # you may not use this file except in compliance with the License.
  6 | # You may obtain a copy of the License at
  7 | #
  8 | #     http://www.apache.org/licenses/LICENSE-2.0
  9 | #
 10 | # Unless required by applicable law or agreed to in writing, software
 11 | # distributed under the License is distributed on an "AS IS" BASIS,
 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13 | # See the License for the specific language governing permissions and
 14 | # limitations under the License.
 15 | """ Finetuning a 🤗 Transformers model for sequence classification on GLUE."""
 16 | import argparse
 17 | import logging
 18 | import time
 19 | 
 20 | import datasets
 21 | import torch
 22 | from datasets import load_dataset
 23 | from torch.utils.data import DataLoader
 24 | from tqdm.auto import tqdm
 25 | 
 26 | import evaluate
 27 | import transformers
 28 | from accelerate import Accelerator
 29 | from accelerate.logging import get_logger
 30 | from transformers import (
 31 |     AutoModelForSequenceClassification,
 32 |     AutoTokenizer,
 33 |     DataCollatorWithPadding,
 34 |     default_data_collator,
 35 |     get_scheduler,
 36 | )
 37 | 
 38 | torch.backends.cuda.matmul.allow_tf32 = True
 39 | logger = get_logger(__name__)
 40 | 
 41 | task_to_keys = {
 42 |     "cola": ("sentence", None),
 43 |     "mnli": ("premise", "hypothesis"),
 44 |     "mrpc": ("sentence1", "sentence2"),
 45 |     "qnli": ("question", "sentence"),
 46 |     "qqp": ("question1", "question2"),
 47 |     "rte": ("sentence1", "sentence2"),
 48 |     "sst2": ("sentence", None),
 49 |     "stsb": ("sentence1", "sentence2"),
 50 |     "wnli": ("sentence1", "sentence2"),
 51 | }
 52 | 
 53 | 
 54 | def parse_args():
 55 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a text classification task")
 56 |     parser.add_argument(
 57 |         "--task_name",
 58 |         type=str,
 59 |         default=None,
 60 |         help="The name of the glue task to train on.",
 61 |         choices=list(task_to_keys.keys()),
 62 |     )
 63 |     parser.add_argument(
 64 |         "--max_length",
 65 |         type=int,
 66 |         default=128,
 67 |         help=(
 68 |             "The maximum total input sequence length after tokenization. Sequences longer than this will be truncated,"
 69 |             " sequences shorter will be padded unless `--dynamic_lengh` is passed."
 70 |         ),
 71 |     )
 72 |     parser.add_argument(
 73 |         "--dynamic_length",
 74 |         action="store_true",
 75 |         help="If passed, pad all samples to `max_length`. Otherwise, dynamic padding is used.",
 76 |     )
 77 |     parser.add_argument(
 78 |         "--model_name_or_path",
 79 |         type=str,
 80 |         help="Path to pretrained model or model identifier from huggingface.co/models.",
 81 |         default="bert-base-cased",
 82 |     )
 83 |     parser.add_argument(
 84 |         "--batch_size",
 85 |         type=int,
 86 |         default=16,
 87 |         help="Batch size (per device) for the dataloaders.",
 88 |     )
 89 |     parser.add_argument(
 90 |         "--num_epochs",
 91 |         type=int,
 92 |         default=3,
 93 |         help="Number of training epochs.",
 94 |     )
 95 |     parser.add_argument("--dynamo_backend", type=str, default="no", help="Dynamo backend")
 96 |     parser.add_argument("--seed", type=int, default=0, help="random seed for torch")
 97 |     parser.add_argument("--mixed_precision", type=str, default="no", help="`no` or `fp16`")
 98 |     args = parser.parse_args()
 99 |     return args
100 | 
101 | 
102 | def main():
103 |     args = parse_args()
104 |     torch.manual_seed(args.seed)
105 |     accelerator = Accelerator(dynamo_backend=args.dynamo_backend, mixed_precision=args.mixed_precision)
106 | 
107 |     # Make one log on every process with the configuration for debugging.
108 |     logging.basicConfig(
109 |         format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
110 |         datefmt="%m/%d/%Y %H:%M:%S",
111 |         level=logging.INFO,
112 |     )
113 |     logger.info(accelerator.state, main_process_only=False)
114 |     if accelerator.is_local_main_process:
115 |         datasets.utils.logging.set_verbosity_warning()
116 |         transformers.utils.logging.set_verbosity_info()
117 |     else:
118 |         datasets.utils.logging.set_verbosity_error()
119 |         transformers.utils.logging.set_verbosity_error()
120 | 
121 |     # Load data
122 |     raw_datasets = load_dataset("glue", args.task_name)
123 | 
124 |     is_regression = args.task_name == "stsb"
125 |     if not is_regression:
126 |         label_list = raw_datasets["train"].features["label"].names
127 |         num_labels = len(label_list)
128 |     else:
129 |         num_labels = 1
130 | 
131 |     # Load pretrained model and tokenizer
132 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path)
133 |     model = AutoModelForSequenceClassification.from_pretrained(args.model_name_or_path, num_labels=num_labels)
134 | 
135 |     # Preprocessing the datasets
136 |     sentence1_key, sentence2_key = task_to_keys[args.task_name]
137 |     padding = False if args.dynamic_length else "max_length"
138 | 
139 |     def preprocess_function(examples):
140 |         # Tokenize the texts
141 |         texts = (
142 |             (examples[sentence1_key],) if sentence2_key is None else (examples[sentence1_key], examples[sentence2_key])
143 |         )
144 |         result = tokenizer(*texts, padding=padding, max_length=args.max_length, truncation=True)
145 |         result["labels"] = examples["label"]
146 |         return result
147 | 
148 |     with accelerator.main_process_first():
149 |         processed_datasets = raw_datasets.map(
150 |             preprocess_function,
151 |             batched=True,
152 |             remove_columns=raw_datasets["train"].column_names,
153 |             desc="Running tokenizer on dataset",
154 |         )
155 | 
156 |     train_dataset = processed_datasets["train"]
157 |     eval_dataset = processed_datasets["validation_matched" if args.task_name == "mnli" else "validation"]
158 | 
159 |     # DataLoaders creation:
160 |     if not args.dynamic_length:
161 |         data_collator = default_data_collator
162 |     else:
163 |         data_collator = DataCollatorWithPadding(tokenizer, pad_to_multiple_of=8)
164 | 
165 |     train_dataloader = DataLoader(
166 |         train_dataset, shuffle=True, collate_fn=data_collator, batch_size=args.batch_size, drop_last=True
167 |     )
168 |     eval_dataloader = DataLoader(
169 |         eval_dataset, collate_fn=data_collator, batch_size=args.batch_size, drop_last=not args.dynamic_length
170 |     )
171 | 
172 |     # Optimizer
173 |     optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
174 | 
175 |     # Scheduler.
176 |     lr_scheduler = get_scheduler(
177 |         name="linear",
178 |         optimizer=optimizer,
179 |         num_warmup_steps=0,
180 |         num_training_steps=len(train_dataloader) * args.num_epochs,
181 |     )
182 | 
183 |     # Prepare everything with our `accelerator`.
184 |     model, optimizer, train_dataloader, eval_dataloader, lr_scheduler = accelerator.prepare(
185 |         model, optimizer, train_dataloader, eval_dataloader, lr_scheduler
186 |     )
187 | 
188 |     # Get the metric function
189 |     metric = evaluate.load("glue", args.task_name)
190 |     # Train!
191 |     # Only show the progress bar once on each machine.
192 |     train_steps = len(train_dataloader) * args.num_epochs
193 |     progress_bar = tqdm(range(train_steps), disable=not accelerator.is_local_main_process)
194 |     start_time = time.time()
195 |     for epoch in range(args.num_epochs):
196 |         model.train()
197 |         for step, batch in enumerate(train_dataloader):
198 |             # We need to skip steps until we reach the resumed step
199 |             outputs = model(**batch)
200 |             loss = outputs.loss
201 |             predictions, references = accelerator.gather_for_metrics((outputs.logits.argmax(dim=-1), batch["labels"]))
202 |             metric.add_batch(predictions=predictions, references=references)
203 |             accelerator.backward(loss)
204 |             optimizer.step()
205 |             lr_scheduler.step()
206 |             optimizer.zero_grad()
207 |             progress_bar.update(1)
208 |             if step == 0 and epoch == 0:
209 |                 first_step_time = time.time() - start_time
210 | 
211 |         eval_train_metric = metric.compute()
212 |         print(f"Training Accuracy for backend {args.dynamo_backend} at epoch {epoch}: {eval_train_metric}")
213 | 
214 |     total_training_time = time.time() - start_time
215 |     avg_train_iteration_time = (total_training_time - first_step_time) / (train_steps - 1)
216 |     print("Training finished.")
217 |     print(f"First iteration took: {first_step_time:.2f}s")
218 |     print(f"Average time after the first iteration: {avg_train_iteration_time * 1000:.2f}ms")
219 | 
220 |     model.eval()
221 |     start_time = time.time()
222 |     for step, batch in enumerate(eval_dataloader):
223 |         with torch.no_grad():
224 |             outputs = model(**batch)
225 |         predictions = outputs.logits.argmax(dim=-1) if not is_regression else outputs.logits.squeeze()
226 |         predictions, references = accelerator.gather_for_metrics((predictions, batch["labels"]))
227 |         metric.add_batch(predictions=predictions, references=references)
228 | 
229 |         if step == 0:
230 |             first_step_time = time.time() - start_time
231 |     total_eval_time = time.time() - start_time
232 |     avg_test_iteration_time = (total_eval_time - first_step_time) / (len(eval_dataloader) - 1)
233 |     print("Evaluation finished.")
234 |     print(f"First iteration took: {first_step_time:.2f}s")
235 |     print(f"Average time after the first iteration: {avg_test_iteration_time * 1000:.2f}ms")
236 | 
237 |     eval_test_metric = metric.compute()
238 |     print(f"Test Accuracy for backend {args.dynamo_backend}: {eval_test_metric}")
239 | 
240 |     out_dict = {
241 |         "backend": args.dynamo_backend,
242 |         "mixed_precision": args.mixed_precision,
243 |         "num_epochs": str(args.num_epochs),
244 |         "seed": str(args.seed),
245 |         "train_acc": str(eval_train_metric["accuracy"]),
246 |         "train_f1": str(eval_train_metric["f1"]),
247 |         "avg_train_time": str(avg_train_iteration_time * 1000),
248 |         "test_acc": str(eval_test_metric["accuracy"]),
249 |         "test_f1": str(eval_test_metric["f1"]),
250 |         "avg_test_time": str(avg_test_iteration_time * 1000),
251 |     }
252 |     prefix = args.model_name_or_path.split("/")[-1]
253 |     with open(f"{prefix}_text_classification_results.csv", "a+") as fd:
254 |         fd.seek(0)
255 |         if len(fd.read(1)) == 0:
256 |             fd.write(",".join(out_dict.keys()) + "\n")
257 |         else:
258 |             fd.write("\n")
259 |         fd.write(",".join(out_dict.values()))
260 | 
261 | 
262 | if __name__ == "__main__":
263 |     main()
264 | 


--------------------------------------------------------------------------------
/scripts/translation.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding=utf-8
  3 | # Copyright The HuggingFace Team and The HuggingFace Inc. team. All rights reserved.
  4 | #
  5 | # Licensed under the Apache License, Version 2.0 (the "License");
  6 | # you may not use this file except in compliance with the License.
  7 | # You may obtain a copy of the License at
  8 | #
  9 | #     http://www.apache.org/licenses/LICENSE-2.0
 10 | #
 11 | # Unless required by applicable law or agreed to in writing, software
 12 | # distributed under the License is distributed on an "AS IS" BASIS,
 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 | # See the License for the specific language governing permissions and
 15 | # limitations under the License.
 16 | """
 17 | Fine-tuning a 🤗 Transformers model on text translation.
 18 | """
 19 | # You can also adapt this script on your own text translation task. Pointers for this are left as comments.
 20 | 
 21 | import argparse
 22 | import logging
 23 | import random
 24 | import time
 25 | 
 26 | import datasets
 27 | import numpy as np
 28 | import torch
 29 | from datasets import load_dataset
 30 | from torch.utils.data import DataLoader
 31 | from tqdm.auto import tqdm
 32 | 
 33 | import evaluate
 34 | import transformers
 35 | from accelerate import Accelerator
 36 | from accelerate.logging import get_logger
 37 | from transformers import (
 38 |     AutoModelForSeq2SeqLM,
 39 |     AutoTokenizer,
 40 |     DataCollatorForSeq2Seq,
 41 |     MBartTokenizer,
 42 |     MBartTokenizerFast,
 43 |     default_data_collator,
 44 |     get_scheduler,
 45 | )
 46 | 
 47 | torch.backends.cuda.matmul.allow_tf32 = True
 48 | logger = get_logger(__name__)
 49 | 
 50 | 
 51 | # Parsing input arguments
 52 | def parse_args():
 53 | 
 54 |     parser = argparse.ArgumentParser(description="Finetune a transformers model on a text classification task")
 55 |     parser.add_argument(
 56 |         "--model_name_or_path",
 57 |         type=str,
 58 |         help="Path to pretrained model or model identifier from huggingface.co/models.",
 59 |         default="t5-small",
 60 |     )
 61 |     parser.add_argument(
 62 |         "--max_length",
 63 |         type=int,
 64 |         default=128,
 65 |         help=(
 66 |             "The maximum total input sequence length after tokenization. Sequences longer than this will be truncated,"
 67 |             " sequences shorter will be padded unless `--dynamic_lengh` is passed."
 68 |         ),
 69 |     )
 70 |     parser.add_argument(
 71 |         "--dynamic_length",
 72 |         action="store_true",
 73 |         help="If passed, pad all samples to `max_length`. Otherwise, dynamic padding is used.",
 74 |     )
 75 |     parser.add_argument(
 76 |         "--batch_size",
 77 |         type=int,
 78 |         default=16,
 79 |         help="Batch size (per device) for the dataloaders.",
 80 |     )
 81 |     parser.add_argument(
 82 |         "--num_epochs",
 83 |         type=int,
 84 |         default=1,
 85 |         help="Number of training epochs.",
 86 |     )
 87 |     parser.add_argument("--seed", type=int, default=0, help="A seed for reproducible training.")
 88 |     parser.add_argument("--dynamo_backend", type=str, default="no", help="Dynamo backend")
 89 |     parser.add_argument("--mixed_precision", type=str, default="no", help="`no` or `fp16`")
 90 |     return parser.parse_args()
 91 | 
 92 | 
 93 | def main():
 94 |     args = parse_args()
 95 |     torch.manual_seed(args.seed)
 96 |     accelerator = Accelerator(dynamo_backend=args.dynamo_backend, mixed_precision=args.mixed_precision)
 97 | 
 98 |     # Make one log on every process with the configuration for debugging.
 99 |     logging.basicConfig(
100 |         format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
101 |         datefmt="%m/%d/%Y %H:%M:%S",
102 |         level=logging.INFO,
103 |     )
104 |     logger.info(accelerator.state, main_process_only=False)
105 |     if accelerator.is_local_main_process:
106 |         datasets.utils.logging.set_verbosity_warning()
107 |         transformers.utils.logging.set_verbosity_info()
108 |     else:
109 |         datasets.utils.logging.set_verbosity_error()
110 |         transformers.utils.logging.set_verbosity_error()
111 | 
112 |     # Load data
113 |     raw_datasets = load_dataset("wmt16", "ro-en")
114 | 
115 |     # Load pretrained model and tokenizer
116 |     tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path)
117 |     model = AutoModelForSeq2SeqLM.from_pretrained(args.model_name_or_path)
118 | 
119 |     # MBART requires some language codes
120 |     if isinstance(tokenizer, (MBartTokenizer, MBartTokenizerFast)):
121 |         tokenizer.src_lang = "en_XX"
122 |         tokenizer.tgt_lang = "ro_RO"
123 |         if model.config.decoder_start_token_id is None:
124 |             if isinstance(tokenizer, MBartTokenizer):
125 |                 model.config.decoder_start_token_id = tokenizer.lang_code_to_id["ro_RO"]
126 |             else:
127 |                 model.config.decoder_start_token_id = tokenizer.convert_tokens_to_ids("ro_RO")
128 | 
129 |     # T5 requires a prefix
130 |     if args.model_name_or_path in ["t5-small", "t5-base", "t5-large", "t5-3b", "t5-11b"]:
131 |         prefix = "translate English to Romanian: "
132 |     else:
133 |         prefix = ""
134 | 
135 |     # Preprocessing the datasets.
136 |     padding = False if args.dynamic_length else "max_length"
137 | 
138 |     def preprocess_function(examples):
139 |         inputs = [ex["en"] for ex in examples["translation"]]
140 |         targets = [ex["ro"] for ex in examples["translation"]]
141 |         inputs = [prefix + inp for inp in inputs]
142 |         model_inputs = tokenizer(
143 |             inputs, text_target=targets, max_length=args.max_length, padding=padding, truncation=True
144 |         )
145 | 
146 |         # If we are padding here, replace all tokenizer.pad_token_id in the labels by -100 when we want to ignore
147 |         # padding in the loss.
148 |         if padding == "max_length":
149 |             model_inputs["labels"] = [
150 |                 [(l if l != tokenizer.pad_token_id else -100) for l in label] for label in model_inputs["labels"]
151 |             ]
152 | 
153 |         return model_inputs
154 | 
155 |     with accelerator.main_process_first():
156 |         processed_datasets = raw_datasets.map(
157 |             preprocess_function,
158 |             batched=True,
159 |             remove_columns=raw_datasets["train"].column_names,
160 |             desc="Running tokenizer on dataset",
161 |         )
162 | 
163 |     train_dataset = processed_datasets["train"]
164 |     eval_dataset = processed_datasets["validation"]
165 | 
166 |     # Log a few random samples from the training set:
167 |     for index in random.sample(range(len(train_dataset)), 3):
168 |         logger.info(f"Sample {index} of the training set: {train_dataset[index]}.")
169 | 
170 |     # DataLoaders creation:
171 |     if not args.dynamic_length:
172 |         data_collator = default_data_collator
173 |     else:
174 |         data_collator = DataCollatorForSeq2Seq(
175 |             tokenizer,
176 |             model=model,
177 |             label_pad_token_id=-100,
178 |             pad_to_multiple_of=8,
179 |         )
180 | 
181 |     train_dataloader = DataLoader(
182 |         train_dataset, shuffle=True, collate_fn=data_collator, batch_size=args.batch_size, drop_last=True
183 |     )
184 |     eval_dataloader = DataLoader(
185 |         eval_dataset, collate_fn=data_collator, batch_size=args.batch_size, drop_last=not args.dynamic_length
186 |     )
187 | 
188 |     # Optimizer
189 |     optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
190 | 
191 |     # Scheduler.
192 |     lr_scheduler = get_scheduler(
193 |         name="linear",
194 |         optimizer=optimizer,
195 |         num_warmup_steps=0,
196 |         num_training_steps=len(train_dataloader) * args.num_epochs,
197 |     )
198 | 
199 |     # Prepare everything with our `accelerator`.
200 |     model, optimizer, train_dataloader, eval_dataloader, lr_scheduler = accelerator.prepare(
201 |         model, optimizer, train_dataloader, eval_dataloader, lr_scheduler
202 |     )
203 | 
204 |     # Metric
205 |     metric = evaluate.load("sacrebleu")
206 | 
207 |     def postprocess_text(preds, labels):
208 |         preds = [pred.strip() for pred in preds]
209 |         labels = [[label.strip()] for label in labels]
210 | 
211 |         return preds, labels
212 | 
213 |     # Train!
214 |     # Only show the progress bar once on each machine.
215 |     train_steps = min(len(train_dataloader) * args.num_epochs, 1000)
216 |     progress_bar = tqdm(range(train_steps), disable=not accelerator.is_local_main_process)
217 |     start_time = time.time()
218 | 
219 |     for epoch in range(args.num_epochs):
220 |         model.train()
221 |         for step, batch in enumerate(train_dataloader):
222 |             # We need to skip steps until we reach the resumed step
223 |             outputs = model(**batch)
224 |             loss = outputs.loss
225 |             accelerator.backward(loss)
226 |             optimizer.step()
227 |             lr_scheduler.step()
228 |             optimizer.zero_grad()
229 |             progress_bar.update(1)
230 |             if step == 0 and epoch == 0:
231 |                 first_step_time = time.time() - start_time
232 |             elif step >= 1000:
233 |                 break
234 | 
235 |     total_training_time = time.time() - start_time
236 |     avg_iteration_time = (total_training_time - first_step_time) / (train_steps - 1)
237 |     print("Training finished.")
238 |     print(f"First iteration took: {first_step_time:.2f}s")
239 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
240 | 
241 |     model.eval()
242 |     start_time = time.time()
243 |     for step, batch in enumerate(eval_dataloader):
244 |         with torch.no_grad():
245 |             generated_tokens = accelerator.unwrap_model(model).generate(
246 |                 batch["input_ids"], attention_mask=batch["attention_mask"], max_length=args.max_length
247 |             )
248 |             generated_tokens = accelerator.pad_across_processes(
249 |                 generated_tokens, dim=1, pad_index=tokenizer.pad_token_id
250 |             )
251 |             labels = batch["labels"]
252 |             if args.dynamic_length:
253 |                 labels = accelerator.pad_across_processes(batch["labels"], dim=1, pad_index=tokenizer.pad_token_id)
254 | 
255 |             generated_tokens = accelerator.gather(generated_tokens).cpu().numpy()
256 |             labels = accelerator.gather(labels).cpu().numpy()
257 | 
258 |             # Replace -100 in the labels as we can't decode them.
259 |             labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
260 | 
261 |             decoded_preds = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
262 |             decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
263 | 
264 |             decoded_preds, decoded_labels = postprocess_text(decoded_preds, decoded_labels)
265 | 
266 |             metric.add_batch(predictions=decoded_preds, references=decoded_labels)
267 |             if step == 0:
268 |                 first_step_time = time.time() - start_time
269 | 
270 |     total_eval_time = time.time() - start_time
271 |     avg_iteration_time = (total_eval_time - first_step_time) / (len(eval_dataloader) - 1)
272 | 
273 |     print("Evaluation finished.")
274 |     print(f"First iteration took: {first_step_time:.2f}s")
275 |     print(f"Average time after the first iteration: {avg_iteration_time * 1000:.2f}ms")
276 | 
277 |     eval_metric = metric.compute()
278 |     print(f"Test BLEU score for backend {args.dynamo_backend}: {eval_metric['score']}")
279 | 
280 | 
281 | if __name__ == "__main__":
282 |     main()
283 | 


--------------------------------------------------------------------------------
/tools/summarize.py:
--------------------------------------------------------------------------------
 1 | import pandas as pd
 2 | import os
 3 | from collections import defaultdict
 4 | import argparse
 5 | 
 6 | 
 7 | def generate_and_save_plots(df, output_dir):
 8 |     for mixed_precision in set(df["mixed_precision"].values):
 9 |         metrics = defaultdict(list)
10 |         filtered_df = df[df["mixed_precision"] == mixed_precision]
11 |         columns = list(df.columns)[2:]
12 | 
13 |         # saving performance plots
14 |         metrics_columns = [column for column in columns if "time" not in column]
15 |         df_metric = filtered_df[metrics_columns + ["backend"]]
16 |         inductor_backend_values = df_metric[df_metric["backend"] == "inductor"].values[0]
17 |         pytorch_backend_values = df_metric[df_metric["backend"] == "no"].values[0]
18 |         for i, column in enumerate(metrics_columns):
19 |             metrics["metric"].append(column)
20 |             metrics["inductor"].append(inductor_backend_values[i])
21 |             metrics["no"].append(pytorch_backend_values[i])
22 |         df_metric = pd.DataFrame(metrics)
23 |         plot = df_metric.plot.bar(x="metric", rot=0)
24 |         fig = plot.get_figure()
25 |         fig.savefig(os.path.join(output_dir, f"{mixed_precision=}_metric.png"))
26 | 
27 |         # saving avg time plots
28 |         metrics = defaultdict(list)
29 |         time_columns = [column for column in columns if "time" in column]
30 |         df_time = filtered_df[time_columns + ["backend"]]
31 |         inductor_backend_values = df_time[df_time["backend"] == "inductor"].values[0]
32 |         pytorch_backend_values = df_time[df_time["backend"] == "no"].values[0]
33 |         for i, column in enumerate(time_columns):
34 |             metrics["avg_time"].append(column)
35 |             metrics["inductor"].append(inductor_backend_values[i])
36 |             metrics["no"].append(pytorch_backend_values[i])
37 |         df_metric = pd.DataFrame(metrics)
38 |         plot = df_metric.plot.bar(x="avg_time", rot=0)
39 |         fig = plot.get_figure()
40 |         fig.savefig(os.path.join(output_dir, f"{mixed_precision=}_avg_time.png"))
41 | 
42 | 
43 | def get_diff_percentage(df):
44 |     diff_percentage = defaultdict(list)
45 |     for mixed_precision in set(df["mixed_precision"].values):
46 |         diff_percentage["mixed_precision"].append(mixed_precision)
47 |         filtered_df = df[df["mixed_precision"] == mixed_precision]
48 |         columns = list(df.columns)[2:]
49 |         inductor_backend_values = filtered_df[filtered_df["backend"] == "inductor"].values[0][2:]
50 |         pytorch_backend_values = filtered_df[filtered_df["backend"] == "no"].values[0][2:]
51 | 
52 |         for i, column in enumerate(columns):
53 |             if "time" in column:
54 |                 diff_percentage[f"{column}_speedup"].append(
55 |                     str(round((pytorch_backend_values[i] / inductor_backend_values[i]), 2)) + "x"
56 |                 )
57 |             else:
58 |                 diff_percentage[f"{column}_diff%"].append(
59 |                     str(round((100 * (inductor_backend_values[i] / pytorch_backend_values[i] - 1)), 2)) + "%"
60 |                 )
61 |     return pd.DataFrame(diff_percentage)
62 | 
63 | 
64 | def main():
65 |     parser = argparse.ArgumentParser(description="Get plots and summary table")
66 |     parser.add_argument("--input_csv_file", type=str, required=True)
67 |     parser.add_argument("--output_dir", type=str, required=True)
68 | 
69 |     args = parser.parse_args()
70 |     os.makedirs(args.output_dir, exist_ok=True)
71 |     df = pd.read_csv(args.input_csv_file)
72 |     group_by_columns = ["backend", "mixed_precision"]
73 |     drop_columns = ["num_epochs", "seed"]
74 |     df.drop(columns=drop_columns, inplace=True)
75 |     df = df.groupby(group_by_columns).agg("mean")
76 |     df = df.reset_index()
77 | 
78 |     generate_and_save_plots(df, args.output_dir)
79 |     diff_df = get_diff_percentage(df)
80 |     file_prefix = args.input_csv_file.split("/")[-1].split(".")[0]
81 |     diff_df.to_csv(os.path.join(args.output_dir, f"{file_prefix}_summary_table.csv"), header=True, index=False)
82 | 
83 | 
84 | if __name__ == "__main__":
85 |     main()
86 | 


--------------------------------------------------------------------------------
/tools/verify_dynamo.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import re
  3 | import subprocess
  4 | import sys
  5 | import traceback
  6 | import warnings
  7 | 
  8 | from pkg_resources import packaging
  9 | 
 10 | MIN_CUDA_VERSION = packaging.version.parse("11.6")
 11 | MIN_PYTHON_VERSION = (3, 7)
 12 | 
 13 | 
 14 | class VerifyDynamoError(BaseException):
 15 |     pass
 16 | 
 17 | 
 18 | def check_python():
 19 |     if sys.version_info < MIN_PYTHON_VERSION:
 20 |         raise VerifyDynamoError(
 21 |             f"Python version not supported: {sys.version_info} "
 22 |             f"- minimum requirement: {MIN_PYTHON_VERSION}"
 23 |         )
 24 |     return sys.version_info
 25 | 
 26 | 
 27 | def check_torch():
 28 |     import torch
 29 | 
 30 |     return packaging.version.parse(torch.__version__)
 31 | 
 32 | 
 33 | # based on torch/utils/cpp_extension.py
 34 | def get_cuda_version():
 35 |     from torch.utils import cpp_extension
 36 | 
 37 |     CUDA_HOME = cpp_extension._find_cuda_home()
 38 |     if not CUDA_HOME:
 39 |         raise VerifyDynamoError(cpp_extension.CUDA_NOT_FOUND_MESSAGE)
 40 | 
 41 |     nvcc = os.path.join(CUDA_HOME, "bin", "nvcc")
 42 |     cuda_version_str = (
 43 |         subprocess.check_output([nvcc, "--version"])
 44 |         .strip()
 45 |         .decode(*cpp_extension.SUBPROCESS_DECODE_ARGS)
 46 |     )
 47 |     cuda_version = re.search(r"release (\d+[.]\d+)", cuda_version_str)
 48 |     if cuda_version is None:
 49 |         raise VerifyDynamoError("CUDA version not found in `nvcc --version` output")
 50 | 
 51 |     cuda_str_version = cuda_version.group(1)
 52 |     return packaging.version.parse(cuda_str_version)
 53 | 
 54 | 
 55 | def check_cuda():
 56 |     import torch
 57 | 
 58 |     if not torch.cuda.is_available():
 59 |         return None
 60 | 
 61 |     torch_cuda_ver = packaging.version.parse(torch.version.cuda)
 62 | 
 63 |     # check if torch cuda version matches system cuda version
 64 |     cuda_ver = get_cuda_version()
 65 |     if cuda_ver != torch_cuda_ver:
 66 |         # raise VerifyDynamoError(
 67 |         warnings.warn(
 68 |             f"CUDA version mismatch, `torch` version: {torch_cuda_ver}, env version: {cuda_ver}"
 69 |         )
 70 | 
 71 |     if torch_cuda_ver < MIN_CUDA_VERSION:
 72 |         # raise VerifyDynamoError(
 73 |         warnings.warn(
 74 |             f"(`torch`) CUDA version not supported: {torch_cuda_ver} "
 75 |             f"- minimum requirement: {MIN_CUDA_VERSION}"
 76 |         )
 77 |     if cuda_ver < MIN_CUDA_VERSION:
 78 |         # raise VerifyDynamoError(
 79 |         warnings.warn(
 80 |             f"(env) CUDA version not supported: {cuda_ver} "
 81 |             f"- minimum requirement: {MIN_CUDA_VERSION}"
 82 |         )
 83 | 
 84 |     return cuda_ver
 85 | 
 86 | 
 87 | def check_dynamo(backend, device, err_msg):
 88 |     import torch
 89 | 
 90 |     if device == "cuda" and not torch.cuda.is_available():
 91 |         print(f"CUDA not available -- skipping CUDA check on {backend} backend\n")
 92 |         return
 93 | 
 94 |     try:
 95 |         import torch._dynamo as dynamo
 96 | 
 97 |         dynamo.reset()
 98 | 
 99 |         @dynamo.optimize(backend, nopython=True)
100 |         def fn(x):
101 |             return x + x
102 | 
103 |         class Module(torch.nn.Module):
104 |             def __init__(self):
105 |                 super().__init__()
106 | 
107 |             def forward(self, x):
108 |                 return x + x
109 | 
110 |         mod = Module()
111 |         opt_mod = dynamo.optimize(backend, nopython=True)(mod)
112 | 
113 |         for f in (fn, opt_mod):
114 |             x = torch.randn(10, 10).to(device)
115 |             x.requires_grad = True
116 |             y = f(x)
117 |             torch.testing.assert_close(y, x + x)
118 |             z = y.sum()
119 |             z.backward()
120 |             torch.testing.assert_close(x.grad, 2 * torch.ones_like(x))
121 |     except Exception:
122 |         sys.stderr.write(traceback.format_exc() + "\n" + err_msg + "\n\n")
123 |         sys.exit(1)
124 | 
125 | 
126 | _SANITY_CHECK_ARGS = (
127 |     ("eager", "cpu", "CPU eager sanity check failed"),
128 |     ("eager", "cuda", "CUDA eager sanity check failed"),
129 |     ("aot_eager", "cpu", "CPU aot_eager sanity check failed"),
130 |     ("aot_eager", "cuda", "CUDA aot_eager sanity check failed"),
131 |     ("inductor", "cpu", "CPU inductor sanity check failed"),
132 |     (
133 |         "inductor",
134 |         "cuda",
135 |         "CUDA inductor sanity check failed\n"
136 |         + "NOTE: Please check that you installed the correct hash/version of `triton`",
137 |     ),
138 | )
139 | 
140 | 
141 | def main():
142 |     python_ver = check_python()
143 |     torch_ver = check_torch()
144 |     cuda_ver = check_cuda()
145 |     print(
146 |         f"Python version: {python_ver.major}.{python_ver.minor}.{python_ver.micro}\n"
147 |         f"`torch` version: {torch_ver}\n"
148 |         f"CUDA version: {cuda_ver}\n"
149 |     )
150 |     for args in _SANITY_CHECK_ARGS:
151 |         check_dynamo(*args)
152 |     print("All required checks passed")
153 | 
154 | 
155 | if __name__ == "__main__":
156 |     main()
157 | 


--------------------------------------------------------------------------------