├── .gitignore ├── LICENSE ├── README.md ├── ch_03 ├── 01_word2vec.ipynb ├── 02_fasttext.ipynb ├── 03_character_language_model.ipynb ├── charlm_dataset_kafka.pt └── charlm_kafka.pt ├── ch_04 ├── 01_positional_encodings.ipynb ├── 02_encoder_transformers_nlp_tasks.ipynb └── 03_gpt2_headlines_generator.ipynb ├── ch_05 ├── 01_instruction_tuning.ipynb ├── 02_RLHF_gpt2_positive_reviewer.ipynb ├── assets │ └── instruct_gpt_rlhf.png └── news_english_german_instruction_dataset_20240909.json ├── ch_06 └── Chapter6_huggingface_openllms.ipynb ├── ch_07 ├── 01_prompt_engineering.ipynb └── assets │ ├── ch_07_06.png │ ├── ch_07_08.png │ ├── ch_07_09.png │ ├── llava_test_image.png │ └── llava_test_image_2.jpg ├── ch_08 └── Chapter8.ipynb ├── ch_09 ├── 01_llm_training_and_scaling.ipynb ├── 02_pretraining_optimizations.ipynb ├── 03_finetuning_optimizations.ipynb ├── 04_instruction_tuning_llama_t2sql.ipynb └── assets │ ├── ch_09_01.png │ ├── ch_09_02.png │ ├── ch_09_03.png │ ├── ch_09_04.png │ ├── ch_09_05.png │ ├── ch_09_09.png │ ├── ch_09_10.png │ └── llama.png ├── ch_11 └── Chapter11.ipynb ├── ch_12 ├── 01_vanilla_gan.ipynb ├── 02_deep_convolutional_gan.ipynb ├── 03_conditional_gan.ipynb └── 04_progressive_gan.ipynb ├── ch_13 ├── cyclegan.ipynb └── pix2pix.ipynb ├── ch_14 ├── 01_dlib_facial_landmarks_demo.ipynb ├── 02_face_recognition_demo.ipynb ├── 03_reenactment_pix2pix_training.ipynb ├── 04_reactment_pix2pix.ipynb ├── constants.py ├── dataset_utils.py ├── deepfake_banner.png ├── face_utils.py ├── gan_utils.py ├── nicolas_ref_cc.jpg ├── obama.mp4 ├── sample_image_cc.png └── trump_ref_cc.jpg └── ch_15 └── StableDiffusionExample.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | *.DS_Store 2 | *.ipynb_checkpoints/ 3 | *cached_* 4 | *dontcommit* 5 | *__pycache__* -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Packt 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 | Generative AI with Python and PyTorch, Second Edition

3 |

This is the code repository for Generative AI with Python and PyTorch, Second Edition, published by Packt. 4 |

5 | 6 |

7 | Hands-on projects and cutting-edge techniques using generative adversarial networks and LLMs 8 |

9 |

10 | Joseph Babcock, Raghav Bali

11 | 12 |

13 | 14 |       15 | Free PDF 16 |       17 | Graphic Bundle 18 |       19 | Amazon 20 |       21 |

22 |
23 |

About the book

24 | 25 | Generative AI with Python and PyTorch, Second Edition 26 | 27 | 28 | Become an expert in generative AI through practical projects to leverage cutting-edge models for Natural Language Processing (NLP) and computer vision. Generative AI with Python and PyTorch, Second Edition, is your comprehensive guide to creating advanced AI applications. Leveraging Python, this book provides a detailed exploration of the latest generative AI technologies. 29 | 30 | From NLP to image generation, this edition dives into practical applications and the underlying theories that enable these technologies. By integrating the latest advancements and applications of large language models, this book prepares you to design and implement powerful AI systems that transform data into actionable insights. 31 | 32 | You’ll build your LLM toolbox by learning about various models, tools, and techniques, including GPT-4, LangChain, RLHF, LoRA, and retrieval augmented generation. This deep learning book shows you how to generate images and apply styler transfer using GANs, before implementing CLIP and diffusion models. 33 | 34 | Whether you’re creating dynamic content or developing complex AI-driven solutions, Generative AI with Python and PyTorch, Second Edition, equips you with the knowledge to use Python and AI to their full potential. 35 |
36 |
37 |

Key Learnings

38 | 53 | 54 |
55 | 56 |
57 |

Chapters

58 | 59 | 60 | | Chapters | Colab | Kaggle | Gradient | Studio Lab | 61 | | :-------- | :-------- | :------- | :-------- | :-------- | 62 | | **Chapter 1: An Introduction to Generative AI: ""Drawing"" Data from Models** | | | | | 63 | | **Chapter 2: Building Blocks of Deep Neural Networks** | | | | | 64 | | **Chapter 3: The Rise of Methods for Text Generation** | | | | | 65 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 66 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 67 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 68 | | **Chapter 4: NLP 2.0: Using Transformers to Generate Text** | | | | | 69 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 70 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 71 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 72 | | **Chapter 5: Foundations of LLMs** | | | | | 73 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 74 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 75 | | **Chapter 6: Open Source LLMs** | | | | | 76 | | **Chapter 7: Prompt Engineering** | | | | | 77 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 78 | | **Chapter 8: LLM Toolbox/Ecosystem** | | | | | 79 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 80 | | **Chapter 9: LLM Optimisation Techniques** | | | | | 81 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 82 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 83 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 84 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 85 | | **Chapter 10: Emerging Applications in Generative AI** | | | | | 86 | | **Chapter 11: Neural Networks Using VAEs** | | | | | 87 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 88 | | **Chapter 12: Image Generation with GANs** | | | | | 89 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 90 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 91 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 92 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 93 | | **Chapter 13: Style Transfer with GANs** | | | | | 94 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 95 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 96 | | **Chapter 14: Deepfakes with GANs** | | | | | 97 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 98 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 99 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 100 | | | Open In Colab
| Open In Kaggle
| Open In Gradient
| Open In Studio Lab
| 101 | | **Chapter 15: Diffusion Models and AI Art** | | | | | 102 | 103 | 104 | 105 | 106 | 107 | 108 |
109 | 110 | 111 |
112 |

Requirements for this book

113 | Basic understanding of Python syntax and programming experience will help you understand the majority of the code base. Additionally, an intermediate-level understanding of concepts related to machine learning and deep learning would enable you to appreciate and understand complex generative models and techniques discussed throughout the book. 114 | A quick setup guide is as follows: 115 | 116 | ### Hardware (Minimum) 117 | - 512 GB HDD 118 | - 32 GB RAM 119 | - Intel Core i5 processor or better / Apple Silicon M1 or better 120 | - Access to a 32 GB graphics card or better (T4 or better) 121 | 122 | ### Software 123 | - Python 3.11 and above 124 | - PyTorch 2.5.x and above 125 | 126 | ### Browser 127 | - Chrome, Safari, or Firefox for executing code directly via Google Colab or other cloud services
128 | 129 | 130 | 131 |
132 |

Get to know Authors

133 | 134 | _Joseph Babcock_ Joseph Babcock has spent over a decade working with big data and AI in the e-commerce, digital streaming, and quantitative finance domains. Throughout his career, he has worked on recommender systems, petabyte-scale cloud data pipelines, A/B testing, causal inference, and time series analysis. He completed his PhD studies at Johns Hopkins University, applying machine learning to drug discovery and genomics. 135 | 136 | _Raghav Bali_ Raghav Bali is a Staff Data Scientist at Delivery Hero, a leading food delivery service headquartered in Berlin, Germany. With 12+ years of expertise, he specializes in research and development of enterprise-level solutions leveraging Machine Learning, Deep Learning, Natural Language Processing, and Recommendation Engines for practical business applications. 137 | Besides his professional endeavors, Raghav is an esteemed mentor and an accomplished public speaker. He has contributed to multiple peer-reviewed papers and authored multiple well received books. Additionally, he holds co-inventor credits on multiple patents in healthcare, machine learning, deep learning, and natural language processing. 138 | 139 | 140 | 141 |
142 |
143 |

Other Related Books

144 | 151 | 152 |
153 | -------------------------------------------------------------------------------- /ch_03/charlm_dataset_kafka.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/5992bcc2e28c2ad573fa39d290a6e342b4d3820e/ch_03/charlm_dataset_kafka.pt -------------------------------------------------------------------------------- /ch_03/charlm_kafka.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/5992bcc2e28c2ad573fa39d290a6e342b4d3820e/ch_03/charlm_kafka.pt -------------------------------------------------------------------------------- /ch_04/03_gpt2_headlines_generator.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "19b9f9be-bff1-47dc-8be0-576ef6a557ff", 6 | "metadata": { 7 | "editable": true, 8 | "slideshow": { 9 | "slide_type": "" 10 | }, 11 | "tags": [] 12 | }, 13 | "source": [ 14 | "# Generating News Headlines using GPT2\n", 15 | "## GPT2\n", 16 | "GPT-2 model was released as part of the work titled “Language Models are Unsupervised Multi-task Learners” in 2019. The largest GPT-2 variant is a huge 1.5B parameter transformer-based model which the model was able to perform remarkably well of various NLP tasks. The most striking aspect of this work is that the authors showcase how a model trained in an unsupervised fashion (language modeling) achieves state-of-the-art performance in zero-shot setting.\n", 17 | "\n", 18 | "## HuggingFace Transformers\n", 19 | "One of the most propular python packages to work with Transformer based NLP models. Huggingface transformers is a high-level API to easily load, fine-tune and re-train models such as GPT2, BERT, T5 and so on\n", 20 | "\n", 21 | "## Fake Headlines\n", 22 | "ABC-News Dataset is a dataset of a million headlines available here collected over a period of 17 years. We will make use of this dataset to fine-tune the GPT2 model. Once fine-tuned we will use it to generate some fake headlines\n", 23 | "\n", 24 | "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/blob/master/ch_04/03_gpt2_headlines_generator.ipynb)" 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": 4, 30 | "id": "d4f3d782-174b-4c55-8d11-8b9a52c34fe9", 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "# !pip3 install scikit-learn==1.5.1\n", 35 | "# !pip3 install transformers==4.42.4" 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": 1, 41 | "id": "a457e201-f112-4bbe-9ebb-04a39961ce73", 42 | "metadata": {}, 43 | "outputs": [], 44 | "source": [ 45 | "import pandas as pd\n", 46 | "from sklearn.model_selection import train_test_split\n", 47 | "\n", 48 | "import torch\n", 49 | "from transformers import pipeline\n", 50 | "from transformers import AutoTokenizer\n", 51 | "from transformers import TextDataset,DataCollatorForLanguageModeling\n", 52 | "from transformers import Trainer, TrainingArguments,AutoModelForCausalLM" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "id": "fffef4bc-e884-4509-9636-eedb039e8961", 58 | "metadata": {}, 59 | "source": [ 60 | "### Prepare Dataset" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": 2, 66 | "id": "791b7720-872d-4da9-84a2-d7000e7946fb", 67 | "metadata": {}, 68 | "outputs": [], 69 | "source": [ 70 | "# download from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL\n", 71 | "# !unzip abcnews.zip" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 22, 77 | "id": "15c18110-cd6a-4695-97fe-17f279ec87e3", 78 | "metadata": {}, 79 | "outputs": [ 80 | { 81 | "data": { 82 | "text/plain": [ 83 | "(1241692, 2)" 84 | ] 85 | }, 86 | "execution_count": 22, 87 | "metadata": {}, 88 | "output_type": "execute_result" 89 | } 90 | ], 91 | "source": [ 92 | "news = pd.read_csv('abcnews-date-text-sample.csv')\n", 93 | "news.shape" 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "execution_count": 25, 99 | "id": "ce909286-1c48-45bf-8227-51dccaabca6f", 100 | "metadata": {}, 101 | "outputs": [ 102 | { 103 | "data": { 104 | "text/html": [ 105 | "
\n", 106 | "\n", 119 | "\n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | "
publish_dateheadline_textline_length
020030219aba decides against community broadcasting lic...50
120030219act fire witnesses must be aware of defamation46
220030219a g calls for infrastructure protection summit46
320030219air nz staff in aust strike for pay rise40
420030219air nz strike to affect australian travellers45
\n", 161 | "
" 162 | ], 163 | "text/plain": [ 164 | " publish_date headline_text \\\n", 165 | "0 20030219 aba decides against community broadcasting lic... \n", 166 | "1 20030219 act fire witnesses must be aware of defamation \n", 167 | "2 20030219 a g calls for infrastructure protection summit \n", 168 | "3 20030219 air nz staff in aust strike for pay rise \n", 169 | "4 20030219 air nz strike to affect australian travellers \n", 170 | "\n", 171 | " line_length \n", 172 | "0 50 \n", 173 | "1 46 \n", 174 | "2 46 \n", 175 | "3 40 \n", 176 | "4 45 " 177 | ] 178 | }, 179 | "execution_count": 25, 180 | "metadata": {}, 181 | "output_type": "execute_result" 182 | } 183 | ], 184 | "source": [ 185 | "news.head()" 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": 28, 191 | "id": "9e9205e3-0921-45c1-af1e-68889ee2342d", 192 | "metadata": {}, 193 | "outputs": [ 194 | { 195 | "data": { 196 | "text/plain": [ 197 | "" 198 | ] 199 | }, 200 | "execution_count": 28, 201 | "metadata": {}, 202 | "output_type": "execute_result" 203 | }, 204 | { 205 | "data": { 206 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAkIAAAGdCAYAAAD+JxxnAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/TGe4hAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAuAElEQVR4nO3df1iUdb7/8RcgDGCCogvIEZWtTv7WwiT6dSyR0bg6ubltPzxmZnblBW3IOaa2SqjtstH6s0hOW2p7rexq17VZaQeZMDVX1ETZ1NKt1s06NtjJH5OYw8jM94+9uL+OIEgLjM7n+bguLp37fnPP+/0Rxhf3zD2E+Hw+nwAAAAwUGugGAAAAAoUgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwVqdAN3A583q9Onr0qLp06aKQkJBAtwMAAC6Bz+fTd999p6SkJIWGNn/OhyDUjKNHjyo5OTnQbQAAgB/gyy+/VK9evZqtIQg1o0uXLpL+sZAxMTEB7qZteTwelZeXKzMzU+Hh4YFuJyBMXwPmN3t+iTUwfX4peNfA5XIpOTnZ+n+8OQShZjQ8HRYTExOUQSg6OloxMTFB9cXfGqavAfObPb/EGpg+vxT8a3ApL2vhxdIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxuoU6AYAIJAGFWyUuz4k0G1csr//OivQLQBBhTNCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYvLM0cBnqO2tDu9+HLcynohFt987KvOMxgCsRQQhB72Khoq2DAADgysNTYwAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxOgW6AQDBoe+sDYFuoVVsYT4VjQh0FwACjTNCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGCsVgWhwsJC3XjjjerSpYvi4+M1btw4HTp0yK9m5MiRCgkJ8ft44okn/GqOHDmirKwsRUdHKz4+XjNmzNC5c+f8ajZv3qwbbrhBNptN11xzjVatWtWon+LiYvXt21eRkZFKS0vTrl27/PafPXtW2dnZ6t69u6666iqNHz9eNTU1rRkZAAAEsVYFoS1btig7O1s7duyQw+GQx+NRZmamamtr/eqmTp2qr7/+2vooKiqy9tXX1ysrK0t1dXXavn27Xn/9da1atUr5+flWzeHDh5WVlaU77rhD1dXVys3N1WOPPaaNGzdaNWvWrFFeXp6effZZ7dmzR0OHDpXdbtexY8esmunTp+udd97RG2+8oS1btujo0aO69957W71IAAAgOLXql66WlZX53V61apXi4+NVVVWl22+/3doeHR2txMTEJo9RXl6ujz/+WO+9954SEhI0bNgwLViwQDNnzlRBQYEiIiJUUlKilJQULVy4UJLUv39/bdu2TYsXL5bdbpckLVq0SFOnTtXkyZMlSSUlJdqwYYNWrFihWbNm6dSpU3rttddUWlqqO++8U5K0cuVK9e/fXzt27NBNN93UmtEBAEAQ+qd++/ypU6ckSXFxcX7bV69erd///vdKTEzU3Xffrblz5yo6OlqSVFlZqcGDByshIcGqt9vtmjZtmg4cOKDrr79elZWVysjI8Dum3W5Xbm6uJKmurk5VVVWaPXu2tT80NFQZGRmqrKyUJFVVVcnj8fgdp1+/furdu7cqKyubDEJut1tut9u67XK5JEkej0cej6fV63M5a5gn2OZqii3M1/T2UJ/fn6Zh/itz/rb8njXpcaApps8vBe8atGaeHxyEvF6vcnNzdcstt2jQoEHW9oceekh9+vRRUlKSPvroI82cOVOHDh3Sn/70J0mS0+n0C0GSrNtOp7PZGpfLpe+//14nTpxQfX19kzUHDx60jhEREaGuXbs2qmm4nwsVFhZq3rx5jbaXl5dbQS7YOByOQLfQ7opGNL9/wXBvxzRymWL+K2v+d999t82PacLjQHNMn18KvjU4c+bMJdf+4CCUnZ2t/fv3a9u2bX7bH3/8cevvgwcPVs+ePTVq1Ch9/vnnuvrqq3/o3XWI2bNnKy8vz7rtcrmUnJyszMxMxcTEBLCztufxeORwODR69GiFh4cHup12NahgY5PbbaE+LRju1dzdoXJ7Qzq4q8Bj/itz/v0F9jY7lkmPA00xfX4peNeg4RmdS/GDglBOTo7Wr1+vrVu3qlevXs3WpqWlSZI+++wzXX311UpMTGx0dVfDlVwNrytKTExsdHVXTU2NYmJiFBUVpbCwMIWFhTVZc/4x6urqdPLkSb+zQufXXMhms8lmszXaHh4eHlRfIOcL5tkauOub/0/O7Q1psSaYMf+VNX97fL+a8DjQHNPnl4JvDVozS6uuGvP5fMrJydGbb76pTZs2KSUlpcXPqa6uliT17NlTkpSenq59+/b5Xd3lcDgUExOjAQMGWDUVFRV+x3E4HEpPT5ckRUREKDU11a/G6/WqoqLCqklNTVV4eLhfzaFDh3TkyBGrBgAAmK1VZ4Sys7NVWlqqt956S126dLFeaxMbG6uoqCh9/vnnKi0t1V133aXu3bvro48+0vTp03X77bdryJAhkqTMzEwNGDBAEydOVFFRkZxOp+bMmaPs7GzrbMwTTzyhl156SU8//bQeffRRbdq0SWvXrtWGDRusXvLy8jRp0iQNHz5cI0aM0JIlS1RbW2tdRRYbG6spU6YoLy9PcXFxiomJ0ZNPPqn09HSuGAMAAJJaGYSWL18u6R9vmni+lStX6pFHHlFERITee+89K5QkJydr/PjxmjNnjlUbFham9evXa9q0aUpPT1fnzp01adIkzZ8/36pJSUnRhg0bNH36dC1dulS9evXSq6++al06L0n333+/vvnmG+Xn58vpdGrYsGEqKyvzewH14sWLFRoaqvHjx8vtdstut+vll19u1QIBAIDg1aog5PM1f5lpcnKytmzZ0uJx+vTp0+KVDyNHjtTevXubrcnJyVFOTs5F90dGRqq4uFjFxcUt9gQAAMzD7xoDAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxWhWECgsLdeONN6pLly6Kj4/XuHHjdOjQIb+as2fPKjs7W927d9dVV12l8ePHq6amxq/myJEjysrKUnR0tOLj4zVjxgydO3fOr2bz5s264YYbZLPZdM0112jVqlWN+ikuLlbfvn0VGRmptLQ07dq1q9W9AAAAc7UqCG3ZskXZ2dnasWOHHA6HPB6PMjMzVVtba9VMnz5d77zzjt544w1t2bJFR48e1b333mvtr6+vV1ZWlurq6rR9+3a9/vrrWrVqlfLz862aw4cPKysrS3fccYeqq6uVm5urxx57TBs3brRq1qxZo7y8PD377LPas2ePhg4dKrvdrmPHjl1yLwAAwGydWlNcVlbmd3vVqlWKj49XVVWVbr/9dp06dUqvvfaaSktLdeedd0qSVq5cqf79+2vHjh266aabVF5ero8//ljvvfeeEhISNGzYMC1YsEAzZ85UQUGBIiIiVFJSopSUFC1cuFCS1L9/f23btk2LFy+W3W6XJC1atEhTp07V5MmTJUklJSXasGGDVqxYoVmzZl1SLwAAwGytCkIXOnXqlCQpLi5OklRVVSWPx6OMjAyrpl+/furdu7cqKyt10003qbKyUoMHD1ZCQoJVY7fbNW3aNB04cEDXX3+9Kisr/Y7RUJObmytJqqurU1VVlWbPnm3tDw0NVUZGhiorKy+5lwu53W653W7rtsvlkiR5PB55PJ4ftEaXq4Z5gm2uptjCfE1vD/X5/Wka5r8y52/L71mTHgeaYvr8UvCuQWvm+cFByOv1Kjc3V7fccosGDRokSXI6nYqIiFDXrl39ahMSEuR0Oq2a80NQw/6Gfc3VuFwuff/99zpx4oTq6+ubrDl48OAl93KhwsJCzZs3r9H28vJyRUdHX2wprmgOhyPQLbS7ohHN718w3NsxjVymmP/Kmv/dd99t82Oa8DjQHNPnl4JvDc6cOXPJtT84CGVnZ2v//v3atm3bDz3EZWf27NnKy8uzbrtcLiUnJyszM1MxMTEB7KzteTweORwOjR49WuHh4YFup10NKtjY5HZbqE8Lhns1d3eo3N6QDu4q8Jj/ypx/f4G9zY5l0uNAU0yfXwreNWh4RudS/KAglJOTo/Xr12vr1q3q1auXtT0xMVF1dXU6efKk35mYmpoaJSYmWjUXXt3VcCXX+TUXXt1VU1OjmJgYRUVFKSwsTGFhYU3WnH+Mlnq5kM1mk81ma7Q9PDw8qL5AzhfMszVw1zf/n5zbG9JiTTBj/itr/vb4fjXhcaA5ps8vBd8atGaWVl015vP5lJOTozfffFObNm1SSkqK3/7U1FSFh4eroqLC2nbo0CEdOXJE6enpkqT09HTt27fP7+ouh8OhmJgYDRgwwKo5/xgNNQ3HiIiIUGpqql+N1+tVRUWFVXMpvQAAALO16oxQdna2SktL9dZbb6lLly7Wa21iY2MVFRWl2NhYTZkyRXl5eYqLi1NMTIyefPJJpaenWy9OzszM1IABAzRx4kQVFRXJ6XRqzpw5ys7Ots7GPPHEE3rppZf09NNP69FHH9WmTZu0du1abdiwweolLy9PkyZN0vDhwzVixAgtWbJEtbW11lVkl9ILAAAwW6uC0PLlyyVJI0eO9Nu+cuVKPfLII5KkxYsXKzQ0VOPHj5fb7ZbdbtfLL79s1YaFhWn9+vWaNm2a0tPT1blzZ02aNEnz58+3alJSUrRhwwZNnz5dS5cuVa9evfTqq69al85L0v33369vvvlG+fn5cjqdGjZsmMrKyvxeQN1SLwAAwGytCkI+X8uXmUZGRqq4uFjFxcUXrenTp0+LVz6MHDlSe/fubbYmJydHOTk5/1QvAADAXPyuMQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGKvVQWjr1q26++67lZSUpJCQEK1bt85v/yOPPKKQkBC/jzFjxvjVHD9+XBMmTFBMTIy6du2qKVOm6PTp0341H330kW677TZFRkYqOTlZRUVFjXp544031K9fP0VGRmrw4MF69913/fb7fD7l5+erZ8+eioqKUkZGhj799NPWjgwAAIJUq4NQbW2thg4dquLi4ovWjBkzRl9//bX18Yc//MFv/4QJE3TgwAE5HA6tX79eW7du1eOPP27td7lcyszMVJ8+fVRVVaUXXnhBBQUFeuWVV6ya7du368EHH9SUKVO0d+9ejRs3TuPGjdP+/futmqKiIi1btkwlJSXauXOnOnfuLLvdrrNnz7Z2bAAAEIQ6tfYTxo4dq7FjxzZbY7PZlJiY2OS+Tz75RGVlZfrwww81fPhwSdKLL76ou+66S7/5zW+UlJSk1atXq66uTitWrFBERIQGDhyo6upqLVq0yApMS5cu1ZgxYzRjxgxJ0oIFC+RwOPTSSy+ppKREPp9PS5Ys0Zw5c3TPPfdIkn73u98pISFB69at0wMPPNDa0QEAQJBpdRC6FJs3b1Z8fLy6deumO++8U88995y6d+8uSaqsrFTXrl2tECRJGRkZCg0N1c6dO/WTn/xElZWVuv322xUREWHV2O12Pf/88zpx4oS6deumyspK5eXl+d2v3W63nqo7fPiwnE6nMjIyrP2xsbFKS0tTZWVlk0HI7XbL7XZbt10ulyTJ4/HI4/H88wtzGWmYJ9jmaootzNf09lCf35+mYf4rc/62/J416XGgKabPLwXvGrRmnjYPQmPGjNG9996rlJQUff7553rmmWc0duxYVVZWKiwsTE6nU/Hx8f5NdOqkuLg4OZ1OSZLT6VRKSopfTUJCgrWvW7ducjqd1rbza84/xvmf11TNhQoLCzVv3rxG28vLyxUdHX2pS3BFcTgcgW6h3RWNaH7/guHejmnkMsX8V9b8F74Wsi2Y8DjQHNPnl4JvDc6cOXPJtW0ehM4/0zJ48GANGTJEV199tTZv3qxRo0a19d21qdmzZ/udZXK5XEpOTlZmZqZiYmIC2Fnb83g8cjgcGj16tMLDwwPdTrsaVLCxye22UJ8WDPdq7u5Qub0hHdxV4DH/lTn//gJ7mx3LpMeBppg+vxS8a9DwjM6laJenxs734x//WD169NBnn32mUaNGKTExUceOHfOrOXfunI4fP269rigxMVE1NTV+NQ23W6o5f3/Dtp49e/rVDBs2rMlebTabbDZbo+3h4eFB9QVyvmCerYG7vvn/5NzekBZrghnzX1nzt8f3qwmPA80xfX4p+NagNbO0+/sIffXVV/r222+tMJKenq6TJ0+qqqrKqtm0aZO8Xq/S0tKsmq1bt/o9x+dwOHTdddepW7duVk1FRYXffTkcDqWnp0uSUlJSlJiY6Ffjcrm0c+dOqwYAAJit1UHo9OnTqq6uVnV1taR/vCi5urpaR44c0enTpzVjxgzt2LFDf//731VRUaF77rlH11xzjez2f5zO7d+/v8aMGaOpU6dq165d+vOf/6ycnBw98MADSkpKkiQ99NBDioiI0JQpU3TgwAGtWbNGS5cu9Xva6qmnnlJZWZkWLlyogwcPqqCgQLt371ZOTo4kKSQkRLm5uXruuef09ttva9++fXr44YeVlJSkcePG/ZPLBgAAgkGrnxrbvXu37rjjDut2QziZNGmSli9fro8++kivv/66Tp48qaSkJGVmZmrBggV+TzmtXr1aOTk5GjVqlEJDQzV+/HgtW7bM2h8bG6vy8nJlZ2crNTVVPXr0UH5+vt97Dd18880qLS3VnDlz9Mwzz+jaa6/VunXrNGjQIKvm6aefVm1trR5//HGdPHlSt956q8rKyhQZGdnasQEAQBBqdRAaOXKkfL6LX266cWPTL0w9X1xcnEpLS5utGTJkiD744INma+677z7dd999F90fEhKi+fPna/78+S32BAAAzMPvGgMAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGO1Oght3bpVd999t5KSkhQSEqJ169b57ff5fMrPz1fPnj0VFRWljIwMffrpp341x48f14QJExQTE6OuXbtqypQpOn36tF/NRx99pNtuu02RkZFKTk5WUVFRo17eeOMN9evXT5GRkRo8eLDefffdVvcCAADM1eogVFtbq6FDh6q4uLjJ/UVFRVq2bJlKSkq0c+dOde7cWXa7XWfPnrVqJkyYoAMHDsjhcGj9+vXaunWrHn/8cWu/y+VSZmam+vTpo6qqKr3wwgsqKCjQK6+8YtVs375dDz74oKZMmaK9e/dq3LhxGjdunPbv39+qXgAAgLk6tfYTxo4dq7Fjxza5z+fzacmSJZozZ47uueceSdLvfvc7JSQkaN26dXrggQf0ySefqKysTB9++KGGDx8uSXrxxRd111136Te/+Y2SkpK0evVq1dXVacWKFYqIiNDAgQNVXV2tRYsWWYFp6dKlGjNmjGbMmCFJWrBggRwOh1566SWVlJRcUi8AAMBsbfoaocOHD8vpdCojI8PaFhsbq7S0NFVWVkqSKisr1bVrVysESVJGRoZCQ0O1c+dOq+b2229XRESEVWO323Xo0CGdOHHCqjn/fhpqGu7nUnoBAABma/UZoeY4nU5JUkJCgt/2hIQEa5/T6VR8fLx/E506KS4uzq8mJSWl0TEa9nXr1k1Op7PF+2mplwu53W653W7rtsvlkiR5PB55PJ7mRr/iNMwTbHM1xRbma3p7qM/vT9Mw/5U5f1t+z5r0ONAU0+eXgncNWjNPmwahK11hYaHmzZvXaHt5ebmio6MD0FH7czgcgW6h3RWNaH7/guHejmnkMsX8V9b8F14U0hZMeBxojunzS8G3BmfOnLnk2jYNQomJiZKkmpoa9ezZ09peU1OjYcOGWTXHjh3z+7xz587p+PHj1ucnJiaqpqbGr6bhdks15+9vqZcLzZ49W3l5edZtl8ul5ORkZWZmKiYmpuUFuIJ4PB45HA6NHj1a4eHhgW6nXQ0q2NjkdluoTwuGezV3d6jc3pAO7irwmP/KnH9/gb3NjmXS40BTTJ9fCt41aHhG51K0aRBKSUlRYmKiKioqrLDhcrm0c+dOTZs2TZKUnp6ukydPqqqqSqmpqZKkTZs2yev1Ki0tzar5xS9+IY/HY/3DOBwOXXfdderWrZtVU1FRodzcXOv+HQ6H0tPTL7mXC9lsNtlstkbbw8PDg+oL5HzBPFsDd33z/8m5vSEt1gQz5r+y5m+P71cTHgeaY/r8UvCtQWtmafWLpU+fPq3q6mpVV1dL+seLkqurq3XkyBGFhIQoNzdXzz33nN5++23t27dPDz/8sJKSkjRu3DhJUv/+/TVmzBhNnTpVu3bt0p///Gfl5OTogQceUFJSkiTpoYceUkREhKZMmaIDBw5ozZo1Wrp0qd/ZmqeeekplZWVauHChDh48qIKCAu3evVs5OTmSdEm9AAAAs7X6jNDu3bt1xx13WLcbwsmkSZO0atUqPf3006qtrdXjjz+ukydP6tZbb1VZWZkiIyOtz1m9erVycnI0atQohYaGavz48Vq2bJm1PzY2VuXl5crOzlZqaqp69Oih/Px8v/cauvnmm1VaWqo5c+bomWee0bXXXqt169Zp0KBBVs2l9AIAAMzV6iA0cuRI+XwXv8oiJCRE8+fP1/z58y9aExcXp9LS0mbvZ8iQIfrggw+arbnvvvt03333/VO9AAAAc/G7xgAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYKw2D0IFBQUKCQnx++jXr5+1/+zZs8rOzlb37t111VVXafz48aqpqfE7xpEjR5SVlaXo6GjFx8drxowZOnfunF/N5s2bdcMNN8hms+maa67RqlWrGvVSXFysvn37KjIyUmlpadq1a1dbjwsAAK5g7XJGaODAgfr666+tj23btln7pk+frnfeeUdvvPGGtmzZoqNHj+ree++19tfX1ysrK0t1dXXavn27Xn/9da1atUr5+flWzeHDh5WVlaU77rhD1dXVys3N1WOPPaaNGzdaNWvWrFFeXp6effZZ7dmzR0OHDpXdbtexY8faY2QAAHAFapcg1KlTJyUmJlofPXr0kCSdOnVKr732mhYtWqQ777xTqampWrlypbZv364dO3ZIksrLy/Xxxx/r97//vYYNG6axY8dqwYIFKi4uVl1dnSSppKREKSkpWrhwofr376+cnBz99Kc/1eLFi60eFi1apKlTp2ry5MkaMGCASkpKFB0drRUrVrTHyAAA4ArUqT0O+umnnyopKUmRkZFKT09XYWGhevfuraqqKnk8HmVkZFi1/fr1U+/evVVZWambbrpJlZWVGjx4sBISEqwau92uadOm6cCBA7r++utVWVnpd4yGmtzcXElSXV2dqqqqNHv2bGt/aGioMjIyVFlZedG+3W633G63ddvlckmSPB6PPB7PP7Uml5uGeYJtrqbYwnxNbw/1+f1pGua/Mue/7hfr2+xYtlCfFgyXUueXye0NabPjNmV/gb1dj/9DmPQ4eDHBugatmafNg1BaWppWrVql6667Tl9//bXmzZun2267Tfv375fT6VRERIS6du3q9zkJCQlyOp2SJKfT6ReCGvY37GuuxuVy6fvvv9eJEydUX1/fZM3Bgwcv2nthYaHmzZvXaHt5ebmio6MvbQGuMA6HI9AttLuiEc3vXzDc2zGNXKaY3+z5pY5Zg3fffbfd7+OHMuFxsCXBtgZnzpy55No2D0Jjx461/j5kyBClpaWpT58+Wrt2raKiotr67trU7NmzlZeXZ912uVxKTk5WZmamYmJiAthZ2/N4PHI4HBo9erTCw8MD3U67GlSwscnt//hp2Ku5u0Pb/afhyxHzmz2/1LFrcLmeETLlcfBignUNGp7RuRTt8tTY+bp27ap//dd/1WeffabRo0errq5OJ0+e9DsrVFNTo8TERElSYmJio6u7Gq4qO7/mwivNampqFBMTo6ioKIWFhSksLKzJmoZjNMVms8lmszXaHh4eHlRfIOcL5tkauOubf4B3e0NarAlmzG/2/FLHrMHl/DhjwuNgS4JtDVozS7u/j9Dp06f1+eefq2fPnkpNTVV4eLgqKiqs/YcOHdKRI0eUnp4uSUpPT9e+ffv8ru5yOByKiYnRgAEDrJrzj9FQ03CMiIgIpaam+tV4vV5VVFRYNQAAAG0ehP7rv/5LW7Zs0d///ndt375dP/nJTxQWFqYHH3xQsbGxmjJlivLy8vT++++rqqpKkydPVnp6um666SZJUmZmpgYMGKCJEyfqL3/5izZu3Kg5c+YoOzvbOlvzxBNP6G9/+5uefvppHTx4UC+//LLWrl2r6dOnW33k5eXpt7/9rV5//XV98sknmjZtmmprazV58uS2HhkAAFyh2vypsa+++koPPvigvv32W/3oRz/Srbfeqh07duhHP/qRJGnx4sUKDQ3V+PHj5Xa7Zbfb9fLLL1ufHxYWpvXr12vatGlKT09X586dNWnSJM2fP9+qSUlJ0YYNGzR9+nQtXbpUvXr10quvviq7/f8/B33//ffrm2++UX5+vpxOp4YNG6aysrJGL6AGAADmavMg9Mc//rHZ/ZGRkSouLlZxcfFFa/r06dPiFQYjR47U3r17m63JyclRTk5OszUAAMBc/K4xAABgLIIQAAAwVrtfPo/g0nfWhkC3AABAm+GMEAAAMBZBCAAAGIsgBAAAjEUQAgAAxiIIAQAAYxGEAACAsQhCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFidAt0AACD49Z21IdAtNGIL86lohDSoYKPc9SGN9v/911kB6AodjTNCAADAWJwRCqBA/oTU0k9CAACYgDNCAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABiLIAQAAIxFEAIAAMYiCAEAAGMRhAAAgLEIQgAAwFgEIQAAYCyCEAAAMBZBCAAAGIsgBAAAjEUQAgAAxjIiCBUXF6tv376KjIxUWlqadu3aFeiWAADAZSDog9CaNWuUl5enZ599Vnv27NHQoUNlt9t17NixQLcGAAACLOiD0KJFizR16lRNnjxZAwYMUElJiaKjo7VixYpAtwYAAAKsU6AbaE91dXWqqqrS7NmzrW2hoaHKyMhQZWVlo3q32y23223dPnXqlCTp+PHj8ng8bd5fp3O1bX7MS75vr09nznjVyROqem9IwPoIJNPXgPnNnl9iDVqa/5r/WhuArv45O2ePalW9x+PRmTNn9O233yo8PLyduup43333nSTJ5/O1WBvUQej//u//VF9fr4SEBL/tCQkJOnjwYKP6wsJCzZs3r9H2lJSUdusxkB4KdAOXAdPXgPlh+hoE2/w9Fga6g8vLd999p9jY2GZrgjoItdbs2bOVl5dn3fZ6vTp+/Li6d++ukJDg+mnJ5XIpOTlZX375pWJiYgLdTkCYvgbMb/b8Emtg+vxS8K6Bz+fTd999p6SkpBZrgzoI9ejRQ2FhYaqpqfHbXlNTo8TExEb1NptNNpvNb1vXrl3bs8WAi4mJCaov/h/C9DVgfrPnl1gD0+eXgnMNWjoT1CCoXywdERGh1NRUVVRUWNu8Xq8qKiqUnp4ewM4AAMDlIKjPCElSXl6eJk2apOHDh2vEiBFasmSJamtrNXny5EC3BgAAAizog9D999+vb775Rvn5+XI6nRo2bJjKysoavYDaNDabTc8++2yjpwJNYvoaML/Z80usgenzS6yBJIX4LuXaMgAAgCAU1K8RAgAAaA5BCAAAGIsgBAAAjEUQAgAAxiIIGaawsFA33nijunTpovj4eI0bN06HDh0KdFsB8+tf/1ohISHKzc0NdCsd5n//93/1H//xH+revbuioqI0ePBg7d69O9BtdZj6+nrNnTtXKSkpioqK0tVXX60FCxZc0u8kulJt3bpVd999t5KSkhQSEqJ169b57ff5fMrPz1fPnj0VFRWljIwMffrpp4Fpth00N7/H49HMmTM1ePBgde7cWUlJSXr44Yd19OjRwDXcxlr69z/fE088oZCQEC1ZsqTD+gs0gpBhtmzZouzsbO3YsUMOh0Mej0eZmZmqrQ3cL4ANlA8//FD//d//rSFDhgS6lQ5z4sQJ3XLLLQoPD9f//M//6OOPP9bChQvVrVu3QLfWYZ5//nktX75cL730kj755BM9//zzKioq0osvvhjo1tpNbW2thg4dquLi4ib3FxUVadmyZSopKdHOnTvVuXNn2e12nT17toM7bR/NzX/mzBnt2bNHc+fO1Z49e/SnP/1Jhw4d0r//+78HoNP20dK/f4M333xTO3bsuKRfSxFUfDDasWPHfJJ8W7ZsCXQrHeq7777zXXvttT6Hw+H7t3/7N99TTz0V6JY6xMyZM3233nproNsIqKysLN+jjz7qt+3ee+/1TZgwIUAddSxJvjfffNO67fV6fYmJib4XXnjB2nby5EmfzWbz/eEPfwhAh+3rwvmbsmvXLp8k3xdffNExTXWgi83/1Vdf+f7lX/7Ft3//fl+fPn18ixcv7vDeAoUzQoY7deqUJCkuLi7AnXSs7OxsZWVlKSMjI9CtdKi3335bw4cP13333af4+Hhdf/31+u1vfxvotjrUzTffrIqKCv31r3+VJP3lL3/Rtm3bNHbs2AB3FhiHDx+W0+n0+16IjY1VWlqaKisrA9hZ4Jw6dUohISFB/7smG3i9Xk2cOFEzZszQwIEDA91Ohwv6d5bGxXm9XuXm5uqWW27RoEGDAt1Oh/njH/+oPXv26MMPPwx0Kx3ub3/7m5YvX668vDw988wz+vDDD/Xzn/9cERERmjRpUqDb6xCzZs2Sy+VSv379FBYWpvr6ev3yl7/UhAkTAt1aQDidTklq9G77CQkJ1j6TnD17VjNnztSDDz4YdL+E9GKef/55derUST//+c8D3UpAEIQMlp2drf3792vbtm2BbqXDfPnll3rqqafkcDgUGRkZ6HY6nNfr1fDhw/WrX/1KknT99ddr//79KikpMSYIrV27VqtXr1ZpaakGDhyo6upq5ebmKikpyZg1QNM8Ho9+9rOfyefzafny5YFup0NUVVVp6dKl2rNnj0JCQgLdTkDw1JihcnJytH79er3//vvq1atXoNvpMFVVVTp27JhuuOEGderUSZ06ddKWLVu0bNkyderUSfX19YFusV317NlTAwYM8NvWv39/HTlyJEAddbwZM2Zo1qxZeuCBBzR48GBNnDhR06dPV2FhYaBbC4jExERJUk1Njd/2mpoaa58JGkLQF198IYfDYczZoA8++EDHjh1T7969rcfEL774Qv/5n/+pvn37Brq9DsEZIcP4fD49+eSTevPNN7V582alpKQEuqUONWrUKO3bt89v2+TJk9WvXz/NnDlTYWFhAeqsY9xyyy2N3i7hr3/9q/r06ROgjjremTNnFBrq/zNgWFiYvF5vgDoKrJSUFCUmJqqiokLDhg2TJLlcLu3cuVPTpk0LbHMdpCEEffrpp3r//ffVvXv3QLfUYSZOnNjotZJ2u10TJ07U5MmTA9RVxyIIGSY7O1ulpaV666231KVLF+s1ALGxsYqKigpwd+2vS5cujV4P1blzZ3Xv3t2I10lNnz5dN998s371q1/pZz/7mXbt2qVXXnlFr7zySqBb6zB33323fvnLX6p3794aOHCg9u7dq0WLFunRRx8NdGvt5vTp0/rss8+s24cPH1Z1dbXi4uLUu3dv5ebm6rnnntO1116rlJQUzZ07V0lJSRo3blzgmm5Dzc3fs2dP/fSnP9WePXu0fv161dfXW4+LcXFxioiICFTbbaalf/8Lg194eLgSExN13XXXdXSrgRHoy9bQsSQ1+bFy5cpAtxYwJl0+7/P5fO+8845v0KBBPpvN5uvXr5/vlVdeCXRLHcrlcvmeeuopX+/evX2RkZG+H//4x75f/OIXPrfbHejW2s3777/f5Pf9pEmTfD7fPy6hnzt3ri8hIcFns9l8o0aN8h06dCiwTbeh5uY/fPjwRR8X33///UC33iZa+ve/kGmXz4f4fEH8dqoAAADN4MXSAADAWAQhAABgLIIQAAAwFkEIAAAYiyAEAACMRRACAADGIggBAABjEYQAAICxCEIAAMBYBCEAAGAsghAAADAWQQgAABjr/wESCjxMkjUgKwAAAABJRU5ErkJggg==", 207 | "text/plain": [ 208 | "
" 209 | ] 210 | }, 211 | "metadata": {}, 212 | "output_type": "display_data" 213 | } 214 | ], 215 | "source": [ 216 | "news['line_length'].hist()" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": 4, 222 | "id": "05188d98-d126-46f7-aa2b-f9d33483575d", 223 | "metadata": {}, 224 | "outputs": [ 225 | { 226 | "data": { 227 | "text/plain": [ 228 | "(4159, 2049)" 229 | ] 230 | }, 231 | "execution_count": 4, 232 | "metadata": {}, 233 | "output_type": "execute_result" 234 | } 235 | ], 236 | "source": [ 237 | "X_train, X_test= train_test_split(news.headline_text.sample(int(0.005*news.shape[0])).tolist(),test_size=0.33, random_state=42)\n", 238 | "len(X_train), len(X_test)" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": 5, 244 | "id": "c606838c-86f1-4ca1-ab82-41dd14915a22", 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "with open('train_dataset.txt','w') as f:\n", 249 | " for line in X_train:\n", 250 | " f.write(line)\n", 251 | " f.write(\"\\n\")\n", 252 | "\n", 253 | "with open('test_dataset.txt','w') as f:\n", 254 | " for line in X_test:\n", 255 | " f.write(line)\n", 256 | " f.write(\"\\n\")" 257 | ] 258 | }, 259 | { 260 | "cell_type": "code", 261 | "execution_count": 2, 262 | "id": "75e15f55-5921-4489-9a9d-2544626b6acf", 263 | "metadata": {}, 264 | "outputs": [ 265 | { 266 | "name": "stderr", 267 | "output_type": "stream", 268 | "text": [ 269 | "/Users/raghavbali/.pyenv/versions/3.11.9/envs/deeplearning/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884\n", 270 | " warnings.warn(\n" 271 | ] 272 | } 273 | ], 274 | "source": [ 275 | "tokenizer = AutoTokenizer.from_pretrained(\"gpt2\",pad_token='')\n", 276 | "\n", 277 | "train_path = './train_dataset.txt'\n", 278 | "test_path = './test_dataset.txt'" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": 3, 284 | "id": "cc598a4e-6ea4-41b3-8241-57ea59c21b59", 285 | "metadata": {}, 286 | "outputs": [], 287 | "source": [ 288 | "import datasets\n", 289 | "# train_dataset = datasets.load_dataset('csv',data_files=train_path)\n", 290 | "# test_dataset = datasets.load_dataset('csv',data_files=test_path)\n", 291 | "dataset = datasets.load_dataset('text',data_files={'train':train_path,'test':test_path})" 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": 6, 297 | "id": "8d2dad66-49cd-4af9-9133-9d2add0d1bd0", 298 | "metadata": {}, 299 | "outputs": [ 300 | { 301 | "data": { 302 | "application/vnd.jupyter.widget-view+json": { 303 | "model_id": "a9abfcfb3c95495295c5d3d4e2567ca6", 304 | "version_major": 2, 305 | "version_minor": 0 306 | }, 307 | "text/plain": [ 308 | "Map (num_proc=8): 0%| | 0/4159 [00:00' for line in examples[\"text\"] if len(line) > 0 and not line.isspace()]\n", 333 | " return tokenizer(\n", 334 | " examples[\"text\"],\n", 335 | " truncation=True,\n", 336 | " )\n", 337 | "\n", 338 | "tokenized_train_dataset = dataset['train'].map(\n", 339 | " tokenize_function,\n", 340 | " batched=True,\n", 341 | " num_proc=8,\n", 342 | " remove_columns=[\"text\"],\n", 343 | ")\n", 344 | "\n", 345 | "tokenized_test_dataset = dataset['test'].map(\n", 346 | " tokenize_function,\n", 347 | " batched=True,\n", 348 | " num_proc=8,\n", 349 | " remove_columns=[\"text\"],\n", 350 | ")" 351 | ] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "execution_count": 7, 356 | "id": "42589349-ece5-4b79-9159-a81112e500df", 357 | "metadata": {}, 358 | "outputs": [], 359 | "source": [ 360 | "data_collator = DataCollatorForLanguageModeling(\n", 361 | " tokenizer=tokenizer, mlm=False,\n", 362 | ")" 363 | ] 364 | }, 365 | { 366 | "cell_type": "markdown", 367 | "id": "035383de-b4fb-470e-8448-d53c62c60911", 368 | "metadata": {}, 369 | "source": [ 370 | "## Prepare Model for Training" 371 | ] 372 | }, 373 | { 374 | "cell_type": "code", 375 | "execution_count": 8, 376 | "id": "0e401adf-db30-4675-a11b-bce1aa031457", 377 | "metadata": {}, 378 | "outputs": [ 379 | { 380 | "data": { 381 | "application/vnd.jupyter.widget-view+json": { 382 | "model_id": "93f49cd0cecb4a719f3830b993b55b44", 383 | "version_major": 2, 384 | "version_minor": 0 385 | }, 386 | "text/plain": [ 387 | "VBox(children=(HTML(value='
\n", 540 | " \n", 541 | " \n", 542 | " [130/130 02:19, Epoch 2/2]\n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | "
StepTraining Loss
86.501000
165.618000
245.407000
325.382200
405.262300
485.105400
565.107700
645.169300
724.687800
804.680900
884.734500
964.685100
1044.556900
1124.660200
1204.634800
1284.633200

" 618 | ], 619 | "text/plain": [ 620 | "" 621 | ] 622 | }, 623 | "metadata": {}, 624 | "output_type": "display_data" 625 | }, 626 | { 627 | "data": { 628 | "text/plain": [ 629 | "TrainOutput(global_step=130, training_loss=5.044873604407678, metrics={'train_runtime': 140.587, 'train_samples_per_second': 59.166, 'train_steps_per_second': 0.925, 'total_flos': 248723096358912.0, 'train_loss': 5.044873604407678, 'epoch': 2.0})" 630 | ] 631 | }, 632 | "execution_count": 15, 633 | "metadata": {}, 634 | "output_type": "execute_result" 635 | } 636 | ], 637 | "source": [ 638 | "trainer.train()" 639 | ] 640 | }, 641 | { 642 | "cell_type": "code", 643 | "execution_count": 16, 644 | "id": "c823eb84-895c-4745-bac0-054772cedab7", 645 | "metadata": {}, 646 | "outputs": [], 647 | "source": [ 648 | "# trainer.save_model()" 649 | ] 650 | }, 651 | { 652 | "cell_type": "code", 653 | "execution_count": 25, 654 | "id": "f2e812b9-38a4-43cd-aac1-3f9cac8d41e8", 655 | "metadata": {}, 656 | "outputs": [ 657 | { 658 | "data": { 659 | "application/vnd.jupyter.widget-view+json": { 660 | "model_id": "e16c98a2e9cc49b3a9be5c82787a7ea5", 661 | "version_major": 2, 662 | "version_minor": 0 663 | }, 664 | "text/plain": [ 665 | "model.safetensors: 0%| | 0.00/1.42G [00:00 Back of the Envelope Calculations : A quick way to get rough estimates\n", 23 | "\n", 24 | "\n", 25 | "**[LLaMA 3.1](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md)** from Meta.AI launched very recently. The model is available in 8B, 70B and 405B sizes and is outperforming a number of existing LLMs on various benchmarks. \n", 26 | "\n", 27 | "![image.png](attachment:08264d12-83d1-45ac-8664-90c3f5af5ad6.png)" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "id": "04de8150-f71a-4478-89b6-075cf10a602b", 33 | "metadata": {}, 34 | "source": [ 35 | "## But how much does it cost to train such model(s)?\n", 36 | "\n", 37 | "\n", 38 | "> Source: https://x.com/deedydas/status/1629312480165109760\n", 39 | "\n", 40 | "__Assumptions__\n", 41 | "For the sake of our understanding, we will make the following assumptions:\n", 42 | "- Ignore costs associated with preparing datasets\n", 43 | "- Ignore costs associated with training restarts, infra-failures, etc.\n", 44 | "- Cost of forward and backward pass is set to 1\n", 45 | "- Assume a very simplified view of overhead associated with multi-GPU/multi-node clusters by setting a standard efficiency ratio (ex: 0.25 efficiency in terms of TFLOPs)" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "id": "060e81bd-68c3-4d3f-b04b-2fb47d176b16", 51 | "metadata": {}, 52 | "source": [ 53 | "### Model Parameters\n", 54 | "- Model Size : 405 **B**illion\n", 55 | "- Training Dataset : 15 **T**rillion" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": 39, 61 | "id": "baea2c80-41aa-420a-a9f1-a84b75ff0c89", 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "# define model and dataset size\n", 66 | "model_name = 'LLaMA3.1'\n", 67 | "model_size = 405e9\n", 68 | "dataset_size = 15e12 #15Trillion Tokens. Hint use scientific notation\n", 69 | "forward_backward_pass_ops = 1 # better estimate from table 1 @ Kaplan et. al." 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "id": "cf57b284-9971-4dbe-b3b5-42346a8d0cea", 75 | "metadata": {}, 76 | "source": [ 77 | "### Compute Required " 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": 40, 83 | "id": "6ce98c19-04c0-429b-b198-81afad05316f", 84 | "metadata": {}, 85 | "outputs": [ 86 | { 87 | "name": "stdout", 88 | "output_type": "stream", 89 | "text": [ 90 | "We will need approximately \u001b[1m6.075e+24\u001b[0m FLOPs to train \u001b[1mLLaMA3.1\u001b[0m\n", 91 | "\t,where FLOPs is Floating Point Operations Per Second\n" 92 | ] 93 | } 94 | ], 95 | "source": [ 96 | "APPROX_COMPUTE_REQUIRED = model_size * dataset_size * forward_backward_pass_ops\n", 97 | "print(f\"We will need approximately \\033[1m{APPROX_COMPUTE_REQUIRED}\\033[0m FLOPs to train \\033[1m{model_name}\\033[0m\")\n", 98 | "print(\"\\t,where FLOPs is Floating Point Operations Per Second\")" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "id": "97faa229-cc24-4635-8ade-ade15bd51eb4", 104 | "metadata": {}, 105 | "source": [ 106 | "### GPU Performance and Compute Time" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 41, 112 | "id": "cda26c00-0ff2-4ff5-b337-122abacf2368", 113 | "metadata": {}, 114 | "outputs": [], 115 | "source": [ 116 | "# cost source: https://fullstackdeeplearning.com/cloud-gpus/\n", 117 | "gpu_details = {\n", 118 | " 't4':{\n", 119 | " 'flops':0.081e14, #colab free\n", 120 | " 'cost':0.21, #usd per hour\n", 121 | " 'ram':16 #gb\n", 122 | " },\n", 123 | " 'v100':{\n", 124 | " 'flops':0.164e14, #standard nvidia\n", 125 | " 'cost':0.84, #usd per hour\n", 126 | " 'ram':32 #gb\n", 127 | " \n", 128 | " },\n", 129 | " 'a100':{\n", 130 | " 'flops':3.12e14, #standard nvidia\n", 131 | " 'cost':1.1, #usd per hour\n", 132 | " 'ram':80 #gb\n", 133 | " },\n", 134 | "}\n", 135 | "hour_constant = 60*60 # number of seconds in an hour\n", 136 | "gpu_efficiency = 0.5 #50% efficiency" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": 42, 142 | "id": "0db43d82-5fc3-4bce-8e51-5b76dd49d4a4", 143 | "metadata": {}, 144 | "outputs": [ 145 | { 146 | "name": "stdout", 147 | "output_type": "stream", 148 | "text": [ 149 | "We will need approximately \u001b[1m1.08E+07\u001b[0m GPU hours to train \u001b[1mLLaMA3.1\u001b[0m on a \u001b[1ma100\u001b[0m GPU\n" 150 | ] 151 | } 152 | ], 153 | "source": [ 154 | "gpu = #TODO: Select one of the GPUs, ex: a100\n", 155 | "COMPUTE_TIME = APPROX_COMPUTE_REQUIRED/(gpu_details.get(gpu).get('flops')*hour_constant*gpu_efficiency)\n", 156 | "print(f\"We will need approximately \\033[1m{COMPUTE_TIME:.2E}\\033[0m GPU hours to train \\033[1m{model_name}\\033[0m on a \\033[1m{gpu}\\033[0m GPU\")" 157 | ] 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "id": "313aa189-1ec4-4885-9300-022284d39480", 162 | "metadata": {}, 163 | "source": [ 164 | "### Cost of Training" 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 43, 170 | "id": "c7383ba9-a742-4a26-a361-046770020450", 171 | "metadata": {}, 172 | "outputs": [ 173 | { 174 | "name": "stdout", 175 | "output_type": "stream", 176 | "text": [ 177 | "We will need approximately spend \u001b[1m$11,899,038.46\u001b[0m to train \u001b[1mLLaMA3.1\u001b[0m on a \u001b[1ma100\u001b[0m GPU\n" 178 | ] 179 | } 180 | ], 181 | "source": [ 182 | "TRAINING_COST = COMPUTE_TIME*gpu_details.get(gpu).get('cost')\n", 183 | "print(f\"We will need approximately spend \\033[1m${TRAINING_COST:,.2f}\\033[0m to train \\033[1m{model_name}\\033[0m on a \\033[1m{gpu}\\033[0m GPU\")" 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "id": "7b02f0be-2c00-440c-9303-597c68c16893", 189 | "metadata": {}, 190 | "source": [ 191 | "## Big but How Big?\n", 192 | "\n", 193 | "The latest and the greatest seem to be a thing only the _GPU-rich_ can afford to play with. The exponential increase in the size of models along with their training datasets (we saw GPT vs GPT2 vs GPT3.5 in the previous module) indicates scale is our best friend. \n", 194 | "\n", 195 | "Work by Kaplan et. al. in the work titled **[Scaling Laws for Neural Language Models](https://arxiv.org/pdf/2001.08361)** presents some interesting takeaways. \n", 196 | "We will use the notation from paper as:\n", 197 | "- **$N$**: Model parameters excluding embeddings\n", 198 | "- **$D$**: Size of the dataset\n", 199 | "- **$C$**: Compute used for training the model\n", 200 | "\n", 201 | "_Scale is a function of $N$, $D$ and $C$_\n", 202 | "\n", 203 | "\n", 204 | "Let's look at some of the insights from the paper:" 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "id": "fbfb1e79-23b5-42b4-81c2-a77486a9ac19", 210 | "metadata": {}, 211 | "source": [ 212 | "1. Performance depends **strongly on scale** and weakly on model shape\n", 213 | "2. Performance improves predictably as long as we **scale up** **$N$** and **$D$** : \n", 214 | "_Every time we increase model size 8x, we only need to increase the dataset by roughly 5x_\n", 215 | "3. Large Models are more **sample efficient** than small models reaching same level of performance with fewer steps and fewer data points" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "id": "7da49860-5be3-4cd6-bac7-fcfde7403193", 221 | "metadata": {}, 222 | "source": [ 223 | "\n", 224 | "\n", 225 | "> Source: [Kaplan et. al.](https://arxiv.org/pdf/2001.08361)" 226 | ] 227 | }, 228 | { 229 | "cell_type": "markdown", 230 | "id": "c4b3aab8-83db-4bcd-ae71-84f723fd7194", 231 | "metadata": {}, 232 | "source": [ 233 | "## So Should We Just Keep Growing?\n", 234 | "\n", 235 | "**TL;DR**: Probably not! \n", 236 | "\n", 237 | "**Long Answer**: In their work titled [Training Compute-Optimal Large Language Models](https://arxiv.org/pdf/2203.15556) Hoffman et. al. build upon the previous works to showcase that current(_2022_) set of models are **significantly under trained** or the current set of LLMs are far too large for their compute budgets and datasets!\n", 238 | "\n", 239 | "They present a 70B parameter model titled **Chincilla** which was:\n", 240 | "- 4x smaller than 280B parameter Gopher\n", 241 | "- trained on 4x more data than Gopher, 1.3T tokens vs 300B tokens\n", 242 | "\n", 243 | "and yet **outperformed** Gopher on every task they evaluated!\n", 244 | "\n", 245 | "\n", 246 | "\n", 247 | "> Source: [Hoffman et. al.](https://arxiv.org/pdf/2203.15556)\n", 248 | "> Fine-print: Though undertrained, LLMs increasingly show performance improvement with increasing dataset size" 249 | ] 250 | }, 251 | { 252 | "cell_type": "markdown", 253 | "id": "05fcc526-f317-404d-bbaa-0bbd5204a82e", 254 | "metadata": {}, 255 | "source": [ 256 | "## Ok, So I have a lot of Compute, What's the Problem?\n", 257 | "\n", 258 | "The scaling laws are all good for BigTech, but you could say that most companies have a lot of compute available. Where is the problem? Let us understand this with a simple example walk through\n", 259 | "\n", 260 | "Assumptions/Setup:\n", 261 | "- System RAM (CPU): 32GB\n", 262 | "- GPU RAM : 32 GB\n", 263 | "- Model Size : 20B\n", 264 | "- Parameter Size: 2bytes" 265 | ] 266 | }, 267 | { 268 | "cell_type": "code", 269 | "execution_count": 1, 270 | "id": "18976f94-44ff-41ae-923e-5f851e15ac4b", 271 | "metadata": {}, 272 | "outputs": [], 273 | "source": [ 274 | "from utils import humanbytes, memory_fit" 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": 2, 280 | "id": "b0ea1172-6d32-417d-832d-1b5aa51c80c2", 281 | "metadata": {}, 282 | "outputs": [], 283 | "source": [ 284 | "CPU_RAM = 32e9 # 32GB\n", 285 | "GPU_RAM = 32e9 #32GB\n", 286 | "model_size = 20e9 #20B\n", 287 | "param_size = 2" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 6, 293 | "id": "a845b652-3eff-4d78-8244-58268c312526", 294 | "metadata": {}, 295 | "outputs": [], 296 | "source": [ 297 | "inference_memory = #TODO: Model Size Multiplied with Bytes per Parameter\n", 298 | "inference_outcome = memory_fit(inference_memory,CPU_RAM,GPU_RAM)" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": 8, 304 | "id": "89408375-8106-44e8-8214-e3c29c287129", 305 | "metadata": {}, 306 | "outputs": [ 307 | { 308 | "name": "stdout", 309 | "output_type": "stream", 310 | "text": [ 311 | "Amount of memory needed to load model for inference=\u001b[1m40.00 GB\u001b[0m\n", 312 | "\n", 313 | "Can this work on my setup?\n", 314 | "\u001b[1mYes, but fit needs both CPU and GPU\u001b[0m\n" 315 | ] 316 | } 317 | ], 318 | "source": [ 319 | "print(f\"Amount of memory needed to load model for inference=\\033[1m{humanbytes(inference_memory)}\\033[0m\")\n", 320 | "print()\n", 321 | "print(f\"Can this work on my setup?\\n\\033[1m{inference_outcome}\\033[0m\")" 322 | ] 323 | }, 324 | { 325 | "cell_type": "markdown", 326 | "id": "8aad896f-ce57-463d-8dfc-6a9e551caab7", 327 | "metadata": {}, 328 | "source": [ 329 | "\n", 330 | "This is good for inference but we need to train/fine-tune this model.\n", 331 | "We need to accomodate for:\n", 332 | "- **Gradients/backpropagation** : Size same as model size\n", 333 | "- **Optimizer States** (ex: ADAM needs momentum and variance, can't be FP16): typically 12x of model size" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": 9, 339 | "id": "cea8e259-3f88-4304-8f04-889e24bc859b", 340 | "metadata": {}, 341 | "outputs": [], 342 | "source": [ 343 | "gradient_params = model_size\n", 344 | "optimizer_memory = model_size*12" 345 | ] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "execution_count": 10, 350 | "id": "e7371150-4a51-4c94-a2da-54bdd382d9ff", 351 | "metadata": {}, 352 | "outputs": [], 353 | "source": [ 354 | "finetune_memory = inference_memory + gradient_params + optimizer_memory\n", 355 | "finetune_outcome = memory_fit(finetune_memory,CPU_RAM,GPU_RAM)" 356 | ] 357 | }, 358 | { 359 | "cell_type": "code", 360 | "execution_count": 11, 361 | "id": "7f9bc390-e1d5-4501-b733-5538b6d29ca9", 362 | "metadata": {}, 363 | "outputs": [ 364 | { 365 | "name": "stdout", 366 | "output_type": "stream", 367 | "text": [ 368 | "Amount of memory needed to load model for fintuning=\u001b[1m300.00 GB\u001b[0m\n", 369 | "\n", 370 | "Can this work on my setup?\n", 371 | "\u001b[1mNope, does not fit available memory\u001b[0m\n" 372 | ] 373 | } 374 | ], 375 | "source": [ 376 | "print(f\"Amount of memory needed to load model for fintuning=\\033[1m{humanbytes(finetune_memory)}\\033[0m\")\n", 377 | "print()\n", 378 | "print(f\"Can this work on my setup?\\n\\033[1m{finetune_outcome}\\033[0m\")" 379 | ] 380 | }, 381 | { 382 | "cell_type": "markdown", 383 | "id": "5d07e0ca-4baa-423c-bc68-fe1726bb6de7", 384 | "metadata": {}, 385 | "source": [ 386 | "We need more memory (and faster GPUs). But just by usual scaling we would need:" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": 38, 392 | "id": "c60e7c9f-f5c7-4a64-a29f-e6c0fe6a63e9", 393 | "metadata": {}, 394 | "outputs": [ 395 | { 396 | "name": "stdout", 397 | "output_type": "stream", 398 | "text": [ 399 | "We Would need roughly need \u001b[1m8.0 more GPUs\u001b[0m to setup fine-tuning\n" 400 | ] 401 | } 402 | ], 403 | "source": [ 404 | "additional_gpus = #TODO: HINT Required Memory / RAM per GPU\n", 405 | "print(f\"We Would need roughly need \\033[1m{additional_gpus} more GPUs\\033[0m to setup fine-tuning\")" 406 | ] 407 | }, 408 | { 409 | "cell_type": "code", 410 | "execution_count": 47, 411 | "id": "d4aad26e-ac4c-4599-924a-c751dd08609a", 412 | "metadata": {}, 413 | "outputs": [ 414 | { 415 | "name": "stdout", 416 | "output_type": "stream", 417 | "text": [ 418 | "We Would spend roughly \u001b[1m$7.56/hr\u001b[0m to for fine-tuning with this setup\n" 419 | ] 420 | } 421 | ], 422 | "source": [ 423 | "gpu = 'v100' # GPU RAM size is same for our example\n", 424 | "total_gpu_cost_per_hour = gpu_details.get(gpu).get('cost')*(additional_gpus+1)\n", 425 | "print(f\"We Would spend roughly \\033[1m${total_gpu_cost_per_hour}/hr\\033[0m to for fine-tuning with this setup\")" 426 | ] 427 | } 428 | ], 429 | "metadata": { 430 | "kernelspec": { 431 | "display_name": "Python 3 (ipykernel)", 432 | "language": "python", 433 | "name": "python3" 434 | }, 435 | "language_info": { 436 | "codemirror_mode": { 437 | "name": "ipython", 438 | "version": 3 439 | }, 440 | "file_extension": ".py", 441 | "mimetype": "text/x-python", 442 | "name": "python", 443 | "nbconvert_exporter": "python", 444 | "pygments_lexer": "ipython3", 445 | "version": "3.11.9" 446 | } 447 | }, 448 | "nbformat": 4, 449 | "nbformat_minor": 5 450 | } 451 | -------------------------------------------------------------------------------- /ch_09/02_pretraining_optimizations.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "9d821cda-1586-4727-94c1-460f7f145c1e", 6 | "metadata": { 7 | "id": "9d821cda-1586-4727-94c1-460f7f145c1e" 8 | }, 9 | "source": [ 10 | "# Pretraining Optimizations\n", 11 | "The pretraining step involves the largest amount of data along and is impacted by architectural aspects of the model: its size (parameters), shape (width and depth), and so on.\n", 12 | "This notebook covers optimization techniques focussed on the pretraining step.\n", 13 | "\n", 14 | "We will cover:\n", 15 | "- Different Floating Point Representations/Formats\n", 16 | "- Quantization of Floats\n", 17 | "- Post Training Quantization of Models:\n", 18 | " - Torch based dynamic quantization\n", 19 | " - Huggingface and bitsandbytes based 8bit and 4bit quantization" 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "id": "f6ec8627-b1c5-441e-8333-c354aa2a2469", 25 | "metadata": {}, 26 | "source": [ 27 | "

This Notebook requires GPU" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 1, 33 | "id": "njB6s73CpEZZ", 34 | "metadata": { 35 | "id": "njB6s73CpEZZ" 36 | }, 37 | "outputs": [], 38 | "source": [ 39 | "# !pip3 install -U bitsandbytes\n", 40 | "# restart after this step" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 2, 46 | "id": "978b59cd-0255-4c5d-87eb-4c91047c9f46", 47 | "metadata": { 48 | "id": "978b59cd-0255-4c5d-87eb-4c91047c9f46" 49 | }, 50 | "outputs": [], 51 | "source": [ 52 | "import torch\n", 53 | "import struct\n", 54 | "import numpy as np\n", 55 | "from time import time\n", 56 | "from utils import get_model_size\n", 57 | "from huggingface_hub import notebook_login\n", 58 | "from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, QuantoConfig" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": 3, 64 | "id": "_-0pKdMAv0W1", 65 | "metadata": { 66 | "colab": { 67 | "base_uri": "https://localhost:8080/" 68 | }, 69 | "id": "_-0pKdMAv0W1", 70 | "outputId": "d2a5ff7f-f22c-48a9-9287-02db45320b10" 71 | }, 72 | "outputs": [ 73 | { 74 | "data": { 75 | "text/plain": [ 76 | "" 77 | ] 78 | }, 79 | "execution_count": 3, 80 | "metadata": {}, 81 | "output_type": "execute_result" 82 | } 83 | ], 84 | "source": [ 85 | "# Set up warnings\n", 86 | "import warnings\n", 87 | "warnings.filterwarnings(\n", 88 | " action='ignore',\n", 89 | " category=DeprecationWarning,\n", 90 | " module=r'.*'\n", 91 | ")\n", 92 | "warnings.filterwarnings(\n", 93 | " action='default',\n", 94 | " module=r'torch.ao.quantization'\n", 95 | ")\n", 96 | "\n", 97 | "# Specify random seed for repeatable results\n", 98 | "torch.manual_seed(191009)" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "id": "614f50e6-d91d-431b-b3ed-d528d225cfab", 104 | "metadata": { 105 | "id": "614f50e6-d91d-431b-b3ed-d528d225cfab" 106 | }, 107 | "source": [ 108 | "## Representing Floating Point Numbers\n", 109 | "\n", 110 | "\n" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "id": "4e21bd57-9d5a-4d3f-b6ca-7cee737f8e65", 116 | "metadata": { 117 | "id": "4e21bd57-9d5a-4d3f-b6ca-7cee737f8e65" 118 | }, 119 | "source": [ 120 | "### Binary Representation of Floats\n", 121 | "- Sign bit\n", 122 | "- Exponent bits\n", 123 | "- Mantissa bits" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": 4, 129 | "id": "fade62aa-bb72-4c67-b7f4-76232ba9ec59", 130 | "metadata": { 131 | "colab": { 132 | "base_uri": "https://localhost:8080/" 133 | }, 134 | "id": "fade62aa-bb72-4c67-b7f4-76232ba9ec59", 135 | "outputId": "90c90522-a361-4ecc-c446-52c8ce2b9a96" 136 | }, 137 | "outputs": [ 138 | { 139 | "name": "stdout", 140 | "output_type": "stream", 141 | "text": [ 142 | "Sample Floating Point Number:3.1457898\n" 143 | ] 144 | } 145 | ], 146 | "source": [ 147 | "num = 3.1457898\n", 148 | "print(f\"Sample Floating Point Number:{num}\")" 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": 5, 154 | "id": "ad346123-cc0a-4e64-ba52-5e231933a7f0", 155 | "metadata": { 156 | "colab": { 157 | "base_uri": "https://localhost:8080/" 158 | }, 159 | "id": "ad346123-cc0a-4e64-ba52-5e231933a7f0", 160 | "outputId": "eab07fc3-e487-41f9-94a6-0b8d8448ba29" 161 | }, 162 | "outputs": [ 163 | { 164 | "name": "stdout", 165 | "output_type": "stream", 166 | "text": [ 167 | "Float32 representation of 3.1457898:\n", 168 | "Sign: 0\n", 169 | "Exponent: 10000000\n", 170 | "Fraction: 10010010101010010011111\n" 171 | ] 172 | } 173 | ], 174 | "source": [ 175 | "def float32_to_binary(num):\n", 176 | " return ''.join(f'{b:08b}' for b in struct.pack('!f', num))\n", 177 | "\n", 178 | "binary = float32_to_binary(num)\n", 179 | "\n", 180 | "print(f\"Float32 representation of {num}:\")\n", 181 | "print(f\"Sign: {binary[0]}\")\n", 182 | "print(f\"Exponent: {binary[1:9]}\")\n", 183 | "print(f\"Fraction: {binary[9:]}\")" 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "id": "96e544f8-2d9c-4498-af70-8e016c602ca8", 189 | "metadata": { 190 | "id": "96e544f8-2d9c-4498-af70-8e016c602ca8" 191 | }, 192 | "source": [ 193 | "### Different Types of Floats\n", 194 | "\n", 195 | "- FP32\n", 196 | "- FP16\n", 197 | "- bFloat16" 198 | ] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": 6, 203 | "id": "d725e79a-adf2-48a0-b45e-357c06fc8059", 204 | "metadata": { 205 | "colab": { 206 | "base_uri": "https://localhost:8080/" 207 | }, 208 | "id": "d725e79a-adf2-48a0-b45e-357c06fc8059", 209 | "outputId": "11075f32-2542-40b0-89ae-ad286e755f1d" 210 | }, 211 | "outputs": [ 212 | { 213 | "name": "stdout", 214 | "output_type": "stream", 215 | "text": [ 216 | "Float32: [3.1457899]\n", 217 | "Float16: [3.146]\n" 218 | ] 219 | } 220 | ], 221 | "source": [ 222 | "# Create arrays with different float types\n", 223 | "f32 = np.array([num], dtype=np.float32)\n", 224 | "f16 = np.array([num], dtype=np.float16)\n", 225 | "\n", 226 | "print(f\"Float32: {f32}\")\n", 227 | "print(f\"Float16: {f16}\")" 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": 7, 233 | "id": "94d6b8c8-f625-4b3a-8ff3-353f38f5674d", 234 | "metadata": { 235 | "id": "94d6b8c8-f625-4b3a-8ff3-353f38f5674d" 236 | }, 237 | "outputs": [], 238 | "source": [ 239 | "og_scalar = torch.scalar_tensor(num)\n", 240 | "fp16_scalar = og_scalar.to(dtype=torch.float16)\n", 241 | "bf16_scalar = og_scalar.to(dtype=torch.bfloat16)" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": 8, 247 | "id": "af64e0e9-d107-406e-89d1-fb0b8a24271d", 248 | "metadata": { 249 | "colab": { 250 | "base_uri": "https://localhost:8080/" 251 | }, 252 | "id": "af64e0e9-d107-406e-89d1-fb0b8a24271d", 253 | "outputId": "1b58d1a5-12a5-463b-d0a4-721bf4b9f1c8" 254 | }, 255 | "outputs": [ 256 | { 257 | "name": "stdout", 258 | "output_type": "stream", 259 | "text": [ 260 | "Torch Float32: 3.145789861679077\n", 261 | "Torch Float16: 3.146484375\n", 262 | "Torch bFloat16: 3.140625\n" 263 | ] 264 | } 265 | ], 266 | "source": [ 267 | "print(f\"Torch Float32: {og_scalar}\")\n", 268 | "print(f\"Torch Float16: {fp16_scalar}\")\n", 269 | "print(f\"Torch bFloat16: {bf16_scalar}\")" 270 | ] 271 | }, 272 | { 273 | "attachments": {}, 274 | "cell_type": "markdown", 275 | "id": "7cac62e7-e42c-4c80-bf8f-0c2d328546e9", 276 | "metadata": { 277 | "id": "7cac62e7-e42c-4c80-bf8f-0c2d328546e9" 278 | }, 279 | "source": [ 280 | "## Quantization\n", 281 | "Quantization aims to reduce the number of bits needed to store these weights by binning floating-point values into lower-precision buckets. This reduces memory usage with minimal impact on performance, as small precision losses are often acceptable. \n", 282 | "" 283 | ] 284 | }, 285 | { 286 | "cell_type": "code", 287 | "execution_count": 9, 288 | "id": "a0f25fa8-faee-4a31-86c6-570f77224511", 289 | "metadata": { 290 | "colab": { 291 | "base_uri": "https://localhost:8080/" 292 | }, 293 | "id": "a0f25fa8-faee-4a31-86c6-570f77224511", 294 | "outputId": "d86d95b4-be5c-43ae-e764-77bf930adb0f" 295 | }, 296 | "outputs": [ 297 | { 298 | "data": { 299 | "text/plain": [ 300 | "(31.875, 0)" 301 | ] 302 | }, 303 | "execution_count": 9, 304 | "metadata": {}, 305 | "output_type": "execute_result" 306 | } 307 | ], 308 | "source": [ 309 | "min_x = -np.ceil([num])[0]\n", 310 | "max_x = np.ceil([num])[0]\n", 311 | "scale = 255/(max_x-min_x)\n", 312 | "zero_point = -round(scale*min_x)-128\n", 313 | "scale,zero_point" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": 10, 319 | "id": "d2833335-4601-4633-849e-5978177f21a1", 320 | "metadata": { 321 | "colab": { 322 | "base_uri": "https://localhost:8080/" 323 | }, 324 | "id": "d2833335-4601-4633-849e-5978177f21a1", 325 | "outputId": "bb729e5f-176d-4010-c532-eaf120cd9c22" 326 | }, 327 | "outputs": [ 328 | { 329 | "data": { 330 | "text/plain": [ 331 | "100" 332 | ] 333 | }, 334 | "execution_count": 10, 335 | "metadata": {}, 336 | "output_type": "execute_result" 337 | } 338 | ], 339 | "source": [ 340 | "x_quant = round(scale*og_scalar.numpy()+zero_point)\n", 341 | "x_quant" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": 11, 347 | "id": "63809a33-ba42-45da-8d13-652363d141f0", 348 | "metadata": { 349 | "colab": { 350 | "base_uri": "https://localhost:8080/" 351 | }, 352 | "id": "63809a33-ba42-45da-8d13-652363d141f0", 353 | "outputId": "93fe29f8-20df-4628-f97a-a4c0d97a8433" 354 | }, 355 | "outputs": [ 356 | { 357 | "data": { 358 | "text/plain": [ 359 | "3.1372549019607843" 360 | ] 361 | }, 362 | "execution_count": 11, 363 | "metadata": {}, 364 | "output_type": "execute_result" 365 | } 366 | ], 367 | "source": [ 368 | "x_dequant = (x_quant-zero_point)/scale\n", 369 | "x_dequant" 370 | ] 371 | }, 372 | { 373 | "cell_type": "markdown", 374 | "id": "01ed12ca-d7e5-406a-b7ea-0d0e71f096de", 375 | "metadata": { 376 | "id": "01ed12ca-d7e5-406a-b7ea-0d0e71f096de" 377 | }, 378 | "source": [ 379 | "### Quantization using Torch" 380 | ] 381 | }, 382 | { 383 | "cell_type": "markdown", 384 | "id": "5dd56dd8-07cb-4ac4-a907-1326ce6beb94", 385 | "metadata": { 386 | "id": "5dd56dd8-07cb-4ac4-a907-1326ce6beb94" 387 | }, 388 | "source": [ 389 | "

Static Quantization" 390 | ] 391 | }, 392 | { 393 | "cell_type": "code", 394 | "execution_count": 12, 395 | "id": "9bf481e0-6c3d-438f-a1f0-900163ea29b9", 396 | "metadata": { 397 | "colab": { 398 | "base_uri": "https://localhost:8080/" 399 | }, 400 | "id": "9bf481e0-6c3d-438f-a1f0-900163ea29b9", 401 | "outputId": "eeee32f5-3a1a-4288-aba1-49d0ff75e24e" 402 | }, 403 | "outputs": [ 404 | { 405 | "data": { 406 | "text/plain": [ 407 | "tensor(0., size=(), dtype=torch.qint8,\n", 408 | " quantization_scheme=torch.per_tensor_affine, scale=31.875, zero_point=0)" 409 | ] 410 | }, 411 | "execution_count": 12, 412 | "metadata": {}, 413 | "output_type": "execute_result" 414 | } 415 | ], 416 | "source": [ 417 | "qscalar = torch.quantize_per_tensor(og_scalar,torch.scalar_tensor(scale),torch.scalar_tensor(zero_point),torch.qint8)\n", 418 | "qscalar" 419 | ] 420 | }, 421 | { 422 | "cell_type": "code", 423 | "execution_count": 13, 424 | "id": "eb675447-f05f-4097-8788-919112168d80", 425 | "metadata": { 426 | "colab": { 427 | "base_uri": "https://localhost:8080/" 428 | }, 429 | "id": "eb675447-f05f-4097-8788-919112168d80", 430 | "outputId": "54c03a05-aa7f-4312-c6e4-2999d9e40a08" 431 | }, 432 | "outputs": [ 433 | { 434 | "name": "stdout", 435 | "output_type": "stream", 436 | "text": [ 437 | "Data Type Original Scalar:torch.float32\n", 438 | "Data Type Quantized Scalar:torch.qint8\n", 439 | "Integer Representation of Quantized Scalar:0\n" 440 | ] 441 | } 442 | ], 443 | "source": [ 444 | "print(f\"Data Type Original Scalar:{og_scalar.dtype}\")\n", 445 | "print(f\"Data Type Quantized Scalar:{qscalar.dtype}\")\n", 446 | "print(f\"Integer Representation of Quantized Scalar:{qscalar.int_repr()}\")" 447 | ] 448 | }, 449 | { 450 | "cell_type": "markdown", 451 | "id": "f7ca5818-a3e3-4d0d-ae9b-129241f1f40d", 452 | "metadata": { 453 | "id": "f7ca5818-a3e3-4d0d-ae9b-129241f1f40d" 454 | }, 455 | "source": [ 456 | "

Dynamic Quantization" 457 | ] 458 | }, 459 | { 460 | "cell_type": "code", 461 | "execution_count": 14, 462 | "id": "25a198bf-90e7-48d6-bf4e-13fac2d4b361", 463 | "metadata": { 464 | "colab": { 465 | "base_uri": "https://localhost:8080/" 466 | }, 467 | "id": "25a198bf-90e7-48d6-bf4e-13fac2d4b361", 468 | "outputId": "2702b0f5-05ef-47cf-d2d0-28016835ffab" 469 | }, 470 | "outputs": [ 471 | { 472 | "data": { 473 | "text/plain": [ 474 | "tensor(3.1458, size=(), dtype=torch.qint8,\n", 475 | " quantization_scheme=torch.per_tensor_affine, scale=0.012336430830114029,\n", 476 | " zero_point=-128)" 477 | ] 478 | }, 479 | "execution_count": 14, 480 | "metadata": {}, 481 | "output_type": "execute_result" 482 | } 483 | ], 484 | "source": [ 485 | "dq_scalar = torch.quantize_per_tensor_dynamic(og_scalar,torch.qint8,False)\n", 486 | "dq_scalar" 487 | ] 488 | }, 489 | { 490 | "cell_type": "code", 491 | "execution_count": 15, 492 | "id": "f1274d24-c448-4d5c-9594-e6a388e655b7", 493 | "metadata": { 494 | "colab": { 495 | "base_uri": "https://localhost:8080/" 496 | }, 497 | "id": "f1274d24-c448-4d5c-9594-e6a388e655b7", 498 | "outputId": "f58a8070-ad70-4ce2-c7cc-141d0bda7fa1" 499 | }, 500 | "outputs": [ 501 | { 502 | "name": "stdout", 503 | "output_type": "stream", 504 | "text": [ 505 | "Data Type Dynamically Quantized Scalar:torch.qint8\n", 506 | "Integer Representation of Dynamically Quantized Scalar:127\n" 507 | ] 508 | } 509 | ], 510 | "source": [ 511 | "print(f\"Data Type Dynamically Quantized Scalar:{dq_scalar.dtype}\")\n", 512 | "print(f\"Integer Representation of Dynamically Quantized Scalar:{dq_scalar.int_repr()}\")" 513 | ] 514 | }, 515 | { 516 | "attachments": {}, 517 | "cell_type": "markdown", 518 | "id": "8ab6f796-e4ea-4904-ba2b-ad30a31a76f2", 519 | "metadata": { 520 | "id": "8ab6f796-e4ea-4904-ba2b-ad30a31a76f2" 521 | }, 522 | "source": [ 523 | "## Post Training Quantization\n", 524 | "\n", 525 | "Post-training quantization (PTQ), unlike mixed precision training, is performed after the model has been fully trained in high precision. In PTQ, weights are converted to lower-precision formats such as int8 or bfloat16, with techniques like static quantization using pre-calibrated scaling factors or dynamic quantization, which adjusts on-the-fly at runtime. PTQ is particularly advantageous for deployment scenarios, where reduced memory and latency are critical." 526 | ] 527 | }, 528 | { 529 | "cell_type": "markdown", 530 | "id": "AP2knprcr7vZ", 531 | "metadata": { 532 | "id": "AP2knprcr7vZ" 533 | }, 534 | "source": [ 535 | "### Torch Quantization" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": 16, 541 | "id": "yJgIlqr0sNaz", 542 | "metadata": { 543 | "id": "yJgIlqr0sNaz" 544 | }, 545 | "outputs": [], 546 | "source": [ 547 | "MODEL = \"bert-base-uncased\"" 548 | ] 549 | }, 550 | { 551 | "cell_type": "code", 552 | "execution_count": 17, 553 | "id": "pxROgU80r7Aa", 554 | "metadata": { 555 | "colab": { 556 | "base_uri": "https://localhost:8080/" 557 | }, 558 | "id": "pxROgU80r7Aa", 559 | "outputId": "43676c85-2acc-439c-f57b-2d95fca7ddfa" 560 | }, 561 | "outputs": [ 562 | { 563 | "name": "stderr", 564 | "output_type": "stream", 565 | "text": [ 566 | "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: \n", 567 | "The secret `HF_TOKEN` does not exist in your Colab secrets.\n", 568 | "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n", 569 | "You will be able to reuse this secret in all of your notebooks.\n", 570 | "Please note that authentication is recommended but still optional to access public models or datasets.\n", 571 | " warnings.warn(\n", 572 | "If you want to use `BertLMHeadModel` as a standalone, add `is_decoder=True.`\n" 573 | ] 574 | } 575 | ], 576 | "source": [ 577 | "tokenizer = AutoTokenizer.from_pretrained(MODEL)\n", 578 | "model = AutoModelForCausalLM.from_pretrained(MODEL)" 579 | ] 580 | }, 581 | { 582 | "cell_type": "code", 583 | "execution_count": 18, 584 | "id": "31Qf8u1psPVl", 585 | "metadata": { 586 | "id": "31Qf8u1psPVl" 587 | }, 588 | "outputs": [], 589 | "source": [ 590 | "quantized_model = torch.quantization.quantize_dynamic(\n", 591 | " model, {torch.nn.Linear}, dtype=torch.qint8\n", 592 | ")" 593 | ] 594 | }, 595 | { 596 | "cell_type": "code", 597 | "execution_count": 19, 598 | "id": "gZrxTwTbwk-s", 599 | "metadata": { 600 | "colab": { 601 | "base_uri": "https://localhost:8080/" 602 | }, 603 | "id": "gZrxTwTbwk-s", 604 | "outputId": "1d582c73-3fd4-4057-93cb-c316cab383ec" 605 | }, 606 | "outputs": [ 607 | { 608 | "name": "stdout", 609 | "output_type": "stream", 610 | "text": [ 611 | "Original model's size: 3504457536 bits | 438.06 MB\n" 612 | ] 613 | } 614 | ], 615 | "source": [ 616 | "size_model = get_model_size(model)\n", 617 | "print(f\"Original model's size: {size_model} bits | {size_model / 8e6:.2f} MB\")" 618 | ] 619 | }, 620 | { 621 | "cell_type": "code", 622 | "execution_count": 20, 623 | "id": "R4uUi0jHsPP2", 624 | "metadata": { 625 | "colab": { 626 | "base_uri": "https://localhost:8080/" 627 | }, 628 | "id": "R4uUi0jHsPP2", 629 | "outputId": "a02f34c6-0ea2-442a-8aaa-034c078e1d7f" 630 | }, 631 | "outputs": [ 632 | { 633 | "name": "stdout", 634 | "output_type": "stream", 635 | "text": [ 636 | "Quantized model's size: 764995392 bits | 95.62 MB\n" 637 | ] 638 | } 639 | ], 640 | "source": [ 641 | "size_model = get_model_size(quantized_model)\n", 642 | "print(f\"Quantized model's size: {size_model} bits | {size_model / 8e6:.2f} MB\")" 643 | ] 644 | }, 645 | { 646 | "cell_type": "markdown", 647 | "id": "bR9CNw3ir4gc", 648 | "metadata": { 649 | "id": "bR9CNw3ir4gc" 650 | }, 651 | "source": [ 652 | "### HuggingFace" 653 | ] 654 | }, 655 | { 656 | "cell_type": "markdown", 657 | "id": "6324bac2-026b-4ac7-8587-4a2450f15923", 658 | "metadata": {}, 659 | "source": [ 660 | "

This Section Needs GPU" 661 | ] 662 | }, 663 | { 664 | "cell_type": "code", 665 | "execution_count": 21, 666 | "id": "1496910d-ddec-4789-92bd-f7b94f6eed6f", 667 | "metadata": { 668 | "id": "1496910d-ddec-4789-92bd-f7b94f6eed6f" 669 | }, 670 | "outputs": [], 671 | "source": [ 672 | "MODEL = \"raghavbali/aligned-gpt2-movie_reviewer\"" 673 | ] 674 | }, 675 | { 676 | "cell_type": "code", 677 | "execution_count": 22, 678 | "id": "6f6f52a0-028e-4a87-a105-1f40b3c36f24", 679 | "metadata": { 680 | "colab": { 681 | "base_uri": "https://localhost:8080/" 682 | }, 683 | "id": "6f6f52a0-028e-4a87-a105-1f40b3c36f24", 684 | "outputId": "c358712d-324e-492c-e6b2-6b648e3dfd4d" 685 | }, 686 | "outputs": [ 687 | { 688 | "name": "stderr", 689 | "output_type": "stream", 690 | "text": [ 691 | "Some weights of the model checkpoint at raghavbali/aligned-gpt2-movie_reviewer were not used when initializing GPT2LMHeadModel: ['v_head.summary.bias', 'v_head.summary.weight']\n", 692 | "- This IS expected if you are initializing GPT2LMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n", 693 | "- This IS NOT expected if you are initializing GPT2LMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n" 694 | ] 695 | } 696 | ], 697 | "source": [ 698 | "tokenizer = AutoTokenizer.from_pretrained(MODEL)\n", 699 | "model = AutoModelForCausalLM.from_pretrained(MODEL)" 700 | ] 701 | }, 702 | { 703 | "cell_type": "code", 704 | "execution_count": 23, 705 | "id": "f45abe57-83e4-48c2-8b98-4e224bf2f716", 706 | "metadata": { 707 | "colab": { 708 | "base_uri": "https://localhost:8080/" 709 | }, 710 | "id": "f45abe57-83e4-48c2-8b98-4e224bf2f716", 711 | "outputId": "b0d269fb-eb0a-4944-9507-940102b3977d" 712 | }, 713 | "outputs": [ 714 | { 715 | "name": "stdout", 716 | "output_type": "stream", 717 | "text": [ 718 | "Original model's size: 3982098432 bits | 497.76 MB\n" 719 | ] 720 | } 721 | ], 722 | "source": [ 723 | "size_model = get_model_size(model)\n", 724 | "print(f\"Original model's size: {size_model} bits | {size_model / 8e6:.2f} MB\")" 725 | ] 726 | }, 727 | { 728 | "cell_type": "code", 729 | "execution_count": 24, 730 | "id": "LSY2ChYrppfl", 731 | "metadata": { 732 | "colab": { 733 | "base_uri": "https://localhost:8080/" 734 | }, 735 | "id": "LSY2ChYrppfl", 736 | "outputId": "e0642223-c712-4138-87e9-b23707e01d6c" 737 | }, 738 | "outputs": [ 739 | { 740 | "name": "stderr", 741 | "output_type": "stream", 742 | "text": [ 743 | "`low_cpu_mem_usage` was None, now default to True since model is quantized.\n", 744 | "Some weights of the model checkpoint at raghavbali/aligned-gpt2-movie_reviewer were not used when initializing GPT2LMHeadModel: ['v_head.summary.bias', 'v_head.summary.weight']\n", 745 | "- This IS expected if you are initializing GPT2LMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n", 746 | "- This IS NOT expected if you are initializing GPT2LMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n", 747 | "`low_cpu_mem_usage` was None, now default to True since model is quantized.\n", 748 | "Some weights of the model checkpoint at raghavbali/aligned-gpt2-movie_reviewer were not used when initializing GPT2LMHeadModel: ['v_head.summary.bias', 'v_head.summary.weight']\n", 749 | "- This IS expected if you are initializing GPT2LMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n", 750 | "- This IS NOT expected if you are initializing GPT2LMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n" 751 | ] 752 | } 753 | ], 754 | "source": [ 755 | "model_4bit = AutoModelForCausalLM.from_pretrained(\n", 756 | " MODEL,\n", 757 | " quantization_config=BitsAndBytesConfig(load_in_4bit=True)\n", 758 | ")\n", 759 | "\n", 760 | "model_8bit = AutoModelForCausalLM.from_pretrained(\n", 761 | " MODEL,\n", 762 | " quantization_config=BitsAndBytesConfig(load_in_8bit=True)\n", 763 | ")" 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": 25, 769 | "id": "2DXaYWstp1S9", 770 | "metadata": { 771 | "colab": { 772 | "base_uri": "https://localhost:8080/" 773 | }, 774 | "id": "2DXaYWstp1S9", 775 | "outputId": "b2a073ca-f64e-41e2-ba23-7f0ac01fc473" 776 | }, 777 | "outputs": [ 778 | { 779 | "name": "stdout", 780 | "output_type": "stream", 781 | "text": [ 782 | "Model size after 8bit quantization: 1311571968 bits | 163.95 MB\n", 783 | "Model size after 4bit quantization: 971833344 bits | 121.48 MB\n" 784 | ] 785 | } 786 | ], 787 | "source": [ 788 | "size_model_4bit = get_model_size(model_4bit)\n", 789 | "size_model_8bit = get_model_size(model_8bit)\n", 790 | "\n", 791 | "print(f\"Model size after 8bit quantization: {size_model_8bit} bits | {size_model_8bit / 8e6:.2f} MB\")\n", 792 | "print(f\"Model size after 4bit quantization: {size_model_4bit} bits | {size_model_4bit / 8e6:.2f} MB\")" 793 | ] 794 | }, 795 | { 796 | "cell_type": "markdown", 797 | "id": "2g4Ue_FzqVR8", 798 | "metadata": { 799 | "id": "2g4Ue_FzqVR8" 800 | }, 801 | "source": [ 802 | "Confirm if the models still work as intended after quantization" 803 | ] 804 | }, 805 | { 806 | "cell_type": "code", 807 | "execution_count": 26, 808 | "id": "v-aFCXtmqJYg", 809 | "metadata": { 810 | "id": "v-aFCXtmqJYg" 811 | }, 812 | "outputs": [], 813 | "source": [ 814 | "inputs = tokenizer(\"King Kong\", return_tensors=\"pt\", return_token_type_ids=False)" 815 | ] 816 | }, 817 | { 818 | "cell_type": "code", 819 | "execution_count": 27, 820 | "id": "UqnAGMvVqgPg", 821 | "metadata": { 822 | "colab": { 823 | "base_uri": "https://localhost:8080/" 824 | }, 825 | "id": "UqnAGMvVqgPg", 826 | "outputId": "9db07f34-4884-4ff6-996c-b66f34a111db" 827 | }, 828 | "outputs": [ 829 | { 830 | "name": "stderr", 831 | "output_type": "stream", 832 | "text": [ 833 | "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:2097: UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for example input_ids = input_ids.to('cuda') before running `.generate()`.\n", 834 | " warnings.warn(\n", 835 | "/usr/local/lib/python3.10/dist-packages/bitsandbytes/nn/modules.py:452: UserWarning: Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed.\n", 836 | " warnings.warn(\n", 837 | "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:2097: UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for example input_ids = input_ids.to('cuda') before running `.generate()`.\n", 838 | " warnings.warn(\n" 839 | ] 840 | } 841 | ], 842 | "source": [ 843 | "og_start= time()\n", 844 | "outputs_og = model.generate(**inputs,\n", 845 | " max_new_tokens=25,\n", 846 | " temperature=0.8,\n", 847 | " do_sample=True,\n", 848 | " pad_token_id=tokenizer.eos_token_id)\n", 849 | "og_end= time()\n", 850 | "q4_start= time()\n", 851 | "outputs_4bit = model_4bit.generate(**inputs,\n", 852 | " max_new_tokens=25,\n", 853 | " temperature=0.8,\n", 854 | " do_sample=True,\n", 855 | " pad_token_id=tokenizer.eos_token_id)\n", 856 | "q4_end= time()\n", 857 | "q8_start= time()\n", 858 | "outputs_8bit = model_8bit.generate(**inputs,\n", 859 | " max_new_tokens=25,\n", 860 | " temperature=0.8,\n", 861 | " do_sample=True,\n", 862 | " pad_token_id=tokenizer.eos_token_id)\n", 863 | "q8_end= time()" 864 | ] 865 | }, 866 | { 867 | "cell_type": "code", 868 | "execution_count": 28, 869 | "id": "HaKOCMqlquK5", 870 | "metadata": { 871 | "colab": { 872 | "base_uri": "https://localhost:8080/" 873 | }, 874 | "id": "HaKOCMqlquK5", 875 | "outputId": "d913f21f-423f-4c4f-d50f-72b23fcf8120" 876 | }, 877 | "outputs": [ 878 | { 879 | "name": "stdout", 880 | "output_type": "stream", 881 | "text": [ 882 | "::Model Outputs::\n", 883 | "***************\n", 884 | "\n", 885 | "Original Model:(1.6615946292877197)\n", 886 | "---------------\n", 887 | "King Kong and the Killing Joke is the best in modern cinema. The acting is great, the direction is wonderful, the performances are\n", 888 | "\n", 889 | "8bit Model:(1.7423856258392334)\n", 890 | "---------------\n", 891 | "King Kong: Skull Island - Full HD Remaster - 2.5/10.\n", 892 | "\n", 893 | " video is beautiful and the music is great\n", 894 | "\n", 895 | "4bit Model:(4.4493348598480225)\n", 896 | "---------------\n", 897 | "King Kong movie, then I'd like to see a big action movie with an action movie attached. The first two thirds of the movie\n" 898 | ] 899 | } 900 | ], 901 | "source": [ 902 | "print(\"::Model Outputs::\")\n", 903 | "print(\"*\"*15)\n", 904 | "print()\n", 905 | "print(f\"Original Model:({og_end-og_start})\")\n", 906 | "print(\"-\"*15)\n", 907 | "print(tokenizer.decode(outputs_og[0], skip_special_tokens=True))\n", 908 | "print()\n", 909 | "print(f\"8bit Model:({q8_end-q8_start})\")\n", 910 | "print(\"-\"*15)\n", 911 | "print(tokenizer.decode(outputs_8bit[0], skip_special_tokens=True))\n", 912 | "print()\n", 913 | "print(f\"4bit Model:({q4_end-q4_start})\")\n", 914 | "print(\"-\"*15)\n", 915 | "print(tokenizer.decode(outputs_4bit[0], skip_special_tokens=True))" 916 | ] 917 | }, 918 | { 919 | "cell_type": "code", 920 | "execution_count": 28, 921 | "id": "hZePqfezvPlU", 922 | "metadata": { 923 | "id": "hZePqfezvPlU" 924 | }, 925 | "outputs": [], 926 | "source": [] 927 | } 928 | ], 929 | "metadata": { 930 | "accelerator": "GPU", 931 | "colab": { 932 | "gpuType": "T4", 933 | "provenance": [], 934 | "toc_visible": true 935 | }, 936 | "kernelspec": { 937 | "display_name": "Python 3 (ipykernel)", 938 | "language": "python", 939 | "name": "python3" 940 | }, 941 | "language_info": { 942 | "codemirror_mode": { 943 | "name": "ipython", 944 | "version": 3 945 | }, 946 | "file_extension": ".py", 947 | "mimetype": "text/x-python", 948 | "name": "python", 949 | "nbconvert_exporter": "python", 950 | "pygments_lexer": "ipython3", 951 | "version": "3.11.9" 952 | } 953 | }, 954 | "nbformat": 4, 955 | "nbformat_minor": 5 956 | } 957 | -------------------------------------------------------------------------------- /ch_09/03_finetuning_optimizations.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "attachments": {}, 5 | "cell_type": "markdown", 6 | "id": "7886de70-bf2d-4dba-b5e5-672073534002", 7 | "metadata": {}, 8 | "source": [ 9 | "# Finetuning Optimizations\n", 10 | "\n", 11 | "Finetuning is a very important step in improving the quality of the models and hence it makes sence to understand how we can optimize this step without impacting the performance. Efficiencies in this step also enable us to iterate faster thereby improving adaptability in many fast moving domains. \n", 12 | "\n", 13 | "In this notebook we will cover:\n", 14 | "- Additive PEFT using Prompt Tuning" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "id": "80259db8-5d17-48cf-8252-05f510cbb651", 20 | "metadata": {}, 21 | "source": [ 22 | "

This Notebook requires GPU" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "id": "41027bd7-6939-4091-926e-9fe37bc24ab8", 29 | "metadata": {}, 30 | "outputs": [], 31 | "source": [ 32 | "# !pip3 install peft==0.13.2" 33 | ] 34 | }, 35 | { 36 | "attachments": {}, 37 | "cell_type": "markdown", 38 | "id": "183c4f1f-9a91-493f-a8a2-03fa9bd55d3f", 39 | "metadata": {}, 40 | "source": [ 41 | "## Prompt Tuning\n", 42 | "Add some text and imagesThe usual manual prompting (or hard prompting) works to a great extent but requires a lot of effort to create a good prompt. On the other hand, soft prompts are learnable parameters/tensors added to input embeddings and optimized as per task(s) and dataset.\n", 43 | "\n", 44 | "\n", 45 | "\n", 46 | "Prompt tuning is a form of soft prompting technique which involves introducing task specific tokens or virtual tokens to the model's input space. The virtual tokens are not part of the actual vocabulary of the model and only specify the task. " 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": 1, 52 | "id": "45005d47-0555-4374-9130-03d648ce078f", 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [ 56 | "import torch\n", 57 | "from tqdm.notebook import tqdm\n", 58 | "from datasets import load_dataset\n", 59 | "from transformers import AutoTokenizer\n", 60 | "from torch.utils.data import DataLoader\n", 61 | "from transformers import get_linear_schedule_with_warmup\n", 62 | "from transformers import default_data_collator, AutoModelForCausalLM\n", 63 | "from peft import PromptTuningConfig, PromptTuningInit, get_peft_model" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": 2, 69 | "id": "4fedbc83-c49f-43b1-9ea7-1043b88fc0b1", 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "MODEL = \"bigscience/bloomz-560m\"#\"meta-llama/Llama-3.2-1B\"\n", 74 | "DATASET = \"lmsys/toxic-chat\"" 75 | ] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "id": "ad62673c-ae8f-47e1-a810-b12e00ef7b65", 80 | "metadata": {}, 81 | "source": [ 82 | "### Toxicity Dataset\n", 83 | "\n", 84 | "This dataset contains toxicity annotations on 10K user prompts collected from the Vicuna online demo. The authors utilize a human-AI collaborative annotation framework to guarantee the quality of annotation while maintaining a feasible annotation workload.\n", 85 | "\n", 86 | "### Prompt Tuning Task\n", 87 | "In this section, we will leverage prompt tuning as PEFT technique to fine-tune a model to classify if a user-prompt is toxic or not.\n", 88 | "\n", 89 | "---\n", 90 | "**Source**: \n", 91 | "\n", 101 | "[ToxicChat Dataset](#cite-lin2023toxicchat)" 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "id": "90f96a02-6922-405a-a49e-7b93dfe71377", 107 | "metadata": {}, 108 | "source": [ 109 | "### Prepare Dataset" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": 3, 115 | "id": "928f6807-e34f-42fd-a6b2-4a52845fe92d", 116 | "metadata": {}, 117 | "outputs": [], 118 | "source": [ 119 | "dataset = load_dataset(DATASET, \"toxicchat0124\")\n", 120 | "classes = ['non toxic','toxic']" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 4, 126 | "id": "8fa5eec8-be1f-40d0-8c6e-3c5e8ffb1569", 127 | "metadata": {}, 128 | "outputs": [], 129 | "source": [ 130 | "dataset = dataset.map(\n", 131 | " lambda x: {\"toxicity_label\": [classes[label] for label in x[\"toxicity\"]]},\n", 132 | " batched=True,\n", 133 | " num_proc=1,\n", 134 | ")" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": 5, 140 | "id": "763dd2ae-c82b-45a8-93a4-ea0116ced46a", 141 | "metadata": {}, 142 | "outputs": [ 143 | { 144 | "data": { 145 | "text/plain": [ 146 | "{'conv_id': 'e0c9b3e05414814485dbdcb9a29334d502e59803af9c26df03e9d1de5e7afe67',\n", 147 | " 'user_input': 'Masturbacja jest proces co oitrzebuje',\n", 148 | " 'model_output': 'Masturbacja to proces, który może pozytywnie wpłynąć na zdrowie psychiczne i fizyczne człowieka, ponieważ pomaga w relaksie, redukuje stres i pomaga w uśpieniu. Może też być używana jako dodatkowa form',\n", 149 | " 'human_annotation': True,\n", 150 | " 'toxicity': 0,\n", 151 | " 'jailbreaking': 0,\n", 152 | " 'openai_moderation': '[[\"sexual\", 0.4609803557395935], [\"sexual/minors\", 0.0012527990620583296], [\"harassment\", 0.0001862536446424201], [\"hate\", 0.00015521160094067454], [\"violence\", 6.580814078915864e-05], [\"self-harm\", 3.212967567378655e-05], [\"violence/graphic\", 1.5190824342425913e-05], [\"self-harm/instructions\", 1.0009921425080393e-05], [\"hate/threatening\", 4.4459093260229565e-06], [\"self-harm/intent\", 3.378846486157272e-06], [\"harassment/threatening\", 1.7095695739044459e-06]]',\n", 153 | " 'toxicity_label': 'non toxic'}" 154 | ] 155 | }, 156 | "execution_count": 5, 157 | "metadata": {}, 158 | "output_type": "execute_result" 159 | } 160 | ], 161 | "source": [ 162 | "dataset[\"train\"][0]" 163 | ] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": 7, 168 | "id": "5fa744b8-f5a7-4a38-8a85-e323a287983b", 169 | "metadata": {}, 170 | "outputs": [ 171 | { 172 | "name": "stdout", 173 | "output_type": "stream", 174 | "text": [ 175 | "2\n" 176 | ] 177 | } 178 | ], 179 | "source": [ 180 | "tokenizer = AutoTokenizer.from_pretrained(MODEL)\n", 181 | "if tokenizer.pad_token_id is None:\n", 182 | " tokenizer.pad_token_id = tokenizer.eos_token_id\n", 183 | "target_max_length = max([len(tokenizer(str(class_label))[\"input_ids\"]) for class_label in classes])\n", 184 | "print(target_max_length)" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": 9, 190 | "id": "c7c66b73-f00c-4717-9d16-441f33eeda91", 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [ 194 | "max_length = 32\n", 195 | "def preprocess_function(examples, text_column=\"user_input\", label_column=\"toxicity_label\"):\n", 196 | " batch_size = len(examples[text_column])\n", 197 | " inputs = [f\"{text_column} : {x} Label : \" for x in examples[text_column]]\n", 198 | " targets = [x for x in examples[label_column]]\n", 199 | " model_inputs = tokenizer(inputs)\n", 200 | " labels = tokenizer(targets)\n", 201 | " for i in range(batch_size):\n", 202 | " sample_input_ids = model_inputs[\"input_ids\"][i]\n", 203 | " label_input_ids = labels[\"input_ids\"][i]\n", 204 | " model_inputs[\"input_ids\"][i] = [tokenizer.pad_token_id] * (\n", 205 | " max_length - len(sample_input_ids)\n", 206 | " ) + sample_input_ids\n", 207 | " model_inputs[\"attention_mask\"][i] = [0] * (max_length - len(sample_input_ids)) + model_inputs[\n", 208 | " \"attention_mask\"\n", 209 | " ][i]\n", 210 | " labels[\"input_ids\"][i] = [-100] * (max_length - len(label_input_ids)) + label_input_ids\n", 211 | " model_inputs[\"input_ids\"][i] = torch.tensor(model_inputs[\"input_ids\"][i][:max_length])\n", 212 | " model_inputs[\"attention_mask\"][i] = torch.tensor(model_inputs[\"attention_mask\"][i][:max_length])\n", 213 | " labels[\"input_ids\"][i] = torch.tensor(labels[\"input_ids\"][i][:max_length])\n", 214 | " model_inputs[\"labels\"] = labels[\"input_ids\"]\n", 215 | " return model_inputs" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": 10, 221 | "id": "f2bbce5b-7c09-41e2-9bba-426e00c976ab", 222 | "metadata": {}, 223 | "outputs": [ 224 | { 225 | "data": { 226 | "application/vnd.jupyter.widget-view+json": { 227 | "model_id": "5a9e870fbaa749779dfe9694c385bb02", 228 | "version_major": 2, 229 | "version_minor": 0 230 | }, 231 | "text/plain": [ 232 | "Running tokenizer on dataset (num_proc=2): 0%| | 0/5082 [00:00 Re-enactment using Pix2Pix\n", 9 | "\n", 10 | "\n", 11 | "\n", 12 | "We covered image-to-image translation GAN architectures in Chapter 5. Particularly, we discussed in detail how **pix2pix GAN** is a powerful architecture which enables paired translation tasks. In this notebook, we will leverage pix2pix GAN to develop a face re-enactment setup from scratch. We will:\n", 13 | "+ build a pix2pix network\n", 14 | "+ prepare the dataset using a video\n", 15 | "+ train the model for reenactment using facial landmarks\n", 16 | "\n", 17 | "The actual reenactment part is covered in the second notebook for this chapter. " 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": { 23 | "id": "N6FXnJHk32ND" 24 | }, 25 | "source": [ 26 | "## Load Libraries" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": null, 32 | "metadata": { 33 | "id": "9KlERwzUJbNr" 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "import os\n", 38 | "import cv2\n", 39 | "import dlib\n", 40 | "import numpy as np\n", 41 | "import torch\n", 42 | "from torch.autograd import Variable\n", 43 | "from torch.utils.data import Dataset\n", 44 | "from torch.utils.data import DataLoader\n", 45 | "from torchvision.utils import save_image\n", 46 | "import torchvision.transforms as transforms\n", 47 | "from PIL import Image" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": null, 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [ 56 | "from gan_utils import PATCH_GAN_SHAPE\n", 57 | "from gan_utils import Generator,Discriminator \n", 58 | "from gan_utils import (IMG_WIDTH,\n", 59 | " IMG_HEIGHT,\n", 60 | " NUM_CHANNELS,\n", 61 | " BATCH_SIZE,\n", 62 | " N_EPOCHS,\n", 63 | " SAMPLE_INTERVAL)" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "from dataset_utils import ImageDataset, prepare_data\n", 73 | "from dataset_utils import DATASET_PATH, DOWNSAMPLE_RATIO" 74 | ] 75 | }, 76 | { 77 | "cell_type": "markdown", 78 | "metadata": { 79 | "id": "_T7bKz_T5EP4" 80 | }, 81 | "source": [ 82 | "## Set Parameters" 83 | ] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "execution_count": null, 88 | "metadata": { 89 | "id": "-3M00EI0suiY" 90 | }, 91 | "outputs": [], 92 | "source": [ 93 | "CUDA = True if torch.cuda.is_available() else False\n", 94 | "os.makedirs(\"saved_models/\", exist_ok=True)" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "metadata": { 101 | "colab": { 102 | "base_uri": "https://localhost:8080/" 103 | }, 104 | "id": "_x6XGNalsuVz", 105 | "outputId": "f079c7c3-461b-4a9f-fa83-8ba81ca41251" 106 | }, 107 | "outputs": [], 108 | "source": [ 109 | "# get landmarks model if not already available\n", 110 | "!wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2\n", 111 | "!bunzip2 \"shape_predictor_68_face_landmarks.dat.bz2\"" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | "metadata": { 118 | "id": "PDFOoTQks6JM" 119 | }, 120 | "outputs": [], 121 | "source": [ 122 | "# instantiate objects for face and landmark detection\n", 123 | "detector = dlib.get_frontal_face_detector()\n", 124 | "predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": { 130 | "id": "mEWl8xxU3sKu" 131 | }, 132 | "source": [ 133 | "# Pix2Pix GAN for Re-enactment\n", 134 | "\n", 135 | "In their work titled [“Image to Image Translation with Conditional Adversarial Networks”](https://arxiv.org/abs/1611.07004), Isola and Zhu et. al. present a conditional GAN network which is able to learn task specific loss functions and thus work across datasets. As the name suggests, this GAN architecture takes a specific type of image as input and transforms it into a different domain. It is called pair-wise style transfer as the training set needs to have samples from both, source and target domains." 136 | ] 137 | }, 138 | { 139 | "cell_type": "markdown", 140 | "metadata": { 141 | "id": "E5hkGY1V4OG8" 142 | }, 143 | "source": [ 144 | "## U-Net Generator\n", 145 | "The U-Net architecture uses skip connections to shuttle important features between the input and outputs. In case of pix2pix GAN, skip connections are added between every $ith$ down-sampling and $(n-i)th$ over-sampling layers, where $n$ is the total number of layers in the generator. The skip connection leads to concatenation of all channels from the ith and $(n-i)th$ layers." 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": { 151 | "id": "d5gQ7yrG4Uzp" 152 | }, 153 | "source": [ 154 | "## Patch-GAN Discriminator\n", 155 | "The authors for pix2pix propose a Patch-GAN setup for the discriminator which takes the required inputs and generates an output of size NxN. Each $x_{ij}$ element of the NxN output signifies whether the corresponding patch ij in the generated image is real or fake. Each output patch can be traced back to its initial input patch basis the effective receptive field for each of the layers." 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": { 161 | "id": "FbXMgH8_5Pyp" 162 | }, 163 | "source": [ 164 | "## Initialize Generator and Discriminator Model Objects" 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": null, 170 | "metadata": { 171 | "id": "CFJbgyHeRlFh" 172 | }, 173 | "outputs": [], 174 | "source": [ 175 | "# Initialize generator and discriminator\n", 176 | "generator = Generator()\n", 177 | "discriminator = Discriminator()\n", 178 | "\n", 179 | "# Loss functions\n", 180 | "adversarial_loss = torch.nn.MSELoss()\n", 181 | "pixelwise_loss = torch.nn.L1Loss()\n", 182 | "\n", 183 | "# Loss weight of L1 pixel-wise loss between translated image and real image\n", 184 | "weight_pixel_wise_identity = 100\n", 185 | "\n", 186 | "# Optimizers\n", 187 | "optimizer_G = torch.optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))\n", 188 | "optimizer_D = torch.optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))" 189 | ] 190 | }, 191 | { 192 | "cell_type": "markdown", 193 | "metadata": { 194 | "id": "2bTJhEZG5brD" 195 | }, 196 | "source": [ 197 | "## Prepare Dataset" 198 | ] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": null, 203 | "metadata": { 204 | "colab": { 205 | "base_uri": "https://localhost:8080/" 206 | }, 207 | "id": "plTpZOWDs6GD", 208 | "outputId": "a7017603-2471-4dbf-dcab-f0b400b95cc3" 209 | }, 210 | "outputs": [], 211 | "source": [ 212 | "# prepare data\n", 213 | "prepare_data('obama.mp4',\n", 214 | " detector,\n", 215 | " predictor,\n", 216 | " num_samples=400,\n", 217 | " downsample_ratio = DOWNSAMPLE_RATIO)" 218 | ] 219 | }, 220 | { 221 | "cell_type": "markdown", 222 | "metadata": {}, 223 | "source": [ 224 | "## Setup Objects based on GPU Availability" 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "execution_count": null, 230 | "metadata": { 231 | "id": "6dr3Jr7ySNJ4" 232 | }, 233 | "outputs": [], 234 | "source": [ 235 | "if CUDA:\n", 236 | " generator = generator.cuda()\n", 237 | " discriminator = discriminator.cuda()\n", 238 | " adversarial_loss.cuda()\n", 239 | " pixelwise_loss.cuda()\n", 240 | " Tensor = torch.cuda.FloatTensor\n", 241 | "else:\n", 242 | " Tensor = torch.FloatTensor" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": { 248 | "id": "VvRfehAN5o8I" 249 | }, 250 | "source": [ 251 | "## Define Transformations and Dataloaders" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": null, 257 | "metadata": { 258 | "id": "DjbEdO81SNHa" 259 | }, 260 | "outputs": [], 261 | "source": [ 262 | "image_transformations = [\n", 263 | " transforms.Resize((IMG_HEIGHT, IMG_WIDTH), Image.BICUBIC),\n", 264 | " transforms.ToTensor(),\n", 265 | " transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),\n", 266 | "]" 267 | ] 268 | }, 269 | { 270 | "cell_type": "code", 271 | "execution_count": null, 272 | "metadata": { 273 | "id": "lv0x9dX_SNEs" 274 | }, 275 | "outputs": [], 276 | "source": [ 277 | "train_dataloader = DataLoader(\n", 278 | " ImageDataset(DATASET_PATH, image_transformations=image_transformations),\n", 279 | " batch_size=BATCH_SIZE,\n", 280 | " shuffle=True\n", 281 | ")" 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": null, 287 | "metadata": { 288 | "id": "3cA2eGNXSNBz" 289 | }, 290 | "outputs": [], 291 | "source": [ 292 | "val_dataloader = DataLoader(\n", 293 | " ImageDataset(DATASET_PATH,image_transformations=image_transformations),\n", 294 | " batch_size=BATCH_SIZE//8,\n", 295 | " shuffle=True\n", 296 | ")" 297 | ] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "metadata": { 302 | "id": "sfCtlMsj5tTV" 303 | }, 304 | "source": [ 305 | "## Training Begins!" 306 | ] 307 | }, 308 | { 309 | "cell_type": "code", 310 | "execution_count": null, 311 | "metadata": {}, 312 | "outputs": [], 313 | "source": [ 314 | "def sample_images(val_dataloader,batches_done):\n", 315 | " \"\"\"\n", 316 | " Method to generate sample images for validation\n", 317 | " Parameters:\n", 318 | " val_dataloader: instance of dataloader\n", 319 | " batches_done: training iteration counter\n", 320 | " \"\"\"\n", 321 | " imgs = next(iter(val_dataloader))\n", 322 | " # condition\n", 323 | " real_A = Variable(imgs[\"B\"].type(Tensor))\n", 324 | " # real\n", 325 | " real_B = Variable(imgs[\"A\"].type(Tensor))\n", 326 | " # generated\n", 327 | " generator.eval()\n", 328 | " fake_B = generator(real_A)\n", 329 | " img_sample = torch.cat((real_A.data, fake_B.data, real_B.data), -2)\n", 330 | " save_image(img_sample, f\"{DATASET_PATH}/{batches_done}.png\", nrow=4, normalize=True)" 331 | ] 332 | }, 333 | { 334 | "cell_type": "code", 335 | "execution_count": null, 336 | "metadata": { 337 | "colab": { 338 | "base_uri": "https://localhost:8080/", 339 | "height": 1000 340 | }, 341 | "id": "41d5MJCGZoZv", 342 | "outputId": "05b6e281-ada8-4244-d747-cc8fd319ecde" 343 | }, 344 | "outputs": [], 345 | "source": [ 346 | "for epoch in range(0, N_EPOCHS):\n", 347 | " for i, batch in enumerate(train_dataloader):\n", 348 | "\n", 349 | " # prepare inputs\n", 350 | " real_A = Variable(batch[\"B\"].type(Tensor))\n", 351 | " real_B = Variable(batch[\"A\"].type(Tensor))\n", 352 | "\n", 353 | " # ground truth\n", 354 | " valid = Variable(Tensor(np.ones((real_A.size(0), *PATCH_GAN_SHAPE))), requires_grad=False)\n", 355 | " fake = Variable(Tensor(np.zeros((real_A.size(0), *PATCH_GAN_SHAPE))), requires_grad=False)\n", 356 | "\n", 357 | " # Train Generator\n", 358 | " optimizer_G.zero_grad()\n", 359 | "\n", 360 | " # generator loss\n", 361 | " fake_B = generator(real_A)\n", 362 | " pred_fake = discriminator(fake_B, real_A)\n", 363 | " adv_loss = adversarial_loss(pred_fake, valid)\n", 364 | " loss_pixel = pixelwise_loss(fake_B, real_B)\n", 365 | "\n", 366 | " # Overall Generator loss\n", 367 | " g_loss = adv_loss + weight_pixel_wise_identity * loss_pixel\n", 368 | "\n", 369 | " g_loss.backward()\n", 370 | "\n", 371 | " optimizer_G.step()\n", 372 | "\n", 373 | " # Train Discriminator\n", 374 | " optimizer_D.zero_grad()\n", 375 | "\n", 376 | " pred_real = discriminator(real_B, real_A)\n", 377 | " loss_real = adversarial_loss(pred_real, valid)\n", 378 | " pred_fake = discriminator(fake_B.detach(), real_A)\n", 379 | " loss_fake = adversarial_loss(pred_fake, fake)\n", 380 | "\n", 381 | " # Overall Discriminator loss\n", 382 | " d_loss = 0.5 * (loss_real + loss_fake)\n", 383 | "\n", 384 | " d_loss.backward()\n", 385 | " optimizer_D.step()\n", 386 | "\n", 387 | " # Progress Report\n", 388 | " batches_done = epoch * len(train_dataloader) + i\n", 389 | " print(f'Epoch: {epoch}/{N_EPOCHS}-Batch: {i}/{len(train_dataloader)}--D.loss:{d_loss.item():.4f},G.loss:{g_loss.item():.4f}--Adv.Loss:{adv_loss.item():.4f}')\n", 390 | "\n", 391 | " # generate samples\n", 392 | " if batches_done % SAMPLE_INTERVAL == 0:\n", 393 | " sample_images(val_dataloader,batches_done)" 394 | ] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "metadata": {}, 399 | "source": [ 400 | "## Save the Trained Models" 401 | ] 402 | }, 403 | { 404 | "cell_type": "code", 405 | "execution_count": null, 406 | "metadata": { 407 | "id": "sIzQFLMrajAw" 408 | }, 409 | "outputs": [], 410 | "source": [ 411 | "torch.save(generator.state_dict(), \"saved_models/generator.pt\")\n", 412 | "torch.save(discriminator.state_dict(), \"saved_models/discriminator.pt\")" 413 | ] 414 | } 415 | ], 416 | "metadata": { 417 | "accelerator": "GPU", 418 | "colab": { 419 | "gpuType": "T4", 420 | "provenance": [] 421 | }, 422 | "kernelspec": { 423 | "display_name": "Python 3 (ipykernel)", 424 | "language": "python", 425 | "name": "python3" 426 | }, 427 | "language_info": { 428 | "codemirror_mode": { 429 | "name": "ipython", 430 | "version": 3 431 | }, 432 | "file_extension": ".py", 433 | "mimetype": "text/x-python", 434 | "name": "python", 435 | "nbconvert_exporter": "python", 436 | "pygments_lexer": "ipython3", 437 | "version": "3.9.19" 438 | } 439 | }, 440 | "nbformat": 4, 441 | "nbformat_minor": 4 442 | } 443 | -------------------------------------------------------------------------------- /ch_14/constants.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | mean_face_x = np.array([0.000213256, 0.0752622, 0.18113, 0.29077, 5 | 0.393397, 0.586856, 0.689483, 0.799124, 6 | 0.904991, 0.98004, 0.490127, 0.490127, 7 | 0.490127, 0.490127, 0.36688, 0.426036, 8 | 0.490127, 0.554217, 0.613373, 0.121737, 9 | 0.187122, 0.265825, 0.334606, 0.260918, 10 | 0.182743, 0.645647, 0.714428, 0.793132, 11 | 0.858516, 0.79751, 0.719335, 0.254149, 12 | 0.340985, 0.428858, 0.490127, 0.551395, 13 | 0.639268, 0.726104, 0.642159, 0.556721, 14 | 0.490127, 0.423532, 0.338094, 0.290379, 15 | 0.428096, 0.490127, 0.552157, 0.689874, 16 | 0.553364, 0.490127, 0.42689]) 17 | 18 | mean_face_y = np.array([ 19 | 0.106454, 0.038915, 0.0187482, 0.0344891, 0.0773906, 0.0773906, 0.0344891, 20 | 0.0187482, 0.038915, 0.106454, 0.203352, 0.307009, 0.409805, 0.515625, 21 | 0.587326, 0.609345, 0.628106, 0.609345, 0.587326, 0.216423, 0.178758, 22 | 0.179852, 0.231733, 0.245099, 0.244077, 0.231733, 0.179852, 0.178758, 23 | 0.216423, 0.244077, 0.245099, 0.780233, 0.745405, 0.727388, 0.742578, 24 | 0.727388, 0.745405, 0.780233, 0.864805, 0.902192, 0.909281, 0.902192, 25 | 0.864805, 0.784792, 0.778746, 0.785343, 0.778746, 0.784792, 0.824182, 26 | 0.831803, 0.824182]) 27 | 28 | random_transform_args = { 29 | 'rotation_range': 10, 30 | 'zoom_range': 0.05, 31 | 'shift_range': 0.05, 32 | 'random_flip': 0.4, 33 | } 34 | 35 | face_coverage = 220 36 | 37 | landmarks_2D = np.stack([mean_face_x, mean_face_y], axis=1) 38 | -------------------------------------------------------------------------------- /ch_14/dataset_utils.py: -------------------------------------------------------------------------------- 1 | from torch.utils.data import Dataset 2 | import torchvision.transforms as transforms 3 | 4 | import os 5 | import glob 6 | import numpy as np 7 | 8 | import cv2 9 | from PIL import Image 10 | from imutils import video 11 | 12 | 13 | DOWNSAMPLE_RATIO = 4 14 | DATASET_PATH = "./images/" 15 | 16 | class ImageDataset(Dataset): 17 | def __init__(self, dataset_path, image_transformations=None): 18 | self.transform = transforms.Compose(image_transformations) 19 | self.orig_files = sorted(glob.glob(os.path.join(dataset_path,'original') + "/*.*")) 20 | self.landmark_files = sorted(glob.glob(os.path.join(dataset_path,'landmarks') + "/*.*")) 21 | 22 | def __getitem__(self, index): 23 | 24 | orig_img = Image.open(self.orig_files[index % len(self.orig_files)]) 25 | landmark_img = Image.open(self.landmark_files[index % len(self.landmark_files)]) 26 | 27 | # flip images randomly 28 | if np.random.random() < 0.5: 29 | orig_img = Image.fromarray(np.array(orig_img)[:, ::-1, :], "RGB") 30 | landmark_img = Image.fromarray(np.array(landmark_img)[:, ::-1, :], "RGB") 31 | 32 | orig_img = self.transform(orig_img) 33 | landmark_img = self.transform(landmark_img) 34 | 35 | return {"A": orig_img, "B": landmark_img} 36 | 37 | def __len__(self): 38 | return len(self.orig_files) 39 | 40 | def reshape_array(array): 41 | return np.array(array, np.int32).reshape((-1, 1, 2)) 42 | 43 | 44 | def resize(image,img_width,img_height): 45 | """Crop and resize image for pix2pix.""" 46 | height, width, _ = image.shape 47 | if height != width: 48 | # crop to correct ratio 49 | size = min(height, width) 50 | oh = (height - size) // 2 51 | ow = (width - size) // 2 52 | cropped_image = image[oh:(oh + size), ow:(ow + size)] 53 | image_resize = cv2.resize(cropped_image, (img_width, img_height), interpolation = cv2.INTER_LINEAR) 54 | return image_resize 55 | 56 | def rescale_frame(frame): 57 | dim = (256, 256) 58 | return cv2.resize(frame, dim, interpolation =cv2.INTER_AREA) 59 | 60 | def get_landmarks(black_image,gray,faces,predictor): 61 | for face in faces: 62 | detected_landmarks = predictor(gray, face).parts() 63 | landmarks = [[p.x * DOWNSAMPLE_RATIO, p.y * DOWNSAMPLE_RATIO] for p in detected_landmarks] 64 | 65 | jaw = reshape_array(landmarks[0:17]) 66 | left_eyebrow = reshape_array(landmarks[22:27]) 67 | right_eyebrow = reshape_array(landmarks[17:22]) 68 | nose_bridge = reshape_array(landmarks[27:31]) 69 | lower_nose = reshape_array(landmarks[30:35]) 70 | left_eye = reshape_array(landmarks[42:48]) 71 | right_eye = reshape_array(landmarks[36:42]) 72 | outer_lip = reshape_array(landmarks[48:60]) 73 | inner_lip = reshape_array(landmarks[60:68]) 74 | 75 | color = (255, 255, 255) 76 | thickness = 3 77 | 78 | cv2.polylines(black_image, [jaw], False, color, thickness) 79 | cv2.polylines(black_image, [left_eyebrow], False, color, thickness) 80 | cv2.polylines(black_image, [right_eyebrow], False, color, thickness) 81 | cv2.polylines(black_image, [nose_bridge], False, color, thickness) 82 | cv2.polylines(black_image, [lower_nose], True, color, thickness) 83 | cv2.polylines(black_image, [left_eye], True, color, thickness) 84 | cv2.polylines(black_image, [right_eye], True, color, thickness) 85 | cv2.polylines(black_image, [outer_lip], True, color, thickness) 86 | cv2.polylines(black_image, [inner_lip], True, color, thickness) 87 | return black_image 88 | 89 | def prepare_data(video_file_path, detector, predictor, num_samples=400, downsample_ratio = DOWNSAMPLE_RATIO): 90 | """ 91 | Utility to prepare data for pix2pix based deepfake. 92 | Output is a set of directories with original frames 93 | and their corresponding facial landmarks 94 | Parameters: 95 | video_file_path : path to video to be analysed 96 | num_samples : number of frames/samples to be extracted 97 | """ 98 | 99 | # create output directories 100 | os.makedirs(f'{DATASET_PATH}',exist_ok=True) 101 | os.makedirs(f'{DATASET_PATH}/original', exist_ok=True) 102 | os.makedirs(f'{DATASET_PATH}/landmarks', exist_ok=True) 103 | 104 | # get video capture object 105 | cap = cv2.VideoCapture(video_file_path) 106 | fps = video.FPS().start() 107 | 108 | # iterate through video frame by fame 109 | count = 0 110 | while cap.isOpened(): 111 | ret, frame = cap.read() 112 | 113 | # resize frame 114 | frame_resize = cv2.resize(frame, 115 | None, 116 | fx=1 / downsample_ratio, 117 | fy=1 / downsample_ratio) 118 | 119 | # gray scale 120 | gray = cv2.cvtColor(frame_resize, cv2.COLOR_BGR2GRAY) 121 | 122 | # detect face 123 | faces = detector(gray, 1) 124 | 125 | # black background 126 | black_image = np.zeros(frame.shape, np.uint8) 127 | 128 | # Proceed only if face is detected 129 | if len(faces) == 1: 130 | black_image = get_landmarks(black_image,gray,faces,predictor) 131 | 132 | # Display the resulting frame 133 | count += 1 134 | cv2.imwrite(f"{DATASET_PATH}/original/{count}.png", frame) 135 | cv2.imwrite(f"{DATASET_PATH}/landmarks/{count}.png", black_image) 136 | fps.update() 137 | 138 | # stop after num_samples 139 | if count == num_samples: 140 | break 141 | elif cv2.waitKey(1) & 0xFF == ord('q'): 142 | break 143 | else: 144 | print("No face detected") 145 | 146 | fps.stop() 147 | print('Total time: {:.2f}'.format(fps.elapsed())) 148 | print('Approx. FPS: {:.2f}'.format(fps.fps())) 149 | 150 | cap.release() 151 | cv2.destroyAllWindows() -------------------------------------------------------------------------------- /ch_14/deepfake_banner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/5992bcc2e28c2ad573fa39d290a6e342b4d3820e/ch_14/deepfake_banner.png -------------------------------------------------------------------------------- /ch_14/face_utils.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import face_recognition 4 | from skimage import transform 5 | 6 | from constants import landmarks_2D 7 | 8 | 9 | def get_align_mat(face): 10 | mat = transform.SimilarityTransform() 11 | mat.estimate(np.array(face.landmarksAsXY()[17:]), landmarks_2D) 12 | return mat.params[0:2] 13 | 14 | 15 | # Extraction class to identify and crop out the face from an image 16 | class Extract(object): 17 | def extract(self, image, face, size): 18 | if face.landmarks is None: 19 | print("Warning! landmarks not found. Switching to crop!") 20 | return cv2.resize(face.image, (size, size)) 21 | 22 | alignment = get_align_mat(face) 23 | return self.transform(image, alignment, size, padding=48) 24 | 25 | def transform(self, image, mat, size, padding=0): 26 | mat = mat * (size - 2 * padding) 27 | mat[:, 2] += padding 28 | return cv2.warpAffine(image, mat, (size, size)) 29 | 30 | 31 | # Filter class to check if a given image has a specific identity or not 32 | class FaceFilter(): 33 | def __init__(self, reference_file_path, threshold=0.65): 34 | """ 35 | Works only for single face images 36 | """ 37 | image = face_recognition.load_image_file(reference_file_path) 38 | self.encoding = face_recognition.face_encodings(image)[0] 39 | self.threshold = threshold 40 | 41 | def check(self, detected_face): 42 | encodings = face_recognition.face_encodings(np.array(detected_face.image)) 43 | if len(encodings) > 0: 44 | encodings = encodings[0] 45 | score = face_recognition.face_distance([self.encoding], encodings) 46 | else: 47 | print("No faces found in the image!") 48 | score = 0.8 49 | return score <= self.threshold 50 | 51 | 52 | class Convert(): 53 | def __init__(self, encoder, 54 | blur_size=2, 55 | seamless_clone=False, 56 | mask_type="facehullandrect", 57 | erosion_kernel_size=None, 58 | **kwargs): 59 | self.encoder = encoder 60 | 61 | self.erosion_kernel = None 62 | if erosion_kernel_size is not None: 63 | self.erosion_kernel = cv2.getStructuringElement( 64 | cv2.MORPH_ELLIPSE, (erosion_kernel_size, erosion_kernel_size)) 65 | 66 | self.blur_size = blur_size 67 | self.seamless_clone = seamless_clone 68 | self.mask_type = mask_type.lower() 69 | 70 | def patch_image(self, image, face_detected): 71 | size = 64 72 | image_size = image.shape[1], image.shape[0] 73 | 74 | mat = np.array(get_align_mat(face_detected)).reshape(2, 3) * size 75 | 76 | new_face = self.get_new_face(image, mat, size) 77 | 78 | image_mask = self.get_image_mask( 79 | image, new_face, face_detected, mat, image_size) 80 | 81 | return self.apply_new_face(image, 82 | new_face, 83 | image_mask, 84 | mat, 85 | image_size, 86 | size) 87 | 88 | def apply_new_face(self, 89 | image, 90 | new_face, 91 | image_mask, 92 | mat, 93 | image_size, 94 | size): 95 | base_image = np.copy(image) 96 | new_image = np.copy(image) 97 | 98 | cv2.warpAffine(new_face, mat, image_size, new_image, 99 | cv2.WARP_INVERSE_MAP, cv2.BORDER_TRANSPARENT) 100 | 101 | outimage = None 102 | if self.seamless_clone: 103 | masky, maskx = cv2.transform(np.array([size / 2, size / 2]).reshape( 104 | 1, 1, 2), cv2.invertAffineTransform(mat)).reshape(2).astype(int) 105 | outimage = cv2.seamlessClone(new_image.astype(np.uint8), base_image.astype( 106 | np.uint8), (image_mask * 255).astype(np.uint8), (masky, maskx), cv2.NORMAL_CLONE) 107 | else: 108 | foreground = cv2.multiply(image_mask, new_image.astype(float)) 109 | background = cv2.multiply( 110 | 1.0 - image_mask, base_image.astype(float)) 111 | outimage = cv2.add(foreground, background) 112 | 113 | return outimage 114 | 115 | def get_new_face(self, image, mat, size): 116 | face = cv2.warpAffine(image, mat, (size, size)) 117 | face = np.expand_dims(face, 0) 118 | new_face = self.encoder(face / 255.0)[0] 119 | 120 | return np.clip(new_face * 255, 0, 255).astype(image.dtype) 121 | 122 | def get_image_mask(self, image, new_face, face_detected, mat, image_size): 123 | 124 | face_mask = np.zeros(image.shape, dtype=float) 125 | if 'rect' in self.mask_type: 126 | face_src = np.ones(new_face.shape, dtype=float) 127 | cv2.warpAffine(face_src, mat, image_size, face_mask, 128 | cv2.WARP_INVERSE_MAP, cv2.BORDER_TRANSPARENT) 129 | 130 | hull_mask = np.zeros(image.shape, dtype=float) 131 | if 'hull' in self.mask_type: 132 | hull = cv2.convexHull(np.array(face_detected.landmarksAsXY()).reshape( 133 | (-1, 2)).astype(int)).flatten().reshape((-1, 2)) 134 | cv2.fillConvexPoly(hull_mask, hull, (1, 1, 1)) 135 | 136 | if self.mask_type == 'rect': 137 | image_mask = face_mask 138 | elif self.mask_type == 'faceHull': 139 | image_mask = hull_mask 140 | else: 141 | image_mask = ((face_mask * hull_mask)) 142 | 143 | if self.erosion_kernel is not None: 144 | image_mask = cv2.erode( 145 | image_mask, self.erosion_kernel, iterations=1) 146 | 147 | if self.blur_size != 0: 148 | image_mask = cv2.blur(image_mask, (self.blur_size, self.blur_size)) 149 | 150 | return image_mask 151 | 152 | 153 | # Entity class 154 | class DetectedFace(object): 155 | def __init__(self, image, x, w, y, h, landmarks): 156 | self.image = image 157 | self.x = x 158 | self.w = w 159 | self.y = y 160 | self.h = h 161 | self.landmarks = landmarks 162 | 163 | def landmarksAsXY(self): 164 | return [(p.x, p.y) for p in self.landmarks.parts()] 165 | -------------------------------------------------------------------------------- /ch_14/gan_utils.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | 5 | IMG_WIDTH = 256 6 | IMG_HEIGHT = 256 7 | NUM_CHANNELS = 3 8 | BATCH_SIZE = 64 9 | N_EPOCHS = 100 10 | SAMPLE_INTERVAL = 18 11 | 12 | # prepare patch size for our setup 13 | patch = int(IMG_HEIGHT / 2**4) 14 | PATCH_GAN_SHAPE = (1,patch, patch) 15 | 16 | 17 | class DownSampleBlock(nn.Module): 18 | def __init__(self, input_channels, output_channels,normalize=True): 19 | super(DownSampleBlock, self).__init__() 20 | layers = [ 21 | nn.Conv2d( 22 | input_channels, 23 | output_channels, 24 | kernel_size=4, 25 | stride=2, 26 | padding=1, 27 | bias=False) 28 | ] 29 | if normalize: 30 | layers.append(nn.InstanceNorm2d(output_channels)) 31 | layers.append(nn.LeakyReLU(0.2)) 32 | layers.append(nn.Dropout(0.5)) 33 | self.model = nn.Sequential(*layers) 34 | 35 | def forward(self, x): 36 | return self.model(x) 37 | 38 | class UpSampleBlock(nn.Module): 39 | def __init__(self, input_channels, output_channels): 40 | super(UpSampleBlock, self).__init__() 41 | layers = [ 42 | nn.ConvTranspose2d( 43 | input_channels, 44 | output_channels, 45 | kernel_size=4, 46 | stride=2, 47 | padding=1, 48 | bias=False), 49 | ] 50 | layers.append(nn.InstanceNorm2d(output_channels)) 51 | layers.append(nn.ReLU(inplace=True)) 52 | layers.append(nn.Dropout(0.5)) 53 | self.model = nn.Sequential(*layers) 54 | 55 | def forward(self, x, skip_connection): 56 | x = self.model(x) 57 | x = torch.cat((x, skip_connection), 1) 58 | 59 | return x 60 | 61 | class Generator(nn.Module): 62 | def __init__(self, input_channels=3,out_channels=3): 63 | super(Generator, self).__init__() 64 | 65 | self.downsample1 = DownSampleBlock(input_channels,64, normalize=False) 66 | self.downsample2 = DownSampleBlock(64, 128) 67 | self.downsample3 = DownSampleBlock(128, 256) 68 | self.downsample4 = DownSampleBlock(256, 512) 69 | self.downsample5 = DownSampleBlock(512, 512) 70 | self.downsample6 = DownSampleBlock(512, 512) 71 | self.downsample7 = DownSampleBlock(512, 512) 72 | self.downsample8 = DownSampleBlock(512, 512,normalize=False) 73 | 74 | self.upsample1 = UpSampleBlock(512, 512) 75 | self.upsample2 = UpSampleBlock(1024, 512) 76 | self.upsample3 = UpSampleBlock(1024, 512) 77 | self.upsample4 = UpSampleBlock(1024, 512) 78 | self.upsample5 = UpSampleBlock(1024, 256) 79 | self.upsample6 = UpSampleBlock(512, 128) 80 | self.upsample7 = UpSampleBlock(256, 64) 81 | 82 | self.final_layer = nn.Sequential( 83 | nn.Upsample(scale_factor=2), 84 | # padding left, right, top, bottom 85 | nn.ZeroPad2d((1, 0, 1, 0)), 86 | nn.Conv2d(128, out_channels, 4, padding=1), 87 | nn.Tanh(), 88 | ) 89 | 90 | def forward(self, x): 91 | # downsampling blocks 92 | d1 = self.downsample1(x) 93 | d2 = self.downsample2(d1) 94 | d3 = self.downsample3(d2) 95 | d4 = self.downsample4(d3) 96 | d5 = self.downsample5(d4) 97 | d6 = self.downsample6(d5) 98 | d7 = self.downsample7(d6) 99 | d8 = self.downsample8(d7) 100 | # upsampling blocks with skip connections 101 | u1 = self.upsample1(d8, d7) 102 | u2 = self.upsample2(u1, d6) 103 | u3 = self.upsample3(u2, d5) 104 | u4 = self.upsample4(u3, d4) 105 | u5 = self.upsample5(u4, d3) 106 | u6 = self.upsample6(u5, d2) 107 | u7 = self.upsample7(u6, d1) 108 | 109 | return self.final_layer(u7) 110 | 111 | class Discriminator(nn.Module): 112 | def __init__(self, input_channels=3): 113 | super(Discriminator, self).__init__() 114 | 115 | def discriminator_block(input_filters, output_filters): 116 | layers = [ 117 | nn.Conv2d( 118 | input_filters, 119 | output_filters, 120 | kernel_size=4, 121 | stride=2, 122 | padding=1) 123 | ] 124 | layers.append(nn.InstanceNorm2d(output_filters)) 125 | layers.append(nn.LeakyReLU(0.2, inplace=True)) 126 | return layers 127 | 128 | self.model = nn.Sequential( 129 | *discriminator_block(input_channels * 2, output_filters=64), 130 | *discriminator_block(64, 128), 131 | *discriminator_block(128, 256), 132 | *discriminator_block(256, 512), 133 | # padding left, right, top, bottom 134 | nn.ZeroPad2d((1, 0, 1, 0)), 135 | nn.Conv2d(512, 1, 4, padding=1, bias=False) 136 | ) 137 | 138 | def forward(self, img_A, img_B): 139 | img_input = torch.cat((img_A, img_B), 1) 140 | return self.model(img_input) 141 | 142 | -------------------------------------------------------------------------------- /ch_14/nicolas_ref_cc.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/5992bcc2e28c2ad573fa39d290a6e342b4d3820e/ch_14/nicolas_ref_cc.jpg -------------------------------------------------------------------------------- /ch_14/obama.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/5992bcc2e28c2ad573fa39d290a6e342b4d3820e/ch_14/obama.mp4 -------------------------------------------------------------------------------- /ch_14/sample_image_cc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/5992bcc2e28c2ad573fa39d290a6e342b4d3820e/ch_14/sample_image_cc.png -------------------------------------------------------------------------------- /ch_14/trump_ref_cc.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Generative-AI-with-Python-and-PyTorch-Second-Edition/5992bcc2e28c2ad573fa39d290a6e342b4d3820e/ch_14/trump_ref_cc.jpg --------------------------------------------------------------------------------