├── 01. Transformer Models
    ├── Requirements.txt
    ├── README.md
    └── TransformerModels.ipynb
├── 02. Pipeline Function
    ├── Requirements.txt
    ├── README.md
    └── PipelineFunction.ipynb
├── 03. Tokenizers & Models
    ├── Requirements.txt
    ├── README.md
    └── Models&Tokenizers.ipynb
├── 04. Pretrained Models
    ├── Requirements.txt
    ├── README.md
    └── PretrainedModel.ipynb
├── LICENSE
└── README.md


/01. Transformer Models/Requirements.txt:
--------------------------------------------------------------------------------
1 | transformer


--------------------------------------------------------------------------------
/02. Pipeline Function/Requirements.txt:
--------------------------------------------------------------------------------
1 | transformer
2 | torch


--------------------------------------------------------------------------------
/03. Tokenizers & Models/Requirements.txt:
--------------------------------------------------------------------------------
1 | transformer
2 | torch


--------------------------------------------------------------------------------
/04. Pretrained Models/Requirements.txt:
--------------------------------------------------------------------------------
1 | transformer
2 | torch
3 | datasets


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2022 Thinam Tamang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # **HUGGING FACE**
 2 | 
 3 | The repository contains a list of the projects which I have worked on while reading **Hugging Face 🤗**. 
 4 | 
 5 | ### **📚NOTEBOOKS:**
 6 | 
 7 | [**1. TRANSFORMER MODELS**](https://github.com/ThinamXx/HuggingFace/tree/main/01.%20Transformer%20Models)
 8 | - The **Transformer Models** notebook is a comprehensive notebook as it contains a list of real world applications of Transformers such as Sentiment Analysis, Zero-shot Classification, Text Generation, Mask Filling, Named Entity Recognition, Question Answering, Summarization, and Translation.
 9 | 
10 | [**2. PIPELINE FUNCTION**](https://github.com/ThinamXx/HuggingFace/tree/main/02.%20Pipeline%20Function)
11 | - The **Pipeline Function** notebook is a comprehensive notebook as it contains all three steps of the pipeline: preprocessing with tokenizers, passing the inputs through the model, and postprocessing the outputs.
12 | 
13 | [**3. TOKENIZERS & MODELS**](https://github.com/ThinamXx/HuggingFace/tree/main/03.%20Tokenizers%20%26%20Models)
14 | - The **Tokenizers & Models** notebook contains the comprehensive information about the basic buildings blocks of a Transformer model, tokenization pipeline, limitations of input IDs, attention masks, and configurable tokenizer methods.
15 | 


--------------------------------------------------------------------------------
/04. Pretrained Models/README.md:
--------------------------------------------------------------------------------
 1 | ## **Hugging Face : Fine Tuning Pretrained Models**
 2 | 
 3 | The [**Pretrained Models**](https://github.com/ThinamXx/HuggingFace/blob/main/04.%20Pretrained%20Models/PretrainedModel.ipynb) notebook contains information about Preprocessing the Dataset, Dynamic Padding, Training & Evaluation of 🤗 Transformer Models. 
 4 | 
 5 | **Note:**
 6 | - 📑[**Pretrained Models**](https://github.com/ThinamXx/HuggingFace/blob/main/04.%20Pretrained%20Models/PretrainedModel.ipynb)
 7 | 
 8 | **MRPC (Microsoft Research Paraphrase Corpus)**
 9 | - In this notebook, we will use MRPC (Microsoft Research Paraphrase Corpus) dataset introduced by William B. Dolan and Chris Brockett. The dataset consist of 5801 pairs of sentences, with a label indicating if they are paraphrases or not. It is one of the 10 datasets composing the GLUE benchmark, which is an academic benchmark that is used to measure the performance of ML models across 10 different text classification tasks.
10 | 
11 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2031a.PNG) 
12 | 
13 | **Dynamic Padding**
14 | - The function that is responsible for putting together samples inside a batch is called a collate function. Dynamic Padding means the samples in the batch should all be padded to the maximum length inside the batch. I have presented the implementation of Tokenization Pipeline & Training here in the snapshot.
15 | 
16 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2031b.PNG)
17 | 


--------------------------------------------------------------------------------
/03. Tokenizers & Models/README.md:
--------------------------------------------------------------------------------
 1 | ## **Hugging Face : Tokenizers & Models**
 2 | 
 3 | The [**Tokenizers & Models**](https://github.com/ThinamXx/HuggingFace/blob/main/03.%20Tokenizers%20%26%20Models/Models%26Tokenizers.ipynb) notebook contains the comprehensive information about the basic buildings blocks of a Transformer model, tokenization pipeline, limitations of input IDs, attention masks, and configurable tokenizer methods. 
 4 | 
 5 | **Note:**
 6 | - 📑[**Tokenizers & Models**](https://github.com/ThinamXx/HuggingFace/blob/main/03.%20Tokenizers%20%26%20Models/Models%26Tokenizers.ipynb)
 7 | 
 8 | **Tokenizers**
 9 | - The basic type of tokenizer that comes to mind is word-based tokenizer. It's generally very easy to set up and use with only a few rules, and it yeilds decent results.
10 | 
11 | **Subword Tokenization**
12 | - Subword Tokenization rely on the principle that frequently used words should not be split into smaller subwords, but rare words should be decomposed into meaningful subwords.
13 | 
14 | **Encoding**
15 | - Translating text to numbers is known as encoding. Encoding is done in a two-step process: the tokenization, followed by the conversion to input IDs.
16 | - Batching is the act of sending multiple sentences through the model, all at once. If we have only one sentence, we can just build a batch with a single sequence.
17 | - Padding makes sure all our sentences have the same length by adding a special word called the padding token to the sentences with fewer values.
18 | 
19 | **Attention Masks**
20 | - Attention masks are tensors with the same shapes as the input IDs tensors, filled with 0s and 1s: 1s indicate the corresponding tokens should be attended to, and 0s indicate the corresponding tokens should not be attended to i.e. they should be ignored by the attention layers of the model.
21 | 
22 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2029.PNG)
23 | 
24 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2030.PNG)
25 | 


--------------------------------------------------------------------------------
/02. Pipeline Function/README.md:
--------------------------------------------------------------------------------
 1 | ## **Hugging Face : Pipeline Function**
 2 | 
 3 | The [**Pipeline Function**](https://github.com/ThinamXx/HuggingFace/blob/main/02.%20Pipeline%20Function/PipelineFunction.ipynb) notebook is a comprehensive notebook as it contains all three steps of the **pipeline**: preprocessing with tokenizers, passing the inputs through the model, and postprocessing the outputs.  
 4 | 
 5 | **Note:**
 6 | - 📑[**Pipeline Function**](https://github.com/ThinamXx/HuggingFace/blob/main/02.%20Pipeline%20Function/PipelineFunction.ipynb) 
 7 | 
 8 | **Natural Language Processing**
 9 | - NLP is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of NLP tasks is not only to understand single words individually, but to be able to understand the context of those words.
10 | 
11 | **Transformer Pipelines**
12 | - The three main steps involved when we pass some text to a **pipeline** are:
13 |   - The text is preprocessed into a format the model can understand.
14 |   - The preprocessed inputs are passed to the model.
15 |   - The predictions of the model are post-processed, so we can make sense of them.
16 | 
17 | **Preprocessing with Tokenizer**
18 | - The tokenizer will be responsible for:
19 |   - Splitting the input into words, subwords, or symbols like puncutation that are also called tokens.
20 |   - Mapping each token to an integer.
21 |   - Adding additional inputs that may be useful to the model.
22 | 
23 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2027.PNG) 
24 | 
25 | **Going through Model**
26 | - The vector output by the Transformer module is usually large. It generally has 3 dimensions:
27 |   - Batch size: The number of sequences processed at a time.
28 |   - Sequence length: The length of the numerical representation of the sequence.
29 |   - Hidden size: The vector dimension of each model input.
30 | 
31 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2027.PNG) 
32 | 
33 | 


--------------------------------------------------------------------------------
/01. Transformer Models/README.md:
--------------------------------------------------------------------------------
 1 | ## **Hugging Face : Transformer Models**
 2 | 
 3 | The [**Transformer Models**](https://github.com/ThinamXx/HuggingFace/blob/main/01.%20Transformer%20Models/TransformerModels.ipynb) notebook is a comprehensive notebook as it contains a list of real world applications of **Transformers** such as **Sentiment Analysis**, **Zero-shot Classification**, **Text Generation**, **Mask Filling**, **Named Entity Recognition**, **Question Answering**, **Summarization**, and **Translation**. 
 4 | 
 5 | **Note:**
 6 | - 📑[**Transformer Models**](https://github.com/ThinamXx/HuggingFace/blob/main/01.%20Transformer%20Models/TransformerModels.ipynb) 
 7 | 
 8 | **Natural Language Processing**
 9 | - NLP is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of NLP tasks is not only to understand single words individually, but to be able to understand the context of those words.
10 | 
11 | **Transformer Pipelines**
12 | - The three main steps involved when we pass some text to a **pipeline** are:
13 |   - The text is preprocessed into a format the model can understand.
14 |   - The preprocessed inputs are passed to the model.
15 |   - The predictions of the model are post-processed, so we can make sense of them.
16 | 
17 | **Sentiment Analysis**
18 | 
19 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2026.PNG)
20 | 
21 | **Zero-shot Classification**
22 | - The zero-shot-classification pipeline is very powerful, as it allows us to specify which labels to use for the classification, so we don't have to rely on the labels of the pretrained model. This pipeline is called zero-shot because we don't need to fine-tune the model on our data to use it. It can directly return probability scores for any list of labels we want.
23 | 
24 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2026.PNG)
25 | 
26 | **Text Generation**
27 | - The main idea here is that when we provide a prompt and the model will auto-complete it by generating the remaining text.
28 | 
29 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2026.PNG)
30 | 
31 | **Mask Filling**
32 | - The idea of this task is to fill in the blanks in a given text.
33 | 
34 | **Named Entity Recognition**
35 | - NER is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or organizations.
36 | 
37 | **Question Answering**
38 | - The question-answering pipeline answers questions using information from a given context.
39 | 
40 | **Summarization**
41 | - Summarization is the task of reducing a text into a shorter text while keeping all or most of the important aspects referenced in the text.
42 | 
43 | **Translation**
44 | 
45 | ![Image](https://github.com/ThinamXx/MachineLearning_DeepLearning/blob/main/Images/Day%2026.PNG)
46 | 


--------------------------------------------------------------------------------
/02. Pipeline Function/PipelineFunction.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "name": "PipelineFunction.ipynb",
  7 |       "provenance": []
  8 |     },
  9 |     "kernelspec": {
 10 |       "name": "python3",
 11 |       "display_name": "Python 3"
 12 |     },
 13 |     "language_info": {
 14 |       "name": "python"
 15 |     },
 16 |     "accelerator": "GPU"
 17 |   },
 18 |   "cells": [
 19 |     {
 20 |       "cell_type": "markdown",
 21 |       "source": [
 22 |         "**INITIALIZATION:**\n",
 23 |         "- I use these three lines of code on top of my each notebooks because it will help to prevent any problems while reloading the same project. And the third line of code helps to make visualization within the notebook."
 24 |       ],
 25 |       "metadata": {
 26 |         "id": "uD9wr0Cmweq0"
 27 |       }
 28 |     },
 29 |     {
 30 |       "cell_type": "code",
 31 |       "execution_count": 1,
 32 |       "metadata": {
 33 |         "id": "fSALgkZsJCHw"
 34 |       },
 35 |       "outputs": [],
 36 |       "source": [
 37 |         "#@ INITIALIZATION: \n",
 38 |         "%reload_ext autoreload\n",
 39 |         "%autoreload 2\n",
 40 |         "%matplotlib inline"
 41 |       ]
 42 |     },
 43 |     {
 44 |       "cell_type": "markdown",
 45 |       "source": [
 46 |         "**LIBRARIES AND DEPENDENCIES:**\n",
 47 |         "- I have downloaded all the libraries and dependencies required for the project in one particular cell."
 48 |       ],
 49 |       "metadata": {
 50 |         "id": "1OnmQkSVwp9N"
 51 |       }
 52 |     },
 53 |     {
 54 |       "cell_type": "code",
 55 |       "source": [
 56 |         "#@ INSTALLING DEPENDENCIES: UNCOMMENT BELOW: \n",
 57 |         "# !pip install transformers[sentencepiece]"
 58 |       ],
 59 |       "metadata": {
 60 |         "id": "e6bx-9iQwni7"
 61 |       },
 62 |       "execution_count": 3,
 63 |       "outputs": []
 64 |     },
 65 |     {
 66 |       "cell_type": "code",
 67 |       "source": [
 68 |         "#@ DOWNLOADING LIBRARIES AND DEPENDENCIES:\n",
 69 |         "import torch\n",
 70 |         "import transformers\n",
 71 |         "from transformers import pipeline\n",
 72 |         "from transformers import AutoTokenizer\n",
 73 |         "from transformers import AutoModel\n",
 74 |         "from transformers import AutoModelForSequenceClassification"
 75 |       ],
 76 |       "metadata": {
 77 |         "id": "JTf0fK0Kwu3R"
 78 |       },
 79 |       "execution_count": 18,
 80 |       "outputs": []
 81 |     },
 82 |     {
 83 |       "cell_type": "markdown",
 84 |       "source": [
 85 |         "**SENTIMENT ANALYSIS:**"
 86 |       ],
 87 |       "metadata": {
 88 |         "id": "kkJil872w4m4"
 89 |       }
 90 |     },
 91 |     {
 92 |       "cell_type": "code",
 93 |       "source": [
 94 |         "#@ IMPLEMENTATION OF SENTIMENT ANALYSIS PIPELINE:\n",
 95 |         "classifier = pipeline(\"sentiment-analysis\")                                 # Initializing Classifier Object. \n",
 96 |         "classifier(\"I've started the HuggingFace course which fascinates me.\")      # Inspecting Sentiment."
 97 |       ],
 98 |       "metadata": {
 99 |         "colab": {
100 |           "base_uri": "https://localhost:8080/"
101 |         },
102 |         "id": "f_SApdSEwzdC",
103 |         "outputId": "17d6c32f-b6ba-4c2e-fcc8-ca0d580c16ef"
104 |       },
105 |       "execution_count": 6,
106 |       "outputs": [
107 |         {
108 |           "output_type": "stream",
109 |           "name": "stderr",
110 |           "text": [
111 |             "No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)\n"
112 |           ]
113 |         },
114 |         {
115 |           "output_type": "execute_result",
116 |           "data": {
117 |             "text/plain": [
118 |               "[{'label': 'POSITIVE', 'score': 0.9997233748435974}]"
119 |             ]
120 |           },
121 |           "metadata": {},
122 |           "execution_count": 6
123 |         }
124 |       ]
125 |     },
126 |     {
127 |       "cell_type": "markdown",
128 |       "source": [
129 |         "**PREPROCESSING WITH TOKENIZER:**\n",
130 |         "- The **tokenizer** will be responsible for: \n",
131 |         "    - Splitting the input into words, subwords, or symbols like puncutation that are also called tokens. \n",
132 |         "    - Mapping each token to an integer. \n",
133 |         "    - Adding additional inputs that may be useful to the model. "
134 |       ],
135 |       "metadata": {
136 |         "id": "mB8yznK7xeV9"
137 |       }
138 |     },
139 |     {
140 |       "cell_type": "code",
141 |       "source": [
142 |         "#@ INITIALIZATION OF AUTOTOKENIZER: \n",
143 |         "checkpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"                  # Initialization. \n",
144 |         "tokenizer = AutoTokenizer.from_pretrained(checkpoint)                           # Initializing Tokenizer. "
145 |       ],
146 |       "metadata": {
147 |         "id": "Jw48OV8fxNsC"
148 |       },
149 |       "execution_count": 8,
150 |       "outputs": []
151 |     },
152 |     {
153 |       "cell_type": "code",
154 |       "source": [
155 |         "#@ CONVERTING INTO TENSORS:\n",
156 |         "raw_inputs = [\"I've started the HuggingFace course which fascinates me.\",\n",
157 |         "              \"I will no longer read it's documentation.\",\n",
158 |         "              \"I think the course is awesome!\"]                                 # Initializing Input Text. \n",
159 |         "inputs = tokenizer(raw_inputs, padding=True, truncation=True,                   # Converting into Equal Size. \n",
160 |         "                   return_tensors=\"pt\")                                         # Getting PyTorch Tensors. \n",
161 |         "print(inputs)                                                                   # Inspecting Tensors with Attention Mask. "
162 |       ],
163 |       "metadata": {
164 |         "colab": {
165 |           "base_uri": "https://localhost:8080/"
166 |         },
167 |         "id": "-QRTHuWPzejE",
168 |         "outputId": "40fbd083-be03-4771-ee79-6a827846a0ee"
169 |       },
170 |       "execution_count": 9,
171 |       "outputs": [
172 |         {
173 |           "output_type": "stream",
174 |           "name": "stdout",
175 |           "text": [
176 |             "{'input_ids': tensor([[  101,  1045,  1005,  2310,  2318,  1996, 17662, 12172,  2607,  2029,\n",
177 |             "          6904, 11020, 28184,  2033,  1012,   102],\n",
178 |             "        [  101,  1045,  2097,  2053,  2936,  3191,  2009,  1005,  1055, 12653,\n",
179 |             "          1012,   102,     0,     0,     0,     0],\n",
180 |             "        [  101,  1045,  2228,  1996,  2607,  2003, 12476,   999,   102,     0,\n",
181 |             "             0,     0,     0,     0,     0,     0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
182 |             "        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],\n",
183 |             "        [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0]])}\n"
184 |           ]
185 |         }
186 |       ]
187 |     },
188 |     {
189 |       "cell_type": "markdown",
190 |       "source": [
191 |         "**GOING THROUGH MODEL:**\n",
192 |         "- The vector output by the Transformer module is usually large. It generally has 3 dimensions:\n",
193 |         "    - **Batch size:** The number of sequences processed at a time. \n",
194 |         "    - **Sequence length:** The length of the numerical representation of the sequence. \n",
195 |         "    - **Hidden size:** The vector dimension of each model input. "
196 |       ],
197 |       "metadata": {
198 |         "id": "EhHtxFMu07O3"
199 |       }
200 |     },
201 |     {
202 |       "cell_type": "code",
203 |       "source": [
204 |         "#@ INITIALIZATION OF TRANSFORMER MODEL:\n",
205 |         "checkpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"                  # Initialization. \n",
206 |         "model = AutoModel.from_pretrained(checkpoint)                                   # Initializing Model. "
207 |       ],
208 |       "metadata": {
209 |         "colab": {
210 |           "base_uri": "https://localhost:8080/"
211 |         },
212 |         "id": "qBziV4RZ0tYn",
213 |         "outputId": "9da736bf-80bb-4bff-d286-be99fe05b474"
214 |       },
215 |       "execution_count": 12,
216 |       "outputs": [
217 |         {
218 |           "output_type": "stream",
219 |           "name": "stderr",
220 |           "text": [
221 |             "Some weights of the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing DistilBertModel: ['classifier.bias', 'pre_classifier.weight', 'classifier.weight', 'pre_classifier.bias']\n",
222 |             "- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
223 |             "- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n"
224 |           ]
225 |         }
226 |       ]
227 |     },
228 |     {
229 |       "cell_type": "code",
230 |       "source": [
231 |         "#@ INSPECTING VECTOR DIMENSIONS:\n",
232 |         "output = model(**inputs)                                                        # Initializing Output. \n",
233 |         "print(output.last_hidden_state.shape)                                           # Inspecting Shape. "
234 |       ],
235 |       "metadata": {
236 |         "colab": {
237 |           "base_uri": "https://localhost:8080/"
238 |         },
239 |         "id": "o0DF1xLD1idI",
240 |         "outputId": "28558a2c-8725-436e-870f-b41f553bcade"
241 |       },
242 |       "execution_count": 13,
243 |       "outputs": [
244 |         {
245 |           "output_type": "stream",
246 |           "name": "stdout",
247 |           "text": [
248 |             "torch.Size([3, 16, 768])\n"
249 |           ]
250 |         }
251 |       ]
252 |     },
253 |     {
254 |       "cell_type": "code",
255 |       "source": [
256 |         "#@ INITIALIZATION OF SEQUENCE CLASSIFICATION TRANSFORMER MODEL:\n",
257 |         "checkpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"                  # Initialization. \n",
258 |         "model = AutoModelForSequenceClassification.from_pretrained(checkpoint)          # Initializing Model.\n",
259 |         "output = model(**inputs)                                                        # Initializing Output. \n",
260 |         "print(output.logits.shape)                                                      # Inspecting Shape.  "
261 |       ],
262 |       "metadata": {
263 |         "colab": {
264 |           "base_uri": "https://localhost:8080/"
265 |         },
266 |         "id": "vgx0wBTG3rC2",
267 |         "outputId": "dc8fa02b-9cb9-4358-daf3-dddcac135f45"
268 |       },
269 |       "execution_count": 16,
270 |       "outputs": [
271 |         {
272 |           "output_type": "stream",
273 |           "name": "stdout",
274 |           "text": [
275 |             "torch.Size([3, 2])\n"
276 |           ]
277 |         }
278 |       ]
279 |     },
280 |     {
281 |       "cell_type": "markdown",
282 |       "source": [
283 |         "**POSTPROCESSING THE OUTPUT:**"
284 |       ],
285 |       "metadata": {
286 |         "id": "0Vpcn_cP5ZNx"
287 |       }
288 |     },
289 |     {
290 |       "cell_type": "code",
291 |       "source": [
292 |         "#@ INSPECTING THE OUTPUT:\n",
293 |         "print(output.logits)"
294 |       ],
295 |       "metadata": {
296 |         "colab": {
297 |           "base_uri": "https://localhost:8080/"
298 |         },
299 |         "id": "abcy0rRU5Hkf",
300 |         "outputId": "785a4d1f-18e2-49ad-a275-1da1a0c11dd1"
301 |       },
302 |       "execution_count": 17,
303 |       "outputs": [
304 |         {
305 |           "output_type": "stream",
306 |           "name": "stdout",
307 |           "text": [
308 |             "tensor([[-3.9449,  4.2477],\n",
309 |             "        [ 4.1948, -3.3446],\n",
310 |             "        [-4.2149,  4.5673]], grad_fn=<AddmmBackward0>)\n"
311 |           ]
312 |         }
313 |       ]
314 |     },
315 |     {
316 |       "cell_type": "code",
317 |       "source": [
318 |         "#@ CONVERTING LOGITS INTO PROBABILITIES:\n",
319 |         "predictions = torch.nn.functional.softmax(output.logits, dim=-1)                # Applying Softmax Function. \n",
320 |         "print(predictions)                                                              # Inspecting Prediction Probabilities. "
321 |       ],
322 |       "metadata": {
323 |         "colab": {
324 |           "base_uri": "https://localhost:8080/"
325 |         },
326 |         "id": "0M1iZ3Xa5jMn",
327 |         "outputId": "7e897944-2bdc-481b-b95f-224008263ef8"
328 |       },
329 |       "execution_count": 20,
330 |       "outputs": [
331 |         {
332 |           "output_type": "stream",
333 |           "name": "stdout",
334 |           "text": [
335 |             "tensor([[2.7661e-04, 9.9972e-01],\n",
336 |             "        [9.9947e-01, 5.3144e-04],\n",
337 |             "        [1.5341e-04, 9.9985e-01]], grad_fn=<SoftmaxBackward0>)\n"
338 |           ]
339 |         }
340 |       ]
341 |     },
342 |     {
343 |       "cell_type": "code",
344 |       "source": [
345 |         "#@ INSPECTING MODEL ATTRIBUTE:\n",
346 |         "model.config.id2label"
347 |       ],
348 |       "metadata": {
349 |         "colab": {
350 |           "base_uri": "https://localhost:8080/"
351 |         },
352 |         "id": "4UFDFhhl6VPX",
353 |         "outputId": "8ff92bfd-8d06-411a-f2fa-64483f003353"
354 |       },
355 |       "execution_count": 22,
356 |       "outputs": [
357 |         {
358 |           "output_type": "execute_result",
359 |           "data": {
360 |             "text/plain": [
361 |               "{0: 'NEGATIVE', 1: 'POSITIVE'}"
362 |             ]
363 |           },
364 |           "metadata": {},
365 |           "execution_count": 22
366 |         }
367 |       ]
368 |     }
369 |   ]
370 | }


--------------------------------------------------------------------------------
/03. Tokenizers & Models/Models&Tokenizers.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "name": "Models&Tokenizers.ipynb",
  7 |       "provenance": [],
  8 |       "collapsed_sections": []
  9 |     },
 10 |     "kernelspec": {
 11 |       "name": "python3",
 12 |       "display_name": "Python 3"
 13 |     },
 14 |     "language_info": {
 15 |       "name": "python"
 16 |     },
 17 |     "accelerator": "GPU"
 18 |   },
 19 |   "cells": [
 20 |     {
 21 |       "cell_type": "markdown",
 22 |       "source": [
 23 |         "**INITIALIZATION:**\n",
 24 |         "- I use these three lines of code on top of my each notebooks because it will help to prevent any problems while reloading the same project. And the third line of code helps to make visualization within the notebook."
 25 |       ],
 26 |       "metadata": {
 27 |         "id": "77HDATy1OdHg"
 28 |       }
 29 |     },
 30 |     {
 31 |       "cell_type": "code",
 32 |       "execution_count": 3,
 33 |       "metadata": {
 34 |         "id": "NGfR4PCjOHZZ"
 35 |       },
 36 |       "outputs": [],
 37 |       "source": [
 38 |         "#@ INITIALIZATION: \n",
 39 |         "%reload_ext autoreload\n",
 40 |         "%autoreload 2\n",
 41 |         "%matplotlib inline"
 42 |       ]
 43 |     },
 44 |     {
 45 |       "cell_type": "markdown",
 46 |       "source": [
 47 |         "**LIBRARIES AND DEPENDENCIES:**\n",
 48 |         "- I have downloaded all the libraries and dependencies required for the project in one particular cell."
 49 |       ],
 50 |       "metadata": {
 51 |         "id": "oOvUVdpdOlcW"
 52 |       }
 53 |     },
 54 |     {
 55 |       "cell_type": "code",
 56 |       "source": [
 57 |         "#@ INSTALLING DEPENDENCIES: UNCOMMENT BELOW: \n",
 58 |         "# !pip install transformers[sentencepiece]"
 59 |       ],
 60 |       "metadata": {
 61 |         "id": "7_xvpbkUOicZ"
 62 |       },
 63 |       "execution_count": 2,
 64 |       "outputs": []
 65 |     },
 66 |     {
 67 |       "cell_type": "code",
 68 |       "source": [
 69 |         "#@ DOWNLOADING LIBRARIES AND DEPENDENCIES:\n",
 70 |         "import torch\n",
 71 |         "import transformers\n",
 72 |         "from transformers import BertConfig, BertModel\n",
 73 |         "from transformers import BertTokenizer\n",
 74 |         "from transformers import AutoTokenizer\n",
 75 |         "from transformers import AutoModelForSequenceClassification"
 76 |       ],
 77 |       "metadata": {
 78 |         "id": "SjikyD4ROrCJ"
 79 |       },
 80 |       "execution_count": 4,
 81 |       "outputs": []
 82 |     },
 83 |     {
 84 |       "cell_type": "markdown",
 85 |       "source": [
 86 |         "**TRANSFORMER**\n",
 87 |         "- I will create the BERT model here. "
 88 |       ],
 89 |       "metadata": {
 90 |         "id": "6ZjshvfyaVm9"
 91 |       }
 92 |     },
 93 |     {
 94 |       "cell_type": "code",
 95 |       "source": [
 96 |         "#@ INITIALIZING BERT MODEL: RANDOM: UNCOMMENT BELOW:\n",
 97 |         "# config = BertConfig()                               # Building BERT Config.\n",
 98 |         "# model = BertModel(config)                           # Building BERT Model. \n",
 99 |         "# print(config)                                       # Inspecting Configurations. "
100 |       ],
101 |       "metadata": {
102 |         "id": "BpoRHTJmOwk7"
103 |       },
104 |       "execution_count": 5,
105 |       "outputs": []
106 |     },
107 |     {
108 |       "cell_type": "code",
109 |       "source": [
110 |         "#@ INITIALIZING PRETRAINED MODEL:\n",
111 |         "model = BertModel.from_pretrained(\"bert-base-cased\")"
112 |       ],
113 |       "metadata": {
114 |         "colab": {
115 |           "base_uri": "https://localhost:8080/"
116 |         },
117 |         "id": "OU_WmDpzb1lA",
118 |         "outputId": "bd11fd3d-4455-4237-db4a-b6a9c0ae19b6"
119 |       },
120 |       "execution_count": 8,
121 |       "outputs": [
122 |         {
123 |           "output_type": "stream",
124 |           "name": "stderr",
125 |           "text": [
126 |             "Some weights of the model checkpoint at bert-base-cased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.bias']\n",
127 |             "- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
128 |             "- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n"
129 |           ]
130 |         }
131 |       ]
132 |     },
133 |     {
134 |       "cell_type": "code",
135 |       "source": [
136 |         "#@ SAVING THE PRETRAINED BERT MODEL:\n",
137 |         "model.save_pretrained(\"./model\")"
138 |       ],
139 |       "metadata": {
140 |         "id": "3uD2bXb1dyXu"
141 |       },
142 |       "execution_count": 9,
143 |       "outputs": []
144 |     },
145 |     {
146 |       "cell_type": "markdown",
147 |       "source": [
148 |         "**TOKENIZERS:**\n",
149 |         "- The basic type of tokenizer that comes to mind is `word-based` tokenizer. It's generally very easy to set up and use with only a few rules, and it yeilds decent results. "
150 |       ],
151 |       "metadata": {
152 |         "id": "5RxMTWEggKeo"
153 |       }
154 |     },
155 |     {
156 |       "cell_type": "code",
157 |       "source": [
158 |         "#@ EXAMPLE OF WORD BASED TOKENIZATION:\n",
159 |         "tokenized_text = \"I am Thinam Tamang\".split()           # Initializing Tokenization. \n",
160 |         "print(tokenized_text)                                   # Inspecting Tokens. "
161 |       ],
162 |       "metadata": {
163 |         "colab": {
164 |           "base_uri": "https://localhost:8080/"
165 |         },
166 |         "id": "qjXZzYK8d-kv",
167 |         "outputId": "c10c7e53-409c-4287-dfbd-bfb5a5543ac4"
168 |       },
169 |       "execution_count": 10,
170 |       "outputs": [
171 |         {
172 |           "output_type": "stream",
173 |           "name": "stdout",
174 |           "text": [
175 |             "['I', 'am', 'Thinam', 'Tamang']\n"
176 |           ]
177 |         }
178 |       ]
179 |     },
180 |     {
181 |       "cell_type": "markdown",
182 |       "source": [
183 |         "**SUBWORD TOKENIZATION:**\n",
184 |         "- Subword Tokenization rely on the principle that frequently used words should not be split into smaller subwords, but rare words should be decomposed into meaningful subwords. "
185 |       ],
186 |       "metadata": {
187 |         "id": "IeeTX9DTxV8H"
188 |       }
189 |     },
190 |     {
191 |       "cell_type": "code",
192 |       "source": [
193 |         "#@ INITIALIZING TOKENIZATION:\n",
194 |         "tokenizer = BertTokenizer.from_pretrained(\"bert-base-cased\")            # Initializing Tokenizer. \n",
195 |         "tokenizer = AutoTokenizer.from_pretrained(\"bert-base-cased\")            # Initializing Tokenizer. \n",
196 |         "tokenizer(\"Using a Transformer network is simple !\")                    # Implementing Tokenizer. "
197 |       ],
198 |       "metadata": {
199 |         "id": "dcksc1GOhPI9",
200 |         "colab": {
201 |           "base_uri": "https://localhost:8080/"
202 |         },
203 |         "outputId": "5c2e1f2c-6091-44f3-c60f-870de597e030"
204 |       },
205 |       "execution_count": 14,
206 |       "outputs": [
207 |         {
208 |           "output_type": "execute_result",
209 |           "data": {
210 |             "text/plain": [
211 |               "{'input_ids': [101, 7993, 170, 13809, 23763, 2443, 1110, 3014, 106, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}"
212 |             ]
213 |           },
214 |           "metadata": {},
215 |           "execution_count": 14
216 |         }
217 |       ]
218 |     },
219 |     {
220 |       "cell_type": "code",
221 |       "source": [
222 |         "#@ SAVING THE TOKENIZER:\n",
223 |         "tokenizer.save_pretrained(\"./tokenizer\");"
224 |       ],
225 |       "metadata": {
226 |         "id": "noY9v9P42fyc"
227 |       },
228 |       "execution_count": 15,
229 |       "outputs": []
230 |     },
231 |     {
232 |       "cell_type": "markdown",
233 |       "source": [
234 |         "**ENCODING:**\n",
235 |         "- Translating text to numbers is known as `encoding`. Encoding is done in a two-step process: the tokenization, followed by the conversion to input IDs. "
236 |       ],
237 |       "metadata": {
238 |         "id": "pxCJRDbS4ds8"
239 |       }
240 |     },
241 |     {
242 |       "cell_type": "code",
243 |       "source": [
244 |         "#@ IMPLEMENTATION OF TOKENIZATION:\n",
245 |         "sequence = \"Using a Transformer network is simple !\"                    # Initializing Sequence. \n",
246 |         "tokens = tokenizer.tokenize(sequence)                                   # Initializing Tokenization. \n",
247 |         "print(tokens)                                                           # Inspecting Tokens. "
248 |       ],
249 |       "metadata": {
250 |         "colab": {
251 |           "base_uri": "https://localhost:8080/"
252 |         },
253 |         "id": "83UaJad62v_K",
254 |         "outputId": "8a48626a-8b74-47bb-870c-e109dc7629dd"
255 |       },
256 |       "execution_count": 16,
257 |       "outputs": [
258 |         {
259 |           "output_type": "stream",
260 |           "name": "stdout",
261 |           "text": [
262 |             "['Using', 'a', 'Trans', '##former', 'network', 'is', 'simple', '!']\n"
263 |           ]
264 |         }
265 |       ]
266 |     },
267 |     {
268 |       "cell_type": "code",
269 |       "source": [
270 |         "#@ CONVERSION TO INPUT IDs:\n",
271 |         "ids = tokenizer.convert_tokens_to_ids(tokens)                           # Conversion.\n",
272 |         "print(ids)"
273 |       ],
274 |       "metadata": {
275 |         "colab": {
276 |           "base_uri": "https://localhost:8080/"
277 |         },
278 |         "id": "Na48g0t676Qr",
279 |         "outputId": "655c9b95-8a42-4132-b8e3-42b5514ece16"
280 |       },
281 |       "execution_count": 17,
282 |       "outputs": [
283 |         {
284 |           "output_type": "stream",
285 |           "name": "stdout",
286 |           "text": [
287 |             "[7993, 170, 13809, 23763, 2443, 1110, 3014, 106]\n"
288 |           ]
289 |         }
290 |       ]
291 |     },
292 |     {
293 |       "cell_type": "code",
294 |       "source": [
295 |         "#@ IMPLEMENTATION OF TOKENIZATION: \n",
296 |         "sequence =[\"I've been waiting for a HuggingFace course my whole life.\",\n",
297 |         "            \"I hate this so much!\"]                                     # Initializing Sequences. \n",
298 |         "tokens = tokenizer.tokenize(sequence)                                   # Initializing Tokenization. \n",
299 |         "ids = tokenizer.convert_tokens_to_ids(tokens)                           # Conversion.\n",
300 |         "print(ids)"
301 |       ],
302 |       "metadata": {
303 |         "colab": {
304 |           "base_uri": "https://localhost:8080/"
305 |         },
306 |         "id": "IuC_bCr98-Ck",
307 |         "outputId": "59dbbe18-3332-4e7f-f192-a99c0a6f427e"
308 |       },
309 |       "execution_count": 18,
310 |       "outputs": [
311 |         {
312 |           "output_type": "stream",
313 |           "name": "stdout",
314 |           "text": [
315 |             "[146, 112, 1396, 1151, 2613, 1111, 170, 20164, 10932, 2271, 7954, 1736, 1139, 2006, 1297, 119, 146, 4819, 1142, 1177, 1277, 106]\n"
316 |           ]
317 |         }
318 |       ]
319 |     },
320 |     {
321 |       "cell_type": "markdown",
322 |       "source": [
323 |         "**Batching** is the act of sending multiple sentences through the model, all at once. If we have only one sentence, we can just build a batch with a single sequence. "
324 |       ],
325 |       "metadata": {
326 |         "id": "68a-88dazOCC"
327 |       }
328 |     },
329 |     {
330 |       "cell_type": "code",
331 |       "source": [
332 |         "#@ IMPLEMENTATION OF TOKENIZATION: UNCOMMENT BELOW: \n",
333 |         "checkpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"          # Initialization. \n",
334 |         "tokenizer = AutoTokenizer.from_pretrained(checkpoint)                   # Initializing Tokenizer. \n",
335 |         "model = AutoModelForSequenceClassification.from_pretrained(checkpoint)  # Initializing Model.\n",
336 |         "sequence = \"I've been waiting for a Spiderman movie my whole life.\"     # Initialization.\n",
337 |         "tokens = tokenizer.tokenize(sequence)                                   # Encoding. \n",
338 |         "ids = tokenizer.convert_tokens_to_ids(tokens)                           # Converting into Token IDs. \n",
339 |         "input_ids = torch.tensor(ids)                                           # Converting into Tensors. \n",
340 |         "# model(input_ids)                                                      # Inspection. "
341 |       ],
342 |       "metadata": {
343 |         "id": "biysan099x-Z"
344 |       },
345 |       "execution_count": 20,
346 |       "outputs": []
347 |     },
348 |     {
349 |       "cell_type": "code",
350 |       "source": [
351 |         "#@ IMPLEMENTATION OF TOKENIZATION: ADDING DIMENSION: \n",
352 |         "sequence = \"I've been waiting for a Spiderman movie my whole life.\"     # Initialization.\n",
353 |         "tokens = tokenizer.tokenize(sequence)                                   # Encoding. \n",
354 |         "ids = tokenizer.convert_tokens_to_ids(tokens)                           # Converting into Token IDs. \n",
355 |         "input_ids = torch.tensor([ids])                                         # Converting into Tensors. \n",
356 |         "print(\"Input IDs:\", input_ids)\n",
357 |         "output = model(input_ids)                                               # Implementation of Model. \n",
358 |         "print(\"Logits:\", output.logits)                                         # Inspection. "
359 |       ],
360 |       "metadata": {
361 |         "colab": {
362 |           "base_uri": "https://localhost:8080/"
363 |         },
364 |         "id": "FezN5kfqxdo8",
365 |         "outputId": "f3ed1fec-b734-478e-bd5a-5a98fcd20080"
366 |       },
367 |       "execution_count": 21,
368 |       "outputs": [
369 |         {
370 |           "output_type": "stream",
371 |           "name": "stdout",
372 |           "text": [
373 |             "Input IDs: tensor([[1045, 1005, 2310, 2042, 3403, 2005, 1037, 6804, 2386, 3185, 2026, 2878,\n",
374 |             "         2166, 1012]])\n",
375 |             "Logits: tensor([[-0.4018,  0.7059]], grad_fn=<AddmmBackward0>)\n"
376 |           ]
377 |         }
378 |       ]
379 |     },
380 |     {
381 |       "cell_type": "code",
382 |       "source": [
383 |         "#@ IMPLEMENTATION OF TOKENIZATION: ADDING DIMENSION: \n",
384 |         "sequence = \"I've been waiting for a Spiderman movie my whole life.\"     # Initialization.\n",
385 |         "tokens = tokenizer.tokenize(sequence)                                   # Encoding. \n",
386 |         "ids = tokenizer.convert_tokens_to_ids(tokens)                           # Converting into Token IDs. \n",
387 |         "input_ids = torch.tensor([ids, ids])                                    # Converting into Tensors. \n",
388 |         "print(\"Input IDs:\", input_ids)\n",
389 |         "output = model(input_ids)                                               # Implementation of Model. \n",
390 |         "print(\"Logits:\", output.logits)                                         # Inspection. "
391 |       ],
392 |       "metadata": {
393 |         "colab": {
394 |           "base_uri": "https://localhost:8080/"
395 |         },
396 |         "id": "ZFVO9p6FzHRm",
397 |         "outputId": "89c7d400-8872-4c69-de05-ec46fbfef7ce"
398 |       },
399 |       "execution_count": 22,
400 |       "outputs": [
401 |         {
402 |           "output_type": "stream",
403 |           "name": "stdout",
404 |           "text": [
405 |             "Input IDs: tensor([[1045, 1005, 2310, 2042, 3403, 2005, 1037, 6804, 2386, 3185, 2026, 2878,\n",
406 |             "         2166, 1012],\n",
407 |             "        [1045, 1005, 2310, 2042, 3403, 2005, 1037, 6804, 2386, 3185, 2026, 2878,\n",
408 |             "         2166, 1012]])\n",
409 |             "Logits: tensor([[-0.4018,  0.7059],\n",
410 |             "        [-0.4018,  0.7059]], grad_fn=<AddmmBackward0>)\n"
411 |           ]
412 |         }
413 |       ]
414 |     },
415 |     {
416 |       "cell_type": "markdown",
417 |       "source": [
418 |         "**PADDING THE INPUTS:**\n",
419 |         "- **Padding** makes sure all our sentences have the same length by adding a special word called the `padding token` to the sentences with fewer values. "
420 |       ],
421 |       "metadata": {
422 |         "id": "8kMMCcPw0eTv"
423 |       }
424 |     },
425 |     {
426 |       "cell_type": "code",
427 |       "source": [
428 |         "#@ IMPLEMENTATION OF PADDING:\n",
429 |         "sequence1_ids = [[200, 200, 200]]                               # Initializing Sequence IDs.  \n",
430 |         "sequence2_ids = [[200, 200]]                                    # Initializing Sequence IDs.  \n",
431 |         "batched_ids = [[200, 200, 200], \n",
432 |         "               [200, 200, tokenizer.pad_token_id]]              # Implementation of Padding. \n",
433 |         "print(model(torch.tensor(sequence1_ids)).logits)                # Inspecting Logits. \n",
434 |         "print(model(torch.tensor(sequence2_ids)).logits)                # Inspecting Logits.\n",
435 |         "print(model(torch.tensor(batched_ids)).logits)                  # Inspecting Logits."
436 |       ],
437 |       "metadata": {
438 |         "colab": {
439 |           "base_uri": "https://localhost:8080/"
440 |         },
441 |         "id": "0eVFQRy4z_IB",
442 |         "outputId": "511b0684-5479-4f10-db54-06951e458eca"
443 |       },
444 |       "execution_count": 23,
445 |       "outputs": [
446 |         {
447 |           "output_type": "stream",
448 |           "name": "stdout",
449 |           "text": [
450 |             "tensor([[ 1.5694, -1.3895]], grad_fn=<AddmmBackward0>)\n",
451 |             "tensor([[ 0.5803, -0.4125]], grad_fn=<AddmmBackward0>)\n",
452 |             "tensor([[ 1.5694, -1.3895],\n",
453 |             "        [ 1.3374, -1.2163]], grad_fn=<AddmmBackward0>)\n"
454 |           ]
455 |         }
456 |       ]
457 |     },
458 |     {
459 |       "cell_type": "markdown",
460 |       "source": [
461 |         "**ATTENTION MASKS:**\n",
462 |         "- **Attention masks** are tensors with the same shapes as the input IDs tensors, filled with 0s and 1s: 1s indicate the corresponding tokens should be attended to, and 0s indicate the corresponding tokens should not be attended to i.e. they should be ignored by the attention layers of the model. "
463 |       ],
464 |       "metadata": {
465 |         "id": "WDwbduFU3QDn"
466 |       }
467 |     },
468 |     {
469 |       "cell_type": "code",
470 |       "source": [
471 |         "#@ IMPLEMENTATION OF PADDING AND ATTENTION MASKS: \n",
472 |         "batched_ids = [[200, 200, 200], \n",
473 |         "               [200, 200, tokenizer.pad_token_id]]              # Implementation of Padding. \n",
474 |         "attention_mask = [[1, 1, 1],\n",
475 |         "                  [1, 1, 0]]                                    # Initialization. \n",
476 |         "outputs = model(torch.tensor(batched_ids),\n",
477 |         "                attention_mask=torch.tensor(attention_mask))    # Implementation of Attention Masks. \n",
478 |         "print(outputs.logits)                                           # Inspection. "
479 |       ],
480 |       "metadata": {
481 |         "colab": {
482 |           "base_uri": "https://localhost:8080/"
483 |         },
484 |         "id": "GnqH24Li2btO",
485 |         "outputId": "2940396e-31bb-4507-d254-11c0476823fa"
486 |       },
487 |       "execution_count": 24,
488 |       "outputs": [
489 |         {
490 |           "output_type": "stream",
491 |           "name": "stdout",
492 |           "text": [
493 |             "tensor([[ 1.5694, -1.3895],\n",
494 |             "        [ 0.5803, -0.4125]], grad_fn=<AddmmBackward0>)\n"
495 |           ]
496 |         }
497 |       ]
498 |     },
499 |     {
500 |       "cell_type": "markdown",
501 |       "source": [
502 |         "**CONCLUSION:**"
503 |       ],
504 |       "metadata": {
505 |         "id": "MkN8AO80eFLv"
506 |       }
507 |     },
508 |     {
509 |       "cell_type": "code",
510 |       "source": [
511 |         "#@ IMPLEMENTATION OF TOKENIZER:\n",
512 |         "sequence = \"I've been waiting for a HuggingFace course my whole life.\"      # Text Example. \n",
513 |         "model_inputs = tokenizer(sequence)                                          # Initializing Model Inputs. \n",
514 |         "model_inputs = tokenizer(sequence, padding=\"longest\")                       # Padding Upto Maximum Length.\n",
515 |         "model_inputs = tokenizer(sequence, padding=\"max_length\")                    # Padding Upto Model Max Length. \n",
516 |         "model_inputs = tokenizer(sequence, padding=\"max_length\", max_length=8)      # Padding Upto Specified Length. \n",
517 |         "model_inputs = tokenizer(sequence, truncation=True)                         # Truncating Long Sequence. \n",
518 |         "model_inputs = tokenizer(sequence, max_length=8, truncation=True)           # Truncating Long Sequence. \n",
519 |         "model_inputs = tokenizer(sequence, padding=True, return_tensors=\"pt\")       # Return PyTorch Tensors.\n",
520 |         "model_inputs = tokenizer(sequence, padding=True, return_tensors=\"tf\")       # Return TensorFlow Tensors. \n",
521 |         "model_inputs = tokenizer(sequence, padding=True, return_tensors=\"np\")       # Return NumPy Arrays. "
522 |       ],
523 |       "metadata": {
524 |         "id": "gdO7M4Oa4tRn"
525 |       },
526 |       "execution_count": 25,
527 |       "outputs": []
528 |     },
529 |     {
530 |       "cell_type": "code",
531 |       "source": [
532 |         "#@ IMPLEMENTATION OF TOKENIZER: SPECIAL TOKENS: \n",
533 |         "sequence = \"I've been waiting for a HuggingFace course my whole life.\"      # Text Example. \n",
534 |         "model_inputs = tokenizer(sequence)                                          # Tokenization. \n",
535 |         "tokens = tokenizer.tokenize(sequence)                                       # Initializing Tokens\n",
536 |         "ids = tokenizer.convert_tokens_to_ids(tokens)                               # Initializing Input IDs. \n",
537 |         "print(ids)"
538 |       ],
539 |       "metadata": {
540 |         "colab": {
541 |           "base_uri": "https://localhost:8080/"
542 |         },
543 |         "id": "WVEkdP44gzOQ",
544 |         "outputId": "fae56745-4097-4db5-a005-3c27c7d96ec1"
545 |       },
546 |       "execution_count": 29,
547 |       "outputs": [
548 |         {
549 |           "output_type": "stream",
550 |           "name": "stdout",
551 |           "text": [
552 |             "[1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012]\n"
553 |           ]
554 |         }
555 |       ]
556 |     },
557 |     {
558 |       "cell_type": "code",
559 |       "source": [
560 |         "#@ IMPLEMENTATION OF TOKENIZER: DECODING TOKENS: \n",
561 |         "print(tokenizer.decode(model_inputs[\"input_ids\"]))\n",
562 |         "print(tokenizer.decode(ids))"
563 |       ],
564 |       "metadata": {
565 |         "colab": {
566 |           "base_uri": "https://localhost:8080/"
567 |         },
568 |         "id": "-Nqz4P3MhLRT",
569 |         "outputId": "6cf37051-31d1-4c52-81c6-e643af5bf99e"
570 |       },
571 |       "execution_count": 30,
572 |       "outputs": [
573 |         {
574 |           "output_type": "stream",
575 |           "name": "stdout",
576 |           "text": [
577 |             "[CLS] i've been waiting for a huggingface course my whole life. [SEP]\n",
578 |             "i've been waiting for a huggingface course my whole life.\n"
579 |           ]
580 |         }
581 |       ]
582 |     },
583 |     {
584 |       "cell_type": "code",
585 |       "source": [
586 |         "#@ TOKENIZER TO MODEL:\n",
587 |         "model = AutoModelForSequenceClassification.from_pretrained(checkpoint)          # Initializing Model.\n",
588 |         "sequences = [\"I've been waiting for a HuggingFace course my whole life.\",\"So\"]  # Text Example. \n",
589 |         "tokens = tokenizer(sequences, padding=True, truncation=True, \n",
590 |         "                   return_tensors=\"pt\")                                         # Initializing Tokenization. \n",
591 |         "output = model(**tokens)                                                        # Implementation of Model. "
592 |       ],
593 |       "metadata": {
594 |         "id": "jSSQfzbRhsLI"
595 |       },
596 |       "execution_count": 31,
597 |       "outputs": []
598 |     }
599 |   ]
600 | }


--------------------------------------------------------------------------------
/01. Transformer Models/TransformerModels.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "name": "TransformerModels.ipynb",
  7 |       "provenance": []
  8 |     },
  9 |     "kernelspec": {
 10 |       "name": "python3",
 11 |       "display_name": "Python 3"
 12 |     },
 13 |     "language_info": {
 14 |       "name": "python"
 15 |     },
 16 |     "accelerator": "GPU"
 17 |   },
 18 |   "cells": [
 19 |     {
 20 |       "cell_type": "markdown",
 21 |       "source": [
 22 |         "**INITIALIZATION:**\n",
 23 |         "- I use these three lines of code on top of my each notebooks because it will help to prevent any problems while reloading the same project. And the third line of code helps to make visualization within the notebook."
 24 |       ],
 25 |       "metadata": {
 26 |         "id": "s_rod7u5Z3Bk"
 27 |       }
 28 |     },
 29 |     {
 30 |       "cell_type": "code",
 31 |       "execution_count": 1,
 32 |       "metadata": {
 33 |         "id": "7-c8RRG5XCK7"
 34 |       },
 35 |       "outputs": [],
 36 |       "source": [
 37 |         "#@ INITIALIZATION: \n",
 38 |         "%reload_ext autoreload\n",
 39 |         "%autoreload 2\n",
 40 |         "%matplotlib inline"
 41 |       ]
 42 |     },
 43 |     {
 44 |       "cell_type": "markdown",
 45 |       "source": [
 46 |         "**LIBRARIES AND DEPENDENCIES:**\n",
 47 |         "- I have downloaded all the libraries and dependencies required for the project in one particular cell."
 48 |       ],
 49 |       "metadata": {
 50 |         "id": "n9I8VhyfaBr7"
 51 |       }
 52 |     },
 53 |     {
 54 |       "cell_type": "code",
 55 |       "source": [
 56 |         "#@ INSTALLING DEPENDENCIES: UNCOMMENT BELOW: \n",
 57 |         "# !pip install transformers[sentencepiece]"
 58 |       ],
 59 |       "metadata": {
 60 |         "id": "6xlT0WUMZ-62"
 61 |       },
 62 |       "execution_count": 3,
 63 |       "outputs": []
 64 |     },
 65 |     {
 66 |       "cell_type": "code",
 67 |       "source": [
 68 |         "#@ DOWNLOADING LIBRARIES AND DEPENDENCIES:\n",
 69 |         "import transformers\n",
 70 |         "from transformers import pipeline"
 71 |       ],
 72 |       "metadata": {
 73 |         "id": "NOL1UW_vaWtG"
 74 |       },
 75 |       "execution_count": 4,
 76 |       "outputs": []
 77 |     },
 78 |     {
 79 |       "cell_type": "markdown",
 80 |       "source": [
 81 |         "**Natural Language Processing**\n",
 82 |         "- **NLP** is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of **NLP** tasks is not only to understand single words individually, but to be able to understand the context of those words. "
 83 |       ],
 84 |       "metadata": {
 85 |         "id": "CECqmYGOawxR"
 86 |       }
 87 |     },
 88 |     {
 89 |       "cell_type": "markdown",
 90 |       "source": [
 91 |         "**SENTIMENT ANALYSIS:**"
 92 |       ],
 93 |       "metadata": {
 94 |         "id": "RkznUFUOcY3F"
 95 |       }
 96 |     },
 97 |     {
 98 |       "cell_type": "code",
 99 |       "source": [
100 |         "#@ IMPLEMENTATION OF SENTIMENT ANALYSIS PIPELINE:\n",
101 |         "classifier = pipeline(\"sentiment-analysis\")                                 # Initializing Classifier Object. \n",
102 |         "classifier(\"I've started the HuggingFace course which fascinates me.\")      # Inspecting Sentiment. "
103 |       ],
104 |       "metadata": {
105 |         "colab": {
106 |           "base_uri": "https://localhost:8080/"
107 |         },
108 |         "id": "ibYIi6HBatZn",
109 |         "outputId": "467f1699-c91c-4bc6-d47d-e8dcb1805f3d"
110 |       },
111 |       "execution_count": 6,
112 |       "outputs": [
113 |         {
114 |           "output_type": "stream",
115 |           "name": "stderr",
116 |           "text": [
117 |             "No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)\n"
118 |           ]
119 |         },
120 |         {
121 |           "output_type": "execute_result",
122 |           "data": {
123 |             "text/plain": [
124 |               "[{'label': 'POSITIVE', 'score': 0.9997233748435974}]"
125 |             ]
126 |           },
127 |           "metadata": {},
128 |           "execution_count": 6
129 |         }
130 |       ]
131 |     },
132 |     {
133 |       "cell_type": "code",
134 |       "source": [
135 |         "#@ IMPLEMENTATION OF SENTIMENT ANALYSIS PIPELINE: MULTIPLE:\n",
136 |         "classifier([\"I've started the HuggingFace course which fascinates me.\",\n",
137 |         "            \"I will no longer read it's documentation.\",\n",
138 |         "            \"I think the course is awesome!\"])                              # Inspecting Sentiment. "
139 |       ],
140 |       "metadata": {
141 |         "colab": {
142 |           "base_uri": "https://localhost:8080/"
143 |         },
144 |         "id": "pYpdpAjXc4dH",
145 |         "outputId": "2f3ffbfe-456c-4914-8a2b-e72c84ace700"
146 |       },
147 |       "execution_count": 7,
148 |       "outputs": [
149 |         {
150 |           "output_type": "execute_result",
151 |           "data": {
152 |             "text/plain": [
153 |               "[{'label': 'POSITIVE', 'score': 0.9997233748435974},\n",
154 |               " {'label': 'NEGATIVE', 'score': 0.9994686245918274},\n",
155 |               " {'label': 'POSITIVE', 'score': 0.9998465776443481}]"
156 |             ]
157 |           },
158 |           "metadata": {},
159 |           "execution_count": 7
160 |         }
161 |       ]
162 |     },
163 |     {
164 |       "cell_type": "markdown",
165 |       "source": [
166 |         "**Pipelines**\n",
167 |         "- The three main steps involved when we pass some text to a `pipeline` are: \n",
168 |         "    - The text is preprocessed into a format the model can understand.\n",
169 |         "    - The preprocessed inputs are passed to the model. \n",
170 |         "    - The predictions of the model are post-processed, so we can make sense of them. "
171 |       ],
172 |       "metadata": {
173 |         "id": "bp4Yx3MffEw6"
174 |       }
175 |     },
176 |     {
177 |       "cell_type": "markdown",
178 |       "source": [
179 |         "**ZERO-SHOT CLASSIFICATION:**\n",
180 |         "- The `zero-shot-classification` pipeline is very powerful, as it allows us to specify which labels to use for the classification, so we don't have to rely on the labels of the pretrained model. This `pipeline` is called `zero-shot` because we don't need to fine-tune the model on our data to use it. It can directly return probability scores for any list of labels we want. "
181 |       ],
182 |       "metadata": {
183 |         "id": "natPtwAFgD9T"
184 |       }
185 |     },
186 |     {
187 |       "cell_type": "code",
188 |       "source": [
189 |         "#@ IMPLEMENTATION OF ZERO SHOT CLASSIFICATION PIPELINE: \n",
190 |         "classifier = pipeline(\"zero-shot-classification\")                           # Initializing Classifier Object. \n",
191 |         "classifier(\"This is a course about Transformers library\", \n",
192 |         "           candidate_labels=[\"education\", \"health\", \"programming\"])         # Inspecting Classification. "
193 |       ],
194 |       "metadata": {
195 |         "colab": {
196 |           "base_uri": "https://localhost:8080/"
197 |         },
198 |         "id": "Wt8ilH9VeW1A",
199 |         "outputId": "cd15fd0e-b9ec-4d68-bd96-64d6bfa6ca14"
200 |       },
201 |       "execution_count": 9,
202 |       "outputs": [
203 |         {
204 |           "output_type": "stream",
205 |           "name": "stderr",
206 |           "text": [
207 |             "No model was supplied, defaulted to facebook/bart-large-mnli (https://huggingface.co/facebook/bart-large-mnli)\n"
208 |           ]
209 |         },
210 |         {
211 |           "output_type": "execute_result",
212 |           "data": {
213 |             "text/plain": [
214 |               "{'labels': ['education', 'programming', 'health'],\n",
215 |               " 'scores': [0.690150260925293, 0.19084124267101288, 0.11900852620601654],\n",
216 |               " 'sequence': 'This is a course about Transformers library'}"
217 |             ]
218 |           },
219 |           "metadata": {},
220 |           "execution_count": 9
221 |         }
222 |       ]
223 |     },
224 |     {
225 |       "cell_type": "markdown",
226 |       "source": [
227 |         "**TEXT GENERATION:**\n",
228 |         "- The main idea here is that when we provide a prompt and the model will auto-complete it by generating the remaining text. "
229 |       ],
230 |       "metadata": {
231 |         "id": "HSCQbBQdylz6"
232 |       }
233 |     },
234 |     {
235 |       "cell_type": "code",
236 |       "source": [
237 |         "#@ IMPLEMENTATION OF TEXT GENERATION PIPELINE: \n",
238 |         "generator = pipeline(\"text-generation\")                                 # Initializing Generator Object. \n",
239 |         "generator(\"In this course, you will learn to\")                          # Inspecting Generated Text. "
240 |       ],
241 |       "metadata": {
242 |         "id": "GS1r-kAQirUE",
243 |         "colab": {
244 |           "base_uri": "https://localhost:8080/"
245 |         },
246 |         "outputId": "8945cadb-dcfc-44c2-fb05-3d4a8768d3f9"
247 |       },
248 |       "execution_count": 11,
249 |       "outputs": [
250 |         {
251 |           "output_type": "stream",
252 |           "name": "stderr",
253 |           "text": [
254 |             "No model was supplied, defaulted to gpt2 (https://huggingface.co/gpt2)\n",
255 |             "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
256 |           ]
257 |         },
258 |         {
259 |           "output_type": "execute_result",
260 |           "data": {
261 |             "text/plain": [
262 |               "[{'generated_text': 'In this course, you will learn to create functional web applications with AngularJS. One part of this course will be working with web components in real time, as well as web technologies in a way so that you can integrate them seamlessly. So be sure'}]"
263 |             ]
264 |           },
265 |           "metadata": {},
266 |           "execution_count": 11
267 |         }
268 |       ]
269 |     },
270 |     {
271 |       "cell_type": "code",
272 |       "source": [
273 |         "#@ IMPLEMENTATION OF TEXT GENERATION PIPELINE: \n",
274 |         "generator(\"I like python because\", num_return_sequences=2, max_length=15)     # Inspecting Generated Sequences. "
275 |       ],
276 |       "metadata": {
277 |         "colab": {
278 |           "base_uri": "https://localhost:8080/"
279 |         },
280 |         "id": "YeJPE8k5zjPL",
281 |         "outputId": "7be625bb-9c1d-4258-83ca-a28f4832ea10"
282 |       },
283 |       "execution_count": 15,
284 |       "outputs": [
285 |         {
286 |           "output_type": "stream",
287 |           "name": "stderr",
288 |           "text": [
289 |             "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
290 |           ]
291 |         },
292 |         {
293 |           "output_type": "execute_result",
294 |           "data": {
295 |             "text/plain": [
296 |               "[{'generated_text': 'I like python because of all that it makes my day.'},\n",
297 |               " {'generated_text': 'I like python because of its amazing support for object oriented, modular code.'}]"
298 |             ]
299 |           },
300 |           "metadata": {},
301 |           "execution_count": 15
302 |         }
303 |       ]
304 |     },
305 |     {
306 |       "cell_type": "code",
307 |       "source": [
308 |         "#@ IMPLEMENTATION OF TEXT GENERATION PIPELINE: DISTILGPT2:\n",
309 |         "generator = pipeline(\"text-generation\", model=\"distilgpt2\")                   # Initializing Generator Object. \n",
310 |         "generator(\"I want to be a programmer so that\", \n",
311 |         "          num_return_sequences=2, max_length=30)                              # Inspecting Generated Sequences. "
312 |       ],
313 |       "metadata": {
314 |         "colab": {
315 |           "base_uri": "https://localhost:8080/"
316 |         },
317 |         "id": "i8tULAYiz_1g",
318 |         "outputId": "682c579f-cbc4-4868-e46d-381b058715a1"
319 |       },
320 |       "execution_count": 19,
321 |       "outputs": [
322 |         {
323 |           "output_type": "stream",
324 |           "name": "stderr",
325 |           "text": [
326 |             "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
327 |           ]
328 |         },
329 |         {
330 |           "output_type": "execute_result",
331 |           "data": {
332 |             "text/plain": [
333 |               "[{'generated_text': 'I want to be a programmer so that I can enjoy the world.'},\n",
334 |               " {'generated_text': 'I want to be a programmer so that we can do the same thing. If you want to start making things, you really need to get creative,'}]"
335 |             ]
336 |           },
337 |           "metadata": {},
338 |           "execution_count": 19
339 |         }
340 |       ]
341 |     },
342 |     {
343 |       "cell_type": "markdown",
344 |       "source": [
345 |         "**MASK FILLING:**\n",
346 |         "- The idea of this task is to fill in the blanks in a given text. "
347 |       ],
348 |       "metadata": {
349 |         "id": "J0zsBIMFCwDw"
350 |       }
351 |     },
352 |     {
353 |       "cell_type": "code",
354 |       "source": [
355 |         "#@ IMPLEMENTATION OF MASK FILLING PIPELINE:\n",
356 |         "unmasker = pipeline(\"fill-mask\")                                        # Initilizing Mask Filling Object. \n",
357 |         "unmasker(\"This course will teach you all about <mask> models.\", \n",
358 |         "         top_k=2)                                                       # Inspecting Mask Token."
359 |       ],
360 |       "metadata": {
361 |         "colab": {
362 |           "base_uri": "https://localhost:8080/"
363 |         },
364 |         "id": "4c1UCi9C2kI6",
365 |         "outputId": "f5bd0254-94a7-4b03-b0d5-a61eb914e57f"
366 |       },
367 |       "execution_count": 21,
368 |       "outputs": [
369 |         {
370 |           "output_type": "stream",
371 |           "name": "stderr",
372 |           "text": [
373 |             "No model was supplied, defaulted to distilroberta-base (https://huggingface.co/distilroberta-base)\n"
374 |           ]
375 |         },
376 |         {
377 |           "output_type": "execute_result",
378 |           "data": {
379 |             "text/plain": [
380 |               "[{'score': 0.196198508143425,\n",
381 |               "  'sequence': 'This course will teach you all about mathematical models.',\n",
382 |               "  'token': 30412,\n",
383 |               "  'token_str': ' mathematical'},\n",
384 |               " {'score': 0.040527332574129105,\n",
385 |               "  'sequence': 'This course will teach you all about computational models.',\n",
386 |               "  'token': 38163,\n",
387 |               "  'token_str': ' computational'}]"
388 |             ]
389 |           },
390 |           "metadata": {},
391 |           "execution_count": 21
392 |         }
393 |       ]
394 |     },
395 |     {
396 |       "cell_type": "code",
397 |       "source": [
398 |         "#@ IMPLEMENTATION OF MASK FILLING PIPELINE:\n",
399 |         "unmasker = pipeline(\"fill-mask\", model=\"bert-base-cased\")               # Initilizing Mask Filling Object. \n",
400 |         "unmasker(\"This course will teach you all about [MASK] models.\", \n",
401 |         "         top_k=2)                                                       # Inspecting Mask Token."
402 |       ],
403 |       "metadata": {
404 |         "colab": {
405 |           "base_uri": "https://localhost:8080/"
406 |         },
407 |         "id": "kjvAPF2zHCgt",
408 |         "outputId": "b33e49ac-a579-4831-e83d-66f62665fcbe"
409 |       },
410 |       "execution_count": 23,
411 |       "outputs": [
412 |         {
413 |           "output_type": "stream",
414 |           "name": "stderr",
415 |           "text": [
416 |             "Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']\n",
417 |             "- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
418 |             "- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n"
419 |           ]
420 |         },
421 |         {
422 |           "output_type": "execute_result",
423 |           "data": {
424 |             "text/plain": [
425 |               "[{'score': 0.2596317529678345,\n",
426 |               "  'sequence': 'This course will teach you all about role models.',\n",
427 |               "  'token': 1648,\n",
428 |               "  'token_str': 'role'},\n",
429 |               " {'score': 0.0942726731300354,\n",
430 |               "  'sequence': 'This course will teach you all about the models.',\n",
431 |               "  'token': 1103,\n",
432 |               "  'token_str': 'the'}]"
433 |             ]
434 |           },
435 |           "metadata": {},
436 |           "execution_count": 23
437 |         }
438 |       ]
439 |     },
440 |     {
441 |       "cell_type": "markdown",
442 |       "source": [
443 |         "**NAMED ENTITY RECOGNITION:**\n",
444 |         "- **NER** is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or organizations. "
445 |       ],
446 |       "metadata": {
447 |         "id": "T1IuXZA-Fdrd"
448 |       }
449 |     },
450 |     {
451 |       "cell_type": "code",
452 |       "source": [
453 |         "#@ IMPLEMENTATION OF NER PIPELINE:\n",
454 |         "ner = pipeline(\"ner\", grouped_entities=True)                           # Initializing NER Object. \n",
455 |         "ner(\"I am Thinam Tamang and I am from Kathmandu, Nepal.\")              # Inspecting Entities. "
456 |       ],
457 |       "metadata": {
458 |         "colab": {
459 |           "base_uri": "https://localhost:8080/"
460 |         },
461 |         "id": "xkHKDYCPFEZT",
462 |         "outputId": "5af30e26-791a-4575-b7ca-525a481e9300"
463 |       },
464 |       "execution_count": 25,
465 |       "outputs": [
466 |         {
467 |           "output_type": "stream",
468 |           "name": "stderr",
469 |           "text": [
470 |             "No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english)\n",
471 |             "/usr/local/lib/python3.7/dist-packages/transformers/pipelines/token_classification.py:136: UserWarning: `grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to `aggregation_strategy=\"AggregationStrategy.SIMPLE\"` instead.\n",
472 |             "  f'`grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to `aggregation_strategy=\"{aggregation_strategy}\"` instead.'\n"
473 |           ]
474 |         },
475 |         {
476 |           "output_type": "execute_result",
477 |           "data": {
478 |             "text/plain": [
479 |               "[{'end': 18,\n",
480 |               "  'entity_group': 'PER',\n",
481 |               "  'score': 0.9967239,\n",
482 |               "  'start': 5,\n",
483 |               "  'word': 'Thinam Tamang'},\n",
484 |               " {'end': 42,\n",
485 |               "  'entity_group': 'LOC',\n",
486 |               "  'score': 0.9990222,\n",
487 |               "  'start': 33,\n",
488 |               "  'word': 'Kathmandu'},\n",
489 |               " {'end': 49,\n",
490 |               "  'entity_group': 'LOC',\n",
491 |               "  'score': 0.9997008,\n",
492 |               "  'start': 44,\n",
493 |               "  'word': 'Nepal'}]"
494 |             ]
495 |           },
496 |           "metadata": {},
497 |           "execution_count": 25
498 |         }
499 |       ]
500 |     },
501 |     {
502 |       "cell_type": "markdown",
503 |       "source": [
504 |         "**QUESTION ANSWERING:**\n",
505 |         "- The `question-answering` pipeline answers questions using information from a given context. "
506 |       ],
507 |       "metadata": {
508 |         "id": "G85OQxxHIBQp"
509 |       }
510 |     },
511 |     {
512 |       "cell_type": "code",
513 |       "source": [
514 |         "#@ IMPLEMENTATION OF QUESTION ANSWERING PIPELINE: \n",
515 |         "question_answerer = pipeline(\"question-answering\")                          # Initialization. \n",
516 |         "question_answerer(\n",
517 |         "    question=\"What is my name?\",\n",
518 |         "    context=\"I am Thinam from Nepal.\")                                      # Inspecting Answer. "
519 |       ],
520 |       "metadata": {
521 |         "colab": {
522 |           "base_uri": "https://localhost:8080/"
523 |         },
524 |         "id": "F4q1loPYH3pB",
525 |         "outputId": "c16effd2-731a-4a2b-d1a7-d4c8ca51a8b1"
526 |       },
527 |       "execution_count": 27,
528 |       "outputs": [
529 |         {
530 |           "output_type": "stream",
531 |           "name": "stderr",
532 |           "text": [
533 |             "No model was supplied, defaulted to distilbert-base-cased-distilled-squad (https://huggingface.co/distilbert-base-cased-distilled-squad)\n"
534 |           ]
535 |         },
536 |         {
537 |           "output_type": "execute_result",
538 |           "data": {
539 |             "text/plain": [
540 |               "{'answer': 'Thinam', 'end': 11, 'score': 0.8733119964599609, 'start': 5}"
541 |             ]
542 |           },
543 |           "metadata": {},
544 |           "execution_count": 27
545 |         }
546 |       ]
547 |     },
548 |     {
549 |       "cell_type": "markdown",
550 |       "source": [
551 |         "**SUMMARIZATION:**\n",
552 |         "- Summarization is the task of reducing a text into a shorter text while keeping all or most of the important aspects referenced in the text. "
553 |       ],
554 |       "metadata": {
555 |         "id": "uxANYti2Jzen"
556 |       }
557 |     },
558 |     {
559 |       "cell_type": "code",
560 |       "source": [
561 |         "#@ IMPLEMENTATION OF SUMMARIZATION PIPELINE:\n",
562 |         "summarizer = pipeline(\"summarization\")                                                  # Initializing Summarizer Object. \n",
563 |         "summarizer(\n",
564 |         "    \"\"\"\n",
565 |         "    America has changed dramatically during recent years. Not only has the number of \n",
566 |         "    graduates in traditional engineering disciplines such as mechanical, civil, \n",
567 |         "    electrical, chemical, and aeronautical engineering declined, but in most of \n",
568 |         "    the premier American universities engineering curricula now concentrate on \n",
569 |         "    and encourage largely the study of engineering science. As a result, there \n",
570 |         "    are declining offerings in engineering subjects dealing with infrastructure, \n",
571 |         "    the environment, and related issues, and greater concentration on high \n",
572 |         "    technology subjects, largely supporting increasingly complex scientific \n",
573 |         "    developments. While the latter is important, it should not be at the expense \n",
574 |         "    of more traditional engineering.\n",
575 |         "\n",
576 |         "    Rapidly developing economies such as China and India, as well as other \n",
577 |         "    industrial countries in Europe and Asia, continue to encourage and advance \n",
578 |         "    the teaching of engineering. Both China and India, respectively, graduate \n",
579 |         "    six and eight times as many traditional engineers as does the United States. \n",
580 |         "    Other industrial countries at minimum maintain their output, while America \n",
581 |         "    suffers an increasingly serious decline in the number of engineering graduates \n",
582 |         "    and a lack of well-educated engineers.\n",
583 |         "\"\"\"\n",
584 |         ")                                                                                         # Inspecting Summarized Text. "
585 |       ],
586 |       "metadata": {
587 |         "colab": {
588 |           "base_uri": "https://localhost:8080/"
589 |         },
590 |         "id": "2-jGKeWkJeVa",
591 |         "outputId": "eb0d8461-e28c-444c-cc2b-604c68a2bfa2"
592 |       },
593 |       "execution_count": 29,
594 |       "outputs": [
595 |         {
596 |           "output_type": "stream",
597 |           "name": "stderr",
598 |           "text": [
599 |             "No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 (https://huggingface.co/sshleifer/distilbart-cnn-12-6)\n"
600 |           ]
601 |         },
602 |         {
603 |           "output_type": "execute_result",
604 |           "data": {
605 |             "text/plain": [
606 |               "[{'summary_text': ' America has changed dramatically during recent years . The number of engineering graduates in the U.S. has declined in traditional engineering disciplines such as mechanical, civil,    electrical, chemical, and aeronautical engineering . Rapidly developing economies such as China and India continue to encourage and advance the teaching of engineering .'}]"
607 |             ]
608 |           },
609 |           "metadata": {},
610 |           "execution_count": 29
611 |         }
612 |       ]
613 |     },
614 |     {
615 |       "cell_type": "markdown",
616 |       "source": [
617 |         "**TRANSLATION:**"
618 |       ],
619 |       "metadata": {
620 |         "id": "JukLFMN6LGRq"
621 |       }
622 |     },
623 |     {
624 |       "cell_type": "code",
625 |       "source": [
626 |         "#@ IMPLEMENTATION OF TRANSLATION PIPELINE: \n",
627 |         "translator = pipeline(\"translation\", model=\"Helsinki-NLP/opus-mt-fr-en\")        # Initializing Translator Object. \n",
628 |         "translator(\"Ce cours est produit par Hugging Face.\")                            # Inspecting Translation. "
629 |       ],
630 |       "metadata": {
631 |         "colab": {
632 |           "base_uri": "https://localhost:8080/"
633 |         },
634 |         "id": "aEcD0rquKsMx",
635 |         "outputId": "03b449f7-424a-4b5b-d834-526e9006ad9b"
636 |       },
637 |       "execution_count": 31,
638 |       "outputs": [
639 |         {
640 |           "output_type": "execute_result",
641 |           "data": {
642 |             "text/plain": [
643 |               "[{'translation_text': 'This course is produced by Hugging Face.'}]"
644 |             ]
645 |           },
646 |           "metadata": {},
647 |           "execution_count": 31
648 |         }
649 |       ]
650 |     }
651 |   ]
652 | }


--------------------------------------------------------------------------------
/04. Pretrained Models/PretrainedModel.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |   "nbformat": 4,
   3 |   "nbformat_minor": 0,
   4 |   "metadata": {
   5 |     "colab": {
   6 |       "name": "PretrainedModel.ipynb",
   7 |       "provenance": []
   8 |     },
   9 |     "kernelspec": {
  10 |       "name": "python3",
  11 |       "display_name": "Python 3"
  12 |     },
  13 |     "language_info": {
  14 |       "name": "python"
  15 |     },
  16 |     "accelerator": "GPU",
  17 |     "widgets": {
  18 |       "application/vnd.jupyter.widget-state+json": {
  19 |         "957e095a73a04591bb5fab8696af061b": {
  20 |           "model_module": "@jupyter-widgets/controls",
  21 |           "model_name": "HBoxModel",
  22 |           "model_module_version": "1.5.0",
  23 |           "state": {
  24 |             "_view_name": "HBoxView",
  25 |             "_dom_classes": [],
  26 |             "_model_name": "HBoxModel",
  27 |             "_view_module": "@jupyter-widgets/controls",
  28 |             "_model_module_version": "1.5.0",
  29 |             "_view_count": null,
  30 |             "_view_module_version": "1.5.0",
  31 |             "box_style": "",
  32 |             "layout": "IPY_MODEL_bc08769e79eb4c2890dbfef83d60a9b5",
  33 |             "_model_module": "@jupyter-widgets/controls",
  34 |             "children": [
  35 |               "IPY_MODEL_97a100676e024babb68562fa29104237",
  36 |               "IPY_MODEL_727208403fe1401a9bbaa24c00874fc7",
  37 |               "IPY_MODEL_828124b3ba894e1d8d5dec057991e495"
  38 |             ]
  39 |           }
  40 |         },
  41 |         "bc08769e79eb4c2890dbfef83d60a9b5": {
  42 |           "model_module": "@jupyter-widgets/base",
  43 |           "model_name": "LayoutModel",
  44 |           "model_module_version": "1.2.0",
  45 |           "state": {
  46 |             "_view_name": "LayoutView",
  47 |             "grid_template_rows": null,
  48 |             "right": null,
  49 |             "justify_content": null,
  50 |             "_view_module": "@jupyter-widgets/base",
  51 |             "overflow": null,
  52 |             "_model_module_version": "1.2.0",
  53 |             "_view_count": null,
  54 |             "flex_flow": null,
  55 |             "width": null,
  56 |             "min_width": null,
  57 |             "border": null,
  58 |             "align_items": null,
  59 |             "bottom": null,
  60 |             "_model_module": "@jupyter-widgets/base",
  61 |             "top": null,
  62 |             "grid_column": null,
  63 |             "overflow_y": null,
  64 |             "overflow_x": null,
  65 |             "grid_auto_flow": null,
  66 |             "grid_area": null,
  67 |             "grid_template_columns": null,
  68 |             "flex": null,
  69 |             "_model_name": "LayoutModel",
  70 |             "justify_items": null,
  71 |             "grid_row": null,
  72 |             "max_height": null,
  73 |             "align_content": null,
  74 |             "visibility": null,
  75 |             "align_self": null,
  76 |             "height": null,
  77 |             "min_height": null,
  78 |             "padding": null,
  79 |             "grid_auto_rows": null,
  80 |             "grid_gap": null,
  81 |             "max_width": null,
  82 |             "order": null,
  83 |             "_view_module_version": "1.2.0",
  84 |             "grid_template_areas": null,
  85 |             "object_position": null,
  86 |             "object_fit": null,
  87 |             "grid_auto_columns": null,
  88 |             "margin": null,
  89 |             "display": null,
  90 |             "left": null
  91 |           }
  92 |         },
  93 |         "97a100676e024babb68562fa29104237": {
  94 |           "model_module": "@jupyter-widgets/controls",
  95 |           "model_name": "HTMLModel",
  96 |           "model_module_version": "1.5.0",
  97 |           "state": {
  98 |             "_view_name": "HTMLView",
  99 |             "style": "IPY_MODEL_fa9ec3fc52bc4176bb2b0e247cb0d73b",
 100 |             "_dom_classes": [],
 101 |             "description": "",
 102 |             "_model_name": "HTMLModel",
 103 |             "placeholder": "​",
 104 |             "_view_module": "@jupyter-widgets/controls",
 105 |             "_model_module_version": "1.5.0",
 106 |             "value": "100%",
 107 |             "_view_count": null,
 108 |             "_view_module_version": "1.5.0",
 109 |             "description_tooltip": null,
 110 |             "_model_module": "@jupyter-widgets/controls",
 111 |             "layout": "IPY_MODEL_31ce7c77e7f14b139d2f06840d291b02"
 112 |           }
 113 |         },
 114 |         "727208403fe1401a9bbaa24c00874fc7": {
 115 |           "model_module": "@jupyter-widgets/controls",
 116 |           "model_name": "FloatProgressModel",
 117 |           "model_module_version": "1.5.0",
 118 |           "state": {
 119 |             "_view_name": "ProgressView",
 120 |             "style": "IPY_MODEL_785e6e4979fa401883bed2871ee92efd",
 121 |             "_dom_classes": [],
 122 |             "description": "",
 123 |             "_model_name": "FloatProgressModel",
 124 |             "bar_style": "success",
 125 |             "max": 3,
 126 |             "_view_module": "@jupyter-widgets/controls",
 127 |             "_model_module_version": "1.5.0",
 128 |             "value": 3,
 129 |             "_view_count": null,
 130 |             "_view_module_version": "1.5.0",
 131 |             "orientation": "horizontal",
 132 |             "min": 0,
 133 |             "description_tooltip": null,
 134 |             "_model_module": "@jupyter-widgets/controls",
 135 |             "layout": "IPY_MODEL_113f430a406847d69ec9e635faa5bd35"
 136 |           }
 137 |         },
 138 |         "828124b3ba894e1d8d5dec057991e495": {
 139 |           "model_module": "@jupyter-widgets/controls",
 140 |           "model_name": "HTMLModel",
 141 |           "model_module_version": "1.5.0",
 142 |           "state": {
 143 |             "_view_name": "HTMLView",
 144 |             "style": "IPY_MODEL_48888bda89b84a64ac37fe485f93f37c",
 145 |             "_dom_classes": [],
 146 |             "description": "",
 147 |             "_model_name": "HTMLModel",
 148 |             "placeholder": "​",
 149 |             "_view_module": "@jupyter-widgets/controls",
 150 |             "_model_module_version": "1.5.0",
 151 |             "value": " 3/3 [00:00&lt;00:00, 58.84it/s]",
 152 |             "_view_count": null,
 153 |             "_view_module_version": "1.5.0",
 154 |             "description_tooltip": null,
 155 |             "_model_module": "@jupyter-widgets/controls",
 156 |             "layout": "IPY_MODEL_0bee05318e8147f596e3046701ff24fc"
 157 |           }
 158 |         },
 159 |         "fa9ec3fc52bc4176bb2b0e247cb0d73b": {
 160 |           "model_module": "@jupyter-widgets/controls",
 161 |           "model_name": "DescriptionStyleModel",
 162 |           "model_module_version": "1.5.0",
 163 |           "state": {
 164 |             "_view_name": "StyleView",
 165 |             "_model_name": "DescriptionStyleModel",
 166 |             "description_width": "",
 167 |             "_view_module": "@jupyter-widgets/base",
 168 |             "_model_module_version": "1.5.0",
 169 |             "_view_count": null,
 170 |             "_view_module_version": "1.2.0",
 171 |             "_model_module": "@jupyter-widgets/controls"
 172 |           }
 173 |         },
 174 |         "31ce7c77e7f14b139d2f06840d291b02": {
 175 |           "model_module": "@jupyter-widgets/base",
 176 |           "model_name": "LayoutModel",
 177 |           "model_module_version": "1.2.0",
 178 |           "state": {
 179 |             "_view_name": "LayoutView",
 180 |             "grid_template_rows": null,
 181 |             "right": null,
 182 |             "justify_content": null,
 183 |             "_view_module": "@jupyter-widgets/base",
 184 |             "overflow": null,
 185 |             "_model_module_version": "1.2.0",
 186 |             "_view_count": null,
 187 |             "flex_flow": null,
 188 |             "width": null,
 189 |             "min_width": null,
 190 |             "border": null,
 191 |             "align_items": null,
 192 |             "bottom": null,
 193 |             "_model_module": "@jupyter-widgets/base",
 194 |             "top": null,
 195 |             "grid_column": null,
 196 |             "overflow_y": null,
 197 |             "overflow_x": null,
 198 |             "grid_auto_flow": null,
 199 |             "grid_area": null,
 200 |             "grid_template_columns": null,
 201 |             "flex": null,
 202 |             "_model_name": "LayoutModel",
 203 |             "justify_items": null,
 204 |             "grid_row": null,
 205 |             "max_height": null,
 206 |             "align_content": null,
 207 |             "visibility": null,
 208 |             "align_self": null,
 209 |             "height": null,
 210 |             "min_height": null,
 211 |             "padding": null,
 212 |             "grid_auto_rows": null,
 213 |             "grid_gap": null,
 214 |             "max_width": null,
 215 |             "order": null,
 216 |             "_view_module_version": "1.2.0",
 217 |             "grid_template_areas": null,
 218 |             "object_position": null,
 219 |             "object_fit": null,
 220 |             "grid_auto_columns": null,
 221 |             "margin": null,
 222 |             "display": null,
 223 |             "left": null
 224 |           }
 225 |         },
 226 |         "785e6e4979fa401883bed2871ee92efd": {
 227 |           "model_module": "@jupyter-widgets/controls",
 228 |           "model_name": "ProgressStyleModel",
 229 |           "model_module_version": "1.5.0",
 230 |           "state": {
 231 |             "_view_name": "StyleView",
 232 |             "_model_name": "ProgressStyleModel",
 233 |             "description_width": "",
 234 |             "_view_module": "@jupyter-widgets/base",
 235 |             "_model_module_version": "1.5.0",
 236 |             "_view_count": null,
 237 |             "_view_module_version": "1.2.0",
 238 |             "bar_color": null,
 239 |             "_model_module": "@jupyter-widgets/controls"
 240 |           }
 241 |         },
 242 |         "113f430a406847d69ec9e635faa5bd35": {
 243 |           "model_module": "@jupyter-widgets/base",
 244 |           "model_name": "LayoutModel",
 245 |           "model_module_version": "1.2.0",
 246 |           "state": {
 247 |             "_view_name": "LayoutView",
 248 |             "grid_template_rows": null,
 249 |             "right": null,
 250 |             "justify_content": null,
 251 |             "_view_module": "@jupyter-widgets/base",
 252 |             "overflow": null,
 253 |             "_model_module_version": "1.2.0",
 254 |             "_view_count": null,
 255 |             "flex_flow": null,
 256 |             "width": null,
 257 |             "min_width": null,
 258 |             "border": null,
 259 |             "align_items": null,
 260 |             "bottom": null,
 261 |             "_model_module": "@jupyter-widgets/base",
 262 |             "top": null,
 263 |             "grid_column": null,
 264 |             "overflow_y": null,
 265 |             "overflow_x": null,
 266 |             "grid_auto_flow": null,
 267 |             "grid_area": null,
 268 |             "grid_template_columns": null,
 269 |             "flex": null,
 270 |             "_model_name": "LayoutModel",
 271 |             "justify_items": null,
 272 |             "grid_row": null,
 273 |             "max_height": null,
 274 |             "align_content": null,
 275 |             "visibility": null,
 276 |             "align_self": null,
 277 |             "height": null,
 278 |             "min_height": null,
 279 |             "padding": null,
 280 |             "grid_auto_rows": null,
 281 |             "grid_gap": null,
 282 |             "max_width": null,
 283 |             "order": null,
 284 |             "_view_module_version": "1.2.0",
 285 |             "grid_template_areas": null,
 286 |             "object_position": null,
 287 |             "object_fit": null,
 288 |             "grid_auto_columns": null,
 289 |             "margin": null,
 290 |             "display": null,
 291 |             "left": null
 292 |           }
 293 |         },
 294 |         "48888bda89b84a64ac37fe485f93f37c": {
 295 |           "model_module": "@jupyter-widgets/controls",
 296 |           "model_name": "DescriptionStyleModel",
 297 |           "model_module_version": "1.5.0",
 298 |           "state": {
 299 |             "_view_name": "StyleView",
 300 |             "_model_name": "DescriptionStyleModel",
 301 |             "description_width": "",
 302 |             "_view_module": "@jupyter-widgets/base",
 303 |             "_model_module_version": "1.5.0",
 304 |             "_view_count": null,
 305 |             "_view_module_version": "1.2.0",
 306 |             "_model_module": "@jupyter-widgets/controls"
 307 |           }
 308 |         },
 309 |         "0bee05318e8147f596e3046701ff24fc": {
 310 |           "model_module": "@jupyter-widgets/base",
 311 |           "model_name": "LayoutModel",
 312 |           "model_module_version": "1.2.0",
 313 |           "state": {
 314 |             "_view_name": "LayoutView",
 315 |             "grid_template_rows": null,
 316 |             "right": null,
 317 |             "justify_content": null,
 318 |             "_view_module": "@jupyter-widgets/base",
 319 |             "overflow": null,
 320 |             "_model_module_version": "1.2.0",
 321 |             "_view_count": null,
 322 |             "flex_flow": null,
 323 |             "width": null,
 324 |             "min_width": null,
 325 |             "border": null,
 326 |             "align_items": null,
 327 |             "bottom": null,
 328 |             "_model_module": "@jupyter-widgets/base",
 329 |             "top": null,
 330 |             "grid_column": null,
 331 |             "overflow_y": null,
 332 |             "overflow_x": null,
 333 |             "grid_auto_flow": null,
 334 |             "grid_area": null,
 335 |             "grid_template_columns": null,
 336 |             "flex": null,
 337 |             "_model_name": "LayoutModel",
 338 |             "justify_items": null,
 339 |             "grid_row": null,
 340 |             "max_height": null,
 341 |             "align_content": null,
 342 |             "visibility": null,
 343 |             "align_self": null,
 344 |             "height": null,
 345 |             "min_height": null,
 346 |             "padding": null,
 347 |             "grid_auto_rows": null,
 348 |             "grid_gap": null,
 349 |             "max_width": null,
 350 |             "order": null,
 351 |             "_view_module_version": "1.2.0",
 352 |             "grid_template_areas": null,
 353 |             "object_position": null,
 354 |             "object_fit": null,
 355 |             "grid_auto_columns": null,
 356 |             "margin": null,
 357 |             "display": null,
 358 |             "left": null
 359 |           }
 360 |         },
 361 |         "ba551b3375564a7b9ea59576cde50f50": {
 362 |           "model_module": "@jupyter-widgets/controls",
 363 |           "model_name": "HBoxModel",
 364 |           "model_module_version": "1.5.0",
 365 |           "state": {
 366 |             "_view_name": "HBoxView",
 367 |             "_dom_classes": [],
 368 |             "_model_name": "HBoxModel",
 369 |             "_view_module": "@jupyter-widgets/controls",
 370 |             "_model_module_version": "1.5.0",
 371 |             "_view_count": null,
 372 |             "_view_module_version": "1.5.0",
 373 |             "box_style": "",
 374 |             "layout": "IPY_MODEL_726ccfe3b3fb4c429f6745fe0c59062c",
 375 |             "_model_module": "@jupyter-widgets/controls",
 376 |             "children": [
 377 |               "IPY_MODEL_555f9571507c43c997a5a434e71fec44",
 378 |               "IPY_MODEL_c24b3ef3c3c8453b95c9f92cac448ecc",
 379 |               "IPY_MODEL_f592f52d24af46ac9b9e21c7e801206f"
 380 |             ]
 381 |           }
 382 |         },
 383 |         "726ccfe3b3fb4c429f6745fe0c59062c": {
 384 |           "model_module": "@jupyter-widgets/base",
 385 |           "model_name": "LayoutModel",
 386 |           "model_module_version": "1.2.0",
 387 |           "state": {
 388 |             "_view_name": "LayoutView",
 389 |             "grid_template_rows": null,
 390 |             "right": null,
 391 |             "justify_content": null,
 392 |             "_view_module": "@jupyter-widgets/base",
 393 |             "overflow": null,
 394 |             "_model_module_version": "1.2.0",
 395 |             "_view_count": null,
 396 |             "flex_flow": null,
 397 |             "width": null,
 398 |             "min_width": null,
 399 |             "border": null,
 400 |             "align_items": null,
 401 |             "bottom": null,
 402 |             "_model_module": "@jupyter-widgets/base",
 403 |             "top": null,
 404 |             "grid_column": null,
 405 |             "overflow_y": null,
 406 |             "overflow_x": null,
 407 |             "grid_auto_flow": null,
 408 |             "grid_area": null,
 409 |             "grid_template_columns": null,
 410 |             "flex": null,
 411 |             "_model_name": "LayoutModel",
 412 |             "justify_items": null,
 413 |             "grid_row": null,
 414 |             "max_height": null,
 415 |             "align_content": null,
 416 |             "visibility": null,
 417 |             "align_self": null,
 418 |             "height": null,
 419 |             "min_height": null,
 420 |             "padding": null,
 421 |             "grid_auto_rows": null,
 422 |             "grid_gap": null,
 423 |             "max_width": null,
 424 |             "order": null,
 425 |             "_view_module_version": "1.2.0",
 426 |             "grid_template_areas": null,
 427 |             "object_position": null,
 428 |             "object_fit": null,
 429 |             "grid_auto_columns": null,
 430 |             "margin": null,
 431 |             "display": null,
 432 |             "left": null
 433 |           }
 434 |         },
 435 |         "555f9571507c43c997a5a434e71fec44": {
 436 |           "model_module": "@jupyter-widgets/controls",
 437 |           "model_name": "HTMLModel",
 438 |           "model_module_version": "1.5.0",
 439 |           "state": {
 440 |             "_view_name": "HTMLView",
 441 |             "style": "IPY_MODEL_87aba6a79c924802be023c6601f47090",
 442 |             "_dom_classes": [],
 443 |             "description": "",
 444 |             "_model_name": "HTMLModel",
 445 |             "placeholder": "​",
 446 |             "_view_module": "@jupyter-widgets/controls",
 447 |             "_model_module_version": "1.5.0",
 448 |             "value": "100%",
 449 |             "_view_count": null,
 450 |             "_view_module_version": "1.5.0",
 451 |             "description_tooltip": null,
 452 |             "_model_module": "@jupyter-widgets/controls",
 453 |             "layout": "IPY_MODEL_3b752510457846f99d590e042db7a8c6"
 454 |           }
 455 |         },
 456 |         "c24b3ef3c3c8453b95c9f92cac448ecc": {
 457 |           "model_module": "@jupyter-widgets/controls",
 458 |           "model_name": "FloatProgressModel",
 459 |           "model_module_version": "1.5.0",
 460 |           "state": {
 461 |             "_view_name": "ProgressView",
 462 |             "style": "IPY_MODEL_dd0139ca19074aed86585e27f7c3c1c9",
 463 |             "_dom_classes": [],
 464 |             "description": "",
 465 |             "_model_name": "FloatProgressModel",
 466 |             "bar_style": "success",
 467 |             "max": 4,
 468 |             "_view_module": "@jupyter-widgets/controls",
 469 |             "_model_module_version": "1.5.0",
 470 |             "value": 4,
 471 |             "_view_count": null,
 472 |             "_view_module_version": "1.5.0",
 473 |             "orientation": "horizontal",
 474 |             "min": 0,
 475 |             "description_tooltip": null,
 476 |             "_model_module": "@jupyter-widgets/controls",
 477 |             "layout": "IPY_MODEL_c7469b58d5a64256b7a44a084e8b3ffa"
 478 |           }
 479 |         },
 480 |         "f592f52d24af46ac9b9e21c7e801206f": {
 481 |           "model_module": "@jupyter-widgets/controls",
 482 |           "model_name": "HTMLModel",
 483 |           "model_module_version": "1.5.0",
 484 |           "state": {
 485 |             "_view_name": "HTMLView",
 486 |             "style": "IPY_MODEL_f3483fa8082b437cb9ee9ccd1bc25aba",
 487 |             "_dom_classes": [],
 488 |             "description": "",
 489 |             "_model_name": "HTMLModel",
 490 |             "placeholder": "​",
 491 |             "_view_module": "@jupyter-widgets/controls",
 492 |             "_model_module_version": "1.5.0",
 493 |             "value": " 4/4 [00:00&lt;00:00,  6.91ba/s]",
 494 |             "_view_count": null,
 495 |             "_view_module_version": "1.5.0",
 496 |             "description_tooltip": null,
 497 |             "_model_module": "@jupyter-widgets/controls",
 498 |             "layout": "IPY_MODEL_89d97214d04c4d4188d7f12c72cd34d5"
 499 |           }
 500 |         },
 501 |         "87aba6a79c924802be023c6601f47090": {
 502 |           "model_module": "@jupyter-widgets/controls",
 503 |           "model_name": "DescriptionStyleModel",
 504 |           "model_module_version": "1.5.0",
 505 |           "state": {
 506 |             "_view_name": "StyleView",
 507 |             "_model_name": "DescriptionStyleModel",
 508 |             "description_width": "",
 509 |             "_view_module": "@jupyter-widgets/base",
 510 |             "_model_module_version": "1.5.0",
 511 |             "_view_count": null,
 512 |             "_view_module_version": "1.2.0",
 513 |             "_model_module": "@jupyter-widgets/controls"
 514 |           }
 515 |         },
 516 |         "3b752510457846f99d590e042db7a8c6": {
 517 |           "model_module": "@jupyter-widgets/base",
 518 |           "model_name": "LayoutModel",
 519 |           "model_module_version": "1.2.0",
 520 |           "state": {
 521 |             "_view_name": "LayoutView",
 522 |             "grid_template_rows": null,
 523 |             "right": null,
 524 |             "justify_content": null,
 525 |             "_view_module": "@jupyter-widgets/base",
 526 |             "overflow": null,
 527 |             "_model_module_version": "1.2.0",
 528 |             "_view_count": null,
 529 |             "flex_flow": null,
 530 |             "width": null,
 531 |             "min_width": null,
 532 |             "border": null,
 533 |             "align_items": null,
 534 |             "bottom": null,
 535 |             "_model_module": "@jupyter-widgets/base",
 536 |             "top": null,
 537 |             "grid_column": null,
 538 |             "overflow_y": null,
 539 |             "overflow_x": null,
 540 |             "grid_auto_flow": null,
 541 |             "grid_area": null,
 542 |             "grid_template_columns": null,
 543 |             "flex": null,
 544 |             "_model_name": "LayoutModel",
 545 |             "justify_items": null,
 546 |             "grid_row": null,
 547 |             "max_height": null,
 548 |             "align_content": null,
 549 |             "visibility": null,
 550 |             "align_self": null,
 551 |             "height": null,
 552 |             "min_height": null,
 553 |             "padding": null,
 554 |             "grid_auto_rows": null,
 555 |             "grid_gap": null,
 556 |             "max_width": null,
 557 |             "order": null,
 558 |             "_view_module_version": "1.2.0",
 559 |             "grid_template_areas": null,
 560 |             "object_position": null,
 561 |             "object_fit": null,
 562 |             "grid_auto_columns": null,
 563 |             "margin": null,
 564 |             "display": null,
 565 |             "left": null
 566 |           }
 567 |         },
 568 |         "dd0139ca19074aed86585e27f7c3c1c9": {
 569 |           "model_module": "@jupyter-widgets/controls",
 570 |           "model_name": "ProgressStyleModel",
 571 |           "model_module_version": "1.5.0",
 572 |           "state": {
 573 |             "_view_name": "StyleView",
 574 |             "_model_name": "ProgressStyleModel",
 575 |             "description_width": "",
 576 |             "_view_module": "@jupyter-widgets/base",
 577 |             "_model_module_version": "1.5.0",
 578 |             "_view_count": null,
 579 |             "_view_module_version": "1.2.0",
 580 |             "bar_color": null,
 581 |             "_model_module": "@jupyter-widgets/controls"
 582 |           }
 583 |         },
 584 |         "c7469b58d5a64256b7a44a084e8b3ffa": {
 585 |           "model_module": "@jupyter-widgets/base",
 586 |           "model_name": "LayoutModel",
 587 |           "model_module_version": "1.2.0",
 588 |           "state": {
 589 |             "_view_name": "LayoutView",
 590 |             "grid_template_rows": null,
 591 |             "right": null,
 592 |             "justify_content": null,
 593 |             "_view_module": "@jupyter-widgets/base",
 594 |             "overflow": null,
 595 |             "_model_module_version": "1.2.0",
 596 |             "_view_count": null,
 597 |             "flex_flow": null,
 598 |             "width": null,
 599 |             "min_width": null,
 600 |             "border": null,
 601 |             "align_items": null,
 602 |             "bottom": null,
 603 |             "_model_module": "@jupyter-widgets/base",
 604 |             "top": null,
 605 |             "grid_column": null,
 606 |             "overflow_y": null,
 607 |             "overflow_x": null,
 608 |             "grid_auto_flow": null,
 609 |             "grid_area": null,
 610 |             "grid_template_columns": null,
 611 |             "flex": null,
 612 |             "_model_name": "LayoutModel",
 613 |             "justify_items": null,
 614 |             "grid_row": null,
 615 |             "max_height": null,
 616 |             "align_content": null,
 617 |             "visibility": null,
 618 |             "align_self": null,
 619 |             "height": null,
 620 |             "min_height": null,
 621 |             "padding": null,
 622 |             "grid_auto_rows": null,
 623 |             "grid_gap": null,
 624 |             "max_width": null,
 625 |             "order": null,
 626 |             "_view_module_version": "1.2.0",
 627 |             "grid_template_areas": null,
 628 |             "object_position": null,
 629 |             "object_fit": null,
 630 |             "grid_auto_columns": null,
 631 |             "margin": null,
 632 |             "display": null,
 633 |             "left": null
 634 |           }
 635 |         },
 636 |         "f3483fa8082b437cb9ee9ccd1bc25aba": {
 637 |           "model_module": "@jupyter-widgets/controls",
 638 |           "model_name": "DescriptionStyleModel",
 639 |           "model_module_version": "1.5.0",
 640 |           "state": {
 641 |             "_view_name": "StyleView",
 642 |             "_model_name": "DescriptionStyleModel",
 643 |             "description_width": "",
 644 |             "_view_module": "@jupyter-widgets/base",
 645 |             "_model_module_version": "1.5.0",
 646 |             "_view_count": null,
 647 |             "_view_module_version": "1.2.0",
 648 |             "_model_module": "@jupyter-widgets/controls"
 649 |           }
 650 |         },
 651 |         "89d97214d04c4d4188d7f12c72cd34d5": {
 652 |           "model_module": "@jupyter-widgets/base",
 653 |           "model_name": "LayoutModel",
 654 |           "model_module_version": "1.2.0",
 655 |           "state": {
 656 |             "_view_name": "LayoutView",
 657 |             "grid_template_rows": null,
 658 |             "right": null,
 659 |             "justify_content": null,
 660 |             "_view_module": "@jupyter-widgets/base",
 661 |             "overflow": null,
 662 |             "_model_module_version": "1.2.0",
 663 |             "_view_count": null,
 664 |             "flex_flow": null,
 665 |             "width": null,
 666 |             "min_width": null,
 667 |             "border": null,
 668 |             "align_items": null,
 669 |             "bottom": null,
 670 |             "_model_module": "@jupyter-widgets/base",
 671 |             "top": null,
 672 |             "grid_column": null,
 673 |             "overflow_y": null,
 674 |             "overflow_x": null,
 675 |             "grid_auto_flow": null,
 676 |             "grid_area": null,
 677 |             "grid_template_columns": null,
 678 |             "flex": null,
 679 |             "_model_name": "LayoutModel",
 680 |             "justify_items": null,
 681 |             "grid_row": null,
 682 |             "max_height": null,
 683 |             "align_content": null,
 684 |             "visibility": null,
 685 |             "align_self": null,
 686 |             "height": null,
 687 |             "min_height": null,
 688 |             "padding": null,
 689 |             "grid_auto_rows": null,
 690 |             "grid_gap": null,
 691 |             "max_width": null,
 692 |             "order": null,
 693 |             "_view_module_version": "1.2.0",
 694 |             "grid_template_areas": null,
 695 |             "object_position": null,
 696 |             "object_fit": null,
 697 |             "grid_auto_columns": null,
 698 |             "margin": null,
 699 |             "display": null,
 700 |             "left": null
 701 |           }
 702 |         }
 703 |       }
 704 |     }
 705 |   },
 706 |   "cells": [
 707 |     {
 708 |       "cell_type": "markdown",
 709 |       "source": [
 710 |         "**INITIALIZATION:**\n",
 711 |         "- I use these three lines of code on top of my each notebooks because it will help to prevent any problems while reloading the same project. And the third line of code helps to make visualization within the notebook."
 712 |       ],
 713 |       "metadata": {
 714 |         "id": "NTM4eJtYoUrW"
 715 |       }
 716 |     },
 717 |     {
 718 |       "cell_type": "code",
 719 |       "execution_count": 1,
 720 |       "metadata": {
 721 |         "id": "vrzwsW9LnzLu"
 722 |       },
 723 |       "outputs": [],
 724 |       "source": [
 725 |         "#@ INITIALIZATION: \n",
 726 |         "%reload_ext autoreload\n",
 727 |         "%autoreload 2\n",
 728 |         "%matplotlib inline"
 729 |       ]
 730 |     },
 731 |     {
 732 |       "cell_type": "markdown",
 733 |       "source": [
 734 |         "**LIBRARIES AND DEPENDENCIES:**\n",
 735 |         "- I have downloaded all the libraries and dependencies required for the project in one particular cell."
 736 |       ],
 737 |       "metadata": {
 738 |         "id": "TeFko4nqodm1"
 739 |       }
 740 |     },
 741 |     {
 742 |       "cell_type": "code",
 743 |       "source": [
 744 |         "#@ INSTALLING DEPENDENCIES: UNCOMMENT BELOW: \n",
 745 |         "# !pip install datasets transformers[sentencepiece]"
 746 |       ],
 747 |       "metadata": {
 748 |         "id": "Alj-bR9rogkb"
 749 |       },
 750 |       "execution_count": 3,
 751 |       "outputs": []
 752 |     },
 753 |     {
 754 |       "cell_type": "code",
 755 |       "source": [
 756 |         "#@ DOWNLOADING LIBRARIES AND DEPENDENCIES:\n",
 757 |         "import numpy as np\n",
 758 |         "import torch\n",
 759 |         "import transformers\n",
 760 |         "import datasets\n",
 761 |         "from datasets import load_dataset\n",
 762 |         "from datasets import load_metric\n",
 763 |         "from transformers import AdamW\n",
 764 |         "from transformers import AutoTokenizer\n",
 765 |         "from transformers import AutoModelForSequenceClassification\n",
 766 |         "from transformers import DataCollatorWithPadding\n",
 767 |         "from transformers import TrainingArguments\n",
 768 |         "from transformers import Trainer\n",
 769 |         "\n",
 770 |         "#@ IGNORING WARNINGS: \n",
 771 |         "import warnings\n",
 772 |         "warnings.filterwarnings(\"ignore\")"
 773 |       ],
 774 |       "metadata": {
 775 |         "id": "xvPCBSX5oluM"
 776 |       },
 777 |       "execution_count": 22,
 778 |       "outputs": []
 779 |     },
 780 |     {
 781 |       "cell_type": "markdown",
 782 |       "source": [
 783 |         "**PROCESSING THE DATA:**"
 784 |       ],
 785 |       "metadata": {
 786 |         "id": "3MupS6cJpHLb"
 787 |       }
 788 |     },
 789 |     {
 790 |       "cell_type": "code",
 791 |       "source": [
 792 |         "#@ PROCESSING THE DATA:\n",
 793 |         "checkpoint = \"bert-base-uncased\"                                        # Initialization. \n",
 794 |         "tokenizer = AutoTokenizer.from_pretrained(checkpoint)                   # Initializing Tokenizer. \n",
 795 |         "model = AutoModelForSequenceClassification.from_pretrained(checkpoint)  # Initializing Sequence Model. \n",
 796 |         "sequences = [\n",
 797 |         "        \"I've been waiting for a HuggingFace course my whole life\",\n",
 798 |         "        \"This course is amazing!\"\n",
 799 |         "]                                                                       # Text Sequences. \n",
 800 |         "batch = tokenizer(sequences, padding=True, truncation=True, \n",
 801 |         "                  return_tensors=\"pt\")                                  # Getting Batch of Tensors. \n",
 802 |         "batch[\"labels\"] = torch.tensor([1, 1])                                  # Initializing Labels. \n",
 803 |         "\n",
 804 |         "#@ INITIALIZING MODEL TRAINING PARAMETERS:\n",
 805 |         "optimizer = AdamW(model.parameters())                                   # Initializing Optimizer. \n",
 806 |         "loss = model(**batch).loss                                              # Initializing Loss. \n",
 807 |         "loss.backward()                                                         # Initializing Back Propagation. \n",
 808 |         "optimizer.step()                                                        # Updating Parameters. "
 809 |       ],
 810 |       "metadata": {
 811 |         "colab": {
 812 |           "base_uri": "https://localhost:8080/"
 813 |         },
 814 |         "id": "V45HKkHQpD61",
 815 |         "outputId": "19fc89cb-50df-4a6d-b6aa-c2d54f87e76e"
 816 |       },
 817 |       "execution_count": 6,
 818 |       "outputs": [
 819 |         {
 820 |           "output_type": "stream",
 821 |           "name": "stderr",
 822 |           "text": [
 823 |             "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias']\n",
 824 |             "- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
 825 |             "- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
 826 |             "Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']\n",
 827 |             "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
 828 |           ]
 829 |         }
 830 |       ]
 831 |     },
 832 |     {
 833 |       "cell_type": "markdown",
 834 |       "source": [
 835 |         "**GETTING THE DATASET:**\n",
 836 |         "- In this notebook, we will use MRPC (Microsoft Research Paraphrase Corpus) dataset introduced by William B. Dolan and Chris Brockett. The dataset consist of 5801 pairs of sentences, with a label indicating if they are paraphrases or not. It is one of the 10 datasets composing the GLUE benchmark, which is an academic benchmark that is used to measure the performance of ML models across 10 different text classification tasks. "
 837 |       ],
 838 |       "metadata": {
 839 |         "id": "h-BFfURztEfK"
 840 |       }
 841 |     },
 842 |     {
 843 |       "cell_type": "code",
 844 |       "source": [
 845 |         "#@ GETTING THE DATASET:\n",
 846 |         "raw_datasets = load_dataset(\"glue\", \"mrpc\")             # Getting MRPC Dataset. \n",
 847 |         "raw_datasets                                            # Inspecting Dataset. "
 848 |       ],
 849 |       "metadata": {
 850 |         "colab": {
 851 |           "base_uri": "https://localhost:8080/",
 852 |           "height": 309,
 853 |           "referenced_widgets": [
 854 |             "957e095a73a04591bb5fab8696af061b",
 855 |             "bc08769e79eb4c2890dbfef83d60a9b5",
 856 |             "97a100676e024babb68562fa29104237",
 857 |             "727208403fe1401a9bbaa24c00874fc7",
 858 |             "828124b3ba894e1d8d5dec057991e495",
 859 |             "fa9ec3fc52bc4176bb2b0e247cb0d73b",
 860 |             "31ce7c77e7f14b139d2f06840d291b02",
 861 |             "785e6e4979fa401883bed2871ee92efd",
 862 |             "113f430a406847d69ec9e635faa5bd35",
 863 |             "48888bda89b84a64ac37fe485f93f37c",
 864 |             "0bee05318e8147f596e3046701ff24fc"
 865 |           ]
 866 |         },
 867 |         "id": "TVvxjHqhsbcx",
 868 |         "outputId": "c9f700eb-66ba-4e7c-964e-30525a858706"
 869 |       },
 870 |       "execution_count": 8,
 871 |       "outputs": [
 872 |         {
 873 |           "output_type": "stream",
 874 |           "name": "stderr",
 875 |           "text": [
 876 |             "Reusing dataset glue (/root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)\n"
 877 |           ]
 878 |         },
 879 |         {
 880 |           "output_type": "display_data",
 881 |           "data": {
 882 |             "application/vnd.jupyter.widget-view+json": {
 883 |               "model_id": "957e095a73a04591bb5fab8696af061b",
 884 |               "version_minor": 0,
 885 |               "version_major": 2
 886 |             },
 887 |             "text/plain": [
 888 |               "  0%|          | 0/3 [00:00<?, ?it/s]"
 889 |             ]
 890 |           },
 891 |           "metadata": {}
 892 |         },
 893 |         {
 894 |           "output_type": "execute_result",
 895 |           "data": {
 896 |             "text/plain": [
 897 |               "DatasetDict({\n",
 898 |               "    train: Dataset({\n",
 899 |               "        features: ['sentence1', 'sentence2', 'label', 'idx'],\n",
 900 |               "        num_rows: 3668\n",
 901 |               "    })\n",
 902 |               "    validation: Dataset({\n",
 903 |               "        features: ['sentence1', 'sentence2', 'label', 'idx'],\n",
 904 |               "        num_rows: 408\n",
 905 |               "    })\n",
 906 |               "    test: Dataset({\n",
 907 |               "        features: ['sentence1', 'sentence2', 'label', 'idx'],\n",
 908 |               "        num_rows: 1725\n",
 909 |               "    })\n",
 910 |               "})"
 911 |             ]
 912 |           },
 913 |           "metadata": {},
 914 |           "execution_count": 8
 915 |         }
 916 |       ]
 917 |     },
 918 |     {
 919 |       "cell_type": "code",
 920 |       "source": [
 921 |         "#@ INSPECTING TRAINING DATASET: \n",
 922 |         "raw_train_dataset = raw_datasets[\"train\"]               # Training Dataset. \n",
 923 |         "raw_train_dataset[15]                                    # Inspection. "
 924 |       ],
 925 |       "metadata": {
 926 |         "colab": {
 927 |           "base_uri": "https://localhost:8080/"
 928 |         },
 929 |         "id": "YFr08GBQ0xtW",
 930 |         "outputId": "e67d005c-cba1-43d2-b921-51e269784f4c"
 931 |       },
 932 |       "execution_count": 9,
 933 |       "outputs": [
 934 |         {
 935 |           "output_type": "execute_result",
 936 |           "data": {
 937 |             "text/plain": [
 938 |               "{'idx': 16,\n",
 939 |               " 'label': 0,\n",
 940 |               " 'sentence1': 'Rudder was most recently senior vice president for the Developer & Platform Evangelism Business .',\n",
 941 |               " 'sentence2': 'Senior Vice President Eric Rudder , formerly head of the Developer and Platform Evangelism unit , will lead the new entity .'}"
 942 |             ]
 943 |           },
 944 |           "metadata": {},
 945 |           "execution_count": 9
 946 |         }
 947 |       ]
 948 |     },
 949 |     {
 950 |       "cell_type": "code",
 951 |       "source": [
 952 |         "#@ INSPECTING TYPE OF COLUMNS:\n",
 953 |         "raw_train_dataset.features"
 954 |       ],
 955 |       "metadata": {
 956 |         "colab": {
 957 |           "base_uri": "https://localhost:8080/"
 958 |         },
 959 |         "id": "AGVarErY3vIe",
 960 |         "outputId": "7952f95d-c049-4fbb-93d2-232438c24bde"
 961 |       },
 962 |       "execution_count": 10,
 963 |       "outputs": [
 964 |         {
 965 |           "output_type": "execute_result",
 966 |           "data": {
 967 |             "text/plain": [
 968 |               "{'idx': Value(dtype='int32', id=None),\n",
 969 |               " 'label': ClassLabel(num_classes=2, names=['not_equivalent', 'equivalent'], id=None),\n",
 970 |               " 'sentence1': Value(dtype='string', id=None),\n",
 971 |               " 'sentence2': Value(dtype='string', id=None)}"
 972 |             ]
 973 |           },
 974 |           "metadata": {},
 975 |           "execution_count": 10
 976 |         }
 977 |       ]
 978 |     },
 979 |     {
 980 |       "cell_type": "markdown",
 981 |       "source": [
 982 |         "**PREPROCESSING THE DATASET:**\n",
 983 |         "- To preprocess the dataset, we will convert the text to numbers the model can make sense of, with the help of tokenizer. "
 984 |       ],
 985 |       "metadata": {
 986 |         "id": "BiiE5nTe5nZo"
 987 |       }
 988 |     },
 989 |     {
 990 |       "cell_type": "code",
 991 |       "source": [
 992 |         "#@ INITIALIZING TOKENIZATION:\n",
 993 |         "tokenized_1 = tokenizer(raw_datasets[\"train\"][\"sentence1\"])                  # Tokenization. \n",
 994 |         "tokenized_2 = tokenizer(raw_datasets[\"train\"][\"sentence2\"])                  # Tokenization. "
 995 |       ],
 996 |       "metadata": {
 997 |         "id": "Fz6lI8xN4GQ9"
 998 |       },
 999 |       "execution_count": 11,
1000 |       "outputs": []
1001 |     },
1002 |     {
1003 |       "cell_type": "code",
1004 |       "source": [
1005 |         "#@ IMPLEMENTING TOKENIZER:\n",
1006 |         "inputs = tokenizer(\"This is a first sentence\", \n",
1007 |         "                   \"This is a second sentence\")                              # Tokenization. \n",
1008 |         "inputs"
1009 |       ],
1010 |       "metadata": {
1011 |         "colab": {
1012 |           "base_uri": "https://localhost:8080/"
1013 |         },
1014 |         "id": "Tnz8kEcyjPH8",
1015 |         "outputId": "a0315c47-0bdc-430a-84ab-057919e768d2"
1016 |       },
1017 |       "execution_count": 12,
1018 |       "outputs": [
1019 |         {
1020 |           "output_type": "execute_result",
1021 |           "data": {
1022 |             "text/plain": [
1023 |               "{'input_ids': [101, 2023, 2003, 1037, 2034, 6251, 102, 2023, 2003, 1037, 2117, 6251, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}"
1024 |             ]
1025 |           },
1026 |           "metadata": {},
1027 |           "execution_count": 12
1028 |         }
1029 |       ]
1030 |     },
1031 |     {
1032 |       "cell_type": "code",
1033 |       "source": [
1034 |         "#@ DEFINING TOKENIZATION FUNCTION:\n",
1035 |         "def tokenize_function(example):                                               # Defining Function. \n",
1036 |         "    return tokenizer(example[\"sentence1\"], example[\"sentence2\"], \n",
1037 |         "                     truncation=True)                                         # Implementation of Tokenizer. \n",
1038 |         "\n",
1039 |         "#@ IMPLEMENTATION OF FUNCTION:\n",
1040 |         "tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)        # Initializing Tokenization. \n",
1041 |         "tokenized_datasets                                                            # Inspection. "
1042 |       ],
1043 |       "metadata": {
1044 |         "colab": {
1045 |           "base_uri": "https://localhost:8080/",
1046 |           "height": 347,
1047 |           "referenced_widgets": [
1048 |             "ba551b3375564a7b9ea59576cde50f50",
1049 |             "726ccfe3b3fb4c429f6745fe0c59062c",
1050 |             "555f9571507c43c997a5a434e71fec44",
1051 |             "c24b3ef3c3c8453b95c9f92cac448ecc",
1052 |             "f592f52d24af46ac9b9e21c7e801206f",
1053 |             "87aba6a79c924802be023c6601f47090",
1054 |             "3b752510457846f99d590e042db7a8c6",
1055 |             "dd0139ca19074aed86585e27f7c3c1c9",
1056 |             "c7469b58d5a64256b7a44a084e8b3ffa",
1057 |             "f3483fa8082b437cb9ee9ccd1bc25aba",
1058 |             "89d97214d04c4d4188d7f12c72cd34d5"
1059 |           ]
1060 |         },
1061 |         "id": "YORtBNZOjgdR",
1062 |         "outputId": "8e35bc74-c445-4ab1-ed9f-34b341b03ddb"
1063 |       },
1064 |       "execution_count": 14,
1065 |       "outputs": [
1066 |         {
1067 |           "output_type": "display_data",
1068 |           "data": {
1069 |             "application/vnd.jupyter.widget-view+json": {
1070 |               "model_id": "ba551b3375564a7b9ea59576cde50f50",
1071 |               "version_minor": 0,
1072 |               "version_major": 2
1073 |             },
1074 |             "text/plain": [
1075 |               "  0%|          | 0/4 [00:00<?, ?ba/s]"
1076 |             ]
1077 |           },
1078 |           "metadata": {}
1079 |         },
1080 |         {
1081 |           "output_type": "stream",
1082 |           "name": "stderr",
1083 |           "text": [
1084 |             "Loading cached processed dataset at /root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-c44a7f7bf29f5b58.arrow\n",
1085 |             "Loading cached processed dataset at /root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-3e01f4e4c8ccd211.arrow\n"
1086 |           ]
1087 |         },
1088 |         {
1089 |           "output_type": "execute_result",
1090 |           "data": {
1091 |             "text/plain": [
1092 |               "DatasetDict({\n",
1093 |               "    train: Dataset({\n",
1094 |               "        features: ['sentence1', 'sentence2', 'label', 'idx', 'input_ids', 'token_type_ids', 'attention_mask'],\n",
1095 |               "        num_rows: 3668\n",
1096 |               "    })\n",
1097 |               "    validation: Dataset({\n",
1098 |               "        features: ['sentence1', 'sentence2', 'label', 'idx', 'input_ids', 'token_type_ids', 'attention_mask'],\n",
1099 |               "        num_rows: 408\n",
1100 |               "    })\n",
1101 |               "    test: Dataset({\n",
1102 |               "        features: ['sentence1', 'sentence2', 'label', 'idx', 'input_ids', 'token_type_ids', 'attention_mask'],\n",
1103 |               "        num_rows: 1725\n",
1104 |               "    })\n",
1105 |               "})"
1106 |             ]
1107 |           },
1108 |           "metadata": {},
1109 |           "execution_count": 14
1110 |         }
1111 |       ]
1112 |     },
1113 |     {
1114 |       "cell_type": "markdown",
1115 |       "source": [
1116 |         "**DYNAMIC PADDING**\n",
1117 |         "- The function that is responsible for putting together samples inside a batch is called a `collate function`. **Dynamic Padding** means the samples in the batch should all be padded to the maximum length inside the batch. "
1118 |       ],
1119 |       "metadata": {
1120 |         "id": "W73wiTROuhkF"
1121 |       }
1122 |     },
1123 |     {
1124 |       "cell_type": "code",
1125 |       "source": [
1126 |         "#@ IMPLEMENTATION OF COLLATOR FUNCTION: \n",
1127 |         "data_collator = DataCollatorWithPadding(tokenizer=tokenizer)    # Initialization. \n",
1128 |         "\n",
1129 |         "#@ IMPLEMENTATION OF COLLATOR FUNCTION: INITIALIZATION: \n",
1130 |         "samples = tokenized_datasets[\"train\"][:8]  \n",
1131 |         "samples = {k:v for k,v in samples.items() if \n",
1132 |         "           k not in [\"idx\", \"sentence1\", \"sentence2\"]}\n",
1133 |         "[len(x) for x in samples[\"input_ids\"]]"
1134 |       ],
1135 |       "metadata": {
1136 |         "colab": {
1137 |           "base_uri": "https://localhost:8080/"
1138 |         },
1139 |         "id": "SZrfuz8DmD7l",
1140 |         "outputId": "a6353f97-dfb1-4bda-8610-ce5b38c678a0"
1141 |       },
1142 |       "execution_count": 15,
1143 |       "outputs": [
1144 |         {
1145 |           "output_type": "execute_result",
1146 |           "data": {
1147 |             "text/plain": [
1148 |               "[50, 59, 47, 67, 59, 50, 62, 32]"
1149 |             ]
1150 |           },
1151 |           "metadata": {},
1152 |           "execution_count": 15
1153 |         }
1154 |       ]
1155 |     },
1156 |     {
1157 |       "cell_type": "code",
1158 |       "source": [
1159 |         "#@ IMPLEMENTATION OF COLLATOR FUNCTION: \n",
1160 |         "batch = data_collator(samples)                  # Implementation. \n",
1161 |         "{k:v.shape for k, v in batch.items()}           # Inspection. "
1162 |       ],
1163 |       "metadata": {
1164 |         "colab": {
1165 |           "base_uri": "https://localhost:8080/"
1166 |         },
1167 |         "id": "ddpwArvGwCjW",
1168 |         "outputId": "7f75509f-be96-4c99-e881-e9c6aa531797"
1169 |       },
1170 |       "execution_count": 16,
1171 |       "outputs": [
1172 |         {
1173 |           "output_type": "execute_result",
1174 |           "data": {
1175 |             "text/plain": [
1176 |               "{'attention_mask': torch.Size([8, 67]),\n",
1177 |               " 'input_ids': torch.Size([8, 67]),\n",
1178 |               " 'labels': torch.Size([8]),\n",
1179 |               " 'token_type_ids': torch.Size([8, 67])}"
1180 |             ]
1181 |           },
1182 |           "metadata": {},
1183 |           "execution_count": 16
1184 |         }
1185 |       ]
1186 |     },
1187 |     {
1188 |       "cell_type": "markdown",
1189 |       "source": [
1190 |         "**TRAINING:**"
1191 |       ],
1192 |       "metadata": {
1193 |         "id": "4W_RLfRwriWC"
1194 |       }
1195 |     },
1196 |     {
1197 |       "cell_type": "code",
1198 |       "source": [
1199 |         "#@ INITIALIZING TRAINING:\n",
1200 |         "training_args = TrainingArguments(checkpoint)                                           # Initializing Training Arguments. \n",
1201 |         "model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)    # Initializing Classification Model. \n",
1202 |         "\n",
1203 |         "\n",
1204 |         "#@ INITIALIZING TRAINER:\n",
1205 |         "trainer = Trainer(model, training_args, train_dataset=tokenized_datasets[\"train\"],\n",
1206 |         "                  eval_dataset=tokenized_datasets[\"validation\"], \n",
1207 |         "                  data_collator=data_collator, tokenizer=tokenizer)                     # Initializing Trainer.\n",
1208 |         "trainer.train()                                                                         # Initializing Training.  "
1209 |       ],
1210 |       "metadata": {
1211 |         "colab": {
1212 |           "base_uri": "https://localhost:8080/",
1213 |           "height": 662
1214 |         },
1215 |         "id": "JIeHvtSMrR6R",
1216 |         "outputId": "ad36abcb-f740-4579-917e-0e511a06a135"
1217 |       },
1218 |       "execution_count": 17,
1219 |       "outputs": [
1220 |         {
1221 |           "output_type": "stream",
1222 |           "name": "stderr",
1223 |           "text": [
1224 |             "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias']\n",
1225 |             "- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
1226 |             "- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
1227 |             "Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']\n",
1228 |             "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n",
1229 |             "The following columns in the training set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.\n",
1230 |             "***** Running training *****\n",
1231 |             "  Num examples = 3668\n",
1232 |             "  Num Epochs = 3\n",
1233 |             "  Instantaneous batch size per device = 8\n",
1234 |             "  Total train batch size (w. parallel, distributed & accumulation) = 8\n",
1235 |             "  Gradient Accumulation steps = 1\n",
1236 |             "  Total optimization steps = 1377\n"
1237 |           ]
1238 |         },
1239 |         {
1240 |           "output_type": "display_data",
1241 |           "data": {
1242 |             "text/html": [
1243 |               "\n",
1244 |               "    <div>\n",
1245 |               "      \n",
1246 |               "      <progress value='1377' max='1377' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
1247 |               "      [1377/1377 06:37, Epoch 3/3]\n",
1248 |               "    </div>\n",
1249 |               "    <table border=\"1\" class=\"dataframe\">\n",
1250 |               "  <thead>\n",
1251 |               " <tr style=\"text-align: left;\">\n",
1252 |               "      <th>Step</th>\n",
1253 |               "      <th>Training Loss</th>\n",
1254 |               "    </tr>\n",
1255 |               "  </thead>\n",
1256 |               "  <tbody>\n",
1257 |               "    <tr>\n",
1258 |               "      <td>500</td>\n",
1259 |               "      <td>0.562600</td>\n",
1260 |               "    </tr>\n",
1261 |               "    <tr>\n",
1262 |               "      <td>1000</td>\n",
1263 |               "      <td>0.366000</td>\n",
1264 |               "    </tr>\n",
1265 |               "  </tbody>\n",
1266 |               "</table><p>"
1267 |             ],
1268 |             "text/plain": [
1269 |               "<IPython.core.display.HTML object>"
1270 |             ]
1271 |           },
1272 |           "metadata": {}
1273 |         },
1274 |         {
1275 |           "output_type": "stream",
1276 |           "name": "stderr",
1277 |           "text": [
1278 |             "Saving model checkpoint to bert-base-uncased/checkpoint-500\n",
1279 |             "Configuration saved in bert-base-uncased/checkpoint-500/config.json\n",
1280 |             "Model weights saved in bert-base-uncased/checkpoint-500/pytorch_model.bin\n",
1281 |             "tokenizer config file saved in bert-base-uncased/checkpoint-500/tokenizer_config.json\n",
1282 |             "Special tokens file saved in bert-base-uncased/checkpoint-500/special_tokens_map.json\n",
1283 |             "Saving model checkpoint to bert-base-uncased/checkpoint-1000\n",
1284 |             "Configuration saved in bert-base-uncased/checkpoint-1000/config.json\n",
1285 |             "Model weights saved in bert-base-uncased/checkpoint-1000/pytorch_model.bin\n",
1286 |             "tokenizer config file saved in bert-base-uncased/checkpoint-1000/tokenizer_config.json\n",
1287 |             "Special tokens file saved in bert-base-uncased/checkpoint-1000/special_tokens_map.json\n",
1288 |             "\n",
1289 |             "\n",
1290 |             "Training completed. Do not forget to share your model on huggingface.co/models =)\n",
1291 |             "\n",
1292 |             "\n"
1293 |           ]
1294 |         },
1295 |         {
1296 |           "output_type": "execute_result",
1297 |           "data": {
1298 |             "text/plain": [
1299 |               "TrainOutput(global_step=1377, training_loss=0.40232000856531125, metrics={'train_runtime': 397.7197, 'train_samples_per_second': 27.668, 'train_steps_per_second': 3.462, 'total_flos': 405470580750720.0, 'train_loss': 0.40232000856531125, 'epoch': 3.0})"
1300 |             ]
1301 |           },
1302 |           "metadata": {},
1303 |           "execution_count": 17
1304 |         }
1305 |       ]
1306 |     },
1307 |     {
1308 |       "cell_type": "markdown",
1309 |       "source": [
1310 |         "**EVALUATION:**"
1311 |       ],
1312 |       "metadata": {
1313 |         "id": "6Gg-pF6aDLA1"
1314 |       }
1315 |     },
1316 |     {
1317 |       "cell_type": "code",
1318 |       "source": [
1319 |         "#@ INITIALIZING MODEL EVALUATION:\n",
1320 |         "predictions = trainer.predict(tokenized_datasets[\"validation\"])         # Getting Predictions. \n",
1321 |         "print(predictions.predictions.shape, predictions.label_ids.shape)       # Inspecting Predictions. "
1322 |       ],
1323 |       "metadata": {
1324 |         "id": "8ZJnOI-xuEX1",
1325 |         "colab": {
1326 |           "base_uri": "https://localhost:8080/",
1327 |           "height": 144
1328 |         },
1329 |         "outputId": "d8518103-a3ae-4813-c50e-8243f7cc6164"
1330 |       },
1331 |       "execution_count": 19,
1332 |       "outputs": [
1333 |         {
1334 |           "output_type": "stream",
1335 |           "name": "stderr",
1336 |           "text": [
1337 |             "The following columns in the test set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.\n",
1338 |             "***** Running Prediction *****\n",
1339 |             "  Num examples = 408\n",
1340 |             "  Batch size = 8\n"
1341 |           ]
1342 |         },
1343 |         {
1344 |           "output_type": "display_data",
1345 |           "data": {
1346 |             "text/html": [
1347 |               "\n",
1348 |               "    <div>\n",
1349 |               "      \n",
1350 |               "      <progress value='102' max='51' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
1351 |               "      [51/51 01:27]\n",
1352 |               "    </div>\n",
1353 |               "    "
1354 |             ],
1355 |             "text/plain": [
1356 |               "<IPython.core.display.HTML object>"
1357 |             ]
1358 |           },
1359 |           "metadata": {}
1360 |         },
1361 |         {
1362 |           "output_type": "stream",
1363 |           "name": "stdout",
1364 |           "text": [
1365 |             "(408, 2) (408,)\n"
1366 |           ]
1367 |         }
1368 |       ]
1369 |     },
1370 |     {
1371 |       "cell_type": "code",
1372 |       "source": [
1373 |         "#@ INSPECTING MODEL PREDICTION:\n",
1374 |         "preds = np.argmax(predictions.predictions, axis=-1)                     # Getting Maximum Index.\n",
1375 |         "metric = load_metric(\"glue\", \"mrpc\")                                    # Initializing Metrics. \n",
1376 |         "metric.compute(predictions=preds, references=predictions.label_ids)     # Computing Metrices. "
1377 |       ],
1378 |       "metadata": {
1379 |         "colab": {
1380 |           "base_uri": "https://localhost:8080/"
1381 |         },
1382 |         "id": "h0evBLawFnVV",
1383 |         "outputId": "4a39c8cd-6035-492a-9b2f-9b5f641353d0"
1384 |       },
1385 |       "execution_count": 24,
1386 |       "outputs": [
1387 |         {
1388 |           "output_type": "execute_result",
1389 |           "data": {
1390 |             "text/plain": [
1391 |               "{'accuracy': 0.8676470588235294, 'f1': 0.9072164948453608}"
1392 |             ]
1393 |           },
1394 |           "metadata": {},
1395 |           "execution_count": 24
1396 |         }
1397 |       ]
1398 |     },
1399 |     {
1400 |       "cell_type": "code",
1401 |       "source": [
1402 |         "#@ DEFINING FUNCTION FOR COMPUTING METRICS:\n",
1403 |         "def compute_metrics(eval_preds):                                        # Defining Function. \n",
1404 |         "    metric = load_metric(\"glue\", \"mrpc\")                                # Initializing Metrics. \n",
1405 |         "    logits, labels = eval_preds\n",
1406 |         "    predictions = np.argmax(logits, axis=-1)                            # Getting Maximum Index. \n",
1407 |         "    return metric.compute(predictions=predictions, references=labels)    # Getting Metrices. "
1408 |       ],
1409 |       "metadata": {
1410 |         "id": "ZF4eRpM6HYwk"
1411 |       },
1412 |       "execution_count": 31,
1413 |       "outputs": []
1414 |     },
1415 |     {
1416 |       "cell_type": "code",
1417 |       "source": [
1418 |         "#@ DEFINING NEW TRAINER: \n",
1419 |         "training_args = TrainingArguments(\"test-trainer\", \n",
1420 |         "                                  evaluation_strategy=\"epoch\")          # Initializing Training Arguments. \n",
1421 |         "trainer = Trainer(model, training_args, \n",
1422 |         "                  train_dataset=tokenized_datasets[\"train\"],            # Initializing Training Datasets. \n",
1423 |         "                  eval_dataset=tokenized_datasets[\"validation\"],        # Initializing Validation Datasets. \n",
1424 |         "                  data_collator=data_collator, tokenizer=tokenizer, \n",
1425 |         "                  compute_metrics=compute_metrics)                      # Initializing Trainer. \n",
1426 |         "trainer.train()                                                         # Training the Model. "
1427 |       ],
1428 |       "metadata": {
1429 |         "colab": {
1430 |           "base_uri": "https://localhost:8080/",
1431 |           "height": 849
1432 |         },
1433 |         "id": "xeTSv_p7KJFX",
1434 |         "outputId": "1d44da42-9627-4acb-b812-0e1d43d3c73c"
1435 |       },
1436 |       "execution_count": 32,
1437 |       "outputs": [
1438 |         {
1439 |           "output_type": "stream",
1440 |           "name": "stderr",
1441 |           "text": [
1442 |             "PyTorch: setting up devices\n",
1443 |             "The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).\n",
1444 |             "The following columns in the training set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.\n",
1445 |             "***** Running training *****\n",
1446 |             "  Num examples = 3668\n",
1447 |             "  Num Epochs = 3\n",
1448 |             "  Instantaneous batch size per device = 8\n",
1449 |             "  Total train batch size (w. parallel, distributed & accumulation) = 8\n",
1450 |             "  Gradient Accumulation steps = 1\n",
1451 |             "  Total optimization steps = 1377\n"
1452 |           ]
1453 |         },
1454 |         {
1455 |           "output_type": "display_data",
1456 |           "data": {
1457 |             "text/html": [
1458 |               "\n",
1459 |               "    <div>\n",
1460 |               "      \n",
1461 |               "      <progress value='1377' max='1377' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
1462 |               "      [1377/1377 06:38, Epoch 3/3]\n",
1463 |               "    </div>\n",
1464 |               "    <table border=\"1\" class=\"dataframe\">\n",
1465 |               "  <thead>\n",
1466 |               " <tr style=\"text-align: left;\">\n",
1467 |               "      <th>Epoch</th>\n",
1468 |               "      <th>Training Loss</th>\n",
1469 |               "      <th>Validation Loss</th>\n",
1470 |               "      <th>Accuracy</th>\n",
1471 |               "      <th>F1</th>\n",
1472 |               "    </tr>\n",
1473 |               "  </thead>\n",
1474 |               "  <tbody>\n",
1475 |               "    <tr>\n",
1476 |               "      <td>1</td>\n",
1477 |               "      <td>No log</td>\n",
1478 |               "      <td>0.838388</td>\n",
1479 |               "      <td>0.835784</td>\n",
1480 |               "      <td>0.879713</td>\n",
1481 |               "    </tr>\n",
1482 |               "    <tr>\n",
1483 |               "      <td>2</td>\n",
1484 |               "      <td>0.156100</td>\n",
1485 |               "      <td>0.802094</td>\n",
1486 |               "      <td>0.848039</td>\n",
1487 |               "      <td>0.892734</td>\n",
1488 |               "    </tr>\n",
1489 |               "    <tr>\n",
1490 |               "      <td>3</td>\n",
1491 |               "      <td>0.088100</td>\n",
1492 |               "      <td>0.924342</td>\n",
1493 |               "      <td>0.835784</td>\n",
1494 |               "      <td>0.886248</td>\n",
1495 |               "    </tr>\n",
1496 |               "  </tbody>\n",
1497 |               "</table><p>"
1498 |             ],
1499 |             "text/plain": [
1500 |               "<IPython.core.display.HTML object>"
1501 |             ]
1502 |           },
1503 |           "metadata": {}
1504 |         },
1505 |         {
1506 |           "output_type": "stream",
1507 |           "name": "stderr",
1508 |           "text": [
1509 |             "The following columns in the evaluation set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.\n",
1510 |             "***** Running Evaluation *****\n",
1511 |             "  Num examples = 408\n",
1512 |             "  Batch size = 8\n",
1513 |             "Saving model checkpoint to test-trainer/checkpoint-500\n",
1514 |             "Configuration saved in test-trainer/checkpoint-500/config.json\n",
1515 |             "Model weights saved in test-trainer/checkpoint-500/pytorch_model.bin\n",
1516 |             "tokenizer config file saved in test-trainer/checkpoint-500/tokenizer_config.json\n",
1517 |             "Special tokens file saved in test-trainer/checkpoint-500/special_tokens_map.json\n",
1518 |             "The following columns in the evaluation set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.\n",
1519 |             "***** Running Evaluation *****\n",
1520 |             "  Num examples = 408\n",
1521 |             "  Batch size = 8\n",
1522 |             "Saving model checkpoint to test-trainer/checkpoint-1000\n",
1523 |             "Configuration saved in test-trainer/checkpoint-1000/config.json\n",
1524 |             "Model weights saved in test-trainer/checkpoint-1000/pytorch_model.bin\n",
1525 |             "tokenizer config file saved in test-trainer/checkpoint-1000/tokenizer_config.json\n",
1526 |             "Special tokens file saved in test-trainer/checkpoint-1000/special_tokens_map.json\n",
1527 |             "The following columns in the evaluation set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.\n",
1528 |             "***** Running Evaluation *****\n",
1529 |             "  Num examples = 408\n",
1530 |             "  Batch size = 8\n",
1531 |             "\n",
1532 |             "\n",
1533 |             "Training completed. Do not forget to share your model on huggingface.co/models =)\n",
1534 |             "\n",
1535 |             "\n"
1536 |           ]
1537 |         },
1538 |         {
1539 |           "output_type": "execute_result",
1540 |           "data": {
1541 |             "text/plain": [
1542 |               "TrainOutput(global_step=1377, training_loss=0.10326220942478553, metrics={'train_runtime': 398.7433, 'train_samples_per_second': 27.597, 'train_steps_per_second': 3.453, 'total_flos': 405470580750720.0, 'train_loss': 0.10326220942478553, 'epoch': 3.0})"
1543 |             ]
1544 |           },
1545 |           "metadata": {},
1546 |           "execution_count": 32
1547 |         }
1548 |       ]
1549 |     }
1550 |   ]
1551 | }


--------------------------------------------------------------------------------