├── Customising Large Language Models.ipynb ├── README.md ├── fake_news_detection.ipynb ├── falcon_finetune.ipynb └── gemma_training_grpo_connectfour.ipynb /Customising Large Language Models.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "20f23767", 6 | "metadata": {}, 7 | "source": [ 8 | "

Training or finetuning GPT Models within a business or organisation (and its challenges).

" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "5ffcfeaa", 14 | "metadata": {}, 15 | "source": [ 16 | "The increasing popularity and use cases of Large Language models in real world applications is on the rise. One of the main challenges presenting business and organisations is tuning these models to their own use cases." 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "e0cb9dea", 22 | "metadata": {}, 23 | "source": [ 24 | "

In this notebook I will explore options for training custom use cases for these models, relevant research papers and also the presented challenges.

" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "id": "3e164967", 30 | "metadata": {}, 31 | "source": [ 32 | "

#1 Training your own GPT level model.

" 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "id": "88ab5dd7", 38 | "metadata": {}, 39 | "source": [ 40 | "One of the main benefits of using the GPT architecture is that you can effectively chuck as much unstructured data at the model as possible and it can gain a predictive knowledge or at least learn the patterns of the data. This helps a model to generate new text based off your own.\n", 41 | "\n", 42 | "The challenge with this approach is that its a huge effort needing really skilled AI engineers (many of which are already highly paid in private companies) and huge costs in compute - in the following paper its estimated it costs about 3 million dollars just to train the model alone. Forgetting the cost of curating so much data and keeping this model up to date too. \n", 43 | "\n", 44 | "This would also only create a foundation model and you would need additional training to get it to respond conversationally in a chatGPT style - this is another huge task." 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "id": "c3ae53b0", 50 | "metadata": {}, 51 | "source": [ 52 | " See BloombergGPT Paper: https://arxiv.org/abs/2303.17564 " 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "id": "e336fec4", 58 | "metadata": {}, 59 | "source": [ 60 | "

#2 Finetuning an existing Language Model

" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "id": "b3fbe406", 66 | "metadata": {}, 67 | "source": [ 68 | " An option currently presented, (Azure, Google are beginning to offer this service) is finetuning these models to specific use cases. \n", 69 | "\n", 70 | " This approach does assume that your data is already organised, representative and within a format to train a language model. The format typically expected is a sort of question and answer such as prompt: [prompt text], Response: [refered response] \n", 71 | "\n", 72 | "\n", 73 | " This would potentially be a massive task and you can imagine within a huge organisation with lots of data how many examples you may need to train a model and transfer that knowledge effectively! \n", 74 | "\n", 75 | "\n", 76 | " The below paper experiments with using LLMs (specifically gpt-3 and 4) to generate these datasets. *Its worth pointing out though that using OpenAI's outputs to train your own models commercially does break their terms of service!* \n", 77 | " \n", 78 | " We could use opensource Language models to create synthetic datasets but that would raise the question why you would use Azure, Google all together to finetune these models. I dont imagine Azure will be offering this service with opensource models too as they are all-in with OpenAI's closed-source models. Take a look at my 'Finetuning Falcon' notebook in this repo to see how to do this without using services by cloud providers.\n", 79 | "\n", 80 | " See **Orca LLM Paper** - https://arxiv.org/abs/2306.02707 " 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "944172c9", 86 | "metadata": {}, 87 | "source": [ 88 | "

Final Option - although technically it is not training a model!

" 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "cdd463da", 94 | "metadata": {}, 95 | "source": [ 96 | "

#3 Prompt Engineering - given LLMs context with vector databases.

" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "8e5dbcac", 102 | "metadata": {}, 103 | "source": [ 104 | " Some practices have been put in place to apply cosine similarity against a vector database and then whatever text is found feeding that into the prompt for the language model and asking it to return relevant information etc. \n", 105 | "\n", 106 | " Although this is a hacky way to take advantage of LLMs. I am doubtful on its use cases in a production setting. Some questions to answer. \n", 107 | "\n", 108 | "1. How do you ensure you retrieve the most relevant context for the model? Is applying cosine enough?\n", 109 | "2. How do you know the model wont hallucinate given the wrong context (think across documents) or it simply does not know what you feed in (*) ? Could be less of a problem with GPT-4 but still assumes the model knows specific organisational terms etc.\n", 110 | "3. Do Large Language models have big enough context windows to understand something fully?\n", 111 | "\n", 112 | " * *You can ask these models to return nothing for things it does not know but this does not solve hallucinations completely.*" 113 | ] 114 | } 115 | ], 116 | "metadata": { 117 | "kernelspec": { 118 | "display_name": "Python 3 (ipykernel)", 119 | "language": "python", 120 | "name": "python3" 121 | }, 122 | "language_info": { 123 | "codemirror_mode": { 124 | "name": "ipython", 125 | "version": 3 126 | }, 127 | "file_extension": ".py", 128 | "mimetype": "text/x-python", 129 | "name": "python", 130 | "nbconvert_exporter": "python", 131 | "pygments_lexer": "ipython3", 132 | "version": "3.9.12" 133 | } 134 | }, 135 | "nbformat": 4, 136 | "nbformat_minor": 5 137 | } 138 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## 🔗 Notebooks 2 | 3 | Here is a list of the notebooks available in this repository: 4 | 5 | ### 1. Reinforcement Learning 6 | 7 | - [Training an LLM with RL](gemma_training_grpo_connectfour.ipynb): Training Gemma 1B to play Connect-Four using Deepseeks GRPO algorithm. 8 | 9 | ### 2. LLMs 10 | - [Fake News Detection](fake_news_detection.ipynb): Detecting fake news with Bidirectional Encoder (BERT) + LSTM model 11 | - [Falcon Finetune](falcon_finetune.ipynb): Old repo of how to train Falcon 12 | - [Custom LLM options](Customising Large Language Models.ipynb): Options for customising LLMs 13 | -------------------------------------------------------------------------------- /fake_news_detection.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "1eb579f9", 6 | "metadata": {}, 7 | "source": [ 8 | "
Predicting Fake news with a Bidirectional Encoder (BERT) + LSTM model
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "c8008089", 14 | "metadata": {}, 15 | "source": [ 16 | "
Based off me reading this Paper
" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "id": "2ee45319", 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "import pandas as pd\n", 27 | "import glob\n", 28 | "\n", 29 | "\n", 30 | "csv_files = ['politifact_fake.csv', 'gossipcop_fake.csv', 'gossipcop_real.csv', 'politifact_real.csv']\n", 31 | "\n", 32 | "dfs = []\n", 33 | "\n", 34 | "# Loop through the list of files and read each file into a DataFrame\n", 35 | "for file in csv_files:\n", 36 | " df = pd.read_csv(file)\n", 37 | " if \"fake\" in file: \n", 38 | " df[\"verdict\"] = \"Fake\"\n", 39 | " else:\n", 40 | " df[\"verdict\"] = \"True\"\n", 41 | " dfs.append(df)\n", 42 | "\n", 43 | "# Concatenate all DataFrames in the list into a single DataFrame\n", 44 | "df = pd.concat(dfs, ignore_index=True)" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "id": "f145c8d1", 50 | "metadata": {}, 51 | "source": [ 52 | "
The data comes from a dataset release called FakeNewsNet its a combination of data from politifact a user contributed site along with a site called gossipcon with the same concept
" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": 2, 58 | "id": "dbb7ff4c", 59 | "metadata": {}, 60 | "outputs": [ 61 | { 62 | "data": { 63 | "text/plain": [ 64 | "(23196, 5)" 65 | ] 66 | }, 67 | "execution_count": 2, 68 | "metadata": {}, 69 | "output_type": "execute_result" 70 | } 71 | ], 72 | "source": [ 73 | "df.shape" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": 31, 79 | "id": "8815f09f", 80 | "metadata": {}, 81 | "outputs": [ 82 | { 83 | "data": { 84 | "text/html": [ 85 | "
\n", 86 | "\n", 99 | "\n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | "
id news_url title tweet_ids verdict
7526 gossipcop-2895484840 www.magzter.com/article/Celebrity/OK/Taylors-L... Taylor's Lonely Life 279594449154232321\\t280553081064800256\\t280797... Fake
4351 gossipcop-884847 https://www.longroom.com/discussion/720445/wat... Watch \"Belligerent\" Scott Disick Freak Out at ... NaN True
10822 gossipcop-3662901506 radaronline.com/videos/ellen-degeneres-talk-sh... Boss From Hell! Ellen DeGeneres Treats Her Tal... 943461276277231616\\t943471999036411904\\t943499... Fake
5118 gossipcop-925000 https://www.pinterest.co.uk/pin/29533768803792... Fearless from Beyoncé and Jay Z's Vacation Pics 981654072489947136 True
4236 gossipcop-907444 https://www.usmagazine.com/celebrity-news/news... James Franco to Attend SAG Awards 2018 Amid Mi... 954377483410960386\\t954389751481675778\\t954389... True
\n", 153 | "
" 154 | ], 155 | "text/plain": [ 156 | " id \\\n", 157 | "7526 gossipcop-2895484840 \n", 158 | "4351 gossipcop-884847 \n", 159 | "10822 gossipcop-3662901506 \n", 160 | "5118 gossipcop-925000 \n", 161 | "4236 gossipcop-907444 \n", 162 | "\n", 163 | " news_url \\\n", 164 | "7526 www.magzter.com/article/Celebrity/OK/Taylors-L... \n", 165 | "4351 https://www.longroom.com/discussion/720445/wat... \n", 166 | "10822 radaronline.com/videos/ellen-degeneres-talk-sh... \n", 167 | "5118 https://www.pinterest.co.uk/pin/29533768803792... \n", 168 | "4236 https://www.usmagazine.com/celebrity-news/news... \n", 169 | "\n", 170 | " title \\\n", 171 | "7526 Taylor's Lonely Life \n", 172 | "4351 Watch \"Belligerent\" Scott Disick Freak Out at ... \n", 173 | "10822 Boss From Hell! Ellen DeGeneres Treats Her Tal... \n", 174 | "5118 Fearless from Beyoncé and Jay Z's Vacation Pics \n", 175 | "4236 James Franco to Attend SAG Awards 2018 Amid Mi... \n", 176 | "\n", 177 | " tweet_ids verdict \n", 178 | "7526 279594449154232321\\t280553081064800256\\t280797... Fake \n", 179 | "4351 NaN True \n", 180 | "10822 943461276277231616\\t943471999036411904\\t943499... Fake \n", 181 | "5118 981654072489947136 True \n", 182 | "4236 954377483410960386\\t954389751481675778\\t954389... True " 183 | ] 184 | }, 185 | "execution_count": 31, 186 | "metadata": {}, 187 | "output_type": "execute_result" 188 | } 189 | ], 190 | "source": [ 191 | "df.sample(5)" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": 4, 197 | "id": "e12ec5ff", 198 | "metadata": {}, 199 | "outputs": [ 200 | { 201 | "data": { 202 | "text/html": [ 203 | "
\n", 204 | "\n", 217 | "\n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | "
verdict count
0 True 17441
1 Fake 5755
\n", 238 | "
" 239 | ], 240 | "text/plain": [ 241 | " verdict count\n", 242 | "0 True 17441\n", 243 | "1 Fake 5755" 244 | ] 245 | }, 246 | "execution_count": 4, 247 | "metadata": {}, 248 | "output_type": "execute_result" 249 | } 250 | ], 251 | "source": [ 252 | "df[\"verdict\"].value_counts().reset_index()" 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "id": "27b46b28", 258 | "metadata": {}, 259 | "source": [ 260 | "
Bit of a class imbalance on real vs fake so will sample the same amount of True
" 261 | ] 262 | }, 263 | { 264 | "cell_type": "code", 265 | "execution_count": 5, 266 | "id": "9fef50ef", 267 | "metadata": {}, 268 | "outputs": [], 269 | "source": [ 270 | "sampled_0_25 = df[df['verdict'] == 'True'].sample(n=5755, random_state=42)\n", 271 | "rest_df = df[df['verdict'] != 'True']\n", 272 | "df = pd.concat([sampled_0_25, rest_df], ignore_index=True)" 273 | ] 274 | }, 275 | { 276 | "cell_type": "code", 277 | "execution_count": 6, 278 | "id": "0d693143", 279 | "metadata": {}, 280 | "outputs": [ 281 | { 282 | "data": { 283 | "text/html": [ 284 | "
\n", 285 | "\n", 298 | "\n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | "
verdict count
0 True 5755
1 Fake 5755
\n", 319 | "
" 320 | ], 321 | "text/plain": [ 322 | " verdict count\n", 323 | "0 True 5755\n", 324 | "1 Fake 5755" 325 | ] 326 | }, 327 | "execution_count": 6, 328 | "metadata": {}, 329 | "output_type": "execute_result" 330 | } 331 | ], 332 | "source": [ 333 | "df[\"verdict\"].value_counts().reset_index()" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": 7, 339 | "id": "2b423e3f", 340 | "metadata": {}, 341 | "outputs": [], 342 | "source": [ 343 | "from sklearn.model_selection import train_test_split\n", 344 | "from sklearn.preprocessing import LabelEncoder\n", 345 | "\n", 346 | "# Convert labels to integers\n", 347 | "label_encoder = LabelEncoder()\n", 348 | "\n", 349 | "texts = df[\"title\"].to_list()\n", 350 | "\n", 351 | "labels = label_encoder.fit_transform(df[\"verdict\"].to_list())\n", 352 | "\n", 353 | "X_train, X_test, y_train, y_test = train_test_split(\n", 354 | " texts, labels, \n", 355 | " test_size=0.2, # 20% of the data for testing\n", 356 | " stratify=labels, # This ensures the distribution of labels is similar in both sets\n", 357 | " random_state=42 # For reproducibility of results\n", 358 | ")\n" 359 | ] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": 8, 364 | "id": "1362d155", 365 | "metadata": {}, 366 | "outputs": [ 367 | { 368 | "name": "stderr", 369 | "output_type": "stream", 370 | "text": [ 371 | "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", 372 | "To disable this warning, you can either:\n", 373 | "\t- Avoid using `tokenizers` before the fork if possible\n", 374 | "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n", 375 | "Epoch 1: 0%| | 0/576 [00:00 1 else 0)\n", 409 | " self.dropout = nn.Dropout(dropout)\n", 410 | " self.classifier = nn.Linear(hidden_dim, num_classes)\n", 411 | "\n", 412 | " def forward(self, input_ids, attention_mask):\n", 413 | " with torch.no_grad():\n", 414 | " outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)\n", 415 | " sequence_output = outputs.last_hidden_state\n", 416 | " lstm_output, (h_n, c_n) = self.lstm(sequence_output)\n", 417 | " lstm_output = self.dropout(lstm_output[:, -1, :])\n", 418 | " logits = self.classifier(lstm_output)\n", 419 | " return logits\n", 420 | "\n", 421 | "# Tokenizer\n", 422 | "tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')\n", 423 | "\n", 424 | "# Tokenizing the texts\n", 425 | "train_encodings = tokenizer(X_train, truncation=True, padding=True, max_length=128, return_tensors=\"pt\")\n", 426 | "test_encodings = tokenizer(X_test, truncation=True, padding=True, max_length=128, return_tensors=\"pt\")\n", 427 | "\n", 428 | "# Creating the datasets\n", 429 | "train_dataset = PolitifactDataset(train_encodings, y_train)\n", 430 | "test_dataset = PolitifactDataset(test_encodings, y_test)\n", 431 | "\n", 432 | "# Model\n", 433 | "model = BertLSTM()\n", 434 | "\n", 435 | "# Working with myt MPS on macbook M1\n", 436 | "device = torch.device(\"mps\" if torch.backends.mps.is_available() else \"cpu\")\n", 437 | "model.to(device)\n", 438 | "\n", 439 | "# DataLoader setup - batch size of 16..\n", 440 | "train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True)\n", 441 | "test_loader = DataLoader(test_dataset, batch_size=16, shuffle=False)\n", 442 | "\n", 443 | "optimizer = torch.optim.AdamW(model.parameters(), lr=0.001)\n", 444 | "\n", 445 | "# training loop:\n", 446 | "for epoch in range(3): # Number of epochs\n", 447 | " model.train()\n", 448 | " total_loss = 0.0 \n", 449 | " num_batches = 0 # Count the number of batches processed\n", 450 | " progress_bar = tqdm(train_loader, desc=f\"Epoch {epoch+1}\")\n", 451 | " for batch in progress_bar:\n", 452 | " optimizer.zero_grad()\n", 453 | " input_ids = batch['input_ids'].to(device)\n", 454 | " attention_mask = batch['attention_mask'].to(device)\n", 455 | " labels = batch['labels'].to(device)\n", 456 | " \n", 457 | " outputs = model(input_ids, attention_mask)\n", 458 | " loss = nn.CrossEntropyLoss()(outputs, labels)\n", 459 | " loss.backward()\n", 460 | " optimizer.step()\n", 461 | " \n", 462 | " total_loss += loss.item() # Accumulate loss\n", 463 | " num_batches += 1\n", 464 | " \n", 465 | " # Update progress bar with mean loss for the current epoch\n", 466 | " progress_bar.set_postfix({'mean_loss': total_loss / num_batches})\n" 467 | ] 468 | }, 469 | { 470 | "cell_type": "markdown", 471 | "id": "971a2280", 472 | "metadata": {}, 473 | "source": [ 474 | "
Lets test out how well the model predicts on the reserved test set
" 475 | ] 476 | }, 477 | { 478 | "cell_type": "code", 479 | "execution_count": 9, 480 | "id": "13a431fd", 481 | "metadata": {}, 482 | "outputs": [ 483 | { 484 | "name": "stderr", 485 | "output_type": "stream", 486 | "text": [ 487 | "Evaluating: 0%| | 0/144 [00:00 73% is not too bad for a quick go- the paper said they got between 73% and 83% with DL
" 564 | ] 565 | }, 566 | { 567 | "cell_type": "markdown", 568 | "id": "c90884d3", 569 | "metadata": {}, 570 | "source": [ 571 | "
Lets test it on a article I saw posted
" 572 | ] 573 | }, 574 | { 575 | "cell_type": "code", 576 | "execution_count": 30, 577 | "id": "bc3d0e1f", 578 | "metadata": {}, 579 | "outputs": [ 580 | { 581 | "name": "stdout", 582 | "output_type": "stream", 583 | "text": [ 584 | "Predicted label: true\n" 585 | ] 586 | } 587 | ], 588 | "source": [ 589 | "model.eval()\n", 590 | "\n", 591 | "input_ids = tokenizer([\"Disillusioned Businesses Discovering That AI Kind of Sucks.\"], truncation=True, padding=True, max_length=128, return_tensors=\"pt\")\n", 592 | "input_ids_test = input_ids['input_ids']\n", 593 | "attention_mask_test = input_ids['attention_mask']\n", 594 | "\n", 595 | "outputs = model(input_ids_test.to(device), attention_mask_test.to(device))\n", 596 | "logits = outputs\n", 597 | "predictions = torch.argmax(logits, dim=1).cpu().numpy() # Move predictions to CPU and convert to numpy\n", 598 | "\n", 599 | "index_to_class = {0: \"fake\", 1: \"true\"} # Adjust based on your actual classes\n", 600 | "\n", 601 | "predicted_label = index_to_class[predictions[0]]\n", 602 | "print(f\"Predicted label: {predicted_label}\")\n" 603 | ] 604 | }, 605 | { 606 | "cell_type": "markdown", 607 | "id": "7b75928e", 608 | "metadata": {}, 609 | "source": [ 610 | "
and one I have just made up
" 611 | ] 612 | }, 613 | { 614 | "cell_type": "code", 615 | "execution_count": 23, 616 | "id": "cc887839", 617 | "metadata": {}, 618 | "outputs": [ 619 | { 620 | "name": "stdout", 621 | "output_type": "stream", 622 | "text": [ 623 | "Predicted label: fake\n" 624 | ] 625 | } 626 | ], 627 | "source": [ 628 | "input_ids = tokenizer([\"Aliens have landed in manchester\"], truncation=True, padding=True, max_length=128, return_tensors=\"pt\")\n", 629 | "input_ids_test = input_ids['input_ids']\n", 630 | "attention_mask_test = input_ids['attention_mask']\n", 631 | "\n", 632 | "outputs = model(input_ids_test.to(device), attention_mask_test.to(device))\n", 633 | "logits = outputs\n", 634 | "predictions = torch.argmax(logits, dim=1).cpu().numpy() \n", 635 | "\n", 636 | "predicted_label = index_to_class[predictions[0]]\n", 637 | "print(f\"Predicted label: {predicted_label}\")\n" 638 | ] 639 | }, 640 | { 641 | "cell_type": "code", 642 | "execution_count": null, 643 | "id": "a8aa651f", 644 | "metadata": {}, 645 | "outputs": [], 646 | "source": [] 647 | } 648 | ], 649 | "metadata": { 650 | "kernelspec": { 651 | "display_name": "Python 3 (ipykernel)", 652 | "language": "python", 653 | "name": "python3" 654 | }, 655 | "language_info": { 656 | "codemirror_mode": { 657 | "name": "ipython", 658 | "version": 3 659 | }, 660 | "file_extension": ".py", 661 | "mimetype": "text/x-python", 662 | "name": "python", 663 | "nbconvert_exporter": "python", 664 | "pygments_lexer": "ipython3", 665 | "version": "3.10.13" 666 | } 667 | }, 668 | "nbformat": 4, 669 | "nbformat_minor": 5 670 | } 671 | -------------------------------------------------------------------------------- /falcon_finetune.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "machine_shape": "hm", 8 | "gpuType": "T4" 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | }, 17 | "accelerator": "GPU" 18 | }, 19 | "cells": [ 20 | { 21 | "cell_type": "markdown", 22 | "source": [ 23 | " # Finetuning Falcon 7B the first commercially opensource LLM on an opensource instruction dataset." 24 | ], 25 | "metadata": { 26 | "id": "5IFPfsoApc5A" 27 | } 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "Recent papers are showing the ability to train smaller models to outperform GPT-4. Notably recent research papers have demonstrated the ability to finetune smaller 7B models some highlights include: \n", 33 | "\n", 34 | "* **Gorilla** - LLM Connected with Massive APIs- https://arxiv.org/abs/2305.15334\n", 35 | "* **Goat** - able to surpass GPT-4 on Arithmetic Tasks https://arxiv.org/pdf/2305.14201v1.pdf\n", 36 | "\n", 37 | "Other recent papers also suggest that a quality small dataset can help smaller language models shine.\n", 38 | "\n", 39 | "* **LIMA** - less is more for alignment. https://arxiv.org/pdf/2305.11206.pdf\n", 40 | "\n", 41 | "With the advancements announced in the QLora paper it is now possible to finetune smaller models on one GPU.\n", 42 | "\n", 43 | "* **QLoRA** https://arxiv.org/pdf/2305.14314v1.pdf\n", 44 | "\n", 45 | "\n", 46 | "\n", 47 | "\n", 48 | "\n", 49 | "\n", 50 | "\n", 51 | "\n" 52 | ], 53 | "metadata": { 54 | "id": "4q4tOB_ipt6C" 55 | } 56 | }, 57 | { 58 | "cell_type": "markdown", 59 | "source": [ 60 | "# Below is code on how you can finetune an LLM (in this case Falcon released by the UAE for free commercial use) on instruction data to achieve state of the art results on specific tasks. This code implements QLoRA so is capable of running on a single gpu with <24gb vram.\n", 61 | "\n", 62 | "**As recent paper show the most important part is not the size of the model but the dataset you are training it on.**" 63 | ], 64 | "metadata": { 65 | "id": "oPLssiEQra8h" 66 | } 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": { 72 | "id": "KqPYMutlXFmu" 73 | }, 74 | "outputs": [], 75 | "source": [ 76 | "!pip install -q -U bitsandbytes\n", 77 | "!pip install einops\n", 78 | "!pip install -q -U git+https://github.com/huggingface/transformers.git \n", 79 | "!pip install -q -U git+https://github.com/huggingface/peft.git\n", 80 | "!pip install -q -U git+https://github.com/huggingface/accelerate.git\n", 81 | "!pip install -q datasets" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "source": [ 87 | "!nvidia-smi" 88 | ], 89 | "metadata": { 90 | "colab": { 91 | "base_uri": "https://localhost:8080/" 92 | }, 93 | "id": "Tj3bBAFBZxmZ", 94 | "outputId": "656a433f-2d39-41c3-edd4-f58c7fdcf371" 95 | }, 96 | "execution_count": 5, 97 | "outputs": [ 98 | { 99 | "output_type": "stream", 100 | "name": "stdout", 101 | "text": [ 102 | "Fri Jun 2 09:24:02 2023 \n", 103 | "+-----------------------------------------------------------------------------+\n", 104 | "| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |\n", 105 | "|-------------------------------+----------------------+----------------------+\n", 106 | "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", 107 | "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", 108 | "| | | MIG M. |\n", 109 | "|===============================+======================+======================|\n", 110 | "| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |\n", 111 | "| N/A 44C P0 27W / 70W | 6793MiB / 15360MiB | 0% Default |\n", 112 | "| | | N/A |\n", 113 | "+-------------------------------+----------------------+----------------------+\n", 114 | " \n", 115 | "+-----------------------------------------------------------------------------+\n", 116 | "| Processes: |\n", 117 | "| GPU GI CI PID Type Process name GPU Memory |\n", 118 | "| ID ID Usage |\n", 119 | "|=============================================================================|\n", 120 | "+-----------------------------------------------------------------------------+\n" 121 | ] 122 | } 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "source": [ 128 | "import torch\n", 129 | "from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig\n", 130 | "\n", 131 | "model_id = \"tiiuae/falcon-7b\"\n", 132 | "\n", 133 | "bnb_config = BitsAndBytesConfig(\n", 134 | " load_in_4bit=True,\n", 135 | " bnb_4bit_use_double_quant=True,\n", 136 | " bnb_4bit_quant_type=\"nf4\",\n", 137 | " bnb_4bit_compute_dtype=torch.bfloat16\n", 138 | ")\n", 139 | "\n", 140 | "tokenizer = AutoTokenizer.from_pretrained(model_id)\n", 141 | "model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, trust_remote_code=True, device_map={\"\":0})" 142 | ], 143 | "metadata": { 144 | "id": "l8CpKUCDXN_P" 145 | }, 146 | "execution_count": null, 147 | "outputs": [] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "source": [ 152 | "
Prepare for peft training (parameter efficient fine tuning)
" 153 | ], 154 | "metadata": { 155 | "id": "C7DFt9GEXbXD" 156 | } 157 | }, 158 | { 159 | "cell_type": "code", 160 | "source": [ 161 | "from peft import prepare_model_for_kbit_training\n", 162 | "\n", 163 | "model.gradient_checkpointing_enable()\n", 164 | "model = prepare_model_for_kbit_training(model)" 165 | ], 166 | "metadata": { 167 | "id": "aSRF17jmXaVt" 168 | }, 169 | "execution_count": 6, 170 | "outputs": [] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "source": [ 175 | "def print_trainable_parameters(model):\n", 176 | " \"\"\"\n", 177 | " Prints the number of trainable parameters in the model.\n", 178 | " \"\"\"\n", 179 | " trainable_params = 0\n", 180 | " all_param = 0\n", 181 | " for _, param in model.named_parameters():\n", 182 | " all_param += param.numel()\n", 183 | " if param.requires_grad:\n", 184 | " trainable_params += param.numel()\n", 185 | " print(\n", 186 | " f\"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}\"\n", 187 | " )" 188 | ], 189 | "metadata": { 190 | "id": "D79ii0NRl8Sq" 191 | }, 192 | "execution_count": 49, 193 | "outputs": [] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "source": [ 198 | "from peft import LoraConfig, get_peft_model\n", 199 | "\n", 200 | "config = LoraConfig(\n", 201 | " r=8, \n", 202 | " lora_alpha=32, \n", 203 | " target_modules=[\"query_key_value\"], \n", 204 | " lora_dropout=0.05, \n", 205 | " bias=\"none\", \n", 206 | " task_type=\"CAUSAL_LM\"\n", 207 | ")\n", 208 | "\n", 209 | "model = get_peft_model(model, config)\n", 210 | "print_trainable_parameters(model)" 211 | ], 212 | "metadata": { 213 | "colab": { 214 | "base_uri": "https://localhost:8080/" 215 | }, 216 | "id": "RquYNWeNl2MC", 217 | "outputId": "0eb55523-f421-45f3-bbb3-bc506de474d4" 218 | }, 219 | "execution_count": 50, 220 | "outputs": [ 221 | { 222 | "output_type": "stream", 223 | "name": "stdout", 224 | "text": [ 225 | "trainable params: 2359296 || all params: 3611104128 || trainable%: 0.06533447711203746\n" 226 | ] 227 | } 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "source": [ 233 | "#Load our commercially opensource instruction dataset (Dolly 15k)" 234 | ], 235 | "metadata": { 236 | "id": "7vlfopuroIMi" 237 | } 238 | }, 239 | { 240 | "cell_type": "code", 241 | "source": [ 242 | "from datasets import load_dataset\n", 243 | "\n", 244 | "data = load_dataset(\"databricks/databricks-dolly-15k\")" 245 | ], 246 | "metadata": { 247 | "id": "QXYB68GRakEd" 248 | }, 249 | "execution_count": null, 250 | "outputs": [] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "source": [ 255 | "data" 256 | ], 257 | "metadata": { 258 | "colab": { 259 | "base_uri": "https://localhost:8080/" 260 | }, 261 | "id": "pDk6qGjCbkXk", 262 | "outputId": "ffbb4022-43b3-4eb8-a75d-90ea9f42d042" 263 | }, 264 | "execution_count": 53, 265 | "outputs": [ 266 | { 267 | "output_type": "execute_result", 268 | "data": { 269 | "text/plain": [ 270 | "DatasetDict({\n", 271 | " train: Dataset({\n", 272 | " features: ['instruction', 'context', 'response', 'category'],\n", 273 | " num_rows: 15011\n", 274 | " })\n", 275 | "})" 276 | ] 277 | }, 278 | "metadata": {}, 279 | "execution_count": 53 280 | } 281 | ] 282 | }, 283 | { 284 | "cell_type": "markdown", 285 | "source": [ 286 | "# Prepare the dataset by tokenizing and also joining some of the columns" 287 | ], 288 | "metadata": { 289 | "id": "uAblZRiPo4Tr" 290 | } 291 | }, 292 | { 293 | "cell_type": "code", 294 | "source": [ 295 | "def tokenize_function(examples):\n", 296 | " # Concatenate instruction and input text\n", 297 | " input_text = [' '.join(t) for t in zip(examples[\"instruction\"], examples[\"context\"])]\n", 298 | " output_text = examples[\"response\"]\n", 299 | "\n", 300 | " # Tokenize inputs and outputs\n", 301 | " inputs = tokenizer(input_text, padding=\"max_length\", truncation=True, max_length=512)\n", 302 | " outputs = tokenizer(output_text, padding=\"max_length\", truncation=True, max_length=512)\n", 303 | " \n", 304 | " return {\"input_ids\": inputs.input_ids, \"attention_mask\": inputs.attention_mask, \"labels\": outputs.input_ids}\n", 305 | "\n", 306 | "dataset = data.map(tokenize_function, batched=True, remove_columns=[\"instruction\", \"context\", \"response\"])" 307 | ], 308 | "metadata": { 309 | "id": "WgZyOW25cOzD" 310 | }, 311 | "execution_count": null, 312 | "outputs": [] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "source": [ 317 | "dataset.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])" 318 | ], 319 | "metadata": { 320 | "id": "essiQx04iIFm" 321 | }, 322 | "execution_count": 57, 323 | "outputs": [] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "source": [ 328 | "dataset" 329 | ], 330 | "metadata": { 331 | "colab": { 332 | "base_uri": "https://localhost:8080/" 333 | }, 334 | "id": "AMsx0DZjiwTc", 335 | "outputId": "7a836201-8b3e-4080-e11c-3a09845efbdc" 336 | }, 337 | "execution_count": 58, 338 | "outputs": [ 339 | { 340 | "output_type": "execute_result", 341 | "data": { 342 | "text/plain": [ 343 | "DatasetDict({\n", 344 | " train: Dataset({\n", 345 | " features: ['category', 'input_ids', 'attention_mask', 'labels'],\n", 346 | " num_rows: 15011\n", 347 | " })\n", 348 | "})" 349 | ] 350 | }, 351 | "metadata": {}, 352 | "execution_count": 58 353 | } 354 | ] 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "source": [ 359 | "# Split the dataset up into train and test" 360 | ], 361 | "metadata": { 362 | "id": "_NszDl7Mon80" 363 | } 364 | }, 365 | { 366 | "cell_type": "code", 367 | "source": [ 368 | "from datasets import DatasetDict\n", 369 | "\n", 370 | "# Split the data into 80% for training and 20% for evaluation\n", 371 | "split_data = dataset['train'].train_test_split(test_size=0.2)\n", 372 | "\n", 373 | "# Now we update the dataset with the new split data\n", 374 | "dataset = DatasetDict({\n", 375 | " 'train': split_data['train'],\n", 376 | " 'eval': split_data['test']\n", 377 | "})\n" 378 | ], 379 | "metadata": { 380 | "id": "rvkY9elXi46H" 381 | }, 382 | "execution_count": 59, 383 | "outputs": [] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "source": [ 388 | "# Train the model" 389 | ], 390 | "metadata": { 391 | "id": "hf9nRcappThv" 392 | } 393 | }, 394 | { 395 | "cell_type": "markdown", 396 | "source": [ 397 | "As per bitsandbytes example i have set this to run for 10 steps as a tester. If you uncomment out the **num_train_epochs** and remove **max_steps** this will train properly, and based on the ETA would finetune in about 24 hours." 398 | ], 399 | "metadata": { 400 | "id": "pROarGdzs53J" 401 | } 402 | }, 403 | { 404 | "cell_type": "code", 405 | "source": [ 406 | "import transformers\n", 407 | "\n", 408 | "trainer = transformers.Trainer(\n", 409 | " model=model,\n", 410 | " args=transformers.TrainingArguments(\n", 411 | " per_device_train_batch_size=1,\n", 412 | " gradient_accumulation_steps=4,\n", 413 | " warmup_steps=2,\n", 414 | " # num_train_epochs=3,\n", 415 | " max_steps=10,\n", 416 | " learning_rate=2e-4,\n", 417 | " fp16=True,\n", 418 | " logging_steps=1,\n", 419 | " output_dir=\"outputs\",\n", 420 | " optim=\"paged_adamw_8bit\"\n", 421 | " ),\n", 422 | " train_dataset=dataset['train'],\n", 423 | " eval_dataset=dataset['eval'],\n", 424 | ")\n", 425 | "model.config.use_cache = False # silence the warnings. Please re-enable for inference!\n", 426 | "trainer.train()" 427 | ], 428 | "metadata": { 429 | "colab": { 430 | "base_uri": "https://localhost:8080/", 431 | "height": 426 432 | }, 433 | "id": "-ja-ESroiVSb", 434 | "outputId": "577cd0f6-588d-48b8-f253-8196fa93f034" 435 | }, 436 | "execution_count": 64, 437 | "outputs": [ 438 | { 439 | "output_type": "display_data", 440 | "data": { 441 | "text/plain": [ 442 | "" 443 | ], 444 | "text/html": [ 445 | "\n", 446 | "
\n", 447 | " \n", 448 | " \n", 449 | " [10/10 01:33, Epoch 0/1]\n", 450 | "
\n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | "
Step Training Loss
1 1.025000
2 2.115400
3 1.343800
4 2.544700
5 0.564300
6 3.495900
7 1.927200
8 0.640500
9 0.759200
10 1.656900
" 501 | ] 502 | }, 503 | "metadata": {} 504 | }, 505 | { 506 | "output_type": "execute_result", 507 | "data": { 508 | "text/plain": [ 509 | "TrainOutput(global_step=10, training_loss=1.6072865188121797, metrics={'train_runtime': 103.5639, 'train_samples_per_second': 0.386, 'train_steps_per_second': 0.097, 'total_flos': 407425237647360.0, 'train_loss': 1.6072865188121797, 'epoch': 0.0})" 510 | ] 511 | }, 512 | "metadata": {}, 513 | "execution_count": 64 514 | } 515 | ] 516 | } 517 | ] 518 | } 519 | -------------------------------------------------------------------------------- /gemma_training_grpo_connectfour.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": { 7 | "id": "qaZrW1OHF0KP" 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "# %%capture\n", 12 | "import os\n", 13 | "if \"COLAB_\" not in \"\".join(os.environ.keys()):\n", 14 | " !pip install unsloth vllm\n", 15 | "else:\n", 16 | " # [NOTE] Do the below ONLY in Colab! Use [[pip install unsloth vllm]]\n", 17 | " !pip install --no-deps unsloth vllm\n", 18 | "# Install latest Hugging Face for Gemma-3!\n", 19 | "!pip install --no-deps git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 2, 25 | "metadata": { 26 | "id": "_urJTB8FGi1S" 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "\n", 31 | "#@title Colab Extra Install { display-mode: \"form\" }\n", 32 | "%%capture\n", 33 | "import os\n", 34 | "if \"COLAB_\" not in \"\".join(os.environ.keys()):\n", 35 | " !pip install unsloth vllm\n", 36 | "else:\n", 37 | " !pip install --no-deps unsloth vllm\n", 38 | " # [NOTE] Do the below ONLY in Colab! Use [[pip install unsloth vllm]]\n", 39 | " # Skip restarting message in Colab\n", 40 | " import sys, re, requests; modules = list(sys.modules.keys())\n", 41 | " for x in modules: sys.modules.pop(x) if \"PIL\" in x or \"google\" in x else None\n", 42 | " !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl triton cut_cross_entropy unsloth_zoo\n", 43 | " !pip install sentencepiece protobuf datasets huggingface_hub hf_transfer\n", 44 | "\n", 45 | " # vLLM requirements - vLLM breaks Colab due to reinstalling numpy\n", 46 | " f = requests.get(\"https://raw.githubusercontent.com/vllm-project/vllm/refs/heads/main/requirements/common.txt\").content\n", 47 | " with open(\"vllm_requirements.txt\", \"wb\") as file:\n", 48 | " file.write(re.sub(rb\"(transformers|numpy|xformers)[^\\n]{1,}\\n\", b\"\", f))\n", 49 | " !pip install -r vllm_requirements.txt\n" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": { 56 | "colab": { 57 | "base_uri": "https://localhost:8080/", 58 | "height": 461, 59 | "referenced_widgets": [ 60 | "61fe9da71b97485f92c6ee2380d46b4d", 61 | "50656ef62a804b908bf05493789cc362", 62 | "b5601c72540c4ec9be8cfbe7b577ef65", 63 | "11788b948d5b46bd88de9235bbb3179d", 64 | "915ba23ce31640e4b487d7d402d203d8", 65 | "5e7f4d73b97142448857a67a66c47b34", 66 | "02e45237f0ac4b5ab7c6cd68b09da051", 67 | "7fed1cef573e47d08300438fe4c158e9", 68 | "222df800dad945e199a7ccd8dfbc7a8d", 69 | "58872b9c5b074813af7b6d9bdc9ba60f", 70 | "3a8ee07769a94ea2a7d675bb113ce709", 71 | "72278d146c6a43c88282be435ff8b7ed", 72 | "e4354fcaac5642ae89ce8a3c69d1b68d", 73 | "aea7cfbba79340e98ca0a00ea6e6a5c2", 74 | "adf9b0c5dacb426aa21ae59db45657ca", 75 | "d0a7b3b62181431b9ca9870c167b983e", 76 | "6da8546c83864bd98dd3bc7568de095b", 77 | "39367aac2c894342ab3ed257ac8c2e4f", 78 | "21c5356744e149d6a2d67fdb9830fbeb", 79 | "92bf50844694495daf0abd9f3fb01ef1", 80 | "29c6198a3adb4d909230ed1eb1e3cd74", 81 | "3f956e7672f34276b78e5e0cadaaedbd", 82 | "3abecbd6292a4b4e94e5213616cf2122", 83 | "72d5f442747f4bfda91945c58033ead2", 84 | "34b06389fce04892a242003bf79610ed", 85 | "9f2bda486bea4880829704c05223562e", 86 | "136e78cc0e764bda86121f1383e5bb0b", 87 | "4441bde44ec44063bed8f43ae4a30a4e", 88 | "1e8533a81384457dab93020c702eb2d6", 89 | "5b2d5be5b4134fe080add6f4e48be59f", 90 | "fa2d656a9fd549dfb64e97546e82591d", 91 | "98b2d0f27f4a40c6a09b5eeb8e6deafd", 92 | "edce2515b0cd4d7298064476d43646a1", 93 | "b451038191ef4a9dadecf1140c32fffe", 94 | "fa4045a862034e428ab40009d04d313e", 95 | "d425bdb1be0144d48c43708e61745b4c", 96 | "fb8e26b8d050464595bf725cf2b2e9d3", 97 | "3ace3d3b4baf4599a6673f58088351d4", 98 | "8001fd2f05784a66b7f3aac77fcf2303", 99 | "39323e862d2f4da089cbd7a7a45b378c", 100 | "b3f74dd34648458ca9df0ed01ab0f559", 101 | "81c6f32b33b0420081e0c7b27299a2b8", 102 | "8a9b97f5d64a4781b6d46134bc5ebff8", 103 | "8bbd54857d834502b94847a6903dcdf9", 104 | "fd1d0df20f1c4b3082d4ee128570c4e5", 105 | "218927b46cf148e6903d64e656449a7c", 106 | "954458b7719f442cafd3409ab7a90116", 107 | "ac32520514b34a2fbe86b90fccbf9bae", 108 | "c35c454a893a4edfae1f27a6707cb77c", 109 | "8d3e0594460d4e149c88f1193f5bed6e", 110 | "6f66bbc9f1084c8899c64cefd3b0934a", 111 | "3ffcec797ef542808540f5dd4ed7650b", 112 | "430dbe6044ac4c19a72fb996ddc9fbd1", 113 | "b9e417466b8940a39b9e5755576c1c5c", 114 | "b07788470b864279a763197ae593f1bd", 115 | "6009acf8a81645209f3d94660fdd5fea", 116 | "1d3b2ea61ff54562a38e66b6d8d36a0f", 117 | "d526fa7789664d2f87e9ad1d370cb848", 118 | "9ff84d1955eb474f87e3691ce38c98de", 119 | "2eded99c82784f55b140d8d92b0d9a1c", 120 | "221361471b7d49d7a4e7e23d6c84fe4f", 121 | "00c56ee901c4434583ac71772ed9a7ac", 122 | "525bd5ebb77e45ac8654b01651388c97", 123 | "d84e7c4018c1448d9909cd46b7edf68d", 124 | "b52f5ea85a644ddba56654b59d564db1", 125 | "53f024e3fff24976958b2bf43794c28d", 126 | "c90552efb9e948be970c16b2b56e5042", 127 | "44c5ff8f8c174a1a994062849c8488f7", 128 | "e27560d2924647fb8123edbeac778d6f", 129 | "3a487f0804d34da58bcd2afc2234027e", 130 | "c50e91d15c7a4919bb2d048ef4aa51ef", 131 | "77b0e8db32ce4558961c1871ecd2a147", 132 | "4cf0111ca172405fb7b08f337d82f911", 133 | "c8fb24696fc14bbb81827eb5fad2dbc0", 134 | "63040bada9764ff8b5dac77d22a90008", 135 | "0be87abfc633419c834932c46174e3b9", 136 | "dbc76d5ff39b449691b110d839594912" 137 | ] 138 | }, 139 | "id": "WJWGzYdiGmWP", 140 | "outputId": "2bca917a-f14a-4f11-81a6-75652249e418" 141 | }, 142 | "outputs": [], 143 | "source": [ 144 | "from unsloth import FastModel\n", 145 | "import torch\n", 146 | "max_seq_length = 1024\n", 147 | "\n", 148 | "fourbit_models = [\n", 149 | " # 4bit dynamic quants for superior accuracy and low memory use\n", 150 | " \"unsloth/gemma-3-1b-it-unsloth-bnb-4bit\",\n", 151 | " \"unsloth/gemma-3-4b-it-unsloth-bnb-4bit\",\n", 152 | " \"unsloth/gemma-3-12b-it-unsloth-bnb-4bit\",\n", 153 | " \"unsloth/gemma-3-27b-it-unsloth-bnb-4bit\",\n", 154 | "\n", 155 | " # Other popular models!\n", 156 | " \"unsloth/Llama-3.1-8B\",\n", 157 | " \"unsloth/Llama-3.2-3B\",\n", 158 | " \"unsloth/Llama-3.3-70B\",\n", 159 | " \"unsloth/mistral-7b-instruct-v0.3\",\n", 160 | " \"unsloth/Phi-4\",\n", 161 | "] # More models at https://huggingface.co/unsloth\n", 162 | "\n", 163 | "model, tokenizer = FastModel.from_pretrained(\n", 164 | " model_name = \"unsloth/gemma-3-1b-it\",\n", 165 | " max_seq_length = max_seq_length, # Choose any for long context!\n", 166 | " load_in_4bit = False, # 4 bit quantization to reduce memory\n", 167 | " load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory\n", 168 | " full_finetuning = False, # [NEW!] We have full finetuning now!\n", 169 | " # token = \"hf_...\", # use one if using gated models\n", 170 | ")\n", 171 | "\n", 172 | "model = FastModel.get_peft_model(\n", 173 | " model,\n", 174 | " finetune_vision_layers = False, # Turn off for just text!\n", 175 | " finetune_language_layers = True, # Should leave on!\n", 176 | " finetune_attention_modules = True, # Attention good for GRPO\n", 177 | " finetune_mlp_modules = True, # Should leave on always!\n", 178 | "\n", 179 | " r = 8, # Larger = higher accuracy, but might overfit\n", 180 | " lora_alpha = 8, # Recommended alpha == r at least\n", 181 | " lora_dropout = 0,\n", 182 | " bias = \"none\",\n", 183 | " random_state = 3407,\n", 184 | ")" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "metadata": { 190 | "id": "lOkYETaG_ixw" 191 | }, 192 | "source": [ 193 | "# Creating our Connect Four Games Dataset" 194 | ] 195 | }, 196 | { 197 | "cell_type": "markdown", 198 | "metadata": { 199 | "id": "xlm51ffR_VkY" 200 | }, 201 | "source": [ 202 | "Below is our method to generate lots of games of connect four. I collect games up to a point where one player wins. I can then ask the model what move it needs to take to win the game if its either X or O! I also then know what the answer should be which I can rewards a model on." 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "metadata": { 209 | "id": "gSUfxDWG_TXg" 210 | }, 211 | "outputs": [], 212 | "source": [ 213 | "import numpy as np\n", 214 | "import random\n", 215 | "import pandas as pd\n", 216 | "\n", 217 | "ROWS = 6\n", 218 | "COLS = 7\n", 219 | "\n", 220 | "def create_board():\n", 221 | " \"\"\"Creates an empty Connect Four board.\"\"\"\n", 222 | " return np.full((ROWS, COLS), '.', dtype=str)\n", 223 | "\n", 224 | "def drop_piece(board, col, piece):\n", 225 | " \"\"\"Drops a piece in the given column and returns the row where it landed.\"\"\"\n", 226 | " for row in range(ROWS - 1, -1, -1): # start from the bottom row\n", 227 | " if board[row, col] == '.':\n", 228 | " board[row, col] = piece\n", 229 | " return row\n", 230 | " return None # Column is full\n", 231 | "\n", 232 | "def is_winning_move(board, row, col, piece):\n", 233 | " \"\"\"Checks if placing a piece at (row, col) wins the game.\n", 234 | " It checks vertical, horizontal, and both diagonal directions.\"\"\"\n", 235 | " directions = [\n", 236 | " (1, 0), # vertical (down)\n", 237 | " (0, 1), # horizontal (right)\n", 238 | " (1, 1), # diagonal ↘\n", 239 | " (1, -1) # diagonal ↙\n", 240 | " ]\n", 241 | " for dr, dc in directions:\n", 242 | " count = 1 # count the piece just placed\n", 243 | " for d in [-1, 1]: # check both directions along (dr, dc)\n", 244 | " r, c = row + d * dr, col + d * dc\n", 245 | " while 0 <= r < ROWS and 0 <= c < COLS and board[r, c] == piece:\n", 246 | " count += 1\n", 247 | " if count == 4:\n", 248 | " return True\n", 249 | " r += d * dr\n", 250 | " c += d * dc\n", 251 | " return False\n", 252 | "\n", 253 | "def get_almost_winning_boards(num_games=1000):\n", 254 | " \"\"\"\n", 255 | " Simulates games and collects board states that are just one move away from winning.\n", 256 | " For each board state, the function returns a tuple with:\n", 257 | " (board state copy, winning player, winning column)\n", 258 | " \"\"\"\n", 259 | " winning_positions = []\n", 260 | "\n", 261 | " # Run through multiple game simulations.\n", 262 | " for _ in range(num_games):\n", 263 | " board = create_board()\n", 264 | " current_piece = \"X\"\n", 265 | "\n", 266 | " # Simulate moves until board is full or a winning move is found.\n", 267 | " for _ in range(ROWS * COLS):\n", 268 | " available_columns = [c for c in range(COLS) if board[0, c] == '.']\n", 269 | " if not available_columns:\n", 270 | " break # board is full\n", 271 | "\n", 272 | " col = random.choice(available_columns)\n", 273 | " row = drop_piece(board, col, current_piece)\n", 274 | "\n", 275 | " # Check if the last move is a winning move.\n", 276 | " if is_winning_move(board, row, col, current_piece):\n", 277 | " # Remove the winning move to get the board state just before the win.\n", 278 | " board[row, col] = '.'\n", 279 | " # Save the board state, the winning player, and the column to win.\n", 280 | " winning_positions.append((board.copy(), current_piece, col))\n", 281 | " break\n", 282 | "\n", 283 | " # Switch the player.\n", 284 | " current_piece = \"O\" if current_piece == \"X\" else \"X\"\n", 285 | "\n", 286 | " return winning_positions\n", 287 | "\n", 288 | "# Generate board states that are one move away from winning.\n", 289 | "almost_winning_boards = get_almost_winning_boards(num_games=1000)\n", 290 | "\n", 291 | "# Convert board states into a DataFrame with one board per row.\n", 292 | "# Each row will include a label for the game, the winning player, the winning column,\n", 293 | "# and the board rows (from top row 0 to bottom row ROWS-1).\n", 294 | "board_data = []\n", 295 | "for idx, (board, winner, win_col) in enumerate(almost_winning_boards):\n", 296 | " # Convert each row (a numpy array) to a string for display.\n", 297 | " rows_as_str = [\"\".join(row) for row in board]\n", 298 | " board_data.append([f\"Game {idx+1}\", winner, win_col] + rows_as_str)\n", 299 | "\n", 300 | "# Create DataFrame columns: Game, Winning Player, Winning Column, and one column for each board row.\n", 301 | "columns = [\"Game\", \"Winning Player\", \"Winning Column\"] + [f\"Row {i}\" for i in range(ROWS)]\n", 302 | "df = pd.DataFrame(board_data, columns=columns)\n", 303 | "df.to_csv(\"games.csv\")\n" 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": null, 309 | "metadata": { 310 | "id": "joRvKHzX_wtI" 311 | }, 312 | "outputs": [], 313 | "source": [] 314 | }, 315 | { 316 | "cell_type": "markdown", 317 | "metadata": { 318 | "id": "cyJVhyie_yK6" 319 | }, 320 | "source": [ 321 | "# Training & Rewarding Gemma 1B" 322 | ] 323 | }, 324 | { 325 | "cell_type": "markdown", 326 | "metadata": { 327 | "id": "0nkSkVNnAOxT" 328 | }, 329 | "source": [ 330 | "This section of the code focuses on preparing the Connect Four game data and structuring the interaction with the AI model. It begins by importing necessary libraries like Pandas for data handling and re for regular expressions. The Connect Four game data is loaded from a CSV file named \"games.csv\" into a Pandas DataFrame. To guide the AI, a system prompt is defined, instructing it to provide reasoning and solutions within specific tags.\n", 331 | "\n", 332 | "The format_puzzle function then transforms game states from the DataFrame into prompts for the AI, including a visual representation of the board and instructions. To ensure the AI's responses adhere to the desired format and to extract the predicted move, regular expressions are utilized. Finally, a check_answer function, is sued as part of the reward system, it assesses the accuracy of the AI's predictions during training by comparing them to the correct moves stored in our CSV. This setup lays the groundwork for effectively training and evaluating the AI's performance in playing Connect Four over time.\n", 333 | "\n", 334 | "The core of the training process relies on the GRPO algorithm. GRPO scores the AI's responses using a combination of reward functions which we just mentioned, assessing aspects like output format adherence and prediction accuracy. These scores are then used to calculate gradients that guide the model's learning process. Importantly, training is conducted using LORA (Low-Rank Adaptation), a parameter-efficient fine-tuning technique. LORA enables fine-tuning specific model layers while keeping the majority of the pre-trained weights frozen, leading to faster training and reduced memory requirements. Ultimately, this process aims to enhance the AI's ability to predict winning moves in Connect Four, progressively refining its performance through iterative training and feedback" 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": 7, 340 | "metadata": { 341 | "id": "gHfF21maF0H3" 342 | }, 343 | "outputs": [], 344 | "source": [ 345 | "import pandas as pd\n", 346 | "import re\n", 347 | "\n", 348 | "# Load the games CSV - this is a automated csv\n", 349 | "df = pd.read_csv(\"games.csv\")\n", 350 | "\n", 351 | "# Format constants for responses\n", 352 | "reasoning_start = \"\"\n", 353 | "reasoning_end = \"\"\n", 354 | "solution_start = \"\"\n", 355 | "solution_end = \"\"\n", 356 | "\n", 357 | "system_prompt = \\\n", 358 | "f\"\"\"You are given a problem.\n", 359 | "Think about the problem and provide your working out.\n", 360 | "Place it between {reasoning_start} and {reasoning_end}.\n", 361 | "Then, provide your solution between {solution_start}{solution_end}\"\"\"\n", 362 | "system_prompt\n", 363 | "\n", 364 | "def format_puzzle(row):\n", 365 | " \"\"\"Formats a Connect 4 board from a DataFrame row into a prompt.\"\"\"\n", 366 | "\n", 367 | " board_rows = [df.loc[row, f'Row {i}'] for i in range(6)] # Display from bottom to top\n", 368 | " board_str = \"1234567\\n \" + \"\\n \".join(board_rows) # Add column numbers at the top\n", 369 | " winning_player = df.loc[row, f'Winning Player']\n", 370 | " winning_column = df.loc[row, f'Winning Column']\n", 371 | " # Create the prompt\n", 372 | " prompt = f\"\"\"\n", 373 | " You are a connect four master.\n", 374 | "\n", 375 | " board position:\n", 376 | " {board_str}\n", 377 | "\n", 378 | " It is {winning_player}'s turn to move. Find where they should move their piece.\n", 379 | " Columns are labelled from 1-7. Choose the best column to win the game.\n", 380 | "\n", 381 | " {reasoning_start}\n", 382 | " As {winning_player} which number column should you place your column to win the game and connect four?\n", 383 | " {reasoning_end}\n", 384 | "\n", 385 | " {solution_start}\"\"\"\n", 386 | "\n", 387 | " return prompt\n", 388 | "\n", 389 | "\n", 390 | "# Compile regex to check that the AI output follows the expected format\n", 391 | "match_format = re.compile(\n", 392 | " rf\"^[\\s]*\"\\\n", 393 | " rf\"{reasoning_start}.+?{reasoning_end}.*?\"\\\n", 394 | " rf\"{solution_start}(.+?){solution_end}\"\\\n", 395 | " rf\"[\\s]*$\",\n", 396 | " flags = re.MULTILINE | re.DOTALL\n", 397 | ")\n", 398 | "\n", 399 | "def extract_move(response):\n", 400 | " \"\"\"Extracts the move number from the AI response.\"\"\"\n", 401 | " move_pattern = re.search(r\"\\b([1-7])\\b\", response) # Look for a single-digit column number\n", 402 | " return move_pattern.group(1) if move_pattern else None\n", 403 | "\n", 404 | "def check_answer(prompts, completions, answer, **kwargs):\n", 405 | " \"\"\"Reward function comparing extracted move with true answer.\"\"\"\n", 406 | " # Handle different completion formats\n", 407 | " responses = []\n", 408 | " for completion in completions:\n", 409 | " if isinstance(completion, list) and len(completion) > 0 and isinstance(completion[0], dict) and \"content\" in completion[0]:\n", 410 | " responses.append(completion[0][\"content\"])\n", 411 | " elif isinstance(completion, dict) and \"content\" in completion:\n", 412 | " responses.append(completion[\"content\"])\n", 413 | " else:\n", 414 | " # Handle case where completion is already a string\n", 415 | " responses.append(completion)\n", 416 | "\n", 417 | " extracted_responses = []\n", 418 | " for r in responses:\n", 419 | " if not isinstance(r, str):\n", 420 | " extracted_responses.append(None)\n", 421 | " continue\n", 422 | "\n", 423 | " match = match_format.search(r)\n", 424 | " if match is not None:\n", 425 | " guess = match.group(1)\n", 426 | " # Further extract the number from the solution section\n", 427 | " num_match = re.search(r\"\\b([1-7])\\b\", guess)\n", 428 | " extracted_responses.append(num_match.group(1) if num_match else None)\n", 429 | " else:\n", 430 | " extracted_responses.append(None)\n", 431 | "\n", 432 | " scores = []\n", 433 | " for guess, true_answer in zip(extracted_responses, answer):\n", 434 | " # Convert true_answer to int, add 1, then convert back to string\n", 435 | " adjusted_true_answer = str(int(true_answer) + 1)\n", 436 | "\n", 437 | " score = 0\n", 438 | " if guess is None:\n", 439 | " scores.append(0)\n", 440 | " continue\n", 441 | " # Correct answer gets 3 points!\n", 442 | " if guess == adjusted_true_answer:\n", 443 | " score += 3.0\n", 444 | " # If stripping whitespace makes them match, reward partially\n", 445 | " elif guess.strip() == adjusted_true_answer.strip():\n", 446 | " score += 1.5\n", 447 | " else:\n", 448 | " # Also check if the answer is close via ratio comparison\n", 449 | " try:\n", 450 | " ratio = float(guess) / float(adjusted_true_answer)\n", 451 | " if 0.9 <= ratio <= 1.1:\n", 452 | " score += 0.5\n", 453 | " elif 0.8 <= ratio <= 1.2:\n", 454 | " score += 0.25\n", 455 | " else:\n", 456 | " score -= 1.0 # Penalize wrong answers\n", 457 | " except:\n", 458 | " score -= 0.5 # Penalize non-numeric or badly formatted answers\n", 459 | " scores.append(score)\n", 460 | " return scores\n", 461 | "\n", 462 | "def match_format_exactly(prompts, completions, answer, **kwargs):\n", 463 | " \"\"\"Reward full credit if the output strictly follows the expected format.\"\"\"\n", 464 | " responses = [comp[0][\"content\"] if isinstance(comp, list) and isinstance(comp[0], dict) else comp for comp in completions]\n", 465 | " scores = []\n", 466 | " for r in responses:\n", 467 | " if match_format.fullmatch(r.strip()):\n", 468 | " scores.append(1.0)\n", 469 | " else:\n", 470 | " scores.append(0.0)\n", 471 | " return scores\n", 472 | "\n", 473 | "\n", 474 | "def match_format_exactly(prompts, completions, answer, **kwargs):\n", 475 | " \"\"\"Reward full credit if the output strictly follows the expected format.\"\"\"\n", 476 | " responses = []\n", 477 | " for completion in completions:\n", 478 | " if isinstance(completion, list) and len(completion) > 0 and isinstance(completion[0], dict) and \"content\" in completion[0]:\n", 479 | " responses.append(completion[0][\"content\"])\n", 480 | " elif isinstance(completion, dict) and \"content\" in completion:\n", 481 | " responses.append(completion[\"content\"])\n", 482 | " else:\n", 483 | " # Handle case where completion is already a string\n", 484 | " responses.append(completion)\n", 485 | "\n", 486 | " scores = []\n", 487 | " for r in responses:\n", 488 | " if isinstance(r, str) and match_format.search(r):\n", 489 | " scores.append(1.0)\n", 490 | " else:\n", 491 | " scores.append(0.0)\n", 492 | " return scores\n", 493 | "\n", 494 | "def match_format_approximately(prompts, completions, answer, **kwargs):\n", 495 | " \"\"\"Reward if key formatting tags are present in the output.\"\"\"\n", 496 | " responses = []\n", 497 | " for completion in completions:\n", 498 | " if isinstance(completion, list) and len(completion) > 0 and isinstance(completion[0], dict) and \"content\" in completion[0]:\n", 499 | " responses.append(completion[0][\"content\"])\n", 500 | " elif isinstance(completion, dict) and \"content\" in completion:\n", 501 | " responses.append(completion[\"content\"])\n", 502 | " else:\n", 503 | " # Handle case where completion is already a string\n", 504 | " responses.append(completion)\n", 505 | "\n", 506 | " scores = []\n", 507 | " for r in responses:\n", 508 | " if isinstance(r, str) and reasoning_start in r and reasoning_end in r and solution_start in r and solution_end in r:\n", 509 | " scores.append(0.5)\n", 510 | " else:\n", 511 | " scores.append(0.0)\n", 512 | " return scores\n", 513 | "\n", 514 | "def check_numbers(prompts, completions, answer, **kwargs):\n", 515 | " \"\"\"Reward if the extracted number from the output is correct.\"\"\"\n", 516 | " responses = []\n", 517 | " for completion in completions:\n", 518 | " if isinstance(completion, list) and len(completion) > 0 and isinstance(completion[0], dict) and \"content\" in completion[0]:\n", 519 | " responses.append(completion[0][\"content\"])\n", 520 | " elif isinstance(completion, dict) and \"content\" in completion:\n", 521 | " responses.append(completion[\"content\"])\n", 522 | " else:\n", 523 | " # Handle case where completion is already a string\n", 524 | " responses.append(completion)\n", 525 | "\n", 526 | " scores = []\n", 527 | " for r, true in zip(responses, answer):\n", 528 | " if not isinstance(r, str):\n", 529 | " scores.append(0.0)\n", 530 | " continue\n", 531 | "\n", 532 | " num = extract_move(r)\n", 533 | " # Convert true to int, add 1, then convert back to string\n", 534 | " adjusted_true = str(int(true) + 1)\n", 535 | "\n", 536 | " if num == adjusted_true:\n", 537 | " scores.append(1.0)\n", 538 | " else:\n", 539 | " scores.append(0.0)\n", 540 | " return scores\n", 541 | "# Create a simple dataset that cycles through the games CSV\n", 542 | "class ConnectFourDataset:\n", 543 | " def __init__(self, dataframe):\n", 544 | " self.dataframe = dataframe\n", 545 | "\n", 546 | " def __len__(self):\n", 547 | " return len(self.dataframe)\n", 548 | "\n", 549 | " def __getitem__(self, idx):\n", 550 | " prompt = format_puzzle(idx)\n", 551 | " # The answer is the winning column (as a string)\n", 552 | " # Keep as 0-based index from CSV, conversion happens in check functions\n", 553 | " answer = str(self.dataframe.loc[idx, 'Winning Column'])\n", 554 | " return {'prompt': prompt, 'answer': answer}\n", 555 | "\n", 556 | "dataset = ConnectFourDataset(df)" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": null, 562 | "metadata": { 563 | "colab": { 564 | "base_uri": "https://localhost:8080/", 565 | "height": 1000 566 | }, 567 | "id": "yVvVk4BuF0FK", 568 | "outputId": "dfdc5c66-7ef3-4992-c5a5-3018aac18ad6" 569 | }, 570 | "outputs": [], 571 | "source": [ 572 | "# Set max prompt length and import GRPO trainer components\n", 573 | "max_prompt_length = 256\n", 574 | "\n", 575 | "from trl import GRPOConfig, GRPOTrainer\n", 576 | "\n", 577 | "training_args = GRPOConfig(\n", 578 | " learning_rate = 5e-6,\n", 579 | " adam_beta1 = 0.9,\n", 580 | " adam_beta2 = 0.99,\n", 581 | " weight_decay = 0.1,\n", 582 | " warmup_ratio = 0.1,\n", 583 | " lr_scheduler_type = \"cosine\",\n", 584 | " optim = \"adamw_torch_fused\",\n", 585 | " logging_steps = 1,\n", 586 | " per_device_train_batch_size = 1,\n", 587 | " gradient_accumulation_steps = 1, # Increase to 4 for smoother training if needed\n", 588 | " num_generations = 4, # Decrease if out of memory\n", 589 | " max_prompt_length = max_prompt_length,\n", 590 | " max_completion_length = max_seq_length - max_prompt_length,\n", 591 | " # num_train_epochs = 1, # Uncomment for a full training run\n", 592 | " max_steps = 50,\n", 593 | " save_steps = 50,\n", 594 | " max_grad_norm = 0.1,\n", 595 | " report_to = \"none\", # Can use Weights & Biases if desired\n", 596 | " output_dir = \"outputs\",\n", 597 | ")\n", 598 | "\n", 599 | "trainer = GRPOTrainer(\n", 600 | " model = model,\n", 601 | " processing_class = tokenizer,\n", 602 | " reward_funcs = [\n", 603 | " match_format_exactly,\n", 604 | " match_format_approximately,\n", 605 | " check_answer,\n", 606 | " check_numbers,\n", 607 | " ],\n", 608 | " args = training_args,\n", 609 | " train_dataset = dataset,\n", 610 | ")\n", 611 | "\n", 612 | "# Start training with GRPO!\n", 613 | "trainer.train()" 614 | ] 615 | } 616 | ], 617 | "metadata": { 618 | "accelerator": "GPU", 619 | "colab": { 620 | "gpuType": "T4", 621 | "provenance": [] 622 | }, 623 | "kernelspec": { 624 | "display_name": "Python 3", 625 | "name": "python3" 626 | }, 627 | "language_info": { 628 | "name": "python" 629 | }, 630 | "widgets": { 631 | "application/vnd.jupyter.widget-state+json": { 632 | "00c56ee901c4434583ac71772ed9a7ac": { 633 | "model_module": "@jupyter-widgets/controls", 634 | "model_module_version": "1.5.0", 635 | "model_name": "DescriptionStyleModel", 636 | "state": { 637 | "_model_module": "@jupyter-widgets/controls", 638 | "_model_module_version": "1.5.0", 639 | "_model_name": "DescriptionStyleModel", 640 | "_view_count": null, 641 | "_view_module": "@jupyter-widgets/base", 642 | "_view_module_version": "1.2.0", 643 | "_view_name": "StyleView", 644 | "description_width": "" 645 | } 646 | }, 647 | "02e45237f0ac4b5ab7c6cd68b09da051": { 648 | "model_module": "@jupyter-widgets/controls", 649 | "model_module_version": "1.5.0", 650 | "model_name": "DescriptionStyleModel", 651 | "state": { 652 | "_model_module": "@jupyter-widgets/controls", 653 | "_model_module_version": "1.5.0", 654 | "_model_name": "DescriptionStyleModel", 655 | "_view_count": null, 656 | "_view_module": "@jupyter-widgets/base", 657 | "_view_module_version": "1.2.0", 658 | "_view_name": "StyleView", 659 | "description_width": "" 660 | } 661 | }, 662 | "0be87abfc633419c834932c46174e3b9": { 663 | "model_module": "@jupyter-widgets/base", 664 | "model_module_version": "1.2.0", 665 | "model_name": "LayoutModel", 666 | "state": { 667 | "_model_module": "@jupyter-widgets/base", 668 | "_model_module_version": "1.2.0", 669 | "_model_name": "LayoutModel", 670 | "_view_count": null, 671 | "_view_module": "@jupyter-widgets/base", 672 | "_view_module_version": "1.2.0", 673 | "_view_name": "LayoutView", 674 | "align_content": null, 675 | "align_items": null, 676 | "align_self": null, 677 | "border": null, 678 | "bottom": null, 679 | "display": null, 680 | "flex": null, 681 | "flex_flow": null, 682 | "grid_area": null, 683 | "grid_auto_columns": null, 684 | "grid_auto_flow": null, 685 | "grid_auto_rows": null, 686 | "grid_column": null, 687 | "grid_gap": null, 688 | "grid_row": null, 689 | "grid_template_areas": null, 690 | "grid_template_columns": null, 691 | "grid_template_rows": null, 692 | "height": null, 693 | "justify_content": null, 694 | "justify_items": null, 695 | "left": null, 696 | "margin": null, 697 | "max_height": null, 698 | "max_width": null, 699 | "min_height": null, 700 | "min_width": null, 701 | "object_fit": null, 702 | "object_position": null, 703 | "order": null, 704 | "overflow": null, 705 | "overflow_x": null, 706 | "overflow_y": null, 707 | "padding": null, 708 | "right": null, 709 | "top": null, 710 | "visibility": null, 711 | "width": null 712 | } 713 | }, 714 | "11788b948d5b46bd88de9235bbb3179d": { 715 | "model_module": "@jupyter-widgets/controls", 716 | "model_module_version": "1.5.0", 717 | "model_name": "HTMLModel", 718 | "state": { 719 | "_dom_classes": [], 720 | "_model_module": "@jupyter-widgets/controls", 721 | "_model_module_version": "1.5.0", 722 | "_model_name": "HTMLModel", 723 | "_view_count": null, 724 | "_view_module": "@jupyter-widgets/controls", 725 | "_view_module_version": "1.5.0", 726 | "_view_name": "HTMLView", 727 | "description": "", 728 | "description_tooltip": null, 729 | "layout": "IPY_MODEL_58872b9c5b074813af7b6d9bdc9ba60f", 730 | "placeholder": "", 731 | "style": "IPY_MODEL_3a8ee07769a94ea2a7d675bb113ce709", 732 | "value": " 2.00G/2.00G [00:17<00:00, 756MB/s]" 733 | } 734 | }, 735 | "136e78cc0e764bda86121f1383e5bb0b": { 736 | "model_module": "@jupyter-widgets/base", 737 | "model_module_version": "1.2.0", 738 | "model_name": "LayoutModel", 739 | "state": { 740 | "_model_module": "@jupyter-widgets/base", 741 | "_model_module_version": "1.2.0", 742 | "_model_name": "LayoutModel", 743 | "_view_count": null, 744 | "_view_module": "@jupyter-widgets/base", 745 | "_view_module_version": "1.2.0", 746 | "_view_name": "LayoutView", 747 | "align_content": null, 748 | "align_items": null, 749 | "align_self": null, 750 | "border": null, 751 | "bottom": null, 752 | "display": null, 753 | "flex": null, 754 | "flex_flow": null, 755 | "grid_area": null, 756 | "grid_auto_columns": null, 757 | "grid_auto_flow": null, 758 | "grid_auto_rows": null, 759 | "grid_column": null, 760 | "grid_gap": null, 761 | "grid_row": null, 762 | "grid_template_areas": null, 763 | "grid_template_columns": null, 764 | "grid_template_rows": null, 765 | "height": null, 766 | "justify_content": null, 767 | "justify_items": null, 768 | "left": null, 769 | "margin": null, 770 | "max_height": null, 771 | "max_width": null, 772 | "min_height": null, 773 | "min_width": null, 774 | "object_fit": null, 775 | "object_position": null, 776 | "order": null, 777 | "overflow": null, 778 | "overflow_x": null, 779 | "overflow_y": null, 780 | "padding": null, 781 | "right": null, 782 | "top": null, 783 | "visibility": null, 784 | "width": null 785 | } 786 | }, 787 | "1d3b2ea61ff54562a38e66b6d8d36a0f": { 788 | "model_module": "@jupyter-widgets/controls", 789 | "model_module_version": "1.5.0", 790 | "model_name": "HTMLModel", 791 | "state": { 792 | "_dom_classes": [], 793 | "_model_module": "@jupyter-widgets/controls", 794 | "_model_module_version": "1.5.0", 795 | "_model_name": "HTMLModel", 796 | "_view_count": null, 797 | "_view_module": "@jupyter-widgets/controls", 798 | "_view_module_version": "1.5.0", 799 | "_view_name": "HTMLView", 800 | "description": "", 801 | "description_tooltip": null, 802 | "layout": "IPY_MODEL_221361471b7d49d7a4e7e23d6c84fe4f", 803 | "placeholder": "", 804 | "style": "IPY_MODEL_00c56ee901c4434583ac71772ed9a7ac", 805 | "value": "added_tokens.json: 100%" 806 | } 807 | }, 808 | "1e8533a81384457dab93020c702eb2d6": { 809 | "model_module": "@jupyter-widgets/controls", 810 | "model_module_version": "1.5.0", 811 | "model_name": "DescriptionStyleModel", 812 | "state": { 813 | "_model_module": "@jupyter-widgets/controls", 814 | "_model_module_version": "1.5.0", 815 | "_model_name": "DescriptionStyleModel", 816 | "_view_count": null, 817 | "_view_module": "@jupyter-widgets/base", 818 | "_view_module_version": "1.2.0", 819 | "_view_name": "StyleView", 820 | "description_width": "" 821 | } 822 | }, 823 | "218927b46cf148e6903d64e656449a7c": { 824 | "model_module": "@jupyter-widgets/controls", 825 | "model_module_version": "1.5.0", 826 | "model_name": "HTMLModel", 827 | "state": { 828 | "_dom_classes": [], 829 | "_model_module": "@jupyter-widgets/controls", 830 | "_model_module_version": "1.5.0", 831 | "_model_name": "HTMLModel", 832 | "_view_count": null, 833 | "_view_module": "@jupyter-widgets/controls", 834 | "_view_module_version": "1.5.0", 835 | "_view_name": "HTMLView", 836 | "description": "", 837 | "description_tooltip": null, 838 | "layout": "IPY_MODEL_8d3e0594460d4e149c88f1193f5bed6e", 839 | "placeholder": "", 840 | "style": "IPY_MODEL_6f66bbc9f1084c8899c64cefd3b0934a", 841 | "value": "tokenizer.json: 100%" 842 | } 843 | }, 844 | "21c5356744e149d6a2d67fdb9830fbeb": { 845 | "model_module": "@jupyter-widgets/base", 846 | "model_module_version": "1.2.0", 847 | "model_name": "LayoutModel", 848 | "state": { 849 | "_model_module": "@jupyter-widgets/base", 850 | "_model_module_version": "1.2.0", 851 | "_model_name": "LayoutModel", 852 | "_view_count": null, 853 | "_view_module": "@jupyter-widgets/base", 854 | "_view_module_version": "1.2.0", 855 | "_view_name": "LayoutView", 856 | "align_content": null, 857 | "align_items": null, 858 | "align_self": null, 859 | "border": null, 860 | "bottom": null, 861 | "display": null, 862 | "flex": null, 863 | "flex_flow": null, 864 | "grid_area": null, 865 | "grid_auto_columns": null, 866 | "grid_auto_flow": null, 867 | "grid_auto_rows": null, 868 | "grid_column": null, 869 | "grid_gap": null, 870 | "grid_row": null, 871 | "grid_template_areas": null, 872 | "grid_template_columns": null, 873 | "grid_template_rows": null, 874 | "height": null, 875 | "justify_content": null, 876 | "justify_items": null, 877 | "left": null, 878 | "margin": null, 879 | "max_height": null, 880 | "max_width": null, 881 | "min_height": null, 882 | "min_width": null, 883 | "object_fit": null, 884 | "object_position": null, 885 | "order": null, 886 | "overflow": null, 887 | "overflow_x": null, 888 | "overflow_y": null, 889 | "padding": null, 890 | "right": null, 891 | "top": null, 892 | "visibility": null, 893 | "width": null 894 | } 895 | }, 896 | "221361471b7d49d7a4e7e23d6c84fe4f": { 897 | "model_module": "@jupyter-widgets/base", 898 | "model_module_version": "1.2.0", 899 | "model_name": "LayoutModel", 900 | "state": { 901 | "_model_module": "@jupyter-widgets/base", 902 | "_model_module_version": "1.2.0", 903 | "_model_name": "LayoutModel", 904 | "_view_count": null, 905 | "_view_module": "@jupyter-widgets/base", 906 | "_view_module_version": "1.2.0", 907 | "_view_name": "LayoutView", 908 | "align_content": null, 909 | "align_items": null, 910 | "align_self": null, 911 | "border": null, 912 | "bottom": null, 913 | "display": null, 914 | "flex": null, 915 | "flex_flow": null, 916 | "grid_area": null, 917 | "grid_auto_columns": null, 918 | "grid_auto_flow": null, 919 | "grid_auto_rows": null, 920 | "grid_column": null, 921 | "grid_gap": null, 922 | "grid_row": null, 923 | "grid_template_areas": null, 924 | "grid_template_columns": null, 925 | "grid_template_rows": null, 926 | "height": null, 927 | "justify_content": null, 928 | "justify_items": null, 929 | "left": null, 930 | "margin": null, 931 | "max_height": null, 932 | "max_width": null, 933 | "min_height": null, 934 | "min_width": null, 935 | "object_fit": null, 936 | "object_position": null, 937 | "order": null, 938 | "overflow": null, 939 | "overflow_x": null, 940 | "overflow_y": null, 941 | "padding": null, 942 | "right": null, 943 | "top": null, 944 | "visibility": null, 945 | "width": null 946 | } 947 | }, 948 | "222df800dad945e199a7ccd8dfbc7a8d": { 949 | "model_module": "@jupyter-widgets/controls", 950 | "model_module_version": "1.5.0", 951 | "model_name": "ProgressStyleModel", 952 | "state": { 953 | "_model_module": "@jupyter-widgets/controls", 954 | "_model_module_version": "1.5.0", 955 | "_model_name": "ProgressStyleModel", 956 | "_view_count": null, 957 | "_view_module": "@jupyter-widgets/base", 958 | "_view_module_version": "1.2.0", 959 | "_view_name": "StyleView", 960 | "bar_color": null, 961 | "description_width": "" 962 | } 963 | }, 964 | "29c6198a3adb4d909230ed1eb1e3cd74": { 965 | "model_module": "@jupyter-widgets/base", 966 | "model_module_version": "1.2.0", 967 | "model_name": "LayoutModel", 968 | "state": { 969 | "_model_module": "@jupyter-widgets/base", 970 | "_model_module_version": "1.2.0", 971 | "_model_name": "LayoutModel", 972 | "_view_count": null, 973 | "_view_module": "@jupyter-widgets/base", 974 | "_view_module_version": "1.2.0", 975 | "_view_name": "LayoutView", 976 | "align_content": null, 977 | "align_items": null, 978 | "align_self": null, 979 | "border": null, 980 | "bottom": null, 981 | "display": null, 982 | "flex": null, 983 | "flex_flow": null, 984 | "grid_area": null, 985 | "grid_auto_columns": null, 986 | "grid_auto_flow": null, 987 | "grid_auto_rows": null, 988 | "grid_column": null, 989 | "grid_gap": null, 990 | "grid_row": null, 991 | "grid_template_areas": null, 992 | "grid_template_columns": null, 993 | "grid_template_rows": null, 994 | "height": null, 995 | "justify_content": null, 996 | "justify_items": null, 997 | "left": null, 998 | "margin": null, 999 | "max_height": null, 1000 | "max_width": null, 1001 | "min_height": null, 1002 | "min_width": null, 1003 | "object_fit": null, 1004 | "object_position": null, 1005 | "order": null, 1006 | "overflow": null, 1007 | "overflow_x": null, 1008 | "overflow_y": null, 1009 | "padding": null, 1010 | "right": null, 1011 | "top": null, 1012 | "visibility": null, 1013 | "width": null 1014 | } 1015 | }, 1016 | "2eded99c82784f55b140d8d92b0d9a1c": { 1017 | "model_module": "@jupyter-widgets/base", 1018 | "model_module_version": "1.2.0", 1019 | "model_name": "LayoutModel", 1020 | "state": { 1021 | "_model_module": "@jupyter-widgets/base", 1022 | "_model_module_version": "1.2.0", 1023 | "_model_name": "LayoutModel", 1024 | "_view_count": null, 1025 | "_view_module": "@jupyter-widgets/base", 1026 | "_view_module_version": "1.2.0", 1027 | "_view_name": "LayoutView", 1028 | "align_content": null, 1029 | "align_items": null, 1030 | "align_self": null, 1031 | "border": null, 1032 | "bottom": null, 1033 | "display": null, 1034 | "flex": null, 1035 | "flex_flow": null, 1036 | "grid_area": null, 1037 | "grid_auto_columns": null, 1038 | "grid_auto_flow": null, 1039 | "grid_auto_rows": null, 1040 | "grid_column": null, 1041 | "grid_gap": null, 1042 | "grid_row": null, 1043 | "grid_template_areas": null, 1044 | "grid_template_columns": null, 1045 | "grid_template_rows": null, 1046 | "height": null, 1047 | "justify_content": null, 1048 | "justify_items": null, 1049 | "left": null, 1050 | "margin": null, 1051 | "max_height": null, 1052 | "max_width": null, 1053 | "min_height": null, 1054 | "min_width": null, 1055 | "object_fit": null, 1056 | "object_position": null, 1057 | "order": null, 1058 | "overflow": null, 1059 | "overflow_x": null, 1060 | "overflow_y": null, 1061 | "padding": null, 1062 | "right": null, 1063 | "top": null, 1064 | "visibility": null, 1065 | "width": null 1066 | } 1067 | }, 1068 | "34b06389fce04892a242003bf79610ed": { 1069 | "model_module": "@jupyter-widgets/controls", 1070 | "model_module_version": "1.5.0", 1071 | "model_name": "FloatProgressModel", 1072 | "state": { 1073 | "_dom_classes": [], 1074 | "_model_module": "@jupyter-widgets/controls", 1075 | "_model_module_version": "1.5.0", 1076 | "_model_name": "FloatProgressModel", 1077 | "_view_count": null, 1078 | "_view_module": "@jupyter-widgets/controls", 1079 | "_view_module_version": "1.5.0", 1080 | "_view_name": "ProgressView", 1081 | "bar_style": "success", 1082 | "description": "", 1083 | "description_tooltip": null, 1084 | "layout": "IPY_MODEL_5b2d5be5b4134fe080add6f4e48be59f", 1085 | "max": 1157007, 1086 | "min": 0, 1087 | "orientation": "horizontal", 1088 | "style": "IPY_MODEL_fa2d656a9fd549dfb64e97546e82591d", 1089 | "value": 1157007 1090 | } 1091 | }, 1092 | "39323e862d2f4da089cbd7a7a45b378c": { 1093 | "model_module": "@jupyter-widgets/controls", 1094 | "model_module_version": "1.5.0", 1095 | "model_name": "DescriptionStyleModel", 1096 | "state": { 1097 | "_model_module": "@jupyter-widgets/controls", 1098 | "_model_module_version": "1.5.0", 1099 | "_model_name": "DescriptionStyleModel", 1100 | "_view_count": null, 1101 | "_view_module": "@jupyter-widgets/base", 1102 | "_view_module_version": "1.2.0", 1103 | "_view_name": "StyleView", 1104 | "description_width": "" 1105 | } 1106 | }, 1107 | "39367aac2c894342ab3ed257ac8c2e4f": { 1108 | "model_module": "@jupyter-widgets/controls", 1109 | "model_module_version": "1.5.0", 1110 | "model_name": "DescriptionStyleModel", 1111 | "state": { 1112 | "_model_module": "@jupyter-widgets/controls", 1113 | "_model_module_version": "1.5.0", 1114 | "_model_name": "DescriptionStyleModel", 1115 | "_view_count": null, 1116 | "_view_module": "@jupyter-widgets/base", 1117 | "_view_module_version": "1.2.0", 1118 | "_view_name": "StyleView", 1119 | "description_width": "" 1120 | } 1121 | }, 1122 | "3a487f0804d34da58bcd2afc2234027e": { 1123 | "model_module": "@jupyter-widgets/controls", 1124 | "model_module_version": "1.5.0", 1125 | "model_name": "HTMLModel", 1126 | "state": { 1127 | "_dom_classes": [], 1128 | "_model_module": "@jupyter-widgets/controls", 1129 | "_model_module_version": "1.5.0", 1130 | "_model_name": "HTMLModel", 1131 | "_view_count": null, 1132 | "_view_module": "@jupyter-widgets/controls", 1133 | "_view_module_version": "1.5.0", 1134 | "_view_name": "HTMLView", 1135 | "description": "", 1136 | "description_tooltip": null, 1137 | "layout": "IPY_MODEL_0be87abfc633419c834932c46174e3b9", 1138 | "placeholder": "", 1139 | "style": "IPY_MODEL_dbc76d5ff39b449691b110d839594912", 1140 | "value": " 670/670 [00:00<00:00, 60.3kB/s]" 1141 | } 1142 | }, 1143 | "3a8ee07769a94ea2a7d675bb113ce709": { 1144 | "model_module": "@jupyter-widgets/controls", 1145 | "model_module_version": "1.5.0", 1146 | "model_name": "DescriptionStyleModel", 1147 | "state": { 1148 | "_model_module": "@jupyter-widgets/controls", 1149 | "_model_module_version": "1.5.0", 1150 | "_model_name": "DescriptionStyleModel", 1151 | "_view_count": null, 1152 | "_view_module": "@jupyter-widgets/base", 1153 | "_view_module_version": "1.2.0", 1154 | "_view_name": "StyleView", 1155 | "description_width": "" 1156 | } 1157 | }, 1158 | "3abecbd6292a4b4e94e5213616cf2122": { 1159 | "model_module": "@jupyter-widgets/controls", 1160 | "model_module_version": "1.5.0", 1161 | "model_name": "HBoxModel", 1162 | "state": { 1163 | "_dom_classes": [], 1164 | "_model_module": "@jupyter-widgets/controls", 1165 | "_model_module_version": "1.5.0", 1166 | "_model_name": "HBoxModel", 1167 | "_view_count": null, 1168 | "_view_module": "@jupyter-widgets/controls", 1169 | "_view_module_version": "1.5.0", 1170 | "_view_name": "HBoxView", 1171 | "box_style": "", 1172 | "children": [ 1173 | "IPY_MODEL_72d5f442747f4bfda91945c58033ead2", 1174 | "IPY_MODEL_34b06389fce04892a242003bf79610ed", 1175 | "IPY_MODEL_9f2bda486bea4880829704c05223562e" 1176 | ], 1177 | "layout": "IPY_MODEL_136e78cc0e764bda86121f1383e5bb0b" 1178 | } 1179 | }, 1180 | "3ace3d3b4baf4599a6673f58088351d4": { 1181 | "model_module": "@jupyter-widgets/base", 1182 | "model_module_version": "1.2.0", 1183 | "model_name": "LayoutModel", 1184 | "state": { 1185 | "_model_module": "@jupyter-widgets/base", 1186 | "_model_module_version": "1.2.0", 1187 | "_model_name": "LayoutModel", 1188 | "_view_count": null, 1189 | "_view_module": "@jupyter-widgets/base", 1190 | "_view_module_version": "1.2.0", 1191 | "_view_name": "LayoutView", 1192 | "align_content": null, 1193 | "align_items": null, 1194 | "align_self": null, 1195 | "border": null, 1196 | "bottom": null, 1197 | "display": null, 1198 | "flex": null, 1199 | "flex_flow": null, 1200 | "grid_area": null, 1201 | "grid_auto_columns": null, 1202 | "grid_auto_flow": null, 1203 | "grid_auto_rows": null, 1204 | "grid_column": null, 1205 | "grid_gap": null, 1206 | "grid_row": null, 1207 | "grid_template_areas": null, 1208 | "grid_template_columns": null, 1209 | "grid_template_rows": null, 1210 | "height": null, 1211 | "justify_content": null, 1212 | "justify_items": null, 1213 | "left": null, 1214 | "margin": null, 1215 | "max_height": null, 1216 | "max_width": null, 1217 | "min_height": null, 1218 | "min_width": null, 1219 | "object_fit": null, 1220 | "object_position": null, 1221 | "order": null, 1222 | "overflow": null, 1223 | "overflow_x": null, 1224 | "overflow_y": null, 1225 | "padding": null, 1226 | "right": null, 1227 | "top": null, 1228 | "visibility": null, 1229 | "width": null 1230 | } 1231 | }, 1232 | "3f956e7672f34276b78e5e0cadaaedbd": { 1233 | "model_module": "@jupyter-widgets/controls", 1234 | "model_module_version": "1.5.0", 1235 | "model_name": "DescriptionStyleModel", 1236 | "state": { 1237 | "_model_module": "@jupyter-widgets/controls", 1238 | "_model_module_version": "1.5.0", 1239 | "_model_name": "DescriptionStyleModel", 1240 | "_view_count": null, 1241 | "_view_module": "@jupyter-widgets/base", 1242 | "_view_module_version": "1.2.0", 1243 | "_view_name": "StyleView", 1244 | "description_width": "" 1245 | } 1246 | }, 1247 | "3ffcec797ef542808540f5dd4ed7650b": { 1248 | "model_module": "@jupyter-widgets/base", 1249 | "model_module_version": "1.2.0", 1250 | "model_name": "LayoutModel", 1251 | "state": { 1252 | "_model_module": "@jupyter-widgets/base", 1253 | "_model_module_version": "1.2.0", 1254 | "_model_name": "LayoutModel", 1255 | "_view_count": null, 1256 | "_view_module": "@jupyter-widgets/base", 1257 | "_view_module_version": "1.2.0", 1258 | "_view_name": "LayoutView", 1259 | "align_content": null, 1260 | "align_items": null, 1261 | "align_self": null, 1262 | "border": null, 1263 | "bottom": null, 1264 | "display": null, 1265 | "flex": null, 1266 | "flex_flow": null, 1267 | "grid_area": null, 1268 | "grid_auto_columns": null, 1269 | "grid_auto_flow": null, 1270 | "grid_auto_rows": null, 1271 | "grid_column": null, 1272 | "grid_gap": null, 1273 | "grid_row": null, 1274 | "grid_template_areas": null, 1275 | "grid_template_columns": null, 1276 | "grid_template_rows": null, 1277 | "height": null, 1278 | "justify_content": null, 1279 | "justify_items": null, 1280 | "left": null, 1281 | "margin": null, 1282 | "max_height": null, 1283 | "max_width": null, 1284 | "min_height": null, 1285 | "min_width": null, 1286 | "object_fit": null, 1287 | "object_position": null, 1288 | "order": null, 1289 | "overflow": null, 1290 | "overflow_x": null, 1291 | "overflow_y": null, 1292 | "padding": null, 1293 | "right": null, 1294 | "top": null, 1295 | "visibility": null, 1296 | "width": null 1297 | } 1298 | }, 1299 | "430dbe6044ac4c19a72fb996ddc9fbd1": { 1300 | "model_module": "@jupyter-widgets/controls", 1301 | "model_module_version": "1.5.0", 1302 | "model_name": "ProgressStyleModel", 1303 | "state": { 1304 | "_model_module": "@jupyter-widgets/controls", 1305 | "_model_module_version": "1.5.0", 1306 | "_model_name": "ProgressStyleModel", 1307 | "_view_count": null, 1308 | "_view_module": "@jupyter-widgets/base", 1309 | "_view_module_version": "1.2.0", 1310 | "_view_name": "StyleView", 1311 | "bar_color": null, 1312 | "description_width": "" 1313 | } 1314 | }, 1315 | "4441bde44ec44063bed8f43ae4a30a4e": { 1316 | "model_module": "@jupyter-widgets/base", 1317 | "model_module_version": "1.2.0", 1318 | "model_name": "LayoutModel", 1319 | "state": { 1320 | "_model_module": "@jupyter-widgets/base", 1321 | "_model_module_version": "1.2.0", 1322 | "_model_name": "LayoutModel", 1323 | "_view_count": null, 1324 | "_view_module": "@jupyter-widgets/base", 1325 | "_view_module_version": "1.2.0", 1326 | "_view_name": "LayoutView", 1327 | "align_content": null, 1328 | "align_items": null, 1329 | "align_self": null, 1330 | "border": null, 1331 | "bottom": null, 1332 | "display": null, 1333 | "flex": null, 1334 | "flex_flow": null, 1335 | "grid_area": null, 1336 | "grid_auto_columns": null, 1337 | "grid_auto_flow": null, 1338 | "grid_auto_rows": null, 1339 | "grid_column": null, 1340 | "grid_gap": null, 1341 | "grid_row": null, 1342 | "grid_template_areas": null, 1343 | "grid_template_columns": null, 1344 | "grid_template_rows": null, 1345 | "height": null, 1346 | "justify_content": null, 1347 | "justify_items": null, 1348 | "left": null, 1349 | "margin": null, 1350 | "max_height": null, 1351 | "max_width": null, 1352 | "min_height": null, 1353 | "min_width": null, 1354 | "object_fit": null, 1355 | "object_position": null, 1356 | "order": null, 1357 | "overflow": null, 1358 | "overflow_x": null, 1359 | "overflow_y": null, 1360 | "padding": null, 1361 | "right": null, 1362 | "top": null, 1363 | "visibility": null, 1364 | "width": null 1365 | } 1366 | }, 1367 | "44c5ff8f8c174a1a994062849c8488f7": { 1368 | "model_module": "@jupyter-widgets/controls", 1369 | "model_module_version": "1.5.0", 1370 | "model_name": "HTMLModel", 1371 | "state": { 1372 | "_dom_classes": [], 1373 | "_model_module": "@jupyter-widgets/controls", 1374 | "_model_module_version": "1.5.0", 1375 | "_model_name": "HTMLModel", 1376 | "_view_count": null, 1377 | "_view_module": "@jupyter-widgets/controls", 1378 | "_view_module_version": "1.5.0", 1379 | "_view_name": "HTMLView", 1380 | "description": "", 1381 | "description_tooltip": null, 1382 | "layout": "IPY_MODEL_77b0e8db32ce4558961c1871ecd2a147", 1383 | "placeholder": "", 1384 | "style": "IPY_MODEL_4cf0111ca172405fb7b08f337d82f911", 1385 | "value": "special_tokens_map.json: 100%" 1386 | } 1387 | }, 1388 | "4cf0111ca172405fb7b08f337d82f911": { 1389 | "model_module": "@jupyter-widgets/controls", 1390 | "model_module_version": "1.5.0", 1391 | "model_name": "DescriptionStyleModel", 1392 | "state": { 1393 | "_model_module": "@jupyter-widgets/controls", 1394 | "_model_module_version": "1.5.0", 1395 | "_model_name": "DescriptionStyleModel", 1396 | "_view_count": null, 1397 | "_view_module": "@jupyter-widgets/base", 1398 | "_view_module_version": "1.2.0", 1399 | "_view_name": "StyleView", 1400 | "description_width": "" 1401 | } 1402 | }, 1403 | "50656ef62a804b908bf05493789cc362": { 1404 | "model_module": "@jupyter-widgets/controls", 1405 | "model_module_version": "1.5.0", 1406 | "model_name": "HTMLModel", 1407 | "state": { 1408 | "_dom_classes": [], 1409 | "_model_module": "@jupyter-widgets/controls", 1410 | "_model_module_version": "1.5.0", 1411 | "_model_name": "HTMLModel", 1412 | "_view_count": null, 1413 | "_view_module": "@jupyter-widgets/controls", 1414 | "_view_module_version": "1.5.0", 1415 | "_view_name": "HTMLView", 1416 | "description": "", 1417 | "description_tooltip": null, 1418 | "layout": "IPY_MODEL_5e7f4d73b97142448857a67a66c47b34", 1419 | "placeholder": "", 1420 | "style": "IPY_MODEL_02e45237f0ac4b5ab7c6cd68b09da051", 1421 | "value": "model.safetensors: 100%" 1422 | } 1423 | }, 1424 | "525bd5ebb77e45ac8654b01651388c97": { 1425 | "model_module": "@jupyter-widgets/base", 1426 | "model_module_version": "1.2.0", 1427 | "model_name": "LayoutModel", 1428 | "state": { 1429 | "_model_module": "@jupyter-widgets/base", 1430 | "_model_module_version": "1.2.0", 1431 | "_model_name": "LayoutModel", 1432 | "_view_count": null, 1433 | "_view_module": "@jupyter-widgets/base", 1434 | "_view_module_version": "1.2.0", 1435 | "_view_name": "LayoutView", 1436 | "align_content": null, 1437 | "align_items": null, 1438 | "align_self": null, 1439 | "border": null, 1440 | "bottom": null, 1441 | "display": null, 1442 | "flex": null, 1443 | "flex_flow": null, 1444 | "grid_area": null, 1445 | "grid_auto_columns": null, 1446 | "grid_auto_flow": null, 1447 | "grid_auto_rows": null, 1448 | "grid_column": null, 1449 | "grid_gap": null, 1450 | "grid_row": null, 1451 | "grid_template_areas": null, 1452 | "grid_template_columns": null, 1453 | "grid_template_rows": null, 1454 | "height": null, 1455 | "justify_content": null, 1456 | "justify_items": null, 1457 | "left": null, 1458 | "margin": null, 1459 | "max_height": null, 1460 | "max_width": null, 1461 | "min_height": null, 1462 | "min_width": null, 1463 | "object_fit": null, 1464 | "object_position": null, 1465 | "order": null, 1466 | "overflow": null, 1467 | "overflow_x": null, 1468 | "overflow_y": null, 1469 | "padding": null, 1470 | "right": null, 1471 | "top": null, 1472 | "visibility": null, 1473 | "width": null 1474 | } 1475 | }, 1476 | "53f024e3fff24976958b2bf43794c28d": { 1477 | "model_module": "@jupyter-widgets/controls", 1478 | "model_module_version": "1.5.0", 1479 | "model_name": "DescriptionStyleModel", 1480 | "state": { 1481 | "_model_module": "@jupyter-widgets/controls", 1482 | "_model_module_version": "1.5.0", 1483 | "_model_name": "DescriptionStyleModel", 1484 | "_view_count": null, 1485 | "_view_module": "@jupyter-widgets/base", 1486 | "_view_module_version": "1.2.0", 1487 | "_view_name": "StyleView", 1488 | "description_width": "" 1489 | } 1490 | }, 1491 | "58872b9c5b074813af7b6d9bdc9ba60f": { 1492 | "model_module": "@jupyter-widgets/base", 1493 | "model_module_version": "1.2.0", 1494 | "model_name": "LayoutModel", 1495 | "state": { 1496 | "_model_module": "@jupyter-widgets/base", 1497 | "_model_module_version": "1.2.0", 1498 | "_model_name": "LayoutModel", 1499 | "_view_count": null, 1500 | "_view_module": "@jupyter-widgets/base", 1501 | "_view_module_version": "1.2.0", 1502 | "_view_name": "LayoutView", 1503 | "align_content": null, 1504 | "align_items": null, 1505 | "align_self": null, 1506 | "border": null, 1507 | "bottom": null, 1508 | "display": null, 1509 | "flex": null, 1510 | "flex_flow": null, 1511 | "grid_area": null, 1512 | "grid_auto_columns": null, 1513 | "grid_auto_flow": null, 1514 | "grid_auto_rows": null, 1515 | "grid_column": null, 1516 | "grid_gap": null, 1517 | "grid_row": null, 1518 | "grid_template_areas": null, 1519 | "grid_template_columns": null, 1520 | "grid_template_rows": null, 1521 | "height": null, 1522 | "justify_content": null, 1523 | "justify_items": null, 1524 | "left": null, 1525 | "margin": null, 1526 | "max_height": null, 1527 | "max_width": null, 1528 | "min_height": null, 1529 | "min_width": null, 1530 | "object_fit": null, 1531 | "object_position": null, 1532 | "order": null, 1533 | "overflow": null, 1534 | "overflow_x": null, 1535 | "overflow_y": null, 1536 | "padding": null, 1537 | "right": null, 1538 | "top": null, 1539 | "visibility": null, 1540 | "width": null 1541 | } 1542 | }, 1543 | "5b2d5be5b4134fe080add6f4e48be59f": { 1544 | "model_module": "@jupyter-widgets/base", 1545 | "model_module_version": "1.2.0", 1546 | "model_name": "LayoutModel", 1547 | "state": { 1548 | "_model_module": "@jupyter-widgets/base", 1549 | "_model_module_version": "1.2.0", 1550 | "_model_name": "LayoutModel", 1551 | "_view_count": null, 1552 | "_view_module": "@jupyter-widgets/base", 1553 | "_view_module_version": "1.2.0", 1554 | "_view_name": "LayoutView", 1555 | "align_content": null, 1556 | "align_items": null, 1557 | "align_self": null, 1558 | "border": null, 1559 | "bottom": null, 1560 | "display": null, 1561 | "flex": null, 1562 | "flex_flow": null, 1563 | "grid_area": null, 1564 | "grid_auto_columns": null, 1565 | "grid_auto_flow": null, 1566 | "grid_auto_rows": null, 1567 | "grid_column": null, 1568 | "grid_gap": null, 1569 | "grid_row": null, 1570 | "grid_template_areas": null, 1571 | "grid_template_columns": null, 1572 | "grid_template_rows": null, 1573 | "height": null, 1574 | "justify_content": null, 1575 | "justify_items": null, 1576 | "left": null, 1577 | "margin": null, 1578 | "max_height": null, 1579 | "max_width": null, 1580 | "min_height": null, 1581 | "min_width": null, 1582 | "object_fit": null, 1583 | "object_position": null, 1584 | "order": null, 1585 | "overflow": null, 1586 | "overflow_x": null, 1587 | "overflow_y": null, 1588 | "padding": null, 1589 | "right": null, 1590 | "top": null, 1591 | "visibility": null, 1592 | "width": null 1593 | } 1594 | }, 1595 | "5e7f4d73b97142448857a67a66c47b34": { 1596 | "model_module": "@jupyter-widgets/base", 1597 | "model_module_version": "1.2.0", 1598 | "model_name": "LayoutModel", 1599 | "state": { 1600 | "_model_module": "@jupyter-widgets/base", 1601 | "_model_module_version": "1.2.0", 1602 | "_model_name": "LayoutModel", 1603 | "_view_count": null, 1604 | "_view_module": "@jupyter-widgets/base", 1605 | "_view_module_version": "1.2.0", 1606 | "_view_name": "LayoutView", 1607 | "align_content": null, 1608 | "align_items": null, 1609 | "align_self": null, 1610 | "border": null, 1611 | "bottom": null, 1612 | "display": null, 1613 | "flex": null, 1614 | "flex_flow": null, 1615 | "grid_area": null, 1616 | "grid_auto_columns": null, 1617 | "grid_auto_flow": null, 1618 | "grid_auto_rows": null, 1619 | "grid_column": null, 1620 | "grid_gap": null, 1621 | "grid_row": null, 1622 | "grid_template_areas": null, 1623 | "grid_template_columns": null, 1624 | "grid_template_rows": null, 1625 | "height": null, 1626 | "justify_content": null, 1627 | "justify_items": null, 1628 | "left": null, 1629 | "margin": null, 1630 | "max_height": null, 1631 | "max_width": null, 1632 | "min_height": null, 1633 | "min_width": null, 1634 | "object_fit": null, 1635 | "object_position": null, 1636 | "order": null, 1637 | "overflow": null, 1638 | "overflow_x": null, 1639 | "overflow_y": null, 1640 | "padding": null, 1641 | "right": null, 1642 | "top": null, 1643 | "visibility": null, 1644 | "width": null 1645 | } 1646 | }, 1647 | "6009acf8a81645209f3d94660fdd5fea": { 1648 | "model_module": "@jupyter-widgets/controls", 1649 | "model_module_version": "1.5.0", 1650 | "model_name": "HBoxModel", 1651 | "state": { 1652 | "_dom_classes": [], 1653 | "_model_module": "@jupyter-widgets/controls", 1654 | "_model_module_version": "1.5.0", 1655 | "_model_name": "HBoxModel", 1656 | "_view_count": null, 1657 | "_view_module": "@jupyter-widgets/controls", 1658 | "_view_module_version": "1.5.0", 1659 | "_view_name": "HBoxView", 1660 | "box_style": "", 1661 | "children": [ 1662 | "IPY_MODEL_1d3b2ea61ff54562a38e66b6d8d36a0f", 1663 | "IPY_MODEL_d526fa7789664d2f87e9ad1d370cb848", 1664 | "IPY_MODEL_9ff84d1955eb474f87e3691ce38c98de" 1665 | ], 1666 | "layout": "IPY_MODEL_2eded99c82784f55b140d8d92b0d9a1c" 1667 | } 1668 | }, 1669 | "61fe9da71b97485f92c6ee2380d46b4d": { 1670 | "model_module": "@jupyter-widgets/controls", 1671 | "model_module_version": "1.5.0", 1672 | "model_name": "HBoxModel", 1673 | "state": { 1674 | "_dom_classes": [], 1675 | "_model_module": "@jupyter-widgets/controls", 1676 | "_model_module_version": "1.5.0", 1677 | "_model_name": "HBoxModel", 1678 | "_view_count": null, 1679 | "_view_module": "@jupyter-widgets/controls", 1680 | "_view_module_version": "1.5.0", 1681 | "_view_name": "HBoxView", 1682 | "box_style": "", 1683 | "children": [ 1684 | "IPY_MODEL_50656ef62a804b908bf05493789cc362", 1685 | "IPY_MODEL_b5601c72540c4ec9be8cfbe7b577ef65", 1686 | "IPY_MODEL_11788b948d5b46bd88de9235bbb3179d" 1687 | ], 1688 | "layout": "IPY_MODEL_915ba23ce31640e4b487d7d402d203d8" 1689 | } 1690 | }, 1691 | "63040bada9764ff8b5dac77d22a90008": { 1692 | "model_module": "@jupyter-widgets/controls", 1693 | "model_module_version": "1.5.0", 1694 | "model_name": "ProgressStyleModel", 1695 | "state": { 1696 | "_model_module": "@jupyter-widgets/controls", 1697 | "_model_module_version": "1.5.0", 1698 | "_model_name": "ProgressStyleModel", 1699 | "_view_count": null, 1700 | "_view_module": "@jupyter-widgets/base", 1701 | "_view_module_version": "1.2.0", 1702 | "_view_name": "StyleView", 1703 | "bar_color": null, 1704 | "description_width": "" 1705 | } 1706 | }, 1707 | "6da8546c83864bd98dd3bc7568de095b": { 1708 | "model_module": "@jupyter-widgets/base", 1709 | "model_module_version": "1.2.0", 1710 | "model_name": "LayoutModel", 1711 | "state": { 1712 | "_model_module": "@jupyter-widgets/base", 1713 | "_model_module_version": "1.2.0", 1714 | "_model_name": "LayoutModel", 1715 | "_view_count": null, 1716 | "_view_module": "@jupyter-widgets/base", 1717 | "_view_module_version": "1.2.0", 1718 | "_view_name": "LayoutView", 1719 | "align_content": null, 1720 | "align_items": null, 1721 | "align_self": null, 1722 | "border": null, 1723 | "bottom": null, 1724 | "display": null, 1725 | "flex": null, 1726 | "flex_flow": null, 1727 | "grid_area": null, 1728 | "grid_auto_columns": null, 1729 | "grid_auto_flow": null, 1730 | "grid_auto_rows": null, 1731 | "grid_column": null, 1732 | "grid_gap": null, 1733 | "grid_row": null, 1734 | "grid_template_areas": null, 1735 | "grid_template_columns": null, 1736 | "grid_template_rows": null, 1737 | "height": null, 1738 | "justify_content": null, 1739 | "justify_items": null, 1740 | "left": null, 1741 | "margin": null, 1742 | "max_height": null, 1743 | "max_width": null, 1744 | "min_height": null, 1745 | "min_width": null, 1746 | "object_fit": null, 1747 | "object_position": null, 1748 | "order": null, 1749 | "overflow": null, 1750 | "overflow_x": null, 1751 | "overflow_y": null, 1752 | "padding": null, 1753 | "right": null, 1754 | "top": null, 1755 | "visibility": null, 1756 | "width": null 1757 | } 1758 | }, 1759 | "6f66bbc9f1084c8899c64cefd3b0934a": { 1760 | "model_module": "@jupyter-widgets/controls", 1761 | "model_module_version": "1.5.0", 1762 | "model_name": "DescriptionStyleModel", 1763 | "state": { 1764 | "_model_module": "@jupyter-widgets/controls", 1765 | "_model_module_version": "1.5.0", 1766 | "_model_name": "DescriptionStyleModel", 1767 | "_view_count": null, 1768 | "_view_module": "@jupyter-widgets/base", 1769 | "_view_module_version": "1.2.0", 1770 | "_view_name": "StyleView", 1771 | "description_width": "" 1772 | } 1773 | }, 1774 | "72278d146c6a43c88282be435ff8b7ed": { 1775 | "model_module": "@jupyter-widgets/controls", 1776 | "model_module_version": "1.5.0", 1777 | "model_name": "HBoxModel", 1778 | "state": { 1779 | "_dom_classes": [], 1780 | "_model_module": "@jupyter-widgets/controls", 1781 | "_model_module_version": "1.5.0", 1782 | "_model_name": "HBoxModel", 1783 | "_view_count": null, 1784 | "_view_module": "@jupyter-widgets/controls", 1785 | "_view_module_version": "1.5.0", 1786 | "_view_name": "HBoxView", 1787 | "box_style": "", 1788 | "children": [ 1789 | "IPY_MODEL_e4354fcaac5642ae89ce8a3c69d1b68d", 1790 | "IPY_MODEL_aea7cfbba79340e98ca0a00ea6e6a5c2", 1791 | "IPY_MODEL_adf9b0c5dacb426aa21ae59db45657ca" 1792 | ], 1793 | "layout": "IPY_MODEL_d0a7b3b62181431b9ca9870c167b983e" 1794 | } 1795 | }, 1796 | "72d5f442747f4bfda91945c58033ead2": { 1797 | "model_module": "@jupyter-widgets/controls", 1798 | "model_module_version": "1.5.0", 1799 | "model_name": "HTMLModel", 1800 | "state": { 1801 | "_dom_classes": [], 1802 | "_model_module": "@jupyter-widgets/controls", 1803 | "_model_module_version": "1.5.0", 1804 | "_model_name": "HTMLModel", 1805 | "_view_count": null, 1806 | "_view_module": "@jupyter-widgets/controls", 1807 | "_view_module_version": "1.5.0", 1808 | "_view_name": "HTMLView", 1809 | "description": "", 1810 | "description_tooltip": null, 1811 | "layout": "IPY_MODEL_4441bde44ec44063bed8f43ae4a30a4e", 1812 | "placeholder": "", 1813 | "style": "IPY_MODEL_1e8533a81384457dab93020c702eb2d6", 1814 | "value": "tokenizer_config.json: 100%" 1815 | } 1816 | }, 1817 | "77b0e8db32ce4558961c1871ecd2a147": { 1818 | "model_module": "@jupyter-widgets/base", 1819 | "model_module_version": "1.2.0", 1820 | "model_name": "LayoutModel", 1821 | "state": { 1822 | "_model_module": "@jupyter-widgets/base", 1823 | "_model_module_version": "1.2.0", 1824 | "_model_name": "LayoutModel", 1825 | "_view_count": null, 1826 | "_view_module": "@jupyter-widgets/base", 1827 | "_view_module_version": "1.2.0", 1828 | "_view_name": "LayoutView", 1829 | "align_content": null, 1830 | "align_items": null, 1831 | "align_self": null, 1832 | "border": null, 1833 | "bottom": null, 1834 | "display": null, 1835 | "flex": null, 1836 | "flex_flow": null, 1837 | "grid_area": null, 1838 | "grid_auto_columns": null, 1839 | "grid_auto_flow": null, 1840 | "grid_auto_rows": null, 1841 | "grid_column": null, 1842 | "grid_gap": null, 1843 | "grid_row": null, 1844 | "grid_template_areas": null, 1845 | "grid_template_columns": null, 1846 | "grid_template_rows": null, 1847 | "height": null, 1848 | "justify_content": null, 1849 | "justify_items": null, 1850 | "left": null, 1851 | "margin": null, 1852 | "max_height": null, 1853 | "max_width": null, 1854 | "min_height": null, 1855 | "min_width": null, 1856 | "object_fit": null, 1857 | "object_position": null, 1858 | "order": null, 1859 | "overflow": null, 1860 | "overflow_x": null, 1861 | "overflow_y": null, 1862 | "padding": null, 1863 | "right": null, 1864 | "top": null, 1865 | "visibility": null, 1866 | "width": null 1867 | } 1868 | }, 1869 | "7fed1cef573e47d08300438fe4c158e9": { 1870 | "model_module": "@jupyter-widgets/base", 1871 | "model_module_version": "1.2.0", 1872 | "model_name": "LayoutModel", 1873 | "state": { 1874 | "_model_module": "@jupyter-widgets/base", 1875 | "_model_module_version": "1.2.0", 1876 | "_model_name": "LayoutModel", 1877 | "_view_count": null, 1878 | "_view_module": "@jupyter-widgets/base", 1879 | "_view_module_version": "1.2.0", 1880 | "_view_name": "LayoutView", 1881 | "align_content": null, 1882 | "align_items": null, 1883 | "align_self": null, 1884 | "border": null, 1885 | "bottom": null, 1886 | "display": null, 1887 | "flex": null, 1888 | "flex_flow": null, 1889 | "grid_area": null, 1890 | "grid_auto_columns": null, 1891 | "grid_auto_flow": null, 1892 | "grid_auto_rows": null, 1893 | "grid_column": null, 1894 | "grid_gap": null, 1895 | "grid_row": null, 1896 | "grid_template_areas": null, 1897 | "grid_template_columns": null, 1898 | "grid_template_rows": null, 1899 | "height": null, 1900 | "justify_content": null, 1901 | "justify_items": null, 1902 | "left": null, 1903 | "margin": null, 1904 | "max_height": null, 1905 | "max_width": null, 1906 | "min_height": null, 1907 | "min_width": null, 1908 | "object_fit": null, 1909 | "object_position": null, 1910 | "order": null, 1911 | "overflow": null, 1912 | "overflow_x": null, 1913 | "overflow_y": null, 1914 | "padding": null, 1915 | "right": null, 1916 | "top": null, 1917 | "visibility": null, 1918 | "width": null 1919 | } 1920 | }, 1921 | "8001fd2f05784a66b7f3aac77fcf2303": { 1922 | "model_module": "@jupyter-widgets/base", 1923 | "model_module_version": "1.2.0", 1924 | "model_name": "LayoutModel", 1925 | "state": { 1926 | "_model_module": "@jupyter-widgets/base", 1927 | "_model_module_version": "1.2.0", 1928 | "_model_name": "LayoutModel", 1929 | "_view_count": null, 1930 | "_view_module": "@jupyter-widgets/base", 1931 | "_view_module_version": "1.2.0", 1932 | "_view_name": "LayoutView", 1933 | "align_content": null, 1934 | "align_items": null, 1935 | "align_self": null, 1936 | "border": null, 1937 | "bottom": null, 1938 | "display": null, 1939 | "flex": null, 1940 | "flex_flow": null, 1941 | "grid_area": null, 1942 | "grid_auto_columns": null, 1943 | "grid_auto_flow": null, 1944 | "grid_auto_rows": null, 1945 | "grid_column": null, 1946 | "grid_gap": null, 1947 | "grid_row": null, 1948 | "grid_template_areas": null, 1949 | "grid_template_columns": null, 1950 | "grid_template_rows": null, 1951 | "height": null, 1952 | "justify_content": null, 1953 | "justify_items": null, 1954 | "left": null, 1955 | "margin": null, 1956 | "max_height": null, 1957 | "max_width": null, 1958 | "min_height": null, 1959 | "min_width": null, 1960 | "object_fit": null, 1961 | "object_position": null, 1962 | "order": null, 1963 | "overflow": null, 1964 | "overflow_x": null, 1965 | "overflow_y": null, 1966 | "padding": null, 1967 | "right": null, 1968 | "top": null, 1969 | "visibility": null, 1970 | "width": null 1971 | } 1972 | }, 1973 | "81c6f32b33b0420081e0c7b27299a2b8": { 1974 | "model_module": "@jupyter-widgets/controls", 1975 | "model_module_version": "1.5.0", 1976 | "model_name": "ProgressStyleModel", 1977 | "state": { 1978 | "_model_module": "@jupyter-widgets/controls", 1979 | "_model_module_version": "1.5.0", 1980 | "_model_name": "ProgressStyleModel", 1981 | "_view_count": null, 1982 | "_view_module": "@jupyter-widgets/base", 1983 | "_view_module_version": "1.2.0", 1984 | "_view_name": "StyleView", 1985 | "bar_color": null, 1986 | "description_width": "" 1987 | } 1988 | }, 1989 | "8a9b97f5d64a4781b6d46134bc5ebff8": { 1990 | "model_module": "@jupyter-widgets/base", 1991 | "model_module_version": "1.2.0", 1992 | "model_name": "LayoutModel", 1993 | "state": { 1994 | "_model_module": "@jupyter-widgets/base", 1995 | "_model_module_version": "1.2.0", 1996 | "_model_name": "LayoutModel", 1997 | "_view_count": null, 1998 | "_view_module": "@jupyter-widgets/base", 1999 | "_view_module_version": "1.2.0", 2000 | "_view_name": "LayoutView", 2001 | "align_content": null, 2002 | "align_items": null, 2003 | "align_self": null, 2004 | "border": null, 2005 | "bottom": null, 2006 | "display": null, 2007 | "flex": null, 2008 | "flex_flow": null, 2009 | "grid_area": null, 2010 | "grid_auto_columns": null, 2011 | "grid_auto_flow": null, 2012 | "grid_auto_rows": null, 2013 | "grid_column": null, 2014 | "grid_gap": null, 2015 | "grid_row": null, 2016 | "grid_template_areas": null, 2017 | "grid_template_columns": null, 2018 | "grid_template_rows": null, 2019 | "height": null, 2020 | "justify_content": null, 2021 | "justify_items": null, 2022 | "left": null, 2023 | "margin": null, 2024 | "max_height": null, 2025 | "max_width": null, 2026 | "min_height": null, 2027 | "min_width": null, 2028 | "object_fit": null, 2029 | "object_position": null, 2030 | "order": null, 2031 | "overflow": null, 2032 | "overflow_x": null, 2033 | "overflow_y": null, 2034 | "padding": null, 2035 | "right": null, 2036 | "top": null, 2037 | "visibility": null, 2038 | "width": null 2039 | } 2040 | }, 2041 | "8bbd54857d834502b94847a6903dcdf9": { 2042 | "model_module": "@jupyter-widgets/controls", 2043 | "model_module_version": "1.5.0", 2044 | "model_name": "DescriptionStyleModel", 2045 | "state": { 2046 | "_model_module": "@jupyter-widgets/controls", 2047 | "_model_module_version": "1.5.0", 2048 | "_model_name": "DescriptionStyleModel", 2049 | "_view_count": null, 2050 | "_view_module": "@jupyter-widgets/base", 2051 | "_view_module_version": "1.2.0", 2052 | "_view_name": "StyleView", 2053 | "description_width": "" 2054 | } 2055 | }, 2056 | "8d3e0594460d4e149c88f1193f5bed6e": { 2057 | "model_module": "@jupyter-widgets/base", 2058 | "model_module_version": "1.2.0", 2059 | "model_name": "LayoutModel", 2060 | "state": { 2061 | "_model_module": "@jupyter-widgets/base", 2062 | "_model_module_version": "1.2.0", 2063 | "_model_name": "LayoutModel", 2064 | "_view_count": null, 2065 | "_view_module": "@jupyter-widgets/base", 2066 | "_view_module_version": "1.2.0", 2067 | "_view_name": "LayoutView", 2068 | "align_content": null, 2069 | "align_items": null, 2070 | "align_self": null, 2071 | "border": null, 2072 | "bottom": null, 2073 | "display": null, 2074 | "flex": null, 2075 | "flex_flow": null, 2076 | "grid_area": null, 2077 | "grid_auto_columns": null, 2078 | "grid_auto_flow": null, 2079 | "grid_auto_rows": null, 2080 | "grid_column": null, 2081 | "grid_gap": null, 2082 | "grid_row": null, 2083 | "grid_template_areas": null, 2084 | "grid_template_columns": null, 2085 | "grid_template_rows": null, 2086 | "height": null, 2087 | "justify_content": null, 2088 | "justify_items": null, 2089 | "left": null, 2090 | "margin": null, 2091 | "max_height": null, 2092 | "max_width": null, 2093 | "min_height": null, 2094 | "min_width": null, 2095 | "object_fit": null, 2096 | "object_position": null, 2097 | "order": null, 2098 | "overflow": null, 2099 | "overflow_x": null, 2100 | "overflow_y": null, 2101 | "padding": null, 2102 | "right": null, 2103 | "top": null, 2104 | "visibility": null, 2105 | "width": null 2106 | } 2107 | }, 2108 | "915ba23ce31640e4b487d7d402d203d8": { 2109 | "model_module": "@jupyter-widgets/base", 2110 | "model_module_version": "1.2.0", 2111 | "model_name": "LayoutModel", 2112 | "state": { 2113 | "_model_module": "@jupyter-widgets/base", 2114 | "_model_module_version": "1.2.0", 2115 | "_model_name": "LayoutModel", 2116 | "_view_count": null, 2117 | "_view_module": "@jupyter-widgets/base", 2118 | "_view_module_version": "1.2.0", 2119 | "_view_name": "LayoutView", 2120 | "align_content": null, 2121 | "align_items": null, 2122 | "align_self": null, 2123 | "border": null, 2124 | "bottom": null, 2125 | "display": null, 2126 | "flex": null, 2127 | "flex_flow": null, 2128 | "grid_area": null, 2129 | "grid_auto_columns": null, 2130 | "grid_auto_flow": null, 2131 | "grid_auto_rows": null, 2132 | "grid_column": null, 2133 | "grid_gap": null, 2134 | "grid_row": null, 2135 | "grid_template_areas": null, 2136 | "grid_template_columns": null, 2137 | "grid_template_rows": null, 2138 | "height": null, 2139 | "justify_content": null, 2140 | "justify_items": null, 2141 | "left": null, 2142 | "margin": null, 2143 | "max_height": null, 2144 | "max_width": null, 2145 | "min_height": null, 2146 | "min_width": null, 2147 | "object_fit": null, 2148 | "object_position": null, 2149 | "order": null, 2150 | "overflow": null, 2151 | "overflow_x": null, 2152 | "overflow_y": null, 2153 | "padding": null, 2154 | "right": null, 2155 | "top": null, 2156 | "visibility": null, 2157 | "width": null 2158 | } 2159 | }, 2160 | "92bf50844694495daf0abd9f3fb01ef1": { 2161 | "model_module": "@jupyter-widgets/controls", 2162 | "model_module_version": "1.5.0", 2163 | "model_name": "ProgressStyleModel", 2164 | "state": { 2165 | "_model_module": "@jupyter-widgets/controls", 2166 | "_model_module_version": "1.5.0", 2167 | "_model_name": "ProgressStyleModel", 2168 | "_view_count": null, 2169 | "_view_module": "@jupyter-widgets/base", 2170 | "_view_module_version": "1.2.0", 2171 | "_view_name": "StyleView", 2172 | "bar_color": null, 2173 | "description_width": "" 2174 | } 2175 | }, 2176 | "954458b7719f442cafd3409ab7a90116": { 2177 | "model_module": "@jupyter-widgets/controls", 2178 | "model_module_version": "1.5.0", 2179 | "model_name": "FloatProgressModel", 2180 | "state": { 2181 | "_dom_classes": [], 2182 | "_model_module": "@jupyter-widgets/controls", 2183 | "_model_module_version": "1.5.0", 2184 | "_model_name": "FloatProgressModel", 2185 | "_view_count": null, 2186 | "_view_module": "@jupyter-widgets/controls", 2187 | "_view_module_version": "1.5.0", 2188 | "_view_name": "ProgressView", 2189 | "bar_style": "success", 2190 | "description": "", 2191 | "description_tooltip": null, 2192 | "layout": "IPY_MODEL_3ffcec797ef542808540f5dd4ed7650b", 2193 | "max": 33384568, 2194 | "min": 0, 2195 | "orientation": "horizontal", 2196 | "style": "IPY_MODEL_430dbe6044ac4c19a72fb996ddc9fbd1", 2197 | "value": 33384568 2198 | } 2199 | }, 2200 | "98b2d0f27f4a40c6a09b5eeb8e6deafd": { 2201 | "model_module": "@jupyter-widgets/base", 2202 | "model_module_version": "1.2.0", 2203 | "model_name": "LayoutModel", 2204 | "state": { 2205 | "_model_module": "@jupyter-widgets/base", 2206 | "_model_module_version": "1.2.0", 2207 | "_model_name": "LayoutModel", 2208 | "_view_count": null, 2209 | "_view_module": "@jupyter-widgets/base", 2210 | "_view_module_version": "1.2.0", 2211 | "_view_name": "LayoutView", 2212 | "align_content": null, 2213 | "align_items": null, 2214 | "align_self": null, 2215 | "border": null, 2216 | "bottom": null, 2217 | "display": null, 2218 | "flex": null, 2219 | "flex_flow": null, 2220 | "grid_area": null, 2221 | "grid_auto_columns": null, 2222 | "grid_auto_flow": null, 2223 | "grid_auto_rows": null, 2224 | "grid_column": null, 2225 | "grid_gap": null, 2226 | "grid_row": null, 2227 | "grid_template_areas": null, 2228 | "grid_template_columns": null, 2229 | "grid_template_rows": null, 2230 | "height": null, 2231 | "justify_content": null, 2232 | "justify_items": null, 2233 | "left": null, 2234 | "margin": null, 2235 | "max_height": null, 2236 | "max_width": null, 2237 | "min_height": null, 2238 | "min_width": null, 2239 | "object_fit": null, 2240 | "object_position": null, 2241 | "order": null, 2242 | "overflow": null, 2243 | "overflow_x": null, 2244 | "overflow_y": null, 2245 | "padding": null, 2246 | "right": null, 2247 | "top": null, 2248 | "visibility": null, 2249 | "width": null 2250 | } 2251 | }, 2252 | "9f2bda486bea4880829704c05223562e": { 2253 | "model_module": "@jupyter-widgets/controls", 2254 | "model_module_version": "1.5.0", 2255 | "model_name": "HTMLModel", 2256 | "state": { 2257 | "_dom_classes": [], 2258 | "_model_module": "@jupyter-widgets/controls", 2259 | "_model_module_version": "1.5.0", 2260 | "_model_name": "HTMLModel", 2261 | "_view_count": null, 2262 | "_view_module": "@jupyter-widgets/controls", 2263 | "_view_module_version": "1.5.0", 2264 | "_view_name": "HTMLView", 2265 | "description": "", 2266 | "description_tooltip": null, 2267 | "layout": "IPY_MODEL_98b2d0f27f4a40c6a09b5eeb8e6deafd", 2268 | "placeholder": "", 2269 | "style": "IPY_MODEL_edce2515b0cd4d7298064476d43646a1", 2270 | "value": " 1.16M/1.16M [00:00<00:00, 7.96MB/s]" 2271 | } 2272 | }, 2273 | "9ff84d1955eb474f87e3691ce38c98de": { 2274 | "model_module": "@jupyter-widgets/controls", 2275 | "model_module_version": "1.5.0", 2276 | "model_name": "HTMLModel", 2277 | "state": { 2278 | "_dom_classes": [], 2279 | "_model_module": "@jupyter-widgets/controls", 2280 | "_model_module_version": "1.5.0", 2281 | "_model_name": "HTMLModel", 2282 | "_view_count": null, 2283 | "_view_module": "@jupyter-widgets/controls", 2284 | "_view_module_version": "1.5.0", 2285 | "_view_name": "HTMLView", 2286 | "description": "", 2287 | "description_tooltip": null, 2288 | "layout": "IPY_MODEL_b52f5ea85a644ddba56654b59d564db1", 2289 | "placeholder": "", 2290 | "style": "IPY_MODEL_53f024e3fff24976958b2bf43794c28d", 2291 | "value": " 35.0/35.0 [00:00<00:00, 3.47kB/s]" 2292 | } 2293 | }, 2294 | "ac32520514b34a2fbe86b90fccbf9bae": { 2295 | "model_module": "@jupyter-widgets/controls", 2296 | "model_module_version": "1.5.0", 2297 | "model_name": "HTMLModel", 2298 | "state": { 2299 | "_dom_classes": [], 2300 | "_model_module": "@jupyter-widgets/controls", 2301 | "_model_module_version": "1.5.0", 2302 | "_model_name": "HTMLModel", 2303 | "_view_count": null, 2304 | "_view_module": "@jupyter-widgets/controls", 2305 | "_view_module_version": "1.5.0", 2306 | "_view_name": "HTMLView", 2307 | "description": "", 2308 | "description_tooltip": null, 2309 | "layout": "IPY_MODEL_b9e417466b8940a39b9e5755576c1c5c", 2310 | "placeholder": "", 2311 | "style": "IPY_MODEL_b07788470b864279a763197ae593f1bd", 2312 | "value": " 33.4M/33.4M [00:00<00:00, 132MB/s]" 2313 | } 2314 | }, 2315 | "adf9b0c5dacb426aa21ae59db45657ca": { 2316 | "model_module": "@jupyter-widgets/controls", 2317 | "model_module_version": "1.5.0", 2318 | "model_name": "HTMLModel", 2319 | "state": { 2320 | "_dom_classes": [], 2321 | "_model_module": "@jupyter-widgets/controls", 2322 | "_model_module_version": "1.5.0", 2323 | "_model_name": "HTMLModel", 2324 | "_view_count": null, 2325 | "_view_module": "@jupyter-widgets/controls", 2326 | "_view_module_version": "1.5.0", 2327 | "_view_name": "HTMLView", 2328 | "description": "", 2329 | "description_tooltip": null, 2330 | "layout": "IPY_MODEL_29c6198a3adb4d909230ed1eb1e3cd74", 2331 | "placeholder": "", 2332 | "style": "IPY_MODEL_3f956e7672f34276b78e5e0cadaaedbd", 2333 | "value": " 215/215 [00:00<00:00, 15.8kB/s]" 2334 | } 2335 | }, 2336 | "aea7cfbba79340e98ca0a00ea6e6a5c2": { 2337 | "model_module": "@jupyter-widgets/controls", 2338 | "model_module_version": "1.5.0", 2339 | "model_name": "FloatProgressModel", 2340 | "state": { 2341 | "_dom_classes": [], 2342 | "_model_module": "@jupyter-widgets/controls", 2343 | "_model_module_version": "1.5.0", 2344 | "_model_name": "FloatProgressModel", 2345 | "_view_count": null, 2346 | "_view_module": "@jupyter-widgets/controls", 2347 | "_view_module_version": "1.5.0", 2348 | "_view_name": "ProgressView", 2349 | "bar_style": "success", 2350 | "description": "", 2351 | "description_tooltip": null, 2352 | "layout": "IPY_MODEL_21c5356744e149d6a2d67fdb9830fbeb", 2353 | "max": 215, 2354 | "min": 0, 2355 | "orientation": "horizontal", 2356 | "style": "IPY_MODEL_92bf50844694495daf0abd9f3fb01ef1", 2357 | "value": 215 2358 | } 2359 | }, 2360 | "b07788470b864279a763197ae593f1bd": { 2361 | "model_module": "@jupyter-widgets/controls", 2362 | "model_module_version": "1.5.0", 2363 | "model_name": "DescriptionStyleModel", 2364 | "state": { 2365 | "_model_module": "@jupyter-widgets/controls", 2366 | "_model_module_version": "1.5.0", 2367 | "_model_name": "DescriptionStyleModel", 2368 | "_view_count": null, 2369 | "_view_module": "@jupyter-widgets/base", 2370 | "_view_module_version": "1.2.0", 2371 | "_view_name": "StyleView", 2372 | "description_width": "" 2373 | } 2374 | }, 2375 | "b3f74dd34648458ca9df0ed01ab0f559": { 2376 | "model_module": "@jupyter-widgets/base", 2377 | "model_module_version": "1.2.0", 2378 | "model_name": "LayoutModel", 2379 | "state": { 2380 | "_model_module": "@jupyter-widgets/base", 2381 | "_model_module_version": "1.2.0", 2382 | "_model_name": "LayoutModel", 2383 | "_view_count": null, 2384 | "_view_module": "@jupyter-widgets/base", 2385 | "_view_module_version": "1.2.0", 2386 | "_view_name": "LayoutView", 2387 | "align_content": null, 2388 | "align_items": null, 2389 | "align_self": null, 2390 | "border": null, 2391 | "bottom": null, 2392 | "display": null, 2393 | "flex": null, 2394 | "flex_flow": null, 2395 | "grid_area": null, 2396 | "grid_auto_columns": null, 2397 | "grid_auto_flow": null, 2398 | "grid_auto_rows": null, 2399 | "grid_column": null, 2400 | "grid_gap": null, 2401 | "grid_row": null, 2402 | "grid_template_areas": null, 2403 | "grid_template_columns": null, 2404 | "grid_template_rows": null, 2405 | "height": null, 2406 | "justify_content": null, 2407 | "justify_items": null, 2408 | "left": null, 2409 | "margin": null, 2410 | "max_height": null, 2411 | "max_width": null, 2412 | "min_height": null, 2413 | "min_width": null, 2414 | "object_fit": null, 2415 | "object_position": null, 2416 | "order": null, 2417 | "overflow": null, 2418 | "overflow_x": null, 2419 | "overflow_y": null, 2420 | "padding": null, 2421 | "right": null, 2422 | "top": null, 2423 | "visibility": null, 2424 | "width": null 2425 | } 2426 | }, 2427 | "b451038191ef4a9dadecf1140c32fffe": { 2428 | "model_module": "@jupyter-widgets/controls", 2429 | "model_module_version": "1.5.0", 2430 | "model_name": "HBoxModel", 2431 | "state": { 2432 | "_dom_classes": [], 2433 | "_model_module": "@jupyter-widgets/controls", 2434 | "_model_module_version": "1.5.0", 2435 | "_model_name": "HBoxModel", 2436 | "_view_count": null, 2437 | "_view_module": "@jupyter-widgets/controls", 2438 | "_view_module_version": "1.5.0", 2439 | "_view_name": "HBoxView", 2440 | "box_style": "", 2441 | "children": [ 2442 | "IPY_MODEL_fa4045a862034e428ab40009d04d313e", 2443 | "IPY_MODEL_d425bdb1be0144d48c43708e61745b4c", 2444 | "IPY_MODEL_fb8e26b8d050464595bf725cf2b2e9d3" 2445 | ], 2446 | "layout": "IPY_MODEL_3ace3d3b4baf4599a6673f58088351d4" 2447 | } 2448 | }, 2449 | "b52f5ea85a644ddba56654b59d564db1": { 2450 | "model_module": "@jupyter-widgets/base", 2451 | "model_module_version": "1.2.0", 2452 | "model_name": "LayoutModel", 2453 | "state": { 2454 | "_model_module": "@jupyter-widgets/base", 2455 | "_model_module_version": "1.2.0", 2456 | "_model_name": "LayoutModel", 2457 | "_view_count": null, 2458 | "_view_module": "@jupyter-widgets/base", 2459 | "_view_module_version": "1.2.0", 2460 | "_view_name": "LayoutView", 2461 | "align_content": null, 2462 | "align_items": null, 2463 | "align_self": null, 2464 | "border": null, 2465 | "bottom": null, 2466 | "display": null, 2467 | "flex": null, 2468 | "flex_flow": null, 2469 | "grid_area": null, 2470 | "grid_auto_columns": null, 2471 | "grid_auto_flow": null, 2472 | "grid_auto_rows": null, 2473 | "grid_column": null, 2474 | "grid_gap": null, 2475 | "grid_row": null, 2476 | "grid_template_areas": null, 2477 | "grid_template_columns": null, 2478 | "grid_template_rows": null, 2479 | "height": null, 2480 | "justify_content": null, 2481 | "justify_items": null, 2482 | "left": null, 2483 | "margin": null, 2484 | "max_height": null, 2485 | "max_width": null, 2486 | "min_height": null, 2487 | "min_width": null, 2488 | "object_fit": null, 2489 | "object_position": null, 2490 | "order": null, 2491 | "overflow": null, 2492 | "overflow_x": null, 2493 | "overflow_y": null, 2494 | "padding": null, 2495 | "right": null, 2496 | "top": null, 2497 | "visibility": null, 2498 | "width": null 2499 | } 2500 | }, 2501 | "b5601c72540c4ec9be8cfbe7b577ef65": { 2502 | "model_module": "@jupyter-widgets/controls", 2503 | "model_module_version": "1.5.0", 2504 | "model_name": "FloatProgressModel", 2505 | "state": { 2506 | "_dom_classes": [], 2507 | "_model_module": "@jupyter-widgets/controls", 2508 | "_model_module_version": "1.5.0", 2509 | "_model_name": "FloatProgressModel", 2510 | "_view_count": null, 2511 | "_view_module": "@jupyter-widgets/controls", 2512 | "_view_module_version": "1.5.0", 2513 | "_view_name": "ProgressView", 2514 | "bar_style": "danger", 2515 | "description": "", 2516 | "description_tooltip": null, 2517 | "layout": "IPY_MODEL_7fed1cef573e47d08300438fe4c158e9", 2518 | "max": 1999811208, 2519 | "min": 0, 2520 | "orientation": "horizontal", 2521 | "style": "IPY_MODEL_222df800dad945e199a7ccd8dfbc7a8d", 2522 | "value": 1999811018 2523 | } 2524 | }, 2525 | "b9e417466b8940a39b9e5755576c1c5c": { 2526 | "model_module": "@jupyter-widgets/base", 2527 | "model_module_version": "1.2.0", 2528 | "model_name": "LayoutModel", 2529 | "state": { 2530 | "_model_module": "@jupyter-widgets/base", 2531 | "_model_module_version": "1.2.0", 2532 | "_model_name": "LayoutModel", 2533 | "_view_count": null, 2534 | "_view_module": "@jupyter-widgets/base", 2535 | "_view_module_version": "1.2.0", 2536 | "_view_name": "LayoutView", 2537 | "align_content": null, 2538 | "align_items": null, 2539 | "align_self": null, 2540 | "border": null, 2541 | "bottom": null, 2542 | "display": null, 2543 | "flex": null, 2544 | "flex_flow": null, 2545 | "grid_area": null, 2546 | "grid_auto_columns": null, 2547 | "grid_auto_flow": null, 2548 | "grid_auto_rows": null, 2549 | "grid_column": null, 2550 | "grid_gap": null, 2551 | "grid_row": null, 2552 | "grid_template_areas": null, 2553 | "grid_template_columns": null, 2554 | "grid_template_rows": null, 2555 | "height": null, 2556 | "justify_content": null, 2557 | "justify_items": null, 2558 | "left": null, 2559 | "margin": null, 2560 | "max_height": null, 2561 | "max_width": null, 2562 | "min_height": null, 2563 | "min_width": null, 2564 | "object_fit": null, 2565 | "object_position": null, 2566 | "order": null, 2567 | "overflow": null, 2568 | "overflow_x": null, 2569 | "overflow_y": null, 2570 | "padding": null, 2571 | "right": null, 2572 | "top": null, 2573 | "visibility": null, 2574 | "width": null 2575 | } 2576 | }, 2577 | "c35c454a893a4edfae1f27a6707cb77c": { 2578 | "model_module": "@jupyter-widgets/base", 2579 | "model_module_version": "1.2.0", 2580 | "model_name": "LayoutModel", 2581 | "state": { 2582 | "_model_module": "@jupyter-widgets/base", 2583 | "_model_module_version": "1.2.0", 2584 | "_model_name": "LayoutModel", 2585 | "_view_count": null, 2586 | "_view_module": "@jupyter-widgets/base", 2587 | "_view_module_version": "1.2.0", 2588 | "_view_name": "LayoutView", 2589 | "align_content": null, 2590 | "align_items": null, 2591 | "align_self": null, 2592 | "border": null, 2593 | "bottom": null, 2594 | "display": null, 2595 | "flex": null, 2596 | "flex_flow": null, 2597 | "grid_area": null, 2598 | "grid_auto_columns": null, 2599 | "grid_auto_flow": null, 2600 | "grid_auto_rows": null, 2601 | "grid_column": null, 2602 | "grid_gap": null, 2603 | "grid_row": null, 2604 | "grid_template_areas": null, 2605 | "grid_template_columns": null, 2606 | "grid_template_rows": null, 2607 | "height": null, 2608 | "justify_content": null, 2609 | "justify_items": null, 2610 | "left": null, 2611 | "margin": null, 2612 | "max_height": null, 2613 | "max_width": null, 2614 | "min_height": null, 2615 | "min_width": null, 2616 | "object_fit": null, 2617 | "object_position": null, 2618 | "order": null, 2619 | "overflow": null, 2620 | "overflow_x": null, 2621 | "overflow_y": null, 2622 | "padding": null, 2623 | "right": null, 2624 | "top": null, 2625 | "visibility": null, 2626 | "width": null 2627 | } 2628 | }, 2629 | "c50e91d15c7a4919bb2d048ef4aa51ef": { 2630 | "model_module": "@jupyter-widgets/base", 2631 | "model_module_version": "1.2.0", 2632 | "model_name": "LayoutModel", 2633 | "state": { 2634 | "_model_module": "@jupyter-widgets/base", 2635 | "_model_module_version": "1.2.0", 2636 | "_model_name": "LayoutModel", 2637 | "_view_count": null, 2638 | "_view_module": "@jupyter-widgets/base", 2639 | "_view_module_version": "1.2.0", 2640 | "_view_name": "LayoutView", 2641 | "align_content": null, 2642 | "align_items": null, 2643 | "align_self": null, 2644 | "border": null, 2645 | "bottom": null, 2646 | "display": null, 2647 | "flex": null, 2648 | "flex_flow": null, 2649 | "grid_area": null, 2650 | "grid_auto_columns": null, 2651 | "grid_auto_flow": null, 2652 | "grid_auto_rows": null, 2653 | "grid_column": null, 2654 | "grid_gap": null, 2655 | "grid_row": null, 2656 | "grid_template_areas": null, 2657 | "grid_template_columns": null, 2658 | "grid_template_rows": null, 2659 | "height": null, 2660 | "justify_content": null, 2661 | "justify_items": null, 2662 | "left": null, 2663 | "margin": null, 2664 | "max_height": null, 2665 | "max_width": null, 2666 | "min_height": null, 2667 | "min_width": null, 2668 | "object_fit": null, 2669 | "object_position": null, 2670 | "order": null, 2671 | "overflow": null, 2672 | "overflow_x": null, 2673 | "overflow_y": null, 2674 | "padding": null, 2675 | "right": null, 2676 | "top": null, 2677 | "visibility": null, 2678 | "width": null 2679 | } 2680 | }, 2681 | "c8fb24696fc14bbb81827eb5fad2dbc0": { 2682 | "model_module": "@jupyter-widgets/base", 2683 | "model_module_version": "1.2.0", 2684 | "model_name": "LayoutModel", 2685 | "state": { 2686 | "_model_module": "@jupyter-widgets/base", 2687 | "_model_module_version": "1.2.0", 2688 | "_model_name": "LayoutModel", 2689 | "_view_count": null, 2690 | "_view_module": "@jupyter-widgets/base", 2691 | "_view_module_version": "1.2.0", 2692 | "_view_name": "LayoutView", 2693 | "align_content": null, 2694 | "align_items": null, 2695 | "align_self": null, 2696 | "border": null, 2697 | "bottom": null, 2698 | "display": null, 2699 | "flex": null, 2700 | "flex_flow": null, 2701 | "grid_area": null, 2702 | "grid_auto_columns": null, 2703 | "grid_auto_flow": null, 2704 | "grid_auto_rows": null, 2705 | "grid_column": null, 2706 | "grid_gap": null, 2707 | "grid_row": null, 2708 | "grid_template_areas": null, 2709 | "grid_template_columns": null, 2710 | "grid_template_rows": null, 2711 | "height": null, 2712 | "justify_content": null, 2713 | "justify_items": null, 2714 | "left": null, 2715 | "margin": null, 2716 | "max_height": null, 2717 | "max_width": null, 2718 | "min_height": null, 2719 | "min_width": null, 2720 | "object_fit": null, 2721 | "object_position": null, 2722 | "order": null, 2723 | "overflow": null, 2724 | "overflow_x": null, 2725 | "overflow_y": null, 2726 | "padding": null, 2727 | "right": null, 2728 | "top": null, 2729 | "visibility": null, 2730 | "width": null 2731 | } 2732 | }, 2733 | "c90552efb9e948be970c16b2b56e5042": { 2734 | "model_module": "@jupyter-widgets/controls", 2735 | "model_module_version": "1.5.0", 2736 | "model_name": "HBoxModel", 2737 | "state": { 2738 | "_dom_classes": [], 2739 | "_model_module": "@jupyter-widgets/controls", 2740 | "_model_module_version": "1.5.0", 2741 | "_model_name": "HBoxModel", 2742 | "_view_count": null, 2743 | "_view_module": "@jupyter-widgets/controls", 2744 | "_view_module_version": "1.5.0", 2745 | "_view_name": "HBoxView", 2746 | "box_style": "", 2747 | "children": [ 2748 | "IPY_MODEL_44c5ff8f8c174a1a994062849c8488f7", 2749 | "IPY_MODEL_e27560d2924647fb8123edbeac778d6f", 2750 | "IPY_MODEL_3a487f0804d34da58bcd2afc2234027e" 2751 | ], 2752 | "layout": "IPY_MODEL_c50e91d15c7a4919bb2d048ef4aa51ef" 2753 | } 2754 | }, 2755 | "d0a7b3b62181431b9ca9870c167b983e": { 2756 | "model_module": "@jupyter-widgets/base", 2757 | "model_module_version": "1.2.0", 2758 | "model_name": "LayoutModel", 2759 | "state": { 2760 | "_model_module": "@jupyter-widgets/base", 2761 | "_model_module_version": "1.2.0", 2762 | "_model_name": "LayoutModel", 2763 | "_view_count": null, 2764 | "_view_module": "@jupyter-widgets/base", 2765 | "_view_module_version": "1.2.0", 2766 | "_view_name": "LayoutView", 2767 | "align_content": null, 2768 | "align_items": null, 2769 | "align_self": null, 2770 | "border": null, 2771 | "bottom": null, 2772 | "display": null, 2773 | "flex": null, 2774 | "flex_flow": null, 2775 | "grid_area": null, 2776 | "grid_auto_columns": null, 2777 | "grid_auto_flow": null, 2778 | "grid_auto_rows": null, 2779 | "grid_column": null, 2780 | "grid_gap": null, 2781 | "grid_row": null, 2782 | "grid_template_areas": null, 2783 | "grid_template_columns": null, 2784 | "grid_template_rows": null, 2785 | "height": null, 2786 | "justify_content": null, 2787 | "justify_items": null, 2788 | "left": null, 2789 | "margin": null, 2790 | "max_height": null, 2791 | "max_width": null, 2792 | "min_height": null, 2793 | "min_width": null, 2794 | "object_fit": null, 2795 | "object_position": null, 2796 | "order": null, 2797 | "overflow": null, 2798 | "overflow_x": null, 2799 | "overflow_y": null, 2800 | "padding": null, 2801 | "right": null, 2802 | "top": null, 2803 | "visibility": null, 2804 | "width": null 2805 | } 2806 | }, 2807 | "d425bdb1be0144d48c43708e61745b4c": { 2808 | "model_module": "@jupyter-widgets/controls", 2809 | "model_module_version": "1.5.0", 2810 | "model_name": "FloatProgressModel", 2811 | "state": { 2812 | "_dom_classes": [], 2813 | "_model_module": "@jupyter-widgets/controls", 2814 | "_model_module_version": "1.5.0", 2815 | "_model_name": "FloatProgressModel", 2816 | "_view_count": null, 2817 | "_view_module": "@jupyter-widgets/controls", 2818 | "_view_module_version": "1.5.0", 2819 | "_view_name": "ProgressView", 2820 | "bar_style": "success", 2821 | "description": "", 2822 | "description_tooltip": null, 2823 | "layout": "IPY_MODEL_b3f74dd34648458ca9df0ed01ab0f559", 2824 | "max": 4689074, 2825 | "min": 0, 2826 | "orientation": "horizontal", 2827 | "style": "IPY_MODEL_81c6f32b33b0420081e0c7b27299a2b8", 2828 | "value": 4689074 2829 | } 2830 | }, 2831 | "d526fa7789664d2f87e9ad1d370cb848": { 2832 | "model_module": "@jupyter-widgets/controls", 2833 | "model_module_version": "1.5.0", 2834 | "model_name": "FloatProgressModel", 2835 | "state": { 2836 | "_dom_classes": [], 2837 | "_model_module": "@jupyter-widgets/controls", 2838 | "_model_module_version": "1.5.0", 2839 | "_model_name": "FloatProgressModel", 2840 | "_view_count": null, 2841 | "_view_module": "@jupyter-widgets/controls", 2842 | "_view_module_version": "1.5.0", 2843 | "_view_name": "ProgressView", 2844 | "bar_style": "success", 2845 | "description": "", 2846 | "description_tooltip": null, 2847 | "layout": "IPY_MODEL_525bd5ebb77e45ac8654b01651388c97", 2848 | "max": 35, 2849 | "min": 0, 2850 | "orientation": "horizontal", 2851 | "style": "IPY_MODEL_d84e7c4018c1448d9909cd46b7edf68d", 2852 | "value": 35 2853 | } 2854 | }, 2855 | "d84e7c4018c1448d9909cd46b7edf68d": { 2856 | "model_module": "@jupyter-widgets/controls", 2857 | "model_module_version": "1.5.0", 2858 | "model_name": "ProgressStyleModel", 2859 | "state": { 2860 | "_model_module": "@jupyter-widgets/controls", 2861 | "_model_module_version": "1.5.0", 2862 | "_model_name": "ProgressStyleModel", 2863 | "_view_count": null, 2864 | "_view_module": "@jupyter-widgets/base", 2865 | "_view_module_version": "1.2.0", 2866 | "_view_name": "StyleView", 2867 | "bar_color": null, 2868 | "description_width": "" 2869 | } 2870 | }, 2871 | "dbc76d5ff39b449691b110d839594912": { 2872 | "model_module": "@jupyter-widgets/controls", 2873 | "model_module_version": "1.5.0", 2874 | "model_name": "DescriptionStyleModel", 2875 | "state": { 2876 | "_model_module": "@jupyter-widgets/controls", 2877 | "_model_module_version": "1.5.0", 2878 | "_model_name": "DescriptionStyleModel", 2879 | "_view_count": null, 2880 | "_view_module": "@jupyter-widgets/base", 2881 | "_view_module_version": "1.2.0", 2882 | "_view_name": "StyleView", 2883 | "description_width": "" 2884 | } 2885 | }, 2886 | "e27560d2924647fb8123edbeac778d6f": { 2887 | "model_module": "@jupyter-widgets/controls", 2888 | "model_module_version": "1.5.0", 2889 | "model_name": "FloatProgressModel", 2890 | "state": { 2891 | "_dom_classes": [], 2892 | "_model_module": "@jupyter-widgets/controls", 2893 | "_model_module_version": "1.5.0", 2894 | "_model_name": "FloatProgressModel", 2895 | "_view_count": null, 2896 | "_view_module": "@jupyter-widgets/controls", 2897 | "_view_module_version": "1.5.0", 2898 | "_view_name": "ProgressView", 2899 | "bar_style": "success", 2900 | "description": "", 2901 | "description_tooltip": null, 2902 | "layout": "IPY_MODEL_c8fb24696fc14bbb81827eb5fad2dbc0", 2903 | "max": 670, 2904 | "min": 0, 2905 | "orientation": "horizontal", 2906 | "style": "IPY_MODEL_63040bada9764ff8b5dac77d22a90008", 2907 | "value": 670 2908 | } 2909 | }, 2910 | "e4354fcaac5642ae89ce8a3c69d1b68d": { 2911 | "model_module": "@jupyter-widgets/controls", 2912 | "model_module_version": "1.5.0", 2913 | "model_name": "HTMLModel", 2914 | "state": { 2915 | "_dom_classes": [], 2916 | "_model_module": "@jupyter-widgets/controls", 2917 | "_model_module_version": "1.5.0", 2918 | "_model_name": "HTMLModel", 2919 | "_view_count": null, 2920 | "_view_module": "@jupyter-widgets/controls", 2921 | "_view_module_version": "1.5.0", 2922 | "_view_name": "HTMLView", 2923 | "description": "", 2924 | "description_tooltip": null, 2925 | "layout": "IPY_MODEL_6da8546c83864bd98dd3bc7568de095b", 2926 | "placeholder": "", 2927 | "style": "IPY_MODEL_39367aac2c894342ab3ed257ac8c2e4f", 2928 | "value": "generation_config.json: 100%" 2929 | } 2930 | }, 2931 | "edce2515b0cd4d7298064476d43646a1": { 2932 | "model_module": "@jupyter-widgets/controls", 2933 | "model_module_version": "1.5.0", 2934 | "model_name": "DescriptionStyleModel", 2935 | "state": { 2936 | "_model_module": "@jupyter-widgets/controls", 2937 | "_model_module_version": "1.5.0", 2938 | "_model_name": "DescriptionStyleModel", 2939 | "_view_count": null, 2940 | "_view_module": "@jupyter-widgets/base", 2941 | "_view_module_version": "1.2.0", 2942 | "_view_name": "StyleView", 2943 | "description_width": "" 2944 | } 2945 | }, 2946 | "fa2d656a9fd549dfb64e97546e82591d": { 2947 | "model_module": "@jupyter-widgets/controls", 2948 | "model_module_version": "1.5.0", 2949 | "model_name": "ProgressStyleModel", 2950 | "state": { 2951 | "_model_module": "@jupyter-widgets/controls", 2952 | "_model_module_version": "1.5.0", 2953 | "_model_name": "ProgressStyleModel", 2954 | "_view_count": null, 2955 | "_view_module": "@jupyter-widgets/base", 2956 | "_view_module_version": "1.2.0", 2957 | "_view_name": "StyleView", 2958 | "bar_color": null, 2959 | "description_width": "" 2960 | } 2961 | }, 2962 | "fa4045a862034e428ab40009d04d313e": { 2963 | "model_module": "@jupyter-widgets/controls", 2964 | "model_module_version": "1.5.0", 2965 | "model_name": "HTMLModel", 2966 | "state": { 2967 | "_dom_classes": [], 2968 | "_model_module": "@jupyter-widgets/controls", 2969 | "_model_module_version": "1.5.0", 2970 | "_model_name": "HTMLModel", 2971 | "_view_count": null, 2972 | "_view_module": "@jupyter-widgets/controls", 2973 | "_view_module_version": "1.5.0", 2974 | "_view_name": "HTMLView", 2975 | "description": "", 2976 | "description_tooltip": null, 2977 | "layout": "IPY_MODEL_8001fd2f05784a66b7f3aac77fcf2303", 2978 | "placeholder": "", 2979 | "style": "IPY_MODEL_39323e862d2f4da089cbd7a7a45b378c", 2980 | "value": "tokenizer.model: 100%" 2981 | } 2982 | }, 2983 | "fb8e26b8d050464595bf725cf2b2e9d3": { 2984 | "model_module": "@jupyter-widgets/controls", 2985 | "model_module_version": "1.5.0", 2986 | "model_name": "HTMLModel", 2987 | "state": { 2988 | "_dom_classes": [], 2989 | "_model_module": "@jupyter-widgets/controls", 2990 | "_model_module_version": "1.5.0", 2991 | "_model_name": "HTMLModel", 2992 | "_view_count": null, 2993 | "_view_module": "@jupyter-widgets/controls", 2994 | "_view_module_version": "1.5.0", 2995 | "_view_name": "HTMLView", 2996 | "description": "", 2997 | "description_tooltip": null, 2998 | "layout": "IPY_MODEL_8a9b97f5d64a4781b6d46134bc5ebff8", 2999 | "placeholder": "", 3000 | "style": "IPY_MODEL_8bbd54857d834502b94847a6903dcdf9", 3001 | "value": " 4.69M/4.69M [00:00<00:00, 21.2MB/s]" 3002 | } 3003 | }, 3004 | "fd1d0df20f1c4b3082d4ee128570c4e5": { 3005 | "model_module": "@jupyter-widgets/controls", 3006 | "model_module_version": "1.5.0", 3007 | "model_name": "HBoxModel", 3008 | "state": { 3009 | "_dom_classes": [], 3010 | "_model_module": "@jupyter-widgets/controls", 3011 | "_model_module_version": "1.5.0", 3012 | "_model_name": "HBoxModel", 3013 | "_view_count": null, 3014 | "_view_module": "@jupyter-widgets/controls", 3015 | "_view_module_version": "1.5.0", 3016 | "_view_name": "HBoxView", 3017 | "box_style": "", 3018 | "children": [ 3019 | "IPY_MODEL_218927b46cf148e6903d64e656449a7c", 3020 | "IPY_MODEL_954458b7719f442cafd3409ab7a90116", 3021 | "IPY_MODEL_ac32520514b34a2fbe86b90fccbf9bae" 3022 | ], 3023 | "layout": "IPY_MODEL_c35c454a893a4edfae1f27a6707cb77c" 3024 | } 3025 | } 3026 | } 3027 | } 3028 | }, 3029 | "nbformat": 4, 3030 | "nbformat_minor": 0 3031 | } 3032 | --------------------------------------------------------------------------------

	id	news_url	title	tweet_ids	verdict
7526	gossipcop-2895484840	www.magzter.com/article/Celebrity/OK/Taylors-L...	Taylor's Lonely Life	279594449154232321\\t280553081064800256\\t280797...	Fake
4351	gossipcop-884847	https://www.longroom.com/discussion/720445/wat...	Watch \"Belligerent\" Scott Disick Freak Out at ...	NaN	True
10822	gossipcop-3662901506	radaronline.com/videos/ellen-degeneres-talk-sh...	Boss From Hell! Ellen DeGeneres Treats Her Tal...	943461276277231616\\t943471999036411904\\t943499...	Fake
5118	gossipcop-925000	https://www.pinterest.co.uk/pin/29533768803792...	Fearless from Beyoncé and Jay Z's Vacation Pics	981654072489947136	True
4236	gossipcop-907444	https://www.usmagazine.com/celebrity-news/news...	James Franco to Attend SAG Awards 2018 Amid Mi...	954377483410960386\\t954389751481675778\\t954389...	True

Step	Training Loss
1	1.025000
2	2.115400
3	1.343800
4	2.544700
5	0.564300
6	3.495900
7	1.927200
8	0.640500
9	0.759200
10	1.656900