├── .gitignore ├── Lab 1 : PyTorch & Brevitas ├── LAB_1_8BIT_TRAIN.ipynb ├── OLD_FP_TRAINING.ipynb └── readme.md ├── Lab 2 : FINN Compiler ├── LAB2.ipynb ├── LAB2_VERIFICATION.ipynb └── readme.md ├── Lab 3 : Porting to Zynq FPGA Manually ├── Vitis_programs │ ├── data_generator │ │ └── main.py │ └── main.c ├── final_custom_system.png ├── final_output.png └── readme.md ├── Lecture 1 : Reminders ├── PTQ_quantization_for_nn.ipynb ├── PyTorch.ipynb └── quantization.ipynb ├── Lecture 2 : Project concepts ├── Brevitas.ipynb ├── fc_mnist_simple.png └── readme.md ├── SLIDES_CC.pdf └── readme.md /.gitignore: -------------------------------------------------------------------------------- 1 | data 2 | mnist_samples.h -------------------------------------------------------------------------------- /Lab 1 : PyTorch & Brevitas/LAB_1_8BIT_TRAIN.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "a71a8839-7c58-4862-a5f0-0f7e97085918", 6 | "metadata": {}, 7 | "source": [ 8 | "# THIS NOTEBOOK TRAINS A FASHION-MNIST MODEL USING 0-255 VALUES (UINT8)\n", 9 | "\n", 10 | "After creating the docker environement,\n", 11 | "\n", 12 | "The goal of this lab is to create a MNIST Fasion model in pytorch and experiment with the different parameters\n", 13 | "\n", 14 | "Then, we will do the same model but fully quantized and start adapting it for FINN" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "id": "2a09c1a7-e82e-4d14-bced-29aaf340638c", 20 | "metadata": {}, 21 | "source": [ 22 | "## Base model creation" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "id": "a4e9dec1-299c-45ec-bc1a-41d4825fea44", 29 | "metadata": {}, 30 | "outputs": [], 31 | "source": [ 32 | "import torch\n", 33 | "from torchvision import datasets, transforms\n", 34 | "from torch.utils.data import DataLoader\n", 35 | "\n", 36 | "root_dir = \"/tmp/finn_dev_rootmin\"" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 2, 42 | "id": "b9651dd1-25d9-4c48-8e02-08c28d29fa84", 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "# Define a transform to normalize the data\n", 47 | "transform = transforms.Compose([\n", 48 | " transforms.ToTensor(), # Convert the image to a PyTorch tensor\n", 49 | " #transforms.Normalize((0.5,), (0.5,)) # Normalize the tensor with mean and std\n", 50 | "]);\n", 51 | "\n", 52 | "# Load the training dataset\n", 53 | "train_dataset = datasets.FashionMNIST(\n", 54 | " root='./data', # Directory to save the dataset\n", 55 | " train=True, # Load the training set\n", 56 | " download=True, # Download the dataset if it doesn't exist\n", 57 | " transform=transform # Apply the defined transformations\n", 58 | ");\n", 59 | "\n", 60 | "# Load the test dataset\n", 61 | "test_dataset = datasets.FashionMNIST(\n", 62 | " root='./data',\n", 63 | " train=False, # Load the test set\n", 64 | " download=True,\n", 65 | " transform=transform\n", 66 | ")" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 3, 72 | "id": "7cb48f47-1369-4f87-8c9b-c0e29a8de5e5", 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "import matplotlib.pyplot as plt\n", 77 | "import numpy as np\n", 78 | "\n", 79 | "image, label = train_dataset[5]\n", 80 | "image = np.array(image).squeeze()\n", 81 | "print(\"Min : \", np.min(image[0]), \" /// Max : \", np.max(image[0]))\n", 82 | "# plot the sample\n", 83 | "\n", 84 | "fig = plt.figure\n", 85 | "plt.imshow(image, cmap='gray')\n", 86 | "plt.show()" 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": 4, 92 | "id": "ccdce0f7-c128-48f5-956e-7174818139d0", 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [ 96 | "batch_size = 100\n", 97 | "\n", 98 | "# Create a data loader for the training set\n", 99 | "train_loader = DataLoader(\n", 100 | " dataset=train_dataset,\n", 101 | " batch_size=batch_size, # Number of samples per batch\n", 102 | " shuffle=True # Shuffle the data\n", 103 | ")\n", 104 | "\n", 105 | "# Create a data loader for the test set\n", 106 | "test_loader = DataLoader(\n", 107 | " dataset=test_dataset,\n", 108 | " batch_size=batch_size,\n", 109 | " shuffle=False # No need to shuffle the test data\n", 110 | ")" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": 5, 116 | "id": "5c0f87bc-cfa0-4d45-9ed3-eddff267ddc7", 117 | "metadata": {}, 118 | "outputs": [], 119 | "source": [ 120 | "import torch\n", 121 | "import torch.nn as nn\n", 122 | "import torch.optim as optim" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": 6, 128 | "id": "135060a6-6f2f-4cee-a32e-3881d985b165", 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "input_size = 28*28\n", 133 | "hidden1 = 64\n", 134 | "hidden2 = 64\n", 135 | "num_classes = 10\n", 136 | "\n", 137 | "class SimpleFCModel(nn.Module):\n", 138 | " def __init__(self):\n", 139 | " super(SimpleFCModel, self).__init__()\n", 140 | " \n", 141 | " # Define the layers\n", 142 | " self.relu = nn.ReLU() # Activation function\n", 143 | " self.fc1 = nn.Linear(input_size, hidden1) # First hidden layer\n", 144 | " self.fc2 = nn.Linear(hidden1, hidden2) # Second hidden layer\n", 145 | " self.fc3 = nn.Linear(hidden2, num_classes) # Output layer\n", 146 | " \n", 147 | " def forward(self, x):\n", 148 | " # Forward pass through the network\n", 149 | " out = self.fc1(x)\n", 150 | " out = self.relu(out)\n", 151 | " out = self.fc2(out)\n", 152 | " out = self.relu(out)\n", 153 | " out = self.fc3(out)\n", 154 | " return out" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": 7, 160 | "id": "559a5697-2f00-4a5e-8f5f-119ea87351d0", 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "model = SimpleFCModel()\n", 165 | "# Loss function\n", 166 | "criterion = nn.CrossEntropyLoss()\n", 167 | "# Optimizer\n", 168 | "optimizer = optim.Adam(model.parameters(), lr=0.001)\n", 169 | "model" 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": 8, 175 | "id": "779b5632-3ac5-41da-a155-63f7c4d343a8", 176 | "metadata": {}, 177 | "outputs": [], 178 | "source": [ 179 | "num_epochs = 5\n", 180 | "model.train()\n", 181 | "\n", 182 | "for epoch in range(num_epochs):\n", 183 | " for batch_idx, (images, labels) in enumerate(train_loader):\n", 184 | " images = torch.reshape(images, (batch_size, input_size))\n", 185 | " out = model(images)\n", 186 | " loss = criterion(out, labels)\n", 187 | " optimizer.zero_grad()\n", 188 | " loss.backward()\n", 189 | " optimizer.step()\n", 190 | "\n", 191 | " print(f'Epoch [{epoch+1}/5], Loss: {loss.item():.4f}')" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": 9, 197 | "id": "6fb5c432-a6e6-4668-87e9-98cbd7799974", 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [ 201 | "# test the model\n", 202 | "\n", 203 | "model.eval()\n", 204 | "correct = 0\n", 205 | "total = 0\n", 206 | "loss_total = 0\n", 207 | "\n", 208 | "with torch.no_grad():\n", 209 | " \n", 210 | " for batch_idx, (images, labels) in enumerate(test_loader):\n", 211 | " images = torch.reshape(images, (batch_size, input_size))\n", 212 | " out = model(images)\n", 213 | " _, predicted = torch.max(out.data, 1)\n", 214 | " total += labels.size(0)\n", 215 | " correct += (predicted == labels).sum().item()\n", 216 | "\n", 217 | " accuracy = 100 * correct / total\n", 218 | " print(\"accuracy =\", accuracy)" 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "id": "1648a989-431b-418d-9dbf-118ca52aa7e1", 224 | "metadata": {}, 225 | "source": [ 226 | "# PART 2\n", 227 | "\n", 228 | "This part is about creating a quantized version of the model and adapting it to finn." 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 10, 234 | "id": "dab5b0a0-9cfd-41c7-99cc-3c2d3365cbcf", 235 | "metadata": {}, 236 | "outputs": [], 237 | "source": [ 238 | "import torch\n", 239 | "from brevitas.nn import QuantLinear\n", 240 | "from brevitas.nn import QuantReLU\n", 241 | "from brevitas.nn import QuantIdentity\n", 242 | "\n", 243 | "import torch.nn as nn\n", 244 | "\n", 245 | "brevitas_input_size = 28 * 28\n", 246 | "brevitas_hidden1 = 64\n", 247 | "brevitas_hidden2 = 64\n", 248 | "brevitas_num_classes = 10\n", 249 | "weight_bit_width = 4\n", 250 | "act_bit_width = 4\n", 251 | "dropout_prob = 0.5\n", 252 | "\n", 253 | "#is this model fully quantized or only the wieghts, i shall dig to find out once done !\n", 254 | "brevitas_model = nn.Sequential(\n", 255 | " QuantLinear(brevitas_input_size, brevitas_hidden1, bias=True, weight_bit_width=weight_bit_width),\n", 256 | " nn.BatchNorm1d(brevitas_hidden1),\n", 257 | " nn.Dropout(0.5),\n", 258 | " QuantReLU(bit_width=act_bit_width),\n", 259 | " QuantLinear(brevitas_hidden1, brevitas_hidden2, bias=True, weight_bit_width=weight_bit_width),\n", 260 | " nn.BatchNorm1d(brevitas_hidden2),\n", 261 | " nn.Dropout(0.5),\n", 262 | " QuantReLU(bit_width=act_bit_width),\n", 263 | " QuantLinear(brevitas_hidden2, brevitas_num_classes, bias=True, weight_bit_width=weight_bit_width),\n", 264 | " QuantReLU(bit_width=act_bit_width)\n", 265 | ")\n", 266 | "\n", 267 | "# uncomment to check the network object\n", 268 | "#brevitas_model" 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "id": "289eec74", 274 | "metadata": {}, 275 | "source": [ 276 | "### The input data has to be quantized.\n", 277 | "\n", 278 | "Normaly in brevistas, we can use the ```QuantIdentity()``` layer for this but unfortunatly, it does not convert to hardware (yet)" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": 11, 284 | "id": "670acb3a", 285 | "metadata": {}, 286 | "outputs": [], 287 | "source": [ 288 | "from torch.utils.data import Dataset\n", 289 | "\n", 290 | "# Define a custom quantization function\n", 291 | "def quantize_tensor(x, num_bits=8):\n", 292 | " qmin = 0.\n", 293 | " qmax = 2.**num_bits - 1.\n", 294 | " min_val, max_val = x.min(), x.max()\n", 295 | "\n", 296 | " scale = (max_val - min_val) / (qmax - qmin)\n", 297 | " initial_zero_point = qmin - min_val / scale\n", 298 | "\n", 299 | " zero_point = 0\n", 300 | " if initial_zero_point < qmin:\n", 301 | " zero_point = qmin\n", 302 | " elif initial_zero_point > qmax:\n", 303 | " zero_point = qmax\n", 304 | " else:\n", 305 | " zero_point = initial_zero_point\n", 306 | "\n", 307 | " zero_point = int(zero_point)\n", 308 | " q_x = zero_point + x / scale\n", 309 | " q_x.clamp_(qmin, qmax).round_()\n", 310 | " \n", 311 | " return q_x\n", 312 | "\n", 313 | "# Define the quantized transform\n", 314 | "transform_quantized = transforms.Compose([\n", 315 | " transforms.ToTensor(), # Convert the image to a PyTorch tensor\n", 316 | " transforms.Lambda(lambda x: quantize_tensor(x)) # Apply quantization\n", 317 | "])\n", 318 | "\n", 319 | "# Load the training dataset\n", 320 | "train_dataset_qnt = datasets.FashionMNIST(\n", 321 | " root='./data', # Directory to save the dataset\n", 322 | " train=True, # Load the training set\n", 323 | " download=True, # Download the dataset if it doesn't exist\n", 324 | " transform=transform_quantized # Apply the defined transformations\n", 325 | ");\n", 326 | "\n", 327 | "# Load the test dataset\n", 328 | "test_dataset_qnt = datasets.FashionMNIST(\n", 329 | " root='./data',\n", 330 | " train=False, # Load the test set\n", 331 | " download=True,\n", 332 | " transform=transform_quantized\n", 333 | ")\n", 334 | "\n", 335 | "train_loader = DataLoader(train_dataset_qnt, 100)\n", 336 | "test_loader = DataLoader(test_dataset_qnt, 100)" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": 12, 342 | "id": "67c819d8", 343 | "metadata": {}, 344 | "outputs": [], 345 | "source": [ 346 | "import matplotlib.pyplot as plt\n", 347 | "import numpy as np\n", 348 | "\n", 349 | "image, label = train_dataset_qnt[10]\n", 350 | "image = np.array(image).squeeze()\n", 351 | "print(\"Min : \", np.min(image), \" /// Max : \", np.max(image))\n", 352 | "print(image.dtype)\n", 353 | "# plot the sample\n", 354 | "\n", 355 | "fig = plt.figure\n", 356 | "plt.imshow(image, cmap='gray')\n", 357 | "plt.show()" 358 | ] 359 | }, 360 | { 361 | "cell_type": "code", 362 | "execution_count": 13, 363 | "id": "236d85fe-986a-44f4-84d1-fce1d5591f80", 364 | "metadata": {}, 365 | "outputs": [], 366 | "source": [ 367 | "# loss criterion and optimizer\n", 368 | "criterion = nn.CrossEntropyLoss()\n", 369 | "optimizer = torch.optim.Adam(brevitas_model.parameters(), lr=0.001, betas=(0.9, 0.999))\n", 370 | "\n", 371 | "num_epochs = 5\n", 372 | "brevitas_model.train()\n", 373 | "\n", 374 | "for epoch in range(num_epochs):\n", 375 | " for batch_idx, (images, labels) in enumerate(train_loader):\n", 376 | " images = torch.reshape(images, (batch_size, 28*28))\n", 377 | " out = brevitas_model(images.float()) # This just make the value a float ie 255 becomes 255,0 and not 1\n", 378 | " loss = criterion(out, labels)\n", 379 | " optimizer.zero_grad()\n", 380 | " loss.backward()\n", 381 | " optimizer.step()\n", 382 | "\n", 383 | " print(f'Epoch [{epoch+1}/5], Loss: {loss.item():.4f}')" 384 | ] 385 | }, 386 | { 387 | "cell_type": "code", 388 | "execution_count": 14, 389 | "id": "f4d239cf-edb2-4faa-a502-d05861a156fb", 390 | "metadata": {}, 391 | "outputs": [], 392 | "source": [ 393 | "# test the model\n", 394 | "\n", 395 | "brevitas_model.eval()\n", 396 | "correct = 0\n", 397 | "total = 0\n", 398 | "loss_total = 0\n", 399 | "\n", 400 | "with torch.no_grad():\n", 401 | " for batch_idx, (images, labels) in enumerate(test_loader):\n", 402 | " images = torch.reshape(images, (batch_size, 28*28))\n", 403 | " out = brevitas_model(images.float())\n", 404 | " _, predicted = torch.max(out.data, 1)\n", 405 | " total += labels.size(0)\n", 406 | " correct += (predicted == labels).sum().item()\n", 407 | "\n", 408 | " accuracy = 100 * correct / total\n", 409 | " print(\"accuracy =\", accuracy, \"%\")" 410 | ] 411 | }, 412 | { 413 | "cell_type": "code", 414 | "execution_count": 15, 415 | "id": "c3843919-1e83-4c50-accd-3e9292cecc2f", 416 | "metadata": {}, 417 | "outputs": [], 418 | "source": [ 419 | "#lets have a quick look at the weights too\n", 420 | "print(brevitas_model[0].quant_weight())\n", 421 | "#internally, weoght are stored as float 32, here nare ways to visualize actual quantized weights :\n", 422 | "print(brevitas_model[0].quant_weight().int())\n", 423 | "print(brevitas_model[0].quant_weight().int().dtype)" 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "execution_count": 16, 429 | "id": "8247848d-6308-4524-984c-e15e9d1e71c1", 430 | "metadata": {}, 431 | "outputs": [], 432 | "source": [ 433 | "# Model wrapper to :\n", 434 | "# - make sure the model can rack in bipolar data, we just saw so no need for that\n", 435 | "# - add a binary quantizer on the output whith bipolar behavior\n", 436 | "# note to myself, may have to rework that output quantizer (or just do no pre/post, also works fine\n", 437 | "from brevitas.nn import QuantIdentity\n", 438 | "\n", 439 | "\n", 440 | "class ModelForExport(nn.Module):\n", 441 | " def __init__(self, my_pretrained_model):\n", 442 | " super(ModelForExport, self).__init__()\n", 443 | " self.pretrained = my_pretrained_model\n", 444 | " \n", 445 | " def forward(self, x):\n", 446 | " out= self.pretrained(x)\n", 447 | " return out\n", 448 | "\n", 449 | "model_for_export = ModelForExport(brevitas_model)" 450 | ] 451 | }, 452 | { 453 | "cell_type": "code", 454 | "execution_count": 17, 455 | "id": "058a27d4-d48a-41b4-834e-75253bb230ff", 456 | "metadata": {}, 457 | "outputs": [], 458 | "source": [ 459 | "# test the model\n", 460 | "\n", 461 | "model_for_export.eval()\n", 462 | "correct = 0\n", 463 | "total = 0\n", 464 | "loss_total = 0\n", 465 | "\n", 466 | "with torch.no_grad():\n", 467 | " for batch_idx, (images, labels) in enumerate(test_loader):\n", 468 | " images = torch.reshape(images, (batch_size, 28*28))\n", 469 | " out = model_for_export(images.float())\n", 470 | " # print(images)\n", 471 | " _, predicted = torch.max(out.data, 1)\n", 472 | " total += labels.size(0)\n", 473 | " correct += (predicted == labels).sum().item()\n", 474 | " print(\"accuracy =\", accuracy)" 475 | ] 476 | }, 477 | { 478 | "cell_type": "markdown", 479 | "id": "3a1d9899-000b-4343-ac81-2c425b571519", 480 | "metadata": {}, 481 | "source": [ 482 | "# PART 3\n", 483 | "\n", 484 | "Exporting the model and visualizing it" 485 | ] 486 | }, 487 | { 488 | "cell_type": "code", 489 | "execution_count": 18, 490 | "id": "11f27a3b-88c6-483e-be82-85d1ee53129b", 491 | "metadata": {}, 492 | "outputs": [], 493 | "source": [ 494 | "from brevitas.export import export_qonnx\n", 495 | "from qonnx.util.cleanup import cleanup as qonnx_cleanup\n", 496 | "from qonnx.core.modelwrapper import ModelWrapper\n", 497 | "from qonnx.core.datatype import DataType\n", 498 | "from finn.transformation.qonnx.convert_qonnx_to_finn import ConvertQONNXtoFINN\n", 499 | "\n", 500 | "filename = root_dir + \"/LAB_1.onnx\"\n", 501 | "filename_clean = root_dir + \"/LAB1_clean.onnx\"\n", 502 | "\n", 503 | "def asymmetric_quantize(arr, num_bits=8):\n", 504 | " min = 0\n", 505 | " max = 2**num_bits - 1\n", 506 | " \n", 507 | " beta = np.min(arr)\n", 508 | " alpha = np.max(arr)\n", 509 | " scale = (alpha - beta) / max\n", 510 | " zero_point = np.clip((-beta/scale),0,max).round().astype(np.int8)\n", 511 | "\n", 512 | " quantized_arr = np.clip(np.round(arr / scale + zero_point), min, max).astype(np.float32)\n", 513 | " \n", 514 | " return quantized_arr\n", 515 | "\n", 516 | "#Crete a tensor ressembling the input tensor we saw earlier\n", 517 | "input_a = np.random.rand(1,28*28).astype(np.float32)\n", 518 | "input_a = asymmetric_quantize(input_a)\n", 519 | "print(np.max(input_a[0]))\n", 520 | "scale = 1.0\n", 521 | "input_t = torch.from_numpy(input_a * scale)\n", 522 | "\n", 523 | "# Export to ONNX\n", 524 | "export_qonnx(\n", 525 | " model_for_export, export_path=filename, input_t=input_t\n", 526 | ")\n", 527 | "\n", 528 | "# clean-up\n", 529 | "qonnx_cleanup(filename, out_file=filename_clean)\n", 530 | "\n", 531 | "# ModelWrapper\n", 532 | "model = ModelWrapper(filename_clean)\n", 533 | "model = model.transform(ConvertQONNXtoFINN())\n", 534 | "model.save(root_dir + \"/ready_finn.onnx\")\n", 535 | "\n", 536 | "print(\"Model saved to \" + root_dir + \"/ready_finn.onnx\")" 537 | ] 538 | }, 539 | { 540 | "cell_type": "code", 541 | "execution_count": 20, 542 | "id": "8d6f0250-5b1c-4276-9293-c29a1b112782", 543 | "metadata": {}, 544 | "outputs": [], 545 | "source": [ 546 | "from finn.util.visualization import showInNetron\n", 547 | "\n", 548 | "showInNetron(root_dir + \"/ready_finn.onnx\")" 549 | ] 550 | }, 551 | { 552 | "cell_type": "code", 553 | "execution_count": null, 554 | "id": "3f09bf3d-0802-4b3f-b686-fd649f5f6687", 555 | "metadata": {}, 556 | "outputs": [], 557 | "source": [] 558 | } 559 | ], 560 | "metadata": { 561 | "kernelspec": { 562 | "display_name": "Python 3 (ipykernel)", 563 | "language": "python", 564 | "name": "python3" 565 | } 566 | }, 567 | "nbformat": 4, 568 | "nbformat_minor": 5 569 | } 570 | -------------------------------------------------------------------------------- /Lab 1 : PyTorch & Brevitas/OLD_FP_TRAINING.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "a71a8839-7c58-4862-a5f0-0f7e97085918", 6 | "metadata": {}, 7 | "source": [ 8 | "# PART 1\n", 9 | "\n", 10 | "After creating the docker environement,\n", 11 | "\n", 12 | "The goal of this lab is to create a MNIST Fasion model in pytorch and experiment with the different parameters\n", 13 | "\n", 14 | "Then, we will do the same model but fully quantized and start adapting it for FINN" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "id": "2a09c1a7-e82e-4d14-bced-29aaf340638c", 20 | "metadata": {}, 21 | "source": [ 22 | "## Base model creation" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "id": "a4e9dec1-299c-45ec-bc1a-41d4825fea44", 29 | "metadata": {}, 30 | "outputs": [], 31 | "source": [ 32 | "import torch\n", 33 | "from torchvision import datasets, transforms\n", 34 | "from torch.utils.data import DataLoader" 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": 2, 40 | "id": "b9651dd1-25d9-4c48-8e02-08c28d29fa84", 41 | "metadata": {}, 42 | "outputs": [], 43 | "source": [ 44 | "# Define a transform to normalize the data\n", 45 | "transform = transforms.Compose([\n", 46 | " transforms.ToTensor(), # Convert the image to a PyTorch tensor\n", 47 | "]);\n", 48 | "\n", 49 | "# Load the training dataset\n", 50 | "train_dataset = datasets.FashionMNIST(\n", 51 | " root='./data', # Directory to save the dataset\n", 52 | " train=True, # Load the training set\n", 53 | " download=True, # Download the dataset if it doesn't exist\n", 54 | " transform=transform # Apply the defined transformations\n", 55 | ");\n", 56 | "\n", 57 | "# Load the test dataset\n", 58 | "test_dataset = datasets.FashionMNIST(\n", 59 | " root='./data',\n", 60 | " train=False, # Load the test set\n", 61 | " download=True,\n", 62 | " transform=transform\n", 63 | ")" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "id": "7cb48f47-1369-4f87-8c9b-c0e29a8de5e5", 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "import matplotlib.pyplot as plt\n", 74 | "import numpy as np\n", 75 | "\n", 76 | "image, label = train_dataset[5]\n", 77 | "image = np.array(image).squeeze()\n", 78 | "print(\"Min : \", np.min(image[0]), \" /// Max : \", np.max(image[0]))\n", 79 | "# plot the sample\n", 80 | "\n", 81 | "fig = plt.figure\n", 82 | "plt.imshow(image, cmap='gray')\n", 83 | "plt.show()" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": 4, 89 | "id": "ccdce0f7-c128-48f5-956e-7174818139d0", 90 | "metadata": {}, 91 | "outputs": [], 92 | "source": [ 93 | "batch_size = 100\n", 94 | "\n", 95 | "# Create a data loader for the training set\n", 96 | "train_loader = DataLoader(\n", 97 | " dataset=train_dataset,\n", 98 | " batch_size=batch_size, # Number of samples per batch\n", 99 | " shuffle=True # Shuffle the data\n", 100 | ")\n", 101 | "\n", 102 | "# Create a data loader for the test set\n", 103 | "test_loader = DataLoader(\n", 104 | " dataset=test_dataset,\n", 105 | " batch_size=batch_size,\n", 106 | " shuffle=False # No need to shuffle the test data\n", 107 | ")" 108 | ] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "execution_count": 5, 113 | "id": "5c0f87bc-cfa0-4d45-9ed3-eddff267ddc7", 114 | "metadata": {}, 115 | "outputs": [], 116 | "source": [ 117 | "import torch\n", 118 | "import torch.nn as nn\n", 119 | "import torch.optim as optim" 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": 6, 125 | "id": "135060a6-6f2f-4cee-a32e-3881d985b165", 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [ 129 | "input_size = 28*28\n", 130 | "hidden1 = 64\n", 131 | "hidden2 = 64\n", 132 | "num_classes = 10\n", 133 | "\n", 134 | "class SimpleFCModel(nn.Module):\n", 135 | " def __init__(self):\n", 136 | " super(SimpleFCModel, self).__init__()\n", 137 | " \n", 138 | " # Define the layers\n", 139 | " self.relu = nn.ReLU() # Activation function\n", 140 | " self.fc1 = nn.Linear(input_size, hidden1) # First hidden layer\n", 141 | " self.fc2 = nn.Linear(hidden1, hidden2) # Second hidden layer\n", 142 | " self.fc3 = nn.Linear(hidden2, num_classes) # Output layer\n", 143 | " \n", 144 | " def forward(self, x):\n", 145 | " # Forward pass through the network\n", 146 | " out = self.fc1(x)\n", 147 | " out = self.relu(out)\n", 148 | " out = self.fc2(out)\n", 149 | " out = self.relu(out)\n", 150 | " out = self.fc3(out)\n", 151 | " return out" 152 | ] 153 | }, 154 | { 155 | "cell_type": "code", 156 | "execution_count": null, 157 | "id": "559a5697-2f00-4a5e-8f5f-119ea87351d0", 158 | "metadata": {}, 159 | "outputs": [], 160 | "source": [ 161 | "model = SimpleFCModel()\n", 162 | "# Loss function\n", 163 | "criterion = nn.CrossEntropyLoss()\n", 164 | "# Optimizer\n", 165 | "optimizer = optim.Adam(model.parameters(), lr=0.001)\n", 166 | "model" 167 | ] 168 | }, 169 | { 170 | "cell_type": "code", 171 | "execution_count": null, 172 | "id": "779b5632-3ac5-41da-a155-63f7c4d343a8", 173 | "metadata": {}, 174 | "outputs": [], 175 | "source": [ 176 | "num_epochs = 5\n", 177 | "model.train()\n", 178 | "\n", 179 | "for epoch in range(num_epochs):\n", 180 | " for batch_idx, (images, labels) in enumerate(train_loader):\n", 181 | " images = torch.reshape(images, (batch_size, input_size))\n", 182 | " out = model(images)\n", 183 | " loss = criterion(out, labels)\n", 184 | " optimizer.zero_grad()\n", 185 | " loss.backward()\n", 186 | " optimizer.step()\n", 187 | "\n", 188 | " print(f'Epoch [{epoch+1}/5], Loss: {loss.item():.4f}')" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": null, 194 | "id": "6fb5c432-a6e6-4668-87e9-98cbd7799974", 195 | "metadata": {}, 196 | "outputs": [], 197 | "source": [ 198 | "# test the model\n", 199 | "\n", 200 | "model.eval()\n", 201 | "correct = 0\n", 202 | "total = 0\n", 203 | "loss_total = 0\n", 204 | "\n", 205 | "with torch.no_grad():\n", 206 | " for batch_idx, (images, labels) in enumerate(test_loader):\n", 207 | " images = torch.reshape(images, (batch_size, input_size))\n", 208 | " out = model(images)\n", 209 | " _, predicted = torch.max(out.data, 1)\n", 210 | " total += labels.size(0)\n", 211 | " correct += (predicted == labels).sum().item()\n", 212 | "\n", 213 | " accuracy = 100 * correct / total\n", 214 | " print(\"accuracy =\", accuracy)" 215 | ] 216 | }, 217 | { 218 | "cell_type": "markdown", 219 | "id": "1648a989-431b-418d-9dbf-118ca52aa7e1", 220 | "metadata": {}, 221 | "source": [ 222 | "# PART 2\n", 223 | "\n", 224 | "This part is about creating a quantized version of the model and adapting it to finn." 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "execution_count": 10, 230 | "id": "dab5b0a0-9cfd-41c7-99cc-3c2d3365cbcf", 231 | "metadata": {}, 232 | "outputs": [], 233 | "source": [ 234 | "import torch\n", 235 | "from brevitas.nn import QuantLinear\n", 236 | "from brevitas.nn import QuantReLU\n", 237 | "from brevitas.nn import QuantIdentity\n", 238 | "\n", 239 | "import torch.nn as nn\n", 240 | "\n", 241 | "brevitas_input_size = 28 * 28\n", 242 | "brevitas_hidden1 = 64\n", 243 | "brevitas_hidden2 = 64\n", 244 | "brevitas_num_classes = 10\n", 245 | "weight_bit_width = 4\n", 246 | "act_bit_width = 4\n", 247 | "dropout_prob = 0.5\n", 248 | "\n", 249 | "#is this model fully quantized or only the wieghts, i shall dig to find out once done !\n", 250 | "brevitas_model = nn.Sequential(\n", 251 | " QuantLinear(brevitas_input_size, brevitas_hidden1, bias=True, weight_bit_width=weight_bit_width),\n", 252 | " nn.BatchNorm1d(brevitas_hidden1),\n", 253 | " nn.Dropout(0.5),\n", 254 | " QuantReLU(bit_width=act_bit_width),\n", 255 | " QuantLinear(brevitas_hidden1, brevitas_hidden2, bias=True, weight_bit_width=weight_bit_width),\n", 256 | " nn.BatchNorm1d(brevitas_hidden2),\n", 257 | " nn.Dropout(0.5),\n", 258 | " QuantReLU(bit_width=act_bit_width),\n", 259 | " QuantLinear(brevitas_hidden2, brevitas_num_classes, bias=True, weight_bit_width=weight_bit_width),\n", 260 | " QuantReLU(bit_width=act_bit_width)\n", 261 | ")\n", 262 | "\n", 263 | "# uncomment to check the network object\n", 264 | "#brevitas_model" 265 | ] 266 | }, 267 | { 268 | "cell_type": "markdown", 269 | "id": "289eec74", 270 | "metadata": {}, 271 | "source": [ 272 | "### The input data has to be quantized.\n", 273 | "\n", 274 | "Normaly in brevistas, we can use the ```QuantIdentity()``` layer for this but unfortunatly, it does not convert to hardware (yet)" 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": 11, 280 | "id": "670acb3a", 281 | "metadata": {}, 282 | "outputs": [], 283 | "source": [ 284 | "# Define the quantized transform\n", 285 | "transform = transforms.Compose([\n", 286 | " transforms.ToTensor(), # Convert the image to a PyTorch tensor\n", 287 | "])\n", 288 | "\n", 289 | "# Load the training dataset\n", 290 | "train_dataset = datasets.FashionMNIST(\n", 291 | " root='./data', # Directory to save the dataset\n", 292 | " train=True, # Load the training set\n", 293 | " download=True, # Download the dataset if it doesn't exist\n", 294 | " transform=transform # Apply the defined transformations\n", 295 | ");\n", 296 | "\n", 297 | "# Load the test dataset\n", 298 | "test_dataset = datasets.FashionMNIST(\n", 299 | " root='./data',\n", 300 | " train=False, # Load the test set\n", 301 | " download=True,\n", 302 | " transform=transform\n", 303 | ")\n", 304 | "\n", 305 | "train_loader = DataLoader(train_dataset, 100)\n", 306 | "test_loader = DataLoader(test_dataset, 100)" 307 | ] 308 | }, 309 | { 310 | "cell_type": "code", 311 | "execution_count": null, 312 | "id": "67c819d8", 313 | "metadata": {}, 314 | "outputs": [], 315 | "source": [ 316 | "import matplotlib.pyplot as plt\n", 317 | "import numpy as np\n", 318 | "\n", 319 | "image, label = train_dataset[10]\n", 320 | "image = np.array(image).squeeze()\n", 321 | "print(\"Min : \", np.min(image), \" /// Max : \", np.max(image))\n", 322 | "print(image.dtype)\n", 323 | "# plot the sample\n", 324 | "\n", 325 | "fig = plt.figure\n", 326 | "plt.imshow(image, cmap='gray')\n", 327 | "plt.show()" 328 | ] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "execution_count": null, 333 | "id": "236d85fe-986a-44f4-84d1-fce1d5591f80", 334 | "metadata": {}, 335 | "outputs": [], 336 | "source": [ 337 | "# loss criterion and optimizer\n", 338 | "criterion = nn.CrossEntropyLoss()\n", 339 | "optimizer = torch.optim.Adam(brevitas_model.parameters(), lr=0.001, betas=(0.9, 0.999))\n", 340 | "\n", 341 | "num_epochs = 5\n", 342 | "brevitas_model.train()\n", 343 | "\n", 344 | "for epoch in range(num_epochs):\n", 345 | " for batch_idx, (images, labels) in enumerate(train_loader):\n", 346 | " images = torch.reshape(images, (batch_size, 28*28))\n", 347 | " out = brevitas_model(images)\n", 348 | " loss = criterion(out, labels)\n", 349 | " optimizer.zero_grad()\n", 350 | " loss.backward()\n", 351 | " optimizer.step()\n", 352 | "\n", 353 | " print(f'Epoch [{epoch+1}/5], Loss: {loss.item():.4f}')" 354 | ] 355 | }, 356 | { 357 | "cell_type": "code", 358 | "execution_count": null, 359 | "id": "f4d239cf-edb2-4faa-a502-d05861a156fb", 360 | "metadata": {}, 361 | "outputs": [], 362 | "source": [ 363 | "# test the model\n", 364 | "\n", 365 | "brevitas_model.eval()\n", 366 | "correct = 0\n", 367 | "total = 0\n", 368 | "loss_total = 0\n", 369 | "\n", 370 | "with torch.no_grad():\n", 371 | " for batch_idx, (images, labels) in enumerate(test_loader):\n", 372 | " images = torch.reshape(images, (batch_size, 28*28))\n", 373 | " out = brevitas_model(images)\n", 374 | " _, predicted = torch.max(out.data, 1)\n", 375 | " total += labels.size(0)\n", 376 | " correct += (predicted == labels).sum().item()\n", 377 | "\n", 378 | " accuracy = 100 * correct / total\n", 379 | " print(\"accuracy =\", accuracy, \"%\")" 380 | ] 381 | }, 382 | { 383 | "cell_type": "code", 384 | "execution_count": null, 385 | "id": "c3843919-1e83-4c50-accd-3e9292cecc2f", 386 | "metadata": {}, 387 | "outputs": [], 388 | "source": [ 389 | "#lets have a quick look at the weights too\n", 390 | "print(brevitas_model[0].quant_weight())\n", 391 | "#internally, weoght are stored as float 32, here nare ways to visualize actual quantized weights :\n", 392 | "print(brevitas_model[0].quant_weight().int())\n", 393 | "print(brevitas_model[0].quant_weight().int().dtype)" 394 | ] 395 | }, 396 | { 397 | "cell_type": "code", 398 | "execution_count": 20, 399 | "id": "8247848d-6308-4524-984c-e15e9d1e71c1", 400 | "metadata": {}, 401 | "outputs": [], 402 | "source": [ 403 | "# You can use this model wrapper to add some layers dempending on the data\n", 404 | "# we will also add pre/post proc in FINN later on\n", 405 | "\n", 406 | "class ModelForExport(nn.Module):\n", 407 | " def __init__(self, my_pretrained_model):\n", 408 | " super(ModelForExport, self).__init__()\n", 409 | " self.pretrained = my_pretrained_model\n", 410 | " \n", 411 | " def forward(self, x):\n", 412 | " out= self.pretrained(x)\n", 413 | " return out\n", 414 | "\n", 415 | "model_for_export = ModelForExport(brevitas_model)" 416 | ] 417 | }, 418 | { 419 | "cell_type": "code", 420 | "execution_count": null, 421 | "id": "058a27d4-d48a-41b4-834e-75253bb230ff", 422 | "metadata": {}, 423 | "outputs": [], 424 | "source": [ 425 | "# test the model\n", 426 | "\n", 427 | "model_for_export.eval()\n", 428 | "correct = 0\n", 429 | "total = 0\n", 430 | "loss_total = 0\n", 431 | "\n", 432 | "with torch.no_grad():\n", 433 | " for batch_idx, (images, labels) in enumerate(test_loader):\n", 434 | " images = torch.reshape(images, (batch_size, 28*28))\n", 435 | " out = model_for_export(images)\n", 436 | " _, predicted = torch.max(out.data, 1)\n", 437 | " total += labels.size(0)\n", 438 | " correct += (predicted == labels).sum().item()\n", 439 | " print(\"accuracy =\", accuracy)" 440 | ] 441 | }, 442 | { 443 | "cell_type": "markdown", 444 | "id": "3a1d9899-000b-4343-ac81-2c425b571519", 445 | "metadata": {}, 446 | "source": [ 447 | "# PART 3\n", 448 | "\n", 449 | "Exporting the model and visualizing it" 450 | ] 451 | }, 452 | { 453 | "cell_type": "code", 454 | "execution_count": null, 455 | "id": "11f27a3b-88c6-483e-be82-85d1ee53129b", 456 | "metadata": {}, 457 | "outputs": [], 458 | "source": [ 459 | "from brevitas.export import export_qonnx\n", 460 | "from qonnx.util.cleanup import cleanup as qonnx_cleanup\n", 461 | "from qonnx.core.modelwrapper import ModelWrapper\n", 462 | "from qonnx.core.datatype import DataType\n", 463 | "from finn.transformation.qonnx.convert_qonnx_to_finn import ConvertQONNXtoFINN\n", 464 | "\n", 465 | "filename = \"/tmp/finn_dev_rootmin/LAB_1.onnx\"\n", 466 | "filename_clean = \"/tmp/finn_dev_rootmin/LAB1_clean.onnx\"\n", 467 | "\n", 468 | "#Crete a tensor ressembling the input tensor we saw earlier\n", 469 | "input_a = np.random.rand(1,28*28).astype(np.float32)\n", 470 | "print(np.max(input_a[0]))\n", 471 | "scale = 1.0\n", 472 | "input_t = torch.from_numpy(input_a * scale)\n", 473 | "\n", 474 | "# Export to ONNX\n", 475 | "export_qonnx(\n", 476 | " model_for_export, export_path=filename, input_t=input_t\n", 477 | ")\n", 478 | "\n", 479 | "# clean-up\n", 480 | "qonnx_cleanup(filename, out_file=filename_clean)\n", 481 | "\n", 482 | "# ModelWrapper\n", 483 | "model = ModelWrapper(filename_clean)\n", 484 | "# Setting the input datatype explicitly because it doesn't get derived from the export function\n", 485 | "model = model.transform(ConvertQONNXtoFINN())\n", 486 | "model.save(\"/tmp/finn_dev_rootmin/ready_finn.onnx\")\n", 487 | "\n", 488 | "print(\"Model saved to /tmp/finn_dev_rootmin/ready_finn.onnx\")" 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": null, 494 | "id": "8d6f0250-5b1c-4276-9293-c29a1b112782", 495 | "metadata": {}, 496 | "outputs": [], 497 | "source": [ 498 | "from finn.util.visualization import showInNetron\n", 499 | "\n", 500 | "showInNetron(\"/tmp/finn_dev_rootmin/ready_finn.onnx\")" 501 | ] 502 | }, 503 | { 504 | "cell_type": "code", 505 | "execution_count": null, 506 | "id": "2f6561bc", 507 | "metadata": {}, 508 | "outputs": [], 509 | "source": [] 510 | } 511 | ], 512 | "metadata": { 513 | "kernelspec": { 514 | "display_name": "Python 3 (ipykernel)", 515 | "language": "python", 516 | "name": "python3" 517 | } 518 | }, 519 | "nbformat": 4, 520 | "nbformat_minor": 5 521 | } 522 | -------------------------------------------------------------------------------- /Lab 1 : PyTorch & Brevitas/readme.md: -------------------------------------------------------------------------------- 1 | # LAB 1 : PyTorch, First model, and Quantize Aware Training 2 | 3 | As the title suggests, this lab is all about handling the PyTorch Framework and reproduce this workflow to train a similar classifier using Quantize Awara Training (QAT). 4 | 5 | *Note that 2 notbooks are present, one trains the model on FP32 values (0-1), the other one on UINT8 values (0-255), we will discuss the impact of such a choice during the lectures* 6 | 7 | ## 1 : PyTorch 8 | 9 | Quick refreseher on how to use pytorch, import and analyse data and how to manipulate a model. 10 | 11 | ## 2 : Brevitas 12 | 13 | Brevitas is a QAT framework based on PyTorch, It allows us to train a simple quantized classifier. 14 | 15 | In our example, The model is quantized on weights and activations. 16 | 17 | You will also learn how to transforms the input data as manipulating datatypes to fit the end usage is a big part of the job. 18 | 19 | ## 3 : Export for usage in LAB 2 20 | 21 | Lab 2 is based on this lab as we will use this model for conversion to HW layers. Kowing how use differents ONNX tools is essentials. 22 | 23 | ## What is next ? 24 | 25 | In lab 2, we will de some processing on the model and then use FINN, a compiler that allows us to convert a model into HW layers. (See lab2 for ressources) 26 | -------------------------------------------------------------------------------- /Lab 2 : FINN Compiler/LAB2_VERIFICATION.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# MODEL VERIFICATION\n", 8 | "\n", 9 | "When creating the model from scratch and manipulating the data in different ways to adapt it to the use case, verifying the model at each step turns out to be important.\n", 10 | "\n", 11 | "During LAB2 notebook, we already verified the FINN-ONNX model, which indicates that we did the job right on our side, but what if something goes wrong in the way we apply further transformations ?\n", 12 | "\n", 13 | "To make sure we are ready for FPGA inference, verification is a very important step to avoid hours of useless hardware dubugging.\n", 14 | "\n", 15 | "Verifications convered by this notebook :\n", 16 | "\n", 17 | "- HLS layers verification using C++\n", 18 | "- RTL output verification using PyVerilator\n", 19 | "\n", 20 | "This notebook was based on [this example](https://github.com/Xilinx/finn/blob/main/notebooks/end2end_example/bnn-pynq/tfc_end2end_verification.ipynb) from FINN tutorials.\n", 21 | "\n", 22 | "As you will see, verification will be fairly easy as FINN provides a very user-friendly API for these tools.\n", 23 | "\n", 24 | "## Workflow : manual transformations\n", 25 | "\n", 26 | "In this notebook, we will use [fpgadataflow transformations](https://finn.readthedocs.io/en/latest/source_code/finn.transformation.fpgadataflow.html) manualy. These usually are done automatically when building in FINN notebooks, depending on what output you ask for. Because we want control in these simulation examples, we wil use transformations \"manualy\"." 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 1, 32 | "metadata": {}, 33 | "outputs": [], 34 | "source": [ 35 | "root_dir = \"/tmp/finn_dev_rootmin\"" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": [ 42 | "# C++ Simulation\n", 43 | "\n", 44 | "First, execute LAB2, we will grab the models from the common ```/tmp/finn_dev_yourusername/``` output folder" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "We first define the \"golden reference\" for comparison" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 2, 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "import numpy as np\n", 61 | "from qonnx.core.modelwrapper import ModelWrapper\n", 62 | "import qonnx.core.onnx_exec as oxe\n", 63 | "\n", 64 | "input_tensor = input_a = np.random.uniform(low=0, high=255, size=(28*28)).astype(np.uint8).astype(np.float32)\n", 65 | "input_dict = {\"global_in\": input_tensor.reshape(1,28*28)}\n", 66 | "golden_model = ModelWrapper(root_dir + \"/full_preproc.onnx\")\n", 67 | "output_dict = oxe.execute_onnx(golden_model, input_dict)\n", 68 | "golden_output = output_dict[list(output_dict.keys())[0]]\n", 69 | "\n", 70 | "print(golden_output)" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "We will generate the different source code : ```PrepareCppSim``` and executables : ```CompileCppSim```" 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": 3, 83 | "metadata": {}, 84 | "outputs": [], 85 | "source": [ 86 | "from finn.transformation.fpgadataflow.prepare_cppsim import PrepareCppSim\n", 87 | "from finn.transformation.fpgadataflow.compile_cppsim import CompileCppSim\n", 88 | "from qonnx.transformation.general import GiveUniqueNodeNames\n", 89 | "from qonnx.core.modelwrapper import ModelWrapper\n", 90 | "\n", 91 | "model_cppsim = ModelWrapper(root_dir + \"/to_hw_conv.onnx\")\n", 92 | "model_cppsim = model_cppsim.transform(GiveUniqueNodeNames())\n", 93 | "model_cppsim = model_cppsim.transform(PrepareCppSim())\n", 94 | "model_cppsim = model_cppsim.transform(CompileCppSim())\n", 95 | "\n", 96 | "from finn.util.visualization import showSrc, showInNetron\n", 97 | "\n", 98 | "model_cppsim.save(root_dir + \"/cppsim.onnx\")\n", 99 | "showInNetron(root_dir + \"/cppsim.onnx\")\n" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "graph manipulation reminder : [cutomOp Docs](https://finn.readthedocs.io/en/latest/source_code/finn.custom_op.html#module-qonnx.custom_op.registry)" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 4, 112 | "metadata": {}, 113 | "outputs": [], 114 | "source": [ 115 | "# Look at the generated files\n", 116 | "from qonnx.custom_op.registry import getCustomOp\n", 117 | "\n", 118 | "model = ModelWrapper(root_dir + \"/cppsim.onnx\")\n", 119 | "\n", 120 | "fc0 = model.graph.node[0]\n", 121 | "fc0w = getCustomOp(fc0)\n", 122 | "cpp_code_dir = fc0w.get_nodeattr(\"code_gen_dir_cppsim\")\n", 123 | "\n", 124 | "!ls {cpp_code_dir}" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": {}, 130 | "source": [ 131 | "## Simulation and testing" 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": 5, 137 | "metadata": {}, 138 | "outputs": [], 139 | "source": [ 140 | "from finn.transformation.fpgadataflow.set_exec_mode import SetExecMode\n", 141 | "\n", 142 | "model_cppsim = model_cppsim.transform(SetExecMode(\"cppsim\"))\n", 143 | "model_cppsim.save(root_dir + \"/cppsim_exec.onnx\")" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": 6, 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "import numpy as np\n", 153 | "import onnx.numpy_helper as nph\n", 154 | "import qonnx.core.onnx_exec as oxe\n", 155 | "\n", 156 | "input_dict = {\"global_in\": input_tensor.reshape(1,28*28)}\n", 157 | "\n", 158 | "parent_model = ModelWrapper(root_dir + \"/df_part.onnx\")\n", 159 | "sdp_node = parent_model.graph.node[0]\n", 160 | "child_model = root_dir + \"/cppsim_exec.onnx\"\n", 161 | "getCustomOp(sdp_node).set_nodeattr(\"model\", child_model)\n", 162 | "output_dict = oxe.execute_onnx(parent_model, input_dict)\n", 163 | "output_cppsim = output_dict[list(output_dict.keys())[0]]\n", 164 | "\n", 165 | "try:\n", 166 | " print(golden_output[0], output_cppsim[0])\n", 167 | " assert np.isclose(output_cppsim[0], np.where(golden_output[0]==np.amax(golden_output[0])), atol=1e-3).all()\n", 168 | " print(\"Predictions are the same!\")\n", 169 | "except AssertionError:\n", 170 | " assert False, \"The results are not the same!\"" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "Great results are the same ! Note that this very small exmaple was done as an example and compares simple top label output. You can use the exmaple in a loop to check for hundreds of random sample et even setup a dataloader and testing loop for verification like we did like in LAB2." 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": {}, 183 | "source": [ 184 | "# PyVerilator RTL Verification / Emulation\n", 185 | "\n", 186 | "Once the RTL has been generated, we can also emulate it to compare with the golden result. They are multpile ways to go about this simulation, we will go for a \"node by node\" method.\n", 187 | "\n", 188 | "manual worflow comm" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": 7, 194 | "metadata": {}, 195 | "outputs": [], 196 | "source": [ 197 | "from finn.transformation.fpgadataflow.prepare_rtlsim import PrepareRTLSim\n", 198 | "from finn.transformation.fpgadataflow.prepare_ip import PrepareIP\n", 199 | "from finn.transformation.fpgadataflow.hlssynth_ip import HLSSynthIP\n", 200 | "from qonnx.core.modelwrapper import ModelWrapper\n", 201 | "from qonnx.transformation.general import GiveUniqueNodeNames\n", 202 | "\n", 203 | "\n", 204 | "test_fpga_part = \"xc7z020clg400-1\"\n", 205 | "target_clk_ns = 10" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": 8, 211 | "metadata": {}, 212 | "outputs": [], 213 | "source": [ 214 | "from finn.util.visualization import showSrc, showInNetron\n", 215 | "\n", 216 | "model_rtlsim = ModelWrapper(root_dir + \"/post_synth.onnx\")\n", 217 | "showInNetron(root_dir + \"/post_synth.onnx\")" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": 9, 223 | "metadata": {}, 224 | "outputs": [], 225 | "source": [ 226 | "from finn.transformation.fpgadataflow.set_exec_mode import SetExecMode\n", 227 | "\n", 228 | "model_rtlsim = model_rtlsim.transform(SetExecMode(\"rtlsim\"))\n", 229 | "model_rtlsim = model_rtlsim.transform(PrepareRTLSim())\n", 230 | "model_rtlsim.save(root_dir + \"/rtlsim.onnx\")" 231 | ] 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": 10, 236 | "metadata": {}, 237 | "outputs": [], 238 | "source": [ 239 | "# declare parent model from dataflow partition parent model\n", 240 | "parent_rtsim = ModelWrapper(root_dir + \"/df_part.onnx\")\n", 241 | "# reference child rtl model to the streming dataflow node\n", 242 | "sdp_node = getCustomOp(parent_model.graph.node[0])\n", 243 | "sdp_node.set_nodeattr(\"model\", root_dir + \"/rtlsim.onnx\")\n", 244 | "\n", 245 | "# set the exec mode for when we'll use oxe runtime, just like with C++ simulation\n", 246 | "parent_rtsim = parent_rtsim.transform(SetExecMode(\"rtlsim\"))" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": null, 252 | "metadata": {}, 253 | "outputs": [], 254 | "source": [ 255 | "#declare input data and run oxe runtime inference in RTL SIM mode\n", 256 | "input_dict = {\"global_in\": input_tensor.reshape(1,28*28)}\n", 257 | "output_dict = oxe.execute_onnx(parent_rtsim, input_dict)\n", 258 | "output_rtlsim = output_dict[list(output_dict.keys())[0]]\n", 259 | "print(golden_output, output_rtlsim)" 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": null, 265 | "metadata": {}, 266 | "outputs": [], 267 | "source": [ 268 | "try:\n", 269 | " assert np.isclose(output_rtlsim, np.where(golden_output[0]==np.amax(golden_output[0])), atol=1e-3).all()\n", 270 | " print(\"Predictions are the same!\")\n", 271 | "except AssertionError:\n", 272 | " assert False, \"The results are not the same!\"" 273 | ] 274 | }, 275 | { 276 | "cell_type": "code", 277 | "execution_count": null, 278 | "metadata": {}, 279 | "outputs": [], 280 | "source": [] 281 | } 282 | ], 283 | "metadata": { 284 | "kernelspec": { 285 | "display_name": "Python 3 (ipykernel)", 286 | "language": "python", 287 | "name": "python3" 288 | } 289 | }, 290 | "nbformat": 4, 291 | "nbformat_minor": 2 292 | } 293 | -------------------------------------------------------------------------------- /Lab 2 : FINN Compiler/readme.md: -------------------------------------------------------------------------------- 1 | # LAB 2 : Using FINN to convert our model to HW. 2 | 3 | FINN is an incredible tool that allows us to convert common AI tensor operations into HW Layers. 4 | 5 | The tool is still under active research and development. The generated layers depend on the architecture you chose. 6 | 7 | You have to think about your achitecture in way that enables FINN to do its work, it is the whole subject of this LAB. 8 | 9 | After the course, you might want to check these docs in order to make your own large classifiers that involves deeper work on the model : 10 | - [FINN Docs](https://finn.readthedocs.io/en/latest/) 11 | - [FINN additional ressources & papers](https://xilinx.github.io/finn/quickstart) 12 | 13 | But for now, you can stick to the lab and I'll try my best to explain as much tricky moves as possible. 14 | 15 | ## 1 - Processing the model 16 | 17 | We will operate some light processing on the model, adding labels, ... 18 | 19 | ## 2 - Verify the model 20 | 21 | Now that the model is in FINN-ONNX Format, verifying it using actual data that you might use for inference is important. 22 | 23 | We will use a wrapper provided by FINN to execute the model directly from the ONNX representation and check if our accuracy is coherent. 24 | 25 | ## 3 - Convert to HW Layers 26 | 27 | We will then perform a series of transformations on the model, the goal here is to give the network a certain shape that FINN we be able to work with. 28 | 29 | This step is where quantization and architecture choice may be important. 30 | 31 | ## What's next ? 32 | 33 | FINN provides a pretty neat workflow for beginners that uses PYNQ : Python drivers & runtime, automated HW conversion for standards model etc.. 34 | 35 | My goal in this course was to : 36 | - Give deeper understanding of the ways things works 37 | - Implement the final product on Zynq 38 | 39 | As the Zynq board I'll will use is not suppported, like many others, we will hack our way around and create our own logic using the produced IP, Vivado and Vitis worflow with some custom RTL along the way ! 40 | -------------------------------------------------------------------------------- /Lab 3 : Porting to Zynq FPGA Manually/Vitis_programs/data_generator/main.py: -------------------------------------------------------------------------------- 1 | # This generates an header file with N MNIST flatten sample 2 | # data is 8bits INT8 (or just char) meant to be imported 3 | # into a Tx buffer for use in Vitis (for DMA). 4 | # we can then use labels to compare execution 5 | 6 | import torch 7 | from torchvision import datasets, transforms 8 | import random 9 | 10 | def quantize_tensor(x, num_bits=8): 11 | qmin = 0. 12 | qmax = 2.**num_bits - 1. 13 | min_val, max_val = x.min(), x.max() 14 | 15 | scale = (max_val - min_val) / (qmax - qmin) 16 | initial_zero_point = qmin - min_val / scale 17 | 18 | zero_point = 0 19 | if initial_zero_point < qmin: 20 | zero_point = qmin 21 | elif initial_zero_point > qmax: 22 | zero_point = qmax 23 | else: 24 | zero_point = initial_zero_point 25 | 26 | zero_point = int(zero_point) 27 | q_x = zero_point + x / scale 28 | q_x.clamp_(qmin, qmax).round_() 29 | 30 | return q_x.byte() 31 | 32 | # Load MNIST dataset 33 | transform = transforms.Compose([ 34 | transforms.ToTensor(), 35 | transforms.Lambda(lambda x: quantize_tensor(x)) 36 | ]) 37 | 38 | mnist_dataset = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform) 39 | 40 | print(mnist_dataset[0][0]) 41 | 42 | # Select random samples 43 | num_samples = 100 44 | indices = random.sample(range(num_samples), num_samples) 45 | print(indices) 46 | samples = [mnist_dataset[i][0] for i in indices] 47 | labels = [mnist_dataset[i][1] for i in indices] 48 | print(labels) 49 | 50 | # Generate C header file 51 | with open('mnist_samples.h', 'w') as f: 52 | f.write("// This file has been auto-generated\n\n") 53 | f.write("#ifndef MNIST_SAMPLES_H\n") 54 | f.write("#define MNIST_SAMPLES_H\n\n") 55 | f.write(f"#define NUM_SAMPLES {str(num_samples)}\n") 56 | f.write("#define IMAGE_SIZE 784\n\n") 57 | 58 | f.write("const unsigned char mnist_samples[NUM_SAMPLES][IMAGE_SIZE] = {\n") 59 | 60 | for i, sample in enumerate(samples): 61 | # Denormalize, scale to 0-255, and flatten 62 | img = sample.squeeze() 63 | img_flat = img.reshape(-1).byte().tolist() 64 | 65 | f.write(" {") 66 | f.write(", ".join(map(str, img_flat))) 67 | f.write("},\n") 68 | 69 | f.write("};\n\n") 70 | 71 | f.write("const unsigned char mnist_labels[NUM_SAMPLES] = {\n") 72 | 73 | for j, label in enumerate(labels): 74 | f.write(" " + str(label) + ",\n") 75 | 76 | f.write("};\n\n") 77 | f.write("#endif // MNIST_SAMPLES_H\n") 78 | 79 | print("MNIST samples have been generated and saved in 'mnist_samples.h'") -------------------------------------------------------------------------------- /Lab 3 : Porting to Zynq FPGA Manually/Vitis_programs/main.c: -------------------------------------------------------------------------------- 1 | // This code is to be copy pasted in you vivtis application component 2 | // alongside data generated by the python generator 3 | // PLEASE INCREASE YOUR STACK AND HEAP SIZE ! to avoid program stall 4 | 5 | #include "xparameters.h" 6 | #include "xaxidma.h" 7 | #include 8 | #include 9 | #include "mnist_samples.h" 10 | 11 | XAxiDma AxiDma; 12 | 13 | int init_dma(XAxiDma *AxiDma) { 14 | XAxiDma_Config* CfgPtr; 15 | int status; 16 | 17 | CfgPtr = XAxiDma_LookupConfig(XPAR_AXI_DMA_0_BASEADDR); 18 | if (!CfgPtr) { 19 | xil_printf("No configuration found for %d\n", XPAR_AXI_DMA_0_BASEADDR); 20 | return XST_FAILURE; 21 | } 22 | 23 | status = XAxiDma_CfgInitialize(AxiDma, CfgPtr); 24 | if (status != XST_SUCCESS) { 25 | xil_printf("Initialization failed\n"); 26 | return XST_FAILURE; 27 | } 28 | 29 | if (XAxiDma_HasSg(AxiDma)) { 30 | xil_printf("Device configured as SG mode\n"); 31 | return XST_FAILURE; 32 | } 33 | 34 | return XST_SUCCESS; 35 | } 36 | 37 | static inline void enable_pmu_cycle_counter(void) { 38 | asm volatile("mcr p15, 0, %0, c9, c12, 1" :: "r"(1 << 31)); // Enable cycle counter 39 | asm volatile("mcr p15, 0, %0, c9, c12, 0" :: "r"(1)); // Enable all counters 40 | } 41 | 42 | static inline uint32_t read_pmu_cycle_counter(void) { 43 | uint32_t value; 44 | asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(value)); 45 | return value; 46 | } 47 | 48 | int main(void) { 49 | enable_pmu_cycle_counter(); 50 | uint32_t start, end; 51 | 52 | int status = init_dma(&AxiDma); 53 | if(status != XST_SUCCESS) { 54 | xil_printf("Error while initializing the DMA\n"); 55 | return 1; 56 | } 57 | 58 | xil_printf("DMA initialized successfully\n"); 59 | 60 | volatile char TxBuffer[IMAGE_SIZE*NUM_SAMPLES] __attribute__ ((aligned (32))); 61 | volatile int RxBuffer[NUM_SAMPLES] __attribute__ ((aligned (32))); 62 | 63 | xil_printf("Memory init OKAY\n"); 64 | 65 | for(int j = 0; j < NUM_SAMPLES; j++) { 66 | for(int i = 0; i < IMAGE_SIZE; i++) { 67 | // xil_printf("I : %d /// J : %d\n", i, j); // debug purpose 68 | TxBuffer[j * IMAGE_SIZE + i] = (char)mnist_samples[j][i]; // fill with variable placeholder data 69 | } 70 | } 71 | 72 | xil_printf("Memory allocation OKAY\n"); 73 | 74 | Xil_DCacheFlushRange((UINTPTR)TxBuffer, NUM_SAMPLES * IMAGE_SIZE * sizeof(char)); 75 | Xil_DCacheFlushRange((UINTPTR)RxBuffer, NUM_SAMPLES * sizeof(char)); 76 | 77 | xil_printf("Cach flush OKAY, Strating transfers...\n"); 78 | 79 | start = read_pmu_cycle_counter(); 80 | for(int k = 0; k < NUM_SAMPLES; k++) { 81 | 82 | status = XAxiDma_SimpleTransfer(&AxiDma, (UINTPTR)&TxBuffer[k*IMAGE_SIZE], IMAGE_SIZE * sizeof(char), XAXIDMA_DMA_TO_DEVICE); 83 | //printf("%i TO_DEVICE status code\n", status); 84 | if (status != XST_SUCCESS) { 85 | xil_printf("Error: DMA transfer to device failed\n"); 86 | return XST_FAILURE; 87 | } 88 | 89 | status = XAxiDma_SimpleTransfer(&AxiDma, (UINTPTR)&RxBuffer[k], sizeof(int), XAXIDMA_DEVICE_TO_DMA); 90 | //printf("%i FROM_DEVICE status code\n", status); 91 | if (status != XST_SUCCESS) { 92 | xil_printf("Error: DMA transfer from device failed\n"); 93 | return XST_FAILURE; 94 | } 95 | 96 | while (XAxiDma_Busy(&AxiDma, XAXIDMA_DMA_TO_DEVICE) || 97 | XAxiDma_Busy(&AxiDma, XAXIDMA_DEVICE_TO_DMA)) { 98 | ; 99 | } 100 | xil_printf("#%i iteration done\n", k); 101 | } 102 | end = read_pmu_cycle_counter(); 103 | 104 | // Output classifier's results & compute the accuracy 105 | 106 | int valid = 0; 107 | int accuracy_percentage; 108 | 109 | for(int i = 0; i < NUM_SAMPLES; i++) { 110 | xil_printf("FPGA value RxBuffer[%d] = %d\n", i, RxBuffer[i]); 111 | if(RxBuffer[i] == mnist_labels[i]){ 112 | valid++; 113 | } 114 | } 115 | // Calculate accuracy as a percentage, multiplied by 100 to preserve precision 116 | accuracy_percentage = (valid * 100) / NUM_SAMPLES; 117 | xil_printf("\n\nMODEL ACCURACY = %d%%\n", accuracy_percentage); 118 | 119 | uint32_t cycles = end - start; 120 | double time_ms = (double)cycles / 667000.0; 121 | printf("Execution time: %f milliseconds\n", time_ms); 122 | 123 | return 0; 124 | } -------------------------------------------------------------------------------- /Lab 3 : Porting to Zynq FPGA Manually/final_custom_system.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0BAB1/AI_to_FPGA_course/6d94feb967484eb6ac0d1ae596c5c4d7ae79b4fa/Lab 3 : Porting to Zynq FPGA Manually/final_custom_system.png -------------------------------------------------------------------------------- /Lab 3 : Porting to Zynq FPGA Manually/final_output.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0BAB1/AI_to_FPGA_course/6d94feb967484eb6ac0d1ae596c5c4d7ae79b4fa/Lab 3 : Porting to Zynq FPGA Manually/final_output.png -------------------------------------------------------------------------------- /Lab 3 : Porting to Zynq FPGA Manually/readme.md: -------------------------------------------------------------------------------- 1 | # LAB 3 Presentation 2 | 3 | FINN Provides a very neat runtime environement based on Pynq to run FPGA inference directly from a notebook. 4 | 5 | However, you might want to run inference on other (unsuported) FPGA boards. Given that only 2 boards are made for PYNQ and only [a few](http://www.pynq.io/boards.html) are officially supported at the moment, we will do the FPGA inference manually. 6 | This allows for better understanding and flexibility for your future projects. 7 | 8 | To do so, we will go over various steps to make it work. 9 | 10 | - 0 => Find the FINN exportoted stitched IP and integrate this to your vivado project 11 | - 1 => Create our own "glue logic" IP to interface between the model and Xilinx's DMA IP 12 | - 2 => Run synth & impl and export to vitis 13 | - 3 => Create software to run inference using DMA's drivers 14 | 15 | ## What is in this folder ? 16 | 17 | This folder contains : 18 | 19 | - HARDWARE : The glue logic IP's System Verilog code (use ```git clone --recursive https://github.com/0BAB1/python_to_fpga_course.git``` flag to clone the sub repo if needed or access it [here](https://github.com/0BAB1/Axi-Stream-FIFO-for-FINN)) 20 | - SOFTWARE : The code we'll run in Vitis 21 | - SOFTWARE : A data generator for C inference of MNIST data. 22 | - A LAB MANUAL 23 | 24 | # LAB 3 MANUAL 25 | 26 | ## 0 : Find the FINN exportoted stitched IP and integrate this to your vivado project 27 | 28 | During LAB 2, we used FINN to generate a "stitched IP" that conviniently generated a zynq project for us. Regardless of the workflow you chose (FINN has multple workflows like a [CLI](https://finn.readthedocs.io/en/latest/command_line.html) or [Custom builders](https://finn.readthedocs.io/en/latest/command_line.html) that does all we did in LAB in a automated way), you will always have a collection of outputs including : 29 | 30 | - The different layers as IPs 31 | - A stitched IP 32 | 33 | After LAB 2, you can access the stitched IP by opinning the ```/tmp/finn_dev_yourusername``` folder, you will then find a range of output product. 34 | We are going to focus on the ```vivado_zynq_xxx.../``` folder and open the .xpr using vivado. 35 | 36 | ## 1 : Create our own "glue logic" IP to interface between the model and Xilinx's DMA IP 37 | 38 | With the output vivado project oppened, we will now proceed to delete every IP used in the block design **except for the stiched IP**, we will keep it a build our system around it. 39 | 40 | As we can see, the stiched IP is very conviently packed with simplifed stream interfaces and expects a 8bits input for the data, just as planed ! 41 | 42 | But there is a problem : to transfer data, we will use Xilinx's DMA IP that need TLAST signal to function properly. 43 | 44 | You can create a custom FIFO IP using the HDL in this repo's folder in order to handle the correct signals assertion for DMA to function properly. 45 | 46 | Then add this custom IP Right after the FINN IP. 47 | 48 | We then add the usual DMA etc.. to send data to the model via AXI Stream directly from memory. [Here is a small tutorial illustrating how to use DMA](https://www.youtube.com/watch?v=aySO9jCKj9g) 49 | 50 | The end custom system should then look like this : 51 | 52 | ![Final system image](./final_custom_system.png) 53 | 54 | ## 2 : Run synth & impl and export to vitis 55 | 56 | This step is not a step per say, you simply have to generate a bisteam and export the resulting hardware to Vitis so we can use the drivers to generate some software 57 | 58 | ## 3 : Create software to run inference using DMA's drivers 59 | 60 | Once in vitis with platform & app components created, you can take inspiration from the code in the repo's "vitis prgram folders". 61 | 62 | You also have, in this repo, a main.py file that will generate random quantized (UINT8) data alongside the corresponding labels and put these in a header file to use in our software. 63 | 64 | Be sure that your heap and stack size complies with the number of samples you will load in memory (I use 0xf000 for both stack and heap to run this example). 65 | 66 | ## 4 : What is next ? 67 | 68 | - Open UART tty 69 | - Run debuging mode, observe results 70 | - Compare FPGA accuracy with python simulations (asserted equal) 71 | - Use system ILA to debug DMA & Drivers problems 72 | 73 | Do not hesitate to open an Issue or contact if you have a problem. Here is a final output example : 74 | 75 | ![final output image](final_output.png) -------------------------------------------------------------------------------- /Lecture 1 : Reminders/PTQ_quantization_for_nn.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Quantization for Neural Networks\n", 8 | "\n", 9 | "After the small asymetric quantization example, In this notebook, we will see how to quantize a Neural Network (NN).\n", 10 | "\n", 11 | "## Post Training Quantization (PTQ)\n", 12 | "\n", 13 | "The PQT will involve training a regular model and then quantizing it.\n", 14 | "\n", 15 | "To do so, we will use observer to determine alpha, beta, scale and zero factors, whilst simply running inference. Just like we did in the f32 to int8 vector quantization example.\n", 16 | "\n", 17 | "This will be done using pytorch only.\n", 18 | "\n", 19 | "## Quantization Aware Training (QAT)\n", 20 | "\n", 21 | "For this, you will have to wait until the the **next lecture**, where we will use Brevitas, a superset of pytorch, to do QAT\n", 22 | "\n", 23 | "### Side note on Pytorch vs Brevitas for Quantization\n", 24 | "\n", 25 | "Note that other framework than brevitas exists for QAT but FINN (a very important tool for later in the course) was built for working with Brevitas.\n", 26 | "\n", 27 | "*Meaning* : even though we use PyTorch here as it is the easiest for PQT, we will quickly transition to brevitas for QAT. See this notebook serve as learning material to demonstrate that you can also you quantization for simpler AI use cases to save inference costs." 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 1, 33 | "metadata": {}, 34 | "outputs": [], 35 | "source": [ 36 | "import torch\n", 37 | "import torchvision.datasets as datasets\n", 38 | "import torchvision.transforms as transforms\n", 39 | "import torch.nn as nn\n", 40 | "import matplotlib.pyplot as plt\n", 41 | "_ = torch.manual_seed(0)" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": 2, 47 | "metadata": {}, 48 | "outputs": [], 49 | "source": [ 50 | "# IMPORT THE DATA\n", 51 | "import torchvision\n", 52 | "import torchvision.transforms as transforms\n", 53 | "from torch.utils.data import DataLoader\n", 54 | "\n", 55 | "# Data preparation\n", 56 | "transform = transforms.Compose([\n", 57 | " transforms.ToTensor(),\n", 58 | " transforms.Normalize((0.1307,), (0.3081,))\n", 59 | "])\n", 60 | "\n", 61 | "train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)\n", 62 | "test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transform, download=True)\n", 63 | "\n", 64 | "train_loader = DataLoader(dataset=train_dataset, batch_size=100, shuffle=True)\n", 65 | "test_loader = DataLoader(dataset=test_dataset, batch_size=100, shuffle=False)" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": 3, 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "# DEFINE THE MODEL\n", 75 | "# This example will be more elaborated in the second lecture, along side a full QAT example in Brevitas\n", 76 | "\n", 77 | "class SimpleClassifier(nn.Module):\n", 78 | " def __init__ (self):\n", 79 | " super(SimpleClassifier, self).__init__()\n", 80 | " self.model = nn.Sequential(\n", 81 | " nn.Linear(28*28, 128),\n", 82 | " nn.ReLU(),\n", 83 | " nn.Linear(128, 10)\n", 84 | " )\n", 85 | "\n", 86 | " def forward(self, x):\n", 87 | " return self.model(x)" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": 4, 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [ 96 | "# DECLARE THE MODEL AND OPTIMIZATION PARAMETERS\n", 97 | "import torch.optim as optim\n", 98 | "\n", 99 | "model = SimpleClassifier()\n", 100 | "# Define loss function and optimizer\n", 101 | "criterion = nn.CrossEntropyLoss()\n", 102 | "optimizer = optim.Adam(model.parameters(), lr=0.001)" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "metadata": {}, 109 | "outputs": [], 110 | "source": [ 111 | "# TRAIN THE MODEL\n", 112 | "for epoch in range(5):\n", 113 | " for i, (images, labels) in enumerate(train_loader):\n", 114 | " # Flatten the image\n", 115 | " images = images.reshape(-1, 28*28)\n", 116 | " \n", 117 | " # Forward pass\n", 118 | " outputs = model(images)\n", 119 | " loss = criterion(outputs, labels)\n", 120 | " \n", 121 | " # Backward pass and optimize\n", 122 | " optimizer.zero_grad()\n", 123 | " loss.backward()\n", 124 | " optimizer.step()\n", 125 | " \n", 126 | " \n", 127 | " print(f'Epoch [{epoch+1}/5], Loss: {loss.item():.4f}')\n", 128 | "\n", 129 | "print(\"Training finished!\")" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [ 138 | "# Testing loop\n", 139 | "import torch\n", 140 | "model.eval()\n", 141 | "correct = 0\n", 142 | "with torch.no_grad():\n", 143 | " for data, target in test_loader:\n", 144 | " # Flatten the image\n", 145 | " data = data.reshape(-1, 28*28)\n", 146 | " output = model(data)\n", 147 | " pred = output.argmax(dim=1, keepdim=True)\n", 148 | " correct += pred.eq(target.view_as(pred)).sum().item()\n", 149 | "\n", 150 | "accuracy = 100. * correct / len(test_loader.dataset)\n", 151 | "print(f'Test Accuracy: {accuracy:.2f}%')\n", 152 | "\n", 153 | "old_accuracy = accuracy #save for later" 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "metadata": {}, 159 | "source": [ 160 | "## NOW LET'S ANALYSE !" 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": null, 166 | "metadata": {}, 167 | "outputs": [], 168 | "source": [ 169 | "import os\n", 170 | "\n", 171 | "# GET MODEL SIZE\n", 172 | "def get_size(model):\n", 173 | " torch.save(model.state_dict(), \"model_before_PTQ.p\")\n", 174 | " size = os.path.getsize(\"model_before_PTQ.p\")/1e3\n", 175 | " os.remove(\"model_before_PTQ.p\")\n", 176 | " return(size)\n", 177 | "\n", 178 | "old_size = get_size(model)\n", 179 | "print(\"size of the model before PTQ : \", old_size, \"KB\")" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "# POST TRAINING QUANT\n", 187 | "\n", 188 | "When we did the quantization example, we saw that we use min and max valus to compute adequate quantization.\n", 189 | "\n", 190 | "Here inputs changes all the time ! we have to run inference to gather data in order to determine the best parameters.\n", 191 | "\n", 192 | "To do this, we will simply use \"Obervers\"\n", 193 | "\n", 194 | "## Crerate a model with observers\n", 195 | "\n", 196 | "first we add quant and dequant [stubs](https://pytorch.org/docs/stable/generated/torch.ao.quantization.QuantStub.html) wich are observers for broad I/O quantization" 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": 8, 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": [ 205 | "# DEFINE THE MODEL\n", 206 | "\n", 207 | "class SimpleQuantClassifier(nn.Module):\n", 208 | " def __init__ (self):\n", 209 | " super(SimpleQuantClassifier, self).__init__()\n", 210 | " self.quant = torch.quantization.QuantStub()\n", 211 | " self.model = nn.Sequential(\n", 212 | " nn.Linear(28*28, 128),\n", 213 | " nn.ReLU(),\n", 214 | " nn.Linear(128, 10),\n", 215 | " )\n", 216 | " self.dequant = torch.quantization.DeQuantStub()\n", 217 | "\n", 218 | " def forward(self, x):\n", 219 | " out = self.quant(x)\n", 220 | " out = self.model(out)\n", 221 | " out = self.dequant(out)\n", 222 | " return out" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": {}, 228 | "source": [ 229 | "and add observers to intermediate layers too" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "metadata": {}, 236 | "outputs": [], 237 | "source": [ 238 | "import torch.ao.quantization\n", 239 | "\n", 240 | "\n", 241 | "quant_model = SimpleQuantClassifier()\n", 242 | "quant_model.load_state_dict(model.state_dict()) # load pre-trained weights into the quant model\n", 243 | "quant_model.eval()\n", 244 | "\n", 245 | "quant_model.qconfig = torch.ao.quantization.default_qconfig\n", 246 | "quant_model = torch.ao.quantization.prepare(quant_model) # insert observers\n", 247 | "quant_model" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": {}, 253 | "source": [ 254 | "## Run inference on the new model\n", 255 | "\n", 256 | "This will allow observer to gather data" 257 | ] 258 | }, 259 | { 260 | "cell_type": "code", 261 | "execution_count": null, 262 | "metadata": {}, 263 | "outputs": [], 264 | "source": [ 265 | "import torch\n", 266 | "quant_model.eval()\n", 267 | "correct = 0\n", 268 | "with torch.no_grad():\n", 269 | " for data, target in test_loader:\n", 270 | " # Flatten the image\n", 271 | " data = data.reshape(-1, 28*28)\n", 272 | " output = quant_model(data)\n", 273 | " pred = output.argmax(dim=1, keepdim=True)\n", 274 | " correct += pred.eq(target.view_as(pred)).sum().item()\n", 275 | "\n", 276 | "accuracy = 100. * correct / len(test_loader.dataset)\n", 277 | "print(f'Test Accuracy: {accuracy:.2f}%')" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": {}, 283 | "source": [ 284 | "We now re check our model, we now see that observers carry data with them, good !" 285 | ] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": null, 290 | "metadata": {}, 291 | "outputs": [], 292 | "source": [ 293 | "quant_model" 294 | ] 295 | }, 296 | { 297 | "cell_type": "markdown", 298 | "metadata": {}, 299 | "source": [ 300 | "## Quantize the model\n", 301 | "\n", 302 | "We can now simply use othe pytorch API to use these data for quantization !\n", 303 | "\n", 304 | "we than visualize our weights, they are now INT8 !" 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": 12, 310 | "metadata": {}, 311 | "outputs": [], 312 | "source": [ 313 | "import torch.ao.quantization\n", 314 | "\n", 315 | "quant_model = torch.ao.quantization.convert(quant_model)" 316 | ] 317 | }, 318 | { 319 | "cell_type": "code", 320 | "execution_count": null, 321 | "metadata": {}, 322 | "outputs": [], 323 | "source": [ 324 | "print(torch.int_repr(quant_model.model[0].weight()))" 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "We can also compare quantized and dequantized weights, we can see that a small error has been introduced." 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": null, 337 | "metadata": {}, 338 | "outputs": [], 339 | "source": [ 340 | "print(model.model[0].weight) # original weights\n", 341 | "print(quant_model.model[0].weight()) # dequant weights" 342 | ] 343 | }, 344 | { 345 | "cell_type": "markdown", 346 | "metadata": {}, 347 | "source": [ 348 | "# Lets compare again !\n", 349 | "\n", 350 | "we will now analyse accuracy and size of the model. (/4 theorically)" 351 | ] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "execution_count": null, 356 | "metadata": {}, 357 | "outputs": [], 358 | "source": [ 359 | "import torch\n", 360 | "quant_model.eval()\n", 361 | "correct = 0\n", 362 | "with torch.no_grad():\n", 363 | " for data, target in test_loader:\n", 364 | " # Flatten the image\n", 365 | " data = data.reshape(-1, 28*28)\n", 366 | " output = quant_model(data)\n", 367 | " pred = output.argmax(dim=1, keepdim=True)\n", 368 | " correct += pred.eq(target.view_as(pred)).sum().item()\n", 369 | "\n", 370 | "accuracy = 100. * correct / len(test_loader.dataset)\n", 371 | "print(f'Test Accuracy (ORIGINAL): {old_accuracy:.2f}%')\n", 372 | "print(f'Test Accuracy (Quantized): {accuracy:.2f}%')" 373 | ] 374 | }, 375 | { 376 | "cell_type": "code", 377 | "execution_count": null, 378 | "metadata": {}, 379 | "outputs": [], 380 | "source": [ 381 | "print(\"size of the model before PTQ : \", old_size, \"KB\")\n", 382 | "print(\"size of the model after PTQ : \", get_size(quant_model), \"KB\")" 383 | ] 384 | } 385 | ], 386 | "metadata": { 387 | "kernelspec": { 388 | "display_name": "Python 3 (ipykernel)", 389 | "language": "python", 390 | "name": "python3" 391 | }, 392 | "language_info": { 393 | "codemirror_mode": { 394 | "name": "ipython", 395 | "version": 3 396 | }, 397 | "file_extension": ".py", 398 | "mimetype": "text/x-python", 399 | "name": "python", 400 | "nbconvert_exporter": "python", 401 | "pygments_lexer": "ipython3", 402 | "version": "3.10.12" 403 | } 404 | }, 405 | "nbformat": 4, 406 | "nbformat_minor": 2 407 | } 408 | -------------------------------------------------------------------------------- /Lecture 1 : Reminders/PyTorch.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# PyTorch\n", 8 | "\n", 9 | "Many of you might already be familiar with the basics of AI, a reminder always helps, especialy if you did not use the PyTorch framework before !" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 1, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "import torch\n", 19 | "import torch.nn as nn\n", 20 | "import torch.optim as optim" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": [ 27 | "Let's create the simplest classifier for MNIST !" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 2, 33 | "metadata": {}, 34 | "outputs": [], 35 | "source": [ 36 | "class SimpleClassifier(nn.Module):\n", 37 | " def __init__ (self):\n", 38 | " super(SimpleClassifier, self).__init__()\n", 39 | " self.model = nn.Sequential(\n", 40 | " nn.Linear(28*28, 128),\n", 41 | " nn.ReLU(),\n", 42 | " nn.Linear(128, 10)\n", 43 | " )\n", 44 | "\n", 45 | " def forward(self, x):\n", 46 | " return self.model(x)" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "This very simple model can be represented like this (made the image myself, might not be the best but the spirit is here !):\n", 54 | "\n", 55 | "

\n", 56 | " \n", 57 | "

\n", 58 | "\n", 59 | "Now we simply declare the model, the optimizer (the way we will update weights through training) and the loss function (how we will compare the output and the expected classification)" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": 3, 65 | "metadata": {}, 66 | "outputs": [], 67 | "source": [ 68 | "model = SimpleClassifier()\n", 69 | "# Define loss function and optimizer\n", 70 | "criterion = nn.CrossEntropyLoss()\n", 71 | "optimizer = optim.Adam(model.parameters(), lr=0.001)" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "We will now import the MNIST dataset. In real life application, you may have to make this dataset yourself, which is in most cases, one of the hardest part..." 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": 4, 84 | "metadata": {}, 85 | "outputs": [ 86 | { 87 | "name": "stdout", 88 | "output_type": "stream", 89 | "text": [ 90 | "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\n", 91 | "Failed to download (trying next):\n", 92 | "HTTP Error 403: Forbidden\n", 93 | "\n", 94 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz\n", 95 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz\n" 96 | ] 97 | }, 98 | { 99 | "name": "stderr", 100 | "output_type": "stream", 101 | "text": [ 102 | "100%|██████████| 9912422/9912422 [00:02<00:00, 4167931.35it/s]\n" 103 | ] 104 | }, 105 | { 106 | "name": "stdout", 107 | "output_type": "stream", 108 | "text": [ 109 | "Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw\n", 110 | "\n", 111 | "Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\n", 112 | "Failed to download (trying next):\n", 113 | "HTTP Error 403: Forbidden\n", 114 | "\n", 115 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz\n", 116 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz\n" 117 | ] 118 | }, 119 | { 120 | "name": "stderr", 121 | "output_type": "stream", 122 | "text": [ 123 | "100%|██████████| 28881/28881 [00:00<00:00, 337261.87it/s]\n" 124 | ] 125 | }, 126 | { 127 | "name": "stdout", 128 | "output_type": "stream", 129 | "text": [ 130 | "Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw\n", 131 | "\n", 132 | "Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\n", 133 | "Failed to download (trying next):\n", 134 | "HTTP Error 403: Forbidden\n", 135 | "\n", 136 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz\n", 137 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz\n" 138 | ] 139 | }, 140 | { 141 | "name": "stderr", 142 | "output_type": "stream", 143 | "text": [ 144 | "100%|██████████| 1648877/1648877 [00:01<00:00, 1063683.01it/s]\n" 145 | ] 146 | }, 147 | { 148 | "name": "stdout", 149 | "output_type": "stream", 150 | "text": [ 151 | "Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw\n", 152 | "\n", 153 | "Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\n", 154 | "Failed to download (trying next):\n", 155 | "HTTP Error 403: Forbidden\n", 156 | "\n", 157 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz\n", 158 | "Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz\n" 159 | ] 160 | }, 161 | { 162 | "name": "stderr", 163 | "output_type": "stream", 164 | "text": [ 165 | "100%|██████████| 4542/4542 [00:00<00:00, 9896378.58it/s]" 166 | ] 167 | }, 168 | { 169 | "name": "stdout", 170 | "output_type": "stream", 171 | "text": [ 172 | "Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw\n", 173 | "\n" 174 | ] 175 | }, 176 | { 177 | "name": "stderr", 178 | "output_type": "stream", 179 | "text": [ 180 | "\n" 181 | ] 182 | } 183 | ], 184 | "source": [ 185 | "import torchvision\n", 186 | "import torchvision.transforms as transforms\n", 187 | "from torch.utils.data import DataLoader\n", 188 | "\n", 189 | "# Define transformations\n", 190 | "transform = transforms.Compose([\n", 191 | " transforms.ToTensor(),\n", 192 | " transforms.Normalize((0.1307,), (0.3081,)) # MNIST mean and std\n", 193 | "])\n", 194 | "\n", 195 | "# Load MNIST dataset\n", 196 | "train_dataset = torchvision.datasets.MNIST(root='./data', \n", 197 | " train=True, \n", 198 | " transform=transform, \n", 199 | " download=True)\n", 200 | "\n", 201 | "# Create DataLoader\n", 202 | "train_loader = DataLoader(dataset=train_dataset, \n", 203 | " batch_size=100, \n", 204 | " shuffle=True)" 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": {}, 210 | "source": [ 211 | "### And now train the model !\n", 212 | "\n", 213 | "We will do so by running 5 times the same loop (5 epochs) to see how the model adapts durring training.\n", 214 | "\n", 215 | "The DataLoader stucture is organized in batches extracted from the raw dataset (here we have 600 batches of 100 sample)\n", 216 | "\n", 217 | "We can also check out the sample shape (or size) and also see they are all the same!" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": 21, 223 | "metadata": {}, 224 | "outputs": [ 225 | { 226 | "name": "stdout", 227 | "output_type": "stream", 228 | "text": [ 229 | "600\n", 230 | "torch.Size([100, 1, 28, 28])\n", 231 | "torch.Size([100, 1, 28, 28])\n", 232 | "torch.Size([100, 1, 28, 28])\n", 233 | "torch.Size([100, 1, 28, 28])\n", 234 | "torch.Size([100, 1, 28, 28])\n" 235 | ] 236 | } 237 | ], 238 | "source": [ 239 | "print(len(train_loader))\n", 240 | "# note that print(train_loader[0]) will not work as it is itrable but not indexable !\n", 241 | "i = 0\n", 242 | "for batch in train_loader:\n", 243 | " print(batch[0].shape)\n", 244 | " if i == 4 :\n", 245 | " break \n", 246 | " i+=1" 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "### We now train our model\n", 254 | "\n", 255 | "We reshpe the images, perform model inferce and compute loss/weights gradients. We then uypdate the weights and go to the next batch (and after that, redo an epoch, until the end)" 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": 6, 261 | "metadata": {}, 262 | "outputs": [ 263 | { 264 | "name": "stdout", 265 | "output_type": "stream", 266 | "text": [ 267 | "Epoch [1/5], Loss: 0.0293\n", 268 | "Epoch [2/5], Loss: 0.0239\n", 269 | "Epoch [3/5], Loss: 0.0365\n", 270 | "Epoch [4/5], Loss: 0.0428\n", 271 | "Epoch [5/5], Loss: 0.0184\n", 272 | "Training finished!\n" 273 | ] 274 | } 275 | ], 276 | "source": [ 277 | "# Assuming you have your data loaded into train_loader\n", 278 | "for epoch in range(5):\n", 279 | " for i, (images, labels) in enumerate(train_loader):\n", 280 | " # Flatten the image\n", 281 | " images = images.reshape(-1, 28*28)\n", 282 | " \n", 283 | " # Forward pass\n", 284 | " outputs = model(images)\n", 285 | " loss = criterion(outputs, labels)\n", 286 | " \n", 287 | " # Backward pass and optimize\n", 288 | " optimizer.zero_grad()\n", 289 | " loss.backward()\n", 290 | " optimizer.step()\n", 291 | " \n", 292 | " \n", 293 | " print(f'Epoch [{epoch+1}/5], Loss: {loss.item():.4f}')\n", 294 | "\n", 295 | "print(\"Training finished!\")" 296 | ] 297 | }, 298 | { 299 | "cell_type": "markdown", 300 | "metadata": {}, 301 | "source": [ 302 | "### We now use the exact same principle to test our model" 303 | ] 304 | }, 305 | { 306 | "cell_type": "code", 307 | "execution_count": 22, 308 | "metadata": {}, 309 | "outputs": [ 310 | { 311 | "name": "stdout", 312 | "output_type": "stream", 313 | "text": [ 314 | "Test Accuracy: 97.61%\n" 315 | ] 316 | } 317 | ], 318 | "source": [ 319 | "test_dataset = torchvision.datasets.MNIST(root='./data', \n", 320 | " train=False, \n", 321 | " transform=transform)\n", 322 | "test_loader = DataLoader(dataset=test_dataset, \n", 323 | " batch_size=100, \n", 324 | " shuffle=False)\n", 325 | "\n", 326 | "model.eval()\n", 327 | "\n", 328 | "correct = 0\n", 329 | "total = 0\n", 330 | "\n", 331 | "with torch.no_grad():\n", 332 | " for images, labels in test_loader:\n", 333 | " images = images.reshape(-1, 28*28)\n", 334 | " outputs = model(images)\n", 335 | " _, predicted = torch.max(outputs.data, 1)\n", 336 | " total += labels.size(0)\n", 337 | " correct += (predicted == labels).sum().item()\n", 338 | "\n", 339 | "# Calculate accuracy\n", 340 | "accuracy = 100 * correct / total\n", 341 | "print(f'Test Accuracy: {accuracy:.2f}%')" 342 | ] 343 | }, 344 | { 345 | "cell_type": "markdown", 346 | "metadata": {}, 347 | "source": [ 348 | "# Learn more...\n", 349 | "\n", 350 | "AI is the most researched topic nowdays. Ressources are everywhere, this small and specific introduction s nothing compared to how broad the field is (even tougb it always comes down to the same basic logic)." 351 | ] 352 | } 353 | ], 354 | "metadata": { 355 | "kernelspec": { 356 | "display_name": "Python 3", 357 | "language": "python", 358 | "name": "python3" 359 | }, 360 | "language_info": { 361 | "codemirror_mode": { 362 | "name": "ipython", 363 | "version": 3 364 | }, 365 | "file_extension": ".py", 366 | "mimetype": "text/x-python", 367 | "name": "python", 368 | "nbconvert_exporter": "python", 369 | "pygments_lexer": "ipython3", 370 | "version": "3.10.12" 371 | } 372 | }, 373 | "nbformat": 4, 374 | "nbformat_minor": 2 375 | } 376 | -------------------------------------------------------------------------------- /Lecture 1 : Reminders/quantization.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Quantization\n", 8 | "\n", 9 | "This notebook serves as a basic comprehensive introductoin to quantization by manipulating np arrays" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "### Lets create a FP32 vector" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": null, 22 | "metadata": {}, 23 | "outputs": [], 24 | "source": [ 25 | "import numpy as np\n", 26 | "my_fp_data = np.random.rand(50).astype(np.float32)\n", 27 | "my_fp_data" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "metadata": {}, 33 | "source": [ 34 | "### We will now create a quantization function and quantize it" 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": null, 40 | "metadata": {}, 41 | "outputs": [], 42 | "source": [ 43 | "def asymmetric_quantize(arr, num_bits=8):\n", 44 | " min = 0\n", 45 | " max = 2**num_bits - 1\n", 46 | " \n", 47 | " beta = np.min(arr)\n", 48 | " alpha = np.max(arr)\n", 49 | " scale = (alpha - beta) / max\n", 50 | " zero_point = np.clip((-beta/scale),0,max).round().astype(np.int8)\n", 51 | "\n", 52 | " quantized_arr = np.clip(np.round(arr / scale + zero_point), min, max).astype(np.uint8)\n", 53 | " \n", 54 | " return quantized_arr, scale, zero_point\n", 55 | "\n", 56 | "def asymmetric_dequantize(quantized_arr, scale, zero_point):\n", 57 | " return (quantized_arr.astype(np.float32) - zero_point) * scale" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "metadata": {}, 64 | "outputs": [], 65 | "source": [ 66 | "my_int_data, scale, zero = asymmetric_quantize(my_fp_data)\n", 67 | "my_int_data" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "### Now let's recover our inital data !" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": {}, 81 | "outputs": [], 82 | "source": [ 83 | "dequant = asymmetric_dequantize(my_int_data, scale, zero)\n", 84 | "dequant" 85 | ] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "metadata": {}, 90 | "source": [ 91 | "# We can see the values are close but not the same...\n", 92 | "\n", 93 | "This error is induced by the lowering of bits resolutions.\n", 94 | "It may look bad but it's not !\n", 95 | "It allows for :\n", 96 | "- Lighter models (int8 << fp32 in size)\n", 97 | "- Faster operations within the model\n", 98 | "\n", 99 | "Of course this error has an effect, let's see those effects on NN inference\n", 100 | "and how to quantize a NN\n", 101 | "\n", 102 | "## Exercice to do : Implement symetric quantization" 103 | ] 104 | }, 105 | { 106 | "cell_type": "markdown", 107 | "metadata": {}, 108 | "source": [ 109 | "# To go further ...\n", 110 | "\n", 111 | "If quantization sparked your curiosity, I recomend you watch this 1hour video that will go over in details what quantization is for NN : [Video](https://www.youtube.com/watch?v=0VdNflU08yA)" 112 | ] 113 | } 114 | ], 115 | "metadata": { 116 | "kernelspec": { 117 | "display_name": "Python 3", 118 | "language": "python", 119 | "name": "python3" 120 | }, 121 | "language_info": { 122 | "codemirror_mode": { 123 | "name": "ipython", 124 | "version": 3 125 | }, 126 | "file_extension": ".py", 127 | "mimetype": "text/x-python", 128 | "name": "python", 129 | "nbconvert_exporter": "python", 130 | "pygments_lexer": "ipython3", 131 | "version": "3.10.12" 132 | } 133 | }, 134 | "nbformat": 4, 135 | "nbformat_minor": 2 136 | } 137 | -------------------------------------------------------------------------------- /Lecture 2 : Project concepts/Brevitas.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Brevitas\n", 8 | "\n", 9 | "Brevitas alows us to train quantized NN. This tool is very useful and future tools used in the course are based on this.\n", 10 | "\n", 11 | "This notebook serves as an introducion to quantizing NNs through the crration (and training) on a fully quantized MNIST classifier.\n", 12 | "\n", 13 | "I opteed for a more robust architecture this time to avoid low precision." 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": 1, 19 | "metadata": {}, 20 | "outputs": [], 21 | "source": [ 22 | "from torch.nn import Module, Flatten\n", 23 | "import torch.nn.functional as F\n", 24 | "\n", 25 | "import brevitas.nn as qnn\n", 26 | "from brevitas.quant import Int8Bias" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "### The model itself.\n", 34 | "\n", 35 | "- As it is a fully quantized model, we introduduce a quntidentity to quantize the input (4 bit activation)\n", 36 | "- All the data passing through this network will be quantized until the output ass all operation are int" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 2, 42 | "metadata": {}, 43 | "outputs": [], 44 | "source": [ 45 | "class QuantWeightActBiasLeNet(Module):\n", 46 | " def __init__(self):\n", 47 | " super(QuantWeightActBiasLeNet, self).__init__()\n", 48 | " self.quant_inp = qnn.QuantIdentity(bit_width=4, return_quant_tensor=True)\n", 49 | " self.fc1 = qnn.QuantLinear(28*28, 128, bias=True, weight_bit_width=4, bias_quant=Int8Bias)\n", 50 | " self.relu = qnn.QuantReLU(bit_width=4, return_quant_tensor=True)\n", 51 | " self.fc2 = qnn.QuantLinear(128, 10, bias=True, weight_bit_width=4, bias_quant=Int8Bias)\n", 52 | "\n", 53 | "\n", 54 | " def forward(self, x):\n", 55 | " out = self.quant_inp(x)\n", 56 | " out = out.reshape(out.shape[0], -1)\n", 57 | " out = self.relu(self.fc1(out))\n", 58 | " out = self.fc2(out)\n", 59 | " return out\n", 60 | "\n", 61 | "quant_weight_act_bias_lenet = QuantWeightActBiasLeNet()\n" 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": {}, 67 | "source": [ 68 | "### Some inspections\n", 69 | "\n", 70 | "Lets play around with the layers, see what they have that's so special !" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": 4, 76 | "metadata": {}, 77 | "outputs": [ 78 | { 79 | "data": { 80 | "text/plain": [ 81 | "QuantWeightActBiasLeNet(\n", 82 | " (quant_inp): QuantIdentity(\n", 83 | " (input_quant): ActQuantProxyFromInjector(\n", 84 | " (_zero_hw_sentinel): StatelessBuffer()\n", 85 | " )\n", 86 | " (act_quant): ActQuantProxyFromInjector(\n", 87 | " (_zero_hw_sentinel): StatelessBuffer()\n", 88 | " (fused_activation_quant_proxy): FusedActivationQuantProxy(\n", 89 | " (activation_impl): Identity()\n", 90 | " (tensor_quant): RescalingIntQuant(\n", 91 | " (int_quant): IntQuant(\n", 92 | " (float_to_int_impl): RoundSte()\n", 93 | " (tensor_clamp_impl): TensorClamp()\n", 94 | " (delay_wrapper): DelayWrapper(\n", 95 | " (delay_impl): _NoDelay()\n", 96 | " )\n", 97 | " )\n", 98 | " (scaling_impl): ParameterFromRuntimeStatsScaling(\n", 99 | " (stats_input_view_shape_impl): OverTensorView()\n", 100 | " (stats): _Stats(\n", 101 | " (stats_impl): AbsPercentile()\n", 102 | " )\n", 103 | " (restrict_scaling): _RestrictValue(\n", 104 | " (restrict_value_impl): FloatRestrictValue()\n", 105 | " )\n", 106 | " (clamp_scaling): _ClampValue(\n", 107 | " (clamp_min_ste): ScalarClampMinSte()\n", 108 | " )\n", 109 | " (restrict_inplace_preprocess): Identity()\n", 110 | " (restrict_preprocess): Identity()\n", 111 | " )\n", 112 | " (int_scaling_impl): IntScaling()\n", 113 | " (zero_point_impl): ZeroZeroPoint(\n", 114 | " (zero_point): StatelessBuffer()\n", 115 | " )\n", 116 | " (msb_clamp_bit_width_impl): BitWidthConst(\n", 117 | " (bit_width): StatelessBuffer()\n", 118 | " )\n", 119 | " )\n", 120 | " )\n", 121 | " )\n", 122 | " )\n", 123 | " (fc1): QuantLinear(\n", 124 | " in_features=784, out_features=128, bias=True\n", 125 | " (input_quant): ActQuantProxyFromInjector(\n", 126 | " (_zero_hw_sentinel): StatelessBuffer()\n", 127 | " )\n", 128 | " (output_quant): ActQuantProxyFromInjector(\n", 129 | " (_zero_hw_sentinel): StatelessBuffer()\n", 130 | " )\n", 131 | " (weight_quant): WeightQuantProxyFromInjector(\n", 132 | " (_zero_hw_sentinel): StatelessBuffer()\n", 133 | " (tensor_quant): RescalingIntQuant(\n", 134 | " (int_quant): IntQuant(\n", 135 | " (float_to_int_impl): RoundSte()\n", 136 | " (tensor_clamp_impl): TensorClampSte()\n", 137 | " (delay_wrapper): DelayWrapper(\n", 138 | " (delay_impl): _NoDelay()\n", 139 | " )\n", 140 | " )\n", 141 | " (scaling_impl): StatsFromParameterScaling(\n", 142 | " (parameter_list_stats): _ParameterListStats(\n", 143 | " (first_tracked_param): _ViewParameterWrapper(\n", 144 | " (view_shape_impl): OverTensorView()\n", 145 | " )\n", 146 | " (stats): _Stats(\n", 147 | " (stats_impl): AbsMax()\n", 148 | " )\n", 149 | " )\n", 150 | " (stats_scaling_impl): _StatsScaling(\n", 151 | " (affine_rescaling): Identity()\n", 152 | " (restrict_clamp_scaling): _RestrictClampValue(\n", 153 | " (clamp_min_ste): ScalarClampMinSte()\n", 154 | " (restrict_value_impl): FloatRestrictValue()\n", 155 | " )\n", 156 | " (restrict_scaling_pre): Identity()\n", 157 | " )\n", 158 | " )\n", 159 | " (int_scaling_impl): IntScaling()\n", 160 | " (zero_point_impl): ZeroZeroPoint(\n", 161 | " (zero_point): StatelessBuffer()\n", 162 | " )\n", 163 | " (msb_clamp_bit_width_impl): BitWidthConst(\n", 164 | " (bit_width): StatelessBuffer()\n", 165 | " )\n", 166 | " )\n", 167 | " )\n", 168 | " (bias_quant): BiasQuantProxyFromInjector(\n", 169 | " (_zero_hw_sentinel): StatelessBuffer()\n", 170 | " (tensor_quant): PrescaledRestrictIntQuant(\n", 171 | " (int_quant): IntQuant(\n", 172 | " (float_to_int_impl): RoundSte()\n", 173 | " (tensor_clamp_impl): TensorClamp()\n", 174 | " (delay_wrapper): DelayWrapper(\n", 175 | " (delay_impl): _NoDelay()\n", 176 | " )\n", 177 | " )\n", 178 | " (msb_clamp_bit_width_impl): BitWidthConst(\n", 179 | " (bit_width): StatelessBuffer()\n", 180 | " )\n", 181 | " (zero_point): StatelessBuffer()\n", 182 | " )\n", 183 | " )\n", 184 | " )\n", 185 | " (relu): QuantReLU(\n", 186 | " (input_quant): ActQuantProxyFromInjector(\n", 187 | " (_zero_hw_sentinel): StatelessBuffer()\n", 188 | " )\n", 189 | " (act_quant): ActQuantProxyFromInjector(\n", 190 | " (_zero_hw_sentinel): StatelessBuffer()\n", 191 | " (fused_activation_quant_proxy): FusedActivationQuantProxy(\n", 192 | " (activation_impl): ReLU()\n", 193 | " (tensor_quant): RescalingIntQuant(\n", 194 | " (int_quant): IntQuant(\n", 195 | " (float_to_int_impl): RoundSte()\n", 196 | " (tensor_clamp_impl): TensorClamp()\n", 197 | " (delay_wrapper): DelayWrapper(\n", 198 | " (delay_impl): _NoDelay()\n", 199 | " )\n", 200 | " )\n", 201 | " (scaling_impl): ParameterFromRuntimeStatsScaling(\n", 202 | " (stats_input_view_shape_impl): OverTensorView()\n", 203 | " (stats): _Stats(\n", 204 | " (stats_impl): AbsPercentile()\n", 205 | " )\n", 206 | " (restrict_scaling): _RestrictValue(\n", 207 | " (restrict_value_impl): FloatRestrictValue()\n", 208 | " )\n", 209 | " (clamp_scaling): _ClampValue(\n", 210 | " (clamp_min_ste): ScalarClampMinSte()\n", 211 | " )\n", 212 | " (restrict_inplace_preprocess): Identity()\n", 213 | " (restrict_preprocess): Identity()\n", 214 | " )\n", 215 | " (int_scaling_impl): IntScaling()\n", 216 | " (zero_point_impl): ZeroZeroPoint(\n", 217 | " (zero_point): StatelessBuffer()\n", 218 | " )\n", 219 | " (msb_clamp_bit_width_impl): BitWidthConst(\n", 220 | " (bit_width): StatelessBuffer()\n", 221 | " )\n", 222 | " )\n", 223 | " )\n", 224 | " )\n", 225 | " )\n", 226 | " (fc2): QuantLinear(\n", 227 | " in_features=128, out_features=10, bias=True\n", 228 | " (input_quant): ActQuantProxyFromInjector(\n", 229 | " (_zero_hw_sentinel): StatelessBuffer()\n", 230 | " )\n", 231 | " (output_quant): ActQuantProxyFromInjector(\n", 232 | " (_zero_hw_sentinel): StatelessBuffer()\n", 233 | " )\n", 234 | " (weight_quant): WeightQuantProxyFromInjector(\n", 235 | " (_zero_hw_sentinel): StatelessBuffer()\n", 236 | " (tensor_quant): RescalingIntQuant(\n", 237 | " (int_quant): IntQuant(\n", 238 | " (float_to_int_impl): RoundSte()\n", 239 | " (tensor_clamp_impl): TensorClampSte()\n", 240 | " (delay_wrapper): DelayWrapper(\n", 241 | " (delay_impl): _NoDelay()\n", 242 | " )\n", 243 | " )\n", 244 | " (scaling_impl): StatsFromParameterScaling(\n", 245 | " (parameter_list_stats): _ParameterListStats(\n", 246 | " (first_tracked_param): _ViewParameterWrapper(\n", 247 | " (view_shape_impl): OverTensorView()\n", 248 | " )\n", 249 | " (stats): _Stats(\n", 250 | " (stats_impl): AbsMax()\n", 251 | " )\n", 252 | " )\n", 253 | " (stats_scaling_impl): _StatsScaling(\n", 254 | " (affine_rescaling): Identity()\n", 255 | " (restrict_clamp_scaling): _RestrictClampValue(\n", 256 | " (clamp_min_ste): ScalarClampMinSte()\n", 257 | " (restrict_value_impl): FloatRestrictValue()\n", 258 | " )\n", 259 | " (restrict_scaling_pre): Identity()\n", 260 | " )\n", 261 | " )\n", 262 | " (int_scaling_impl): IntScaling()\n", 263 | " (zero_point_impl): ZeroZeroPoint(\n", 264 | " (zero_point): StatelessBuffer()\n", 265 | " )\n", 266 | " (msb_clamp_bit_width_impl): BitWidthConst(\n", 267 | " (bit_width): StatelessBuffer()\n", 268 | " )\n", 269 | " )\n", 270 | " )\n", 271 | " (bias_quant): BiasQuantProxyFromInjector(\n", 272 | " (_zero_hw_sentinel): StatelessBuffer()\n", 273 | " (tensor_quant): PrescaledRestrictIntQuant(\n", 274 | " (int_quant): IntQuant(\n", 275 | " (float_to_int_impl): RoundSte()\n", 276 | " (tensor_clamp_impl): TensorClamp()\n", 277 | " (delay_wrapper): DelayWrapper(\n", 278 | " (delay_impl): _NoDelay()\n", 279 | " )\n", 280 | " )\n", 281 | " (msb_clamp_bit_width_impl): BitWidthConst(\n", 282 | " (bit_width): StatelessBuffer()\n", 283 | " )\n", 284 | " (zero_point): StatelessBuffer()\n", 285 | " )\n", 286 | " )\n", 287 | " )\n", 288 | ")" 289 | ] 290 | }, 291 | "execution_count": 4, 292 | "metadata": {}, 293 | "output_type": "execute_result" 294 | } 295 | ], 296 | "source": [ 297 | "model = QuantWeightActBiasLeNet()\n", 298 | "model" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": 5, 304 | "metadata": {}, 305 | "outputs": [ 306 | { 307 | "name": "stdout", 308 | "output_type": "stream", 309 | "text": [ 310 | "Parameter containing:\n", 311 | "tensor([[-0.0326, -0.0227, -0.0050, ..., 0.0253, -0.0092, 0.0329],\n", 312 | " [ 0.0050, -0.0345, 0.0211, ..., 0.0196, -0.0355, -0.0124],\n", 313 | " [ 0.0065, -0.0054, 0.0175, ..., -0.0345, 0.0011, -0.0011],\n", 314 | " ...,\n", 315 | " [ 0.0343, -0.0034, -0.0246, ..., 0.0229, -0.0110, -0.0022],\n", 316 | " [ 0.0330, 0.0281, 0.0260, ..., -0.0251, 0.0294, -0.0145],\n", 317 | " [ 0.0013, -0.0068, -0.0140, ..., -0.0218, 0.0356, -0.0237]],\n", 318 | " requires_grad=True)\n", 319 | "QuantTensor(value=tensor([[-0.0306, -0.0204, -0.0051, ..., 0.0255, -0.0102, 0.0306],\n", 320 | " [ 0.0051, -0.0357, 0.0204, ..., 0.0204, -0.0357, -0.0102],\n", 321 | " [ 0.0051, -0.0051, 0.0153, ..., -0.0357, 0.0000, -0.0000],\n", 322 | " ...,\n", 323 | " [ 0.0357, -0.0051, -0.0255, ..., 0.0204, -0.0102, -0.0000],\n", 324 | " [ 0.0306, 0.0306, 0.0255, ..., -0.0255, 0.0306, -0.0153],\n", 325 | " [ 0.0000, -0.0051, -0.0153, ..., -0.0204, 0.0357, -0.0255]],\n", 326 | " grad_fn=), scale=tensor(0.0051, grad_fn=), zero_point=tensor(0.), bit_width=tensor(4.), signed_t=tensor(True), training_t=tensor(True))\n", 327 | "tensor([[-6, -4, -1, ..., 5, -2, 6],\n", 328 | " [ 1, -7, 4, ..., 4, -7, -2],\n", 329 | " [ 1, -1, 3, ..., -7, 0, 0],\n", 330 | " ...,\n", 331 | " [ 7, -1, -5, ..., 4, -2, 0],\n", 332 | " [ 6, 6, 5, ..., -5, 6, -3],\n", 333 | " [ 0, -1, -3, ..., -4, 7, -5]], dtype=torch.int8)\n", 334 | "torch.int8\n" 335 | ] 336 | } 337 | ], 338 | "source": [ 339 | "print(model.fc1.weight)\n", 340 | "print(model.fc1.quant_weight())\n", 341 | "print(model.fc1.quant_weight().int())\n", 342 | "print(model.fc1.quant_weight().int().dtype)" 343 | ] 344 | }, 345 | { 346 | "cell_type": "markdown", 347 | "metadata": {}, 348 | "source": [ 349 | "### Training and testing\n", 350 | "\n", 351 | "sameprinciples as studied previously" 352 | ] 353 | }, 354 | { 355 | "cell_type": "code", 356 | "execution_count": 6, 357 | "metadata": {}, 358 | "outputs": [], 359 | "source": [ 360 | "import torchvision\n", 361 | "import torchvision.transforms as transforms\n", 362 | "from torch.utils.data import DataLoader\n", 363 | "\n", 364 | "# Data preparation\n", 365 | "transform = transforms.Compose([\n", 366 | " transforms.ToTensor(),\n", 367 | " transforms.Normalize((0.1307,), (0.3081,))\n", 368 | "])\n", 369 | "\n", 370 | "train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)\n", 371 | "test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transform, download=True)\n", 372 | "\n", 373 | "train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)\n", 374 | "test_loader = DataLoader(dataset=test_dataset, batch_size=100, shuffle=False)\n", 375 | "\n" 376 | ] 377 | }, 378 | { 379 | "cell_type": "code", 380 | "execution_count": 7, 381 | "metadata": {}, 382 | "outputs": [], 383 | "source": [ 384 | "from torch import nn\n", 385 | "import torch.optim as optim\n", 386 | "\n", 387 | "# Model, loss function, and optimizer\n", 388 | "model = QuantWeightActBiasLeNet()\n", 389 | "criterion = nn.CrossEntropyLoss()\n", 390 | "optimizer = optim.Adam(model.parameters(), lr=0.001)" 391 | ] 392 | }, 393 | { 394 | "cell_type": "code", 395 | "execution_count": 8, 396 | "metadata": {}, 397 | "outputs": [ 398 | { 399 | "name": "stderr", 400 | "output_type": "stream", 401 | "text": [ 402 | "/usr/local/lib/python3.10/dist-packages/torch/_tensor.py:1255: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ../c10/core/TensorImpl.h:1758.)\n", 403 | " return super(Tensor, self).rename(names)\n" 404 | ] 405 | }, 406 | { 407 | "name": "stdout", 408 | "output_type": "stream", 409 | "text": [ 410 | "Epoch 1, Loss: 0.0402638241648674\n", 411 | "Epoch 2, Loss: 0.15108023583889008\n", 412 | "Epoch 3, Loss: 0.05071258917450905\n", 413 | "Epoch 4, Loss: 0.02540937438607216\n", 414 | "Epoch 5, Loss: 0.11885809898376465\n" 415 | ] 416 | } 417 | ], 418 | "source": [ 419 | "# Training loop\n", 420 | "for epoch in range(5): # Train for 5 epochs\n", 421 | " model.train()\n", 422 | " for batch_idx, (data, target) in enumerate(train_loader):\n", 423 | " optimizer.zero_grad()\n", 424 | " output = model(data)\n", 425 | " loss = criterion(output, target)\n", 426 | " loss.backward()\n", 427 | " optimizer.step()\n", 428 | " \n", 429 | " print(f'Epoch {epoch+1}, Loss: {loss.item()}')\n", 430 | "\n" 431 | ] 432 | }, 433 | { 434 | "cell_type": "code", 435 | "execution_count": 9, 436 | "metadata": {}, 437 | "outputs": [ 438 | { 439 | "name": "stdout", 440 | "output_type": "stream", 441 | "text": [ 442 | "Test Accuracy: 97.52%\n" 443 | ] 444 | } 445 | ], 446 | "source": [ 447 | "# Testing loop\n", 448 | "import torch\n", 449 | "model.eval()\n", 450 | "correct = 0\n", 451 | "with torch.no_grad():\n", 452 | " for data, target in test_loader:\n", 453 | " output = model(data)\n", 454 | " pred = output.argmax(dim=1, keepdim=True)\n", 455 | " correct += pred.eq(target.view_as(pred)).sum().item()\n", 456 | "\n", 457 | "accuracy = 100. * correct / len(test_loader.dataset)\n", 458 | "print(f'Test Accuracy: {accuracy:.2f}%')\n" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": {}, 464 | "source": [ 465 | "As we can see, the accuracy dropped compared to last example, but only ~0.1%" 466 | ] 467 | }, 468 | { 469 | "cell_type": "markdown", 470 | "metadata": {}, 471 | "source": [ 472 | "# Learn more\n", 473 | "\n", 474 | "You can learn more about quantizing your model here : [Quant getting started](https://xilinx.github.io/brevitas/getting_started.html)\n", 475 | "\n", 476 | "This documentation will introduction you to weight-only quantization all the way to full quantization in a simple lighthearted way .\n", 477 | "\n", 478 | "We will also have a lot of tie during the lab where we'll take time to slow down and look at what's happenning. Stay tuned !" 479 | ] 480 | } 481 | ], 482 | "metadata": { 483 | "kernelspec": { 484 | "display_name": "Python 3 (ipykernel)", 485 | "language": "python", 486 | "name": "python3" 487 | }, 488 | "language_info": { 489 | "codemirror_mode": { 490 | "name": "ipython", 491 | "version": 3 492 | }, 493 | "file_extension": ".py", 494 | "mimetype": "text/x-python", 495 | "name": "python", 496 | "nbconvert_exporter": "python", 497 | "pygments_lexer": "ipython3", 498 | "version": "3.10.12" 499 | } 500 | }, 501 | "nbformat": 4, 502 | "nbformat_minor": 2 503 | } 504 | -------------------------------------------------------------------------------- /Lecture 2 : Project concepts/fc_mnist_simple.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0BAB1/AI_to_FPGA_course/6d94feb967484eb6ac0d1ae596c5c4d7ae79b4fa/Lecture 2 : Project concepts/fc_mnist_simple.png -------------------------------------------------------------------------------- /Lecture 2 : Project concepts/readme.md: -------------------------------------------------------------------------------- 1 | ## Lecture 2 2 | 3 | Examples in the lecture : 4 | 5 | - ONNX examples : [Notebook access here](https://github.com/Xilinx/finn/tree/main/notebooks/basics) 6 | - Brevitas example : Local -------------------------------------------------------------------------------- /SLIDES_CC.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0BAB1/AI_to_FPGA_course/6d94feb967484eb6ac0d1ae596c5c4d7ae79b4fa/SLIDES_CC.pdf -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # Python AI to FPGA : Course Material 2 | 3 | This repo contains all the course's material form my teaching activities (do not use outside the course, conform to french law intellectual property). 4 | 5 | Lectures given at : [IDEC](https://www.idec.or.kr/) & [Sungkyunkwan University](https://www.skku.edu/eng/index.do). 6 | 7 | > [!NOTE] 8 | > This course is 9 hours long and this repo is meant for teaching eveything people need to know to deploy end-to-end solutions. 9 | > It was not meant to be done alone. Attending the lectures is indeed far more efficient but students can reach out to me if needed. 10 | 11 | # Lectures 12 | 13 | Examples are meant to be watched during the lectures to understand basic concepts, you can also follow along. 14 | 15 | If these concepts are not acquired / understood, I strongly encourage you to look into deeper material. *(the notebooks contains clues on where to look for such material in the "Learn More" sections)* 16 | 17 | # Labs 18 | 19 | Meant to be done from scratch, I highly recommend you do them by yourself; by following along during the labs or at home. 20 | 21 | Each Lab has its specificities so each folder contain a ```readme.md``` file to provide details & context to the student. 22 | 23 | ## Lab prerequisites : 24 | 25 | - Linux system is prefered but not mandatory 26 | - Python, Pytorch installed on your system 27 | - Docker environement setup for FINN and Brevitas (see below) 28 | - Xilinx tools Vivado, Vitis & Vitis HLS (2023 Version) 29 | - A zynq board for inference 30 | 31 | ## Setup your docker environement for the lab in advance : 32 | 33 | You can setup you docker environement by cloning [finn](https://github.com/Xilinx/finn) and running : 34 | 35 | ```bash 36 | bash run_docker.sh notebook 37 | ``` 38 | 39 | This will setup notebook dev environement. Here is the [official tutorial](https://finn.readthedocs.io/en/latest/getting_started.html#running-finn-in-docker) to follow to also setup the environement vars. 40 | 41 | ```/tmp/finn_dev_username``` will be a common folder where you can examine compiled outputs. 42 | --------------------------------------------------------------------------------