├── Assignments
├── Homework_1
│ └── Homework1.pdf
├── Homework_2
│ └── Homework2.pdf
├── Homework_3
│ └── Homework3.pdf
├── Lab_1
│ └── Lab1.pdf
├── Lab_2
│ └── Lab2.pdf
├── Lab_3
│ └── Lab3.pdf
├── Lab_4
│ └── Lab4.pdf
└── Lab_5
│ └── Lab5.pdf
├── MSDS631_DeepLearning_Syllabus.pdf
├── Notebooks
├── Lecture1_Introduction.ipynb
├── Lecture2_Make_DL_Work.ipynb
├── Lecture3_Images_and_CNNs.ipynb
├── Lecture4_Transfer_Augmentation.ipynb
├── Lecture5_Text_Embeddings_Models.ipynb
├── Lecture6_Sequence_Models.ipynb
├── Lecture8_More_Imaging.ipynb
├── Neptune_PyTorch.ipynb
└── Pytorch_Lightning.ipynb
├── README.md
└── Slides
├── Lecture1_Introduction.pdf
├── Lecture2_Make_DL_Work.pdf
├── Lecture3_Images_and_CNNs.pdf
├── Lecture4_Imaging_Small_Datasets.pdf
├── Lecture5_Text_Data.pdf
├── Lecture6_Sequence_Models.pdf
├── Lecture7_DrInterian_Attention.pdf
├── Lecture8_More_Imaging.pdf
└── Lecture9_More_Topics.pdf
/Assignments/Homework_1/Homework1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Homework_1/Homework1.pdf
--------------------------------------------------------------------------------
/Assignments/Homework_2/Homework2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Homework_2/Homework2.pdf
--------------------------------------------------------------------------------
/Assignments/Homework_3/Homework3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Homework_3/Homework3.pdf
--------------------------------------------------------------------------------
/Assignments/Lab_1/Lab1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Lab_1/Lab1.pdf
--------------------------------------------------------------------------------
/Assignments/Lab_2/Lab2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Lab_2/Lab2.pdf
--------------------------------------------------------------------------------
/Assignments/Lab_3/Lab3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Lab_3/Lab3.pdf
--------------------------------------------------------------------------------
/Assignments/Lab_4/Lab4.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Lab_4/Lab4.pdf
--------------------------------------------------------------------------------
/Assignments/Lab_5/Lab5.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Assignments/Lab_5/Lab5.pdf
--------------------------------------------------------------------------------
/MSDS631_DeepLearning_Syllabus.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/MSDS631_DeepLearning_Syllabus.pdf
--------------------------------------------------------------------------------
/Notebooks/Lecture1_Introduction.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "(As always a new environment specifically for this course is recommended!)"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "## Handling Data in PyTorch\n",
15 | "\n",
16 | "The torch ```tensor``` is the fundamental datatype used by PyTorch\n",
17 | "- works very similar to arrays\n",
18 | "- designed to work with GPUs\n",
19 | "- optimized for automatic differentiation"
20 | ]
21 | },
22 | {
23 | "cell_type": "code",
24 | "execution_count": null,
25 | "metadata": {},
26 | "outputs": [],
27 | "source": [
28 | "import torch\n",
29 | "import numpy as np\n",
30 | "\n",
31 | "# array\n",
32 | "X_array = np.array([[1,0],[0,1]])\n",
33 | "\n",
34 | "# tensor\n",
35 | "X_tensor = torch.tensor([[1,0],[0,1]])\n",
36 | "\n",
37 | "X_array, X_tensor"
38 | ]
39 | },
40 | {
41 | "cell_type": "code",
42 | "execution_count": null,
43 | "metadata": {},
44 | "outputs": [],
45 | "source": [
46 | "# can easily convert back and forth\n",
47 | "X_tensor.numpy(), torch.from_numpy(X_array)"
48 | ]
49 | },
50 | {
51 | "cell_type": "markdown",
52 | "metadata": {},
53 | "source": [
54 | "If you want to use a GPU, you can use one for free in Google Colab or Kaggle for a limited amount of time each week."
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": null,
60 | "metadata": {},
61 | "outputs": [],
62 | "source": [
63 | "# if a GPU is available we can send the tensor to the GPU\n",
64 | "if torch.cuda.is_available():\n",
65 | " device = torch.device(0)\n",
66 | " X_tensor_cuda = X_tensor.to(device)"
67 | ]
68 | },
69 | {
70 | "cell_type": "markdown",
71 | "metadata": {},
72 | "source": [
73 | "The ```Dataset``` is an abstract class which holds the recipe for producing your data\n",
74 | "- can do complex operations to retrieve/transform your data in parallel\n",
75 | "- You must implement the following methods:\n",
76 | " - ```__init__```\n",
77 | " - ```__len__```: length of the dataset\n",
78 | " - ```__getitem__```: recipe for retrieving the *i*-th datapoint"
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": null,
84 | "metadata": {},
85 | "outputs": [],
86 | "source": [
87 | "from torch.utils.data import Dataset\n",
88 | "\n",
89 | "# create some linear data on [0,10] according to a slope, intercept, and number of desired points\n",
90 | "def random_linear_data(m, b, n):\n",
91 | " x = 10 * np.random.rand(n)\n",
92 | " y = m * x + b + np.random.rand(n)\n",
93 | " return x, y\n",
94 | "\n",
95 | "# create a dataset class\n",
96 | "class LinearDataset(Dataset):\n",
97 | " # things I need to intialize\n",
98 | " def __init__(self, m, b, n):\n",
99 | " x, y = random_linear_data(m, b, n)\n",
100 | " self.x, self.y = torch.from_numpy(x), torch.from_numpy(y)\n",
101 | " self.n = n\n",
102 | " \n",
103 | " # length of the dataset\n",
104 | " def __len__(self):\n",
105 | " return self.n\n",
106 | " \n",
107 | " # how to get a datapoint\n",
108 | " # any transformations you want to do on-the-fly\n",
109 | " def __getitem__(self, idx):\n",
110 | " return self.x[idx], self.y[idx]\n",
111 | " \n",
112 | "linear_ds = LinearDataset(1.5, 50, 100)"
113 | ]
114 | },
115 | {
116 | "cell_type": "code",
117 | "execution_count": null,
118 | "metadata": {},
119 | "outputs": [],
120 | "source": [
121 | "# get first datapoint\n",
122 | "next(iter(linear_ds))"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": null,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "X = []\n",
132 | "Y = []\n",
133 | "# iterate through the dataset\n",
134 | "for x,y in linear_ds:\n",
135 | " X.append(x.item())\n",
136 | " Y.append(y.item())\n",
137 | "\n",
138 | "import matplotlib.pyplot as plt\n",
139 | "plt.scatter(X, Y)"
140 | ]
141 | },
142 | {
143 | "cell_type": "code",
144 | "execution_count": null,
145 | "metadata": {},
146 | "outputs": [],
147 | "source": [
148 | "# turn iris data into a Dataset\n",
149 | "import seaborn as sns\n",
150 | "import pandas as pd\n",
151 | "\n",
152 | "iris = sns.load_dataset('iris')\n",
153 | "iris = iris[iris.species != 'virginica']\n",
154 | "iris.head()"
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": null,
160 | "metadata": {},
161 | "outputs": [],
162 | "source": [
163 | "class IrisDataset(Dataset):\n",
164 | " def __init__(self, df):\n",
165 | " self.df = df\n",
166 | " self.species_val = {'setosa':0,\n",
167 | " 'versicolor':1}\n",
168 | " \n",
169 | " def __len__(self):\n",
170 | " return len(self.df)\n",
171 | " \n",
172 | " def __getitem__(self, idx):\n",
173 | " row = self.df.iloc[idx]\n",
174 | " x = torch.tensor([row['sepal_length'],\n",
175 | " row['sepal_width']]).float()\n",
176 | " \n",
177 | " y = torch.tensor(self.species_val[row['species']]).float()\n",
178 | " \n",
179 | " return x, y\n",
180 | "iris_ds = IrisDataset(iris)\n",
181 | "next(iter(iris_ds))"
182 | ]
183 | },
184 | {
185 | "cell_type": "markdown",
186 | "metadata": {},
187 | "source": [
188 | "In general a good rule of thumb for what to do on-the-fly vs. preprocessing:\n",
189 | "- If it is random alteration (data augmentation): on-the-fly\n",
190 | "- If it is a time-consuming step that is also the same each time: preprocessing"
191 | ]
192 | },
193 | {
194 | "cell_type": "markdown",
195 | "metadata": {},
196 | "source": [
197 | "The ```Dataloader``` helps us iterate over a Dataset\n",
198 | "- can choose batch size\n",
199 | "- can shuffle\n",
200 | "- can be retrieved in parallel\n",
201 | "- automatically collates tensors"
202 | ]
203 | },
204 | {
205 | "cell_type": "code",
206 | "execution_count": null,
207 | "metadata": {},
208 | "outputs": [],
209 | "source": [
210 | "from torch.utils.data import DataLoader\n",
211 | "\n",
212 | "iris_dl = DataLoader(iris_ds, batch_size=10, shuffle=True)"
213 | ]
214 | },
215 | {
216 | "cell_type": "code",
217 | "execution_count": null,
218 | "metadata": {},
219 | "outputs": [],
220 | "source": [
221 | "x, y = next(iter(iris_dl))\n",
222 | "print(x.shape, y.shape)\n",
223 | "print(x, y)"
224 | ]
225 | },
226 | {
227 | "cell_type": "markdown",
228 | "metadata": {},
229 | "source": [
230 | "## Defining a Model\n",
231 | "\n",
232 | "Let's define a simple Feed Forward neural network for the iris dataset"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": null,
238 | "metadata": {},
239 | "outputs": [],
240 | "source": [
241 | "import torch.nn as nn\n",
242 | "\n",
243 | "class TwoLayerNN(nn.Module):\n",
244 | " def __init__(self, input_dim, hidden_dim, output_dim):\n",
245 | " super(TwoLayerNN, self).__init__()\n",
246 | " # initialize the layers with random weights\n",
247 | " self.linear1 = nn.Linear(input_dim, hidden_dim)\n",
248 | " self.linear2 = nn.Linear(hidden_dim, output_dim)\n",
249 | " self.relu = nn.ReLU()\n",
250 | " \n",
251 | " def forward(self, x):\n",
252 | " # define the actual function\n",
253 | " x = self.linear1(x)\n",
254 | " x = self.relu(x)\n",
255 | " x = self.linear2(x)\n",
256 | " # don't worry about the last activation function for now\n",
257 | " return torch.squeeze(x)\n",
258 | " \n",
259 | "model = TwoLayerNN(2, 5, 1)"
260 | ]
261 | },
262 | {
263 | "cell_type": "code",
264 | "execution_count": null,
265 | "metadata": {},
266 | "outputs": [],
267 | "source": [
268 | "print(model)"
269 | ]
270 | },
271 | {
272 | "cell_type": "markdown",
273 | "metadata": {},
274 | "source": [
275 | "Note the attached gradient function below. PyTorch autograd is keeping track of the computational graph for computing partial derivatives with respect to the various parameters/weights."
276 | ]
277 | },
278 | {
279 | "cell_type": "code",
280 | "execution_count": null,
281 | "metadata": {
282 | "scrolled": true
283 | },
284 | "outputs": [],
285 | "source": [
286 | "x, y = next(iter(iris_dl))\n",
287 | "model(x), y"
288 | ]
289 | },
290 | {
291 | "cell_type": "markdown",
292 | "metadata": {},
293 | "source": [
294 | "Some very useful tools for looking at models"
295 | ]
296 | },
297 | {
298 | "cell_type": "code",
299 | "execution_count": null,
300 | "metadata": {
301 | "scrolled": true
302 | },
303 | "outputs": [],
304 | "source": [
305 | "from torchsummary import summary\n",
306 | "summary(model, input_size = (2,), device='cpu')"
307 | ]
308 | },
309 | {
310 | "cell_type": "code",
311 | "execution_count": null,
312 | "metadata": {
313 | "scrolled": true
314 | },
315 | "outputs": [],
316 | "source": [
317 | "# uh oh\n",
318 | "summary(model, input_size = (3,), device='cpu')"
319 | ]
320 | },
321 | {
322 | "cell_type": "code",
323 | "execution_count": null,
324 | "metadata": {},
325 | "outputs": [],
326 | "source": [
327 | "from torchviz import make_dot\n",
328 | "make_dot(model(x), params=dict(list(model.named_parameters())))"
329 | ]
330 | },
331 | {
332 | "cell_type": "code",
333 | "execution_count": null,
334 | "metadata": {},
335 | "outputs": [],
336 | "source": [
337 | "# if the function is straightforward we can just use Sequential\n",
338 | "#model = nn.Sequential(nn.Linear(2, 5),\n",
339 | "# nn.ReLU(),\n",
340 | "# nn.Linear(5, 1))"
341 | ]
342 | },
343 | {
344 | "cell_type": "markdown",
345 | "metadata": {},
346 | "source": [
347 | "## Train the Model\n",
348 | "We need the following ingredients\n",
349 | "- A loss function for our model\n",
350 | "- An optimization algorithm"
351 | ]
352 | },
353 | {
354 | "cell_type": "code",
355 | "execution_count": null,
356 | "metadata": {},
357 | "outputs": [],
358 | "source": [
359 | "import torch.optim as optim\n",
360 | "\n",
361 | "# feeds outputs through a sigmoid before computing BCE Loss\n",
362 | "lossFun = nn.BCEWithLogitsLoss()\n",
363 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)"
364 | ]
365 | },
366 | {
367 | "cell_type": "markdown",
368 | "metadata": {},
369 | "source": [
370 | "Below we adjust the weights of the model according to one batch"
371 | ]
372 | },
373 | {
374 | "cell_type": "code",
375 | "execution_count": null,
376 | "metadata": {},
377 | "outputs": [],
378 | "source": [
379 | "# adjust the gradients according to one batch\n",
380 | "\n",
381 | "x, y = next(iter(iris_dl))\n",
382 | "\n",
383 | "# some layers will do different things during training/prediction (i.e. dropout)\n",
384 | "model.train()\n",
385 | "\n",
386 | "# compute the predictions then loss\n",
387 | "y_pred = model(x)\n",
388 | "loss = lossFun(y_pred, y)\n",
389 | "print(loss.item())\n",
390 | "\n",
391 | "# zero out the gradients in the optimizer (otherwise they will accumulate)\n",
392 | "optimizer.zero_grad()\n",
393 | "\n",
394 | "# compute the gradients w.r.t. loss function\n",
395 | "loss.backward()\n",
396 | "\n",
397 | "# adjust weights!\n",
398 | "optimizer.step()"
399 | ]
400 | },
401 | {
402 | "cell_type": "markdown",
403 | "metadata": {},
404 | "source": [
405 | "An *epoch* is one pass through the training set"
406 | ]
407 | },
408 | {
409 | "cell_type": "code",
410 | "execution_count": null,
411 | "metadata": {
412 | "scrolled": true
413 | },
414 | "outputs": [],
415 | "source": [
416 | "# very crude training loop (you'll make a fancier one in your first lab)\n",
417 | "for epoch in range(100):\n",
418 | " for x, y in iris_dl:\n",
419 | " model.train()\n",
420 | " \n",
421 | " y_pred = model(x)\n",
422 | " loss = lossFun(y_pred, y)\n",
423 | " print(loss.item())\n",
424 | " \n",
425 | " optimizer.zero_grad()\n",
426 | " \n",
427 | " loss.backward()\n",
428 | " optimizer.step()"
429 | ]
430 | },
431 | {
432 | "cell_type": "markdown",
433 | "metadata": {},
434 | "source": [
435 | "## After Training\n",
436 | "Let's use our model to make some predictions"
437 | ]
438 | },
439 | {
440 | "cell_type": "code",
441 | "execution_count": null,
442 | "metadata": {},
443 | "outputs": [],
444 | "source": [
445 | "x,y = next(iter(iris_dl))\n",
446 | "\n",
447 | "# some layers will do different things during training/prediction (i.e. dropout)\n",
448 | "model.eval()\n",
449 | "\n",
450 | "# don't compute gradients\n",
451 | "with torch.no_grad():\n",
452 | " outputs = torch.sigmoid(model(x))\n",
453 | "\n",
454 | "y_pred = torch.zeros(10)\n",
455 | "y_pred[outputs > .5] = 1\n",
456 | "\n",
457 | "y_pred, y"
458 | ]
459 | },
460 | {
461 | "cell_type": "code",
462 | "execution_count": null,
463 | "metadata": {},
464 | "outputs": [],
465 | "source": [
466 | "# save your model parameters and optimizater checkpoint\n",
467 | "checkpoint = {'model_state_dict': model.state_dict(),\n",
468 | " 'optimizer_state_dict' :optimizer.state_dict()}\n",
469 | "torch.save(checkpoint, 'model_checkpoint.pt')"
470 | ]
471 | },
472 | {
473 | "cell_type": "code",
474 | "execution_count": null,
475 | "metadata": {},
476 | "outputs": [],
477 | "source": [
478 | "# now load them up!\n",
479 | "checkpoint = torch.load('model_checkpoint.pt')\n",
480 | "model.load_state_dict(checkpoint['model_state_dict'])\n",
481 | "optimizer.load_state_dict(checkpoint['optimizer_state_dict'])"
482 | ]
483 | },
484 | {
485 | "cell_type": "markdown",
486 | "metadata": {},
487 | "source": [
488 | "You can save other things in the checkpoint such as the loss history, epoch number, etc. if you really want to save every aspect of your progress."
489 | ]
490 | }
491 | ],
492 | "metadata": {
493 | "kernelspec": {
494 | "display_name": "Python 3",
495 | "language": "python",
496 | "name": "python3"
497 | },
498 | "language_info": {
499 | "codemirror_mode": {
500 | "name": "ipython",
501 | "version": 3
502 | },
503 | "file_extension": ".py",
504 | "mimetype": "text/x-python",
505 | "name": "python",
506 | "nbconvert_exporter": "python",
507 | "pygments_lexer": "ipython3",
508 | "version": "3.8.10"
509 | }
510 | },
511 | "nbformat": 4,
512 | "nbformat_minor": 4
513 | }
514 |
--------------------------------------------------------------------------------
/Notebooks/Lecture2_Make_DL_Work.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import numpy as np\n",
10 | "import pandas as pd\n",
11 | "\n",
12 | "# PyTorch stuff we'll need\n",
13 | "import torch\n",
14 | "from torch.utils.data import Dataset\n",
15 | "from torch.utils.data import DataLoader\n",
16 | "import torch.nn as nn\n",
17 | "import torch.optim as optim"
18 | ]
19 | },
20 | {
21 | "cell_type": "markdown",
22 | "metadata": {},
23 | "source": [
24 | "## Learning Rate Scheduler"
25 | ]
26 | },
27 | {
28 | "cell_type": "code",
29 | "execution_count": null,
30 | "metadata": {},
31 | "outputs": [],
32 | "source": [
33 | "# model and optimizer\n",
34 | "model = nn.Sequential(nn.Linear(2, 5),\n",
35 | " nn.ReLU(),\n",
36 | " nn.Linear(5, 1))\n",
37 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)"
38 | ]
39 | },
40 | {
41 | "cell_type": "markdown",
42 | "metadata": {},
43 | "source": [
44 | "We can group the parameters of our model into different groups (will be used for transfer learning later)\n",
45 | "- here we only have one group\n",
46 | "- note the learning rate"
47 | ]
48 | },
49 | {
50 | "cell_type": "code",
51 | "execution_count": null,
52 | "metadata": {},
53 | "outputs": [],
54 | "source": [
55 | "optimizer.param_groups"
56 | ]
57 | },
58 | {
59 | "cell_type": "markdown",
60 | "metadata": {},
61 | "source": [
62 | "The learning rate scheduler lets us adjust the learning rate according to different schemes\n",
63 | "- For example the following is [Cosine Annealing](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html#torch.optim.lr_scheduler.CosineAnnealingLR) set for 100 iterations"
64 | ]
65 | },
66 | {
67 | "cell_type": "code",
68 | "execution_count": null,
69 | "metadata": {},
70 | "outputs": [],
71 | "source": [
72 | "lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, 100)"
73 | ]
74 | },
75 | {
76 | "cell_type": "code",
77 | "execution_count": null,
78 | "metadata": {
79 | "scrolled": true
80 | },
81 | "outputs": [],
82 | "source": [
83 | "# why do you think we are getting an error here?\n",
84 | "print(optimizer.param_groups[0]['lr'])\n",
85 | "lr_scheduler.step()\n",
86 | "print(optimizer.param_groups[0]['lr'])"
87 | ]
88 | },
89 | {
90 | "cell_type": "code",
91 | "execution_count": null,
92 | "metadata": {},
93 | "outputs": [],
94 | "source": [
95 | "for i in range (10):\n",
96 | " lr_scheduler.step()\n",
97 | " print(optimizer.param_groups[0]['lr'])"
98 | ]
99 | },
100 | {
101 | "cell_type": "markdown",
102 | "metadata": {},
103 | "source": [
104 | "Try implementing and plotting the learning rate for a [One Cycle](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.OneCycleLR.html#torch.optim.lr_scheduler.OneCycleLR) learning rate"
105 | ]
106 | },
107 | {
108 | "cell_type": "code",
109 | "execution_count": null,
110 | "metadata": {},
111 | "outputs": [],
112 | "source": []
113 | },
114 | {
115 | "cell_type": "markdown",
116 | "metadata": {},
117 | "source": [
118 | "## Dropout\n",
119 | "\n",
120 | "The layer ```nn.Dropout(p)``` randomly zeros out elements on the input tensor with probability ```p```. The resulting tensor is then scaling by $\\frac{1}{1-p}$.\n",
121 | "- Keeps output same scale as during test time (no dropout)\n",
122 | "- Think about it as making it so that Dropout is adding noise with mean 0"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": null,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "drop = nn.Dropout(p=.2)\n",
132 | "x = torch.ones((100,100))\n",
133 | "print(x)\n",
134 | "y = drop(x)\n",
135 | "y"
136 | ]
137 | },
138 | {
139 | "cell_type": "code",
140 | "execution_count": null,
141 | "metadata": {},
142 | "outputs": [],
143 | "source": [
144 | "# model with Dropout\n",
145 | "class TwoLayerNN_drop(nn.Module):\n",
146 | " def __init__(self, input_dim, hidden_dim, output_dim):\n",
147 | " super(TwoLayerNN_drop, self).__init__()\n",
148 | " self.linear1 = nn.Linear(input_dim, hidden_dim)\n",
149 | " self.linear2 = nn.Linear(hidden_dim, output_dim)\n",
150 | " self.relu = nn.ReLU()\n",
151 | " self.dropout = nn.Dropout(p=.25)\n",
152 | " \n",
153 | " def forward(self, x):\n",
154 | " x = self.linear1(x)\n",
155 | " x = self.relu(x)\n",
156 | " \n",
157 | " # Add some dropout after first layer\n",
158 | " x = self.dropout(x)\n",
159 | " \n",
160 | " x = self.linear2(x)\n",
161 | " return torch.squeeze(x)"
162 | ]
163 | },
164 | {
165 | "cell_type": "markdown",
166 | "metadata": {},
167 | "source": [
168 | "## Weight Decay\n",
169 | "- Let's use [Stochastic Gradient Descent](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD) with weight decay"
170 | ]
171 | },
172 | {
173 | "cell_type": "code",
174 | "execution_count": null,
175 | "metadata": {},
176 | "outputs": [],
177 | "source": [
178 | "# blowing up weight decay so you can see it in action\n",
179 | "\n",
180 | "model = nn.Sequential(nn.Linear(2, 5),\n",
181 | " nn.ReLU(),\n",
182 | " nn.Linear(5, 1))\n",
183 | "optimizer = optim.SGD(model.parameters(), lr = 0.001, weight_decay = 1000.0)"
184 | ]
185 | },
186 | {
187 | "cell_type": "code",
188 | "execution_count": null,
189 | "metadata": {},
190 | "outputs": [],
191 | "source": [
192 | "for param in model.parameters():\n",
193 | " print(param)"
194 | ]
195 | },
196 | {
197 | "cell_type": "code",
198 | "execution_count": null,
199 | "metadata": {},
200 | "outputs": [],
201 | "source": [
202 | "model.train()\n",
203 | "y = model(torch.ones(10, 2))\n",
204 | "\n",
205 | "# train w.r.t a loss function that wants to maximize output\n",
206 | "(1/sum(y)).backward()\n",
207 | "optimizer.step()"
208 | ]
209 | },
210 | {
211 | "cell_type": "code",
212 | "execution_count": null,
213 | "metadata": {},
214 | "outputs": [],
215 | "source": [
216 | "# weights have decreased\n",
217 | "for param in model.parameters():\n",
218 | " print(param)"
219 | ]
220 | },
221 | {
222 | "cell_type": "markdown",
223 | "metadata": {},
224 | "source": [
225 | "## Batch Normalization"
226 | ]
227 | },
228 | {
229 | "cell_type": "code",
230 | "execution_count": null,
231 | "metadata": {},
232 | "outputs": [],
233 | "source": [
234 | "# model with batch normalization\n",
235 | "class TwoLayerNN_BN(nn.Module):\n",
236 | " def __init__(self, input_dim, hidden_dim, output_dim):\n",
237 | " super(TwoLayerNN_BN, self).__init__()\n",
238 | " self.linear1 = nn.Linear(input_dim, hidden_dim)\n",
239 | " self.linear2 = nn.Linear(hidden_dim, output_dim)\n",
240 | " self.relu = nn.ReLU()\n",
241 | " \n",
242 | " # we input the number of features to be normalizing across a batch\n",
243 | " self.bn = nn.BatchNorm1d(hidden_dim)\n",
244 | " \n",
245 | " def forward(self, x):\n",
246 | " x = self.linear1(x)\n",
247 | " \n",
248 | " # add batch normalization before activation\n",
249 | " x = self.bn(x)\n",
250 | " x = self.relu(x)\n",
251 | " \n",
252 | " x = self.linear2(x)\n",
253 | " # no batch norm for final output!\n",
254 | " \n",
255 | " return torch.squeeze(x)"
256 | ]
257 | },
258 | {
259 | "cell_type": "code",
260 | "execution_count": null,
261 | "metadata": {},
262 | "outputs": [],
263 | "source": [
264 | "model = TwoLayerNN_BN(2, 5, 1)\n",
265 | "bn_layer = model.bn\n",
266 | "\n",
267 | "# note that batch normalization intializes with pure mini-batch noramlization\n",
268 | "# will change during training\n",
269 | "for param in bn_layer.parameters():\n",
270 | " print(param)"
271 | ]
272 | },
273 | {
274 | "cell_type": "markdown",
275 | "metadata": {},
276 | "source": [
277 | "## Early Stopping\n",
278 | "- ideas\n",
279 | " * stop training after validation loss does not improve after so many epochs\n",
280 | " * save model parameters after each epoch if they are a new minimum validation loss"
281 | ]
282 | },
283 | {
284 | "cell_type": "code",
285 | "execution_count": null,
286 | "metadata": {},
287 | "outputs": [],
288 | "source": [
289 | "import seaborn as sns\n",
290 | "mpg = sns.load_dataset('mpg')\n",
291 | "mpg.head()"
292 | ]
293 | },
294 | {
295 | "cell_type": "code",
296 | "execution_count": null,
297 | "metadata": {},
298 | "outputs": [],
299 | "source": [
300 | "class MPGDataset(Dataset):\n",
301 | " def __init__(self, df):\n",
302 | " self.df = df\n",
303 | " \n",
304 | " def __len__(self):\n",
305 | " return len(self.df)\n",
306 | " \n",
307 | " def __getitem__(self, idx):\n",
308 | " row = self.df.iloc[idx]\n",
309 | " x = torch.tensor([row['displacement'],\n",
310 | " row['weight']]).float()\n",
311 | " \n",
312 | " y = torch.tensor(row['mpg']).float()\n",
313 | " \n",
314 | " return x, y\n",
315 | "\n",
316 | "# train/val split\n",
317 | "mpg_train = mpg[100:].reset_index(drop=True)\n",
318 | "mpg_val = mpg[:100].reset_index(drop=True)\n",
319 | "mpg_train_ds = MPGDataset(mpg_train)\n",
320 | "mpg_val_ds = MPGDataset(mpg_val)\n",
321 | "\n",
322 | "# load into dataloader\n",
323 | "mpg_train_dl = DataLoader(mpg_train_ds, batch_size=50, shuffle=True)\n",
324 | "mpg_val_dl = DataLoader(mpg_val_ds, batch_size=100, shuffle=False)"
325 | ]
326 | },
327 | {
328 | "cell_type": "code",
329 | "execution_count": null,
330 | "metadata": {},
331 | "outputs": [],
332 | "source": [
333 | "# vanilla two-layer\n",
334 | "class TwoLayerNN(nn.Module):\n",
335 | " def __init__(self, input_dim, hidden_dim, output_dim):\n",
336 | " super(TwoLayerNN, self).__init__()\n",
337 | " self.linear1 = nn.Linear(input_dim, hidden_dim)\n",
338 | " self.linear2 = nn.Linear(hidden_dim, output_dim)\n",
339 | " self.relu = nn.ReLU()\n",
340 | " \n",
341 | " def forward(self, x):\n",
342 | " x = self.linear1(x)\n",
343 | " x = self.relu(x) \n",
344 | " x = self.linear2(x)\n",
345 | " return torch.squeeze(x)"
346 | ]
347 | },
348 | {
349 | "cell_type": "code",
350 | "execution_count": null,
351 | "metadata": {},
352 | "outputs": [],
353 | "source": [
354 | "# large network to induce overfitting\n",
355 | "model = TwoLayerNN(2, 10, 1)\n",
356 | "lossFun = nn.L1Loss()\n",
357 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)"
358 | ]
359 | },
360 | {
361 | "cell_type": "code",
362 | "execution_count": null,
363 | "metadata": {
364 | "scrolled": false
365 | },
366 | "outputs": [],
367 | "source": [
368 | "from tqdm.notebook import tqdm\n",
369 | "\n",
370 | "# collect losses\n",
371 | "avg_train = []\n",
372 | "avg_val = []\n",
373 | "\n",
374 | "for epoch in tqdm(range(500)):\n",
375 | " train_losses = []\n",
376 | " for x, y in mpg_train_dl:\n",
377 | " \n",
378 | " model.train()\n",
379 | " \n",
380 | " y_pred = model(x)\n",
381 | " loss = lossFun(y_pred, y)\n",
382 | " train_losses.append(loss.item())\n",
383 | " \n",
384 | " optimizer.zero_grad()\n",
385 | " \n",
386 | " loss.backward()\n",
387 | " optimizer.step()\n",
388 | " \n",
389 | " avg_train.append(sum(train_losses) / len(train_losses))\n",
390 | " \n",
391 | " for x, y in mpg_val_dl:\n",
392 | " model.eval()\n",
393 | " \n",
394 | " y_pred = model(x)\n",
395 | " loss = lossFun(y_pred, y)\n",
396 | " \n",
397 | " avg_val.append(loss.item())"
398 | ]
399 | },
400 | {
401 | "cell_type": "code",
402 | "execution_count": null,
403 | "metadata": {},
404 | "outputs": [],
405 | "source": [
406 | "import matplotlib.pyplot as plt\n",
407 | "\n",
408 | "# plot losses\n",
409 | "plt.plot(avg_train)\n",
410 | "plt.plot(avg_val)\n",
411 | "plt.show()"
412 | ]
413 | },
414 | {
415 | "cell_type": "markdown",
416 | "metadata": {},
417 | "source": [
418 | "What about with batch normalization?"
419 | ]
420 | },
421 | {
422 | "cell_type": "code",
423 | "execution_count": null,
424 | "metadata": {},
425 | "outputs": [],
426 | "source": [
427 | "model = TwoLayerNN_BN(2, 10, 1)\n",
428 | "lossFun = nn.L1Loss()\n",
429 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)\n",
430 | "\n",
431 | "# collect losses\n",
432 | "avg_train = []\n",
433 | "avg_val = []\n",
434 | "\n",
435 | "for epoch in tqdm(range(500)):\n",
436 | " train_losses = []\n",
437 | " for x, y in mpg_train_dl:\n",
438 | " \n",
439 | " model.train()\n",
440 | " \n",
441 | " y_pred = model(x)\n",
442 | " loss = lossFun(y_pred, y)\n",
443 | " train_losses.append(loss.item())\n",
444 | " \n",
445 | " optimizer.zero_grad()\n",
446 | " \n",
447 | " loss.backward()\n",
448 | " optimizer.step()\n",
449 | " \n",
450 | " avg_train.append(sum(train_losses) / len(train_losses))\n",
451 | " \n",
452 | " for x, y in mpg_val_dl:\n",
453 | " model.eval()\n",
454 | " \n",
455 | " y_pred = model(x)\n",
456 | " loss = lossFun(y_pred, y)\n",
457 | " \n",
458 | " avg_val.append(loss.item())\n",
459 | " \n",
460 | "plt.plot(avg_train)\n",
461 | "plt.plot(avg_val)\n",
462 | "plt.show()"
463 | ]
464 | },
465 | {
466 | "cell_type": "markdown",
467 | "metadata": {},
468 | "source": [
469 | "## Categorical Embeddings\n",
470 | "- let's include the make of the car in our model"
471 | ]
472 | },
473 | {
474 | "cell_type": "code",
475 | "execution_count": null,
476 | "metadata": {},
477 | "outputs": [],
478 | "source": [
479 | "mpg.head()"
480 | ]
481 | },
482 | {
483 | "cell_type": "code",
484 | "execution_count": null,
485 | "metadata": {},
486 | "outputs": [],
487 | "source": [
488 | "makes = []\n",
489 | "for idx in range(len(mpg)):\n",
490 | " row = mpg.iloc[idx]\n",
491 | " makes.append(row['name'].split(' ')[0])\n",
492 | "mpg['make'] = makes\n",
493 | "mpg.head()"
494 | ]
495 | },
496 | {
497 | "cell_type": "code",
498 | "execution_count": null,
499 | "metadata": {},
500 | "outputs": [],
501 | "source": [
502 | "# create an index for possible values of make\n",
503 | "# sort of like a one-hot-encoding here\n",
504 | "make_dict = {make: i for i, make in enumerate(set(makes))}\n",
505 | "make_dict"
506 | ]
507 | },
508 | {
509 | "cell_type": "code",
510 | "execution_count": null,
511 | "metadata": {},
512 | "outputs": [],
513 | "source": [
514 | "class MPGDataset(Dataset):\n",
515 | " def __init__(self, df):\n",
516 | " self.df = df\n",
517 | " \n",
518 | " def __len__(self):\n",
519 | " return len(self.df)\n",
520 | " \n",
521 | " def __getitem__(self, idx):\n",
522 | " row = self.df.iloc[idx]\n",
523 | " make_idx = make_dict[row['make']]\n",
524 | " \n",
525 | " x1 = torch.tensor([row['displacement'],\n",
526 | " row['weight']]).float()\n",
527 | " \n",
528 | " x2 = torch.tensor(make_idx)\n",
529 | " \n",
530 | " y = torch.tensor(row['mpg']).float()\n",
531 | " \n",
532 | " return x1, x2, y\n",
533 | " \n",
534 | "mpg_ds = MPGDataset(mpg)\n",
535 | "\n",
536 | "# note the second tensor\n",
537 | "next(iter(mpg_ds))"
538 | ]
539 | },
540 | {
541 | "cell_type": "code",
542 | "execution_count": null,
543 | "metadata": {},
544 | "outputs": [],
545 | "source": [
546 | "# let's add an embedding layer\n",
547 | "class TwoLayerNN_Emb(nn.Module):\n",
548 | " def __init__(self, input_dim, hidden_dim, output_dim):\n",
549 | " super(TwoLayerNN_Emb, self).__init__()\n",
550 | " self.linear1 = nn.Linear(input_dim, hidden_dim)\n",
551 | " self.linear2 = nn.Linear(hidden_dim, output_dim)\n",
552 | " \n",
553 | " # first argument is number of values, next is size of embedding\n",
554 | " self.emb = nn.Embedding(len(make_dict), 2)\n",
555 | " \n",
556 | " # let's keep in batch normalization\n",
557 | " self.bn = nn.BatchNorm1d(hidden_dim)\n",
558 | " self.relu = nn.ReLU()\n",
559 | " \n",
560 | " def forward(self, x1, x2):\n",
561 | " \n",
562 | " x2 = self.emb(x2)\n",
563 | " \n",
564 | " # concatenate the vectors along dim=1, skipping batch dim\n",
565 | " x = torch.cat((x1, x2), dim=1)\n",
566 | " \n",
567 | " x = self.linear1(x)\n",
568 | " x = self.bn(x)\n",
569 | " x = self.relu(x)\n",
570 | " \n",
571 | " x = self.linear2(x)\n",
572 | " \n",
573 | " return torch.squeeze(x)"
574 | ]
575 | },
576 | {
577 | "cell_type": "code",
578 | "execution_count": null,
579 | "metadata": {},
580 | "outputs": [],
581 | "source": [
582 | "mpg_dl = DataLoader(mpg_ds, batch_size=50, shuffle=True)\n",
583 | "\n",
584 | "# what is the correct dimension here?\n",
585 | "model = TwoLayerNN_Emb(4, 5, 1)\n",
586 | "x1, x2, y = next(iter(mpg_dl))\n",
587 | "model(x1, x2)"
588 | ]
589 | },
590 | {
591 | "cell_type": "markdown",
592 | "metadata": {},
593 | "source": [
594 | "Let's train it the same way we did above"
595 | ]
596 | },
597 | {
598 | "cell_type": "code",
599 | "execution_count": null,
600 | "metadata": {},
601 | "outputs": [],
602 | "source": [
603 | "# train/val split\n",
604 | "mpg_train = mpg[100:].reset_index(drop=True)\n",
605 | "mpg_val = mpg[:100].reset_index(drop=True)\n",
606 | "mpg_train_ds = MPGDataset(mpg_train)\n",
607 | "mpg_val_ds = MPGDataset(mpg_val)\n",
608 | "\n",
609 | "# load into dataloader\n",
610 | "mpg_train_dl = DataLoader(mpg_train_ds, batch_size=50, shuffle=True)\n",
611 | "mpg_val_dl = DataLoader(mpg_val_ds, batch_size=100, shuffle=False)"
612 | ]
613 | },
614 | {
615 | "cell_type": "code",
616 | "execution_count": null,
617 | "metadata": {},
618 | "outputs": [],
619 | "source": [
620 | "lossFun = nn.L1Loss()\n",
621 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)\n",
622 | "\n",
623 | "# collect losses\n",
624 | "avg_train = []\n",
625 | "avg_val = []\n",
626 | "\n",
627 | "for epoch in tqdm(range(500)):\n",
628 | " train_losses = []\n",
629 | " for x1, x2, y in mpg_train_dl:\n",
630 | " \n",
631 | " model.train()\n",
632 | " \n",
633 | " y_pred = model(x1, x2)\n",
634 | " loss = lossFun(y_pred, y)\n",
635 | " train_losses.append(loss.item())\n",
636 | " \n",
637 | " optimizer.zero_grad()\n",
638 | " \n",
639 | " loss.backward()\n",
640 | " optimizer.step()\n",
641 | " \n",
642 | " avg_train.append(sum(train_losses) / len(train_losses))\n",
643 | " \n",
644 | " for x1, x2, y in mpg_val_dl:\n",
645 | " model.eval()\n",
646 | " \n",
647 | " y_pred = model(x1, x2)\n",
648 | " loss = lossFun(y_pred, y)\n",
649 | " \n",
650 | " avg_val.append(loss.item())"
651 | ]
652 | },
653 | {
654 | "cell_type": "code",
655 | "execution_count": null,
656 | "metadata": {},
657 | "outputs": [],
658 | "source": [
659 | "plt.plot(avg_train)\n",
660 | "plt.plot(avg_val)\n",
661 | "plt.show()"
662 | ]
663 | },
664 | {
665 | "cell_type": "code",
666 | "execution_count": null,
667 | "metadata": {},
668 | "outputs": [],
669 | "source": [
670 | "# let's look at the embedding matrix\n",
671 | "for param in model.emb.parameters():\n",
672 | " print(param)"
673 | ]
674 | },
675 | {
676 | "cell_type": "code",
677 | "execution_count": null,
678 | "metadata": {},
679 | "outputs": [],
680 | "source": [
681 | "# compare embeddings for vw and volkswagon\n",
682 | "for param in model.emb.parameters():\n",
683 | " print(param[?], param[?])"
684 | ]
685 | },
686 | {
687 | "cell_type": "code",
688 | "execution_count": null,
689 | "metadata": {},
690 | "outputs": [],
691 | "source": []
692 | }
693 | ],
694 | "metadata": {
695 | "kernelspec": {
696 | "display_name": "Python 3",
697 | "language": "python",
698 | "name": "python3"
699 | },
700 | "language_info": {
701 | "codemirror_mode": {
702 | "name": "ipython",
703 | "version": 3
704 | },
705 | "file_extension": ".py",
706 | "mimetype": "text/x-python",
707 | "name": "python",
708 | "nbconvert_exporter": "python",
709 | "pygments_lexer": "ipython3",
710 | "version": "3.8.10"
711 | }
712 | },
713 | "nbformat": 4,
714 | "nbformat_minor": 4
715 | }
716 |
--------------------------------------------------------------------------------
/Notebooks/Lecture3_Images_and_CNNs.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "id": "7e032868",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import numpy as np\n",
11 | "import pandas as pd\n",
12 | "import torch\n",
13 | "from torch.utils.data import Dataset\n",
14 | "from torch.utils.data import DataLoader\n",
15 | "import torch.nn as nn\n",
16 | "import torch.optim as optim\n",
17 | "\n",
18 | "import torchvision\n",
19 | "\n",
20 | "from tqdm.notebook import tqdm\n",
21 | "import matplotlib.pyplot as plt"
22 | ]
23 | },
24 | {
25 | "cell_type": "markdown",
26 | "id": "a1bb290b-54b8-4d2f-af18-fff71dcd33ab",
27 | "metadata": {},
28 | "source": [
29 | "Image Data from [here](https://www.kaggle.com/andrewmvd/animal-faces)\n",
30 | "- Animal Faces"
31 | ]
32 | },
33 | {
34 | "cell_type": "markdown",
35 | "id": "ea405e99-a951-4762-af0d-93a1f7b844b6",
36 | "metadata": {},
37 | "source": [
38 | "## Images"
39 | ]
40 | },
41 | {
42 | "cell_type": "code",
43 | "execution_count": null,
44 | "id": "15b39fed-5b4f-4bec-96b3-b1bca2670727",
45 | "metadata": {},
46 | "outputs": [],
47 | "source": [
48 | "# What's in this dataset?\n",
49 | "import os\n",
50 | "os.listdir('course_data/afhq')"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": null,
56 | "id": "0d8a10db-0c51-4d07-8fa1-7c198b919450",
57 | "metadata": {},
58 | "outputs": [],
59 | "source": [
60 | "# three labels\n",
61 | "os.listdir('course_data/afhq/train')"
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": null,
67 | "id": "6b39b04d-74c5-4fb3-9a25-9204b2f8afae",
68 | "metadata": {},
69 | "outputs": [],
70 | "source": [
71 | "# within each folder are the images\n",
72 | "os.listdir('course_data/afhq/train/cat')[:5]"
73 | ]
74 | },
75 | {
76 | "cell_type": "code",
77 | "execution_count": null,
78 | "id": "5e5d8a22-1b81-4ac9-b113-55e1fc471fb1",
79 | "metadata": {},
80 | "outputs": [],
81 | "source": [
82 | "# create a dataframe for our data\n",
83 | "data_path = 'course_data/afhq'\n",
84 | "\n",
85 | "rows = []\n",
86 | "for dataset in os.listdir(data_path):\n",
87 | " for label in os.listdir(data_path + f'/{dataset}'):\n",
88 | " for image in os.listdir(data_path + f'/{dataset}' + f'/{label}'):\n",
89 | " row = dict()\n",
90 | " row['image_file'] = image\n",
91 | " row['label'] = label\n",
92 | " row['dataset'] = dataset\n",
93 | " \n",
94 | " # a bit redudant, could build from other data in __getitem__ if wanted\n",
95 | " row['image_path'] = data_path + f'/{dataset}' + f'/{label}'\n",
96 | " rows.append(row)\n",
97 | " \n",
98 | "df = pd.DataFrame(rows)\n",
99 | "print(len(df))\n",
100 | "df.head()"
101 | ]
102 | },
103 | {
104 | "cell_type": "code",
105 | "execution_count": null,
106 | "id": "297a6797-0867-4da1-abd4-7a547982d252",
107 | "metadata": {},
108 | "outputs": [],
109 | "source": [
110 | "# training and validation data\n",
111 | "df_train = df[df['dataset'] == 'train'].reset_index(drop=True)\n",
112 | "df_val = df[df['dataset'] == 'val'].reset_index(drop=True)\n",
113 | "len(df_train), len(df_val)"
114 | ]
115 | },
116 | {
117 | "cell_type": "markdown",
118 | "id": "bb81c9e0-5cf7-4931-a34d-cef8f088c8b1",
119 | "metadata": {},
120 | "source": [
121 | "Before creating a Dataset class, let's think about what we want as our input to the network"
122 | ]
123 | },
124 | {
125 | "cell_type": "code",
126 | "execution_count": null,
127 | "id": "3439e30c-3e2a-4f40-ab62-cc8b049f878e",
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "import cv2\n",
132 | "\n",
133 | "# pull up an image\n",
134 | "row = df.iloc[0]\n",
135 | "image_path = row['image_path']\n",
136 | "fname = row['image_file']\n",
137 | "path = image_path+'/'+fname\n",
138 | "img = cv2.imread(path)\n",
139 | "\n",
140 | "# what is an image?\n",
141 | "img"
142 | ]
143 | },
144 | {
145 | "cell_type": "code",
146 | "execution_count": null,
147 | "id": "6f057c43-0eff-4754-b45a-8ee432db0ed8",
148 | "metadata": {},
149 | "outputs": [],
150 | "source": [
151 | "# 512x512 image with 3 channels\n",
152 | "print(img.shape)\n",
153 | "\n",
154 | "# pixel intensity goes from 0 to 255\n",
155 | "print(np.max(img), np.min(img))"
156 | ]
157 | },
158 | {
159 | "cell_type": "code",
160 | "execution_count": null,
161 | "id": "ad211e21-044b-44f2-a28d-12b1abe5adfa",
162 | "metadata": {},
163 | "outputs": [],
164 | "source": [
165 | "# look at the image\n",
166 | "plt.imshow(img)"
167 | ]
168 | },
169 | {
170 | "cell_type": "code",
171 | "execution_count": null,
172 | "id": "fe556d9b-77e8-4a54-9311-2526867a0eb2",
173 | "metadata": {},
174 | "outputs": [],
175 | "source": [
176 | "# why is it weird? cv2 opens in BGR instead of RGB\n",
177 | "plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))"
178 | ]
179 | },
180 | {
181 | "cell_type": "markdown",
182 | "id": "8484f820-2061-4662-8c08-1bd52f6cfa77",
183 | "metadata": {},
184 | "source": [
185 | "## Convolutional Layers\n",
186 | "- [Documentation](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) for ```Conv2d``` is a must-read"
187 | ]
188 | },
189 | {
190 | "cell_type": "code",
191 | "execution_count": null,
192 | "id": "c0620919-793e-45c8-8e74-568fb0b7bf40",
193 | "metadata": {},
194 | "outputs": [],
195 | "source": [
196 | "# number of channels of the input\n",
197 | "in_channels = 3\n",
198 | "# number of filters (hence number of output channels)\n",
199 | "out_channels = 32\n",
200 | "# filter size\n",
201 | "kernel_size = 3 # equivalent to (3,3)\n",
202 | "\n",
203 | "# define the layer\n",
204 | "conv = nn.Conv2d(in_channels, out_channels, kernel_size)\n",
205 | "\n",
206 | "# why error? (two reasons!)\n",
207 | "conv(torch.tensor(img))"
208 | ]
209 | },
210 | {
211 | "cell_type": "code",
212 | "execution_count": null,
213 | "id": "7accca5f-14c5-40ed-9648-ff57c57e05f7",
214 | "metadata": {},
215 | "outputs": [],
216 | "source": [
217 | "# let's try again\n",
218 | "img2 = img[np.newaxis, :, :, :]\n",
219 | "img2 = np.transpose(img2, (0, 3, 1, 2))\n",
220 | "img2 = torch.tensor(img2).float()\n",
221 | "\n",
222 | "output = conv(img2)\n",
223 | "\n",
224 | "# why this shape?\n",
225 | "output.shape"
226 | ]
227 | },
228 | {
229 | "cell_type": "markdown",
230 | "id": "d78240b9-0f24-44b0-8385-ca7dd810577a",
231 | "metadata": {},
232 | "source": [
233 | "Think: How can we change this so that the output has the same 2D shape?"
234 | ]
235 | },
236 | {
237 | "cell_type": "markdown",
238 | "id": "090d4787-85b5-4f59-943a-1e09ecc93f2d",
239 | "metadata": {},
240 | "source": [
241 | "## Dataset and Model"
242 | ]
243 | },
244 | {
245 | "cell_type": "code",
246 | "execution_count": null,
247 | "id": "d777d74d-9040-4dda-917c-63834ddc617a",
248 | "metadata": {},
249 | "outputs": [],
250 | "source": [
251 | "# Let's create a Dataset for our animal faces! \n",
252 | "class AnimalFacesDataset(Dataset):\n",
253 | " def __init__(self, df):\n",
254 | " self.df = df\n",
255 | " \n",
256 | " # label dictionary\n",
257 | " self.label_dict = {'cat':0, 'dog':1, 'wild':2}\n",
258 | " \n",
259 | " def __len__(self):\n",
260 | " return len(self.df)\n",
261 | " \n",
262 | " def __getitem__(self, idx):\n",
263 | " row = self.df.iloc[idx]\n",
264 | " \n",
265 | " # get ingredients for retrieving image\n",
266 | " image_path = row['image_path']\n",
267 | " fname = row['image_file']\n",
268 | " path = image_path+'/'+fname\n",
269 | " \n",
270 | " # read the img\n",
271 | " img = cv2.imread(path)\n",
272 | " \n",
273 | " # convert to RGB\n",
274 | " img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n",
275 | " \n",
276 | " # move color channels to correct spot\n",
277 | " img = np.transpose(img, (2, 0, 1))\n",
278 | " \n",
279 | " # convert to [0,1] scale\n",
280 | " img = torch.tensor(img / 255.).float()\n",
281 | " \n",
282 | " label = torch.tensor(self.label_dict[row['label']])\n",
283 | " \n",
284 | " return img, label"
285 | ]
286 | },
287 | {
288 | "cell_type": "code",
289 | "execution_count": null,
290 | "id": "faa53271-5aa5-40df-8a0e-d90b74dcb9e0",
291 | "metadata": {},
292 | "outputs": [],
293 | "source": [
294 | "ds_train = AnimalFacesDataset(df_train)\n",
295 | "dl_train = DataLoader(ds_train, batch_size = 2, shuffle=True)"
296 | ]
297 | },
298 | {
299 | "cell_type": "code",
300 | "execution_count": null,
301 | "id": "f77c6612-15fa-463e-98ce-8dd856f040cb",
302 | "metadata": {},
303 | "outputs": [],
304 | "source": [
305 | "# make sure our recipe works!\n",
306 | "# notice the time...\n",
307 | "for img, label in tqdm(dl_train):\n",
308 | " None"
309 | ]
310 | },
311 | {
312 | "cell_type": "markdown",
313 | "id": "741d8ac0-360c-4eca-81e7-bca2abe811c8",
314 | "metadata": {},
315 | "source": [
316 | "Have to sketch out dimensions while constructing!\n",
317 | "\n",
318 | "Input: (3, 512, 512)\n",
319 | "\n",
320 | "Conv1 -> (32, 512, 512)\n",
321 | "\n",
322 | "Pool -> (32, 256, 256)\n",
323 | "\n",
324 | "Conv2 -> (64, 256, 256)\n",
325 | "\n",
326 | "Pool -> (64, 128, 128)\n",
327 | "\n",
328 | "Conv3 -> (128, 128, 128)\n",
329 | "\n",
330 | "Pool -> (128, 64, 64)\n",
331 | "\n",
332 | "Conv4 -> (1, 64, 64)"
333 | ]
334 | },
335 | {
336 | "cell_type": "code",
337 | "execution_count": null,
338 | "id": "9d31eb6b-86f2-4cfc-a578-04b56b65de91",
339 | "metadata": {},
340 | "outputs": [],
341 | "source": [
342 | "class CNN(nn.Module):\n",
343 | " def __init__(self):\n",
344 | " super().__init__()\n",
345 | " \n",
346 | " # same padding!\n",
347 | " self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1)\n",
348 | " self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)\n",
349 | " self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1)\n",
350 | " \n",
351 | " # doing this to shrink size enough!\n",
352 | " self.conv4 = nn.Conv2d(in_channels=128, out_channels=1, kernel_size=3, padding=1)\n",
353 | " \n",
354 | " self.linear1 = nn.Linear(4096, 100)\n",
355 | " \n",
356 | " # read documentation for CrossEntropy Loss!\n",
357 | " self.linear2 = nn.Linear(100, 3)\n",
358 | " \n",
359 | " # pooling\n",
360 | " self.pool = nn.MaxPool2d(kernel_size=2)\n",
361 | " \n",
362 | " # activation\n",
363 | " self.relu = nn.ReLU()\n",
364 | " \n",
365 | " # for unrolling into FC layer\n",
366 | " self.unroll = nn.Flatten()\n",
367 | " \n",
368 | " def forward(self, x):\n",
369 | " # helpful to do this along the way sometimes!\n",
370 | " #print(x.shape)\n",
371 | " \n",
372 | " x = self.conv1(x)\n",
373 | " x = self.relu(x)\n",
374 | " x = self.pool(x)\n",
375 | " \n",
376 | " x = self.conv2(x)\n",
377 | " x = self.relu(x)\n",
378 | " x = self.pool(x)\n",
379 | " \n",
380 | " x = self.conv3(x)\n",
381 | " x = self.relu(x)\n",
382 | " x = self.pool(x)\n",
383 | " \n",
384 | " x = self.conv4(x)\n",
385 | " x = self.relu(x)\n",
386 | " \n",
387 | " # unroll x for FC layer\n",
388 | " x = self.linear1(self.unroll(x))\n",
389 | " x = self.relu(x)\n",
390 | " x = self.linear2(x)\n",
391 | " \n",
392 | " return x\n",
393 | " \n",
394 | "model = CNN()"
395 | ]
396 | },
397 | {
398 | "cell_type": "code",
399 | "execution_count": null,
400 | "id": "379e0e49-efd7-42cc-83ff-7e3115651629",
401 | "metadata": {},
402 | "outputs": [],
403 | "source": [
404 | "model(img2)\n",
405 | "\n",
406 | "from torchsummary import summary\n",
407 | "summary(model, input_size = (3, 512, 512), device='cpu')"
408 | ]
409 | },
410 | {
411 | "cell_type": "code",
412 | "execution_count": null,
413 | "id": "7fac55bb-a11e-4877-9410-3a09d3e06c67",
414 | "metadata": {},
415 | "outputs": [],
416 | "source": []
417 | }
418 | ],
419 | "metadata": {
420 | "kernelspec": {
421 | "display_name": "Python 3",
422 | "language": "python",
423 | "name": "python3"
424 | },
425 | "language_info": {
426 | "codemirror_mode": {
427 | "name": "ipython",
428 | "version": 3
429 | },
430 | "file_extension": ".py",
431 | "mimetype": "text/x-python",
432 | "name": "python",
433 | "nbconvert_exporter": "python",
434 | "pygments_lexer": "ipython3",
435 | "version": "3.8.8"
436 | }
437 | },
438 | "nbformat": 4,
439 | "nbformat_minor": 5
440 | }
441 |
--------------------------------------------------------------------------------
/Notebooks/Lecture4_Transfer_Augmentation.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "id": "7e032868",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import os\n",
11 | "import cv2\n",
12 | "import time\n",
13 | "import numpy as np\n",
14 | "import pandas as pd\n",
15 | "from tqdm.notebook import tqdm\n",
16 | "import matplotlib.pyplot as plt\n",
17 | "\n",
18 | "import torch\n",
19 | "from torch.utils.data import Dataset, DataLoader\n",
20 | "import torch.nn as nn\n",
21 | "import torch.optim as optim\n",
22 | "\n",
23 | "# data augmentation\n",
24 | "import albumentations as A\n",
25 | "\n",
26 | "# pretrained models\n",
27 | "import torchvision\n",
28 | "from torchvision import models, transforms"
29 | ]
30 | },
31 | {
32 | "cell_type": "markdown",
33 | "id": "a1bb290b-54b8-4d2f-af18-fff71dcd33ab",
34 | "metadata": {},
35 | "source": [
36 | "Image Data from [here](https://www.kaggle.com/andrewmvd/animal-faces)\n",
37 | "- Animal Faces"
38 | ]
39 | },
40 | {
41 | "cell_type": "markdown",
42 | "id": "369bebf4-9016-426b-b48c-3228c4e7b1e5",
43 | "metadata": {},
44 | "source": [
45 | "## Resizing"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": null,
51 | "id": "bda1aedf-dbfe-493f-a859-659fcadcd30a",
52 | "metadata": {},
53 | "outputs": [],
54 | "source": [
55 | "# create a dataframe for our image data\n",
56 | "data_path = 'course_data/afhq'\n",
57 | "\n",
58 | "rows = []\n",
59 | "for dataset in os.listdir(data_path):\n",
60 | " for label in os.listdir(data_path + f'/{dataset}'):\n",
61 | " for image in os.listdir(data_path + f'/{dataset}' + f'/{label}'):\n",
62 | " row = dict()\n",
63 | " row['image_file'] = image\n",
64 | " row['label'] = label\n",
65 | " row['dataset'] = dataset\n",
66 | " \n",
67 | " # a bit redudant, could build from other data in __getitem__ if wanted\n",
68 | " row['image_path'] = data_path + f'/{dataset}' + f'/{label}'\n",
69 | " rows.append(row)\n",
70 | " \n",
71 | "df = pd.DataFrame(rows)\n",
72 | "print(len(df))\n",
73 | "df.head()"
74 | ]
75 | },
76 | {
77 | "cell_type": "code",
78 | "execution_count": null,
79 | "id": "120426b8-4e1d-4dea-bcf3-b0f4eb4ba9e4",
80 | "metadata": {},
81 | "outputs": [],
82 | "source": [
83 | "# training and validation data\n",
84 | "df_train = df[df['dataset'] == 'train'].reset_index(drop=True)\n",
85 | "df_val = df[df['dataset'] == 'val'].reset_index(drop=True)\n",
86 | "len(df_train), len(df_val)"
87 | ]
88 | },
89 | {
90 | "cell_type": "markdown",
91 | "id": "57230353-94d5-40c1-97a0-799c0ebffa5c",
92 | "metadata": {},
93 | "source": [
94 | "We're going to work with a pre-trained model that takes in images of size 224x224. We will reduce the resolution as a *pre-processings* step rather than on the fly to save time during training.\n",
95 | "- Notice the time we save during each epoch: 3 seconds for me"
96 | ]
97 | },
98 | {
99 | "cell_type": "code",
100 | "execution_count": null,
101 | "id": "b18bef54-d3e6-4628-9fa6-b7cf6d4cfc41",
102 | "metadata": {},
103 | "outputs": [],
104 | "source": [
105 | "def resize_img(path, size):\n",
106 | " img = cv2.imread(path)\n",
107 | " \n",
108 | " start = time.time()\n",
109 | " img = cv2.resize(img, size)\n",
110 | " end = time.time()\n",
111 | " \n",
112 | " cv2.imwrite(path, img)\n",
113 | " return end - start\n",
114 | "\n",
115 | "# resize all of the images to 256x256\n",
116 | "total_time_resize = 0.0\n",
117 | "for idx in tqdm(range(len(df_train))):\n",
118 | " row = df_train.iloc[idx]\n",
119 | " image_path = row['image_path']\n",
120 | " fname = row['image_file']\n",
121 | " path = image_path+'/'+fname\n",
122 | " \n",
123 | " total_time_resize += resize_img(path, (256, 256))\n",
124 | " \n",
125 | "for idx in tqdm(range(len(df_val))):\n",
126 | " row = df_train.iloc[idx]\n",
127 | " image_path = row['image_path']\n",
128 | " fname = row['image_file']\n",
129 | " path = image_path+'/'+fname\n",
130 | " \n",
131 | " total_time_resize += resize_img(path, (256, 256))\n",
132 | " "
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "execution_count": null,
138 | "id": "5dfb0e02-a3c9-46c4-a7f6-9c49bc78db25",
139 | "metadata": {},
140 | "outputs": [],
141 | "source": [
142 | "total_time_resize"
143 | ]
144 | },
145 | {
146 | "cell_type": "code",
147 | "execution_count": null,
148 | "id": "0ee046eb-4c26-4db2-be5f-cc3400e95e3f",
149 | "metadata": {},
150 | "outputs": [],
151 | "source": [
152 | "row = df_train.iloc[100]\n",
153 | "image_path = row['image_path']\n",
154 | "fname = row['image_file']\n",
155 | "path = image_path+'/'+fname\n",
156 | "img = cv2.imread(path)\n",
157 | "\n",
158 | "img.shape"
159 | ]
160 | },
161 | {
162 | "cell_type": "markdown",
163 | "id": "0ff44043-dfca-4ee2-b5a3-ce71956e75e9",
164 | "metadata": {},
165 | "source": [
166 | "## Data Augmentation with [Albumentations](https://github.com/albumentations-team/albumentations)\n",
167 | "- A suite of very fast transformations for images\n",
168 | "- Supports masks and keypoints!"
169 | ]
170 | },
171 | {
172 | "cell_type": "code",
173 | "execution_count": null,
174 | "id": "1b578b01-e5cd-4b9d-959e-0b881003073e",
175 | "metadata": {},
176 | "outputs": [],
177 | "source": [
178 | "from albumentations.pytorch import ToTensorV2\n",
179 | "\n",
180 | "# let's add an augmentation option\n",
181 | "class AnimalFacesDataset(Dataset):\n",
182 | " def __init__(self, df, augment=False):\n",
183 | " self.df = df\n",
184 | " self.augment = augment\n",
185 | " \n",
186 | " # label dictionary\n",
187 | " self.label_dict = {'cat':0, 'dog':1, 'wild':2}\n",
188 | " \n",
189 | " # define the transformation\n",
190 | " if augment == True:\n",
191 | " self.transforms = A.Compose([\n",
192 | " # spatial transforms\n",
193 | " A.RandomCrop(width=224, height=224),\n",
194 | " A.HorizontalFlip(p=.5),\n",
195 | " A.VerticalFlip(p=.5),\n",
196 | " A.Rotate(limit = 10, \n",
197 | " border_mode = cv2.BORDER_CONSTANT, \n",
198 | " value = 0.0, p = .75),\n",
199 | " \n",
200 | " # pixel-level transformation\n",
201 | " A.RandomBrightnessContrast(p=0.5),\n",
202 | " \n",
203 | " # we will normalize according to ImageNet since we will be using a pre-trained ResNet\n",
204 | " # this adjusts from [0,255] to [0,1]\n",
205 | " A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),\n",
206 | " \n",
207 | " # convert to a tensor and move color channels\n",
208 | " ToTensorV2()\n",
209 | " ])\n",
210 | " else:\n",
211 | " self.transforms = A.Compose([\n",
212 | " # training/valid images have same size\n",
213 | " A.CenterCrop(width=224, height=224),\n",
214 | " \n",
215 | " # normalize\n",
216 | " A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),\n",
217 | " \n",
218 | " # convert to a tensor and move color channels\n",
219 | " ToTensorV2()\n",
220 | " ])\n",
221 | " \n",
222 | " def __len__(self):\n",
223 | " return len(self.df)\n",
224 | " \n",
225 | " def __getitem__(self, idx):\n",
226 | " row = self.df.iloc[idx]\n",
227 | " \n",
228 | " # get ingredients for retrieving image\n",
229 | " image_path = row['image_path']\n",
230 | " fname = row['image_file']\n",
231 | " path = image_path+'/'+fname\n",
232 | " \n",
233 | " # read the img\n",
234 | " img = cv2.imread(path)\n",
235 | " \n",
236 | " # convert to RGB\n",
237 | " img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n",
238 | " \n",
239 | " # transform the image\n",
240 | " # certain transformations expect the uint8 datatype\n",
241 | " transformed = self.transforms(image=img.astype(np.uint8))\n",
242 | " img = transformed['image']\n",
243 | " \n",
244 | " label = torch.tensor(self.label_dict[row['label']])\n",
245 | " \n",
246 | " return img, label"
247 | ]
248 | },
249 | {
250 | "cell_type": "code",
251 | "execution_count": null,
252 | "id": "e490a686-4006-4423-a91d-a4a4ccd04fa8",
253 | "metadata": {},
254 | "outputs": [],
255 | "source": [
256 | "ds_train = AnimalFacesDataset(df_train, augment=True)\n",
257 | "dl_train = DataLoader(ds_train, batch_size = 16, shuffle=True)\n",
258 | "\n",
259 | "ds_val = AnimalFacesDataset(df_val)\n",
260 | "dl_val = DataLoader(ds_val, batch_size = 16, shuffle=True)"
261 | ]
262 | },
263 | {
264 | "cell_type": "markdown",
265 | "id": "52261356-1c91-485c-ab2c-072e3dc33e04",
266 | "metadata": {},
267 | "source": [
268 | "Below we double check that this is working properly, and can see the transformation in practice"
269 | ]
270 | },
271 | {
272 | "cell_type": "code",
273 | "execution_count": null,
274 | "id": "88ab0d6e-71da-45f2-9f64-a4c0c49a5257",
275 | "metadata": {},
276 | "outputs": [],
277 | "source": [
278 | "img, label = next(iter(ds_train))\n",
279 | "print(img.shape)\n",
280 | "\n",
281 | "# restructure for plt\n",
282 | "img = np.transpose(np.array(img), (1,2,0))\n",
283 | "\n",
284 | "# reverse the normalization\n",
285 | "mean = (0.485, 0.456, 0.406)\n",
286 | "std = (0.229, 0.224, 0.225)\n",
287 | "for i in range(3):\n",
288 | " img[:,:,i] = (img[:,:,i] * std[i]) + mean[i]\n",
289 | "\n",
290 | "plt.imshow(img)\n",
291 | "print(label)"
292 | ]
293 | },
294 | {
295 | "cell_type": "markdown",
296 | "id": "c9fb861e-0ace-4bb9-8f99-e1202078f49d",
297 | "metadata": {},
298 | "source": [
299 | "## Pretrained Models\n",
300 | "- Freezing Layers (feature extraction)\n",
301 | "- Finetuning (weight initialization)"
302 | ]
303 | },
304 | {
305 | "cell_type": "code",
306 | "execution_count": null,
307 | "id": "83406fad-22c7-4bd4-ab35-dc35d150f21d",
308 | "metadata": {},
309 | "outputs": [],
310 | "source": [
311 | "# drum roll...the pretrained resnet!\n",
312 | "resnet = models.resnet18(pretrained=True)"
313 | ]
314 | },
315 | {
316 | "cell_type": "code",
317 | "execution_count": null,
318 | "id": "b5923144-ced7-4e12-a698-faa5e86de73d",
319 | "metadata": {},
320 | "outputs": [],
321 | "source": [
322 | "# we can see the architecture\n",
323 | "# note how many of the layers are organized in \"BasicBlock\"\n",
324 | "resnet"
325 | ]
326 | },
327 | {
328 | "cell_type": "markdown",
329 | "id": "ef1fbd33-4d5e-47f7-9691-c8b2499e0146",
330 | "metadata": {},
331 | "source": [
332 | "- Notice how the image eventually becomes a 1D vector of dimension 512\n",
333 | "- In some sense the network has transformed an image into a vector of features helpful for image classification\n",
334 | "- The last layer is a simple function (linear followed by softmax) on this feature space that predicts an images class\n",
335 | "- One strategy is to train a new simple function on this **same** feature space for our classification task"
336 | ]
337 | },
338 | {
339 | "cell_type": "code",
340 | "execution_count": null,
341 | "id": "5c72d09a-78ea-4ec7-aceb-67bfaa081858",
342 | "metadata": {},
343 | "outputs": [],
344 | "source": [
345 | "from torchsummary import summary\n",
346 | "summary(resnet, input_size = (3, 224, 224), device='cpu')"
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": null,
352 | "id": "1348cac1-5a90-4795-8c71-7f309c3abb5a",
353 | "metadata": {},
354 | "outputs": [],
355 | "source": [
356 | "# turn off gradients for all the parameters\n",
357 | "for param in resnet.parameters():\n",
358 | " param.requires_grad = False"
359 | ]
360 | },
361 | {
362 | "cell_type": "code",
363 | "execution_count": null,
364 | "id": "3239a40b-79e2-44c6-9d3f-3f224e85e670",
365 | "metadata": {},
366 | "outputs": [],
367 | "source": [
368 | "# re-intialize the last layer for our task\n",
369 | "print(resnet.fc)\n",
370 | "resnet.fc = nn.Linear(512, 3)\n",
371 | "print(resnet.fc)"
372 | ]
373 | },
374 | {
375 | "cell_type": "code",
376 | "execution_count": null,
377 | "id": "96d64c4b-6851-4b7e-8bc4-36e991409662",
378 | "metadata": {},
379 | "outputs": [],
380 | "source": [
381 | "# re-initializing the layer reset to default settings\n",
382 | "for param in resnet.fc.parameters():\n",
383 | " print(param.requires_grad)"
384 | ]
385 | },
386 | {
387 | "cell_type": "code",
388 | "execution_count": null,
389 | "id": "7f819a11-860d-41f2-8598-960119767120",
390 | "metadata": {},
391 | "outputs": [],
392 | "source": [
393 | "# double-check all the parameters\n",
394 | "for name, param in resnet.named_parameters():\n",
395 | " print(f\"{name} gradient is set to\", param.requires_grad)"
396 | ]
397 | },
398 | {
399 | "cell_type": "code",
400 | "execution_count": null,
401 | "id": "1de65d46-4ac3-4d01-baad-6659d1f09fba",
402 | "metadata": {},
403 | "outputs": [],
404 | "source": [
405 | "# pass the appropriate parameters to the optimizer\n",
406 | "params_to_update = []\n",
407 | "\n",
408 | "for param in resnet.parameters():\n",
409 | " if param.requires_grad == True:\n",
410 | " params_to_update.append(param)\n",
411 | "\n",
412 | "optimizer = optim.Adam(params_to_update, lr=0.001)"
413 | ]
414 | },
415 | {
416 | "cell_type": "code",
417 | "execution_count": null,
418 | "id": "c48bbab6-2250-485e-950a-a049857d675c",
419 | "metadata": {},
420 | "outputs": [],
421 | "source": [
422 | "# let's make sure that this actually freezes/trains the layers, take a sample weight\n",
423 | "print(resnet.conv1.weight[0])\n",
424 | "print(resnet.fc.bias)"
425 | ]
426 | },
427 | {
428 | "cell_type": "code",
429 | "execution_count": null,
430 | "id": "173e63ff-6da8-41f5-90d6-b24d640b374e",
431 | "metadata": {},
432 | "outputs": [],
433 | "source": [
434 | "def one_pass(model, dataloader, optimizer, lossFun, backwards=True, print_loss=False):\n",
435 | " \n",
436 | " if backwards == True:\n",
437 | " model.train()\n",
438 | " else:\n",
439 | " model.eval()\n",
440 | " \n",
441 | " total_loss = 0.0\n",
442 | " for x, y in tqdm(dataloader):\n",
443 | " \n",
444 | " y_pred = model(x)\n",
445 | " loss = lossFun(y_pred, y)\n",
446 | " total_loss += loss.item()\n",
447 | " \n",
448 | " if backwards == True:\n",
449 | " optimizer.zero_grad()\n",
450 | " loss.backward()\n",
451 | " optimizer.step()\n",
452 | " avg_loss = total_loss / len(dataloader)\n",
453 | " \n",
454 | " if print_loss == True:\n",
455 | " print(avg_loss)\n",
456 | " \n",
457 | " return avg_loss\n",
458 | "\n",
459 | "def one_pass_acc(model, dataloader, num_points):\n",
460 | " model.eval()\n",
461 | " total_incorrect = 0\n",
462 | " \n",
463 | " softmax = nn.LogSoftmax(dim=1)\n",
464 | " \n",
465 | " for x, y in dataloader:\n",
466 | " y_pred = softmax(model(x))\n",
467 | " y_pred = torch.argmax(y_pred, dim=1)\n",
468 | " \n",
469 | " total_incorrect += torch.count_nonzero(y - y_pred).item()\n",
470 | " \n",
471 | " percent_wrong = total_incorrect / num_points\n",
472 | " return 1 - percent_wrong"
473 | ]
474 | },
475 | {
476 | "cell_type": "code",
477 | "execution_count": null,
478 | "id": "d04ce83b-c4f2-4120-b15a-f729a6eef0ed",
479 | "metadata": {},
480 | "outputs": [],
481 | "source": [
482 | "from tqdm.notebook import tqdm\n",
483 | "\n",
484 | "lossFun = nn.CrossEntropyLoss()\n",
485 | "\n",
486 | "num_epochs = 5\n",
487 | "train_losses = []\n",
488 | "valid_losses = []\n",
489 | "\n",
490 | "for epoch in tqdm(range(num_epochs)):\n",
491 | " print('Epoch: ', epoch)\n",
492 | " \n",
493 | " train_loss = one_pass(resnet, dl_train, optimizer, lossFun)\n",
494 | " train_losses.append(train_loss)\n",
495 | " print('Train loss: ', train_loss)\n",
496 | " \n",
497 | " valid_loss = one_pass(resnet, dl_val, optimizer, lossFun, backwards=False)\n",
498 | " valid_losses.append(valid_loss)\n",
499 | " print('Valid loss: ', valid_loss)\n",
500 | " \n",
501 | " train_acc = one_pass_acc(resnet, dl_train, len(ds_train))\n",
502 | " valid_acc = one_pass_acc(resnet, dl_val, len(ds_val))\n",
503 | " print('Train Acc: ', train_acc)\n",
504 | " print('Valid Acc: ', valid_acc)"
505 | ]
506 | },
507 | {
508 | "cell_type": "markdown",
509 | "id": "edaf246f-c486-4971-b40f-5c108dc04f15",
510 | "metadata": {},
511 | "source": [
512 | "Note how long it takes to train for images"
513 | ]
514 | },
515 | {
516 | "cell_type": "code",
517 | "execution_count": null,
518 | "id": "9bb0a828-4c2b-4509-b085-17e1f19f54f1",
519 | "metadata": {},
520 | "outputs": [],
521 | "source": [
522 | "print(resnet.conv1.weight[0])\n",
523 | "print(resnet.fc.bias)"
524 | ]
525 | },
526 | {
527 | "cell_type": "markdown",
528 | "id": "94cb6e8e-d4c2-49c7-9708-7290fb27d28b",
529 | "metadata": {},
530 | "source": [
531 | "If we want to finetune, we can either\n",
532 | "- use the resnet as a starting point and train by treating the pretrained weights as good weight initilaization OR\n",
533 | "- we can train different layers at different learning rates (the later the layer, the more we want to adjust the feature)"
534 | ]
535 | },
536 | {
537 | "cell_type": "code",
538 | "execution_count": null,
539 | "id": "ac3144f5-38d6-4994-b801-f64beaee572f",
540 | "metadata": {},
541 | "outputs": [],
542 | "source": [
543 | "# we can pass the optimizer groups of parameters rather than all the parameters in one group\n",
544 | "for param_group in optimizer.param_groups:\n",
545 | " print(param_group)"
546 | ]
547 | },
548 | {
549 | "cell_type": "code",
550 | "execution_count": null,
551 | "id": "4db9f39a-ea40-4fa6-ae0e-7ca5d9e1c25b",
552 | "metadata": {},
553 | "outputs": [],
554 | "source": [
555 | "for name, layer in resnet.named_children():\n",
556 | " print(name)"
557 | ]
558 | },
559 | {
560 | "cell_type": "code",
561 | "execution_count": null,
562 | "id": "a155115e-eabf-4f10-a5ae-daf8bec10438",
563 | "metadata": {},
564 | "outputs": [],
565 | "source": [
566 | "max_lr = 0.01\n",
567 | "params = []\n",
568 | "for i, layer in enumerate(resnet.children()):\n",
569 | " if i < 6:\n",
570 | " params.append({'params': layer.parameters(), 'lr': max_lr / 100})\n",
571 | " elif 5 < i < 9:\n",
572 | " params.append({'params': layer.parameters(), 'lr': max_lr / 10})\n",
573 | " else:\n",
574 | " params.append({'params': layer.parameters()})\n",
575 | " \n",
576 | "# only the parameters we didn't manually set the learning rate for inherit the learning rate set when defining the optimizer\n",
577 | "optimizer = optim.Adam(params, lr = max_lr)"
578 | ]
579 | },
580 | {
581 | "cell_type": "code",
582 | "execution_count": null,
583 | "id": "d11ca47f-cc50-467b-9bab-ce25a26f5a40",
584 | "metadata": {},
585 | "outputs": [],
586 | "source": [
587 | "# we can see the parameters groups here\n",
588 | "optimizer"
589 | ]
590 | },
591 | {
592 | "cell_type": "code",
593 | "execution_count": null,
594 | "id": "221cce29-a6be-4d9b-97fb-c2a6292b40fd",
595 | "metadata": {},
596 | "outputs": [],
597 | "source": [
598 | "# to make this a bit cleaner you can make a new model class\n",
599 | "# use model.features1, model.features2, and model.classifier to set the learning rates\n",
600 | "class Tune_ResNet(nn.Module):\n",
601 | " def __init__(self):\n",
602 | " super(Tune_ResNet, self).__init__()\n",
603 | " resnet = models.resnet18(pretrained=True)\n",
604 | " layers = list(resnet.children())[:9]\n",
605 | " self.features1 = nn.Sequential(*layers[:6])\n",
606 | " self.features2 = nn.Sequential(*layers[6:])\n",
607 | " self.classifier = nn.Linear(512, 3)\n",
608 | " self.unroll = nn.Flatten()\n",
609 | " \n",
610 | " def forward(self, x):\n",
611 | " x = self.features1(x)\n",
612 | " x = self.features2(x)\n",
613 | " x = self.unroll(x)\n",
614 | " x = self.classifier(x)\n",
615 | " return x\n",
616 | " \n",
617 | "model = Tune_ResNet()\n",
618 | "summary(model, input_size = (3, 224, 224), device='cpu')"
619 | ]
620 | },
621 | {
622 | "cell_type": "markdown",
623 | "id": "f2f02122-57fa-452e-8f1a-94c5529d2844",
624 | "metadata": {},
625 | "source": [
626 | "## Training on a GPU\n",
627 | "- We saw how slow it was to train images on a cpu\n",
628 | "- PyTorch makes it easy to do this training on a GPU!\n",
629 | "- Always follow GPU etiquette and check who is running what"
630 | ]
631 | },
632 | {
633 | "cell_type": "code",
634 | "execution_count": null,
635 | "id": "4adba611-dcdc-4982-bfa1-f4b512e00a97",
636 | "metadata": {},
637 | "outputs": [],
638 | "source": [
639 | "# is a GPU available?\n",
640 | "torch.cuda.is_available()"
641 | ]
642 | },
643 | {
644 | "cell_type": "code",
645 | "execution_count": null,
646 | "id": "b265d230-a907-4071-80e0-4c72a6360948",
647 | "metadata": {},
648 | "outputs": [],
649 | "source": [
650 | "# check who is using what\n",
651 | "!nvidia-smi"
652 | ]
653 | },
654 | {
655 | "cell_type": "code",
656 | "execution_count": null,
657 | "id": "9f16ded2-6486-491f-9f06-5205665e1c0a",
658 | "metadata": {},
659 | "outputs": [],
660 | "source": [
661 | "# how many devices are there?\n",
662 | "torch.cuda.device_count()"
663 | ]
664 | },
665 | {
666 | "cell_type": "code",
667 | "execution_count": null,
668 | "id": "de5d5cdc-9953-4336-92f7-761e5347ead2",
669 | "metadata": {},
670 | "outputs": [],
671 | "source": [
672 | "device_no = 0\n",
673 | "if torch.cuda.is_available() == True:\n",
674 | " device = torch.device(device_no)\n",
675 | "else:\n",
676 | " device = torch.device('cpu')\n",
677 | "device"
678 | ]
679 | },
680 | {
681 | "cell_type": "code",
682 | "execution_count": null,
683 | "id": "17eaf798-c904-45ab-8879-141e9931687f",
684 | "metadata": {},
685 | "outputs": [],
686 | "source": [
687 | "# move model parameters to device\n",
688 | "model.to(device)"
689 | ]
690 | },
691 | {
692 | "cell_type": "code",
693 | "execution_count": null,
694 | "id": "2071d8b1-a383-41f1-bbf7-af78a0647577",
695 | "metadata": {},
696 | "outputs": [],
697 | "source": [
698 | "# let's adapt our earlier function\n",
699 | "def one_pass(model, dataloader, optimizer, lossFun, device, backwards=True, print_loss=False):\n",
700 | " \n",
701 | " if backwards == True:\n",
702 | " model.train()\n",
703 | " else:\n",
704 | " model.eval()\n",
705 | " \n",
706 | " total_loss = 0.0\n",
707 | " for x, y in tqdm(dataloader):\n",
708 | " \n",
709 | " # send labelled data to the device\n",
710 | " x, y = x.to(device), y.to(device)\n",
711 | " \n",
712 | " y_pred = model(x)\n",
713 | " loss = lossFun(y_pred, y)\n",
714 | " total_loss += loss.item()\n",
715 | " \n",
716 | " if backwards == True:\n",
717 | " optimizer.zero_grad()\n",
718 | " loss.backward()\n",
719 | " optimizer.step()\n",
720 | " avg_loss = total_loss / len(dataloader)\n",
721 | " \n",
722 | " if print_loss == True:\n",
723 | " print(avg_loss)\n",
724 | " \n",
725 | " return avg_loss"
726 | ]
727 | },
728 | {
729 | "cell_type": "markdown",
730 | "id": "3986f108-ae8c-4eb8-b699-5526ef094d73",
731 | "metadata": {},
732 | "source": [
733 | "Note that\n",
734 | "- The model can only take inputs on the same device\n",
735 | "- The output is also on the specified device and cannot interact with tensors on a different device"
736 | ]
737 | },
738 | {
739 | "cell_type": "code",
740 | "execution_count": null,
741 | "id": "b6fa035e-d834-47db-a435-bfa2214c2dcc",
742 | "metadata": {},
743 | "outputs": [],
744 | "source": [
745 | "x, y = next(iter(dl_train))\n",
746 | "\n",
747 | "# move to device\n",
748 | "x, y = x.to(device), y.to(device)\n",
749 | "\n",
750 | "# perform computation\n",
751 | "y_pred = model(x)\n",
752 | "\n",
753 | "# now its on the cpu again\n",
754 | "y_pred.cpu()"
755 | ]
756 | }
757 | ],
758 | "metadata": {
759 | "kernelspec": {
760 | "display_name": "Python 3",
761 | "language": "python",
762 | "name": "python3"
763 | },
764 | "language_info": {
765 | "codemirror_mode": {
766 | "name": "ipython",
767 | "version": 3
768 | },
769 | "file_extension": ".py",
770 | "mimetype": "text/x-python",
771 | "name": "python",
772 | "nbconvert_exporter": "python",
773 | "pygments_lexer": "ipython3",
774 | "version": "3.8.10"
775 | }
776 | },
777 | "nbformat": 4,
778 | "nbformat_minor": 5
779 | }
780 |
--------------------------------------------------------------------------------
/Notebooks/Lecture5_Text_Embeddings_Models.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "id": "f5adfd5c-c155-444d-beeb-89161a51b239",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import os\n",
11 | "import cv2\n",
12 | "import time\n",
13 | "import numpy as np\n",
14 | "import pandas as pd\n",
15 | "from tqdm.notebook import tqdm\n",
16 | "import matplotlib.pyplot as plt\n",
17 | "\n",
18 | "import torch\n",
19 | "from torch.utils.data import Dataset, DataLoader\n",
20 | "import torch.nn as nn\n",
21 | "import torch.optim as optim"
22 | ]
23 | },
24 | {
25 | "cell_type": "markdown",
26 | "id": "e0668f27-fd1f-4930-819f-fcbdb2702a9c",
27 | "metadata": {},
28 | "source": [
29 | "IMDB Movie Review Dataset (cleaned)\n",
30 | "- Originally from [here](https://ai.stanford.edu/~amaas/data/sentiment/)\n",
31 | "- Cleaned into a csv [here](https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews)"
32 | ]
33 | },
34 | {
35 | "cell_type": "code",
36 | "execution_count": null,
37 | "id": "7c14c97e-da75-46c3-aadb-e1576bedcf8f",
38 | "metadata": {},
39 | "outputs": [],
40 | "source": [
41 | "df = pd.read_csv('course_data/IMDB Dataset.csv')\n",
42 | "df.head()"
43 | ]
44 | },
45 | {
46 | "cell_type": "markdown",
47 | "id": "8a73f20c-ac25-407e-8318-77680e570778",
48 | "metadata": {},
49 | "source": [
50 | "## Automatic Tokenization with Spacy"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": null,
56 | "id": "83e50c35-b1b7-429d-8004-fbc1fdcb9284",
57 | "metadata": {},
58 | "outputs": [],
59 | "source": [
60 | "# tool for text\n",
61 | "import spacy\n",
62 | "\n",
63 | "# load information about words\n",
64 | "!python3 -m spacy download en_core_web_sm\n",
65 | "nlp = spacy.load('en_core_web_sm')"
66 | ]
67 | },
68 | {
69 | "cell_type": "code",
70 | "execution_count": null,
71 | "id": "c754d50b-148e-48a6-a470-0eac8236796e",
72 | "metadata": {},
73 | "outputs": [],
74 | "source": [
75 | "some_text = df.iloc[9]['review']\n",
76 | "print(some_text)\n",
77 | "\n",
78 | "# automatically tokenize the text\n",
79 | "tokenized_text = nlp(some_text)"
80 | ]
81 | },
82 | {
83 | "cell_type": "code",
84 | "execution_count": null,
85 | "id": "5d3c905d-66ad-41b7-8ad5-ad0a85e41615",
86 | "metadata": {},
87 | "outputs": [],
88 | "source": [
89 | "# it's not perfect\n",
90 | "for token in tokenized_text:\n",
91 | " print(token.text)"
92 | ]
93 | },
94 | {
95 | "cell_type": "code",
96 | "execution_count": null,
97 | "id": "62fae344-cb8d-411b-8c6a-ef3e226d7c0a",
98 | "metadata": {},
99 | "outputs": [],
100 | "source": [
101 | "idx = 5\n",
102 | "\n",
103 | "token = tokenized_text[idx]\n",
104 | "\n",
105 | "# lemmatization\n",
106 | "print('Lemmatization of', token.text, 'is', token.lemma_)\n",
107 | "\n",
108 | "# part of speech tagging\n",
109 | "print(token.text, 'is a', token.pos_)\n",
110 | "\n",
111 | "# is it a stop word?\n",
112 | "print('The fact that', token.text, 'is a stop word is', token.is_stop)"
113 | ]
114 | },
115 | {
116 | "cell_type": "code",
117 | "execution_count": null,
118 | "id": "12a84263-6ea6-4ebd-b092-9c69aa053e10",
119 | "metadata": {},
120 | "outputs": [],
121 | "source": [
122 | "# sentence segmentation\n",
123 | "for sentence in tokenized_text.sents:\n",
124 | " print(sentence)"
125 | ]
126 | },
127 | {
128 | "cell_type": "markdown",
129 | "id": "cb184749-fc09-425e-bd27-4f56759b4c0e",
130 | "metadata": {},
131 | "source": [
132 | "- tons more fancy features!\n",
133 | "- Let's do a simple pipeline where we ignore non-alphabetic characters"
134 | ]
135 | },
136 | {
137 | "cell_type": "code",
138 | "execution_count": null,
139 | "id": "1c291843-acb8-406c-bced-b8a477bfbf17",
140 | "metadata": {},
141 | "outputs": [],
142 | "source": [
143 | "import re\n",
144 | "\n",
145 | "a_review = df.iloc[9]['review']\n",
146 | "\n",
147 | "# remove those
s\n",
148 | "a_review = a_review.replace('
', ' ')\n",
149 | "print(a_review)\n",
150 | "\n",
151 | "# remove non-alphabetic characters\n",
152 | "a_review = re.sub(\"[^A-Za-z']+\", ' ', a_review)\n",
153 | "print(a_review)"
154 | ]
155 | },
156 | {
157 | "cell_type": "code",
158 | "execution_count": null,
159 | "id": "0c6a2ccd-516e-43ac-a485-8f03ca2032ff",
160 | "metadata": {},
161 | "outputs": [],
162 | "source": [
163 | "# disabling some fancy features of spacy for speed\n",
164 | "nlp = spacy.load('en_core_web_sm', disable = ['ner', 'parser'])\n",
165 | "\n",
166 | "rows = []\n",
167 | "for idx in tqdm(range(len(df))):\n",
168 | " row = df.iloc[idx].copy()\n",
169 | " \n",
170 | " # first we remove numeric characters and lowercase everything\n",
171 | " cleaned_review = re.sub(\"[^A-Za-z']+\", ' ', row['review'].replace('
', ' ')).lower()\n",
172 | " \n",
173 | " # we let spaCy tokenize and lemmatize the text for us\n",
174 | " tokenized_review = nlp(cleaned_review)\n",
175 | " cleaned_tokenized = [token.lemma_ for token in tokenized_review if ((not token.is_stop) or (' ' in token.text))]\n",
176 | " \n",
177 | " if len(cleaned_tokenized) > 1:\n",
178 | " row['cleaned'] = ' '.join(cleaned_tokenized)\n",
179 | " rows.append(row)\n",
180 | "df_clean = pd.DataFrame(rows)\n",
181 | "df_clean.head()\n",
182 | "df_clean.to_csv('course_data/IMDB_cleaned.csv')"
183 | ]
184 | },
185 | {
186 | "cell_type": "code",
187 | "execution_count": null,
188 | "id": "400239e5-9d76-462b-be23-c6165b7e2905",
189 | "metadata": {},
190 | "outputs": [],
191 | "source": [
192 | "df_clean = pd.read_csv('course_data/IMDB_cleaned.csv')\n",
193 | "df_clean.head()"
194 | ]
195 | },
196 | {
197 | "cell_type": "markdown",
198 | "id": "44dd434d-a060-40b6-89b2-a7d745efe03c",
199 | "metadata": {},
200 | "source": [
201 | "## Prepare for Training"
202 | ]
203 | },
204 | {
205 | "cell_type": "code",
206 | "execution_count": null,
207 | "id": "11b07a32-67ad-4952-9acf-8fbf925d6768",
208 | "metadata": {},
209 | "outputs": [],
210 | "source": [
211 | "# count words, send infrequent to unknown\n",
212 | "\n",
213 | "# let's get an idea of word frequency\n",
214 | "from collections import Counter\n",
215 | "\n",
216 | "reviews = [review.split(' ') for review in list(df_clean['cleaned'])]\n",
217 | "word_freq = Counter([token for review in reviews for token in review]).most_common()"
218 | ]
219 | },
220 | {
221 | "cell_type": "code",
222 | "execution_count": null,
223 | "id": "a2d58387-6570-40aa-8190-8fa84fa28c4f",
224 | "metadata": {},
225 | "outputs": [],
226 | "source": [
227 | "# no surprises here\n",
228 | "word_freq[:10]"
229 | ]
230 | },
231 | {
232 | "cell_type": "code",
233 | "execution_count": null,
234 | "id": "9f73b89b-60b9-4600-bd31-8e8ab6fdba73",
235 | "metadata": {},
236 | "outputs": [],
237 | "source": [
238 | "# words only seen once\n",
239 | "word_freq[-25:]"
240 | ]
241 | },
242 | {
243 | "cell_type": "code",
244 | "execution_count": null,
245 | "id": "15e6d11b-bfeb-499d-bdb5-3f6bd9c62018",
246 | "metadata": {},
247 | "outputs": [],
248 | "source": [
249 | "# remove words that appear infrequently\n",
250 | "word_freq = dict(word_freq)\n",
251 | "print(len(word_freq))\n",
252 | "min_freq = 5\n",
253 | "word_dict = {}\n",
254 | "\n",
255 | "# sending all the unknowns to 0\n",
256 | "i = 1\n",
257 | "for word in word_freq:\n",
258 | " if word_freq[word] > min_freq:\n",
259 | " word_dict[word] = i\n",
260 | " i += 1\n",
261 | " else:\n",
262 | " word_dict[word] = 0\n",
263 | "\n",
264 | "# dictionary length \n",
265 | "dict_length = max(word_dict.values()) + 1\n",
266 | "dict_length"
267 | ]
268 | },
269 | {
270 | "cell_type": "code",
271 | "execution_count": null,
272 | "id": "481996b9-1b4d-457f-869e-3665b48f0b3c",
273 | "metadata": {},
274 | "outputs": [],
275 | "source": [
276 | "# to collate the tensors into batches, sentence need to be the same size\n",
277 | "# we could overwrite the collate function, or we could pick a max sentence size and pad\n",
278 | "\n",
279 | "max_length = 0\n",
280 | "for idx in tqdm(range(len(df_clean))):\n",
281 | " row = df_clean.iloc[idx]\n",
282 | " length = len(row['cleaned'].split(' '))\n",
283 | " if length > max_length:\n",
284 | " max_length = length\n",
285 | "print(max_length)"
286 | ]
287 | },
288 | {
289 | "cell_type": "code",
290 | "execution_count": null,
291 | "id": "d32d0945-6010-4589-9777-5777630b020e",
292 | "metadata": {},
293 | "outputs": [],
294 | "source": [
295 | "class IMDBDataset(Dataset):\n",
296 | " def __init__(self, df, word_dict, max_length):\n",
297 | " self.df = df\n",
298 | " self.word_dict = word_dict\n",
299 | " self.sent_dict = {'negative': 0, 'positive': 1}\n",
300 | " self.max_len = max_length\n",
301 | " \n",
302 | " def __len__(self):\n",
303 | " return len(self.df)\n",
304 | " \n",
305 | " def __getitem__(self, idx):\n",
306 | " row = self.df.iloc[idx]\n",
307 | " review = row['cleaned'].split(' ')\n",
308 | " x = torch.zeros(self.max_len)\n",
309 | " \n",
310 | " # get review as a list of integers\n",
311 | " for idx in range(len(review)):\n",
312 | " # we want to front pad for RNN\n",
313 | " x[self.max_len - len(review) + idx] = self.word_dict[review[idx]]\n",
314 | " \n",
315 | " y = torch.tensor(self.sent_dict[row['sentiment']]).float()\n",
316 | " \n",
317 | " # embedding likes long tensors\n",
318 | " return x.long(), y\n",
319 | "ds = IMDBDataset(df_clean, word_dict, max_length)\n",
320 | "next(iter(ds))"
321 | ]
322 | },
323 | {
324 | "cell_type": "markdown",
325 | "id": "37f05b5f-1c74-434c-bc72-b6a0f60dec48",
326 | "metadata": {},
327 | "source": [
328 | "## Models"
329 | ]
330 | },
331 | {
332 | "cell_type": "code",
333 | "execution_count": null,
334 | "id": "658d591e-2aea-4faf-8c8f-6c4941524464",
335 | "metadata": {},
336 | "outputs": [],
337 | "source": [
338 | "# CBOW model for sentiment analysis\n",
339 | "# train the embedding during training\n",
340 | "class CBOW(nn.Module):\n",
341 | " def __init__(self, dict_length, embedding_size):\n",
342 | " super(CBOW, self).__init__()\n",
343 | " # padding index turns off gradient for unknown tokens\n",
344 | " self.word_emb = nn.Embedding(dict_length, embedding_size, padding_idx=0)\n",
345 | " self.linear = nn.Linear(embedding_size, 1)\n",
346 | " self.emb_size = embedding_size\n",
347 | " \n",
348 | " def forward(self, x):\n",
349 | " sent_length = x.shape[1]\n",
350 | " x = self.word_emb(x)\n",
351 | " sent_length = torch.count_nonzero(x, dim=1)\n",
352 | " x = torch.sum(x, dim=1) / sent_length\n",
353 | " x = self.linear(x)\n",
354 | " return torch.squeeze(x)"
355 | ]
356 | },
357 | {
358 | "cell_type": "code",
359 | "execution_count": null,
360 | "id": "138d8654-d202-4297-88f8-025cb96d4e17",
361 | "metadata": {},
362 | "outputs": [],
363 | "source": [
364 | "dl = DataLoader(ds, batch_size=1000, shuffle=True)\n",
365 | "x, y = next(iter(dl))\n",
366 | "\n",
367 | "cbow_model = CBOW(dict_length, 100)\n",
368 | "cbow_model(x).shape"
369 | ]
370 | },
371 | {
372 | "cell_type": "code",
373 | "execution_count": null,
374 | "id": "bb4f2893-55e8-402c-ad06-a43ab14219e7",
375 | "metadata": {
376 | "tags": []
377 | },
378 | "outputs": [],
379 | "source": [
380 | "def one_pass(model, dataloader, optimizer, lossFun, backwards=True, print_loss=False):\n",
381 | " \n",
382 | " if backwards == True:\n",
383 | " model.train()\n",
384 | " else:\n",
385 | " model.eval()\n",
386 | " \n",
387 | " total_loss = 0.0\n",
388 | " for x, y in tqdm(dataloader):\n",
389 | " \n",
390 | " y_pred = model(x)\n",
391 | " loss = lossFun(y_pred, y)\n",
392 | " total_loss += loss.item()\n",
393 | " \n",
394 | " if backwards == True:\n",
395 | " optimizer.zero_grad()\n",
396 | " loss.backward()\n",
397 | " optimizer.step()\n",
398 | " avg_loss = total_loss / len(dataloader)\n",
399 | " \n",
400 | " if print_loss == True:\n",
401 | " print(avg_loss)\n",
402 | " \n",
403 | " return avg_loss\n",
404 | "\n",
405 | "def one_pass_acc(model, dataloader, num_points):\n",
406 | " model.eval()\n",
407 | " total_incorrect = 0\n",
408 | " \n",
409 | " for x, y in dataloader:\n",
410 | " y_pred = (torch.sigmoid(model(x)) > 0.5).float()\n",
411 | " \n",
412 | " total_incorrect += torch.count_nonzero(y - y_pred).item()\n",
413 | " \n",
414 | " percent_wrong = total_incorrect / num_points\n",
415 | " return 1 - percent_wrong"
416 | ]
417 | },
418 | {
419 | "cell_type": "code",
420 | "execution_count": null,
421 | "id": "59d73c4a-61a4-47c3-8d65-8f319f4f572e",
422 | "metadata": {},
423 | "outputs": [],
424 | "source": [
425 | "lossFun = nn.BCEWithLogitsLoss()\n",
426 | "optimizer = optim.Adam(cbow_model.parameters(), lr = 0.01)\n",
427 | "\n",
428 | "num_epochs = 5\n",
429 | "\n",
430 | "for epoch in tqdm(range(num_epochs)):\n",
431 | " print('Epoch: ', epoch)\n",
432 | " \n",
433 | " loss = one_pass(cbow_model, dl, optimizer, lossFun)\n",
434 | " print('Loss: ', loss)\n",
435 | " \n",
436 | " acc = one_pass_acc(cbow_model, dl, len(ds))\n",
437 | " print('Accuracy: ', acc)"
438 | ]
439 | },
440 | {
441 | "cell_type": "code",
442 | "execution_count": null,
443 | "id": "4c4c25a5-3aa6-4325-b66e-2b1fc80daa06",
444 | "metadata": {},
445 | "outputs": [],
446 | "source": [
447 | "# RNN model for sentiment analysis (read Documentation for nn.RNN!)\n",
448 | "# train the embedding during training\n",
449 | "class RNN(nn.Module):\n",
450 | " def __init__(self, dict_length, embedding_size):\n",
451 | " super(RNN, self).__init__()\n",
452 | " # padding index turns off gradient for unknown tokens\n",
453 | " self.word_emb = nn.Embedding(dict_length, embedding_size, padding_idx=0)\n",
454 | " \n",
455 | " # RNN doesn't care about length of sequence\n",
456 | " # RNN does care about the size of the word embedding\n",
457 | " # hidden size dictates dimension of output of RNN\n",
458 | " self.rnn = nn.RNN(input_size=embedding_size, hidden_size=1, batch_first=True)\n",
459 | " \n",
460 | " # PyTorch RNN outputs a sequence of same length as input\n",
461 | " # For many to one, we can either use the final hidden state OR\n",
462 | " # slap a linear layer on the output, taking in all the hidden states\n",
463 | " \n",
464 | " def forward(self, x):\n",
465 | " x = self.word_emb(x)\n",
466 | " \n",
467 | " # RNN layer outputs a tuple, the output and the final hidden state\n",
468 | " # taking the final hidden state as output\n",
469 | " x = self.rnn(x)[1]\n",
470 | " \n",
471 | " return torch.squeeze(x)\n",
472 | "\n",
473 | "x, y = next(iter(dl))\n",
474 | "rnn_model = RNN(dict_length, 100)\n",
475 | "rnn_model(x).shape"
476 | ]
477 | },
478 | {
479 | "cell_type": "code",
480 | "execution_count": null,
481 | "id": "1883e871-61d8-4883-987c-7f36f98ee5a2",
482 | "metadata": {},
483 | "outputs": [],
484 | "source": [
485 | "# does way better\n",
486 | "# hidden state updates each time it sees a new word\n",
487 | "# intuition: probably gets excited when it sees a word like bad/good and ignores the rest\n",
488 | "lossFun = nn.BCEWithLogitsLoss()\n",
489 | "optimizer = optim.Adam(model.parameters(), lr = 0.01)\n",
490 | "\n",
491 | "num_epochs = 5\n",
492 | "\n",
493 | "for epoch in tqdm(range(num_epochs)):\n",
494 | " print('Epoch: ', epoch)\n",
495 | " \n",
496 | " loss = one_pass(model, dl, optimizer, lossFun)\n",
497 | " print('Loss: ', loss)\n",
498 | " \n",
499 | " acc = one_pass_acc(model, dl, len(ds))\n",
500 | " print('Accuracy: ', acc)"
501 | ]
502 | },
503 | {
504 | "cell_type": "markdown",
505 | "id": "08415a2a-18ab-4cb2-bd4a-2da97ad9aa73",
506 | "metadata": {},
507 | "source": [
508 | "## Tools for Word Embeddings"
509 | ]
510 | },
511 | {
512 | "cell_type": "code",
513 | "execution_count": null,
514 | "id": "18a60bae-91dc-4062-9710-f03e18c89acb",
515 | "metadata": {},
516 | "outputs": [],
517 | "source": [
518 | "# gensim is a great package for word embeddings\n",
519 | "# easy to train your own!\n",
520 | "import gensim.downloader\n",
521 | "\n",
522 | "# twitter embedding might be helpful for doing NLP related to social media!\n",
523 | "print(list(gensim.downloader.info()['models'].keys()))"
524 | ]
525 | },
526 | {
527 | "cell_type": "code",
528 | "execution_count": null,
529 | "id": "e699102d-2d7f-409a-a3c3-2b27144b6f2e",
530 | "metadata": {},
531 | "outputs": [],
532 | "source": [
533 | "# let's get the glove-wiki-gigaword-100\n",
534 | "# you can freeze the embedding for a model, finetune the embedding, or use it as a starting point for an embedding layer\n",
535 | "glove_emb = gensim.downloader.load('glove-wiki-gigaword-100')"
536 | ]
537 | },
538 | {
539 | "cell_type": "code",
540 | "execution_count": null,
541 | "id": "fe240be6-7cec-449d-9f82-fa6c6071ecb3",
542 | "metadata": {},
543 | "outputs": [],
544 | "source": [
545 | "# you can easily perform all the fancy features of word embeddings\n",
546 | "glove_emb.most_similar('cat')"
547 | ]
548 | },
549 | {
550 | "cell_type": "code",
551 | "execution_count": null,
552 | "id": "fac0527c-e0de-4898-85a6-782a304f00de",
553 | "metadata": {},
554 | "outputs": [],
555 | "source": [
556 | "# normed cat vector\n",
557 | "cat_vec = glove_emb.get_vector('cat')\n",
558 | "cat_vec / np.linalg.norm(cat_vec)"
559 | ]
560 | },
561 | {
562 | "cell_type": "code",
563 | "execution_count": null,
564 | "id": "3e696c59-43e8-46ee-a570-7c46a7eeb881",
565 | "metadata": {},
566 | "outputs": [],
567 | "source": [
568 | "# get the weights\n",
569 | "weights = glove_emb.get_normed_vectors()\n",
570 | "weights.shape"
571 | ]
572 | },
573 | {
574 | "cell_type": "code",
575 | "execution_count": null,
576 | "id": "dd11e0ba-e00b-426d-892f-500df86c9451",
577 | "metadata": {},
578 | "outputs": [],
579 | "source": [
580 | "# PyTorch makes it easy to load the weights\n",
581 | "glove_emb_layer = nn.Embedding.from_pretrained(torch.tensor(weights))"
582 | ]
583 | },
584 | {
585 | "cell_type": "code",
586 | "execution_count": null,
587 | "id": "e83c862f-1bd5-44e6-83d6-6c78a54ab8fb",
588 | "metadata": {},
589 | "outputs": [],
590 | "source": [
591 | "cat_idx = glove_emb.get_index('cat')\n",
592 | "glove_emb_layer(torch.tensor(cat_idx))"
593 | ]
594 | },
595 | {
596 | "cell_type": "code",
597 | "execution_count": null,
598 | "id": "6ad35fba-d354-419b-9be1-2a38a76bc104",
599 | "metadata": {},
600 | "outputs": [],
601 | "source": [
602 | "# make sure you turn off the gradients when training!\n",
603 | "for param in glove_emb_layer.parameters():\n",
604 | " print(param.requires_grad)"
605 | ]
606 | },
607 | {
608 | "cell_type": "code",
609 | "execution_count": null,
610 | "id": "93624f77-ea54-4f87-98a1-8203624020ac",
611 | "metadata": {},
612 | "outputs": [],
613 | "source": [
614 | "glove_emb_layer = nn.Embedding.from_pretrained(torch.tensor(weights), freeze=True)\n",
615 | "for param in glove_emb_layer.parameters():\n",
616 | " print(param.requires_grad)"
617 | ]
618 | },
619 | {
620 | "cell_type": "code",
621 | "execution_count": null,
622 | "id": "883a5922-8576-4a82-9eb3-8b52bebfd473",
623 | "metadata": {},
624 | "outputs": [],
625 | "source": [
626 | "# define a word2vec model\n",
627 | "from gensim.models import Word2Vec\n",
628 | "\n",
629 | "# different options for how to perform word2vec training\n",
630 | "# check out documentation for more options related to sampling frequent vs. infrequent words\n",
631 | "w2v_model = Word2Vec(# only consider words that show up at least a 100 times\n",
632 | " min_count = 100, \n",
633 | " \n",
634 | " # context window\n",
635 | " window = 2,\n",
636 | " \n",
637 | " #size of embedding\n",
638 | " vector_size = 300)\n",
639 | "# has methods build_vocab and train"
640 | ]
641 | }
642 | ],
643 | "metadata": {
644 | "kernelspec": {
645 | "display_name": "Python 3",
646 | "language": "python",
647 | "name": "python3"
648 | },
649 | "language_info": {
650 | "codemirror_mode": {
651 | "name": "ipython",
652 | "version": 3
653 | },
654 | "file_extension": ".py",
655 | "mimetype": "text/x-python",
656 | "name": "python",
657 | "nbconvert_exporter": "python",
658 | "pygments_lexer": "ipython3",
659 | "version": "3.8.10"
660 | }
661 | },
662 | "nbformat": 4,
663 | "nbformat_minor": 5
664 | }
665 |
--------------------------------------------------------------------------------
/Notebooks/Lecture6_Sequence_Models.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "id": "5b376ace-3837-418d-9f01-d7def6be3e25",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import os\n",
11 | "import cv2\n",
12 | "import time\n",
13 | "import random\n",
14 | "import numpy as np\n",
15 | "import pandas as pd\n",
16 | "from tqdm.notebook import tqdm\n",
17 | "import matplotlib.pyplot as plt\n",
18 | "\n",
19 | "import torch\n",
20 | "from torch.utils.data import Dataset, DataLoader\n",
21 | "import torch.nn as nn\n",
22 | "import torch.optim as optim"
23 | ]
24 | },
25 | {
26 | "cell_type": "markdown",
27 | "id": "9fefa422-45ca-4994-b6ca-16fd9507420a",
28 | "metadata": {},
29 | "source": [
30 | "## The ```nn.RNN``` module\n",
31 | "Some basic options for ```nn.RNN```\n",
32 | "- ```input_size```: refers to size of embedding/feature vectors (i.e. number of channels)\n",
33 | "- ```hidden_size```: desired dimensions of hidden state vector\n",
34 | "- ```num_layers```: number of RNNs stacked on top\n",
35 | "- ```batch_first```: If True, the input/output dimension is *(batch size, sequence length, embedding/feature vector size)*, otherwise it is *(sequence length, batch size, embedding/feature vector size)*"
36 | ]
37 | },
38 | {
39 | "cell_type": "code",
40 | "execution_count": null,
41 | "id": "42a48bde-5580-435e-bf9f-3cdd1e1a032b",
42 | "metadata": {},
43 | "outputs": [],
44 | "source": [
45 | "# assume we have a sequence of 300 dimensional vectors\n",
46 | "# hidden state dimension will be 100\n",
47 | "basic_rnn = nn.RNN(input_size=300, hidden_size=100, num_layers=1, batch_first=True)"
48 | ]
49 | },
50 | {
51 | "cell_type": "code",
52 | "execution_count": null,
53 | "id": "c3b4a84c-7af7-42a9-80c1-445329be6296",
54 | "metadata": {},
55 | "outputs": [],
56 | "source": [
57 | "# what's in here?\n",
58 | "for name, param in basic_rnn.named_parameters():\n",
59 | " print(name, param.shape)"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": null,
65 | "id": "67e044fc-792d-4344-a57c-69807fea18e1",
66 | "metadata": {},
67 | "outputs": [],
68 | "source": [
69 | "# assume we have batch_size=3 and a length 10 sequence of 300 dimensional vectors\n",
70 | "input_seq = torch.rand((3, 10, 300))"
71 | ]
72 | },
73 | {
74 | "cell_type": "code",
75 | "execution_count": null,
76 | "id": "d1fd1ebd-2c87-4f75-afc6-2e1097ba5a79",
77 | "metadata": {},
78 | "outputs": [],
79 | "source": [
80 | "# we get two outputs when we pass a batch to the RNN\n",
81 | "output = basic_rnn(input_seq)\n",
82 | "for element in output:\n",
83 | " print(element.shape)"
84 | ]
85 | },
86 | {
87 | "cell_type": "markdown",
88 | "id": "ccad583a-2bc4-40c5-afd6-7c2259c57410",
89 | "metadata": {},
90 | "source": [
91 | "- The first output is a length ten sequence of 100 dimensional vectors (per datapoint in batch of size 3)\n",
92 | "- These are all the hidden states as we passed the sequence through the RNN"
93 | ]
94 | },
95 | {
96 | "cell_type": "code",
97 | "execution_count": null,
98 | "id": "d74f7e89-cf07-41c3-9816-d99e5c69d2cc",
99 | "metadata": {},
100 | "outputs": [],
101 | "source": [
102 | "output[0]"
103 | ]
104 | },
105 | {
106 | "cell_type": "markdown",
107 | "id": "475bfb24-44e2-4b4c-9034-5c2d1e0b7c6b",
108 | "metadata": {},
109 | "source": [
110 | "- The second output is a single 100 dimensional vector (per datapoint in batch of size 3)\n",
111 | "- This is the *last* hidden state"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": null,
117 | "id": "56b6048c-2d16-4578-857d-65ce7445c3da",
118 | "metadata": {},
119 | "outputs": [],
120 | "source": [
121 | "print(output[1] - output[0][:,-1,:])"
122 | ]
123 | },
124 | {
125 | "cell_type": "markdown",
126 | "id": "28c15200-284d-460d-8786-d5d0d4d7ae48",
127 | "metadata": {},
128 | "source": [
129 | "We can give the RNN layer a second input: a initial hidden state"
130 | ]
131 | },
132 | {
133 | "cell_type": "code",
134 | "execution_count": null,
135 | "id": "27c1e08e-b81e-42d5-800e-3705a110439d",
136 | "metadata": {},
137 | "outputs": [],
138 | "source": [
139 | "# a different initial hidden state changes the output slightly\n",
140 | "basic_rnn(input_seq)[1] - basic_rnn(input_seq, torch.rand((1, 3, 100)))[1]"
141 | ]
142 | },
143 | {
144 | "cell_type": "markdown",
145 | "id": "f6c9d521-de64-44c7-a8ca-64cc2bc82307",
146 | "metadata": {},
147 | "source": [
148 | "- We see two sets of weights if we do more than one layer\n",
149 | "- Note that the $W_{ih}$ weight of the second layer is 100$\\times$100 since the input vectors for the second layer of the RNN are 100-dimensional vectors"
150 | ]
151 | },
152 | {
153 | "cell_type": "code",
154 | "execution_count": null,
155 | "id": "58236eba-2c31-4270-949e-377b9b606e73",
156 | "metadata": {},
157 | "outputs": [],
158 | "source": [
159 | "two_layer_basic_rnn = nn.RNN(input_size=300, hidden_size=100, num_layers=2, batch_first=True)\n",
160 | "for name, param in two_layer_basic_rnn.named_parameters():\n",
161 | " print(name, param.shape)"
162 | ]
163 | },
164 | {
165 | "cell_type": "markdown",
166 | "id": "f955c217-72b6-465e-a414-1245f2e838b2",
167 | "metadata": {},
168 | "source": [
169 | "- The shape of our output changes slightly\n",
170 | "- The first element are the hidden states of the top/last layer\n",
171 | "- The second element are the hidden states output by the two layers (let's one use this as input to a new RNN)"
172 | ]
173 | },
174 | {
175 | "cell_type": "code",
176 | "execution_count": null,
177 | "id": "f94a3b0e-9a58-4fee-a3db-ca603938c61f",
178 | "metadata": {},
179 | "outputs": [],
180 | "source": [
181 | "output = two_layer_basic_rnn(input_seq)\n",
182 | "for element in output:\n",
183 | " print(element.shape)"
184 | ]
185 | },
186 | {
187 | "cell_type": "code",
188 | "execution_count": null,
189 | "id": "85587e0b-0d68-4c58-85b4-7873bcd88404",
190 | "metadata": {},
191 | "outputs": [],
192 | "source": [
193 | "# Vanilla RNN using nn.RNN\n",
194 | "class Vanilla_RNN(nn.Module):\n",
195 | " def __init__(self, input_size, hidden_size, output_size):\n",
196 | " super(Vanilla_RNN, self).__init__()\n",
197 | " self.rnn = nn.RNN(input_size, hidden_size)\n",
198 | " \n",
199 | " # here is our g function from the lecture slides\n",
200 | " # linear layer turning the i-th hidden state into the i-th output\n",
201 | " self.g = nn.Linear(hidden_size, output_size)\n",
202 | "\n",
203 | " def forward(self, x):\n",
204 | " \n",
205 | " out, hidden = self.rnn(x)\n",
206 | " out = self.g(out)\n",
207 | "\n",
208 | " return out, hidden\n",
209 | " \n",
210 | "v_rnn = Vanilla_RNN(300, 100, 50)"
211 | ]
212 | },
213 | {
214 | "cell_type": "code",
215 | "execution_count": null,
216 | "id": "65fa67c8-d7e9-4748-9367-bf38f40b1754",
217 | "metadata": {},
218 | "outputs": [],
219 | "source": [
220 | "# output dimension has changing because we did a linear layer from 100-dim to 50-dim\n",
221 | "for output in v_rnn(input_seq):\n",
222 | " print(output.shape)"
223 | ]
224 | },
225 | {
226 | "cell_type": "markdown",
227 | "id": "9cd26686-5f00-4e29-bb71-ecc3dd7c9f9f",
228 | "metadata": {},
229 | "source": [
230 | "## Fancier RNN architectures"
231 | ]
232 | },
233 | {
234 | "cell_type": "markdown",
235 | "id": "f3231765-bed1-48d9-bd5f-ea1f1c9e41bc",
236 | "metadata": {},
237 | "source": [
238 | "- `nn.GRU` works almost identically to the `nn.RNN` (more parameters inside the $f$ function)\n",
239 | "- ``nn.LSTM`` is slightly different in that it also has a cell state. So the second output element is a tupe of *(final hidden state, final cell state)*"
240 | ]
241 | },
242 | {
243 | "cell_type": "code",
244 | "execution_count": null,
245 | "id": "5394ee59-6450-4528-a697-cd4806cfaac0",
246 | "metadata": {},
247 | "outputs": [],
248 | "source": [
249 | "basic_gru = nn.GRU(input_size=300, hidden_size=100, num_layers=1, batch_first=True)\n",
250 | "for output in basic_gru(input_seq):\n",
251 | " print(output.shape)"
252 | ]
253 | },
254 | {
255 | "cell_type": "code",
256 | "execution_count": null,
257 | "id": "68119347-e598-4566-9b91-9a704dca30c8",
258 | "metadata": {},
259 | "outputs": [],
260 | "source": [
261 | "basic_lstm = nn.LSTM(input_size=300, hidden_size=100, num_layers=1, batch_first=True)\n",
262 | "for output in basic_lstm(input_seq):\n",
263 | " try:\n",
264 | " print(output.shape)\n",
265 | " except:\n",
266 | " name = 'hidden'\n",
267 | " for ele in output:\n",
268 | " print(f'{name} state size:', ele.shape)\n",
269 | " name = 'cell'"
270 | ]
271 | },
272 | {
273 | "cell_type": "markdown",
274 | "id": "b718ce26-12a5-4514-b800-0bcd019c557a",
275 | "metadata": {},
276 | "source": [
277 | "## Generating Text\n",
278 | "- Idea: Take a text and use the shifted text as target"
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": null,
284 | "id": "e773ccf6-7431-494d-9f8f-abd61a75919b",
285 | "metadata": {},
286 | "outputs": [],
287 | "source": [
288 | "df_clean = pd.read_csv('course_data/IMDB_cleaned.csv')\n",
289 | "df_clean.head()"
290 | ]
291 | },
292 | {
293 | "cell_type": "code",
294 | "execution_count": null,
295 | "id": "26d3773f-05fa-4fe1-82b7-599c6544bbd5",
296 | "metadata": {},
297 | "outputs": [],
298 | "source": [
299 | "# count words, send infrequent to unknown\n",
300 | "from collections import Counter\n",
301 | "\n",
302 | "reviews = [review.split(' ') for review in list(df_clean['cleaned'])]\n",
303 | "word_freq = dict(Counter([token for review in reviews for token in review]).most_common())\n",
304 | "print(len(word_freq))\n",
305 | "min_freq = 50\n",
306 | "word_dict = {}\n",
307 | "\n",
308 | "# sending all the unknowns to 0\n",
309 | "i = 1\n",
310 | "for word in word_freq:\n",
311 | " if word_freq[word] > min_freq:\n",
312 | " word_dict[word] = i\n",
313 | " i += 1\n",
314 | " else:\n",
315 | " word_dict[word] = 0\n",
316 | "\n",
317 | "# dictionary length \n",
318 | "dict_length = max(word_dict.values()) + 1\n",
319 | "dict_length"
320 | ]
321 | },
322 | {
323 | "cell_type": "code",
324 | "execution_count": null,
325 | "id": "0d68badd-ab98-4b5e-94a9-f01173b409b3",
326 | "metadata": {},
327 | "outputs": [],
328 | "source": [
329 | "# clean out unknown tokens for simplicity\n",
330 | "df_cleaner = pd.DataFrame(list(df_clean.apply(lambda x:\n",
331 | " {'cleaned': ' '.join([token for token in x['cleaned'].split(' ') if word_dict[token] != 0]),\n",
332 | " 'sentiment':x['sentiment']}, axis=1)))"
333 | ]
334 | },
335 | {
336 | "cell_type": "code",
337 | "execution_count": null,
338 | "id": "a26f31ab-c6d2-4275-a6a9-079c213d05b8",
339 | "metadata": {},
340 | "outputs": [],
341 | "source": [
342 | "# clean out reviews that are too short\n",
343 | "min_length = 12\n",
344 | "print(len(df_clean))\n",
345 | "df_cleaner = df_cleaner[df_cleaner.apply(lambda x: len(x['cleaned'].split(' ')) >= min_length, axis=1)].reset_index(drop=True)\n",
346 | "len(df_cleaner)"
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": null,
352 | "id": "0985d6d9-42c4-4980-9241-e25346bf3ea0",
353 | "metadata": {},
354 | "outputs": [],
355 | "source": [
356 | "import random\n",
357 | "\n",
358 | "# max length here will be maximum length of the sequence predicted\n",
359 | "class IMDBDataset(Dataset):\n",
360 | " def __init__(self, df, word_dict, max_length):\n",
361 | " self.df = df\n",
362 | " self.word_dict = word_dict\n",
363 | " self.sent_dict = {'negative': 0, 'positive': 1}\n",
364 | " self.max_len = max_length\n",
365 | " \n",
366 | " def __len__(self):\n",
367 | " return len(self.df)\n",
368 | " \n",
369 | " def __getitem__(self, idx):\n",
370 | " row = self.df.iloc[idx]\n",
371 | " review = row['cleaned'].split(' ')\n",
372 | " \n",
373 | " \n",
374 | " x = torch.zeros(self.max_len-1)\n",
375 | " y = torch.zeros(self.max_len-1)\n",
376 | " \n",
377 | " starting_point = random.randint(0,len(review) - (self.max_len))\n",
378 | " \n",
379 | " # get reviews as a list of integers\n",
380 | " for idx in range(self.max_len-1):\n",
381 | " x[idx] = self.word_dict[review[starting_point + idx]]\n",
382 | " y[idx] = self.word_dict[review[starting_point + idx + 1]]\n",
383 | " \n",
384 | " \n",
385 | " # embedding likes long tensors\n",
386 | " return x.long(), y.long()\n",
387 | "ds = IMDBDataset(df_cleaner, word_dict, 10)\n",
388 | "\n",
389 | "# target is the input review shifted over one\n",
390 | "# i.e. predict next word from first part of the sequence\n",
391 | "next(iter(ds))"
392 | ]
393 | },
394 | {
395 | "cell_type": "code",
396 | "execution_count": null,
397 | "id": "72000546-06ee-4b37-a96c-c42a00919a2b",
398 | "metadata": {},
399 | "outputs": [],
400 | "source": [
401 | "dl = DataLoader(ds, batch_size = 1000, shuffle=True)\n",
402 | "for element in tqdm(dl):\n",
403 | " None"
404 | ]
405 | },
406 | {
407 | "cell_type": "code",
408 | "execution_count": null,
409 | "id": "41bbd5ac-5b0d-4c27-8623-f6e1a15dea5a",
410 | "metadata": {},
411 | "outputs": [],
412 | "source": [
413 | "# create a model to generate a synthetic review\n",
414 | "class LSTM_Gen(nn.Module):\n",
415 | " def __init__(self, word_dict, embedding_size, hidden_size):\n",
416 | " super(LSTM_Gen, self).__init__()\n",
417 | " self.word_dict = word_dict\n",
418 | " self.hidden_size = hidden_size\n",
419 | " \n",
420 | " # integer to word dictionary\n",
421 | " self.idx2word = dict([(x, y) for x, y in zip(self.word_dict.values(), self.word_dict.keys())])\n",
422 | " self.idx2word[0] = 'UNK'\n",
423 | " \n",
424 | " # length of dictionary\n",
425 | " dict_length = max(word_dict.values()) + 1\n",
426 | " \n",
427 | " # embed the words\n",
428 | " self.emb = nn.Embedding(dict_length, embedding_size)\n",
429 | " \n",
430 | " # pass through an LSTM\n",
431 | " self.lstm = nn.LSTM(embedding_size, hidden_size)\n",
432 | " \n",
433 | " # send output through a linear layer\n",
434 | " self.linear = nn.Linear(hidden_size, dict_length)\n",
435 | "\n",
436 | " def forward(self, x):\n",
437 | " x = self.emb(x)\n",
438 | " out, hidden = self.lstm(x)\n",
439 | " out = self.linear(out)\n",
440 | "\n",
441 | " return out.permute((0, 2, 1))\n",
442 | " \n",
443 | " # method to generate sequence using LSTM module\n",
444 | " def gen_seq(self, start_token, seq_length):\n",
445 | " print(start_token)\n",
446 | " softmax = nn.LogSoftmax(dim=2)\n",
447 | " \n",
448 | " # embedding of start token\n",
449 | " next_emb = self.emb(torch.tensor([[self.word_dict[start_token]]]))\n",
450 | " \n",
451 | " # initial hidden/cell states\n",
452 | " next_state = (torch.zeros((1,1,self.hidden_size)), torch.zeros((1,1,self.hidden_size)))\n",
453 | " \n",
454 | " # generate a sequence!\n",
455 | " for i in range(seq_length):\n",
456 | " # use the hidden/cell states for input into next pass through LSTM layer\n",
457 | " out, next_state = self.lstm(next_emb, next_state)\n",
458 | " \n",
459 | " # make prediction\n",
460 | " y_pred = self.linear(out)\n",
461 | " next_idx = torch.argmax(softmax(y_pred), dim=2)\n",
462 | " print(self.idx2word[torch.squeeze(next_idx).item()])\n",
463 | " \n",
464 | " # embed prediction for input into next pass\n",
465 | " next_emb = self.emb(next_idx)\n",
466 | " \n",
467 | "\n",
468 | "lstm_model = LSTM_Gen(word_dict, embedding_size=100, hidden_size=100)"
469 | ]
470 | },
471 | {
472 | "cell_type": "code",
473 | "execution_count": null,
474 | "id": "4fd675d8-e08e-414b-b608-bb6e3a3d5b8e",
475 | "metadata": {},
476 | "outputs": [],
477 | "source": [
478 | "lstm_model.gen_seq('first', 10)"
479 | ]
480 | },
481 | {
482 | "cell_type": "code",
483 | "execution_count": null,
484 | "id": "48f49f96-c10f-41b1-9874-cb3f85fd0475",
485 | "metadata": {},
486 | "outputs": [],
487 | "source": [
488 | "def one_pass(model, dataloader, optimizer, lossFun, backwards=True, print_loss=False):\n",
489 | " \n",
490 | " if backwards == True:\n",
491 | " model.train()\n",
492 | " else:\n",
493 | " model.eval()\n",
494 | " \n",
495 | " total_loss = 0.0\n",
496 | " for x, y in tqdm(dataloader):\n",
497 | " \n",
498 | " y_pred = model(x)\n",
499 | " loss = lossFun(y_pred, y)\n",
500 | " total_loss += loss.item()\n",
501 | " \n",
502 | " if backwards == True:\n",
503 | " optimizer.zero_grad()\n",
504 | " loss.backward()\n",
505 | " optimizer.step()\n",
506 | " avg_loss = total_loss / len(dataloader)\n",
507 | " \n",
508 | " if print_loss == True:\n",
509 | " print(avg_loss)\n",
510 | " \n",
511 | " return avg_loss\n",
512 | "\n",
513 | "def one_pass_acc(model, dataloader, num_points):\n",
514 | " model.eval()\n",
515 | " total_incorrect = 0\n",
516 | " \n",
517 | " softmax = nn.LogSoftmax(dim=1)\n",
518 | " \n",
519 | " for x, y in dataloader:\n",
520 | " y_pred = torch.argmax(softmax(model(x)), dim=1)\n",
521 | " total_incorrect += torch.count_nonzero(y - y_pred).item()\n",
522 | " \n",
523 | " percent_wrong = total_incorrect / num_points\n",
524 | " return 1 - percent_wrong"
525 | ]
526 | },
527 | {
528 | "cell_type": "code",
529 | "execution_count": null,
530 | "id": "8b3a5e71-2bdf-4d32-bdc8-8097e1cab9a4",
531 | "metadata": {},
532 | "outputs": [],
533 | "source": [
534 | "total = sum(list(word_freq.values()))\n",
535 | "\n",
536 | "# need to weight the cross entropy loss because of imbalanced dataset\n",
537 | "weights = [0]\n",
538 | "for value in word_freq.values():\n",
539 | " weights.append(total / (dict_length * value))\n",
540 | "\n",
541 | "nn.CrossEntropyLoss(weight=torch.tensor(weights))\n",
542 | "\n",
543 | "optimizer = optim.Adam(lstm_model.parameters(), lr = 0.01)"
544 | ]
545 | },
546 | {
547 | "cell_type": "code",
548 | "execution_count": null,
549 | "id": "c63aba79-7ee8-4aa7-81c3-8d4de86b6876",
550 | "metadata": {},
551 | "outputs": [],
552 | "source": [
553 | "num_epochs = 2\n",
554 | "\n",
555 | "for epoch in tqdm(range(num_epochs)):\n",
556 | " print('Epoch: ', epoch)\n",
557 | " \n",
558 | " loss = one_pass(lstm_model, dl, optimizer, lossFun)\n",
559 | " print('Loss: ', loss)"
560 | ]
561 | },
562 | {
563 | "cell_type": "code",
564 | "execution_count": null,
565 | "id": "d23affae-678d-42b4-bc45-a2ffd3b3b9ad",
566 | "metadata": {},
567 | "outputs": [],
568 | "source": [
569 | "lstm_model.gen_seq('film', 10)"
570 | ]
571 | },
572 | {
573 | "cell_type": "markdown",
574 | "id": "9177511c-553f-47cb-9547-9f188f659d4b",
575 | "metadata": {},
576 | "source": [
577 | "## Seq2Seq\n",
578 | "- Great introduction [here](https://github.com/bentrevett/pytorch-seq2seq)"
579 | ]
580 | },
581 | {
582 | "cell_type": "code",
583 | "execution_count": null,
584 | "id": "98c947aa-ef5b-41a5-9b4c-a765f0bf15a9",
585 | "metadata": {},
586 | "outputs": [],
587 | "source": [
588 | "# 30,000 english-german sentences\n",
589 | "from torchtext.datasets import Multi30k\n",
590 | "train_data, valid_data, test_data = Multi30k()"
591 | ]
592 | },
593 | {
594 | "cell_type": "code",
595 | "execution_count": null,
596 | "id": "ba4b3bc9-4fc7-46d3-90f0-05c0af0b11b6",
597 | "metadata": {},
598 | "outputs": [],
599 | "source": [
600 | "next(train_data)"
601 | ]
602 | },
603 | {
604 | "cell_type": "code",
605 | "execution_count": null,
606 | "id": "d246aeef-50d5-4bc9-9f30-c53935cbf018",
607 | "metadata": {},
608 | "outputs": [],
609 | "source": [
610 | "# need tokenizers for english and german\n",
611 | "import spacy\n",
612 | "#!python -m spacy download en_core_web_sm\n",
613 | "#!python -m spacy download de_core_news_sm\n",
614 | "\n",
615 | "spacy_de = spacy.load('de_core_news_sm')\n",
616 | "spacy_en = spacy.load('en_core_web_sm')"
617 | ]
618 | },
619 | {
620 | "cell_type": "code",
621 | "execution_count": null,
622 | "id": "96b81f94-1ecc-48e8-b303-4e4915d790e8",
623 | "metadata": {
624 | "tags": []
625 | },
626 | "outputs": [],
627 | "source": [
628 | "def clean_to_csv(iterator, path):\n",
629 | " rows = []\n",
630 | " for sent_de, sent_en in tqdm(iterator):\n",
631 | " # create a state of sentence token\n",
632 | " tokenized_text_de = ['']\n",
633 | " tokenized_text_en = ['']\n",
634 | " \n",
635 | " # no lemmatization for translation!\n",
636 | " for token in spacy_de(sent_de):\n",
637 | " if token.text not in ['.', '\\n']:\n",
638 | " tokenized_text_de.append(token.text.lower())\n",
639 | " for token in spacy_en(sent_en):\n",
640 | " if token.text not in ['.', '\\n']:\n",
641 | " tokenized_text_en.append(token.text.lower())\n",
642 | " tokenized_text_de.append('')\n",
643 | " tokenized_text_en.append('')\n",
644 | " row = {'english': tokenized_text_en,\n",
645 | " 'german': tokenized_text_de}\n",
646 | " rows.append(row)\n",
647 | " df = pd.DataFrame(rows)\n",
648 | " df.to_csv(path)\n",
649 | " return df\n",
650 | " \n",
651 | "df_train = clean_to_csv(train_data, 'course_data/Multi30k_train.csv')\n",
652 | "df_val = clean_to_csv(valid_data, 'course_data/Multi30k_val.csv')\n",
653 | "df_test = clean_to_csv(test_data, 'course_data/Multi30k_test.csv')\n",
654 | "df_train.head()"
655 | ]
656 | },
657 | {
658 | "cell_type": "code",
659 | "execution_count": null,
660 | "id": "f2e7ea64-cf1e-4d9c-9146-0be83ac27dda",
661 | "metadata": {},
662 | "outputs": [],
663 | "source": [
664 | "df_train = pd.read_csv('course_data/Multi30k_train.csv').drop(columns=[\"Unnamed: 0\"]).applymap(lambda x: x.strip('][').split(', '))\n",
665 | "df_val = pd.read_csv('course_data/Multi30k_val.csv').drop(columns=[\"Unnamed: 0\"]).applymap(lambda x: x.strip('][').split(', '))\n",
666 | "df_test = pd.read_csv('course_data/Multi30k_test.csv').drop(columns=[\"Unnamed: 0\"]).applymap(lambda x: x.strip('][').split(', '))"
667 | ]
668 | },
669 | {
670 | "cell_type": "code",
671 | "execution_count": null,
672 | "id": "862e63e9-5373-48f3-b29d-bef115ae1753",
673 | "metadata": {},
674 | "outputs": [],
675 | "source": [
676 | "# build vocab from JUST training data (prevent data leakage)\n",
677 | "from collections import Counter\n",
678 | "\n",
679 | "def build_vocab(df, col_name, min_freq):\n",
680 | " all_words = [token for sentence in list(df[col_name]) for token in sentence if token != '\\n']\n",
681 | " \n",
682 | " word_freq = dict(Counter(all_words).most_common())\n",
683 | " word_dict = {'' : 0}\n",
684 | " \n",
685 | " i = 0\n",
686 | " for word in word_freq:\n",
687 | " if word_freq[word] >= min_freq:\n",
688 | " word_dict[word] = i+1\n",
689 | " i += 1\n",
690 | " else:\n",
691 | " word_dict[word] = 0\n",
692 | " \n",
693 | " idx2word = dict([(x, y) for x, y in zip(word_dict.values(), word_dict.keys())])\n",
694 | " idx2word[0] = ''\n",
695 | " \n",
696 | " return word_freq, word_dict, idx2word\n",
697 | "\n",
698 | "word_freq_en, word_dict_en, idx2word_en = build_vocab(df_train, 'english', 2)\n",
699 | "word_freq_de, word_dict_de, idx2word_de = build_vocab(df_train, 'german', 2)\n",
700 | "\n",
701 | "len(idx2word_en), len(idx2word_de)"
702 | ]
703 | },
704 | {
705 | "cell_type": "code",
706 | "execution_count": null,
707 | "id": "2372aaf7-5d6c-4849-925b-a486eb34a696",
708 | "metadata": {},
709 | "outputs": [],
710 | "source": [
711 | "max_len = 0\n",
712 | "for idx in range(len(df_train)):\n",
713 | " row = df_train.iloc[idx]\n",
714 | " if len(row['english']) > max_len:\n",
715 | " max_len = len(row['english'])\n",
716 | " if len(row['german']) > max_len:\n",
717 | " max_len = len(row['german'])\n",
718 | "max_len"
719 | ]
720 | },
721 | {
722 | "cell_type": "code",
723 | "execution_count": null,
724 | "id": "25288b46-914b-4220-bea0-abb1fcb3df78",
725 | "metadata": {},
726 | "outputs": [],
727 | "source": [
728 | "class en2deDataset(Dataset):\n",
729 | " def __init__(self, df, word_dict_en, word_dict_de, max_len):\n",
730 | " self.df = df\n",
731 | " self.word_dict_en = word_dict_en\n",
732 | " self.word_dict_de = word_dict_de\n",
733 | " self.max_len = max_len\n",
734 | " \n",
735 | " def __len__(self):\n",
736 | " return len(self.df)\n",
737 | " \n",
738 | " def __getitem__(self, idx):\n",
739 | " row = self.df.iloc[idx]\n",
740 | " sent_en = row['english']\n",
741 | " sent_de = row['german']\n",
742 | " \n",
743 | " x = torch.zeros(self.max_len)\n",
744 | " y = torch.zeros(self.max_len)\n",
745 | " \n",
746 | " # front pad english sentence\n",
747 | " for idx in range(len(sent_en)):\n",
748 | " # unknown words get sent to 0\n",
749 | " try:\n",
750 | " word_idx = self.word_dict_en[sent_en[idx]]\n",
751 | " except:\n",
752 | " word_idx = 0\n",
753 | " x[self.max_len - len(sent_en) + idx] = word_idx\n",
754 | " \n",
755 | " # back pad german sentence\n",
756 | " for idx in range(len(sent_de)):\n",
757 | " # unknown words get sent to 0\n",
758 | " try:\n",
759 | " word_idx = self.word_dict_de[sent_de[idx]]\n",
760 | " except:\n",
761 | " word_idx = 0\n",
762 | " y[idx] = word_idx\n",
763 | " \n",
764 | " # embedding likes long tensors\n",
765 | " return x.long(), y.long()"
766 | ]
767 | },
768 | {
769 | "cell_type": "code",
770 | "execution_count": null,
771 | "id": "764d63c1-52aa-4599-a09a-f131e9d38708",
772 | "metadata": {},
773 | "outputs": [],
774 | "source": [
775 | "ds_train = en2deDataset(df_train, word_dict_en, word_dict_de, max_len)\n",
776 | "next(iter(ds_train))"
777 | ]
778 | },
779 | {
780 | "cell_type": "code",
781 | "execution_count": null,
782 | "id": "e5720819-5ec4-4f14-9079-af54399c130a",
783 | "metadata": {},
784 | "outputs": [],
785 | "source": [
786 | "ds_train = en2deDataset(df_train, word_dict_en, word_dict_de, max_len)\n",
787 | "dl_train = DataLoader(ds_train, batch_size=100, shuffle=True)\n",
788 | "\n",
789 | "ds_val = en2deDataset(df_val, word_dict_en, word_dict_de, max_len)\n",
790 | "dl_val = DataLoader(ds_val, batch_size=100, shuffle=False)\n",
791 | "\n",
792 | "ds_test = en2deDataset(df_test, word_dict_en, word_dict_de, max_len)\n",
793 | "dl_test = DataLoader(ds_test, batch_size=100, shuffle=False)\n",
794 | " \n",
795 | "next(iter(dl_train))"
796 | ]
797 | },
798 | {
799 | "cell_type": "code",
800 | "execution_count": null,
801 | "id": "5982b8fb-92e3-4032-909a-ab070b2e8d7d",
802 | "metadata": {},
803 | "outputs": [],
804 | "source": [
805 | "# now we define a simple Encoder with an LSTM\n",
806 | "class Encoder(nn.Module):\n",
807 | " def __init__(self, dict_length_en, emb_size, hidden_size):\n",
808 | " super().__init__()\n",
809 | " \n",
810 | " self.emb_en = nn.Embedding(dict_length_en, emb_size)\n",
811 | " self.rnn = nn.LSTM(input_size=emb_size, hidden_size=hidden_size, batch_first=True)\n",
812 | " \n",
813 | " def forward(self, x):\n",
814 | " \n",
815 | " # don't need the outputs, just the hidden/cell states for input into the decoder\n",
816 | " outputs, (hidden, cell) = self.rnn(self.emb_en(x))\n",
817 | " \n",
818 | " return hidden, cell"
819 | ]
820 | },
821 | {
822 | "cell_type": "code",
823 | "execution_count": null,
824 | "id": "95df86d1-b88f-437f-bbda-71e0c204a988",
825 | "metadata": {},
826 | "outputs": [],
827 | "source": [
828 | "# for the decoder, we need the states from the encoder as input as well as the target sentence\n",
829 | "# the forward pass represents the prediction of a single German word (the next word in the sentence)\n",
830 | "class Decoder(nn.Module):\n",
831 | " def __init__(self, dict_length_de, emb_size, hidden_size):\n",
832 | " super().__init__()\n",
833 | " \n",
834 | " self.emb_de = nn.Embedding(dict_length_de, emb_size)\n",
835 | " self.rnn = nn.LSTM(input_size=emb_size, hidden_size=hidden_size, batch_first=True)\n",
836 | " \n",
837 | " # output function\n",
838 | " self.linear = nn.Linear(hidden_size, dict_length_de)\n",
839 | " \n",
840 | " def forward(self, input_word, hidden, cell):\n",
841 | " \n",
842 | " input_emb = self.emb_de(input_word)\n",
843 | " \n",
844 | " # output the next hidden/cell states\n",
845 | " output, (hidden, cell) = self.rnn(input_emb, (hidden, cell))\n",
846 | " \n",
847 | " # prediction for next word\n",
848 | " output = self.linear(output)\n",
849 | " \n",
850 | " return output, (hidden, cell)"
851 | ]
852 | },
853 | {
854 | "cell_type": "code",
855 | "execution_count": null,
856 | "id": "048f9dd6-971c-4f7c-87ae-c3d03bd8a250",
857 | "metadata": {},
858 | "outputs": [],
859 | "source": [
860 | "class Seq2Seq(nn.Module):\n",
861 | " def __init__(self, dict_length_en, dict_length_de, emb_size, hidden_size, max_len):\n",
862 | " super().__init__()\n",
863 | " \n",
864 | " self.encoder = Encoder(dict_length_en, emb_size, hidden_size)\n",
865 | " self.decoder = Decoder(dict_length_de, emb_size, hidden_size)\n",
866 | " self.softmax = nn.LogSoftmax(dim=2)\n",
867 | " self.max_len = max_len\n",
868 | " self.output_size = dict_length_de\n",
869 | " \n",
870 | " def forward(self, x, y):\n",
871 | " \n",
872 | " hidden, cell = self.encoder(x)\n",
873 | " \n",
874 | " next_word = y[:, 0:1]\n",
875 | " prediction = torch.zeros((y.shape[0], self.output_size, y.shape[1]))\n",
876 | " \n",
877 | " # first token is always \n",
878 | " prediction[:, 1, 0] = 1\n",
879 | " \n",
880 | " for i in range(self.max_len-1):\n",
881 | " \n",
882 | " output, (hidden, cell) = self.decoder(next_word, hidden, cell)\n",
883 | " prediction[:, :, i+1] = torch.squeeze(output)\n",
884 | " \n",
885 | " # can implement teacher forcing here (sometimes use target word rather than predicted word for next token)\n",
886 | " teacher_forcing_prob = random.uniform(0, 1)\n",
887 | " #teacher_forcing_prob = 1.0\n",
888 | " if teacher_forcing_prob > 0.5:\n",
889 | " next_word = torch.argmax(self.softmax(output), dim=2)\n",
890 | " else:\n",
891 | " next_word = y[:, (i+1):(i+2)]\n",
892 | " \n",
893 | " \n",
894 | " return prediction"
895 | ]
896 | },
897 | {
898 | "cell_type": "code",
899 | "execution_count": null,
900 | "id": "0317aa05-2b18-4a04-b2b9-2f485da6e8ae",
901 | "metadata": {},
902 | "outputs": [],
903 | "source": [
904 | "model = Seq2Seq(len(idx2word_en), len(idx2word_de), 100, 100, max_len)"
905 | ]
906 | },
907 | {
908 | "cell_type": "code",
909 | "execution_count": null,
910 | "id": "fe7d7f7b-1556-47af-b58d-6495de8558ef",
911 | "metadata": {},
912 | "outputs": [],
913 | "source": [
914 | "x, y = next(iter(dl_train))\n",
915 | "model(x, y)"
916 | ]
917 | },
918 | {
919 | "cell_type": "code",
920 | "execution_count": null,
921 | "id": "31a7ec45-7b31-4a30-9b48-629f416a3d6c",
922 | "metadata": {},
923 | "outputs": [],
924 | "source": [
925 | "def one_pass(model, dataloader, optimizer, lossFun, backwards=True, print_loss=False):\n",
926 | " \n",
927 | " if backwards == True:\n",
928 | " model.train()\n",
929 | " else:\n",
930 | " model.eval()\n",
931 | " \n",
932 | " total_loss = 0.0\n",
933 | " for x, y in tqdm(dataloader):\n",
934 | " \n",
935 | " y_pred = model(x, y)\n",
936 | " loss = lossFun(y_pred, y)\n",
937 | " total_loss += loss.item()\n",
938 | " \n",
939 | " if backwards == True:\n",
940 | " optimizer.zero_grad()\n",
941 | " loss.backward()\n",
942 | " optimizer.step()\n",
943 | " avg_loss = total_loss / len(dataloader)\n",
944 | " \n",
945 | " if print_loss == True:\n",
946 | " print(avg_loss)\n",
947 | " \n",
948 | " return avg_loss\n",
949 | "\n",
950 | "def one_pass_acc(model, dataloader, num_points):\n",
951 | " model.eval()\n",
952 | " total_incorrect = 0\n",
953 | " \n",
954 | " softmax = nn.LogSoftmax(dim=1)\n",
955 | " \n",
956 | " for x, y in dataloader:\n",
957 | " y_pred = torch.argmax(softmax(model(x, y)), dim=1)\n",
958 | " total_incorrect += torch.count_nonzero(y - y_pred).item()\n",
959 | " \n",
960 | " percent_wrong = total_incorrect / num_points\n",
961 | " return 1 - percent_wrong"
962 | ]
963 | },
964 | {
965 | "cell_type": "code",
966 | "execution_count": null,
967 | "id": "a83924ef-41cf-4216-9f9a-ad0eebb5a682",
968 | "metadata": {},
969 | "outputs": [],
970 | "source": [
971 | "lossFun = nn.CrossEntropyLoss()\n",
972 | "optimizer = optim.Adam(model.parameters(), lr = 0.01)"
973 | ]
974 | },
975 | {
976 | "cell_type": "code",
977 | "execution_count": null,
978 | "id": "ec0c39de-4557-4835-a005-d3f75c04a800",
979 | "metadata": {},
980 | "outputs": [],
981 | "source": [
982 | "num_epochs = 2\n",
983 | "\n",
984 | "for epoch in tqdm(range(num_epochs)):\n",
985 | " print('Epoch: ', epoch)\n",
986 | " \n",
987 | " loss_train = one_pass(model, dl_train, optimizer, lossFun)\n",
988 | " print('Loss: ', loss_train)\n",
989 | " \n",
990 | " #acc_train = one_pass_acc(model, dl_train, len(ds_train))\n",
991 | " #print('Accuracy: ', acc_train)"
992 | ]
993 | },
994 | {
995 | "cell_type": "code",
996 | "execution_count": null,
997 | "id": "362d9ef2-7040-4862-9a16-2a35919e65da",
998 | "metadata": {},
999 | "outputs": [],
1000 | "source": [
1001 | "# test a translation\n",
1002 | "softmax = nn.LogSoftmax(dim=1)\n",
1003 | "\n",
1004 | "# one batch\n",
1005 | "x, y = next(iter(dl_train))\n",
1006 | "y_pred = model(x, y)\n",
1007 | "# english sentence\n",
1008 | "sent_en = []\n",
1009 | "for index in x[0]:\n",
1010 | " next_word = idx2word_en[index.item()].strip(\"''\")\n",
1011 | " if next_word not in ['', '', '']:\n",
1012 | " sent_en.append(next_word)\n",
1013 | "print(' '.join(sent_en))\n",
1014 | "\n",
1015 | "sent_de = []\n",
1016 | "for index in torch.argmax(model.softmax(y_pred), dim=1)[0]:\n",
1017 | " next_word = idx2word_de[index.item()].strip(\"''\")\n",
1018 | " if next_word not in ['', '', '']:\n",
1019 | " sent_de.append(next_word)\n",
1020 | "print(' '.join(sent_de))"
1021 | ]
1022 | },
1023 | {
1024 | "cell_type": "markdown",
1025 | "id": "2e0d30e3-c82f-4b39-8bbb-95eaa43a4d43",
1026 | "metadata": {},
1027 | "source": [
1028 | "## Custom Loss Function"
1029 | ]
1030 | },
1031 | {
1032 | "cell_type": "code",
1033 | "execution_count": null,
1034 | "id": "b45b95dc-48a8-4e67-b679-152da323bf47",
1035 | "metadata": {},
1036 | "outputs": [],
1037 | "source": [
1038 | "class some_loss(nn.Module):\n",
1039 | " def __init__(self, hyperparam):\n",
1040 | " super(some_loss, self).__init__()\n",
1041 | " self.hyperparam = hyperparam\n",
1042 | " \n",
1043 | " \n",
1044 | " def forward(self, y_pred, y):\n",
1045 | " diff = y_pred - y\n",
1046 | " \n",
1047 | " # average over each entry and batch size\n",
1048 | " torch.norm(diff) / torch.numel(doff)\n",
1049 | " return"
1050 | ]
1051 | }
1052 | ],
1053 | "metadata": {
1054 | "kernelspec": {
1055 | "display_name": "Python 3 (ipykernel)",
1056 | "language": "python",
1057 | "name": "python3"
1058 | },
1059 | "language_info": {
1060 | "codemirror_mode": {
1061 | "name": "ipython",
1062 | "version": 3
1063 | },
1064 | "file_extension": ".py",
1065 | "mimetype": "text/x-python",
1066 | "name": "python",
1067 | "nbconvert_exporter": "python",
1068 | "pygments_lexer": "ipython3",
1069 | "version": "3.8.10"
1070 | }
1071 | },
1072 | "nbformat": 4,
1073 | "nbformat_minor": 5
1074 | }
1075 |
--------------------------------------------------------------------------------
/Notebooks/Neptune_PyTorch.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "500564d8-f85b-43b9-bff1-90c0bd51b941",
6 | "metadata": {},
7 | "source": [
8 | "## Using neptune.ai with PyTorch to log information during model development\n",
9 | "\n",
10 | "### by Michael Ruddy"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": null,
16 | "id": "be3f7b10-9d68-48e3-9286-12e0a0a63fb0",
17 | "metadata": {},
18 | "outputs": [],
19 | "source": [
20 | "import numpy as np\n",
21 | "import pandas as pd\n",
22 | "from tqdm.notebook import tqdm\n",
23 | "\n",
24 | "# PyTorch stuff\n",
25 | "import torch, torchvision\n",
26 | "from torch.utils.data import Dataset\n",
27 | "from torch.utils.data import DataLoader\n",
28 | "import torch.nn as nn\n",
29 | "import torch.optim as optim\n",
30 | "from torchvision import models, transforms\n",
31 | "\n",
32 | "# Neptune\n",
33 | "import neptune.new as neptune"
34 | ]
35 | },
36 | {
37 | "cell_type": "markdown",
38 | "id": "1d9a01b0-fc0e-4eae-9b58-ee327ba1c176",
39 | "metadata": {},
40 | "source": [
41 | "Let's use the MNIST dataset to test out these features."
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": null,
47 | "id": "a7f0b673-95b0-499b-92f3-6652dcc96ea9",
48 | "metadata": {},
49 | "outputs": [],
50 | "source": [
51 | "# load up the MNIST dataset\n",
52 | "trnsfm = transforms.Compose(\n",
53 | " [transforms.ToTensor(),\n",
54 | " transforms.Normalize((.5), (.5))])\n",
55 | "\n",
56 | "ds_train = torchvision.datasets.MNIST(root='./data', train=True,\n",
57 | " download=True, transform=trnsfm)\n",
58 | "ds_val = torchvision.datasets.MNIST(root='./data', train=False,\n",
59 | " download=True, transform=trnsfm)\n",
60 | "\n",
61 | "# I'm going to do more than one \"run\" in this notebook\n",
62 | "global_hyperparam = {'N_train':len(ds_train),\n",
63 | " 'N_val':len(ds_val)}\n",
64 | "\n",
65 | "batch_size = 4\n",
66 | "global_hyperparam['batch_size'] = batch_size\n",
67 | "\n",
68 | "# dataloaders\n",
69 | "dl_train = torch.utils.data.DataLoader(ds_train, batch_size=batch_size,\n",
70 | " shuffle=True, num_workers=2)\n",
71 | "dl_val = torch.utils.data.DataLoader(ds_val, batch_size=len(ds_val),\n",
72 | " shuffle=False, num_workers=2)"
73 | ]
74 | },
75 | {
76 | "cell_type": "markdown",
77 | "id": "02e33878-00bc-47e2-a1f8-7e26e13817e8",
78 | "metadata": {},
79 | "source": [
80 | "And a very simple CNN architecture."
81 | ]
82 | },
83 | {
84 | "cell_type": "code",
85 | "execution_count": null,
86 | "id": "750e41cc-9d6b-4274-9ba9-1564055c8336",
87 | "metadata": {},
88 | "outputs": [],
89 | "source": [
90 | "# simple CNN\n",
91 | "class small_CNN(nn.Module):\n",
92 | " def __init__(self):\n",
93 | " super().__init__()\n",
94 | " \n",
95 | " self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, padding=1)\n",
96 | " self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)\n",
97 | " self.conv3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)\n",
98 | " \n",
99 | " self.linear1 = nn.Linear(64*7*7, 100)\n",
100 | " self.linear2 = nn.Linear(100, 10)\n",
101 | " \n",
102 | " self.pool = nn.MaxPool2d(kernel_size=2)\n",
103 | " self.relu = nn.ReLU()\n",
104 | " self.unroll = nn.Flatten()\n",
105 | " \n",
106 | " def forward(self, x):\n",
107 | " \n",
108 | " x = self.conv1(x)\n",
109 | " x = self.relu(x)\n",
110 | " x = self.pool(x)\n",
111 | " \n",
112 | " x = self.conv2(x)\n",
113 | " x = self.relu(x)\n",
114 | " x = self.pool(x)\n",
115 | " \n",
116 | " x = self.conv3(x)\n",
117 | " x = self.relu(x)\n",
118 | " \n",
119 | " x = self.linear1(self.unroll(x))\n",
120 | " x = self.relu(x)\n",
121 | " x = self.linear2(x)\n",
122 | " \n",
123 | " return x"
124 | ]
125 | },
126 | {
127 | "cell_type": "markdown",
128 | "id": "214377f6-6e06-4ee6-97ed-5a82ed128389",
129 | "metadata": {},
130 | "source": [
131 | "Here we have some simple functions to train epoch."
132 | ]
133 | },
134 | {
135 | "cell_type": "code",
136 | "execution_count": null,
137 | "id": "9c982a64-2cf9-41b9-bcb3-d5742ac07e25",
138 | "metadata": {},
139 | "outputs": [],
140 | "source": [
141 | "def one_pass(model, dataloader, optimizer, lossFun, backwards=True, print_loss=False, log=None):\n",
142 | " \n",
143 | " if backwards == True:\n",
144 | " model.train()\n",
145 | " else:\n",
146 | " model.eval()\n",
147 | " \n",
148 | " total_loss = 0.0\n",
149 | " for x, y in tqdm(dataloader):\n",
150 | " \n",
151 | " y_pred = model(x)\n",
152 | " loss = lossFun(y_pred, y)\n",
153 | " total_loss += loss.item()\n",
154 | " \n",
155 | " # pass the key name to log the loss each batch\n",
156 | " if log:\n",
157 | " run[log].log(loss.item())\n",
158 | " \n",
159 | " if backwards == True:\n",
160 | " optimizer.zero_grad()\n",
161 | " loss.backward()\n",
162 | " optimizer.step()\n",
163 | " avg_loss = total_loss / len(dataloader)\n",
164 | " \n",
165 | " if print_loss == True:\n",
166 | " print(avg_loss)\n",
167 | " \n",
168 | " return avg_loss\n",
169 | "\n",
170 | "\n",
171 | "def one_pass_acc(model, dataloader, num_points):\n",
172 | " model.eval()\n",
173 | " total_incorrect = 0\n",
174 | " \n",
175 | " softmax = nn.LogSoftmax(dim=1)\n",
176 | " \n",
177 | " for x, y in dataloader:\n",
178 | " y_pred = softmax(model(x))\n",
179 | " y_pred = torch.argmax(y_pred, dim=1)\n",
180 | " total_incorrect += torch.count_nonzero(y - y_pred).item()\n",
181 | " \n",
182 | " acc = 1 - (total_incorrect / num_points)\n",
183 | " \n",
184 | " return acc"
185 | ]
186 | },
187 | {
188 | "cell_type": "markdown",
189 | "id": "ab58ad52-c79b-46dc-823b-1603f54b4828",
190 | "metadata": {},
191 | "source": [
192 | "Now let's perform an experiment. We must first create a project using our account at neptune.ai and get the api_token and project name from there. We'll keep track of various hyperparameters, but also statistics about training such as the training/validation loss each epoch. Finally we can save the model parameters as well.\n",
193 | "\n",
194 | "Some helpful tidbits:\n",
195 | "- The choice of organizing the set-up into a `config` folder is arbitrary. I can organize this information however I please. Same thing with train and validation folders.\n",
196 | "- What is helpful to make sure that these have the same organization across runs to make comparison easy."
197 | ]
198 | },
199 | {
200 | "cell_type": "code",
201 | "execution_count": null,
202 | "id": "bdbde3ce-91cb-4fc8-8d12-e43cfa82bea4",
203 | "metadata": {},
204 | "outputs": [],
205 | "source": [
206 | "# initialize a run\n",
207 | "run = neptune.init(\n",
208 | " project=\"your_project_name\",\n",
209 | " api_token=\"your_api_key\",\n",
210 | " name = \"Small_CNN\",\n",
211 | " tags = [\"Scratch\", \"3 Downsamples\"]\n",
212 | ")\n",
213 | "\n",
214 | "# set up model and training\n",
215 | "model = small_CNN()\n",
216 | "lossFun = nn.CrossEntropyLoss()\n",
217 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)\n",
218 | "num_epochs = 10\n",
219 | "\n",
220 | "# log the set-up\n",
221 | "for key, value in global_hyperparam.items():\n",
222 | " run[f'config/{key}'] = value\n",
223 | " \n",
224 | "run['config/model'] = type(model).__name__\n",
225 | "run['config/criterion'] = type(lossFun).__name__\n",
226 | "run['config/optimizer'] = type(optimizer).__name__\n",
227 | "run['config/params'] = {\"learning_rate\": optimizer.param_groups[0]['lr'],\n",
228 | " \"epoch_nr\" : num_epochs}\n",
229 | "\n",
230 | "for epoch in tqdm(range(num_epochs)):\n",
231 | " \n",
232 | " train_loss = one_pass(model, dl_train, optimizer, lossFun, log=\"train/batch_loss\")\n",
233 | " valid_loss = one_pass(model, dl_val, optimizer, lossFun, backwards=False)\n",
234 | " \n",
235 | " train_acc = one_pass_acc(model, dl_train, len(ds_train))\n",
236 | " valid_acc = one_pass_acc(model, dl_val, len(ds_val))\n",
237 | " \n",
238 | " # log the loss and accuracy each epoch\n",
239 | " run[\"train/loss\"].log(train_loss)\n",
240 | " run[\"val/loss\"].log(valid_loss)\n",
241 | " run[\"train/acc\"].log(train_acc)\n",
242 | " run[\"val/acc\"].log(valid_acc)\n",
243 | "\n",
244 | "# save your progress\n",
245 | "checkpoint = {'model_state_dict': model.state_dict(),\n",
246 | " 'optimizer_state_dict' :optimizer.state_dict()}\n",
247 | "torch.save(checkpoint, 'model_checkpoint.pt')\n",
248 | "\n",
249 | "# upload the model weights along with an architecture description\n",
250 | "run['model/model_checkpoint'].upload('model_checkpoint.pt')\n",
251 | "\n",
252 | "# save model architecture description\n",
253 | "model_arch = open(\"model_arch.txt\", \"w\")\n",
254 | "model_arch.write(str(model))\n",
255 | "model_arch.close()\n",
256 | "run['model/architecture'].upload(\"model_arch.txt\")\n",
257 | " \n",
258 | "# stop logging this run\n",
259 | "run.stop()"
260 | ]
261 | },
262 | {
263 | "cell_type": "markdown",
264 | "id": "1e38bd58-32b1-4edf-bb1e-a8b11412988f",
265 | "metadata": {},
266 | "source": [
267 | "Let's say I close the notebook and want to go back and keep logging the previous model."
268 | ]
269 | },
270 | {
271 | "cell_type": "code",
272 | "execution_count": null,
273 | "id": "98444bb6-4040-4317-a765-05012c201271",
274 | "metadata": {},
275 | "outputs": [],
276 | "source": [
277 | "# get back to that same run\n",
278 | "run = neptune.init(\n",
279 | " project=\"your_project_name\",\n",
280 | " api_token=\"your_api_key\",\n",
281 | " run='NEP-1'\n",
282 | ")\n",
283 | "\n",
284 | "# downloads the file with the same name (will overwrite if already there)\n",
285 | "run['model/model_checkpoint'].download()\n",
286 | "\n",
287 | "# set up model and training again\n",
288 | "model = small_CNN()\n",
289 | "lossFun = nn.CrossEntropyLoss()\n",
290 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)\n",
291 | "num_epochs = 10\n",
292 | "\n",
293 | "# load up the previous checkpoint\n",
294 | "# model architecture must be the same!\n",
295 | "checkpoint = torch.load('model_checkpoint.pt')\n",
296 | "model.load_state_dict(checkpoint['model_state_dict'])\n",
297 | "optimizer.load_state_dict(checkpoint['optimizer_state_dict'])\n",
298 | "\n",
299 | "for key, value in global_hyperparam.items():\n",
300 | " run[f'config/{key}'] = value\n",
301 | " \n",
302 | "run['config/model'] = type(model).__name__\n",
303 | "run['config/criterion'] = type(lossFun).__name__\n",
304 | "run['config/optimizer'] = type(optimizer).__name__\n",
305 | "run['config/params'] = {\"learning_rate\": optimizer.param_groups[0]['lr'],\n",
306 | " \"epoch_nr\" : num_epochs}\n",
307 | "\n",
308 | "for epoch in tqdm(range(num_epochs)):\n",
309 | " \n",
310 | " train_loss = one_pass(model, dl_train, optimizer, lossFun, log=\"train/batch_loss\")\n",
311 | " valid_loss = one_pass(model, dl_val, optimizer, lossFun, backwards=False)\n",
312 | " \n",
313 | " train_acc = one_pass_acc(model, dl_train, len(ds_train))\n",
314 | " valid_acc = one_pass_acc(model, dl_val, len(ds_val))\n",
315 | " \n",
316 | " # continue to log the loss and accuracy each epoch\n",
317 | " run[\"train/loss\"].log(train_loss)\n",
318 | " run[\"val/loss\"].log(valid_loss)\n",
319 | " run[\"train/acc\"].log(train_acc)\n",
320 | " run[\"val/acc\"].log(valid_acc)\n",
321 | "\n",
322 | "# save your progress again\n",
323 | "checkpoint = {'model_state_dict': model.state_dict(),\n",
324 | " 'optimizer_state_dict' :optimizer.state_dict()}\n",
325 | "torch.save(checkpoint, 'model_checkpoint.pt')\n",
326 | "run['model/model_checkpoint'].upload('model_checkpoint.pt')\n",
327 | " \n",
328 | "# stop logging the run\n",
329 | "run.stop()"
330 | ]
331 | },
332 | {
333 | "cell_type": "markdown",
334 | "id": "97f45c01-bf27-4f53-82a4-ff76fcc9a92c",
335 | "metadata": {},
336 | "source": [
337 | "Let's compare to a different style of model. After running this, go to the Compare Runs tab in neptune.ai"
338 | ]
339 | },
340 | {
341 | "cell_type": "code",
342 | "execution_count": null,
343 | "id": "158f61ca-5f9a-4f76-bae3-7a881020f665",
344 | "metadata": {},
345 | "outputs": [],
346 | "source": [
347 | "# simple CNN\n",
348 | "class smaller_CNN(nn.Module):\n",
349 | " def __init__(self):\n",
350 | " super().__init__()\n",
351 | " \n",
352 | " self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, padding=1)\n",
353 | " self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)\n",
354 | " \n",
355 | " self.linear1 = nn.Linear(32*14*14, 100)\n",
356 | " self.linear2 = nn.Linear(100, 10)\n",
357 | " \n",
358 | " self.pool = nn.MaxPool2d(kernel_size=2)\n",
359 | " self.relu = nn.ReLU()\n",
360 | " self.unroll = nn.Flatten()\n",
361 | " \n",
362 | " def forward(self, x):\n",
363 | " \n",
364 | " x = self.conv1(x)\n",
365 | " x = self.relu(x)\n",
366 | " x = self.pool(x)\n",
367 | " \n",
368 | " x = self.conv2(x)\n",
369 | " x = self.relu(x)\n",
370 | " \n",
371 | " x = self.linear1(self.unroll(x))\n",
372 | " x = self.relu(x)\n",
373 | " x = self.linear2(x)\n",
374 | " \n",
375 | " return x"
376 | ]
377 | },
378 | {
379 | "cell_type": "code",
380 | "execution_count": null,
381 | "id": "589a0d9c-f603-4db0-87ea-0469b2703042",
382 | "metadata": {},
383 | "outputs": [],
384 | "source": [
385 | "# initialize a run\n",
386 | "run = neptune.init(\n",
387 | " project=\"your_project_name\",\n",
388 | " api_token=\"your_api_key\",\n",
389 | " name = \"Smaller_CNN\",\n",
390 | " tags = [\"Scratch\", \"2 Downsamples\"]\n",
391 | ")\n",
392 | "\n",
393 | "# set up model and training\n",
394 | "model = smaller_CNN()\n",
395 | "lossFun = nn.CrossEntropyLoss()\n",
396 | "optimizer = optim.Adam(model.parameters(), lr = 0.001)\n",
397 | "num_epochs = 10\n",
398 | "\n",
399 | "# log the set-up\n",
400 | "for key, value in global_hyperparam.items():\n",
401 | " run[f'config/{key}'] = value\n",
402 | " \n",
403 | "run['config/model'] = type(model).__name__\n",
404 | "run['config/criterion'] = type(lossFun).__name__\n",
405 | "run['config/optimizer'] = type(optimizer).__name__\n",
406 | "run['config/params'] = {\"learning_rate\": optimizer.param_groups[0]['lr'],\n",
407 | " \"epoch_nr\" : num_epochs}\n",
408 | "\n",
409 | "for epoch in tqdm(range(num_epochs)):\n",
410 | " \n",
411 | " train_loss = one_pass(model, dl_train, optimizer, lossFun, log=\"train/batch_loss\")\n",
412 | " valid_loss = one_pass(model, dl_val, optimizer, lossFun, backwards=False)\n",
413 | " \n",
414 | " train_acc = one_pass_acc(model, dl_train, len(ds_train))\n",
415 | " valid_acc = one_pass_acc(model, dl_val, len(ds_val))\n",
416 | " \n",
417 | " # log the loss and accuracy each epoch\n",
418 | " run[\"train/loss\"].log(train_loss)\n",
419 | " run[\"val/loss\"].log(valid_loss)\n",
420 | " run[\"train/acc\"].log(train_acc)\n",
421 | " run[\"val/acc\"].log(valid_acc)\n",
422 | "\n",
423 | "# save your progress\n",
424 | "checkpoint = {'model_state_dict': model.state_dict(),\n",
425 | " 'optimizer_state_dict' :optimizer.state_dict()}\n",
426 | "torch.save(checkpoint, 'model_checkpoint.pt')\n",
427 | "\n",
428 | "# upload the model weights along with an architecture description\n",
429 | "run['model/model_checkpoint'].upload('model_checkpoint.pt')\n",
430 | "\n",
431 | "# save model architecture description\n",
432 | "model_arch = open(\"model_arch.txt\", \"w\")\n",
433 | "model_arch.write(str(model))\n",
434 | "model_arch.close()\n",
435 | "run['model/architecture'].upload(\"model_arch.txt\")\n",
436 | " \n",
437 | "# stop logging this run\n",
438 | "run.stop()"
439 | ]
440 | }
441 | ],
442 | "metadata": {
443 | "kernelspec": {
444 | "display_name": "Python 3 (ipykernel)",
445 | "language": "python",
446 | "name": "python3"
447 | },
448 | "language_info": {
449 | "codemirror_mode": {
450 | "name": "ipython",
451 | "version": 3
452 | },
453 | "file_extension": ".py",
454 | "mimetype": "text/x-python",
455 | "name": "python",
456 | "nbconvert_exporter": "python",
457 | "pygments_lexer": "ipython3",
458 | "version": "3.8.12"
459 | }
460 | },
461 | "nbformat": 4,
462 | "nbformat_minor": 5
463 | }
464 |
--------------------------------------------------------------------------------
/Notebooks/Pytorch_Lightning.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "0133fec1-13b1-47bb-be64-e455ab94f663",
6 | "metadata": {},
7 | "source": [
8 | "## Using PyTorch Lightning\n",
9 | "\n",
10 | "### by Michael Ruddy"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "id": "d99e1bea-335e-413a-8cb7-ee6a13a6bd15",
16 | "metadata": {},
17 | "source": [
18 | "To get PyTorch Lightning:\n",
19 | "\n",
20 | "`conda install -c conda-forge pytorch-lightning`"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": null,
26 | "id": "3269a8b5-00f0-48f1-9de1-00b419962ed7",
27 | "metadata": {},
28 | "outputs": [],
29 | "source": [
30 | "import numpy as np\n",
31 | "import pandas as pd\n",
32 | "from tqdm.notebook import tqdm\n",
33 | "\n",
34 | "# PyTorch stuff\n",
35 | "import torch, torchvision\n",
36 | "from torch.utils.data import Dataset\n",
37 | "from torch.utils.data import DataLoader\n",
38 | "import torch.nn as nn\n",
39 | "import torch.optim as optim\n",
40 | "from torchvision import models, transforms\n",
41 | "\n",
42 | "# PyTorch Lightning\n",
43 | "import pytorch_lightning as pl"
44 | ]
45 | },
46 | {
47 | "cell_type": "markdown",
48 | "id": "26e707ab-a687-49bb-8fd9-8193f1fc4f76",
49 | "metadata": {},
50 | "source": [
51 | "Let's use the MNIST dataset to test out these features."
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "id": "bb5c6853-9675-4c64-a8fa-06234b00811b",
58 | "metadata": {},
59 | "outputs": [],
60 | "source": [
61 | "# load up the MNIST dataset\n",
62 | "trnsfm = transforms.Compose(\n",
63 | " [transforms.ToTensor(),\n",
64 | " transforms.Normalize((.5), (.5))])\n",
65 | "\n",
66 | "ds_train = torchvision.datasets.MNIST(root='./data', train=True,\n",
67 | " download=True, transform=trnsfm)\n",
68 | "ds_val = torchvision.datasets.MNIST(root='./data', train=False,\n",
69 | " download=True, transform=trnsfm)\n",
70 | "\n",
71 | "# I'm going to do more than one \"run\" in this notebook\n",
72 | "global_hyperparam = {'N_train':len(ds_train),\n",
73 | " 'N_val':len(ds_val)}\n",
74 | "\n",
75 | "batch_size = 4\n",
76 | "global_hyperparam['batch_size'] = batch_size\n",
77 | "\n",
78 | "# dataloaders\n",
79 | "dl_train = torch.utils.data.DataLoader(ds_train, batch_size=batch_size,\n",
80 | " shuffle=True, num_workers=2)\n",
81 | "dl_val = torch.utils.data.DataLoader(ds_val, batch_size=batch_size,\n",
82 | " shuffle=False, num_workers=2)"
83 | ]
84 | },
85 | {
86 | "cell_type": "markdown",
87 | "id": "b80d5519-c5b2-46a4-85ae-e9da0203f7ab",
88 | "metadata": {},
89 | "source": [
90 | "### PyTorch Set-Up + Training"
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": null,
96 | "id": "d157d933-fd28-474f-8523-5bbc24fd1418",
97 | "metadata": {},
98 | "outputs": [],
99 | "source": [
100 | "# model\n",
101 | "class small_CNN(nn.Module):\n",
102 | " def __init__(self):\n",
103 | " super().__init__()\n",
104 | " \n",
105 | " self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, padding=1)\n",
106 | " self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)\n",
107 | " self.conv3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)\n",
108 | " \n",
109 | " self.linear1 = nn.Linear(64*7*7, 100)\n",
110 | " self.linear2 = nn.Linear(100, 10)\n",
111 | " \n",
112 | " self.pool = nn.MaxPool2d(kernel_size=2)\n",
113 | " self.relu = nn.ReLU()\n",
114 | " self.unroll = nn.Flatten()\n",
115 | " \n",
116 | " def forward(self, x):\n",
117 | " \n",
118 | " x = self.conv1(x)\n",
119 | " x = self.relu(x)\n",
120 | " x = self.pool(x)\n",
121 | " \n",
122 | " x = self.conv2(x)\n",
123 | " x = self.relu(x)\n",
124 | " x = self.pool(x)\n",
125 | " \n",
126 | " x = self.conv3(x)\n",
127 | " x = self.relu(x)\n",
128 | " \n",
129 | " x = self.linear1(self.unroll(x))\n",
130 | " x = self.relu(x)\n",
131 | " x = self.linear2(x)\n",
132 | " \n",
133 | " return x"
134 | ]
135 | },
136 | {
137 | "cell_type": "markdown",
138 | "id": "6f82a43d-d2f2-4a40-9d07-c3bd550c11b0",
139 | "metadata": {},
140 | "source": [
141 | "Note that these next functions must be altered if I alter the task of number of inputs to the model forward pass or if I want to switch to regression."
142 | ]
143 | },
144 | {
145 | "cell_type": "code",
146 | "execution_count": null,
147 | "id": "8e479948-c7dc-4ccf-af00-317e6b99e37d",
148 | "metadata": {
149 | "tags": []
150 | },
151 | "outputs": [],
152 | "source": [
153 | "# one pass through the dataloader, keyword for whether to backprop or not\n",
154 | "def one_pass(model, dataloader, optimizer, scheduler, lossFun, backwards=True, print_loss=False):\n",
155 | " \n",
156 | " if backwards == True:\n",
157 | " model.train()\n",
158 | " else:\n",
159 | " model.eval()\n",
160 | " \n",
161 | " total_loss = 0.0\n",
162 | " for x, y in tqdm(dataloader):\n",
163 | " \n",
164 | " y_pred = model(x)\n",
165 | " loss = lossFun(y_pred, y)\n",
166 | " total_loss += loss.item()\n",
167 | " \n",
168 | " if backwards == True:\n",
169 | " optimizer.zero_grad()\n",
170 | " loss.backward()\n",
171 | " optimizer.step()\n",
172 | " scheduler.step()\n",
173 | " avg_loss = total_loss / len(dataloader)\n",
174 | " \n",
175 | " if print_loss == True:\n",
176 | " print(avg_loss)\n",
177 | " \n",
178 | " return avg_loss\n",
179 | "\n",
180 | "# one pass to gather metrics\n",
181 | "def one_pass_acc(model, dataloader, num_points):\n",
182 | " model.eval()\n",
183 | " total_incorrect = 0\n",
184 | " \n",
185 | " softmax = nn.LogSoftmax(dim=1)\n",
186 | " \n",
187 | " for x, y in dataloader:\n",
188 | " y_pred = softmax(model(x))\n",
189 | " y_pred = torch.argmax(y_pred, dim=1)\n",
190 | " total_incorrect += torch.count_nonzero(y - y_pred).item()\n",
191 | " \n",
192 | " acc = 1 - (total_incorrect / num_points)\n",
193 | " \n",
194 | " return acc"
195 | ]
196 | },
197 | {
198 | "cell_type": "markdown",
199 | "id": "408ccf50-2392-4651-8ad4-58700d98e9f3",
200 | "metadata": {},
201 | "source": [
202 | "The training loop"
203 | ]
204 | },
205 | {
206 | "cell_type": "code",
207 | "execution_count": null,
208 | "id": "0fddb2e5-9b2e-482b-81de-780070b18704",
209 | "metadata": {},
210 | "outputs": [],
211 | "source": [
212 | "num_epochs = 2\n",
213 | "model = small_CNN()\n",
214 | "lossFun = nn.CrossEntropyLoss()\n",
215 | "optimizer = optim.Adam(model.parameters(), lr=0.001)\n",
216 | "lr_scheduler = optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.002, epochs=num_epochs, steps_per_epoch=len(dl_train))\n",
217 | "\n",
218 | "for epoch in tqdm(range(num_epochs)):\n",
219 | " \n",
220 | " train_loss = one_pass(model, dl_train, optimizer, lr_scheduler, lossFun)\n",
221 | " valid_loss = one_pass(model, dl_val, optimizer, lr_scheduler, lossFun, backwards=False)\n",
222 | " \n",
223 | " print(f\"Train loss, Epoch {epoch}:\", train_loss)\n",
224 | " print(f\"Val loss, Epoch {epoch}:\", valid_loss)\n",
225 | " \n",
226 | " train_acc = one_pass_acc(model, dl_train, len(ds_train))\n",
227 | " valid_acc = one_pass_acc(model, dl_val, len(ds_val))"
228 | ]
229 | },
230 | {
231 | "cell_type": "markdown",
232 | "id": "1144760a-78a1-469c-9414-8f715e469cd1",
233 | "metadata": {},
234 | "source": [
235 | "Now let's do the same model and training in PyTorch Lightning."
236 | ]
237 | },
238 | {
239 | "cell_type": "code",
240 | "execution_count": null,
241 | "id": "2f9e837b-0839-4394-b696-8f6a06e9169a",
242 | "metadata": {},
243 | "outputs": [],
244 | "source": [
245 | "class lightning_small_CNN(pl.LightningModule):\n",
246 | " \n",
247 | " # Similarly need to set-up the weights and the forward pass\n",
248 | " def __init__(self, hparams):\n",
249 | " super().__init__()\n",
250 | " \n",
251 | " self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, padding=1)\n",
252 | " self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)\n",
253 | " self.conv3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)\n",
254 | " \n",
255 | " self.linear1 = nn.Linear(64*7*7, 100)\n",
256 | " self.linear2 = nn.Linear(100, 10)\n",
257 | " \n",
258 | " self.pool = nn.MaxPool2d(kernel_size=2)\n",
259 | " self.relu = nn.ReLU()\n",
260 | " self.unroll = nn.Flatten()\n",
261 | " \n",
262 | " # going to attach the loss function to the module\n",
263 | " self.CELoss = nn.CrossEntropyLoss()\n",
264 | " self.softmax = nn.LogSoftmax(dim=1)\n",
265 | " \n",
266 | " # need for scheduler, can't named self.hparams\n",
267 | " self.hp = hparams\n",
268 | " \n",
269 | " def forward(self, x):\n",
270 | " \n",
271 | " x = self.conv1(x)\n",
272 | " x = self.relu(x)\n",
273 | " x = self.pool(x)\n",
274 | " \n",
275 | " x = self.conv2(x)\n",
276 | " x = self.relu(x)\n",
277 | " x = self.pool(x)\n",
278 | " \n",
279 | " x = self.conv3(x)\n",
280 | " x = self.relu(x)\n",
281 | " \n",
282 | " x = self.linear1(self.unroll(x))\n",
283 | " x = self.relu(x)\n",
284 | " x = self.linear2(x)\n",
285 | " \n",
286 | " return x\n",
287 | "\n",
288 | " # method for computing the loss\n",
289 | " def lossFun(self, y_pred, y):\n",
290 | " return self.CELoss(y_pred, y)\n",
291 | " \n",
292 | " # we can define our metric functions below\n",
293 | " def acc(self, y_pred, y):\n",
294 | " y_pred = torch.argmax(y_pred, dim=1)\n",
295 | " total_incorrect = torch.count_nonzero(y - y_pred).item()\n",
296 | " \n",
297 | " return 1 - (total_incorrect / torch.numel(y))\n",
298 | " \n",
299 | " # this method must be named training_step\n",
300 | " def training_step(self, train_batch, batch_idx):\n",
301 | " \n",
302 | " x, y = train_batch\n",
303 | "\n",
304 | " # now these functions are wrapped up in self\n",
305 | " y_pred = self.forward(x)\n",
306 | " loss = self.lossFun(y_pred, y)\n",
307 | " self.log('train_loss', loss, on_epoch=True)\n",
308 | " \n",
309 | " # compute metrics\n",
310 | " acc = self.acc(y_pred, y)\n",
311 | " self.log('train_acc', acc, on_step=False, on_epoch=True)\n",
312 | " \n",
313 | " return loss\n",
314 | "\n",
315 | " # instead of a on/off switch for the backward pass, we simply define a separate step for validation\n",
316 | " # must be named validation_step\n",
317 | " def validation_step(self, val_batch, batch_idx):\n",
318 | " x, y = val_batch\n",
319 | " y_pred = self.forward(x)\n",
320 | " loss = self.lossFun(y_pred, y)\n",
321 | " self.log('val_loss', loss)\n",
322 | " \n",
323 | " acc = self.acc(y_pred, y)\n",
324 | " self.log('val_acc', acc)\n",
325 | "\n",
326 | " # here we configure the optimizer\n",
327 | " def configure_optimizers(self):\n",
328 | " optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)\n",
329 | " \n",
330 | " # we can even pass the scheduler!\n",
331 | " # because the annealing scheduler needs to know the number of batches and epochs, we'll pass a hparam dictionary to the model later\n",
332 | " lr_scheduler = {\n",
333 | " 'scheduler': optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.002,\n",
334 | " epochs=self.hp['num_epochs'],\n",
335 | " steps_per_epoch=self.hp['num_batches']),\n",
336 | " 'interval': 'step' # forces updates after each training step, instead of per epoch\n",
337 | " }\n",
338 | " # we pass lists here, because lightning support multiple optimizers!\n",
339 | " return [optimizer], [lr_scheduler]"
340 | ]
341 | },
342 | {
343 | "cell_type": "markdown",
344 | "id": "9d6bffe6-8e09-476e-9a5c-9e566b6ee663",
345 | "metadata": {},
346 | "source": [
347 | "Now we have the same training loop"
348 | ]
349 | },
350 | {
351 | "cell_type": "code",
352 | "execution_count": null,
353 | "id": "abf21a86-be81-4062-abef-7248c652d556",
354 | "metadata": {},
355 | "outputs": [],
356 | "source": [
357 | "num_epochs = 2\n",
358 | "hparams = {'num_epochs': num_epochs,\n",
359 | " 'num_batches': len(dl_train)}\n",
360 | "model = lightning_small_CNN(hparams)\n",
361 | "trainer = pl.Trainer(max_epochs=num_epochs)\n",
362 | "\n",
363 | "trainer.fit(model, dl_train, dl_val)"
364 | ]
365 | },
366 | {
367 | "cell_type": "markdown",
368 | "id": "4fa7637d-4c13-4857-a488-7672d990bbed",
369 | "metadata": {},
370 | "source": [
371 | "There's a few nice bells and whistles here.\n",
372 | "- Automatic progress bar!\n",
373 | "- Makes logging the train and validation loss easy (logs stored in lightning_logs\n",
374 | "- Trainer first makes sure the forward loop runs on the validation set\n",
375 | "- Most of the training loop can be abstracted to the Module which makes training from scripts very easy"
376 | ]
377 | },
378 | {
379 | "cell_type": "code",
380 | "execution_count": null,
381 | "id": "dbe332d3-1b85-4522-8a0a-a6755adb00f8",
382 | "metadata": {},
383 | "outputs": [],
384 | "source": [
385 | "# load up tensorboard to view the logs!\n",
386 | "%load_ext tensorboard\n",
387 | "%tensorboard --logdir lightning_logs"
388 | ]
389 | },
390 | {
391 | "cell_type": "markdown",
392 | "id": "1c1d1e6c-8b8a-4590-80a0-f11fae57ba6c",
393 | "metadata": {},
394 | "source": [
395 | "We can even pair with neptune.ai, need to run the following first:\n",
396 | "\n",
397 | "`pip install neptune-pytorch-lightning`"
398 | ]
399 | },
400 | {
401 | "cell_type": "code",
402 | "execution_count": null,
403 | "id": "de84f04d-ecaa-4748-9b06-d94df46a8b8c",
404 | "metadata": {},
405 | "outputs": [],
406 | "source": [
407 | "from neptune.new.integrations.pytorch_lightning import NeptuneLogger\n",
408 | "\n",
409 | "# frustratingly enough note that api_token is called api_key here!\n",
410 | "run = NeptuneLogger(\n",
411 | " project=\"your_project_name\",\n",
412 | " api_key=\"your_api_key\",\n",
413 | " name = \"Lightning_Test\",\n",
414 | ")\n",
415 | "\n",
416 | "num_epochs = 2\n",
417 | "hparams = {'num_epochs': num_epochs,\n",
418 | " 'num_batches': len(dl_train)}\n",
419 | "model = lightning_small_CNN(hparams)\n",
420 | "trainer = pl.Trainer(max_epochs=num_epochs, logger=run)\n",
421 | "\n",
422 | "trainer.fit(model, dl_train, dl_val)"
423 | ]
424 | },
425 | {
426 | "cell_type": "code",
427 | "execution_count": null,
428 | "id": "4e1215ff-5a71-48bd-a71e-125b850c75b5",
429 | "metadata": {},
430 | "outputs": [],
431 | "source": []
432 | }
433 | ],
434 | "metadata": {
435 | "kernelspec": {
436 | "display_name": "Python 3 (ipykernel)",
437 | "language": "python",
438 | "name": "python3"
439 | },
440 | "language_info": {
441 | "codemirror_mode": {
442 | "name": "ipython",
443 | "version": 3
444 | },
445 | "file_extension": ".py",
446 | "mimetype": "text/x-python",
447 | "name": "python",
448 | "nbconvert_exporter": "python",
449 | "pygments_lexer": "ipython3",
450 | "version": "3.8.12"
451 | }
452 | },
453 | "nbformat": 4,
454 | "nbformat_minor": 5
455 | }
456 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # MSDS 631: Deep Learning Neural Networks
2 | The repository associated with MSDS631 at University of San Francisco in Summer 2021. This course has ended; solutions to the assignments are available upon request.
3 |
4 | **Instructor:** Michael Ruddy
5 |
6 | **Email**: mruddy@usfca.edu
7 |
8 | **Class Time**: Tuesdays/Thursdays at 10am - 12pm Pacific Time
9 |
10 | **Location:** Zoom (see Canvas)
11 |
12 | **Office Hours**: Tuesdays/Thursdays at 3pm - 4pm Pacific Time
13 |
14 | **Syllabus**: [Link](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/MSDS631_DeepLearning_Syllabus.pdf)
15 |
16 | ## Course Learning Outcomes
17 |
18 | - Understand Neural Networks and basic techniques for using them effectively (dropout, batch normalization, data augmentation, etc.).
19 | - Learn how to handle image data and use Convolutional Neural Networks for various imaging tasks.
20 | - Learn how to handle text data and use Recurrent Neural Networks and Transformers for various Natural Language Processing tasks and for other tasks involving sequential data.
21 | - Practice implementing these techniques from scratch using PyTorch
22 | - Complete a Deep Learning project as part of team from start to end using real world data:
23 | - Research literature for the problem,
24 | - Prepare/clean data,
25 | - Apply the appropriate deep learning techniques,
26 | - Evaluate and communicate the results.
27 |
28 | ## Course Overview
29 |
30 | ### Assessment
31 |
32 | The course will be graded based on the following components:
33 |
34 | - **Professionalism** (20%): Students will attend each live lecture (unless given prior permission for watching recording), participating in all quizzes/polls, and will be ready to respond if called upon. Additionally students will follow the code of conduct.
35 | - **Assignments** (30%): There will be 3-4 Homeworks and 1-2 live Tests/Quizzes. These will be graded for correctness. For homeworks, if it is turned in late, there is a 10% score reduction for each day after the due date.
36 | - **Labs** (20%): There will be many Lab assignments which will be graded for completion. Completion means each problem has been reasonably attempted or the student indicates *in detail* why they confused or unable to complete it. Labs turned in late receive half credit if they are received within a week of the due date.
37 | - **Final Project** (30%): Students will complete a final project that includes two presentations and a GitHub repository (each worth 10% of their total grade). Students will work in teams of two to complete their final project.
38 |
39 | ### Code of Conduct
40 |
41 | As students will often be asked to share and discuss their work, and work on assignments with their peers, the code of conduct **MUST** be followed. Students are expected to be highly respectful of their peers during and outside of class to help foster an inclusive environment where everyone is comfortable making mistakes and contributing. When working as a group, it is every group member's responsibility to make sure everyone has input and understands the answers given. Students are also expected to be respectful of the instructor and other MSDS faculty.
42 |
43 | ## Schedule
44 |
45 | **Week 1**
46 |
47 | | Topic | Slides | In Class Code | Lab | Due Date |
48 | | :--- | :---: | :---: | :---: | :---: |
49 | | 7/6: Introduction to Deep Learning and this course | [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture1_Introduction.pdf) | [Notebook](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Notebooks/Lecture1_Introduction.ipynb) | [Lab 1](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Lab_1/Lab1.pdf) | July 7th 11:59pm PDT|
50 | | 7/8: Techniques for making Deep Learning work | [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture2_Make_DL_Work.pdf) | [Notebook](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Notebooks/Lecture2_Make_DL_Work.ipynb) | [Lab 2](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Lab_2/Lab2.pdf) | July 9th 11:59pm PDT |
51 |
52 | HW1: [Recommendation Systems with a Feed Forward Neural Network](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Homework_1/Homework1.pdf) (Due 7/16 at 11:59pm Pacific Time)
53 |
54 | **Week 2**
55 |
56 | | Topic | Slides | In Class Code | Lab | Due Date |
57 | | :--- | :---: | :---: | :---: | :---: |
58 | | 7/13: Deep Learning with Image Data | [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture3_Images_and_CNNs.pdf) | [Notebook](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Notebooks/Lecture3_Images_and_CNNs.ipynb) |[Lab 3](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Lab_3/Lab3.pdf) | July 15th 11:59 PDT |
59 | | 7/15: Data Augmentation and Transfer Learning | [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture4_Imaging_Small_Datasets.pdf) | [Notebook](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Notebooks/Lecture4_Transfer_Augmentation.ipynb) | [Lab 4](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Lab_4/Lab4.pdf) | July 18th 11:59 PDT|
60 |
61 | **Week 3**
62 |
63 | | Topic | Slides | In Class Code | Lab | Due Date |
64 | | :--- | :---: | :---: | :---: | :---: |
65 | | 7/20: Final Project Description Presentation | | | | |
66 | | 7/22: Deep Learning with Text Data| [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture5_Text_Data.pdf) | [Notebook](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Notebooks/Lecture5_Text_Embeddings_Models.ipynb) | [Lab 5](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Lab_5/Lab5.pdf) | July 25th 11:59 PDT|
67 |
68 | **Week 4**
69 |
70 | | Topic | Slides | In Class Code | Lab | Due Date |
71 | | :--- | :---: | :---: | :---: | :---: |
72 | | 7/27: Sequence Models | [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture6_Sequence_Models.pdf) | [Notebook](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Notebooks/Lecture6_Sequence_Models.ipynb) | See HW2 | |
73 | | 7/29: BERT/Transformers | [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture7_DrInterian_Attention.pdf), (by [Dr. Interian](https://github.com/yanneta))| | | |
74 |
75 | HW2: [RNNs and a Twin Neural Network](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Homework_2/Homework2.pdf) (Due 8/3 at 11:59pm Pacific Time)
76 |
77 | **Week 5**
78 |
79 | | Topic | Slides | In Class Code | Lab | Due Date |
80 | | :--- | :---: | :---: | :---: | :---: |
81 | | 8/3: Multi-Task Learning and More Imaging | [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture8_More_Imaging.pdf) | [Notebook](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Notebooks/Lecture8_More_Imaging.ipynb) | See HW3 | |
82 | | 8/5: Final Project Office Hours | | | | |
83 |
84 | HW3: [Bounding Boxes](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Assignments/Homework_3/Homework3.pdf) (Due 8/10 at 11:59pm Pacific Time)
85 |
86 | **Week 6**
87 |
88 | | Topic | Slides | In Class Code | Lab | Due Date |
89 | | :--- | :---: | :---: | :---: | :---: |
90 | | 8/10: More Topics: Style Transfer, GANs, Reinforcement Learning| [Slides](https://github.com/mgruddy/DeepLearning_MSDS21/blob/main/Slides/Lecture9_More_Topics.pdf) | | | |
91 | | 8/12: Final Project Presentation| | | | |
92 |
93 | The final projects produced by the students in this course can be viewed [here](https://docs.google.com/spreadsheets/d/1kUJCgP5zK7Mg0FWoD59JZxE04A1s6AfPNAjaomQSHP0/edit?usp=sharing).
94 |
95 | ### Important USF Dates
96 |
97 | - Tuesday, July 6th: First day of classes
98 | - Monday, July 12th: Census Day (Last day to withdraw with tuition reversal)
99 | - Thursday, August 12th: Last day of class
100 |
101 | ### Important Class Dates
102 |
103 | - Thursday, July 15th: Final Project Groups must be finalized
104 | - Tuesday, July 20th: Final Project Description Presentation Date
105 | - Thursday, August 12th: Final Project Presentation Date
106 |
--------------------------------------------------------------------------------
/Slides/Lecture1_Introduction.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture1_Introduction.pdf
--------------------------------------------------------------------------------
/Slides/Lecture2_Make_DL_Work.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture2_Make_DL_Work.pdf
--------------------------------------------------------------------------------
/Slides/Lecture3_Images_and_CNNs.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture3_Images_and_CNNs.pdf
--------------------------------------------------------------------------------
/Slides/Lecture4_Imaging_Small_Datasets.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture4_Imaging_Small_Datasets.pdf
--------------------------------------------------------------------------------
/Slides/Lecture5_Text_Data.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture5_Text_Data.pdf
--------------------------------------------------------------------------------
/Slides/Lecture6_Sequence_Models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture6_Sequence_Models.pdf
--------------------------------------------------------------------------------
/Slides/Lecture7_DrInterian_Attention.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture7_DrInterian_Attention.pdf
--------------------------------------------------------------------------------
/Slides/Lecture8_More_Imaging.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture8_More_Imaging.pdf
--------------------------------------------------------------------------------
/Slides/Lecture9_More_Topics.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mgruddy/DeepLearning_MSDS21/8445b9a607b4df2c5cbf19da6065b2924d811060/Slides/Lecture9_More_Topics.pdf
--------------------------------------------------------------------------------